Transcriptome Reconstruction and Quantification from RNA Sequencing Data
Sahar Al Seesi, Serghei Mangul, Adrian Caciula, Alex Zelikovsky and Ion Măndoiu
from: Genome Analysis: Current Procedures and Applications (Edited by: Maria S. Poptsova). Caister Academic Press, U.K. (2014)
Massively parallel whole transcriptome sequencing has become the technology of choice for transcriptome analysis since it supports a wider range of problems than the previously popular microarray technology. In this chapter we focus on two of these applications, namely transcriptome reconstruction and quantification. We discuss the key computational problems related with these applications and describe some of the best-performing algorithms available for each. For transcriptome reconstruction, we present in detail a statistical genome-guided method called "Transcriptome Reconstruction using Integer Programming" (TRIP) that incorporates fragment length distribution into novel transcript reconstruction from paired-end RNA-Seq reads. Experimental results on both real and synthetic datasets show that TRIP is more accurate than methods ignoring fragment length distribution information. For transcriptome quantification, we focus on two Expectation-Maximization (EM) algorithms for both RNA-Seq and Digital Gene Expression (DGE) sequencing protocols. Both algorithms take into account alternative splicing and mapping ambiguities. We present experimental results on real datasets comparing the two protocols as well as methods for each protocol. Results show that the EM algorithms outperform other available methods for both RNA-Seq and DGE, and that they yield comparable quantification accuracy on real data generated using the RNA-Seq and DGE protocols read more ...