Inferring Viral Quasispecies Spectra from Shotgun and Amplicon Next-generation Sequencing Reads
Irina Astrovskaya, Nicholas Mancuso, Bassam Tork, Serghei Mangul, Alex Artyomenko, Pavel Skums, Lilia Ganova-Raeva, Ion Măndoiu and Alex Zelikovsky
from: Genome Analysis: Current Procedures and Applications (Edited by: Maria S. Poptsova). Caister Academic Press, U.K. (2014)
Many clinically relevant viruses, including hepatitis C virus (HCV) and human immunodeficiency virus (HIV), exhibit high genomic diversity within infected hosts which may explain the failure of vaccines and resistance to existing antiviral therapies. Characterizing the viral population infecting a host requires reconstructing all co-existing (related, but non-identical) viral variants, referred to as quasispecies, and inferring their relative abundances. Next-generation sequencing is a promising approach for characterizing viral diversity due to its ability to generate large numbers of reads at a low cost. However, standard assembly software was originally designed for a single genome assembly and cannot be used to assemble multiple closely related quasispecies sequences and estimate their abundances. In this chapter, we focus on the problem of reconstructing viral quasispecies populations from next-generation sequencing reads produced by two most commonly used strategies: the shotgun sequencing and the sequencing of partially overlapping PCR amplicons. We discuss computational challenges associated with each strategy and review existing approaches to quasispecies reconstruction with focus on two state-of-the-art software tools - Viral Spectrum Assembler (ViSpA), designed for the shotgun reads, and Viral Assembler (VirA), which handles the amplicon reads. Both tools have been tested on simulated and real read data from HCV, HIV (ViSpA) and HBV (VirA) quasispecies, and shown to compare favorably with other existing methods read more ...