The Virophage Family Lavidaviridae

Double-stranded (ds) DNA viruses of the family Lavidaviridae, commonly known as virophages, are a fascinating group of eukaryotic viruses that depend on a coinfecting giant dsDNA virus of the Mimiviridae for their propagation. Instead of replicating in the nucleus, virophages multiply in the cytoplasmic virion factory of a coinfecting giant virus inside a phototrophic or heterotrophic protistal host cell. Virophages are parasites of giant viruses and can inhibit their replication, which may lead to increased survival rates of the infected host cell population. The genomes of virophages are 17–33 kilobase pairs (kbp) long and encode 16–34 proteins. Genetic signatures of virophages can be found in metagenomic datasets from various saltwater and freshwater environments around the planet. Most virophages share a set of conserved genes that code for a major and a minor capsid protein, a cysteine protease, a genome-packaging ATPase, and a superfamily 3 helicase, although the genomes are otherwise diverse and variable. Lavidaviruses share genes with other mobile genetic elements, suggesting that horizontal gene transfer and recombination have been major forces in shaping these viral genomes. Integrases are occasionally found in virophage genomes and enable these DNA viruses to persist as provirophages in the chromosomes of their viral and cellular hosts. As we watch the genetic diversity of this new viral family unfold through metagenomics, additional isolates are still lacking and critical questions regarding their infection cycle, host range, and ecology remain to be answered. Introduction Virophages are a recently discovered class of double-stranded (ds) DNA viruses that have evolved a dependency on complex dsDNA viruses of eukaryotes, so-called giant viruses. The discovery of giant viruses was therefore a prerequisite for the isolation and characterization of virophages (see Fig. 12.1), as shall be reviewed here briefly (See also Reteno et al., 2018). In 1992, following a pneumonia outbreak in Bradford, England, an intra-amoebal parasite was isolated by the team of T.J. Rowbotham and given the name ‘Bradford coccus’. This microorganism was initially assumed to be a bacterium because of its size and positive Gram-stain reaction; however, attempts to amplify and analyse its 16S ribosomal DNA sequences failed. A decade later in the laboratory of Didier Raoult in Marseille, electron microscopy revealed that Bradford coccus was in fact a giant virus with a fibre-studded, 0.75 μm-diameter particle (La Scola et al., 2003). The serendipitous discovery of Acanthamoeba polyphaga mimivirus (family Mimiviridae, genus Mimivirus, species Acanthamoeba polyphaga mimivirus) and the analysis of its 1.2 million base pair dsDNA genome have profoundly changed our view of the viral world (Raoult et al., 2004). This virus exhibited a particle size and genome length that exceeded those of the smallest cellular organisms, blurring the boundary between viruses and cells. The finding that some mimivirus genes were homologous to genes that were previously known to occur only in cellular genomes, in particular four aminoacyl-tRNA synthetases and four translation factors, spurred evolutionary scenarios in which giant viruses were hypothesized to caister.com/cimb 1 Curr. Issues Mol. Biol. Vol. 40


Introduction
Virophages are a recently discovered class of double-stranded (ds) DNA viruses that have evolved a dependency on complex dsDNA viruses of eukaryotes, so-called giant viruses. The discovery of giant viruses was therefore a prerequisite for the isolation and characterization of virophages (see Fig. 12.1), as shall be reviewed here briefly (See also Reteno et al., 2018).
In 1992, following a pneumonia outbreak in Bradford, England, an intra-amoebal parasite was isolated by the team of T.J. Rowbotham and given the name 'Bradford coccus' . This microorganism was initially assumed to be a bacterium because of its size and positive Gram-stain reaction; however, attempts to amplify and analyse its 16S ribosomal DNA sequences failed. A decade later in the laboratory of Didier Raoult in Marseille, electron microscopy revealed that Bradford coccus was in fact a giant virus with a fibre-studded, 0.75 µm-diameter particle (La Scola et al., 2003). The serendipitous discovery of Acanthamoeba polyphaga mimivirus (family Mimiviridae, genus Mimivirus, species Acanthamoeba polyphaga mimivirus) and the analysis of its 1.2 million base pair dsDNA genome have profoundly changed our view of the viral world (Raoult et al., 2004). This virus exhibited a particle size and genome length that exceeded those of the smallest cellular organisms, blurring the boundary between viruses and cells. The finding that some mimivirus genes were homologous to genes that were previously known to occur only in cellular genomes, in particular four aminoacyl-tRNA synthetases and four translation factors, spurred evolutionary scenarios in which giant viruses were hypothesized to Fischer be descendants of a cellular ancestor (Boyer et al., 2010;Nasir et al., 2012;Raoult et al., 2004). Such hypotheses, however, have been widely criticized and the origin of giant viruses remains a matter of ongoing debate (Williams et al., 2011;Yutin et al., 2014). Akin to poxviruses (Broyles, 2003), mimivirus and related giant viruses replicate solely in the cytoplasm of their unicellular eukaryotic host (Mutsafi et al., 2010), which is made possible by hundreds of virus-encoded proteins that provide nuclear functions to the virion factory (VF), such as DNA replication and transcription. For instance, the mimivirus-encoded transcription apparatus consists of at least eight DNA-dependent RNA polymerase subunits, a trifunctional mRNA capping enzyme, a polyadenylate polymerase, and several transcription factors.
A targeted search for giant viruses was launched following the characterization of mimivirus and resulted in the isolation of dozens of new viral strains . One of the first strains to be isolated was Acanthamoeba castellanii mamavirus, a close relative of mimivirus that was recovered from a cooling tower in Paris. Electron microscopy analysis of mamavirus-infected amoebae revealed the presence of a second, smaller icosahedral virus that was named Sputnik (La Scola et al., 2008). Sputnik was unable to infect Acanthamoeba cells on its own, and replicated only when the cells were coinfected with mamavirus. In coinfected cells, Sputnik colocalized to the mamavirus VF, providing the first evidence that Sputnik uses giant virus-encoded enzymes for its propagation. The presence of Sputnik also elicited a morphological phenotype in mamavirus, with visible deformations such as partial capsid thickening in many of the newly synthesized virions. The yield of mamavirus progeny from coinfected cells was reduced by ≈70% (La Scola et al., 2008). Sputnik thus acted as a parasite of mimiviruses and the term 'virophage' was coined to reflect the relationship as a 'virus of a virus' . However, the parasitic interaction appears to be restricted to the intracellular phase of the virus life cycle, when the biochemical complexity of giant viruses unfolds in the form of the VF organelle. In the years following the discovery of Sputnik, additional virophages such as mavirus (Fischer and Suttle, 2011) and Zamilon (Gaia et al., 2014) were isolated, and more than a dozen putative virophage genomes were found in metagenomic datasets (Gong et al., 2016;Oh et al., 2016;Yau et al., 2011;Yutin et al., 2015a;Zhou et al., 2013Zhou et al., , 2015 (see Table 12.1). These eukaryotic dsDNA viruses are geographically widespread, genetically diverse, and appear to be commonly associated with giant virus-infected protist populations. The interested reader is referred to previously published review articles on this topic Claverie and Abergel, 2009b;Desnues and Raoult, 2010;Gaia et al., 2013a).

Virion structure
Only few virophage representatives (Sputnik, mavirus, and Zamilon) have been isolated in laboratory culture and are thus amenable to structural studies. caister.com/cimb The virophage particles that have been examined so far are 50-75 nm in diameter and possess icosahedral symmetry (Fig. 12.2). A 3.5 Å resolution cryo-electron microscopy (cryo-EM) reconstruction of the 75 nm wide Sputnik particle suggested the lack of a membrane component (Zhang et al., 2012), contradicting earlier reports that assumed a lipid component in the Sputnik particle (Desnues et al., 2012a;Sun et al., 2010). The Sputnik capsid structure is composed of the major capsid protein (MCP) encoded by the V20 gene, and the minor capsid protein (mCP) encoded by the V18/V19 gene. The MCP contains a double jelly-roll fold and forms trimeric capsomers (hexons) with pseudohexagonal symmetry that build the 20 faces of the icosahedral particle ( Fig. 12.2C), whereas the mCP is a single jelly-roll protein that forms pentameric capsomers (pentons) occupying the 12 vertices. The mature capsid consists of 260 hexons and 12 pentons that are arranged in a lattice with a triangulation (T) number of 27 (h = 3; k = 3) (Sun et al., 2010;Zhang et al., 2012).
The capsid proteins of different virophages show low levels of conservation on the amino acid level (e.g. ~40% for the MCPs of Sputnik and mavirus) . The mCP in particular is highly diverse, which hinders sequence similaritybased identification of novel virophages. Despite the high sequence divergence, all MCPs and mCPs are assumed to adopt the double and single jellyroll folds, respectively, and preliminary data shows that the three-dimensional structure of the mavirus MCP is very similar to that of Sputnik (D. Born, L. Reuter, U. Mersdorf, M. Mueller, M.G. Fischer, A. Meinhart and J. Reinstein, under review). However, since most of the sequence space occupied by virophages remains unknown, the existence of different capsid architectures cannot be excluded. For instance, the virophage genomes assembled from a sheep rumen metagenome apparently lack a penton gene (Yutin et al., 2015a). In addition to the main capsid components (MCP and mCP), other virophage-encoded proteins may be present in the mature virion, such as the mavirus MV13 C) Cryo-EM reconstruction of the Sputnik virion at 3.5 Å resolution with a magnified major capsid protein trimer (PDB entry 3J26). Modified from (Zhang et al., 2012).
Virophage genomes contain from 16 to 34 protein-coding sequences, which are for the most part spaced in a non-overlapping manner, similar to other dsDNA viruses and bacterial genomes ( Fig.  12.3). The Organic Lake virophage (OLV) has five annotated ORFs that are nested within larger predicted coding sequences (CDSs) (Yau et al.,

2011).
There is no indication that they encode proteins, however, since one of those ORFs (OLV26) has no significant similarity to sequences in public databases, and the other four ORFs (OLV14-16, 21) contain highly repetitive amino acid patterns, which may give false-positive homologues in BLAST searches.

Conserved genes
Nearly all virophage genomes encode a structural module that consists of a major capsid protein (MCP), a minor capsid or penton protein (mCP), a cysteine protease (PRO), and an FtsK-HerA-like ATPase (coloured in Fig. 12.3). These proteins are involved in virion formation. The highly conserved MCP-mCP gene pair is found in all virophages including the virophage-like element PgVV and also in polinton-like viruses (PLVs), which encode poorly conserved versions of these capsid genes (Yutin et al., 2015b). These two genes are syntenic in most virophages ( Fig. 12.3), indicative of their importance. The MCP adopts a double jelly-roll fold, whereas the mCP gene encodes the single jelly-roll penton protein of virophage capsids. Yellowstone Lake virophage (YSLV) 1 encodes two versions of the mCP, coding sequences (CDSs) 26 and 27, of which CDS 26 contains a 390 aa long insertion that is missing in other virophage mCPs. Because mCP pentamers are located at the vertices of the icosahedral capsid, they are most likely to interact with host components (e.g. membrane proteins), which may explain their high sequence variability due to adaptation to new hosts. The virophage-encoded PRO bears similarity to maturation proteases from adenoviruses and PRD1, and this protease is most likely required for virion morphogenesis (D. Born, L. Reuter, U. Mersdorf, M. Mueller, M.G. Fischer, A. Meinhart and J. Reinstein, under review). The cryo-EM reconstruction of the Sputnik virion revealed that the assembled MCP was 87 aa shorter than predicted from the DNA sequence (Zhang et al., 2012). Therefore, the C-terminus must be cleaved off at a diglycine motif that is conserved across all virophages except for PgVV and the rumen virophages (RVPs). PRO is thus responsible for C-terminal processing of the MCP in at least Sputnik and mavirus. The last conserved virophage gene is a predicted genome-translocating ATPase that belongs to the FtsK-HerA superfamily.
These NTPases usually form hexamers at a unique vertex of the viral capsid and pump the genome through the central pore of this hexamer, a process that is powered by ATP hydrolysis (Burroughs et al., 2007). The mavirus ATPase differs from those of all other virophages by a unique 54 aa insertion between Walker A and Walker B motifs.
Most virophage genomes encode a superfamily 3 helicase (S3H) domain ( Fig. 12.3), but the genetic relationships among these genes are complicated. Virophage S3H genes are apparently polyphyletic, and the versions found in Sputnik, Zamilon, OLV, YSLV 1, YSLV 6, QLV, and PgVV are coupled to an amino-terminal domain with predicted primasepolymerase activity (TVpol, related to bacterial DNA polymerase I enzymes) (Iyer et al., 2008;Yutin et al., 2013). The S3H may thus have adopted different functions in different virophages, e.g. as a helicase in mavirus (which encodes a separate DNA polymerase), and as the primary DNA replication enzyme in Sputnik. Short zinc ribbon domains are also frequently found in virophages, where they can be fused with GIY-YIG endonucleases (e.g. mavirus, OLV), or occur separately (e.g. Sputnik) .

Non-conserved genes
Many of the genes found in virophage genomes exhibit diverse evolutionary affiliations, including eukaryotes, bacteria, different viral families, and diverse mobile genetic elements (Fischer and Suttle, 2011;Krupovic and Koonin, 2015;La Scola et al., 2008;Yau et al., 2011;Yutin et al., 2013). Examples are the bacteriophage lambda-type tyrosine recombinase encoded by Sputnik or the retroviral integrase encoded by mavirus. Although virophages are genetically distinct from their associated host viruses, they encode one to several genes with clear affiliations to giant DNA viruses, and these genes often contain repetitive motifs. The genomes of Sputnik, Zamilon, and predicted algae-infecting virophages contain collagen-like repeat proteins that are also prominent in genomes of Mimiviridae members (Gaia et al., 2014;La Scola et al., 2008;Suhre, 2005;Yau et al., 2011). Sputnik ORF V12 codes for a low-complexity protein that is similar to mimivirus ORF R546; and Sputnik ORFs V4, V14, V16, and V17 display similarities to proteins encoded mainly by moumouviruses. The most caister.com/cimb Fischer frequent repetitive element in the CroV genome is called FNIP or IP22 (Fischer et al., 2010;O'Day et al., 2006) and occupies more than 10% of the 692 kbp genome (Hackl and Fischer, in preparation), and the same repeat is also found in mavirus ORF MV20. These observations suggest that DNA or protein repeats may be involved in virophage-giant virus interactions. Repetitive proteins are also found in the genomes of uncultivated virophages, e.g. the Kelch, ankyrin, and adhesin domaincontaining proteins in the endogenous Bigelowiella natans virophages (Blanc et al., 2015).

Diversity and taxonomy of virophages
After the isolation and characterization of Sputnik and mavirus, it became apparent that these viruses were the first representatives of a previously unknown group of dsDNA viruses. Classification of virophages was initially complicated by their requirement for coinfecting giant dsDNA viruses for their propagation. This property, which virophages share with satellite viruses, led to the initial classification of Sputnik as a satellite virus, placed within the obscure category of 'subviral agents' in the Ninth Report of the International Committee on Taxonomy of Viruses (ICTV) (King et al., 2011). The distinction between satellite viruses and virophages has caused some confusion and debate Fischer, 2011;Krupovic and Cvirkaite-Krupovic, 2011) and shall be discussed here briefly.

Virophages and satellite viruses
Virophages as well as satellite viruses can exert a negative effect on their associated giant viruses, or helper viruses, respectively. This effect varies significantly depending on the virus and multiple infection parameters. Satellite viruses also share some other aspects with virophages, such as subcellular localization or overlap with their helper viruses regarding polyadenylation sites (Krupovic and Cvirkaite-Krupovic, 2011). On the other hand, virophages are genetically distinct from satellite viruses. While the latter have short ssRNA genomes (satellite viruses infecting plants and arthropods) or short ssDNA genomes (adeno-associated viruses) (King et al., 2011), virophages possess considerably longer dsDNA genomes encoding genes that are conserved in the PRD1-adenovirus lineage. Unlike satellite viruses and owing to the greater complexity of virophage particles, virophages encode multiple morphogenesis proteins (two capsid proteins, maturation protease, DNA-packaging ATPase) and DNA replication proteins. As discussed in more detail below, virophages probably depend on the giant virus-provided transcription apparatus in a way that is similar to how other small dsDNA viruses depend on the cellular transcription system.
Thus, while several phenomena are shared between virophages and satellite viruses (dependency on another virus, variable hypovirulence), their larger particle size and coding potential, genetic and infection properties, and relatedness to viruses of the PRD1-adenovirus lineage clearly set virophages apart from those much simpler and genetically distinct groups of satellite viruses. Virophages can be viewed as eukaryotic dsDNA viruses that became evolutionarily adapted to giant dsDNA viruses, and that use the viral transcription enzymes in the cytoplasm instead of accessing the host nucleus for cellular transcription.

The Lavidaviridae
The family Lavidaviridae was created to reflect the unique properties of virophages, which distinguish them from other groups of viruses including satellite viruses . 'Lavida-' is an acronym for 'large virus dependent or associated' and highlights the most striking feature of these viruses, i.e. their co-dependency on giant dsDNA viruses. This family currently contains two genera, Sputnikvirus and Mavirus, which were created based on shared subsets of genes that are specific to each genus, in addition to phylogenetic analysis of genes that are conserved in all virophages.
The topologies of the resulting trees vary somewhat depending on the chosen marker gene ( Fig.  12.4). The most highly conserved gene and thus arguably the best phylogenetic marker is the major capsid protein (Fig. 12.4A), but the ATPase is also conserved enough to be useful for phylogenetic studies ( Fig. 12.4B). Both marker genes resolve the two genera with high branch support, and additional well-supported clusters can be observed, as discussed below. Given the rapid accumulation of virophage genomes from metagenomes, which leads to a better resolution of the phylogenetic reconstructions, additional Lavidaviridae genera will likely be added in the near future.

The genus Sputnikvirus
Sputnik, the first virophage to be discovered, was isolated from water of a cooling tower near Paris, France (La Scola et al., 2008). The same sample also contained a new mimivirus strain named mamavirus with a 1,191,693 bp dsDNA genome. Based on transmission electron microscopy, the Sputnik particles were initially reported to be 50 nm wide, in contrast to cryo-EM studies that were conducted later and reported a capsid diameter of 74 nm (Sun et al., 2010). After Sputnik virions were purified from mamavirus by filtration through 0.2 µm poresize filters and inoculated on amoebae, no Sputnik multiplication was observed. This virus instead only replicated when the amoebae were coinfected with mamavirus. Inspection of coinfected amoebal cells by electron microscopy further revealed that Sputnik and mamavirus particles emerged from the same cytoplasmic virion factory ( Fig. 12.5). Sputnik multiplied faster than mamavirus and progeny Sputnik virions appeared at distinct locations of the virion factory. In addition, it was noted that Sputnik had a negative effect on mamavirus production, as the yield of the latter was reduced by ~70% and abnormal giant virus capsids were observed in the presence of Sputnik (La Scola et al., 2008). This  (Pei et al., 2008), and the manually edited alignment was used to create an unrooted Bayesian phylogenetic tree using MrBayes v3.1.2 (Ronquist and Huelsenbeck, 2003) with 1 million generations and a burn-in of 1000. Branches with posterior probabilities less than 0.5 were collapsed; those with posterior probabilities higher than 0.90 are marked by black dots. Cultured virophages are printed in blue.
caister.com/cimb 9 Curr. Issues Mol. Biol. Vol. 40 Fischer led to the classification of Sputnik as a parasite of mamavirus and the introduction of the term 'virophage', as a virus infecting another virus.
The Sputnik genome consists of a circular dsDNA molecule 18,343 bp in length with an A+T content of 73%, and encodes 21 predicted protein-coding sequences (CDSs). Two other Sputnik strains have been isolated and genetically characterized. Sputnik 2 was found associated with lentillevirus, a mimivirus strain isolated from lens fluid (Desnues et al., 2012b;La Scola et al., 2010) and Sputnik 3 was recovered via a reporter system from a soil sample near Marseille, France during a screen for new virophages (Gaia et al., 2013b). The Sputnik 2 and 3 genomes are four base pairs shorter than Sputnik 1, which results from single base pair deletions at positions 877, 7949, 7958, and 12936. Initially, the Sputnik 1 genome displayed another additional base at position 14047. This, however, was found to be a sequencing artifact that resulted in a frame shift artificially separating ORFs 18 and 19 (Gaia et al., 2013b;Zhang et al., 2012). The corrected Sputnik 1 sequence encodes the minor capsid protein V18/19 in a single ORF, which is not only in agreement with the gene topology in Sputnik 2 and 3, but also enables full-length sequence alignments with homologous mCP genes in other virophages. The genome of Sputnik 3 further differs from Sputnik 1 and 2 by two T to A transversions at positions 8986 and 8991 (located in the intergenic region between V12 and V13) and two A to G transitions at positions 16098 and 17666, resulting in a A464T change in the MCP V20 and a D371N change in V21, respectively. Due to their nearly identical genomes, Sputniks 1, 2 and 3 are closely related strains.
Another Sputnik isolate is the Rio Negro virophage (RNV) that was found in the Rio Negro River of the Brazilian Amazon (Campos et al., 2014). RNV is associated with a mimivirus variant called Samba virus. The RNV particles were reported to be only 35 nm in diameter, even though the partial major capsid gene sequence was 100% identical to the Sputnik MCP gene, which makes it likely that the true capsid size of RNV is comparable to that of Sputnik.
The Zamilon virophage was isolated from Tunisian soil together with its host virus, the Mont1 mimivirus (Boughalmi et al., 2013;Gaia et al., 2014). Zamilon (Arabic for 'neighbour') is a relative of the Sputnik virophages, and its circular 17,276-bp-long genome shares 76% nucleotide identity with Sputnik. The Zamilon genome encodes 20 predicted proteins, 17 of which have homologues in Sputnik. In contrast to Sputnik, Zamilon does not replicate with all members of the Mimivirus genus. Zamilon replication was only detected during coinfection with viruses of lineage B (moumouviruses) and lineage C (megaviruses), but not with lineage A members (mimiviruses) (Gaia et al., 2014). In addition, Zamilon differs from Sputnik by not inhibiting the replication of its coinfecting giant virus.

Mavirus
The second described virophage was mavirus (meaning, maverick-related virus; mavericks are large transposons found in eukaryotic genomes that are also called polintons from the presence of DNA polymerase and integrase genes; see 'Evolutionary connections of virophages' below), a parasite of Cafeteria roenbergensis virus (CroV). The exact geographic origin of mavirus is not clear. CroV was isolated from coastal waters of the Gulf of Mexico near Aransas Pass, Texas, USA, in 1989 (Garza and Suttle, 1995). The flagellate host strain, on which CroV was isolated, originated from the Pacific coast near Yaquina Bay, Oregon, USA ( Gonzalez and Suttle, 1993). After mavirus was discovered in CroV-infected flagellate cultures (Fischer and Suttle, 2011), it was assumed that mavirus had been co-isolated with CroV from the same Gulf of Mexico water sample and that it had been propagated alongside CroV until the virophage was identified in 2009. Recent findings of endogenous mavirus genomes integrated in flagellate chromosomal DNA (Fischer and Hackl, 2016), however, have opened up an alternative explanation, according to which some cells of the Oregon host strain could have harboured a mavirus provirophage that was subsequently reactivated upon contact with CroV.
The mavirus genome is a circular 19,063-bp-long molecule that is able to linearize for integration. The terminal inverted repeats (TIRs) of exogenous mavirus genomes are ≈50 bp long (Fischer and Suttle, 2011), whereas the TIRs of the endogenous mavirus genome are 615/616 bp long, resulting in a total length of 20,190 bp for the provirophage genome (Fischer and Hackl, 2016). Mavirus encodes 20 predicted proteins and has a GC content of 30%, which is similar to the properties of Sputnik and Zamilon.
The particles of mavirus are icosahedral in shape and have a diameter of 60-65 nm in thin-section transmission electron micrographs, and 70-75 nm in negative stain electron micrographs (Fig. 12.2B). The burst size of mavirus is roughly ten times higher than that of CroV, the latter as measured during a mavirus-free CroV infection of C. roenbergensis.
Ace Lake mavirus Although known only from metagenomic sequences, the genome of Ace Lake Mavirus (ALM) clearly falls within the genus Mavirus. The ALM genome was assembled from short metagenomic reads that originated from the Antarctic Ace Lake (Zhou et al., 2013). The genome sequence is not complete and may contain sequence errors due to erroneous assembly and low sequence coverage. Despite these potential problems, the ALM genome provided a first glance at sequence diversity within the mavirus subgroup of virophages. ALM and mavirus share 13 of their 22 and 20 predicted genes, respectively. The gene order is also conserved; however, an inversion affecting seven genes occurred in one of the genomes. In addition to the four highly conserved morphogenesis genes, mavirus and ALM encode homologues for the predicted pPolB, rve-INT, S3H, as well as six genes of unknown function.
ALM ORF14 probably starts at the second ATG start codon, as suggested by alignments with its homologue mavirus ORF12 and the presence of a putative promoter motif directly upstream of the second ATG site, whereas no such promoter motif is present in the immediate upstream region of the first start codon. Like mavirus, ALM encodes an rve-INT. The ALM homologue is annotated in two adjacent reading frames and at present it is not known whether this gene is split into two separate ORFs, or whether an artificial frameshift was introduced during assembly of the metagenomic reads.

Additional virophages from metagenomes
Virophages appear to be distributed world-wide and several groups of virophage-like genomes have been discovered in metagenomes in recent years (Figs. 12.1, 12.3 and 12.6). This was made possible by seeding BLAST search with genes from fully sequenced genomes of the few virophage isolates that are currently available. Interestingly, whereas Sputnik originated from a freshwater cooling tower, Zamilon from soil, and mavirus from an oceanic environment, most uncultured virophages were found in lakes, suggesting that lake ecosystems provide conditions that allow certain virophage populations to thrive.
The Organic Lake virophage The first virophage genome assembled from a metagenome was reported in 2011 (Yau et al., 2011). A genomic study of the microbial community in East Antarctica's hypersaline Organic Lake revealed the presence of a virophage-like genome. The Organic Lake Virophage (OLV) is probably associated with NCLDVs of the extended Mimiviridae family that infects a photosynthetic host 2 . Its genome was assembled into a 26,421 bp long sequence, encoding 24 ORFs. The phylogenetic position of OLV depends on the gene used ( Fig.  12.4), but based on MCP phylogeny, there are currently no known close relatives of OLV.
The Yellowstone Lake virophages and Ace Lake mavirus In 2013, Zhou et al. reported the assembly of four virophage genomes (YSLV1-4) from a metagenome of Yellowstone Lake water samples (Zhou et al., 2013). These findings were expanded by the same group of researchers with the publication of three additional Yellowstone Lake virophage sequences (YSLV5-7) 2 years later (Zhou et al., 2015). The YSLV genomes are 23-30 kb long and appear to belong to different subgroups, as shown by phylogenetic reconstruction (Fig. 12.4). In particular, YSLV5 and YSLV7 do not cluster with other YSLVs. In their 2013 publication, the authors also described the genome of Ace Lake mavirus (ALM), assembled from the saline meromictic Ace Lake in the Vestfold Hills of Antarctica (Zhou et al., 2013). Interestingly, this lake is in close geographic proximity to Organic Lake, where OLV was detected (Yau et al., 2011).
The Dishui Lake and Qinghai Lake virophages In 2016, virophage genomes were reconstructed from metagenomic data of two Asian freshwater lakes, one in Shanghai and one in Tibet. The Dishui Lake virophages (DSLVs) occur in an artificial freshwater lake in Shanghai, China, and are closely related to YSLV 3 ( Fig. 12.4) (Gong et al., 2016). They are presumed to replicate in combination with algae-infecting giant viruses. Another population of virophages was found in a surface water metagenome of Lake Qinghai, in the mountains of Tibet (Oh et al., 2016). The Qinghai Lake virophages (QLVs) are most closely related to YSLVs 1 + 4 and could infect phototrophic protists.
Virophages from a sheep rumen metagenome A novel lineage of virophages was identified in the metagenome of sheep rumen (Yutin et al., 2015a). The longest of these was 26,209 bp long and encoded 22 ORFs. The rumen virophage (RVP) genomes appear to be a hybrid between bona fide virophages (with genes for MCP, ATPase, PRO) and polintons (mavericks, encoding a proteinprimed DNA polymerase B). Such a scenario has also been proposed for the origin of mavirus ; however, RVPs and mavirus are phylogenetically distinct based on pPolB and MCP analyses and RVPs branch separately from all other virophages (Yutin et al., 2015a) (Fig. 12.4). So far, no minor capsid protein has been found in RVP genomes, suggesting that their capsid architecture differs from that of Sputnik and mavirus. Other virophage sightings A metagenomic study of viromes from cryoconite holes in Greenland yielded a partial virophage genome (Bellas et al., 2015). Cryoconite ('ice dust') holes are cylindrical depressions at the surface of glaciers which are filled with dark coloured sediments. The cryoconite virophage (CryV) partial genome is 12,572 bp long and includes the four morphogenesis genes (MCP, mCP, PRO, ATPase). Virophage genome fragments were also identified at the opposite pole, from surface soil of the dry Miers Valley in Eastern Antarctica (Zablocki et al., 2014). Gene fragments of this Miers Valley soil virophage (MVSV) suggested that viruses of the Sputnik clade exist in Antarctica, as well as mimiviruses and phycodnaviruses, whose genomic signatures were also found in the same study. Antarctica therefore harbours a diverse virophage community in its aquatic and terrestrial environments, as evidenced by OLV, ALM, and MVSV. Not surprisingly, genetic signatures of giant viruses are also present in Antarctica (Kerepesi and Grolmusz, 2017;Yau et al., 2011). The endogenous virophage-like elements found in the nuclear genome of Bigelowiella natans (Blanc et al., 2015) are discussed later in this review. A recent study identified 25 uncultivated virophage populations using time-series metagenomes from two North-American freshwater lakes (Roux et al., 2017). This report will help refine the taxonomy of virophages and confirms that most of their genetic diversity remains to be described.

The virophage-like element PgVV and the polinton-like viruses
The large DNA virus PgV-16T belongs to the extended Mimiviridae family and infects the bloom-forming microalga Phaeocystis globosa (Santini et al., 2013). During assembly of the 459,984-bp-long, linear dsDNA PgV-16T genome, an additional contig was found. This 19,527 bp long linear dsDNA molecule with 1 kb terminal inverted repeats contained 16 predicted CDSs that are all located on the same strand. Based on the similarity of three of these CDSs to genes in OLV and mavirus, this genetic element was termed PgV-associated virophage (PgVV). However, no 50-80 nm sized capsids were found and the authors proposed that PgVV replicates either as a linear plasmid or as a provirophage that is integrated in the PgV-16T genome (Santini et al., 2013). Although the PgVV genome was initially reported to lack any recognizable capsid genes, ORF PgVV_00012 likely encodes a distant version of a double-jelly roll major capsid protein and ORF PgVV_00010 could encode a minor capsid protein (Krupovic et al., 2014). At the time of discovery, PgVV was most closely related to mavirus and OLV. Based on the identification of polinton-like viruses (PLVs) in metagenomic datasets using the PgVV-predicted MCP as bait, however, PgVV is considered to be a PLV rather than a virophage (Yutin et al., 2015b). The distinction is justified by the lack of a cysteine protease that is conserved in virophages, as well as by the distinct versions of MCP, mCP, and ATPase genes found in PgVV and PLVs. (Krupovic et al., 2014;Yutin et al., 2015b).

Evolutionary connections of virophages
Virophages may have evolved from an ancestral virus of the PRD1-adenovirus lineage by multiple recombination and gene replacement events (Krupovic and Koonin, 2015). Genome analysis of the first virophage, Sputnik, suggested mixed origins for its genes which exhibited links to viruses infecting archaea, bacteria, and eukaryotes (La Scola et al., 2008). Surprisingly, seven of the 20 genes from the second described virophage, mavirus, had their closest homologues in one particular group of genetic elements, the maverick/polinton elements (MPEs) (Fischer and Suttle, 2011). The MPEs were originally described as a class of DNA transposons widespread in eukaryotic genomes with highly variable copy numbers ( Jordan et al., 2004;Kapitonov and Jurka, 2006;Pritham et al., 2007). They are 15-20 kb large and contain conserved rve-INT integrase and pPolB DNA polymerase genes in addition to other virus-affiliated genes such as an FtsK/HerA-like genome packaging ATPase and an adenovirus-like protease. Recently, it was shown that MPEs also code for putative mCP and MCP genes (Krupovic et al., 2014), which resulted in a shift of perception, and capsid-encoding MPEs are now regarded as endogenous viruses, called polintoviruses (Krupovic and Koonin, 2015), rather than as transposons. Roughly one-third of the genome of the human parasite Trichomonas Fischer vaginalis consists of MPEs, indicating that at least in some eukaryotic lineages these elements are able to spread intragenomically.
MPEs and virophages of the genus Mavirus share seven homologous proteins: S3H, rve-INT, pPolB, FtsK/HerA-type ATPase, PRO, mCP and MCP. In addition, virophages and MPEs have similar genome length (~20 kb), overlapping host range (mainly protists, although MPEs are also found in animals), and the mavirus genome exhibits terminal inverted repeats similar to those found in MPEs. The genes shared by mavirus and MPEs have thus evolved from a common ancestor; however, the nature of this ancestor is unclear. One hypothesis states that ancestral mavirus-like virophages integrated into a eukaryotic host genome, where they conferred resistance to the host against infection by giant viruses and were positively selected for in the host cell population (Fischer and Suttle, 2011). An alternative hypothesis, which is mainly based on tree topologies in phylogenetic reconstructions of conserved MPE/virophage proteins, suggests that virophages evolved from ancestral MPEs through multiple gene transfers and rearrangements and that they subsequently became dependent on giant DNA viruses (Krupovic and Koonin, 2015;Yutin et al., 2013).
Recent data mining approaches in metagenomic sequences have revealed several additional groups of viral elements with ties to virophages and MPEs. The RVP virophages found in a sheep rumen metagenome share with MPEs the pPolB, ATPase, PRO and MCP genes, but lack the rve-INT and mCP. In phylogenetic trees based on multiple sequence alignments of MCP, the RVP virophages cluster as a monophyletic group, distinct from Sputnik-, mavirus-, and OLV-like virophages (Yutin et al., 2015a) (Fig. 12.4). Another group of related viruses discovered in metagenomes are the polinton-like viruses (PLVs) (Yutin et al., 2015b), which share a highly diverged version of the double-jelly roll MCP with PgVV. The PLVs are 18-28 kb in length, encode ATPase, mCP, and MCP (but no PRO) and some of them contain long TIRs. Notably, PLVs were also found in the genomes of photosynthetic protists, which suggests -together with the endogenous B. natans virophages -that virophage integration into eukaryotic genomes may be a common phenomenon (Blanc et al., 2015;Fischer, 2015;Yutin et al., 2015b). For a more detailed overview of the evolutionary connections between virophages, MPEs, PLVs, and a class of mimivirusassociated parasitic linear DNA elements called transpovirons, see Koonin and Krupovic (2017) and Krupovic and Koonin (2016).

The virophage infection cycle
Owing to the paucity of culture-based virophage systems, many details of their replication cycle remain unknown. Studies on the Acanthamoebamimivirus-Sputnik and Cafeteria-CroV-mavirus systems, however, have given us a vague idea about the intracellular events that occur during virophage infection (Fig. 12.7). The most striking feature about these viruses is that, despite their respectable size and coding potential, they cannot reproduce inside a host cell without a suitable coinfecting giant virus. Hence, the replication cycle of virophages is tightly linked to that of their host viruses.

Virion entry
Virions enter the host cell either by endocytosis, or as a composite with their host virus by phagocytosis. An example for the former entry mode is mavirus, which attaches to the host cell surface via an unknown receptor (Fig. 12.8B). The membrane at the attachment site then starts to invaginate and the virus is endocytosed (Fig. 12.8C). Inside infected host cells, mavirus particles can be observed within coated (Fig. 12.8A and D) as well as uncoated vesicles (Fig. 12.8E) by electron microscopy, which implies that virions are internalized by clathrinmediated endocytosis (Fischer and Suttle, 2011).
In contrast, Sputnik has never been observed to enter the amoeba independently of a coinfecting virus of the Mimiviridae family. Sputnik is able to adhere to the 1250 Å long, heavily glycosylated surface fibres that cover the mimivirus capsid (Piacente et al., 2012;Xiao et al., 2009). This interaction may be mediated by the ~100 Å long mushroom-like surface fibres that decorate the Sputnik virion (Sun et al., 2010). The virophage-giant virus composite is then phagocytosed by the amoeba and Sputnik particles entangled within mimivirus surface fibres are occasionally observed by electron microscopy . In particular, the mimivirus R135 gene product, a glycosylated fibre protein related to GMC-type oxidoreductases that elicits an antigenic response (Klose et al., 2015;Pelletier et al., 2009), appears to have a high affinity for Sputnik particles, as this protein was identified by mass spectrometry in a purified Sputnik preparation (La Scola et al., 2008). Additional support for this hypothesis stems from the analysis of a mimivirus deletion mutant, which lacks the surface fibres due to the deletion of 155 genes. This bald form of mimivirus, termed M4, no longer supports Sputnik replication (Boyer et al., 2011). The R135 gene is among the genes that are deleted in mimivirus M4.
The composite entry model of Sputnik and mimivirus via phagocytosis creates the need for a mechanism that would allow Sputnik particles to escape from the phagosome. Uncoating of mimivirus particles in the phagosome is initiated by opening of the stargate portal, a unique struc- The tural feature of mimiviruses (Reteno et al., 2018). open stargate exposes the inner viral membrane, which subsequently fuses with the phagosomal membrane, thereby creating a gateway to the cytoplasm for the viral core (Zauberman et al., 2008). This uncoating mechanism, however, grants cytoplasmic access only to molecules that are located inside the mimivirus membrane. Any other particles inside the phagosome, such as Sputnik capsids, would have to use an alternative escape route. For this reason, a mimivirus-independent entry pathway of Sputnik cannot be excluded, and the exact mechanism of viral genome delivery from the particle into the host cytoplasm remains to be determined. Cryo-EM studies of empty Sputnik virions suggested that low pH conditions (pH ~5.5) can trigger the dissociation of some of the penton structures, which would create a portal through which the viral DNA could exit the virion (Zhang et al., 2012). Capsid dismantling of Sputnik may thus resemble that of adenoviruses, which are known to lose penton bases in acidified endosomes during cell entry (Greber et al., 1993). The uncoating process of mavirus, by contrast, is not known.
As suggested by mathematical modelling, the two different entry modes may have consequences for long-term population dynamics of viruses and hosts (Taylor et al., 2014).

Virus factories
Following entry, the virophage genome is targeted to the developing virion factory of the coinfecting giant virus. After an eclipse phase, which lasts 2-4 h for Sputnik, newly synthesized virophage particles are visible within the virion factory (Figs. 12.5 and 12.9). Sputnik particle production usually begins at one pole of the mimivirus factory (La Scola et al., 2008). The phenotype of mavirus-infected CroV factories varies, depending on parameters such as cell line, multiplicity of infection (MOI), and whether a coinfection with external mavirus particles or a reactivation of endogenous mavirus takes place. Occasionally, factories producing both CroV and mavirus particles can be observed (Fig. 12.9A). More frequently, however, and especially during coinfection with high MOIs of mavirus, the CroV factories will develop to occupy a large part of the cytoplasm, but no CroV particle production occurs. Instead, nested electron-dense subfactories synthesizing mavirus virions can be seen (Fig. 12.9B).

Coinfection dependence
Sputnik, Zamilon, mavirus, and presumably other virophages to be isolated in the future cannot replicate in their host cells without coinfection of their respective host viruses. At first glance, the gene content of virophages does not offer any obvious explanation for such a dependency. The conserved morphogenetic gene module consists of MCP, mCP, PRO and ATPase, and virophages are thus presumably able to synthesize their own capsids. Virophage genomes also code for DNA replication proteins, which should enable them to copy their genetic information. Therefore, virophages appear to be self-sufficient for DNA replication and particle assembly.
Several observations, in contrast, suggest that transcription is the process for which virophages depend on their coinfecting host viruses. First, the computational analysis of the mavirus genome revealed that all 20 genes contain a conserved sequence motif in their immediate 5′ upstream region (Fischer and Suttle, 2011). This motif consists of a 'TCTA' core which is flanked by ATrich sequences and is located approximately 14 nt upstream of the start codon. A highly similar conserved sequence motif had been found earlier at the same upstream position of those CroV genes that are expressed late during infection (Fischer et al.,  2010). Second, quantitative real-time PCR analysis of cDNA generated from RNA samples isolated at different time points post infection revealed that mavirus genes become active at the beginning of late phase of CroV infection (≈4 h p.i.), whereas no mavirus gene activity was detected before the onset of late phase (K. Fenzl and M. Fischer, unpublished data). For Sputnik, a less obvious promoter motif was identified, which also resembled a putative late promoter motif of mimivirus (Legendre et al., 2010). Third, Sputnik and mimivirus share similar transcription termination signals. In mimivirus, the location of polyadenylation sites appear to be defined by palindromic sequences, leading to the proposition of a 'hairpin rule' for mimivirus polyadenylation (Byrne et al., 2009). In the Sputnik genome, 16 such palindromic sequences were identified, 14 of which were located in intergenic regions, which makes them candidates for polyadenylation signals (Claverie and Abergel, 2009a). In summary, these findings strongly suggest that virophages use the giant virus-encoded transcription machinery, and in particular the late phase-specific transcription factors to express their genes.

DNA replication
There are currently no data to indicate that the DNA replication proteins encoded by giant viruses may play a role in replicating virophage genomes.
Members of the Mavirus genus encode a predicted protein-primed B-family DNA polymerase, whereas a distinct version of bacterial DNA polymerase I (TVPol) with a predicted primase activity is found in the Sputnik and OLV clades (Iyer et al., 2008;Yutin et al., 2013). Protein-primed family B DNA polymerases are encoded by various parasitic elements such as adenoviruses, phi29-like viruses, tectiviruses, some archaeal viruses, and mavericks/ polintons (Redrejo-Rodríguez and Salas, 2014). These polymerases use a terminal protein that is covalently attached to the 5′ ends of the genome as a primer. Due to low sequence conservation, however, a terminal protein for mavirus has not been identified yet. In addition, protein-primed replication requires a linear DNA template and the mavirus genome has circular topology, thus the exact mechanism of DNA replication in viruses of the Mavirus genus remains to be explored.

The protist−virus−virophage triangle
Host range Lavidaviruses require two distinct hosts for successful replication, a susceptible host cell and a permissive host virus in the form of the cytoplasmic virion factory. Each of the two hosts must fulfil certain requirements for virophage replication to occur. The host cell must be susceptible to the virophage, i.e. the virophage must be able to enter the cell. In addition, the virophage relies on the host cell for energy, metabolites, protein translation, and other essential systems. As a second host component, the virophage requires the presence of a permissive coinfecting giant virus, which provides the components (presumably transcription proteins) that are necessary for the virophage to complete its replication cycle. Currently, the only known cellular hosts for virophages are phagotrophic protists. With only two such virophage-virus-host systems in culture (Acanthamoeba sp. and Cafeteria sp.), however, the information on virophage host cell range is extremely low. Most of the environmental virophage genomes that have been assembled from metagenomic sequences, or host-integrated virophage genomes (Blanc et al., 2015;Gong et al., 2016;Oh et al., 2016;Yau et al., 2011;Zhou et al., 2013) are assumed to be associated with giant viruses of the extended Mimiviridae family (Fischer, 2016), which mostly infect phototrophic protists. We can predict that it will only be a matter of time until the first algal virophages are isolated and grown in laboratory cultures.
Cellular host range of virophages will mainly be determined by the ability of the virophage to enter the cell. Members of the genus Sputnikvirus are proposed to enter the amoebal host by attachment to the external fibres of mimiviruses and via subsequent uptake of the virus-virophage composite by phagocytosis. Since this entry mode presumably does not depend on virophage-host cell contact, cellular host range should be determined solely by the giant virus host requirements, and the host cell might not be able to become resistant to the virophage while remaining susceptible to the giant virus. In contrast, members of the genus Mavirus enter their flagellate host cells by receptor-mediated endocytosis, a process that requires specific virus-host interactions at the cell surface. Here, it is conceivable that a cellular host becomes resistant to the virophage by mutations in the cell surface receptor, while remaining susceptible to the giant virus. Evolutionarily, this would probably result in a selective disadvantage for the host cell, since mavirus can strongly inhibit CroV replication and increase the survival of CroV-infected flagellate populations (Fischer and Hackl, 2016).
The type of entry of virophages may reflect the host specificity of their associated giant viruses. Mimiviruses are assumed to have a broad host range and can replicate in various amoebae (Reteno et al., 2018). Although mimiviruses replicate well in Acanthamoeba sp., their main natural host is unknown. Hence, mimivirus-associated virophages may be maladapted to Acanthamoeba, which might explain why they are not capable of independent cell entry. Mavirus, on the other hand, is able to trigger specific uptake by C. roenbergensis cells via clathrin-mediated endocytosis. Its viral host CroV infects members of the genus Cafeteria, as well as closely related genera such as Picophagus (M. Fischer, unpublished data). Unlike mimiviruses, CroV appears to enter host cells not by phagocytosis, but by docking to the cytoplasmic membrane and releasing only the viral core into the cytoplasm (U. Mersdorf and M. Fischer, unpublished results). This suggests that marine heterotrophic nanoflagellates such as Cafeteria are the natural hosts for CroV and mavirus.
In general, virophage replication depends on the availability of cytoplasmic transcription. Therefore, the giant virus specificity of lavidaviruses is likely determined by the ability of the giant virus-encoded transcription machinery to synthesize virophage transcripts. Based on the high similarity of late gene promoters as well as transcription termination motifs between giant viruses and their virophages, it is predicted that this interaction largely depends on the recognition of virophage gene promoters by a giant virus-encoded transcription factor. The Mimiviridae is not the only family of large dsDNA viruses that encode their own transcription system, and it is possible that virophage-like parasites also exist for other members of the proposed viral order Megavirales 3 , such as pithoviruses and poxviruses.
Virophages are unlikely to replicate in human cells or cause disease (Desnues et al., 2012b;Parola et al., 2012). The case of Sputnik 2, however, shows that virophages may be part of the human virome.
Sputnik 2 was isolated together with lentillevirus from the contact lens of a keratitis patient, whose symptoms were probably caused by Acanthamoeba cells that also happened to harbour the giant virus and the virophage (Desnues et al., 2012b).

Virophage-giant virus interactions
Virophages that encode integrases are able to physically link their genomes to those of other organisms. The tyrosine recombinase found in Sputnik virophages may catalyse the occasional integration into mimivirus genomes. Analysis of paired-end Illumina reads of lentillevirus DNA revealed that Sputnik 2 may also exist as a provirophage in the genome of its host virus (Desnues et al., 2012b). Similarly, the PgVV genome likely integrates at multiple sites within its associated Phaeocystis globosa virus 16T (Santini et al., 2013). Persistence as a provirophage in a giant virus genome has clear advantages for the smaller virus genome, as it will automatically coinfect and colocalize with its supporting giant virus.
Even although virophages are known to inhibit the production of giant virus particles during a coinfection, their parasitic effect on the host virus appears to vary considerably. Whereas mavirus is a potent inhibitor of CroV and coinfected cells often produce no CroV virions at all, the negative impact of Sputnik on mimivirus is less pronounced and viral factories producing both types of capsids are frequently observed. The Zamilon virophage on the other hand has no discernible negative impact on the replication of a coinfecting mimivirus. The reason for this variation is unknown, but the degree of virus-against-virus pathogenicity most certainly affects the cellular and viral host dynamics, and the resulting ecological consequences will differ for different virophages. As is to be expected for a host-parasite system, the host can become resistant to the parasite, or acquire means to control the parasite. Mimivirus thus has been shown to become resistant to Sputnik through large-scale deletions at the genome termini, resulting in a 'bald' capsid phenotype where external fibres are no longer present (Boyer et al., 2011). This may prevent Sputnik from entering the amoeba via a composite mechanism, although it cannot be excluded that other genes essential for virophage replication are located in the deleted regions.
A general defence mechanism against virophages was recently proposed in mimivirus (Levasseur et al., 2016). Of the three mimivirus lineages (A: mimiviruses, B: moumouviruses, C: megaviruses), Zamilon can only replicate with lineage B and C strains, but not with lineage A mimiviruses. Genome analysis of lineage A viruses revealed an open reading frame (R349) with four 15-nt long repeats of a Zamilon DNA sequence. This so-called 'mimivirus virophage resistance element' (MIMIVIRE) was proposed to represent a CRISPR-Cas-like adaptive immune system against virophages. Silencing of the R349 gene allowed Zamilon to replicate in lineage mimiviruses. Two additional genes located close to the R349 gene, the helicase R350 and the nuclease R354, were also proposed to be part of this defence system, because silencing any of these two genes increased Zamilon replication in lineage A mimiviruses. Criticism has been voiced, however, over the alleged similarity to CRISPR-Cas systems, owing, amongst other things, to the lack of regularly spaced or flanking repeats that are typical for CRISPR-Cas (Claverie and Abergel, 2016;Mohanraju et al., 2016). Instead, Claverie and Abergel proposed a protein-protein interaction scenario, according to which the repeated Zamilon sequences in the R349 protein would compete for binding partners with the Zamilon ORF 4 protein, in which the repeated sequences are present (Claverie and Abergel, 2016). The R350 and R354 proteins would not be directly involved in virophage restriction, but be required for general replication or transcription processes and their silencing would thus indirectly benefit the replication of Zamilon. Clearly, the mechanism of the observed resistance to Zamilon in lineage A mimiviruses remains to be elucidated. Regardless of the specific workings of such a defence system, it becomes evident that host-parasite arms races are not restricted to cellular hosts and are also being waged between giant viruses and their viral parasites.

Virophage-host cell interactions
From the beginning, virophages were hypothesized to integrate into host genomes. This was based on the presence of integrases in Sputnik and mavirus, and the genetic relatedness between mavirus and the endogenous eukaryotic maverick/polinton elements (Fischer and Suttle, 2011;La Scola et al., 2008). Mavirus/ALM and Sputnik/Zamilon Fischer encode two different types of integrases. Whereas members of the genus Sputnikvirus have a bacteriophage lambda-like tyrosine recombinase (Gaia et al., 2014;La Scola et al., 2008), the mavirus-like virophages encode a retrovirus RVE-type integrase (Fischer and Suttle, 2011;Zhou et al., 2013). The benefits of genome integration for virophages are apparent: this process physically ties the virophage to one of its two host components and increases its chances to encounter conditions favourable for replication. At the same time, integrated viral DNA is maintained and replicated by the host cell, whereas free virions are exposed to various environmental parameters that may destroy infectivity, such as ultraviolet irradiation, pH, salinity, etc.
A targeted computational search for virophages in published eukaryotic genomes revealed a group of closely related viral elements in the nuclear genome of the chlorarachniophyte alga Bigelowiella natans (Blanc et al., 2015). The authors of this study identified 38 virophage-like elements with high similarity to each other (average 91% nucleotide similarity), ranging in size from 0.1 kb to 33.3 kb. Based on phylogenetic reconstruction of MCP and ATPase, the B. natans virophages are most closely related to YSLV 5 ( Fig. 12.4). Interestingly, many of these virophage-like genes were found to be transcribed, including conserved genes such as MCP, mCP, and ATPase. The presence of integrated genomic fragments from putative viruses of the NCLDV clade suggests that B. natans is host to both giant DNA viruses as well as virophages (Blanc et al., 2015). It remains unclear, however, how old these endogenous virophage-like elements are and how frequently they integrate into host cell genomes.
An experimental study in the Cafeteria-CroVmavirus system was able to shed first light on these questions by showing that mavirus integrates very efficiently into the nuclear genome of C. roenbergensis during a coinfection with CroV (Fischer and Hackl, 2016). The authors sequenced and assembled the nuclear genome of a host strain before and after coinfection, which allowed them to determine the exact integration sites of mavirus. A least 11 distinct mavirus integrations were found within a single host genome and the integration sites displayed no obvious consensus motif, suggesting that integration is not sequence-specific. Host and virus population data from this study showed that nearly every CroV-infected host cell is destined to lyse. The high percentage of de novo provirophages can be explained by the ability of mavirus to integrate into the host genome in the absence of CroV (M. Fischer, unpublished data). Unlike the B. natans virophages, the endogenous mavirus genes were found to be transcriptionally silent. Infection of a mavirus-bearing cell line with CroV, however, resulted in gene expression, DNA replication, and particle formation of the endogenous mavirus genomes.
The presence of the endogenous virophage did not benefit the CroV-infected host cell directly, nor was CroV replication inhibited as a result of mavirus reactivation. During subsequent rounds of infection, however, the reactivated mavirus particles inhibited CroV replication during regular coinfections, which led to a significantly reduced host cell mortality rate, depending on the MOI of mavirus and CroV (Fischer and Hackl, 2016). Thus, endogenous virophages can in principle protect host populations from giant virus infection, as proposed earlier (Fischer and Suttle, 2011). A crucial condition to ensure host survival, however, is that the spread of the lytic virus can be stopped before all cells are infected. The initially infected provirophage-bearing cells would be sacrificed in an altruistic manner to release virophage particles and eventually protect the population. Not all virophages may act in a way similar to mavirus though, and the sporadic distribution of integrases in virophage genomes, as well as the moderate effect of Zamilon on mimiviruses, suggests that not all virophages are beneficial to their cellular hosts. This should not, however, detract from the possible ecological importance of virophages on their viral and cellular hosts, which was proposed soon after the discovery of virophages (Fischer and Suttle, 2011;Yau et al., 2011).
Several studies have explored host-virus− virophage dynamics in silico using mathematical models. Yau et al. (2011) presented a Lotka-Volterra simulation with OLV as a predator of a lytic giant virus which in turn preys on an alga. In this model, the presence of the virophage reduced the recovery time of the host population after giant virus lysis. Algal blooms were predicted to occur more frequently, with an overall increase in secondary production due to the action of the virophage. Wodarz explored the evolutionary dynamics of a virophage-containing tripartite system by modelling cells infected with one, both, or none of the two viruses (Wodarz, 2013). This model suggested that the virophage evolves towards higher levels of pathogenicity on the giant virus, which may lead to instability and eventual extinction of giant virus and virophage. Thus, a refined model is needed to allow for persistence of the virophage, which is apparently the case for naturally occurring virophages and their hosts. Taylor et al. (2014) constructed a model in which they paid particular attention to different entry modes of virophages, i.e. entry either as a composite with the giant virus (Sputnik) or independently of the giant virus (mavirus). The authors found that both entry modes allow for stable coexistence of such a tripartite system, albeit with slight differences. Also, entry mode did not influence the beneficial effect of virophages on host populations as long as the virophage was pathogenic for the giant virus. Clearly more experimental data are needed to validate these theoretical predictions and explore the ecological role of virophages.

Conclusion
Viruses are often perceived as the ultimate parasites, encoding just the bare essentials for successfully infecting and manipulating a host cell. The discovery of giant DNA viruses with genome lengths upwards of 1 million base pairs contradicted this paradigm. Increasing complexity allows these viruses to become independent from the host cell in many biochemical pathways, such as DNA replication and repair, transcription, or glycosylation. On the other hand, building enzymatically rich replication compartments also bears the risk of becoming a target for other parasites. Virophages of the family Lavidaviridae have evolved to utilize the giant virus-encoded transcription machinery for their gene expression, instead of locating to the nucleus to use the DNA-dependent RNA polymerase of the host. Until recently, virophages have escaped human notice because they are primarily associated with unicellular eukaryotic hosts and cause no disease in plants or animals. Nevertheless, these viruses are probably very ancient and can be found all over the globe in diverse marine and freshwater environments, where they replicate within various heterotrophic and presumably also phototrophic protists, and are associated with different types of giant DNA viruses. Many aspects of their infection biology, ecology and evolution remain to be studied, but from the very beginning of their discovery, these fascinating 'viruses of viruses' have captured the interest of scientists and the general public alike.