Functional Evolution of Bacterial Histone-Like HU Proteins

Bacterial histone-like HU proteins are critical to maintenance of the nucleoid structure. In addition, they participate in all DNA-dependent functions, including replication, repair, recombination and gene regulation. In these capacities, their function is typically architectural, inducing a specific DNA topology that promotes assembly of higher-order nucleo-protein structures. Although HU proteins are highly conserved, individual homologs have been shown to exhibit a wide range of different DNA binding specificities and affinities. The existence of such distinct specificities indicates functional evolution and predicts distinct in vivo roles. Emerging evidence suggests that HU proteins discriminate between DNA target sites based on intrinsic flexure, and that two primary features of protein binding contribute to target site selection: The extent to which protein-mediated DNA kinks are stabilized and a network of surface salt-bridges that modulate interaction between DNA flanking the kinks and the body of the protein. These features confer target site selection for a specific HU homolog, they suggest the ability of HU to induce different DNA structural deformations depending on substrate, and they explain the distinct binding properties characteristic of HU homologs. Further divergence is evidenced by the existence of HU homologs with an additional lysine-rich domain also found in eukaryotic histone H1.


Introduction
Integrity of the bacterial genome is essential to survival of the organism.Further, the size of the bacterial cell necessitates significant compaction of the genomic DNA, yet availability to various cellular machineries is important for cell growth.A variety of small DNA-binding proteins encompass these functions (Gualerzi and Pon, 1986;Kellenberger and Arnold-Schultz-Gahmen, 1992;Dame 2005).These proteins are sometimes referred-to as histone-like, not because of sequence or structural similarity to eukaryotic histones, but because of comparable roles in nucleoid compaction.A number of such nucleoid-associated proteins have been identified in Escherichia coli, including H-NS, Fis, Dps (DNA protection during starvation), HU, and IHF (Integration Host Factor), all of which are present at concentrations up to or even exceeding 10 mM, depending on growth conditions (Azam and Ishihama, 1999).These proteins have different DNA-binding properties and function together (and sometimes opposing each other) to organize genomic DNA and to regulate DNA-dependent activities.
Notably, most bacterial species do not encode homologs of all of these proteins.HU proteins, however, appear to be encoded by all eubacteria, and some bacterial species encode more than one homolog.Further, homologs are found in organelles such as chloroplast, where they appear to serve similar functions in DNA organization (Ram et al., 2008;Karcher et al., 2009).An HU deficiency in E. coli has a mild phenotype, while HU appears to be essential in Bacillus subtilis and other gram-positive organisms (Wada et al., 1988;Huisman et al., 1989;Kano and Imamoto, 1990;Yasusawa et al., 1992;Micka and Marahiel, 1992;Boubrik and Rouvière-Yaniv, 1995;Jaffé et al., 1997;Painbeni et al., 1997;Li and Waters, 1998;Bartels et al., 2001;Liu et al., 2008;Nguyen et al., 2009).This is likely due, at least in part, to a lack of redundancy of other nucleoid-associated proteins in gram-positive organisms.This review discusses recent advances towards understanding molecular mechanisms underlying DNA substrate selectivity by HU proteins, including the function of homologs with an additional domain otherwise characteristic of eukaryotic linker histones.While extracellular roles of HU homologs in cell adhesion and in eliciting immune responses have been reported (e.g., Stinson et al., 1998;Pethe et al., 2001), the focus of this review is on intracellular functions.

Sequence and structural conservation
The HU/IHF family of proteins (sometimes referred-to as Type II DNA-binding proteins) consists of orthologs that share significant sequence identity (Swinger and Rice, 2004).HU derives its name from the E. coli strain in which it was first identified, U93, with the letter H denoting 'histone-like' (Rouvière-Yaniv and Gros, 1975).Active protomers function as dimers, usually composed of 90-99 amino acid subunits (Figures 1 and 2).IHF homologs are usually heterodimers, while HU homologs typically are homodimers (notable exceptions include HU from E. coli and other enterobacteria, which are heterodimers).HU and IHF proteins adopt a conserved, compact core of intertwined monomers (Tanaka et al., 1984;Vis et al., 1995;Jia et al., 1996;Rice et al., 1996;Swinger et al., 2003); two helical segments from each monomer form the body of the protein that is capped by β-strands, which extend to embrace the DNA helix.These DNA-embracing β-strands are largely disordered in absence of DNA but fold on DNA binding.A short C-terminal α-helix completes the structure.HU proteins are typically quite stable; for example, melting temperatures of Bacillus subtilis HU of 33-48ºC have been reported under various solution conditions (Wilson et al., 1990;Welfle et al., 1992;Christodoulou et al., 2002).Helix packing within the protein core is significantly influenced by the sequence of the loop connecting helices one and two, with glycine mediating a loop-flexibility that promotes optimal helix packing and thermal stability (Andera et al., 1994;Kawamura et al., 1996;Liu et al., 2000;Chen et al., 2004).The structures of E. coli IHF, Borrelia burgdorferi Hbb (which is neither a strict HU or IHF homolog), and Anabaena 2 Grove HU in complex with DNA show that highly conserved prolines at the tips of the DNA-embracing β-strands mediate two sharp DNA kinks, the DNA bend deriving from the prolines partially intercalating from the minor groove (Rice et al., 1996;Swinger et al., 2003;Mouw and Rice, 2007).As seen both in the crystal structures and inferred from DNA-binding experiments, the two protein-mediated DNA kinks are introduced at a separation of 9 bp (Rice et al., 1996;Grove et al., 1996a,b;Grove and Lim, 2001;Swinger et al., 2003;Chen et al., 2004;Kamau et al., 2005;Mouw and Rice, 2007).Consistent with a key role in engaging the DNA, the DNA-intercalating proline is conserved among all homologs, and its substitution causes significantly altered DNA binding (Lee et al., 1992).
The significant DNA bend induced on HU binding is energetically costly, hence predisposing the DNA to bending by introducing imperfections such as nicks or mismatches reduces the energetic cost of bending.This property is the basis for the observation that most HU homologs exhibit preferred binding to DNA with such structural distortions compared to perfect duplex DNA.This property was also exploited to obtain the structure of HU in complex with DNA; while IHF binds preferred sequences, HU homologs generally bind without sequence-specificity.Accordingly, the structure of Anabaena HU in complex with DNA was solved using DNA with distortions that promote site-specific binding and therefore permit crystallization of the complex.Another general feature of HU/IHF is that indirect readout forms the basis for site-specific binding rather than direct contacts to the functional groups of the DNA bases.Both IHF and Borrelia Hbb use indirect readout to recognize their cognate sites through contacts with the minor groove and the phosphodiester backbone, and Anabaena HU employs similar strategies for structure-specific binding (Rice et al., 1996;Swinger et al., 2003;Mouw and Rice, 2007).

Differential substrate specificity
Based on the significant sequence conservation and structural homology, and considering that the mechanism of DNA binding involving indirect readout and the introduction Histone-Like HU Proteins 3 of sharp DNA kinks, in part through partial intercalation of conserved prolines, the expectation might be for equivalent DNA binding properties of HU homologs.This is not observed.Consistent with distinct properties, it is also well documented that E. coli HU and IHF are not completely interchangeable (Segall et al., 1994).Two variable features of protein binding contribute to the existence of differential DNA substrate selectivity; the extent to which the DNA kinks are stabilized and how well DNA flanking the prolinemediated DNA kinks interacts with positively charged residues on the protein surface, as such positively charged surface patches vary between HU homologs.
Variable DNA site sizes DNA binding by several HU homologs has been reported.For example, E. coli HU binds nonspecifically and with low affinity to duplex DNA, and it binds with ~100-fold higher affinity to cruciform DNA and DNA with nicks and gaps (Pontiggia et al., 1993;Bonnefoy et al., 1994;Castaing et al., 1995;Pinson et al., 1999).Such substrate specificity is consistent with the reported role of E. coli HU in events such as replication, recombination and repair (Huisman et al., 1989;Aki and Adhya, 1997;Li and Waters, 1998;Shanado et al., 1998;Bahloul et al., 2001;Williams and Foster, 2007;Oberto et al., 2009).In its interaction with duplex DNA, E. coli HU binds an ~9 bp site with an equilibrium dissociation constant (Kd) of 200-2500 nM, although binding to prebent DNA has suggested a binding geometry similar to that exhibited by IHF (Bonnefoy et al., 1994;Pinson et al., 1999).By comparison, the sequence-specific E. coli IHF binds 9 bp of non-specific DNA with low affinity (10-30 µM Kd), but with high affinity (low nM Kd) to its 35 bp cognate site (Yang and Nash, 1995;Holbrook et al., 2001).These observations illustrate that a given homolog may bind a DNA site of variable length, depending on the nature of the DNA substrate.
What has also become increasingly apparent is that HU orthologs encoded by different eubacteria exhibit significantly different DNA-binding properties in vitro.Reported non-specific binding site sizes in duplex DNA range between ~9 bp for E. coli HU and 12 bp for Bacillus subtilis HU to 17-19 bp for HU from Anabaena and Helicobacter pylori and >35 bp for HU from Thermotoga maritima and Deinococcus radiodurans, with affinities ranging between Kd~5 nM for T. maritima HU and >200 nM for E. coli HU (Broyles and Pettijohn, 1986;Bonnefoy and Rouvière-Yaniv, 1991;Lavoie et al., 1996;Liu et al., 1998;Azam and Ishihama, 1999;Kobryn et al., 2000;Castaing et al., 1995;Fernandez et al., 1997;Grove et al., 1996a,b;Esser et al., 1999;Grove and Lim, 2001;Chen et al., 2004;Ghosh and Grove, 2004;Kamau et al., 2005;Koh et al., 2008).Based on shared elements of DNA site recognition, an ~9 bp site corresponds to the distance between DNA base pairs at which proline residues partially intercalate.Longer DNA sites reflect contacts between DNA distal to the kinks and the sides of the protein.Accordingly, longer DNA site sizes imply that DNA kinks are stabilized sufficiently for flanking DNA to contact the sides of the protein, and that positively charged surface patches exist that permit such contacts to flanking DNA.

Stabilization of DNA kinks
Considering the relationship between occluded DNA site sizes and the distance between proline-induced DNA kinks, the molecular basis for two related observations must be defined: 1-That E. coli HU, for example, binds a 9 bp site in perfect duplex DNA, while contacts to a longer DNA site are inferred for interaction with pre-bent or flexible DNA and 2that various HU orthologs bind DNA sites of different length in perfect duplex DNA.As noted above, the answer to this question hinges on the extent to which proline-mediated DNA bends are stabilized by neighboring positively charged residues and on the availability of positively charged patches on the protein surface that can engage DNA flanking the kinks.The crystal structures of HU/IHF in complex with DNA provide the first essential insights.The sequence-specific IHF statically bends its 35 bp cognate DNA by ~160° (Rice et al., 1996).In contrast, the structure of Anabaena HU in complex with DNA reveals variable bend angles of 105-139° induced by each of two identical monomers, indicating that HU may induce a range of bend angles depending on DNA substrate (Swinger et al., 2003;Swinger and Rice, 2004).A comparison of structures with different DNA substrates also lead to the inference that greater bend angles correlate with longer site sizes.Thus, a given HU homolog may engage different DNA substrates differently, as dictated by its ability to induce and stabilize the requisite DNA bends.Indeed, reduced demands for deformational energy is the basis for enhanced binding of many HU homologs to DNA with specific defects, such as nicks and mismatches and to pre-bent DNA such as four-way DNA junctions.For example, analysis of HU from E. coli, B. subtilis, and H. pylori has shown that less binding energy is expended on bending more flexible DNA substrates, resulting in preferred binding and a greater bend angle that brings flanking DNA into contact with the sides of the protein; this results in a longer DNA site size in flexible DNA compared to perfect duplex (Pontiggia et al., 1993;Bonnefoy et al., 1994;Pinson et al., 1999;Wojtuszewski and Mukerji, 2003;Arthanari et al., 2004;Chen et al., 2004;Kamau et al., 2005).The importance of DNA flexure for binding site selection is also reflected in the observation that the affinity of IHF for its binding sites is determined by the deformational energy required for DNA to adopt the IHFbound structure (Grove et al., 1996a;Goodman and Kay, 1999;Teter et al., 2000;Aeling et al., 2006;Swinger and Rice, 2007).
Similar arguments suggest that a differential ability to stabilize DNA kinks by different HU orthologs may be reflected in different DNA site sizes in duplex DNA.Such stabilization may be achieved by positively charged residues near the kinks.Inspecting the sequence immediately preceding the DNA-intercalating proline (Figure 1) reveals significant conservation of an RNP motif.The arginine is seen in the IHF-DNA structure to make hydrogen bonds to the DNA, contacts that would also be possible in the Anabaena HU structure (in this structure, the arginine makes crystal packing contacts to a neighboring complex (Rice et al., 1996;Swinger et al., 2003)).That this arginine contributes to stabilization of the DNA kinks is suggested by the observation that replacement of the original V with R in T. maritima HU results in enhanced affinity (Grove and Lim, 2001).Even more compelling is the observation that both a significantly increased affinity and a diminished preference for more pliable DNA is seen on replacing F with R in the Bacillus phage SPO1-encoded HU homolog TF1 (Sayre and Geiduschek, 1990).For these two HU homologs, replacement of this otherwise conserved arginine can be rationalized based on observed DNA-binding properties: T. maritima HU has high affinity (low nM Kd) for duplex DNA and removal of this arginine appears to be necessary to retain any discrimination between perfect duplex and damaged DNA; for TF1, reduced stabilization of proline-mediated DNA kinks is required to maintain preferred binding to its natural DNA target, which it recognizes on the basis of the increased flexibility of hydroxymethyluracil-A base pair steps that define its cognate site (Grove et al., 1996b;Grove et al., 1997).This arginine is also not conserved in HU homologs encoded by mycoplasma, in which its absence may be required to confer optimal DNA binding to the genomic DNA, which is characterized by unusually low G+C-content.Taken together, enhanced stabilization of DNA kinks, either by suitably placed basic residues near the DNA-intercalating prolines or by predisposing the DNA to bending, results in enhanced affinity and often an increased DNA site size.Notably, optimal stabilization of DNA kinks is desired, not maximal stabilization; if DNA kinks are maximally stabilized in duplex DNA, then enhanced DNA flexure as imposed by DNA damage may not result in preferred protein binding.This is observed for T. maritima HU, which binds a 37 bp site in perfect duplex DNA and has only modest preference for distorted DNA (Grove and Lim, 2001).Encoded by a thermophile, T. maritima HU may have evolved other properties in preference to discrimination between different DNA targets, such as the extensive DNA compaction and protection not emulated by HU proteins from mesophilic organisms (Mukherjee et al., 2008b).By contrast, the bacteriophage SPO1-encoded TF1 binds preferred sites in the phage genome that is characterized by the global replacement of thymine with hydroxymethyluracil; TF1 recognizes its preferred sites based on their pliability and loses this ability to discriminate if mutations are introduced that promote binding (Sayre and Geiduschek, 1990).
Surface salt bridges modulate interaction between DNA flanking the kinks and the body of the protein DNA site sizes longer than 9 bp demand interaction between the sides of the protein and DNA distal to the kinks.Indeed, several examples exist of protein assemblies that organize long DNA segments, and it has become apparent that such extensive DNA organization depends on the wrapping of the DNA duplex across a protein surface.A recurring feature of such wrapped complexes, including HU and IHF, is the existence not only of cationic surface residues responsible for interaction with the negatively charged DNA, but also numerous anionic residues.Therefore, cationic residues can form salt bridges with neighboring anionic residues in the free protein, competing with formation of electrostatic contacts to the DNA (Holbrook et al., 2001;Saecker and Record, 2002;Grove, 2003;Vander Meulen et al., 2008).The negatively charged DNA surface causes accumulation of salt cations that are released into bulk solution on protein binding, providing an entropic driving force for the binding event.This phenomenon may be experimentally detected as a dependence of the binding constant on salt concentration.A DNA wrapping event such as that occurring on IHF-DNA complex formation might therefore be expected to be strongly salt-dependent; however, the salt-dependence of binding is only modest and the binding event entropically unfavorable.This was explained by the existence of numerous surface salt-bridges that must be disrupted to permit subsequent hydration and interaction of cationic groups with the DNA (Holbrook et al., 2001;Vander Meulen et al., 2008).Inspection of HU protein surfaces likewise reveals a large number of positively charged surface residues and a comparable number of negatively charged residues (for instance, each monomer of B. subtilis HU has 16 and 14 basic and acidic residues, respectively).Consistent with a role of surface-exposed lysines and their salt-bridging partners in defining DNA-binding affinities and occluded site sizes, both DNA site size and binding affinity may be altered on changing either positively or negatively charged residues lining the surface of HU homologs (Grove, 2003;Chen et al., 2004;Kamau et al., 2005).
Surface-exposed lysines must contact DNA flanking the kinks to generate a wrapped complex.However, evaluating the roles of individual residues has revealed that their position relative to the DNA kinks is important.Based on this observation, a specific model was articulated based on the interaction of TF1 with its cognate DNA and subsequently confirmed based on analysis of B. subtilis HU (Grove and Saavedra, 2002;Kamau et al., 2005).TF1, like IHF, binds to preferred sites, and it engages 37 bp of duplex.It was found that Lys3 (numbering matching TF1 and B. subtilis HU) is a critical residue for wrapping of DNA longer than ~25 bp, suggesting that optimal interactions with flanking DNA occur at a distance from the DNA kinks where optimal leverage may be exerted on the DNA.Lys3 would contact DNA 8-9 bp distal to the proline-mediated DNA kinks, suggesting that TF1 derives significant binding energy from contacts at the edges of a 29 bp core.Similarly, the DNA-bending protein Histone-Like HU Proteins 5 CAP forms contacts to DNA 12-14 bp from the center of its binding site that contribute strongly to affinity, while contacts to more distal DNA (beyond a 28 bp core) contribute only modestly (Liu-Johnson et al., 1986).Taken together, the inference is for optimal interactions in a region of the DNA that is far from the bend center, and that the most efficient use of binding energy occurs when the interactions necessary for bending allow the protein to anchor the ends of the DNA and pull them toward the center.For HU homologs, Lys3 appears to be in a position to contribute such optimized interactions when not sequestered in a surface salt-bridge (Grove and Saavedra, 2002;Kamau et al., 2005).
Despite extensive sequence and structural homology, substrate specificities and affinities vary among HU proteins.Large-scale comparisons of DNA-binding proteins have shown that specificity-changes within a family of sequencespecific proteins are due largely to changes in residues making direct contacts (Luscombe and Thornton, 2002).Even non-sequence specific proteins tend to exhibit greater conservation of residues involved in indirect contacts to the DNA backbone, except for members of the Type II DNA binding protein family (Luscombe and Thornton, 2002).With indirect readout dominant in determining substrate specificity, substitutions in surface-exposed lysine or arginine residues or their salt-bridging partners would be expected to change DNA-binding properties, as would substitutions in amino acids that alter the surface disposition of charged residues.The latter type of substitution, in particular, may be difficult to appreciate from sequence alignments; consequently, predicting substrate specificity of HU proteins (and hence in vivo functions) is not straightforward.For example, a glycine in the turn between helices one and two has been shown to be important for thermal stability of HU proteins, as noted above, but substitution of this residue also alters DNA binding properties (Andera et al., 1994;Grove et al., 1996b;Kawamura et al., 1998;Christodoulou and Vorgias, 2002).For the bacteriophage SPO1-encoded HU homolog TF1, structural analysis has shown that substitution of the original E for G in this position contributes to the generation of a positively charged surface patch and to increased DNAbinding affinity (Grove et al., 1996b;Liu et al., 2000).

Restraint of DNA supercoiling
HU proteins generally constrain negative DNA supercoils in plasmid DNA (exemplified for E. coli HU in Broyles and Pettijohn, 1986).Borrelia Hbb likewise constrains DNA supercoils (Kobryn et al., 2000), but E. coli IHF does not.In principle, there are two ways in which such negative writhe may be generated in the presence of topoisomerases; HU may underwind the bound DNA, or the two DNA kinks may be phased such that the bent DNA does not lie in a single plane.The available co-crystal structures illustrate the mechanism by which HU restrains negative supercoils.In the IHF-DNA structure, the two DNA bends are nearly co-planar, and undertwisting of the DNA near the kinks is essentially compensated by overtwisting elsewhere (Rice et al., 1996).As a result, IHF does not constrain DNA supercoils in the presence of topoisomerase.In contrast, Anabaena HU introduces DNA underwinding, and the two DNA bends are not co-planar, resulting in a node consistent with negative supercoiling (Swinger et al., 2003;Swinger and Rice, 2004).Such non-planarity of wrapped DNA is also seen in the nucleosome (Luger et al., 1997).In Hbb, the DNA lies nearly in a single plane with the dihedral angle between the DNA segments bound along the sides of the protein less than 5°, but the DNA is significantly underwound at the kinks (Mouw and Rice, 2007).Thus, for Hbb, its nonspecific binding mode leads to supercoiling by a mechanism that is likely in part based on DNA underwinding, although it remains a possibility that interaction with non-cognate DNA may result in non-equivalent bends at either kink and therefore greater dihedral angles between flanking DNA.Indeed, the inference from available co-crystal structures is that static bends of specific DNA sites appear to result in DNA lying essentially in a plane, while the variable bends induced by HU may correlate with out-of-plane bending and hence DNA supercoiling (Figure 3).Formation of rigid filaments at high HU:DNA ratios HU proteins have long been known to participate in organization of the bacterial genome and to do so by means of their ability to bend and compact the DNA.Evidence for such functions have come not only from biochemical data, as discussed above, but also from in vivo visualization of the bacterial nucleoid.For example, inactivation of several HU homologs, including E. coli HU, has been reported to result in de-compaction of the nucleoid (Rouvière-Yaniv et al., 1979;Broyles and Pettijohn, 1986;Kellenberger and Arnold-Schultz-Gahmen, 1992;Köhler and Marahiel, 1997;Nguyen et al., 2009;Karcher et al., 2009;Mukherjee et al., 2009).In vivo DNA compaction on overexpression of an HU homolog has also been reported (for example, for a variant of E. coli HU; Kar et al., 2005), and HU from the thermophilic T. maritima induces both nucleoid condensation when overexpressed in E. coli and a DNA compaction in vitro that shields DNA from nuclease digestion, an effect not emulated by B. subtilis HU (Mukherjee et al., 2008b).However, a different mode of DNA binding at high HU:DNA ratios has been inferred from single molecule experiments.Under such conditions, HU appears to reduce DNA flexibility and to form rigid filaments with the DNA (van Noort et al., 2004;Sagi et al., 2004;Salomo et al., 2006).In contrast, no such DNA rigidification was observed in similar experiments with E. coli IHF (Ali et al., 2001).Consistent with this observation, IHF does not appear to recruit additional protomers to the DNA, in contrast to what is observed for E. coli HU (Benevides et al., 2008).While HU clearly functions in vivo to organize the genomic DNA, cellular concentrations of E. coli HU (corresponding to about 1 dimer per 100 bp) suggest that higher local concentrations are possible that may cause the formation of rigid filaments.Noting the presence of numerous other nucleoid-associated proteins, such DNA rigidification has been proposed to counteract the compaction induced by the DNA-condensing protein H-NS (van Noort et al., 2004).

HU homologs from species other than eubacteria
Endosymbiosis and gene transfer from eubacteria is responsible for the occurrence of HU homologs in distantly related species.Eukaryotic organelles such as mitochondria and chloroplasts evolved from bacterial endosymbionts (alpha-proteobacteria and cyanobacteria, respectively) and share some similarities in terms of how their genomes were reduced.In mitochondria, however, nucleoid-associated proteins vary between species and include members of the high mobility group (HMG) box family rather than proteins of bacterial origin (reviewed in Kucej and Butow, 2007).In plastids, many original bacterial genes have been likewise replaced with eukaryotic counterparts over the course of evolution, but some plastid genomes have been shown to encode functional HU homologs with particular homology to cyanobacterial HU.For example, the HU homolog of Chryptomonas Φ (Guillardia theta, a chryptomonad alga), which shares 53% sequence identity with HU from Nostoc sp., was shown to complement a mutation in B. subtilis HU, HU from the primitive red alga Cyanidioschyzon merolae complements an HU mutation in E. coli, and knockdown of the HU homolog in the green alga Chlamydomonas reinhardtii reduces the level of compaction and genome copy number of its chloroplast nucleoid (Grasser et al., 1997;Kobayashi et al., 2002;Karcher et al., 2009).
In general, chloroplast genomes show modest variation in terms of gene content, but greatly reduced and variable plastid genomes are found in dinoflagellates and apicomplexans, two of the three phyla that constitute the Aveolata.While many dinoflagellates are photosynthetic, the plastid organelle of apicomplexans, the apicoplast, is nonphotosynthetic.Both dinoflagellates and apicomplexans, such as Plasmodium sp., have been reported to encode an HU homolog that is targeted to the plastid where it is essential for nucleoid structuring.However, while apicomplexan HU homologs appear related to HU from cyanobacteria, dinoflagellate homologs cluster with β/γ-proteobacterial HU, suggesting that they originate from a separate symbiotic event (Chan et al., 2006;Chan and Wong, 2007;Ram et al., 2008;Sasaki et al., 2009).That both dinoflagellates and apicomplexans have independently retained the gene encoding a bacterial HU homolog in their nuclear genome speaks to the versatility of these proteins.
While archaea typically do not encode HU homologs, Blast searches of archaeal genomes identifies a few exceptions, including HU homologs encoded by Thermoplasma acidophilum and T. volcanium and the both ecologically and phylogenetically related Picrophilus torridus, and by Halorubrum lacusprofundi, and Ferroplasma acidarmanus.The presence of HU homologs in these species is most likely the result of lateral gene transfer and may have occurred to adapt to particular extreme environments (e.g., see Angelov and Liebl, 2006), although the physiological roles of the HU homologs in these species have not been determined.
An HU homolog is encoded by the lytic B. subtilis bacteriophage SPO1.This homolog, designated TF1 for its role as a transcription factor in regulating the expression of genes involved in transition from middle to late gene expression, has been well characterized (Greene et al., 1984).This homolog exploits the global substitution of hydroxymethyluracil for thymine in the phage genome to bind preferred sites within its target promoters, as discussed above.Several genes within the recently sequenced SPO1 genome are thought to have been acquired from bacterial sources and subsequently adapted to meet the needs of the phage, and TF1 likely belongs in this category (Stewart et al., 2009).
An even more remarkable case of gene transfer is exemplified by the presence of a gene encoding an HU homolog in the eukaryotic African Swine Fever Virus (ASFV), a large DNA virus that replicates in the cytoplasm of infected cells.This virus encodes multiple genes with similarity to bacterial genes, suggesting trans-kingdom gene transfer, perhaps in a eukaryotic host simultaneously infected with ASFV and a bacterium.The ASFV-encoded HU homolog is expressed at late stages in the infection cycle, and it is localized to the virion nucleoid (Borca et al., 1996).That it appears to be conserved among ASFV isolates suggests an essential function; however, the precise functions of this protein in the virus lifecycle remain unknown.ed at late Divergent HU homologs with lysine-rich extensions HU proteins are grouped with other small nucleoidassociated proteins as histone-like proteins based on their role in genomic DNA organization.For some homologs, this moniker is particularly apt.Unusual two-domain HU homologs are encoded by some members of the Actinomycetes, including Streptomycetes, Mycobacteria, and members of the genus Kineococcus (Figure 1).These homologs feature the classical HU-fold at the N-terminus, followed by a long C-terminal extension that is characterized by a series of low complexity lysine-, alanine-, and proline-rich repeats.These repeats are similar to the (S/T)PKK repeats found in eukaryotic histone H1 proteins.Some species, such as the Mycobacteria, only encode a two-domain HU homolog, whereas the Streptomycetes encode both a conventional and a two-domain homolog.
A similar set of repeats is also found within the N-terminal extension of HU encoded by members of the genus Deinococcus, which also do not encode a conventional HU homolog (Figure 1).D. radiodurans HU has been reported to be essential and to be important for nucleoid structuring (Nguyen et al., 2009).Deletion of Histone-Like HU Proteins 7 the N-terminal extension leaves a functional DNA-binding protein with properties akin to those of E. coli HU, including a short ~11 bp DNA site size and preferred binding to DNA with junctions and imperfections (Ghosh and Grove, 2006).These binding properties are significantly modulated by the N-terminal extension, which causes an increased occluded site size, markedly attenuated preference for DNA with nicks and gaps, and a binding mode with four-way junction DNA that involves binding to the junction arms and not the crossover (Ghosh andGrove, 2004, 2006).Considering that D. radiodurans has evolved unusually effective means by which double-strand DNA breaks are repaired (Cox et al., 2010), it is tempting to speculate that its HU homolog has evolved binding properties to support such repair mechanisms.
In vitro characterization of DNA binding by mycobacterial HU homologs (typically referred-to as Hlp, for histonelike protein) also revealed markedly different properties imposed by the presence of the C-terminal domain.
Truncating the entire C-terminal domain to retain only the HU-fold renders a protein that no longer binds DNA; this is similar to what was previously observed on truncation of the short C-terminal extension of both E. coli IHF and TF1, where such truncations caused loss of specificity and loss of DNA binding, respectively (Mengeritsky et al., 1993;Andera and Geiduschek, 1994;Mukherjee et al., 2008a).In the latter proteins, which both bind 35-37 bp sites, the extended C-termini are important for contacts to DNA beyond the kinks, suggesting that the HU domain of mycobacterial Hlp may likewise have a DNA site size longer than the 9 bp separating the DNA kinks.Notably, truncation of only the section of the C-terminal domain comprising the lysine-rich repeats reveals that this region is responsible for mediating a DNA association not seen with the truncated protein (Mukherjee et al., 2008a).As full-length Hlp causes compaction of genomic DNA when expressed in E. coli, these observations point to a role for mycobacterial Hlp in DNA association and compaction and indicate that the lysine-rich C-terminus is responsible for this function.Consistent with a role in DNA protection, M. smegmatis Hlp is upregulated in dormancy (Lee et al., 1998).Kineococcus radiotolerans is characterized by a resistance to γ-radiation and desiccation comparable to that reported for D. radiodurans, suggesting protective functions of its nucleoid-associated proteins (Phillips et al., 2002).Differential expression of conventional and two-domain HU homologs in Streptomyces coelicolor likewise speak to specialized functions (Salerno et al., 2009).The twodomain HU homolog was seen to be nucleoid-associated in spores, and its deletion caused an increased nucleoid size and decreased heat resistance.Taken together, these observations suggest a role for these extended C-terminal domains in DNA protection and compaction.

An evolutionary link to eukaryotic linker histones?
What is striking is that these eubacteria encode proteins with sequences that closely resemble the C-terminal domains of eukaryotic H1.Linker histones of multicellular eukaryotes have a tripartite domain organization, in which a central globular winged helix domain is flanked by two less structured domains rich in basic residues.While the winged helix domain is conserved among metazoan H1 homologs, the sequence of the flanking domains is quite heterogeneous (reviewed in Happel and Doenecke, 2009).The C-terminal domain is characterized by a series of repeats rich in lysine, alanine and proline, and it has been proposed to adopt a kinked helix conformation on interaction with DNA (Clark et al., 1988).A major function of histone H1 is to condense the linker DNA and to promote formation of the 30 nm fiber, and it has been shown that the lysine-rich C-terminus is essential for such DNA condensation (Bharath et al., 2002;Ellen and van Holde, 2004).Curiously, H1 homologs from kinetoplastids such as Trypanosoma cruzi lack the winged helix domain and are effectively similar to the C-terminal domains of H1 from plants and animals.Since sequences of eubacterial lysine-rich domains sometimes resemble the sequences of metazoan H1 C-terminal domains more than the latter resemble the sequences of the truncated H1 of protists, the possibility therefore exists that the lysine-rich sequences found in eubacterial DNA-binding proteins are evolutionarily related to the C-terminal domains of H1 and to the truncated H1 homologs of protists (Kasinsky et al., 2001).

Concluding remarks
The proposed function of HU proteins is to serve an architectural role by overcoming the torsional rigidity of B-form DNA.The ~160º DNA bend created when IHF interacts with its 35 bp preferred DNA site is seemingly static, and the DNA lies largely in a single plane.Consistent with a more static interaction with its DNA sites, E. coli IHF cannot substitute for HU in all binding events.In contrast, the structure of Anabaena HU in complex with DNA reveals a greater range of allowable bend angles.As the extent of the protein-mediated DNA bend will depend on the nature of the DNA substrate, with more flexible substrates resulting in less binding energy expended on bending and higher affinity binding, individual HU homologs have the potential not only to select preferred DNA substrates, but to induce different DNA structural deformations on binding different DNA sites.This property undoubtedly contributes to the wide range of DNA-dependent activities in which HU participates.
Following a similar logic, the distinct substrate specificities reported for some HU homologs are likely to be reflected in specialized functions in vivo.Consistent with such optimized functions, some bacterial species encode multiple HU homologs or as reported for the heterodimeric E. coli HU, regulate production of either subunit or homolog individually.Of particular note is the existence of HU homologs that feature a separate domain whose sequence is closely related to the C-terminal domain of eukaryotic linker histones.Consistent with a role in DNA protection, such lysine-rich extensions evidently exist in HU homologs encoded by bacterial species that are particularly adept at resisting stress conditions, as reflected in the desiccation and radiation tolerance characteristic of Deinococcus and Kineococcus, the ability to enter dormancy characteristic of mycobacteria, or in the heat resistance required of Streptomyces spores.

Figure 1 .
Figure 1.Multiple sequence alignment of HU homologs.Helical segments are denoted above the alignment (based on the structure of B. stearothermophilus HU).The DNA-intercalating proline (enclosed in a red box) is part of a conserved RNP motif.Some members of the Actinomycetes encode homologs with C-terminal extensions characterized by proline-, alanine-, and lysine-rich repeats (exemplified for Streptomyces coelicolor, Mycobacterium tuberculosis and M. smegmatis, and Kineococcus radiotolerans), while members of the genus Deinococcus encode homologs with a comparable N-terminal extension (D. radiodurans and D. geothermalis).

Figure 2 .
Figure 2. Structure of Anabaena HU in complex with DNA (1P78).The DNA-intercalating prolines are colored red and the arginines of the RNP motif are shown in blue stick representation.Lysine-3 is shown in orange.Contacts between Lys3 and DNA distal to the kinks are inferred to exert optimal leverage on the DNA.

Figure 3 .
Figure 3. Toroidal DNA supercoiling by HU. A. IHF bends its DNA site with the two DNA bends nearly coplanar (left panel; 1IHF).A rotation by 90° about the vertical axis shows planarity of IHF-bent DNA (red DNA duplex).When DNA bends are not co-planar, the two DNA segments flanking the kinks no longer lie in the same plane (blue DNA duplex), resulting in toroidal supercoiling of the DNA about the protein.B. Supercoiling of DNA by toroidal coiling about individual HU molecules.