The Molecular Biology of Recombination in Mycobacteria : What Do We Know and How Can We Use It ?

Recombination is a ubiquitous genetic process which results in the exchange of DNA between two substrates. Homologous recombination occurs between DNA species with identical sequence whereas illegitimate recombination can occur between DNA with very little or no homology. Site-specific recombination is often used by temperate phages to stably integrate into bacterial chromosomes. Characterisation of the mechanisms of recombination in mycobacteria has mainly focussed on RecA-dependent homologous recombination and phage-directed site-specific recombination. In contrast the high frequency of illegitimate recombination in slow-growing mycobacteria has not been explained. The role of DNA repair in dormancy and infection have not yet been fully established, but early work suggests that RecA-mediated pathways are not required for virulence. All three recombination mechanisms have been utilised in developing genetic techniques for the analysis of the biology and pathogenesis of mycobacteria. A recently developed method for studying essential genes will generate further insights into the biology of these important organisms.


Introduction
Recombination mechanisms have been widely studied in the Gram negative model organism, Escherichia coli and this evidence forms the basis for our understanding of these ubiquitous processes.The mycobacteria are Gram positive actinomycetes which are not closely related to E. coli.The study of recombination processes in the mycobacteria themselves should help to elucidate the differences in pathways between these bacteria and shed further light on the mechanism by which pathogenic mycobacteria preserve the integrity of their chromosomal DNA.In addition, recombination is a useful genetic tool for investigating the biology of these bacteria.This review will focus on what is known about recombination in the mycobacteria and how these processes are being used by molecular microbiologists to unravel the mysteries of mycobacterial biology.

Mycobacteria
Mycobacteria are acid-fast, Gram-positive aerobes of a slightly curved or straight rod-shape.Colony morphology is species dependent varying from rough to smooth and pigmented to non-pigmented.Mycobacteria are generally susceptible to heat and UV light, but they are resistant to the action of acids and alkalis.This is largely attributed to the complex cell wall which contains a high proportion of lipids including the mycolic acids (high-molecular alphasubstituted, beta-hydroxy fatty acids), trehalose-6,6'dimycolate and peptidoglycolipids.The mycolic acids form an outer layer in the cell wall which acts as a hydrophobic permeability barrier and is responsible for the characteristic acid-fast property as assayed by the Ziehl-Nielsen stain.The cell wall is also the target of some anti-tuberculosis drugs.
The genus Mycobacterium encompasses species which are pathogenic, including opportunistic and obligate pathogens, and non-pathogenic.The most important pathogenic mycobacteria are those of the Mycobacterium tuberculosis complex and Mycobacterium leprae which cause tuberculosis and leprosy respectively.The Mycobacterium avium complex, Mycobacterium ulcerans, Mycobacterium fortuitum, and Mycobacterium abcessus are opportunistic human pathogens.Many other mycobacterial species are free-living, occupying diverse environmental niches such as rivers and wet soil.Many species, including Mycobacterium smegmatis, rarely cause human disease.Mycobacteria can be subdivided on the basis of growth rate into the slow-growers, which take from seven days to three weeks or more to produce colonies on solid media, and the rapid-growers, which take less than seven days.

Slow-growers
The Mycobacterium tuberculosis complex consists of the pathogenic species M. tuberculosis, Mycobacterium bovis, Mycobacterium africanum, Mycobacterium microti, Mycobacterium canetti and the attenuated vaccine strain Mycobacterium bovis BCG.These species are grouped together by virtue of the fact that their similarity at the nucleotide level is >95%.M. tuberculosis poses a major public health problem.The World Health Organisation (WHO) fi gures estimate 8 million new cases of tuberculosis and 2 million deaths per year worldwide.Furthermore, a third of the world population is thought to be asymptomatically latently infected with TB.Factors which have contributed to the global resurgence of TB include poor patient compliance with the strict drug regimen (6-8 months treatment with strong antimicrobials) which has resulted in the emergence of multi-drug resistant strains.
For latently-infected individuals, HIV co-infection is the greatest risk factor for progression to active disease.M. bovis is the causative agent of tuberculosis in warmblooded animals including cattle, dogs, cats, pigs as well as humans.The clinical features of human infection are similar to those caused by M. tuberculosis.The attenuated BCG strain is used in many parts of the world as a vaccine against tuberculosis.
The adaptive ability of mycobacteria has a great signifi cance in the clinical context since tubercle bacilli are able to adapt and persist in a non-replicative state in the lung lesions of the human host for decades after the initial episode of infection, in what are hostile environmental conditions.M. tuberculosis is an intracellular pathogen which infects and multiplies within macrophages, where it may be exposed to DNA damaging conditions such as oxidative stress.As such it has a need to ensure that DNA damage is effectively dealt with.Homologous recombination has a major role to play in DNA repair, particularly in the repair of double stranded breaks which are potentially lethal.Since pathogenic mycobacteria are exposed to DNA-damaging agents in their intracellular phase, much attention has been focussed on the function of recombination and repair in terms of pathogenesis.For this reason the main DNA recombination pathway and in particular RecA has been studied extensively in M. tuberculosis.The availability of the complete genome for M. tuberculosis has enabled the identifi cation of all the major genes involved in DNA recombination and repair pathways (Cole et al., 1998;Mizrahi and Andersen, 1998).
The M. avium complex (MAC) consists of Mycobacterium avium and Mycobacterium intracellulare.These species can be found in many different environmental niches such as water, soil and plants and are often implicated in human disease.Since the emergence of HIV, MAC organisms have gained a foothold as opportunistic pathogens in AIDS patients.In non-AIDS patients MAC infection causes pulmonary disease and disseminated infections are rare.However, for those with HIV the infection is commonly disseminated and involves almost any internal organ, but especially the liver, spleen, and bone marrow.

Fast-growers
Of all the fast-growers, M. smegmatis has been the most widely studied, since it has been used as a genetic model system for the slow-growers, precisely because of its faster growth rate and non-pathogenic nature.M. smegmatis rarely causes human disease, although it has been associated with osteomyelitis, post surgical infections and cellulitis.Most mycobacteriophages have been identifi ed using M. smegmatis and the mechanisms of site-specifi c recombination have been studied in this organism.

Recombination
Recombination involves the exchange of DNA between two molecules.Exchange can take place between identical DNA sequences (homologous recombination) or unrelated DNA sequences (illegitimate recombination).For recombination to occur double strand breaks (or ends) of DNA are required, these can be generated by DNA damage or by recombination enzymes or can be provided by linear DNA.Strand exchange between the two molecules can then proceed followed by rejoining of the DNA ends.Recombination plays an essential role in maintaining the integrity of chromosomal DNA in all cells and a number of different recombination pathways have been identifi ed.Understanding of the processes involved in recombination has been greatly facilitated by work done in E. coli.Numerous genes involved in recombination have been identifi ed and many of these have been characterised leading to a better understanding of the sequence of events which take place during recombination and repair of damaged DNA (Clark and Sandler, 1994;Kowalczykowski et al., 1994;Kuzminov, 1999).

Homologous recombination
Homologous recombination (HR) occurs between two copies of an identical (or nearly identical) sequence and results in exchange of homologous sequences between DNA molecules (Figure 1).The key player in the recombination process is the product of the recA gene, encoding a DNA-dependent ATPase, which was the fi rst recombination gene to be identifi ed (Clark and Margulies, 1965).RecA-dependent pathways include the RecBCD pathway, which is the principle pathway in wild type cells, and the RecF pathway which facilitates recombination  (Holliday, 1964).In this model RecA mediates the exchange of DNA strands after strand breakage has occurred.Branch migration is followed by the resolution of the crossover point involving cutting and religating of the DNA ends.Resolution of Holliday structures can also occur when two migrating Holliday junctions converge.
Recombination in Mycobacteria 147 between plasmids (Clark and Sandler, 1994).RecA function is dependent on the length of homology as well as the structure of the replicons involved.In general, recombination events for homologies greater that 1kb are RecA-mediated whereas RecA-independent mechanisms can mediate cross-over events which occur at sites of homology which are less than 1kb in length.
Sequencing of the M. tuberculosis genome has led to the identifi cation of a number of mycobacterial homologues of recombination genes including RecA, RecBCD, RecF, RuvA, RuvB and RuvC.This indicates that the basic mechanism of recombination is conserved in mycobacteria.However, an early indication of major differences between the enterobacteria and the mycobacteria was found when it was shown that IR occurs at a high frequency in the M. tuberculosis complex (Kalpana et al., 1991).HR was also shown to occur when allelic replacement was demonstrated at a high frequency in M. smegmatis (Mustafa, 1995) and at a much lower frequency in the slow-growers (Kalpana, et al., 1991;Marklund et al., 1995).
The major recombination pathway of M. tuberculosis (RecA pathway) has been extensively studied because of its role in recombinational DNA repair and presumed importance for virulence.Exhaustive studies have been carried out on the mycobacterial RecA.In contrast to M. smegmatis recA, the M. tuberculosis recA (Mt-recA) encodes an intein in the middle of the open reading frame (Davis et al., 1991).The intein has protein splicing activity which directs its precise removal from the RecA pre-protein.
In vitro studies have shown that Mt-RecA is produced as an inactive 85kDa precursor molecule which undergoes post-translational splicing by the intein to produce the active 38kDa protein (Davis et al., 1992).It had been suggested that the presence of the intein within the M. tuberculosis RecA gene was responsible for the diffi culties in achieving allelic exchange via HR (McFadden, 1996).However, when this hypothesis was tested, the function of the Mt-RecA was shown to be the same, regardless of whether it possessed the intein or not, as both versions were equally profi cient at directing HR in the genetic host M. smegmatis (Papavinasasundaram et al., 1998).
Mt-RecA is negatively regulated by RecX with which it is co-expressed (Papavinasasundaram et al., 1997).RecX inhibits RecA-promoted ATP hydrolysis and abolishes the initiation and progression of strand exchange (Venkatesh et al., 2002).The formation of the nucleoprotein fi lament in M. tuberculosis is facilitated by the joint action of the RecA protein and SSB.Filament formation is pHdependent with SSB increasing the pH range over which this occurs.The effi ciency of Mt-RecA mediated strand exchange is dependent on the length of the duplex DNA, pH, ionic strength and ATP.It has been suggested that these limitations may contribute to a slower rate of strand exchange in M. tuberculosis leading to ineffi cient HR (Vaze and Muniyappa, 1999).
Initial studies on the RecA pathway proposed that recombinational repair is a critical process during infection and plays a role in the pathogenesis of mycobacteria.Since M. tuberculosis is an intracellular pathogen residing within macrophages it may be exposed to many DNA-damaging agents including oxidative stresses generated by the macrophage.Surprisingly, deletion of RecA from the vaccine strain M. bovis BCG did not result in further attenuation in an animal model of infection despite the deletion strain being more sensitive to DNA-damaging agents in vitro (Sander et al., 2001).In addition such strains were indistinguishable from wild-type under growth in dormancy-inducing conditions (oxygen depletion), a situation in which recombinational repair would seem to be necessary (Wayne, 2001;Wayne and Sohaskey, 2001).This apparent confl ict is one area in which future work with the pathogenic strains needs to be conducted in order to resolve the issue.Further support for this arises from the work of Sassetti et al. who demonstrated that the base excisions repair pathway, but not the RecA pathway, was required for in vivo growth of M. tuberculosis (Sassetti and Rubin, 2003).

Illegitimate recombination
Illegitimate recombination (IR) is recombination between DNA strands which are either non-homologous or possess very short regions of homology.It can lead to genome rearrangements such as translocations, deletions, duplications and insertions.The mechanisms which give rise to IR have been studied in E. coli and two main types of IR have been identifi ed; short homology-dependent and homology-independent.Although the mechanism has not been fully elucidated, homology-independent recombination is mediated by DNA gyrase and typically involves regions of homology of less than 1.7 bp (Shimizu et al., 1997;Ashizawa et al., 1999).Homology-dependent IR occurs between regions of very short homology (typically 4-10 bp), is RecA-independent, is increased after DNA damage by UV or other agents (Ikeda et al., 1995) and is mediated by the RecE pathway which includes RecE, RecT, RecJ, RecO and RecR (Shiraishi et al., 2002).This type of IR is proposed to be a repair mechanism which evolved to deal with ds breaks in the chromosome (Kuzminov, 1999).
IR usually occurs at low frequencies in bacterial cells but commonly occurs in mammalian cells.Early work in the slow-growing M. tuberculosis complex species indicated that IR occurred at a higher frequency than HR (Kalpana, et al., 1991).In the fast-grower M. smegmatis IR was not observed at a high frequency.As yet, the mechanism of IR in mycobacteria is unknown, although it is thought to require the joining of the ends of DNA strands.IR in M. tuberculosis between linear DNA and chromosomal DNA requires breaks in the latter to occur.It is possible that IR is part of an effi cient DNA repair system thereby preserving the integrity of the genome.This would be particularly important to the survival of slow-growers such as M. tuberculosis in, for example, the macrophage environment where the bacillus has to cope with the abundance of free-radical species found within the macrophage and copious DNA damage.Although IR has been subsequently demonstrated by at least one other group to occur at high frequencies, the role of this process in mycobacteria has not been investigated.Since IR as well as HR can repair double stranded breaks, this may partly explain why HR is not absolutely required for survival under dormant conditions or for virulence.Future work to investigate whether these two pathways repair the same type of DNA-damage would answer the question of redundancy in repair pathways.

Site-specifi c recombination
Site-specific recombination, as it name suggests, is recombination between specific DNA sites which is catalysed by a recombinase.A feature common to all site-specifi c recombination systems is the presence of a functional core in the target DNA, which usually consists of two binding sites for the recombinase arranged as an inverted repeat.The recombinase recognizes the specifi c core sequence and binds to form a synaptic complex.It then catalyses the cleavage of the core DNA followed by realignment and ligation of the cleaved ends.During site-specifi c recombination no loss or gain of genetic information occurs and no replication occurs.In many cases site-specifi c recombination results in the integration of one DNA molecule into the other rather than exchange of DNA between the two molecules.The recombinase may also catalyse the reverse reaction to remove the integrated sequence.Many temperate bacteriophages which form stable lysogens do so via a site-specifi c recombinase system.

Mycobacteriophage L5
The best-characterised mycobacteriophage is L5, a temperate phage that infects both fast-and slow-growing strains including M. smegmatis and the M. tuberculosis complex (Lee, et al., 1991;Lee and Hatfull, 1993;Pena, et al., 1996;Pena, et al., 1997;Pena, et al., 1998;Pena, et al., 1999).Site-specifi c recombination between the phage and the host chromosome occurs between the phage attachment site (attP) and the attachment site in the bacterial chromosome (attB).Recombination between attP and attB is catalysed by a phage-encoded integrase (L5 gpInt) and requires the mycobacterial host protein factor (Pedulla, et al., 1996).attP contains a 43bp core site at which strand exchange occurs and recombination results in the formation of an integrated prophage delimited by the left (attL) and right (attR) junction sites.Excision of the prophage is catalysed by an excisionase and occurs naturally during the induction of the L5 lysogens in the late lytic growth phase.Again it involves a site-specifi c recombination event between the attachment junctions attL and attR to regenerate the original phage genome and bacterial chromosome as separate entities (Lewis and Hatfull, 2000).In L5 the integrase can also mediate excision at a low level (Springer et al., 2001).The ability of L5 mycobacteriophage to infect both fast and slow growing mycobacteria is due to the presence of the conserved attB site which overlaps the tRNA gly gene.

Mycobacteriophage Bxb1
Bxb1 mycobacteriophage is a temperate phage which forms stable lysogens following integration into the genome of M. smegmatis.Bxb1 is closely related to the L5 phage and the overall genome organization of the genome of Bxb1 is similar to that of L5 (Mediavilla, et al., 2000).Bxb1 has a serine integrase which mediates the integration of Bxb1 into a specifi c region of the host genome.Unusually the attB site is within the coding region of the groEL1 gene (Kim, et al., 2003).Mycobacteria have two groEL genes with a high level of similarity at the nucleotide level (70%), but due to sequence-specifi city Bxb1 only integrates into groEL1.Bxb1 does not infect slow-growing mycobacteria as the attB site is not suffi ciently conserved.Excision of Bxb1 requires an excisionase protein which has not yet been identifi ed.

Mycobacteriophage Ms6
Ms6 forms stable lysogens in M. smegmatis (Anes, et al., 1992).It is unrelated to phages L5 or Bxb1.The Ms6 attP attachment site has a high A+T content and contains numerous direct and inverted repeats.The 26 bp core region of attP overlaps the 3' end of the tRNA ALA gene which is a conserved feature in both fast-and slow-growing mycobacteria (Freitas-Vieira, et al., 1998).Integration of Ms6 into host genome is mediated by the Ms6 integrase which directs integration into the 3' end of the tRNA ALA genes of both fast and slow growing mycobacteria.pSAM2 integrative element pSAM2 is an 11kb integrative element which was originally isolated from Streptomyces ambofaciens which encodes a system for producing the macrolide antibiotic spiramycin (Pernodet et al., 1984;Boccard et al., 1989).The recombination system of pSAM2 is similar to that of temperate phages involving an integrase and attB/P sites.The attB site is a 58 bp region extending from the anticodon loop to the 3' end of the tRNA Pro gene.The attB site is conserved in actinomycetes including mycobacteria (Mazodier et al., 1990).Thus pSAM2 can integrate into a wide range of actinomycete species.Integration is facilitated by an in integrase of the λ-integrase family (Seoane et al., 1997).

Construction of genetic knock-outs
Homologous recombination has been widely used to inactivate chromosomal genes in bacteria and the mycobacteria are no exception (Figure 2).All methods make use of the fact that HR between a copy of a gene on the chromosome and a mutated version of the gene elsewhere will result in the replacement of the former.All of the methods now employed for generating mutant strains make use of the basics of HR, but the delivery of the altered gene varies.HR in M. smegmatis is relatively straightforward and gene knock-outs are readily obtained.However, initial attempts to use HR in slow-growing mycobacteria were frustrated by both the Recombination in Mycobacteria 149 high level of IR and the low frequency of HR and several different approaches have been used to overcome these problems.The reason for the low frequency of HR in M. tuberculosis is still unknown, however the suggestion that lack of HR arises from low transformation effi ciencies may only be the case in certain species such as M. avium.In M. tuberculosis, high transformation effi ciencies are easily attainable (in our hands routinely 10 7 per µg plasmid DNA), whereas HR effi ciencies are low, in the range of 1-100 per µg.In addition, the frequency of HR varies by orders of magnitude between genetic loci (Parish et al., 1999) showing that it is the frequency of HR that is the limiting factor.Although extensive studies assessing the variables which affect HR frequency have not been carried out, the available data shows that certain types of DNA substrates (alkali-denatured, UV-treated, long linear fragments) are more recombinogenic than others (Balasubramanian et al., 1996;Hinds et al., 1999).The minimum length of homology required for HR has not been established, but recombination can be achieved with fragments of 700 bp (Parish, et al., 1999).
Since replacement of the chromosomal gene requires two recombination events to occur, recombination methods can be divided into one-step and two-step methods.Using a one-step method, gene replacement is directly selected for after introduction of the mutant allele (Figure 2A); this method requires that the incoming gene is marked with e.g. an antibiotic resistance gene to allow for selection of recombinants.In the two-step method, initially single cross-over strains are selected which have the whole delivery vector integrated into the chromosome; the second recombination event is then allowed to occur and isolation of the double cross-overs carried out (Figure 2B).Using this method, the vector must carry suitable selection and/or resistance marker genes.However, in contrast to the single step method, unmarked deletions or point mutations can be constructed.In the slow-growers where HR frequency is low, two step methods are preferred.
In our laboratory, we routinely make unmarked in-frame deletions when constructing deletion mutants using a twostep method (Parish and Stoker, 2000).This has two major advantages, fi rst there is no reporter/marker genes left in the chromosome so sequential deletions can be made without the need for extra reporters and secondly the likelihood of polar effects, a particular problem for genes in operons, can be minimised.
Delivery of the recombination substrate and isolation of recombinants requires both that the DNA is introduced, but also that the delivery vector is subsequently lost so that selection for the marker genes only recovers cells in which HR has occurred.Three main methods of delivering the recombination substrate have been developed for use in mycobacteria -temperature sensitive (ts) phages or plasmids, suicide (non-replicating) plasmids and plasmid incompatibility.
One-step method -phage delivery A conditionally replicating phage system has been developed by Bardarov et al. (1997)(Figure 3).Although initially developed for transposon mutagenesis, the system is equally applicable to gene replacement (Bardarov et al., 2002).In this method, the gene of interest is disrupted by an antibiotic resistance marker and cloned into a phasmid vector.This vector behaves as a plasmid in E. coli and as a phage in mycobacteria.The phage has a ts mutation so that it will only replicate as a phage at a low temperature (30°C).Once the phasmid has been constructed it is transformed into the host M. smegmatis at the permissive temperature in order to generate a high titre phage lysate.This recombinant phage is then used to infect the host mycobacterium at the non-permissive temperature (39°C).Only cells in which HR has occurred will grow after selection on the appropriate antibiotic.The advantage of this system is that phage infection is extremely effi cient so that even if HR frequency is low, recombinants should be obtained.The problem of leaving an antibiotic resistance marker in the gene of interest has been addressed by including the res sites of the γδ transposon system (Figure 2C).These sites are the target of the γδ resolvase and fl ank the resistance gene.Site-specifi c recombination mediated by A recombinant phasmid containing the marked disrupted gene is propagated as phage at 30°C in M. smegmatis to make a high titre lysate.This recombinant phage is then used to infect the mycobacterial host.Selection at 39°C results in the isolation of recombinants as the phage cannot replicate at this temperature.The resulting mutation is marked but can be unmarked using the resolvase as depicted in Figure 2C.
Recombination in Mycobacteria 151 the γδ resolvase removes the resistance gene sequence leaving an unmarked version.Although this solves the problem of leaving the marker in the chromosome, it is still not possible to use such this system for generating more precise mutations such as point mutations or in-frame deletions, as one of the res sites is left inside the gene.In addition if the gene is essential, then no recombinants will be obtained.Two-step strategies -conditionally replicating plasmids Many modifi cations have been made to the basic two step strategy in order to improve its effi ciency, particularly with respect to use in M. tuberculosis.The use of a replicating vector allows the recombination substrate to be present in the cell for a longer period of time, making the possibility of HR more likely.Obviously once recombination has occurred then a mechanism for removing the plasmid is required.The isolation of a plasmid which is ts in mycobacteria (Guilhot et al., 1992) in combination with the use of sacB as a negative selection marker (Pelicic et al., 1996a) has led to the development of a two-step gene replacement method for use in M. smegmatis (Pelicic et al., 1996b), M. bovis BCG (Pelicic et al., 1996c) and M. tuberculosis (Pelicic et al., 1998)(Figure 4A).In both cases the inactivated gene is introduced on replicating plasmids and recombination occurs followed by subsequent plasmid loss.In (A) plasmid loss is driven by shifting the temperature up to 39°C and at the same time selecting for cells that have lost the plasmid using sucrose selection.In (B) plasmids are lost normally and as they cannot replicate independently of each other the loss of one plasmid means that the other is no longer maintained.

Plasmid incompatibility
We have used plasmid incompatibility to drive plasmid loss after recombination (Pashley et al., 2003)(Figure 4B).The system is based on providing the plasmid replication proteins on separate plasmids so that they can replicate when both plasmids of a pair are present.One plasmid carries the recombination substrate.Selection for both plasmids maintains their presence in the cell allowing time for HR to occur.Subsequent plasmid loss is achieved by growing bacteria without antibiotics -since the plasmid used has a low copy number, one of the pair is frequently lost from the cell and so neither can replicate.Such loss occurs at a high frequency, so that recombinants without plasmids can easily be isolated.Two-step strategies -suicide vectors A number of approaches using recombination substrates which cannot replicate in mycobacteria have been used.Balasubramanian et al. used long linear substrates of 40-50 kb in order to achieve HR in M. tuberculosis (Balasubramanian, et al., 1996).Whilst this approach was successful, it has not been widely used due to the low effi ciency of introducing such large DNA fragments.Short linear fragments are unsuitable as they tend to be IR substrates (Kalpana, et al., 1991;Collins et al., 2002).
Much use has been made of suicide vectors which cannot replicate in mycobacteria, as delivery vectors for recombination substrates.In general vectors which replicate in E. coli, do not replicate in mycobacteria, so most of the widely-used E. coli cloning vectors are suitable for this purpose.In the simplest case, the gene of interest is disrupted by an antibiotic resistance gene, cloned into a suicide vector and transformed into mycobacteria by electroporation.Selection of antibiotic resistant colonies should isolate stains in which recombination has occurred (Figure 2A).There are several potential problems with this procedure: (1) using a single marker, it is not possible to distinguish single and double cross-over strains; (2) IR occurs at a relatively high frequency in the slowgrowing mycobacteria, especially M. tuberculosis; (3) if the gene is essential, no recombinants will be obtained and (4) recombination frequency is low in comparison to transformation effi ciency with a replicating vector, often occurring at 10 -5 to 10 -6 for single cross-overs and probably at least an order of magnitude lower for double cross-overs.Despite these limitations, gene replacement has been achieved with a few genes in this straightforward manner.For the fast-growers, such as M. smegmatis, where IR is not common, the problems are fewer and more success has been achieved with these species.
In order to overcome these limitations, several approaches have been used.We have used pre-treatment of the DNA to stimulate HR and reduce IR, so that any transformants isolated result from the former (Hinds, et al., 1999;Parish, et al., 1999;Parish and Stoker, 2000).Use of either single-stranded or UV-irradiated DNA results in abolition of IR at the same time as raising the frequency of HR (Hinds, et al., 1999).As well as increasing HR events, we have also developed a two-step procedure involving multiple selection/marker genes which make the selection and identifi cation of recombinants as straightforward as possible (Figure 2B).
Recently, Malaga et al. used a two-step strategy in which the mutated gene was disrupted by an antibiotic resistance gene (Malaga et al., 2003).The antibiotic marker was fl anked by the res sites of the transposon γδ system as with the phage delivery system.Again, the TnpR resolvase was expressed in cells where it acted on the res sites and specifi cally removed the marker.In this way, they were able to construct unmarked deletions in the chromosome of M. smegmatis and M. bovis BCG (Figure 2C).

Selection and marker genes
The problems encountered in M. tuberculosis are more numerous than those for M. smegmatis.As well as the problem of IR in M. tuberculosis, there is also the problem of signifi cant levels of spontaneous resistance arising.For example, the background level of resistance to kanamycin can often be higher than the frequency of HR, thus use of a single resistance marker is not recommended, precluding the routine use of a single step HR method when using plasmid delivery systems.The combination of two antibiotic resistance genes e.g.hygromycin and kanamycin overcomes this and reduces the background level of spontaneous resistance essentially to zero.Alternatively, reporter genes can be used in conjunction with an antibiotic resistance gene.Two main reporter genes have been used in mycobacteria, namely lacZ and xylE both of which can be assayed using simple colour changes in the presence of appropriate substrates (Curcic et al., 1994;Timm et al., 1994).
Three negative selection genes have been used in mycobacteria, katG, rpsL and sacB (Norman et al., 1995;Sander et al., 1995;Pelicic, et al., 1996).KatG encodes a catalase which confers isoniazid sensitivity, rpsL encodes the wild type allele of the ribosomal protein S12 which confers streptomycin sensitivity and sacB encodes a levan sucrase which confers sucrose sensitivity.Both katG and rpsL counter-selection methods require the use of strains which are initially resistant to the antibiotic.In contrast, sacB use has the advantage that it results in sucrose sensitivity in any wild type mycobacterial strain.Negative selection markers are invaluable in the two-step procedure as a mechanism for selecting against single cross-overs and therefore for double cross-overs (Reyrat et al., 1998).One disadvantage of sacB is the high rate of spontaneous resistance, typically in the region of 10 -4 to 10 -5 in M. tuberculosis, and so it is usually utilised in conjunction with a screening reporter such as xylE (Pelicic et al., 1997) or lacZ (Parish and Stoker, 2000).
Although there is now a choice of methods for gene replacement in the M. tuberculosis complex, other mycobacteria are still refractory to such genetic methods.Very few mutants of the M. avium-intracellularescrofulaceum complex have been generated by HR, possibly due to the diffi culty of transforming the cells with DNA (Marklund, et al., 1995;Mahenthiralingam et al., 1998).The ability to obtain mutants appears to be strain dependent, with certain laboratory strains being more amenable to manipulation (Marklund, et al., 1995;Mahenthiralingam, et al., 1998).Recently the phage delivery system has been used in these bacteria to introduce the recombination substrate and this may overcome the diffi culties associated with electroporation of plasmid DNA into the cells (Otero et al., 2003).In addition, katG counter-selection has been successfully used (Maslow et al., 2003).

Illegitimate recombination
Early attempts at gene replacement in the slow-growing mycobacteria were hampered by the much higher frequency of IR (Kalpana, et al., 1991).This resulted in the essentially random integration of linear DNA into the chromosome at a frequency of 10 -4 to 10 -5 .Since HR occurs at a lower, albeit loci-dependant, frequency of 10 -5 to 10 -7 , IR posed a signifi cant barrier to obtaining gene replacement, However, IR has proved useful in itself as a method for generating random mutations and mutant libraries.Collins et al. used IR to construct a library of 1000 illegitimate recombinants in M. bovis which they subsequently screened for attenuation of virulence and isolated fi ve mutants showing reduced virulence demonstrating the utility of this method (Collins, et al., 2002).

Site-specifi c recombination
The main use of site-specifi c recombination has been in the development of vectors for use in mycobacteria.Phages which integrate into the chromosome at specifi c sites have been developed for use as delivery vectors for exogenous DNA.The advantages of these vectors mainly lie in the fact that they have an effective copy number of one.This is especially important when studying promoter activity, as higher copy numbers can result in artefactual results with controllable promoters.For example, the hsp60 promoter is a heat-shock inducible promoter, but when the promoter is placed on an extra-chromosomal vector (copy number approximately fi ve per cell), it is expressed constitutively, possibly due to the titration of regulatory proteins (Stover et al., 1991).Copy number may also be important when inserting genes which are potentially toxic when expressed at a high level.Another consideration has been the stability of vectors, which is generally considered to be higher for integrated vectors than for extrachromosomal vectors Figure 5. Excision and switching of L5 integrated vectors.A: Schematic diagram of a deletion strain with the only functional copy of an essential gene in the L5 integrated vector.B: Excision of the integrated vector can be achieved by transforming with a plasmid containing the excisionase gene (xis) and results in the strain shown.For essential genes this strain is not viable.C: Direct replacement of an integrated vector can be achieved by transforming with a second L5 integrating vector carrying an alternative resistance marker.In the case depicted a test gene is also included.If the test gene can complement the function of the deleted chromosomal gene then the cells will be viable.The limits of the integrating vector are denoted by the att sites.gm: gentamicin resistance; hyg: hygromycin resistance.An in-frame deletion is denoted by the "explosion" symbol.
which are often subject to deletions and rearrangements (Haeseleer, 1994).
The most commonly used system is derived from mycobacteriophage L5.Incorporation of the L5 integrase and the attP site into a vector is all that is required for integration into the tRNA gly region of the chromosome (Lee, et al., 1991).A variety of derivatives have been used including a promoter-probe version (Dussurget et al., 1999;Casali and Ehrt, 2001).Different antibiotic resistance genes have been incorporated into the vectors including kanamycin, hygromycin, gentamicin and streptomycin resistance (Lee and Hatfull, 1993;Mahenthiralingam, et al., 1998;Dussurget, et al., 1999).Although L5-based vectors are considered to be stably maintained, the integrase can mediate excision of the vector and in some cases this may be a problem.In order to circumvent this problem, a two plasmid system which provides the integrase on a non-replicating vector separate from the attP site can be developed (Pena, et al., 1997;Springer, et al., 2001).This allows for integration of the vector and subsequent loss of the integrase so that no excision can occur.
We have also used the L5-based integrating vectors to construct merodiploid strains in order to demonstrate gene essentiality (Parish and Stoker, 2000;Parish et al., 2001;Parish and Stoker, 2002).We have shown for glnE and aroK, that the chromosomal copy could only be deleted when a functional copy of the gene was provided in the L5 vector.We also demonstrated that when the integrated vector contained the only functional copy, it could not be removed by the excisionase (Figure 5).Recent work in our laboratory has developed this method further to enable us to study essential genes by demonstrating that a resident integrated vector can be replaced by an incoming vector in M. tuberculosis (Pashley and Parish, 2003)(Figure 5) extending the studies in M. smegmatis (Pena, et al., 1997;Springer, et al., 2001).This new method is now allowing us to study the function of essential genes and adds further impetus to the genetic analysis of mycobacteria.
Another phage-based integrating vector has been developed from phage Ms6 (Freitas-Vieira, et al., 1998).Integration of a vector carrying the Ms6 attP site and integrase has been demonstrated in both fast-and slowgrowing species (M.smegmatis, M. vaccae, M. bovis BCG and M. tuberculosis).The vector is stably maintained and since it has a different site of integration (tRNA ala gene) can be used in combination with L5-based integrating vectors.
An integrative plasmid from Streptomyces has also been shown to function in mycobacteria (Martin et al., 1991).pSAM2 integration into the chromosome occurs in a range of actinomycetes which have a conserved tRNA pro site (Mazodier, et al., 1990).Although integration has only been directly demonstrated in M. smegmatis, other mycobacteria have the conserved attB site and thus this plasmid should function in these species as well.A vector which carries the attP site and integrase, but not the excisionase is available for stable maintenance (Seoane, et al., 1997).

Conclusion
Although recombination mechanisms have been well characterised in Gram negative bacteria, the details in mycobacteria still remain sketchy.The unusual presence of an intein in Mt-RecA and the unexpectedly high frequency of IR in the slow-growers point to a requirement for repair pathways that is critically different from enterobacteria.In addition the surprising observation that RecA is not required for survival under infection conditions poses further questions.These observations may represent another manifestation of the particular environment in which the pathogenic mycobacteria inhabit and provide further clues as to the type of DNA damage they sustain.As such future work is needed to focus on the potential redundancy of HR and IR for repairing ds breaks and determining which DNA repair pathway are critical for virulence.
As well as informing the basic biology of these organisms, the study of recombination pathways has enabled the development of genetic methods which are proving invaluable in the study of the pathogenesis of these medically important bacteria.As well as established methods for constructing mutant strains, newer techniques will allow us to study the function of essential genes in the future.

Figure 1 .
Figure 1.The Holliday model of homologous recombination(Holliday,  1964).In this model RecA mediates the exchange of DNA strands after strand breakage has occurred.Branch migration is followed by the resolution of the crossover point involving cutting and religating of the DNA ends.Resolution of Holliday structures can also occur when two migrating Holliday junctions converge.

Figure 2 .
Figure 2. Schematic of gene replacement by homologous recombination.A: One-step gene replacement requires two simultaneous recombination events to occur resulting in the mutant gene replacing the wild-type gene.B: Two-step recombination is based on the isolation of single cross-overs carrying the whole delivery vector integrated in the fi rst step.In the second step double cross-overs are isolated which may have either of the two gene alleles.C: Removal of marker genes by site-specifi c recombination.

Figure 3 .
Figure 3. Gene replacement using conditionally-replicating phages.A recombinant phasmid containing the marked disrupted gene is propagated as phage at 30°C in M. smegmatis to make a high titre lysate.This recombinant phage is then used to infect the mycobacterial host.Selection at 39°C results in the isolation of recombinants as the phage cannot replicate at this temperature.The resulting mutation is marked but can be unmarked using the resolvase as depicted in Figure2C.

Figure 4 .
Figure 4. Use of replicating vectors to deliver recombination substrates.A: Temperature sensitive plasmid.B: Plasmid incompatibility.In both cases the inactivated gene is introduced on replicating plasmids and recombination occurs followed by subsequent plasmid loss.In (A) plasmid loss is driven by shifting the temperature up to 39°C and at the same time selecting for cells that have lost the plasmid using sucrose selection.In (B) plasmids are lost normally and as they cannot replicate independently of each other the loss of one plasmid means that the other is no longer maintained.