MMBR Figure table search 04
Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Copyright Information
Right arrow Books from ASM Press
Right arrow MicrobeWorld
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Casadesús, J.
Right arrow Articles by Low, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Casadesús, J.
Right arrow Articles by Low, D.
Microbiology and Molecular Biology Reviews, September 2006, p. 830-856, Vol. 70, No. 3
1092-2172/06/$08.00+0     doi:10.1128/MMBR.00016-06
Copyright © 2006, American Society for Microbiology. All Rights Reserved.

Epigenetic Gene Regulation in the Bacterial World

Josep Casadesús1 and David Low2*

Departamento de Genética, Universidad de Sevilla, Seville 41080, Spain,1 Molecular, Cellular, and Developmental Biology, University of California, Santa Barbara, California 931062

SUMMARY
INTRODUCTION
FOUNDATIONS
    Origins: R-M Systems
    Orphan DNA MTases
        Dam.
        CcrM.
    Regulation of Cellular Events by the Hemimethylated DNA State
    DNA Methylation Patterns
DNA ADENINE METHYLATION-DEPENDENT REGULATORY SYSTEMS
    Pap Pili
        The Pap OFF- to ON-phase transition.
        Environmental mechanisms for switch control.
        The Pap ON- to OFF-phase transition.
    Pap-Related Systems
        PapI homologue acting as a positive regulator of pilus expression.
        PapI homologue acting as a negative regulator of pilus expression.
    Phase-Variable Outer Membrane Protein Ag43
    VSP Repair
    Bacteriophage Infection
        Regulation of DNA packaging in bacteriophage P1.
        Regulation of the cre gene in bacteriophage P1.
        Regulation of the mom operon in bacteriophage Mu.
    Conjugal Transfer in the Virulence Plasmid of Salmonella enterica
        Regulation of traJ transcription.
        Regulation of finP transcription.
    Bacterial Virulence
        Roles of Dam methylation in Salmonella virulence.
        Attenuation of bacterial virulence by Dam methylase overproduction.
    CcrM Methylation and Regulation of Cell Cycle in Alphaproteobacteria
        Regulation of ccrM transcription.
        Regulation of ctrA transcription.
CONCLUDING REMARKS
ACKNOWLEDGMENTS
REFERENCES

   SUMMARY
 Top
 Next
 References
 
Like many eukaryotes, bacteria make widespread use of postreplicative DNA methylation for the epigenetic control of DNA-protein interactions. Unlike eukaryotes, however, bacteria use DNA adenine methylation (rather than DNA cytosine methylation) as an epigenetic signal. DNA adenine methylation plays roles in the virulence of diverse pathogens of humans and livestock animals, including pathogenic Escherichia coli, Salmonella, Vibrio, Yersinia, Haemophilus, and Brucella. In Alphaproteobacteria, methylation of adenine at GANTC sites by the CcrM methylase regulates the cell cycle and couples gene transcription to DNA replication. In Gammaproteobacteria, adenine methylation at GATC sites by the Dam methylase provides signals for DNA replication, chromosome segregation, mismatch repair, packaging of bacteriophage genomes, transposase activity, and regulation of gene expression. Transcriptional repression by Dam methylation appears to be more common than transcriptional activation. Certain promoters are active only during the hemimethylation interval that follows DNA replication; repression is restored when the newly synthesized DNA strand is methylated. In the E. coli genome, however, methylation of specific GATC sites can be blocked by cognate DNA binding proteins. Blockage of GATC methylation beyond cell division permits transmission of DNA methylation patterns to daughter cells and can give rise to distinct epigenetic states, each propagated by a positive feedback loop. Switching between alternative DNA methylation patterns can split clonal bacterial populations into epigenetic lineages in a manner reminiscent of eukaryotic cell differentiation. Inheritance of self-propagating DNA methylation patterns governs phase variation in the E. coli pap operon, the agn43 gene, and other loci encoding virulence-related cell surface functions.


   INTRODUCTION
 Top
 Previous
 Next
 References
 
The word "epigenetics" is based on the Greek prefix "epi-," denoting "on" or "in addition," and "genetic," meaning "pertaining to or produced from genes." In the past, the term "epigenetics" has been used to describe the differentiation of genetically identical cells into distinct cell types to form tissues and organs during development of a multicellular organism. In current practice the word is used by biologists to describe heritable changes in gene expression that occur without changes in the DNA sequence. In the strict sense, epigenetic systems involve two or more heritable states, each maintained by a positive feedback loop. In a broader sense, however, any additional information superimposed to the DNA sequence (e.g., methylation of DNA) can be considered "epigenetic." Here we review the current state of research in the field of bacterial epigenetics, with an emphasis on systems controlled by DNA methylation, which are the best known at the molecular level. We refer the reader to reviews covering other aspects of DNA methylation and related topics (16, 32, 51, 96, 143, 160, 172, 178, 202, 214, 264, 265, 285).

Epigenetic phenomena include prions, in which protein structure is heritably transmitted (223, 231, 235, 259); genomic imprinting, characterized by monoallelic repression of maternally or paternally inherited genes (52, 84, 128, 195, 213); histone modification, such as methylation of lysines by histone phase methyltransferases (MTases) that maintain active and silent chromatin states (132, 273); and DNA methylation patterns formed as a result of inhibition of methylation of specific DNA bases by protein binding (29, 41, 118, 262, 263). Each of these phenomena involve self-perpetuating states, be they protein or DNA related (116, 155, 230-232), and the particular state that the molecule is in affects gene expression.

Epigenetic regulation can enable unicellular organisms to respond rapidly to environmental stresses or signals. For example, the yeast prion PSI+ is generated by a conformational change of the Sup35p translation termination factor, which is then inherited by daughter cells. The PSI+ form of Sup35p allows readthrough of nonsense codons that can provide a survival advantage under adverse conditions such as growth in paraquat or caffeine (259). The PSI+ prion is a metastable element that is generated and lost spontaneously at low rates, and thus within a population of yeast, some yeast cells will carry the prion and others will not. This situation provides potential flexibility in the response of the yeast population to environmental changes, orchestrated through the ability of the PSI+ prion to act upon native Sup35p protein and convert it to prion protein (223).

Methylation of specific DNA sequences by DNA methyltransferases provides another mechanism by which epigenetic inheritance can be orchestrated. For example, in certain eukaryotes, including mammals, methylation of cytosine residues at 5'-CG-3' (CpG) sequences facilitates binding of methyl-CpG binding proteins (134, 156, 187). In turn, methyl-CpG binding proteins affect the transcription state of a local DNA region through further interaction with chromatin-remodeling proteins (145). Methylation of CpG can affect gene expression, and the methylated state is usually correlated with transcriptional repression. The methylation pattern of a DNA region is defined as the collective presence or absence of methyl groups on specific target sites. DNA methylation patterns can vary between cells, tissues, and individuals. DNA methylation patterns are established via de novo methylation during the first stages of embryonic development (28, 81, 213). Such patterns are propagated by DNA methyltransferases known as maintenance methylases (Dnmt1), which are active on hemimethylated DNA substrates generated by DNA replication. Thus, if a DNA region contains methylated CpG sequences, they will be propagated in the methylated state. Nonmethylated CpG sequences, however, are not substrates for the maintenance DNA methylases. Thus, if a DNA region contains nonmethylated CpGs, they will tend to remain nonmethylated. A major area of research in eukaryotic epigenetic regulation is directed at understanding the mechanisms by which DNA methylation patterns are erased following cleavage of the fertilized egg and then established via de novo methylation (74, 81, 141, 180).

DNA methylation plays important roles in the biology of bacteria: phenomena such as timing of DNA replication, partitioning nascent chromosomes to daughter cells, repair of DNA, and timing of transposition and conjugal transfer of plasmids are sensitive to the methylation states of specific DNA regions (16, 160, 172, 178, 202, 285). All of these events use as a signal the hemimethylated state of newly replicated DNA, generated by semiconservative replication of a fully methylated DNA molecule. In the case of DNA replication, the protein SeqA binds preferentially to hemimethylated DNA target sites (GATC sequence) clustered in the origin of replication (oriC) and sequesters the origin from replication initiation. In addition, SeqA also transiently blocks synthesis of the DnaA protein, which is necessary for replication initiation, by binding to hemimethylated GATC sites in the dnaA promoter (36, 49, 100, 140, 146, 163, 179, 249). In DNA repair, the methyl-directed mismatch repair protein MutH recognizes hemimethylated DNA sites and cuts the nonmethylated daughter DNA strand, ensuring that the methylated parental strand will be used as the template for repair-associated DNA synthesis (8, 12, 25, 178, 227, 237). In transposition of Tn10, hemimethylated DNA plays two roles: enhancing binding of RNA polymerase to the transposase promoter and enhancing binding of transposase to its DNA target sites (144, 181, 219). DNA methylation appears to play similar roles in regulating Tn5 transposition (73, 161, 175, 217, 253, 292). None of these phenomena are heritable since the hemimethylated state of DNA is not heritable, occurring transiently in newly replicated DNA.

Phenomena involving inheritance of DNA methylation patterns are also known in bacteria, and the best-known examples involve phase variation. In phase variation, gene expression alternates between active (ON phase) and inactive (OFF phase) states. For example, uropathogenic Escherichia coli (UPEC) cells undergo pilus phase variation, which can be observed using immunoelectron microscopy with antipilus antibodies marked with colloidal gold (Fig. 1). Phase variation can occur through a variety of genetic mechanisms involving changes in nucleotide sequence (e.g., site-specific recombination and mutation) which result in heritably altered gene expression (1, 4, 26, 32, 33, 42, 53, 69, 75, 79, 86, 98, 113, 119, 122, 133, 164, 191, 229, 240, 244, 256, 265, 298). Bacteria also use epigenetic mechanisms to control phase variation. In all cases examined, these systems use DNA methylation patterns to pass information regarding the phenotypic expression state of the mother cell on to the daughter cells. A DNA methylation pattern is formed by binding of a regulatory protein(s) to a site that overlaps a methylation target, blocking methylation. This pattern can control gene expression if methylation, in turn, affects binding of the regulatory protein(s) to its DNA target site, which could occur by steric hindrance or alteration of DNA structure due to methylation (206, 207). Notably, most adhesin genes in E. coli are regulated by epigenetic mechanisms involving DNA methylation patterns (32, 115, 116, 262).


Figure 1
View larger version (129K):
[in this window]
[in a new window]
 
FIG. 1. Pap phase variation in uropathogenic E. coli. Pap17 pilus phase variation of uropathogenic E. coli strain C1212 was visualized with anti-Pap17 antibodies labeled with 10-nm colloidal gold particles. The bacterium at the left is in the ON-phase state for Pap17 expression, whereas the two bacteria at the right are in the OFF phase. Note that these two bacteria express unmarked Pap21 pili, which are also under phase variation control but are not marked with the anti-Pap17 antiserum. In addition to the Pap pili (diameter of about 7 nm), flagella (diameter of about 20 nm) can also be seen.

 
Little is known concerning how widespread epigenetic control is in the bacterial world and the roles that epigenetic regulatory systems play in bacterial biology, including pathogenesis. Our main goal in writing this review is to introduce the reader to epigenetic regulatory control, focusing on the main features and unique aspects of the epigenetic control systems that have been studied. The list of examples discussed below can be grouped into several classes: (i) strict-sense epigenetic inheritance involving heritable transmission of DNA methylation states to daughter cells, as in the pap operon of uropathogenic E. coli; (ii) DNA methylation signals that generate distinct epigenetic states in DNA molecules coexisting in the same cell, as in IS10 transposition and in traJ regulation; and (iii) systems that are "epigenetic" in a broader sense, since DNA methylation provides a signal for temporal or spatial control of DNA-protein interactions but does not give rise to distinct lineages of cells or DNA molecules. Examples of the last class include the control of bacterial mismatch repair by DNA methylation and the coupling of promoters to distinct DNA methylation states during the cell cycle. We hope that this will be useful not only in understanding experiments carried out to date but also as a primer for future work in bacterial epigenetics.


   FOUNDATIONS
 Top
 Previous
 Next
 References
 
Most epigenetic systems known in bacteria use DNA methylation as a signal that regulates a specific DNA-protein interaction. These systems are usually composed of a DNA methylase and a DNA binding protein(s) that bind to DNA sequences overlapping the target methylation site, blocking methylation of that site. Methylation of the target site, in turn, inhibits protein binding, resulting in two alternative methylation states of the target site, methylated and nonmethylated. The epigenetic regulatory methylases known in bacteria are designated "orphan" methylases since they lack a cognate restriction enzyme. We begin by discussing restriction-modification (R-M) systems, since they are likely the progenitors of the orphan methylases regulating epigenetic processes. Indeed, DNA methylation plays a regulatory role in some R-M systems, as described below.

Origins: R-M Systems

DNA methylation was originally discovered in the context of restriction-modification systems, in which a restriction endonuclease recognizes a specific target DNA sequence unless that sequence has been methylated by a cognate DNA methyltransferase (5, 27, 39, 153, 220, 260). Three main groups of R-M systems (types I, II, and III) have been described, based on whether the restriction and modification activities are within a single polypeptide (types I and III) or separate polypeptides (type II) and on whether the restriction enzymes cut at a site close to (types II and III) or far from (type I) the methylation target sequence (185, 221, 236, 238, 284). It has been postulated that R-M systems evolved as a form of cellular defense, targeting incoming viral and other foreign DNA sequences for degradation. Note that foreign DNAs would not be methylated at the appropriate target sites unless that sequence was derived from a bacterium with a cognate methylase of the same specificity (6, 77). In these systems, the restriction enzyme and cognate methylase are both expressed at levels that allow complete methylation of the genome, sufficient to block double-strand DNA cleavage by the restriction enzyme, a potentially fatal event. Incoming foreign DNA is efficiently destroyed, since the restriction enzyme has the upper hand over the methylase: for the DNA to survive, every restriction site it carries would have to be methylated before even a single site is cleaved by the cognate restriction enzyme, an unlikely event.

Work by Kobayashi and colleagues has suggested that R-M systems have attributes of selfish genes (148-150). Nakayama and Kobayashi showed that a plasmid containing the type II R-M EcoRV system could not be displaced from cells by an incompatible plasmid due to the death of cells that lost the EcoRV-containing plasmid, a form of postsegregational killing (186). In cells lacking the R-M gene complex, the levels of methylase and cognate restriction enzyme drop to a point where insufficient methylase is present to protect all chromosomal target sites; the restriction enzyme then cleaves one or more sites, killing the cell. This scenario is similar to that for addiction modules such as hok-sok, in which sok gene expresses an antisense RNA that inhibits translation of the hok toxin gene. When cells lose a plasmid containing hok-sok, they die; since hok mRNA is stable but sok RNA is unstable (half-life [t1/2]), <30 s), translation of hok ensues which leads to cell death (91, 92). Other addiction modules are made of two proteins, a toxin and an antitoxin (82, 90, 106).

Further analysis of the EcoRV system has shown that a regulatory gene designated "C," sandwiched between the R and M genes, codes for a product that activates R gene expression (186). The C gene appears to be required for expression of the R gene, since postsegregational killing does not occur in C gene mutants. One function of the C gene is in establishment of an R-M system in a new host. In this case the M gene is immediately activated, allowing modification of host DNA sites. At the same time, C gene expression is also activated, building up the C protein level to a point that allows activation of R gene expression. This temporal delay in expression of the restriction enzyme is critical in allowing time for all chromosomal sites to be methylated and protected from digestion. In addition, C also functions as a suicide immunity gene, forcing expression of the R gene of an incoming closely related R-M complex with different restriction specificity, resulting in host cell death. This would be expected to prevent spread of a competing R-M complex of the same C gene immunity group (any R-M complex in which the resident C protein activates expression of an incoming R gene) within a bacterial population (250).

A second regulatory strategy used by R-M systems utilizes methylation of the cognate restriction site to control R-M transcription via a direct effect on RNA polymerase binding. For example, in the CfrBI system of Citrobacter freundii, methylation of a cytosine (underlined) within the 5'-CCATGG-3' DNA restriction site decreases expression of the CfrBI methylase (CfrBIM) and concomitantly increases expression of the CfrBI restriction enzyme (CfrBIR) (18, 294). This appears to occur as a result of the location of the cfrBI site within the –35 RNA polymerase {sigma}70 binding site of the cfrBIM gene. Since the cfrBIM promoter is stronger than that of cfrBIR, any bacterial cell receiving the CfrBI system will be methylated before restriction can occur. As the intracellular methylase level increases, the cfrBI site is methylated, decreasing expression of cfrBIM and enabling expression of cfrBIR. The latter may protect the cell from incoming foreign DNA lacking methylated sequences.

A third R-M regulatory mechanism utilizes the methylase itself as a feedback regulator. In a number of cases binding of the methylase to DNA occurs via an N-terminal extension containing a helix-turn-helix motif (142, 196, 197). For example, in the SsoII R-M system of Shigella sonnei, the SsoII methyltransferase (SsoIIM) represses its own synthesis and stimulates expression of the cognate restriction endonuclease (SsoIIR). Similar N-terminal extensions are present on a number of 5-methylcytosine methyltransferases, including those in the EcoRII, dcm, MspI, and LlaJI systems (142). The last system, present in Lactococcus lactis, encodes two methylases, M1.LlaJ1 and M2.LlaJ1, recognizing the complementary and asymmetric sequences 5'-GACGC-3' and 5'-GCGTC-3', respectively, with methylation of the internal cytosine in each case. Two LlaJI restriction sites are present 8 bp apart within the regulatory region of the llaJI operon, with one site overlapping the –35 RNA polymerase {sigma}70 recognition site of the operon. Notably, methylation of both 5'-GCGTC-3' sites by M2.LlaJ1 enhances binding of M1.LlaJ1, repressing transcription of the llaJI operon. The ability of the M1.LlaJ1 methylase to distinguish methylated and nonmethylated target sites provides a feedback mechanism by which expression of the llaJI operon is controlled by DNA methylation.

The analysis of regulation of the EcoRV, CfrBI and LlaJI R-M systems described above has provided insight into the evolution of epigenetic control systems that are predominantly controlled by "orphan" methyltransferases, including DNA cytosine methylase (Dcm) (202) in E. coli. It has been postulated that orphan methylases such as Dcm may have arisen by selection as vaccines against invasion of a restriction-modification complex (250). In the case of Dcm, which methylates the duplex sequence 5'-CCWGG-3' (top strand shown; W = A or T) at the first cytosine, this methylation protects against cleavage by EcoRII. It was shown that postsegregational killing by the EcoRII R-M complex was diminished by the presence of dcm (250), which partially protected host chromosomal DNA from restriction attack. This function of Dcm as a possible molecular vaccine may be analogous to the function of cytosine methylation in certain eukaryotes, including mammals, where methylation has been postulated to inactivate transposons (293), although this hypothesis has been challenged (30). Dcm is not known to be involved in gene regulatory control. However, the other orphan methylase in E. coli, DNA adenine methylase (Dam), with homologues in other Alphaproteobacteria, does play an essential role in regulating epigenetic circuits. As well, Gammaproteobacteria have a cell cycle-regulated methylase (CcrM) which plays a major role in the control of chromosome replication and regulates expression of certain genes. In the next section we describe the biochemical properties of these DNA methylases and additional components of epigenetic switches before discussing specific epigenetic systems in detail.

Orphan DNA MTases

Dam. Dam of E. coli is classified in the {alpha} group of DNA MTases based on the organization of 10 domains (167). The E. coli dam gene (accession no. J01600) is 834 bp and codes for a 32-kDa monomeric protein (114). Dam homologues are present in Salmonella spp., Haemophilus influenzae, and additional gram-negative bacteria (16, 204, 254). Dam binds to DNA nonspecifically as a monomer, moving by linear diffusion and specifically methylating 5'-GATC-3' sequences. At GATC sites the adenine base is flipped out 180° into the active site of the enzyme, where it is stabilized by hydrophobic stacking with a tyrosine in the DPPY motif, which is conserved among adenine methyltransferases (123, 157). The methyl group donor, S-adenosyl-L-methionine (AdoMet), is required for stable binding of the flipped adenine in the active-site pocket of the enzyme and binds to Dam after the methylase binds DNA, transferring a methyl group to the exocyclic N6 nitrogen of adenine (261). AdoMet binds to two sites in the Dam protein: one is the catalytic center, and the other seems to be involved in an allosteric change that may increase specific binding of Dam to DNA (22). Dam appears to methylate only one of the adenosines of duplex GATC DNA sequence at a time (261). Notably, Dam shows high processivity for most DNAs; that is, after one methylation event, it slides on the same DNA molecule and carries out additional methylation events (turnovers). This high processivity effectively increases the rate of Dam methylation and may reflect the fact that there are few (<100) Dam molecules present in a single E. coli cell, yet there are about 19,000 GATC sites to methylate. Dam levels vary according to growth rate as a result of increased transcription from one of five dam gene promoters, designated P2 (158).

Based on the estimated numbers of Dam and GATC target sites per cell, each Dam molecule modifies between 20 and 100 GATC sites per minute (kcat) (261). This number is about 100-fold higher than the turnover number observed in vitro using an oligonucleotide substrate with one GATC site, indicating that there is likely some difference(s) in vivo that enables Dam to be more efficient at methylation (261). One possibility, suggested by Urig et al. (261), is that Dam is associated with the DNA polymerase III machine, scanning DNA for GATC sites as DNA replication proceeds and thus methylating DNA much more efficiently than it would in a random walk.

The processive nature of Dam contrasts sharply with DNA methylases associated with R-M systems, such as the EcoRV methylase (MEcoRV), which methylates its GATATC recognition sites distributively (95). In this case and for other R-M systems, incoming DNA needs to be restricted (cut) by the restriction enzyme before every site is methylated. The restriction enzyme has the advantage, since if just one restriction site in an incoming phage genome is left unmodified, the enzyme can cleave the DNA and block its replication. Note that restriction could be hampered if R-M DNA methylases were highly processive like Dam: processivity would increase the chances that all restriction sites in an incoming phage, for example, would be modified before restriction could occur.

Other gram-negative Gammaproteobacteria besides E. coli, including Salmonella spp., Serratia marcescens, Yersinia spp., Vibrio cholerae, Haemophilus influenzae, and Neisseria meningitidis, code for orphan MTases with significant sequence identity to EcoDam and which target adenosine of the GATC DNA sequence (162). Although Dam is not essential for growth of E. coli and Salmonella on laboratory media (14, 172, 254), the Dam homologues in Yersinia pseudotuberculosis, Yersinia enterocolitica, and Vibrio cholerae are essential gene products (135). However, a strain of Y. pseudotuberculosis in which dam mutations are viable has been described (252). It is not known what essential function(s) Dam plays in the pathogens in which it is essential, but it is provocative that both Yersinia and Vibrio contain two chromosomes, in contrast to the single chromosomes in E. coli and Salmonella spp., where Dam is not essential. A speculation is that Dam may be essential to coordinate DNA replication in bacteria with two or more chromosomes (78).

Dam homologues without a restriction enzyme counterpart are also present in bacteriophages, including Sulfolobus neozealandicus droplet-shaped virus (7), halophilic phage {phi}Ch1 (15), H. influenzae phage HP1 (204), phage P1 (61), phage T1 (9), and phage T4 (226). The last MTase, T4Dam, has been well characterized biochemically, primarily by Hattman and colleagues (123, 228). T4Dam, like EcoDam, is highly processive (169) and complements a dam mutant E. coli mutator phenotype (226). T4Dam and EcoDam may have a common evolutionary origin, sharing up to 64% sequence identity in four different regions (11 to 33 amino acids long) (105). After methylation with resulting formation of S-adenosyl-L-homocysteine, AdoMet binds to T4Dam without dissociating from the DNA duplex (299). Like EcoDam, T4Dam appears to flip out the adenosine of GATC sequence, facilitating its methylation (168).

CcrM. The cell cycle-regulated DNA MTase family (CcrM) constitutes a second important group of orphan methyltransferases, classified in the ß group of MTases and originally identified in Caulobacter crescentus (167, 242, 300). CcrM binds to and methylates adenosine in the sequence 5'-GANTC-3', where "N" is any nucleotide (167, 300). Like EcoDam, CcrM is a functional monomer and acts processively (20), although evidence suggests that it is a dimer at physiologic concentration (234). However, unlike EcoDam, CcrM has a distinct preference for hemimethylated DNA as a substrate, based on the observation that the turnover rate for hemimethylated DNA containing a GANTC target site(s) was significantly higher than that for DNA containing nonmethylated sites (20). CcrM binds to and methylates adenosine in the sequence 5'-GANTC-3', where "N" is any nucleotide. The GANTC sequence is also the target of HinfM methylase, which shares 49% identity with CcrM and whose cognate restriction enzyme HinfI from H. influenzae cuts at nonmethylated GANTC sites (300).

In Caulobacter, CcrM is an essential cell component and plays a crucial role in cell cycle regulation (20, 139, 170, 214-216, 242, 243, 300). CcrM homologues, which are likewise essential, have been found in Agrobacterium tumefaciens, the causative agent of crown gall disease in plants (137); in Rhizobium meliloti, the nitrogen-fixing symbiont of alfalfa and other legumes (286); and in the animal pathogen Brucella abortus (222). In B. abortus, aberrant CcrM expression impairs the pathogen's ability to proliferate in murine macrophages, raising the possibility that CcrM methylation might control the synthesis of virulence factors (222).

Regulation of Cellular Events by the Hemimethylated DNA State

Following passage of the DNA replication fork in E. coli, GATC sites methylated on the top and bottom strands in a mother cell (denoted as fully methylated) are converted into two hemimethylated DNA duplexes: one methylated on the top strand and nonmethylated on the bottom strand and one methylated on the bottom strand and nonmethylated on the top strand due to semiconservative replication (Fig. 2A). Most GATC sites are rapidly remethylated by Dam and exist in the hemimethylated state for only a fraction of the cell cycle (Fig. 2A). Exceptions are the DNA replication origin oriC, the dnaA promoter, and possibly additional GATC sites in the chromosome which bind SeqA (60). SeqA preferentially binds to clusters of two or more hemimethylated GATC sites spaced one to two helical turns apart (Fig. 2B). In the case of oriC, which contains a cluster of 13 GATC sites, sequestration delays remethylation and prevents binding of the DnaA protein, which controls the initiation of DNA replication. At other sites, binding of SeqA tetramers to hemimethylated GATC sites may organize nucleoid domains (100). Notably, the transcription profile of an E. coli SeqA mutant was found to be similar to that of a Dam overproducer strain. Based on this observation, a model was developed in which Dam and SeqA compete for binding to hemimethylated DNA generated at the replication fork (159).


Figure 2
View larger version (16K):
[in this window]
[in a new window]
 
FIG. 2. Generation of hemimethylated and nonmethylated GATC sites. (A) The vast majority of chromosomal GATC sites in E. coli are fully methylated until DNA replication generates two hemimethylated species, one methylated on the top strand and one methylated on the bottom strand. Within a short time after replication (less than 5 min), Dam methylates the nonmethylated GATC site, regenerating a fully methylated GATC site. (B) Two or more helically phased GATC sites (for example, in oriC) can be bound by SeqA when they are in the hemimethylated state. Binding of SeqA inhibits Dam methylation, maintaining the hemimethylated state for a portion of the cell cycle. Dissociation of SeqA allows Dam to methylate the hemimethylated DNAs, generating fully methylated DNA. (C) Certain GATC sites are present within or adjacent to regulatory protein binding sites. In some but not all cases, protein binding blocks DNA methylation over the entire cell cycle, stabilizing the hemimethylated state in the first generation and leading to a nonmethylated state in the second generation (only the second generation for the DNA methylated on the top strand is shown at the right).

 
The half-life of hemimethylated GATC sites not bound by SeqA has been estimated to be between 0.5 and 4 min, based on analysis of synchronized E. coli cells and monitoring the methylation status with restriction enzymes DpnI, which cuts fully methylated GATC sites; MboI, which cuts fully nonmethylated sites; and Sau3AI, which cuts GATC sites regardless of methylation state (50). In contrast, analysis of the origin of replication in the colicinogenic plasmid ColE1 indicated that remethylation of hemimethylated GATC sites occurs within a few seconds of passage of the replication fork (241). Notably, remethylation appeared to occur asynchronously, with methylation at GATC sites on the leading replication arm occurring more rapidly than GATC methylation on the lagging arm (about 2 seconds versus 4 seconds), suggesting that remethylation on the lagging arm occurs after ligation of Okazaki fragments. The reason for the discrepancy in estimation of the half-life of GATC sites is unclear but could reflect differences in chromosomal versus plasmid replication. For chromosomal replication the DNA polymerase III replication machinery is stationary, bound to the cytoplasmic membrane with DNA moving through it (154, 179). It is possible that Dam is present in a complex bound near the origin, methylating nascent DNA sequences as they arise.

The presence of hemimethylated GATC sites provides a signal that DNA replication has just occurred and plays a role in diverse cellular processes. For example, in methyl-directed mismatch repair the MutH protein binds to nonmethylated GATC sites and cleaves the nonmethylated DNA strand, ensuring that mutations in the daughter DNA strand are repaired using the parental strand as a template. In the absence of Dam, MutH can cleave the daughter strand, the parental strand, or both DNA strands. If the cell survives double-strand DNA breakage, 50% of the time the mutant daughter strand is used as a template to "repair" the parental strand, resulting in fixation of a mutation into the DNA (172, 285). Hemimethylated GATC sites are also used to control rates of transposition of insertion sequences IS3, IS10, IS50, and IS903 as well as transposons Tn5, Tn10, and Tn903 (73, 217, 219, 292). Elegant studies from Kleckner's laboratory showed that hemimethylated GATC sites control IS10 transposition in two different ways (181, 219). First, a GATC site present at bp –67 to –70 (here designated GATC-68) within the –10 module of the transposase promoter pIN controls transcription of the transposase gene. Full methylation of the GATC-68 inhibits RNA polymerase binding, reducing the level of tnp IS10 transcription. A second GATC site at bp 1320 to 1323 (GATC-1321) near the inner terminus of IS10 controls binding of transposase. Full methylation of GATC-1321 blocks transposition by inhibiting transposase binding. These two effects of DNA methylation on transposase expression and binding effectively limit IS10 transposition to a brief period immediately following DNA replication when GATC-68 and GATC-1321 are hemimethylated. Remarkably, the two hemimethylated IS10 DNAs have different transposition activities: IS10 methylated on the template strand is about 330 times more active than IS10 methylated on the nontemplate strand and 1,000 times more active than fully methylated IS10 (219). The majority of this difference is due to increased binding of transposase at the inner IS10 terminus; in addition, activation of the transposase promoter is more efficient in the IS10 hemimethylated species whose template strand is methylated. Since transposition of Tn10 does not involve the inner terminus, stimulation of Tn10 transposition following DNA replication is less efficient than for IS10 (219).

Like that of Tn10, transposition of IS50 and of Tn5 is stimulated by DNA replication (175). GATC sites are present within the inside end (IE) of IS50, similar to the case for IS10, and within the –10 region of the transposase regulatory region (73, 253, 292). In both IS50 and Tn5, Dam methylation represses tnp promoter activity and transposase binding to the IS50 IE (73, 253, 292). Increased transposition of IS50 and Tn5 in a Dam host requires integration host factor (IHF), probably to compensate for a DNA conformational defect associated with the lack of Dam (165). In turn, binding of Fis (factor for inversion stimulation) to the IE inhibits IS50 transposition (276). Methylation of three GATC sites within the Fis recognition sequence inhibits Fis binding. Thus, immediately following DNA replication, Fis binds to the IE, inhibiting IS50 transposition, and counteracts the positive effects of the hemimethylated state on IS50 transposition. In contrast, Tn5 transposition is not inhibited by Fis, since it does not use IE (276).

DNA hemimethylation may regulate transcription of additional genes that contain GATC sites within their promoter regions. The list includes glnS, sulA, trpS, trpR, and tyrR of E. coli and cre of bacteriophage P1 (16, 172, 205, 246). Expression of these genes was increased in the absence of Dam, suggesting that GATC methylation may decrease binding of RNA polymerase. The possible physiologic significance of methylation of these sites is not known, but it could tie gene expression to the replication state of the cell, increasing transcription immediately after passage of the replication fork. In the case of the trpR gene, which encodes the repressor of the trp operon, an attractive speculation has been proposed by M. G. Marinus: because trpR is located between the origin of replication and the trp operon, a transient boost in trpR transcription might provide the increased concentration of repressor necessary to maintain repression when chromosome replication doubles trp operon dosage (171).

DNA Methylation Patterns

About 16 years ago, Blyn et al. discovered that one of two GATC sites within the regulatory region of the chromosomally encoded pyelonephritis-associated pilus (pap) operon of uropathogenic Escherichia coli (UPEC) was heritably nonmethylated, depending upon the pilus expression state of the cells (34). When DNA was isolated from cells expressing pyelonephritis-associated pili (Pap pili) (ON-phase cells), it was found that a GATC site proximal to the pap pilin promoter was methylated, whereas the promoter-distal GATC site was nonmethylated. This DNA methylation pattern characteristic of ON-phase cells differed from that of OFF-phase cells, which contained the converse pattern where the GATC site proximal to the pap pilin promoter was nonmethylated and the promoter-distal GATC site was methylated. The term "nonmethylated" is defined here as a state in which the GATC target of DNA adenine methylase is not methylated on either the top or bottom DNA strand, constituting a DNA methylation pattern analogous to those observed in mammalian cells (34). Since the term "unmethylated" might imply that an active demethylation has occurred, we prefer use of "nonmethylated" to describe DNA lacking a methyl group on both the top and bottom DNA strands. The phenomenon of demethylation, which occurs in eukaryotes to reset the DNA methylation pattern after zygote formation (88, 147), has not been reported to occur in prokaryotes. DNA methylation patterns are formed in bacteria by binding of a protein(s) at a DNA site(s) overlapping or near a GATC site(s), preventing methylation of that site(s) throughout the cell cycle (Fig. 2C). A direct role for DNA methylation patterns in the heritable control of gene expression in bacteria was first shown in the Pap system (41).

Further analysis of DNA methylation patterns in E. coli showed that multiple GATC sequences (ca. 36 sites) in the genome of E. coli K-12, which lack pap DNA sequences, were stably nonmethylated (218, 272). These sites were identified by digestion of chromosomal DNA with MboI, which cuts at nonmethylated GATC sites. Since nonmethylated GATC sites are rare, the DNA fragments generated by MboI digestion are too large to be resolved by conventional agarose gel electrophoresis. Pulsed-field gel electrophoresis was used to resolve these fragments; however, the DNA sequences flanking the nonmethylated GATC sites were not determined. Ringquist and Smith (218) also showed for the first time that a number of Dcm target sites [CC(A/T)GG; the second cytosine is methylated at the C-5 position] were stably nonmethylated.

Wang and Church analyzed Dam DNA methylation patterns to assess the binding of proteins to chromosomal DNA sites. Chromosomal DNA was digested with MboI and ClaI and cloned into pBluescript, which enabled the nonmethylated GATC sites to be sequenced (272). Since binding of proteins such as catabolite gene activator protein (CAP) is dependent upon environmental conditions via the secondary regulator cyclic AMP (cAMP), DNA methylation patterns within the regulatory regions of genes bound by cAMP-CAP and other regulatory factors were found to be environmentally controlled (218, 251). For example, a GATC sequence within the regulatory region of the car operon, controlling carbamoyl phosphate synthetase and involved in arginine and pyrimidine anabolism, was found to be protected from Dam methylation (272). This nonmethylated GATC site and others are listed in Table 1, with the chromosomal position (bp 29444 for the GATC near the carA gene) in E. coli MG1655 (a K-12 isolate) also shown. No protection of the car GATC site was detected in the absence of pyrimidines, consistent with the hypothesis that a pyrimidine repressor(s) binds to the car promoter region near or overlapping the GATC site, protecting it from methylation. Indeed, CarP and IHF were shown to bind in the regulatory region of carAB and protect GATC-207 (Table 1) from methylation (54).


View this table:
[in this window]
[in a new window]
 
TABLE 1. Nonmethylated GATC sites in the E. coli K-12 chromosome

 
Another nonmethylated GATC site identified was in the gut (also known as srl) operon, controlling uptake of the alcohol sugar glucitol (bp 2823768). A binding site for CAP was identified near the nonmethylated GATC site located at –44.5 (GATC-44.5) relative to the transcription start site (263), suggesting the possibility that binding of CAP to the gut promoter blocks methylation of the GATC –44.5 site (note that in Table 1 this GATC site is 86 bp upstream of the AUG start site for gutA and is thus labeled "–86"). Analysis of DNA methylation in E. coli containing a deletion of the crp gene, coding for CAP, showed that methylation protection of the GATC-44.5 was reduced from 95% in crp+ cells to 50% in {Delta}crp cells. These data supported the hypothesis that CAP contributes to methylation protection of GATC-44.5 in vivo. However, further analysis of the gut operon showed that although cAMP-CAP binds to sites overlapping GATC-44.5, CAP does not protect this site from Dam methylation (263). Instead, the GutR repressor, which also binds at GATC-44.5, blocks methylation of this site both in vitro and in vivo. GutR-dependent protection of methylation of GATC-44.5 in vivo was not observed in the presence of glucitol, an activator of gut transcription, indicating that under these conditions GutR was no longer bound at GATC-44.5, allowing methylation of this site by Dam. However, methylation of GATC-44.5 did not affect binding of GutR to the gut regulatory region. These results led to the conclusion that although methylation protection indicates the presence of a DNA binding site in vivo, the absence of methylation protection of a GATC site does not prove the absence of binding of a protein at that site (263).

Wang and Church also identified nonmethylated GATC sites within the mtl (mannitol, bp 3769597), cdd (deoxycytidine deaminase, bp 2229798), flh (flagellar synthesis, bp 1976481), psp (stress response, bp 1366007), and fep (iron transport, bp 621523) operons (272). Using a similar approach in which nonmethylated GATC sites in the E. coli chromosome were cloned by digestion with MboI and AvaI, Hale et al. identified four nonmethylated GATC sites in the regulatory regions of the ppiA (bp 3490085), yhiP (bp 3638351), rspA (bp 1653241), and b1776 (bp 1859455) genes (99). Protection of the ppiA GATC site was dependent upon growth phase and carbon source. Protection of a GATC site near yhiP required leucine-responsive regulatory protein (Lrp) and was leucine responsive, similar to the case for some operons controlled by this global regulator (44, 68, 188, 189). The other GATC sites were protected under all the environmental conditions examined (99). A more comprehensive approach to identification of nonmethylated GATC sites was undertaken by Tavoizoie and Church (251); this approach allowed 12 additional sites to be identified, all of which were located within 5' noncoding regions of genes and open reading frames (Table 1).

Recent work by Blomfield's group on fim regulation controlling type 1 pili has identified two nonmethylated GATC sites at bp 4537512 and 4538525 in the E. coli chromosome near yjhA that are stably nonmethylated, separated from the fim locus by 1.4 kilobase pairs (80). These GATC sites are located near cis-active element regions 1 and 2, both of which play positive roles in transcription of the fimB recombinase gene, controlling type 1 pilus phase variation together with FimE (239). Binding of two regulatory proteins, the NanR sialic acid-responsive regulator and NagC, the N-acetylglucosamine-responsive regulatory protein, is required to activate fimB expression. Binding of NanR to region 1 blocks methylation of one adjacent GATC site, and binding of NagC to region 2 blocks methylation of the second GATC site. Only a fraction of the two GATC sites are nonmethylated after growth in glycerol minimal medium (239). Methylation protection of these GATC sites is not observed after addition of sialic acid (also known as N-acetyl-neuraminic acid). This likely occurs via inhibition of NanR binding, which is sensitive to sialic acid and inhibition by NagC via binding of N-acetylglucosamine-6-phosphate generated by sialic acid catabolism. Thus, binding of NanR and NagC controls methylation of two GATC sites adjacent to yjhA, likely by steric hindrance of Dam. However, mutation of the GATC site adjacent to region 1 did not affect fimB expression (239), indicating that methylation of this GATC site does not, in turn, modulate NagC binding. Moreover, in a dam mutant, expression of fimB is decreased, the opposite of what would be expected if GATC methylation inhibits NagC and NanR binding. These results indicate that the reported regulation of fim expression by Dam (199) does not occur via methylation of the GATC sites located near regions 1 and 2 adjacent to fim.

In summary, a small fraction of the approximately 20,000 GATC sites in the E. coli chromosome are totally or partially nonmethylated in any given growth state and environmental condition. The protection of GATC site methylation by Dam is dependent upon competition between Dam and specific DNA binding proteins. Dam appears to methylate most GATC sites in a highly processive manner, as discussed above. Recently, however, analysis of methylation of the regulatory GATC sites in the pap operon indicates that they are not methylated processively (32) . That is, Dam binds to pap DNA, methylates one GATC site, and then dissociates before methylating the second site. This effectively reduces the ability of Dam to compete with proteins that bind to DNA sequences containing one or more GATC sites. Bergerat et al. first proposed that DNA sequences surrounding GATC sites may dictate the avidity of Dam for its target sites (23). Mutation of the AT-rich flanking sequences of the pap GATC sites to CG sequences increased processivity, which appeared to be due to changes in the kinetics of methyl transfer and not in binding affinity (203). Analysis of known nonmethylated GATC sites tentatively suggests a trend toward having AT-rich flanking sequences, though this is not always the case (Table 1).

Since DNA methylation patterns are formed as a result of binding of proteins primarily at gene regulatory regions, they are altered by growth conditions that affect regulatory protein level(s) and/or DNA binding properties. As discussed above, identification of nonmethylated GATC sites has been used as a sort of natural in vivo footprint system to track binding of regulatory proteins under different environmental conditions (251, 272). In addition, it is clear that a subset of nonmethylated GATC sites (for example within the pap, sfa, daa, agn43, and other operons [see below]) play important roles in epigenetic regulation. In these systems, not only is a DNA methylation pattern established by protection of specific GATC sites by a regulatory protein(s), but methylation of the GATC site(s), in turn, modulates regulatory protein binding (263). This results in two heritable states: either the regulatory protein is bound to a specific DNA sequence containing a GATC site(s), protecting it from methylation, or the regulatory protein is not bound due to a reduction of binding affinity for target sequence(s) caused by GATC methylation. Clearly, only a subset of all nonmethylated GATC sites have these particular properties and are involved in epigenetic control systems. For example, as shown in Table 1, DNA methylation patterns have been shown to directly control expression of agn43 (111, 271) but do not control the gut (srl) operon (263) and do not appear to directly regulate fim (239). Further study will be necessary to determine if any of the other genes containing nonmethylated GATC sites in their regulatory regions are under methylation pattern control (Table 1).


   DNA ADENINE METHYLATION-DEPENDENT REGULATORY SYSTEMS
 Top
 Previous
 Next
 References
 
In the sections below we describe the current state of knowledge regarding how DNA methylation controls bacterial gene expression. Our focus for each methylation-controlled operon is on aspects of regulation affected by methylation and not on complete descriptions of regulatory networks.

Pap Pili

Pyelonephritis-associated pili play an important role in attachment of UPEC to uroepithelial cells lining the upper urinary tract, facilitating colonization of the kidneys. Pap pilus expression switches on and off within individual cells in the bacterial population, a process known as phase variation. The biological role of Pap pilus phase variation is not known, but possibilities include (i) escape from immune detection; (ii) facilitation of a bind-release-bind series of events in which successive generations of bacteria ascend the urinary tract; and (iii) controlling growth of UPEC by modulating the effects of contact-dependent growth inhibition, a newly described bacterial phenomenon (3).

DNA adenine methylase controls Pap phase variation by methylation of two GATC sites, one proximal to the pap pilin promoter (GATCprox), located 53 bp from the papBA transcription start site, and the other located 102 bp upstream of GATCprox, designated GATCdist (Fig. 3A). Note that these two GATC sites are located within Lrp DNA binding site 2 and site 5, respectively. Methylation at these two pap GATC sites controls the binding of the global regulator Lrp (44, 189) and the coregulatory protein PapI (118, 138) to pap DNA sites 1, 2, and 3 proximal to the papBA pilin promoter and to sites 4, 5, and 6 distal to papBA. Lrp appears to bind cooperatively to sites 1, 2, and 3 or to sites 4, 5, and 6 (193). Binding to all six sites can be achieved in vitro by addition of sufficient Lrp but rarely occurs in vivo based on analysis of the methylation states of GATCprox and GATCdist (41). In ON-phase cells GATCdist is nonmethylated and GATCprox is methylated (41) (Fig. 3D). Protection of GATCdist from Dam methylation requires both Lrp and PapI based on the observation that GATCdist is fully methylated in either an lrp or a papI mutant (40, 41). In contrast, OFF-phase cells display the converse DNA methylation pattern in which GATCprox is nonmethylated and GATCdist is methylated (Fig. 3A). Protection of GATCprox requires Lrp but not PapI (41, 263). Based on these in vivo DNA methylation patterns together with in vitro studies of Lrp binding, it was concluded that in ON-phase cells PapI-Lrp binds to sites 4, 5, and 6, protecting GATCdist from Dam, and in OFF-phase cells Lrp binds to sites 1, 2, and 3, protecting GATCprox from Dam (41). These DNA methylation patterns result from competition between Dam and Lrp for binding at sites 1, 2, and 3 and at sites 4, 5, and 6, containing GATCprox and GATCdist, respectively, as discussed in detail below.


Figure 3
View larger version (21K):
[in this window]
[in a new window]
 
FIG. 3. The Pap OFF- to ON-phase transition mechanism. The regulatory region of the pap operon is shown at the top, with six DNA binding sites for Lrp (gray rectangles) and GATCprox and GATCdist within Lrp binding sites 2 and 5, respectively. The divergent papI and papBA promoters are shown with arrows. Lrp (ovals), PapI (triangles), and PapB (diamonds) are shown. The methylation states of the top and bottom DNA strands of a GATC site are depicted by an open circle (nonmethylated) or closed circle (methylated). The OFF-to-ON switch is described in the text.

 
The Pap OFF- to ON-phase transition. In Fig. 3A (lower section), pap regulatory DNA with the OFF-phase DNA methylation pattern is depicted: GATCdist is fully methylated, and GATCprox is fully nonmethylated as a result of binding of Lrp at pap sites 1, 2, and 3 overlapping GATCprox. Transcription from papBA is blocked by binding of Lrp at sites 1, 2, and 3 overlapping the promoter, likely as a result of steric hindrance of RNA polymerase binding (278). The OFF-phase state is stabilized by two main factors: mutual exclusion and DNA methylation. Binding of Lrp at sites 1, 2, and 3 reduces the affinity of Lrp for pap sites 4, 5, and 6 (overlapping GATCdist) by 10-fold via a phenomenon that has been denoted "mutual exclusion" (116). Mutual exclusion requires a supercoiled pap substrate by an unknown mechanism. One possibility is that Lrp could induce bending at sites 1, 2, and 3, propagating an alteration in twist to sites 4, 5, and 6. Methylation of GATCdist reduces the affinity of Lrp for sites 4, 5, and 6 by about 20-fold based on in vitro DNA binding measurements (118). In addition, there is an intrinsic twofold-higher affinity of Lrp for sites 1, 2, and 3 versus 4, 5, and 6. These factors contribute to stabilization of the OFF-phase Pap expression state (116).

The transition from the OFF to ON phase requires that GATCprox be methylated by Dam; either a dam mutant E. coli strain or a GCTCprox A-to-C transversion mutant that cannot be methylated by Dam but does not significantly alter the affinity of Lrp for sites 1, 2, and 3 is locked in the OFF phase (41). In contrast, methylation of GATCdist has an inhibitory effect on the OFF-to-ON switch: overexpression of Dam by just fourfold prevents the OFF-to-ON switch. Moreover, E. coli containing a GCTCdist mutation that blocks Dam methylation is locked in the ON phase, even under conditions of Dam overexpression (41). These data support the hypothesis that OFF-to-ON switching requires DNA replication to generate a hemimethylated GATCdist intermediate, which is bound by PapI-Lrp with a higher affinity than DNA with a fully methylated GATCdist (118). A low level of the coregulatory protein PapI, required for Pap pili expression (138, 193, 194), increases the affinity of Lrp for pap DNA hemimethylated at GATCdist but does not enhance binding of Lrp to pap DNA fully methylated at GATCdist (118). Notably, the hemimethylation state of pap matters: PapI increases Lrp's affinity for DNA methylated on the top strand at GATCdist about fourfold more than for DNA methylated on the bottom strand (118). These results raise the intriguing possibility that Pap phase switching may be biased: daughter cells receiving a DNA methylated on the top strand may have a higher probability of switching to the ON phase than cells receiving DNA methylated on the bottom strand.

PapI is a small (ca. 9-kDa) coregulatory protein expressed from the papI promoter divergent to the papBA pilin promoter (Fig. 3A, top). PapI increases the affinity of Lrp for pap site 5, and to a lesser extent site 2, but has no effect on binding of Lrp to any of the other four Lrp binding sites (118) (Fig. 3C). pap Lrp binding sites 5 and 2 share the sequence "ACGATC," which differs from the other four pap Lrp binding sites and the ilvIH Lrp binding site 2 (65, 129, 138), which do not display PapI-dependent Lrp binding (118). All pap Lrp binding sites share the sequence "GNNNTTT" with the Lrp binding consensus determined by systematic evolution of ligands by exponential enrichment (64).

PapI does not appear to bind specifically to pap DNA by itself, based on gel shift analysis (138) and DNA cross-linking (118). DNA methylation interference indicated that methylation of bases in the sequence 5'-GNCGAT-3' overlapping GATCdist in the top strand and 3'-TGCTAG-5' in the bottom strand significantly reduced PapI-dependent Lrp binding compared with binding of Lrp alone. Methylation of the bottom-strand cytosine complementary to the guanine of "GATC" (meC9) blocked formation of the ternary PapI-Lrp-pap site 5 complex without affecting Lrp binding (118). These results support the hypothesis that enhancement of Lrp binding to site 5 occurs via formation of a PapI-dependent ternary complex with Lrp and pap DNA. Cross-linking with a photoactivatible 9-Å azidophenacyl cross-linker three bases from the presumptive PapI binding sequence "ACGATC" showed that PapI and Lrp were both cross-linked to pap DNA in the ternary complex with nonmethylated DNA, while only Lrp was cross-linked with DNA methylated at C9 (118). These results indicate that PapI is located near the pap ACGATC sequence in the PapI-Lrp-pap site 5 ternary complex and may directly contact this sequence.

The observation that PapI (100 nM) increases Lrp's affinity for pap site 2 (which contains the ACGATC PapI-specific sequence identical to site 5) (118) presents an apparent paradox, since this should block pap transcription due to its close proximity to the papBA pilin promoter (278). Further analysis showed that at low PapI levels significant enhancement of Lrp binding occurred at sites 4, 5, and 6 (CGATCdist) but not at sites 1, 2, and 3 (CGATCprox) (118). At 5 nM PapI, the affinity of Lrp was fourfold higher for pap sites 4, 5, and 6 (Kd = 0.25 nM) than for sites 1, 2, and 3 (Kd = 1.0 nM). Conversely, in the absence of PapI, the affinity of Lrp for sites 1, 2, and 3 (Kd = 1.2 nM) was about twofold higher than that for sites 4, 5, and 6 (Kd = 2.5 nM). Thus, binding of Lrp at sites 4, 5, and 6 should be favored at low PapI levels, resulting in activation of papBA transcription. This, in turn, would increase the PapI level via a PapB-mediated positive feedback loop whereby PapB binds upstream of the papI promoter and helps activate PapI expression (11, 85, 288) (Fig. 3B). High PapI levels could potentially shut off pap transcription by increasing the binding of PapI-Lrp complexes at promoter-proximal sites 1, 2, and 3. However, this is prevented by methylation of GATCprox by Dam, which specifically blocks PapI-dependent Lrp binding without affecting binding of Lrp alone (118).

To determine if the essential role of methylation of GATCprox in the OFF- to ON-phase transition is to specifically block PapI-dependent Lrp binding to sites 1, 2, and 3, the wild-type CGATCprox sequence was mutated to TGATCprox to specifically inhibit PapI-dependent Lrp binding. It was reasoned that under conditions in which PapI-dependent binding of Lrp to sites 1, 2, and 3 was blocked, switching from OFF to ON phase should occur in the absence of Dam. Analysis of the TGATCprox mutant showed that PapI-dependent Lrp binding to sites 1, 2, and 3 was inhibited but binding of Lrp was unaffected both in vitro and in vivo. Switch frequency analysis of E. coli containing the TGATCprox mutation showed that the OFF-to-ON rate (5.6 x 10–4/cell/generation) was about sevenfold higher than that of wild-type cells (8.2 x 10–5/cell/generation). Notably, in a dam null mutant background cells were locked in the ON-phase state, showing that methylation is not required for pap transcription under conditions in which PapI-dependent binding of Lrp to pap site 2 containing GATCprox is blocked. These results support the conclusion that methylation at GATCprox is required for the OFF- to ON-phase transition by specifically inhibiting PapI-dependent Lrp binding to sites 1, 2, and 3 (Fig. 3C, top).

Environmental mechanisms for switch control. Binding of Lrp at sites 4, 5, and 6, together with binding of cAMP-CAP at –215.5 (relative to the papBA transcription start site) (277), enhances papBA transcription via contact between CAP activating region 1 and the {alpha}C-terminal domain of RNA polymerase (277). In this way, Pap pilus expression is environmentally controlled by carbon source via the cAMP level. The role of Lrp may be structural, bending pap DNA between the CAP binding site at –215.5 and the papBA promoter to facilitate contact between cAMP-CAP and the {alpha}C-terminal domain. This results in transcription initiation from papBA and expression of PapB, which has been reported to bind with highest affinity to a site between the papI promoter and the CAP binding site (85), stimulating papI transcription, which constitutes a positive feedback loop (Fig. 3D). The high PapI level ensures binding of PapI-Lrp to sites 4, 5, and 6, and methylation of GATCprox prevents binding of PapI-Lrp to sites 1, 2, and 3, which would shut off papBA transcription and turn the switch OFF (278). The fact that both PapI and PapB are required for switching from the OFF to ON phase raises a chicken-and-egg problem that has not been adequately addressed: which regulatory factor initiates the switch? We speculate that regulation is at the level of PapB expression and that a low level of papBA mRNA is made following DNA replication and Lrp/H-NS dissociation from sites 1, 2, and 3 (266). If this papBA mRNA is rapidly translated, it would induce papI transcription, initiating the OFF-to-ON switch cascade. There is indirect evidence to support the idea that there may be translational control involved in Pap pilus expression, since a rimJ mutation affects pap gene regulation (280-282). RimJ acetylates ribosomal protein S5 in the 30S subunit. Thus, it is possible that ultimately the initiation of the Pap OFF-to-ON switch may be dependent upon the translation of a basal level of papBA mRNA present immediately following DNA replication.

The global regulatory protein H-NS is not required for Pap phase variation (266), but it does modulate Pap gene expression and Pap switch rates. H-NS represses papBA transcription in response to low temperature (94), high osmolarity (283), and rich medium (283). This may occur by specific binding of H-NS to the pap regulatory region, as evidenced by blocking of methylation of both pap regulatory GATC sites in vitro and in vivo (279). Binding of H-NS near the papBA promoter could inhibit binding of RNA polymerase, repressing transcription. Notably, at 37°C H-NS appears to positively affect Pap phase variation, since the OFF-to-ON switch rate is reduced in an hns mutant (266, 283). This positive effect of H-NS on the OFF- to ON-phase transition could occur via competition with Lrp at sites 1, 2, and 3, which would help to move PapI-Lrp to sites 4, 5, and 6, analogous to the role of methylation of GATCprox (Fig. 3C).

Another environmental input into Pap phase variation is mediated by the CpxAR response regulatory system (117, 127). Under certain conditions that stress the cell envelope, including high pH, CpxA located in the inner membrane autophosphorylates and then transfers a phosphate group to CpxR to yield CpxR-phosphate (CpxR-P) (176, 211). CpxR-P binds to sites overlapping all six pap Lrp binding sites, competes with Lrp for binding to these sites, and shuts off papBA transcription and Pap pilus expression (115, 117). Notably, CpxR-P binding to pap sites 1 to 6 is not inhibited by DNA methylation, in contrast to Lrp, even though CpxR-P, like Lrp, binds at sites overlapping the pap GATCprox and GATCdist sites. The biological role of CpxAR regulation of Pap pilus expression is not fully clear. One possibility is that under conditions of envelope stress it makes sense to curtail pilus expression to prevent further damage to the membrane. Another provocative possibility is that under conditions of stress UPEC cells stop making Pap pili, making them susceptible to contact-dependent growth inhibition (3). The physiologic significance of this is unknown, but it might contribute to survival under harsh conditions by slowing bacterial metabolism and growth (3).

The Pap ON- to OFF-phase transition. The Pap ON- to OFF-phase transition occurs at about a 100-fold-higher rate than the OFF- to ON-phase transition (35, 266). Notably, factors including H-NS, carbon source, and osmolarity do not affect the ON- to OFF-phase transition rate (35, 266, 283); therefore it appears that the ON- to OFF-phase transition is relatively constant under different environmental conditions. The ON- to OFF-phase transition has not been thoroughly examined, but based on knowledge of the OFF-to-ON switch mechanism (116-118) (see above), the following model is postulated. Starting with a cell in the ON-phase state (Fig. 4A), DNA replication is postulated to dissociate PapI-Lrp from sites 4, 5, and 6, enabling Dam to compete with Lrp for binding at GATCdist (Fig. 4C) Methylation of GATCdist is essential for the OFF-phase state (41). DNA replication also generates two hemimethylated GATCprox sites, one methylated on the top strand and one on the bottom strand (Fig. 4B). Whether a cell remains in the ON phase or transitions to the OFF state may be dictated by competition of Lrp for binding to pap promoter-proximal sites 1, 2, and 3 versus distal sites 4, 5, and 6 (Fig. 4B). Lrp has about a twofold-higher affinity for the proximal sites than for distal sites, and methylation of GATCprox does not affect Lrp binding to these proximal sites (118). In contrast, methylation of GATCdist inhibits binding of Lrp and PapI-Lrp to the distal sites (118, 194). These two factors should favor binding of Lrp to the proximal sites over the distal sites, which may account in part for the high ON-to-OFF rate observed. Following one additional round of DNA replication, the OFF-phase state is attained (Fig. 4D).


Figure 4
View larger version (20K):
[in this window]
[in a new window]
 
FIG. 4. The Pap ON- to OFF-phase transition mechanism. See the legend to Fig. 3 for explanations of symbols. The ON-to-OFF switch mechanism is described in the text.

 
Clearly, the Pap epigenetic switch mechanism is complex, involving distinct DNA methylation and protein-DNA binding states. Therefore, it would be highly useful to have a mathematical model that could predict switch rates under a variety of conditions and identify the key regulatory step(s) determining switch outcome. Liao and coworkers have developed a model for Pap phase variation that takes into account many of the protein-protein and protein-DNA interactions of Lrp, PapI, and Dam described above (131, 297). To rigorously test a model, one would need to alter cellular levels of PapI, Lrp, and Dam and experimentally determine switch rates. In addition, a useful model should be able to predict switch outcomes when the affinities of PapI, Lrp, and Dam for pap DNA have been altered, for example. Although these types of analyses have not yet been carried out, preliminary data suggest that the Markov chain model for Pap may be useful in understanding Pap switch dynamics. However, the frequency of ON-state cells in the population was underestimated, for example (297). Reliable numbers for biochemical parameters of the Pap switch, such as association and dissociation binding constants for PapI-Lrp, Lrp, and Dam at sites 1, 2, and 3 and at sites 4, 5, and 6, and have not yet been obtained. This makes it difficult to determine if the Pap model does not accurately reflect experimental data due to incorrect biochemical parameters used in the model or because assumptions in the model are incorrect or incomplete. Recently, another Pap switch model was developed by Munsky and Khammash (183, 184). Further work as outlined above will be necessary to test these models and determine if they are useful in furthering our understanding of the Pap switch and other epigenetic switch systems (see below).

Pap-Related Systems

Analysis of pilus operons containing regulatory regions with homology to pap indicates that there are two groups: those that are positively regulated by PapI homologues, similar to the pap system, and those negatively regulated by PapI homologues.

PapI homologue acting as a positive regulator of pilus expression. The regulatory regions of many pilus operons in E. coli, including Pap-related fimbriae (Prf), foo (F1651 pili), clp (CS41 pili), sfa (S pili), daa (F1845), fae (K88), and afa (afimbrial adhesin), share two GATC sites analogous to GATCprox and GATCdist and spaced 102 base pairs apart as in pap (151) (Fig. 5). Moreover, these GATC sites are present within additional conserved sequences, "CGATCdistTTTT" and "CGATCproxTT," with the entire sequence called a "GATC box" (note the inverse orientations of the GATC boxes in the pilus regulatory sequences shown in Fig. 5). Since the GATC box sequence contains binding sites for Lrp and Dam, as well as a portion of the PapI response element "ACGATC," this provides the means by which these various pilus operons are controlled by DNA methylation patterns.


Figure 5
View larger version (27K):
[in this window]
[in a new window]
 
FIG. 5. DNA sequence alignment of the GATC box regions from pilus operons under DNA methylation pattern control. DNA base pairs conserved in all pap family regulatory regions are shaded black with light lettering. The distal and proximal regulatory GATC sites (GATCdist and GATCprox, respectively) are shown. Arrows show the inverted orientation of the two GATC box regions. The accession numbers for the sequences shown are as follows: pap, X14471; foo, AF109675; sfa, S59541; afa, X76688; daa, M98766; clp, L48184; fae, X77671; pef, L08613.

 
The sfa, daa, prf (pap-related fimbria), and afa-3 operons appear to be regulated by DNA methylation patterns, analogous to regulation of pap. Each of these pilus operons codes for a PapI and a PapB homologue, and cross-complementation between the PapB and PapI homologues between prf and sfa (182) and between pap and sfa and daa (267) was shown. The DaaF and SfaC proteins function similarly to PapI, positively regulating expression of daa and sfa, respectively, by facilitating binding of Lrp to promoter-distal binding sites overlapping GATCdist (267). Methylation of the pap-related GATC sites, in turn, controls binding of Lrp.

PapI homologue acting as a negative regulator of pilus expression. Two methylation-controlled pilus operons in E. coli, clp (CS31A) and fae (K88), and one pilus operon in Salmonella enterica serovar Typhimurium, pef, share common regulatory features with pap but have distinct differences as well. The regulatory regions of clp, fae, and pef contain conserved GATC box sites and spacing identical to that in pap (Fig. 5). Also similar to pap, binding of Lrp to regulatory DNA is controlled by DNA methylation and a PapI homologue. However, all three methylation-controlled operons are carried on plasmids, and in each case PapI homologues negatively control phase variation and transcription.

K88 pili, expressed by enterotoxigenic E. coli infecting pigs, is not under phase variation control, in contrast to the case for all other Pap family members, (124). The fae regulatory region shares GATC box sequences with pap, spaced 102 bp apart, as well as a PapI homologue, FaeA, and a PapB homologue, FaeB (124). A third regulatory GATC site (GATC-III) is present 28 bp downstream (toward the faeB promoter) of GATCprox, and two IS1 sequences are present between faeB and faeA (Fig. 5). In contrast to the case for pap, FaeA and Lrp act to negatively control fae transcription. Data from Huisman et al. indicated that in the absence of FaeA, Lrp binds at sites overlapping GATCprox, protecting it from methylation by Dam (124, 125). However, in contrast to the case for pap, this Lrp binding has little effect on pilin transcription. In the presence of FaeA, the PapI homologue, additional binding of Lrp near GATC-III occurs, blocking methylation of both GATCprox and GATC-III and reducing fae transcription. This GATC-III site shares the "CGATCTTTTA" sequence of the pap and fae GATCdist sites, though in opposite orientation, possibly accounting for FaeA-mediated binding of Lrp to this region. However, FaeA-mediated binding of Lrp to GATCdist was not observed. In fact, mutation of the GATCdist site to GTTC sequence was lethal due to overproduction of K88 pili, indicating that methylation of GATCdist normally blocks binding of FaeA-Lrp. Whether FaeA-Lrp binds to GATCdist under normal physiologic conditions is not clear, but it is possible that binding to a hemimethylated GATCdist site might occur immediately following DNA replication, stimulating K88 expression under certain conditions. Another difference between regulation of fae and pap is in control of faeA and of papI transcription. In the case of pap, papI is regulated by PapB via a positive feedback mechanism (116), whereas in fae, an IS1 insertion apparently disrupts this positive feedback. Instead, FaeA may bind to its own promoter, acting as a positive autoregulator (125).

Regulation of the clp operon, coding for CS31A pili, which are expressed by enterotoxigenic E. coli, shares common regulatory features with pap but, like for fae and pef, has distinct differences as well. In E. coli isolate CS31A harboring clp, CS31A pili are under phase variation control, yet the plasmid-carried clp operon does not have a papI homologue associated with it (62, 173). It seems likely that a pap operon identified on the chromosome of E. coli CS31A supplies PapI in trans, but this has not been confirmed. Analysis of clp regulation in E. coli K-12 (no papI homologue present) showed that Lrp and the PapB homologue ClpB repressed clp transcription. However, even in the presence of Lrp and ClpB, a moderate level of clp pilin transcription was observed. In addition, in lrp+ clpB+ cells lacking Dam, transcription was almost maximally derepressed. Introduction of the PapI homologue AfaF resulted in phase variation of CS31A expression: instead of a normally distributed transcription of CS31A among the cell population, individual cells either transcribed (ON phase) or did not transcribe (OFF phase) the clp operon, with the methylation patte