Department of Plant Biochemistry and Molecular and Cellular Biology, Estación Experimental del Zaidín, Consejo Superior de Investigaciones Científicas, E-18008 Granada, Spain,1 Laboratory of Applied Microbiology, Marine Biotechnology Institute 3-75-1 Heita, Kamaishi, Iwate 026-0001, Japan,2 Department of Biological Sciences, Imperial College London, Flowers Building, Armstrong Road, South Kensington, London, SW7 2AZ, United Kingdom,3 Department of Biochemistry & Molecular Biology, Oregon Health Sciences University, 3181 SW Sam Jackson Park Road, Portland, Oregon 97239-30984
SUMMARY INTRODUCTION DEFINING THE TetR FAMILY TetR Family Profile Identification of TetR Family Members in DNA and Protein Databases PROTEINS WITH KNOWN THREE-DIMENSIONAL STRUCTURES TetR Regulator Tetracycline resistance and the role of the transcriptional regulator TetR. TetR DNA-binding domain: a symmetric TetR dimer binds a palindromic operator. QacR Regulator Two QacR dimers bind the operator to repress the qacA multidrug transporter gene. QacR as a model for multidrug recognition. Three-Dimensional Structure of CprB EthR Structure Crystal structure of TetR family members with unknown functions. DNA-BINDING PREDICTIONS BASED ON TetR AND QacR CRYSTAL STRUCTURES Relationship between Profile Positions and Structural Positioning SOME REGULATORS ARE PART OF COMPLEX REGULATORY CIRCUITS AcrR Regulator Is the Local Specific Regulator of the acrAB Efflux Pump Mtr Circuit of Neisseria BetI Controls the Choline-Glycine Betaine Pathway of E. coli ArpA Regulator from Streptomyces HapR Regulates Virulence Genes in Vibrio cholerae Other Quorum-Sensing Circuits BIOTECHNOLOGICAL APPLICATIONS AND FUTURE PROSPECTS ACKNOWLEDGMENTS REFERENCES
| SUMMARY |
|---|
|
|
|---|
-, ß-, and
-proteobacteria, cyanobacteria, and archaea. The set of genes they regulate is known for 85 out of the 2,353 members of the family. These proteins are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The regulatory network in which the family member is involved can be simple, as in TetR (i.e., TetR bound to the target operator represses tetA transcription and is released in the presence of tetracycline), or more complex, involving a series of regulatory cascades in which either the expression of the TetR family member is modulated by another regulator or the TetR family member triggers a cell response to react to environmental insults. Based on what has been learned from the cocrystals of TetR and QacR with their target operators and from their three-dimensional structures in the absence and in the presence of ligands, and based on multialignment analyses of the conserved stretch of 47 amino acids in the 2,353 TetR family members, two groups of residues have been identified. One group includes highly conserved positions involved in the proper orientation of the helix-turn-helix motif and hence seems to play a structural role. The other set of less conserved residues are involved in establishing contacts with the phosphate backbone and target bases in the operator. Information related to the TetR family of regulators has been updated in a database that can be accessed at www.bactregulators.org. | INTRODUCTION |
|---|
|
|
|---|
In most cases, the adaptive responses are mediated by transcriptional regulators. Most microbial regulators involved in transcriptional control are two-domain proteins with a signal-receiving domain and a DNA-binding domain which transduces the signal (1, 18, 145, 152, 170, 207, 271, 292-294, 298, 303, 345, 369, 428, 431) (Table 1). In other cases, the sensing of signals that trigger a transcriptional process involves two proteins, as in two-component regulatory systems such as CzcR/CzcS; DcuS/DcuR; NifL/NifA; NtrB/NtrC; PhoP/PhoQ; and TodS/TodT (75, 139, 200, 206, 233, 234, 257, 307, 309, 316, 409, 423). One protein is usually a membrane-linked kinase that, upon sensing the appropriate signal, phosphorylates a DNA-binding protein that mediates transcription from its cognate promoter. Structural analyses have revealed that the helix-turn-helix (HTH) signature is the most recurrent DNA-binding motif in prokaryotic transcriptional factors, since almost 95% of all transcriptional factors described in prokaryotes use the HTH motif to bind their target DNA sequences (12, 19, 27, 41, 43, 104, 135, 136, 302, 335, 343).
|
This review focuses on the TetR family, a family of transcriptional regulators that is well represented and widely distributed among bacteria with an HTH DNA-binding motif (210, 211, 246, 288).
Members of the TetR family of repressors are identified by a profile (see below) which can be easily used to recognize TetR family members in SWISS-PROT and TrEMBL and in all available proteins from prokaryotic genome sequences. After compiling data from protein and nucleic acid databases, the TetR family of regulators was found to include 2,353 nonredundant sequences (as of December 2004). The specific function regulated by members of the TetR family is known for only about 85 members (Table 2). These proteins control genes whose products are involved in multidrug resistance, enzymes implicated in different catabolic pathways, biosynthesis of antibiotics, osmotic stress, and pathogenicity of gram-negative and gram-positive bacteria (Table 2). The most relevant information on these proteins is collected in a database available at http://www.bactregulators.org (235). The database also supplies information for each member of the family, including identifiers, names, sequences, source, function, COG (clusters orthologous groups), position and orientation of the corresponding gene in the genome, and, when available, three-dimensional structures.
|
| DEFINING THE TetR FAMILY |
|---|
|
|
|---|
To develop the TetR family profile, we first selected a set of 120 sequences as belonging to the TetR family based on two criteria: a positive score for PROSITE signature PS01081, and a high score for PF00440 HMM. The 120 sequences were clustered into 42 groups using BLAST, and a representative sequence was selected and aligned for each cluster using CLUSTAL (http://clustalw.genome.ad.jp/). This revealed that the most conserved region corresponded to the HTH domain described in the TetR and QacR crystals (120, 150, 287, 288, 289, 349, 350, 351). The initial HTH motif was progressively extended until the global score of the multialignment diminished. Figure 1 shows the final alignment of the sequences. This conserved stretch corresponded in TetR and QacR crystals to the almost complete
-helix 1, the HTH domain formed by
-helices 2 and 3, and five residues of
-helix 4 that connect the DNA-interacting region with the core of the protein (see Fig. 2 for the three-dimensional structure of TetR).
|
|
To verify the quality of this TetR profile for specificity (false positives) and sensitivity (false negatives), we implemented a new tool called Provalidator which uses Interpro, Swiss-Prot, Prodom, TIGRfam, CoGnitor, NCBI-RPS-BLAST, and PSI-BLAST resources (68, 128, 154, 323, 348, 387, 449). In the first step, we searched for false positives among the 2,353 proteins we assigned to the TetR family. Interpro assigned 2,315 proteins to the TetR family, and these 2,315 were considered true positives. The remaining 38 proteins were analyzed with other resources such as TIGRfam, Prodom, NCBI-RPS-BLAST and PSI-BLAST (128, 449). This allowed us to assign 34 proteins to the TetR family. Three of the false positives (Q89RN6, Q988I6, and Q6N8G8) that we found were protein members of the AraC/XylS family of transcription activators (109, 394). These proteins have two HTH motifs at the C-terminal end, typical of AraC/XylS family members (109, 229). These three proteins were identified as potential TetR members because one of its HTH is highly similar to the DNA-binding domain in TetR. The fourth false positive is a transposase (Q981E7).
Provalidator detected 15 false negatives (Q742Y2, Q8CJK3, Q73ZY1, Q6D1J7, Q8KU64, Q9A917, Q880T2, Q6D2Z4, Q885G7, Q8PC90, Q9A466, Q9S6C0, Q9ZH26, Q6A626, and Q8G822), which are proteins assigned to the TetR family by INTERPRO but whose Z-score was between 6.407 and 8.487. In summary, the TetR profile with a Z-score threshold of 8.5 identified proteins that were not detected by INTERPRO, and among the 660,992 proteins analyzed, only four false positives were found. These results indicate that the new algorithm is highly effective for the detection of members of the TetR family.
Table 3 shows that members of the TetR family were detected in 144 microbial genomes belonging to 80 genera and 113 species of gram-positive and
-, ß-, and
-proteobacteria, cyanobacteria, and archaea, indicating wide taxonomic distribution. We have found that proteins of the TetR family are encoded both in chromosomes and in plasmids, and the mobility of the latter elements could be a source of the spread of genes in this family via horizontal transfer (147, 383), as is also the case with catabolic genes (77, 160, 236, 410, 426), antimicrobial resistance determinants (20, 100, 124), and 16S rRNA genes (347).
|
As a general collorarium, we can say that it seems that proteins of the TetR family are involved in the adaptation to complex and changing environments. This in turn correlates with the fact that many members of the TetR family are found among microbes with abundant extracytoplasmic function sigma factors (52, 227, 236, 277, 444).
| PROTEINS WITH KNOWN THREE-DIMENSIONAL STRUCTURES |
|---|
|
|
|---|
In addition, given that all members of the family whose function is known are repressors, they probably function in a similar way. Binding of an inducer molecule to the nonconserved domain of a TetR family member probably causes conformational changes in the conserved DNA-binding region that result in release of the repressor from the operator and thus allow transcription from the cognate promoter. To gain insights into the mechanisms of action of the TetR family members, we analyzed in detail the three-dimensional structure of the four members of the family, TetR, QacR, CprB, and EthR, whose crystal structures have been obtained (150, 264, 286-289, 349-351), in order to identify common and differential features of the TetR family members.
Adjacent to tetA and divergently oriented is tetR (112), whose gene product tightly controls expression of both tetA and tetR (148, 150). The intergenic region between the tetR and tetA genes contains two identical operators separated by 11 bp. TetR binds to these operators and thus prevents transcription from both promoters (Fig. 3) and (288). In all TetR crystal structures elucidated to date (PDB identifiers: 2TCT; 2TRT; 1A6I; 1BJO; 1BJY; 1BJZ; 1ORK; and 1RP1), this repressor appears as a homodimer (29, 30, 159, 183, 287-289, 366). The TetR homodimer binds to the operator (Fig. 3). Each 15-bp operator shows an internal palindromic symmetry with an extra central base pair (Fig. 3A). The operator sequences overlap with promoters for tetA and tetR, thereby blocking the expression of both genes. When tetracycline complexed with Mg2+ binds to TetR (166, 384), a conformational change takes place that renders the TetR protein unable to bind DNA. As a consequence, TetR and TetA are expressed (286).
|
-helices with connecting turns and loops (Fig. 2). The three-dimensional structure of the TetR monomer is stabilized mainly by hydrophobic helix-to-helix contacts. The global structure of the TetR homodimer can be divided into two DNA-binding domains at the N-terminal end of each monomer, and a regulatory core domain involved in dimerization and ligand binding (150, 286-289). The DNA-binding domains are constituted by helices
1,
2, and
3 and their symmetric helices
1',
2', and
3' (a prime denotes the second monomer). Helices
4 and
4' connect these domains with the regulatory core domain composed of helices
5 to
10 and their symmetric counterparts
5' and
10' (150, 287, 289). The regulatory domain is responsible for dimerization and contains, for each monomer, a binding pocket that accommodates tetracycline in the presence of a divalent cation. Helices
5,
8, and
10 and their counterparts
5',
8', and
10' constitute the scaffold of the core domain, and their structure is the most conserved in both TetR conformations (150, 287-289).
The tetracycline-binding pocket is identical in both monomers. The cavity to which the [TcMg]+ complex binds is depicted in Fig. 4 (286, 287, 289). The entrance of this cavity is controlled by
9' and the C-terminal end of
8' and the loop that connects both, while the exit is closed by loop 4-5 (287- 289). When [TcMg]+ enters the tunnel, its A ring makes contacts with loop 4-5, and the interaction with the effector triggers a cascade of conformational changes. The contacts that His100 and Thr103, both in
6, establish with the magnesium ion of the complex displace
6, which undergoes a conformational change in its C terminus to form a ß-turn (Fig. 4). The 6-7 loop is also pushed near the inducer, so that Arg104 and Pro105 interact with tetracycline. Translation of
6 forces
4 to move in the same direction due to van der Waals contacts. His64 of
4, anchored to
5 and to tetracycline, acts as a pivot point, and
4 moves like a pendulum. As a consequence of the rotation of
4 and
4', recognition helices
3 and
3' move further apart, and the DNA contacts are disrupted (Fig. 5) (286, 287, 289). Tetracycline is impeded from freeing the binding cavity, and TetR cannot bind its target DNA again. It should be noted that residues outside the binding cavity can influence affinity for tetracycline, as revealed by Kamionka et al. (168), who isolated a double mutant (G96E, L205S) with reduced affinity for the antibiotic.
|
|
TetR DNA-binding domain: a symmetric TetR dimer binds a palindromic operator. Cocrystallization of TetR with its operator DNA established that the TetR homodimer binds perpendicularly to the longitudinal DNA axis (Fig. 3A). Two adjacent DNA major groove regions covering a 6-base-pair area on both strands are involved in the almost perfect docking with the two TetR-interacting domains (Fig. 3A and 3B) (288). No water molecules were found at the TetR-DNA interface, where the crucial interactions are hydrophobic (288).
The interactions of each HTH domain with the operator DNA are summarized in Fig. 3A and 3B. The TetR monomer A binds the main strand from positions 4 to 7 while contacting the complementary strand from operator positions +4 to +2, and the symmetric monomer A' binds the main strand from positions +2 to +4 and the complementary strand from positions 4 to 7 (Fig. 3A and 3B).
Crystallographic analysis revealed that helix
3 (from Gln38 to His44) is the main element responsible for sequence-specific recognition, since all residues in this helix contribute to it, except for Leu41, which is part of the hydrophobic core stabilizing the
1,
2, and
3 helix bundle. Thr40 residue in monomer A establishes direct contacts with operator base pairs T(7) and C(6) in the main DNA strand (Fig. 3A and 3B). Trp43 interacts with T(7) as well. Pro39 interacts with both strands at bases T(5) and A(4) of the main strand and T(+4) of the complementary strand. In the rest of the operator half site, the
3 helix of monomer A interacts with the complementary strand, Tyr42 contacting with T(+4) and Gln38 with A(+3). Helix
2 supplies an additional specific contact with the complementary strand, namely, Arg28 contacts G(+2).
Although the TetR DNA binding domain maintains its structure thanks to a hydrophobic core formed by residues from the
1,
2, and
3 bundle (288), interactions with DNA lead to changes in the TetR DNA binding domain. One such change is that
3 forms a 310-helical turn at the N-terminal end as a result of complex DNA contacts. The H-bonds between Arg28-G(+2), and Gln38-A(+3) increase the separation between base pairs 1 and 2 from 3.4 Å to 3.9 Å (288). The two phosphate groups accompanying the G at position +2 establish H-bonds with side chains of Thr26, Thr27, Tyr42, and Lys48, and with the amino groups of the main chain of Thr27 and Lys48 (Fig. 3B). These contacts draw DNA closer to TetR near G(+2). Although the DNA is kinked away from TetR at position +2 in both operator strands, bending toward TetR in the area corresponding to positions +3 to +6 compensates for the DNA deviation. Crystallographic studies revealed that Lys48 located in
4, outside the HTH motif, also established contacts with the target DNA region (Fig. 3B). This lysine is relatively well conserved among TetR family members, and we are tempted to suggest that this residue plays an equivalent role in other proteins of the TetR family.
The three-dimensional structure of QacR (PDB identifiers 1JTX, 1JTG, 1JTY, 1JUM, 1JUP, 1JUS, and 1JTO) revealed that it is an all-helical protein which contains a DNA-binding HTH motif embedded within an N-terminal three-helix bundle and a second domain involved in drug binding and dimerization (350, 351). It should be noted that unlike TetR, two QacR dimers, rather than one, bind the operator site (339, 340) (Fig. 6).
|
|
3 helix of QacR A distal and B distal monomers establish the most extensive specific interactions with the operator (351). The Tyr41 residue of the A distal monomer (Fig. 6B) establishes hydrophobic contacts with base T(10) of the DNA main strand as well as with the phosphate at position 11 in the main strand, while Tyr40 contacts T(+7) (Fig. 6B). In addition, tight docking with DNA is facilitated by specific hydrogen bonds between Lys36 and base G(+6) in the complementary strand, and between Gly37 and base G(8) in the main strand. Gly37 is important in repression because nucleotide G(8) is the transcription start site for the qacA gene. Monomers A and B proximal also establish a series of critical interactions. For instance, Tyr41 of B proximal contacts the C(6) base in the main strand, whereas Tyr40 contacts base T(+3) and phosphate (+2) in the complementary strand (351). Gly37 in the A proximal monomer contacts G(4) in the complementary strand, whereas Lys36 contacts G(+1) in the main strand. A number of residues in
2, loop
2-
3,
3 and the positive dipole of the
1 (N terminus) also interact with the phosphate backbone of both DNA strands (351). Figure 6C shows how each dimer engages the DNA major groove in a face almost opposite to the other dimer, forming an angle between the two dimer axes of less than 180° (Fig. 7). Studies of QacR binding to DNA have indicated that the two dimers bind DNA cooperatively (120, 121, 351). Analysis of the three-dimensional structure suggested that such cooperativity does not arise from protein-protein interactions, as the closest approach of the dimers is 5.0 Å. Rather, binding cooperativity appears to be mediated through conversion of the DNA structure from a B-DNA conformation to the high-affinity undertwisted configuration observed in the crystal structure. Conversion of the DNA conformation is necessary because the optimal distance between each of the HTH motifs of the QacR dimer is 37 Å. This requires expansion of the 34-Å distance between successive major groove regions on one edge of the canonical B-DNA. It has been suggested that binding of the first QacR dimer forces this energetically unfavorable conformational change, which in turn produces an optimal DNA conformation for the easy binding of the second dimer (351). Experimental data reported by Grkovic et al. (121, 122) suggested that the two dimers must bind simultaneously and cooperatively to the operator in order to maintain the DNA deformation detected in the crystal.
Schumacher and Brennan (349) noticed that TetR and QacR achieve the same degree of specificity in DNA binding through different mechanisms. They noted that TetR, recruits Arg28, located outside its recognition helix, to make a base pair-specific contact (288), whereas QacR does not employ residues outside
3 to ensure DNA binding specificity. They also noted that TetR kinks its binding site and induces a 17° bend towards the protein to optimize the position of its HTH motifs for specific base interactions within each DNA half site; whereas QacR widens the major groove of the entire IR1 binding site smoothly and bends its DNA site by only 3°. These distinctions are reflected in the different HTH center-to-center distances observed in QacR (37 Å). Thus, an important lesson derived from comparisons of the QacR-DNA and TetR-DNA structures is that even structurally homologous proteins of the same family that share a similar function, i.e., repression, can utilize slightly different mechanisms of action.
QacR as a model for multidrug recognition. QacR is released from the qacA operator by its interaction with a number of cationic lipophilic drugs such as rhodamine 6G, crystal violet, and ethidium (119). More recently, Grkovic et al. (122) showed that effector recognition of QacR can be extended to several bivalent cationic dyes and plant alkaloids. In spite of the existence of two binding pockets, only one drug molecule is bound by each homodimer, as determined by equilibrium dialysis studies and isothermal titration calorimetry for the QacR-R6G complex (350). The QacR crystal bound to different drugs revealed another remarkable finding: the presence of an expansive and multifaceted drug-binding pocket with a volume of 1,100 Å3, so that different drugs partially overlap different subpockets (349, 351). A similar cavity able to bind multiple drugs was reported by Yu et al. (445, 446) for the AcrB multidrug transporter.
Crystallographic studies by Schumacher et al. (350) and Murray et al. (261) have demonstrated that multidrug recognition mediated by the QacR dimer is a rather simple process that, contrary to expectations, does not require sophisticated molecular mechanisms. Indeed, the drug binding domain of QacR consists of six
-helices (PDB identifiers: 1JTX, 1JT6, 1JTY, 1JUP, 1JUS, 1JTO, 1RKW, and 1RPW). Entry to the mostly buried drug-binding pocket is through a small opening formed by the divergence of helices
6,
7,
8, and
8'. The stoichiometry of one drug molecule for two QacR subunits led to this asymmetric induction process, in which the drug-bound monomer undergoes a major structural change. Comparison of the drug-bound structure with the DNA-bound structure reveals that drug binding triggers a coil-to-helix transition of residues 89 to 93, which extends helix
5 by a turn. This transition removes the drug surrogates Tyr92 and Tyr93 from the hydrophobic core of the protein. Expulsion of these tyrosines also leads to the relocation of nearby helix
6 and its tethered DNA-binding domain. The result of this structural transition is a 9-Å translation and a 37° rotation of the DNA-binding domain, effectively rendering the QacR dimer unable to bind its target DNA.
-butyrolactones as autoregulators or microbial hormones, together with their specific receptors (
-butyrolactone receptors), to control morphological differentiation, antibiotic production, or both (150, 151). The most representative of the
-butyrolactone autoregulatory factors is 2-isocapryloyl-3R-hydroxymethyl-
-butyrolactone, known as A-factor, which is essential for aerial mycelium formation, streptomycin production, streptomycin resistance, and yellow pigment production (133, 134, 155) in Streptomyces griseus. However, the A-factor receptor protein, known as ArpA, has proved to be difficult to purify. In contrast, the CprB protein from Streptomyces coelicolor A3(2), which is 30% identical to ArpA (284), has been purified and crystallized (264), although the ligand for CprB is still unknown. Nonetheless, CprB binds the same nucleotide sequence as does ArpA (375) and indeed CprB also serves as a negative regulator for both secondary metabolism and morphogenesis in S. coelicolor, as ArpA does in S. griseus (264, 284).
The CprB dimer is omega shaped, and the two subunits in the dimer are related by a pseudo-twofold axis. Each monomer of CprB is composed of 10
-helices and has two domains: a DNA-binding domain (residues 1 to 52) and a regulatory domain (residues 77 to 215). The three-dimensional structure of CprB is essentially similar to that of QacR bound to DNA except for the lack of
10 (350, 351). In addition, the DNA-binding domains of the two proteins are very similar, so much so that the two DNA-binding domains can be superimposed with an rms deviation of 1.48 Å for 71 C
atoms (264). Although no information on CprB-operator DNA is available, the high degree of sequence conservation allowed the authors to predict that the core of the DNA-binding domain is composed of Ile14, Ile15, Ala18, Phe22, Leu32, Ile35, Leu46, and Phe50.
It has been suggested that a CprB dimer binds to its target DNA as found in the TetR-DNA complex (150, 287, 288). This is because structure-based amino acid sequence alignment shows that at the amino acid sequence level the DNA-binding domains of CprB and TetR are highly identical. This suggests that there is an evolutionary relationship between the DNA-binding domains of the two proteins. The regulatory domain of CprB is composed of six
-helices (helices
5 to
10) (264), which can also be superimposed on the corresponding domain of TetR (286, 287, 289) (PDB code 1JT0).
The expression of ethA is regulated by EthR in M. tuberculosis. Overexpression of ethR leads to ethionamide resistance, whereas chromosomal inactivation of ethR promotes ethionamide hypersensitivity (28). EthR was found to bind directly and specifically to DNA sequences corresponding to the ethRA intergenic region (28, 90). The large EthR operator, which comprises 55 bp in comparison with the 15-bp operators recognized by most other family members, is organized as a putative highly degenerated palindrome containing pairs of overlapping inverted and tandem repeat sequences (90). In the absence of DNA, EthR forms a homodimer in solution, and surface plasmon resonance measurements suggest that EthR octamerizes when bound to DNA (90).
The EthR monomer is an all-helical, two-domain molecule (79). The N-terminal domain comprises helices 1 to 3, with helices 2 and 3 forming the HTH DNA-binding motif seen in other TetR family protein structures. The larger C-terminal domain, which in QacR and TetR has been dubbed the drug-binding domain, consists of helices 4 to 9, and its function in EthR is unknown. The crystal structure revealed that the dimerization interface, a conserved structural feature among the TetR class of repressors, is primarily formed by helices 8 and 9 (288, 351).
One of the most striking features of the EthR structure is a narrow tunnel-like cavity formed by helices 4, 5, 7, and 8 that opens to the bottom of the molecule (79). The tunnel measures about 20 Å in length and is lined predominantly, albeit not exclusively, by aromatic residues, with helices 5 and 7 constituting the majority of side chains. The loop connecting helices 4 and 5 restricts the opening of the hydrophobic tunnel, and the electron density in this loop is only poorly defined, indicating a certain degree of structural flexibility in the loop. This cavity may serve as the binding site for an as yet unknown ligand.
Crystal structure of TetR family members with unknown functions. New genomic/proteomic approaches are leading to the crystallization of a number of proteins, many of which have no assigned function. The following proteins of the TetR family have been crystallized: Cgl2612 of Corynebacterium glutamicum (pdb 1V7B); YbiH of Salmonella enterica serovar Typhimurium (pdb 1T33); YcdC of Escherichia coli (pdb 1PB6); and YfiR and YsiA from Bacillus subtilis (pdb 1RKT and 1VIO, respectively).
| DNA-BINDING PREDICTIONS BASED ON TetR AND QacR CRYSTAL STRUCTURES |
|---|
|
|
|---|
-helices involved in contacts with DNA in the multialignment of the 2,353 members of the TetR family in this domain. Based on these findings, we hypothesized that residues at the same position in the multialignment of all family members may play equivalent roles. This prompted us to analyze each amino acid in the multialignment within the DNA binding domain.
|
Tyr42 in TetR and Tyr40 in QacR corresponded to position 37 in the profile sequence displayed in Fig. 1, where a Tyr residue appeared in 74.16% of the aligned proteins (Table 4). The next most highly represented residues in this position are also aromatic amino acids: phenylalanine (8%) and histidine (4%) (Table 4). Tyr-42 in TetR and Tyr40 in QacR appear at the center of
-helix 3 and contact a thymine located at the center of the palindrome forming the operator and also contact a phosphate one position towards the center of the palindrome (Fig. 3B and 6B).
The residue at position 39 of the profile in the multialignment corresponds to His44 in TetR and His42 in QacR. In the corresponding cocrystals, these residues established contacts with the phosphate backbone (Fig. 3B and Fig. 6B). In the multiple sequence alignment of all family members, either histidine or tyrosine appears at position 39. We are tempted to propose that this residue is critical for interactions with the phosphate backbone.
A lysine-DNA phosphate interaction is shared at residues Lys48 in TetR and Lys46 in QacR, which correspond to position 43 in the multialignment and are located in the amino end of the
4 helix. A lysine residue is present in 77% of TetR proteins, and their interactions with DNA phosphates seem to be crucial to adjust the HTH domain to contact DNA (Fig. 3B and 6B). At position 22 of the profile (Thr27 in TetR and Thr25 in QacR), five residues are the most abundant (Val, Leu, Met, Ile, and Thr). Thr27 in TetR and Thr25 in QacR are involved in interactions with the phosphate backbone.
Thus, in the TetR family, the contacts established by the residue aligned at position 37 in
3 (tyrosine present in 74% of the cases) and 39 in
3 (His or Tyr present in 98% of the cases) and a residue at position 43 in
4 (Lys present in 77% of the cases) probably orient the HTH motif to interact with the DNA major groove and anchor the protein to the phosphate backbone.
Glycine at position 16, located at the end of
1, in the multialignment is highly conserved and is involved in changing the polypeptide direction in the TetR and QacR crystals to orient the HTH DNA binding domain properly.
Positions 33, 34, 35, and 38 of the profile align many different residues (Table 4). In TetR and QacR, the corresponding residues establish specific contacts with different DNA bases except Asn38 of QacR (position 35 in the multialignment), which contacts the phosphate backbone. Based on the high variability of these positions in the corresponding multiple alignment of the family, we are tempted to propose that these positions endow specificity to each protein so that it can recognize its operator through specific protein-DNA interactions.
| SOME REGULATORS ARE PART OF COMPLEX REGULATORY CIRCUITS |
|---|
|
|
|---|
However, TetR family proteins also participate in other types of regulatory networks that underlie complex processes, such as homeostasis in metabolism (biosynthesis of amino acids, nucleotides, protoheme, and reserve material), synthesis of osmoprotectants, quorum sensing, drug resistance, virulence, and processes related to growth phase-dependent differentiation (sporulation and biosynthesis of antibiotics) (Table 2) (www.bactregulators.org) (235).
Figure 8 shows a series of schemes in which a TetR family member plays a role in complex circuits. Below, for the sake of brevity, we have analyzed only some representative sets of regulatory networks, including proteins involved in drug resistance (AcrR of E. coli and MtrR of Neisseria gonorrhoeae), biosynthesis of an osmoprotectant (BetI), a key protein involved in idiophase antibiotic production and differentiation in Streptomyces (ArpR), a protein involved in pathogenesis in Vibrio (HapR), and some proteins involved in quorum sensing.
|
AcrB is a large cytoplasmic membrane protein (224, 226, 445, 446) which associates with AcrA, a membrane fusion protein (281), and TolC, a protein that forms a channel for the extrusion of substrates into the medium (102, 193). The acrA and acrB genes form an operon (224) whose transcription is regulated by the acrR gene product. The acrR gene is divergently transcribed from the acrAB operon. Overexpression of AcrR represses the transcription of acrAB. This observation is consistent with the function of AcrR as a repressor for acrAB transcription. Evidence for this function has come also from gel shift mobility assays, which provided direct evidence for the binding of AcrR to the promoter region of acrAB. DNA sequencing (92) of certain isolates that overexpressed acrB mRNA revealed that the mutant strains had insertions that disrupted the acrR gene or point mutations that rendered a nonfunctional regulator, i.e., an amino acid substitution of cysteine for arginine at position 45 of AcrR. This biochemical and genetic evidence provides support for the regulatory role of AcrR.
MarA, SoxS, and Rob are related transcriptional activators of the AraC/XylS family (7, 112, 367) that activate acrAB expression, although they are not involved in the regulation of acrAB in response to general stress conditions (13, 14, 21, 35, 110, 224) because the acrAB operon can be activated in response to these stresses in genetic backgrounds lacking mar and sox (223-225). It was also found that general stress conditions increased the transcription of acrAB in the absence of functional AcrR, and these conditions, surprisingly, increased the transcription of acrR to a greater extent than that of acrAB. These results suggest the existence of a mar-sox-independent pathway to control acrAB expression in response to the general stress conditions. This transcriptional control of acrAB is also AcrR independent. Therefore, a major role of AcrR is to function as a specific secondary modulator to fine-tune the level of acrAB transcription and prevent unwanted overexpression of the efflux pump. This represents a novel mechanism for regulating gene expression in E. coli.
The mtrR gene is divergently transcribed with respect to the mtrCDE operon (Fig. 8F). The promoters of mtrR and mtrC overlap in their 35 boxes, and footprinting analysis showed that MtrR binds a 40-bp region within the 10 to 35 region of the mtrR promoter, which contains an inverted repeat (221). MtrR bound to its target site prevented expression from the efflux pump operon and its regulator (Fig. 8F). The expression of mtr genes is enhanced by the AraC/XylS member MtrA, although the mechanism of activation of this protein is unknown.
On the other hand, Veal and Shafer (407) have recently identified a gene that was designated mtrF, located downstream of the mtrR gene, that is predicted to encode a 56.1-kDa cytoplasmic membrane protein containing 12 transmembrane domains. Expression of mtrF was enhanced in a strain deficient in MtrR production, indicating that this gene, together with the closely linked mtrCDE operon, is subject to MtrR-dependent transcriptional control. Genetic evidence suggests that MtrF is also important in the expression of high-level detergent resistance by gonococci, and it was proposed that MtrF acts in conjunction with the MtrC-MtrD-MtrE efflux pump to confer high-level resistance to certain hydrophobic agents in gonococci. MtrR also controls the farAB operon, which encodes an efflux pump involved in resistance to long-chain fatty acids (Fig. 8F). The efflux pump FarAB uses MtrE as the outer membrane component (208).
The osmoregulatory choline-glycine betaine pathway is encoded by the bet genes. The betA gene encodes choline dehydrogenase; betB encodes betaine aldehyde dehydrogenase; betT encodes a transport system for choline; and betI encodes a 21.8-kDa repressor protein involved in choline regulation of the bet genes (Fig. 8C). The bet genes are linked, with betT being transcribed divergently from the betIBA operon (203, 374). Primer extension analysis identified two partially overlapping promoters which were responsible for the divergent expression of the betT gene and betIBA operon. The transcripts are initiated 61 bp apart and are induced by osmotic stress, but for full expression choline is required in the growth medium (91, 202, 204). Because the ArcA protein represses the expression of bet genes in E. coli under anaerobiosis, the bet genes are expressed only under aerobic conditions. An arcA mutation caused complete derepression of the bet genes. A similar pattern for derepression by ArcA has been reported previously for other genes (sodA and arcA) which are directly regulated by ArcA (65).
Results from different laboratories suggest that choline regulation but not osmotic regulation of the bet promoters depended on BetI, a TetR family member. This was indicated by the requirement for choline, in addition to osmotic stress, for betT to be expressed in a mutant strain in which betI was supplied in trans. Furthermore, this choline effect was not seen in cells lacking betI. These findings indicate that betI encodes a repressor that reduced the expression of betT (331).
A chimeric BetI glutathione S-transferase fusion protein (BetI*) was purified, and gel mobility shift assays showed that BetI* formed a complex with a 41-bp DNA fragment carrying the intergenic betI promoter region. Footprinting revealed the presence of two sequences of dyad symmetry which probably constitute the BetI operator.
The Sinorhizobium meliloti bet genes have been cloned, and their involvement in response to osmotic stress has been analyzed (304, 305, 368).
The ultimate regulator of the mentioned processes in Streptomyces griseus is a homodimeric protein called A-factor receptor protein (ArpA) (155-158), which regulates the switch for physiological and morphological differentiation. The main biologically significant target of ArpA is the adpA gene. The AdpA protein in turn controls the expression of other genes. These genes include strR, which serves as a pathway-specific transcriptional activator for streptomycin biosynthetic genes (278); an open reading frame encoding a probable pathway-specific regulator for a polyketide compound (441); adsA, which encodes an extracytoplasmic function sigma factor of RNA polymerase essential for aerial mycelium formation (438); sgmA, which encodes a metalloendopeptidase probably involved in apoptosis of substrate hyphae during aerial mycelium development (174); ssgA, which encodes a small acidic protein essential for spore septum formation (437); amfR, essential for aerial hyphae formation (398, 440); and the sprT and sprU genes, which encode trypsin-like proteases (173).
In vitro, ArpA binds its target DNA site at the 10/35 region, which is a 22-bp palindrome (5'-GG(T/C)CGGT(A/T)(T/C)G(T/G)-3'). Addition of
-butyrolactone effector to the ArpA-DNA complex immediately releases ArpA from the DNA. A mutant strain deficient in ArpA or producing a mutant ArpA protein unable to bind to its target DNA overproduces streptomycin and forms aerial mycelia and spores earlier than the wild-type strain (282, 283). An amino acid replacement at Val-41 to Ala in ArpA in the HTH motif at the N-terminal portion of ArpA abolished DNA binding activity but not
-butyrolactone binding activity, suggesting the involvement of this HTH in DNA binding. On the other hand, mutation of Trp-119 (Trp 119
Ala) generated a mutant unable to bind the
-butyrolactone, resulting in a mutant protein that did not sense the presence of A-factor. These data suggest that ArpA consists of an HTH DNA-binding at the N-terminal end and an effector binding domain at the C-terminal end of the protein.
In the streptomycin biosynthetic gene cluster in S. griseus, StrR is the pathway-specific regulator that serves as a transcriptional activator for the other genes in the cluster (320). Expression of the strR gene was controlled by the AdpA protein, which binds the region upstream of the strR promoter and activates its transcription (311, 312). adpA knockout mutants produced no streptomycin, and overexpression of adpA caused the wild-type S. griseus strain to produce streptomycin at an earlier growth stage in a larger amount. This set of events explains how A-factor triggers streptomycin biosynthesis.
Disruption of the chromosomal adsA gene encoding
AdsA resulted in loss of aerial hypha formation but not streptomycin production, indicating that this sigma factor is involved in morphological development (401, 438).
Several receptor proteins for
-butyrolactone-type autoregulators have been described in other species of Streptomyces. For example, CprA and CprB are involved in secondary metabolism and aerial mycelium formation in S. coelicolor A3(2) (284). The virginiae butanolyde receptor BarA is involved in virginiamycin biosynthesis in Streptomyces virginiae (280), and the IM-2 receptor, FarA is involved in blue pigment production in another Streptomyces strain (413).
FarA in Streptomyces lavendulae senses the concentration of
-butyrolactone IM-2 and also transduces this signal, thus derepressing antibiotic and blue pigment biosynthesis. In addition, FarA also seems to be necessary for IM-2 biosynthesis (Fig. 8J).
The role of ScbR protein in the quorum-sensing circuit of Streptomyces coelicolor is similar to that of FarA in S. lavendulae. ScbR is also involved in three functions: positively regulating SCB1 synthesis (the
-butyrolactone that acts as signal), receiving the signal, and transducing this signal, thereby derepressing the production of the antibiotics actinorhodin and U-prodigiosin (Fig. 8M).
HapR, in turn, is regulated by quorum-sensing signals that are sensed and transmitted by LuxO. The quorum-sensing apparatus in Vibrio cholerae is unusually complex and is composed of three parallel signaling systems (247). In contrast to other bacteria, in which high cell density triggers virulence gene expression, in Vibrio cholerae low cell density is the condition that activates the production of the pathogenic factors CT and TCP (455). At high cell density HapR positively regulates the expression of a hemagglutinin protease (Hap) that promotes detachment of Vibrio cholerae from the gastrointestinal epithelium (365) and exerts a negative effect on biofilm formation. Taking into account the regulatory functions of HapR and considering that some pathogenic biotypes lose HapR expression whereas others lose the aphA binding site, it appears that HapR expression is related to diminished toxicity and colonization capacity. These features offer potentially fruitful avenues of research to design drugs to modulate Vibrio cholerae pathogenicity.