Microbiology and Molecular Biology Reviews, September 2003, p. 303-342, Vol. 67, No. 3
1092-2172/03/$08.00+0 DOI: 10.1128/MMBR.67.3.303-342.2003
Copyright © 2003, American Society for Microbiology. All Rights Reserved.
Department of Microbiology and Cell Science, University of Florida, Gainesville, Florida 32611,1 BioScience Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87544,2 Department of Chemistry, City College of New York, New York, New York 100313
SUMMARY INTRODUCTION Biochemical Pathway of Tryptophan Biosynthesis Nomenclature. Seven catalytic domains and two {alpha}/ß-subunit complexes. Relatives of Trp pathway catalytic domains. Identical Trp pathways exist within varied metabolic contexts. Operon Stability trp Operon and Its Regulation Known regulatory mechanisms. Unknown regulatory systems awaiting discovery? Feasibility for Deduction of Evolutionary Histories GENOMIC DISTRIBUTION OF THE TRYPTOPHAN PATHWAY Mapping of trp Gene Patterns to the 16S rRNA Tree Trp Biosynthesis in Its Larger Context of Aromatic Biosynthesis Implications of Missing Genes Unidentified analogue genes. Alternative metabolic relationships. Reductive evolution. Search for an Elusive trpC Gene in Actinomycete Bacteria One actinomycete exception is explained by LGT. Post-LGT events of vertical descent can be tracked in C. diphtheriae. Pattern and profile search. Evaluation of an unknown gene inserted in the trp operon. Possible catalysis of the TrpC reaction by HisA. Evolution of competence for TrpC catalysis by TrpD. Other possibilities. GENE FUSIONS Phylogenetic Distribution of trp Gene Fusions Nested Gene Fusions Trp PATHWAY GENE ORGANIZATION IN THE ARCHAEA Trp PATHWAY GENE ORGANIZATION IN THE BACTERIA Whole-Pathway trp Operons Dispersal of trp Operon Genes Gene Scrambling RETENTION OF THE ANCESTRAL OPERON AT SPACED PHYLOGENETIC NODES IN BACTERIA TWO MAJOR EVENTS UNDERLIE THE DYNAMICS OF trp OPERON CHANGE IN BACTERIA Operon Scission Yields Two Half-Pathway Operons Fusion of trpD with trpC Restores a Whole-Pathway Operon LATERAL GENE TRANSFER OF trp OPERONS Lateral Gene Transfer of Whole-Pathway Operons Lateral Gene Ttransfer of Partial-Pathway trp Operons FINE-TUNED EVOLUTIONARY DEDUCTIONS Single Change in a Common Ancestor versus Multiple Independent Changes in Descendants Distinguishing Derived States from Ancestral States Deducing Ancestral Character States at Phylogenetic Node Positions Value of Flanking-Gene Context EXPANDED METABOLIC CONTEXT Pyrococcus and Its Archaeal Relatives Convergent trp and giant aro operons of Pyrococcus. Dynamics of archaeal gene shuffling. Bacillus/Staphylococcus Clade B. subtilis subgroup. Listeria subgroup. Interconnectivity of the trp, aro, pab, and his operons. Evolutionary information derived from flanking-gene context. Deducing the likely common ancestor of the clade. OVERVIEW PERSPECTIVES Lineage-Specific Evolutionary Trends Individual Divergences Unmasked in the Larger Genomic Context Analysis of the Ancestral State at Phylogenetic Nodes Intellectual Dilemma Addressed Does trp gene reorganization necessarily imply functional deterioration? Are there any clear examples of efficient operons systems that have been disrupted? Elaborate regulation seems to be fairly recent. Regulation extending beyond the Trp pathway. Does Regulation Power Evolutionary Dynamics? FUTURE PROSPECTS FOR ELEVATED KNOWLEDGE OF Trp PATHWAY EVOLUTION APPENDIX Analysis of Raw DNA Sequence Data 16S rRNA Tree Construction DNA Composition Fusion Protein and Linker Region Analyses ACKNOWLEDGMENTS REFERENCES
|
|
|---|
|
|
|---|
An ideal operon system for this analysis is the trp operon. We show that the trp operon must have been present in early prokaryote ancestors. In Bacteria but not in Archaea, sufficient genome representation exists to deduce an ancestral whole-pathway trp operon. The regulation of this operon may initially have been quite minimal since the first evolutionary step(s) probably would be to collect the structural genes together. Parsimony principles support a hypothesis developed in this paper of two major evolutionary events in Bacteria, one splitting the ancestral operon in two and the other rejoining it by gene fusion. We assert that a detailed analysis can recognize occasional events of lateral gene transfer (LGT) or paralogy. Both are likely to be associated with Trp pathway genes engaged in specialized metabolic pathways other than primary amino acid biosynthesis. We show that when two sister lineages differ in particular trp operon characteristics, it is possible to deduce which is the derived change and which reflects the state of the ancestral node.
Recently, Gogarten et al. (28) endorsed a "synthesis" that will acknowledge both the traditional tree-like behavior (vertical descent of genes) and web-like, reticulate behavior (horizontal gene transfer) of the evolutionary process. They leave it open whether or not "vertical descent remains the best descriptor of the history of most genes over evolutionary time." Our overall analysis yields a very optimistic viewpoint that the evolution of the trp operon can be deduced as a vertical genealogy, with events of LGT and paralogy enriching the analysis as interesting features rather than undermining or obliterating the vertical trace of evolutionary history.
-subunits for anthranilate synthase and tryptophan synthase, respectively; TrpAb and TrpEb are ß subunits for anthranilate synthase and tryptophan synthase, respectively). Capital letters are assigned according to the order of the enzyme reactions (or overall reactions, in the case of the two complexes). C. Yanofsky has expressed to us his preference (probably shared by most experimentalists working specifically with trp systems) for adherence to previous nomenclature schemes to minimize disruption of what is most familiar in the existing literature. Admittedly, the designations generally in use for the Trp branch do not generate as many problems of annotation errors as is the case for the rest of the aromatic pathway, but for consistency with our overall work with the aromatic pathway, we use the new naming system in this paper. Both sets of designations are shown in Table 1.
![]() View larger version (24K): [in a new window] |
FIG. 1. Biochemical pathway of tryptophan biosynthesis. The nomenclature used in this paper for the seven catalytic domains is in boxes. See Table 1 for the alternative designations used in the literature. Anthranilate synthase catalyzes the overall reaction from chorismate to anthranilate via the half-reactions shown, whereby 2-amino-2-deoxyisochorismate (ADIC) is an enzyme-bound intermediate (62). The TrpAa/TrpAb complex functions as an amidotransferase, utilizing glutamine as the source of the o-amino group of anthranilate. TrpAa can catalyze the overall reaction alone in the presence of NH3 (thereby functioning as an aminase). TrpAb alone in some cases may be able to function as a glutaminase. As shown by McDonald et al. (59), Pseudomonas and Streptomyces species form ADIC as the product of a reaction catalyzed by PhzE. PhzE has fused domains that are homologues of TrpAa and TrpAb, which we have denoted TrpAaTrpAb_phz (93) (see Table 1). In these organisms, ADIC can be considered a branch point that proceeds to Trp on the one hand and to phenazine pigments on the other hand. Tryptophan synthase catalyzes a second overall reaction, converting indoleglycerol phosphate to Trp in a reaction path where indole is always an intermediate. The alpha (TrpEa) and beta (TrpEb) subunits catalyze the reactions shown in which the indole intermediate is processed through a tunnel (85). PR, phosphoribosyl; IGP, indoleglycerol phosphate; G3P, glyceraldehyde 3-phosphate.
|
|
View this table: [in a new window] |
TABLE 1. Key to nomenclature conversions
|
/ß-subunit complexes.
Trp is an essential amino acid among the assemblage of required amino acids in mammals. Trp is generally synthesized by free-living prokaryotes, lower eukaryotes, and higher plants. The Trp pathway is one of three amino acid branches diverging from a common flow route that produces chorismate. The apparent universal biosynthetic pathway for Trp biosynthesis that initiates with chorismate and L-glutamine is shown in Fig. 1. Seven catalytic domains are deployed to carry out the reactions shown. In a given organism these may be individually expressed, but a wide variety of gene fusions that encode single proteins carrying two or more catalytic domains are known. TrpAa can function as an ammonia-utilizing aminase in the anthranilate synthase reaction. Although the aminase reaction can proceed with ammonia at unphysiologically high pH values, such reactions typically rely upon a glutamine-utilizing glutaminase subunit to deliver the ammonia at the active site (probably within a "tunnel"). Accordingly, TrpAb is a glutaminase homologue that forms a complex with TrpAa, thereby conferring an amidotransferase component to the overall anthranilate synthase reaction in the presence of glutamine. In either case, whether or not the overall anthranilate synthase reaction is carried out in the presence of TrpAb, 2-amino-2-deoxyisochorismate (ADIC) is an enzyme-bound intermediate. Interestingly, some species of Pseudomonas and Streptomyces produce an enzyme called PhzE (59), which carries out the ADIC synthase reaction but not the ADIC lyase reaction (see Fig. 1). ADIC is then converted ultimately to phenazine pigments. PhzE is a fusion of domains homologous to TrpAa and TrpAb (hence our designation TrpAaTrpAb_phz in Table 1). TrpAa belongs to a protein superfamily that includes other chorismate-utilizing enzymes: PabAa converts chorismate to 4-amino-4-deoxychorismate (precursor of 4-aminobenzoate), and MenF and EntC are different homologue subgroups that convert chorismate to isochorismate (as precursors of ubiquinones and an iron siderophore, respectively).
Tryptophan synthase also exists as a complex of nonidentical subunits and is one of the best-understood examples of allosteric interaction exerted between subunits (97). Why indole should be sequestered to a tunnel in the
/ß complex of tryptophan synthase is not known, but indole is volatile and rather toxic. Yanofsky has speculated that recent findings of a role for indole in quorum sensing and biofilm formation might suggest that indole either produced by tryptophanase or otherwise available in the environment may serve as a metabolite cue that might otherwise be disrupted if biosynthetic indole were not enzyme-bound (see reference 96 and references therein). It has been speculated (92) that some Archaea may not form a tryptophan synthase complex.
Relatives of Trp pathway catalytic domains. The pathway of Trp biosynthesis is the first amino acid pathway for which the atomic structure of every catalytic domain has been determined (58), a circumstance of significance because evolutionary analysis can be greatly enhanced through insight gained at the structural level of protein folding. Consultation of the reference by Yanofsky et al. (97) is highly recommended for a definitive presentation of the detailed literature up to about 2000. Each catalytic domain belongs to a protein superfamily at the structural level of protein folding. Many of the catalytic domains exhibit clear homology on the criterion of amino acid identity with proteins that have different substrate specificities and which participate in different pathways. From an evolutionary perspective, this is of interest with respect to such questions as the extent to which the Trp pathway enzymes have been assembled (via gene duplication and substrate alteration) by recruitment of homologues from other pathways or the extent to which the Trp pathway has been the source of genes recruited for function in other pathways or a homologous gene with a recent history of function in another pathway has "crossed over" to replace a Trp pathway gene (or vice versa). This aspect is not addressed further in this article except indirectly (e.g., see the later section on the search for an elusive trpC gene).
Identical Trp pathways exist within varied metabolic contexts. The Trp pathway is generally defined as an unbranched pathway that begins with chorismate and produces Trp as a substrate for general protein synthesis. The Trp pathway appears to have evolved only once. These aspects of universality are favorable for the task of deducing the evolutionary history. However, many aspects of biochemical individuality are not usually considered. In some cases, Trp biosynthesis does not compete with Phe and/or Tyr biosynthesis because one or both of these are absent. In other cases, as exemplified by the use of ADIC for phenazine biosynthesis in Pseudomonas and Streptomyces species, chorismate is no longer the last branch point, and if one starts with chorismate as a reference point, then the pathway is branched. The pathway does not necessarily end exclusively with the Trp end product supplying protein synthesis, e.g., in cases where Trp may be a component of an antibiotic (as in Streptomyces), or where it is converted to indoleacetic acid in plant symbionts such as Azospirillum. Eukaryotes (but no prokaryotes so far) deploy Trp as a precursor of niacin. In such cases, the pathway can be considered divergently branched at the end, with Trp being guided to different molecular fates.
Trp is the most biochemically expensive of the amino acid pathways, requiring the input of erythrose-4-phosphate, ATP, phosphoribosyl pyrophosphate (PRPP), two phosphoenolpyruvate molecules, L-glutamine, and L-serine. Thus, efficient regulation is generally expected, but these rules no longer apply in an endosymbiont such as Buchnera, which has abandoned Trp regulation. In this case, loss of regulation can be viewed as a positive selective step in order to satisfy the needs of its aphid host. In addition, some prokaryotes sustain different physiological or developmental states where the demands impacting the Trp pathway may be more complicated than just sensing the availability of Trp for protein synthesis. These often involve specialized pathways that coexist with primary Trp biosynthesis. These specialized pathways are encoded in part or entirely by divergent trp gene duplicates whose expression is triggered by a variety of temporal and environmental cues, e.g., to make a given pigment or antibiotic derived in part from the Trp pathway.
These are all interesting but complicating elements that we have tried to keep in mind. This is relevant to the task of sorting out and recognizing paralogues (or xenologues) that may be engaged in specialist pathways other than primary Trp biosynthesis. Appreciation of such complexity may also prove relevant to understanding the nature of split-pathway trp operons in many prokaryotes.
The Itoh et al. study (37) was a broad-scope analysis of many operons that was necessarily limited with respect to in-depth consideration of any individual operon system. It should be noted that for these kinds of studies, operons have been considered simply as a collection of structural genes that are linked. The presence or absence of linked or unlinked regulatory elements has not usually been evaluated, undoubtedly because this is not easily done. In this paper we pursue in great detail the evolution of a single well-known operon system in the large number of prokaryote genomes now available. We found strong support for the hypothesis that the trp operon, minimally defined as the linked assemblage of structural genes for tryptophan (Trp) biosynthesis, is of ancient origin and has indeed followed a dynamic time course of change that includes several identifiable milestone events in Bacteria. Our study leads to the further hypothesis that the instability of early trp operons (and perhaps some modern ones) can be attributed to weak positive selection conferred by relatively undeveloped control mechanisms.
We suggest that since the time that operons evolved a variety of control mechanisms, the characterization of operons as dynamic (rather than unstable) yields better semantics to describe a positive ongoing process of fine-tuning. In modern free-living organisms, the variety of recently evolved trp operon systems which differ from one another and are endowed with intricate control features mediated by one or more unlinked regulatory genes may in fact be highly stable in the contemporary time frame. One caveat, however, is that this frequently will not apply to pathogenic or endosymbiotic relatives, where the rules dictating selective advantage have completely changed.
Known regulatory mechanisms. At the bioinformatic level, the analysis of trp operons in the literature has been largely restricted to the structural genes. Consideration of regulatory features has been understandably limited, mainly because relatively little comparative information is available at the experimental level and also because analysis of alternative stem-loop structures, etc., is not a trivial task. Escherichia coli, Bacillus subtilis, Pseudomonas aeruginosa, and Lactococcus lactis represent clades for which detailed control mechanisms have been described, each of them entirely different. Importantly, each mechanism seems to be narrowly distributed, and therefore we infer that they are of recent origin. Note that in each case, unlinked genes exist that markedly decrease the probability that the total regulated operon system could be transferred by LGT in one event.
Regulation of Trp biosynthesis in E. coli, the most widely known system, is quite sophisticated (23, 94), being subject to the following multiple levels of control: (i) repression control via the Trp repressor (encoded by the unlinked trpR) which binds Trp as a corepressor moiety, (ii) an attenuation mechanism mediated by a Trp-rich leader peptide (encoded by trpL), and (iii) allosteric feedback inhibition of anthranilate synthase by Trp (95). The E. coli mechanisms of overall trp operon regulation are generally shared by the enteric lineage of Bacteria, defined by us as the clade that includes Shewanella putrefaciens as the outlying point of divergence from E. coli.
Bacillus subtilis has a different system of trp operon regulation (72, 80, 95, 96), whereby genes unlinked to the trp operon encode (i) a trp RNA-binding attenuation protein (TRAP) encoded by mtrB as well as (ii) an anti-TRAP gene product encoded by rtpA (80). Trp both feedback inhibits anthranilate synthase and activates TRAP for attenuator function, whereas uncharged tRNATrp induces synthesis of anti-TRAP. TRAP can also block translation of the trp operon through interference with the ribosome-binding site. The clade sharing the TRAP system of regulation includes Bacillus halodurans, Bacillus stearothermophilus, and Oceanobacillus iheyensis in addition to Bacillus subtilis. At this time it is not clear whether the anti-TRAP component is present throughout this clade.
A third finely tuned system of regulation has been documented in Lactococcus lactis (69). In this case uncharged tRNA can bind directly to the leader transcript, stabilizing an antiterminator configuration that promotes expression of the operonic genes. In Lactococcus lactis, unlinked, unknown genes involved in trp operon transcript processing and in transcription initiation have been suggested (69). The presence or absence of the Lactococcus lactis mode of trp operon regulation in close relatives, such as species of Streptococcus, has apparently not yet been investigated.
In Pseudomonas aeruginosa, the fourth well-documented system, the Trp pathway is represented by four operon entities: a free-standing trpAa, the trpAbBD operon, a free-standing trpC, and the trpEbTrpEa operon. The trpAa and trpAbBD operons are regulated by attenuation mechanisms employing leader peptides (67), whereas the trpEbtrpEa operon is controlled by an indoleglycerol phosphate-activated regulatory protein encoded by trpI (6). trpC is not known to be regulated in any way. The P. aeruginosa system is complicated by the presence of paralogues of trpAa and trpAb. These include genes of unknown physiological function (also known as phnA and phnB) expressed in stationary phase (57) as well as two copies of PhzE (trpAatrpAb_phz), a gene that encodes ADIC synthase (Fig. 1), the initial reaction committed to phenazine biosynthesis. It is not entirely clear what physiological conditions exist in P. aeruginosa (and close relatives) that have resulted in its unusual use of indoleglycerol phosphate as a regulatory cue for the selective regulation of the trpEbTrpEa operon, but it is certainly evident that much has been committed to the overall regulation in this system. Close genomic neighbors of P. aeruginosa that possess identical split-pathway trp operons and trpI include Pseudomonas fluorescens, Pseudomonas syringae, and Azotobacter vinelandii.
Unknown regulatory systems awaiting discovery? We do not know the extent to which the total network of regulatory elements governing the single trp operons in the E. coli, B. subtilis, and L. lactis clades or the multiple split-pathway operons of the P. aeruginosa clade might be more elaborate than that of most other organisms. Different lifestyles undoubtedly select mechanisms accommodating varied ranges of control responsiveness. A simple mode of Trp regulation may very well be appropriate in a cyanobacterium but not E. coli. A variety of alternative regulatory systems in other modern lineages probably remain to be elucidated. Transcriptional regulation has been reported in the whole-pathway operons of Methanobacterium thermoautotrophicum (26) and Pyrococcus kodakaraensis (77), but the exact mechanisms are unknown. The split-pathway operons of the clade represented by Rhizobium meliloti (7) and Azospirillum brasilense (21) exhibit an attenuation mechanism involving a Trp-rich leader peptide, upstream of the trpAatrpAb fusion, but no regulation of the remaining two partial-pathway operons is known. Physically separated split-pathway trp operons may be of positive selective value per se for presently unknown reasons, whereby it might be of value to discoordinate the expression of some trp genes from others, or they may simply be the outcome of initially disrupted whole-pathway operons that subsequently recruited a refined control mechanism accommodating the gene separations.
As a first step toward deducing the evolutionary history of overall aromatic biosynthesis, we selected the Trp branch as a challenging but manageable metabolic segment for initial analysis. Trp pathway genes have sometimes been recruited for function in specialized biochemical pathways, and ancient paralogues or xenologues may coexist with the Trp pathway genes that are engaged in primary biosynthesis. We have shown (93) that detailed case-by-case analysis can distinguish ancient trp paralogues (or xenologues) from their homologues engaged in primary Trp biosynthesis. A comparable study in the literature produced a detailed analysis of homologues of ornithine carbamoyltransferase in which the challenges to tracking a vertical path of evolutionary descent that are caused by the complexities of xenology and ancient paralogy were sorted out (73). This study was preceded by an analysis (49) showing that ornithine carbamoyltransferases in turn belong to a larger protein family in which the ornithine and aspartate carbamoyltransferases are very ancient paralogues. The conclusions such comprehensive studies are consistent with the contentions of Glansdorff (27) and Woese (87) that complications of ancient paralogy, ancient analogy, and lateral gene transfer can be recognized sufficiently well to allow the events of vertical ancestry to be tracked.
Here we present results from an in-depth, manual analysis of Trp pathway genes in over 100 genomes. A limited amount of information is also given to illustrate the very important perspective that the evolutionary relationships of Trp biosynthesis will ultimately be best understood in its larger context as one branch of a highly divergent pathway responsible for the biosynthesis of aromatic amino acids as well as many other important metabolites.
|
|
|---|
|
View this table: [in a new window] |
TABLE 2. Cross-reference guide to organisms and figures
|
![]() ![]() ![]() ![]() View larger version (290K): [in a new window] |
FIG.2. Distribution of aromatic-pathway catalytic domains among prokaryotes. In each panel, 16S rRNA trees are shown at the left, and the presence (shaded circles) or absence (open circles) of domains is shown at the right. Note that only the presence or absence of genes, not gene order, is indicated. Catalytic domains of the common trunk of aromatic biosynthesis (Aro), the phenylalanine branch (Phe), the tyrosine branch (Tyr), and the Trp branch are labeled across the top right; the specific letter designation for a given domain is shown at the bottom. In the Trp grouping, split circles are used to indicate the presence or absence of TrpAa (top half-circle) and TrpAb (bottom half-circle) or TrpEa (top half-circle) and TrpEa (bottom half-circle). In panel A, the presence or absence of transketolase (Trk) is indicated by the left column of circles. The connecting point of a tree segment in any given panel (A, B, C, and D) with a tree segment(s) in another panel is marked with a broken line. The scale bar corresponds to substitutions per site. Dotted lines in the Streptococcus region (B) and the Buchnera region (D) indicate our suggestion that the 16S rRNA tree shown may not reflect exactly the correct order of branching, and perhaps these organisms branch from a slightly deeper position. See Fig. 8 for the suggested branching order of Buchnera. Circled numbers indicate eight node positions from which Trp protein trees are congruent with the 16S rRNA tree. The common trunk of aromatic biosynthesis is encoded by seven genes whose corresponding gene products are named AroA through AroG. The common-pathway genes are named in exact order of pathway reactions according to the precedent implemented in references 12, 31, 76, 90, and 91. The chorismate mutase block is represented by homologues of either AroQ (usually) or AroH (seldom) (12). PheA refers to prephenate dehydratase, the sequence of the relatively infrequent arogenate dehydratase being currently unknown. TyrA refers to a homologue family that includes prephenate dehydrogenase, arogenate dehydrogenase, or cyclohexadienyl dehydrogenase (9, 88). See Fig. 1 for details of Trp biosynthesis. The names of organisms retaining the putative ancestral whole-pathway trp operon are shaded orange, those having the two split-pathway operons are shaded magenta, and those having operons rejoined by fusion of trpD and trpC are shaded aqua. These correspond to the major evolutionary events portrayed in Fig. 12 and indicated with the same color-coding scheme. Probable pseudogenes in chlamydiae (C) and Coxiella (21) are indicated with heavy black slash marks. Genes that function in two pathways (trpAb in Bacillus subtilis and trpC in actinomycete bacteria) are marked with magenta bull's-eyes in B. Panel A includes the Archaea and a few of the deeper-branching Bacteria at the bottom. Panel B includes the gram-positive Bacteria. Panel C includes cyanobacteria, chlamydiae, and other organisms on the 16S rRNA tree between the gram-positive organisms in panel B and the organisms in panel D, which contains the gram-negative subdivisions of the Proteobacteria. Wolbachia sp. (panel D) is an endosymbiont of Brugia malayi. A cross-index of all organisms shown in both this figure and the remaining figures is given in Table 2.
|
The multipurpose Fig. 2 provides a summary of the presence or absence of Trp pathway genes in the larger context of the presence or absence of genes specifying the common aromatic trunk and the sister phenylalanine and tyrosine branches. The circles in Fig. 2 from left to right represent catalytic domains (specified at the bottom of each panel) corresponding to the seven common-pathway steps (aroA through aroG), chorismate mutase (aroQ or aroH) (which is common to the short Phe and Tyr branches), and the seven catalytic domains of the Trp pathway (Fig. 1 and Table 1).
The key enzyme of Phe biosynthesis is PheA, and the key enzyme of Tyr biosynthesis is TyrA. The Phe and Tyr branches each utilize an aminotransferase step, not shown as a circle because of bioinformatic difficulties associated with deducing the substrate specificity of multiple and ubiquitous broad-specificity aminotransferases (42). Most intermediary metabolites of aromatic biosynthesis are not likely to be available from the environment; only quinate, shikimate, and anthranilate, all abundant in nature (10), are feasible precursors of Trp. Although these metabolites are indeed readily utilized when available, no prokaryotes have yet been found to rely on an exogenous source of quinate, shikimate, or anthranilate as exclusive and obligatory beginning precursors. One interesting special-case exception is Chlamydophila psittaci, an obligate intracellular parasite that utilizes host-derived anthranilate as a required Trp precursor (89).
Alternative metabolic relationships.
In contrast to the apparent universality of the specific Trp branch, alternative enzyme steps appear to exist in nature for the Phe and Tyr branches as well as for the common trunk of aromatic biosynthesis. Some Archaea (Fig. 2A) and two widely spaced members of the Bacteria (Aquifex and Desulfovibrio, Fig. 2A and 2D) lack both AroA and AroB. Transketolase (Trk), required for generation of a substrate for AroA, is also shown in Fig. 2A because most (but not all) organisms that lack AroA and AroB also lack transketolase. (Desulfovibrio vulgaris [Fig. 2D] does have transketolase.) In the last six organisms, dehydroquinate, the substrate of AroC, presumably connects with carbohydrate metabolism in some unknown way that does not involve AroB or any of the known AroA homology groupings AroAI
, AroAIß, or AroAII (31, 44, 76). Some support for this putative alternative metabolic connection, based on tracer methodology, exists in the literature (79). It is also possible that quinate, either from the environment or arising endogenously in some unknown way, could be the source of dehydroquinate via the action of a quinate dehydrogenase.
Although species of Chlamydophila and Chlamydia are very close phylogenetically, the presence of Trp pathway genes varies from complete absence in C. pneumoniae to almost all present in C. psittaci. It appears that the Trp pathway in C. trachomatis and C. muridarum is in a contemporary process of reductive evolution, and the few remaining genes may be remnants (25, 89). In contrast to these species, an "incomplete" trp operon in C. psittaci appears to play a role in the capture of host kynurenine derived from tryptophan (89). Although C. psittaci does lack trpAa and trpAb, the remaining five trp genes coexist in an operon into which two novel genes have been recruited. These encode kynureninase and PRPP synthase. This creates the ability to generate PRPP (needed for the TrpB step) and to intercept host kynurenine as a source of anthranilate, cycling host-catabolized Trp back to Trp in the intracellular parasite (89). Effectively, a host-pathogen metabolic mosaic has been created, and the variant operon generates a kynurenine-to-Trp flow route instead of the usual chorismate-to-Trp flow route.
As explained above, the absence of trpAa and trpAb in C. psittaci is by design, and the remaining Trp pathway is functional. The likelihood that aroA and aroB, which are absent in some organisms, will prove to reflect either a new metabolic connection or the existence of unknown analogue genes has already been mentioned. In a few cases tyrA or pheA was the only aromatic-pathway gene not found by homology search. The endosymbiont Buchnera (Fig. 2D), which lacks tyrA, may not need to synthesize tyrosine because the host has phenylalanine hydroxylase, which can convert phenylalanine to tyrosine. Aeropyrum pernix (Fig. 2A) and Helicobacter pylori (Fig. 2D), which both lack pheA, may very well possess arogenate dehydratase, an alternative pathway step for prephenate dehydratase (reference 39 and references therein). No gene encoding an arogenate dehydratase has yet been cloned and sequenced.
Reductive evolution. Reductive evolution is descriptive of the process in which pathogens or symbionts decrease genome size by abandoning genes that are needed by their free-living relatives but dispensable because of the availability of resources from a host or symbiont partner. The genus Pyrococcus exhibits marked variation in the capability for aromatic biosynthesis. Pyrococcus horikoshii has experienced total reductive evolution. Only TrpEb remains in P. horikoshii, and the case has been made that this may have some other function, such as serine deaminase activity (92). P. abyssi possesses genes encoding common-pathway and Trp pathway steps but lacks the Phe and Tyr branches. Although chorismate mutase (aroQ) is present, it could have some other substrate specificity (13). Since P. abyssi lacks the competing Phe and Tyr branches, an unusual metabolic circumstance exists in which the representation of tryptophan biosynthesis can be collapsed to that of a linear pathway of 12 overall steps (corresponding to the seven common-pathway steps followed by the five overall steps that are specifically dedicated to Trp biosynthesis). In contrast to the foregoing two differentially auxotrophic species of Pyrococcus, P. furiosus possesses a complete assemblage of aromatic-pathway genes.
Organisms that lack the entire branched system of aromatic amino acid biosynthesis include P. horikoshii (Fig. 2A), Ureaplasma urealyticum and Mycoplasma species (Fig. 2B), Borrelia burgdorferi and Treponema pallidum (Fig. 2C), and Rickettsia prowazekii and Wolbachia spp. (Fig. 2D). These whole-pathway reductive evolutions are generally associated with intracellular parasitism or endosymbiosis, and they imply auxotrophic dependence upon the host not only for all three aromatic amino acids but also for end products of the vitamin-like branches (e.g., folate, vitamin K, and ubiquinones) that derive from chorismate. In the Bacteria, some organisms possess an otherwise intact aromatic pathway but the Trp branch is uniquely absent. Among gram-positive bacteria (Fig. 2B), this includes Enterococcus faecalis and Clostridium difficile, and this pattern is also seen in the gram-negative Haemophilus ducreyi (Fig. 2D).
Interestingly, some organisms lack all three of the terminal aromatic amino acid branches but possess an intact common pathway to chorismate: Streptococcus pyogenes (Fig. 2B), Streptococcus equi (Fig. 1B), chlamydial species (Fig. 2C), Porphyromonas gingivalis (Fig. 2C), and Treponema denticola (Fig. 2C). The implication is that the remaining common pathway still links to one or more of the vitamin-like pathways. In the chlamydiae, we could not detect (by use of homology searching) a single gene encoding any known chorismate-utilizing enzyme. However, this could easily be accounted for by the existence of analogue genes that have not yet been identified. For example, E. coli chorismate lyase, which catalyzes the initial step of ubiquinone biosynthesis, is encoded by a gene (66) that is of very limited distribution. Therefore, elucidation of presently unknown analogue genes encoding chorismate lyase surely must be forthcoming.
![]() View larger version (27K): [in a new window] |
FIG. 3. Apparent absence of trpC and an event of LGT in a lineage of actinomycete bacteria. A broader phylogenetic context can be viewed in Fig. 2B and 6A. chyp denotes a conserved hypothetical membrane protein exhibiting about 28% identity in comparison of a given Mycobacterium species with a given Corynebacterium species. Color-coded boxes pointing in the direction of transcription represent genes of Trp biosynthesis. For clarity of presentation, trpAa is shown as Aa, etc. Open boxes with question marks denote hypothetical proteins. Intergenic spacing is shown, with negative values indicating gene overlap. trpDtrpC fusions are represented by short black linker bars. On the left are 16S rRNA-based phylogenetic trees of the genomes having the gene organizations shown on the right. Orthologues that match the mycobacterial trpAa/chyp/D/Eb/Ea operon genes are aligned vertically. Contemporary trp operons in coryneform species that originated in their common ancestor by LGT of trpAa/Ab/B/DC/Eb/Ea from a source within the enteric lineage are shown within brackets. Except for the two coryneform species, all actinomycetes have a free-standing trpB gene. The Mycobacterium spp. and Streptomyces also have a free-standing trpAb gene. The corresponding TrpB and TrpAb proteins exhibit high identity with one another but not with TrpB and TrpAb of the coryneform species. Thermomonospora has dissociated trpAa from the typical clade operon and fused it with trpAb (as also shown in Fig. 4). The trpAa/Ab/B/D/aroAII operon of S. coelicolor is known to be specifically associated with antibiotic biosynthesis (see text).
|
A comprehensive phylogenetic tree for trpD proteins (data not shown) reveals that all of the TrpD proteins in Fig. 3 exhibit cohesive clustering and an order of branching that is congruent with the corresponding genome positions on the16S rRNA phylogenetic tree except, of course, for the trpD domain of the trpDtrpC fusion protein in the two coryneform species. Thus, in C. diphtheriae and C. glutamicum, the free-standing trpD outside of the whole-pathway trp operon is more closely related to trpD inside the partial-pathway trp operons of all the other organisms. An inner-membrane protein of unknown function separating trpAa and trpD in all of the mycobacteria, encoded by chyp, also flanks the nonoperonic trpD of the two coryneform species. As expected for the suggested LGT scenario, trees of TrpAa, TrpEa, and TrpEb proteins that are encoded from the partial-pathway operons of mycobacterial species, Streptomyces, and Thermomonospora in Fig. 3 all cluster closely together with the exclusion of the corresponding LGT genes from the coryneform bacteria.
Post-LGT events of vertical descent can be tracked in C. diphtheriae. Since the time that an alien trpAa/trpAb/trpB/trpDtrpC/trpEb/trpEa operon displaced the trp genes present in the common ancestor of coryneform bacteria, leaving behind only chyp and trpD as remnants, subsequent vertical evolutionary events in the C. diphtheriae genome are apparent. Thus, an insertion containing panB and panC occurred recently between trpDtrpC and trpEb in the C. diphtheriae lineage after its divergence from C. glutamicum. In C. glutamicum, closely related panB and panC orthologues (encoding ketopantoate hydroxymethyltransferase and pantothenate synthetase) comprise a characterized operon of D-pantothenate biosynthesis that is located elsewhere in the genome (71). In C. diphtheriae, the translocation of panB and panC into the trp operon is associated with an inversion event between these two genes. Hence, the opposite transcriptional direction of the inserted panC has now isolated trpEb/trpEa from its former operonic transcriptional continuity, presumably forcing it to become a separate transcriptional unit. It is interesting that the otherwise alien operon of C. diphtheriae now contains the native genes panB and panC, transposed from the resident genome. C. diphtheriae has also produced a gene duplicate of the gene encoding the alien TrpEb, which has then become the proximal member of the operon. This paralogue TrpEb is probably deficient in complex formation with TrpEa, because conserved residue K-167 (Salmonella enterica serovar Typhimurium numbering), which forms a salt bridge with residue D-56 of TrpEa, has been changed to S-167 (85). Also, the highly conserved residue 162-G has been changed to a charged residue, 162-E. Thus, after the LGT event, several subsequent vertical events of evolution that occurred in C. diphtheriae but not in C. glutamicum can be tracked.
The following approaches were taken in an attempt to locate the missing trpC genes in the above-mentioned actinomycete organisms.
Pattern and profile search. TrpC is a short and relatively divergent sequence. Known TrpC homologues may have identities as low as 22%. In an initial Blast screening with E. coli TrpC as the query, for example, the Ferroplasma acidarmanus genome did not return any hits and appeared to lack TrpC by this criterion. However, the position of an unknown gene within the trp operon of F. acidarmanus strongly implicated its presence as a divergent trpC gene because it occupies the same relative position as trpC in two closely related Thermoplasma species. Indeed, identity as trpC (second iteration) was amply confirmed by use of PSI-Blast (5), as well as by the observed conservation in multiple alignments of critical residues established by structural studies of TrpC from E. coli. In addition, the use of TrpC query sequences from most of the Archaea did return positive Blast hits from the F. acidarmanus genome.
With this background in mind, the genomes of T. fusca, S. coelicolor, and the mycobacteria M. avium, M. tuberculosis, and M. bovis were subjected to a pattern and profile search that included a ProSite-like pattern based upon critical residues reported in the PDB summary, the use of TrpC domains as query sequences that were available from the closest relatives of the group missing TrpC, and the generation of a hidden Markov model based on a multiple sequence alignment of known TrpC sequences. No illuminating results were obtained with this approach.
Evaluation of an unknown gene inserted in the trp operon. M. tuberculosis has a conserved hypothetical gene (Rv1610) inserted between trpAa and trpD (denoted chyp in Fig. 3). The absence of trpC coupled with the insertion of this unexpected gene within the trp operon invited careful scrutiny. This was, in fact, reminiscent of the previously mentioned situation with the operonic trpC of F. acidarmanus, which initially eluded detection as trpC. However, critical residues expected of TrpC could not be matched to Rv1610 by manual alignment. Furthermore, Rv1610 appears to encode an inner-membrane protein with three transmembrane segments. In addition, if Rv1610 were, in fact, a divergent TrpC, we would expect to find homologues in T. fusca and S. coelicolor. We did not.
Possible catalysis of the TrpC reaction by HisA.
TrpC catalyzes an intramolecular oxidoreduction (Amadori rearrangement) that parallels the isomerase reaction catalyzed by HisA. Both reactions involve isomerization of an identical phosphoribosyl moiety. TrpC and HisA each exhibit (ß
)8 barrel structures. Jurgens et al. (46) in fact generated hisA mutants that could catalyze the TrpC reaction both in vivo and in vitro. One of these variants retained significant HisA activity. We therefore envisioned the possibility that an ancestor of the TrpC-deficient block of organisms might have duplicated hisA and recruited one copy to TrpC function. However, second copies of hisA were not found. We then further considered the possibility that HisA in these organisms might catalyze both reactions, since that potential had been established in vitro. However, the alignment of HisA sequences did not reveal any obvious variant residues common to the TrpC-deficient block of organisms that might suggest potential for TrpC activity.
Evolution of competence for TrpC catalysis by TrpD.
Altamirano et al. (3) recently reported the evolution of TrpC activity from the
ß barrel scaffold of TrpD following in vitro mutagenesis and recombination. Thus, one might envision an event of trpD gene duplication followed by divergence of one of the paralogues to TrpC function. Although a gene duplicate of trpD was found in S. coelicolor, other organisms of the trpC-"deficient" block do not have a trpD gene duplicate. In consideration of the additional possibility that a modified trpD might encode an enzyme capable of both reactions, a careful comparison of the multiple alignment for trpD sequences failed to reveal a variant subgroup that might be expected of an evolved dual-function trpC/trpD protein. This is perhaps not surprising in view of the recent retraction (4) of the results of Altamirano et al. (3).
Other possibilities.
Enzymes possessing triose phosphate isomerase (TIM) (ß
)8 barrel-like folds are widespread and accommodate a particularly wide range of functions (15). Within this large grouping, TrpC, TrpD, TrpAa, and Rpe (D-ribulose 5-phosphate 3-phosphate epimerase) belong to the ribulose phosphate binding superfamily within the SCOP (structural classification of proteins) database (15, 86). Therefore, both TrpAa and Rpe were also evaluated as possible evolutionary sources of the missing TrpC, with the approaches described for HisA and TrpD. Suggestive evidence was not found.
The isomerase step catalyzed by TrpC is clearly a facile reaction, and although none of the foregoing possibilities considered produced the answer sought, they illustrate nicely the rationale and sorts of in silico strategies for gene discovery that can be anticipated in the near future. Until the time that this article was under review, the identity of trpC in the organisms included in Fig. 3 had remained a mystery. However, convincing evidence has been obtained recently that the HisA isomerase in these organisms does in fact catalyze the isomerase reaction in both pathways (9). The gene name, priA (phosphoribosyl transferase A), has been suggested to accommodate to its functional role in two pathways. Although this possibility was anticipated as outlined earlier, the natural bifunctional proteins of actinomycete bacteria did not resemble that obtained experimentally (46) in terms of amino acid sequence matches. Barona-Gómez and Hodgson (9) suggested that the bifunctional actinomycete isomerases represent an ancient evolutionary state that is in line with the recruitment hypothesis (38). If so, specialization in the gene duplicate that became trpC must have required more divergence than the gene duplicate that became hisA because the homology of PriA proteins with HisA is evident but not with TrpC proteins.
|
|
|---|
![]() View larger version (33K): [in a new window] |
FIG. 4. Mapping of the distribution of Trp pathway gene fusions to the 16S rRNA tree. The presence of fusion subtypes is color-coded as indicated in the legend. Although Buchnera aphidicola maps near E. coli on the 16S rRNA tree, as shown, its true point of divergence is probably prior to Yersinia, as portrayed by dotted lines in Fig. 8.
|
|
View this table: [in a new window] |
TABLE 3. Comparison of GC content in gene fusions and cognate genomes
|
The ultimate analysis of the total inventory of fused genes in any given genome should provide an excellent phylogenetic tool for deducing the order of branching. This approach should be greatly enhanced by the rapid increase in the number of sequenced genomes coupled with the enormous advantage of being able to identify gene fusions with bioinformatic methods. However, it was not expected at the time that fusions could occur independently at such frequencies or that LGT should be taken seriously. Therefore, application of the approach of nested gene fusions will require sufficient background work to recognize and discriminate fusion clusters that have independent origins on the vertical tree as well as ones that might have been spread in the horizontal direction by LGT.
|
|
|---|
trpAa
trpAb and
trpEb_1
trpEa.
![]() View larger version (34K): [in a new window] |
FIG. 5. Organization of trp operon genes in the Archaea. Each trp gene is color coded differently, including the two subtypes of trpEb (Eb_1 and Eb_2) (92). trp genes that exist in the genome unlinked to any other trp genes are not shown. Archaeoglobus fulgidus has a trpDtrpB gene fusion (see Fig. 4). Intergenic spacing is shown, with negative values indicating gene overlap. Genes that are not specific trp pathway genes are in white boxes. F. acidarmanus possesses a gene encoding the aroAIß subclass (44) of DAHP synthase. aspC in S. solfataricus is an aromatic aminotransferase of the I aspartate aminotransferase type (42). This gene insertion corresponds to genes that appear to have escaped from the aro operons shown in Fig. 10. The gene order shown for Methanosarcina barkeri is the same as those in Methanosarcina acetivorans and Methanosarcina mazei. The gene order shown for S. solfataricus is the same as that for Solfolobus tokodaii.
|
Usually, the pair of genes encoding the two subunits of tryptophan synthase are adjacent in prokaryotes. In the case of P. aerophilum, trpEa and trpEb_2 have been separated from one another within the operon. This may reflect the inability of trpEb_2 to form a complex with trpEa. In P. aerophilum, trpC and trpD have become separately dissociated from the operon. trpC and trpD are adjacent in the operon of Aeropyrum pernix, but separated in the operon of Sulfolobus solfataricus. Although all of the trp genes in A. pernix are adjacent, they are organized as two divergently transcribed groups, trpEa/trpEb_2/trpC/trpD and trpB/trpAa/trpAb. The A. pernix
trpEa
trpEb order is very unusual, the
trpEb
trpEa gene order being one of the most highly conserved gene couples in all prokaryotic genomes (17). Methanosarcina barkeri and Halobacterium spp. have identical gene orders, but the intact operon currently seen in M. barkeri corresponds to a splitting into two separate operons in Halobacterium spp. In some cases, other aromatic-pathway genes have been inserted into the trp operon. Thus, the trp operon of F. acidarmanus has aroAIß as its most distal gene, whereas S. solfataricus has aspC (encoding aromatic aminotransferase) as the most distal gene of its trp operon.
|
|
|---|
![]() ![]() View larger version (98K): [in a new window] |
FIG. 6. Organization of trp operon genes in the Bacteria. Each trp gene is color coded differently, including the two subtypes of trpEb. (trpEb in this figure refers to the major trpEb_1 subtype.) The tree sections in A and B join as indicated by the dashed line. Intergenic spacing is shown, with negative values indicating gene overlap. Separations showing white space and no intergenic spacing values indicate that the gene clusters are not linked to one another. Insertions of hypothetical genes and known genes are shown as white boxes. Short black bars connecting arrows denote gene fusions. Links to zoom-in expansions of particular lineages in other figures of this paper are indicated by binoculars. In B, the gene organization shown for Rhodopseudomonas palustris is identical to those of the closely related Agrobacterium tumefaciens, Rhizobium loti, Brucella melitensis, and Sinorhizobium meliloti; that shown for Burkholderia fungorum is identical to that of Burkholderia pseudomallei and Burkholderia mallei; that shown for Bordetella parapertussis is identical to that of Bordetella pertusis and Bordetella bronchiseptica; that for Neisseria meningitidis is identical to that of Neisseria gonorrhoeae; and that for Pseudomonas aeruginosa is identical to that of Pseudomonas putida, Pseudomonas fluorescens, and Pseudomonas syringae. The apparent supraoperon of Anabaena sp. (A) has been discussed in reference 93. kynU and kprS on the Chlamydophila psittaci line (A) refer to genes encoding kynureninase and PRPP synthase, respectively (89). The linked trpAa/trpAb genes shown for P. aeruginosa (B) were named phnA/phnB by Essar et al. (24). because they were thought to be dedicated to phenazine biosynthesis, a conclusion shown to be incorrect by Mavrodi et al. (57). This gene pair is not within the vertical line of descent (see later section), as indicated by the LGT notation. The trpAaAb operon shown on the left for Xylella is also outside the vertical line of descent (i.e., origin by LGT) (93). Shewanella putrefaciens (B) has the newly proposed name of Shewanella oneidensis (81).
|
In some cases, bacteria possess two chromosomes (19, 54). It is interesting that, in Rhodobacter sphaeroides, not only has the ancestral trp operon been split apart, but also the resulting partial-pathway operons (trpAa/yibQ/trpAb/trpB/trpD and trpC/aroR/trpEb) now reside on separate chromosomes. TrpEa has become completely dissociated from these operons (54). The closest available genomic neighbor of R. sphaeroides that is available on the 16S rRNA tree is Sphingomonas aromaticivorans, and it possesses the same split-pathway arrangement as R. sphaeroides except that trpEb and trpEa have remained together (i.e., trpC/trpEb/trpEa). The intriguing partitioning of the trp split-pathway operons between two chromosomes is typical of a 16S rRNA grouping of organisms that includes Rhodopseudomonas palustris, Rhizobium loti, Brucella melitensis, Sinorhizobium meliloti, Rhodobacter sphaeroides, and Sphingomonas aromaticivorans. Most of these organisms are not shown in Fig. 6, but a detailed breakdown of Trp pathway gene organization in this part of the tree is given in reference 93.
At one extreme of gene dissociation, gene dispersal has completely eliminated any linkage of trp pathway genes, as observed in Aquifex, unicellular cyanobacteria (Synechocystis, Synechococcus, and Prochlorococcus), and Chlorobium tepidum. (Only organisms possessing at least some linked trp genes are shown in the various figures of this paper.) One might reasonably consider whether these organisms simply manifest retention of a "preoperon" ancestral state, but this seems untenable with respect to the application of parsimony principles because they represent distinctly separate, widely spaced lineages.
All cyanobacteria possess a common phylogenetically congruous set of completely dispersed genes for tryptophan biosynthesis. However, Nostoc and Anabaena possess in addition some redundant trp genes that are linked to one another. The assemblage of linked trp genes in Anabaena spp. (shown in the middle of Fig. 6A) is very similar to that of the closely related Nostoc punctiforme (not shown, but see Fig. 2 of reference 93 for details) and seems to be part of a larger gene assemblage (possible supraoperon) that includes several other aromatic-pathway genes. Nostoc and Anabaena (large-genome, filamentous, and heterocystous cyanobacteria) possess these linked genes in addition to copies of all of the dispersed trp pathway genes found in the unicellular cyanobacteria. Hence, the redundant set of linked genes that are uniquely present in Nostoc and Anabaena seemed to be obvious candidates for origin by LGT. However, no support for LGT was found, and it has been suggested (93) that ancient paralogues have been retained in the Nostoc/Anabaena lineage, whereas the set of linked paralogue genes has been lost in the unicellular cyanobacteria.
Brown and Doolittle (11) made the correct observations, as long ago as 1997, even with vastly less data, that the consensus gene order seemed to be trpAa/Ab/B/D/C/Eb/Ea in Bacteria, that archaeal gene orders seem to be more variable than in Bacteria, and that the trpAa/trpAb and trpEb/trpEa linkage groups might be ancestral.
|
|
|---|
At the deepest branching position shown in Fig. 6A, T. maritima possesses a compact ancestral operon, differing only in that trpAb and trpB have fused. This fusion is rare, having occurred elsewhere only in the distant E. coli/S. enterica serovar Typhimurium/K. pneumoniae subgrouping (Fig. 4). At the next phylogenetic node in Fig. 6A, D. ethenogenes has retained the ancestral trp operon, albeit with an aroAIß insertion between trpC and trpEb. In the gram-positive organisms shown in the Fig. 2B tree, ancestral operons are present in the following organisms from the deepest to more shallow phylogenetic nodes: Clostridium acetobutylicum > Desulfitobacterium hafniense > Listeria monocytogenes, Bacillus anthracis, and species of Staphylococcus. The ancestral trp operon has not survived in most of the phylogenetic groupings shown in Fig. 2C. In many cases, some or all of the Trp pathway genes have been lost by reductive evolution in pathogenic Bacteria. In other cases (cyanobacteria and Chlorobium tepidum), the trp genes have all been dispersed. Cytophaga hutchinsonii is the sole organism shown in Fig. 2C that has retained a complete trp operon with the ancestral gene order. Finally, in the top node illustrated in Fig. 2D, Desulfovibrio vulgaris has retained the compact ancestral trp operon, as shown near the bottom of Fig. 6A.
|
|
|---|
![]() View larger version (42K): [in a new window] |
FIG. 9. Conserved genes flanking the trpC/trpEb/trpEa operon of organisms within the split-operon portion of the 16S rRNA tree. Organisms in the upper grouping are -Proteobacteria; the cluster between Thiobacillus and Neisseria are ß-Proteobacteria; and the bottom cluster is that fraction of the -Proteobacteria that diverged prior to the trpDtrpC fusion event. lysM and truA, conserved at the flanking gene position at the left throughout the ß- and -Proteobacteria, are shaded grey, as are accD and folC (conserved in the flanking gene position at the right throughout the phylogenetic span portrayed in this figure). The deduced gene order of the common ancestor for each of the two major 16S rRNA clades is the same as shown for the two contemporary organisms Rhodopseudomonas palustris and Nitorosomonas europaea, as indicated by outlining in orange. Intervening genes, either hypothetical or known, are shown as open block arrows.
|
![]() View larger version (27K): [in a new window] |
FIG. 8. Zoom-in from Fig. 6B showing Trp pathway gene organization in a range of Proteobacteria defined by the presence of the trpD trpC fusion. Deduced phylogenetic events described on the left are identified by number on the 16S rRNA tree at the evolutionary times indicated. The actual position of Buchnera on the 16S rRNA tree (as shown in Fig. 2D and Fig. 4) is closest to E. coli. However, the long branch (Fig. 4) is consistent with the more likely order of branching depicted by the dotted line for Buchnera in this figure.
|
|
|
|---|
It also is worthwhile to consider whether what is effective regulation for one organism would be appropriate for organisms that have a completely different lifestyle. E. coli, for example, experiences regular episodes of feast and famine in the gut of humans, and the ability of E. coli to regulate Trp enzymes over a large range of expression confers rapid response and efficiency. On the other hand, cyanobacteria generally grow in a nutritionally dilute environment and synthesize most of their amino acids most of the time. Under these conditions, possession of an operon system that is responsive over several orders of magnitude may not confer selective advantages.
There are a number of well-spaced genomes that possess the putative ancestral operon of Bacteria, highlighted orange in Fig. 2, e.g., trpAa/Ab/B/D/C/Eb/Ea is present in species of Listeria, species of Streptococcus, species of Staphylococcus, Clostridium acetobutylicum, and Desulfovibrio vulgaris. We considered the possibility that the trp operons in these organisms are related to one another by LGT rather than by vertical descent. However, we did not find that the trp operon proteins in any of these organisms clustered together when comprehensive trees for all seven proteins were inspected (data not shown), as would be expected for relationships of LGT. Therefore, we conclude that in these lineages, the exact ancestral operon was simply retained without gene dispersal, gene insertion, or gene fusion.
On the other hand, the fusion-containing trp operon (trpAa/Ab/B/D/C/Eb/Ea) in the enteric lineage is related to those of coryneform bacteria and Helicobacter pylori by LGT. We know that coryneform bacteria must have been the recipient rather than the donor because they retain remnants of the original host. We conclude that H. pylori was also a recipient of LGT from the ancestral lineage because the Helicobacter/Campylobacter node of divergence is more recent than the root of divergence for the enteric lineage. Therefore, if Helicobacter had been the trp operon donor, one would expect Campylobacter to also have the fusion-containing trp operon. As pointed out before, the modern Helicobacter operon lacks repression control by trpR, presumably because trpR of the alien enteric lineage donor was unlinked to the transferred operon. It would be interesting to know how the regulation of the modern H. pylori trp operon compares to that of the modern Campylobacter jejuni trp operon, which presumably would be similar to the original H. pylori trp operon that was displaced.
Figure 6B shows two different partial-pathway trpAa/trpAb operons that were acquired by LGT in Xylella fastidiosa and in Pseudomonas aeruginosa, as discussed previously by Xie et al. (93). In Xylella it has been speculated (93) that trpAa/trpAb coexists within an operon with acl, which encodes an aryl-coenzyme A ligase that might have the specificity of an anthranilate-coenzyme A ligase. This might then be a point of divergence, whereby coenzyme A-activated anthranilate proceeds to an antibiotic, siderophore, etc. This anthranilate synthase appears to be resistant to feedback inhibition by Trp, consistent with the absence of Trp as an end product of the putative specialized pathway. The P. aeruginosa trpAa/trpAb operon shown in Fig. 6B was originally denoted phnA/phnB (phn for phenazine) because their expression in stationary phase, unregulated by Trp, was thought to be a mechanism to produce anthranilate precursor for phenazine synthesis in the presence of Trp. Although it is now known (57) that this operon is not part of the phenazine pathway and that anthranilate is not a phenazine precursor, it would appear to constitute a system designed for production of anthranilate in an unknown functional role in stationary-phase metabolism.
Streptomyces coelicolor possesses an operon (trpAa/trpab/trpB/trpD/aroAII) (Fig. 4) that is nested within a large cluster of genes that dictate synthesis of a calcium-dependent antibiotic (CDA) (70). This antibiotic contains Trp. The origin of this operon by LGT has been mentioned (70), but a detailed analysis has not yet been done. However, even if it originated via ancient paralogy instead, it is a good example of a contemporary operon that could confer a specialized ability to make Trp in the presence of fully charged tryptophanyl-tRNA via LGT. The key aspects are an operon free of any mode of regulation by Trp and inclusion of the gene encoding a homologue of DAHP synthase (AroAII) that is not inhibited by amino acids, hence ensuring an unrestrained supply of chorismate. Thus, normal restraints in place for primary biosynthesis at the branch point levels of both DAHP synthase and anthranilate synthase have been removed in order to accommodate the secondary synthesis of antibiotic. Note also in these examples with S. coelicolor that the primary and secondary pathways are not entirely separate.
The antibiotic-oriented operon system lacks trpEa and trpEb. Therefore, the tryptophan synthase that is utilized for primary biosynthesis must also be used to make Trp molecules destined for incorporation into antibiotic molecules. In view of the recent revelation (9) that priA fulfills the isomerase function in both the histidine and tryptophan pathways in S. coelicolor, as discussed earlier, it would appear that priA must also have a functional role in a third pathway to the CDA antibiotic. S. coelicolor has four paralogues of trpAa: one engaged in primary Trp biosynthesis (undoubtedly sensitive to feedback inhibition), a free-standing trpAa of unknown function, one dedicated to antibiotic biosynthesis (probably not sensitive to feedback inhibition), and another (not shown in Fig. 3) that is a domain component of trpAatrpAb_phz and dedicated to phenazine biosynthesis.
|
|
|---|
![]() View larger version (24K): [in a new window] |
FIG. 7. Zoom-in from Fig. 6A showing instances of Trp pathway reductive evolution and expansion of intergenic space in one phylogenetic section of some gram-positive bacteria whose 16S rRNA tree relationships are shown at the left. Loss of various metabolic capabilities is indicated by scissors. Note that the order of branching of Lactococcus lactis (shown in orange) has been altered from that shown in the 16S rRNA tree of Fig. 2B. The gene order and compact spacing of Listeria innocua is the same as that shown for Listeria monocytogenes.
|
-Proteobacteria (top major grouping), the ß-Proteobacteria, and some of the
-Proteobacteria. We did not find the trpAa/Ab/B/D partial-pathway operon to be flanked by conserved genes, but the trpC/Eb/Ea partial-pathway operon did exhibit flanking conserved genes. On the right in Fig. 9 are shown the positions of conserved genes that do flank the trpC/Eb/Ea operon. Genes encoding the ß subunit of acetyl-coenzyme A carboxylase (accD) and folylpolyglutamate synthase/dihydrofolate synthase (folC) follow trpEa in most cases. Occasionally folC appears to have been translocated away from trpEa/accD, as exemplified in Bordetella parapertussis and Neisseria meningitides and Neisseria gonorrhoeae. For the lower group of organisms (from Thiobacillus through the Pseudomonas/Azotobacter cluster), the trpC/Eb/Ea operon is additionally flanked on the left by genes encoding fimbria V protein (lysM) and tRNA pseudouridine synthase A (truA). The top group of organisms shown in Fig. 9 exhibit the gene order trpC/trpEb/trpEa/accD/folC (boxed) that likely mirrors the ancestral gene order of the alpha-Proteobacteria, whereas it is reasonable to suggest that the gene order of Nitrosomonas europeae represents the ancestral gene order of the remaining organisms in the tree. These conserved flanking genes provide information that can help guide fine-tuned evolutionary deductions. For example, the clade that includes Pseudomonas aeruginosa, P. syringae, P. fluorescens, and Azotobacter vinelandii possesses a trpC gene that has become separated from trpEb/trpEa. Was trpC or trpEb/Ea transposed away from the original trpC/Eb/Ea operon? The answer clearly is trpEb/trpEa, since trpC is flanked on the left by lysM/truA and on the right by accD/folC. Likewise, in the Magnetococcus sp., trpC is flanked on the right by accD and folC, and therefore the trpEb/trpEa operon must have been translocated away from trpC. Also, trpD in Magnetococcus must have migrated from the trpAa/Ab/D operon to its anomalous contemporary position near trpC.
Both Magnetospirillum and Azospirillum resemble the Magnetococcus sp. in that trpC has separated from trpEb/trpEa. However, in contrast to Magnetococcus sp., in which trpEb/trpEa has been transposed, in both Magnetospirillum and Azospirillum it is trpEb/trpEa that is linked to accD, and therefore it is clearly trpC that has been transposed away.
In Bordetella parapertussis trpC has separated from trpEb/trpEa in such a way that trpC retains linkage with lysM/truA on the left and trpEb/trpEa retains linkage with accD on the right. This could be consistent with a very large insertion (49,000 bp) between trpC and trpEb, or more likely trpEb/trpEa/aacD were jointly transposed. In Neisseria meningitides and Neisseria gonorrhoeae, trpC, trpEb, and trpEa have all become separated from one another. In this case, trpEa has retained its linkage with accD.
|
|
|---|
![]() View larger version (38K): [in a new window] |
FIG. 10. Linkage relationship of genes within the larger context of aromatic amino acid biosynthesis in Archaea. The tree is the same as that shown in Fig. 5, where the full organism names corresponding to the acronyms used can be viewed. Common-pathway genes are shaded and designated by the gene letter, e.g., Q = aroQ. Hypothetical genes are denoted as hypo. Genes are labeled within block arrows that point in the direction of transcription. Copies of genes encoding transketolase are designated trk- and trk-ß. Short black bars connecting arrows indicate gene fusions. Deleted genes, pathway branches, and entire pathways are indicated with scissors.
|
Dynamics of archaeal gene shuffling. It is suggestive that the gene orders within the largest archaeal linkage groups that represent either Crenarchaeota (P. furiosus) or the Euryarchaeota (S. solfataricus) show some similarities, and we speculate that the ancestral gene order might have resembled that of the P. furiosus aro operon. This speculation is influenced by the gene order (aroAIß/aroB/aroC/aroD/aroE/aroF/aroG) of the closest neighbor of S. solfataricus, Aeropyrum pernix. The altered order of aroC and aroG in S. solfataricus may reflect derived transposition events. If P. furiosus does represent the ancestral order, deletion of aspC could have resulted in the aroQtyrA fusion in S. solfataricus, which must then have been inserted between the ancestral trk-ß and aroAIß. If so, the deleted aspC gene was then inserted into the trp operon of S. sulfataricus (see Fig. 5) to become the distal gene member of the operon. Whether the trp operon became associated with the convergently transcribed aro operon in the Pyrococcus lineage or whether the trp operon dissociated from the aro operon in the Euryarchaeota seem to be equally possible alternatives that await resolution with the advent of more closely spaced genome representation.
The two Thermoplasma species (T. acidophilum and T. volcanium) and the closely related Ferroplasma acidarmanus have two identical aro operons except that aroAIß is missing in Ferroplasma in comparison with the aroQ/tyrA/aroAIß operon of the Thermoplasma species (Fig. 10). It is quite intriguing that this aroAIß gene has been inserted into the trp operon of F. acidarmanus at the distal gene position (Fig. 5).
It is apparent that genes of both Trp biosynthesis (Fig. 5) and overall aromatic biosynthesis (Fig. 10) have been atypically dispersed in Methanococcus jannaschii. This is reminiscent of the tendencies toward gene dispersal seen in some but relatively few of the Bacteria (species of cyanobacteria, Aquifex, and Chlorobium). Methanopyrus kandleri, a relatively close relative of M. jannaschii, also has dispersed Trp pathway genes, with only trpAa and trpAb (20-bp gene overlap) being adjacent (data not shown).
![]() View larger version (59K): [in a new window] |
FIG. 11. Zoom-in from Fig. 6A showing a conserved gram-positive region containing the six-gene aro operon (or remnants of it) and the trp/aro supraoperon of the B. subtilis/B. halodurans/B. stearothermophilus subgroup. (A) The aro and trp operons are mapped on a 16S rRNA tree at the far left. (The exact branching order of Oceanobacillus iheyensis has not been determined.) The Enterococcus/Streptococcus/Lactococcus grouping branches off between Listeria and the B. anthracis subgroup on a 16S rRNA tree (not shown, but see Fig. 2B and Fig. 7), but we believe from a variety of observations that it belongs just outside of the lineage shown in this figure. Shaded bracketed regions around aro operons and trp/aro supraoperons can be related to the presence of a context of conserved, flanking genes, as shown in part B. The separate aro and trp operons of a putative common ancestor are shown at the bottom of A. aro genes in B are color coded to match the genes shown in A. The conserved region to the left of aro operon genes includes eight genes (gps, hbs, hepS, menH, hepT, ndk, aroG, and aroB) that are conserved in every organism shown (heavy black overbars). Gene abbreviations: gpsA, glycerol 3-phosphate dehydrogenase; spoIVA, sporulation protein IVA; hbs, nonspecific DNA-binding protein; mtrA, GTP cyclohydrolase I; mtrB, TRAP; hepS, heptaprenyldiphosphate synthase (component I); menH, heptaprenyl naphthoquinone methyltransferase; qpt, quinone polyprenyltransferase; acd, aromatic acid decarboxylase; hypo, hypothetical gene; hepT heptaprenyldiphosphate synthase (component II); ndk, nucleoside diphosphate kinase; cheR, chemotaxis protein methyltransferase, tpr, tetratricopeptide repeat-containing protein (COG0457).
|
hisHb (subscript denotes broad specificity) encodes a subgroup of imidazole acetyl aminotransferase that is widespread and functions as an aromatic aminotransferase (42). The other subgroup, HisHn (subscript denotes narrow specificity) functions in the pathway of histidine biosynthesis. Interestingly, the hisHb/tyrA/aroF gene combination is part of another supraoperon (serC/aroQppheA/hisHb/tyrA/aroF/cmk/rpsA) which has been characterized in Pseudomonas stutzeri and P. aeruginosa (90, 91). aroH is a relatively rare analogue class of chorismate mutase, thus far known to be present only in cyanobacteria and in a scattered distribution of gram-positive Bacteria, including, in addition to the lower group of Bacillus in Fig. 11A, Desulfitobacterium hafniense, Carboxydothermus hydrogenoformans, Clostridium botulinum (but not other Clostridium species), Thermoanaerobacter tengcongensis, Streptomyces coelicolor, Thermomonospora fusca, and Heliobacillus mobilis. The gene organizations of the Bacillus halodurans and Bacillus stearothermophilus supraoperons are essentially identical to that of B. subtilis. However, note that in B. stearothermophilus a conspicuous expansion of intergenic space between trpC and trpEb and between trpEb and trpEa is evident (Fig. 11A). We can be fairly sure, because of parsimony principles applied to the comparative data, that this intergenic expansion is a derived evolutionary state rather than an ancestral one.
Upstream of the B. subtilis supraoperon is the mtrA/mtrB operon, encoding GTP cyclohydrolase I and the TRAP regulatory protein, respectively (Fig. 11B). mtrB is uniquely present within the lower subgroup. B. stearothermophilus has conserved the general region shown in Fig. 11B between gpsA and the supraoperon, but tpr and its flanking region to the right have been transposed away. B. halodurans exhibits a number of unique insertions in the conserved region shown in Fig. 11B.
Listeria subgroup. In spite of the current generic naming, Bacillus anthracis is closer on the 16S rRNA tree to species of Staphylococcus and Listeria (upper group in Fig. 11A) than to the other Bacillus species of the lower group. Members of this upper group all possess a complete seven-gene trp operon, including trpAb, which is absent in the lower Bacillus grouping of Fig. 11A. The Staphylococcus/B. anthracis group lacks the tryptophan RNA-binding attenuator protein (TRAP) encoded by mtrB (29), which is present throughout the lower group. The Staphylococcus/B. anthracis group also differs from the lower group and Listeria in the absence of aroH.
The aroH gene may be in a general process of displacement by aroQ, which is by far the most ubiquitous gene encoding chorismate mutase (13). Indeed, even within the lower group, one widely used strain of B. subtilis (strain 168) has lost aroH and relies exclusively on aroQ (48). The strain 168 genome, which has been sequenced and reported to possess aroH (as shown in Fig. 11A), is actually a hybrid prototrophic transformant with B. subtilis strain 23, the donor of aroH and linked trp genes (48). In Staphylococcus species of the upper group of Fig. 11A, the presumptive ancestral hisHb/tyrA/aroF linkage group has been disrupted, and aroF is now linked to aroG/aroB, whereas tyrA is now linked (divergently) with an intact trp operon. In contrast, B. anthracis retains the hisHb/tyrA/aroF linkage, but this has been expanded by addition of a gene duplicate of aroG at the 3' end. In addition, the putative ancestral aroG/aroB has acquired a duplicate of hisHb at the 5' end.
Note that we can distinguish which paralogues of aroG and hisHb in B. anthracis have remained in flanking gene context and which have been transposed away, i.e., the bracketed aroG/aroB/hisHb operon of Fig. 11A exists within the context shown in Fig. 11B. If aroH was present in the common ancestor of the clade, as speculated at the bottom of Fig. 11A, then it was lost in the common ancestor of the upper group. Otherwise, it arrived in the lower group either as a newly evolved innovation or by LGT. The first alternative may be more likely, considering that some fairly close relatives outside of the clade shown (e.g., Clostridium botulinum and Thermoanaerobacter tengcongensis) possess aroH.
Interconnectivity of the trp, aro, pab, and his operons. Figure 11 illustrates that organisms like Listeria and Oceanobacillus possess six-gene aro and seven-gene trp operons that are located in widely spaced parts of their genomes. They also have pab operons and his operons (not shown) that altogether constitute four separately spaced and seemingly unrelated operons. This presumably represents the straightforward ancestral state of the clade. In the B. subtilis clade, however, these separate operon systems have become integrated via the following events. (i) The trp operon was inserted into the aro operon to produce the well-studied supraoperon. (ii) hisHn, a substrate-specific imidazole acetol phosphate aminotransferase, was deleted from the his operon, making the histidine pathway dependent upon HisHb, a broad-specificity imidazole acetol phosphate aminotransferase encoded by the aro portion of the supraoperon. (iii) trpAb was deleted from the trp operon, leaving the Trp pathway dependent upon the dual-function PabAb encoded from the pab operon. A metabolic basis for integration of the aro, trp, and pab operons is readily apparent in that the component genes are all part of the divergently branched pathway of aromatic biosynthesis. A metabolic relationship between the aromatic and histidine pathways is not as straightforward. However, both have a precursor relationship with pentose phosphate metabolism, both utilize a glutamine amidotransferase reaction, and both utilize PRPP as a key early substrate.
Evolutionary information derived from flanking-gene context. Figure 11B shows a conserved region between gpsA and tpr that is the location of the six-gene aro operon in Listeria and Oceanobacillus. Upstream between the highly conserved hbs and hepS are mtrA and mtrB (if present). The shaded brackets in Fig. 11A indicate the genes that are present within the flanking gene context detailed in Fig. 11B. In the major upper group, the trp operon has no consistent pattern of flanking genes. In B. subtilis and B. halodurans, the supraoperon genes are ordered within the region shown on the bottom line of Fig. 11B. In B. anthracis, an aroG/aroB/hisHb segment of the original six-gene aro operon has remained in the original context of flanking genes. Paralogues of aroG and hisHb, now associated with tyrA and aroF, have migrated to a new genomic position. In Staphylococcus the remnant of the original aro operon, aroG/aroB/aroF, has remained in the original context of flanking genes; aroH has been lost from the genome; and hisHb and tyrA have separately been moved elsewhere. In the case of tyrA, it has now been divergently positioned directly upstream of the trp operon.
Thus, both this analysis and the analysis represented by the data shown in Fig. 9 illustrate how flanking-gene context in relatively close sister lineages can help sort derived evolutionary events from ancestral ones.
Deducing the likely common ancestor of the clade. Thus, the major upper and lower groups of Fig. 11A differ in the gene organization of the trp operon (presence or absence of trpAb), in the regulation of the operon (presence or absence of mtrB), and in the particular context of association with other aromatic-pathway genes (Fig. 11B). The most conserved gene order arrangements overall, in addition to the trp operon, are aroG/aroB and hisHb/tyrA/aroF. One can be fairly certain that the common ancestor possessed the complete trpAa/Ab/B/D/C/Eb/Ea, aroG/aroB, and hisHb/tyrA/aroF gene orders. This is because the linkage of aroG/aroB persists throughout the organisms shown in Fig. 11A and because the hisHb/tyrA/aroF linkage is well conserved, even at a deeper level, in the Bacteria. Deduction of a convincing common ancestor will require the genome sequences of additional organisms that will present a more finely spaced phylogenetic progression. A case in point that illustrates the process was our recent consideration of the new genome sequence for Thermoanaerobacter tengcongensis in this connection. When the Blast similarities of proteins from T. tengcongensis were scored against the overall genomic database (8), the highest score was for B. halodurans. Had this reflected membership of T. tengcongensis in the Fig. 11A clade, as we anticipated, it might have assisted deduction of evolutionary events in the clade. However, T. tengcongensis does not have the clade-conserved aroQaroA fusion, and its position on the 16S rRNA tree also places it outside the clade.
Given the tentative deduced ancestral linkages shown at the bottom of Fig. 11A, evolution of the supraoperon of the lower group must have involved loss of trpAb and the connection of aroG/aroB/aroH at the 5' end of the operon, as well as joining of hisHb/tyrA/aroF at the 3' end of the operon. If the common ancestor possibly possessed aroG/aroB/aroH/hisHb/tyrA/aroF as a single linkage group (as seems probable in view of the presence of this six-gene aro cluster in Listeria and Oceanobacillus), a single event of insertion of the trp operon between aroH and hisHb would account for the contemporary supraoperon. We propose that the gene organization of the common ancestor of the clade shown in Fig. 11A was very similar to that of the modern Listeria monocytogenes.
|
|
|---|
The comparative analysis of the histidine operon is well beyond the scope of this paper, but dynamics of gene scrambling similar to those seen with the trp operon are evident, e.g., the E. coli gene order hisG/D/C/BdBpx/H/A/F/IE compared to the Sulfolobus solfataricus gene order hisC/G/A/Bd/F/D/E/H/I/Bpx. A preliminary assessment indicates that the histidine pathway gene organization exhibits some intriguing parallels to Trp pathway gene organization. Different events of gene scrambling, gene dispersal, gene fusion, intergenic expansion, and operon fragmentation exist in both Bacteria and Archaea. Similar to what seems to be the case for the Trp pathway gene organization, gene scrambling also seems to be more frequent for histidine pathway gene organization in the Archaea than in the Bacteria.
![]() View larger version (40K): [in a new window] |
FIG.12. Schematic of the major evolutionary events (milestone I and milestone II) following the ancient establishment of a trp operon in the domain Bacteria. The ancestral trp operon has been retained by organisms such as Listeria monocytogenes (Lmo), Clostridium acetobutylicum (Cac), Streptococcus pneumoniae (Spn), and Desulfovibrio vulgaris (Dvu). The emergence of selected contemporary organisms is shown. The three stages highlighted with an orange oval, a magenta oval, and a green oval correspond to the color coding used in Fig. 2 to designate the particular contemporary organisms that have retained the exact gene organization illustrated within one of the three ovals.
|
The well-studied trp operons of Pseudomonas aeruginosa and Bacillus subtilis are illustrated in Fig. 12 as examples of operons from organisms that are not representative of the deeper phylogenetic node. Since the time of the landmark splitting of the ancestral operon, a history of additional fragmentations in P. aeruginosa has left trpAa isolated from trpAb/B/D and trpC isolated from trpEb/Ea. Likewise, the well-studied trp operon of Bacillus subtilis is not representative of the broader Listeria/Bacillus/Staphylococcus node. In relatively recent events, trpAb has been discarded, and the remaining trp operon appears to have been inserted into an aro (aroG/aroB/aroH/hisHb/tyrA/aro/F) operon. (see Fig. 11 and the attending discussion in this text). Since the dual use of pabAb in the lower group for both anthranilate and PABA synthesis is isolated to this lineage, the seven-gene trp operon of Bacillus anthracis and Staphylococcus species is more representative of the node of Fig. 11 organisms than is the six-gene B. subtilis operon.
The clade of actinomycete Bacteria shown in Fig. 3 offers a particularly apt example of how genes representing the ancestral state of Trp biosynthesis can be sorted out from genes originating by LGT or ancient paralogy. The Thermomonospora, Streptomyces, Corynebacterium, and Mycobacterium genera each exhibit substantial differences from one another. Mycobacterium lacks paralogue copies of trp genes, and therefore its trpAa/D/Eb/Ea operon plus dispersed copies of trpAb, trpB and the missing trpC that are present can reasonably be assumed by default to specify the primary pathway of Trp biosynthesis. The situation is the same in Thermomonospora except that trpAa and trpAb are fused. Streptomyces possesses several trp operons, but the primary trpAa/D/Eb/Ea biosynthetic operon can be identified by phylogenetic analysis. Thus, proteins encoded by each of the trpAa/D/Eb/Ea operons as well as the free-standing copies of trpAb and trpB in the organisms shown in Fig. 3 all cluster together on phylogenetic trees to the exclusion of other paralogues present in the Streptomyces genome.
The trpAa/Ab/B/D/aroAII operon is known to have a specialized role in antibiotic production that is unique to Streptomyces. The free-standing TrpD of S. coelicolor specifically clusters in the phylogenetic tree with TrpD proteins encoded by the trpAa/D/Eb/Ea operons in the rest of the clade. It is a trpD remnant, since all other genes have been otherwise replaced by a whole-pathway operon via LGT. Thus, with all of this information, we can reasonably predict that the common ancestor at the node position for actinomycete bacteria (as depicted in Fig. 6) possessed a trpAa/D/Eb/Ea operon, with the remaining trp genes dispersed.
Events of gene insertion and gene shuffling are not necessarily events of gene disruption. The reshuffled deck of trp genes in an operon such as that of Desulfitobacterium hafniense (Fig. 6A) seems curious, indeed, but there is no reason to believe that this compact operon is any less efficient for the shuffling. Perhaps the shuffling reflects nature's continuing experimentation to test for different orders of translationally coupled genes that produce different protein-protein interactions. When previously compact operons are altered by expansion of intergenic spacing, perhaps this is a necessary evolutionary step for successful gene fusion. Sufficient intergenic space would seem to be necessary for evolution of a linker region that does not intrude on the catalytic domains being fused.
Are there any clear examples of efficient operons systems that have been disrupted? We do not know the extent to which the high efficiency of regulation that is fully documented in only a few organisms such as E. coli and B. subtilis is typical of other trp operons. The regulatory features of E. coli and B. subtilis are distributed within rather narrow clades, and it may be that these exemplify relatively recent advanced operon systems that will in fact strongly resist future disruptive events in all of the free-living descendants. It would be most informative to know the details of regulation in a well-spaced phylogenetic progression of other modern whole-pathway operons (such as the operons carried by the orange-highlighted organisms in Fig. 2). For example, a two-component response regulator gene is positioned only 17 bp upstream of trpAa in Thermotoga maritima. Might this reflect the presence of a completely different mode of control?
It is possible that many trp operons in nature are relatively primitive and only have the advantages conferred by a common promoter and (in the case of overlapping genes) either translational coupling or protection from mRNA degradation. It seems quite probable that many free-living organisms have no use for the huge range of trp gene expression that is typical of a feast-and-famine organism such as E. coli. For example, cyanobacteria probably make most or all Trp endogenously and thus may require regulation over a minimal range. One could envision that simple feedback inhibition of anthranilate synthase might constitute the main regulation in operation. This is consistent with the results of two studies of cyanobacteria in which exogenous Trp transport was two orders of magnitude less than in B. subtilis (32), anthranilate synthase is 100% inhibited at 10 µM Trp (36), and the range of enzyme expression varies only two- to threefold except for a 20-fold range in the case of tryptophan synthase (36).
There are distinct examples where operon disruption has followed acquisition of a finely tuned trp operon system, e.g., dissociation of trpEb/trpEa in Pasteurella multocida in the enteric lineage (see Fig. 8). However, these are special cases in organisms that have become pathogens or intracellular symbionts. There is ample evidence that evolved interorganismal relationships can produce completely new selective conditions that no longer require an efficient operon. In the extreme case, many pathogenic organisms undergo reductive evolution and abandon the pathway altogether because the host provides Trp. Since eukaryote hosts (such as humans) are relatively recent, such processes are likely to be in an ongoing state. In these cases, events of gene insertion and gene dissociation may not be selectively disadvantageous. Indeed, they may be steps in the selected process of genome reduction. In this connection it might be instructive to consider the recent disruptive events that have occurred in the pathogenic Corynebacterium diphtheriae but not in the free-living sister species Corynebacterium glutamicum since the LGT-mediated acquisition of the trp operon in their common ancestor (see Fig. 3 and attending discussion).
In a completely different context of interorganismal relationship, Buchnera aphidicola is an endosymbiont that produces Trp for the host. In this case, one can pinpoint a fairly recent time of selection against efficient regulation of Trp biosynthesis. Here the endosymbiont cells have been challenged to overproduce Trp for export to the host. This is primarily accomplished by translocation of trpAa/trpAb to a plasmid, with the result of giving a 16-fold amplification of the rate-limiting first step of Trp biosynthesis (50). It is very important to keep in mind that genomic sequencing has been heavily biased in favor of organisms that directly impact humans, and genomic representation of free-living organisms is still relatively weak.
The answer to the question raised in the heading is then yes and no. Pathogens (especially obligate pathogens) are in the process of abandoning the trp operon altogether. Endosymbionts, such as Buchnera, may abandon the regulation altogether in order to engineer themselves to saturate the needs of the host. However, there is thus far no evidence that a free-living organism equipped with a highly evolved and efficiently regulated trp operon experiences instability with respect to that operon.
Elaborate regulation seems to be fairly recent. Primitive trp operons may have been regulated by relatively simple schemes. Consistent with this is that all elaborate control systems for trp operons are restricted to marrow clades. The advanced trp operon of E. coli differs from that of the putative common ancestor of Bacteria in having two pairs of structural gene fusions (Fig. 4), the trpR repressor, and a leader peptide (trpL) for attenuation. The distribution of trpR is limited to the enteric lineage except for Coxiella burnetii, Xylella fastidiosa, and some species of Chlamydia (89). Regulation by attenuation mechanisms seems to be distinctly more widespread than repression control by trpR (7, 53, 75). However, particular attenuation mechanisms can be distinctly different. Thus, the mechanism in E. coli that relies on the trpL leader peptide (95) is quite distinct from the Bacillus subtilis mechanism that utilizes a Trp-activated RNA-binding protein (TRAP) (29) as well as an anti-TRAP protein whose synthesis is induced by uncharged tRNATrp (80).
Does the enteric clade (see Fig. 8), with its multiple mechanisms of control, perhaps possess a relatively superior trp operon that would resist future events of operon disruption? It may very well be that the enteric lineage (as represented in Fig. 8) currently has a very highly conserved operon system in its free-living members. Exceptions in pathogenic organisms that are undergoing reductive evolution are easily understood (e.g., Haemophilus species), as are exceptions in intracellular symbionts such as Buchnera.
The L. lactis tRNA-directed transcription termination mechanism might prove to be the most broadly distributed mechanism, since various gram-positive organisms utilize this mechanism for a number of different amino acid biosynthetic pathways (34). The loss of trpAb from the trp operon of the B. subtilis clade and reliance upon the broad-specificity homologue in the folate pathway for dual function in anthranilate and 4-aminobenzoate synthesis may have favored an even more advanced regulatory system that integrates folate and Trp biosynthesis. In accord with this, TRAP also regulates the transcript levels in the B. subtilis folate operon (20).
Regulation of Trp biosynthesis in organisms lacking the whole-pathway operon may be relatively undeveloped aside from the widespread sensitivity of anthranilate synthase to feedback inhibition by Trp. Several partial-pathway operons are known to possess only a degree of regulation. Thus, in Rhizobium meliloti, the trpAatrpAb operon is regulated by transcription attenuation but not the trpBD operon or the trpCEbEa operon (7). However, such a generalization may not be justified in consideration of P. aeruginosa and its close relatives P. putida, P. fluorescens, P. syringae, and Azotobacter vinelandii, in which transcription of the trpEbEa operon is activated by trpI (6, 14) and the free-standing trpAa and the trpAbBD operon are regulated by attenuation (67).
Given the variety of trp operon regulatory mechanisms that are known to have evolved and others undoubtedly yet to be discovered (30), one might think that selection for the most efficient operons would have proceeded rapidly via LGT. This may be an oversimplification in that different levels of efficiency may be selected for different lifestyles. Feast-and-famine organisms such as E. coli may be most suited to relatively large ranges of control modulation. In any event, only the LGT relationship of whole-operon transfer between Helicobacter pylori, coryneform bacteria, and enteric bacteria is evident at present. An obvious roadblock to LGT of at least some complexly regulated operons is the presence of regulatory genes at unlinked loci with respect to the operonic structural genes. It may very well be (see following section) that what is efficient in the metabolic context of one lineage is not so efficient in the metabolic context of another lineage (40).
Regulation extending beyond the Trp pathway.
From the vantage point of operon stability, we think that it is very important to consider how deeply some modern trp operon systems have become integrated into a broader metabolic network. The first example is trpR in E. coli. Not only the trp operon but also four additional transcription units belong to the trpR regulon (68). Other members of the regulon include the trpR gene itself (which is therefore autoregulated), mtr (encodes a Trp-specific transporter), aroL (encodes shikimate kinase II), and aroH (a paralogue of the DAHP synthase AroAI
homology group that is also feedback inhibited by Trp). aroL is also a member of the tyrR regulon. Thus, fine-tuned regulation by trpR is not only focused upon the specific Trp branch, but also influences the broader aromatic pathway, which generates precursor molecules. There is a certain integrant relationship in which the presence of trpR correlates with multiple, differentially regulated isoenzymes of DAHP synthase. It may be relevant here that there is a correlation between disruption of the whole-pathway trp operon in Haemophilus influenzae and the loss of genes encoding two of the three differentially regulated isoenzymes of DAHP synthase that are typically present in enterics.
The second example is that of mtrB in B. subtilis, which encodes TRAP. Here again, TRAP exerts regulatory influences across metabolic pathways, in this case between the Trp and folate pathways. TRAP not only regulates the trp operon by both transcription attenuation and a translational control mechanism, it also regulates the translation of pabAb (required for both Trp and folate biosynthesis), yhg (a putative Trp transporter), and ycbk (encoding a protein of unknown function). Thus, Trp and folate biosynthesis are coordinated via the regulatory abilities of TRAP. An organism such as Oceanobacillus possesses mtrB, a seven-gene trp operon that contains trpAb, and a folate operon that contains pabAb. Thus, it seems likely that once the dual regulatory role of mtrB in both pathways was established, integration was further elevated in the B. subtilis/B. halodurans/B. stearothermophilus clade by loss of trpAa and reliance upon pabAb to form alternative complexes with either trpAa or pabAa.
Aspects of regulation that may merit increased attention are the factors that influence the rate of mRNA decay. It is generally accepted that the differential stability of mRNA plays an important role in determining the steady-state levels of gene expression. Individual mRNA decay rates can vary more than 100-fold. In contrast to the level of knowledge about initiation of trp gene transcription, little is known about the specificity, precision, and regulatory role of mRNA decay. New capabilities for the systematic measurement of mRNA decay rates (83) should enhance our understanding of this important aspect of regulation.
One can envision that such mechanisms might have preceded the commitment of genetic material to the elaboration of regulatory proteins. Consider the relative contribution of attenuation (relatively weak) and trpR-mediated repression (relatively strong) in E. coli. Repression is designed to detect Trp, whereas attenuation is designed to detect uncharged tTNATrp. Under many growth conditions, the free Trp concentration in the cell may be fairly low but still sufficient to keep tRNATrp largely charged. Thus, trpR-mediated repression is responsible for a large range of expression, and only after maximal derepression does relief from attenuation ensue. Consider also that the repressor binds not only to the trp operator but also to operators relevant to DAHP synthase and trpR itself (autoregulation). The modern whole-pathway operon systems that do possess efficient control features should be highly stable, barring any evolutionary transitions to pathogenic or symbiotic relationships. This would not preclude presumably desirable changes such as gene fusions. Simple, unregulated operons (both ancient and modern) or weakly regulated operons can be expected to be relatively unstable compared to complex, regulated operon systems that can sense a variety of different cues with a good range of sensitivity. To the extent that these deploy unlinked regulatory elements, intergenomic transfer should be relatively unlikely due to the necessity for cotransfer of unlinked genes in order to obtain the complete operon system.
|
|
|---|
Thus, we are beginning to get a fairly good picture of the evolutionary progressions that have taken place with respect to the organization of trp genes as whole-pathway operons, partial-pathway operons, and dispersed genes. However, a rationale for what driving forces exist to power the evolutionary dynamics that we can describe is not so clear. This limitation can probably be attributed to the relatively small amount of information about Trp pathway regulation that is available in the broad comparative context. To completely describe trp operon systems, one needs to evaluate any linked or unlinked regulatory elements that may exist. Two widely spaced organisms may have identical whole-pathway trp operons but may have evolved completely different control systems, or one of the two may be quite complex and the other simple. It seems significant that the current systems of trp operon regulation that can be described as elaborate are present in narrow bacterial clades and therefore must be of relatively recent origin. Comparative bioinformatics data to elucidate the range of regulatory mechanisms in place for trp operons in modern organisms is an initiative that is only beginning (60) and should be most informative.
Complexly regulated Trp systems are likely to involve the integration of Trp biosynthesis with other pathways, as has been elucidated between Trp and folate (mediated by TRAP) in B. subtilis or between Trp and the greater aromatic pathway (mediated by TrpR) in E. coli. One could envision a yet-to-be discovered metabolic relationship between Trp and serine or between Trp and histidine.
A second aspect of complexity involves the variety of multiple pathways that can exist within a single organism in which Trp or Trp intermediates can have different fates. For example, Streptomyces coelicolor has four TrpAa/TrpAb homologues that compete to direct chorismate to the specific alternative fates of phenazine biosynthesis, antibiotic biosynthesis, siderophore (coelibactin) biosynthesis, and primary Trp substrate for protein synthesis. All of these competing systems would be expected to respond to entirely different regulatory cues. In some cases, a given trp gene product may be shared by more than one pathway. Larger genomes can be expected to more frequently exhibit this kind of paralogy/xenology complexity, and indeed we have seen examples for the Trp pathway in large-genome organisms such as Nostoc sp., Pseudomonas aeruginosa, and Streptomyces coelicolor.
In this article, a strong foundation has been developed that should help guide the selection of key organisms for studies designed to gain insight into how Trp pathway regulation is related to the driving forces of evolution.
|
|
|---|
Deduced amino acid sequences were analyzed for N-terminal signal sequences and transmembrane domains with Psort (http://psort.ims.u-tokyo.ac.jp/) (64).
Hidden Markov model and Prosite pattern search. Multiple alignments were obtained with the ClustalW program (78) included in the BioEdit (version 5.0.9) multiple alignment tool (33). A hidden Markov model based upon a multiple sequence alignment of known TrpC sequences was generated by version 2.2g of the HMMER program (22). A Prosite-like regular expression pattern was generated manually, and this hidden Markov model and Prosite pattern were further searched against the genomes that are missing trpC.
We thank A. Osterman of Integrated Genomics, Inc. (Chicago, Ill.) for provision of access to ERGO. Most importantly, we thank Charles Yanofsky for his extraordinary generosity in providing an almost continuous critique of the early version of this work and for a number of suggestions, among them the idea that an ancient attenuation mechanism might have involved a simple and direct association of small molecules with nascent RNA.
Florida Agricultural Experiment Station Journal series no. R-09160. ![]()
|
|
|---|
/ß-barrel scaffold. Nature 403:617-622.[CrossRef][Medline]
/ß-barrel scaffold: retraction. Nature 417:468.[CrossRef][Medline]
)(8) barrels: implications for the evolution of metabolic pathways. J. Mol. Biol. 303:627-641.[CrossRef][Medline]
-barrel enzymes that catalyze three sequential reactions in the pathway of tryptophan biosynthesis. Biochemistry 30:9161-9169.[CrossRef][Medline]
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»