Previous Article | Next Article ![]()
Microbiology and Molecular Biology Reviews, December 2007, p. 576-599, Vol. 71, No. 4
1092-2172/07/$08.00+0 doi:10.1128/MMBR.00015-07
Copyright © 2007, American Society for Microbiology. All Rights Reserved.

Department of Microbiology and Plant Molecular Biology/Biotechnology Program, The Ohio State University, 484 West 12th Avenue, Columbus, Ohio 43210-1292,1 Graduate College of Marine and Earth Studies, Delaware Biotechnology Institute, University of Delaware, 127 DBI, 15 Innovation Way, Newark, Delaware 19711,2 Howard Hughes Medical Institute, UCLA-DOE Institute for Genomics and Proteomics, Department of Chemistry and Biochemistry, University of California, Los Angeles, Box 951570, Los Angeles, California 90095-1570,3 Department of Plant Cellular and Molecular Biology, The Ohio State University, 582 Aronoff Laboratory, 318 W. 12th Avenue, Columbus, Ohio 43210-12924
SUMMARY INTRODUCTION DIFFERENT MOLECULAR FORMS FOR THE SAME (AND DIFFERENT) FUNCTIONS The RubisCO-Like Protein (Form IV), a Homolog of RubisCO The RubisCO Superfamily at Present Sequence Conservation in the RubisCO Superfamily Evidence for Distinct Functions among RLP Lineages (i) Active-site substitution patterns and implications from functional studies. (ii) Local gene conservation as an indicator of different functions. Genomic Context-Based Analyses of Diverse RLPs Suggests Functional Diversity PROBING THE EVOLUTIONARY ORIGINS OF RubisCO: EVIDENCE FOR ARCHAEAL CENTRAL METABOLISM AS THE ULTIMATE SOURCE OF ALL EXTANT RubisCO AND RLP SEQUENCES Non-RubisCO/RLP Structural Homologs Physiological Role for Archaeal (Form III) RubisCO RubisCO AND RLP STRUCTURES: SIMILAR YET DIFFERENT ENOUGH Potential of Structural Comparisons To Enhance Functional Studies Comparison of Secondary Structural Elements Unique to RLP and RubisCO: Possible Implications for RLP Structure-Function Relationships CONCLUSIONS AND OUTLOOK ACKNOWLEDGMENTS REFERENCES
|
|
|---|
|
|
|---|
5 s–1) that is among the lowest for any biological catalyst (13, 68). |
|
|---|
55,000 Mr) and small (
15,000 Mr) polypeptides in an (L2)4(S4)2 structure (4, 35). This type of enzyme, now called form I, is the predominant RubisCO form found in nature, and it is present in terrestrial and marine plants, eukaryotic algae, cyanobacteria, and most phototrophic and chemolithoautotrophic proteobacteria (68). The name form I was originally used to distinguish this type of RubisCO from another structurally simpler form of the enzyme that was shown to be a dimer of only large subunits, which was discovered originally in the nonsulfur phototrophic bacterium Rhodospirillum rubrum (69, 70). Interestingly, another nonsulfur purple phototrophic bacterium, Rhodobacter sphaeroides, also appeared to contain this second structural form of RubisCO (albeit in higher aggregates of large subunits) and was originally isolated as a second peak of activity after ion-exchange fractionation of extracts from induced R. sphaeroides. Form I RubisCO was isolated from the same crude extracts, i.e., in the first activity peak that eluted from the column (29). Thus, the enzyme from the second activity peak (peak II), which contained the novel structural form analogous to R. rubrum RubisCO, was eventually called the form II enzyme to distinguish it from the first peak of activity or the form I enzyme. Form II RubisCO proteins were shown to catalyze the same reaction as form I RubisCO, and both enzymes catalyze an oxygen fixation reaction whereby the enediol of RuBP is attacked by molecular oxygen. The form II enzyme, comprised only of multimers of large-type subunits [(L2)x], shows only about 30% amino acid sequence identity to form I large subunits. In addition, form II enzymes all appear to be less efficient in partitioning the two gaseous substrates of RubisCO, CO2 and O2. Most importantly, the form II enzyme takes on a distinct physiological role, as it is used primarily to enable the CBB pathway to balance the redox potential of the cell under select growth conditions (19, 68, 74). To this day, the relative differences and similarities in primary structure serve as a convenient means to classify all the different forms of RubisCO found in nature. By the mid-1990s, it was recognized that the form I enzyme could be further classified, according to amino acid sequence homologies, as either "green" (cyanobacterial, algal, and plant) and "red" (phototrophic bacterial and nongreen eukaryotic algal) (16, 19, 67, 68, 74). As more RubisCO gene sequences became available, the green enzymes were further subdivided into forms IA and IB, and the red enzymes were subdivided into forms IC and ID (67, 68) (Fig. 1). Form II bacterial enzymes, and even eukaryotic homologs found in symbiotic dinoflagellates, all appear to be fairly closely related, and there is no clear subdivision. This convenient division into different phylogenetic and catalytically distinct structural forms (forms I and II) lasted for about 20 years. The more recent explosion of complete genomic sequencing projects has led to putative RubisCO sequences showing up in some unusual places, including organisms that use alternatives to the CBB pathway to fix CO2 and even microorganisms that do not use CO2 as a major carbon source. For example, it was shown that various archaea, including those that use other means for primary CO2 assimilation or those that may even grow on organic compounds, contain genes that encode a bona fide functional RubisCO (25, 73; F. R. Tabita, G. M. Watson, and J. P. Yu, presented at the 98th Meeting of the American Society for Microbiology, 1998). Moreover, phylogenetic analyses clearly placed the archaeal RubisCO sequences in a separate category, which was termed form III (68, 73). Thus, by the late 1990s, it was apparent that nature still had some surprises for RubisCO biochemists and evolutionists (RubisCOlogists), and the rather comfortable and long-standing classification of RubisCO into only forms I and II was clearly and obviously incomplete and actually incorrect. For those interested in structure-function relationships, the advent of the form III enzymes, obtained from organisms that never see molecular oxygen, offers tantalizing possibilities to learn more about how the active site of RubisCO might have evolved. This is especially relevant since it was found that several archaeal enzymes are highly sensitive to molecular oxygen and have extremely poor capabilities to discriminate between CO2 and O2 (27, 37, 73) due in part to an extremely high affinity of these enzymes for O2 (37).
![]() View larger version (14K): [in a new window] |
FIG. 1. Unrooted NJ tree of RubisCO/RLP lineages. To construct this tree, a total of 193 sequences were aligned with MEGA 3.1 (38) and evaluated by ProtTest (1), and the tree was then constructed using the equal-input model with a gamma rate distribution of 1.554. The total numbers of sequences considered in each lineage were 35 for I-A, 16 for I-B, 9 for I-C, 22 for I-D, 20 for II, 10 for III-1, 4 for III-2, 20 for IV-NonPhoto, 2 for IV-EnvOnly, 14 for IV-Photo, 16 for IV-DeepYkrW, 12 for IV-YkrW, and 5 for IV-GOS. The width of the arrows is directly proportional to the number of sequences considered for each clade. For a complete list of sequences and sources, see Table S1 in the supplemental material. The scale bar represents a difference of 0.5 substitutions per site. Bootstrap values for nodes are shown in Fig. 2A. Single-sequence abbreviations and sequence identifiers are as follows: IV-Arc.ful-DSM 4304, Archaeoglobus fulgidus strain DSM4304 (GenBank accession number NP_070416); Met.bur-DSM6242, Methanococcoides burtonii strain DSM6242 (accession number ZP_00563653); Met.hun-JF-1, Methanospirillum hungatei strain JF-1 (accession number YP_503739); Met.the-PT, Methanosaeta thermophila strain PT (accession number ZP_01153096).
|
Phylogenetic analyses of RubisCO and RLP sequences indicate that there are at least three distinct lineages of bona fide RubisCO and six distinct clades of RLP molecules (Fig. 1). The well-studied form I and form II groups are each monophyletic and, despite their clear separation, are somewhat related to each other. Form III sequences are recognizably distinct from forms I and II by any phylogenetic reconstruction method employed (31) (Fig. 2), which initially suggested a relationship to RLP (68). However, all form III proteins analyzed thus far can catalyze RubisCO activity (27, 73) in vitro, while no RLP has ever been documented to catalyze RuBP-dependent CO2 fixation, undoubtedly due to the absence of critical conserved active-site residues (13, 68) in the latter (Fig. 3). Thus, the only currently known bona fide RubisCO sequences are those found within forms I, II, and III. Outlying sequences observed in recently sequenced methanogen genomes will be discussed below.
![]() View larger version (25K): [in a new window] |
FIG. 2. Comparison of RubisCO/RLP tree topologies reconstructed with NJ (A), ME (B), UPGMA (C), and MP (D). All except MP assumed a distribution of 1.554 of evolutionary rates across four categories as calculated by ProtTest (1). Values at nodes represent bootstrap support observed in 1,000 trials per method. IV-Arc.ful-DSM 4304, Archaeoglobus fulgidus strain DSM4304 (GenBank accession number NP_070416); Met.bur-DSM6242, Methanococcoides burtonii strain DSM6242 (accession number ZP_00563653); Met.hun-JF-1, Methanospirillum hungatei strain JF-1 (accession number YP_503739); Met.the-PT, Methanosaeta thermophila strain PT (accession number ZP_01153096).
|
![]() View larger version (61K): [in a new window] |
FIG. 3. Conservation of RubisCO active-site residues in RubisCO/RLP family members as noted previously by Cleland et al. (13) and Tabita (68). All form III RubisCO and RLP (form IV) sequences used in the reconstruction of phylogenetic relationships are included. Residues are noted in single-letter IUPAC code. Positions shaded green indicate conservation, while yellow indicates a semiconservative substitution and red indicates a nonconservative substitution. C, catalytic residue; R, RuBP binding residue.
|
|
View this table: [in a new window] |
TABLE 1. RubisCO and RLP lineage properties and phylogenetic distributiona
|
![]() View larger version (21K): [in a new window] |
FIG. 8. RubisCO and B. subtilis RLP catalyze similar enolase-type reactions and employ structurally analogous substrates (see reference 33). In each instance, a carbamylated lysine catalyzes proton abstraction from the substrate to initialize enolization. DK-MTP 1-P, 2,3-diketo-5-methylthiopentyl-1-phosphate; HK-MTP 1-P, 2-hydroxy-3-keto-5-methylthiopentenyl-1-phosphate. (Adapted with permission from reference 33. Copyright 2007 American Chemical Society.)
|
|
View this table: [in a new window] |
TABLE 2. Residues conserved at various percentages across all RubisCO/RLP sequences analyzed
|
Given this variation, it seems unlikely that members of one lineage would functionally substitute for a member from another RLP family, although evidence exists otherwise (discussed below). Currently, detailed functional studies have been carried out for only four RLPs, C. tepidum RLP (30, 31), the YkrW/MtnW proteins of Bacillus subtilis and Geobacillus kaustophilus (8, 33, 45, 63), and the YkrW-like RLP from the cyanobacterium Microcystis aeruginosa (11). Thus far, the three-dimensional structures have been solved only for C. tepidum RLP (37), R. palustris RLP2 (this paper), and G. kaustophilus RLP (33, 39). The RLP from C. tepidum and the RLP2 from R. palustris are structurally very similar at the active site but possess four different active-site residues compared to the B. subtilis and G. kaustophilus proteins. Specific catalytic residues appear to be differentially conserved among the two lineages. The major difference is the Glu versus the Lys at Asn-123 (spinach RubisCO numbering), suggesting possible differences in hydrogen-bonding patterns with their respective substrates. In addition, Asn versus Val/Met identities at the Lys-177 position in C. tepidum versus B. subtilis groups of RLPs, respectively, may indicate different needs or participants for proton abstraction at the presumptive active site (see below), whereas Phe versus Pro identities at Arg-295, the residue that interacts with P2 phosphate in spinach RubisCO, likely indicate that each type of RLP reacts with distinct substrates with different hydrophobicities at the P2 site.
The B. subtilis YkrW/MtnW protein and, more recently, its M. aeruginosa and G. kaustophilus RLP homologs, have all been shown to function as a 2,3-diketo-5-methylthiopentyl-1-phosphate enolase in the methionine salvage pathway. Thus, a B. subtilis mutant lacking YkrW/MtnW has a relatively constrained phenotype that is manifested only under severe sulfur starvation conditions (45, 63). Based on structural comparisons discussed elsewhere in this review, it appears that 2,3-diketo-5-methylthiopentyl-1-phosphate is not compatible with the active-site pocket in C. tepidum RLP or R. palustris RLP2 (39). Thus, it was not surprising that inactivating these genes resulted in strains with distinct phenotypic properties in different organisms. For example, an insertionally inactivated RLP mutant of C. tepidum (strain
::RLP) had a highly pleiotropic phenotype, with defects observed in pigmentation, the ability to metabolize some sulfur compounds, and the aberrant expression of stress response proteins (31). More specifically, strain
::RLP is unable to oxidize thiosulfate efficiently, although the ability to oxidize sulfide remains unperturbed (30). Strain
::RLP is also deficient in oxidizing elemental sulfur, as it was found to produce significantly more extracellular elemental sulfur than the wild type (31).
A null mutation in the gene encoding RLP in C. tepidum also results in the overproduction of two oxidative stress response-related proteins, i.e., a thiol-specific antioxidant (Tsa) protein and superoxide dismutase. The levels of these two proteins are 12- and 3-fold enhanced, respectively, in the
::RLP strain compared with the wild type. The accumulation of these proteins correlates with the transcript levels of the corresponding genes (30). The
::RLP strain is also significantly more resistant to hydrogen peroxide exposure during growth than is the wild type (30). Further analyses indicate that the C. tepidum genome also encodes two potentially relevant transcriptional regulators, i.e., the ferric ion uptake regulator (Fur) and the peroxide regulator (PerR). Since these regulators are reported to be involved in the regulation of oxidative stress response genes in various bacteria including Escherichia coli, Bacillus subtilis, and Staphylococcus aureus, the possibility that RLP might be involved with the function of these regulators was considered. However, insertional inactivation of both the fur and perR genes of C. tepidum did not affect the accumulation of the Tsa and superoxide dismutase proteins in the
::RLP mutant strain (Singh and Tabita, unpublished).
How RLP specifically contributes to sulfur oxidation and oxidative stress in chlorobia is still unknown. These areas have received relatively little experimental attention in chlorobia to date, although this is beginning to change with the exploitation of available genomic data (12). Genes encoding RLPs have been found in all Chlorobium genomes sequenced to date (see http://img.jgi.doe.gov/cgi-bin/pub/main.cgi for details) even though these strains vary considerably in the spectra of reduced sulfur compounds used to support growth. Regarding oxidative stress, Chlorobium sp. strain GSB1, recently isolated from a hydrothermal vent sample (9), was found to maintain viability during prolonged exposure to molecular oxygen only in the absence of light and sulfide. Clearly, experiments utilizing oxidative stress elicitors other than molecular oxygen (i.e., organic hydroperoxides, methyl viologen, or diamide) in addition to experiments examining the interplay of light, sulfur compounds, and oxygen are required. Such studies of stress physiology and sulfur oxidation will likely contribute to delineating the function of RLP in C. tepidum.
Phylogenetic analyses (Fig. 1) indicated that some organisms (i.e., Rhodopseudomonas palustris, Rhodospirillum rubrum, and Microcystis aeruginosa) contain both bona fide RubisCO as well as RLP. Indeed, R. palustris, a purple nonsulfur bacterium, has two bona fide RubisCOs (form I and form II) and two RLPs (RLP1 and RLP2). Both the sequence alignment and structural analysis (discussed above) show that one of the RLPs (RLP2) is closely related to the C. tepidum RLP (30). Although the overall recently solved structure of R. palustris RLP2 is similar to that of C. tepidum RLP, there are subtle differences (discussed below). Disruption of either of the two RLPs present in R. palustris or the single RLP in R. rubrum does not appear to affect the expression of any oxidative stress response proteins (J. Singh, T. E. Hanson, S. Romagnoli, and F. R. Tabita, unpublished data). Because the disruption of the RLPs in these organisms failed to evoke a detectable phenotype, either these proteins do not function in the stress response or, perhaps, the amount of RubisCO present is enough to complement the function of the missing RLP (much like how RubisCO complements the defect in methionine metabolism in B. subtilis) (8). Interestingly, both R. rubrum and R. palustris are capable of using 5-methlythioadenosine (MTA) as the sole sulfur source for growth (Fig. 4), much like B. subtilis. Further studies indicate that RLP is definitely involved in MTA-dependent growth in these organisms (Singh and Tabita, unpublished). In addition, bioinformatic analyses indicate that R. rubrum and R. palustris contain the requisite genes of the methionine salvage pathway, while other nonsulfur bacteria such as R. sphaeroides and R. capsulatus do not, nor are the latter two organisms capable of MTA-dependent growth (Fig. 4). Why physiologically related organisms that are typically found in similar environments appear to both utilize (and contain RLP) and not utilize the methionine salvage pathway is not clear at this time.
![]() View larger version (13K): [in a new window] |
FIG. 4. Growth of four purple nonsulfur bacteria on MTA as the sole sulfur source. Rr, Rhodospirillum rubrum; Rp, Rhodopseudomonas palustris; Rc, Rhodobacter capsulatus; Rs, Rhodobacter sphaeroides. Growth on MTA correlates with the presence of RLP, further shown by inactivating the RLP gene (Singh and Tabita, unpublished).
|
![]() View larger version (28K): [in a new window] |
FIG. 5. Local conservation near genes encoding form III RubisCO (A) or the RLP lineages IV-Photo (B), IV-NonPhoto (C), and IV-YkrW. Gene neighborhoods were visualized using tools at the Integrated Microbial Genomes website (http://img.jgi.doe.gov/cgi-bin/pub/main.cgi). RubisCO/RLP genes are indicated in red. Other open reading frames are colored and identified according to their annotation in the Integrated Microbial Genomes database. Methyl-coM, methyl coenzyme M; Bchl, bacteriochlorophyll; Me-T'ferase, methyltransferase; 5-Me-C RE, 5-methyl-cytosine removing enzyme; EF-Ts, elongation factor Ts; SDR, short chain dehydrogenase/reductase.
|
Aside from the form I and II bona fide RubisCOs and the IV-Photo and IV-YkrW lineages discussed above, only three other examples of local gene conservation were found. Form III RubisCOs in Methanosarcina spp. share conserved gene organizations downstream, including a methyl coenzyme (CoM) reductase operon; the polC gene, encoding a DNA polymerase; and others (Fig. 5A). In Pyrococcus spp., genes encoding form III RubisCO are preceded by four genes encoding conserved hypothetical proteins as well as potential operons encoding Na+/H+-translocating ATPase and potential DNA repair functions (Fig. 5A). Finally, in the IV-NonPhoto lineage, there seems to be a conserved gene encoding a surface or secreted protein predicted to be dependent on a type III secretion system for export (Fig. 5C). The functional significance of these other instances of local gene conservation is currently unknown.
We calculated the functional linkages of 11 RLP sequences out of 44 known sequences using a confidence threshold of 0.5. Based on the functional linkages, the 11 RLP sequences can be divided into two major groups (Fig. 6). The first group consisted of the RLPs from C. tepidum, R. palustris (RLP1 and RLP2), Archaeoglobus fulgidus, Mesorhizobium loti, and Sinorhizobium meliloti. These RLPs are all functionally linked to hypothetical proteins, which reside near the RLPs on the chromosome. The second group consisted mainly of RLPs from Bacillus spp., including B. subtilis, B. cereus, B. anthracis strain A2012, and B. anthracis strain Ames. These RLP genes all overlap with haloacid dehalogenase-like hydrolases with an intergenic distance of –3. In addition, they all have functional linkages to aminotransferases, which reside near the RLPs on the chromosome. Based on biochemical studies, the RLP from B. subtilis, YkrW/MtnW, and its two functionally linked proteins, hydrolase (YkrX/MtnX) and aminotransferase (YkrV/MtnV), have been suggested to function together in the methionine salvage pathway (8) (Fig. 7). As discussed above, YkrW/MtnW functions as an enolase with 2,3-diketo-5-methylthiopentyl-1-phosphate as its substrate. By analogy, the RLPs in the second group probably all function as enolases in the methionine salvage pathway. Certainly, the close structural likeliness of RuBP and the 2,3-diketo compound of the methionine salvage pathway (Fig. 8) and the reported ability of the R. rubrum RubisCO gene to complement a ykrW knockout in B. subtilis (8) suggest both a functional relationship and an evolutionary relationship between RubisCOs and RLPs. Lastly, the RLP from Bordetella bronchiseptica has no functional linkages above the confidence threshold and may thus belong to another group of RLPs.
![]() View larger version (25K): [in a new window] |
FIG. 6. RLPs grouped by their functional linkage patterns. The 11 RLPs indicated here can be divided into two major groups. In the first group, all RLPs are linked to a hypothetical protein by the gene cluster method with short intergenic distances. The two hypothetical proteins next to the RLPs in Mesorhizobium loti and Sinorhizobium meliloti are homologous to each other. All the RLPs from Bacillus species form the second group. They have very similar gene organizations on the chromosome. They all reside between an aminotransferase and a hydrolase, which overlaps with RLPs by 3 bp. The RLP from Bordetella bronchiseptica does not have any functional linkages with high confidence. Oxred, oxidoreductase; Hypo, hypothetical protein; Amt, aminotransferase; MetSal, methylthioribose salvage protein; Hydro, hydrolase, haloacid dehalogenase-like hydrolase; CtRLP, C. tepidum RLP; RpRLP2, R. palustris RLP2; AfRLP A. fulgidus RLP; SmRLP, Sinorhizobium meliloti RLP; BsRLP, B. subtilis RLP; BcRLP, B. cereus RLP; BaRLP, B. anthracis RLP; BbRLP, Bordetella bronchiseptica RLP.
|
![]() View larger version (34K): [in a new window] |
FIG. 7. Methionine salvage pathway in which the YkrW-type RLP, such as the protein from B. subtilis (8), encoded by the mtnW/ykrW gene, participates in an enolase reaction whereby 2,3-diketo-5-methylthiopentyl-1-phosphate is converted to 2-hydroxy-3-keto-5-methylthiopentenyl-1-phosphate (highlighted). The products of the mtnX/ykrX, mtnZ/ykrZ, and mtnV/ykrV genes then allow methionine to be formed. SAM, S-adenosylmethionine. (Adapted from reference 8 with permission of the publisher.)
|
|
|
|---|
In every phylogenetic reconstruction examined, the bona fide RubisCOs (forms I to III) form a coherent clade, suggesting that they share a common line of descent. With minimum evolution and neighbor joining, forms I and II are late-descending nodes in a clade where the deepest branches are form III RubisCO and two additional RubisCO sequences from Methanosaeta thermophila and Methanospirillum hungatei. These two archaeal sequences consistently clade with one another and separate from other archaeal RubisCO sequences in form III. In addition, the sequence of the RubisCO from Methanococcoides burtonii, a methanogenic archaeon isolated from Antarctic marine sediments (60), consistently branches at the base of the form II clade in every method employed. These sequences are quite divergent, averaging only 28% (M. thermophila and M. hungatei) and 24% (M. burtonii) identity with all other RubisCO/RLP sequences. This consistent distribution of archaeal sequences at the base of clades containing all known bona fide RubisCO sequences suggests that this clade may have originated in the Archaea and subsequently been distributed to bacteria, eukaryotic algae, and higher plants. Overall bootstrap support is high for nodes in both methods with mean values of 75% and 83% for NJ and ME, respectively. The lowest bootstrap values were observed for internal nodes of the RLP cluster, while all terminal nodes are strongly supported.
The two other methods employed to reconstruct RubisCO/RLP relationships, UPGMA and MP, display different relationships among forms I to III. UPGMA maintains the same two lineages of form III observed by MP and NJ methods but places them as a sister group to the M. thermophila and M. hungatei sequences. This archaeal cluster is a sister clade to all form I sequences by the UPGMA method. With MP, the form III sequences are rearranged into two different lineages (III* and III** in Fig. 2D) that differ from the III-1 and III-2 lineages found by the three other methods. With this rearrangement, form I RubisCOs appear as a daughter clade nested within form III sequences. MP further produces a tree in which all clades are nested and splits the IV-Photo group found by all other methods into proteobacterial and Chlorobium clades. The major differences between the NJ, ME, and UPGMA trees are due to the branching order of RLP lineages, specifically whether the IV-NonPhoto lineage branches deeply off the RubisCO lineage or within the RLP lineage. Mean bootstrap support is also high for UPGMA and MP, at 80% and 100%, respectively.
Obviously, the tree topologies are highly dependent on the phylogenetic inference method employed. Both UPGMA and MP are known to be the most reliable for estimating trees in data sets where evolutionary rates are nearly constant across lineages (15), while NJ with rate correction was found to operate reliably when faced with variable rates across lineages. ProtTest analysis of the RubisCO/RLP sequence set indicated a moderate amount of rate variability that could confound UPGMA and MP analyses. Thus, the phylogenetic relationships inferred by NJ and ME, which indicate an archaeal origin for RubisCO/RLP, appear to be the most robust.
RubisCO and RLP appear to be more prevalent in the euryarchaea, which, along with the crenarchaeota, are the two major branches of descent in the archaea as delineated by 16S rRNA gene sequence comparisons (18). There are only two crenarchaeal form III sequences known, those from Hyperthermus butylicus DSM 5456 (YP_001012710) and Thermofilum pendens Hrk-5 (YP_920628). Within the archaeal RubisCOs, there appears to be more flexibility in the range of residues accepted at active-site positions (Fig. 3), indicating that the final sequence of the active site is variable in this group, while it appears set in all other form I and form II sequences. Additionally, the most deeply branching RLP sequence is found in the euryarchaeon A. fulgidus. Taken together, these observations suggest that the euryarchaea harbor the deepest-branching RubisCO and RLP sequences, which therefore makes them the best candidates for the evolutionary root of the RubisCO and RLP superfamily.
When the phylogenetic distribution of RubisCO/RLP lineages (Table 1) was examined, a single transfer of RLP from a methanogenic euryarchaeon into an ancestor of the Firmicutes, Proteobacteria, and Chlorobia, with subsequent lateral transfer to chloroflexi, followed by gene losses, could account for the distribution of most of the RLP lineages. Likewise, lateral transfer of a form III RubisCO from a euryarchaeon to a common ancestor of Cyanobacteria and Proteobacteria and eukaryote RubisCOs being acquired via subsequent endosymbiotic events could account for the distribution of bona fide RubisCO lineages observed. From these considerations, the likely evolutionary development of the large subunit of RubisCO and RLP follows the model depicted in Fig. 9. In addition to this scheme, it appears that the M. burtonii sequence, found at the base of the form II clade, may be a result of lateral transfer of a bacterial form II sequence to the archaea.
![]() View larger version (30K): [in a new window] |
FIG. 9. Model for the evolution of RubisCO large subunits and RLP. The ancestor of all extant RubisCO large subunits and RLPs is proposed to have arisen in the Methanomicrobia with subsequent distribution by vertical transmission (solid arrows) and lateral transfer (dashed arrows) within the archaea. A central event in the evolutionary history was the acquisition of both a form III RubisCO and an RLP (IV-DeepYkr) by an ancestral eubacterium from the archaea. From these two ancestral sequences, diverse form I, form II, and form IV enzymes evolved within the Proteobacteria and Cyanobacteria and have been subsequently distributed by lateral gene transfer and by endosymbiotic events (dashed and dotted arrows) involving both Cyanobacteria and Alphaproteobacteria, leading to the phylogenetic distribution of sequences seen in nature today. The small subunit of form I RubisCO must have originated soon after the transfer of form III to the eubacterial ancestor prior to the divergence of Proteobacteria and Cyanobacteria.
|
The scenario outlined above and in Fig. 9 does not explain the presence of RLPs in the picoeukaryote marine chlorophyte O. tauri, which encodes two distinct and highly divergent RLPs in its nuclear genome (17) in addition to a typical form I large subunit in the chloroplast genome (57). The synthesis and function of the RLPs encoded by the O. tauri nuclear genome have not yet been demonstrated. It may be that this sole example of eukaryotic RLPs is indicative of an additional lateral gene transfer, possibly from a member of the Alphaproteobacteria. The other unusual prokaryote-to-eukaryote lateral transfer in this scheme, which explains the presence of form II RubisCO in the Dinophyceae, has been previously analyzed in detail (44, 51, 52, 58).
The archaeal origin model of RubisCO/RLP evolution proposed here is substantially different from that reported previously by Ashida et al. (7) and Carre-Mlouka et al. (11), who speculated that bona fide RubisCOs arose in the YkrW lineage. However, those authors relied on much smaller sets of sequences and more limited numbers of phylogenetic reconstructions to reach their conclusions. Clearly, models of RubisCO evolution will themselves need to evolve, as new sequences are continuously being reported, especially in metagenomic sequencing projects. However, at this point, no new distinct RubisCO forms have been uncovered; thus, the basic conclusions reached in Fig. 9 appear to represent the most feasible scenarios for RubisCO and RLP evolution.
-barrel motif, the TIM barrel (46, 76). The TIM barrel fold is composed of 32 superfamilies in the latest release of the Structural Classification of Proteins (SCOP) database (version 1.71 [http://scop.mrc-lmb.cam.ac.uk/scop/]), the largest number of structural superfamilies of any fold within the SCOP class of alpha and beta proteins (
/β). The functional flexibility of the TIM barrel scaffold has been well documented (reviewed by Anantharaman et al. [3]). The evolution of TIM barrel proteins has been previously examined, and the RubisCO superfamily was found to cluster with other TIM barrel superfamilies containing a sugar phosphate binding motif (76). However, a separate PSI-BLAST analysis failed to link the RubisCO structure with other TIM families (46). To identify structural homologs of RubisCO and RLP, a total of five RubisCO/RLP structures representing each major lineage (Protein Data Bank [PDB] accession numbers 1RBL [form I], 5RUB [form II], 1GEH [form III], 1YKW [form IV/RLP], and 2OEJ) were used to search the PDB structure collection using the DALI fold comparison search tool (http://www.ebi.ac.uk/dali/index.html). The DALI server was chosen based on its favorable evaluation relative to other fold comparison servers (48). Structural homologs were considered only if the average DALI Z score was >10 and the structure was identified by each of the five queries (Table 3). The number of homologs returned was driven primarily by PDB accession number 5RUB, the form II RubisCO structure from Rhodospirillum rubrum, which retrieved the fewest homologs.
|
View this table: [in a new window] |
TABLE 3. Structural homologs of RubisCO/RLP as determined by fold comparisons carried out at the DALI server with five representative RubisCO and RLP structures against the PDB databasea
|
With evidence pointing to RuBP as the exclusive substrate for archaeal RubisCO, how, then, does the organism synthesize RuBP? Again, negative data suggest that there is no demonstrable PRK activity in extracts from the organisms tested (26); moreover, analyses of the great majority of available genomes indicate no recognizable gene to encode PRK. Recently determined genomic sequences of Methanospirillum hungatei, Methanoculleus marisnigri, and Methanosaeta thermophila (http://img.jgi.doe.gov/cgi-bin/pub/main.cgi) represent the only archaeal organisms where potential PRK genes may exist. As for the vast majority of archaea, which possess no discernible PRK gene, a satisfying positive finding was the demonstration of a novel means to synthesize RuBP. Direct enzymatic assays using alternative substrates with extracts of Methanocaldococcus jannaschii provided evidence for a previously uncharacterized pathway for RuBP synthesis from 5-phosphoribose-D-1-pyrophosphate (PRPP) in M. jannaschii and other methanogenic archaea (26). Thus, these experiments, using PRPP as the sole substrate, resolved the need for a kinase dedicated to RuBP generation because PRPP already contains the relevant phosphates at both the C1 and C5 positions. Based on studies with other systems, it was hypothesized that either there is a selective enzymatic dephosphorylation step at the C1 position or nonenzymatic dephosphorylation occurs at the pyrophosphate at both moderate and high temperatures in the presence of magnesium at neutral pH (26). In either instance, the product would be ribose-1,5-bisphophosphate, a compound known to be synthesized in many other biological systems including macrophages and red blood cells under conditions of hypoxia (discussed in reference 24). Further indications of a novel and specific enzymatic reaction(s) was the stoichiometric conversion of PRPP to RuBP using extracts of M. jannaschii, such that that one molecule of PRPP was converted to two molecules of PGA. These results provided experimental verification for the proposed pathway (26). Inhibition of the PRPP-to-PGA conversion in vitro by both the RubisCO transition state analog CABP and antibodies to M. jannaschii RubisCO convincingly reinforced the idea that RubisCO catalysis is essential to convert PRPP to PGA. The proposed unique enzymatic step of this pathway is the conversion of ribose-1,5-bisphosphate, or ribose-1,2 cyclic phosphate-5-P (ribose-1,2cP-5-P), to RuBP. This work thus identified a novel means to synthesize the CO2 acceptor and substrate for RubisCO in the absence of a detectable kinase such as PRK. More recently, studies with Thermococcus kodakarensis confirmed and greatly extended those studies and again pointed to ribose-1,5-bisphosphate as the direct precursor to RuBP (59). Moreover, Sato et al. identified the enzymes and the requisite structural genes, including RubisCO, that are involved in a pathway of AMP metabolism (59). In that scheme, AMP, which could be produced from PRPP, is acted upon by an AMP phosphorylase to produce ribose-1,5-bisphosphate, followed by a ribose-1,5-bisphosphate isomerase, to yield RuBP. Both studies proposed that this route to RuBP might point to unique evolutionary links between purine-pyrimidine recycling pathways and the CBB cycle, with RubisCO catalysis and PRPP/AMP metabolism providing the needed anaplerotic levels of PGA (26, 59). Apparently, the genes of the AMP-to-RuBP pathway are conserved in virtually all archaea that contain form III RubisCO (59), suggesting that this might be a universal means by which archaea employ RubisCO in metabolism.
|
|
|---|
/β-barrel with two additional small
-helices forming a cap at the C terminus (39). Like form I, form II, and form III RubisCOs, the presumptive active site of RLP is located in the subunit interface between the C-terminal domain of one subunit and the N-terminal domain of another subunit (Fig. 10). As discussed above, compared to the invariant active-site residues found in RubisCO, 10 of 19 active-site residues differ in the C. tepidum RLP. These dissimilarities in the amino acid sequence confer unique shapes and chemical properties to the active site, making it evident that C. tepidum RLP may not bind RuBP but may bind a structurally related molecule. While the C. tepidum RLP active site appears to be compatible for accommodating the P1 phosphate group, the backbone of CABP, and a metal ion (possibly Mg2+), the geometry and chemistry seem to be incompatible with an incoming P2 phosphate group, as in CABP (39). It does appear, however, that a smaller and slightly hydrophobic group may fit into this active site. One is also confronted with the interesting result that R. rubrum RubisCO complements the function of B. subtilis RLP (YkrW/MtnW) in a ykrW/mtnW mutant (8). Since the substrates for the 2,3-diketo-5-methylthiopentyl-1-phosphate enolase and RubisCO reactions are fairly similar (Fig. 8), one would expect that the active sites of RLP should be able to bind to a wide range of molecules similar to RuBP, as is the case with RubisCO (5). However, structural analyses of C. tepidum RLP indicate that nonidentical residues at positions coincident with mechanistically significant RubisCO residues make the RLP active-site pocket smaller and slightly more hydrophobic at the P2 site (39) (Fig. 10) and hence may cause steric hindrance for binding the P2 phosphate of RuBP. The active-site structure of C. tepidum RLP suggests that this protein might function as an enolase but probably could not catalyze carboxylation (39).
![]() View larger version (76K): [in a new window] |
FIG. 10. The active sites in the crystal structures of form I (spinach) RubisCO (PDB accession number 8RUC) with bound CABP, C. tepidum RLP (PDB accession number 1YKW), R. palustris RLP2 (PDB accession number 2QYG), and G. kaustophilus RLP (accession number 2OEM) with bound DK-H-1-P. The side chains of active-site residues are shown as sticks, except for residue R383 in C. tepidum RLP and R. palustris RLP2. Only the backbone carbon and nitrogen atoms of R383 in the RLPs are shown. CABP and DK-H-1-P are shown in white, and the P1 and P2 phosphate groups are labeled in red and orange. Residues involved in contributing hydrogen bonds with the P1 phosphate group are green, residues involved in making hydrogen bonds with the backbone of CABP are orange, residues coordinating the Mg2+ atom (shown in magenta) are light red, and residues involved in binding P2 phosphate group are cyan. Not all parts of the structures are shown for the purpose of clarity.
|
/β-barrel domain of RubisCO, plays an important role in catalysis (13). Among multiple form I RubisCO structures, loop 6 has been observed to partition between the "open" and "closed" conformations (20, 61, 62). In C. tepidum RLP, loop 6 is ordered and adopts a closed conformation similar to that found in the structure of activated RubisCO (PDB accession number 8RUC), although no substrate is bound at the active site. Loop 6 folds over and closes the active site. The backbone of a key residue on loop 6, Arg-327, superimposes well with that of Lys-334 in form I RubisCO (Fig. 10). Although the side chain has a different conformation, it can possibly make hydrogen bonds with CABP. There are two other major differences in the C. tepidum RLP structure compared to those of bona fide RubisCO proteins. First, there is an additional 14-residue loop, loop CD, between β-strands C and D in the N-terminal domain. Second, C. tepidum RLP is missing a β-hairpin turn between helix 6 and β-strand 7 in the C-terminal
/β-barrel domain. Loop CD approaches the active-site opening from the direction opposite from loop 6 and packs against loop 6. The positional similarity between loop CD and the C-terminal tail of RubisCO upon substrate binding suggests that loop CD may have a role in positioning loop 6 (discussed below).
In this review, we also report the second crystal structure of an RLP from the IV-Photo clade, RLP2 from R. palustris (PDB accession number 2QYG) (Table 4). As described above, the structure of R. palustris RLP2 is very similar to the structure of C. tepidum RLP, with a C
atom rmsd of 0.8 Å (Fig. 11). A previously described disordered region in N-terminal domain residues 47 to 58 was found to be ordered in the R. palustris RLP2 structure, as in another independently solved C. tepidum RLP structure (cited in reference 31). In addition, much like C. tepidum RLP, the same residues analogous to RubisCO active-site residues are conserved in R. palustris RLP2. However, in R. palustris RLP2, residue R327 appears to take up a different conformation compared to that in the C. tepidum RLP structure (Fig. 10). In R. palustris RLP2, the side chain of R327 adopts a conformation comparable to that of K334, the corresponding catalytic residue of spinach (form I) RubisCO (PDB accession number 8RUC). The R. palustris RLP2 structure further supports the hypothesis that residue R327 can potentially form hydrogen bonds with the P1 phosphate and the backbone of a CABP-like ligand. Residue E119, although adopting a different conformation relative to the identical residue in C. tepidum RLP, can still potentially form a hydrogen bond with the backbone of a CABP-like ligand.
|
View this table: [in a new window] |
TABLE 4. X-ray data collection and refinement statistics of the R. palustris RLP2 structure
|
![]() View larger version (67K): [in a new window] |
FIG. 11. The monomer structures of RLP2 from R. palustris and RLP from G. kaustophilus superimposed with the RLP from C. tepidum. R. palustris RLP2 is blue, G. kaustophilus RLP is red, and C. tepidum RLP is green. The root mean square deviation (RMSD) of the C atom is 0.8 Å between R. palustris RLP2 and C. tepidum RLP, 1.3 Å between R. palustris RLP2 and G. kaustophilus RLP, and 1.3 Å between C. tepidum RLP2 and G. kaustophilus RLP. Two main structural differences can be seen in the N-terminal domain: loop CD in C. tepidum RLP and R. palustris RLP2 becomes a helix in G. kaustophilus RLP, and residues 47 to 58, missed in C. tepidum RLP, become a loop in R. palustris RLP2 and partly a helix in G. kaustophilus RLP.
|
The N-terminal 18 residues in the Photo-type RLPs are missing in the YkrW type. In addition, there are two main differences between the structures of the Photo-type RLP and YkrW-type RLP in the N-terminal domain. Loop CD in G. kaustophilus RLP becomes a helix and slightly swings away from loop 6 but forms a tighter interaction interface with loop CD (which should be helix CD in this case) from the other monomer. The second main difference is in the region of residues 47 to 58, which was previously missing from the C. tepidum RLP structure. This region is less flexible in the structure of R. palustris RLP2, forming a loop, and partly becomes a helix in the structure of G. kaustophilus RLP (Fig. 11). Loop 6 in G. kaustophilus RLP adopts a closed conformation, as seen in the Photo-type RLPs, when the substrate analog or Mg2+ is bound but becomes flexible when no substrate or Mg2+ is bound; in addition, the density for residues 304 to 308 is missing in the latter G. kaustophilus structure, presumably because of the flexibility of loop 6 (PDB accession number 2OEJ).
![]() View larger version (54K): [in a new window] |
FIG. 12. Comparison of secondary structural elements in the X-ray crystal structures of the different forms of RubisCO. Large subunits from the structures of spinach (form I; PDB accession number 8RUC) (yellow), T. kodakarensis (form III; accession number 1GEH) (purple), and R. rubrum (form II; accession number 5RUB) (red) were superimposed on C. tepidum RLP (form IV; accession number 1YKW) (green) to align the -carbon backbones. The transition state analog CABP (black sticks), which is present only in the spinach structure, has been drawn into the other structures to indicate the positions of active sites. A basic unit common to all four types of structures is formed as a result of the association of at least two of the large subunits. The active sites in bona fide RubisCO enzymes are contributed by residues from the N-terminal domain of one large subunit and the C-terminal domain of the other. Loop CD, which is present only in the RLPs and the RubisCO β-hairpin structure that is absent in the RLP structure, is indicated.
|
![]() ![]() View larger version (232K): [in a new window] |
FIG. 13. Structural alignment of representative sequences from RLPs and RubisCO large subunits. Superimposition of the X-ray crystal structures of C. tepidum RLP (PDB accession number 1YKW; form IV), spinach RubisCO (accession number 8RUC; form I), T. kodakarensis RubisCO (accession number 1GEH; form III), and R. rubrum RubisCO (accession number 5RUB; form II) was used to deduce the alignment of secondary structural elements (helices as bars and β-strands as arrows). Residue numbers are indicated on each side of the sequences. Conserved active-site residues are marked with an "*" below the sequences. RubisCO large-subunit sequences are boxed in gray. Residues that are identical or similar to those in other species are colored uniquely based on the nature of the residue. The catalytic loop 6, β-hairpin (both present in RubisCO enzymes), and loop CD (present only in RLPs) are indicated. A. vinosum, Allochromatium vinosum; C. limicola, Chlorobium limicola; O. granulosus, Oceanicola granulosus; P. horikoshii, Pyrococcus horikoshii; T. denitrificans, Thiobacillus denitrificans.
|
![]() View larger version (27K): [in a new window] |
FIG. 14. Comparison of the unique loop CD of C. tepidum RLP (PDB accession number 1YKW) (A) with the comparable region of form I (spinach) RubisCO (accession number 8RUC) (B). Residues Q78 to I91 form a loop (loop CD) (red ribbon and sticks), and residues in this loop have multiple interactions with residues of the same subunit (green ribbon and sticks) or the neighboring large subunit (purple ribbon and sticks). Notably, the hydroxyl group of S86 forms a hydrogen bond with loop 6 residue R327 (orange sticks) from the neighboring large subunit. Spinach form I residues equivalent to E75, E77, and H92 of C. tepidum RLP are E93, E94, and N95 (red ribbon and sticks). Residues K305 and V475 (yellow sticks) interact with E93 in the closed conformation of spinach RubisCO.
|
![]() View larger version (111K): [in a new window] |
FIG. 15. Placement of the β-hairpin residues in the holoenzyme structure of form I (spinach) RubisCO (PDB accession number 8RUC). The β-hairpin residues (Y353 to S367) (red) that are absent in RLPs are exposed to the solvent in the holoenzyme structure of spinach RubisCO. The large subunits are yellow, and the small subunits are blue.
|
|
|
|---|
We thank Simona Romagnoli for the rlp2 clone and Yim Wu for her assistance in protein purification and crystallization.
Supplemental material for this article may be found at http://mmbr.asm.org. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»