MMBR Figure table search 04
Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Copyright Information
Right arrow Books from ASM Press
Right arrow MicrobeWorld
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Kurland, C. G.
Right arrow Articles by Andersson, S. G. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kurland, C. G.
Right arrow Articles by Andersson, S. G. E.

Microbiology and Molecular Biology Reviews, December 2000, p. 786-820, Vol. 64, No. 4
1092-2172/00/$04.00+0
Copyright © 2000, American Society for Microbiology. All rights reserved.

Origin and Evolution of the Mitochondrial Proteome

C. G. Kurland1,2,* and S. G. E. Andersson1

Department of Molecular Evolution, Evolutionary Biology Centre, University of Uppsala, Uppsala SE 752 36,1 and Department of Microbiology, Lund University, Lund SE 223 62,2 Sweden

SUMMARY
INTRODUCTION
OX-TOX MODEL
FROM BACTERIAL GENOME TO VESTIGE
    Genome Degradation
    Transfer to the Nucleus
THE YEAST MITOCHONDRIAL PROTEOME
    Energy Metabolism
    Information Processes
    Heat Shock Proteins
    Biosynthesis, Regulation, and Transport
    Summary
THE HOST
    Archaeazoa
    Hydrogenosomes
    Hydrogen Exporter---Late
    Hydrogen Exporter---Early
    Eukaryotic Heterotrophy
    Horizontal Transfer to and from Eukaryotes
    Phylogenetic Inference and Gene Transfer
FUTURE DIRECTIONS
ACKNOWLEDGMENTS
REFERENCES


SUMMARY
Top
Next
References

The endosymbiotic theory for the origin of mitochondria requires substantial modification. The three identifiable ancestral sources to the proteome of mitochondria are proteins descended from the ancestral alpha -proteobacteria symbiont, proteins with no homology to bacterial orthologs, and diverse proteins with bacterial affinities not derived from alpha -proteobacteria. Random mutations in the form of deletions large and small seem to have eliminated nonessential genes from the endosymbiont-mitochondrial genome lineages. This process, together with the transfer of genes from the endosymbiont-mitochondrial genome to nuclei, has led to a marked reduction in the size of mitochondrial genomes. All proteins of bacterial descent that are encoded by nuclear genes were probably transferred by the same mechanism, involving the disintegration of mitochondria or bacteria by the intracellular membranous vacuoles of cells to release nucleic acid fragments that transform the nuclear genome. This ongoing process has intermittently introduced bacterial genes to nuclear genomes. The genomes of the last common ancestor of all organisms, in particular of mitochondria, encoded cytochrome oxidase homologues. There are no phylogenetic indications either in the mitochondrial proteome or in the nuclear genomes that the initial or subsequent function of the ancestor to the mitochondria was anaerobic. In contrast, there are indications that relatively advanced eukaryotes adapted to anaerobiosis by dismantling their mitochondria and refitting them as hydrogenosomes. Accordingly, a continuous history of aerobic respiration seems to have been the fate of most mitochondrial lineages. The initial phases of this history may have involved aerobic respiration by the symbiont functioning as a scavenger of toxic oxygen. The transition to mitochondria capable of active ATP export to the host cell seems to have required recruitment of eukaryotic ATP transport proteins from the nucleus. The identity of the ancestral host of the alpha -proteobacterial endosymbiont is unclear, but there is no indication that it was an autotroph. There are no indications of a specific alpha -proteobacterial origin to genes for glycolysis. In the absence of data to the contrary, it is assumed that the ancestral host cell was a heterotroph.


INTRODUCTION
Top
Previous
Next
References

Mitochondria are the ATP-generating organelles of eukaryotes, and in most organisms they are oxygen respiring. Roughly 2 billion years ago, the ambient oxygen tension of Earth's atmosphere increased rapidly. Here, rapidly means that the oxygen tension went from roughly 1% to more than 15% of present levels within less than 200 million years (88). Many believe that the origins of mitochondria as organelles in primitive eukaryotes can be associated with this environmental trauma (121).

Nevertheless, the Earth's atmosphere during the billions of years prior to this global oxygen shock was probably not the heavy reducing atmosphere suggested by Oparin (142). Geochemical evidence suggests that the oxygen tension in the atmosphere may have been as much as 1% of present levels from the very beginning (88, 157). In other words, during the entire history of the biosphere, oxygen was accessible at low levels in the atmosphere and quite possibly at higher levels locally. The continuous presence of oxygen matches the ancient origins of the terminal oxidases characteristic of mitochondria. Thus, the monophyletic lineage of cytochrome oxidases is well represented in the archaea, bacteria, and eukaryotes (40, 41, 104, 161, 166).

Phylogenetic reconstructions and distance measurements based on the sequences of cytochrome c oxidase and cytochrome b are consistent with divergence of mitochondria from bacteria between 1.5 and 2.0 billion years ago (165). Accordingly, the oxidative respiratory system that was introduced into eukaryotes by way of the primitive mitochondrion was already an ancient enzymatic system. There is now overwhelming support for the idea that the vehicle that introduced the respiratory system into the eukaryotic lineages was an endosymbiotic alpha -proteobacterium (20, 77, 78, 79).

The endosymbiotic theory of plastid as well of mitochondrial origins arose in the nineteenth century and was given new life by Margulis (121) precisely when molecular methods could begin to test some of its predictions. The discovery of mitochondrial genomes and the results of phylogenetic reconstructions with sequences for rRNA as well as for a few proteins strengthened confidence in this theory (32, 79, 206). As a consequence, when we reviewed the literature on codon preferences in this journal 10 years ago, we found it convenient to treat the mitochondrial genome as though it was just another kind of bacterial genome (12).

Since then, detailed comparisons and phylogenetic reconstruction with relevant genome sequences have very much expanded our view of the mitochondrion. Most informative have been the mitochondrial genomes of protists (77, 80, 111), the nuclear genome of the yeast Saccharomyces cerevisiae (87;1x; http://www.proteome.com), and the genome of the alpha -proteobacterium Rickettsia prowazekii (20, 76). The genomic comparisons show unambiguously that the coding sequences of the mitochondrial genomes are predominantly the descendants of alpha -proteobacterial homologues. Accordingly, some version of the endosymbiotic theory is in all probability relevant to the origins of mitochondria. However, to account for some of the new data, this theory needs to be modified significantly.

First, it turns out that only a small fraction of all proteins functioning in mitochondria are the descendants of the ancestral free-living alpha -proteobacterium. Most of the remaining proteins are descendants of nuclear genes with no bacterial antecedents (17, 95). That most of the genes of the ancestral alpha -proteobacterium have disappeared from the mitochondrial genome has been understood for some time (13, 14, 15, 16, 19, 20, 22, 77). The magnitude of the loss can be estimated as follows. To our knowledge, the smallest genome of a free-living alpha -proteobacterium is that of Bartonella henselae, with less than 2 × 106 base pairs, and the largest is Bradyrhizobium japonicum, with 8.7 × 106 base pairs (107, 155). Since the Bartonella genome encodes 1,600 or more proteins (Andersson et al., unpublished), we can take this figure as a conservative size estimate for the proteome encoded by the free-living alpha -proteobacterial ancestor of mitochondria. What then accounts for the enormous size discrepancy between the coding capacity of alpha -proteobacterial and mitochondrial genomes? Here, we need to compare 1,600 proteins with the 67 proteins encoded by the mitochondrion with the largest coding repertoire, that of Reclinomonas americana (76, 77).

There are at least two large-scale reductive tendencies that will account for fact that contemporary mitochondrial genomes have evolved into mere vestiges of the ancestral genome. One is the massive loss of genes that are not essential to life in the eukaryotic cytosol. Thus, genes in the nuclear genome can replace many gene products originally encoded by ones in the endosymbiont's genome (6, 13, 15, 16, 108). This means that the suspension of purifying selection allows redundant genes in the mitochondria to be inactivated and deleted by random mutation (101). In addition, unique essential genes can be transferred to the nucleus if their protein products can be recruited from the cytosol for function in the mitochondrion (17, 27, 75, 95). A recent model for the evolution of mitochondrial genomes predicts that eventually, when such transfer is not destructive, all coding sequences will be displaced from the mitochondria to the nucleus (27).

Genes transferred to the nucleus can encode proteins that will be transported to the mitochondria by a specific transport system (138, 162). The same transport system can also assist mitochondria to recruit nonbacterial proteins encoded in the nucleus. For example, the nucleus of Saccharomyces cerevisiae contributes more than 400 proteins to the mitochondrion (87). Phylogenetic analysis suggests that half of the nuclear proteins that augment the mitochondrial proteome have no bacterial affinities (19, 95). This half is likely to be purely eukaryotic in origin (see below).

In effect, mitochondria have evolved in two distinctive modes. One is the reductive mode that reflects an extreme adaptation to an intracellular existence. The other is an expansive mode in which the mitochondria are the beneficiaries of nuclear evolution. In the following we document these modes of genome evolution. We stress the importance of extending the analysis of mitochondria beyond the relatively small and highly variable contribution of contemporary mitochondrial genomes. A narrow focus on the genomes of the organelle tends to obscure most of its evolutionary history. Since there are descendants of both ancient alpha -proteobacterial genes and more recent eukaryotic genes cooperating in mitochondrial functions, it is most convenient to view the evolution of the organelle as the evolution of a proteome. Viewed from this vantage, mitochondria no longer seem to be just another sort of bacteria.

Space limitations have meant that we cannot do justice to the vast amount of information that is available on many other aspects of the mitochondrial genome. We recommend that interested readers supplement their background information with the aid of an excellent book, Mitochondrial Genomes (205), edited by Wolstenholme and Jeon. This book contains a chapter by Michael W. Gray (75) that is a must.


OX-TOX MODEL
Top
Previous
Next
References

The endosymbiotic theory, as explicated by Margulis (121), was an eclectic formulation that concerned much about cellular evolution besides the origins of mitochondria and plastids, but convention has reduced the common use of the term. By 1998, the standard model to describe the origins of mitochondria was quite specific: the endosymbiont was identified as an alpha -proteobacterium, such as Paracoccus, and the host as an archaeon (50, 75, 79, 137). An important aspect of the standard model in all of its shifting forms is that it assumes, often tacitly, that the symbiosis leading to the mitochondrial lineages involved an exchange of ATP produced aerobically by the symbiont for organics provided by the anaerobic host.

This view was challenged recently on biochemical grounds. In particular, it was recognized that a free-living bacterium such as Paracoccus would probably not be able to actively transport ATP to a prospective host because bacteria do not in general have ATP exporters (10, 20, 124, 185). Two additional observations, to which we return below, are relevant. First, only two endocellular parasitic bacteria are known to have ATP transport proteins. These are importers of ATP that are clearly related to plastid homologues but unrelated to the ATP exporters of mitochondria (10, 17, 20, 201). Second, the ATP transporters of mitochondria seem to have evolved after the divergence of eukaryotes (17, 20, 95). From where do the ATP transport functions of the mitochondria come?

It turns out that this question of seemingly small detail opens into much larger issues. In particular, if the initial symbiotic relationship between the alpha -proteobacterium and its host did not depend on the sharing of ATP produced through the aerobic respiration of the symbiont, what was that relationship? Two current views of the initial symbiotic relationship between the ancestor of mitochondria and its host have emerged. One sort of model favors an evolutionary path that is initially supported by anaerobic syntrophy (115, 124). The other, which involves aerobic mutualism (17), we describe first.

As noted above, all coding sequences of characterized mitochondria are found in the mitochondrial genome of R. americana (77). This, along with the clear similarity of these to coding sequences of R. prowazekii and B. henselae (20; Andersson et al., unpublished data), suggests that the mitochondria arose only once and that they, together with the putative alpha -proteobacterial ancestor, make up a monophyletic lineage. This is consistent with the supposition that the initial endosymbiotic relationship to the ancestral host was an aerobic one.

Nevertheless, there are two details of this putative aerobic scenario that are challenging. First, it is not possible at present to identify the host of the ancestral symbiont with any confidence. Current preferences vacillate between an archaeon and a primitive eukaryote as the likely host (see, for example, references 76 and 131). Either way, we assume for simplicity that the host was a heterotroph that could provide the endosymbiont with substrates such as pyruvate. This is not a drastic assumption because of the near ubiquity of glycolytic pathways among archaea, bacteria, and eukaryotes (46). In addition, it is known that cytochrome oxidases may function as cytochrome c oxidases, quinol oxidases, or nitrogen oxide (NO) oxidases. Nevertheless, all members of this gene family belong to the same monophyletic lineage, and all three may have been present in the last common ancestor (40, 41). This, together with the particularly close monophyletic relationship between the cytochromes employed by bacteria and by mitochondria, is strong evidence that the ancestral endosymbiont had already acquired an aerobic respiratory chain (20, 77, 165). We return to the origins of eukaryotic heterotrophy below.

Second, as mentioned, neither transporters such as the ATP/ADP translocases of Rickettsia and Chlamydia nor protein transport systems such as those that recruit proteins into the mitochondria are found among free-living bacteria. This means that there is no reason to suppose that the ancestral endosymbiont could export ATP or import proteins, as do modern mitochondria. Instead, we suggest that initially the aerobic symbiont interacted with its host in ways that are not characteristic of modern mitochondria.

One possibility is that the ancestral alpha -proteobacterium was an aerobic symbiont that consumed oxygen with the aid of a respiratory chain ending in cytochrome oxidase and that, in return, its heterotrophic anaerobic host made pyruvate accessible (17, 95). Here, the host may not have benefited initially by sequestering the ATP produced aerobically by the symbiont. Instead, it is suggested that the consumption of oxygen per se constituted the service provided by the ancestral symbiont in the initial phase of the evolution of mitochondria. In effect, the cytochrome oxidase activity of the symbiont detoxified the host cytosol by converting oxygen to water. The benefit here is that elements of the host's anaerobic metabolism that were sensitive to oxygen would be protected by the activities of the endosymbiont's cytochrome oxidase.

The credibility of this conjecture derives from numerous examples of modern symbiotic relationships with an oxygen-scavenging function assigned to one of the partners (60, 61, 62). Roughly two billion years ago, the oxygen tension increased from less than 1.5% to greater than 15% of present levels (88). At that time, the demands for an oxygen-consuming symbiont to support an essentially anaerobic host would have been, if anything, more pressing than they are today. Thus, in modern organisms, activities such as peroxidases, catalases, and superoxide dismutases protect cells against the toxic effects of oxygen respiration. These activities might not have been so widespread two billion years ago. Indeed, even some modern cells are killed or debilitated by exposure to less than ambient atmospheric oxygen tensions (60, 61, 62).

In the Ox-Tox model, the evolution of the mitochondrion from the endosymbiont required the evolution of characteristic mitochondrial control and export functions that were derived from nuclear genes. The evolution of novel nuclear gene products for recruitment by the mitochondria is typified by the integration of the ATP/ADP translocase into the workings of the primitive mitochondrion. Thus, this activity is found universally in mitochondria, which dates its debut to a time prior to the divergence of the major branches of eukaryotes. This novel recruit to the mitochondrial proteome made possible an efficient supply of ATP to the host cell from the evolving mitochondrion. For this reason, the integration of the ATP/ADP translocase into the workings of the endosymbiont may be taken as a marker for the transformation of the endosymbiont into an organelle.


FROM BACTERIAL GENOME TO VESTIGE
Top
Previous
Next
References

One group of alpha -proteobacteria, the rickettsiae, are of special interest both as models for the evolution of mitochondria and as possible descendants of the endosymbiotic ancestor to mitochondria (11, 14, 58, 81). These organisms, like the putative ancestor of the mitochondria, are thought to be the descendants of free-living alpha -proteobacteria (191, 197). Furthermore, phylogenetic reconstructions for diverse protein-coding sequences suggest that rickettsiae are the closest modern relatives of the mitochondria (11, 81, 141, 166, 186). Once the genome sequence of Rickettsia prowazekii (20) and that of its close relative, Bartonella henselae (Andersson et al., unpublished data), became available, their intimate phylogenetic relationships to mitochondrial genomes became incontrovertible. Nevertheless, there is a glaring gap between, on the one hand, discovering this phylogenetic relationship and, on the other, understanding precisely what that ancestor was.

Furthermore, there is the enormous discrepancy between the number of coding sequences in mitochondria and that in free-living alpha -proteobacteria. Much has been made about the limited coding capacity of animal mitochondria (22, 204) as well as of the stark contrast between these and plant mitochondria with their relatively large, complicated genomic architectures (83, 180). Compared to the little rickettsial genome with its 834 protein-coding sequences (20), the coding capacity of any mitochondrion is insignificant. The numerical range of protein-coding sequences in mitochondria extends from 2 in Plasmodium falciparum to 67 in Reclinomonas americana (77). The protein-coding capacities of all known plant, animal, and fungal mitochondria are nested between those of these two protists. In fact, most mitochondrial genomes can boast between 12 and 20 or so protein-coding genes along with rRNA and tRNA genes---not much with which to run an organelle.

If an alpha -proteobacterium such as Bartonella with its 1,600 genes was the ancestor of mitochondria, why did nearly all of the coding capacity of this genome disappear from the organelle? Alternatively, why does the mitochondrial genome have any protein-coding capacity at all? Again, why are most of the genes needed by mitochondria found in the nucleus and not in the mitochondrial genome? Finally, are the nuclear mitochondrial genes of bacterial origin or of eukaryotic origin?

Genome Degradation

A free-living bacterium that initiates a symbiotic relationship with another cell will be bathed in the metabolic intermediates of its host. These metabolites make some of the symbiont's genes redundant as long as it shares the host metabolism. Thus, we expect some genes in the symbiont to be neutralized by the host's biochemical activities. Neutralized genes are subject to mutational degradation (101). When genes required for the free-living mode are forfeited, the facultative symbiont has evolved into an obligate symbiont or an obligate parasite, with a coding capacity that can be extremely limited (70, 71, 72, 94, 147, 170, 172). For example, the obligate parasites of the genus Rickettsia, like mitochondria, have virtually no genes for amino acid or nucleoside biosynthesis, but their facultative parasitic relatives, the Bartonella spp., are fully able to produce these intermediates in their free-living mode (20; Andersson et al., unpublished data).

The Rickettsia, like their relatives the mitochondria, have a well-developed oxidative metabolism that exploits the Krebs cycle along with an ATP-generating electron transport chain that terminates with cytochrome oxidase. Both sorts of genomes are devoid of genes for anaerobic glycolysis, and this may be attributed to the fact that their respective hosts supply them with pyruvate as the precursor to the Krebs cycle. In contrast, the Bartonella genome has a complete glycolytic repertoire (B. Canbäck, U. C. M. Alsmark, S. G. E. Andersson, and C. G. Kurland, unpublished data). In effect, a good deal of the difference in the gene complements of these two bacteria, which amounts to circa 1 million base pairs, may simply be the difference between the needs of an obligate and a facultative parasite. Likewise, the difference between the 834 genes of the Rickettsia genome and the roughly 400 genes that specify mitochondrial functions (see below) may reflect differences in the needs of an infective parasite and those of a captive organelle.

These streamlining effects are to some extent a reflection of the population structure common to the genomes of obligate symbionts, obligate parasites, and cellular organelles (15). These sorts of genomes tend to propagate as asexual lineages, which are characterized by small population sizes. Under these conditions, sublethal deleterious mutations accumulate, and these may include the inactivation or loss of nonessential genes (15, 107, 117, 118). In the case of mitochondria, such mutations will be subject to purifying selection at the cellular level (28). However, the efficiency of this selection will depend on the population size and the magnitude of the selective disadvantage of the mutations. As a consequence, asexual genomes in small populations or in populations subject to recurrent bottlenecks will tend to be degraded by the inroads of weakly deleterious mutations (59), i.e., by Muller's ratchet. The ratchet has been demonstrated experimentally in bacteriophages (43) and in free-living bacteria (6). The influence of Muller's ratchet has also been inferred in the genomes of endosymbionts (35, 110, 130, 192) as well as in mitochondria (108, 117, 118).

Muller's ratchet may also account for some of the reduction in the effective gene complement of the evolving mitochondrion (59, 137). The magnitude of the influence of the ratchet on a genome is related to the degree of mutational diversity in the genome population. This follows from the fact that the mutation frequency per genome is likely to be proportional to the size of the genome. This implies that the larger genome characteristic of an early stage of mitochondrial evolution should have been more vulnerable to the inroads of Muller's ratchet than that of a modern mitochondrion (27).

So far we have considered genetic mechanisms that influence the size of the mitochondrial genome. We may also consider the molecular mechanisms that mediate the degradation of genomes. There are at least two different ways that sequences may be extirpated from a genome. One would be a slippage mechanism in which short runs of nucleotides are removed. This is a slow but sure way to delete sequences. Indeed, traces of this mechanism are observed in the highly derived genome of Rickettsia. For example, nearly one quarter of the Rickettsia prowazekii genome is noncoding sequence (7, 20). It is possible to study the mutation spectrum of noncoding sequences from different species of Rickettsia. Such a comparison shows that short deletions provide the dominant evolutionary mode in these sequences (8, 9). Thus, noncoding sequences that are thought to be mutation-degraded versions of nonessential coding sequences can slowly depart the genome by virtue of small deletions (7, 8, 9, 11, 15).

A more dramatic deletion mechanism and one that has left a more obvious signature on highly derived genomes is that attending intrachromosomal recombination at repeat sequences. This sort of recombination event has been observed as the most common mechanism of large-scale deletions in bacteria under laboratory conditions (143). Such deletions leave at least two signatures. First, they lead to the loss of intervening sequences between two repeat sequences along with the deletion of one of the repeats. Second, they lead to rearrangements of the flanking sequences surrounding the original repeat sequences. Such rearrangements may be detected in descendants of the deleted genome as the loss of highly conserved sequence motifs, such as those of common operons.

In the reduced genomes of Rickettsia, gene duplications common to other bacteria such as multiple rRNA operons and duplicated elongation factor Tu genes are missing (21, 171). These, along with short repeat sequences that are common in free-living bacteria, seem to have been consumed by intrachromosomal recombination in the genome of Rickettsia (20). In addition, the correlate is observed. Thus, the highly conserved operons for rRNA, proteins of the translation apparatus, and some metabolic enzymes are either gone from the Rickettsia genome or are retained in scrambled form (7, 19, 21, 171). Such depredations are even more in evidence in the genomes of mitochondria. These commonly have their minimal coding sequences arranged with little rhyme or reason, except among some primitive protists and plants (75, 77).

As mentioned earlier, the largest number of coding sequences observed so far in mitochondria belongs to Reclinomonas americana (77, 80, 111). Although it has only 67 protein-coding genes, it is a giant among mitochondria. There is much to recommend interest in this genome, which is in some ways very unlike other mitochondrial genomes (77). Like some other protists and plant mitochondrial genomes, it has recognizable gene motifs, such as the rRNA operon and the giant ribosomal protein cluster seen in bacteria (20, 80, 111). The presence of such motifs is hard to explain other than by the conservation of ancient bacterial motifs.

That R. americana's mitochondrial genome contains 18 protein-coding sequences not seen in other mitochondrial genomes is not as remarkable as the fact that all of the proteins found in all other mitochondrial genomes are among the remaining 49 protein-coding genes (77, 80, 111). This simple fact speaks forcefully for the monophyletic character of the mitochondrial lineages, particularly when it is recalled that there are hundreds of mitochondrial proteins coded by genes in the nuclei of eukaryotes (77).

Phylogenetic reconstructions with the coding sequences from the R. americana mitochondrion along with those from other mitochondria and from bacterial genomes are unambiguous: the alpha -proteobacteria Rickettsia and Bartonella have a common ancestor with the mitochondrial lineage. That common ancestor was a free-living bacterium with a genome that was probably larger than that of Bartonella and certainly much larger than those of Rickettsia as well as mitochondria. Nevertheless, there is a very important difference between the genomes of Rickettsia and those of mitochondria. While the proteome of Rickettsia is at most twice the size of the mitochondrial proteome, typically less than 10% of the mitochondrial proteome is encoded by the mitochondrial genome. Thus, there is a dimension to the reductive genome evolution of mitochondria that is not shared by the Rickettsia.

Transfer to the Nucleus

There is a decisive difference in the evolutionary coupling between the host cell genome and the genome of an endocellular parasite and between the genomes of an organelle and of an endosymbiont (15). The fitness of the endosymbiont, like that of the organelle, is coupled positively to that of the cell. In contrast, the parasite's fitness is negatively coupled to the cell's fitness. Consequently, mutations that adversely affect the parasite will benefit the host genome. Conversely, deleterious mutations in the mitochondrion can be compensated for by changes in the host genome that enhance the combined fitness of the two genomes. Much of the evolution of the mitochondrion can be understood with the help of this distinction.

The asexual character of mitochondrial lineages suggests that they might be particularly vulnerable to Muller's ratchet, especially compared to nuclear genomes, with their well-developed sexual mechanisms (107, 117, 119). Thus, Muller's ratchet (132) might account for the fact that the vast majority of genes constituting the mitochondrial proteome are found in the nuclear genome. Simply stated, it is conceivable that genes in the mitochondria would have a much heavier mutational load than the same genes in the nucleus. If a transfer mechanism existed, it would be advantageous for the cell to move genes from the organelle to the nucleus. Indeed, the data for metazoan mitochondria tend to support this notion, because in these organisms, the mutation rates are much higher in their mitochondria than in their nuclei (204). Furthermore, it can be shown that as long as a transfer mechanism exists to shuttle genes from mitochondria to nucleus, the mutational load will inevitably drive genes to the nuclear genome (27). However, there is a serious limitation to this model: it is only applicable to a small fraction of this planet's eukaryotes, primarily the metazoans.

Thus, the genomes of plant mitochondria tend to be less mutation prone than plant nuclear genomes, and in fungi the mutation frequencies of the two genomes are more or less equivalent (117, 118, 146, 202). Nevertheless, the coding capacities of plant and yeast mitochondria are in general not very different from those of animal cells; in all cases, the overwhelming majority of mitochondrial genes are found in the nucleus (75, 77). From these observations, it follows that mutational load alone cannot drive the migration of organelle genes to the nuclei of organisms other than animals.

For the most realistic situation of eukaryotic organisms in finite populations, what is required in addition to random mutations is a biased transfer mechanism. When the transfer mechanism is adequately biased in the direction of the nucleus, it can overcome a mutational gradient in the opposite direction (27). For example, if a cellular transfer mechanism favors moving genes from mitochondria to the nucleus over transferring genes from the nucleus to mitochondria, it can do so as long as its bias is greater than the mutational bias of the nucleus compared to the mitochondrion.

There are data indicating that transfers between mitochondrial and nuclear genomes are an ongoing evolutionary process (1, 31, 45, 126, 140, 144-146, 176, 177). Furthermore, there is an experimental system to study and quantify the transfer of sequences between mitochondrial and nuclear genomes in the yeast Saccharomyces cerevisiae (66, 175, 176, 178). Here, plasmids have been introduced into mitochondria, and the transfer of coding sequences from these plasmids to nuclei has been studied quantitatively (38, 82, 178, 190). The genetic data suggest that the primary pathway for the uptake of mitochondrial coding sequences that are transferred and expressed in the nucleus is provided by autophagy of mitochondria by cellular vacuoles (phagolysosomes). Nucleic acid fragments liberated by disintegration of mitochondria may become intermediates in the transfer to nuclei.

Thorsness and Fox (175, 176) have estimated rates of transfer of coding sequences to and from the mitochondria. For wild-type S. cerevisiae, there is roughly 1 transfer event to the nucleus from the mitochondria per 105 generations. Their experiment failed to detect transfer in the opposite direction, suggesting that this rate is less than 1 transfer/1010 generations. Thus, in this experimental model, the transfer process is expressed at least 105 times more frequently from mitochondria to the nucleus than in the reverse direction. This means that we have found a highly polar process to transport genes from the mitochondria to the nuclei in S. cerevisiae. How general is this likely to be?

One indication of the generality of the transfer process is that autophagic vacuoles are ubiquitous in eukaryotes. In addition, it seems that exogenous fragments of nucleic acids do not normally get into mitochondria. Thus, the experimental transfer to mitochondria of exogenous coding sequences such as those on plasmids has been accomplished to our knowledge only in S. cerevisiae. This transfer requires what is referred to as high-velocity microprojectile bombardment (66). In effect, this unique experimental system requires that the transferred sequences be shot into the cells to effect penetration of some mitochondria. It would seem that effective physical barriers normally prevent transfer of coding sequences into mitochondrial genomes.

Another argument favoring the asymmetric transfer to nuclei by vacuoles is that these universal organelles consume the mitochondria as well as peroxisomes in the normal course of their function. It is this degradation process that apparently releases the fragments of nucleic acids that are taken up by the nuclei (38, 82, 178, 190). Obviously, the destruction of nuclei by vacuoles would be lethal to a cell, which would prevent vacuole-dependent gene transfer from nuclei to mitochondria by this route. For these reasons, we are inclined to believe that preferential transfer of coding sequences to nuclei is the rule rather than the exception. This does not preclude the transfer of sequences from the nucleus to mitochondria by other transfer processes (123, 177).

In order to be recruited by the mitochondrion, a protein-coding sequence transferred to the nucleus often requires an addressing signal to direct its product back to the organelle (24, 138, 162, 163). Splicing pathways such as those that support exon shuffling might accelerate the tagging of newly transferred genes with appropriate addressing sequences. Indeed, the discovery of an intron between an addressing sequence and a mitochondrial gene that had been transferred to the nucleus in some plants confirms this expectation (45, 140). Accordingly, it is conceivable that one reason that splicing systems spread through primitive eukaryotic nuclear genomes was to satisfy the need to tag newly transferred genes from mitochondria and chloroplasts with addressing signals.

The process that transfers a gene to the nucleus can be envisioned as a neutral process with several identifiable states (27). First, an inactive gene is transferred to the nucleus, while the active version is retained by the mitochondrion. Then, the tagging process provides an addressing sequence to the nuclear version so that both mitochondrion and nucleus have active versions of the gene. Finally, mutation inactivates and eventually deletes the mitochondrial gene, and the nuclear allele takes over its function.

Recent events in the evolution of legumes have produced intermediate stages of this sort for the transfer of cox2 from mitochondria to the nucleus (1, 45, 140). The cox2 sequence in the nucleus of legumes seems not to have been copied from mitochondrial DNA. Since this nuclear sequence does not require editing for expression, we may infer that it was copied from an edited RNA fragment (45, 140). Therefore, this transferred gene was probably copied from an RNA fragment that was released by the destruction of mitochondria.

The neutral transfer model implies that eventually all of the genes of the mitochondria that can be transferred without ill effect will be transferred to the nucleus (27). Why then are there any genes left in contemporary mitochondrial genomes? Obviously, some genes remaining in mitochondrial genomes may be ones for which the transfer process is not neutral or for which the transfer requires very rare mutational events. If the entire group of mitochondrial genes is destined for transfer to the nucleus, sequence technology may have discovered the last ones poised for transfer. We recall here that only two such proteins are left in the mitochondrial genome with the most limited coding capacity. It has been suggested that these two genes must remain fixed in the organelle's genome because their protein products can regulate their own expression in tune with the redox potential of the mitochondria (5). Consistent with this interpretation is the identity of the two proteins encoded by all known mitochondrial genomes; both are cytochrome subunits (77).

In summary, we can identify evolutionary forces that tend to reduce the coding capacity of mitochondrial genomes. These evolutionary forces are either destructive, leading to the loss of coding sequences from the cell, or conservative, leading to the transfer of coding sequence to the nucleus. In the next section we also document the existence of an expansive evolutionary tendency that has supplemented the ancestral endosymbiont's proteome and transformed it into that of a cellular organelle.


THE YEAST MITOCHONDRIAL PROTEOME
Top
Previous
Next
References

The discovery of a major fraction of mitochondrial proteins that are not the descendants of a bacterial ancestor was entirely unexpected (95). This discovery required access to a definitive genome sequence and data defining the mitochondrial proteome of the same organism. For the moment, the yeast S. cerevisiae is the only eukaryote for which these two requirements have been met. In this section, we summarize and illustrate the phylogenetic reconstructions obtained for the mitochondrial proteome of S. cerevisiae.

The relative fraction of genes with homologues in bacteria and in eukaryotes for each of six functional categories is summarized in Fig. 1. Roughly one third of all the proteins are classified as ambiguous. These may cluster with bacterial taxa at the ends of long branches in the phylogenetic trees, or they may have homology to proteins from both bacterial and eukaryotic taxa (95). No attempt has been made to deduce their origins. In contrast, a cohort of circa 50 mitochondrial proteins that are clearly most closely related to alpha -proteobacteria have been identified. These support the identification of alpha -proteobacteria as ancestors of the mitochondria (95).


View larger version (40K):
[in this window]
[in a new window]
 
FIG. 1.   Schematic illustration of the relative fraction of mitochondrial proteins with bacterial homologues (black bars) and without bacterial homologues (hatched bars). (Data taken from Karlberg et al. [95].)

Surprisingly, half of the mitochondrial proteome of S. cerevisiae, circa 200 proteins, have no discernable alignments with any bacterial homologues (P < e-10). They cluster exclusively as eukaryotic homologues (95). The presence of a sizable eukaryotic cohort contradicts an expectation of the endosymbiotic theory, which implies that the mitochondrial proteome is exclusively the descendant of an ancestral bacterial proteome. The data suggest that the endosymbiotic theory requires modification.

It is evident from Fig. 1 that the phylogenetic clustering of the mitochondrial proteins into alpha -proteobacterial and eukaryotic homologues goes hand in hand with the functional profiles of the clusters. Thus, the bacterial homologues seem to be mainly involved in translation and energy metabolism. In contrast, the eukaryotic proteins are typically associated with transport and regulatory functions.

A way of viewing the phylogenetic differences between the different functional categories of proteins is as follows (17, 95). First, we imagine that the ancestral bacterial symbiont introduced genes encoding proteins that have been perpetuated as an essential core of proteins functioning in aerobic respiration, the tricarboxylic acid (TCA) cycle, and gene expression. Furthermore, we suggest that the evolution of the mitochondria from the endosymbiont required that novel accessory proteins that arose in the eukaryotic genome complement such core functions. Some of these eukaryotic homologues augment the functions of the core proteins by participating in the assembly of complexes, while others function in regulation. In addition, there is a larger group of eukaryotic proteins with novel gene functions, such as ATP and protein transport. We suggest that the coevolution of the core alpha -proteobacterial components and the complementary eukaryotic nuclear components transformed the endosymbiont into an organelle.

We have also observed a number of bacterial protein homologues that may be examples of horizontal gene transfers from diverse bacteria. This means that there are proteins descended from three sorts of genomic ancestors in the yeast mitochondrial proteome: (i) homologues descended from an alpha -proteobacterium ancestor that encode core function, (ii) some ill-defined orthologues, some of which may have been recruited from a diverse group of bacteria through horizontal gene transfer, and (iii) the dominant group of eukaryotic proteins that have been recruited from the nuclear genome. In effect, the ancestral alpha -proteobacterial proteome consisting of 1,600 or more proteins has been reduced to roughly 50 proteins. This residual core has been complemented by circa 350 novel proteins, mostly recruited from the eukaryotic nuclear genome.

This interpretation rests on detailed phylogenetic analyses of the mitochondrial proteins of S. cerevisiae as well as of other eukaryotes that we present next. Here, all references to genes and their encoded proteins are for S. cerevisiae unless otherwise specified. Some relevant informational details may assist the reader in studying the phylogenetic reconstructions. All of the phylogenetic reconstructions are to be found at http://web1.ebc.uu.se/molev/publications/cfg2000 (95). The sequences of proteins of the yeast mitochondrial proteome are found in the Yeast Protein Database at http://www.proteome.com (87). The sequences of the mitochondrial genome of Reclinomonas amer- ica are at http://megasun.bch.umontreal.ca/ogmp/projects /other/mtcomp.html. Sequences for the nematode Caenorhabditis elegans are in the database WormPep17 at ftp://ftp.sanger.ac.uk/pub/databases/wormpep/. Other sequences from the SwissProt and NCBI databases were also used for the phylogenetic reconstructions. Alignments of homologous proteins (P < e-10) were used to construct phylogenetic trees (95, 174). To structure the discussion, we have followed the somewhat arbitrary biochemical classification used by Karlberg et al. (95).

Energy Metabolism

ATP production in eukaryotes from glucose and oxygen normally consists of two catabolic processes: the conversion of glucose to pyruvate via a glycolytic pathway, and the oxidative conversion of pyruvate to H2O and CO2. Both Rickettsia and mitochondria rely for pyruvate on the glycolytic systems of their host cells. More than 100 mitochondrial enzymes are involved in the oxidation of pyruvate. In S. cerevisiae, the relevant enzymes correspond to five proteins in the pyruvate dehydrogenase complex, 14 proteins in the ATP synthase complex, 16 proteins in the TCA cycle, and more than 70 proteins in the respiratory chain complexes. The ATP produced by this system is exported into the cytoplasm by the ATP/ADP translocases, which transport 1 molecule of ATP in exchange for 1 molecule of ADP.

The Rickettsia genome contains a similar cohort of proteins involved in the pyruvate dehydrogenase complex, the TCA cycle, the respiratory chain complex, and the ATP synthase complex. There are in addition five genes coding for ATP/ADP translocases, but these are required for the import of ATP from the host cell cytosol. Thus, Rickettsia and mitochondria have functionally related systems for ATP production (Fig. 2). However, ATP transport is somewhat different in these two systems (see below).


View larger version (56K):
[in this window]
[in a new window]
 
FIG. 2.   Schematic illustration of the bioenergetic machineries in mitochondria and Rickettsia. (Modified from reference 9.) Ac-CoA, acetyl-CoA.

The pyruvate dehydrogenase complex that converts pyruvate to acetyl coenzymeA (acetyl-CoA) consists of multiple copies of each of three enzymatic components: pyruvate dehydrogenase (E1, two subunits), dihydrolipoamide acetyltransferase (E2), and lipoamide dehydrogenase (E3), as summarized in Fig. 3A. All three mitochondrial subunits are encoded by the nuclear genome and are more closely related to their homologs from alpha -proteobacteria than to those from other bacteria (Fig. 3B, C, and D). A notable exception is the pdx1 gene, which encodes a protein required anchoring the E3 to the E2 component in the complex. This protein has no bacterial homologues, so it may be a relatively recent addition to the pyruvate dehydrogenase complex.




View larger version (91K):
[in this window]
[in a new window]
 
FIG. 3.   (A) Schematic illustration of the pyruvate dehydrogenase complex. (B, C, and D) Phylogenetic reconstructions are based on the combined protein sequences of the alpha  and beta  subunits of the pyruvate dehydrogenase E1 component (B), the dihydrolipoamide acetyltransferase E2 component (C), and the dihydrolipoamide dehydrogenase E3 component (D) from representative species. Names of species from the alpha -proteobacteria are shown in boldface and those from mitochondria (mit) are in italics in this and all subsequent figures where bootstrap numbers are indicated above the horizontal lines. The phylogenetic trees were constructed as described by Karlberg et al. (95).

Acetyl-CoA, produced by the pyruvate dehydrogenase complex, is fed into the TCA cycle (Fig. 4A). There are eight enzyme complexes in the TCA cycle, all of which are encoded by the nucleus in at least some eukaryotes. Three enzymes display particularly strong relationships to the alpha -proteobacteria, whereas the evolutionary history of the others is more complex. Genes encoding succinyl-CoA synthetase, succinate dehydrogenase, and fumarase seem to descend from an ancestral alpha -proteobacterium, and these have been transferred subsequently into the nucleus. For example, alpha -proteobacterial enzymes of the succinate dehydrogenase complex (Sdh) are closely related to their mitochondrial homologs whether these are encoded in the nucleus or in the mitochondrion (Fig. 4B). However, the membrane-anchoring subunit of this complex, Sdh4, has so far been found only in eukaryotes.




View larger version (86K):
[in this window]
[in a new window]
 
FIG. 4.   (A) Schematic illustration of the TCA cycle. (B, C, and D) Phylogenetic reconstructions based on the succinate dehydrogenase iron sulfur protein (B), malate dehydrogenase (C), and aconitase (D) from representative species. The phylogenetic trees were constructed as described by Karlberg et al. (95). gly,glycosome; per, peroxisome; cyt, cytosol; chl1 and chl2, chloroplast 1 and 2, respectively; mit, mitochondrion.

Most if not all of the genes encoding enzymes in the TCA cycle have been duplicated. The resulting paralogues are often recruited to different subcellular compartments. For example, there are three genes encoding malate dehydrogenases in the nuclear genome of S. cerevisiae. These are targeted to the mitochondrion, the cytoplasm, and the peroxisome. All three enzymes form a phylogenetic cluster which is closely related to the mitochondrial malate dehydrogenases in other species (Fig. 4C). This suggests that the yeast cytosolic and peroxisomal forms are derived relatively recently from the mitochondrial malate dehydrogenase. On the other hand, the mouse cytosolic and mitochondrial malate dehydrogenases are highly divergent, which suggests that they arose in a more ancient gene duplication. The malate dehydrogenase in R. prowazekii seems not to be particularly closely related to either the mitochondrial or the cytoplasmic form of malate dehydrogenases in eukaryotes.

There are in S. cerevisiae at least two genes encoding the aconitase hydratase. The bacterial homologues cluster more closely to the cytoplasmic homologues and are more distant from the mitochondrial ones (Fig. 4D). This suggests that the homologue recruited for cytoplasmic functions and that recruited to the mitochondrion may have had different ancestors. The history of isocitrate dehydrogenases is even more complex. There are two forms of isocitrate dehydrogenase, the NAD+- and the NADP+-specific forms, both of which have the same function in the TCA cycle. The NAD+-specific protein is a two-subunit protein in S. cerevisiae and a three-subunit protein in higher eukaryotes. The two yeast NAD+ subunits cluster closely together, suggesting that they have been derived from the same ancestral gene. However, the NADP+-specific isocitrate dehydrogenases are only remotely related to the NAD+-specific proteins. The phylogenetic analysis suggests that the cytoplasmic, mitochondrial, and chloroplast NADP+ homologues have common origin but have been recruited to different subcellular compartments in a pattern that is species specific. The NADP+ homologues in mitochondria are most similar to those in Sphingomonas yanoikuyae (alpha -proteobacterium) and Mycobacterium tuberculosis, whereas more distantly related paralogues are found in all other bacteria. The isocitrate dehydrogenase found in Rickettsia prowazekii is highly divergent from all of these enzymes.

There are several other examples of mitochondrial and cytosolic isoforms that are highly divergent from bacterial homologues. For example, the mitochondrial citrate synthase seems to have originated from within the eukaryotic genome and displays no sequence identity with its bacterial analogue. The phylogenetic analysis of the enzymes of the TCA cycle reveals a complex evolutionary network. The complexities include (i) some genes that have been transferred from an alpha -proteobacterial ancestor into the nuclear genome and retargeted to the mitochondrion, (ii) others that have been transferred from an alpha -proteobacterial genome to the nucleus but targeted to other subcellular compartments, and (iii) still others that seem to have been recruited from different bacterial or eukaryotic ancestors.

The electron transport system consists of three energy-coupling sites: (i) the NADH dehydrogenase complex, (ii) the cytochrome bc1 complex, and (iii) the cytochrome oxidase complex (Fig. 5A). Although some subunits of the NADH dehydrogenases from diverse eukaryotes are encoded by mitochondrial genomes, most are encoded by nuclear genomes. For example, the mitochondrial genome of Reclinomonas americana contains as many as 10 genes encoding subunits of the NADH dehydrogenase complex. This arrangement is reminiscent of the nine NADH dehydrogenase subunits located in immediate proximity to each other in Escherichia coli. A reduced version of this arrangement is also seen in R. prowazekii. Phylogenetic reconstructions based on a combined set of mitochondrial NADH dehydrogenase subunits suggest that these are derived from the alpha -proteobacteria (Fig. 5B). In contrast, only a single nuclear gene that encodes one of the subunits in the NADH dehydrogenase complex has been identified in S. cerevisiae. The corresponding protein is roughly 30% identical to the NADH dehydrogenase subunit of the gamma -proteobacteria, E. coli, and Haemophilus influenzae. This low level of sequence identity suggests that the yeast homologue may not be a descendant of bacterial orthologues. Indeed, it seems likely that the bacterial NADH dehydrogenase genes may have been discarded from the yeast mitochondrial as well as nuclear genomes and replaced by eukaryotic analogues that are too dissimilar to be identified by current routines for identifying sequence similarity.



View larger version (95K):
[in this window]
[in a new window]
 
FIG. 5.   (A) Schematic illustration of the respiratory chain complex. (B and C). Phylogenetic reconstructions based on the combined protein sequences of NADH dehydrogenase I chains A, J, K, L, M, and N (B) and the combined protein sequences of cytochrome b and cytochrome c oxidase subunit I (C) from representative species. The phylogenetic trees were constructed as described by Karlberg et al. (95). (B) The arrow indicates the bootstrap value at that node.

The second coupling site of the electron transport system is the cytochrome bc1 complex, which contains a core of proteins such as cytochrome b, cytochrome c1, and the Rieske iron-sulfur protein. This complex has been isolated from many alpha -proteobacterial species, such as Paracoccus denitrificans, Rhodobacter capsulatus, Rhodospirillum rubrum, and Bradyrhizobium japonicum. The gene encoding cytochrome b is located in the mitochondrial genome and is closely related to its alpha -proteobacterial relatives (Fig. 5C). The nuclear gene encoding cytochrome c1 (cyc1) is also related to one found in gamma -proteobacteria. Homologues for the Rieske iron-sulfur protein (rip1) have been found in alpha - and gamma -proteobacteria as well as in cyanobacteria and green sulfur bacteria. All three mitochondrial proteins cluster closely with their alpha -proteobacterial relatives. In S. cerevisiae, the cytochrome bc1 complex is composed of as many as 10 subunits. The seven other components of this complex in yeast cells are related to eukaryotic homologues, but they are unrelated to any bacterial analogues. These seven appear to be later additions to the central core derived from alpha -proteobacteria.

The cytochrome oxidase complex in S. cerevisiae provides another example of a complex where proteins of eukaryotic descent have been added to a core of alpha -proteobacterial proteins. The core genes are coxI, coxII, and coxIII, encoding cytochrome oxidase c subunits I, II, and III, respectively. These three subunits are almost always mitochondrially encoded and cluster closely with their alpha -proteobacterial homologues (Fig. 5C). In contrast, none of the other yeast proteins in this complex can be aligned with bacterial homologues. These nonbacterial homologues are only occasionally found in other eukaryotes, which is consistent with the interpretation that they are recent nuclear contributions. Only 1 of the 13 yeast proteins responsible for the assembly of the cytochrome oxidase complex has a homologue in R. prowazekii. Four of these assembly proteins can be identified in the proteomes of other eukaryotes. The remaining assembly proteins for the cytochrome oxidase complex are specific for S. cerevisiae.

The gene order for the core components of the ATP synthetase complex (Fig. 6A) is highly conserved among bacterial genomes (Fig. 6B). There are 14 proteins in the ATP synthase complex in S. cerevisiae, half of which can be identified as bacterial descendants. Three of the latter are encoded in yeast mitochondria (atp6, atp8, and atp9), and each displays a close phylogenetic relationship to alpha -proteobacterial homologues. Similarly, phylogenetic reconstruction with the concatenated alignments of proteins encoded by the nuclear genes for the alpha  and gamma  subunits of the ATP synthase (atp1 and atp3) reveals a cluster with strong bootstrap support for mitochondrial and alpha -proteobacterial homologues (Fig. 6C). A similar cluster is also observed for the genes atp2 and atp5. Of the remaining seven yeast proteins in this complex, four are found only in eukaryotes and three appear to be specific to S. cerevisiae.



View larger version (86K):
[in this window]
[in a new window]
 
FIG. 6.   (A) Schematic representation of the ATP synthase complex and (B) organization of the ATP synthase genes. (C) Phylogenetic reconstructions based on the combined protein sequences of the alpha  and gamma  subunits of the ATP synthase complex from representative species. The phylogenetic trees were constructed as described by Karlberg et al. (95). Chl/Nuc refers to chloroplast/nucleus.

In summary, phylogenetic reconstructions for the core components of the respiratory chain complexes identify these as alpha -proteobacterial descendants that are often encoded by the mitochondrial genomes of contemporary eukaryotes. Other components of the mitochondrial proteome descended from alpha -proteobacteria, such as atp1 and atp3, have been transferred to nuclei. Associated with these core components are accessory proteins such as those that assist in the assembly of complexes for which no homologues exist in bacteria. These seem to have evolved after the ancestral endosymbiotic event that contributed the core components. In total, 42% of the proteins participating in the aerobic ATP-generating system of yeast mitochondria have been found so far only in S. cerevisiae and other eukaryotes.

Information Processes

Processes such as replication, transcription, and translation are supported by more than one third of the mitochondrial proteome. Genes encoding components of these systems are typically located in the nuclear genome (Fig. 7), although the mitochondrial genomes of many protists encode different subsets of ribosomal proteins. In particular, almost of half of the mitochondrial genome of Reclinomonas americana consists of genes involved in sequence information processing.


View larger version (52K):
[in this window]
[in a new window]
 
FIG. 7.   Schematic illustration of the import of cytosolic components for the gene expression systems in mitochondria and Rickettsia. AA, amino acid; RNA-P, RNA polymerase.

rRNA sequences have served as the key reference sequences for phylogenetic reconstructions (197). Indeed, the very first molecular data supporting the notion that mitochondria are derived from alpha -proteobacteria were obtained from phylogenetic reconstructions based on rRNA sequences (75, 197). More refined studies based on larger data sets suggested that there may be a particularly close relationship with the group of bacteria to which the Rickettsia belong, the Rickettsiaceae (Fig. 8) (75, 141, 197).


View larger version (20K):
[in this window]
[in a new window]
 
FIG. 8.   Phylogenetic reconstructions based on rRNA sequences from Zea mays mitochondria and representative alpha -proteobacterial species. The tree is drawn according to Olsen et al. (141).

The interpretation of these phylogenetic reconstructions is not free of ambiguities. That many mitochondrial genomes are organized with extreme economy provides one complication. Thus, there is a direct correlation between the size of the mitochondrial genomes and their rRNA genes (13). In particular, in animal mitochondrial genomes smaller than 20 kb, the small- and large-subunit rRNAs are as short as ca. 850 and ca. 1,500 nucleotides, respectively. These lengths can be compared to those of the plant mitochondrial genomes longer than 300 kb that encode rRNAs with ca. 1,500 and 3,000 nucleotides. Fortunately, a conserved core of nucleotide sequences retained in all rRNA sequences facilitates alignments of rRNA sequences from genomes as diverse as those of bacteria and animal mitochondria despite their substantial size variation.

Another complication is that mutation biases and rates of nucleotide substitutions vary markedly among the mitochondrial genomes. The small animal mitochondrial genomes have a strong A+T mutation bias and evolve more than 50 times faster than the large plant mitochondrial genomes. This may account for the particularly close phylogenetic relationship observed between alpha -proteobacteria and plant mitochondria, with their long rRNA sequences and slow rates of nucleotide substitution. Although the rapid, biased sequence evolution of rRNA in animal mitochondria complicates phylogenetic reconstruction, it is likely that these mitochondrial rRNAs also descend from those of the alpha -proteobacterial subdivision (75).

Only a small fraction of the genes for the ubiquitous core of mitochondrial ribosomal proteins are found in the organelle's genome. The largest ribosomal protein assembly in a mitochondrial genome is found in R. americana, which encodes 11 large-subunit ribosomal proteins and 7 small-subunit ribosomal proteins. The organization of genes for these proteins in the protist's genome resembles the super-ribosomal protein gene operon found in bacteria (Fig. 9A) (111). Indeed, much detail of bacterial gene order with some departures is very well conserved in the mitochondrial genome of R. americana (77). A similar string of genes has been retained as a contiguous segment in the Bacillus subtilis genome. In the E. coli genome, this string has been divided into two segments, with the rif region located at 89 min and the ribosomal protein gene operons str-S10-spc and alpha  located at 74 min (165). So much gene order seems to have been preserved in R. americana that it is tempting to infer that the entire string of genes associated with its ribosomal protein cluster are as they were in the genome of the ancestral endosymbiont.


View larger version (29K):
[in this window]
[in a new window]
 
FIG. 9.   (A) Schematic illustration of the organization of the ribosomal protein genes in Escherichia coli, Rickettsia prowazekii, and the mitochondrial (mit) genome of Reclinomonas americana. (B) Phylogenetic reconstructions based on the combined protein sequences of the ribosomal proteins S2, S7, S10, S12, S13, S14, S19, L5, L6, and L16 from representative species. The phylogenetic trees were constructed as described by Karlberg et al. (95).

Nevertheless, R. americana is exceptional. Most other mitochondrial genomes appear to have lost all traces of the gene order of the putative bacterial ancestor. In effect, the gene order of the R. prowazekii genome is intermediate between that of R. americana and the more representative genomes of mitochondria (14, 171). Thus, ancestral sequence motifs for genes encoding translation and transcription components are recognizable in R. prowazekii. However, these gene orders are slightly scrambled (Fig. 9A), presumably as a result of intrachromosomal recombination within a genome that originally was arranged as in some modern bacteria (19, 171).

Phylogenetic reconstructions based on the concatenated sequences of eight small ribosomal proteins and three large ribosomal proteins strongly support a clustering of the mitochondrial and alpha -proteobacterial sequences (Fig. 9B). A particularly interesting feature of this tree is that the nucleus-encoded yeast ribosomal proteins cluster with the mitochondrion-encoded ribosomal proteins. This supports the interpretation that ribosomal protein genes from the alpha -proteobacterial ancestor were transferred to the nucleus from a mitochondrial-endosymbiont intermediate.

The very small sizes of the rRNA sequences and the unusually large protein-RNA ratios in some mitochondria suggest that functions previously provided by the missing parts in these rRNA sequences may have been taken over by nucleus-encoded ribosomal proteins (T. O'Brien, personal communication). Indeed, the yeast mitochondrial ribosome contains a total of 23 small-subunit and 37 large-subunit proteins, many of which have no homologues in bacteria or other eukaryotes. Similarly, the human mitochondrial ribosome contains a large number of ribosomal proteins not found in bacteria. These examples of species-specific ribosomal proteins seem to represent novel solutions to the problems created by the tendency of mitochondria to delete short stretches of sequence from rRNA genes. Here, specific protein structures may replace the functions of deleted rRNA patches in the ribosomes of some mitochondria.

Most of the mitochondrial tRNAs are encoded by the organelle's genome. They are presumably the descendants of tRNAs from the ancestral endosymbiont (18, 98). As an apparent exception, several plant mitochondrial tRNA genes appear to have been replaced rather recently by chloroplast tRNA genes (75). It is conceivable that tRNA genes have crossed organelle boundaries several times in the evolutionary past.

The total number of mitochondrial tRNA genes varies in different species, with a minimum of only 22 tRNA genes in some animal mitochondria. It has been suggested that this reduction in the diversity of the tRNA population was associated with the codon reassignments characteristic of animal mitochondria (13, 37, 108). For example, AUA specifies methionine rather than isoleucine in animal and yeast mitochondria. AGA and AGG encode serine instead of arginine in the insect and echinoderm mitochondrial genes, while these same codons are termination codons in mammalian mitochondria. The standard tRNAArgAGR is absent in animal mitochondria. In insect and echinoderm systems, the translation of AGR codons is performed by the tRNASerAGY.

An absolute minimal tRNA set would theoretically consist of one tRNA per amino acid, i.e., 20 different tRNAs. However, since a minimum of two different tRNAs are required for the translation of arginine, isoleucine, serine, and leucine codons, a more realistic minimum of 24 different tRNAs might be required to translate the conventional genetic code. On the other hand, the diversity of the tRNA population could be further reduced if codon recognition patterns are altered so that a single tRNA can translate the amino acids encoded by three or six codons in the conventional genetic code (13, 108).

Thus, extreme pressure to minimize the set of tRNA species encoded in the mitochondrial genomes might lead to the following sorts of codon reassignments: AUA can be reassigned from isoleucine to methionine, UUG and UUA from leucine to phenylalanine, and AGA as well as AGG from arginine to serine or, alternatively, AGU and AGC from serine to arginine. Each of these reassignments could reduce the number of tRNA species required to translate all 64 triplets of the genetic code. Obviously, the reduction in the minimum number of tRNA isoacceptors requires that the remaining tRNA species expand their codon degeneracy. This may be done by reducing the contribution of the third codon position to function in what has been referred to as hyperwobble (108). A conversion of any of these codons into termination codons would serve the same purpose. Instances of each of these sense codon reassignments in animal mitochondria have been observed. Accordingly, it was suggested that the observed codon reassignments in animal mitochondria are a consequence of the minimization of the canonical set of tRNA genes inherited from the ancestral endosymbiont (13, 108).

Point mutations such as those affecting tRNALeuUUR are particularly common in human mitochondrial diseases. A spontaneous suppressor of such a mutant has been identified as a heteroplasmic alteration of the anticodon of tRNALeuCUN that enables the suppressor to translate leucine codons of the form UUR (129). In this case the suppressor mutation is carried by approximately 10% of the mitochondrial genomes (129). A similar scenario might account for the loss of tRNA species and the evolution of codon reassignments during the evolution of animal mitochondria. Here, a mutant with an ancestral tRNA gene that has accumulated point mutations and/or deletions might be rescued by other mutant tRNA variants that expand their codon recognition range to compensate for the first tRNA variant's defects.

In the yeast Schizosaccharomyees pombe, a viable mutant tRNA isoacceptor with an altered codon recognition pattern has been identified (160). The existence of such viable mutants contradicts the common assumption that mutations that alter codon recognition patterns are lethal. Such tRNA variants with or without accompanying suppressor tRNA species may be fixed in a lineage of mitochondrial genomes. Here, the transition from the canonical tRNA ensemble to one with novel codon recognition patterns might require minimum levels of heteroplasmy to support a transition from the standard translation pattern to an atypical one.

The aminoacyl-tRNA synthetases of the mitochondria are largely descendants of the alpha -proteobacterial ancestor that have been transferred to nuclear genomes. However, the transfer pattern is not without its complexities. In contrast to ribosomal components, each of the aminoacyl-tRNA synthetases needs to interact with only a few molecules, such as the amino acids, ATP, and the tRNAs. This, along with the universality of their functions, may account for the fact that many aminoacyl-tRNA synthetases are associated with complex phylogenetic patterns that reflect the influence of horizontal gene transfers (199). The mitochondrial synthetases present opportunities for particularly complex phylogenetic patterns.

The initial association of the ancestral alpha -proteobacterium and its host brought together two complete translation systems with a total of 40 different aminoacyl-tRNA synthetases. The combination of the symbiont's synthetase complement in the presumptive organelle and the host's complement in the cytoplasm might have provided opportunities for extensive gene replacements and gene losses. In particular, after two billion years it is conceivable that the number of synthetases could have been reduced to a minimum of 20 that service both the mitochondrial and cytoplasmic compartments.

What is observed is more complex and only partially consistent with this conjecture. In S. cerevisiae, for example, three different aminoacyl-tRNA synthetases descended either from the symbiont or the host have been duplicated to serve the mitochondrion and the cytoplasm, while the complementary synthetase has been lost. Likewise, single genes that most likely function in the cytoplasm as well as in the mitochondrion encode a total of four synthetases. Surprisingly, a majority of the 20 nominal aminoacyl-tRNA synthetases in S. cerevisiae have retained the presumed ancestral pattern, represented by the presence of both a mitochondrial and a cytoplasmic protein of distinct phylogenetic origin. It is possible that the exceptional structures of mitochondrial tRNA species (see, for example, reference 108) and the requirement for these to be recognized by a coadapted synthetase have in some cases constrained the evolution of cell compartment-specific synthetase homologues.

We can imagine that the aminoacyl-tRNA synthetases have evolved in a three-stage process. (i) In gene transfer, the bacterial gene is transferred from the mitochondrion to the nuclear genome, which already contains an ancestral eukaryotic synthetase gene of the same specificity. Both genes are expressed, and the bacterially derived synthetase is targeted back to the mitochondrion, while the eukaryotic enzyme is targeted to the cytoplasm. (ii) In gene duplication and replacement, the nuclear and/or bacterial gene is duplicated, and one of the ancestral genes is replaced by the new paralogous gene copy. (iii) In functional duality and gene loss, signal sequences are added to the duplicated pair of genes so that their products can be recruited to the cytoplasm as well as to the mitochondrion. Finally, all of the unnecessary gene copies are purged by random mutation.

The first stage in this process seems to be represented by as many as 12 different aminoacyl-tRNA synthetases in S. cerevisiae, those for Glu, Phe, Leu, Met, Tyr, Asn, Asp, Trp, Ile, Lys, Ser, and Pro (Fig. 10A). Each synthetase is represented by a mitochondrial and a cytoplasmic form, each with a separate origin. The mitochondrial synthetases are normally similar to their bacterial homologues, but there is typically no specific clustering with the rickettsiae.




View larger version (106K):
[in this window]
[in a new window]
 
FIG. 10.   Schematic illustration of the evolution of aminoacyl-tRNA synthetases. The phylogenetic trees are based on glutamyl-tRNA synthetases (A), arginyl-tRNA synthetases (B), and histidyl-tRNA synthetases (C) from representative species. The phylogenetic trees were constructed as described by Karlberg et al. (95).

The disparity between rickettsial and mitochondrial synthetases has several sources. In the simplest case, the Rickettsia do not have an Asn tRNA synthetase. For other synthetases, the absence of this specific relationship is explained by the exceptional placement of Rickettsia in the phylogenetic tree of synthetases. Indeed, several examples of putative horizontal gene transfer events have been identified among the rickettsial aminoacyl-tRNA synthetases (20). For example, phylogenetic reconstructions suggest that the Rickettsia Met tRNA synthetase clusters more closely with a Mycobacterium tuberculosis homologue than with other alpha -proteobacterial versions of the enzyme. Rickettsia is also unusual in that it contains a class I Lys tRNA synthetase which is more similar to the class I Lys tRNA synthetase in the archaea than to the class II Lys tRNA synthetase of other proteobacteria. Finally, Rickettsia may have recruited the gene encoding the cytoplasmic Ile tRNA synthetase, since the rickettsial and the yeast cytoplasmic Ile tRN