Microbiology and Molecular Biology Reviews, December 2000, p. 786-820, Vol. 64, No. 4
1092-2172/00/$04.00+0
Copyright © 2000, American Society for Microbiology. All rights reserved.
Department of Molecular Evolution, Evolutionary Biology Centre, University of Uppsala, Uppsala SE 752 36,1 and Department of Microbiology, Lund University, Lund SE 223 62,2 Sweden
SUMMARY
INTRODUCTION
OX-TOX MODEL
FROM BACTERIAL GENOME TO VESTIGE
Genome Degradation
Transfer to the Nucleus
THE YEAST MITOCHONDRIAL PROTEOME
Energy Metabolism
Information Processes
Heat Shock Proteins
Biosynthesis, Regulation, and Transport
Summary
THE HOST
Archaeazoa
Hydrogenosomes
Hydrogen ExporterLate
Hydrogen ExporterEarly
Eukaryotic Heterotrophy
Horizontal Transfer to and from Eukaryotes
Phylogenetic Inference and Gene Transfer
FUTURE DIRECTIONS
ACKNOWLEDGMENTS
REFERENCES
SUMMARY
|
|
|---|
The endosymbiotic theory for the origin of mitochondria requires substantial modification. The three identifiable ancestral sources to the proteome of mitochondria are proteins descended from the ancestral
-proteobacteria symbiont, proteins with no homology to bacterial orthologs, and diverse proteins with bacterial affinities not derived from
-proteobacteria. Random mutations in the form of deletions large and small seem to have eliminated nonessential genes from the endosymbiont-mitochondrial genome lineages. This process, together with the transfer of genes from the endosymbiont-mitochondrial genome to nuclei, has led to a marked reduction in the size of mitochondrial genomes. All proteins of bacterial descent that are encoded by nuclear genes were probably transferred by the same mechanism, involving the disintegration of mitochondria or bacteria by the intracellular membranous vacuoles of cells to release nucleic acid fragments that transform the nuclear genome. This ongoing process has intermittently introduced bacterial genes to nuclear genomes. The genomes of the last common ancestor of all organisms, in particular of mitochondria, encoded cytochrome oxidase homologues. There are no phylogenetic indications either in the mitochondrial proteome or in the nuclear genomes that the initial or subsequent function of the ancestor to the mitochondria was anaerobic. In contrast, there are indications that relatively advanced eukaryotes adapted to anaerobiosis by dismantling their mitochondria and refitting them as hydrogenosomes. Accordingly, a continuous history of aerobic respiration seems to have been the fate of most mitochondrial lineages. The initial phases of this history may have involved aerobic respiration by the symbiont functioning as a scavenger of toxic oxygen. The transition to mitochondria capable of active ATP export to the host cell seems to have required recruitment of eukaryotic ATP transport proteins from the nucleus. The identity of the ancestral host of the
-proteobacterial endosymbiont is unclear, but there is no indication that it was an autotroph. There are no indications of a specific
-proteobacterial origin to genes for glycolysis. In the absence of data to the contrary, it is assumed that the ancestral host cell was a heterotroph.
INTRODUCTION
|
|
|---|
Mitochondria are the ATP-generating organelles of eukaryotes, and in most organisms they are oxygen respiring. Roughly 2 billion years ago, the ambient oxygen tension of Earth's atmosphere increased rapidly. Here, rapidly means that the oxygen tension went from roughly 1% to more than 15% of present levels within less than 200 million years (88). Many believe that the origins of mitochondria as organelles in primitive eukaryotes can be associated with this environmental trauma (121).
Nevertheless, the Earth's atmosphere during the billions of years prior to this global oxygen shock was probably not the heavy reducing atmosphere suggested by Oparin (142). Geochemical evidence suggests that the oxygen tension in the atmosphere may have been as much as 1% of present levels from the very beginning (88, 157). In other words, during the entire history of the biosphere, oxygen was accessible at low levels in the atmosphere and quite possibly at higher levels locally. The continuous presence of oxygen matches the ancient origins of the terminal oxidases characteristic of mitochondria. Thus, the monophyletic lineage of cytochrome oxidases is well represented in the archaea, bacteria, and eukaryotes (40, 41, 104, 161, 166).
Phylogenetic reconstructions and distance measurements based on the
sequences of cytochrome c oxidase and cytochrome
b are consistent with divergence of mitochondria from
bacteria between 1.5 and 2.0 billion years ago (165).
Accordingly, the oxidative respiratory system that was introduced into
eukaryotes by way of the primitive mitochondrion was already an ancient
enzymatic system. There is now overwhelming support for the idea that
the vehicle that introduced the respiratory system into the eukaryotic lineages was an endosymbiotic
-proteobacterium (20, 77, 78, 79).
The endosymbiotic theory of plastid as well of mitochondrial origins arose in the nineteenth century and was given new life by Margulis (121) precisely when molecular methods could begin to test some of its predictions. The discovery of mitochondrial genomes and the results of phylogenetic reconstructions with sequences for rRNA as well as for a few proteins strengthened confidence in this theory (32, 79, 206). As a consequence, when we reviewed the literature on codon preferences in this journal 10 years ago, we found it convenient to treat the mitochondrial genome as though it was just another kind of bacterial genome (12).
Since then, detailed comparisons and phylogenetic reconstruction with
relevant genome sequences have very much expanded our view of the
mitochondrion. Most informative have been the mitochondrial genomes of
protists (77, 80, 111), the nuclear genome of the yeast
Saccharomyces cerevisiae (87;1x;
http://www.proteome.com), and the genome of the
-proteobacterium
Rickettsia prowazekii (20, 76). The
genomic comparisons show unambiguously that the coding
sequences of the mitochondrial genomes are predominantly the
descendants of
-proteobacterial homologues. Accordingly, some
version of the endosymbiotic theory is in all probability relevant to
the origins of mitochondria. However, to account for some of the new
data, this theory needs to be modified significantly.
First, it turns out that only a small fraction of all proteins
functioning in mitochondria are the descendants of the ancestral free-living
-proteobacterium. Most of the remaining proteins are
descendants of nuclear genes with no bacterial antecedents (17,
95). That most of the genes of the ancestral
-proteobacterium have disappeared from the mitochondrial genome has been understood for
some time (13, 14, 15, 16, 19, 20, 22, 77). The magnitude of
the loss can be estimated as follows. To our knowledge, the smallest
genome of a free-living
-proteobacterium is that of Bartonella
henselae, with less than 2 × 106 base pairs, and
the largest is Bradyrhizobium japonicum, with 8.7 × 106 base pairs (107, 155). Since the
Bartonella genome encodes 1,600 or more proteins (Andersson
et al., unpublished), we can take this figure as a conservative size
estimate for the proteome encoded by the free-living
-proteobacterial ancestor of mitochondria. What then accounts for
the enormous size discrepancy between the coding capacity of
-proteobacterial and mitochondrial genomes? Here, we need to compare
1,600 proteins with the 67 proteins encoded by the mitochondrion with
the largest coding repertoire, that of Reclinomonas
americana (76, 77).
There are at least two large-scale reductive tendencies that will account for fact that contemporary mitochondrial genomes have evolved into mere vestiges of the ancestral genome. One is the massive loss of genes that are not essential to life in the eukaryotic cytosol. Thus, genes in the nuclear genome can replace many gene products originally encoded by ones in the endosymbiont's genome (6, 13, 15, 16, 108). This means that the suspension of purifying selection allows redundant genes in the mitochondria to be inactivated and deleted by random mutation (101). In addition, unique essential genes can be transferred to the nucleus if their protein products can be recruited from the cytosol for function in the mitochondrion (17, 27, 75, 95). A recent model for the evolution of mitochondrial genomes predicts that eventually, when such transfer is not destructive, all coding sequences will be displaced from the mitochondria to the nucleus (27).
Genes transferred to the nucleus can encode proteins that will be transported to the mitochondria by a specific transport system (138, 162). The same transport system can also assist mitochondria to recruit nonbacterial proteins encoded in the nucleus. For example, the nucleus of Saccharomyces cerevisiae contributes more than 400 proteins to the mitochondrion (87). Phylogenetic analysis suggests that half of the nuclear proteins that augment the mitochondrial proteome have no bacterial affinities (19, 95). This half is likely to be purely eukaryotic in origin (see below).
In effect, mitochondria have evolved in two distinctive modes. One is
the reductive mode that reflects an extreme adaptation to an
intracellular existence. The other is an expansive mode in which the
mitochondria are the beneficiaries of nuclear evolution. In the
following we document these modes of genome evolution. We stress the
importance of extending the analysis of mitochondria beyond the
relatively small and highly variable contribution of contemporary
mitochondrial genomes. A narrow focus on the genomes of the organelle
tends to obscure most of its evolutionary history. Since there are
descendants of both ancient
-proteobacterial genes and more recent
eukaryotic genes cooperating in mitochondrial functions, it is most
convenient to view the evolution of the organelle as the evolution of a
proteome. Viewed from this vantage, mitochondria no longer seem to be
just another sort of bacteria.
Space limitations have meant that we cannot do justice to the vast amount of information that is available on many other aspects of the mitochondrial genome. We recommend that interested readers supplement their background information with the aid of an excellent book, Mitochondrial Genomes (205), edited by Wolstenholme and Jeon. This book contains a chapter by Michael W. Gray (75) that is a must.
OX-TOX MODEL
|
|
|---|
The endosymbiotic theory, as explicated by Margulis
(121), was an eclectic formulation that concerned much about
cellular evolution besides the origins of mitochondria and plastids,
but convention has reduced the common use of the term. By 1998, the standard model to describe the origins of mitochondria was quite specific: the endosymbiont was identified as an
-proteobacterium, such as Paracoccus, and the host as an
archaeon (50, 75, 79, 137). An important aspect of the
standard model in all of its shifting forms is that it assumes, often
tacitly, that the symbiosis leading to the mitochondrial lineages
involved an exchange of ATP produced aerobically by the symbiont for
organics provided by the anaerobic host.
This view was challenged recently on biochemical grounds. In particular, it was recognized that a free-living bacterium such as Paracoccus would probably not be able to actively transport ATP to a prospective host because bacteria do not in general have ATP exporters (10, 20, 124, 185). Two additional observations, to which we return below, are relevant. First, only two endocellular parasitic bacteria are known to have ATP transport proteins. These are importers of ATP that are clearly related to plastid homologues but unrelated to the ATP exporters of mitochondria (10, 17, 20, 201). Second, the ATP transporters of mitochondria seem to have evolved after the divergence of eukaryotes (17, 20, 95). From where do the ATP transport functions of the mitochondria come?
It turns out that this question of seemingly small detail opens into
much larger issues. In particular, if the initial symbiotic relationship between the
-proteobacterium and its host did not depend on the sharing of ATP produced through the aerobic respiration of the symbiont, what was that relationship? Two current views of the
initial symbiotic relationship between the ancestor of mitochondria and
its host have emerged. One sort of model favors an evolutionary path
that is initially supported by anaerobic syntrophy (115,
124). The other, which involves aerobic mutualism (17), we describe first.
As noted above, all coding sequences of characterized mitochondria are
found in the mitochondrial genome of R. americana
(77). This, along with the clear similarity of these to
coding sequences of R. prowazekii and B. henselae
(20; Andersson et al., unpublished data), suggests
that the mitochondria arose only once and that they, together with the
putative
-proteobacterial ancestor, make up a monophyletic lineage.
This is consistent with the supposition that the initial endosymbiotic
relationship to the ancestral host was an aerobic one.
Nevertheless, there are two details of this putative aerobic scenario that are challenging. First, it is not possible at present to identify the host of the ancestral symbiont with any confidence. Current preferences vacillate between an archaeon and a primitive eukaryote as the likely host (see, for example, references 76 and 131). Either way, we assume for simplicity that the host was a heterotroph that could provide the endosymbiont with substrates such as pyruvate. This is not a drastic assumption because of the near ubiquity of glycolytic pathways among archaea, bacteria, and eukaryotes (46). In addition, it is known that cytochrome oxidases may function as cytochrome c oxidases, quinol oxidases, or nitrogen oxide (NO) oxidases. Nevertheless, all members of this gene family belong to the same monophyletic lineage, and all three may have been present in the last common ancestor (40, 41). This, together with the particularly close monophyletic relationship between the cytochromes employed by bacteria and by mitochondria, is strong evidence that the ancestral endosymbiont had already acquired an aerobic respiratory chain (20, 77, 165). We return to the origins of eukaryotic heterotrophy below.
Second, as mentioned, neither transporters such as the ATP/ADP translocases of Rickettsia and Chlamydia nor protein transport systems such as those that recruit proteins into the mitochondria are found among free-living bacteria. This means that there is no reason to suppose that the ancestral endosymbiont could export ATP or import proteins, as do modern mitochondria. Instead, we suggest that initially the aerobic symbiont interacted with its host in ways that are not characteristic of modern mitochondria.
One possibility is that the ancestral
-proteobacterium was an
aerobic symbiont that consumed oxygen with the aid of a respiratory chain ending in cytochrome oxidase and that, in return, its
heterotrophic anaerobic host made pyruvate accessible (17,
95). Here, the host may not have benefited initially by
sequestering the ATP produced aerobically by the symbiont. Instead, it
is suggested that the consumption of oxygen per se constituted the
service provided by the ancestral symbiont in the initial phase of the evolution of mitochondria. In effect, the cytochrome oxidase activity of the symbiont detoxified the host cytosol by converting oxygen to
water. The benefit here is that elements of the host's anaerobic metabolism that were sensitive to oxygen would be protected by the
activities of the endosymbiont's cytochrome oxidase.
The credibility of this conjecture derives from numerous examples of modern symbiotic relationships with an oxygen-scavenging function assigned to one of the partners (60, 61, 62). Roughly two billion years ago, the oxygen tension increased from less than 1.5% to greater than 15% of present levels (88). At that time, the demands for an oxygen-consuming symbiont to support an essentially anaerobic host would have been, if anything, more pressing than they are today. Thus, in modern organisms, activities such as peroxidases, catalases, and superoxide dismutases protect cells against the toxic effects of oxygen respiration. These activities might not have been so widespread two billion years ago. Indeed, even some modern cells are killed or debilitated by exposure to less than ambient atmospheric oxygen tensions (60, 61, 62).
In the Ox-Tox model, the evolution of the mitochondrion from the endosymbiont required the evolution of characteristic mitochondrial control and export functions that were derived from nuclear genes. The evolution of novel nuclear gene products for recruitment by the mitochondria is typified by the integration of the ATP/ADP translocase into the workings of the primitive mitochondrion. Thus, this activity is found universally in mitochondria, which dates its debut to a time prior to the divergence of the major branches of eukaryotes. This novel recruit to the mitochondrial proteome made possible an efficient supply of ATP to the host cell from the evolving mitochondrion. For this reason, the integration of the ATP/ADP translocase into the workings of the endosymbiont may be taken as a marker for the transformation of the endosymbiont into an organelle.
FROM BACTERIAL GENOME TO VESTIGE
|
|
|---|
One group of
-proteobacteria, the rickettsiae, are of special
interest both as models for the evolution of mitochondria and as
possible descendants of the endosymbiotic ancestor to mitochondria (11, 14, 58, 81). These organisms, like the putative
ancestor of the mitochondria, are thought to be the descendants of
free-living
-proteobacteria (191, 197). Furthermore,
phylogenetic reconstructions for diverse protein-coding sequences
suggest that rickettsiae are the closest modern relatives of the
mitochondria (11, 81, 141, 166, 186). Once the genome
sequence of Rickettsia prowazekii (20) and that
of its close relative, Bartonella henselae (Andersson et
al., unpublished data), became available, their intimate phylogenetic relationships to mitochondrial genomes became incontrovertible. Nevertheless, there is a glaring gap between, on the one hand, discovering this phylogenetic relationship and, on the other, understanding precisely what that ancestor was.
Furthermore, there is the enormous discrepancy between the number of
coding sequences in mitochondria and that in free-living
-proteobacteria. Much has been made about the limited coding capacity of animal mitochondria (22, 204) as well as of the stark contrast between these and plant mitochondria with their relatively large, complicated genomic architectures (83,
180). Compared to the little rickettsial genome with its 834 protein-coding sequences (20), the coding capacity of any
mitochondrion is insignificant. The numerical range of protein-coding
sequences in mitochondria extends from 2 in Plasmodium
falciparum to 67 in Reclinomonas americana
(77). The protein-coding capacities of all known plant,
animal, and fungal mitochondria are nested between those of these two
protists. In fact, most mitochondrial genomes can boast between 12 and
20 or so protein-coding genes along with rRNA and tRNA genes
not much
with which to run an organelle.
If an
-proteobacterium such as Bartonella with its 1,600 genes was the ancestor of mitochondria, why did nearly all of the coding capacity of this genome disappear from the organelle?
Alternatively, why does the mitochondrial genome have any
protein-coding capacity at all? Again, why are most of the genes needed
by mitochondria found in the nucleus and not in the mitochondrial
genome? Finally, are the nuclear mitochondrial genes of bacterial
origin or of eukaryotic origin?
Genome Degradation
A free-living bacterium that initiates a symbiotic relationship with another cell will be bathed in the metabolic intermediates of its host. These metabolites make some of the symbiont's genes redundant as long as it shares the host metabolism. Thus, we expect some genes in the symbiont to be neutralized by the host's biochemical activities. Neutralized genes are subject to mutational degradation (101). When genes required for the free-living mode are forfeited, the facultative symbiont has evolved into an obligate symbiont or an obligate parasite, with a coding capacity that can be extremely limited (70, 71, 72, 94, 147, 170, 172). For example, the obligate parasites of the genus Rickettsia, like mitochondria, have virtually no genes for amino acid or nucleoside biosynthesis, but their facultative parasitic relatives, the Bartonella spp., are fully able to produce these intermediates in their free-living mode (20; Andersson et al., unpublished data).
The Rickettsia, like their relatives the mitochondria, have a well-developed oxidative metabolism that exploits the Krebs cycle along with an ATP-generating electron transport chain that terminates with cytochrome oxidase. Both sorts of genomes are devoid of genes for anaerobic glycolysis, and this may be attributed to the fact that their respective hosts supply them with pyruvate as the precursor to the Krebs cycle. In contrast, the Bartonella genome has a complete glycolytic repertoire (B. Canbäck, U. C. M. Alsmark, S. G. E. Andersson, and C. G. Kurland, unpublished data). In effect, a good deal of the difference in the gene complements of these two bacteria, which amounts to circa 1 million base pairs, may simply be the difference between the needs of an obligate and a facultative parasite. Likewise, the difference between the 834 genes of the Rickettsia genome and the roughly 400 genes that specify mitochondrial functions (see below) may reflect differences in the needs of an infective parasite and those of a captive organelle.
These streamlining effects are to some extent a reflection of the population structure common to the genomes of obligate symbionts, obligate parasites, and cellular organelles (15). These sorts of genomes tend to propagate as asexual lineages, which are characterized by small population sizes. Under these conditions, sublethal deleterious mutations accumulate, and these may include the inactivation or loss of nonessential genes (15, 107, 117, 118). In the case of mitochondria, such mutations will be subject to purifying selection at the cellular level (28). However, the efficiency of this selection will depend on the population size and the magnitude of the selective disadvantage of the mutations. As a consequence, asexual genomes in small populations or in populations subject to recurrent bottlenecks will tend to be degraded by the inroads of weakly deleterious mutations (59), i.e., by Muller's ratchet. The ratchet has been demonstrated experimentally in bacteriophages (43) and in free-living bacteria (6). The influence of Muller's ratchet has also been inferred in the genomes of endosymbionts (35, 110, 130, 192) as well as in mitochondria (108, 117, 118).
Muller's ratchet may also account for some of the reduction in the effective gene complement of the evolving mitochondrion (59, 137). The magnitude of the influence of the ratchet on a genome is related to the degree of mutational diversity in the genome population. This follows from the fact that the mutation frequency per genome is likely to be proportional to the size of the genome. This implies that the larger genome characteristic of an early stage of mitochondrial evolution should have been more vulnerable to the inroads of Muller's ratchet than that of a modern mitochondrion (27).
So far we have considered genetic mechanisms that influence the size of the mitochondrial genome. We may also consider the molecular mechanisms that mediate the degradation of genomes. There are at least two different ways that sequences may be extirpated from a genome. One would be a slippage mechanism in which short runs of nucleotides are removed. This is a slow but sure way to delete sequences. Indeed, traces of this mechanism are observed in the highly derived genome of Rickettsia. For example, nearly one quarter of the Rickettsia prowazekii genome is noncoding sequence (7, 20). It is possible to study the mutation spectrum of noncoding sequences from different species of Rickettsia. Such a comparison shows that short deletions provide the dominant evolutionary mode in these sequences (8, 9). Thus, noncoding sequences that are thought to be mutation-degraded versions of nonessential coding sequences can slowly depart the genome by virtue of small deletions (7, 8, 9, 11, 15).
A more dramatic deletion mechanism and one that has left a more obvious signature on highly derived genomes is that attending intrachromosomal recombination at repeat sequences. This sort of recombination event has been observed as the most common mechanism of large-scale deletions in bacteria under laboratory conditions (143). Such deletions leave at least two signatures. First, they lead to the loss of intervening sequences between two repeat sequences along with the deletion of one of the repeats. Second, they lead to rearrangements of the flanking sequences surrounding the original repeat sequences. Such rearrangements may be detected in descendants of the deleted genome as the loss of highly conserved sequence motifs, such as those of common operons.
In the reduced genomes of Rickettsia, gene duplications common to other bacteria such as multiple rRNA operons and duplicated elongation factor Tu genes are missing (21, 171). These, along with short repeat sequences that are common in free-living bacteria, seem to have been consumed by intrachromosomal recombination in the genome of Rickettsia (20). In addition, the correlate is observed. Thus, the highly conserved operons for rRNA, proteins of the translation apparatus, and some metabolic enzymes are either gone from the Rickettsia genome or are retained in scrambled form (7, 19, 21, 171). Such depredations are even more in evidence in the genomes of mitochondria. These commonly have their minimal coding sequences arranged with little rhyme or reason, except among some primitive protists and plants (75, 77).
As mentioned earlier, the largest number of coding sequences observed so far in mitochondria belongs to Reclinomonas americana (77, 80, 111). Although it has only 67 protein-coding genes, it is a giant among mitochondria. There is much to recommend interest in this genome, which is in some ways very unlike other mitochondrial genomes (77). Like some other protists and plant mitochondrial genomes, it has recognizable gene motifs, such as the rRNA operon and the giant ribosomal protein cluster seen in bacteria (20, 80, 111). The presence of such motifs is hard to explain other than by the conservation of ancient bacterial motifs.
That R. americana's mitochondrial genome contains 18 protein-coding sequences not seen in other mitochondrial genomes is not as remarkable as the fact that all of the proteins found in all other mitochondrial genomes are among the remaining 49 protein-coding genes (77, 80, 111). This simple fact speaks forcefully for the monophyletic character of the mitochondrial lineages, particularly when it is recalled that there are hundreds of mitochondrial proteins coded by genes in the nuclei of eukaryotes (77).
Phylogenetic reconstructions with the coding sequences from
the R. americana mitochondrion along with those from
other mitochondria and from bacterial genomes are unambiguous: the
-proteobacteria Rickettsia and Bartonella have
a common ancestor with the mitochondrial lineage. That common ancestor
was a free-living bacterium with a genome that was probably larger than
that of Bartonella and certainly much larger than those of
Rickettsia as well as mitochondria. Nevertheless, there is a
very important difference between the genomes of Rickettsia
and those of mitochondria. While the proteome of Rickettsia
is at most twice the size of the mitochondrial proteome, typically less
than 10% of the mitochondrial proteome is encoded by the mitochondrial
genome. Thus, there is a dimension to the reductive genome
evolution of mitochondria that is not shared by the
Rickettsia.
Transfer to the Nucleus
There is a decisive difference in the evolutionary coupling between the host cell genome and the genome of an endocellular parasite and between the genomes of an organelle and of an endosymbiont (15). The fitness of the endosymbiont, like that of the organelle, is coupled positively to that of the cell. In contrast, the parasite's fitness is negatively coupled to the cell's fitness. Consequently, mutations that adversely affect the parasite will benefit the host genome. Conversely, deleterious mutations in the mitochondrion can be compensated for by changes in the host genome that enhance the combined fitness of the two genomes. Much of the evolution of the mitochondrion can be understood with the help of this distinction.
The asexual character of mitochondrial lineages suggests that they might be particularly vulnerable to Muller's ratchet, especially compared to nuclear genomes, with their well-developed sexual mechanisms (107, 117, 119). Thus, Muller's ratchet (132) might account for the fact that the vast majority of genes constituting the mitochondrial proteome are found in the nuclear genome. Simply stated, it is conceivable that genes in the mitochondria would have a much heavier mutational load than the same genes in the nucleus. If a transfer mechanism existed, it would be advantageous for the cell to move genes from the organelle to the nucleus. Indeed, the data for metazoan mitochondria tend to support this notion, because in these organisms, the mutation rates are much higher in their mitochondria than in their nuclei (204). Furthermore, it can be shown that as long as a transfer mechanism exists to shuttle genes from mitochondria to nucleus, the mutational load will inevitably drive genes to the nuclear genome (27). However, there is a serious limitation to this model: it is only applicable to a small fraction of this planet's eukaryotes, primarily the metazoans.
Thus, the genomes of plant mitochondria tend to be less mutation prone than plant nuclear genomes, and in fungi the mutation frequencies of the two genomes are more or less equivalent (117, 118, 146, 202). Nevertheless, the coding capacities of plant and yeast mitochondria are in general not very different from those of animal cells; in all cases, the overwhelming majority of mitochondrial genes are found in the nucleus (75, 77). From these observations, it follows that mutational load alone cannot drive the migration of organelle genes to the nuclei of organisms other than animals.
For the most realistic situation of eukaryotic organisms in finite populations, what is required in addition to random mutations is a biased transfer mechanism. When the transfer mechanism is adequately biased in the direction of the nucleus, it can overcome a mutational gradient in the opposite direction (27). For example, if a cellular transfer mechanism favors moving genes from mitochondria to the nucleus over transferring genes from the nucleus to mitochondria, it can do so as long as its bias is greater than the mutational bias of the nucleus compared to the mitochondrion.
There are data indicating that transfers between mitochondrial and nuclear genomes are an ongoing evolutionary process (1, 31, 45, 126, 140, 144-146, 176, 177). Furthermore, there is an experimental system to study and quantify the transfer of sequences between mitochondrial and nuclear genomes in the yeast Saccharomyces cerevisiae (66, 175, 176, 178). Here, plasmids have been introduced into mitochondria, and the transfer of coding sequences from these plasmids to nuclei has been studied quantitatively (38, 82, 178, 190). The genetic data suggest that the primary pathway for the uptake of mitochondrial coding sequences that are transferred and expressed in the nucleus is provided by autophagy of mitochondria by cellular vacuoles (phagolysosomes). Nucleic acid fragments liberated by disintegration of mitochondria may become intermediates in the transfer to nuclei.
Thorsness and Fox (175, 176) have estimated rates of transfer of coding sequences to and from the mitochondria. For wild-type S. cerevisiae, there is roughly 1 transfer event to the nucleus from the mitochondria per 105 generations. Their experiment failed to detect transfer in the opposite direction, suggesting that this rate is less than 1 transfer/1010 generations. Thus, in this experimental model, the transfer process is expressed at least 105 times more frequently from mitochondria to the nucleus than in the reverse direction. This means that we have found a highly polar process to transport genes from the mitochondria to the nuclei in S. cerevisiae. How general is this likely to be?
One indication of the generality of the transfer process is that autophagic vacuoles are ubiquitous in eukaryotes. In addition, it seems that exogenous fragments of nucleic acids do not normally get into mitochondria. Thus, the experimental transfer to mitochondria of exogenous coding sequences such as those on plasmids has been accomplished to our knowledge only in S. cerevisiae. This transfer requires what is referred to as high-velocity microprojectile bombardment (66). In effect, this unique experimental system requires that the transferred sequences be shot into the cells to effect penetration of some mitochondria. It would seem that effective physical barriers normally prevent transfer of coding sequences into mitochondrial genomes.
Another argument favoring the asymmetric transfer to nuclei by vacuoles is that these universal organelles consume the mitochondria as well as peroxisomes in the normal course of their function. It is this degradation process that apparently releases the fragments of nucleic acids that are taken up by the nuclei (38, 82, 178, 190). Obviously, the destruction of nuclei by vacuoles would be lethal to a cell, which would prevent vacuole-dependent gene transfer from nuclei to mitochondria by this route. For these reasons, we are inclined to believe that preferential transfer of coding sequences to nuclei is the rule rather than the exception. This does not preclude the transfer of sequences from the nucleus to mitochondria by other transfer processes (123, 177).
In order to be recruited by the mitochondrion, a protein-coding sequence transferred to the nucleus often requires an addressing signal to direct its product back to the organelle (24, 138, 162, 163). Splicing pathways such as those that support exon shuffling might accelerate the tagging of newly transferred genes with appropriate addressing sequences. Indeed, the discovery of an intron between an addressing sequence and a mitochondrial gene that had been transferred to the nucleus in some plants confirms this expectation (45, 140). Accordingly, it is conceivable that one reason that splicing systems spread through primitive eukaryotic nuclear genomes was to satisfy the need to tag newly transferred genes from mitochondria and chloroplasts with addressing signals.
The process that transfers a gene to the nucleus can be envisioned as a neutral process with several identifiable states (27). First, an inactive gene is transferred to the nucleus, while the active version is retained by the mitochondrion. Then, the tagging process provides an addressing sequence to the nuclear version so that both mitochondrion and nucleus have active versions of the gene. Finally, mutation inactivates and eventually deletes the mitochondrial gene, and the nuclear allele takes over its function.
Recent events in the evolution of legumes have produced intermediate stages of this sort for the transfer of cox2 from mitochondria to the nucleus (1, 45, 140). The cox2 sequence in the nucleus of legumes seems not to have been copied from mitochondrial DNA. Since this nuclear sequence does not require editing for expression, we may infer that it was copied from an edited RNA fragment (45, 140). Therefore, this transferred gene was probably copied from an RNA fragment that was released by the destruction of mitochondria.
The neutral transfer model implies that eventually all of the genes of the mitochondria that can be transferred without ill effect will be transferred to the nucleus (27). Why then are there any genes left in contemporary mitochondrial genomes? Obviously, some genes remaining in mitochondrial genomes may be ones for which the transfer process is not neutral or for which the transfer requires very rare mutational events. If the entire group of mitochondrial genes is destined for transfer to the nucleus, sequence technology may have discovered the last ones poised for transfer. We recall here that only two such proteins are left in the mitochondrial genome with the most limited coding capacity. It has been suggested that these two genes must remain fixed in the organelle's genome because their protein products can regulate their own expression in tune with the redox potential of the mitochondria (5). Consistent with this interpretation is the identity of the two proteins encoded by all known mitochondrial genomes; both are cytochrome subunits (77).
In summary, we can identify evolutionary forces that tend to reduce the coding capacity of mitochondrial genomes. These evolutionary forces are either destructive, leading to the loss of coding sequences from the cell, or conservative, leading to the transfer of coding sequence to the nucleus. In the next section we also document the existence of an expansive evolutionary tendency that has supplemented the ancestral endosymbiont's proteome and transformed it into that of a cellular organelle.
THE YEAST MITOCHONDRIAL PROTEOME
|
|
|---|
The discovery of a major fraction of mitochondrial proteins that are not the descendants of a bacterial ancestor was entirely unexpected (95). This discovery required access to a definitive genome sequence and data defining the mitochondrial proteome of the same organism. For the moment, the yeast S. cerevisiae is the only eukaryote for which these two requirements have been met. In this section, we summarize and illustrate the phylogenetic reconstructions obtained for the mitochondrial proteome of S. cerevisiae.
The relative fraction of genes with homologues in bacteria and in
eukaryotes for each of six functional categories is summarized in Fig.
1. Roughly one third of all the proteins
are classified as ambiguous. These may cluster with bacterial taxa at
the ends of long branches in the phylogenetic trees, or they may have
homology to proteins from both bacterial and eukaryotic taxa
(95). No attempt has been made to deduce their origins. In
contrast, a cohort of circa 50 mitochondrial proteins that are clearly
most closely related to
-proteobacteria have been identified. These support the identification of
-proteobacteria as ancestors of the mitochondria (95).
|
Surprisingly, half of the mitochondrial proteome of S. cerevisiae, circa 200 proteins, have no discernable alignments
with any bacterial homologues (P < e
10).
They cluster exclusively as eukaryotic homologues (95). The presence of a sizable eukaryotic cohort contradicts an expectation of
the endosymbiotic theory, which implies that the mitochondrial proteome
is exclusively the descendant of an ancestral bacterial proteome. The
data suggest that the endosymbiotic theory requires modification.
It is evident from Fig. 1 that the phylogenetic clustering of the
mitochondrial proteins into
-proteobacterial and eukaryotic homologues goes hand in hand with the functional profiles of the clusters. Thus, the bacterial homologues seem to be mainly involved in
translation and energy metabolism. In contrast, the eukaryotic proteins
are typically associated with transport and regulatory functions.
A way of viewing the phylogenetic differences between the different
functional categories of proteins is as follows (17, 95).
First, we imagine that the ancestral bacterial symbiont introduced
genes encoding proteins that have been perpetuated as an essential core
of proteins functioning in aerobic respiration, the tricarboxylic acid
(TCA) cycle, and gene expression. Furthermore, we suggest that the
evolution of the mitochondria from the endosymbiont required
that novel accessory proteins that arose in the eukaryotic genome
complement such core functions. Some of these eukaryotic homologues
augment the functions of the core proteins by participating in the
assembly of complexes, while others function in regulation. In
addition, there is a larger group of eukaryotic proteins with novel
gene functions, such as ATP and protein transport. We suggest that the
coevolution of the core
-proteobacterial components and the
complementary eukaryotic nuclear components transformed the
endosymbiont into an organelle.
We have also observed a number of bacterial protein homologues that may
be examples of horizontal gene transfers from diverse bacteria. This
means that there are proteins descended from three sorts of
genomic ancestors in the yeast mitochondrial proteome: (i)
homologues descended from an
-proteobacterium ancestor that encode
core function, (ii) some ill-defined orthologues, some of which may
have been recruited from a diverse group of bacteria through horizontal
gene transfer, and (iii) the dominant group of eukaryotic proteins that
have been recruited from the nuclear genome. In effect, the ancestral
-proteobacterial proteome consisting of 1,600 or more proteins has
been reduced to roughly 50 proteins. This residual core has been
complemented by circa 350 novel proteins, mostly recruited from the
eukaryotic nuclear genome.
This interpretation rests on detailed phylogenetic analyses of the
mitochondrial proteins of S. cerevisiae as well as of other eukaryotes that we present next. Here, all references to genes and
their encoded proteins are for S. cerevisiae unless
otherwise specified. Some relevant informational details may assist the reader in studying the phylogenetic reconstructions. All of the phylogenetic reconstructions are to be found at
http://web1.ebc.uu.se/molev/publications/cfg2000 (95).
The sequences of proteins of the yeast mitochondrial proteome are found
in the Yeast Protein Database at http://www.proteome.com (87). The sequences of the mitochondrial genome of
Reclinomonas amer- ica are at http://megasun.bch.umontreal.ca/ogmp/projects /other/mtcomp.html.
Sequences for the nematode Caenorhabditis elegans are in the
database WormPep17 at
ftp://ftp.sanger.ac.uk/pub/databases/wormpep/. Other
sequences from the SwissProt and NCBI databases were also used
for the phylogenetic reconstructions. Alignments of homologous proteins
(P < e
10) were used to construct
phylogenetic trees (95, 174). To structure the discussion,
we have followed the somewhat arbitrary biochemical classification used
by Karlberg et al. (95).
Energy Metabolism
ATP production in eukaryotes from glucose and oxygen normally consists of two catabolic processes: the conversion of glucose to pyruvate via a glycolytic pathway, and the oxidative conversion of pyruvate to H2O and CO2. Both Rickettsia and mitochondria rely for pyruvate on the glycolytic systems of their host cells. More than 100 mitochondrial enzymes are involved in the oxidation of pyruvate. In S. cerevisiae, the relevant enzymes correspond to five proteins in the pyruvate dehydrogenase complex, 14 proteins in the ATP synthase complex, 16 proteins in the TCA cycle, and more than 70 proteins in the respiratory chain complexes. The ATP produced by this system is exported into the cytoplasm by the ATP/ADP translocases, which transport 1 molecule of ATP in exchange for 1 molecule of ADP.
The Rickettsia genome contains a similar cohort of proteins
involved in the pyruvate dehydrogenase complex, the TCA cycle, the
respiratory chain complex, and the ATP synthase complex. There are in
addition five genes coding for ATP/ADP translocases, but these are
required for the import of ATP from the host cell cytosol. Thus,
Rickettsia and mitochondria have functionally related
systems for ATP production (Fig. 2).
However, ATP transport is somewhat different in these two systems (see
below).
|
The pyruvate dehydrogenase complex that converts pyruvate to acetyl
coenzymeA (acetyl-CoA) consists of multiple copies of each of three
enzymatic components: pyruvate dehydrogenase (E1, two subunits),
dihydrolipoamide acetyltransferase (E2), and lipoamide dehydrogenase
(E3), as summarized in Fig.
3A. All
three mitochondrial subunits are encoded by the nuclear genome and are
more closely related to their homologs from
-proteobacteria than to
those from other bacteria (Fig. 3B, C, and D). A notable exception is the pdx1 gene, which encodes a protein required anchoring
the E3 to the E2 component in the complex. This protein has no
bacterial homologues, so it may be a relatively recent addition to the
pyruvate dehydrogenase complex.
|
Acetyl-CoA, produced by the pyruvate dehydrogenase complex, is fed into
the TCA cycle (Fig.
4A).
There are eight enzyme complexes in the TCA cycle, all of which are
encoded by the nucleus in at least some eukaryotes. Three enzymes
display particularly strong relationships to the
-proteobacteria,
whereas the evolutionary history of the others is more complex. Genes
encoding succinyl-CoA synthetase, succinate dehydrogenase, and fumarase
seem to descend from an ancestral
-proteobacterium, and these have
been transferred subsequently into the nucleus. For example,
-proteobacterial enzymes of the succinate dehydrogenase complex
(Sdh) are closely related to their mitochondrial homologs whether these
are encoded in the nucleus or in the mitochondrion (Fig. 4B). However,
the membrane-anchoring subunit of this complex, Sdh4, has so far been found only in eukaryotes.
|
Most if not all of the genes encoding enzymes in the TCA cycle have been duplicated. The resulting paralogues are often recruited to different subcellular compartments. For example, there are three genes encoding malate dehydrogenases in the nuclear genome of S. cerevisiae. These are targeted to the mitochondrion, the cytoplasm, and the peroxisome. All three enzymes form a phylogenetic cluster which is closely related to the mitochondrial malate dehydrogenases in other species (Fig. 4C). This suggests that the yeast cytosolic and peroxisomal forms are derived relatively recently from the mitochondrial malate dehydrogenase. On the other hand, the mouse cytosolic and mitochondrial malate dehydrogenases are highly divergent, which suggests that they arose in a more ancient gene duplication. The malate dehydrogenase in R. prowazekii seems not to be particularly closely related to either the mitochondrial or the cytoplasmic form of malate dehydrogenases in eukaryotes.
There are in S. cerevisiae at least two genes encoding the
aconitase hydratase. The bacterial homologues cluster more closely to
the cytoplasmic homologues and are more distant from the mitochondrial ones (Fig. 4D). This suggests that the homologue recruited for cytoplasmic functions and that recruited to the mitochondrion may have
had different ancestors. The history of isocitrate dehydrogenases is
even more complex. There are two forms of isocitrate dehydrogenase, the
NAD+- and the NADP+-specific forms, both of
which have the same function in the TCA cycle. The
NAD+-specific protein is a two-subunit protein in S. cerevisiae and a three-subunit protein in higher eukaryotes. The
two yeast NAD+ subunits cluster closely together,
suggesting that they have been derived from the same ancestral gene.
However, the NADP+-specific isocitrate dehydrogenases
are only remotely related to the NAD+-specific
proteins. The phylogenetic analysis suggests that the cytoplasmic,
mitochondrial, and chloroplast NADP+ homologues have
common origin but have been recruited to different subcellular
compartments in a pattern that is species specific. The
NADP+ homologues in mitochondria are most similar to
those in Sphingomonas yanoikuyae (
-proteobacterium) and
Mycobacterium tuberculosis, whereas more distantly related
paralogues are found in all other bacteria. The isocitrate
dehydrogenase found in Rickettsia prowazekii is highly
divergent from all of these enzymes.
There are several other examples of mitochondrial and cytosolic
isoforms that are highly divergent from bacterial homologues. For
example, the mitochondrial citrate synthase seems to have originated
from within the eukaryotic genome and displays no sequence identity
with its bacterial analogue. The phylogenetic analysis of the enzymes
of the TCA cycle reveals a complex evolutionary network. The
complexities include (i) some genes that have been transferred from an
-proteobacterial ancestor into the nuclear genome and retargeted to
the mitochondrion, (ii) others that have been transferred from an
-proteobacterial genome to the nucleus but targeted to other
subcellular compartments, and (iii) still others that seem to have been
recruited from different bacterial or eukaryotic ancestors.
The electron transport system consists of three
energy-coupling sites: (i) the NADH dehydrogenase complex,
(ii) the cytochrome bc1 complex, and (iii) the
cytochrome oxidase complex (Fig. 5A). Although some subunits of the NADH dehydrogenases from
diverse eukaryotes are encoded by mitochondrial
genomes, most are encoded by nuclear genomes. For example, the
mitochondrial genome of Reclinomonas americana
contains as many as 10 genes encoding subunits of the NADH
dehydrogenase complex. This arrangement is reminiscent of the nine NADH
dehydrogenase subunits located in immediate proximity to each other in
Escherichia coli. A reduced version of this arrangement is
also seen in R. prowazekii. Phylogenetic reconstructions
based on a combined set of mitochondrial NADH dehydrogenase subunits suggest that these are derived from the
-proteobacteria (Fig. 5B).
In contrast, only a single nuclear gene that encodes one of the
subunits in the NADH dehydrogenase complex has been identified in
S. cerevisiae. The corresponding protein is roughly
30% identical to the NADH dehydrogenase subunit of the
-proteobacteria, E. coli, and Haemophilus
influenzae. This low level of sequence identity suggests that the
yeast homologue may not be a descendant of bacterial orthologues.
Indeed, it seems likely that the bacterial NADH dehydrogenase genes may have been discarded from the yeast mitochondrial as well as
nuclear genomes and replaced by eukaryotic analogues that are too
dissimilar to be identified by current routines for identifying sequence similarity.
|
The second coupling site of the electron transport system is the
cytochrome bc1 complex, which contains a core of proteins such as cytochrome b, cytochrome c1, and the
Rieske iron-sulfur protein. This complex has been isolated from many
-proteobacterial species, such as Paracoccus
denitrificans, Rhodobacter
capsulatus, Rhodospirillum rubrum, and
Bradyrhizobium japonicum. The gene encoding cytochrome
b is located in the mitochondrial genome and is closely
related to its
-proteobacterial relatives (Fig. 5C). The nuclear
gene encoding cytochrome c1 (cyc1) is also
related to one found in
-proteobacteria. Homologues for the Rieske
iron-sulfur protein (rip1) have been found in
- and
-proteobacteria as well as in cyanobacteria and green sulfur
bacteria. All three mitochondrial proteins cluster closely with their
-proteobacterial relatives. In S. cerevisiae, the
cytochrome bc1 complex is composed of as many as 10 subunits. The seven other components of this complex in yeast cells are
related to eukaryotic homologues, but they are unrelated to any
bacterial analogues. These seven appear to be later additions to the
central core derived from
-proteobacteria.
The cytochrome oxidase complex in S. cerevisiae provides
another example of a complex where proteins of eukaryotic descent have
been added to a core of
-proteobacterial proteins. The core genes
are coxI, coxII, and coxIII, encoding
cytochrome oxidase c subunits I, II, and III, respectively.
These three subunits are almost always mitochondrially encoded and
cluster closely with their
-proteobacterial homologues (Fig. 5C). In
contrast, none of the other yeast proteins in this complex can be
aligned with bacterial homologues. These nonbacterial homologues are
only occasionally found in other eukaryotes, which is consistent with the interpretation that they are recent nuclear contributions. Only 1 of the 13 yeast proteins responsible for the assembly of the cytochrome
oxidase complex has a homologue in R. prowazekii. Four of
these assembly proteins can be identified in the proteomes of other
eukaryotes. The remaining assembly proteins for the cytochrome oxidase
complex are specific for S. cerevisiae.
The gene order for the core components of the ATP synthetase complex
(Fig.
6A)
is highly conserved among bacterial genomes (Fig. 6B). There are 14 proteins in the ATP synthase complex in S. cerevisiae, half
of which can be identified as bacterial descendants. Three of the
latter are encoded in yeast mitochondria (atp6,
atp8, and atp9), and each displays a close
phylogenetic relationship to
-proteobacterial homologues. Similarly,
phylogenetic reconstruction with the concatenated alignments of
proteins encoded by the nuclear genes for the
and
subunits of
the ATP synthase (atp1 and atp3) reveals a
cluster with strong bootstrap support for mitochondrial and
-proteobacterial homologues (Fig. 6C). A similar cluster is also
observed for the genes atp2 and atp5. Of the
remaining seven yeast proteins in this complex, four are found only in
eukaryotes and three appear to be specific to S. cerevisiae.
|
In summary, phylogenetic reconstructions for the core components of the
respiratory chain complexes identify these as
-proteobacterial descendants that are often encoded by the mitochondrial genomes of
contemporary eukaryotes. Other components of the mitochondrial proteome
descended from
-proteobacteria, such as atp1 and
atp3, have been transferred to nuclei. Associated with these
core components are accessory proteins such as those that assist in the
assembly of complexes for which no homologues exist in bacteria. These seem to have evolved after the ancestral endosymbiotic event that contributed the core components. In total, 42% of the proteins participating in the aerobic ATP-generating system of yeast
mitochondria have been found so far only in S. cerevisiae
and other eukaryotes.
Information Processes
Processes such as replication, transcription, and translation are
supported by more than one third of the mitochondrial proteome. Genes
encoding components of these systems are typically located in the
nuclear genome (Fig. 7), although the
mitochondrial genomes of many protists encode different subsets of
ribosomal proteins. In particular, almost of half of the mitochondrial
genome of Reclinomonas americana consists of genes involved
in sequence information processing.
|
rRNA sequences have served as the key reference sequences for
phylogenetic reconstructions (197). Indeed, the very first molecular data supporting the notion that mitochondria are derived from
-proteobacteria were obtained from phylogenetic reconstructions based on rRNA sequences (75, 197). More refined studies
based on larger data sets suggested that there may be a particularly close relationship with the group of bacteria to which the
Rickettsia belong, the Rickettsiaceae (Fig.
8) (75, 141, 197).
|
The interpretation of these phylogenetic reconstructions is not free of ambiguities. That many mitochondrial genomes are organized with extreme economy provides one complication. Thus, there is a direct correlation between the size of the mitochondrial genomes and their rRNA genes (13). In particular, in animal mitochondrial genomes smaller than 20 kb, the small- and large-subunit rRNAs are as short as ca. 850 and ca. 1,500 nucleotides, respectively. These lengths can be compared to those of the plant mitochondrial genomes longer than 300 kb that encode rRNAs with ca. 1,500 and 3,000 nucleotides. Fortunately, a conserved core of nucleotide sequences retained in all rRNA sequences facilitates alignments of rRNA sequences from genomes as diverse as those of bacteria and animal mitochondria despite their substantial size variation.
Another complication is that mutation biases and rates of nucleotide
substitutions vary markedly among the mitochondrial genomes. The small
animal mitochondrial genomes have a strong A+T mutation bias and evolve
more than 50 times faster than the large plant mitochondrial genomes.
This may account for the particularly close phylogenetic relationship
observed between
-proteobacteria and plant mitochondria, with their
long rRNA sequences and slow rates of nucleotide substitution. Although
the rapid, biased sequence evolution of rRNA in animal mitochondria
complicates phylogenetic reconstruction, it is likely that these
mitochondrial rRNAs also descend from those of the
-proteobacterial
subdivision (75).
Only a small fraction of the genes for the ubiquitous core of
mitochondrial ribosomal proteins are found in the organelle's genome.
The largest ribosomal protein assembly in a mitochondrial genome is
found in R. americana, which encodes 11 large-subunit ribosomal proteins and 7 small-subunit ribosomal proteins. The organization of genes for these proteins in the protist's genome resembles the super-ribosomal protein gene operon found in
bacteria (Fig. 9A) (111).
Indeed, much detail of bacterial gene order with some departures is
very well conserved in the mitochondrial genome of R. americana (77). A similar string of genes has been retained as a contiguous segment in the Bacillus subtilis
genome. In the E. coli genome, this string has been divided
into two segments, with the rif region located at 89 min and
the ribosomal protein gene operons str-S10-spc and
located at 74 min (165). So much gene order seems to
have been preserved in R. americana that it is tempting to
infer that the entire string of genes associated with its ribosomal
protein cluster are as they were in the genome of the ancestral
endosymbiont.
|
Nevertheless, R. americana is exceptional. Most other mitochondrial genomes appear to have lost all traces of the gene order of the putative bacterial ancestor. In effect, the gene order of the R. prowazekii genome is intermediate between that of R. americana and the more representative genomes of mitochondria (14, 171). Thus, ancestral sequence motifs for genes encoding translation and transcription components are recognizable in R. prowazekii. However, these gene orders are slightly scrambled (Fig. 9A), presumably as a result of intrachromosomal recombination within a genome that originally was arranged as in some modern bacteria (19, 171).
Phylogenetic reconstructions based on the concatenated sequences
of eight small ribosomal proteins and three large ribosomal proteins
strongly support a clustering of the mitochondrial and
-proteobacterial sequences (Fig. 9B). A particularly
interesting feature of this tree is that the nucleus-encoded yeast
ribosomal proteins cluster with the mitochondrion-encoded ribosomal
proteins. This supports the interpretation that ribosomal protein genes from the
-proteobacterial ancestor were transferred to the nucleus from a mitochondrial-endosymbiont intermediate.
The very small sizes of the rRNA sequences and the unusually large protein-RNA ratios in some mitochondria suggest that functions previously provided by the missing parts in these rRNA sequences may have been taken over by nucleus-encoded ribosomal proteins (T. O'Brien, personal communication). Indeed, the yeast mitochondrial ribosome contains a total of 23 small-subunit and 37 large-subunit proteins, many of which have no homologues in bacteria or other eukaryotes. Similarly, the human mitochondrial ribosome contains a large number of ribosomal proteins not found in bacteria. These examples of species-specific ribosomal proteins seem to represent novel solutions to the problems created by the tendency of mitochondria to delete short stretches of sequence from rRNA genes. Here, specific protein structures may replace the functions of deleted rRNA patches in the ribosomes of some mitochondria.
Most of the mitochondrial tRNAs are encoded by the organelle's genome. They are presumably the descendants of tRNAs from the ancestral endosymbiont (18, 98). As an apparent exception, several plant mitochondrial tRNA genes appear to have been replaced rather recently by chloroplast tRNA genes (75). It is conceivable that tRNA genes have crossed organelle boundaries several times in the evolutionary past.
The total number of mitochondrial tRNA genes varies in different species, with a minimum of only 22 tRNA genes in some animal mitochondria. It has been suggested that this reduction in the diversity of the tRNA population was associated with the codon reassignments characteristic of animal mitochondria (13, 37, 108). For example, AUA specifies methionine rather than isoleucine in animal and yeast mitochondria. AGA and AGG encode serine instead of arginine in the insect and echinoderm mitochondrial genes, while these same codons are termination codons in mammalian mitochondria. The standard tRNAArgAGR is absent in animal mitochondria. In insect and echinoderm systems, the translation of AGR codons is performed by the tRNASerAGY.
An absolute minimal tRNA set would theoretically consist of one tRNA per amino acid, i.e., 20 different tRNAs. However, since a minimum of two different tRNAs are required for the translation of arginine, isoleucine, serine, and leucine codons, a more realistic minimum of 24 different tRNAs might be required to translate the conventional genetic code. On the other hand, the diversity of the tRNA population could be further reduced if codon recognition patterns are altered so that a single tRNA can translate the amino acids encoded by three or six codons in the conventional genetic code (13, 108).
Thus, extreme pressure to minimize the set of tRNA species encoded in the mitochondrial genomes might lead to the following sorts of codon reassignments: AUA can be reassigned from isoleucine to methionine, UUG and UUA from leucine to phenylalanine, and AGA as well as AGG from arginine to serine or, alternatively, AGU and AGC from serine to arginine. Each of these reassignments could reduce the number of tRNA species required to translate all 64 triplets of the genetic code. Obviously, the reduction in the minimum number of tRNA isoacceptors requires that the remaining tRNA species expand their codon degeneracy. This may be done by reducing the contribution of the third codon position to function in what has been referred to as hyperwobble (108). A conversion of any of these codons into termination codons would serve the same purpose. Instances of each of these sense codon reassignments in animal mitochondria have been observed. Accordingly, it was suggested that the observed codon reassignments in animal mitochondria are a consequence of the minimization of the canonical set of tRNA genes inherited from the ancestral endosymbiont (13, 108).
Point mutations such as those affecting tRNALeuUUR are particularly common in human mitochondrial diseases. A spontaneous suppressor of such a mutant has been identified as a heteroplasmic alteration of the anticodon of tRNALeuCUN that enables the suppressor to translate leucine codons of the form UUR (129). In this case the suppressor mutation is carried by approximately 10% of the mitochondrial genomes (129). A similar scenario might account for the loss of tRNA species and the evolution of codon reassignments during the evolution of animal mitochondria. Here, a mutant with an ancestral tRNA gene that has accumulated point mutations and/or deletions might be rescued by other mutant tRNA variants that expand their codon recognition range to compensate for the first tRNA variant's defects.
In the yeast Schizosaccharomyees pombe, a viable mutant tRNA isoacceptor with an altered codon recognition pattern has been identified (160). The existence of such viable mutants contradicts the common assumption that mutations that alter codon recognition patterns are lethal. Such tRNA variants with or without accompanying suppressor tRNA species may be fixed in a lineage of mitochondrial genomes. Here, the transition from the canonical tRNA ensemble to one with novel codon recognition patterns might require minimum levels of heteroplasmy to support a transition from the standard translation pattern to an atypical one.
The aminoacyl-tRNA synthetases of the mitochondria are largely
descendants of the
-proteobacterial ancestor that have been transferred to nuclear genomes. However, the transfer pattern is not
without its complexities. In contrast to ribosomal components, each of
the aminoacyl-tRNA synthetases needs to interact with only a few
molecules, such as the amino acids, ATP, and the tRNAs. This, along
with the universality of their functions, may account for the fact that
many aminoacyl-tRNA synthetases are associated with complex
phylogenetic patterns that reflect the influence of horizontal gene
transfers (199). The mitochondrial synthetases present
opportunities for particularly complex phylogenetic patterns.
The initial association of the ancestral
-proteobacterium and its
host brought together two complete translation systems with a total of
40 different aminoacyl-tRNA synthetases. The combination of the
symbiont's synthetase complement in the presumptive organelle and the
host's complement in the cytoplasm might have provided opportunities
for extensive gene replacements and gene losses. In particular, after
two billion years it is conceivable that the number of synthetases
could have been reduced to a minimum of 20 that service both the
mitochondrial and cytoplasmic compartments.
What is observed is more complex and only partially consistent with this conjecture. In S. cerevisiae, for example, three different aminoacyl-tRNA synthetases descended either from the symbiont or the host have been duplicated to serve the mitochondrion and the cytoplasm, while the complementary synthetase has been lost. Likewise, single genes that most likely function in the cytoplasm as well as in the mitochondrion encode a total of four synthetases. Surprisingly, a majority of the 20 nominal aminoacyl-tRNA synthetases in S. cerevisiae have retained the presumed ancestral pattern, represented by the presence of both a mitochondrial and a cytoplasmic protein of distinct phylogenetic origin. It is possible that the exceptional structures of mitochondrial tRNA species (see, for example, reference 108) and the requirement for these to be recognized by a coadapted synthetase have in some cases constrained the evolution of cell compartment-specific synthetase homologues.
We can imagine that the aminoacyl-tRNA synthetases have evolved in a three-stage process. (i) In gene transfer, the bacterial gene is transferred from the mitochondrion to the nuclear genome, which already contains an ancestral eukaryotic synthetase gene of the same specificity. Both genes are expressed, and the bacterially derived synthetase is targeted back to the mitochondrion, while the eukaryotic enzyme is targeted to the cytoplasm. (ii) In gene duplication and replacement, the nuclear and/or bacterial gene is duplicated, and one of the ancestral genes is replaced by the new paralogous gene copy. (iii) In functional duality and gene loss, signal sequences are added to the duplicated pair of genes so that their products can be recruited to the cytoplasm as well as to the mitochondrion. Finally, all of the unnecessary gene copies are purged by random mutation.
The first stage in this process seems to be represented by as many as
12 different aminoacyl-tRNA synthetases in S. cerevisiae, those for Glu, Phe, Leu, Met, Tyr, Asn, Asp, Trp, Ile, Lys, Ser, and
Pro (Fig. 10A). Each synthetase is
represented by a mitochondrial and a cytoplasmic form,
each with a separate origin. The mitochondrial synthetases are normally similar to their bacterial homologues, but
there is typically no specific clustering with the
rickettsiae.
|
The disparity between rickettsial and mitochondrial synthetases has
several sources. In the simplest case, the Rickettsia do not
have an Asn tRNA synthetase. For other synthetases, the absence of this
specific relationship is explained by the exceptional placement of
Rickettsia in the phylogenetic tree of synthetases. Indeed,
several examples of putative horizontal gene transfer events have been
identified among the rickettsial aminoacyl-tRNA synthetases
(20). For example, phylogenetic reconstructions suggest that
the Rickettsia Met tRNA synthetase clusters more closely
with a Mycobacterium tuberculosis homologue than with other
-proteobacterial versions of the enzyme. Rickettsia is also unusual in that it contains a class I Lys tRNA synthetase which is
more similar to the class I Lys tRNA synthetase in the archaea than to
the class II Lys tRNA synthetase of other proteobacteria. Finally,
Rickettsia may have recruited the gene encoding the
cytoplasmic Ile tRNA synthetase, since the rickettsial and the yeast
cytoplasmic Ile tRN