Previous Article | Next Article ![]()
Microbiology and Molecular Biology Reviews, December 1999, p. 923-967, Vol. 63, No. 4
Wadsworth Center, New York State Department
of Health, and Department of Biomedical Sciences, School of Public
Health, The University at Albany, Albany, New York
12201-05091; Institute of Environmental
Science and Engineering, The Technical University of Denmark, Lyngby,
Denmark2; and School of Engineering and
Applied Science, University of California at Los Angeles, Los
Angeles, California 90005-15933
1092-2172/99/$04.00+0
Copyright © 1999, American Society for Microbiology. All rights reserved.
Stress Genes and Proteins in the Archaea
SUMMARY
INTRODUCTION
STRESS, STRESS RESPONSE, STRESS GENES AND PROTEINS, HEAT
SHOCK, MOLECULAR CHAPERONES, AND CHAPERONINS
Primer
Stress versus Heat Shock
Phylogenetic Domains
HSP70(DNAK)
LOCUS
Structure and Organization
Expression
TATA-Binding Protein
ARCHAEAL HSP70 AND
HSP70
The Gene
The Protein
ARCHAEAL HSP40 AND
HSP40
The Gene
The Protein
ARCHAEAL GRPE AND GRPE
The Gene
The Protein
OCCURRENCE OF HSP70 IN NATURE
The Archaeal Puzzle
Hsp70-Based Phylogenetic Trees
FUNCTIONS OF ARCHAEAL MOLECULAR CHAPERONES
Biochemistry
Regulation: More Archaeal Puzzles
CHAPERONINS
Chaperonin Systems I and II
Structure-Function
Evolution
Expression
Regulation
ORIGINS: VERTICAL VERSUS LATERAL
OTHER ARCHAEAL STRESS GENES AND PROTEINS
STRESSORS
STRATEGIES AND METHODS USED TO STUDY THE STRESS RESPONSE,
GENES, AND PROTEINS IN THE ARCHAEA
STRESS TOLERANCE
Questions
Thermoprotectants
Multicellular Structures
Other Factors
PERSPECTIVES, OPEN QUESTIONS, AREAS FOR EXPLORATION, AND
APPLICATIONS
Biochemistry and Function of Archaeal Chaperones and
Chaperonins
Regulation
Cell Differentiation, Development, and Adaptation
Voids To Be Filled: Proteases and Auxiliary Factors
Beyond Chaperones and Chaperonins
CONCLUSION
ACKNOWLEDGMENTS
REFERENCES
SUMMARY
|
|
|---|
The field covered in this review is new; the first sequence of a gene encoding the molecular chaperone Hsp70 and the first description of a chaperonin in the archaea were reported in 1991. These findings boosted research in other areas beyond the archaea that were directly relevant to bacteria and eukaryotes, for example, stress gene regulation, the structure-function relationship of the chaperonin complex, protein-based molecular phylogeny of organisms and eukaryotic-cell organelles, molecular biology and biochemistry of life in extreme environments, and stress tolerance at the cellular and molecular levels. In the last 8 years, archaeal stress genes and proteins belonging to the families Hsp70, Hsp60 (chaperonins), Hsp40(DnaJ), and small heat-shock proteins (sHsp) have been studied. The hsp70(dnaK), hsp40(dnaJ), and grpE genes (the chaperone machine) have been sequenced in seven, four, and two species, respectively, but their expression has been examined in detail only in the mesophilic methanogen Methanosarcina mazei S-6. The proteins possess markers typical of bacterial homologs but none of the signatures distinctive of eukaryotes. In contrast, gene expression and transcription initiation signals and factors are of the eucaryal type, which suggests a hybrid archaeal-bacterial complexion for the Hsp70 system. Another remarkable feature is that several archaeal species in different phylogenetic branches do not have the gene hsp70(dnaK), an evolutionary puzzle that raises the important question of what replaces the product of this gene, Hsp70(DnaK), in protein biogenesis and refolding and for stress resistance. Although archaea are prokaryotes like bacteria, their Hsp60 (chaperonin) family is of type (group) II, similar to that of the eukaryotic cytosol; however, unlike the latter, which has several different members, the archaeal chaperonin system usually includes only two (in some species one and in others possibly three) related subunits of ~60 kDa. These form, in various combinations depending on the species, a large structure or chaperonin complex sometimes called the thermosome. This multimolecular assembly is similar to the bacterial chaperonin complex GroEL/S, but it is made of only the large, double-ring oligomers each with eight (or nine) subunits instead of seven as in the bacterial complex. Like Hsp70(DnaK), the archaeal chaperonin subunits are remarkable for their evolution, but for a different reason. Ubiquitous among archaea, the chaperonins show a pattern of recurrent gene duplication
hetero-oligomeric chaperonin complexes appear to have evolved several times independently. The stress response and stress tolerance in the archaea involve chaperones, chaperonins, other heat shock (stress) proteins including sHsp, thermoprotectants, the proteasome, as yet incompletely understood thermoresistant features of many molecules, and formation of multicellular structures. The latter structures include single- and mixed-species (bacterial-archaeal) types. Many questions remain unanswered, and the field offers extraordinary opportunities owing to the diversity, genetic makeup, and phylogenetic position of archaea and the variety of ecosystems they inhabit. Specific aspects that deserve investigation are elucidation of the mechanism of action of the chaperonin complex at different temperatures, identification of the partners and substitutes for the Hsp70 chaperone machine, analysis of protein folding and refolding in hyperthermophiles, and determination of the molecular mechanisms involved in stress gene regulation in archaeal species that thrive under widely different conditions (temperature, pH, osmolarity, and barometric pressure). These studies are now possible with uni- and multicellular archaeal models and are relevant to various areas of basic and applied research, including exploration and conquest of ecosystems inhospitable to humans and many mammals and plants.
INTRODUCTION
|
|
|---|
The purpose of this review is to examine the information available on archaeal stress genes and proteins, particularly those of the Hsp70 and Hsp60 families, while critically discussing the data in comparison with what is known for the bacterial and eukaryotic equivalents. The aim was to treat the specific topics of the review embedded in the framework of closely related areas of science. Cross-fertilization between research with archaea and research with bacteria and eukaryotes is highlighted to show how the study of archaea has contributed, and will continue to contribute, to other fields, both basic and applied.
A deliberate effort has been made to simplify the text and make it readable to a general audience. Consequently, terms are explained and the data and theories are presented within a historical perspective. A minimal amount of overlapping between related sections distant from one another in the body of the review is included to enhance the flow, particularly when a later section expands on an earlier one.
A comprehensive search of printed literature and databases was attempted. Colleagues were consulted. The majority of the data are displayed in tables and figures, but only illustrative cases are explained in the text. Reviews rather than original reports are cited for topics related to but not strictly dealing with archaeal genes, proteins, or organisms, to reduce the number of references and save space while providing access to a wealth of published information.
Archaea have been found in a wide variety of ecosystems with very different characteristics, very hot or very cold, temperate, anoxic, oxygenated, etc. (9, 12, 50, 59, 178, 297). Thus, what represents a stressor for a species may be a condition required for the optimal growth of another species. The term "stressor," therefore, must be understood in relation to a particular species or group of organisms that share similar living conditions, for example temperature. In this regard, organisms are classified into psychrophiles (optimal temperature for growth [OTG] 15°C or lower), psychrotolerant organisms (OTG, 20 to 30°C), mesophiles (OTG, 35 to 40°C), thermophiles (OTG, 50 to 70°C), and hyperthermophiles (OTG, 80°C or higher) (186). Temperatures higher or lower than the optimal may cause stress and induce a stress response. A temperature upshift causes a heat shock response (47, 90, 120, 128, 177, 252, 292), whereas a temperature downshift induces a cold stress or cold shock response (255, 275, 304). The latter is not dealt with in this review. Likewise, adaptation to high osmotic and barometric pressures and the response of the cell to their changes (33, 68, 77, 121, 138, 190, 201, 204, 232, 266, 271) are not treated in any detail.
We draw attention, though, to the formation of multicellular structures that result in improved cell resistance to physical, mechanical, and chemical stressors. These structures are of various types and have considerable potential for the biotechnology industry and for the exploration and conquest of inhospitable ecosystems, but their relevance to stress resistance is rarely discussed. We highlight the topics that future studies ought to address in relation to the proteins (and their genes) and other molecules that build the intercellular connective material to keep the cells together in a functional, three-dimensional arrangement.
STRESS, STRESS RESPONSE, STRESS GENES AND PROTEINS, HEAT
SHOCK, MOLECULAR CHAPERONES, AND CHAPERONINS
|
|
|---|
Primer
A cell confronted with an abrupt change in its immediate surroundings suffers stress. The cause, or stressor, may be of various types, for example, physical (temperature elevation) or chemical (increase or decrease in pH, salinity, or oxygen concentration) (47, 113, 252, 292).
A key component of stress is protein denaturation (93, 96, 123, 124, 223, 293). Many proteins lose their native, functional configuration and tend to aggregate. The process may be reversible up to a degree, beyond which it becomes irreversible and generalized within the cell, which ultimately dies.
Another main component of stress is the down-regulation of many housekeeping genes, some of which are actually shut off. Whether this is all due to protein denaturation and represents just the breakdown of the cellular machinery or is an active, induced process by which genes are "told" to slow down or stop has not yet been elucidated. Perhaps both mechanisms, gene failure and regulated shutdown, participate.
Protein damage and gene down-regulation are part of the stress response. There is yet another important component of the stress response, i.e., activation of the stress genes (90, 120, 205, 208, 219, 253, 254, 279). The concentrations of the protein products of these genes increase in response to stressors, protecting the cell from the destructive effects of stress and enhancing post-stress recovery by promoting renaturation (refolding) of partially denatured proteins (93, 101, 223, 293).
Thus, stress inactivates or down-regulates many genes but activates others, whose function is to save the cell. Most stress genes also function in the absence of stress, namely, under normal physiological conditions. The proteins encoded by these genes play critical roles in physiological protein biogenesis. They assist in the folding, translocation, and assembly of other proteins (191, 214, 235, 236, 247). This is the reason why many stress proteins are also called molecular chaperones (72). They help other cellular proteins to (i) fold correctly during and after translation; (ii) migrate to the cell's locale, where they will reside and function; and (iii) assemble into the quaternary structure that will make them useful to the cell when the proteins function as polymers.
Furthermore, some stress proteins participate in the degradation of other polypeptides, for example when these are denatured beyond recovery and could pose a serious threat to the cell if they aggregated (39, 92, 97, 98, 122, 123, 133).
In summary, stress proteins, particularly those that are molecular chaperones, aid and protect other cellular proteins from their birth on, but they also contribute to the elimination of polypeptides that are no longer useful and endanger cell viability.
It is important to bear in mind that not all stress proteins are chaperones and, vice versa, that not every molecular chaperone is a stress protein.
The wide spectrum of activities of stress proteins is not limited to the chaperoning of other proteins as described above. These activities also include other functions, for example, modulation of their own synthesis (6, 20, 90), regulation of the stress kinase JNK (85), association with enzymes (for purposes yet to be determined) (43), and participation in signal transduction pathways (175) and in rRNA processing (249). It is therefore clear that stress proteins are multifunctional and ubiquitous. They play their roles in all cells, cell compartments, and organelles and are said to be promiscuous because they interact with a great variety of other molecules.
This diversity of functions is reflected in the structural features of the stress proteins, which are composed of domains and motifs with specific roles. As we discuss below, characterization of these domains and motifs has helped in the classification of newly found genes and proteins, identification of stress proteins in their various anatomical locations, and determination of their evolutionary origins.
Stress versus Heat Shock
Stress genes and proteins are often named heat shock genes and proteins in today's literature, for historical reasons. Because of this, they are represented by the acronyms hsp and Hsp, respectively.
hsp and Hsp were first observed in Drosophila exposed to a temperature higher than the optimum for growth (27°C) (reference 177 and references therein). The genes activated by the temperature upshift were called heat shock genes, and their products were called heat shock proteins. In this review, we use the terms "stress" and "heat shock" interchangeably to qualify the words response, gene, and protein, although we favor the use of "stress" rather than "heat shock" and reserve the latter for the specific instances in which the stressor is a temperature elevation.
Hsp (and their genes) are classified into groups or families according
to their molecular mass in kilodaltons (Table
1). The proteins of the 55- to 64-kDa
group, or Hsp60 family, are also called chaperonins and are
included within the molecular chaperones, generally speaking. More
specifically, the latter term is applied to the Hsp70 family. The genes
and proteins belonging to the Hsp60 and Hsp70 families have been
extensively studied in many bacterial and eukaryotic species.
|
Phylogenetic Domains
The classification of all living cells into three main evolutionary lines, or phylogenetic domains, Bacteria (eubacteria), Archaea (formerly archaebacteria), and Eucarya (eukaryotes) (11, 297, 298, 300), is still useful despite its limitations and the challenges generated by new findings and contrasting theories (66, 67, 80, 103, 105, 106, 109-111, 188, 189, 192, 198, 234, 286, 298). It helps us to visualize how evolution produced what we see today and to track down genes from the past to the present.
The overwhelming majority of information available on stress genes and proteins comes from studies of bacteria (e.g., Escherichia coli and Bacillus subtilis) and eukaryotes (e.g., Drosophila melanogaster, Saccharomyces cerevisiae, plants, and mammals including humans). The study of stress genes and proteins in organisms of the phylogenetic domain Archaea began only a decade ago and is much less advanced than in the other two domains.
HSP70(DNAK)
LOCUS
|
|
|---|
Structure and Organization
The terms hsp70 for the gene and Hsp70 for the protein are used for eukaryotes, while the same gene and protein are called dnaK and DnaK, respectively, in bacteria. We use hsp70 and Hsp70 in most cases, regardless of the origin, for simplicity and because these terms are more widely known than dnaK and DnaK, and also because archaea are not bacteria.
The first hsp70 gene identified by cloning and sequencing within the domain Archaea was reported in 1991 (182). The gene was found in the mesophilic methanogen Methanosarcina mazei S-6 (OTG, 37°C). Shortly thereafter, in 1992, a homolog was cloned and sequenced from another archaeon, Halobacterium (Haloarcula) marismortui (110). This organism is also mesophilic but belongs to a group, the extreme halophiles, different from that of M. mazei S-6.
For a while, the above genes were the only two archaeal hsp70 genes known. In 1994, two additional homologs were reported: one in another mesophilic extreme halophile, Halobacterium cutirubrum (OTG, 45°C), and the other in a thermophile, Thermoplasma acidophilum (OTG, 55°C) (111).
These findings seemed to indicate that the hsp70 gene was present in archaea, confirming the widely held notion that this gene is one of the most highly conserved, occurring in all organisms. This idea was challenged in 1996, when the sequencing of the whole genome of Methanococcus jannaschii (OTG, 85°C) revealed the absence of the hsp70 gene in this archaeon (31), a result that confirmed observations in other laboratories (47, 164).
More recently, in 1997, the sequencing of the thermophilic
methanogen Methanobacterium thermoautotrophicum
H
(OTG, 55°C) was published (263). The
hsp70 gene is present in this methanogen, which in this
regard is therefore similar to the mesophilic M. mazei S-6
but different from the hyperthermophilic methanogen M. jannaschii.
While the sequencing of the M. thermoautotrophicum
H was
under way, another hsp70 gene was cloned and sequenced from
a second Methanosarcina species, the thermophilic species
Methanosarcina thermophila TM-1 (OTG, 50°C)
(126). So by this time, it was clear that at least some
mesophilic and thermophilic methanogens do have the gene but that some
(perhaps all) hyperthermophiles do not.
After the discovery of an archaeal hsp70 gene in M. mazei S-6 in 1991, more sequencing up- and downstream of this gene revealed the other genes that accompany hsp70(dnaK) in bacteria: hsp40(dnaJ) and hsp23(grpE); these discoveries were made in 1993 (181) and 1994 (45), respectively.
It is pertinent to note here, as was done above for hsp70, that the hsp40 gene and its protein Hsp40 are named dnaJ and DnaJ, respectively, when they are from bacteria. Similarly, the bacterial hsp23 gene and its protein, Hsp23, are called grpE and GrpE, respectively.
We use the terms hsp40 and Hsp40 when referring to archaea, for the same reasons we use the terms hsp70 and Hsp70. However, for the hsp23 gene, we use the terms grpE and GrpE, because the alternative hsp23 and Hsp23 may be confusing since there are several different small heat shock proteins with a mass close to 23 kDa (76, 78, 156, 206, 227). Also, the archaeal GrpE molecule has a counterpart in bacteria and the eukaryotic organelles of bacterial origin but apparently not in the eukaryotic cytosol. The latter does not seem to have a GrpE protein but has some other alternative to exercise similar functions, although recent findings suggest that grpE homologs might also occur in the eukaryotic cytosol (211, 212).
As a result of the sequencing of the M. mazei S-6 and
M. thermophila TM-1 hsp70 chromosomal regions and
the sequencing of the M. thermoautotrophicum
H genome,
there are today three archaeal hsp70 loci whose structure
and organization have been determined (Fig.
1) (31, 126, 179). The gene
order 5'-grpE-hsp70-hsp40-3' occurs in the three archaeal
loci and is the same as that observed in many bacteria, particularly
gram-positive bacteria (Fig. 2) (179). However, there are differences between the three
archaeal loci. For example, they differ in the length of the
5'-grpE-hsp70-3' and 5'-hsp40-next gene-3'
intergenic regions and in the gene that follows hsp40
downstream. This gene is the same in M. mazei S-6 and
M. thermophila TM-1 but different in M. thermoautotrophicum
H.
|
|
The length of the intergenic region between hsp70 and hsp40 is conserved in the three loci, particularly in comparison with the other intergenic regions.
The meaning of these structural characteristics is not completely understood. They suggest, for example, that hsp70 and hsp40 may have evolved together, as a unit. This notion is also supported by the conservation of the homologous gene pair in many bacteria (see, for example, Fig. 2).
As discussed later in this review, there are indications that the
hsp70 gene in archaea was received via lateral transfer from
bacteria. Perhaps it was accompanied by hsp40 and
grpE, since both are always present whenever
hsp70 is, and the three genes appear next to each other in
many bacteria. However, other structural characteristics and
experimental results, to be discussed below (see "Occurrence of
hsp70 in nature"), tend to make this notion less
credible, at least in its simplest formulation. Analyses of the
nucleotide sequences between the protein-coding regions of the genes do
not reveal obvious similarities, except for the presence of
putative archaea-type promoters and bacterial-type termination signals
in the expected locations with regard to the translation start and stop
codons, respectively (44, 181, 182). Other sequence features
vary with the intergenic region and the species, but the regions
upstream of hsp70 in two of the methanosarcinas possess a
series of repeats and palindromes. They might be cis-acting signals, namely, binding sites for regulatory factors (168). In contrast, the region upstream of hsp70 in M. thermoautotrophicum
H is very short and lacks anything that
might be a promoter or a cis-acting site.
No bacterial-type promoter sequences (52, 120, 219, 258, 259) are identifiable in these archaeal intergenic regions, nor are there bacterium-type regulatory elements, such as CIRCE (259, 310, 313) or ROSE (208, 209), that one can detect by sequence comparisons.
If one considers the high degree of conservation of these regulatory sequences among bacteria, it is reasonable to conclude that they do not occur in the three known archaeal loci and that regulation of the hsp70 locus genes in these organisms is mediated by factors different from those operating in bacteria. Thus, despite the similarities in organization between the archaeal and bacterial hsp70 loci, their mode of expression and regulatory mechanisms appear to be different.
Also remarkable is that the 5'-grpE-hsp70-3' intergenic region and the distance between grpE and the next gene upstream in M. mazei S-6 and M. thermophila TM-1 are considerably longer than the equivalent regions in bacteria. The latter have their genes closer to one another, in agreement with their polycistronic mode of transcription and their being regulated as a unit, or operon. Instead, the structure of the archaeal loci in the two methanosarcinas shown in Fig. 2 does not suggest the bacterial modes of transcription and regulation but different ones (see also experimental data, given below).
Expression
Functional analyses of the hsp70 genes have been carried out for M. mazei S-6 and to a lesser extent for M. thermophila TM-1. No functional information exists for the other four archaeal hsp70 genes that have been cloned and sequenced thus far.
M. mazei S-6 hsp70(dnaK), hsp40(dnaJ), and grpE respond to heat shock by an increase in the production of their transcripts (Fig. 3) (42, 44), as one would expect for stress genes. The transcripts are monocistronic as in eukaryotes (302) and in contrast to bacteria (10, 95, 120, 131, 132, 294, 313). Likewise, the peak response in terms of transcript levels is reached after heat shocks longer than those that would induce a peak response in bacteria (Fig. 4) (164), also in agreement with what is observed in eukaryotes. The transcription initiation sites map to positions reminiscent of the eukaryotic initiation sites with respect to the promoter (Fig. 5) (42, 44, 179). Furthermore, the genes respond to temperatures ranging from 45 to 60°C (Fig. 6) (164) and to other stressors such as cadmium (Cd2+) (Fig. 7) (179) and ammonia (165) as expected for heat shock genes. Thus, the data show that the archaeal hsp70 locus genes are stress or heat shock genes but have mixed bacterial and eucaryal characteristics.
|
|
|
|
|
The body of structural and functional data available at present
suggests that the mechanism of transcription initiation for the
archaeal hsp70 locus genes differs from those known to
operate in bacteria (26, 27, 120, 132, 208-210, 242, 259, 310,
311, 313) and must involve factors which are not of a bacterial
type, i.e., different from
factors (180). These data, as
well as the fact that all transcription initiation studies with
archaeal systems (albeit none involving heat shock genes and
practically all done with hyperthermophilic systems) have demonstrated
transcription factors of the eucaryal type (16, 51, 94, 117, 118,
264, 274, 276, 312), force the prediction that initiation for the M. mazei S-6 hsp70, hsp40, and
grpE genes involves eucaryal-type factors.
TATA-Binding Protein
The archaeal homologs of the eucaryal TATA-binding protein (TBP) and the transcription factor IIB (TFIIB) (aTFB and aTFA, respectively [117, 118]), have been identified and shown to be required for the transcription of archaeal, non-heat shock genes in vitro (57, 94, 117, 118, 276). There is no comparable information for archaeal stress genes, but one may hypothesize, based on the observations described in the previous section, that these genes will also require TBP and TFIIB as basal factors. Moreover, it is likely that other factors would also be necessary to induce the response to stressors and preferentially, or even specifically, start transcription of hsp70 and its teammates, hsp40 and grpE.
The tbp gene of M. mazei S-6 has been cloned and sequenced (51). The deduced amino acid sequence of the protein possesses some of the expected archaeal characteristics, but it also shows unique features. For example, like all archaeal TBPs known (reviewed in reference 264), the M. mazei S-6 protein is shorter than most eucaryal homologs, amounting to what is the C-terminal domain in eucaryal molecules. Also, the M. mazei S-6 protein is acidic, like the other known archaeal proteins, but it differs from them in that its N-terminal third is basic, not acidic. The direct, imperfect repeats found in all TBPs, archaeal and eucaryal, are also present in the M. mazei S-6 molecule. Repeats of approximately 42 amino acids separated by a spacing segment of 51 residues on average can be identified (51). The repeats are better conserved in the archaeal than in the eucaryal TBPs, and this is also true for the M. mazei S-6 homolog. A few archaeal TBPs have an acidic tail composed of a series of Glu residues in the C-terminal end. This acidic tail is not present in the M. mazei S-6 molecule.
The overall and regional characteristics of the M. mazei S-6 protein most probably determine its functional properties in what pertains to the binding to DNA at the promoter and to the potential interaction with other transcription factors, such as TFIIB and perhaps stress-specific factors (139a). These structure-function aspects of M. mazei S-6 TBP are being investigated at present. Purified TBP binds to the M. mazei S-6 hsp70 promoter, as demonstrated by the electrophoretic mobility shift assay (EMSA) (57a).
Research to determine how transcription initiation starts and proceeds under constitutive (basal) conditions and in response to stress (heat shock) is under way. In experiments with cell lysates from M. mazei S-6, it was demonstrated that TBP present in the lysates binds to the hsp70 promoter (51a). The phenomenon is observed with lysates from both unstressed and stressed cells. In the latter, a protein appears or increases in concentration or in its ability to bind DNA or TBP, which causes an additional shifted band in EMSA. The nature and role of the protein are under investigation. It might be a regulatory factor that binds near the hsp70 promoter.
ARCHAEAL HSP70 AND
HSP70
|
|
|---|
The Gene
The salient characteristics of the archaeal genes sequenced thus
far are described in Table
2. The promoters,
terminators, and ribosome-binding sequences or sites (RBS) are
putative, except for the M. mazei S-6 promoter, for which
preliminary experimental evidence supports the promoter shown (Fig. 5).
|
The Protein
The archaeal Hsp70 proteins are quite similar to each other and,
most remarkably, equally so to proteins from gram-positive bacteria
(Table 3). Among the archaeal proteins,
the most similar pairs are those from the two methanosarcinas, the two
extreme halophiles, and the two thermophiles (T. acidophilum
and M. thermoautotrophicum
H) (see "Hsp70-based
phylogenetic trees" below).
|
The archaeal proteins all have the universal markers for Hsp70 and DnaK
and the bacterial markers (Table
4). However, they do not
have any of the markers typical of eucaryal molecules. Thus, the
archaeal Hsp70 is of bacterial type in sequence and in structural
features that reflect its function.
|
A remarkable feature that appears to be distinctive for the archaeal Hsp70 is the absence of a stretch of 23 to 25 amino acids in the N-terminal quadrant, which became evident when the sequences were aligned with those of proteins from gram-negative bacteria (Fig. 8) (182). This major marker is shared with the DnaK proteins from gram-positive bacteria and is not present in eukaryotic homologs (109-111).
|
The evolutionary and functional significance of this sequence gap in archaea and gram-positive bacteria has not been elucidated. However, it has given support to a phylogenetic classification that places the archaea closer to gram-positive bacteria than to eukaryotes (105, 106, 109-111), in contrast to the classical 16S-18S rRNA-based tree (11, 299, 300). In the Hsp70-based tree, the extant gram-negative bacteria would have separated from their ancestors, i.e., the ancestors of today's gram-positive bacteria, early in evolution. As this happened, or shortly thereafter, the gram-negative line acquired the 23 to 25 extra amino acids that characterize its Hsp70. Also, within the framework of this hypothesis, the eukaryotic nucleus would have arisen from a fusion of a primitive archaeon with a gram-negative ancestor. The gene that ultimately became established in the eukaryotic line was that which came from the bacterial partner.
These are speculations based on sequence comparisons and other data that are not completely satisfactory in view of all the information available today. Alternative explanations have been put forward and are discussed in some detail in subsequent sections of this review.
ARCHAEAL HSP40 AND
HSP40
|
|
|---|
The Gene
The four archaeal hsp40(dnaJ) genes
sequenced thus far are described in Table
5. They are remarkably
similar to each other, as are the proteins they encode (see below).
|
The Protein
The four archaeal Hsp40(DnaJ) proteins known at present are
similar to one another and to their bacterial homologs (Table 6). As is the case for the Hsp70, the
most similar pair is that of the two proteins from methanosarcinas. The
universal motifs and signatures that characterize the Hsp40 molecule,
whether from eukaryotes or from bacteria, also occur in the archaeal
homologs, except those that are distinctive for the eucaryal molecules
(Table 7).
|
|
The Gly-rich domain of the H. cutirubrum Hsp40 is longer and
has a higher percentage of Gly than those of the three molecules from
methanogens (Table 8). The H. cutirubrum molecule also shows a different pattern of distribution
of the CxxCxGxG motif from the molecules from the methanogens (Table
9). Motifs 1 and 2 (counting from the N
to the C terminus) are separated by 9 amino acids, motifs 2 and 3 are separated by 18 amino acids, and motifs 3 and 4 are separated by 6 amino acids in the four molecules. However, motif 1 begins farther away
from the N terminus in the H. cutirubrum molecule than in
the others. In consequence, motif 4 is the closest to the C terminus in
the molecule from the extreme halophile compared with those from the
methanogens. The question remains open whether these seemingly unique
features of the molecule from the extreme halophile reflect an
adaptation to life under high-salinity conditions and/or to cope with
salinity changes.
|
|
Motif 1 in the M. thermoautotrophicum
H molecule is,
barring a sequencing error, aberrant in that the last residue is Arg (R) instead of Gly (G).
ARCHAEAL GRPE AND GRPE
|
|
|---|
The Gene
The two archaeal grpE genes whose sequences have been
determined are described in Table
10. The genes differ
considerably in length; the M. mazei S-6 gene encodes a
molecule 35 amino acids longer than that encoded in the M. thermoautotrophicum
H homolog. This disparity confirms the
poor degree of conservation of grpE and predicts that it
will be very difficult to identify homologs in nature on the basis of
sequence comparisons alone. The failure to detect GrpE in the
eukaryotic-cell cytosol for example, may be due to its diversity.
Methods other than structural analyses may be necessary to unveil the
true spectrum of this molecule, as suggested by recent work
(212) and by the data in Table 10 (see also below).
|
The Protein
The amino acid sequence of GrpE is not as highly conserved as that
of Hsp70 or even Hsp40 (Table 11).
However, if discrete regions, for example regions I and II
(294), are compared, the similarity increases (Table
12). These regions and the GrpE motifs (45) are shown in Fig. 9. The
functions of these structural features have not been determined. It has
been suggested that they might be important portions of the molecule,
involved in the interaction of GrpE with the other members of the
chaperone machine, Hsp70 and Hsp40 (45, 294).
|
|
|
OCCURRENCE OF HSP70 IN NATURE
|
|
|---|
The Archaeal Puzzle
The absence of the hsp70 gene in some archaeal species has been noted since the early 1990s (47) and was also found later, when it could not be detected in the hyperthermophiles Methanothermus fervidus, Sulfolobus sp., and M. jannaschii or in the mesophile Methanospirillum hungateii (47, 164). These reports, however, were based on negative results obtained by Northern, Southern, and Western blots with heterologous probes. Consequently, they could not be taken as proof of the absence of the gene.
A definitive confirmation came in 1996, when the sequencing of the M. jannaschii genome did indeed reveal that this organism does not contain hsp70 or the other two genes of the chaperone machine triad, hsp40 and grpE (31). Although this finding helped to give credence to previous negative results obtained by blotting procedures with heterologous nucleic acid and antibody probes and to reaffirm the idea that some organisms may indeed lack hsp70, it raised questions about the earlier finding of the gene in M. mazei S-6. This organism is a methanogen like M. jannaschii. Why is it, then, that the former contains hsp70 while the latter does not? Was the reported M. mazei S-6 gene real or artifactual?
There were additional data confirming the occurrence of hsp70 in other methanosarcinas, different from M. mazei S-6, from before the M. jannaschii genome had been sequenced (42). However, once again, these data had been obtained by Northern and Southern blots with a probe for the M. mazei S-6 gene, and the possibility of nonspecific hybridizations could not be ruled out.
The situation was finally clarified when the full genome sequence of
another methanogen, M. thermoautotrophicum
H was reported in 1997 (263). Like M. mazei S-6, this methanogen
contains hsp70, as well as hsp40 and
grpE (Fig. 1).
As things stand today, it is clear that hsp70 occurs in some
but not all methanogens. It also occurs in extreme halophiles, but it
is not known whether there are organisms in this group that lack the
gene
this remains to be demonstrated. The gene does not occur in any
of the extreme thermoacidophilic archaea investigated up until now.
This had been suggested, as mentioned above, by results obtained by
blotting procedures (47, 164) and was confirmed for
Archaeoglobus fulgidus (151) and other species by
whole-genome sequencing (Table 13). In
addition, a search for the hsp70(dnaK) relative
hsc66, found in Escherichia coli and other
bacteria (260), in the genomes of A. fulgidus,
Pyrococcus horikoshii, M. jannaschii, and
M. thermoautotrophicum did not reveal its presence
(180a).
|
Several important conclusions may be derived from the data available at present: (i) the absence of hsp70 seems to be a characteristic of archaeal species that live at very high temperatures (hyperthermophiles); (ii) in sharp contrast, no hyperthermophilic bacterium has been found yet that lacks the gene; (iii) hsp70 is scattered among methanogenic archaea that are either mesophiles or thermophiles, like M. mazei S-6 and M. thermophila TM-1, but is absent in other methanogens; (iv) whenever hsp70 was present in a genome, hsp40 and grpE were also found if enough sequencing was done; (v) conversely, genome sequencing has demonstrated that if the hsp70 gene is absent, hsp40 and grpE are also absent; and (vi) the gene has been found in two extreme halophiles, but there are no reports of full-genome sequences for this group of organisms, and so it is not possible to be certain about the situation with them. Do they all have hsp70, and also hsp40 and grpE? We know that at least one of them, H. cutirubum, has hsp40 in addition to hsp70 (32). It may be argued that archaea did not have the genes to begin with and that some of them received the genes via lateral gene transfer. However, the observations listed above, particularly (iv) and (v), and other data challenge the lateral-gene-transfer hypothesis, at least in its simplest form.
One must assume that the three genes jumped as a block, or unit, from a bacterium into an archaeon, particularly the pair hsp70 and hsp40. This implies that the unit also carried the intergenic regions. In agreement with the hypothesis, the proteins encoded by the genes, particularly Hsp70 and Hsp40, are of a bacterial type. Against this hypothesis stands the fact that there are no signal sequences of the bacterial type in the intergenic regions. They are typically archaeal. Hence, more data are needed to determine the origin of the archaeal hsp70 locus genes, i.e., archaeal or bacterial, or prearchaeal or prebacterial; and if the origin is archaeal or prearchaeal, more research is necessary to elucidate the mechanism by which these genes came to be in today's species that have them.
Hsp70-Based Phylogenetic Trees
The Hsp70 molecule lends itself to comparative analyses for making phylogenetic trees. It is widely distributed among organisms of the domains Bacteria and Eucarya, and it also occurs among the archaea. The molecule is long enough to allow for many mutations to be detected and for useful alignments, since about 500 residues are conserved in a molecule which on average is a little over 600 amino acids in length. Furthermore, Hsp70 has segments that are highly conserved and others that are less so, which allows the detection of variations while maintaining alignable portions and the identification of structural markers that can easily be seen in all members of the family.
Several groups of investigators have used Hsp70 to make phylogenetic trees (22, 99, 105-111, 142, 239). Most of these do not agree with the classical tree based on comparisons of 16-18S rRNA sequences (11, 299, 300). In the rRNA-based tree, archaea and eucarya have a common line up to a point at which they diverge. The archaeal-eucaryal and the bacterial lines are shown to branch off earlier from a primitive, common line. Hsp70-based trees do not support the archaea-eucarya sisterhood or the monophyletic character of archaea suggested by the rRNA-based tree. Some Hsp70 trees suggest a close relationship between archaea and gram-positive bacteria on one side and between eucarya and gram-negative bacteria on the other (105, 106).
The reliability of the trees has been questioned lately. This applies to both rRNA- and protein-based trees (66, 67, 80, 99, 105, 106, 188, 189, 192, 286, 298).
Evidence showing that lateral gene transfer events are more frequent and widespread than was previously realized has been accumulating in the last couple of years (1, 4, 66, 67, 213, 298). Thus, finding that the Hsp70 proteins, for example, of two organisms are very similar does not necessarily mean that the organisms are phylogenetically close. It only means that their Hsp70 molecules are close to each other and have a common ancestor. It does not prove that the ancestor molecule was present in a common ancestor of the two organisms. One of the two organisms may have received its Hsp70 via lateral gene transfer and thus mistakenly appears to be a close relative of the donor's ancestral lineage.
In summary, by studying amino acid sequences, one can follow the natural history of genes and their occurrence in nature, namely, their itinerary, as it were, along the series of organisms in which they are found.
A recent study addresses these points, taking advantage of the fact
that considerably more Hsp70 sequences are known now than when previous
studies were carried out (99). A systematic search for
hsp70 among archaea was performed, and the gene was cloned from Aquifex pyrophilus and Thermotoga maritima.
These two bacterial species represent the deepest offshoots in the
rRNA-based tree. The gene was not found in 8 of the 10 archaea
investigated (Table 13). Alignments of 70 Hsp70 sequences, including
the 2 new ones from A. pyrophilus and T. maritima, confirmed the previous observations that the M. mazei S-6 protein clusters with the proteins from the low-G+C
gram-positive bacteria while the proteins from the extreme halophiles
cluster with the proteins from the high-G+C gram-positive bacteria
(Fig. 10). Remarkably,
the Hsp70 proteins from T. acidophilum and M. thermoautotrophicum
H clustered together (Table 3), along with
those from the Aquifexales, Thermotogales, and green nonsulfur bacteria (Fig. 10). The two archaeal Hsp70 proteins
in this group appeared to have an ancestor in common with T. maritima. In brief, the Hsp70 from the archaeal species T. acidophilum and M. thermoautotrophicum
H did not
cluster with the proteins from gram-positive species, as suggested by
others, but with bacteria unrelated to the gram positive ones. Another unexpected finding was that the Hsp70 from T. maritima did
not have the 23-amino-acid insert characteristic, it was believed, of
gram-negative bacteria. Thus, T. maritima Hsp70 possesses a structural feature (i.e., a 23-amino-acid gap in its N-terminal quadrant [Fig. 8]) that is assumed to be distinctive for
gram-positive bacteria and archaea, despite the fact that this organism
is not a gram-positive bacterium or an archaeon.
|
There are several possibilities to explain these observations. For example, if one assumes that the eucarya and archaea had a common ancestor that contained hsp70, it is possible that both received the gene but some archaea lost it afterwards. A second possibility is that there was no hsp70 in the common ancestor and the gene was acquired after the three lineages separated, with some archaea being excluded. A third possibility is that there was a common ancestor which contained hsp70 and that the three lineages received the gene vertically but the archaeal lineage lost it very early (the species that have the gene today received it via lateral transfer). Finally, if one disregards the common-ancestor idea, another possibility is that the archaea never had the gene while the bacteria and eucarya had it from the beginning. Here, also, archaea that have the gene today must have acquired it by lateral transfer.
The above possibilities and others one might easily think of are more or less improbable depending on (i) how one interprets available data from other studies; (ii) what molecule(s) and criteria were used to generate these data; (iii) what methods were applied to obtain, study, and statistically validate the data; and (iv) what classification scheme was adopted as a master scaffolding to compare with the Hsp70-based tree.
In any case, all the possibilities mentioned share an important characteristic: they stimulate research in this fascinating area of biology and evolution. Molecular phylogeny and detailed analyses of proteins and other macromolecules have already demonstrated their enormous value as tools for research. They are instrumental in uncovering relationships between organisms, the origins of the eukaryotic cell components, the functional meaning of structural motifs, and the role of domains in large proteins of eukaryotes whose ancestors are smaller proteins in more primitive organisms.
FUNCTIONS OF ARCHAEAL MOLECULAR CHAPERONES
|
|
|---|
Biochemistry
There is little information on the functions of the archaeal Hsp70, Hsp40, and GrpE molecules, as assessed in vitro or in vivo. Since they are so similar in sequence and/or structural features to the homologs from bacteria (Tables 4 and 7; Fig. 9), it may be assumed that both groups of proteins have the same functions and participate in the same mechanisms as molecular chaperones and regulators of their own synthesis.
For example, the bacterial Hsp70(DnaK) is an ATPase and binds ATP and unfolded polypeptides (substrate) (191, 214, 236). Hsp70(DnaK)-ATP has a lower affinity for substrate than Hsp70(DnaK)-ADP. Thus, ATP hydrolysis, which is enhanced by Hsp40(DnaJ) via interaction of its J domain with at least two sites on the Hsp70(DnaK) molecule (87, 270), promotes substrate binding, and the polypeptide is maintained in an extended form, avoiding aggregation. Interaction with GrpE, or nucleotide exchange factor, regenerates the Hsp70(DnaK)-ATP complex, lowers the affinity for the substrate, and releases it (the polypeptide may then be taken by the chaperonin system for final folding (see "Chaperonins" below). Hsp40(DnaJ) is thought to also bind the substrate, before Hsp70(DnaK) does, and to tag the polypeptide so that the Hsp70(DnaK)-ATP complex "sees" it and binds it (176, 191, 236, 272). Also, E. coli DnaK interacts with ribosome-bound trigger factor (62). There is no experimental information on whether the archaeal Hsp70 system operates like that of bacteria and, if so, to what extent the details described above are similar or dissimilar. This is an area that requires investigation and deserves to be explored. It has the potential for revealing how a bacterial-like molecular machine works in a cell whose genome bears eucaryal-like features and probably encodes accessory (regulatory, auxiliary) factors of the eucaryal type while lacking the complementary Hsp60 system of the bacterial type.
Regulation: More Archaeal Puzzles
It would be of particular interest to explore whether a
self-regulating circuit similar to that described for E. coli (references 6, 20, 90, and
259 and references therein), or some variation of
it, also operates in archaea. Hsp70(DnaK) in some bacteria participates
along with Hsp40(DnaJ) and perhaps also GrpE in the degradation of
32 as a way to down-regulate
hsp70(dnaK) transcription. How much of this
mechanism operates in archaea? We know that archaea do not have
factors, and so regulatory circuits for Hsp70(DnaK) synthesis
cannot include this factor. Does it include another?
We also know that in eukaryotes, the Hsp70 protein intervenes to prevent trimerization of the heat shock factor (HSF) and thus precludes induction of the hsp70 gene (205, 253). Archaea do not have an identifiable HSF, or heat shock element (75), in the hsp70 promoter region (180a). How, then, is the archaeal hsp70 gene regulated? Does Hsp70 participate in this process? If so, how? Does it interact with an archaeal equivalent of the eucaryal HSF or with another kind of regulator?
These are but some of the fascinating questions posed by recent research with the archaeal hsp70 locus genes. The answers to these questions will shed light on the details of the transcription initiation machinery for stress-inducible genes in archaea and will help us to understand the evolution and principles of the transcription mechanisms in the three phylogenetic domains, not just in the Archaea.
CHAPERONINS
|
|
|---|
Chaperonin Systems I and II
One of the most striking features of archaea is that although they are prokaryotes, they do not have a chaperonin (Hsp60) system like that of the other prokaryotes, the bacteria, but instead have a eukaryotic type of chaperonin complex. No exception to this rule has yet been reported; all archaea investigated have a chaperonin system which resembles that of the eukaryotic cytosol.
The bacteria have the "bacterial" type (group) I chaperonins, i.e., the genes/proteins groEL/GroEL and groES/GroES (191, 214, 235, 247). A three-dimensional (3-D) view of the barrel-like GroEL/S complex is shown in Fig. 11. What makes the lack of this chaperonin system in archaea very intriguing is that these organisms, at least some of them, have an Hsp70(DnaK) molecular chaperone machine very similar to the bacterial homolog, as described in previous sections. The absence of type I chaperonins and the presence of the Hsp70(DnaK) chaperone machine in a single organism is per