MMBR Try JB online
Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Copyright Information
Right arrow Books from ASM Press
Right arrow MicrobeWorld
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Makarova, K. S.
Right arrow Articles by Daly, M. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Makarova, K. S.
Right arrow Articles by Daly, M. J.

Microbiology and Molecular Biology Reviews, March 2001, p. 44-79, Vol. 65, No. 1
1092-2172/01/$04.00+0   DOI: 10.1128/MMBR.65.1.44-79.2001

Genome of the Extremely Radiation-Resistant Bacterium Deinococcus radiodurans Viewed from the Perspective of Comparative Genomics

Kira S. Makarova,1,2 L. Aravind,2 Yuri I. Wolf,2 Roman L. Tatusov,2 Kenneth W. Minton,1 Eugene V. Koonin,2 and Michael J. Daly1,*

Uniformed Services University of the Health Sciences, Bethesda, Maryland 20814-4799,1 and National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland 208142

SUMMARY
INTRODUCTION
    Extreme Radiation Resistance
    Isolation
    Cell Structure
    DNA Damage Resistance
    Logistics of Extreme DNA Damage Resistance
    DNA Repair Pathways
SEQUENCE ANALYSIS
    Metabolic Pathways
        Energy production and conversion.
        Carbohydrate metabolism.
        Amino acid and nucleotide metabolism.
        Metabolism of lipids and cell wall components.
        Metabolism of coenzymes.
    Translation System
    Replication, Repair, and Recombination
    Stress Response and Signal Transduction Systems
    Distinctive Features of Predicted Operon Organization and Transcription Regulation
    Expansion of Specific Protein Families
    Proteins with Unusual Domain Architectures
    Horizontal Gene Transfer
    Mobile Genetic Elements
        Inteins.
        Insertional sequences.
        Small noncoding repeats.
        Prophages.
    Evolutionary Relationships to Other Bacteria and Phylogeny
CONCLUSIONS
AVAILABILITY OF COMPLETE RESULTS
ACKNOWLEDGMENTS
ADDENDUM IN PROOF
REFERENCES


SUMMARY
Top
Next
References

The bacterium Deinococcus radiodurans shows remarkable resistance to a range of damage caused by ionizing radiation, desiccation, UV radiation, oxidizing agents, and electrophilic mutagens. D. radiodurans is best known for its extreme resistance to ionizing radiation; not only can it grow continuously in the presence of chronic radiation (6 kilorads/h), but also it can survive acute exposures to gamma radiation exceeding 1,500 kilorads without dying or undergoing induced mutation. These characteristics were the impetus for sequencing the genome of D. radiodurans and the ongoing development of its use for bioremediation of radioactive wastes. Although it is known that these multiple resistance phenotypes stem from efficient DNA repair processes, the mechanisms underlying these extraordinary repair capabilities remain poorly understood. In this work we present an extensive comparative sequence analysis of the Deinococcus genome. Deinococcus is the first representative with a completely sequenced genome from a distinct bacterial lineage of extremophiles, the Thermus-Deinococcus group. Phylogenetic tree analysis, combined with the identification of several synapomorphies between Thermus and Deinococcus, supports the hypothesis that it is an ancient group with no clear affinities to any of the other known bacterial lineages. Distinctive features of the Deinococcus genome as well as features shared with other free-living bacteria were revealed by comparison of its proteome to the collection of clusters of orthologous groups of proteins. Analysis of paralogs in Deinococcus has revealed several unique protein families. In addition, specific expansions of several other families including phosphatases, proteases, acyltransferases, and Nudix family pyrophosphohydrolases were detected. Genes that potentially affect DNA repair and recombination and stress responses were investigated in detail. Some proteins appear to have been horizontally transferred from eukaryotes and are not present in other bacteria. For example, three proteins homologous to plant desiccation resistance proteins were identified, and these are particularly interesting because of the correlation between desiccation and radiation resistance. Compared to other bacteria, the D. radiodurans genome is enriched in repetitive sequences, namely, IS-like transposons and small intergenic repeats. In combination, these observations suggest that several different biological mechanisms contribute to the multiple DNA repair-dependent phenotypes of this organism.


INTRODUCTION
Top
Previous
Next
References

Extreme Radiation Resistance

The evolution of organisms that are able to grow continuously at 6 kilorads (60 Gy)/h (119) or survive acute irradiation doses of 1,500 kilorads (50-52) is remarkable, given the apparent absence of highly radioactive habitats on Earth over geologic times. Notwithstanding a few natural fission reactors like those that gave rise to the Oklo uranium deposits (Gabon) 2 billion years ago (151), the radiation levels in the Earth's surface environments, including its waters containing dissolved radionuclides, have provided only about 0.05 to 20 rads/year over the last 4 billion years (193). DNA damage is readily inflicted on organisms by a variety of other common physicochemical agents (e.g., UV light or oxidizing agents) or nonstatic environments (e.g., cycles of desiccation and hydration or cycles of high and low temperatures) and it seems more likely that radiation resistance evolved in response to chronic exposure to nonradioactive forms of DNA damage.

Isolation

Bacteria belonging to the family Deinococcaceae are some of the most radiation-resistant organisms discovered, and they are vegetative, easily cultured, and nonpathogenic (23, 137, 138). Despite their ubiquitous distribution and apparent ancient derivation, only seven species of Deinococcaceae have been described (69, 138, 145). Deinococcus radiodurans strain R1 was the first of the deinobacteria to be discovered and was isolated in Oregon in 1956 (7) from canned meat that had spoiled following exposure to X rays. Culture yielded a red-pigmented, nonsporulating, gram-positive coccus that was extremely resistant to ionizing radiation, UV light, hydrogen peroxide, and numerous other agents that damage DNA (119, 137, 142, 215), as well as being highly resistant to desiccation (135). It is an aerobic, large (1- to 2-µm) tetrad-forming soil bacterium that is best known for its supreme resistance to ionizing radiation. It not only can survive acute exposures to gamma radiation that exceed 1,500 krads without dying or undergoing induced mutation (53), but it also displays luxuriant growth in the presence of high-level chronic irradiation (6 kilorads/h) (119, 212) without there being any effect on its growth rate or ability to express cloned foreign genes (31). For comparison, Escherichia coli will not grow and is killed in the presence of 6 kilorads/h (119) and an acute dose of only 100 to 200 kilorads needed to sterilize a culture. Similarly, vegetative cells of Bacillus spp. cannot grow at 6 kilorads/h and Bacillus spores show a 5-order-of-magnitude decrease in viability following acute exposure to 200 to 1,000 kilorads (207).

Shortly after the isolation of D. radiodurans R1 in 1956, a second strain of D. radiodurans (SARK) was discovered as an air contaminant in a hospital in Ontario (R. G. E. Murray and C. F. Robinow, Seventh International Congress for Microbiology, 1958). Since then, six closely related radioresistant species have been identified: Deinococcus radiopugnans from haddock tissue (54), Deinococcus radiophilus from Bombay duck (122), Deinococcus proteolyticus from the feces of Lama glama (108), the rod-shaped Deinococcus grandis from elephant feces (158), and the two thermophilic species Deinococcus geothermalis and Deinococcus murrayi from hot springs in Portugal and Italy, respectively (69). These species together form a distinct eubacterial phylogenetic lineage, believed to be most closely related to the Thermus genus. Based on 16S rDNA sequence analysis, it has been proposed that Deinococcus and Thermus form a eubacterial phylum (168). To date, the natural distribution of the deinococci has not been explored systematically. Isolations have occurred worldwide but are diverse and patchy in distribution. In addition to those noted above, sites of isolation include damp soil near a lake in England (133), weathered granite from the Antarctic Dry Valleys (44), irradiated medical instruments, and air purification systems (10, 41, 114, 145). As suggested above, it is possible that their extreme proficiency at DNA repair is related to the selective advantage in environments where they are prone to damage during long periods of desiccation (135). More recently, it has been proposed that adaptation could also occur in permafrost or other semifrozen conditions where cryptobiotic microbes with extremely long generation times could be selected with metabolic processes able to repair the unavoidable accumulation of background radiation-induced DNA damage (171).

Of the deinococcal species, D. radiodurans (138) and D. geothermalis (48) are the only ones for which a system of genetic transformation and manipulation has been developed. Now adding to this genetic technology is the recent complete sequencing and annotation of the D. radiodurans genome (218). The D. radiodurans strain R1 genome consists of two chromosomes (DR_Main [2.65 Mbp] and DR412 [412 kbp]), one megaplasmid (DR177 [177 kbp]), and one plasmid (46 kbp) (218), carrying 3,195 predicted genes. This combination of factors has positioned D. radiodurans as a promising candidate for the study of mechanisms of DNA damage and repair, as well as its exploitation for practical purposes such as cleanup and stabilization of radioactive waste sites. For example, D. radiodurans is being engineered to express metal-detoxifying and organic compound-degrading functions in environments heavily contaminated by radiation; 7 × 107 m3 of ground and 3 × 109 liters of groundwater were contaminated by radioactive waste generated in the United States during the Cold War (31, 48, 119).

Cell Structure

The cell envelope of D. radiodurans is unusual in terms of its structure and composition (3). Although the cell envelope of D. radiodurans is reminiscent of the cell walls of gram-negative organisms (32, 61, 208, 221), Deinococcus often stains gram positive; this may result from the inability of its thick peptidoglycan layer to decolorize. Its cell envelope consists of the plasma and outer membranes, which are separated by a 14- to 20-nm peptidoglycan layer and an uncharacterized "compartmentalized layer." At least six layers have been identified by electron microscopy, with the innermost layer being the plasma membrane. The next layer is a peptidoglycan-containing cell wall and appears to be perforated (the holey layer), but it has no known physiological significance. The third layer appears to be divided into numerous fine compartments (the compartmentalized layer). The fourth layer is the outer membrane, and the fifth layer is a distinct electrolucent zone. The sixth layer consists of regularly packed hexagonal protein subunits (the S-layer, or hexagonally packed intermediate layer), typical of other bacterial S-layers (26, 115, 206). A few strains of Deinococcus also exhibit a dense carbohydrate coat (25, 26, 118, 187, 205, 208, 221). Only the cytoplasmic membrane and the peptidoglycan layer are involved in septum formation during cell division. The other layers are regarded as a sheath, since they surround groups of cells and form on the surface of daughter cells as they separate (187, 208, 221).

The chemical structure of the peptidoglycan layer of D. radiodurans SARK has been investigated using mass spectrometry (165), and the structure obtained is consistent with the A3beta classification given to D. radiodurans (32, 176, 186). Thermus thermophilus HB8 (166) also has an A3beta murein chemotype, and its peptidoglycan is built from the same monomeric subunit, underscoring the phylogenetic relationship between these genera.

The plasma and outer membranes appear to have the same lipid composition (206), yet there is no evidence for conventional lipopolysaccharides. The fatty acid composition of D. radiodurans is distinctive (69); attempts to identify hydroxy fatty acids, lipid A, and heptoses have been unsuccessful (145). A mixture of 15-, 16-, 17-, and 18-carbon saturated and monounsaturated acids are present, while polyunsaturated, cyclopropyl, and branched-chain fatty acids are not detectable. D. radiodurans has the distinguishing characteristic of lacking conventional phospholipids found in other bacteria (204). Of the D. radiodurans membrane lipid, 43% is composed of phosphoglycolipids containing a series of alkylamines as structural components, hitherto unknown as lipid constituents (8, 9). These lipids appear to be derived from the same precursor, a novel phosphatidylglycerolalkylamine, and form when the precursor is glycosylated with galactose or glucosamine. Although glucosamine-containing lipids have been found in other species, notably members of the genus Thermus (160), these phosphoglycolipids are, at present, considered unique to D. radiodurans.

DNA Damage Resistance

The most extensively studied of the deinococci is D. radiodurans. Unlike other deinobacterial species, it is amenable to genetic manipulation due to its natural transformability by both high-molecular-weight chromosomal DNA and plasmid DNA (131, 143, 189). The natural transformability of D. radiodurans has facilitated the development of a variety of techniques for genetic manipulation of this organism (31, 49-52, 81-83, 119, 120, 131, 189-191), rendering it a highly susceptible target for molecular investigation. Transformability, however, is not integral to DNA damage resistance, since the other deinobacterial species are no less radioresistant than D. radiodurans (142) but are not transformable by any forms of DNA (D. geothermalis, however, is an exception since it has been transformed with plasmid recently [48]). In the exponential growth phase, D. radiodurans does not die in response to ionizing irradiation up to 0.5 megarad and shows 10% survival at 0.8 megarad (142), while exponentially growing E. coli, for comparison, shows a very small shoulder of complete resistance and 10% survival at 15 kilorads (188), a 50-fold difference in resistance (188). In the stationary growth phase, D. radiodurans does not die until exposed to 1.5 megarads, over 100-fold greater resistance than stationary-phase E. coli (53, 137). In exponential phase, D. radiodurans is 33-fold more resistant to UV than is E. coli (197). Compared to other organisms, the D. radiodurans DNA sustains the expected amount of damage in vivo at high irradiation doses, on the order of 150 to 200 double-stranded DNA breaks (DSBs) at 1.5 megarads per haploid chromosome under aerobic irradiation conditions, all of which are mended within hours following irradiation (53, 107, 123), nor is its DNA less susceptible than that of E. coli to UV in vivo (183). Furthermore, survivors of extreme ionizing radiation, UV, or bulky chemical-adduct exposures do not show any mutagenesis greater than that occurring after a single round of normal replication (197, 198). On the other hand, D. radiodurans is mutable by N-methyl-N'-nitro-N-nitrosoguanidine and other agents that can cause mispairing of bases during replication (197, 198). Of the many forms of damage imposed on DNA by ionizing radiation, DSBs are considered the most lethal due to the inherent difficulty in their repair, since no single-strand template for accurate repair remains in the double helix (117). Other organisms, such as E. coli, can repair at most a few DSBs per chromosome without dying (112).

Logistics of Extreme DNA Damage Resistance

D. radiodurans contains 8 to 10 haploid genome copies during exponential growth and 4 genome copies during stationary phase (87, 89). In comparison, E. coli contains four or five haploid chromosomes during vigorous exponential growth, and this multiplicity in E. coli has been shown to be necessary for repair of DSBs (112). However, multiplicity in itself is insufficient for radioresistance. Micrococcus luteus and Micrococcus sodonensis also contain multiple genome equivalents but are radiosensitive (142). Azotobacter vinelandii, which contains up to 80 chromosomes per cell (164, 172), is quite sensitive to UV damage (125), to which D. radiodurans is highly resistant. Using various growth media, Harsojo et al. (89) were able to vary the genomic complement of D. radiodurans between 5 and 10 during the exponential phase and demonstrated that there was no correlation between chromosome number and radioresistance. The authors concluded that if chromosome multiplicity is important in repair, five or fewer chromosomes are sufficient. On high-level irradiation (1.75 megarads), D. radiodurans can reconstitute its genome from 1,000 to 2,000 DSB fragments compared to the maximum capability of E. coli of restoring its genome from 10 to 15 DSB fragments. Since most recombination models postulate that all DSB fragments search all others for homology during repair, this would call for an astronomical number of combinations to ensure genome restoration in D. radiodurans. Therefore, it may be that D. radiodurans can use redundant information in ways that other organisms do not. An alternative repair model has been postulated for D. radiodurans in which its chromosomes are always aggregated and aligned, thus dramatically simplifying the search for repair templates (51, 139) following DNA damage.

Repair of DNA damage in D. radiodurans follows an ordered series of events (137, 142). Physical repair of lesions requires conditions compatible with growth (212). For colony formation assays, this is simply achieved by plating on nutrient agar. For liquid cultures, this requires fresh nutrient medium and adjustment of cellular density to a level suitable for exponential growth. This has been demonstrated in liquid cultures for excision of pyrimidine dimers (30), repair of DSBs (56, 53), and recombinational repair of plasmids and chromosomes (51). While growth-promoting conditions are essential for removal of lesions from cellular DNA, the cells themselves do not immediately divide. Indeed, there is a dramatic inhibition of growth for extended durations following acute exposure to nonlethal (or partially lethal) DNA damage. This growth lag is associated with limited degradation of chromosomal DNA intrinsic to the DNA repair processes. Degradation proceeds at a rate independent of dose (the initial extent of damage), but its duration is positively correlated with dose (137; also see reference 142 and citations therein). Thus, the greater the dose, the longer the growth lag, which may exceed the duration of DNA degradation. Following a nonlethal exposure of stationary-phase D. radiodurans to 1.5 megarads under anoxic conditions, dilute liquid cultures of D. radiodurans show no growth for about 10 h and then resume rapid exponential growth (53). The dose-dependent delay of the onset of cellular replication suggests the existence of a checkpoint that monitors the extent of repair and accordingly controls the initiation of replicative DNA synthesis. During the period of stasis, it can be expected that the cell undergoes several phases of repair. The first can be termed cellular cleansing, and it involves several modalities, including the export of damaged DNA components. Initially, the products formed are DNA fragments about 2,000 bp long and consists of a mixture of damaged and undamaged nucleotides and nucleosides (22, 213). These products are found in the cytoplasm and also in the surrounding growth medium, suggesting that D. radiodurans exports the DNA degradation products once they are formed (reference 22 and citations therein). The removal of damaged nucleotides outside the cell might protect the organism from elevated levels of mutagenesis by preventing the reincorporation of damaged bases during DNA synthesis (22). Remaining intracellular mutagenic precursors could be sanitized via pyrophosphohydrolases of the Nudix superfamily (for "nucleoside diphosphate linked to some other moiety x"), the founding member of which is the repair enzyme MutT (28). MutT has an 8-oxo-dGTPase activity, which produces 8-oxo-dGMP plus inorganic pyrophosphate. Since 8-oxo-dGTP is highly mutagenic, the enzyme "sanitizes" the nucleoside triphosphate pool. D. radiodurans is markedly rich in Nudix proteins, some of which may act to sanitize other mutagenic DNA precursors (218). Finally, activated oxygen species with long half-lives may be eliminated by superoxide dismutases and catalases such as SodA and KatA (129). During this initial phase of cellular cleansing, amino acids, nucleotides, nucleosides, sugars, and phosphate may be imported into the cell while precursors for DNA synthesis are made by way of ribonucleoside diphosphate reductase (104). Subsequent phases of repair are genomic restoration and coordination of repair activities.

DNA Repair Pathways

D. radiodurans has repair pathways that include excision repair, mismatch repair, and recombinational repair. Generally, no marked error-prone SOS response is observed in D. radiodurans (142). However, there have been a few reports consistent with SOS response, where preexposure to low doses of ionizing radiation, UV, or hydrogen peroxide causes a low level of subsequent increased resistance to DNA damage (twofold or less) (199, 215). Since the SOS response is not always mutagenic, the absence of DNA damage-induced mutagenesis observed in D. radiodurans cannot be taken as evidence against the existence of the SOS response in this bacterium. Photoreactivation is not present (142), and it has been reported that the adaptive response to alkylation damage is also absent (170). It is known that following DNA damage, there are changes in the cellular abundance of proteins, with enhanced synthesis of four to nine proteins, as judged by sodium dodecyl sulfate-polyacrylamide protein gels (86, 200). Included in this group of proteins are probably RecA (36), elongation factor Tu (200), and KatA (129). While there are many predicted DNA repair genes and pathways in the D. radiodurans genome (218), only a few of its DNA repair enzymatic activities and/or genes have been evaluated for their biochemical activities. The UvrA protein and its gene have been detected (1, 149), and it has been identified as a component of nucleotide excision repair. UV endonuclease-beta has been purified and found to be a 36-kDa manganese-requiring protein, which is thus far only known to recognize UV-induced pyrimidine cyclobutane dimers, incising them as an endonuclease rather than as a glycosylase (63-65). Other repair-related activities detected in extracts of D. radiodurans include uracil DNA glycosylase (132), a thymine glycol glycosylase, and a deoxyribophosphodiesterase (144). DNA polymerase I activity is present and is necessary for resistance to both UV and ionizing radiation (81). Both UvrA and DNA polymerase I deficiencies can be fully complemented by the expression of E. coli UvrA and DNA polymerase I proteins in D. radiodurans mutants, respectively (1, 81). However, this is not the case for D. radiodurans recA, which appears to play a more important role in the extreme radiation resistance phenotype.

The D. radiodurans RecA protein has been detected and its gene has been sequenced; it shows greater than 50% identity to the E. coli RecA protein (81). Mutants with mutations in this gene are highly sensitive to UV and ionizing radiation. Unlike UvrA and DNA polymerase I proteins, expression of E. coli RecA in D. radiodurans does not complement the RecA deficiency and appears to have no effect on D. radiodurans (82, 36). Expression of D. radiodurans RecA in E. coli has been reported to be lethal (36); however, recently it has been successfully expressed in E. coli with less toxicity (M. M. Cox and K. W. Minton, unpublished data), and it has been reported to complement E. coli RecA deficiency (150).

D. radiodurans RecA has recently been purified and characterized (M. M. Cox, unpublished data). In vitro, it has been shown to catalyze the spectrum of activities classically attributed to RecA proteins: (i) it forms striated filaments on single-stranded DNA and double-stranded DNA; (ii) it promotes an efficient DNA strand exchange reaction; and (iii) it has a DNA-dependent nucleoside triphosphatase activity. However, D. radiodurans RecA is distinct from other well-characterized RecAs (e.g., from the gram-negative E. coli) in its nucleoside triphosphatase and DNA strand exchange activities. Unlike E. coli RecA, D. radiodurans RecA does not hydrolyze ATP at pH 7.5, although it exhibits some ATPase activity at lower pHs. In contrast, it is very effective at hydrolyzing dATP over a broad pH range.

The existence of a very efficient recA-independent single-stranded DNA annealing repair pathway has been reported for D. radiodurans (50). This pathway is active during and immediately after DNA damage and before the onset of recA-dependent repair. It can repair about one-third of the 150 to 200 DSBs per chromosome following exposure to 1.75 megarads (50). It has also been reported that unlike other organisms, D. radiodurans RecA is not present in the undamaged deinococcal cell but is synthesized only following DNA damage and following repair. D. radiodurans RecA is apparently expressed in D. radiodurans only following extreme DNA damage (36), and it is noteworthy that the recA-defective D. radiodurans strain rec30 is more radiation resistant than E. coli (138). It is possible that the greater resistance of rec30 arises from the presence of multiple copies of its genome in combination with the single-stranded DNA-annealing repair pathway, which is fully functional in this mutant (50). Together, this evidence supports the idea that D. radiodurans RecA is not necessary for the repair of nonextreme DNA damage (~10 DSB/chromosome, ~100 kilorads) and that Dr RecA may be activated only when DNA is highly damaged (>100 kilorads) (M. J. Daly, unpublished data).


SEQUENCE ANALYSIS
Top
Previous
Next
References

To further our understanding of the functions of individual genes and cellular systems in D. radiodurans as well as their relationship with other organisms, we undertook a detailed computational analysis of the D. radiodurans genome. In addition to the standard genome annotation procedure of The Institute for Genomic Research (218), we used several approaches for deeper protein characterization. In particular, we systematically applied sensitive profile-based methods that included PSI-BLAST, which constructs a position-dependent weight matrix from multiple alignments generated from the BLAST hits above a certain expectation value (e-value) and allows iterative database searches using the information derived from such a matrix (5, 6), IMPALA (175), which searches the matrix against profile databases, and SMART (179, 178), which uses a Hidden Markov Model algorithm (59) to search a sequence against a multiple-alignment database. In addition to the database of profiles included in the SMART system, two other profile collections were used: (i) 5,640 profiles derived from the structurally characterized domains contained in the SCOP database (100, 219), and (ii) 150 profiles for widespread domains primarily involved in different forms of signaling that were employed in previous genome comparisons (40, 163, 175).

Paralogous families of proteins encoded in the D. radiodurans genome were initially identified by comparing the complete set of D. radiodurans proteins to itself (after filtering for low-complexity regions with the SEG program [220]) using the PSI-BLAST program run for three iterations and clustering proteins by single linkage (clustering threshold e-value, 0.001) using the GROUPER program (214). One sequence from each cluster was used to generate a position-specific matrix by running an iterative PSI-BLAST search first against a D. radiodurans protein and then against the nonredundant protein database. These profiles were used to search for additional family members in the D. radiodurans proteome. Families that were recognized by the same profile were joined into superfamilies.

The phylogenetic affinities of D. radiodurans were explored using the COGNITOR program. This program assigns query proteins to conserved protein families that consist of apparent orthologs, termed clusters of orthologous groups (COGs) (201, 202). The functional assignments embedded in the COG database were also used to reconstruct metabolic pathways and other functional systems in D. radiodurans together with the KEGG (105) and WIT (157) databases.

Analysis of the phyletic distribution of homologs of Deinococcus proteins detected in database searches was performed using the TAX_COLLECTOR program of the SEALS package (214). This was followed by phylogenetic tree construction for specific cases. Multiple alignments for phylogenetic reconstruction were generated using the ClustalW program (93) and, when necessary, further adjusted on the basis of PSI-BLAST search outputs. Phylogenetic trees were constructed using the neighbor-joining methods with bootstrap replications as implemented in the NEIGHBOR program of the PHYLIP package (67).

Intergenic repeats were identified using the BLASTN program (6). As a result of this analysis, 2,007 D. radiodurans proteins were assigned to 1,272 COGs, which placed them into specific phylogenetic and functional contexts. In conjunction with profile analysis, this allowed us to define the domain architectures of multidomain proteins, to identify protein families that are unusually expanded in D. radiodurans, and to assign function and/or structure to a number of proteins previously described as hypothetical.

Below, we present an overview of the principal functional systems of D. radiodurans as determined by these analyses and describe unusual aspects of the genome that may be relevant to understanding the extreme resistance of this organism to radiation, desiccation, and other stress factors.

Metabolic Pathways

Analysis of the genome of D. radiodurans shows that it has a typical set of proteins for housekeeping and regulatory functions. As demonstrated by the COG analysis, the metabolic capabilities of D. radiodurans are similar to those of E. coli (152) but less diverse (Table 1); D. radiodurans is an obligatory heterotroph (212). Table 1 lists and compares the standard metabolic pathways of Deinococcus to the corresponding pathways in E. coli, Synechocystis, Bacillus subtilis, and Mycobacterium tuberculosis.

                              
View this table:
[in this window]
[in a new window]
 
TABLE 1.   Basic metabolic pathways in D. radiodurans

Energy production and conversion. Probably the most interesting feature of the systems for energy production in D. radiodurans is that, unlike most other free-living bacteria, it uses the vacuolar type of proton ATP synthase instead of the F1F0 type. Vacuolar (V)-type H+-ATPase is typical of eukaryotes and archaea; all archaea have a conserved operon that consists of eight genes encoding the ATPase subunits. This operon is partially conserved (with some of the subunits missing) in a minority of characterized bacteria, where it replaces the F1F0 ATPase, e.g., in Deinococcus, Thermus, spirochetes, chlamydiae, and Enterococcus. The scattered distribution of the V-ATPase operon among bacteria, in contrast to its conservation in archaea, suggests that this operon has been disseminated in the bacterial world by horizontal transfer. The genes for the standard five complexes of electron transport and oxidative phosphorylation are present in D. radiodurans, with a few exceptions, but some genes of the cytochrome bd quinol oxidase complex are missing. Given that this complex is active predominantly under low-oxygen conditions in other bacteria, its apparent loss in Deinococcus is consistent with D. radiodurans being strictly aerobic. Interestingly, D. radiodurans encodes a multisubunit Na+/H+ antiporter (DR0880 to DR0886) that is characteristic of thermophiles and a few other bacteria (B. subtilis and Rickettsia prowazekii), but is absent in E. coli, Synechosystis, and Mycobacterium. It has been shown that this system is necessary for cells to grow under alkaline conditions (95).

Carbohydrate metabolism. The D. radiodurans genome appears to encode functional pathways for glycolysis, gluconeogenesis, the pentose phosphate shunt, and the tricarboxylic acid (TCA) cycle. A few genes are missing, but these may not be essential since they are also absent in some bacteria that are functional in these pathways (Table 1). The D. radiodurans Entner-Doudoroff pathway may be disrupted since a key enzyme, 2-keto-3-deoxy-6-phosphogluconate aldolase (an ortholog of E. coli Eda), is missing. However, this enzyme is also absent in archaea, where the Entner-Doudoroff pathway appears to be functional, and therefore the enzyme could be displaced by a nonorthologous aldolase in Deinococcus. The glyoxalate bypass that has only been described for E. coli and M. tuberculosis is present and complete in Deinococcus. It remains unclear, however, why some intermediates of the TCA cycle cannot support the growth of D. radiodurans (212). As expected of a heterotroph, Deinococcus encodes several enzymes for complex carbohydrate metabolism; for some of these, e.g., glycogen-debranching enzymes (DR0405 and DR0191), phylogenetic analysis suggests that horizontal transfer from eukaryotes has occurred (data not shown). Other enzymes for sugar conversion, as well as most of the known sugar transport systems, are encoded in D. radiodurans, and this is consistent with the observation that a variety of different sugars can be used by this bacterium as carbon and energy sources (212).

Amino acid and nucleotide metabolism. D. radiodurans is unable to use ammonia as a nitrogen source despite the presence of apparently functional genes for glutamate ammonia ligase and carbamoyl-phosphate synthase, which are key enzymes for ammonia utilization. While there is currently no explanation for this, it has been shown that D. radiodurans can use amino acids effectively as a nitrogen source and that sulfur-containing amino acids appear to be the most readily utilized form of nitrogen. Notably, D. radiodurans lacks the standard pathways for cysteine and methionine biosynthesis yet is able to produce these amino acids using unidentified biosynthetic pathways when provided with other amino acids (212). The absence of all key enzymes for lysine biosynthesis is another puzzling feature of Deinococcus metabolism since it does not require lysine for growth (212). All of the other standard amino acid pathways appear to be functional. Although a few genes seem to be missing from these pathways, they are also absent in some of the other free-living bacteria, where they probably have been displaced by paralogous or nonhomologous enzymes. Some of the genes for enzymes of arginine metabolism are likely to have been acquired by the common ancestor of the Thermus-Deinococcus group from archaea (see Tables 10 and 11).

Most of the known genes for nucleotide metabolism are present in D. radiodurans. The most conspicuous gap is the absence of purine nucleoside phosphorylase, a key enzyme of purine salvage, which has been found in all free-living organisms investigated. Another noteworthy absence is that of two related enzymes of pyrimidine salvage, cytidine deaminase and dUTPase (important in preventing DNA damage), which are present in most bacteria. As may be the case for absent amino acid biosynthetic genes, there might also be unidentified enzymes that compensate for these pyrimidine salvage activities.

Metabolism of lipids and cell wall components. D. radiodurans lacks only one gene from the standard bacterial set of genes coding for enzymes of lipid metabolism, namely, phosphatidylglycerophosphate synthase, which is involved in the biosynthesis of acidic phospholipids. With the exception of the archaeon Methanococcus jannaschii, phosphatidylglycerophosphate synthase has been detected in all organisms with completely sequenced genomes. Its absence in Deinococcus, therefore, is unexpected. Deinococcus encodes multiple copies of several fatty acid biosynthesis genes, of which some could have been transferred horizontally into Deinococcus from distant taxa (Table 1). Consistent with the unusual structure of the peptidoglycan layer in Deinococcus (see above), we identified all essential genes for ornithine metabolism but did not detect several key enzymes for diaminopimelic acid biosynthesis.

Metabolism of coenzymes. Our experimental data show that Deinococcus is capable of de novo biosynthesis of all principal coenzyme components except for nicotinic acid (212). Consistent with this result, we find that genes for several key enzymes of NAD biosynthesis are missing in the genome, which is unusual since this pathway is present in most free-living organisms. Several other conventional pathways for coenzyme biosynthesis are also not complete (Table 1), but, given the ability of Deinococcus to grow in the absence of these coenzymes, it probably encodes functional analogs of these.

Translation System

The translation apparatus is arguably the most highly conserved and uniform of cellular systems, and D. radiodurans is no exception. It contains a typical bacterial complement of translation machinery components. This general uniformity notwithstanding, there are several unique features in the translation apparatus of Deinococcus that have been revealed both experimentally and by genome analysis. In particular, Deinococcus has a unique repertoire of genes and reactions for the formation of glutaminyl-tRNA and asparaginyl-tRNA. Generally, there are two pathways for the activation of glutamine and asparagine: (i) direct charging of tRNAGln and tRNAAsn by glutaminyl- and asparaginyl-tRNA synthetase (Gln-RS and Asn-RS), respectively, and (ii) transamidation of Glu-tRNAGln and Asp-tRNAAsn by the respective amidotransferases (AdT), Glu-AdT and Asp-AdT (101). Usually, the two pathways and the corresponding genes are not present in the same organism. The transamidation pathway for glutamine is predominant in bacteria and archaea, whereas glutaminyl-tRNA synthetase is typical of eukaryotes and gamma proteobacteria (101). In the case of asparagine, archaea primarily use the transamidation pathway, eukaryotes use the direct pathway, and bacteria have a patchy distribution of both systems. Glu-AdT has been studied in detail; it consists of three subunits encoded by the gatABC genes (45). The nature of Asp-AdT is less clear; it has been suggested that it shares A and C subunits with Glu-AdT whereas the B subunit (the likely determinant of tRNA binding) is unique. D. radiodurans encodes Asn-RS, Gln-RS, and the GatABC proteins (45). A recent genome survey has shown that the two systems also coexist in several members of the proteobacteria (85), but Deinococcus is the only nonproteobacterial species with this combination of asparagine and glutamine activation systems. Furthermore, in addition to the intact GatB, Deinococcus encodes a C-terminal domain of this protein that is fused to Gln-RS (Fig. 1). The GatABC complex of D. radiodurans is capable of catalyzing the formation of both Gln-tRNAGln and Asn-tRNAAsn, but in vivo apparently only Asn-tRNAAsn is formed, since the discriminating Glu-RS of Deinococcus does not produce the mischarged Glu-tRNAGln (45). In contrast, Deinococcus encodes two copies of Asp-RS, a typical bacterial discriminating copy and nondiscriminating copy that probably was acquired from the archaea by horizontal gene transfer (45) (see below). The nondiscriminating Asp-RS produces Asp-tRNAAsn, which serves as the substrate for the GatABC enzyme. It has been suggested that the main role of the Asn-tRNAAsn formation in Deinococcus is the synthesis of asparagine, rather than its incorporation into proteins, since Deinococcus does not encode orthologs of known asparagine synthetases (45). Given that GatB is thought to be the tRNA-binding component of Glu-AdT and Asp-AdT, the C-terminal GatB-related domain in Deinococcus Gln-RS could enhance the specificity of this enzyme for tRNAGln. This domain is missing in other Gln-RSs, but the respective organisms do not encode GatB, which in Deinococcus could compete with Gln-RS for binding tRNAGln.


View larger version (31K):
[in this window]
[in a new window]
 
FIG. 1.   Examples of unique domain architectures of Deinococcus proteins.

The repertoire of aminoacyl-tRNA synthetases (aminoacyl-RSs) in Deinococcus also shows several other peculiarities. In addition to the corresponding functional enzymes, Deinococcus encodes truncated and apparently inactive forms of Glu-RS and Ala-RS, as well as apparently active paralogs of Trp-RS and His-RS. Possible horizontal transfer of these additional enzymes as well as other aminoacyl-RSs from archaea and thermophilic bacteria could be readily examined once more of these organisms are sequenced.

Replication, Repair, and Recombination

D. radiodurans contains all the typical bacterial genes that comprise the basal DNA replication machinery (Table 2). The number of paralogs and the domain organization of the DNA polymerase III alpha -subunit is variable in the major bacterial divisions in terms of the presence of an active or inactivated PHP domain, which is predicted to possess phosphatase activity, and the proofreading 3'-5' exonuclease domain. D. radiodurans encodes a single alpha -subunit that is most similar to proteobacterial polymerases and does not contain the 3'-5' exonuclease, which is encoded by a separate gene orthologous to E. coli dnaQ. Unlike the proteobacterial orthologs, however, the Deinococcus polymerase contains an apparently active PHP domain. This appears to represent the ancestral bacterial state of the replicative DNA polymerase, which is also seen in bacteria like Synechocystis and Aquifex. In addition to typical proteins involved in replication, Deinococcus encodes DNA polymerase X, which is similar to the eukaryotic DNA polymerase beta (references 27 and 217 and references therein), and is relatively uncommon in prokaryotes. Deinococcus polymerase X contains an N-terminal nucleotidyltransferase domain and a C-terminal PHP hydrolase domain, the same domain architecture that is seen in homologs from B. subtilis and Methanobacterium thermoautotrophicum; this conservation of domain organization suggests horizontal transfer of the polymerase X gene (13). Notably, along with a few other bacteria, such as Synechocystis and Aquifex, Deinococcus encodes three small nucleotidyltransferases (DR1806, DR0679, and DR0248), which are expanded in archaea (13). These "minimal" nucleotidyltransferases are typically accompanied by a small protein that is fused to the nucleotidyltransferase in the DR0248 protein; the function of this protein, however, has not been characterized directly but is likely to be coupled to that of the nucleotidyltransferases.

                              
View this table:
[in this window]
[in a new window]
 
TABLE 2.   Genes coding for replication, repair, and recombination functions in D. radioduransa

The repertoire of DNA-associated proteins in Deinococcus is similar to that in other bacteria, but some unique features were noticed. Like other bacteria, Deinococcus encodes an ortholog of the chromosomal DNA-binding protein HU, which is believed to play a central role in DNA packaging and also as a cofactor in recombination (reference 184 and references therein). Interestingly, the sequenced genome of the Deinococcus R1 strain contains three adjacent open reading frames (ORFs) encoding fragments of the single-stranded DNA-binding protein (SSB) but lacks a complete gene for SSB; so far, all sequenced bacterial genomes encoded an intact SSB. Because of the 10-fold coverage during the TIGR sequencing project (218), two sequencing errors in this short gene would seem unlikely. Two explanations arise: (i) Deinococcus could encode an as yet unrecognized SSB analog (or an extremely diverged homolog), making the SSB gene expendable; or (ii) a tripartite SSB gene could be expressed by a translational readthrough mechanism or even a unique RNA-editing mechanism.

Bacterial DNA repair includes several partially redundant pathways and generally shows considerable flexibility (20, 60, 70). We investigated the predicted repair system components of D. radiodurans in detail, to detect any possible correlation with its exceptional radioresistant and desiccation-resistant phenotype. Generally, it appears that Deinococcus possesses a typical bacterial system for DNA repair and that, commensurate with the genome size, its repair pathways even appear to be less complex and diverse than those of bacteria with larger genomes, such as E. coli and B. subtilis. At the same time, there are several interesting and unusual aspects of the predicted layout of the repair systems in Deinococcus that may be linked to its phenotype (Table 2).

The nucleotide excision repair system that consists of the UvrABC excinuclease and the UvrD and Mfd (transcription-repair coupling factor) helicases is fully represented in D. radiodurans. Also present are the main components of the base excision repair system including several nucleotide glycosylases and endonucleases, namely, MutM (formamidopyrimidine and 8-oxoguanine DNA glycosylase); MutY (8-oxoguanine DNA glycosylase and apurinic DNA endonuclease-lyase); two paralogous uracil DNA glycosylases (Ung homologs); an additional, recently identified enzyme that has the same activity but is unrelated to Ung (DR1751) (174); endonucleases III (Nth) and V (YjaF); and exonuclease III (XthA). Deinococcus lacks two key enzymes involved in the repair of UV-damaged DNA in other organisms, namely, endonuclease IV (AP-endonuclease) and photo-lyase. Instead, it encodes a typical bacterial UV endonuclease III (thymine glycol-DNA glycosylase) and, more unexpectedly, a TIM-barrel fold nuclease characteristic of eukaryotes and most closely related to the UV endonuclease of Neurospora (20, 223). Eukaryotic-type topoisomerase IB is a truly unexpected protein to be identified in the Deinococcus genome and also could play a role in UV resistance (see "Horizontal gene transfer" below).

The repertoire of recombinational repair genes in Deinococcus includes orthologs of most of the E. coli genes involved in this process (Table 2), but the RecBCD recombinase is missing. While this complex is not universal in bacteria, it is a major component of recombination systems in most free-living species. In Deinococcus, where recombination is thought to be an important contributor to damage-resistance, the absence of this ATP-dependent exonuclease is unexpected. Deinococcus does encode an apparent ortholog of one of the helicase-related subunits of this complex, RecD, but not the other subunits. The RecD protein in Deinococcus is unusual in that it contains an N-terminal region of about 200 amino acid residues that consist of three tandem predicted HhH DNA-binding domains; this unusual domain organization of the RecD protein is shared with B. subtilis and Chlamydia. Such dissociation of RecD from the RecB and RecC subunits is not unique to Deinococcus; "solo" RecD-related proteins are also present in M. jannaschii and in yeast. The function(s) of RecD, once outside the recombinase complex, is unknown.

Another component of the recombinational repair system in Deinococcus that has an unusual domain architecture is the RecQ helicase. It contains three tandem copies of the C-terminal helicase-RNase D (HRD) domain, instead of the single copy present in all other bacteria except Neisseria that similarly possesses three copies (141) (also see below). RecQ sequences from Neisseria and Deinococcus are more similar to each other than to any other homologs, which, together with the distinctive triplication of the HRD domain, indicates that the recQ gene has been exchanged between bacteria from these two distant lineages. In addition, Deinococcus encodes a protein (DR2444) that contains an HRD domain and a domain homologous to cystathionine gamma-lyase; this is the first example of an HRD domain that is not associated with either a helicase or a nuclease (although it is possible that the domain organization of this protein is an artifact caused by a frameshift). This propagation of the HRD domain in Deinococcus could contribute to the repair phenotype given the interactions of RecQ with RecA in recombination (88).

The methylation-dependent mismatch repair system of D. radiodurans includes the MutS and MutL ATPases and endonuclease VII (XseA). Orthologs of the site-specific methylases Dcm and Dam, which are associated with mismatch repair, are not readily detectable. It appears likely, however, that other distantly related DNA methylases predicted in D. radiodurans could perform similar functions.

Like other bacteria with large genomes, D. radiodurans encodes the LexA repressor-autoprotease (DRA0344), which in E. coli and B. subtilis controls the expression of the SOS regulon. In addition, unlike any of the other bacterial genomes studied, D. radiodurans encodes a second, diverged copy of LexA (DRA0074), which retains the same arrangement of the helix-turn-helix (HTH) DNA-binding domain and the autoprotease domain. Attempts to identify LexA-binding sites and the composition of the putative SOS regulon in D. radiodurans have been unsuccessful (M. S. Gelfand, personal communication). This suggests that D. radiodurans does not possess a functional SOS response system, which is in agreement with the results of previous experimental studies (142). Furthermore, Deinococcus does not encode proteins of the DinP/UmuC family, nonprocessive DNA polymerases that play a critical role in translesion DNA synthesis and associated error-prone repair such as SOS repair in E. coli (117).

In addition to orthologs of well-characterized repair proteins discussed in this section, Deinococcus encodes several unusual proteins and expanded protein families that are less confidently associated with repair but might contribute to the unusual effectiveness of the repair and recombination systems in this bacterium; these proteins are discussed below in the section on the unique features of the Deinococcus proteome.

Stress Response and Signal Transduction Systems

D. radiodurans encodes a broad spectrum of proteins that have been associated with various forms of stress response in other bacteria as well as several proteins that appear to be unique and could contribute to more specific forms of the stress response (Table 3). Orthologs of almost all known genes involved in different stress responses in other bacteria (109) are present in Deinococcus. The few stress response proteins that are missing are either specific to the adaptation of a particular organism to its environment or, when of more general significance, likely to be replaced by nonorthologous proteins with similar functions. For example, instead of using the OtsA and OtsB proteins for the synthesis of the osmoprotection disaccharide trehalose, Deinococcus probably uses an alternative pathway via trehalose synthase (DR0933), which has been recently characterized in Thermus (209). Trehalose plays a major role in the desiccation resistance of E. coli (216) and is also likely to be important in Deinococcus. Deinococcus has two additional genes for trehalose metabolism: maltooligosyl trehalose synthase (DR0463), which provides yet another route of trehalose formation, and trehalohydrolase (DR0464). These genes apparently form a mobile operon and probably have been acquired by Deinococcus through horizontal transfer, since their closest homologs are found in Rhizobium, where they appear to have the same operon organization (130).

                              
View this table:
[in this window]
[in a new window]
 
TABLE 3.   Stress response-related genes in D. radiodurans

Among the proteins associated with oxidative stress response, Deinococcus encodes three catalases (DR1998, DRA0259, and DRA0146), two of which are highly similar to one another and to catalases from other bacteria whereas the third is only distantly related to other catalases. The gene for this unusual predicted catalase (DRA0146) is closely linked to and probably forms an operon with a gene for a peroxidase (DRA0145). DRA0146 is most similar to its ortholog from Rhizobium, and these two proteins are, in turn, more closely related to eukaryotic catalases from plants than to bacterial catalases. This suggests that Deinococcus acquired the gene for this catalase from a nitrogen-fixing bacterium, which, in turn, had hijacked it from a plant. In contrast, DRA0145 is distinctly closer to certain peroxidases from fungi, such as Galactomyces geotrichum, than to bacterial forms from Neisseria, E. coli, and actinomycetes. Thus, the entire operon probably has been acquired horizontally. A broad spectrum of other genes that may be involved in the stress response include DRA0149 (agmatinase), DR1353 (an acid-inducible apolipoprotein amino-acetyltransferase), and DR2299, DR1605, and DR2245 (genes of the two-component response and cyclic diguanylate signaling system), which again are very similar to homologs from the family Rhizobiaceae, suggesting significant horizontal gene transfer between these distant bacteria.

In addition to the well-characterized components of stress response systems, Deinococcus encodes several proteins and entire protein families whose specific roles are unknown but are likely to be important for the multiple stress resistance phenotypes of the bacterium. An example of a poorly studied but potentially important system is the "addiction module" response (2), which is encoded by two genes, mazE and mazF (DR0416 and DR0417, respectively). MazF is a stable protein that is toxic to bacteria, whereas MazE protects cells from the toxic effect of MazF and is degraded by the ClpP serine protease. Expression of these two genes is regulated by ppGpp, which is produced by the RelA enzyme (or the bifunctional enzyme SpoT) in response to amino acid starvation. On the basis of these studies, Aizenman et al. (2) have proposed a model of programmed bacterial cell death dependent on the MazEF proteins. Currently, Deinococcus is the only bacterium other than E. coli, the model system in which the role of these proteins was elucidated, that has both genes and retains their operon organization. Another example of poorly characterized genes that are likely to be involved in stress response are two proteins (DR2056 and DR1940) that are homologous to the E. coli heat shock protein HslJ (42). One of these proteins, DR1940, contains three copies of the HslJ domain, a feature that has not yet been seen in this protein family. All the HslJ domains contain two conserved cysteines that could function as a redox pair, with the protein itself being a disulfide bond chaperone. The only prominent chaperone that is missing without an obvious replacement is HSP90, but this gene is also absent in archaea and bacterial thermophiles and therefore appears to be nonessential.

The signal transduction system of D. radiodurans has chimeric features of prokaryotic and eukaryotic systems. This form of chimerism in the signaling system is becoming increasingly evident in several bacterial lineages such as actinomycetes, myxobacteria, and spore-forming firmicutes that undergo cellular differentiation. The typically bacterial components of the signaling system include the two-component systems with the histidine kinase and receiver domains (159) and the cyclic diguanylate signaling system with the GGDEF, EAL, and HD_GYP domains, which appear to function as cyclases and phosphodiesterases (75). In addition, these signaling domains are typically combined with small molecule and protein-binding domains, such as PAS and GAF (17, 203), and the conformation-signaling HAMP domain (16). The two-component phosphorelay system is well developed in Deinococcus, which encodes 23 histidine kinase domains and 29 receiver domains that form several combinations with the GAF and PAS domains. This system is expected to play a major role in sensing redox, light, and other environmental stimuli. Consistent with this, DRA0050, which is orthologous to the cyanobacterial and plant phytochromes, has been shown to be a photoreceptor involved in the regulation of pigment biosynthesis (55), which is likely to affect resistance to DNA-damaging agents (35). Genes encoding two proteins that consist of a sensory transduction histidine kinase and a receiver domain (DRB0028 and DRB0029) appear to be coregulated with an sB operon (DRB0024 to DRB0027). This operon encodes the antisigma factor-regulatory system and is known to be involved in stress response in other bacteria (92, 109). As a whole, this array of six genes appears to comprise a stress response module unique for Deinococcus.

Deinococcus encodes 16 GGDEF domain-containing proteins, which suggests a major role for this uniquely bacterial module that is predicted to function as a cyclase in diguanylate signaling. The two predicted distinct phosphodiesterases of this system, the HD-GYP and EAL domains (six and four copies, respectively, in Deinococcus), complement each other in terms of their copy numbers, as has been observed for other bacterial genomes. These domains tend to combine with the stimulus-sensing PAS and GAF domains. One such interesting architecture is the combination of the GAF domain and the HD_GYP domain in two Deinococcus proteins (Fig. 2). The representation of this signaling system in Deinococcus is comparable to that in other bacteria with moderate-sized to large genomes.


View larger version (30K):
[in this window]
[in a new window]
 
FIG. 2.   Distinct domain architectures of selected proteins implicated in signal transduction in Deinococcus.

While Deinococcus lacks flagella and is unlikely to be capable of chemotactic motility, it possesses certain remnants of the chemotactic signaling system that are likely to signal through alternative pathways. In particular, there are three methyl-accepting chemotactic receptor proteins (DRA0352, DRA0353, and DRA0354), each containing two HAMP domains, but there is no methyltransferase of the chemotactic signaling pathway. These three proteins are encoded by genes located in the vicinity of genes for a CheA-like histidine kinase and a CheY-like receiver domain, which suggests that the methyl-accepting receptor forms a single functional unit with this two-component system protein. Given the apparent absence of chemotaxis, the methyl-accepting receptors could form a scaffold for binding of the CheA kinase, which might signal the availability of amino acids in the environment.

The tetratricopeptide repeats (TPR) seem to play a special role in Deinococcus signaling. In three distinct proteins, these repeats are combined with typically bacterial signaling modules (Fig. 2). The TPR modules are likely to mediate protein-protein interactions within molecular complexes involving these proteins, as documented in eukaryotic systems (113). WD40 proteins, which often serve as interaction partners to TPR in eukaryotes (210), are also expanded in Deinococcus and could cooperate with the TPR-containing proteins. Of particular interest is another group of at least four beta -propeller proteins that appear to be closer to the YWTD class of propellers than to WD40s (DR0960, DR1725, DR2062, and DR2484). In actinomycetes, these propeller domains are fused to protein kinases and are likely to perform specific protein-protein interaction functions in signaling (163).

The prominence of the "eukaryotic" component of the signal transduction systems in Deinococcus is underscored by the fact that it encodes 11 Pkn2-type kinases and 1 kinase of the RIO1 family (DR2209), which is typical of archaea and eukaryotes (121) and was detected in bacteria for the first time. This number is greater than in most other prokaryotes (121), suggesting that protein-serine/threonine phosphorylation-dependent regulatory pathways play a major role in Deinococcus. Consistent with this, Deinococcus also encodes PP2C phosphatases and a FHA domain that typically function in conjunction with the serine/threonine kinases.

Several protein families that have been implicated in stress response and signal transduction in other organisms have undergone specific expansion in Deinococcus; these are discussed in some detail below.

Distinctive Features of Predicted Operon Organization and Transcription Regulation

Generally, the genome organization of D. radiodurans is similar to that of other bacteria (218). Many functionally related genes are organized into clusters that are likely to comprise operons, including such common ones as ribosomal protein genes, ATP synthase, NADH dehydrogenase, and various ATP-binding cassette (ABC)-type transport systems. Beyond these generic operons, however, several unusual gene clusters were detected, and some of these are likely to be related to the unique features of Deinococcus (Table 4).

                              
View this table:
[in this window]
[in a new window]
 
TABLE 4.   Some unusual predicted operons in D. radiodurans

The first group of such unique gene arrays includes paralogous genes that encode protein families overrepresented in Deinococcus, such as amino-acetyltransferases, Nudix hydrolases, and genes of the TerE and DinB/YfiT families (see below). Some of these clusters appear to have evolved by tandem duplication within the Deinococcus lineage, e.g., an acetyltransferase cluster (DR2254 and DR2255) and a Nudix cluster (DR0783 and DR0784). Other clusters of paralogs clearly resulted from a single horizontal transfer event, e.g., the group of tellurium resistance genes (DR2220 to DR2226) that are related to the corresponding gene cluster on the broad-host-range plasmid R478. Finally, some clusters that consist of related genes with apparent phylogenetic affinities to different bacterial lineages (e.g., an acetyltransferase cluster [DR0675 to DR0677]) seem to have originated within the Deinococcus lineage through gene translocation. The second group of unusual predicted operons includes rare gene clusters that probably were acquired by horizontal transfer. Some of these operons could contribute to damage resistance, e.g., DNA repair-related functions (deoxypurine kinase operon [DR0298 and DR0299], eukaryotic-type uracil-DNA-glycosylase and topoisomerase IB [DR0689 and DR0690]), DNA transformation-related functions (competence genes [DR1854 and DR1855], restriction-modification system [DRB0143 and DRB0144]), stress response (DR0389 and DR0390; DR1160 and DR1161), and pigment biosynthesis (DR0861 and DR0862).

Two operons (DR0853 to DR0854 and DR2180 to DR2181) each consist of a gene for a small GTPase of the Ras/Rab family and a gene coding for a small protein of an uncharacterized family that is widespread in bacteria and archaea (L. Aravind and E. V. Koonin, unpublished data). The orthologous GTPase in Myxococcus is important for gliding motility (90), suggesting a role for these proteins in signaling. Expansion of the uncharacterized protein family encoded by the genes adjacent to the GTPase is seen in Streptomyces and Deinococcus and appears to result from relatively recent duplications (DR0616, DR0995, and DR1612), with three of these genes forming a cluster in the chromosome (DR0993 to DR0995). Juxtaposition of these genes with genes for Ras/Rab-GTPases is frequently observed in other genomes, including Myxococcus and archaeal and bacterial thermophiles, suggesting that they form a mobile operon, with the encoded proteins being functionally coupled.

Another predicted operon (DR0332 to DR0335) that could have been horizontally transferred from cyanobacteria encodes components of a protein kinase-dependent regulatory pathway. These include two active Pkn2-type serine/threonine protein kinase with Zn ribbons, a PP2C-type phosphatase with an N-terminally disrupted Pkn2 kinase domain, and a protein that contains a phosphoserine-binding FHA domain combined with a Zn ribbon domain orthologous to proteins from cyanobacteria (FraH) and actinomycetes (121). The phosphorylation system encoded by this operon may play a role in cellular differentiation, with the Zn-ribbon-FHA protein functioning as the downstream effector that regulates transcription.

The general picture of transcription regulation in Deinococcus emerging from genome analysis is similar to that seen in other bacteria. Among Deinococcus gene products, we detected 104 HTH domain-containing proteins that are predicted to function as transcriptional regulators. This number is close to those detected in other free-living bacteria with similar genome sizes (14); the repertoire of HTH-containing proteins identified in Deinococcus covers most of the diversity of prokaryotic transcriptional regulators. Deinococcus encodes seven members of the MerR/SoxR family of regulators (a greater number than in other characterized bacteria except B. subtilis), which could participate in the regulation of various stress response pathways (24, 155). Another family of predicted HTH regulators of unknown specificity that is expanded in Deinococcus consists of eight paralogs (e.g., DR1954); such an expansion is unprecedented in other bacteria and suggests a unique role in the regulation of a distinct set of genes.

Expansion of Specific Protein Families

Expansion of specific protein families has been observed for several complete genomes (43, 126, 194). Sometimes there is a clear relationship between the expansion of a particular protein family and the adaptation of the respective organism to its environment. Examples of such adaptive expansions include ferredoxins in autotrophic archaea (126), several families of enzymes involved in lipid degradation in M. tuberculosis (43), and c-type cytochromes in the metal-reducing bacteria Shewanella (148).

In the D. radiodurans genome, we detected several expansions, some of which appear to be related to stress response and damage control (Fig. 3). In particular, several different families of hydrolases are overrepresented compared to other sequenced genomes. These include MutT-like pyrophosphatases (Nudix), calcineurin-like phosphoesterases, lipase/epoxidase-like (alpha /beta ) hydrolases, subtilisin-like proteases, and sugar deacetylases. In addition to such specifically expanded families, several other families of hydrolases are present in Deinococcus in elevated numbers although they are also common in other bacteria and are not shown here. Some of these hydrolases are likely to be involved in the decomposition of damage products ("cell cleaning") under stress conditions. Independent expansions of certain families, such as alpha /beta hydrolases in Deinococcus and Mycobacterium and subtilisin-like proteases in Deinococcus and Bacillus, are noteworthy and probably correlate with the adaptation of these organisms to the facultative or obligatory heterotrophic life-style (43, 116).


View larger version (55K):
[in this window]
[in a new window]
 
FIG. 3.   Specific protein family expansion in Deinococcus.

Expansion of the Nudix hydrolase protein superfamily is one of the most prominent features of the Deinococcus genome. The MutT protein, the prototype for this superfamily, has been identified as the central component of an antimutagenic system responsible for preventing incorporation of 8-oxo-dGTP into DNA (136). Subsequently, it has been shown that different MutT-like enzymes use a variety of substrates, and the Nudix pyrophosphohydrolases have been tentatively defined as a superfamily of "house-cleaning" enzymes that destroy potentially deleterious compounds (28). A detailed analysis of Nudix proteins in Deinococcus revealed five distinct multidomain proteins, in which the MutT domain is combined with other domains (Fig. 4). Orthologous proteins for three of them also exist in other bacteria. In particular, the family typified by E. coli YjaD contains a Zn ribbon module, which is probably involved in nucleic acid binding. Another Deinococcus protein contains an apparently inactivated (with the catalytic motif REXXEE missing) MutT domain combined with a TagD-like nucleotidyltransferase domain and is likely to perform a regulatory function. A second TagD-like nucleotidyltransferase from Deinococcus (DRA0273) is very similar, but the MutT domain has apparently eroded beyond recognition. Orthologs of a third Nudix protein, which contains an uncharacterized C-terminal domain, are present in Streptomyces, Mycobacterium, and Synechocystis. Again, in most of them, the Nudix pyrophosphohydrolase appears to be inactivated, suggesting a regulatory function.


View larger version (24K):
[in this window]
[in a new window]
 
FIG. 4.   Distinct domain architectures of proteins containing the MutT-like domain. aa, amino acids; SAM, S-adenosylmethionine.

Two closely related Deinococcus proteins contain a duplication of the MutT domain that has not yet been detected in any other organism. Three more Nudix proteins are specifically related to the proteins containing this duplication, and the genes for two of these are adjacent on the chromosome (DR0783 and DR0784). These seven related MutT domains appear to form a Deinococcus-specific family of Nudix hydrolases. Another Nudix protein consists of three domains, namely, S-adenosylmethionine (SAM)-dependent methylase, MutT, and cytosine deaminase (Fig. 4). This domain combination is unique to Deinococcus and suggests that the protein is involved in an as a yet uncharacterized repair pathway.

Altogether, Deinococcus encodes 23 Nudix superfamily proteins that contain 25 individual MutT domains. Some of these proteins are likely to be repair enzymes with known activities, including the MutT ortholog (DR0261), while others will have novel functions, as suggested by the domain combinations discussed above. Other functions are likely to include utilization of damage products formed under various stress conditions. It is unlikely that a distant ancestor of the Deinococcus lineage encoded all these MutT-containing proteins. Rather, it appears that the heterogeneous collection of these proteins encoded by D. radiodurans was assembled via the mixed routes of serial duplication, particularly in the distinct deinococcal family of seven Nudix domains, and horizontal gene transfer.

Amino group acetyltransferases comprise another family that appears to have undergone independent expansion in Deinococcus and in Bacillus. Acetyltransferases of this type participate in various metabolic pathways, including lipid biosynthesis, and in regulatory systems. Except for B. subtilis, other bacteria have less than half the number of these enzymes with respect to the number found in D. radiodurans. Like the acetylases in other bacteria, these enzymes are likely to participate in detoxification of antibitotics and possibly of toxic products that arise upon DNA damage, as well as in regulatory protein acetylation. A Deinococcus-specific family of acetyltransferases, which consists of at least 11 proteins, is most similar to acetyltransferases involved in peptide antibiotic resistance, such as streptothricin acetyltransferase of Streptomyces (98). These acetyltransferases might aid the survival of Deinococcus in the presence of peptide antibiotics secreted by other bacteria, with which it has to compete for nitrogen and carbon sources as a part of its heterotrophic life-style.

Enzymes of the alpha /beta hydrolase superfamily are mainly neutral lipases or acetyl esterases, but some of them have unusual substrate specificity, e.g., heroin esterase from Rhodococcus (169) and antibiotic bialaphos acetyl esterase from Streptomyces (167); other proteins of this superfamily possess unexpected activities, e.g., metal ion-free oxidoreductase from Streptomyces (91). The expanded families of alpha /beta hydrolases in Deinococcus could be exploited for xenobiotic metabolism and/or the biogenesis of the complex cell envelopes (see above).

In several cases, expansion of specific subfamilies within common protein families appears to be important. Deinococcus encodes three paralogous proteins (DR0202, DR0494, and DR2273) related to the FlaR protein from gram-positive bacteria. One of these proteins has been shown to affect DNA topology and is osmoregulated when expressed in E. coli (173). It also influences the expression of supercoiling-sensitive promoters and is considered to be a chromatin-associated protein (173). Topological changes of DNA could play a role in DNA repair of Deinococcus, and the FlaR homologs might be involved in these processes. The FlaR subfamily belongs to the P-loop-containing kinase superfamily that includes nucleotide, gluconate, and shikimate kinases (224). Deinococcus encodes three paralogous proteins (DR0609, DR2467, and DR2139) that belong to another uncharacterized subfamily of these kinases which is also represented in several other bacteria.

Another interesting case is the LigT protein family, which is found in several bacteria, archaea, and eukaryotes and includes RNA ligases and predicted 2',5'-cyclic nucleotide phosphodiesterases. In addition to the LigT ortholog (DR2339), Deinococcus encodes two predicted phosphodiesterases of this family (DR1000 and DR1814) that may participate in RNA metabolism or signaling.

Expansion of several other protein families is consistent with the unusual stress resistance capabilities of D. radiodurans. For example, Deinococcus encodes seven small nuclease domains related to the McrA endonuclease of E. coli (94). The McrA-like nuclease domain is part of three multidomain protein architectures that seem to be unique to Deinococcus (see below). This previously unreported propagation of McrA-like nucleases could make a contribution to the repair potential of Deinococcus. In evolutionary terms, the McrA domain, like the MutT domain, apparently has been expanded in Deinococcus through a recent duplication (DR1312 and DR2483 are 50% identical), as well as through acquisition of genes by horizontal gene transfer.

Expansion of proteins of the TerDEXZ/CABP family in Deinococcus is interesting because some of these proteins could confer resistance to a variety of DNA-damaging agents, including heavy-metal cations, methyl methanesulfonate, mitomycin C and UV (21, 103), and other forms of stress (11). Two members of this family, CABP1 and CABP2, are expressed during starvation in Dictyostelium and form a heterodimer that binds cyclic AMP (cAMP) (78), suggesting that other members of the family also bind various small-molecule ligands.

Deinococcus encodes the largest number of the pathogenesis-related 1 (PR1) family proteins (five members) among bacteria. These secreted proteins are widespread in eukaryotes but sporadic in bacteria (195); unlike the eukaryotic members of this family, the bacterial PR1-related proteins lack the disulfide bond-forming cysteines (68). Since they are predicted to be secreted, the bacterial PR1 family proteins might play a role in inhibiting extracellular enzymes or in interacting with other cells, as suggested by the known activities of their eukaryotic homologs (106).

The second largest protein expansion in Deinococcus is the family of uncharacterized small proteins whose prototype is B. subtilis DinB, a DNA damage-inducible gene product (39). Among bacteria, Deinococcus encodes the greatest number of these proteins, although comparable independent expansions are seen in B. subtilis and the actinomycetes (Fig. 3). Examination of the multiple alignment of this family (Fig. 5) reveals three conserved histidines that could form a catalytic triad of a novel metal-dependent enzyme, perhaps a hydrolase. The prediction of enzymatic activity of these proteins raises the possibility that they could be nucleases directly involved in DNA degradation, which begins in Deinococcus immediately after DNA damage (23, 211). This protein family may be particularly amenable to experimental studies, given its expansion in B. subtilis, a model for many DNA repair studies.


View larger version (80K):
[in this window]
[in a new window]
 
FIG. 5.   Multiple alignment of the conserved core of the DinB/YfiT protein family. The alignment was generated by parsing the PSI-BLAST HSPs and realigning them with the ALITRE program (181). The numbers between aligned blocks indicate the lengths of variable inserts that are not shown; the numbers at the end of each sequence indicate the distances from