Department of Plant Pathology, University of Wisconsin-Madison, Madison, Wisconsin
SUMMARY INTRODUCTION HISTORY OF THE CULTURE DIVIDE Early Microbiology and the Microscope Modern Microbiologya Pure Culture Is Not Enough THE PARADIGM SHIFT rRNA Analysis and Culturing METAGENOMICSCULTURE-INDEPENDENT INSIGHT APPROACHES TO METAGENOMIC ANALYSIS Sequence-Based Analysis Functional Metagenomics Heterologous expression. Identifying active clonesscreens, selections, and functional anchors. ECOLOGICAL INFERENCE FROM METAGENOMICS Symbiosis Buchnera-aphid symbiosis. Proteobacterium-tube worm symbiosis. Competition and Communication What can metagenomics tell us about microbial competition and communication? Role of small molecules. Sequence-based screening for small molecules. Antibiotics as signal molecules. Biogeochemical Cycles Acid mine drainage. Sargasso Sea. Population Genetics and Microheterogeneity CONCLUSIONS AND FUTURE DIRECTIONS REFERENCES
| SUMMARY |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
This change fomented a revolution in microbiological thought. At the heart of this revolution was the convincing demonstration that the uncultured microbial world far outsized the cultured world and that this unseen world could be studied (105-108). This change in thinking was prompted by another, equally important realization: microorganisms underpin most of the geochemical cycles and many human health conditions that were previously thought to be driven by inorganic processes and stress, respectively. The glimmers of insight into the influence that microorganisms exert on the world propelled microbiologists to pursue the uncultured world. In 1931, Waksman optimistically believed that "a large body of information has accumulated that enables us to construct a clear picture...of...the microscopic population of the soil" (145), and in 1923 Bergey's Manual stated categorically that no organism could be classified without being cultured (133).
By the mid-1980s, however, microbiologists had lost this confidence, and the language and practice of microbiology changed to accommodate the vast unknown of uncultured life. Concepts, assumptions, images, and words needed to be replaced when it became evident that they were based upon the premise that microorganisms did not exist unless they could be cultured. Pace and colleagues highlighted the need for nontraditional techniques to understand the microbial world: "The simple morphology of most microbes provides few clues for their identification; physiological traits are often ambiguous. The microbial ecologist is particularly impeded by these constraints, since so many organisms resist cultivation, which is an essential prelude to characterization in the laboratory" (107).
In the ensuing years, microbiologists dedicated intense effort to describing the phylogenetic diversity of exotic and ordinary environmentsocean surfaces, deep sea vents, hot springs, soil, animal rumen and gut, human oral cavity and intestine. Many new lineages were classified based on their molecular signatures alone. The next challenge was to elucidate the functions of these new phylotypes and determine whether they represented new species, genera, or phyla of prokaryotic life. This challenge spawned various techniques, including metagenomics, the genomic analysis of assemblages of organisms. In a few years, the study of uncultured microorganisms has expanded beyond asking "Who is there?" to include the difficult question "What are they doing?"
The outcomes of the recognition of uncultured microorganisms are worthy of examination. One of these outcomes, metagenomics, is further shaping microbiology. Metagenomics has already opened new avenues of research by enabling unprecedented analyses of genome heterogeneity and evolution in environmental contexts and providing access to far more microbial diversity than has been viewed in the petri dish. This review will explore the origins of metagenomics and examine its recent application to microbial ecology and biotechnology.
| HISTORY OF THE CULTURE DIVIDE |
|---|
|
|
|---|
Among the advances during this period of microbiology was the work of botanist Ferdinand Cohn, who classified many bacteria and described the life cycle of Bacillus subtilis based on his microscopic observations (60). Although mycologists such as Franz Unger had understood the concept of pure culture as early as the 1850s, it was in large part the emphasis on disease causality that solidified pure culture as the standard bacteriological technique for laboratory microbiology (49, 96). Robert Koch's postulates and his own innovation in developing culture media were instrumental in this shift, and from the 1880s forward, the microbiological world was divided into the cultured and the uncultured. Microbiologists were attracted to the power and precision of studies of bacteria in pure culture, and as a result, most of the knowledge that fills modern microbiology textbooks is derived from organisms maintained in pure culture.
One of the indicators that cultured microorganisms did not represent much of the microbial world was the oft-observed "great plate count anomaly" (135)the discrepancy between the sizes of populations estimated by dilution plating and by microscopy. This discrepancy is particularly dramatic in some aquatic environments, in which plate counts and viable cells estimated by acridine orange staining can differ by four to six orders of magnitude (66), and in soil, in which 0.1 to 1% of bacteria are readily culturable on common media under standard conditions (138, 139).
Brock and colleagues encountered microorganisms in Yellowstone hot springs that could not be cultured and others whose behavior in culture did not reflect their activities in situ. Many of the organisms could not be cultured on agar medium because their temperature requirements exceed the melting point of the agar. Therefore, elucidating the physiological function of microorganisms without culturing them required ingenuity. Brock's central technique involved the immersion of microscope slides in the spring for 1 to 7 days, followed by microscopic examination and often staining with fluorescent antibodies raised against cultured members of the taxonomic groups suspected to inhabit the environment (17, 21). This approach estimated in situ population sizes and growth rates, which indicated, for example, that certain strains of Sulfolobus grew in the hot springs at temperatures well below the optima in pure culture (103). The expanding body of evidence indicating that it was imperative to study physiology in the environment led Brock's group to determine which organisms in the hot spring were responsible for photosynthesis. To do so, they placed an opaque cover over the spring for a week. The spring lost its pink color, leading them to infer that the genus Synechococcus, typically pink in culture, was a major contributor to photosynthesis (25, 26).
Further evidence that drew attention to the uncultured world accumulated during the 1970s and 1980s. A study of oligotrophs indicated that incubation times longer than 25 days enhanced the recovery of certain organisms in culture (147). The food industry generated intense interest in "injured bacteria" in foodlive organisms that cannot be cultured following stressful treatments such as heat, chilling, or desiccation but represent a significant risk to human health (41). The concept of organisms that were viable but not culturable emerged from the work of Colwell and colleagues, who showed that strains of Vibrio cholerae were indeed alive and virulent when isolated from aquatic environments (8) but did not grow in culture until after passage through a mouse or human intestine (35-37).
The confluence of these and many other scientific and technical advances steadily drew attention to the unculturable microbial world, but two discoveries figured significantly in the sharpened focus. The first was work on the diversity of soil bacteria, which demonstrated with DNA-DNA reassociation techniques that the complexity of the bacterial DNA in the soil was at least 100-fold greater than could be accounted for by culturing. This work suggested that the diversity of the uncultured world exceeded previous estimates (138). The second discovery was the demonstration that Helicobacter pylori causes gastric ulcers and cancer. Although spiral bacteria had been observed in the gastric mucosa of dogs in 1893 and in humans in 1906 (29), and correlations between the appearance of the bacteria and peptic ulcers were noted in 1938 (48), it was not until H. pylori was cultured that its role in disease was accepted (94, 95). Culturing was accompanied by the satisfaction of Koch's postulates on a human volunteer (94), providing definitive evidence for the causal relationship between the bacterium and ulcers.
Ironically, culturing was not that difficult. Plates accidentally incubated for 5 days instead of 3 revealed colonies later shown to be H. pylori (29). The fact that strong microscopic evidence for the role of H. pylori long preceded culturing and might have served as the basis for successful treatment decades earlier, perhaps reducing human suffering and mortality due to ulcers and cancer, did not escape the notice of microbiologists, medical practitioners, and the public. Whereas the studies of the complexity of the soil DNA demonstrated the diversity of the unknown world, the connection of uncultured bacteria and ulcers provided a striking example of the power of the undetected organisms. These discoveries provided compelling evidence that drew microbiologists to wrestle with the daunting challenge of devising strategies to access these organisms.
| THE PARADIGM SHIFT |
|---|
|
|
|---|
The application of PCR technology provided a view of microbial diversity that was not distorted by the culturing bias and revealed that the uncultured majority is highly diverse and contains members that diverge deeply from the readily culturable minority. Today, 52 phyla have been delineated, and most are dominated by uncultured organisms (Fig. 1) (114). The application of phylogenetic stains (Fig. 2) nucleic acid probes with fluorescent labels that facilitate visualization of single cells in situled to a recrudescence of microscopy as a central tool of microbiology and microbial phylogeny (43, 63). Whereas traditional microscopy provides little phylogenetic information and fluorescent antibody studies require prior knowledge and culturing of an organism or one closely related to it to raise antibodies (17, 18), phylogenetic stains require only an rRNA sequence, which can be derived from an environmental sample without culturing. Phylogenetic stains corroborated evidence from PCR-based studies but provided quantitative information as well, because the findings are based on direct observation that is not subject to the skewing of organism abundance potentially observed with PCR (137).
|
|
Nucleic acid probes labeled with fluorescent tags provide such an assay, facilitating quantitative assessment of enrichment and growth. As a result, culturing efforts have intensified recently, and successes have included pure cultures of members of the SAR11 clade, now termed the genus Pelagibacter (34, 38, 113), which represents more than one-third of the prokaryotic cells in the surface of the ocean but was known only by its 16S rRNA signature until 2002 (38, 102, 113). The corollary to SAR11 in terrestrial environments is the Acidobacteria phylum (76, 121). Acidobacteria are abundant in soil, typically representing 20 to 30% of the 16S rRNA sequences amplified by PCR from soil DNA, but until recently only three members had been cultured (7, 56, 79, 81, 89, 97, 129, 132). Once again, the culture-independent indications that it was prevalent in the environment led to intensive efforts to culture members of the Acidobacteria phylum. The current efforts to culture new microorganisms will be advanced by the information that metagenomics can reveal about uncultured organisms (87).
Given that many organisms will not be coaxed readily into pure culture, a critical advance is to extend the understanding of the uncultured world beyond cataloging 16S rRNA gene sequences, and microbiologists have striven to devise methods to analyze the physiology and ecology of these diverse, uncultured organisms.
| METAGENOMICSCULTURE-INDEPENDENT INSIGHT |
|---|
|
|
|---|
The word metagenomics was coined (69) to capture the notion of analysis of a collection of similar but not identical items, as in a meta-analysis, which is an analysis of analyses (64). (Community genomics, environmental genomics, and population genomics are synonyms for the same approach.) The idea of cloning DNA directly from environmental samples was first proposed by Pace (108), and in 1991, the first such cloning in a phage vector was reported (126). The next advance was the construction of a metagenomic library with DNA derived from a mixture of organisms enriched on dried grasses in the laboratory (71). Clones expressing cellulolytic activity were found in these libraries, which were referred to as zoolibraries, a term that has not been used widely in the field (71). The work of DeLong's group defined the field when they reported libraries constructed from prokaryotes in seawater (136). They identified a 40-kb clone that contained a 16S rRNA gene indicating that the clone was derived from an archaeon that had never been cultured. Construction of libraries with DNA extracted from soil lagged due to difficulties associated with maintaining the integrity of DNA during its extraction and purification from a soil matrix (14, 69, 80, 118) but eventually produced analyses analogous to those from seawater (39, 72, 118).
| APPROACHES TO METAGENOMIC ANALYSIS |
|---|
|
|
|---|
|
-Proteobacteria. The sequence of flanking DNA revealed a bacteriorhodopsin-like gene. Its gene product was shown to be an authentic photoreceptor, leading to the insight that bacteriorhodopsin genes are not limited to Archaea but are in fact abundant among the Proteobacteria of the ocean (11, 12).
A promising application of phylogenetic anchor-guided sequencing is to collect and sequence many genomic fragments from one taxon. In more complex environments and taxa, reassembly of a genome may not be feasible, but inference about the physiology and ecology of the members of the groups can be gleaned from sequence data. This approach has been initiated with clones from diverse soils carrying 16S rRNA genes that affiliate with the Acidobacteria phylum, which is abundant in soil and highly diverse (7, 28, 58, 76, 121) and about which little is known (85, 112). Complete sequencing of the estimated
500 kb of Acidobacterium DNA in metagenomic libraries may provide insight into the subgroups of bacteria in this phylum that have never been cultured.
The alternative to a phylogenetic marker-driven approach is to sequence random clones, which has produced dramatic insights, especially when conducted on a massive scale. The distribution and redundancy of functions in a community, linkage of traits, genomic organization, and horizontal gene transfer can all be inferred from sequence-based analysis. The recent monumental sequencing efforts, which include reconstruction of the genomes of uncultured organisms in a community in acid mine drainage (141) and the Sargasso Sea (142), illustrate the power of large-scale sequencing efforts to enrich our understanding of uncultured communities. These studies have made new linkages between phylogeny and function, indicated the surprising abundance of certain types of genes, and reconstructed the genomes of organisms that have not been cultured. These studies will be discussed in detail in the section entitled Biogeochemical Cycles.
The use of phylogenetic markers either as the initial identifiers of DNA fragments to study or as indicators of taxonomic affiliation for DNA fragments carrying genes of interest because of their function is limited by the small number of available markers that provide reliable placement in the Tree of Life. If a fragment of DNA that is of interest for other reasons does not carry a dependable marker, its organism of origin remains unknown. The collection of phylogenetic markers is growing, and as the diversity of markers increases, the power of this approach will also increase, making it possible to assign more fragments of anonymous DNA to the organisms from which they were isolated. Moreover, as more genomes are reconstructed, more genes will be linked to phylogenetic markers even though they were not cloned initially on the same fragment (141, 142).
Identifying active clonesscreens, selections, and functional anchors. The frequency of metagenomic clones that express any given activity is low. For example, in a search for lipolytic clones derived from German soil, only 1 in 730,000 clones showed activity (73). In a library of DNA from North American soil, 29 of a total of 25,000 clones expressed hemolytic activity (118). The scarcity of active clones therefore necessitates development of efficient screens and selections for discovery of new activities or molecules. Just as bacterial genetics relies on selections to detect low-frequency events, metagenomics will be advanced by seeking selectable phenotypes to increase the collection of active clones that can be compared, analyzed, and used to build a conceptual framework for functional analysis.
Several selections have proved to be fruitful. For example, the Daniel group designed a clever selection for Na+(Li+)/H+ antiporters that requires complementation of an E. coli mutant deficient in the three Na+/H+ antiporters (nhaA, nhaB, and chaA) enabling growth on medium containing 7.5 mM LiCl (93). This powerful selection facilitated the discovery of two novel antiporter proteins in a library of 1,480,000 clones containing DNA isolated from soil. Another selection strategy involved complementation of an E. coli mutant deficient in biotin production, which led to the isolation of seven new operons for biotin synthesis from enrichment cultures derived from samples of soil or horse excrement (52).
Selection for antibiotic resistance led to the isolation of a tetracycline resistance determinant from samples of the microbiota from the human mouth (44) and aminoglycoside resistance determinants from soil (115). The selection for aminoglycoside resistance identified nine clones, six of which encoded 6'-acetyltransferases that formed a new cluster based on sequence analysis. These genes were discovered in libraries containing a total of 4 Gb of DNA, or approximately 1 million genes, and thus their infrequent representation would have made it prohibitively laborious to discover them by a screen without a selection. This example illustrates the power of functional metagenomicsgenes that are expressed in an ordinary host such as E. coli may be extraordinary and novel.
High-throughput screens can substitute when the functions of interest do not provide the basis for selection. For example, on certain indicator media, active clones display a characteristic and easily distinguishable appearance even when plated at high density. With the indicator dye tetrazolium chloride, Henne et al. (72) detected clones that utilize 4-hydroxybutyrate in libraries of DNA from agricultural or river valley soil. Very rare lipolytic clones in the same libraries were detected by production of clear halos on media containing rhodamine and either triolein or tributryin (73).
The discovery of new biological motifs will depend in part on functional analysis of metagenomic clones. Functional screens of metagenomic libraries have led to the assignment of functions to numerous "hypothetical proteins" in the databases. Innovation will be required to identify and overcome the barriers to heterologous gene expression and to detect rare clones efficiently in the immense libraries that are needed to represent all of the genomes in complex environments, such as soil. An emerging and powerful direction for metagenomic analysis is the use of functional anchors, which are the functional analogs of phylogenetic anchors. Functional anchors are functions that can be assessed rapidly in all of the clones in a library. When a collection of clones with a common function is assembled, they can be sequenced to find phylogenetic anchors and genomic structure in the flanking DNA. Such an analysis can provide a slice of the metagenome that cuts across clones with a different selective tool, determining the diversity of genomes that contain a particular function that can be expressed in the host carrying the library. Technological developments that promote functional expression and screening will advance this new frontier of functional genomics.
| ECOLOGICAL INFERENCE FROM METAGENOMICS |
|---|
|
|
|---|
Buchnera-aphid symbiosis. The first genome reconstruction of an uncultured organism was that of Buchnera aphidicola, the endosymbiont of aphids. The relationship between the bacterium and the insect is ancient, leaving each partner unable to function independently of the other, as is reflected in the genomic analysis. Moran's group isolated bacterial DNA from the insect and sequenced and reassembled the bacterial genome. The genus Buchnera contains a "reduced" genome of 564 open reading frames. Upon comparison with a reconstructed ancestral genome, 1,906 genes appear to have been lost. Most of the functions are associated with biosynthetic pathways contributed by the host, suggesting that the genome shrinkage is the result of the symbiotic lifestyle, which has become obligate because of gene loss (1, 42, 101).
The reconstruction of B. aphidicola's genome provided insights into the evolution of the symbiosis between the insect and bacterium, the biochemical mutual dependence that they have developed, and the mechanisms of genome shrinkage and rearrangement. The success of genome reconstruction with a single uncultured species provided part of the impetus needed to propose sequencing and reconstructing genomes in more complex assemblages.
Proteobacterium-tube worm symbiosis. Riftia pachyptila, the deep sea tube worm, lives 2,600 m below the ocean surface, near the thermal vents that are rich in sulfide and reach temperatures near 400°C. The tube worm does not have a mouth or digestive tract, and therefore it is entirely dependent on its symbiotic bacteria, which provide the worm with food. The bacteria live in the trophosome, a specialized feeding sac inside the worm (32). The bacteria and trophosome constitute more than half of the animal's body mass. The bacteria oxidize hydrogen sulfide, thereby producing the energy required to fix carbon from CO2, providing sugars and amino acids (predominantly as glutamate) that nourish the worm (55, 84). The worm contributes to the symbiosis by collecting hydrogen sulfide, oxygen, and carbon dioxide and transporting them to the bacteria on hemoglobin-like molecules (3, 46, 149-153).
The bacterium is a member of the
-Proteobacteria, as identified by 16S rRNA gene sequence (47). The bacteria have not been grown in pure culture in laboratory media, but they provide an excellent substrate for metagenomics because they reach high population density in the trophosome and exist there as a single species. Hughes et al. (75) isolated DNA from the bacterial symbiont and constructed fosmid libraries from it that were used to understand the physiology of the bacteria. Robinson et al. (116) identified a gene with similarity to ribulose-1,5-bisphosphate carboxylase/oxygenase (RubisCO) from the same fosmid library. All of the residues associated with the active site are conserved in the protein sequence deduced from the DNA sequence, and it has highest similarity with the RubisCO from Rhodospirillum rubrum. The characterization of this gene lends further support to the premise that the chemoautotrophic bacterial symbiont in R. pachyptila fixes carbon for its host.
The libraries were also screened for two-component regulators with a labeled histidine kinase gene as a probe. They identified a two-component system whose components complemented an envZ and a phoR creC double mutant, respectively. The discovery of a functional envZ homologue indicates that the symbiont carries a response regulator that is typical of
-Proteobacteria, although the signals eliciting responses from these proteins have not yet been identified.
Genomic analysis of the symbiont also led to the identification of a gene encoding flagellin, which was expressed in E. coli and shown to direct the synthesis of flagella that are immunologically cross-reactive with Salmonella flagella. The presence of genes for flagella suggested to the authors that the endosymbiont has a free-living stage in its life cycle and may infect each generation of tube worms rather than being passed maternally (99).
Genes for competition and cooperation are hard to recognize based on sequence alone because the utility of their functions is entirely dependent on ecosystem context and the nature of the resources that are limiting. Therefore, genomics by itself does not provide a means to test ecological hypotheses or identify genes that confer fitness, but it can provide the basis for forming hypotheses. Ecological hypotheses are difficult to test in microorganisms that cannot be cultured or for which there are no genetic tools; however, functional genomics coupled with chemical ecology can yield informative answers. Chemical ecology involves the identification of small molecules with biological activity and proposed ecological function. These compounds can be identified through a variety of methods, including metagenomics. The addition of these molecules to communities can provide the basis for postulating their ecological roles in the community by measuring perturbations of community function. The following sections explore the discovery of small molecules in metagenomic libraries and postulate the ecological functions of these molecules in the organisms producing them.
Role of small molecules. Small-molecule discovery by functional metagenomics has concentrated on antibiotics, which are of interest for their pharmaceutical applications as well as for their roles in ecosystem function. Traditional antibiotic screens for molecules that inhibit bacterial growth have led to the discovery of antibiotics in metagenomic libraries (Fig. 4). They have not been a rich source of novel antibiotics, likely because of the experimental limitations associated with the search. In studies that report frequencies, antibiotic-producing clones are detected at a frequency of approximately 1 producer per 104 clones (23, 24, 61). This low frequency hinders discovery because space and labor are required to conduct typical antimicrobial screens.
|
Sequence-based screening for small molecules. The first polyketide synthases, enzymes involved in synthesis of polyketides, the broad class of antibiotics that includes erythromycin, epithilone, and rifamycin, were first cloned from soil with a PCR-based approach. Seow et al. (128) designed primers that hybridize with the highly conserved region of polyketide synthase genes and amplified novel polyketide synthase homologues directly from soil. This approach was adapted for screening metagenomic libraries by Osburne's group, who screened a 5,000-member metagenomic library for conserved regions of genes encoding type I polyketide synthase. Primers directed toward a conserved region of polyketide synthase I genes that flanks the active site of the ketoacyl synthetase domain were used to screen pools of 96 clones. The screen yielded 11 new polyketide synthase homologues that contained significant sequence similarity to polyketide synthase genes from cultured organisms. In addition, screening clones in both E. coli and Streptomyces lividans by chemical means revealed two novel compounds, fatty dienic alcohol isomers (Fig. 4).
Antibiotics as signal molecules. If antibiotics evolved as mediators of functions other than warfare (42a), such as communication, antibiotic discovery will be expedited by screening metagenomic clones for signaling compounds as well as inhibitory compounds. The challenge is to develop assays that detect signaling by many compounds. A surprising result from the Davies group indicated that subinhibitory concentrations of many antibiotics induce quorum sensing despite no resemblance in structure to the acylated homoserine lactones that appear to be the natural inducers (65). This result presents a propitious opportunitya single screen might capture molecules that are quorum-sensing inducers as well as antibiotics.
This opportunity was investigated by designing a high-throughput screen to identify compounds that induce the expression of genes under the control of a quorum-sensing promoter. The screen is intracellular, meaning the metagenomic DNA is in the same cell as the sensor for quorum-sensing induction (Fig. 5). The sensor is comprised of the luxR promoter, which is induced by acylated homoserine lactones, linked to gfp, and resides on a plasmid in an E. coli strain that did not induce quorum sensing itself (2). If an inducer of the luxR-mediated transcription of gfp is expressed from the metagenomic DNA, the cell fluoresces and can be captured by fluorescence-activated cell sorting or as a colony observed by fluorescence microscopy. Conversely, this sensor system can detect inhibitors of quorum sensing if acylated homoserine lactone is added to the medium and fluorescence-activated cell sorting is set to collect the nonfluorescent cells (Fig. 5). Metagenomic libraries from microbiota of the soil and from the midgut of the gypsy moth have been subjected to this screen, and an array of genes have been identified. Their products are under analysis, and some appear to differ from previously described quorum sensing inducers (L. Williamson, C. Guan, B. Borlee, and J. Handelsman, unpublished data).
|
The simple community structure made it possible for Tyson et al. (141) to clone total DNA and sequence most of the community with high coverage. The G+C content of each clone provided a good indicator of its source because the G+C content of the genomes of the dominant taxa in the mine differ substantially (19) (Fig. 2). Sequence alignment of 16S rRNA and tRNA synthetase genes confirmed the organismal origins of the clones. Nearly complete genomes of Leptospirillum group II and Ferroplasma type II were reconstructed, and substantial sequence information for the other community members was reported.
The metagenomic sequence substantiated a number of significant hypotheses (Fig. 6). First, it appears that Leptospirillum group III contains genes with similarity to those known to be involved in nitrogen fixation, suggesting that it provides the community with fixed nitrogen. This was a surprise because the previous supposition was that a numerically dominant member of the community, such as Leptospirillum group II, would be responsible for nitrogen fixation. However, no genes for nitrogen fixation were found in the Leptospirillum group II genome, leading the authors to suggest that the group III organism is a keystone species that has a low numerical representation but provides a service that is essential to community function. Ferroplasma type I and II genomes contain no genes associated with nitrogen fixation but contain many transporters that indicate that they likely import amino acids and other nitrogenous compounds from the environment.
|
All of the genomes in the acid mine drainage are rich in genes associated with removing potentially toxic elements from the cell. Proton efflux systems are likely responsible for maintaining the nearly neutral intracellular pH, and metal resistance determinants pump metals out of the cells, maintaining nontoxic levels in the interior of the cells.
The acid mine drainage community provides a model for the analysis of other communities. Determining the origin of DNA fragments and assigning functions may be more difficult for communities that are phylogenetically or physiologically more complex (59), but the approach will be useful for all communities.
Sargasso Sea. The Sargasso Sea is a complex and physically sprawling ecosystem compared with the contained acid mine drainage system. The inputs and outputs are more difficult to quantify, and the phylogeny of the community members has not been exhaustively surveyed. Venter et al. (142) embarked on the largest metagenomics project to date (affectionately dubbed megagenomics), in which they sequenced over 1 billion bp and claim to have discovered 1.2 million new genes. Intriguing inferences could be drawn because of the sheer size of the data set. They placed 794,061 genes in a conserved hypothetical protein group, which contains genes to which functions could not be confidently assigned. The next most abundant group contained 69,718 genes apparently involved in energy transduction. Among these were 782 rhodopsin-like photoreceptors, increasing the number of sequenced proteorhodopsin genes by almost 10-fold. Linkage of the rhodopsin genes to genes that provide phylogenetic affiliations, such as genes encoding subunits of RNA polymerase, indicated that the proteorhodopsins were distributed among taxa that were not previously known to contain light-harvesting functions, including the Bacteroides phylum (142).
The Sargasso Sea data set is a gold mine for further analysis. Intriguing hints about many aspects of ecosystem function abound and await further exploration. For example, an intriguing initial observation is that many of the genomes in the Sargasso Sea contain genes with similarity to those involved in phosphonate uptake or utilization of polyphosphates and pyrophosphates, which are present in this extremely phosphate-limited ecosystem. The phosphorus cycle is not well understood, and this collection of genomes provides a new route for discovery of the mechanisms of phosphorus acquisition and transformation. The understanding of nutrient cycling will be advanced by reconstruction of the genomes and the type of function-species analysis that Tyson et al. applied to the acid mine drainage community. The sequence data set from the Sargasso Sea provided the means to reassemble a number of genomes with criteria that include the depth of sequencing coverage, oligonucleotide frequencies, and similarity to previously sequenced genomes. The structures of these genomes individually and collectively will no doubt inform the development of models for nutrient cycling.
| CONCLUSIONS AND FUTURE DIRECTIONS |
|---|
|
|
|---|
Realizing the potential for discovery from metagenomics is dependent on the advancement of methods that are central to library construction and analysis. For sequence-based approaches, the speed and cost of nucleotide sequencing will be a barrier of rapidly diminishing significance as sequencing technology continues to improve. Sequence-based assignment of function will also benefit from advances in detection of homology, which will increasingly rely on the tertiary structures of predicted proteins rather than simply on primary sequence. Advances that will facilitate the management and analysis of large libraries include bioinformatics tools to analyze vast sequence databases and reassemble multiple genomes rapidly and affordable gene chips for library profiling (127) or that readily distinguish clones that are expressing genes from those clones that are silent. Functional analysis will require more innovation in method development. Most important among these are strategies to improve heterologous gene expression and approaches for efficient screening of large libraries.
Microbiology has long relied on diverse methods for analysis, and metagenomics can provide the tools to balance the abundance of knowledge attained from culturing with an understanding of the uncultured majority of microbial life. Myriad environments on Earth have not been studied with culture-independent methods other than PCR-based 16S rRNA gene analysis, and they invite further analysis. Metagenomics may further our understanding of many of the exotic and familiar habitats that are attracting the attention of microbial ecologists, including deep sea thermal vents; acidic hot springs; permafrost, temperate, desert, and cold soils; Antarctic frozen lakes; and eukaryotic host organsthe human mouth and gut, termite and caterpillar guts, plant rhizospheres and phyllospheres, and fungi in lichen symbioses. With improved methods for analysis, funding stimulated by recent triumphs in the field, and attraction of diverse scientists to identify new problems and solve old ones, metagenomics will expand and continue to enrich our understanding of microorganisms.
| FOOTNOTES |
|---|
| REFERENCES |
|---|
|
|
|---|
| 1. | Abbot, P., J. H. Withgott, and N. A. Moran. 2001. Genetic conflict and conditional altruism in social aphid colonies. Proc. Natl. Acad. Sci. USA 98:12068-12071. |
| 2. | Andersen, J. B., A. Heydorn, M. Hentzer, L. Eberl, O. Geisenberger, B. B. Christensen, S. Molin, and M. Givskov. 2001. Gfp-based N-acyl homoserine-lactone sensor systems for detection of bacterial communication. Appl. Environ. Microbiol. 67:575-585. |
| 3. | Arp, A. J., M. L. Doyle, E. Di Cera, and S. J. Gill. 1990. Oxygenation properties of the two co-occurring hemoglobins of the tube worm Riftia pachyptila. Respir. Physiol. 80:323-334.[CrossRef][Medline] |
| 4. | August, P. R., T. H. Grossman, C. Minor, M. P. Draper, I. A. MacNeil, J. M. Pemberton, K. M. Call, D. Holt, and M. S. Osburne. 2000. Sequence analysis and functional characterization of the violacein biosynthetic pathway from Chromobacterium violaceum. J. Mol. Microbiol. Biotechnol. 2:513-519.[Medline] |
| 5. | Baker, B. J., and J. F. Banfield. 2003. Microbial communities in acid mine drainage. FEMS Microbiol. Ecol. 44:139-152.[CrossRef] |
| 6. | Barns, S. M., R. E. Fundyga, M. W. Jeffries, and N. R. Pace. 1994. Remarkable archaeal diversity detected in a Yellowstone National Park hot spring environment. Proc. Natl. Acad. Sci. USA 91:1609-1613. |
| 7. | Barns, S. M., S. L. Takala, and C. R. Kuske. 1999. Wide distribution and diversity of members of the bacterial kingdom Acidobacterium in the environment. Appl. Environ. Microbiol. 65:1731-1737. |
| 8. | Baya, A. M., P. R. Brayton, V. L. Brown, D. J. Grimes, E. Russek-Cohen, and R. R. Colwell. 1986. Coincident plasmids and antimicrobial resistance in marine bacteria isolated from polluted and unpolluted Atlantic Ocean samples. Appl. Environ. Microbiol. 51:1285-1292. |
| 9. | Beattie, G. A., and J. Handelsman. 1993. Evaluation of a strategy for identifying nodulation competitiveness genes in Rhizobium leguminosarum biovar phaseoli. J. Gen. Microbiol. 139:529-538.[Medline] |
| 10. | Begum, A. A., S. Leibovitch, P. Migner, and F. Zhang. 2001. Specific flavonoids induced nod gene expression and pre-activated nod genes of Rhizobium leguminosarum increased pea (Pisum sativum L.) and lentil (Lens culinaris L.) nodulation in controlled growth chamber environments. J. Exp. Bot. 52:1537-1543. |
| 11. | Beja, O., L. Aravind, E. V. Koonin, M. T. Suzuki, A. Hadd, L. P. Nguyen, S. B. Jovanovich, C. M. Gates, R. A. Feldman, J. L. Spudich, E. N. Spudich, and E. F. DeLong. 2000. Bacterial rhodopsin: evidence for a new type of phototrophy in the sea. Science 289:1902-1906. |
| 12. | Beja, O., E. N. Spudich, J. L. Spudich, M. Leclerc, and E. F. DeLong. 2001. Proteorhodopsin phototrophy in the ocean. Nature 411:786-789.[CrossRef][Medline] |
| 13. | Bentley, S. D., M. Maiwald, L. D. Murphy, M. J. Pallen, C. A. Yeats, L. G. Dover, H. T. Norbertczak, G. S. Besra, M. A. Quail, and D. E. Harris. 2003. Sequencing and analysis of the genome of the Whipple's disease bacterium Tropheryma whipplei. Lancet 361:637-644.[CrossRef][Medline] |
| 14. | Berry, A. E., C. Chiocchini, T. Selby, M. Sosio, and E. M. H. Wellington. 2003. Isolation of high molecular weight DNA from soil for cloning into BAC vectors. FEMS Microbiol. Lett. 223:15-20.[CrossRef][Medline] |
| 15. | Bittinger, M. A., J. L. Milner, B. J. Saville, and J. Handelsman. 1997. rosR, a determinant of nodulation competitiveness in Rhizobium etli. Mol. Plant-Microbe Interact. 10:180-186.[Medline] |
| 16. | Black, C., J. A. M. Fyfe, and J. K. Davies. 1995. A promoter associated with the neisserial repeat can be used to transcribe the uvrB gene from Neisseria gonorrhoeae. J. Bacteriol. 177:1952-1958. |
| 17. | Bohlool, B. B., and T. D. Brock. 1974. Immunofluorescence approach to the study of the ecology of Thermoplasma acidophilum in coal refuse material. Appl. Microbiol. 28:11-16.[Medline] |
| 18. | Bohlool, B. B., and T. D. Brock. 1974. Population ecology of Sulfolobus acidocaldarius. II. Immunoecolgical studies. Arch. Microbiol. 97:181-194.[CrossRef][Medline] |
| 19. | Bond, P. L., S. P. Smriga, and J. F. Banfield. 2000. Phylogeny of microorganisms populating a thick, subaerial, lithotrophic biofilm at an extreme acid mine drainage site. Appl. Environ. Microbiol. 66:3842-3849. |
| 20. | Borthakur, D., and X. Gao. 1996. A 150-megadalton plasmid in Rhizobium etli strain TAL182 contains genes for nodulation competitiveness on Phaseolus vulgaris L. Can. J. Microbiol. 42:903-910. |
| 21. | Bott, T. L., and T. D. Brock. 1969. Bacterial growth rates above 90 degrees C in Yellowstone hot springs. Science 164:1411-1412. |
| 22. | Brady, S. F., C. J. Chao, and J. Clardy. 2002. New natural product families from an environmental DNA (eDNA) gene cluster. Am. Chem. Soc. 124:9968-9969.[CrossRef] |
| 23. | Brady, S. F., C. J. Chao, J. Handelsman, and J. Clardy. 2001. Cloning and heterologous expression of a natural product biosynthetic gene cluster from eDNA. Org. Lett. 3:1981-1984.[CrossRef][Medline] |
| 24. | Brady, S. F., and J. Clardy. 2000. Long-chain N-acyl amino acid antibiotics isolated from heterologously expressed environmental DNA. J. Am. Chem. Soc. 122:12903-12904.[CrossRef] |
| 25. | Brock, T. D. 1967. Life at high temperatures. Science 158:1012-1019. |
| 26. | Brock, T. D., and M. L. Brock. 1968. Measurement of steady-state growth rates of a thermophilic alga directly in nature. J. Bacteriol. 95:811-815. |
| 27. | Bromfield, E. S., D. M. Lewis, and L. R. Barran. 1985. Cryptic plasmid and rifampin resistance in Rhizobium meliloti influencing nodulation competitiveness. J. Bacteriol. 164:410-413. |
| 28. | Buckley, D. H., and T. M. Schmidt. 2003. Diversity and dynamics of microbial communities in soils from agro-ecosystems. Environ. Microbiol. 5:441-452.[CrossRef][Medline] |
| 29. | Buckley, M. J. M., and C. A. O'Morain. 1998. Helicobacter biologydiscovery. Br. Med. Bull. 54:7-16. |
| 30. | Bulloch, W. 1938. The history of bacteriology. Oxford University Press, New York, N.Y. |
| 31. | Cannon, W. R. 2003. Whipple's disease, genomics, and drug therapy. Lancet 361:1916. |
| 32. | Cary, S. C., W. Warren, E. Anderson, and S. J. Giovannoni. 1993. Identification and localization of bacterial endosymbionts in hydrothermal vent taxa with symbiont-specific polymerase chain reaction amplification and in situ hybridization techniques. Mol. Mar. Biol. Biotechnol. 2:51-62.[Medline] |
| 33. | Chávez, S., J. C. Reyes, F. Cahuvat, F. J. Florencio, and P. Candau. 1995. The NADP-glutamate dehydrogenase of the cyanobacterium Synechocystis 6803: cloning, transcriptional analysis and disruption of the gdhA gene. Plant Mol. Biol. 28:173-188.[CrossRef][Medline] |
| 34. | Cho, J.-C., and S. J. Giovannoni. 2004. Cultivation and growth characteristics of a diverse group of oligotrophic marine gammaproteobacteria. Appl. Environ. Microbiol. 70:432-440. |
| 35. | Colwell, R. R., P. R. Brayton, D. Harrington, B. D. Tall, A. Huq, and M. M. Levine. 1996. Viable but non-culturable Vibrio cholerae O1 revert to a cultivable state in the human intestine. World J. Microbiol. Biotechnol. 12:28-31.[CrossRef] |
| 36. | Colwell, R. R., and D. J. Grimes (ed.). 2000. Nonculturable microorganisms in the environment. ASM Press, Washington, D.C. |
| 37. | Colwell, R. R., M. L. Tamplin, P. R. Brayton, A. L. Gauzens, B. D. Tall, D. Harrington, M. M. Levine, S. Hall, A. Huq, and D. A. Sack. 1990. Environmental aspects of V. cholerae in transmission of cholera, p. 327-343. In R. B. Sack and Y. Zinnaka (ed.), Advances in research on cholera and related diarrhoeas, 7th ed. KTK Scientific Publications, Tokyo, Japan. |
| 38. | Connon, S. A., and S. J. Giovannoni. 2002. High-throughput methods for culturing microorganisms in very-low-nutrient media yield diverse new marine isolates. Appl. Environ. Microbiol. 68:3878-3885. |
| 39. | Courtois, S., C. M. Cappellano, M. Ball, F. X. Francou, P. Normand, G. Helynck, A. Martinez, S. J. Kolvek, J. Hopke, M. S. Osburne, P. R. August, R. Nalin, M. Guerineau, P. Jeannin, P. Simonet, and J. L. Pernodet. 2003. Recombinant environmental libraries provide access to microbial diversity for drug discovery from natural products. Appl. Environ. Microbiol. 69:49-55. |
| 40. | Currie, C. R. 2001. A community of ants, fungi, and bacteria: a multilateral approach to studying symbiosis. Annu. Rev. Microbiol. 55:357-380.[CrossRef][Medline] |
| 41. | Dahl Sawyer, C. A., and J. J. Pestka. 1985. Foodservice systems: presence of injured bacteria in foods during food product flow. Annu. Rev. Microbiol. 39:51-67.[Medline] |
| 42. | Daubin, V., N. A. Moran, and H. Ochman. 2003. Phylogenetics and the cohesion of bacterial genomes. Science 301:829-832. |
| 42. | Davies, J. 1990. What are antibiotics? Archaic functions for modern activities. Mol. Microbiol. 4:1227-1232.[CrossRef][Medline] |
| 43. | DeLong, E. F., G. S. Wickham, and N. R. Pace. 1989. Phylogenetic stains: ribosomal RNA-based probes for the identification of single cells. Science 243:1360-1363. |
| 44. | Diaz-Torres, M. L., R. McNab, D. A. Spratt, A. Villedieu, N. Hunt, M. Wilson, and P. Mullany. 2003. Novel tetracycline resistance determinant from the oral metagenome. Antimicrob. Agents Chemother. 47:1430-1432. |