This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowReprints and Permissions
Right arrow Copyright Information
Right arrow Books from ASM Press
Right arrow MicrobeWorld
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Kunin, V.
Right arrow Articles by Hugenholtz, P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kunin, V.
Right arrow Articles by Hugenholtz, P.

Next Article 

Microbiology and Molecular Biology Reviews, December 2008, p. 557-578, Vol. 72, No. 4
1092-2172/08/$08.00+0     doi:10.1128/MMBR.00009-08
Copyright © 2008, American Society for Microbiology. All Rights Reserved.

A Bioinformatician's Guide to Metagenomics

Victor Kunin,1 Alex Copeland,2 Alla Lapidus,3 Konstantinos Mavromatis,4 and Philip Hugenholtz1*

Microbial Ecology Program,1 Quality Assurance Department,2 Microbial Genomics Department,3 Genome Biology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, California4

Summary: As random shotgun metagenomic projects proliferate and become the dominant source of publicly available sequence data, procedures for the best practices in their execution and analysis become increasingly important. Based on our experience at the Joint Genome Institute, we describe the chain of decisions accompanying a metagenomic project from the viewpoint of the bioinformatic analysis step by step. We guide the reader through a standard workflow for a metagenomic project beginning with presequencing considerations such as community composition and sequence data type that will greatly influence downstream analyses. We proceed with recommendations for sampling and data generation including sample and metadata collection, community profiling, construction of shotgun libraries, and sequencing strategies. We then discuss the application of generic sequence processing steps (read preprocessing, assembly, and gene prediction and annotation) to metagenomic data sets in contrast to genome projects. Different types of data analyses particular to metagenomes are then presented, including binning, dominant population analysis, and gene-centric analysis. Finally, data management issues are presented and discussed. We hope that this review will assist bioinformaticians and biologists in making better-informed decisions on their journey during a metagenomic project.


* Corresponding author. Mailing address: Microbial Ecology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA. Phone: (925) 296-5725. Fax: (925) 296-5720. E-mail: phugenholtz{at}lbl.gov


Microbiology and Molecular Biology Reviews, December 2008, p. 557-578, Vol. 72, No. 4
1092-2172/08/$08.00+0     doi:10.1128/MMBR.00009-08
Copyright © 2008, American Society for Microbiology. All Rights Reserved.




This article has been cited by other articles:

  • Kosakovsky Pond, S., Wadhawan, S., Chiaromonte, F., Ananda, G., Chung, W.-Y., Taylor, J., Nekrutenko, A., The Galaxy Team, (2009). Windshield splatter analysis with the Galaxy metagenomic pipeline. Genome Res 19: 2144-2153 [Abstract] [Full Text]  
  • Kristiansson, E., Hugenholtz, P., Dalevi, D. (2009). ShotgunFunctionalizeR: an R-package for functional comparison of metagenomes. Bioinformatics 25: 2737-2738 [Abstract] [Full Text]  
  • Yung, P. Y., Burke, C., Lewis, M., Egan, S., Kjelleberg, S., Thomas, T. (2009). Phylogenetic screening of a bacterial, metagenomic library using homing endonuclease restriction and marker insertion. Nucleic Acids Res 0: gkp746v1-gkp746 [Abstract] [Full Text]