  • Review Article
  • Published:

Sequencing and beyond: integrating molecular 'omics' for microbial community profiling

Key Points

  • Advances in DNA sequencing have enabled culture-independent profiling of microbial community membership and function — the field of metagenomics. These approaches have rapidly expanded our knowledge of human-associated and environmental microbiomes.

  • Typical metagenomic studies profile community composition at the species level or above, but new methods are emerging that facilitate strain-level profiling. These methods enable researchers to explore the role of single-nucleotide polymorphisms, gene loss and horizontal gene transfer within microbial ecosystems.

  • DNA sequencing-based surveys of microbial communities yield a static view of community functional potential. Alternative, longitudinal study designs and high-throughput experimental assays capture the temporal dynamics of microbial community structure and activity.

  • Various multi-omic methods have been adapted to study microbial community functional activity, including transcriptomics, proteomics and metabolomics, each of which has strengths and weaknesses.

  • Measurements of microbial community functional activity are more powerful when integrated with traditional metagenomic sequencing because this highlights over-, under- and non-expressed genes and pathways.

  • Multi-omic measurements derived from distinct studies and assays can be combined to build support for new biological hypotheses. These methods have been well developed in the context of model organisms and are highly suited for application to microbial communities.

  • Various statistical and computational methods exist for integrating high-dimensional multi-omic data sets in the search for descriptive and predictive models of microbial community function, as well as biomarkers for human diseases.


High-throughput DNA sequencing has proven invaluable for investigating diverse environmental and host-associated microbial communities. In this Review, we discuss emerging strategies for microbial community analysis that complement and expand traditional metagenomic profiling. These include novel DNA sequencing strategies for identifying strain-level microbial variation and community temporal dynamics; measuring multiple 'omic' data types that better capture community functional activity, such as transcriptomics, proteomics and metabolomics; and combining multiple forms of omic data in an integrated framework. We highlight studies in which the 'multi-omics' approach has led to improved mechanistic models of microbial community structure and function.

Figure 1: Optimizing experimental design.
Figure 2: Profiling strain-level variation in microbial communities.
Figure 3: Relating the metatranscriptome and metagenome in the human gut.
Figure 4: Integrating multi-omic data for deeper biological insights.

This work was funded in part by US National Institutes of Health grants R01HG005969 and U54DK102557 (to C.H. and R. J. Xavier); US National Science Foundation grant DBI-1053486 (to C.H.); US Army Research Office grant W911NF-11-1-0473 (to C.H.); and Danone Research grant PLF-5972-GD (to W. S. Garrett).

The application of high-throughput DNA sequencing to profile the genomic composition of a microbial community in a culture-independent manner.


The community composition, biomolecular repertoire and ecology of microorganisms inhabiting particular environments.


An experimental approach that combines two or more distinct high-throughput molecular biological (omics) assays, such as genomics, transcriptomics, proteomics and metabolomics. The resulting data are generally analysed and combined by integrative methods.

Low-error amplicon sequencing

(LEA–seq). An amplicon sequencing strategy designed to distinguish rare biological variation from sequencing errors, thus leading to more accurate profiling of low-abundance taxa in a community.


Short DNA or RNA sequences derived from a high-throughput sequencing experiment. Reads are often described as 'paired', which indicates that two sequences were derived from opposite ends of the same molecular DNA or RNA fragment.

Single-nucleotide polymorphisms

(SNPs). Positions in a reference genome that occur in more than one nucleotide state (A, C, G and T) among the members of a population.


An assemblage of overlapping DNA or RNA reads from a high-throughput sequencing experiment. Contigs capture larger, continuous sections of genomic (or transcript) material than those represented by individual reads.

Horizontal gene transfer

(HGT). A process in which genetic material is transferred from one cell to the genome of another cell by a method other than normal reproduction (that is, vertical transmission from a mother cell to daughter cell). HGT is also referred to as lateral gene transfer (LGT).


A stress response mechanism used by (primarily Gram-positive) bacteria to survive periods of nutrient depletion.


The collection of microorganisms (of all types: bacteria, archaea, viruses and eukaryotes) inhabiting a particular environment.

Flux balance analysis

A computational method for representing the steady-state metabolic network of an organism or community and evaluating its ability to produce a set of target metabolites from a set of input metabolites.

