Microbiomes, localized communities of microorganisms that exist symbiotically with their immediate environment, can be found virtually anywhere; inside the human colon, around plant roots, inside coral reefs and even within ant colonies. These little microbial microcosms are a hot topic right now. Recent studies have implicated their role in the wellbeing of people, animals, plants and entire oceans (1–3). Variations in microbial numbers and diversity within animals have shown critical involvement in auto-immune disease, obesity, acne, tooth decay, pregnancy and even brain chemistry (4–7). Plant studies have shown that symbiotic organisms impart resistance to drought, increase nutrient absorption and prevent attack by pathogens (8, 9).


Here, we discuss what happens after nucleic acids are extracted using QIAGEN’s sample prep kits. How come these conglomerations of microbial DNA/RNA can be converted into useful, quantifiable information?


Amplicon and whole genome sequencing


First and foremost, to understand how microbiomes might be important, we need to know what microbes are present. Not too long ago, the only reasonable way to do this was to culture the microbes and then  to identify the species by growing enough of them so that cell morphology, staining, antibodies and protein analysis could be used to identify the species. There are two major problems with this method. First, you lose all the information about the initial relative concentrations of the microbes. Secondly, because many microbes will not grow under typical laboratory conditions, their existence could be missed entirely. With the invention of faster and less expensive DNA sequencing, the culturing step can be eliminated. Microbial DNA and RNA can be directly extracted and isolated from a sample of soil, water, biofilm or stool, among other interesting bio-substances. Once the microbial nucleic acids are isolated they can be used to determine which species exist within the sample using sequencing.


There are two common sequencing methods currently being used for this purpose. The first, targeted amplicon sequencing, requires knowing something about the community of microbes that might be in the sample to begin with. Scientists make use of the gene for the 16S ribosomal RNA. (10). It is the most highly conserved DNA in all cells but also contains several hypervariable regions that have diverged over time. PCR is used to amplify those divergent sections of the microbial genomic DNA, and these differences make it possible to uniquely identify microbes. By selecting primers that target conserved regions that flank the variable regions, unique differences can be used to identify microbial species. But while 16S ribosomal RNA is slightly different for virtually all species, no single hypervariable region can be used to distinguish all bacteria. Scientists do the best they can by selecting sets of primers that will characterize the most common bacteria in their sample. These are often referred to as ‘universal’ primers; but they are not 100% universal, which is one drawback of the method.


In the second method, whole genome sequencing, DNA or RNA are isolated directly from environmental samples and sequenced without using a PCR amplification step. Sampled fragments of the whole genome or transcriptome are used to create libraries, which are then sequenced in a pool. The genomic sequence is reconstructed and these sequences are compared to known databases of microbial sequences, like MBGB or KEGG, using sequence comparative analysis programs like BLAST. A disadvantage of this method is that not all microbial species have been sequenced or are accurately cataloged in these databanks. So again, some microbes can be missed.


Analysis of similarity


Once microbial populations are identified through nucleic acid isolation, sequencing and analysis, scientists then need to turn this into quantifiable information regarding what those population differences might mean. To achieve this goal, studies make comparisons between the microbial populations of very well-defined groups. Sometimes, the groups are defined by their state of health, for example, people with diabetes compared to those without. They could also be defined by the locations that the samples were collected from, like soil from the Antarctic versus soil from the Russian Tundra.


Scientists compare the microbial population data from these well-defined groups and do what’s called an analysis of similarity (ANOSIM) to determine if the populations are more similar than they would be simply by chance. This type of analysis can be tricky and in research, it’s not just applied to microbial populations. Basically, you need to first generate data assuming there is no correlation between two data sets. This random data is then compared to the actual population of data that exists. From this analysis, a statistical number, called an R value, is computed and indicates how similar or dissimilar two sets of data are when compared to what they would look like from pure chance. When applied to microbial communities the R value, a number between –1 and +1 indicates how similar populations of microbes are between groups. A number of +1 would indicate a very strong likelihood that microbial populations are similar. A number of –1 would also be statistically significant, but would imply that two populations of bacterial are dissimilar for a reason. An R value of zero would indicate no relationship at all between the two populations. The math enables us to tease out the information that’s hidden in the microbial mixture that is a microbiome. It enables us to go from extracted nucleic acids to PCR and sequencing data to insight into the microbial soup around us.


Check out QIAGEN’s newest products for your library prep needs:


QIAseq 1-Step Amplicon Library Kit


The QIAseq 1-Step Amplicon Library Kit combines end-repair and ligation into a single 30 minute room temperature incubation for the simplest, easiest way to prepare your amplicons for Illumina sequencing.


QIAseq FX DNA Library Kit


The QIAseq FX DNA Library Kit takes you from 1 ng – 1 μg genomic DNA to sequencer-ready, whole genome libraries in just 2.5 hours.


QIAseq Ultralow Input Library Kit


The QIAseq Ultralow Input Library Kit enables the generation of high-quality libraries starting from just 10 pg –100 ng fragmented DNA and offer a robust solution for a wide range of research applications.



  1. 1. Gilbert, J.A., Jansson, J.K., Knight R. (2014) The earth microbiome project: successes and aspirations. BMC Biol. 12, doi:10.1186/s12915-014-0069-1. Link
  2. 2. Hunter, P. (2016) Plant microbiomes and sustainable agriculture: Deciphering the plant microbiome and its role in nutrient supply and plant immunity has great potential to reduce the use of fertilizers and biocides in agriculture. EMBO Rep. 17, 1696–1699. Link
  3. 3. Moran, M.A. (2015) The global ocean microbiome. Science 350, doi: 10.1126/science.aac8455. Link
  4. 4. Gilbert, J.A. et al. (2016) Microbiome-wide association studies link dynamic microbial consortia to disease. Nature 535, 94–103. Link
  5. 5. Parekh, P.J., Balart, L.A., Johnson, D.A. (2015) The Influence of the gut microbiome on obesity, metabolic syndrome and gastrointestinal disease. Transl. Gastroenterol. 6, doi:10.1038/ctg.2015.16. Link
  6. 6. Koren, O. et al. (2012) Host remodeling of the gut microbiome and metabolic changes during pregnancy. Cell, 3, 470–480. Link
  7. 7. Smith, P.A. (2015) The tantalizing links between gut microbes and the brain. Nature 526, 312–314. Link
  8. 8. Schleaeppi, K., Bulgarelli, D. (2015) The plant microbiome at work. Mol. Plant Microbe Interact. 28, 212–217. Link
  9. 9. Turner, T.R., James, E.K., Poole, P.S. (2013) The plant microbiome. Genome Biol. 14, 209. Link
  10. 10. Weisburg, W.G., Barns, S.M., Pelletier, D.A., Lane, D.J. (1991) 16S ribosomal DNA amplification for phylogenetic study. J Bacteriol. 173, 697–703. Link