Comparing the new 16S rRNA V4 and ITS primers to the old primers-RESULTS!

The Knight lab has been working hard testing new primers for 16S rRNA amplicon production and its time to share our progress.

So far, the 16S rRNA V4 region forward primer (designated 515f) has been paired with five different reverse primers (806r, 926r, 967r, 1048r, and 1391r) to amplify ribosomal RNA from bacteria, Archaea, and Fungi.  Thanks to Jed Fuhrman and Amy Apprill, the 515f and 806r primers have also been modified, helping minimize the amplification bias against Crenarchaeota/Thaumarchaeota (515f) and SAR 11 (806r).  All primer pairs have successfully yielded PCR amplicons, and the amplicons from the 515f/806r and 515f/926r constructs sequenced.  The remaining primer pair constructs will be sequenced soon with an update to follow once we have the results.

The differences between the old and new 515f nd 806r constructs are described below:

Original 515f construct / modified construct (Jed Fuhrman, C to Y base change on the 5’ end)


Original 806r construct / modified construct (806rB, Amy Apprill, H to N base change mid-primer):


Why the new constructs, you ask?  And what does the added degeneracy mean?

  1. The barcodes, which were previously located on the reverse primer, are now located on the forward (515) primer.  This enables the user to pair the forward primer with various reverse primer constructs to enable longer amplicons.  We’ve tested the barcoded 515f primer with 806r and 926r.  Importantly, the barcoded constructs were screened in silico for secondary structure against a number of longer constructs (967r, 1048r, 1391r).  We have tested the performance of these constructs in PCR but have not validated the results on the MiSeq or HiSeq platforms.
  2. The degeneracy was added to the forward and reverse primers to minimize the bias against Crenarchaeota/Thaumarchaeota (515f modification) and the marine and freshwater alphaproteobacterial clade SAR11 (806r modification)

To compare the new primer constructs to the old ones and thus confirm the performance of the new constructs, we sequenced amplicons produced from both constructs applied to a number of studies.  Our intent was to sample a wide range of sample types to confirm that the new primer constructs produce data comparable to that obtained using the old constructs on a variety of sample types.  The studies/sample types that the constructs were tested on are:

-5 American Gut fecal samples

-5 American Gut skin Samples

-5 Body farm control Soil samples

-6 Body farm paired skin/soil samples

-33 Sloan house samples (various sites)

-15 Mouse decomposition control soils

-9 Rice Rhizome samples

-12 Agricultural soil samples

Below is a procrustes plot (using the unweighted UniFrac distance matrix on the left and the weighted UniFrac distance matrix to take into account taxa abundances on the right) comparing samples amplified using the original primer construct and the new, modified primer construct.  The calculated M2 value for the unweighted UniFrac based plot is 0.111 and for the weighted UniFrac based plot is 0.196.  With the exception of a few mouse decomposition and built environment samples, each sample produces extremely comparable results between the old and new primer constructs.  Importantly, very commonly studied sample types (stool, soil, skin) perform very well under the new constructs.


Overall abundance comparisons at various taxonomic levels are shown in the pie charts below. For the purposes of clarity, low abundance taxa (less than 0.01% relative abundance) were removed before generating these charts.


At the phylum level, the relative abundances of the major taxa present in the dataset vary at most by 2.5% (Bacteroidetes) between the old and new primer constructs. Importantly, the ratios between various taxa follow the same trend, indicating that the new constructs perform similarly to the old constructs. Four sample pairs (a total of 8 samples out of 86 samples tested) did not perform identically between the two constructs: one pair of mouse decomposition samples and one pair of human decomposition (body farm) samples. The taxonomic composition of these sample pairs varies greatly (see taxonomy plots below), with no obvious patterns. Importantly, these samples do not look like samples of the same type typically look. Together, these observations suggest that the issue is likely not due to the new primer constructs. We hypothesize that the difference could be due either to sample mislabeling or to the fact that some sample DNA became depleted as we tested multiple primer constructs; however, we cannot confirm or refute this hypothesis. Nevertheless, we feel confident concluding that the new primer constructs perform comparably to the old constructs, producing data from which comparable conclusions can be drawn.


The R2 values for each taxonomic level for each sample type/study, with the “outlier” mouse decomp and body farm samples removed, are listed below, followed by the corresponding per-taxon scatter plots (genus level)

AG fecal

AG skin Agricultural Soils Rice rhizome Body farm 1 Body farm 2 Mouse decomp Sloan built environment (house)


0.8833 0.9546 0.9799 0.8630 0.9172 0.9075 0.9148


0.8283 0.8613 0.9398 0.5579 0.8982 0.6658




0.8928 0.9178 0.8291 0.2961 0.6644 0.6460




0.7712 0.9270 0.8942 0.6181 0.8454 0.7769



0.9400 0.7914 0.9082 0.8466 0.5092 0.8690 0.6794




Overall, the modified primers perform comparably to the old primers, especially when applied to commonly studied samples (soil, skin, feces, and even the built environment). We are confident that the vast majority of researchers who have been using the old constructs can transition to the new constructs without issue.

Additionally, we have tested (on the same studies mentioned above) and will send out ITS1 spanning constructs based upon the constructs by Smith and Peay (Smith DP & Peay KG (2014) Sequence Depth, Not PCR Replication, Improves Ecological Inference from Next Generation DNA Sequencing. PLoS ONE 9, e90234; For prior constructs, see ITS1F and ITS2 primers from table 1 of: Op De Beeck M, Lievens B, Busschaert P, Declerck S, Vangronsveld J & Colpaert JV (2014)). The new constructs gave improved yields compared to the old constructs (similar results with initial demultiplexing/quality filtering, but approximately 2X the number of reads clustering against the May 2014 release of the UNITE database). This appeared to impact the overall taxon abundances achieved with each primer construct, as seen in the taxonomy plots below (low abundance taxa (less than 0.01% relative abundance) were removed before generating the plots).


Importantly, even though the relative abundances varied greatly in some cases between the two primer constructs, the overall clustering patterns of specific sample types remained the same, as seen in the procrustes plot (produced from the Bray-Curtis distance matrix) below.


The R2 values for each taxonomic level for each sample type/study are listed below, followed by the corresponding per-taxon scatter plots (phylum and genus level).

*Note-rare taxa were filtered out of the dataset before R2 values were calculated.

AG skin

Agricultural Soils Rice rhizome Body farm 1 Body farm 2 Mouse decomp

Sloan built environment (house)



0.9017 0.9246 0.9018 0.9030 0.9245




0.6466 0.9389 0.5678 0.3587 0.6212




0.6331 0.9334 0.6620 0.5356 0.4103




0.7005 0.8828 0.4942 0.5281 0.4509




0.7084 0.8884 0.7411 0.3767 0.3560


The R2 are higher in sample types expected to have a considerable amount of fungi present, such as agricultural soils, rice, and decomposition.


The performance of the new ITS1 spanning primer constructs compared to the old constructs varies depending on the sample type. Samples expected to possess a considerable fungal community (soil, rice, decomposition to some extent) perform more similarly between the two primer constructs. Importantly, the new constructs are more successful in that a larger number of reads produced through these constructs align against the UNITE fungal database, suggesting that the new constructs will be a good tool for researchers to characterize the fungal microbial communities present in their datasets. However, some researchers will need to take care when comparing datasets amplified using the different constructs, as their comparability depends on sample type and on the biological effect size of the phenomenon under investigation.

We are quite excited about the current work on these primers, which should greatly increase the PCR amplification options available to microbiome researchers working in a variety of environments, including the built environment.


I would like to thank Tony Walters, PhD, and Greg Humphreys for their help in generating the data for this blog post.  Many thanks also to Jed Fuhrman, Amy Apprill, Janet Jansson, Jack Gilbert, and of course, Rob Knight, for input, comments, and hard work producing the data.

Read More

What exactly is that sequencing data?

The idea for GenomePeek began two years ago when I was working with Karl Klose, Liz Dinsdale, and Rob Edwards to assemble a P. salmonis genome that was being particularly difficult, even though we had 9 gigabases of sequencing.   To check whether it was a single isolated genome I pulled out all the 16S reads that hit to 16S and then assembled them. All of the assembled contigs hit to P. salmonis, it was only until later that I found that genome had been shredded by an overactive transposon. We still haven’t solved that problem ….

Looks like someone did not do a complete bacterial isolation.

GenomePeek lay dormant for a year, until a student from SDSU’s bacterial sequencing class came looking for help. In the class the students were isolating a bacterium, sequencing, assembling, and then analyzing it.   The student wanted to create a recA phylogeny and was having trouble with taxonomic assignment.   Their question was: “which RecA gene do I use?”.   This set off a red flag immediately, since RecA is essentially a single copy housekeeping gene.   On inspection, their assembled genome did indeed have two full copies of RecA; one that hit to a Vibrio and one that hit to a Photobacterium.   At the time the two 16S rRNA genes hit to Vibrio, although they were clearly different from each other and hit to different species (since then a representative Photobacterium 16S sequence has been added to the NCBI 16S database that is now the top hit). It turned out that this bacterium also had two sets of other single copy housekeeping genes (I checked rpoB, groEL, nifD, gyrB, and fusA). One of each gene hit to a Vibrio species while the other was most similar to a Photobacterium species. Suffice to say the student was very disappointed after spending a few weeks analyzing and writing up a paper.   The idea occurred for a tool where one could submit sequencing data and then quickly get back a set of useful housekeeping genes for phylogenetic analysis. I thought this tool would save everyone’s time wasted on assembly, annotation, and analysis. By quickly checking sequences, we could easily detect whether the original sequencing data was contaminated. I wrote (more…)

Read More

2014 Microbiology of the Built Environment Fellows

The Sloan Foundation has recently announced their 2014 Microbiology of the Built Environment Postdoctoral Fellows.   The awards, along with the titles of the projects are below.  Congrats all!   Look forward to detailed blog posts from all the awardees describing their upcoming projects.


Huan Gu at Syracuse, along with Dacheng Ran. “Understanding and controlling biofilms in the built environment

Sarah-Jane Haig at University of Michigan, along with Lutgarde Raskin and John LiPuma “Regulation of the microbial community structures in drinking water, from source to tap

Brian Klein at Forsyth Institute, along with Katherine Lemon “Microbiomes of indoor track facilities and runners who train indoors vs outdoors

Zachery Lewis at UC Davis along with David Mills and Katie Hinde (Harvard) “Role of the built environment as a venue for microbial cross inoculation between infants

Read More

New Paper : On the intrinsic sterility of 3D printing

As a biologist with a 3D printer, one of the questions I get most often about 3D printed parts is, “Can you autoclave these things?” As it turns out, no, not really. There are only a handful of thermoplastics that can survive the autoclave process, and most of them are not very good for 3D printing. With few exceptions, only polypropylene and blends of polypropylene hold up to repeated autoclave cycles, and polypropylene is, unfortunately, very a difficult material to print. It shrinks a lot when it cools, which causes a lot of warping during printing, and it is very difficult to get molten polypropylene to bond strongly to cooler, solid polypropylene.

It turns out that this is all unnecessary. Fused deposition modeling (FDM) 3D printing involves shoving a rod of thermoplastic into a hot nozzle until it melts and squirts out the nozzle. For most popular 3D printing plastics like ABS and PLA, the nozzle temperature is somewhere between 180C and 260C, and the plastic stays at that temperature for around a minute, depending on what the toolpath looks like. It’s actually a lot like Pasteurization, except way overkill. Get it? Overkill?

Anyway, here’s how FDM 3D printing compares to various Pasteurization (in black) and autoclave (in red) protocols :

FDM 3D printing compared to pasteurization (black) and autoclave (red) protocols.


Read More

“The Dirt on Antimicrobials”

Bill Walsh of The Healthy Building Network has posted a story on the subject,”the Dirt on Antimicrobials” that covers the health effects concerns from the chemicals themselves but does not address the currently popular subject of the health harm or benefits from the presence of and exposures to the multitude of microbes in, on, and around us.

The post starts out:

“The infusion of antimicrobial materials into building products is on the rise. Manufacturers now routinely add substances such as nano-silver and the pesticide triclosan to paints, tiles and grouts, carpets, solid surfaces, faucets, elevator buttons and toilet seats. The dirty truth is: they do not make people healthier. They do cause environmental harm throughout their lifecycle. And their overuse, like the overuse of antibiotics, may contribute to the evolution of microbes that are more resistant to our known antimicrobial defenses.”

See the complete story here.

Read More