The Knight lab has been working hard testing new primers for 16S rRNA amplicon production and its time to share our progress.
So far, the 16S rRNA V4 region forward primer (designated 515f) has been paired with five different reverse primers (806r, 926r, 967r, 1048r, and 1391r) to amplify ribosomal RNA from bacteria, Archaea, and Fungi. Thanks to Jed Fuhrman and Amy Apprill, the 515f and 806r primers have also been modified, helping minimize the amplification bias against Crenarchaeota/Thaumarchaeota (515f) and SAR 11 (806r). All primer pairs have successfully yielded PCR amplicons, and the amplicons from the 515f/806r and 515f/926r constructs sequenced. The remaining primer pair constructs will be sequenced soon with an update to follow once we have the results.
The differences between the old and new 515f nd 806r constructs are described below:
Original 515f construct / modified construct (Jed Fuhrman, C to Y base change on the 5’ end)
5′-GTGCCAGCMGCCGCGGTAA-3′ / 5′-GTGYCAGCMGCCGCGGTAA-3′
Original 806r construct / modified construct (806rB, Amy Apprill, H to N base change mid-primer):
5′-GGACTACHVGGGTWTCTAAT-3′ / 5’-GGACTACNVGGGTWTCTAAT-3’
Why the new constructs, you ask? And what does the added degeneracy mean?
- The barcodes, which were previously located on the reverse primer, are now located on the forward (515) primer. This enables the user to pair the forward primer with various reverse primer constructs to enable longer amplicons. We’ve tested the barcoded 515f primer with 806r and 926r. Importantly, the barcoded constructs were screened in silico for secondary structure against a number of longer constructs (967r, 1048r, 1391r). We have tested the performance of these constructs in PCR but have not validated the results on the MiSeq or HiSeq platforms.
- The degeneracy was added to the forward and reverse primers to minimize the bias against Crenarchaeota/Thaumarchaeota (515f modification) and the marine and freshwater alphaproteobacterial clade SAR11 (806r modification)
To compare the new primer constructs to the old ones and thus confirm the performance of the new constructs, we sequenced amplicons produced from both constructs applied to a number of studies. Our intent was to sample a wide range of sample types to confirm that the new primer constructs produce data comparable to that obtained using the old constructs on a variety of sample types. The studies/sample types that the constructs were tested on are:
-5 American Gut fecal samples
-5 American Gut skin Samples
-5 Body farm control Soil samples
-6 Body farm paired skin/soil samples
-33 Sloan house samples (various sites)
-15 Mouse decomposition control soils
-9 Rice Rhizome samples
-12 Agricultural soil samples
Below is a procrustes plot (using the unweighted UniFrac distance matrix on the left and the weighted UniFrac distance matrix to take into account taxa abundances on the right) comparing samples amplified using the original primer construct and the new, modified primer construct. The calculated M2 value for the unweighted UniFrac based plot is 0.111 and for the weighted UniFrac based plot is 0.196. With the exception of a few mouse decomposition and built environment samples, each sample produces extremely comparable results between the old and new primer constructs. Importantly, very commonly studied sample types (stool, soil, skin) perform very well under the new constructs.
Overall abundance comparisons at various taxonomic levels are shown in the pie charts below. For the purposes of clarity, low abundance taxa (less than 0.01% relative abundance) were removed before generating these charts.
At the phylum level, the relative abundances of the major taxa present in the dataset vary at most by 2.5% (Bacteroidetes) between the old and new primer constructs. Importantly, the ratios between various taxa follow the same trend, indicating that the new constructs perform similarly to the old constructs. Four sample pairs (a total of 8 samples out of 86 samples tested) did not perform identically between the two constructs: one pair of mouse decomposition samples and one pair of human decomposition (body farm) samples. The taxonomic composition of these sample pairs varies greatly (see taxonomy plots below), with no obvious patterns. Importantly, these samples do not look like samples of the same type typically look. Together, these observations suggest that the issue is likely not due to the new primer constructs. We hypothesize that the difference could be due either to sample mislabeling or to the fact that some sample DNA became depleted as we tested multiple primer constructs; however, we cannot confirm or refute this hypothesis. Nevertheless, we feel confident concluding that the new primer constructs perform comparably to the old constructs, producing data from which comparable conclusions can be drawn.
The R2 values for each taxonomic level for each sample type/study, with the “outlier” mouse decomp and body farm samples removed, are listed below, followed by the corresponding per-taxon scatter plots (genus level)
|AG skin||Agricultural Soils||Rice rhizome||Body farm 1||Body farm 2||Mouse decomp||Sloan built environment (house)|
Overall, the modified primers perform comparably to the old primers, especially when applied to commonly studied samples (soil, skin, feces, and even the built environment). We are confident that the vast majority of researchers who have been using the old constructs can transition to the new constructs without issue.
Additionally, we have tested (on the same studies mentioned above) and will send out ITS1 spanning constructs based upon the constructs by Smith and Peay (Smith DP & Peay KG (2014) Sequence Depth, Not PCR Replication, Improves Ecological Inference from Next Generation DNA Sequencing. PLoS ONE 9, e90234; For prior constructs, see ITS1F and ITS2 primers from table 1 of: Op De Beeck M, Lievens B, Busschaert P, Declerck S, Vangronsveld J & Colpaert JV (2014)). The new constructs gave improved yields compared to the old constructs (similar results with initial demultiplexing/quality filtering, but approximately 2X the number of reads clustering against the May 2014 release of the UNITE database). This appeared to impact the overall taxon abundances achieved with each primer construct, as seen in the taxonomy plots below (low abundance taxa (less than 0.01% relative abundance) were removed before generating the plots).
Importantly, even though the relative abundances varied greatly in some cases between the two primer constructs, the overall clustering patterns of specific sample types remained the same, as seen in the procrustes plot (produced from the Bray-Curtis distance matrix) below.
The R2 values for each taxonomic level for each sample type/study are listed below, followed by the corresponding per-taxon scatter plots (phylum and genus level).
*Note-rare taxa were filtered out of the dataset before R2 values were calculated.
|Agricultural Soils||Rice rhizome||Body farm 1||Body farm 2||Mouse decomp||
Sloan built environment (house)
The R2 are higher in sample types expected to have a considerable amount of fungi present, such as agricultural soils, rice, and decomposition.
The performance of the new ITS1 spanning primer constructs compared to the old constructs varies depending on the sample type. Samples expected to possess a considerable fungal community (soil, rice, decomposition to some extent) perform more similarly between the two primer constructs. Importantly, the new constructs are more successful in that a larger number of reads produced through these constructs align against the UNITE fungal database, suggesting that the new constructs will be a good tool for researchers to characterize the fungal microbial communities present in their datasets. However, some researchers will need to take care when comparing datasets amplified using the different constructs, as their comparability depends on sample type and on the biological effect size of the phenomenon under investigation.
We are quite excited about the current work on these primers, which should greatly increase the PCR amplification options available to microbiome researchers working in a variety of environments, including the built environment.
I would like to thank Tony Walters, PhD, and Greg Humphreys for their help in generating the data for this blog post. Many thanks also to Jed Fuhrman, Amy Apprill, Janet Jansson, Jack Gilbert, and of course, Rob Knight, for input, comments, and hard work producing the data.