home Scholarly Literature (Journals, Books, Reports) Who are the contaminants in your sequencing project?

Who are the contaminants in your sequencing project?

By Jonathan Eisen Posted in Scholarly Literature (Journals, Books, Reports)

Posted on July 25, 2014July 25, 2014

Well, been having many discussions recently about PCR amplification happening from “negative” controls where no sample DNA was added. Such amplification is alas pretty common – due to contamination occurring in some other material added to the PCR reaction. Obviously it would be best to eliminate all DNA contamination of all reagents and all PCRs. But if that does not happen, it is possible to try to detect contamination after it has happened. Below I post some papers related to post-sequencing detection of contamination:

Any other suggestions or comments would be welcome.

UPDATE 10:30 AM 7/25 –

Was reminded on Twitter of a new, critically relevant publication on this issue: Reagent contamination can critically impact sequence-based microbiome analyses

@pathogenomenick @gregcaporaso crap – can’t believe I left that off

– Jonathan Eisen (@phylogenomics) July 25, 2014

10 thoughts on “Who are the contaminants in your sequencing project?”

Greg Caporaso says:

July 25, 2014 at 8:24 am

SourceTracker is also really useful for this. I use it all the time to, e.g., determine if samples seem to have human skin contamination. We have a QIIME tutorial covering this (though it’s covering an older version of SourceTracker now).

Reply
1. Jonathan Eisen says:
  
  July 25, 2014 at 10:13 am
  
  Do you have a database one can include as part of a QIIME workflow that would include sequences from known reagent contaminants? Then you could run Sourcetracker and see if samples seem to have reagent contamination.
  
  Reply
  1. Greg Caporaso says:
    
    July 25, 2014 at 11:16 am
    
    We are currently discussing publishing a standard database that could be used for this, but in the meantime I typically use the data from my PNAS 2011 paper, which contains human gut, human skin, human mouth, plus soil and other environmental samples. These give you a good range of potential contaminating environments, and were sequenced with the widely used 515F/806R primers on Illumina.
    
    Reply
    1. Kyle Bibby says:
      
      July 29, 2014 at 5:38 am
      
      Greg – Do you exclude sequences that “look like” contamination or re-process the sample? If you exclude the sequences, how are you sure the sequences are contamination and isn’t this (somewhat) assuming the answer to your sequencing effort?
      I am interested in how other groups handle this. We currently run negative PCRs for every reaction and a negative for every DNA extraction batch. If we could do this well post-sequencing, it would save a lot of effort.
      
      Reply
      1. David Coil says:
        
        July 29, 2014 at 8:05 pm
        
        I’m also a bit confused as to how you can use a database to screen out contamination. If you’re looking in a new environment, particularly a human-associated one… how do you decide which taxa don’t actually belong in those samples? It seems like you have to have actual wet-lab controls as Kyle described… though it’s not clear what the best way to deal with those is either.
        
        Reply
Phillip Buckhaults says:

July 25, 2014 at 8:59 am

Pls comment on the Neanderthal and Denisovan genome projects in light of this issue.

Reply
1. Jonathan Eisen says:
  
  July 25, 2014 at 10:14 am
  
  Not sure what you are asking here …
  
  Reply
Shantelle Claassen says:

October 26, 2015 at 7:08 am

I am in need of advice on how to “correct for contamination”. We are currently including non-template controls during our extraction process as well as our library prep process. My question now is, how do you correct for the “contamination signal”:
1. Do you remove the total number of reads present in non-template controls for specific taxa
from all your samples in the run? Or do you calculate an average number of reads sequenced
for extraction non-template controls and for library prep non-template controls and remove
these number of reads for the respective taxa from all samples in the run?
2. Would you correct for contamination at the read or OTU level?

Reply
Noah says:

October 26, 2015 at 9:35 am

I’d be very hesitant to rely strictly on any ‘bioinformatic’ solution to removing contaminants. If the data are not trustworthy, it would make me nervous to remove contaminants using SourceTracker (or something equivalent). The key to avoiding problems with contamination is to do good lab work on the front end – otherwise it is garbage in, garbage out.

For example – this makes me very nervous: http://americangut.org/?page_id=277 as the assumption is that there is only a handful of bacteria associated with ‘blooms’ in samples stored improperly and the abundances of other taxa will not be unduly affected.

Reply
Chad Masarweh says:

October 30, 2015 at 10:51 am

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC123900/
This paper from 2002 culture-dependently profiled ultrapure water systems. Most of the contaminants they identified were also found in “Reagent and laboratory contamination can critically impact sequence-based microbiome analyses”

Reply

Who are the contaminants in your sequencing project?

Like this:

Related

10 thoughts on “Who are the contaminants in your sequencing project?”

Leave a Reply to David Coil Cancel reply