Well, this is very very interesting: Request for Information (RFI): Ecology Meets Genomics – Federal Business Opportunities: Opportunities. IARPA – the equivalent of DARPA for the intelligence agencies – is getting into microbial biogeography (for forensic purposes). They are seeking comments on the general topic. From the call:
We hypothesize that by correlating the genetic information within a metagenomic sample with information about species co-occurrence and range it may be possible to ascertain where, and perhaps when, a sample was collected. We are aware of a number of projects and databases across a spectrum of disciplines (genomics, biology, ecology, to name a few) that could be relevant to our objective. The purpose of this RFI is to learn more about the content and accuracy of the various relevant databases and their metadata, and to explore whether and how the community believes these databases could be used individually or together to determine the geospecificity of a metagenomic sample to a given degree of accuracy.
They then say “Responses to this RFI may address any or all of the following questions” and list five major areas:
1) Relying just on existing ecologic, biologic or other databases, public or private, is it possible to determine geospecificity of metagenomic sequences?
2) Which, if any, existing databases provide species-level identity with geographic and temporal metadata?
3) What ecological models currently exist that attempt to predict species or genome range on earth’s land surfaces? How accurate are these models and what are their limitations?
4) What techniques or practices could dramatically improve the quality of existing datasets so that they would be more useful to studies of geospecificity? Which database elements require additional standardization or data? What standardization efforts are underway?
5) Citizen scientist projects worldwide, e.g., Wild Lives of Our Homes (http://homes.yourwildlife.org/), Earth Microbiome Project (http://www.earthmicrobiome.org/), Microbiology of the Built Environment (https://microbe.net/), and the American Gut Project (http://americangut.org/), are anticipated to expand. How is quality assured for these databases? Is quality sufficient for scientific purposes? Are there novel bioinformatics techniques that could compensate for some amount of inaccuracy in the data?
Seems like some very interesting things could be proposed here. And it also seems like a good opportunity to integrate “natural” microbial ecology with “built environment” microbial ecology.