Grid and cloud computing for high throughput assembly and annotation of (meta)genome sequences
04 / 2012 - 04 / 2013
Metagenomics is the study of consortia of micro-organisms that co-exist in many environments, e.g., sea, soil, gut, skin. In these studies, next-generation-sequencing technology is used to generate millions of sequence reads from these consortia. The quality control, handling, assembly of reads into contigs, and functional annotation of either contigs or reads of these large datasets requires significant computing resources. In this proposal, we aim to create Grid and Cloud computing solutions for metagenomic datasets within the NBiC E-BioScience taskforce. These solutions will be used to analyze datasets that are generated in public-private partnerships such as TI Food and Nutrition, the Kluyver Center for Genomics of Industrial Fermentation, NIZO food research, and academic groups. The E-BioScience metagenomics project will closely collaborate with the Metagenomics taskforce that is currently being setup at NBiC, SARA, BigGrid, and the NBiC next-generation-sequencing platform. Project objective is to provide publicity on use-cases of Grid and Cloud computing applied to metagenomic datasets.