We are pleased to announce, in collaboration with 12 other leading life science organizations, the Allele Frequency Community (#AlleleFreq), a landmark initiative formed to addresses a key challenge in interpreting sequencing data for research and clinical applications: the lack of an extensive, high-quality, ethnically-diverse collection of human genomes as a reference set. Each lab that participates in the #AlleleFreq has agreed to pool their human exome- and genome-wide variant call datasets in a secure, anonymized fashion. The #AlleleFreq database already holds more than 82,000 variant call datasets (including 13,983 whole genomes), a number which is expected to grow significantly over time as more labs upload their data (Feb 27, 2015). The founding collaborators of the Allele Frequency Community are:
- David Goldstein, Columbia University Institute for Genomic Medicine
- Madhuri Hegde, Emory Genetics Laboratory
- Peter van der Spek, Erasmus University Medical Center
- Eric Schadt, Icahn Institute for Genomics and Multiscale Biology at Mount Sinai
- Gustavo Glusman, The Institute for Systems Biology
- Greg Eley & Joe Vockley, Inova Translational Medicine Institute
- Tom Kaminski & Stan Letovsky, Enlighten Health Genomics, a business of Laboratory Corporation of America®Holdings (LabCorp®)
- Nathan Pearson, New York Genome Center
- Heidi Rehm, Partners Healthcare Personalized Medicine
- Doug Bassett, QIAGEN Bioinformatics
- Phil Hieter, University of British Columbia
- Jay Shendure, University of Washington
- Chris Mason, Weill Cornell Medical College
Researchers have already begun using the #AlleleFreq datasets and in internal benchmarking studies they have been shown to generate a 43 percent average reduction in false positive rates in causal variant identification. As the #AlleleFreq database grows, this number is expected to increase correspondingly. The data of the #AlleleFreq is stored on QIAGEN’s secure, HIPAA & Safe Harbor compliant private IT infrastructure and made available for free to registered community members. Researchers can initially explore the data using QIAGEN’s Ingenuity® Variant Analysis™. The data is planned to be accessible via other analysis and data interpretation tools in the future, including QIAGEN’s Ingenuity Clinical decision-support solution as well as CLC Cancer Research Workbench, CLC Genomics Workbench and other bioinformatics solutions. To learn more about the Allele Frequency Community and to register for access, please visit: www.allelefrequencycommunity.org.