Genetic & genomic

dbGaP

Database of Genotypes and Phenotypes
"Controlled-access genotype-phenotype data for cohort and disease studies."
controlled-accessGWAScohorts

About the resource

dbGaP is NCBI's central repository for the results of studies that examine the relationship between genotype and phenotype: GWAS, large-scale sequencing studies (NHLBI TOPMed, the AllofUs Research Program partial submissions), and disease-cohort biobanks. It provides two tiers — summary-level data (open) and individual-level data (controlled access).

Individual-level access requires a Data Use Certification reviewed by the relevant Data Access Committee, and is governed by consent restrictions encoded in DUO (Data Use Ontology) terms. dbGaP is the primary U.S. infrastructure through which sequencing and genotyping data from disease-cohort studies become reusable under appropriate ethical oversight.

What you'd use it for

  1. 01Apply for access to individual-level GWAS or sequencing data for a disease cohort
  2. 02Identify summary-level statistics across many disease-association studies
  3. 03Plan secondary analysis of a published consortium dataset
  4. 04Cross-reference candidate gene findings against existing cohort data

How you access it

Web UIAuthorized-access downloadsdbGaP APICloud-hosted via AnVIL/STRIDES

Closely related resources