dbGaP
dbGaP (the NCBI database of Genotypes and Phenotypes) archives and distributes data from studies investigating the relationship between genotype and phenotype, including genome-wide association studies, medical sequencing, molecular diagnostics, and associations with non-clinical traits. The database provides study summaries, descriptions of measured variables, and supporting study documentation. Data are available through open and controlled access; controlled access permits approved users to obtain individual-level phenotype and genotype data.
Repository Scope
Research Areas
Life Sciences; Basic Biological and Medical Research; ; Medicine; Human Genetics; Biology; ;
Data Types
; Observational; ; Geospatial; Image; Genomic/Molecular; Biomedical;
Data Types Encouraged/Permitted
Human genomic data: genotype array, variant calls, sequencing-derived, imputed and derived molecular measurements and other -omics data.
Data Types Explicitly Prohibited
dbGaP does not accept non-human data (except as supplementary to human). BAM, CRAM, or FASTQ files as Molecular Data should be submitted through the Sequence Read Archive (SRA). Raw Proteomics files may need other specialized repositories (PRIDE.)
Fee for JHU Researchers to Deposit No
Data Limit
No stated limits. Contact dbGaP for very large deposits.
Data Access
Option for Data Access
Controlled Access;
Open Access;
Details on Data Access
Controlled access is managed by NCBI, requestor IRB approval or equivalent. Open access to certain summary and aggregated data, metadata and documentation. dbGaP is divided into public and authorized-access sections for aggregate and individual-level data, respectively.
Sensitive Data
Human Data Accepted
Yes
Level of Deidentification Required
Whole human genome (WGS) direct identifiers may require JHU institutional certification and use limiters unless commercial/sponsor use is consented. Safe Harbor de-identification required for all other data.
Human Participant Data Sharing Policy
Administration
Required Funder
None listed
Persistent Identifier
Accession number
Data Retention Period
Indefinitely
AI LLM Policy
AI/LLM modeling for restricted data only on secure platforms approved by NCBI.
re3data
re3data Keywords:
cells - morphology; genetic recombination; genotype-environment interaction; homology (Biology); molecular diagnosis; morphology; phenotype
re3data Repository Contact
https://dbgap.ncbi.nlm.nih.gov/aa/wga.cgi?page=email&from=login