dbGaP

dbGaP (the NCBI database of Genotypes and Phenotypes) archives and distributes data from studies investigating the relationship between genotype and phenotype, including genome-wide association studies, medical sequencing, molecular diagnostics, and associations with non-clinical traits. The database provides study summaries, descriptions of measured variables, and supporting study documentation. Data are available through open and controlled access; controlled access permits approved users to obtain individual-level phenotype and genotype data.

Repository Website


Repository Scope

Data Collection Policy

Research Areas

Life Sciences; Basic Biological and Medical Research; ; Medicine; Human Genetics; Biology; ;

Data Types

; Observational; ; Geospatial; Image; Genomic/Molecular; Biomedical;

Data Types Encouraged/Permitted

Human genomic data: genotype array, variant calls, sequencing-derived, imputed and derived molecular measurements and other -omics data.

Data Types Explicitly Prohibited

dbGaP does not accept non-human data (except as supplementary to human). BAM, CRAM, or FASTQ files as Molecular Data should be submitted through the Sequence Read Archive (SRA). Raw Proteomics files may need other specialized repositories (PRIDE.)

Fee for JHU Researchers to Deposit

No

Data Limit

No stated limits. Contact dbGaP for very large deposits.


Data Access

Data Access Policy

Option for Data Access

Controlled Access;

Open Access;

Details on Data Access

Controlled access is managed by NCBI, requestor IRB approval or equivalent. Open access to certain summary and aggregated data, metadata and documentation. dbGaP is divided into public and authorized-access sections for aggregate and individual-level data, respectively.


Sensitive Data

Human Data Accepted

Yes

Level of Deidentification Required

Whole human genome (WGS) direct identifiers may require JHU institutional certification and use limiters unless commercial/sponsor use is consented. Safe Harbor de-identification required for all other data.

Human Participant Data Sharing Policy

Sensitive Data Policy


Administration

Submission Policy

Required Funder

None listed

Persistent Identifier

Accession number

Data Retention Period

Indefinitely

AI LLM Policy

AI/LLM modeling for restricted data only on secure platforms approved by NCBI.


re3data

re3data Keywords:

cells - morphology; genetic recombination; genotype-environment interaction; homology (Biology); molecular diagnosis; morphology; phenotype

re3data Repository Contact

https://dbgap.ncbi.nlm.nih.gov/aa/wga.cgi?page=email&from=login

re3data Record