Cancer databases
TCGA
The Cancer Genome Atlas
"The landmark multi-omics atlas of 20k+ tumors across 33 cancer types."
About the resource
The Cancer Genome Atlas, a joint NCI/NHGRI program running 2006–2018, characterised more than 20,000 primary tumors and matched normals across 33 cancer types using whole-exome and whole-genome sequencing, RNA-seq, miRNA-seq, methylation, copy-number, reverse-phase protein arrays and clinical follow-up.
Its data sit in the Genomic Data Commons (GDC) as a harmonised, GRCh38-aligned reference and are mirrored in dozens of downstream portals (cBioPortal, the UCSC Xena browser, Firebrowse, the GDC Data Portal, ISB-CGC and Terra). TCGA remains the most-analysed cancer dataset in the world — a generation of disease-subtype, biomarker and pan-cancer findings rest on it.
What you'd use it for
- 01Run a pan-cancer or single-cancer-type multi-omics analysis
- 02Pull harmonised TCGA data into a cloud workspace
- 03Use as the reference cohort for biomarker discovery
- 04Cross-reference candidate driver alterations against TCGA frequencies
How you access it
GDC Data PortalGDC APICloud workspaces (Terra/ISB-CGC)Mirrored derived data in cBioPortal/Xena