Part of the Genome in a Bottle Consortium hosted by NIST dedicated to authoritative characterization of benchmark cancer genomes. Complements NIST's current DNA copy number Reference Materials for HER2, as well as EGFR and MET. Sign up for General GIAB and Analysis Team email lists.
Interested in job opportunities or collaborations with us? Contact justin.zook [at] nist.gov (Justin Zook).
Click here for the GIAB FAQ
This project is an extension of the Genome in a Bottle Consortium to develop the technical infrastructure (reference standards, reference methods, and reference data) to enable translation of cancer genome sequencing to clinical practice and innovations in technologies. The priority of GIAB is authoritative characterization of human genomes for use in benchmarking, including analytical validation and technology development, optimization, and demonstration.
NIST has been collaborating with Andrew Liss at MGH to develop new tumor cell lines with paired normal samples that are explicitly consented for fully public dissemination of genomic data and cell lines. The first tumor cell line (HG008-T) is from a pancreatic ductal adenocarcinoma, for which we have paired normal pancreatic (HG008-N-P) and duodenal tissue (HG008-N-D) for sequencing, but no normal cell line. We currently are collecting extensive genomic data described below, and are working towards making these cell lines available in public repositories. We plan to have another pancreatic tumor cell line with a paired normal cell line in the near future, but these are still under development. We also welcome additional collaborations for tumor and normal cell line pairs that are explicitly consented for fully public dissemination of genomic data and cell lines.
We will be working with the GIAB community to develop benchmark variants for the tumor and normal samples, using assembly-based and mapping-based approaches. We welcome collaborations in this new project.
Starting in the Fall 2023, we began collecting diverse long and short read paired tumor and normal sequencing data for GIAB HG008 samples. We are making the data public, without embargo, as we collect them. The data being collected is described in Table 1 (long range) and Table 2 (short range). We welcome collaborations to analyze these data.
These contributed data can be accessed through the public GIAB FTP as it becomes available. We further provide the Cancer GIAB Data Manifest which allows for exploration of the tumor and normal data currently available on the FTP. If you are interested in exploring the manifest, you can create a filter view by 1) selecting the entire spreadsheet 2) Data → filter views → create new filter view. Please note tumor data collected from year 1 (2022) is from a prior passage of tumor cells and is emphasized in RED. All current tumor data being collected, is from a large batch of tumor cells known as 0823p23. Please take these passages into consideration when choosing tumor datasets you are interested in exploring.
Estimated Read Lengths | Technology | HG008 tumor cell line (T) 2022 passages | HG008 tumor cell line (T) large batch 0823p23 | HG008 normal duodenal tissue (N-D) | HG008 normal pancreatic tissue (N-P) |
---|---|---|---|---|---|
~100 - 300 kb | Oxford Nanopore Technologies (UL) | NA | ~54X , N50 127kb | pending | pending |
~10 - 100 kb | Oxford Nanopore Technologies (duplex) | NA | pending | pending | pending |
~35 kb | Oxford Nanopore Technologies (std) | NA | ~63X , N50 35kb | pending | pending |
~10 - 20 kb | PacBio HiFi (Revio) | NA | ~116X, N50 18kb | pending | ~35X, N50 17kb |
150 kb - multi Mb | Bionano Optical Mapping | NA | available | NA | NA |
2x150 bp | Arima and Phase Genomics HiC-Illumina | Phase Genomics available | Arima in QC | Arima in QC | NA |
chromosomal | Karyologic karyotyping | available | NA | NA | NA |
Estimated Read Lengths | Technology | HG008 tumor cell line (T) 2022 passages | HG008 tumor cell line (T) large batch 0823p23 | HG008 normal duodenal tissue (N-D) | HG008 normal pancreatic tissue (N-P) |
---|---|---|---|---|---|
2x150 bp | Illumina WGS | (1) ~191X , 2x150bp (2) NA | (1) in QC (2) ~100X , 2x150bp (in final QC) | (1) in QC (2) ~100X , 2x150bp (in final QC) | (1) ~150X, 2x150bp (2) NA |
150 bp | Element - AVITI - short insert - (~350 bp) | NA | ~87X, 2x150bp | 61X, 2x150bp | NA |
150 bp | Element- AVITI - long insert (1000+ bp) | NA | pending | pending | NA |
100 - 200 bp | PacBio Onso | NA | pending | pending | NA |
~300 bp | Ultima UG100 | NA | in QC | in QC | NA |
50 bp | BioSkryb single-cell WGS - Illumina | NA | <<1X , 120 cells, 2x50bp | NA | NA |
~300 bp | Bioskryb single-cell WGS - Ultima | NA | in QC | NA | NA |
NIST-NRC Postdoctoral Fellowship: 2-year fellowship at NIST, U.S. citizens only, ~$75,000 salary plus benefits, relocation expenses included, application deadlines are Feb. 1 and Aug. 1, requires 10 page research proposal. Contact Justin Zook if you are interested in writing a proposal on a genomics research project. We have opportunities posted for metrology in Cancer Genomics, Diploid Assembly, Epigenomics and Transcriptomics, Biological Data Science/Machine Learning, and Precision Medicine.