Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Explainable Artificial Intelligence Based Modeling Applied to OMICS Problems

Summary

New biological “’omics” measurements, particularly sequencing-based methods, can produce 103 to 109 values from biological systems and are transforming biosciences and biotechnology. However, large investments into such sequencing technologies by bioindustry require also advancing measurement confidence in standard reference materials (SRMs), such as genome in a bottle (GIAB) and mass spectrometry data catalogs. These SRMs provide reference values but are not certified values due to insufficient understanding of biases. There is a need to design a metrological framework to accelerate the determination of ~109 certified values and probabilistic uncertainties from integrating measurement methods, specifically cutting-edge artificial intelligence (AI) based modeling methods.

Description

The goal of the project is to produce billions of certified values from multiple measurement methods with well-characterized uncertainty in the “’omics” field (genomics, proteomics, transcriptomics). These values are prepared by a joint effort between human experts and trained AI based models for matching gene sequences to a reference. For example, the genomics convolutional neural network (CNN) called DeepVariant (based on the Inception AI model architecture) has been applied to the problem of sequence classification and matching without quantifying model prediction uncertainty.

Our goal in this project is to develop approaches that make the results of the deep-learning based variant interpretable and enable the tandem of AI model plus human expert to produce trusted reference values. Our approach to explainable AI is based on simulations at small scales (see the Related Publications), interactive visualization to interpret sequencing data and genomic variation visualizations, and designs of multiple AI model metrics using perturbations.

Created May 27, 2021, Updated June 11, 2021