Proteomics experiments designed to identify protein markers specific to a particular disease or disease state rely on repeatable identification of proteins within a control set. Repeatability provides statistical assurance that clear differences between two sets (e.g. sick vs. healthy) reflect true biological differences. The degree to which technical variability thwarts this process is an underappreciated problem in the field. Monitoring key steps in these experimental workflows using metrics, provides researchers the necessary tools to evaluate variability and determine the degree to which differences are due to biological changes. Their use will generally improve confidence in this area and speed the search for new disease markers.
Mass spectrometry is a powerful technology which has been successfully adapted for studying the protein component of tissues and body fluids (i.e. proteomics). Technical variability is high in this newly emerging field because of the complexity of the experimental workflows. Understanding and measuring this variability is essential for firmly establishing proteomics as a competitive technology for tomorrow's health care in the United States.
The goals of this project are to identify and measure experimental variability in proteomics experiments as it affects:
- liquid chromatography
- mass spectrometric sampling of ions
- the identification of peptides by interpretation of tandem mass spectra
By systematically monitoring changes across these key areas, outlier data can be easily identified, problems diagnosed and corrective measures made.
Research Activities and Technical Approach
The NCI's Clinical Proteomic Technologies for Cancer (CPTC) have identified technical variability as a major roadblock in the search for protein markers of cancer and the development of the next-generation of cancer detection and treatment technologies. Through an interagency agreement with the NCI, NIST researchers have developed a panel of 50+ metrics, which are computed from raw mass spectrometry data and peptide identifications by spectrum library searching, for evaluating reproducibility.
These metrics have been successfully used in several interlaboratory studies as part of NCI's Clinical Proteomics Technology Assessment for Cancer (CPTAC) program as a means to study reproducibility within and between expert laboratories. These studies have involved analysis of two complex reference samples (i.e. a 20 human protein mix and yeast) by several sites affiliated with each of the 5 CPTAC awardees. The data files were transmitted and analyzed by NIST researchers for evaluation of reproducibility across each of the categories listed above. Results from these analyses, as well as from 'in-house' studies of variability, have been used to document reproducibility and provide specific feedback for reducing variability and optimizing performance.
These metrics have been implemented in a software pipeline available for download. NIST researchers are working with several universities and industrial partners to further develop these as community tools.