Modern mass spectrometers used in the field of proteomics and glycomics are capable of profiling hundreds, and even thousands of molecules in a single experiment. Each of these compounds is isolated and fragmented to form a mass spectrum. Therefore, interpretation of these mass spectra is a critical step in the experimental workflow. Since peptide and glycan mass spectra represent physical properties of these molecules, standard interpretation of these mass spectra has the potential to improve the success rate of all discovery experiments in proteomics and glycomics.
Biological mass spectrometry is a critical tool in understanding monoclonal antibodies. NIST researchers are using their expertise in building mass spectral libraries for other small molecules to compile a comprehensive library of consensus mass spectra of peptides, glycans, glycopeptides, and other important compounds. Developing a standard method for interpreting these mass spectral data and a comprehensive library of high quality tandem mass spectra is critical for establishing and advancing this technology.
- Develop a mass spectral library of all identifiable components of monoclonal antibodies, including peptide (and its modifications), semi-tryptic, glycan, and glycopeptide ions
- Provide the library in a form that is easily searchable using software tools
- Update and maintain the library as a reference data resource
Research Activities and Technical Approach
|While the mass spectrometers used to identify peptides in proteomics have improved greatly over that past ten years, computer algorithms for peptide identification have not. Traditionally, this process involves a step wherein theoretical peptide fragmentation spectra are predicted from protein sequences. These spectra typically contain peaks at the correct m/z values but contain little or no information about their relative intensities (i.e. peak heights) or less common fragmentation products.|
Screenshot of the MS Search 2.0 software showing a library match of an unknown glycan with a glycan in the library (larger image).
Ion components of a tryptic digest of cetuximab (larger image).
|Moreover, hundreds of glycans may be detected in a single experiment but the identification of these compounds depends on the availability of libraries of standard reference mass spectra. Mass spectral libraries are built from measured spectra of known compounds and enable the use of sensitive search algorithms. The use of these algorithms and libraries (1) will lead to a higher percentage of identified spectra at the same level of reliability and (2) will greatly increase the robustness of the glycan identification step.|
The data for this project is both being generated 'in-house' at NIST and collected from many outside sources. NIST also has data exchange agreements with several international proteomics data repositories in order to efficiently share the most relevant data.
To date, the small molecule mass spectral library contains >120,000 spectra for >15,000 ions of >7,000 compounds of biological and environmental relevance (including metabolites, bioactive peptides, amino acids and small peptides, sugars and glycans, lipids and phospholipids, drugs, pesticides, surfactants, and various contaminants). Several of the peptide libraries, including human, yeast and E. coli, represent significant coverage of the proteomes and are suitable for routine uses.
The spectra in these libraries have the features of being experimentally validated, critically evaluated, and annotated with great detail. Their use, in combination or as an alternative to sequence-based identification methods, has been shown to double the number of peptide identifications for some data sets.