This work presents a method for creating a mass spectral library containing tandem spectra identifiable peptide ions in the tryptic digestion of a single protein. Human serum albumin (HSA) was selected for this purpose owing to its ubiquity and high level of characterization. The underlying experimental data consisted of approximately 3,000 1D LC-ESI-MS/MS runs with ion-trap fragmentation. In order to generate a wide range of peptides, studies covered a broad set of instrument and digestion conditions using multiple sources of HSA and trypsin. Computer methods were developed to enable the reliable identification and reference spectrum extraction of all peptide ions identifiable by current sequence search methods. This process made use of both MS2 (tandem) spectra and MS1 (electrospray) data. Identified spectra were generated for 2,919 different peptide ions, using a variety of manually-validated filters to insure spectrum quality and identification reliability. The resulting library was composed of 10% conventional tryptic and 29% semitryptic peptide ions, along with 42% tryptic peptide ions with known or unknown modifications, which included both analytical artifacts and posttranslational modifications present in the original HSA. The remaining 19% contained unexpected missed-cleavages or were under/over alkylated. The methods described can be extended to create equivalent spectral libraries for any target protein. Such libraries have a number of applications in addition to their known advantages of speed and sensitivity, including the ready re-identification of known PTMs, rejection of artifact spectra and a means of assessing sample and digestion quality.
Citation: Molecular and Cellular Proteomics
Pub Type: Journals
Tandem mass spectral libraries, Human Serum Albumin, tryptic digestion of a single protein, peptide classification, spectrum quality and identification reliability, analytical artifacts and posttranslational modifications