Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Comparative Serum Proteomics Project

Summary

Initially the Comparative Mammalian Proteome Aggregator Resource (CoMPARe) Program will generate proteomic data from sera from 25 different species that currently have genome annotations. The resulting data will be publicly available and a data tool developed to humanize protein identifications between species to facilitate direct comparisons. Once the data has been acquired and humanized, work will be begin on a web portal to allow for easy species-species comparisons directed toward expert and non-expert end-users. This will create the foundation for the next phase that will generate data on 50 additional mammalian species as genome annotations become available. This project will be the template for future comparative proteomic projects evaluating plasma and specific tissues of interest.

Description

Illustration showing a dendrogram classification graph, circular arrows pointing from DNA to a mass spectrum and back, and DaVinci’s drawing of a human figure, with the NIST logo centered on the illustration.

The Comparative Serum Proteomics Project will facilitate comparisons of proteomic data among mammalian species. 

Credit: NIST

State-of-the-art biomolecular analysis is no longer limited to model organisms and is becoming routine in non-model organisms. Major drivers of this emerging bioanalytical capacity include increasing accessibility and quality of sequenced genomes as well as high-resolution fast-duty cycle mass spectrometers for proteomic analysis. In recent years there has been a dramatic decrease in sequencing cost along with an increase in the number of published eukaryotic genomes. Moreover, there are ongoing projects to sequence over 9,000 species (G10K and Earth Biogenome Projects). Despite this, currently many organisms do not have genome annotations available. NIST will assist or lead development of high-quality genome assemblies and gene annotations with partners, industry and other agencies (such as our efforts related to the Atlantic bottlenose dolphin).

Using comparative proteomics to evaluate a large, diverse group of non-model organisms creates unique and exciting research questions, opportunities and downstream products. Developing high-quality proteomic data for each species requires quality samples, genomic databases, acquiring data on cutting-edge mass spectrometers, and managing the data into an easily accessible and usable product. To make these results broadly applicable, initially blood will be used. Blood is typically available due to regular health monitoring, making it a readily available and rich resource. Blood also has the advantage of being proximate to most tissues, while also being relatively stable when it comes to many of the major constituents. Further, blood protein constituents cannot be predicted by mRNA transcript abundance. Using modern proteomic analysis of non-depleted serum/plasma, it is possible with two hours of instrument time to identify and provide relative quantification on 100 to 500 proteins. In order to take advantage of emerging proteomics techniques (such as data-independent acquisition), which may not be suitable for non-model organisms, NIST will be working alongside software and algorithm developers to ensure that these platforms can be used beyond human data sets. In order to compile and compare data across species, data tools will be developed to enable comparisons of homologous proteins across species. These data tools and data sets will be made publicly available on ProteomeXchange and MassIVE, as well as a web portal to aid in retrieval and species-species or protein-protein comparisons. This tool will allow researchers and comparative medicine departments to determine suitability of a comparative model beyond the presence or absence of a specific gene and will allow consideration of phenotypic backgrounds to influence research choice.

Current Activity

Phase 1 Goals/Milestones:
 

  • Generate serum proteomic data from 25 different mammalian species that currently have genome annotations. One species (Atlantic bottlenose dolphin; Tursiops truncatus) will be selected to evaluate age and sex variability by analyzing 20 sera.
  • Develop a data tool to humanize identifications between species. NCBI Refseq annotations already rely on homology for gene annotations, and this tool will determine approved HGNC gene names and symbols where appropriate.
  • Make data publicly available on ProteomeXchange and MassIVE
  • Develop a web portal to allow for easy species-species comparisons directed toward end-users that are not experts in proteomic technology.

Phase 2 Goals/Milestones:
 

  • Generate proteomic data of serum from 50 additional mammalian species.
  • Within the original 25 species, add larger sample sets from at least 10 individuals across age and sex.
  • Begin generating plasma proteomes of species in phase 1
  • Continued development of web portal.
Created June 14, 2018, Updated March 19, 2021