Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Data-Driven and Peak-Based Feature Selection In Serum Protein Mass Spectrometry



Walter S. Liggett Jr, Peter E. Barker, O J. Semmes, L H. Cazares


Consider functional canonical correlation analysis (CCA) applied to disjoint sections of lengthy protein mass spectra for the purpose of finding long-distance correlation structure. The relations between the CCA weight functions, which are derived from the data, and spectral peaks, which can be traced to individual proteins, provide a basis for interpreting the structure. The data analyzed consist of repeated measurements of a human serum standard by surface-enhanced laser desorption/ionization (SELDI) time-of-flight (TOF) mass spectrometry. There are 88 spectra obtained from 11 protein chips each with 8 spots. The data-analysis goal is insight into the sample preparation step in such spectrometry, a step that involves the protein chip. We see that variation in this step has an outsized effect on a few proteins. We obtain this insight through interpretation of the long-distance correlation structure and through comparison of spectral variation from chip to chip with variation from spot to spot on single chips.
Clinical Chemistry


biomarker validation, SELDI-TOF, serum proteomics


Liggett Jr, W. , Barker, P. , Semmes, O. and Cazares, L. (2021), Data-Driven and Peak-Based Feature Selection In Serum Protein Mass Spectrometry, Clinical Chemistry (Accessed April 24, 2024)
Created October 12, 2021