Principal Component Analysis for Automated Classification of 2D Spectra and Interferograms of Protein Therapeutics: Influence of Noise, Reconstruction Details, and Data Preparation
Robert Brinson, Wade Elliott, Luke Arbogast, David Sheen, John Giddens, John Marino, Frank Delaglio
Protein therapeutics must retain their proper three-dimensional fold without forming aggregates for safe and effective use in the clinic. Therefore, the ability to monitor protein higher order structure (HOS) can be valuable throughout the lifecycle of a protein therapeutic, from development to manufacture. 2D NMR has been introduced as a robust and precise tool to assess the HOS of a protein biotherapeutic. A common use case is to decide whether two groups of spectra are substantially different, as an indicator of difference in HOS. Here, we demonstrate a quantitative use of principal component analysis (PCA) scores to perform this decision- making, and demonstrate the effect of acquisition and processing details on class separation using samples of NISTmAb monoclonal antibody Reference Material subjected to two different oxidative stress protocols. The work introduces an approach to computing similarity from PCA scores based upon the technique of histogram intersection, a method originally developed for retrieval of images from large databases. Results show that class separation can be robust with respect to random noise, reconstruction method, and analysis region selection. By contrast, details such as baseline distortion can have a pronounced effect, and so must be controlled carefully. Since the classification approach can be performed without the need to identify peaks, results suggest that it is possible to use even more efficient measurement strategies that do not produce spectra that can be analyzed visually, but nevertheless allow useful decision-making that is objective and automated.
, Elliott, W.
, Arbogast, L.
, Sheen, D.
, Giddens, J.
, Marino, J.
and Delaglio, F.
Principal Component Analysis for Automated Classification of 2D Spectra and Interferograms of Protein Therapeutics: Influence of Noise, Reconstruction Details, and Data Preparation, Journal of Biomolecular Nmr, [online], https://doi.org/10.1007/s10858-020-00332-y
(Accessed September 22, 2023)