Determination of physical and chemical properties of petroleum derivatives using gas chromatography and mass spectrometry data.
Werickson Fortunato de Carvalho Rocha, David A. Sheen
The physical properties of a substance such as a fuel are in principle determined solely by its composition, which can be determined or at least probed using techniques such as gas chromatography coupled with mass spectrometry (GC-MS). Because the composition of fuels can vary widely, two different regression algorithms with uncertainty estimation were compared, partial least squares (PLS) regression and support vector machine (SVM) to determine how the physical properties depend on the composition. With this uncertainty estimate, it is possible to assess the trustworthiness of any prediction, which ensures that the chemometric models can be applied for general purposes. In this study, we compare the accuracy and robustness of two models, partial least squares (PLS) regression and support vector machine (SVMs) with uncertainty estimation. A set of hydrocarbon mixtures, including crude oil, oil, gasoline, and biofuel/biodiesel, were collected. GC-MS data were taken, and physical properties were measured for these mixtures using ASTM standard methods. PLS and SVM were used to develop predictive models of the physical properties. Uncertainty in the estimated property values was estimated using a bootstrapping technique. SVM was found to be generally better for predicting the physical properties, although we expect that with a more comprehensive data set the performance of the PLS models can be improved. We show in this work that PLS and SVM can be used to generate a predictive model of physical properties based on GC-MS data. Combined with uncertainty analysis, these models provide robust predictions that can be used for regulatory, economic, and safety purposes.