Combinatorial Testing Metrics for Machine Learning
Erin Lanus, Laura Freeman, D. Richard Kuhn, Raghu N. Kacker
This short paper defines a combinatorial coverage metric for comparing machine learning (ML) data sets and proposes the differences between data sets as a function of combinatorial coverage. The paper illustrates its utility for evaluating and predicting performance of ML models. Identifying and measuring differences between data sets can be of significant value for ML problems, where the accuracy of the model is heavily dependent on the degree to which training data are sufficiently representative of data that will be encountered in application. The utility of the method is illustrated for transfer learning, the problem of predicting performance of a model trained on one data set when applied to another.
April 12-16, 2021
IEEE International Conference on Software Testing, Verification and Validation Workshop (ICSTW)
, Freeman, L.
, Kuhn, D.
and , R.
Combinatorial Testing Metrics for Machine Learning, IEEE International Conference on Software Testing, Verification and Validation Workshop (ICSTW), Porto , , [online], https://doi.org/10.1109/ICSTW52544.2021.00025, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=931825
(Accessed October 28, 2021)