NOTICE: Due to a lapse in annual appropriations, most of this website is not being updated. Learn more.
Form submissions will still be accepted but will not receive responses at this time. Sections of this site for programs using non-appropriated funds (such as NVLAP) or those that are excepted from the shutdown (such as CHIPS and NVD) will continue to be updated.
An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
Erin Lanus, Brian Lee, Jaganmohan Chandrasekaran, Laura Freeman, M S Raunak, Raghu Kacker, David Kuhn
Abstract
Artificial Intelligence (AI) models use statistical learning over data to solve complex problems for which straightforward rules or algorithms may be difficult or impossible to design; however, a side effect is that models that are complex enough to sufficiently represent the function may be uninterpretable. Combinatorial testing, a black-box approach arising from software testing, has been applied to test AI models. A key differentiator between traditional software and AI is that many traditional software faults are deterministic, requiring a failure-inducing combination of inputs to appear only once in the test set for it to be discovered. On the other hand, AI models learn statistically by reinforcing weights through repeated appearances in the training dataset, and the frequency of input combinations plays a significant role in influencing the model's behavior. Thus, a single occurrence of a combination of feature values may not be sufficient to influence the model's behavior. Consequently, measures like simple combinatorial coverage that are applicable to software testing do not capture the frequency with which interactions are covered in the AI model's input space. This work develops methods to characterize the data frequency coverage of feature interactions in training datasets and analyze the impact of imbalance, or skew, in the combinatorial frequency coverage of the training data on model performance. We demonstrate our methods with experiments on an open-source dataset using several classical machine learning algorithms. This pilot study makes three observations: performance may increase or decrease with data skew, feature importance methods do not predict skew impact, and adding more data may not mitigate skew effects.
Lanus, E.
, Lee, B.
, Chandrasekaran, J.
, Freeman, L.
, Raunak, M.
, Kacker, R.
and Kuhn, D.
(2025),
Data Frequency Coverage Impact on AI Performance, Intl Workshop on Combinatorial Testing, Naples, IT, [online], https://doi.org/10.1109/ICSTW64639.2025.10962464, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=959549
(Accessed October 16, 2025)