A Method-Level Test Generation Framework for Debugging Big Data Applications
Raghu N. Kacker, David R. Kuhn, Huadong Feng, Yu J. Lei
Big data applications are now widely used to process massive amounts of data we create every day. When a failure occurs in a big data application, debugging at the system-level input can be expensive due to the large amount of data being processed. This paper introduces a test generation framework for effectively generating method-level tests to facilitate debugging of big data applications. This is achieved by running a big data application with the real dataset and by automatically recording input to a small number of method executions, which we refer to as method-level tests, while preserving certain code coverage, e.g., line coverage. When debugging, a developer could inspect the execution of these method-level tests, instead of the entire program execution with the real dataset, which could be time-consuming. We implemented the framework and applied the framework to seven algorithms in the WEKA tool. The initial results show that only a very small number of method-level tests need to be recorded to preserve code coverage. Furthermore, these tests could kill between 53.08% to 96.89% of the mutants generated using a third-party tool. This suggests that the framework could significantly reduce the efforts required for debugging big data applications.
, Kuhn, D.
, Feng, H.
and Lei, Y.
A Method-Level Test Generation Framework for Debugging Big Data Applications, IEEE International Conference on Big data 2018, Seattle, WA, [online], https://doi.org/10.1109/BigData.2018.8622248
(Accessed March 2, 2024)