Author(s)
James R. Lyle, Graeme Horsman
Abstract
As the digital forensic field develops, taking steps towards ensuring a level of reliability in the processes implemented by its practitioners, emphasis on the need for effective testing has increased. In order to test, test datasets are required, but creating these is not a straightforward task. A poorly constructed and documented test dataset undermines any testing which has taken place using it, eroding the reliability of any subsequent test results. In essence, given the time, effort and knowledge required to generate datasets, the field must guide those carrying out this task to ensure that it is done right at the first instance without wasting resources. Yet, there are currently few standards and best practices defined for dataset creation in digital forensics. This work defines three categories of dataset which typically exist in digital forensic - tool/process evaluation datasets, actions datasets and scenario-based dataset, where the minimum requirements for their creation are outlined and discussed to support those creating them and to help ensure that where datasets are created, they offer maximum value to the field.
Citation
Forensic Science International: Digital Investigation
Keywords
Digital Forensics, Datasets, Testing, Tool-testing
Citation
Lyle, J.
and Horsman, G.
(2021),
Dataset construction challenges for digital forensics, Forensic Science International: Digital Investigation, [online], https://doi.org/10.1016/j.fsidi.2021.301264, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=931168 (Accessed May 10, 2026)
Additional citation formats
Issues
If you have any questions about this publication or are having problems accessing it, please contact [email protected].