Software Assurance Metrics and Tool Evaluation (SAMATE)
Michael J. Kass
The National Software Reference Library (NSRL) of the U.S. National Institute of Standards and Technology (NIST) collects software from various sources and publishes file profiles computed from this software (such as MD5 and SHA-1 hashes) as a Reference Data Set (RDS) of information. The RDS can be used in the forensic examination of file systems, for example, to speed the process of identifying unknown or suspicious files. This paper describes the cross-platform, public domain, Linux/Apache/MySQL/Perl (LAMP) framework with which we produce the RDS from acquired software. The framework is easily deployed (it has been packaged on a Knoppix-based live CD) and allows for the distributed processing of large numbers of files in a loose, heterogeneous computing cluster. We go on to suggest that the framework is sufficiently general in its implementation to be suitable for application to classes of problems quite beyond our original scope.