The NSRL RDS and its associated database contain file-based metadata about software derived from installation media. While this metadata is generally useful in identifying software installed on a computer, it tells only part of the story. As the sophistication of the forensic process has advanced, demand has burgeoned for a more complete description of the total effect software has on a system. The NSRL now augments the file metadata published in the RDS with data that catalog this effect. This is done by modifying known systems under controlled conditions and recording the effects using virtual machine installations.
Initially only Windows(R)-based systems are being studied. Specifically: Windows(R) XP Professional, Windows(R) 7 Ultimate 32-bit and Windows(R) 7 Ultimate 64-bit.
The NSRL uses virtual machine (VM) technology to investigate the forensics of the software life cycle. Our principal method of data collection is to act on a VM, pause the VM, copy off the the paused VM, wake the VM, act on the VM, and so on. We refer to the copied off VMs as slices, as each one represents a slice of time in the software's life cycle on the system. The set of all slices for a package in tandem with various metadata which apply to the entire package life cycle is in turn what we refer to as the application's diskprint.
Each package for which a diskprint is created is taken from the NSRL library. This ensures that the package and it provenance are known and that we have the metadata to describe the package (such as name, manufacturer, version, etc.).
Each diskprint is an attempt to comprehensively describe the changes in a computer system as a result of the influence of a software package and contains, at a minimum, slices taken after the installation, startup, closing and removal/uninstallation of a piece of software. These changes are recorded as comparisons to a known baseline. The known baseline itself consists of a "clean" installation of an operating system which has itself been through the diskprint quantification process.
Each slice in a diskprint comprises a set of measurements which attempts to distill the state of the system at the time the slice was taken. The measurements taken are as follows (expressed as a difference from the baseline):
The diskprint as a whole contains the above measurements for all its constituent slices plus the name, version, manufacturer, etc. data for the package as published in the RDS.
The collection of diskprint hashes are being distributed as part of the RDS distribution.
For an example of alternative output formats see A windows7 diskprint example.
A collection of diskprint data containing sector hashes and dfxml can be found on the diskprint downloads page.