This page provides a general overview of the NSRL: what it is; what it's for; and how it's used. If you're new to the NSRL, this is a good place to start.
What is the NSRL?
The NSRL is The National Software Reference Library, based out of the National Institute of Standards and Technology, an agency of the U.S. Department of Commerce. It has three components:
- A large collection of software packages.
- A database containing detailed information, or metadata, about the files that make up those software packages.
- A public dataset, the NSRL Reference Data Set (RDS) which contains a subset of the metadata held in the database for each file in the collection. The RDS is published and updated every three months.
What's in the NSRL?
The NSRL RDS contains metadata on computer files which can be used to uniquely identify the files and their provenance. For each file in the NSRL collection, the following data are published:
- Cryptographic hash values (MD5 and SHA-1) of the file's content. These uniquely identify the file even if, for example, it has been renamed.
- Data about the file's origin, including the software package(s) containing the file and the manufacturer of the package.
- Other data about the file, including its original name and size.
The RDS is distributed as a collection of UTF-8 text files on a set of 4 CDs. Each file contains a set of records, one record per line, consisting of comma-separated fields. Details of the record formats can be found here (PDF file).
What is the NSRL used for?
The data published by the NSRL is used to rapidly identify files on computer systems, based solely on the content of the files.
In most cases, NSRL file data is used to eliminate known files, such as operating system and application files, during criminal forensic investigations. This reduces the number of files which must be manually examined and thus increases the efficiency of the investigation.
However, the data can just as easily be used to target files of interest. Such uses include detection of unauthorized software installations (e.g., in corporate or web hosting environments or in intellectual property disputes), and discovery of exculpatory evidence by criminal defense teams.
Who uses the NSRL?
The primary focus of the NSRL is to aid computer forensics examiners in their investigations of computer systems. The majority of NSRL stakeholders are in federal, state and local law enforcement in the United States and internationally. These organizations typically use the NSRL data to aid in criminal investigations, as outlined above. Other stakeholders include businesses and other government agencies which may use the NSRL RDS as part of their routine IT operations.
How is the NSRL used?
Typically the RDS data is imported into one of a number of commercial software packages.
How can I get the NSRL data?
NSRL RDS releases are available as a free download. Data sets are updated quarterly (March, June, September, December), typically on the first Friday of the month.