Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

sdhash Datasets

NSRL sdhash 3.3 Data

NSRL has run sdhash version 3.3 (aka "similarity digest") against our unique file corpus. Default values were used in the case of small files which were skipped, and large files which were processed in block mode.

Sdbf files were created for every 1,000 files in each corpus subdirectory.
500 sdbf files each were added to individual zip files.
40 zip files are available for download, each one representing a span of 500,000 files in the unique file corpus.
(Fewer than 500,000 will be found in each zip'ped collection, due to exclusion of small sized files.)

The file names in the sdbf data can be cross-referenced with the corpus metadata.

Here is some example data.

The MD5, SHA1, SHA256 file signatures for the downloads are available here.


2.3G 3.3G 2.9G 1.9G 1.7G 2.0G 4.5G 4.7G 2.0G 1.9G 1.5G 1.1G 1.9G 2.1G 2.4G 1.9G 1.5G 1.2G 1.9G 1.7G 4.8G 1.3G 2.2G 4.6G 2.8G 1.3G 2.3G 1.6G 1.3G 2.9G 1.5G 1.4G 3.7G 2.3G 2.4G 2.8G 314M 36M 1.5G 1.8G 0.6G 1.1G 0.5G 1.1G 0.8G 1.4G



Created June 27, 2016, Updated November 15, 2019