NSRL sdhash 3.3 Data
NSRL has run sdhash version 3.3 (aka "similarity digest") against our unique file corpus. Default values were used in the case of small files which were skipped, and large files which were processed in block mode.
Sdbf files were created for every 1,000 files in each corpus subdirectory.
500 sdbf files each were added to individual zip files.
40 zip files are available for download, each one representing a span of 500,000 files in the unique file corpus.
(Fewer than 500,000 will be found in each zip'ped collection, due to exclusion of small sized files.)
The file names in the sdbf data can be cross-referenced with the corpus metadata.
Here is some example data.
The MD5, SHA1, SHA256 file signatures for the downloads are available here.
Downloads:
2.3G NSRL_corp_000_0.zip 3.3G NSRL_corp_000_5.zip 2.9G NSRL_corp_001_0.zip 1.9G NSRL_corp_001_5.zip 1.7G NSRL_corp_002_0.zip 2.0G NSRL_corp_002_5.zip 4.5G NSRL_corp_003_0.zip 4.7G NSRL_corp_003_5.zip 2.0G NSRL_corp_004_0.zip 1.9G NSRL_corp_004_5.zip 1.5G NSRL_corp_005_0.zip 1.1G NSRL_corp_005_5.zip 1.9G NSRL_corp_006_0.zip 2.1G NSRL_corp_006_5.zip 2.4G NSRL_corp_007_0.zip 1.9G NSRL_corp_007_5.zip 1.5G NSRL_corp_008_0.zip 1.2G NSRL_corp_008_5.zip 1.9G NSRL_corp_009_0.zip 1.7G NSRL_corp_009_5.zip 4.8G NSRL_corp_010_0.zip 1.3G NSRL_corp_010_5.zip 2.2G NSRL_corp_011_0.zip 4.6G NSRL_corp_011_5.zip 2.8G NSRL_corp_012_0.zip 1.3G NSRL_corp_012_5.zip 2.3G NSRL_corp_013_0.zip 1.6G NSRL_corp_013_5.zip 1.3G NSRL_corp_014_0.zip 2.9G NSRL_corp_014_5.zip 1.5G NSRL_corp_015_0.zip 1.4G NSRL_corp_015_5.zip 3.7G NSRL_corp_016_0.zip 2.3G NSRL_corp_016_5.zip 2.4G NSRL_corp_017_0.zip 2.8G NSRL_corp_017_5.zip 314M NSRL_corp_018_0.zip 36M NSRL_corp_018_5.zip 1.5G NSRL_corp_019_0.zip 1.8G NSRL_corp_019_5.zip 0.6G NSRL_corp_020_0.zip 1.1G NSRL_corp_020_5.zip 0.5G NSRL_corp_021_0.zip 1.1G NSRL_corp_021_5.zip 0.8G NSRL_corp_022_0.zip 1.4G NSRL_corp_022_5.zip