A corpus of computer programs with known bugs is useful in determining the ability of tools to find bugs. This article describes the content of NIST's Software Assurance Reference Dataset (SARD), which is a publicly available collection of thousands of programs with known weaknesses. SARD has programs in C, C++, Java, PHP, and C# covering over 150 classes of weaknesses. Most of the test cases are synthetic programs of a page or two of code, but there are over 7,000 full size applications, mostly derived from a dozen base applications. The collection also includes buggy code used in Static Analysis Tool Expositions (SATE). Although not every bug is indicated in every program, the vast majority of weaknesses are noted in files that can be automatically processed. Many test cases are grouped into suites, such as CAS Juliet, IARPA STONESOUP, and Kratkiewicz's buffer overflow. Test cases and suites came from many software developers, tool developers, and academic researchers. Users can search for test cases by language, weakness type, and several other criteria and can then browse, select, and download them. Analysts can cut months off the time needed to evaluate a tool or technique using test cases from the SARD.
Citation: Journal of Cyber Security and Information Systems
Pub Type: Journals
software assurance, static analysis, programming language test material, software quality, cybersecurity