The "correct" classifications, the plurality classifications for the systems participating in the First Census OCR Systems Conference, and a key to the writer who produced each image on NIST TEST DATA 1 are in the file TEST1_AN.ZIP on this disk. This file must be unzipped with the PKUNZIP utility, which is in the DOS directory on TEST DATA 1, and which must be run on a computer under the MS-DOS operating system. This utility is licensed only for this purpose and may not be used for any other purpose without registering the software as described when PKUNZIP is run without any command line arguments. When PKUNZIP -D TEST1_AN.ZIP is run under MS-DOS, it will create the tree structure TEST1 |________________ | | | DIGIT LOWER UPPER Each subdirectory will contain the CLS files for the MIS files on TESTDATA1, the HYP files for the plurality hypotheses (vote_p) for the same MIS files, and the WRT files that identify the writer of each character in each MIS file image. The resulting tree consists of a little over 1.6 Megabytes. This is the minimum amount of disk space required because there is allocation overhead, which can be surprisingly large (factors of 1.5 to 6 depending on hard disk organization). The CLS and HYP files are formatted as described in the DOC directory on TESTDATA1, and contain the type of information described therein. The WRT files are in the MFS format, which is also described in the DOC directory on TESTDATA1. These files provide an index that identifies the form that was the source of each segmented character on TESTDATA1. Each form was filled out by a different writer, so this index separates all 500 writers into separate categories. Each line after the first in each WRT files has an entry such as: f0023_63 The 0023 identifies the writer (from writer 0000 through writer 0499) that printed the character on the same line in the MIS file. The 63 indicates the variant of the form that was actually filled out by writer 0023.