The "correct" classifications, the plurality classifications for the
systems participating in the First Census OCR Systems Conference, and
a key to the writer who produced each image on NIST TEST DATA 1 are in
the file TEST1_AN.ZIP on this disk.  This file must be unzipped with
the PKUNZIP utility, which is in the DOS directory on TEST DATA 1, and
which must be run on a computer under the MS-DOS operating system.
This utility is licensed only for this purpose and may not be used
for any other purpose without registering the software as described
when PKUNZIP is run without any command line arguments.

When PKUNZIP -D TEST1_AN.ZIP is run under MS-DOS, it will create
the tree structure

TEST1
|________________
|       |       |
DIGIT   LOWER   UPPER

Each subdirectory will contain the CLS files for the MIS files on
TESTDATA1, the HYP files for the plurality hypotheses (vote_p) for
the same MIS files, and the WRT files that identify the writer of
each character in each MIS file image.  The resulting tree consists
of a little over 1.6 Megabytes.  This is the minimum amount of disk
space required because there is allocation overhead, which can be
surprisingly large (factors of 1.5 to 6 depending on hard disk
organization).

The CLS and HYP files are formatted as described in the DOC directory
on TESTDATA1, and contain the type of information described therein.

The WRT files are in the MFS format, which is also described in the DOC
directory on TESTDATA1.  These files provide an index that identifies
the form that was the source of each segmented character on TESTDATA1.
Each form was filled out by a different writer, so this index separates
all 500 writers into separate categories.  Each line after the first
in each WRT files has an entry such as:

       f0023_63

The 0023 identifies the writer (from writer 0000 through writer 0499)
that printed the character on the same line in the MIS file.  The 63
indicates the variant of the form that was actually filled out by
writer 0023.
