ENCLOSURE 2: EXAMPLES OF FILE CONTENTS AND STRUCTURES 

A sample of the type of images that will be used in the testing phase of
the Conference can be obtained by anonymous ftp from

sequoyah.ncsl.nist.gov, IP 129.6.61.25. 

The images are in the files in the following directory structure: 

ind_occ
|
-----------------------------
|      |       |      |     |
data   dicts   docs   man   src
|
---------
|       |
d00 ... dYY
|
----------------
|              |
d00f00.mis ... d00f99.mis
d00f00.ref ... d00f99.ref

The directory ind_occ is an exact mirroring of what the CD-ROM discs will
look like except that there may be more subdirectories dYY. The data
directory will contain the image (MIS) files and the reference (REF)
files. The dicts directory will contain the dictionaries that are
discussed later. The src directory will contain all the source code needed
to read the MIS files. This source code has been written and compiled on a
SUN workstation using SUN OS 4.1.1 and works on that platform. We can not
guarantee that this code will work on any other platform or operating
system. The man directory contains the manual pages for the programs and
routines supplied in the src directory. 

At the time of this mailing, no reference files for the sample directory
on the ftp site, except d00f00.ref, are available. The reference files
will definitely be supplied with the training MIS files. 

The five sample images shown on Enclosure 1 are in mis/d00/d00f00.mis.

Examples of the MIS file and related hypothesis subdirectory structures
for the test CDROM and test results are shown below: 

ind_occ
|
-----------------------------
|      |       |      |     |
data   dicts   docs   man   src		SYSTEM_NAME
|					|
---------				--------------
|       |				|
d00 ... dXX				d00 ... dXX
|					|
----------------			------------------
|              |			|
d00f00.mis ... d00f99.mis		d00f00.hyp ... d00d99.hyp
					d00f00.con ... d00f99.con
					d00f00.rj0 ... d00f99.rj0
					 ...       ... ...
					d00f00.rj9 ... d00f99.rj9
					
XX is a two-place digit that may be different for the training and test
data than for the sample data, as was mentioned above in connection with
the sample data at the anonymous ftp site. 

ENCLOSURE 2, PAGE 2

The contents of d00f00.hyp (on left) and d00f00.ref (on right) are:

r00_f00 MANAGER			r00_f00 MANAGER
r00_f01 MANAGER           	r00_f01 MANAGER
r00_f02 MHAGER			r00_f02 MANAGER
r01_f00 CDNSTRUCTION		r01_f00 CONSTRUCTION
r01_f01 HHIHHIIH		r01_f01 ALL MASONRY WORK AND EQUIPMENT OPER
r01_f02 HIIHIRIII		r01_f02 LAYING BLOCK FOUNDATIONS POURING CO
r02_f00 INSWRANCE		r02_f00 INSURANCE
r02_f01 FINANUAL ANALYST	r02_f01 FINANCIAL ANALYST
r02_f02 PREMRIW REPDRD		r02_f02 PREPARING REPORTS
r03_f00 BLANK			r03_f00 BLANK
r03_f01 PHH			r03_f01 PERSONNEL RECEPTIONIST
r03_f02 THIH			r03_f02 TYPING FILING
r04_f00 AERO SPME		r04_f00 AERO SPACE
r04_f01 MANSGEC			r04_f01 MANAGER
r04_f02 MAMALEWG		r04_f02 MANAGERUG

There is no significance to the H and I above, except to represent what
some imaginary system produces as classifications when the hypothetical
character segments are not isolated characters.  





