
		   					Jon Geist
							225/B063/NIST
							G'burg MD 20832 
							June 23, 1993
To whom it may concern: 

Subject: 2nd Census OCR Systems Conference

In May of 1992, the U.S. Bureau of the Census and the National Institute
of Standards and Technology (NIST) held a conference on optical character
recognition (OCR) of hand-printed characters. Various tests focused on the
recognition of individual characters, and the results were encouraging.
However, most participants agreed that larger samples of handwriting were
needed to tackle the more realistic problem of processing images of forms
that contain unconstrained hand print. 

Census and NIST are now preparing for a 2nd OCR Systems Conference to
further advance this research. They are constructing a data base to
contain: images of handwriting from the 1990 Census forms; ASCII text
answers corresponding to each image; dictionaries with common words and
phrases found in the answers; and some generic image processing software.
Examples may be obtained from the ftp server at NIST (see enclosures for
details); larger samples for training and testing will be distributed on a
series of CD-ROMs. 

The test will measure the ability of OCR systems to perform in a "worst
case scenario". The images are being digitized from microfilm and may have
lower quality than images created from the original paper questionnaires.
Also, the 1990 Census questionnaires were designed for key entry data
capture of handwriting, and therefore do not contain any design features
that might facilitate machine recognition. Other tests using smaller
samples of images lifted from the original paper will help to gauge the
effect of image quality on OCR performance. 

This conference is being organized by the following Committee:
Bob Hammond, Robert Creecy, Norman W. Larsen, Randy M. Klear, and Mark J.
Matsko, US Bureau of the Census; Charles L. Wilson, Jon Geist, and
R. Allen Wilkinson, National Institute of Standards and Technology; 
Jonathan J. Hull, Center of Excellence for Document Analysis and 
Recognition; Thomas P. Vogl, Environmental Research Institute of 
Michigan; and Christopher J. C. Burges, AT&T Bell Laboratories.

The Committee is chaired by Jon Geist, and the Conference and related
activities will be run by NIST for the Committee with R. Allen
Wilkinson serving as the technical liaison for the Conference.

The approximate schedule for the research and Conference follows:
Sample data on ftp server 			late June 1993
1st training data CD-ROM 			early August 1993
2nd training data CD-ROM 			early September 1993
Test data CD-ROM 			       	November 1993
Test results due from participants 		November 1993
Conference to announce/discuss results 		February 1994
Publish report                           	June 1994

Seven enclosures provide more information about this research and how
to participate in the conference; ENCLOSURE 0 provides an overview. 

Sincerely, 

Jon Geist



