File:  aldistsm-1.0/README.txt
Date:  6/08/98

This directory contains the source code, documentation, and
example/test data for my software to do split/merge alignment
of two strings of words (or phones).

The main program, "tald3e_sm_export", reads in pairs of
strings to be aligned, marked as "REF:" and "HYP:", from
a file and writes out the aligned strings into another file.
All of the functions and data structures are made available here
so that you can package it differently if you want to.
The code was written in ANSI C on a SUN Workstation.
The installation procedure is basic: just execute

   gcc -o tald3e_sm_export tald3e_sm_export.c

To test the program, execute the batch procedure "test1.bat".
(With the test file supplied, you will get a warning message
about "the specification of a lower pcodeset" being overridden,
which you can ignore.)  When it has run successfully, look at
the output report file "test1.rpt" to check parameter values
and the split/merges found, and compare the output alignment
file "x_sm.aln" with the version produced here at
NIST ("x_sm_NIST.aln") to see if identical results were
produced.  There should also be no substantial differences
between the output file "test1.rpt" produced by your run and
the matching file "test1_NIST.rpt" produced here.

New words found in the input alignment but not in the
word-level pcode file "gpengw1.pcd" will get a pronunciation
produced by a default text-to-phone function.  These new
word prons will be written out in a file named "*.ADD",
and although it's not necessary, you can correct them
and merge them back into your version of "gpengw1.pcd"
if you want a little more accuracy the next time you run.

The 5th command-line parameter is a figure-of-merit
threshold used by the split/merge detecting logic
that you may want to play with.  For each candidate
split or merge, a figure of merit is calculated that
is proportional to the liklihood that the split/merge
is genuine; if this is greater than the threshold,
the split/merge candidate is accepted.  In previous
work (first citation below), a value of 1.7 was found
to be optimal.

For a few more details, see:

"Further Studies in Phonological Scoring", by W. M. Fisher,
 J. Fiscus, A. Martin, D. S. Pallett, and M. A. Przybocki,
 Proceedings of the Spoken Language Systems Technology Workshop,
 January 22-25, 1995, Austin, TX (sponsored by ARPA), Morgan
 Kaufmann Publishers, San Francisco, ISBN 1-55860-374-3, pp. 181-186.
 
"Better Alignment Procedures for Speech Recognition Evaluation",
 by W. M. Fisher and J.G. Fiscus, IEEE International Conference
 on Acoustics, Speech, and Signal Processing 1993, pp. II-59 - II-62.

| Version 1.0, released 6/08/98, incorporates bug fixes, some cosmetic
| changes, command-line parameter changes, and a greatly improved default
| TTP function that's used to get the prons for words not in the dictionary.
| It uses a file of TTP rules and decides to resort to "spell-mode"
| pronunciation (as in "FBI") iff the pronunciation produced by the general
| rules is unpronounceable (because it can't be syllabified). 

 Version 0.1, released 12/05/97, includes fixes to one bug that
 would be triggered if two candidate s/m's are close enough
 to conflict, and some memory allocation and initialization problems.
 In addition, the alignment test case that triggered this bug
 has been added to the proof test set.

 - Bill Fisher / NIST
	