TDT2eval Revison History
TDT2eval Revision History
Date: April 31, 1998


The follow is the revision history for the TDT2eval package.

Version 0.1, Released April 30, 1998

Initial release if version 0.1 for TDT2seg.pl, TDT2trk.pl, TDT2det.pl and TDT2.pm.

Version 0.2, Released July 30, 1998

TDT2seg.pl V0.2
  • Modified the segmentation scoring to be MUCH faster. Removed the old 'delta' based code in favor of the new. Runtimes went from:
    Source v0.1 (sec) v0.2 (sec) Rel. Improvement
    nw 840 67 92%
    bn 166 37 78%
    bsr 164 37 77%
  • During the above re-code, a bug was found in the previous delta-based scorer. The bug resulted in the final evaluation frame to not be checked. The bug had a neglible effect on scores, (something like 0.000001 for P(miss) and/or P(fa)).
  • Disregard non-story regions during scoring
  • Eval Frame size a command line arg
  • Blank lines after the initial comment lines in the system output file are treated as comments.
  • The index file format was changed in that file names are explicitly indicate the directory and extension names.
  • Added computation of Cseg measure.
  • Examples updated.
TDT2trk.pl V0.2
  • Added DET plot outputs, options -d, -a, -e, -f
  • Modified nomenclature to story rather than document
  • Reversed report output so that important numbers are at top
  • Eval function that maps system output to reference topics now uses point markers that mark the beginning of a score/decision point which is continued to the next marker. The default mapping function is the average over words.
  • Small speed up in the evaluation function. Runtimes on error example set went from 76 sec. to 67 sec. A small gain, but runtime will increase linearly with the number of system output points.
  • Update manual page and reports, refer to story not document.
  • Extensively modified the tracking index file. Changes include: adding a list of discriminate training stories in the training epoch, a start recid was added to the source file record, and the full pathname within the TDT2 corpus is specified for the testing source file.
  • Modified the scorer to exclude partial source files from scoring based on the start record number in the index file
  • Modify program to handle new index file format.
  • Blank lines after the initial comment lines in the system output file are treated as comments.
  • Update examples.
  • Corrected a problem that breif documents were excluded from all topic scorings. rather than the topic for which it was judged brief.
  • Added computation of Ctrack measure.
TDT2det.pl V0.2
  • Blank lines after the initial comment lines in the system output file are treated as comments.
  • stories marked BRIEF, or stories marked YES for multiple topics are excluded from scoring.
  • Correct percent false alarm to be (#fa)/(Nstory-(#stories on topic given the topic)).
  • Eval function that maps system output to reference topics now uses point markers that mark the beginning of a score/decision point which is continued to the next marker. The default mapping function is the average over words.
  • Ref to hyp cluster mapping function optimized by a cost function, command line configurable.
  • Modified nomenclature to story rather than document.
  • Allow reference topic cluster to map to NULL Hypothesis cluster.
  • Added a DET plot capability
  • Updated examples.
TDT2.pl V0.2
  • Added Min/Max and Find_system_score_for_doc() functions.
  • Added DET ploting functions, ppndf(), write_gnuplot_DET_header(), write_tics(), and Compute_DET_points().
TDT2BuildIndex.pl V0.1
  • This is a new script, the script generates evaluation index files given a file list.

Version 0.3, Released August 14, 1998

TDT2seg.pl
  • Verified hyp inputs are sorted in ascending order by the pointer.
  • Modified the default P(seg) to be 0.3
  • Re-worked the scoring report to include calculations for each source file type.
  • added a check of TDT2.pm to make sure the right library is in use.
TDT2trk.pl
  • Corrected initializations of hash tables to '= {}'
  • added a default topic decision to beginning of each hyp file so that all reference stories will get mapped to "some" decision.
  • Set Cmiss default value to 1.0
  • Verified hyp inputs are sorted in ascending order by the pointer.
  • Modified the default P(topic) to be 0.02
  • Scored on NEWS stories in the evaluation, all others excluded
  • added a check of TDT2.pm to make sure the right library is in use.
TDT2det.pl
  • Corrected initializations of hash tables to '= {}'
  • added a default topic decision to beginning of each hyp file so that all reference stories will get mapped to "some" decision.
  • Set Cmiss default value to 1.0
  • Verified hyp inputs are sorted in ascending order by the pointer.
  • Modified the default P(topic) to be 0.02
  • Scored on NEWS stories in the evaluation, all others excluded
  • When mapping ref to hyp document clusters, the NULL topic cluster is the default match. The minimum score then is at least as small as the score of the NULL cluster.
  • added a check of TDT2.pm to make sure the right library is in use.
TDT2.pm
  • Corrected initializations of hash tables to '= {}'
  • Corrected a problem in Find_system_score_for_doc(). If there are no decisions for a final reference story, the finale hyp boundary was not in effect.
  • Corrected the load database function to properly erase TDTref so that more that one tracking index file could be created.
TDT2BuildIndex.pl
  • Use only NEWS stories in both the Training stories and NON-target training stories.
  • Corrected the load database function to properly erase TDTref so that more that one tracking index file could be created.
  • Modified the proceedures for making trancking index files, a number of bugs were fixed including: 1) the residual text from the final training files was omitted from test if the file was ABC, 2) the detection index is now used to search for training stories, this way the file lists for detection and tracking match. 3) XXXXXXX null length asrv news stories not scored

Version 0.4, Released Oct, 7, 1998

TDT2seg.pl
  • Modified the code to handle boundary records without recids
  • Added the -s switch. (uses all available speedups)
  • Replaces the -d option, to dump the internal database and then exit, to -L
  • Added the -r option to write the summary report to a file
  • Added the -D option to write out detailed diagnostics
  • Added the ability to make DET dot plots of source show scores and of the overall scores via the -d and -t options.
TDT2trk.pl
  • Modified the code to handle boundary records without recids
  • Added P: and C: to the list of commandline options as they were omitted, but documented
  • Added the -s switch. (uses all available speedups)
  • Added the -r option to write the summary report to a file
  • Added the -D option to write out detailed diagnostics
TDT2det.pl
  • Modified the code to handle boundary records without recids
  • Corrected the case of the option 'p:' to be upper case as documented.
  • Added the -s switch. (uses all available speedups)
  • Added the -r option to write the summary report to a file
  • Added the -D option to write out detailed diagnostics
  • Made the DET plots produce a topic score cloud and topic weighted score point.
  • Added the -T switch to select a set of topics to score over.
  • The -I option was replaced with the -i option. '-I' conflicts with the perl option to add a directory to the package search path.
TDT2.pm
  • Corrected subroutine Load_Boundaries_Into_TDTRef();
  • Changed computes DET points to output probability data points
TDT2BuildIndex.pl
  • Corrected a problem with tracking indexes, when the final training story is a from a text source with either ccap or fdch transcripts, the rest of the file was not included in the test.
  • Added the -s switch. (uses all available speedups)
  • Added the -T switch to select a set of topics to build tracking indexes for.

Version 0.5, Released Oct, 20 1998

TDT2det.pl
  • The regular expression to restrict evaluated topics did not work if the regexp was not specified.

Version 0.6, Released December 3, 1998

TDT2seg.pl
  • Corrected a bug that had the affect of not producing identical denominators across systems. The bug was activated by adjacent Non-NEWS stories that were incorrectly scored. Basically, there were Nframes evaluated around an internal boundary.
TDT2det.pl
  • The fix on the topic regular expression didn't work properly if there was no topic regular expression used. It ran, but it included a 'n/a' topic cluster.
  • Put in an assumption check to make sure there is at least 1 reference and 1 hypothesize story clusters in the evaluation data.
TDT2trk.pl
  • There was an option clash between the speedup option -s and the skip system output files not in the index file.
TDT2BuildIndex.pl
  • Fixed a problem in seg_asr.ndx file generation. For some reason, the variable $src was changed to $source in the function to generate the file.
  • Fixed a problem in tracking non-target generation, non-training documents with multiple topic annotations were in the non-target training list twice.
  • Fixed a problem in the tracking index files, if the document was on-topic for the test topic and for another topic, the document occured as both target and non-target training.