| TDT3.pm |
- Added "use strict" to improve error checking.
- Converted all glob aliases to references.
- Added code to the topic relevance table loader to make sure there is only one entry for a topic/document annotation. Version 0.6 did not do this check and as a consequence, some documents were improperly ignored during scoring. This will slightly change the detection performance.
- Corrected problems with the automatic boundary scoring procedures. The mapping function to map story decisions onto the reference story segmentation was making mistakes.
- On the next to the last hyp story, if it occurred before the last reference story boundary, that mapping was incorrect,
- For a reference story with multiple hypothesis story units, (at least three stories), the mapped hypothesis cluster was always the last story within the reference story boundaries.
- Changed the tracking scoring structure to be more memory efficient.
|
| TDT3seg.pl |
- Added "use strict" to improve error checking.
- Converted all glob aliases to references.
- Relaxed the checking of system output deferral times. Warnings are printed rather that fatal errors.
- Modified the accepted index file format. There are additional fields that specify the source file and language conditions of the test.
- Modified the default evaluation frame size to be 75 IFF the evaluation source language is Mandarin and the POINTER type is a recid.
- For the evaluation of Mandarin ASR, the RECIDs are in terms of words, but the evaluation frame size for Mandarin is in terms of characters. Therefore, the program converts the word-based RECIDs into character-bases RECIDs by reading the tokenized text file.
|
| TDT3det.pl |
- Added "use strict" to improve error checking.
- Converted all glob aliases to references.
- Added a -S option to define independent subsets over which scores are computed.
- Modified the summary report it include story weighted Pmiss, Pfa and Cdet.
- Added the computation of Cost Weighted YDZ metrics.
- Designated primary evaluation metric.
- Relaxed the checking of system output deferral periods. Warnings are printed rather that fatal errors.
- Modifications to TDT3.pm changed the scoring of the 1998 TDT2 evaluation. This document documents the changes in detection performance between TDT2eval V0.6 and TDT3eval V1.0
- Defined MACROs for common topic sets. These macros can be used with the -T option to specify topic sets.
|
| TDT3trk.pl |
- Added "use strict" to improve error checking.
- Converted all glob aliases to references.
- Modified the storage structure for the scores of each document to be more memory efficient. The structure is an array, rather than a hash list, but there is a hash table to tell what values the array cells correspond to.
- Designate primary evaluation metric.
- Topic weighted DET curves are now an option.
- The problem with mapping automatic story boundaries onto reference story boundaries does change the results of the '98 TDT2 evaluation. The overall scores for the CMU no boundary test changed as follows:
| Metric | TDT2eval_v0.6 | TDT3eval_v1.0 |
| Story Weighted Pmiss | 0.4050 | 0.4138 |
| Story Weighted Pfa | 0.0041 | 0.0040 |
| Story Weighted Ctrack | 0.0122 | 0.0122 |
| Topic Weighted Pmiss | 0.3820 | 0.3850 |
| Topic Weighted Pfa | 0.0047 | 0.0046 |
| Topic Weighted Ctrack | 0.0122 | 0.0122 |
|
| TDT3BuildIndex.pl |
- Mostly re-written to support the TDT3 Evaluation.
- Defined MACROs for common topic sets. These macros can be used with the -T option to specify topic sets.
|