The 'TDT3trk.pl' program will score the output generated by a TDT3 tracking system. The program requires the directory path, 'Rootdir', to the LDC's TDT3 Test corpus. The corpus must be in the same structure as released by the LDC, with all file formats identical to their original form.
The tracking task requires independent system runs for each target topic. Since there are a number of topics in each test, file lists are used to specify indexes and system outputs. A file list is an ASCII file of filenames, each name separated with a newline. Comment lines begin with the '#' character and any text after the '#' is ignored.
The program reads 'TDT3_trk_index_list', a file list containing the names of TDT3 Tracking index files provide with the test corpus. The index files are used to load the appropriate data from the corpus and to verify the completeness of the Tracking system output.
After loading the reference data, the 'Trk_system_output_list' file list, which contains filenames of tracking system output, one file per topic, is loaded and scored. The order if topic results files does not matter, the program matches topic specific results to the topic specific index. The output format of the tracking system generated output is specified below.
In order to score a tracking system, the decisions output by the Automatic Topic Tracking System (ATTS) must be mapped onto the story boundaries annotated in the reference corpus. The TDT3trk.pl program can use two methods, majority vote or impulse vote for determining this mapping. The two methods differ in the meaning implied by decision marker output by the ATTS.
The following <Options> are recognized by the program:
| -C Cmiss:Cfa | -> | Set the cost of a missed detection and the cost of a false alarm to 'Cmiss' and 'Cfa' respectively. These numbers are used in the tracking cost function. Default values are Cmiss=1.0 and Cfa=0.1 ; |
| -D Detail | -> | Write internally organize evaluation corpus and pertinent statistics for debugging purposes. This report, though voluminous, is intended to help researchers debug their internal versions of evaluation code. |
| -E SubsetFile | -> | Compute performance excluding source files in the subset definition file. The application of this filter is global, in that the source file is ignored prior to establishing subsets defined by the -U option. NOTE: Only the first set defined in the subset definition file is used for the filter. All others are ignored. |
| -j topicrel[:topicrel]* | -> | Specify alternative topic relevance files via the command line. More than one can be specified by concatenating the file names using a colon ':' separator. |
| -m func | -> | Set the system output to story mapping function to either 'majority' or 'impulse'. Default is 'majority'. |
| -P P(topic) | -> | Use P(topic) for the tracking cost function. Default is 0.02. |
| -r Report | -> | Write the summary report to 'Report' rather than STDOUT, the default. |
| -s | -> | Use all available speedups. Currently, the only speedups involve NOT using 'nsmgls' and 'SGMLS.pm' parser and PERL libraries to read the TDT3 Corpus files. |
| -S | -> | Skip the source files in the system output that were NOT loaded via the index files. Using this option in conjunction with modified/reduced index files provides the capability of computing performance statistics of subsets of an evaluation set. See the FAQ entry regarding performance statistics of tracking evaluation subsets. |
| -U SSDFile | -> | Use the Source file Subset Definition file 'SSDFile' to generate performance statistics on subsets of the tracked source files. The subsets are independent, and unlimited in number. |
| -v num | -> | Set the verbose level to 'num'. Default 1. ==0 None, ==1 Normal, >5 Slight, >10 way too much, >15 not even funny |
| -Z uncompress | -> | Specify the command for uncompressing the system output files prior to scoring. The decompression applies to ONLY the system decision files, not the file lists. The command is executed by opening a pipe command if the system output file ends with a .Z or .gz suffix. The command is required to read a compressed stream from STDIN, and write the uncompressed stream to STDOUT. |
| -o LBL | -> | Treat the stories annotated as level 'LBL' as on topic. The default value is 'YES', but the value can also be 'YES+BRIEF', or 'BRIEF'. |
| | ||
| -d DETfile | -> | Create a DET plot in GNUplot format with the file root 'DETfile'. The program makes several files each with additional extensions. The file 'DETfile'.plt is a command file for GNUplot and can be printed using the command "gnuplot 'DETfile'.plt | lpr". The default plot produces a line trace for each line in the 'Trk_system_output_list' list. See the discussion below on DET Plots for additional information. The options below modify this. |
| -t title | -> | Set the title line for the plot to 'title'. |
| -n | -> | If the topic weighted DET trace is plotted, the 90% confidence interval will also be plotted when this option is used. |
| -p | -> | Produce a single story-weighted DET line trace for all the system output files in 'Trk_system_output_list'. This will only be made if Nt is constant for all the system outputs. * |
| -w | -> | Produce a single topic-weighted DET line trace for all the system output files in 'Trk_system_output_list'. This will only be made if Nt is constant for all the system outputs. * |
| -e | -> | Produce the default output of a line trace for each line in 'Trk_system_output_list'. * |
| -f | -> | Force the program to make a pooled plot even if Nt isn't constant. |
| -u 1|Many | -> | Also produce a DET plot for the subsets defined via the -U option. Either '1' or 'many'
may be used as an argument.
The argument '1' produces one DET plot containing a single pooled or topic weighted DET line,
(depending on the use of the -p and -w options), for each subset.
The root filename for this plot will be 'DETFile'_subsets.
The 'Many' argument builts separate DET plot file for each subset. The plotted traces are controlled by the -p, -t, -e, and -n options. The root filename for the plots will be 'DETFile'_subset=<SubsetHeading>. |
The BNF structure of the tracking index file is:
Where:
| <HEADER_LINE> | :== | # TRACKING <POINTER_TYPE> TOPIC=N 'N' is the topic number under test. |
| | :== | RECID | TIME A POINTER_TYPE is the type of boundaries to be output by the system. The possible values are RECID for text stream tracking or TIME for audio tracking. |
| <TRAIN_STORY> | :== | # Training_docno=Nt <DOCNO> <DOCFILE> Nt is the ordinal number of training story. For Nt = X training conditions, use the stories numbered 1 through and including the Nt = X story. |
| | :== | Document number from the Corpus |
| | :== | Corpus filename with directory and filename extensions relative to the corpus root directory of the tokenized file that contains the training story's text. |
| <SOURCE> | :== | <DOCFILE> <STARTPOSITION> <DOCFILE> is the tokenized text file in the same format as above. |
| | :== | A RECID or TIME indicating the starting point of the evaluation. System outputs before this position are ignored. |
# TRACKING RECID TOPIC=39 # # Training stories # Training_docno=1 APW19980304.0300 tkntext/19980304_0555_0642_APW_ENG.tkn # Training_docno=2 NYT19980303.0324 tkntext/19980303_2112_2200_NYT_NYT.tkn # Training_docno=3 PRI19980303.2000.0432 tkntext/19980303_2000_2100_PRI_TWD.tkn # Training_docno=4 CNN19980303.1130.0639 tkntext/19980303_1130_1200_CNN_HDL.tkn # Training_docno=5 NYT19980302.0439 tkntext/19980302_2052_2146_NYT_NYT.tkn # Training_docno=6 PRI19980302.2000.3319 tkntext/19980302_2000_2100_PRI_TWD.tkn # Training_docno=7 PRI19980302.2000.2038 tkntext/19980302_2000_2100_PRI_TWD.tkn ... # # Discriminate_Training_docno=1 APW19980301.0161 tkntext/19980301_0553_0719_APW_ENG.tkn # Discriminate_Training_docno=2 APW19980301.0171 tkntext/19980301_0553_0719_APW_ENG.tkn ... asrtext/19980302_1830_1900_ABC_WNT.asr 1 asrtext/19980304_1130_1200_CNN_HDL.asr 1 asrtext/19980304_1600_1630_CNN_HDL.asr 1
The BNF structure of the segmentation system output file is:
Where:
| <HEADER_LINE> | :== | <SYSTEM> <BOUNDARIES> <Nt> <TOPIC> <POINTER_TYPE> |
| | :== | System is an alphanumeric character string that uniquely identifies the system being tested. (E.g., CDM_P05-8.v37) |
| | :== | Boundaries is either YES or NO, where YES indicates that story boundaries are supplied to the system being tested and NO indicates that they are not. |
| | :== | Number of training topics used. |
| | :== | TOPIC is the topic id under test. |
| | :== |
RECID | TIME POINTER_TYPE is the type of boundaries to be output by the system. The possible values are RECID for text stream detection or TIME for audio detection. |
| <DECISION_LINE> | :== | <SOURCE> <POINTER> <DECISION> <SCORE> |
| | :== | TDT3 corpus filename with directory and extension names relative to the TDT3 root directory specified on the command line. |
| | :== | POINTER is a hypothesized decision point. For text files, Pointer is the index number of the first word in the hypothesized segment, in the range {1, 2, . . .}. For audio files, Boundary is the time of the beginning of the segment {0.0, . . .}. (It isn't necessary to output the beginning of the first segment.) The hypothesized Boundary points must occur in chronological order. |
| | :== | Decision is either YES or NO, where YES indicates that the system believes that the story being processed discusses the target topic, and NO indicates not. |
| | :== | Score is a real number which indicates how confident the system is that the story being processed discusses the associated topic. More positive values indicate greater confidence. |
# Degnerate Tracking Results, Errors, Brecid corrtrack YES 16 39 RECID asrtext/19980302_1830_1900_ABC_WNT.asr 1 NO 0.0341789461672306 asrtext/19980302_1830_1900_ABC_WNT.asr 72 NO 0.247018221765757 asrtext/19980302_1830_1900_ABC_WNT.asr 581 NO 0.23052775207907 asrtext/19980302_1830_1900_ABC_WNT.asr 606 NO 0.00333382189273834 asrtext/19980302_1830_1900_ABC_WNT.asr 948 NO 0.919592500664294 asrtext/19980302_1830_1900_ABC_WNT.asr 1019 NO 0.93581769708544 asrtext/19980302_1830_1900_ABC_WNT.asr 1092 NO 0.471104943193495 asrtext/19980302_1830_1900_ABC_WNT.asr 1186 NO 0.0925928736105561
-------------------------------------------------------------------------------
-------------------- TDT Tracking Task Performance Report ------------------
Story Weighted (Pooled) Tracking: P(Miss) = 0.0000
P(Fa) = 0.0991
Topic Weighted Tracking: P(Miss) = 0.0000
P(Fa) = 0.0939
Tracking Performance Calculations:
Filename Topic Train Test Corr Corr Miss F/A Pct. Pct.
Story Story Det. ! Det. Story Story Miss F/A
-------- ----- ----- ------ ------ ------ ------ ------ ------ ------
trk_nwt_39.trk 39 16 1200 11 1070 0 119 0.0000 0.1001
trk_nwt_42.trk 42 16 59 0 54 0 5 0.0000 0.0847
trk_nwt_44.trk 44 16 126 2 112 0 12 0.0000 0.0968
======== ===== ====== ====== ====== ====== ====== ====== ====== ======
Sums 1385 13 1236 0 136
Means 461 4 412 0 45 0.0000 0.0939
Execution parameters:
LDC TDT Corpus Root Dir: ../../..
Index File list: trk_nwt_indexes
Index Files: ../indexes_devtest/trk_nwt_44.ndx
../indexes_devtest/trk_nwt_39.ndx
../indexes_devtest/trk_nwt_42.ndx
System Output File List: trk_nwt_outputs
System Output File: trk_nwt_39.trk Name: corrtrack Desc: Degnerate Tracking Results, Errors, Brecid
System Output File: trk_nwt_42.trk Name: corrtrack Desc: Degnerate Tracking Results, Errors, Brecid
System Output File: trk_nwt_44.trk Name: corrtrack Desc: Degnerate Tracking Results, Errors, Brecid
Pointer Type: RECID
System Output to Story Mapping Function: 'majority'
----------------- End of TDT Tracking Task Performance Report ---------------
-------------------------------------------------------------------------------
Preparing DET Curve.
This program computes three types of DET line traces, they are by-topic, story-weighted (formerly pooled), and topic-weighted.