Developers are expected to build systems according to the published evaluation plan. The significant changes between OpenKWS14 and OpenKWS15 are:
- OpenKWS volunteer teams will be supported in multilingual KWS by the release of Build Packs for Cantonese, Pashto, Tagalog, and Turkish in addition to the Vietnamese and Tamil data.
- Evaluation condition changes:
- Introduction of the "Very Limited Language Pack" (VLLP) training condition where the build pack consists of 80 hours of audio with call metadata, no phonetic lexicon, 3 hours of transcription excerpts sampled from many training calls. An LSP, like in prior OpenKWS evaluations, will be provided.
- Additional web text will be provided with the build pack.
- The Limited Language Pack training condition will be discontinued.
- The phonetic lexicons will be released only after the evaluation to support analysis.
- VLLP is required for volunteers who chose to participate in the OpenKWS evaluation.
- Build pack changes:
- A portion of the build pack will be designated as the 'tuning set' for setting weights etc.
- Evaluation schedule changes:
- Reduction of KWS system build time from 3 to 2 weeks.
- Two rounds of testing:
- 1 week for to process the evaluation pack for VLLP condition.
- 3 weeks to train and test systems for the Full LP condition and additional contrastive system runs.
Schedule
December 1 2014 | OpenKWS15 registration closes |
April 22, 2015 | Surprise language build pack download begins |
April 28 2015, 2:00 PM EDT | NIST sends password for surprise language build pack |
May 5, 2015 | Surprise language evaluation pack download begins |
May 12, 2015, 2:00 pm EDT | NIST sends password for surprise language evaluation pack and keywords |
May 19, 2015, 2:00pm EDT | Sites submit surprise language system VLLP outputs to NIST |
May 20, 2015 | NIST posts FullLP Transcripts results |
June 2, 2015 | NIST reports surprise language VLLP scores without reference transcripts |
June 10, 2015 | Sites submit surprise language FullLP output to NIST |
June 12, 2015 | NIST reports surprise language FullLP results |
July 21, 2015 | OpenKWS14 meeting in Washington DC metro area |
Documentation
- The latest OpenKWS15 evaluation is KWS-evalplan-V05
- For convenience, here are a couple 'diffs' with previous versions
- Babel Data Specification, August 26, 2013: This document describes the structure of the Babel data supplied to the participants.
- Language Specific Peculiarities Documents:
Data Resources
The OpenKWS15 Data Resources page contains four types of resources: build packs, evaluation packs, IndusDB releases, and Language Pack Resources. Teams must complete both the registration form and data license in order receive data. Resources will be distributed as follows:
- Build packs:
- Build packs for Cantonese, Pashto, Tagalog, Turkish, Vietnamese and Tamil will be distributed to teams that signup for OpenKWS15.
- The 2015 Surprise Language build pack will be distributed according to the schedule above
- Evaluation packs:
- Evaluation packs for Vietnamese and Tamil will be distributed to teams that signup for OpenKWS15.
- The part of the Evaluation Pack reference transcripts will be distributed to OpenKWS '14 participants and new participants that successfully submit results on the Vietnamese or Tamil eval pack.
- The 2015 Surprise Language evaluation pack will be distributed according to the schedule above.
- IndusDB: IndusDB releases will be issued as appropriate for the team.
- Language Pack Resources: Language Pack Resources will be provided as appropriate for the team.
Evaluation tools
NIST provided tools are described in the Evaluation Infrastructure Setup Instructions.