- OpenKWS volunteer teams will be supported in multilingual KWS by the release of Build Packs for Cantonese, Pashto, Tagalog, and Turkish in addition to the Vietnamese and Tamil data.
- Evaluation condition changes:
- Introduction of the "Very Limited Language Pack" (VLLP) training condition where the build pack consists of 80 hours of audio with call metadata, no phonetic lexicon, 3 hours of transcription excerpts sampled from many training calls. An LSP, like in prior OpenKWS evaluations, will be provided.
- Additional web text will be provided with the build pack.
- The Limited Language Pack training condition will be discontinued.
- The phonetic lexicons will be released only after the evaluation to support analysis.
- VLLP is required for volunteers who chose to participate in the OpenKWS evaluation.
- Build pack changes:
- A portion of the build pack will be designated as the 'tuning set' for setting weights etc.
- Evaluation schedule changes:
- Reduction of KWS system build time from 3 to 2 weeks.
- Two rounds of testing:
- 1 week for to process the evaluation pack for VLLP condition.
- 3 weeks to train and test systems for the Full LP condition and additional contrastive system runs.
- Read the evaluation plan to become familiar with the evaluation.
- Sign and return the OpenKWS15 Registration Form to firstname.lastname@example.org.
- Sign and return the OpenKWS15 Evaluation Data Agreement to email@example.com.
- Complete a Dry Run Evaluation.
- The dry run is an opportunity for developers to make sure they are able to generate valid system output that can be scored with the NIST scoring tools. The actual performance of the system is not of interest during the dry run so developers may feel free to use any method to generate their system output, e.g., a random system, training on the dry run data, etc. The Evaluation Infrastructure Setup Instructions enumerate the steps to complete a dry run.
- NIST highly encourages new teams to build a Vietnamese system to familiarize themselves with the Babel resources in advance of the surprise language evaluation as part of their dry run.
|December 1 2014||OpenKWS15 registration closes|
|April 22, 2015 ||Surprise language build pack download begins|
|April 28 2015, 2:00 PM EDT||NIST sends password for surprise language build pack|
|May 5, 2015||Surprise language evaluation pack download begins|
May 12, 2015, 2:00 pm EDT
|NIST sends password for surprise language evaluation pack and keywords|
|May 19, 2015, 2:00pm EDT||Sites submit surprise language system VLLP outputs to NIST|
|May 20, 2015 ||NIST posts FullLP Transcripts results|
|June 2, 2015||NIST reports surprise language VLLP scores without reference transcripts |
|June 10, 2015||Sites submit surprise language FullLP output to NIST|
|June 12, 2015||NIST reports surprise language FullLP results|
|July 21, 2015||OpenKWS14 meeting in Washington DC metro area|
- The latest OpenKWS15 evaluation is KWS-evalplan-V05
- For convenience, here are a couple 'diffs' with previous versions
- Babel Data Specification, August 26, 2013: This document describes the structure of the Babel data supplied to the participants.
- Language Specific Peculiarities Documents:
- See the OpenKWS15 Data Resources page. (password provided at registration)
The OpenKWS15 Data Resources page contains four types of resources: build packs, evaluation packs, IndusDB releases, and Language Pack Resources. Teams must complete both the registration form and data license in order receive data. Resources will be distributed as follows:
- Build packs:
- Build packs for Cantonese, Pashto, Tagalog, Turkish, Vietnamese and Tamil will be distributed to teams that signup for OpenKWS15.
- The 2015 Surprise Language build pack will be distributed according to the schedule above
- Evaluation packs:
- Evaluation packs for Vietnamese and Tamil will be distributed to teams that signup for OpenKWS15.
- The part of the Evaluation Pack reference transcripts will be distributed to OpenKWS '14 participants and new participants that successfully submit results on the Vietnamese or Tamil eval pack.
- The 2015 Surprise Language evaluation pack will be distributed according to the schedule above.
- IndusDB: IndusDB releases will be issued as appropriate for the team.
- Language Pack Resources: Language Pack Resources will be provided as appropriate for the team.
NIST provided tools are described in the Evaluation Infrastructure Setup Instructions.