The next evaluation will be the OpenSAT20 Evaluation. OpenSAT20 will continue with the same tasks: Automatic Speech Recognition (ASR), Speech Activity Detection (SAD), and Keyword Search (KWS).
The data domain for the OpenSAT20 Evaluation will be simulated public safety communications spoken in English. The evaluation data will be extracted from unexposed portions of the SAFE-T corpus that was collected by the Linguistic Data Consortium (LDC) and initially made available for the OpenSAT19 Evaluation. The audio recordings in the SAFE-T corpus contain speech potentially with increased vocal effort induced by first-responder type background noise conditions and is expected to be challenging for systems to process with a high degree of accuracy.
NIST intends to continue with this public-safety speech corpus in the OpenSAT series to measure year-to-year system performance progress. The NIST Speech Analytic Technologies evaluation series (OpenSAT) goal is to provide broad support for the advancement of speech analytic technologies by including multiple speech analytic tasks and multiple data domains. Developers can choose from one to all tasks and from one to all data domains.
OpenSAT20 will be organized in similar manner to OpenSAT19 except as follows:
OpenSAT20 registration available until July 31, 2020
Training, Development, and Evaluation data available until July 31
Scores posted to leaderboards for the Progress set portion of the evaluation data until July 31
Last date to upload system output to NIST for scoring is July 31
Scores for the Test set portion of the evaluation data made available after July 31 (if a system description is uploaded)
Virtual Workshop will be held on September 16-17, 2020. Click HERE to register. Last day to register is September 04
Go to the OpenSAT website for more information and to register.
Download the 2020 OpenSAT Evaluation Plan V1.6 (pdf). Updated July 1, 2020
Send email to opensat_poc [at] nist.gov (opensat_poc[at]nist[dot]gov) with request to be added to the mailing list, to receive updates, or to ask questions or leave comments.
OpenSAT19 Evaluation Plan (Updated 3/28/2019)
03/29/2018 - 06-14-2019 Development data release (updated dates)
06/17/2019 - 07/01/2019 Evaluation data release (updated date)
08/20/2019 - 08/21/2019 Post Evaluation Workshop
Tasks
Speech Activity Detection (SAD)
Automatic Speech Recognition (ASR)
Key Word Search (KWS)
Data
For SAD, ASR, KWS tasks Low Resource Language - (Pashto language) from the IARPA Babel collection
For SAD, KWS tasks Audio extracted from amateur online videos - from the Video Annotation for Speech Technologies (VAST) collection (English language)
For SAD, ASR, KWS tasks Simulated public safety communications - from the PSC collection (English language)
Tasks
Speech Activity Detection (SAD)
Automatic Speech Recognition (ASR)
Key Word Search (KWS)
Data
For SAD, ASR, KWS tasks Low Resource Language - from the IARPA Babel collection (Pashto language)
For SAD task only Audio extracted from YouTube videos - from the Video Annotation for Speech Technologies (VAST) collection (Arabic, Mandarin and English languages)
For SAD, ASR, KWS tasks First responder/dispatcher operational recordings - from the June 18th 2007, Charleston, South Carolina, Sofa Super Store Fire (English language)
Documentation
Open Speech Analytic Technologies Pilot (OpenSAT Pilot) Evaluation Plan
Open Speech Analytic Technologies Pilot (OpenSAT Pilot) Evaluation Report