Automatic Speech Recognition – Large Vocabulary Transcription
In the ASR task, systems are required to transcribe audio sequences of Italian parliament. Two subtasks are defined, and applicants may choose to participate in any of them:
- transcription
- constrained transcription, using the accompanying minutes
Two modalities are allowed:
- closed: only distributed data are allowed for training and tuning the system
- open: the participant can use any type of data for system training, declaring and describing the proposed setup in the final report
The evaluation is based on Word Accuracy, evaluated as Minimum Edit Distance between the recognizer output and the reference annotation. Training and development material extracted from wide-band (16kHz) corpora will be provided.
Data
Training data consist in: – about 30h of parliament audio sessions along with related (automatic) transcriptions – 1-year minutes of parliament sessions – lexicon covering acoustic and partly language model data
- Dev data: – 1 hour parliament session + minutes + reference transcription
- Test data: – ~1 hour audio sequences from parliament sessions
Data distribution
Test data [04/10/2011] – Training data are available. Please contact: Marco Matassoni, matasso[at]fbk.eu
Distributed data can be used only for the Evalita context, no fee is required.
Task materials
Detailed Guidelines [22/08/2011]
Organizers
- Fabio Brugnara (FBK-irst, Trento)
- Roberto Gretter (FBK-irst, Trento)
- Marco Matassoni (FBK-irst, Trento)