a VM that contains a complete setup for training and testing a speech recognizer.

Download

Author(s)

Eric Riebling and Florian Metze

Description

The TEDLIUM VM is a complete training experiment using training data and transcripts from TED talks released in the TEDLIUM_release1 data set. This is an advanced VM that requires a LOT of resources, resulting in pretty good (but still quite large) acoustic and language models.

Availability

Virtual Machine in OVA format.

Support

Supported as part of the "Speech Recognition Virtual Kitchen" 

IP Agreement

Prerequisites

Required Acknowledgment

Readme

TEDLIUM README

This is a full Kaldi model training experiment including (and requiring) a lot of resources. Before we get started, in case you should need it, the sudo password for this VM is ‘?1zza4All’.

1. Download the Training Data This machine uses a VirtualBox Shared Folder to hold the open source TEDLIUM_release1 training data. Please download this data (as a zip file) and unzip it into a folder on your host computer. It will create a folder named “db” which the TEDLIUM VM will then mount and use for it’s “immutable” training data. It’s 16 GB, and will unzip into 16 GB, so you’ll need at least 32 GB free disk just for this part(!) – then you can delete the zip file and get some of that space back.

2. Shared Folder Setup You will have to set up the shared folder in the guest VM to point to this folder on your host. Assuming you’ve downloaded and imported it, click the Tedlium VM in your VirtualBox Manager window, and click to get to it’s settings. Don’t worry about the warnings “Invalid settings detected” – that is just because we are running at the maximum permitted settings for a VM. Choose Shared Folders from the Settings dialog box, and note under Machine Folders the shared folder with name ‘data’ and Path ‘/usr0/home/er1k/db’.  Right click and choose Edit Shared Folder. Change the Folder Path so that it points to the location you unzipped the training data (‘db’) in step 1 above. (Folder Name should read ‘db’) Make sure that Read-only is unchecked, and that Auto-mount and Make Permanent are checked, and click OK.

3. Running the Experiment First, shut down as many programs on your host computer as you can. VirtualBox will need that memory. Start the Tedlium Virtual Machine, and once it shows the desktop, start a Terminal window. Everything will happen here, so we suggest running in a shell within an editor so you capture a log of everything, with these commands:

cd /kaldi-trunk/egs/tedlium/s5
emacs -nw -e shell
./run.sh

This will assume you have some knowledge of emacs, but really, it just gives you another shell window with captured history and ability to edit. The run.sh command will set off a VERY LONG process with many stages of data processing. It can easily 24 hours complete(!) What is going on is that the shell script run.sh contains a series of steps, shell commands that run more shell commands that fomat the data, perform training, alignment, build models, decode models, perform scoring, etc. Lots of stuff here!

This is an example of a successful run of the TEDLIUM experiment, and can also be found in the file /kaldi-trunk/egs/tedlium/s5/shell.log:

TedliumOutput.txt

Results get placed in folder exp/ and build upon previous results. For example, the final stage puts results in folder exp/tri3_mmi_b0.

Scoring

One way to find scores is to change path to the above named folder, then issue the command:

grep "Percent Total Error" */score_*/ctm.filt.filt.dtl

Results will look like:

ScoreOutput

Another way to look at scores is to look in files with names like:

/kaldi-trunk/egs/tedlium/s5/exp/tri3_mmi_b0.1/decode_test_it4/score_20/ctm.filt.filt.sys

Results will look like this:

SYSTEM SUMMARY PERCENTAGES by SPEAKER
,-------------------------------------------------------------------------------------------.
| exp/tri3_mmi_b0.1/decode_test_it4/score_20/ctm.filt |
|-------------------------------------------------------------------------------------------|
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err | NCE | 
|-------------------------+--------------+-----------------------------------------+--------|
| aimeemullins | 129 2897 | 82.1 13.7 4.2 2.9 20.8 93.0 | -0.021 |
|-------------------------+--------------+-----------------------------------------+--------|
| billgates | 151 4344 | 82.2 13.2 4.6 3.0 20.9 96.0 | -0.020 |
|-------------------------+--------------+-----------------------------------------+--------|
| s182 | 14 302 | 72.2 18.5 9.3 0.7 28.5 92.9 | 0.178 |
|-------------------------+--------------+-----------------------------------------+--------|
| danbarber | 235 2384 | 70.6 21.2 8.2 3.3 32.7 84.7 | -0.050 |
|-------------------------+--------------+-----------------------------------------+--------|
| danbarber_2010_s103 | 1 24 | 50.0 33.3 16.7 0.0 50.0 100.0 | -1.265 |
|-------------------------+--------------+-----------------------------------------+--------|
| danielkahneman | 127 3016 | 81.5 15.5 3.0 2.1 20.6 91.3 | 0.177 |
|-------------------------+--------------+-----------------------------------------+--------|
| s164 | 11 168 | 45.8 23.2 31.0 1.8 56.0 90.9 | 0.169 |
|-------------------------+--------------+-----------------------------------------+--------|
| ericmead_2009p_ericmead | 52 1511 | 80.3 13.4 6.3 2.6 22.3 96.2 | -0.028 |
|-------------------------+--------------+-----------------------------------------+--------|
| garyflake | 35 1102 | 82.6 13.6 3.8 3.5 21.0 94.3 | 0.079 |
|-------------------------+--------------+-----------------------------------------+--------|
| jamescameron | 95 2972 | 81.0 13.3 5.7 3.1 22.1 100.0 | 0.091 |
|-------------------------+--------------+-----------------------------------------+--------|
| janemcgonigal | 108 3821 | 80.5 14.9 4.7 3.2 22.7 99.1 | -0.023 |
|-------------------------+--------------+-----------------------------------------+--------|
| michaelspecter | 124 2969 | 80.1 13.5 6.4 2.1 22.0 94.4 | 0.064 |
|-------------------------+--------------+-----------------------------------------+--------|
| robertgupta | 37 878 | 83.4 12.9 3.8 3.3 19.9 91.9 | 0.351 |
|-------------------------+--------------+-----------------------------------------+--------|
| robertgupta_2010u_s57 | 1 2 | 0.0 0.0 100.0 0.0 100.0 100.0 | 0.000 |
|-------------------------+--------------+-----------------------------------------+--------|
| tomwujec | 35 1122 | 79.6 15.2 5.2 3.1 23.5 94.3 | -0.109 |
|===========================================================================================|
| Sum/Avg | 1155 27512 | 80.0 14.7 5.3 2.8 22.9 93.0 | 0.036 |
|===========================================================================================|
| Mean | 77.0 1834.1 | 70.1 15.7 14.2 2.3 32.2 94.6 | -0.027 |
| S.D. | 68.0 1451.0 | 22.7 7.0 24.8 1.2 21.8 4.2 | 0.362 |
| Median | 52.0 1511.0 | 80.3 13.7 5.7 2.9 22.3 94.3 | 0.000 |
`-------------------------------------------------------------------------------------------'


If this information is inaccurate or incomplete, please submit an update through this form.