Jump to navigation
Who We Are
Policies & Guidelines
Learn at LTI
Explore Our Work
Back to the catalogue
Editing Interaction in Virtual Worlds VM
If you have a problem in filling in this form, contact lti.catalogue AT gmail.com.
Fields marked with (
) are required.
The email is only for internal purposes.
the submission will not be considered until you confirm that you own this email.
You will receive a confirmation email upon submitting. The confirmation email might get into your junk email, so check your SPAM folder. If you do not receive an email, please contact us.
A proof for not being spam
This information will appear in the catalogue.
Natural Language Processing/Computational Linguistics
Information Retrieval, Text Mining and Analytics
Spoken Interfaces and Dialogue Processing
Keywords (comma separated, internal and not shown to public)
Direct Download Link
(If you provide a direct download link, please also provide the IP agreement, and the Required Acknowledgement)
<p>The “Interaction in Virtual Worlds” VM allows you to “play” with an existing (English) speech recognizer supporting live decoding, and experience an open source speech dialog system in a virtual world. The VM contains everything you need, except for a “viewer” for the OpenSIM based Virtual World, spawned by the VM.</p>
Availability (e.g. source code, binary only, XML file, etc.)
<p>Virtual Machine in OVA format.</p>
Support Status (e.g. as-is, maintained, etc.)
<p>Supported as part of the "Speech Recognition Virtual Kitchen", see the <a href="http://speechkitchen.org/forums/forum/virtual-worlds-forum/">FORUM</a>.</p>
Prerequisites (e.g. Windows XP, Java 1.6, etc.)
<p>Virtualbox 4.3.16, Singularity viewer (or equivalent)</p>
Required Acknowledgement (e.g. paper to cite)
<p>See <a href="http://speechkitchen.org/legal/">http://speechkitchen.org/legal/</a>. </p>
might be helpful)
<p><a href="http://www.ubuntu.com/about/about-ubuntu/licensing">Ubuntu licensing approach</a>, see <a href="http://speechkitchen.org/legal/">http://speechkitchen.org/legal/</a>.</p>
Contact (e.g. e-mail)
<ul> <li><a href="http://www.cs.cmu.edu/directory/florian-metze">http://www.cs.cmu.edu/directory/florian-metze</a></li> <li><a href="http://www.cs.cmu.edu/directory/eric-riebling">http://www.cs.cmu.edu/directory/eric-riebling</a> </li> </ul>
<header class="entry-header"> <h1 class="entry-title">Interaction in Virtual Worlds README</h1> </header> <div class="entry-content"><address>This README corresponds to Mario2-IVW.ova, current as of 20140911.</address> <p>There are a couple experiments that can be performed with this virtual machine. The first one is real-time speech recognition (‘live decoding’) using the Kaldi online decoder. The second is a full “interaction in virtual worlds” speech dialog system, which you can fully control.</p> <hr /> <h3>Table of Contents</h3> <p>I. Installation<br />II. Running the system<br />III. Customizing the system<br />IV. Bot Actions</p> <hr /> <p> </p> <h3>I. Installation</h3> <ol> <li>Install Oracle VirtualBox (<a href="http://virtualbox.org/">http://virtualbox.org/</a>), along with the matching Extension Pack. The VM will work best with version 4.3.16.</li> <li>Set up a host-only network in VirtualBox in the VirtualBox graphical user interface, via “File” → “Settings or Preferences” → “Network”. Click on the Host-only Networks tab, then click the network card icon with the green plus sign in the right, if there are no networks yet listed. The resulting new default network should appear with the name ‘vboxnet0′.</li> <li>Import the Mario2-IVW.ova file into VirtualBox and run it.</li> <li>Make sure that your microphone can be used from inside the virtual machine. This includes checking that it works outside the VM, first (checking levels that are present, but not distorting) and, then, checking them within the VM (via the VM’s “System Settings” → “Sound”. The total level is a combination of these.</li> <li>Install the Singularity Viewer from <a href="http://www.singularityviewer.org/downloads">http://www.singularityviewer.org/downloads</a>.</li> <li>While running Singularity, create a new grid. Click the “Grid Manager” button, click the “Create” button, then In the field “Grid Name”, name it whatever you want, and enter in the “Login URI”: http://192.168.56.101:9000. Click “OK” to close this window.</li> <li>The username is “World Master”, and the password is “avatar”. You will use these, as well as the grid name you just chose, to log into the virtual world. But not until starting the server within the VM, described next.</li> <li>The password for the virtual machine is ‘?1zza4All’ should you need it</li> </ol> <p>NOTE: if you are trying to run this from a Windows host, there is a bug in VirtualBox that prevents the microphone from working with an Ubuntu guest operating system. See this Forum post for more information: <a href="http://speechkitchen.org/forums/topic/new-readme-for-mario2-ivw-vm/#post-748">http://speechkitchen.org/forums/topic/new-readme-for-mario2-ivw-vm/#post-748</a></p> <h3>II. Running the system</h3> <p>Once the VM is running…</p> <ol> <li>Please make sure that your microphone can be reached in the virtual machine.</li> <li>To check and test <strong>“online/ live decoding”</strong>, do the following: <ol> <li>Open a terminal</li> <li>Do “cd /kaldi-trunk/egs/voxforge/online_demo”</li> <li>Do “./run.sh –test-mode live”</li> <li>Try speaking into the microphone. The quality is terrible, but you should see speech recognition output appear.</li> </ol> <p>You might see some error messages on the console, but it can still work in spite of these:</p> <pre>`ALSA lib pcm_dsnoop.c:612:(snd_pcm_dsnoop_open) unable to open slave ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround71 ALSA lib setup.c:565:(add_elem) Cannot obtain info for CTL elem (MIXER,’IEC958 Playback Default’,0,0,0): No such file or directory ALSA lib setup.c:565:(add_elem) Cannot obtain info for CTL elem (MIXER,’IEC958 Playback Default’,0,0,0): No such file or directory ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline ALSA lib pcm_dmix.c:957:(snd_pcm_dmix_open) The dmix plugin supports only playback stream`</pre> </li> <li>To start the “<strong>Interaction in Virtual Worlds</strong>“, double click the icon on the desktop with the headphones icon that says “START”. This starts 4 processes: <ol> <li>The OpenSim world server</li> <li>The Kaldi Online Decoder speech recognizer</li> <li>The Stanford CoreNLP Parser</li> <li>The Kaldi/Parser client</li> </ol> </li> <li>You can read what the commands are in startBackend.sh, if you want to run them separately or display the terminals. It will take a long time for these to start up. You will know when all the processes have started when you see the terminal window “Kaldi Online Decoder” display several lines of “ALSA lib pcm” warning messages.</li> <li>Now you can log in the World Master avatar to Singularity, as in Step 6 above.</li> <li>Once the world loads, open MonoDevelop (third icon down the taskbar on the left of the VM screen), click on the IVW solution, and run it with the play button. This will start the SampleBot project within the IVW solution, and log the “Friend Bot” into the virtual world, the computer-controlled character that you can talk to. It will also connect to the Kaldi Parser Wrapper and receive text from the speech recognizer. The IVW solution contains another project, “DogBot” that has lots of sample code that can be used to extend the system.</li> </ol> <p>Warning: If the parser times out (both the parser and the wrapper will throw an error if they’re both running), then please restart: first the parser, then the wrapper. (See below)</p> <h3>III. Customizing the system</h3> <p>Here are the locations of all the code:<br />The START icon runs startBackend.sh, which is located on the Desktop. Here, you can find the commands to run parts of the system individually.</p> <ol> <li>Kaldi Online Decoder: /kaldi-trunk/egs/voxforge/online_demo<br />We use this code, but plug in models trained on TED talks: tedlium<br />This is the speech recognition system.</li> <li>OpenSim: ~/Desktop/opensim-0.8/<br />This sets up the virtual world. If you cannot login to Singularity, it is probably because you need to wait for this.</li> <li>Stanford Parser: ~/Desktop/parse/corenlp.py<br />This is the server for the parser.</li> <li>Kaldi Wrapper: ~/Desktop/parse/wrapKaldiLive.py<br />This code takes the output of the Kaldi Online Decoder, feeds it into the Stanford Parser, and makes the results available to the bot code as a TCP socket service on localhost port 9999.<br />Make sure that the Kaldi and the Stanford servers are running before you start this code. If this script throws a Timeout error, please restart the Stanford Parser and then restart this client.</li> <li>Communicating with the Bot: ~/Desktop/Bot Development/SampleBot<br />Accessible in MonoDevelop (IVW solution, SampleBot project), this code logs the bot into the virtual world and polls the Kaldi wrapper. Make sure that the Kaldi online decoder and wrapper are running before you start this. NOTE: if this code is suspended in debug long enough, the bot will disappear from the virtual world. (The virtual world protocol requires ‘keep-alive’ messages behind the scenes)</li> </ol> <h3>IV. Bot Actions</h3> <p>At the moment, the bot is very limited in its capabilities. It can tell you where certain objects are located. It looks at its surroundings and sees if the object’s name can be found, and if it can, it tells you how many meters away it is from the bot. This is limited to a search radius of 20 meters; you can experiment with this.</p> <p>There are a LOT more things you could add to extend this system, some of which exist in the DogBot project (in Mono). Feel free to play and drop us a line. More details about some of the included technologies can be found here:</p> <ul> <li><a href="http://opensimulator.org/wiki/Main_Page">OpenSimulator</a></li> <li><a href="http://secondlife.com/">Second Life</a></li> <li><a href="http://lib.openmetaverse.org/wiki/Main_Page">LibOpenMetaverse</a></li> <li><a href="http://nlp.stanford.edu/software/lex-parser.shtml">Stanford CoreNLP Parser</a></li> </ul> <h2> Advanced Topics</h2> <h4>Kaldi Online Decoder Model Files</h4> <p>Kaldi language/acoustic model graphs produced by training<br />examples (“egs” such as egs/tedlium) consist of several files:</p> <pre>HCLG.fst, matrix, model, phones.txt, tree, words.txt </pre> <p>This list of files makes up a ‘model’ in the Kaldi online decode example. Models are located in named folders under</p> <pre>/kaldi-trunk/egs/voxforge/online-demo/online_data/models/</pre> <p>They come from the output of running other Kaldi experiments, such as</p> <pre>/kaldi-trunk/egs/tedlium</pre> <p>Here is a mapping of model files, and their origins, for the final stage of the ‘tedlium’ experiment. On the left are the names in the online<br />decode models folder, on the right are where the files originate</p> <pre>HCLG.fst egs/tedlium/s5/exp/tri3_denlats/dengraph/HCLG.fst matrix egs/tedlium/s5/exp/tri3_mmi_b0.1/final.mat model egs/tedlium/s5/exp/tri3_mmi_b0.1/final.mdl phones.txt egs/tedlium/s5/exp/tri3_denlats/denraph/phones.txt tree egs/tedlium/s5/exp/tri3_mmi_b0.1/tree words.txt egs/tedlium/s5/exp/tri3_denlats/dengraph/words.txt </pre> <p>The above training example has the name “tri3_mmi_b0.1″ as the final stage (training tends to build upon previous stages) and each stage gets a new folder name under exp/. You can usually get the name of the final stage by looking the end of run.sh.</p> </div>
Additional Comments (internal and not shown to public)