Running scripts creation

两种训练方法：1.单音素训练 2.简单的3音素训练。从解码结果可以看到这两种方法的区别。

Task

In kaldi-trunk/egs/digits directory create 3 scripts:

a.) cmd.sh

本地机器跑

b.) path.sh

添加路径

c.) run.sh

Getting results

Now all you have to do is to run run.sh script.

go to newly made kaldi-trunk/egs/digits/exp. You may notice there folders with mono and tri1 results as well - directories structure are the same.

Go to mono/decode directory. Here you may find result files (named in a wer_{number} way)

Summary

This is just an example. The point of this short tutorial is to show you how to create 'anything' in Kaldi and to get a better understanding of how to think while using this toolkit. Personally I started with looking for tutorials made by the Kaldi authors/developers. After succesful Kaldi installation I launched some example scripts (Yesno, Voxforge, LibriSpeech - they are relatively easy and have free acoustic/language data to download - I used these three as a base for my own scripts).

Make sure you follow http://kaldi-asr.org/- official project website. There are two very useful sections for beginners inside:
a.) Kaldi tutorial - almost 'step by step' tutorial on how to set up an ASR system; up to some point this can be done without RM dataset. It is good to read it,
b.) Data preparation - very detailed explaination of how to use your own data in Kaldi.

More useful links about Kaldi I found:
https://sites.google.com/site/dpovey/kaldi-lectures - Kaldi lectures created by the main author
http://www.superlectures.com/icassp2011/category.php?lang=en&id=131 - similar; video version
http://www.diplomovaprace.cz/133/thesis_oplatek.pdf - some master diploma thesis about speech recognition using Kaldi

This is all from my side. Good luck!

创建自己的语音识别系统

Data preparation

Audio data

Task

Acoustic data

Task

Language data

Task

Project finalization

Tools attachment

Task

Scoring script

SRILM installation

Task

Configuration files

Task

Running scripts creation

Task

Getting results

Summary