Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- kidi@kidi-ThinkPad-T420s:~/kaldi-trunk/egs/setup_base_files$ ./workspace_setup.sh "start" /home/kidi/kaldi-trunk/egs/setup_base_files/database/an4/data/ /home/kidi/kaldi-trunk/egs/setup_base_files/database/an4/utterance/transcript
- Data TRAIN/TEST SPLITTED
- UTT/wav.scp/utt2spk created for train and test!
- Text and utt2spk sorted
- wav.scp sorted
- spk2utt created!
- --2016-09-11 18:40:18-- http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict/sphinxdict/cmudict.0.7a_SPHINX_40
- Resolving svn.code.sf.net (svn.code.sf.net)... 216.34.181.157
- Connecting to svn.code.sf.net (svn.code.sf.net)|216.34.181.157|:80... connected.
- HTTP request sent, awaiting response... 200 OK
- Length: 3231422 (3.1M) [text/plain]
- Saving to: ‘lexicon.txt’
- 100%[======================================>] 3,231,422 975KB/s in 3.2s
- 2016-09-11 18:40:22 (975 KB/s) - ‘lexicon.txt’ saved [3231422/3231422]
- Filtered lexicon!
- Checking data/local/lang/silence_phones.txt ...
- --> reading data/local/lang/silence_phones.txt
- --> data/local/lang/silence_phones.txt is OK
- Checking data/local/lang/optional_silence.txt ...
- --> reading data/local/lang/optional_silence.txt
- --> data/local/lang/optional_silence.txt is OK
- Checking data/local/lang/nonsilence_phones.txt ...
- --> reading data/local/lang/nonsilence_phones.txt
- --> data/local/lang/nonsilence_phones.txt is OK
- Checking disjoint: silence_phones.txt, nonsilence_phones.txt
- --> disjoint property is OK.
- Checking data/local/lang/lexicon.txt
- --> reading data/local/lang/lexicon.txt
- --> data/local/lang/lexicon.txt is OK
- Checking data/local/lang/extra_questions.txt ...
- --> data/local/lang/extra_questions.txt is empty (this is OK)
- --> SUCCESS [validating dictionary directory data/local/lang]
- **Creating data/local/lang/lexiconp.txt from data/local/lang/lexicon.txt
- fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int
- prepare_lang.sh: validating output directory
- utils/validate_lang.pl data/lang
- Checking data/lang/phones.txt ...
- --> data/lang/phones.txt is OK
- Checking words.txt: #0 ...
- --> data/lang/words.txt is OK
- Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
- --> silence.txt and nonsilence.txt are disjoint
- --> silence.txt and disambig.txt are disjoint
- --> disambig.txt and nonsilence.txt are disjoint
- --> disjoint property is OK
- Checking sumation: silence.txt, nonsilence.txt, disambig.txt ...
- --> summation property is OK
- Checking data/lang/phones/context_indep.{txt, int, csl} ...
- --> 5 entry/entries in data/lang/phones/context_indep.txt
- --> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt
- --> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt
- --> data/lang/phones/context_indep.{txt, int, csl} are OK
- Checking data/lang/phones/nonsilence.{txt, int, csl} ...
- --> 136 entry/entries in data/lang/phones/nonsilence.txt
- --> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt
- --> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt
- --> data/lang/phones/nonsilence.{txt, int, csl} are OK
- Checking data/lang/phones/silence.{txt, int, csl} ...
- --> 5 entry/entries in data/lang/phones/silence.txt
- --> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt
- --> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt
- --> data/lang/phones/silence.{txt, int, csl} are OK
- Checking data/lang/phones/optional_silence.{txt, int, csl} ...
- --> 1 entry/entries in data/lang/phones/optional_silence.txt
- --> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt
- --> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt
- --> data/lang/phones/optional_silence.{txt, int, csl} are OK
- Checking data/lang/phones/disambig.{txt, int, csl} ...
- --> 4 entry/entries in data/lang/phones/disambig.txt
- --> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt
- --> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt
- --> data/lang/phones/disambig.{txt, int, csl} are OK
- Checking data/lang/phones/roots.{txt, int} ...
- --> 35 entry/entries in data/lang/phones/roots.txt
- --> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt
- --> data/lang/phones/roots.{txt, int} are OK
- Checking data/lang/phones/sets.{txt, int} ...
- --> 35 entry/entries in data/lang/phones/sets.txt
- --> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt
- --> data/lang/phones/sets.{txt, int} are OK
- Checking data/lang/phones/extra_questions.{txt, int} ...
- --> 9 entry/entries in data/lang/phones/extra_questions.txt
- --> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt
- --> data/lang/phones/extra_questions.{txt, int} are OK
- Checking data/lang/phones/word_boundary.{txt, int} ...
- --> 141 entry/entries in data/lang/phones/word_boundary.txt
- --> data/lang/phones/word_boundary.int corresponds to data/lang/phones/word_boundary.txt
- --> data/lang/phones/word_boundary.{txt, int} are OK
- Checking optional_silence.txt ...
- --> reading data/lang/phones/optional_silence.txt
- --> data/lang/phones/optional_silence.txt is OK
- Checking disambiguation symbols: #0 and #1
- --> data/lang/phones/disambig.txt has "#0" and "#1"
- --> data/lang/phones/disambig.txt is OK
- Checking topo ...
- Checking word_boundary.txt: silence.txt, nonsilence.txt, disambig.txt ...
- --> data/lang/phones/word_boundary.txt doesn't include disambiguation symbols
- --> data/lang/phones/word_boundary.txt is the union of nonsilence.txt and silence.txt
- --> data/lang/phones/word_boundary.txt is OK
- Checking word-level disambiguation symbols...
- --> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh)
- Checking word_boundary.int and disambig.int
- --> generating a 73 word sequence
- --> resulting phone sequence from L.fst corresponds to the word sequence
- --> L.fst is OK
- --> generating a 88 word sequence
- --> resulting phone sequence from L_disambig.fst corresponds to the word sequence
- --> L_disambig.fst is OK
- Checking data/lang/oov.{txt, int} ...
- --> 1 entry/entries in data/lang/oov.txt
- --> data/lang/oov.int corresponds to data/lang/oov.txt
- --> data/lang/oov.{txt, int} are OK
- --> data/lang/L.fst is olabel sorted
- --> data/lang/L_disambig.fst is olabel sorted
- --> SUCCESS [validating lang directory data/lang]
- utils/validate_data_dir.sh: Successfully validated data-directory data/train
- Lexicon and phones generated! - Run train scripts.
- kidi@kidi-ThinkPad-T420s:~/kaldi-trunk/egs/setup_base_files$ rm -rf ../start/
- kidi@kidi-ThinkPad-T420s:~/kaldi-trunk/egs/setup_base_files$ ./workspace_setup.sh "start" /home/kidi/kaldi-trunk/egs/setup_base_files/database/an4/data/ /home/kidi/kaldi-trunk/egs/setup_base_files/database/an4/utterance/transcript
- Data TRAIN/TEST SPLITTED
- UTT/wav.scp/utt2spk created for train and test!
- Text and utt2spk sorted
- wav.scp sorted
- spk2utt created!
- --2016-09-11 18:40:43-- http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict/sphinxdict/cmudict.0.7a_SPHINX_40
- Resolving svn.code.sf.net (svn.code.sf.net)... 216.34.181.157
- Connecting to svn.code.sf.net (svn.code.sf.net)|216.34.181.157|:80... connected.
- HTTP request sent, awaiting response... 200 OK
- Length: 3231422 (3.1M) [text/plain]
- Saving to: ‘lexicon.txt’
- 100%[======================================>] 3,231,422 1.08MB/s in 2.8s
- 2016-09-11 18:40:46 (1.08 MB/s) - ‘lexicon.txt’ saved [3231422/3231422]
- Filtered lexicon!
- Checking data/local/lang/silence_phones.txt ...
- --> reading data/local/lang/silence_phones.txt
- --> data/local/lang/silence_phones.txt is OK
- Checking data/local/lang/optional_silence.txt ...
- --> reading data/local/lang/optional_silence.txt
- --> data/local/lang/optional_silence.txt is OK
- Checking data/local/lang/nonsilence_phones.txt ...
- --> reading data/local/lang/nonsilence_phones.txt
- --> data/local/lang/nonsilence_phones.txt is OK
- Checking disjoint: silence_phones.txt, nonsilence_phones.txt
- --> disjoint property is OK.
- Checking data/local/lang/lexicon.txt
- --> reading data/local/lang/lexicon.txt
- --> data/local/lang/lexicon.txt is OK
- Checking data/local/lang/extra_questions.txt ...
- --> data/local/lang/extra_questions.txt is empty (this is OK)
- --> SUCCESS [validating dictionary directory data/local/lang]
- **Creating data/local/lang/lexiconp.txt from data/local/lang/lexicon.txt
- fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int
- prepare_lang.sh: validating output directory
- utils/validate_lang.pl data/lang
- Checking data/lang/phones.txt ...
- --> data/lang/phones.txt is OK
- Checking words.txt: #0 ...
- --> data/lang/words.txt is OK
- Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
- --> silence.txt and nonsilence.txt are disjoint
- --> silence.txt and disambig.txt are disjoint
- --> disambig.txt and nonsilence.txt are disjoint
- --> disjoint property is OK
- Checking sumation: silence.txt, nonsilence.txt, disambig.txt ...
- --> summation property is OK
- Checking data/lang/phones/context_indep.{txt, int, csl} ...
- --> 5 entry/entries in data/lang/phones/context_indep.txt
- --> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt
- --> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt
- --> data/lang/phones/context_indep.{txt, int, csl} are OK
- Checking data/lang/phones/nonsilence.{txt, int, csl} ...
- --> 136 entry/entries in data/lang/phones/nonsilence.txt
- --> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt
- --> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt
- --> data/lang/phones/nonsilence.{txt, int, csl} are OK
- Checking data/lang/phones/silence.{txt, int, csl} ...
- --> 5 entry/entries in data/lang/phones/silence.txt
- --> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt
- --> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt
- --> data/lang/phones/silence.{txt, int, csl} are OK
- Checking data/lang/phones/optional_silence.{txt, int, csl} ...
- --> 1 entry/entries in data/lang/phones/optional_silence.txt
- --> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt
- --> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt
- --> data/lang/phones/optional_silence.{txt, int, csl} are OK
- Checking data/lang/phones/disambig.{txt, int, csl} ...
- --> 4 entry/entries in data/lang/phones/disambig.txt
- --> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt
- --> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt
- --> data/lang/phones/disambig.{txt, int, csl} are OK
- Checking data/lang/phones/roots.{txt, int} ...
- --> 35 entry/entries in data/lang/phones/roots.txt
- --> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt
- --> data/lang/phones/roots.{txt, int} are OK
- Checking data/lang/phones/sets.{txt, int} ...
- --> 35 entry/entries in data/lang/phones/sets.txt
- --> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt
- --> data/lang/phones/sets.{txt, int} are OK
- Checking data/lang/phones/extra_questions.{txt, int} ...
- --> 9 entry/entries in data/lang/phones/extra_questions.txt
- --> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt
- --> data/lang/phones/extra_questions.{txt, int} are OK
- Checking data/lang/phones/word_boundary.{txt, int} ...
- --> 141 entry/entries in data/lang/phones/word_boundary.txt
- --> data/lang/phones/word_boundary.int corresponds to data/lang/phones/word_boundary.txt
- --> data/lang/phones/word_boundary.{txt, int} are OK
- Checking optional_silence.txt ...
- --> reading data/lang/phones/optional_silence.txt
- --> data/lang/phones/optional_silence.txt is OK
- Checking disambiguation symbols: #0 and #1
- --> data/lang/phones/disambig.txt has "#0" and "#1"
- --> data/lang/phones/disambig.txt is OK
- Checking topo ...
- Checking word_boundary.txt: silence.txt, nonsilence.txt, disambig.txt ...
- --> data/lang/phones/word_boundary.txt doesn't include disambiguation symbols
- --> data/lang/phones/word_boundary.txt is the union of nonsilence.txt and silence.txt
- --> data/lang/phones/word_boundary.txt is OK
- Checking word-level disambiguation symbols...
- --> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh)
- Checking word_boundary.int and disambig.int
- --> generating a 71 word sequence
- --> resulting phone sequence from L.fst corresponds to the word sequence
- --> L.fst is OK
- --> generating a 96 word sequence
- --> resulting phone sequence from L_disambig.fst corresponds to the word sequence
- --> L_disambig.fst is OK
- Checking data/lang/oov.{txt, int} ...
- --> 1 entry/entries in data/lang/oov.txt
- --> data/lang/oov.int corresponds to data/lang/oov.txt
- --> data/lang/oov.{txt, int} are OK
- --> data/lang/L.fst is olabel sorted
- --> data/lang/L_disambig.fst is olabel sorted
- --> SUCCESS [validating lang directory data/lang]
- utils/validate_data_dir.sh: file data/train/utt2spk is not in sorted order or has duplicates
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement