Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions egs/cmu_cslu_kids/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
This is an ASR recipe for children speech using cmu_kids and cslu_kids.
Both of the corpora can be found on LDC:
- cmu_kids : https://catalog.ldc.upenn.edu/LDC97S63
- cslu_kids: https://catalog.ldc.upenn.edu/LDC2007S18

To run this recipe, you'll need a copy of both corpora:
./run.sh --cmu_kids <path_to_cmu_corpus> --cslu_kids <path_to_cslu_corpus>

By default, this recipe will download an LM pretrained on LibriSpeech from
lm_url=www.openslr.org/resources/11. If you already have a copy of this LM
and do not wish to redownload, you can specify the LM path using the --lm_src option:
./run.sh --cmu_kids <path_to_cmu_corpus> --cslu_kids <path_to_cslu_corpus>\
--lm_src <path_to_librispeech_lm>

This recipe will also download and clean CMU_Dict by default. If you have a clean copy
already, or wish to use your own dictionary, simply copy your version of the dict to
data/local/dict

To run extra features for triphone models or VLTN, set the following options true:
./run.sh --cmu_kids <path_to_cmu_corpus> --cslu_kids <path_to_cslu_corpus>\
--vtln true --extra_features true
23 changes: 23 additions & 0 deletions egs/cmu_cslu_kids/s5/cmd.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# you can change cmd.sh depending on what type of queue you are using.
# If you have no queueing system and want to run on a local machine, you
# can change all instances 'queue.pl' to run.pl (but be careful and run
# commands one by one: most recipes will exhaust the memory on your
# machine). queue.pl works with GridEngine (qsub). slurm.pl works
# with slurm. Different queues are configured differently, with different
# queue names and different ways of specifying things like memory;
# to account for these differences you can create and edit the file
# conf/queue.conf to match your queue's configuration. Search for
# conf/queue.conf in http://kaldi-asr.org/doc/queue.html for more information,
# or search for the string 'default_config' in utils/queue.pl or utils/slurm.pl.

export train_cmd=queue.pl
export decode_cmd="queue.pl --mem 2G"
# the use of cuda_cmd is deprecated, used only in 'nnet1',
export cuda_cmd="queue.pl --gpu 1"

if [[ "$(hostname -f)" == "*.fit.vutbr.cz" ]]; then
queue_conf=$HOME/queue_conf/default.conf # see example /homes/kazi/iveselyk/queue_conf/default.conf,
export train_cmd="queue.pl --config $queue_conf --mem 2G --matylda 0.2"
export decode_cmd="queue.pl --config $queue_conf --mem 3G --matylda 0.1"
export cuda_cmd="queue.pl --config $queue_conf --gpu 1 --mem 10G --tmp 40G"
fi
4 changes: 4 additions & 0 deletions egs/cmu_cslu_kids/s5/conf/decode.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Use wider-than-normal decoding beams for RM.
first_beam=16.0
beam=20.0
lattice_beam=10.0
8 changes: 8 additions & 0 deletions egs/cmu_cslu_kids/s5/conf/decode_dnn.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# In RM, the optimal decode LMWT is in range 2..5, which is different from usual 10..15
# (it is caused by using simple rule-based LM, instead of n-gram LM),
scoring_opts="--min-lmwt 2 --max-lmwt 10"
# Still, it is better to use --acwt 0.1, both for decoding and sMBR,
acwt=0.1
# For this small task we can afford to have large beams,
beam=30.0 # beam for decoding. Was 13.0 in the scripts.
lattice_beam=18.0 # this has most effect on size of the lattices.
2 changes: 2 additions & 0 deletions egs/cmu_cslu_kids/s5/conf/mfcc.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
--use-energy=false # only non-default option.
--allow_downsample=true
11 changes: 11 additions & 0 deletions egs/cmu_cslu_kids/s5/conf/mfcc_hires.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# config for high-resolution MFCC features, intended for neural network training
# Note: we keep all cepstra, so it has the same info as filterbank features,
# but MFCC is more easily compressible (because less correlated) which is why
# we prefer this method.
--use-energy=false # use average of log energy, not energy.
--num-mel-bins=40 # similar to Google's setup.
--num-ceps=40 # there is no dimensionality reduction.
--low-freq=20 # low cutoff frequency for mel bins... this is high-bandwidth data, so
# there might be some information at the low end.
--high-freq=-400 # high cutoff frequently, relative to Nyquist of 8000 (=7600)
--allow-downsample=true
1 change: 1 addition & 0 deletions egs/cmu_cslu_kids/s5/conf/online_cmvn.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# configuration file for apply-cmvn-online, used in the script ../local/run_online_decoding.sh
2 changes: 2 additions & 0 deletions egs/cmu_cslu_kids/s5/conf/plp.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# No non-default options for now.
--allow_downsample=true
137 changes: 137 additions & 0 deletions egs/cmu_cslu_kids/s5/local/chain/compare_wer.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
#!/bin/bash

# this script is used for comparing decoding results between systems.
# e.g. local/chain/compare_wer.sh exp/chain/tdnn_{c,d}_sp
# For use with discriminatively trained systems you specify the epochs after a colon:
# for instance,
# local/chain/compare_wer.sh exp/chain/tdnn_c_sp exp/chain/tdnn_c_sp_smbr:{1,2,3}


if [ $# == 0 ]; then
echo "Usage: $0: [--looped] [--online] <dir1> [<dir2> ... ]"
echo "e.g.: $0 exp/chain/tdnn_{b,c}_sp"
echo "or (with epoch numbers for discriminative training):"
echo "$0 exp/chain/tdnn_b_sp_disc:{1,2,3}"
exit 1
fi

echo "# $0 $*"

include_looped=false
if [ "$1" == "--looped" ]; then
include_looped=true
shift
fi
include_online=false
if [ "$1" == "--online" ]; then
include_online=true
shift
fi


used_epochs=false

# this function set_names is used to separate the epoch-related parts of the name
# [for discriminative training] and the regular parts of the name.
# If called with a colon-free directory name, like:
# set_names exp/chain/tdnn_lstm1e_sp_bi_smbr
# it will set dir=exp/chain/tdnn_lstm1e_sp_bi_smbr and epoch_infix=""
# If called with something like:
# set_names exp/chain/tdnn_d_sp_smbr:3
# it will set dir=exp/chain/tdnn_d_sp_smbr and epoch_infix="_epoch3"


set_names() {
if [ $# != 1 ]; then
echo "compare_wer_general.sh: internal error"
exit 1 # exit the program
fi
dirname=$(echo $1 | cut -d: -f1)
epoch=$(echo $1 | cut -s -d: -f2)
if [ -z $epoch ]; then
epoch_infix=""
else
used_epochs=true
epoch_infix=_epoch${epoch}
fi
}



echo -n "# System "
for x in $*; do printf "% 10s" " $(basename $x)"; done
echo

strings=(
"#WER dev_clean_2 (tgsmall) "
"#WER dev_clean_2 (tglarge) ")

for n in 0 1; do
echo -n "${strings[$n]}"
for x in $*; do
set_names $x # sets $dirname and $epoch_infix
decode_names=(tgsmall_dev_clean_2 tglarge_dev_clean_2)

wer=$(cat $dirname/decode_${decode_names[$n]}/wer_* | utils/best_wer.sh | awk '{print $2}')
printf "% 10s" $wer
done
echo
if $include_looped; then
echo -n "# [looped:] "
for x in $*; do
set_names $x # sets $dirname and $epoch_infix
wer=$(cat $dirname/decode_looped_${decode_names[$n]}/wer_* | utils/best_wer.sh | awk '{print $2}')
printf "% 10s" $wer
done
echo
fi
if $include_online; then
echo -n "# [online:] "
for x in $*; do
set_names $x # sets $dirname and $epoch_infix
wer=$(cat ${dirname}_online/decode_${decode_names[$n]}/wer_* | utils/best_wer.sh | awk '{print $2}')
printf "% 10s" $wer
done
echo
fi
done


if $used_epochs; then
exit 0; # the diagnostics aren't comparable between regular and discriminatively trained systems.
fi


echo -n "# Final train prob "
for x in $*; do
prob=$(grep Overall $x/log/compute_prob_train.final.log | grep -v xent | awk '{printf("%.4f", $8)}')
printf "% 10s" $prob
done
echo

echo -n "# Final valid prob "
for x in $*; do
prob=$(grep Overall $x/log/compute_prob_valid.final.log | grep -v xent | awk '{printf("%.4f", $8)}')
printf "% 10s" $prob
done
echo

echo -n "# Final train prob (xent)"
for x in $*; do
prob=$(grep Overall $x/log/compute_prob_train.final.log | grep -w xent | awk '{printf("%.4f", $8)}')
printf "% 10s" $prob
done
echo

echo -n "# Final valid prob (xent)"
for x in $*; do
prob=$(grep Overall $x/log/compute_prob_valid.final.log | grep -w xent | awk '{printf("%.4f", $8)}')
printf "% 10s" $prob
done
echo

echo -n "# Num-params "
for x in $*; do
printf "% 10s" $(grep num-parameters $x/log/progress.1.log | awk '{print $2}')
done
echo
1 change: 1 addition & 0 deletions egs/cmu_cslu_kids/s5/local/chain/run_tdnnf.sh
82 changes: 82 additions & 0 deletions egs/cmu_cslu_kids/s5/local/chain/tdnnf_decode.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
#! /bin/bash

# Copyright Johns Hopkins University
# 2019 Fei Wu

# Decode on new data set using trained model.
# The data directory should be prepared in kaldi style.
# Usage:
# ./local/chain/tdnnF_decode.sh --data_src <prepared_data_dir>

set -euo pipefail
echo "$0 $@"

stage=0
decode_nj=10
data_src=
affix=
tree_affix=
nnet3_affix=

. ./cmd.sh
. ./path.sh
. ./utils/parse_options.sh

if ! cuda-compiled; then
cat <<EOF && exit 1
This script is intended to be used with GPUs but you have not compiled Kaldi with CUDA
If you want to use GPUs (and have them), go to src/, and configure and make on a machine
where "nvcc" is installed.
EOF
fi
data_name=$(basename $data_src)
data_hires="data/${data_name}_hires"
ivect_dir=exp/nnet3${nnet3_affix}/ivector_$data_name
tree_dir=exp/chain${nnet3_affix}/tree_sp${tree_affix:+_$tree_affix}
dir=exp/chain${nnet3_affix}/tdnn${affix}_sp

mfcc=mfcc_hires_$data_name
chunk_width=140,100,160
reporting_email=

if [ $stage -le 0 ]; then
rm -rf $data_hires
cp -r $data_src $data_hires
fi
# High resolution mfcc
if [ $stage -le 1 ]; then
steps/make_mfcc.sh --nj 10 --mfcc-config conf/mfcc_hires.conf \
--cmd "$train_cmd" $data_hires \
exp/$data_name/make_feat_hires $mfcc|| exit 1;
steps/compute_cmvn_stats.sh $data_hires || exit 1;
utils/fix_data_dir.sh $data_hires || exit 1;
fi

# Extract i-vector
if [ $stage -le 2 ]; then
steps/online/nnet2/extract_ivectors_online.sh --cmd "$train_cmd" --nj 20 \
$data_hires exp/nnet3"$affix"/extractor $ivect_dir
fi

if [ $stage -le 3 ]; then
frames_per_chunk=$(echo $chunk_width | cut -d, -f1)
rm $dir/.error 2>/dev/null || true

(
nspk=$(wc -l <$data_hires/spk2utt)
steps/nnet3/decode.sh \
--acwt 1.0 --post-decode-acwt 10.0 \
--frames-per-chunk $frames_per_chunk \
--nj $nspk --cmd "$decode_cmd" --num-threads 4 \
--online-ivector-dir $ivect_dir \
$tree_dir/graph_tgsmall $data_hires ${dir}/decode_tgsmall_$data_name || exit 1

steps/lmrescore_const_arpa.sh --cmd "$decode_cmd" \
data/lang_test_{tgsmall,tglarge} \
$data_hires ${dir}/decode_{tgsmall,tglarge}_$data_name || exit 1
) || touch $dir/.error &

wait
[ -f $dir/.error ] && echo "$0: there was a problem while decoding" && exit 1
fi

Loading