Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
b9d4b76
[egs] Fix to BSD compatibility of TIMIT data prep (#2966)
danpovey Jan 5, 2019
6b17571
[scripts] Fix RNNLM training script problem (chunk_length was ignored…
hainan-xv Jan 5, 2019
32b8cf1
[src] Fix bug in lattice-1best.cc RE removing insertion penalty (#2970)
freewym Jan 6, 2019
1079922
[src] Compute a separate avg (start, end) interval for each sausage w…
dogancan Jan 6, 2019
205dbd8
[build] Move nvcc verbose flag to proper location (#2962)
ryanleary Jan 7, 2019
1cac236
[egs] Fix mini_librispeech download_lm.sh crash; thx:chris.keith.john…
danpovey Jan 7, 2019
c55002a
[src] Add some code for Batch-renorm alternative
danpovey Jan 8, 2019
6f33848
[src] Various bug fixes in normalization components; fixes to test code.
danpovey Jan 8, 2019
6c6b9b5
[src] remove unused header
danpovey Jan 8, 2019
37d6950
[egs] minor fixes related to python2 vs python3 differences (#2977)
david-ryan-snyder Jan 8, 2019
7a89036
[src] Fix to VarNormComponent
danpovey Jan 9, 2019
58f8729
[src] fix more bugs in VarNorm/MeanNormComponent
danpovey Jan 9, 2019
a7f00ba
[src] Fix further backprop bug in VarNormComponent
danpovey Jan 9, 2019
8686698
[src] Fix further backprop bug in VarNormComponent
danpovey Jan 9, 2019
a6aa269
[src] Small fix in test code, avoid spurious failure (#2978)
danpovey Jan 9, 2019
2864465
[egs] Fix CSJ data-prep; minor path fix for USB version of data (#2979)
feddybear Jan 9, 2019
f02d2a3
[egs] Add paper ref to README.txt in reverb example (#2982)
sas91 Jan 10, 2019
a0790fb
[src] Fix bug in VarNormComponent::Add
danpovey Jan 10, 2019
ff6ddf7
[egs] Minor fixes to sitw recipe (fix problem introdueced in #2925) (…
david-ryan-snyder Jan 11, 2019
9b6fbdd
[scripts] Fix bug introduced in #2957, RE integer division (#2986)
aarora8 Jan 11, 2019
c017268
[egs] Update WSJ flat-start chain recipes to use TDNN-F not TDNN+LSTM…
hhadian Jan 12, 2019
c631fcb
[scripts] Fix typo introduced in #2925 (#2989)
desh2608 Jan 13, 2019
9f981d0
[build] Modify Makefile and travis script to fix Travis failures (#2987)
galv Jan 14, 2019
ae573c9
[src] Simplification and efficiency improvement in ivector-plda-scori…
david-ryan-snyder Jan 16, 2019
50af3fc
[egs] Update madcat Arabic and Chinese egs, IAM (#2964)
aarora8 Jan 16, 2019
f90a98c
[src] Fix overflow bug in convolution code (#2992)
ChunChiehChang Jan 16, 2019
fd0aca9
[src] Fix nan issue in ctm times introduced in #2972, thx: @vesis84 (…
vimalmanohar Jan 16, 2019
e8d1287
[src] Fix 'sausage-time' issue which occurs with disabled MBR decodin…
KarelVesely84 Jan 18, 2019
99dc4d8
[egs] Add scripts for yomdle Russian (OCR task) (#2953)
aarora8 Jan 21, 2019
7e529ed
[egs] Simplify lexicon preparation in Fisher callhome Spanish (#2999)
GoVivace Jan 21, 2019
25f09e8
[egs] Update GALE Arabic recipe (#2934)
aarora8 Jan 22, 2019
4338004
[egs] Remove outdated NN results from Gale Arabic recipe (#3002)
aarora8 Jan 22, 2019
05d9a3d
[egs] Add RESULTS file for the tedlium s5_r3 (release 3) setup (#3003)
huangruizhe Jan 23, 2019
1dcdf80
[src] Fixes to grammar-fst code to handle LM-disambig symbols properl…
danpovey Jan 26, 2019
6f56512
[src] Cosmetic change to mel computation (fix option string) (#3011)
boeddeker Jan 30, 2019
56cfb95
[src] Fix Visual Studio error due to alternate syntactic form of nore…
daanzu Feb 1, 2019
9e35898
[egs] Fix location of sequitur installation (#3017)
jybaek Feb 1, 2019
a51bd96
[src] Fix w/ ifdef Visual Studio error from alternate syntactic form …
daanzu Feb 3, 2019
41ea8cf
[egs] Some fixes to getting data in heroico recipe (#3021)
danpovey Feb 3, 2019
fb514dc
[egs] BABEL script fix: avoid make_L_align.sh generating invalid file…
jtrmal Feb 4, 2019
afc5e78
[src] Fix to older online decoding code in online/ (OnlineFeInput; wa…
jdieguez Feb 6, 2019
226cbf7
[script] Fix unset bash variable in make_mfcc.sh (#3030)
oplatek Feb 8, 2019
6fc4c60
[scripts] Extend limit_num_gpus.sh to support --num-gpus 0. (#3027)
oplatek Feb 8, 2019
2f92bd9
[scripts] fix bug in utils/add_lex_disambig.pl when sil-probs and pro…
Teddyang Feb 15, 2019
403c5ee
[egs] Fix path in Tedlium r3 rnnlm training script (#3039)
francoishernandez Feb 18, 2019
abfbc56
[src] Thread-safety for GrammarFst (thx:armando.muscariello@gmail.com…
danpovey Feb 20, 2019
f09d48a
[scripts] Cosmetic fix to get_degs.sh (#3045)
Teddyang Feb 21, 2019
b0fc09d
[egs] Small bug fixes for IAM and UW3 recipes (#3048)
ChunChiehChang Feb 21, 2019
4494a85
[scripts] Nnet3 segmentation: fix default params (#3051)
danpovey Feb 26, 2019
bf33f1f
[scripts] Allow perturb_data_dir_speed.sh to work with utt2lang (#3055)
igrinis Feb 26, 2019
5f05d59
[scripts] Make beam in monophone training configurable (#3057)
xiaohui-zhang Feb 27, 2019
c0a555e
[scripts] Allow reverberate_data_dir.py to support unicode filenames …
rezame Feb 27, 2019
2e26464
[scripts] Make some cleanup scripts work with python3 (#3054)
vimalmanohar Mar 1, 2019
d21be2d
[scripts] bug fix to nnet2->3 conversion, fixes #886 (#3071)
jfainberg Mar 4, 2019
8fa9648
[src] Make copies occur in per-thread default stream (for GPUs) (#3068)
luitjens Mar 4, 2019
bd326dc
[src] Add GPU version of MergeTaskOutput().. relates to batch decodin…
luitjens Mar 4, 2019
17b7f3f
[src] Add device options to enable tensor core math mode. (#3066)
luitjens Mar 4, 2019
0a1f827
[src] Log nnet3 computation to VLOG, not std::cout (#3072)
kkm000 Mar 5, 2019
f2a89c2
[src] Allow upsampling in compute-mfcc-feats, etc. (#3014)
danpovey Mar 5, 2019
98b45c8
[src] fix problem with rand_r being undefined on Android (#3037)
keli78 Mar 5, 2019
197214d
[egs] Update swbd1_map_words.pl, fix them_1's -> them's (#3052)
Mar 5, 2019
991a75c
[src] Add const overload OnlineNnet2FeaturePipeline::IvectorFeature (…
kkm000 Mar 6, 2019
4432371
[src] Fix syntax error in egs/bn_music_speech/v1/local/make_musan.py …
antonstakhouski Mar 6, 2019
8460fa3
[src] Memory optimization for online feature extraction of long recor…
pzelasko Mar 6, 2019
b801b98
[build] fixed a bug in linux_configure_redhat_fat when use_cuda=no (#…
kan-bayashi Mar 7, 2019
ce97c47
[scripts] Add missing '. ./path.sh' to get_utt2num_frames.sh (#3076)
hhadian Mar 7, 2019
4d61452
[src,scripts,egs] Add count-based biphone tree tying for flat-start c…
hhadian Mar 7, 2019
01cef69
[scripts,egs] Remove sed from various scripts (avoid compatibility pr…
desh2608 Mar 8, 2019
2f95609
[src] Rework error logging for safety and cleanliness (#3064)
kkm000 Mar 8, 2019
bcfe3f8
[src] Change warp-synchronous to cub::BlockReduce (safer but slower) …
desh2608 Mar 10, 2019
1209c07
[src] Fix && and || uses where & and | intended, and other weird erro…
kkm000 Mar 11, 2019
5a5696f
[build] Some fixes to Makefiles (#3088)
kkm000 Mar 11, 2019
abd4869
[src] Fixed -Wreordered warnings in feat (#3090)
pzelasko Mar 12, 2019
9c8ba0f
[egs] Replace bc with perl -e (#3093)
entn-at Mar 12, 2019
8cbd582
[scripts] Fix python3 compatibility issue in data-perturbing script (…
nikhilm16 Mar 12, 2019
7435661
[doc] fix some typos in doc. (#3097)
csukuangfj Mar 12, 2019
5bdea69
[build] Make sure expf() speed probe times sensibly (#3089)
kkm000 Mar 12, 2019
b7a4fec
[scripts] Make sure merge_targets.py works in python3 (#3094)
XIAOYixuan Mar 12, 2019
94475d6
[src] ifdef to fix compilation failure on CUDA 8 and earlier (#3103)
desh2608 Mar 13, 2019
fc8c17b
[doc] fix typos and broken links in doc. (#3102)
csukuangfj Mar 13, 2019
3f8b6b2
[scripts] Fix frame_shift bug in egs/swbd/s5c/local/score_sclite_conf…
freewym Mar 13, 2019
633e61c
[src] Fix wrong assertion failure in nnet3-am-compute (#3106)
MartinKocour Mar 14, 2019
8cafd32
[src] Cosmetic changes to natural-gradient code (#3108)
danpovey Mar 14, 2019
cbee719
Merge branch 'master' into svd_draft_normalize
danpovey Mar 14, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ GSYMS
/tools/ATLAS/
/tools/atlas3.8.3.tar.gz
/tools/irstlm/
/tools/mitlm/
/tools/openfst
/tools/openfst-1.3.2.tar.gz
/tools/openfst-1.3.2/
Expand Down
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ script:
# for the explanation why extra switches needed for clang with ccache.
- CXX="ccache clang++-3.8 -Qunused-arguments -fcolor-diagnostics -Wno-tautological-compare"
CFLAGS=""
LDFLAGS="-llapack"
LDFLAGS="-llapack -Wl,-fuse-ld=gold"
INCDIRS="$XROOT/usr/include"
LIBDIRS="$XROOT/usr/lib"
tools/extras/travis_script.sh
Expand Down
13 changes: 5 additions & 8 deletions egs/ami/s5/local/ami_ihm_scoring_data_prep.sh
Original file line number Diff line number Diff line change
Expand Up @@ -87,18 +87,15 @@ sort -k 2 $dir/utt2spk | utils/utt2spk_to_spk2utt.pl > $dir/spk2utt || exit 1;
join $dir/utt2spk $dir/segments | \
perl -ne '{BEGIN{$pu=""; $pt=0.0;} split;
if ($pu eq $_[1] && $pt > $_[3]) {
print "$_[0] $_[2] $_[3] $_[4]>$_[0] $_[2] $pt $_[4]\n"
print "s/^$_[0] $_[2] $_[3] $_[4]\$/$_[0] $_[2] $pt $_[4]/;\n"
}
$pu=$_[1]; $pt=$_[4];
$pu=$_[1]; $pt=$_[4];
}' > $dir/segments_to_fix
if [ `cat $dir/segments_to_fix | wc -l` -gt 0 ]; then

if [ -s $dir/segments_to_fix ]; then
echo "$0. Applying following fixes to segments"
cat $dir/segments_to_fix
while read line; do
p1=`echo $line | awk -F'>' '{print $1}'`
p2=`echo $line | awk -F'>' '{print $2}'`
sed -ir "s!$p1!$p2!" $dir/segments
done < $dir/segments_to_fix
perl -i -pf $dir/segments_to_fix $dir/segments
fi

# Copy stuff into its final locations
Expand Down
10 changes: 3 additions & 7 deletions egs/ami/s5/local/ami_mdm_scoring_data_prep.sh
Original file line number Diff line number Diff line change
Expand Up @@ -94,19 +94,15 @@ awk '{print $1}' $tmpdir/segments | \
join $tmpdir/utt2spk_stm $tmpdir/segments | \
awk '{ utt=$1; spk=$2; wav=$3; t_beg=$4; t_end=$5;
if(spk_prev == spk && t_end_prev > t_beg) {
print utt, wav, t_beg, t_end">"utt, wav, t_end_prev, t_end;
print "s/^"utt, wav, t_beg, t_end"$/"utt, wav, t_end_prev, t_end"/;";
}
spk_prev=spk; t_end_prev=t_end;
}' > $tmpdir/segments_to_fix

if [ `cat $tmpdir/segments_to_fix | wc -l` -gt 0 ]; then
if [ -s $tmpdir/segments_to_fix ]; then
echo "$0. Applying following fixes to segments"
cat $tmpdir/segments_to_fix
while read line; do
p1=`echo $line | awk -F'>' '{print $1}'`
p2=`echo $line | awk -F'>' '{print $2}'`
sed -ir "s:$p1:$p2:" $tmpdir/segments
done < $tmpdir/segments_to_fix
perl -i -pf $tmpdir/segments_to_fix $tmpdir/segments
fi

# Copy stuff into its final locations [this has been moved from the format_data
Expand Down
10 changes: 3 additions & 7 deletions egs/ami/s5/local/ami_sdm_scoring_data_prep.sh
Original file line number Diff line number Diff line change
Expand Up @@ -101,19 +101,15 @@ awk '{print $1}' $tmpdir/segments | \
join $tmpdir/utt2spk_stm $tmpdir/segments | \
awk '{ utt=$1; spk=$2; wav=$3; t_beg=$4; t_end=$5;
if(spk_prev == spk && t_end_prev > t_beg) {
print utt, wav, t_beg, t_end">"utt, wav, t_end_prev, t_end;
print "s/^"utt, wav, t_beg, t_end"$/"utt, wav, t_end_prev, t_end"/;";
}
spk_prev=spk; t_end_prev=t_end;
}' > $tmpdir/segments_to_fix

if [ `cat $tmpdir/segments_to_fix | wc -l` -gt 0 ]; then
if [ -s $tmpdir/segments_to_fix ]; then
echo "$0. Applying following fixes to segments"
cat $tmpdir/segments_to_fix
while read line; do
p1=`echo $line | awk -F'>' '{print $1}'`
p2=`echo $line | awk -F'>' '{print $2}'`
sed -ir "s:$p1:$p2:" $tmpdir/segments
done < $tmpdir/segments_to_fix
perl -i -pf $tmpdir/segments_to_fix $tmpdir/segments
fi

# Copy stuff into its final locations [this has been moved from the format_data
Expand Down
11 changes: 4 additions & 7 deletions egs/ami/s5b/local/ami_ihm_scoring_data_prep.sh
Original file line number Diff line number Diff line change
Expand Up @@ -93,18 +93,15 @@ sort -k 2 $dir/utt2spk | utils/utt2spk_to_spk2utt.pl > $dir/spk2utt || exit 1;
join $dir/utt2spk $dir/segments | \
perl -ne '{BEGIN{$pu=""; $pt=0.0;} split;
if ($pu eq $_[1] && $pt > $_[3]) {
print "$_[0] $_[2] $_[3] $_[4]>$_[0] $_[2] $pt $_[4]\n"
print "s/^$_[0] $_[2] $_[3] $_[4]\$/$_[0] $_[2] $pt $_[4]/;\n"
}
$pu=$_[1]; $pt=$_[4];
}' > $dir/segments_to_fix
if [ `cat $dir/segments_to_fix | wc -l` -gt 0 ]; then

if [ -s $dir/segments_to_fix ]; then
echo "$0. Applying following fixes to segments"
cat $dir/segments_to_fix
while read line; do
p1=`echo $line | awk -F'>' '{print $1}'`
p2=`echo $line | awk -F'>' '{print $2}'`
sed -ir "s!$p1!$p2!" $dir/segments
done < $dir/segments_to_fix
perl -i -pf $dir/segments_to_fix $dir/segments
fi

# Copy stuff into its final locations
Expand Down
10 changes: 3 additions & 7 deletions egs/ami/s5b/local/ami_mdm_scoring_data_prep.sh
Original file line number Diff line number Diff line change
Expand Up @@ -99,19 +99,15 @@ awk '{print $1}' $tmpdir/segments | \
join $tmpdir/utt2spk_stm $tmpdir/segments | \
awk '{ utt=$1; spk=$2; wav=$3; t_beg=$4; t_end=$5;
if(spk_prev == spk && t_end_prev > t_beg) {
print utt, wav, t_beg, t_end">"utt, wav, t_end_prev, t_end;
print "s/^"utt, wav, t_beg, t_end"$/"utt, wav, t_end_prev, t_end"/;";
}
spk_prev=spk; t_end_prev=t_end;
}' > $tmpdir/segments_to_fix

if [ `cat $tmpdir/segments_to_fix | wc -l` -gt 0 ]; then
if [ -s $tmpdir/segments_to_fix ]; then
echo "$0. Applying following fixes to segments"
cat $tmpdir/segments_to_fix
while read line; do
p1=`echo $line | awk -F'>' '{print $1}'`
p2=`echo $line | awk -F'>' '{print $2}'`
sed -ir "s:$p1:$p2:" $tmpdir/segments
done < $tmpdir/segments_to_fix
perl -i -pf $tmpdir/segments_to_fix $tmpdir/segments
fi

# Copy stuff into its final locations [this has been moved from the format_data
Expand Down
12 changes: 4 additions & 8 deletions egs/ami/s5b/local/ami_sdm_scoring_data_prep.sh
Original file line number Diff line number Diff line change
Expand Up @@ -111,25 +111,21 @@ awk '{print $1}' $tmpdir/segments | \
join $tmpdir/utt2spk_stm $tmpdir/segments | \
awk '{ utt=$1; spk=$2; wav=$3; t_beg=$4; t_end=$5;
if(spk_prev == spk && t_end_prev > t_beg) {
print utt, wav, t_beg, t_end">"utt, wav, t_end_prev, t_end;
print "s/^"utt, wav, t_beg, t_end"$/"utt, wav, t_end_prev, t_end"/;";
}
spk_prev=spk; t_end_prev=t_end;
}' > $tmpdir/segments_to_fix

if [ `cat $tmpdir/segments_to_fix | wc -l` -gt 0 ]; then
if [ -s $tmpdir/segments_to_fix ]; then
echo "$0. Applying following fixes to segments"
cat $tmpdir/segments_to_fix
while read line; do
p1=`echo $line | awk -F'>' '{print $1}'`
p2=`echo $line | awk -F'>' '{print $2}'`
sed -ir "s:$p1:$p2:" $tmpdir/segments
done < $tmpdir/segments_to_fix
perl -i -pf $tmpdir/segments_to_fix $tmpdir/segments
fi

# Copy stuff into its final locations [this has been moved from the format_data
# script]
mkdir -p $dir
for f in spk2utt utt2spk utt2spk_stm wav.scp text segments reco2file_and_channel; do
for f in segments_to_fix spk2utt utt2spk utt2spk_stm wav.scp text segments reco2file_and_channel; do
cp $tmpdir/$f $dir/$f || exit 1;
done

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ cp ${output_dir}_non_normalized/info/* $output_dir/info

# rename file location in the noise-rir pairing files
for file in `ls $output_dir/info/noise_impulse*`; do
sed -i "s/_non_normalized//g" $file
perl -i -pe "s/_non_normalized//g" $file
done

# generating the rir-list with probabilities alloted for each rir
Expand Down
3 changes: 1 addition & 2 deletions egs/babel/s5c/local/syllab/generate_syllable_lang.sh
Original file line number Diff line number Diff line change
Expand Up @@ -118,8 +118,7 @@ ln -s lex.syllabs2phones.disambig.fst $out/L_disambig.fst
echo "Validating the output lang dir"
utils/validate_lang.pl $out || exit 1

sed -i'' 's/#1$//g' $lout/lexicon.txt
sed -i'' 's/#1$//g' $lout/lexiconp.txt
perl -i -pe 's/#1$//g' $lout/lexicon.txt $lout/lexiconp.txt

echo "Done OK."
exit 0
4 changes: 2 additions & 2 deletions egs/babel/s5d/conf/lang/404-georgian.FLP.official.conf
Original file line number Diff line number Diff line change
Expand Up @@ -75,8 +75,8 @@ unsup_data_list=./conf/lists/404-georgian/untranscribed-training.list
unsup_nj=32


lexicon_file=
lexiconFlags="--romanized --oov <unk>"
lexicon_file=/export/corpora/LDC/LDC2016S12/IARPA_BABEL_OP3_404/conversational/reference_materials/lexicon.txt
lexiconFlags=" --romanized --oov <unk>"



14 changes: 10 additions & 4 deletions egs/babel/s5d/local/make_L_align.sh
Original file line number Diff line number Diff line change
Expand Up @@ -34,18 +34,24 @@ tmpdir=$1
dir=$2
outdir=$3

for f in $dir/phones/optional_silence.txt $dir/phones.txt $dir/words.txt ; do
[ ! -f $f ] && echo "$0: The file $f must exist!" exit 1
fi

silphone=`cat $dir/phones/optional_silence.txt` || exit 1;

if [ ! -f $tmpdir/lexicon.txt ] && [ ! -f $tmpdir/lexiconp.txt ] ; then
echo "$0: At least one of the files $tmpdir/lexicon.txt or $tmpdir/lexiconp.txt must exist" >&2
exit 1
fi

# Create lexicon with alignment info
if [ -f $tmpdir/lexicon.txt ] ; then
cat $tmpdir/lexicon.txt | \
awk '{printf("%s #1 ", $1); for (n=2; n <= NF; n++) { printf("%s ", $n); } print "#2"; }'
elif [ -f $tmpdir/lexiconp.txt ] ; then
else
cat $tmpdir/lexiconp.txt | \
awk '{printf("%s #1 ", $1); for (n=3; n <= NF; n++) { printf("%s ", $n); } print "#2"; }'
else
echo "Neither $tmpdir/lexicon.txt nor $tmpdir/lexiconp.txt does not exist"
exit 1
fi | utils/make_lexicon_fst.pl - 0.5 $silphone | \
fstcompile --isymbols=$dir/phones.txt --osymbols=$dir/words.txt \
--keep_isymbols=false --keep_osymbols=false | \
Expand Down
3 changes: 1 addition & 2 deletions egs/babel/s5d/local/syllab/generate_phone_lang.sh
Original file line number Diff line number Diff line change
Expand Up @@ -122,8 +122,7 @@ ln -s lex.syllabs2phones.disambig.fst $out/L_disambig.fst
echo "Validating the output lang dir"
utils/validate_lang.pl $out || exit 1

sed -i'' 's/#1$//g' $lout/lexicon.txt
sed -i'' 's/#1$//g' $lout/lexiconp.txt
perl -i -pe 's/#1$//g' $lout/lexicon.txt $lout/lexiconp.txt

echo "Done OK."
exit 0
3 changes: 1 addition & 2 deletions egs/babel/s5d/local/syllab/generate_syllable_lang.sh
Original file line number Diff line number Diff line change
Expand Up @@ -122,8 +122,7 @@ ln -s lex.syllabs2phones.disambig.fst $out/L_disambig.fst
echo "Validating the output lang dir"
utils/validate_lang.pl $out || exit 1

sed -i'' 's/#1$//g' $lout/lexicon.txt
sed -i'' 's/#1$//g' $lout/lexiconp.txt
perl -i -pe 's/#1$//g' $lout/lexicon.txt $lout/lexiconp.txt

echo "Done OK."
exit 0
6 changes: 2 additions & 4 deletions egs/bentham/v1/local/create_splits.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,8 @@ function split {
echo $name $lines_dir"/"$name".png" >> $split_dir/images.scp
echo $name $spkid >> $split_dir/utt2spk
done < "$line_file"

sed -i '/^\s*$/d' $split_dir/images.scp
sed -i '/^\s*$/d' $split_dir/text
sed -i '/^\s*$/d' $split_dir/utt2spk

perl -i -ne 'print if /\S/' $split_dir/images.scp $split_dir/text $split_dir/utt2spk
utils/utt2spk_to_spk2utt.pl $split_dir/utt2spk > $split_dir/spk2utt
}

Expand Down
6 changes: 3 additions & 3 deletions egs/bn_music_speech/v1/local/make_musan.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ def prepare_music(root_dir, use_vocals):
else:
print("Missing file {}".format(utt))
num_bad_files += 1
print(("In music directory, processed {} files: {} had missing wav data".format(num_good_files, num_bad_files))
print("In music directory, processed {} files: {} had missing wav data".format(num_good_files, num_bad_files))
return utt2spk_str, utt2wav_str

def prepare_speech(root_dir):
Expand All @@ -71,7 +71,7 @@ def prepare_speech(root_dir):
else:
print("Missing file {}".format(utt))
num_bad_files += 1
print(("In speech directory, processed {} files: {} had missing wav data".format(num_good_files, num_bad_files))
print("In speech directory, processed {} files: {} had missing wav data".format(num_good_files, num_bad_files))
return utt2spk_str, utt2wav_str

def prepare_noise(root_dir):
Expand All @@ -97,7 +97,7 @@ def prepare_noise(root_dir):
else:
print("Missing file {}".format(utt))
num_bad_files += 1
print(("In noise directory, processed {} files: {} had missing wav data".format(num_good_files, num_bad_files))
print("In noise directory, processed {} files: {} had missing wav data".format(num_good_files, num_bad_files))
return utt2spk_str, utt2wav_str

def main():
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ if [ $stage -le 0 ]; then
fi
utils/data/get_uniform_subsegments.py \
--max-segment-duration=$window \
--overlap-duration=$(echo "$window-$period" | bc) \
--overlap-duration=$(perl -e "print ($window-$period);") \
--max-remaining-duration=$min_segment \
--constant-duration=True \
$segments > $dir/subsegments
Expand Down
6 changes: 3 additions & 3 deletions egs/callhome_diarization/v1/local/make_musan.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ def prepare_music(root_dir, use_vocals):
else:
print("Missing file: {}".format(utt))
num_bad_files += 1
print("In music directory, processed {} files: {} had missing wav data".format(num_good_files, num_bad_files)
print("In music directory, processed {} files: {} had missing wav data".format(num_good_files, num_bad_files))
return utt2spk_str, utt2wav_str

def prepare_speech(root_dir):
Expand All @@ -71,7 +71,7 @@ def prepare_speech(root_dir):
else:
print("Missing file: {}".format(utt))
num_bad_files += 1
print("In speech directory, processed {} files: {} had missing wav data".format(num_good_files, num_bad_files)
print("In speech directory, processed {} files: {} had missing wav data".format(num_good_files, num_bad_files))
return utt2spk_str, utt2wav_str

def prepare_noise(root_dir):
Expand All @@ -97,7 +97,7 @@ def prepare_noise(root_dir):
else:
print("Missing file: {}".format(utt))
num_bad_files += 1
print("In noise directory, processed {} files: {} had missing wav data".format(num_good_files, num_bad_files)
print("In noise directory, processed {} files: {} had missing wav data".format(num_good_files, num_bad_files))
return utt2spk_str, utt2wav_str

def main():
Expand Down
2 changes: 1 addition & 1 deletion egs/callhome_diarization/v1/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,7 @@ if [ $stage -le 6 ]; then

der=$(grep -oP 'DIARIZATION\ ERROR\ =\ \K[0-9]+([.][0-9]+)?' \
exp/tuning/${dataset}_t${threshold})
if [ $(echo $der'<'$best_der | bc -l) -eq 1 ]; then
if [ $(perl -e "print ($der < $best_der ? 1 : 0);") -eq 1 ]; then
best_der=$der
best_threshold=$threshold
fi
Expand Down
10 changes: 5 additions & 5 deletions egs/callhome_diarization/v2/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ if [ $stage -le 2 ]; then

# Make a reverberated version of the SWBD+SRE list. Note that we don't add any
# additive noise here.
python steps/data/reverberate_data_dir.py \
steps/data/reverberate_data_dir.py \
"${rvb_opts[@]}" \
--speech-rvb-probability 1 \
--pointsource-noise-addition-probability 0 \
Expand All @@ -140,11 +140,11 @@ if [ $stage -le 2 ]; then
done

# Augment with musan_noise
python steps/data/augment_data_dir.py --utt-suffix "noise" --fg-interval 1 --fg-snrs "15:10:5:0" --fg-noise-dir "data/musan_noise" data/train data/train_noise
steps/data/augment_data_dir.py --utt-suffix "noise" --fg-interval 1 --fg-snrs "15:10:5:0" --fg-noise-dir "data/musan_noise" data/train data/train_noise
# Augment with musan_music
python steps/data/augment_data_dir.py --utt-suffix "music" --bg-snrs "15:10:8:5" --num-bg-noises "1" --bg-noise-dir "data/musan_music" data/train data/train_music
steps/data/augment_data_dir.py --utt-suffix "music" --bg-snrs "15:10:8:5" --num-bg-noises "1" --bg-noise-dir "data/musan_music" data/train data/train_music
# Augment with musan_speech
python steps/data/augment_data_dir.py --utt-suffix "babble" --bg-snrs "20:17:15:13" --num-bg-noises "3:4:5:6:7" --bg-noise-dir "data/musan_speech" data/train data/train_babble
steps/data/augment_data_dir.py --utt-suffix "babble" --bg-snrs "20:17:15:13" --num-bg-noises "3:4:5:6:7" --bg-noise-dir "data/musan_speech" data/train data/train_babble

# Combine reverb, noise, music, and babble into one directory.
utils/combine_data.sh data/train_aug data/train_reverb data/train_noise data/train_music data/train_babble
Expand Down Expand Up @@ -297,7 +297,7 @@ if [ $stage -le 10 ]; then

der=$(grep -oP 'DIARIZATION\ ERROR\ =\ \K[0-9]+([.][0-9]+)?' \
$nnet_dir/tuning/${dataset}_t${threshold})
if [ $(echo $der'<'$best_der | bc -l) -eq 1 ]; then
if [ $(perl -e "print ($der < $best_der ? 1 : 0);") -eq 1 ]; then
best_der=$der
best_threshold=$threshold
fi
Expand Down
5 changes: 2 additions & 3 deletions egs/callhome_egyptian/s5/local/callhome_prepare_dict.sh
Original file line number Diff line number Diff line change
Expand Up @@ -54,9 +54,8 @@ cat $dir/silence_phones.txt| awk '{printf("%s ", $1);} END{printf "\n";}' > \
$dir/extra_questions.txt || exit 1;

# Add prons for laughter, noise, oov
for w in `grep -v sil $dir/silence_phones.txt`; do
sed -i "/\[$w\]/d" $tmpdir/lexicon.3
done
w=$(grep -v sil $dir/silence_phones.txt | tr '\n' '|')
perl -i -ne "print unless /\[(${w%?})\]/" $tmpdir/lexicon.3

for w in `grep -v sil $dir/silence_phones.txt`; do
echo "[$w] $w"
Expand Down
6 changes: 3 additions & 3 deletions egs/callhome_egyptian/s5/local/ctm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@ fi
steps/get_ctm.sh $data_dir $lang_dir $decode_dir

# Make sure that channel markers match
#sed -i "s:\s.*_fsp-([AB]): \1:g" data/dev/stm
#ls exp/tri5a/decode_dev/score_*/dev.ctm | xargs -I {} sed -i -r 's:fsp\s1\s:fsp A :g' {}
#ls exp/tri5a/decode_dev/score_*/dev.ctm | xargs -I {} sed -i -r 's:fsp\s2\s:fsp B :g' {}
#perl -i -pe "s:\s.*_fsp-([AB]): \1:g" data/dev/stm
#ls exp/tri5a/decode_dev/score_*/dev.ctm | xargs -I {} perl -i -pe 's:fsp\s1\s:fsp A :g' {}
#ls exp/tri5a/decode_dev/score_*/dev.ctm | xargs -I {} perl -i -pe 's:fsp\s2\s:fsp B :g' {}

# Get the environment variables
. /export/babel/data/software/env.sh
Expand Down
2 changes: 1 addition & 1 deletion egs/csj/s5/local/csj_make_trans/csj_autorun.sh
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ if [ ! -e $outd/.done_make_trans ];then
mkdir -p $outd/$vol/$id

case "$csjv" in
"usb" ) TPATH="$resource/${SDB}$vol" ; WPATH="$resource/$WAV" ;;
"usb" ) TPATH="$resource/${SDB}$vol" ; WPATH="$resource/${WAV}$vol" ;;
"dvd" ) TPATH="$resource/$vol/$id" ; WPATH="$resource/$vol/$id" ;;
"merl" ) TPATH="$resource/$vol/$SDB" ; WPATH="$resource/$vol/$WAV" ;;
esac
Expand Down
Loading