Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
1cc8fb6
Bug fix in nnet3-latgen-faster which missed uttspk option
vimalmanohar Nov 23, 2016
ac84a2b
Bug fix in sparse-matrix.cc
vimalmanohar Nov 23, 2016
d544317
asr_diarization: Adding get_frame_shift.sh
vimalmanohar Nov 24, 2016
c2eab80
Pass --no-text option to validate data dir in speed perturbation
vimalmanohar Nov 24, 2016
cd967d6
Print Cuda profile in nnet3-compute
vimalmanohar Nov 23, 2016
892afd1
Bug fix in xconfig/basic_layers.py
vimalmanohar Dec 13, 2016
90d0f3e
asr_diarization: Making xconfigs support more general networks
vimalmanohar Dec 13, 2016
c9d91b8
asr_diarization: Fix stats printing
vimalmanohar Nov 6, 2016
88189bd
asr_diarization: Add --skip-dims option to apply-cmvn-sliding
vimalmanohar Nov 24, 2016
7b1171f
asr_diarization: Adding weights and length-tolerace to extract ivecto…
vimalmanohar Nov 22, 2016
9b4ed5b
asr_diarization: Adding --do-average option to matrix-sum-rows
vimalmanohar Nov 25, 2016
283bf4c
asr_diarization: Added weight-pdf-post, vector-to-feat, kaldi-matrix …
vimalmanohar Sep 24, 2016
6456875
asr_diarization: Modify subsegment_feats and add fix_subsegmented_fea…
vimalmanohar Sep 25, 2016
b4cd841
asr_diarization: Utility scripts get_reco2utt, get_utt2dur and get_se…
vimalmanohar Aug 30, 2016
c5c37b0
asr_diarization: SAD post-processing
vimalmanohar Nov 25, 2016
44d3eb2
asr_diarization: Modify modify_speaker_info to add --respect-recordin…
vimalmanohar Nov 24, 2016
73daaaf
asr_diarization: Modify subset_data_dir.sh, copy_data_dir.sh to copy …
vimalmanohar Nov 24, 2016
fd56cb9
asr_diarization: Moved evaluate_segmentation.pl to steps/segmentation
vimalmanohar Nov 24, 2016
be83341
asr_diarization: Modify perturb_data_dir_volume.sh to write reco2vol …
vimalmanohar Nov 24, 2016
cfb5369
asr_diarization: Get reverberated version of scp
vimalmanohar Nov 24, 2016
e11a4a1
asr_diarization: Adding script split_data_on_reco.sh
vimalmanohar Nov 24, 2016
e760139
asr_diarization: add per-reco option to split_data.sh
vimalmanohar Nov 24, 2016
fc328b9
asr_diarization: Added deriv weights and xent per dim objective
vimalmanohar Nov 23, 2016
2d11a82
asr_diarization: Adding compress format option
vimalmanohar Nov 23, 2016
efe987a
asr_diarization: nnet3-get-egs etc. modified with deriv weights and c…
vimalmanohar Nov 23, 2016
1503c21
asr_diarization: Log and Exp component
vimalmanohar Dec 7, 2016
089345a
asr_diarization: Adding ScaleGradientComponent
vimalmanohar Nov 23, 2016
30ebd93
asr_diarization: Adding AddGradientSacaleLayer to components.py
vimalmanohar Nov 24, 2016
ba75a09
asr_diarization: Adding get_egs changes into get_egs_targets
vimalmanohar Nov 24, 2016
700a9fa
asr_diarization: Multiple outputs in nnet3
vimalmanohar Nov 23, 2016
58ba175
raw_python_script: Made LSTM and TDNN raw configs similar
vimalmanohar Nov 24, 2016
169b350
asr_diarization: Create prepare_unsad_data.sh
vimalmanohar Nov 23, 2016
26e5f5f
asr_diarization: Temporary changes to mfcc_hires_bp.conf and path.sh …
vimalmanohar Nov 24, 2016
599950b
asr_diarization: Modified reverberation script by moving some functio…
vimalmanohar Nov 24, 2016
fe49dfd
asr_diarization: Add extra_egs_copy_cmd
vimalmanohar Nov 24, 2016
e00fd31
asr_diarization: Create get_egs.py supporting multiple targets
vimalmanohar Nov 24, 2016
dba8e55
asr_diarization: Modify the egs binaries and utilities to support mul…
vimalmanohar Nov 24, 2016
1c68613
asr_diarization: Adding local/snr/make_sad_tdnn_configs.py and stats …
vimalmanohar Nov 24, 2016
ddb993b
asr_diarization: compute_output.sh, SAD decoding scripts and do_segme…
vimalmanohar Nov 24, 2016
c36817a
asr_diarization: Adding min-extra-left-context
vimalmanohar Nov 19, 2016
1022659
asr_diarization: Segmentation tools
vimalmanohar Nov 27, 2016
e5f69fd
asr_diarization: Adding do_corruption_data_dir.sh for corruption with…
vimalmanohar Nov 30, 2016
2ae59ac
asr_diarization: Add do_corruption_data_dir_music.sh for corruption w…
vimalmanohar Nov 30, 2016
b5d55c7
asr_diarization: Recipe for music-id on broadcast news
vimalmanohar Nov 30, 2016
d5f3084
asr_diarization: Utilities invert_vector.pl and vector_get_max.pl
vimalmanohar Nov 25, 2016
76db612
asr_diarization: Recipe for segmentation on AMI SDM dev set
vimalmanohar Nov 30, 2016
f5d2284
asr_diarization: Fisher recipe from data preparation, training nnet a…
vimalmanohar Nov 30, 2016
08d8894
asr_diarization: created compute-snr-targets
vimalmanohar Nov 24, 2016
9a9b54b
asr_diarization: make_snr_targets.sh
vimalmanohar Nov 24, 2016
fd121b8
asr_diarization: Added script to get DCT matrix
vimalmanohar Nov 24, 2016
8ca6a31
asr_diarization_clean: Adding run_train_sad.sh
vimalmanohar Nov 30, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 12 additions & 12 deletions egs/ami/s5b/local/prepare_parallel_train_data.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@
# but the wav data is copied from data/ihm. This is a little tricky because the
# utterance ids are different between the different mics

train_set=train

. utils/parse_options.sh


if [ $# != 1 ]; then
echo "Usage: $0 [sdm1|mdm8]"
Expand All @@ -18,12 +22,10 @@ if [ $mic == "ihm" ]; then
exit 1;
fi

train_set=train

. cmd.sh
. ./path.sh

for f in data/ihm/train/utt2spk data/$mic/train/utt2spk; do
for f in data/ihm/${train_set}/utt2spk data/$mic/${train_set}/utt2spk; do
if [ ! -f $f ]; then
echo "$0: expected file $f to exist"
exit 1
Expand All @@ -32,12 +34,12 @@ done

set -e -o pipefail

mkdir -p data/$mic/train_ihmdata
mkdir -p data/$mic/${train_set}_ihmdata

# the utterance-ids and speaker ids will be from the SDM or MDM data
cp data/$mic/train/{spk2utt,text,utt2spk} data/$mic/train_ihmdata/
cp data/$mic/${train_set}/{spk2utt,text,utt2spk} data/$mic/${train_set}_ihmdata/
# the recording-ids will be from the IHM data.
cp data/ihm/train/{wav.scp,reco2file_and_channel} data/$mic/train_ihmdata/
cp data/ihm/${train_set}/{wav.scp,reco2file_and_channel} data/$mic/${train_set}_ihmdata/

# map sdm/mdm segments to the ihm segments

Expand All @@ -47,19 +49,17 @@ mic_base_upcase=$(echo $mic | sed 's/[0-9]//g' | tr 'a-z' 'A-Z')
# It has lines like:
# AMI_EN2001a_H02_FEO065_0021133_0021442 AMI_EN2001a_SDM_FEO065_0021133_0021442

tmpdir=data/$mic/train_ihmdata/
tmpdir=data/$mic/${train_set}_ihmdata/

awk '{print $1, $1}' <data/ihm/train/utt2spk | \
awk '{print $1, $1}' <data/ihm/${train_set}/utt2spk | \
sed -e "s/_H[0-9][0-9]_/_${mic_base_upcase}_/" | \
awk '{print $2, $1}' >$tmpdir/ihmutt2utt

# Map the 1st field of the segments file from the ihm data (the 1st field being
# the utterance-id) to the corresponding SDM or MDM utterance-id. The other
# fields remain the same (e.g. we want the recording-ids from the IHM data).
utils/apply_map.pl -f 1 $tmpdir/ihmutt2utt <data/ihm/train/segments >data/$mic/train_ihmdata/segments

utils/fix_data_dir.sh data/$mic/train_ihmdata
utils/apply_map.pl -f 1 $tmpdir/ihmutt2utt <data/ihm/${train_set}/segments >data/$mic/${train_set}_ihmdata/segments

rm $tmpdir/ihmutt2utt
utils/fix_data_dir.sh data/$mic/${train_set}_ihmdata

exit 0;
13 changes: 13 additions & 0 deletions egs/aspire/s5/conf/mfcc_hires_bp.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# config for high-resolution MFCC features, intended for neural network training.
# Note: we keep all cepstra, so it has the same info as filterbank features,
# but MFCC is more easily compressible (because less correlated) which is why
# we prefer this method.
--use-energy=false # use average of log energy, not energy.
--sample-frequency=8000 # Switchboard is sampled at 8kHz
--num-mel-bins=28
--num-ceps=28
--cepstral-lifter=0
--low-freq=330 # low cutoff frequency for mel bins
--high-freq=-1000 # high cutoff frequently, relative to Nyquist of 4000 (=3000)


136 changes: 136 additions & 0 deletions egs/aspire/s5/local/segmentation/do_corruption_data_dir.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
#! /bin/bash

# Copyright 2016 Vimal Manohar
# Apache 2.0

set -e
set -u
set -o pipefail

. path.sh

stage=0
corruption_stage=-10
corrupt_only=false

# Data options
data_dir=data/train_si284 # Expecting whole data directory.
speed_perturb=true
num_data_reps=5 # Number of corrupted versions
snrs="20:10:15:5:0:-5"
foreground_snrs="20:10:15:5:0:-5"
background_snrs="20:10:15:5:0:-5"
base_rirs=simulated

# Parallel options
reco_nj=40
cmd=queue.pl

# Options for feature extraction
mfcc_config=conf/mfcc_hires_bp.conf
feat_suffix=hires_bp

reco_vad_dir= # Output of prepare_unsad_data.sh.
# If provided, the speech labels and deriv weights will be
# copied into the output data directory.

. utils/parse_options.sh

if [ $# -ne 0 ]; then
echo "Usage: $0"
exit 1
fi

data_id=`basename ${data_dir}`

rvb_opts=()
if [ "$base_rirs" == "simulated" ]; then
# This is the config for the system using simulated RIRs and point-source noises
rvb_opts+=(--rir-set-parameters "0.5, RIRS_NOISES/simulated_rirs/smallroom/rir_list")
rvb_opts+=(--rir-set-parameters "0.5, RIRS_NOISES/simulated_rirs/mediumroom/rir_list")
rvb_opts+=(--noise-set-parameters RIRS_NOISES/pointsource_noises/noise_list)
else
# This is the config for the JHU ASpIRE submission system
rvb_opts+=(--rir-set-parameters "1.0, RIRS_NOISES/real_rirs_isotropic_noises/rir_list")
rvb_opts+=(--noise-set-parameters RIRS_NOISES/real_rirs_isotropic_noises/noise_list)
fi

corrupted_data_id=${data_id}_corrupted

if [ $stage -le 1 ]; then
python steps/data/reverberate_data_dir.py \
"${rvb_opts[@]}" \
--prefix="rev" \
--foreground-snrs=$foreground_snrs \
--background-snrs=$background_snrs \
--speech-rvb-probability=1 \
--pointsource-noise-addition-probability=1 \
--isotropic-noise-addition-probability=1 \
--num-replications=$num_data_reps \
--max-noises-per-minute=1 \
data/${data_id} data/${corrupted_data_id}
fi

corrupted_data_dir=data/${corrupted_data_id}

if $speed_perturb; then
if [ $stage -le 2 ]; then
## Assuming whole data directories
for x in $clean_data_dir $corrupted_data_dir $noise_data_dir; do
cp $x/reco2dur $x/utt2dur
utils/data/perturb_data_dir_speed_3way.sh $x ${x}_sp
done
fi

corrupted_data_dir=${corrupted_data_dir}_sp
corrupted_data_id=${corrupted_data_id}_sp

if [ $stage -le 3 ]; then
utils/data/perturb_data_dir_volume.sh --scale-low 0.03125 --scale-high 2 \
${corrupted_data_dir}
fi
fi

if $corrupt_only; then
echo "$0: Got corrupted data directory in ${corrupted_data_dir}"
exit 0
fi

mfccdir=`basename $mfcc_config`
mfccdir=${mfccdir%%.conf}

if [[ $(hostname -f) == *.clsp.jhu.edu ]] && [ ! -d $mfccdir/storage ]; then
utils/create_split_dir.pl \
/export/b0{3,4,5,6}/$USER/kaldi-data/egs/aspire-$(date +'%m_%d_%H_%M')/s5/$mfccdir/storage $mfccdir/storage
fi

if [ $stage -le 4 ]; then
utils/copy_data_dir.sh $corrupted_data_dir ${corrupted_data_dir}_$feat_suffix
corrupted_data_dir=${corrupted_data_dir}_$feat_suffix
steps/make_mfcc.sh --mfcc-config $mfcc_config \
--cmd "$cmd" --nj $reco_nj \
$corrupted_data_dir exp/make_${feat_suffix}/${corrupted_data_id} $mfccdir
steps/compute_cmvn_stats.sh --fake \
$corrupted_data_dir exp/make_${feat_suffix}/${corrupted_data_id} $mfccdir
else
corrupted_data_dir=${corrupted_data_dir}_$feat_suffix
fi

if [ $stage -le 8 ]; then
if [ ! -z "$reco_vad_dir" ]; then
if [ ! -f $reco_vad_dir/speech_feat.scp ]; then
echo "$0: Could not find file $reco_vad_dir/speech_feat.scp"
exit 1
fi

cat $reco_vad_dir/speech_feat.scp | \
steps/segmentation/get_reverb_scp.pl -f 1 $num_data_reps | \
sort -k1,1 > ${corrupted_data_dir}/speech_feat.scp

cat $reco_vad_dir/deriv_weights.scp | \
steps/segmentation/get_reverb_scp.pl -f 1 $num_data_reps | \
sort -k1,1 > ${corrupted_data_dir}/deriv_weights.scp
fi
fi

exit 0
Loading