clean up scripts in #2572#2643
Conversation
Speed(gflops) size old new speedup
CuVector::AddDiagMat2Shapes<double>[no-trans], (1048576, 32), 2.13 8.04 3.77x
CuVector::AddDiagMat2Shapes<double>[no-trans], (524288, 64), 4.12 7.27 1.77x
CuVector::AddDiagMat2Shapes<double>[no-trans], (262144, 128), 7.66 8.56 1.12x
CuVector::AddDiagMat2Shapes<double>[no-trans], (131072, 256), 13.50 13.50 1.00x
CuVector::AddDiagMat2Shapes<double>[no-trans], (65536, 512), 22.29 22.32 1.00x
CuVector::AddDiagMat2Shapes<double>[no-trans], (32768, 1024), 32.26 32.35 1.00x
CuVector::AddDiagMat2Shapes<double>[no-trans], (16384, 2048), 32.48 32.47 1.00x
CuVector::AddDiagMat2Shapes<double>[no-trans], (8192, 4096), 32.54 32.57 1.00x
CuVector::AddDiagMat2Shapes<double>[no-trans], (4096, 8192), 32.52 32.55 1.00x
CuVector::AddDiagMat2Shapes<double>[no-trans], (2048, 16384), 32.46 32.49 1.00x
CuVector::AddDiagMat2Shapes<double>[no-trans], (1024, 32768), 32.30 32.34 1.00x
CuVector::AddDiagMat2Shapes<double>[no-trans], (512, 65536), 31.77 31.89 1.00x
CuVector::AddDiagMat2Shapes<double>[no-trans], (256, 131072), 31.74 31.71 1.00x
CuVector::AddDiagMat2Shapes<double>[no-trans], (128, 262144), 31.64 31.67 1.00x
CuVector::AddDiagMat2Shapes<double>[no-trans], (64, 524288), 32.36 32.37 1.00x
CuVector::AddDiagMat2Shapes<double>[no-trans], (32, 1048576), 30.94 30.92 1.00x
CuVector::AddDiagMat2Shapes<double>[trans], (1048576, 32), 1.10 8.61 7.84x
CuVector::AddDiagMat2Shapes<double>[trans], (524288, 64), 2.19 8.61 3.94x
CuVector::AddDiagMat2Shapes<double>[trans], (262144, 128), 4.41 8.67 1.97x
CuVector::AddDiagMat2Shapes<double>[trans], (131072, 256), 8.64 8.56 0.99x
CuVector::AddDiagMat2Shapes<double>[trans], (65536, 512), 15.72 8.57 0.55x
CuVector::AddDiagMat2Shapes<double>[trans], (32768, 1024), 26.09 26.07 1.00x
CuVector::AddDiagMat2Shapes<double>[trans], (16384, 2048), 31.51 31.26 0.99x
CuVector::AddDiagMat2Shapes<double>[trans], (8192, 4096), 27.93 28.35 1.02x
CuVector::AddDiagMat2Shapes<double>[trans], (4096, 8192), 31.56 31.52 1.00x
CuVector::AddDiagMat2Shapes<double>[trans], (2048, 16384), 31.21 31.20 1.00x
CuVector::AddDiagMat2Shapes<double>[trans], (1024, 32768), 31.40 31.36 1.00x
CuVector::AddDiagMat2Shapes<double>[trans], (512, 65536), 31.52 31.55 1.00x
CuVector::AddDiagMat2Shapes<double>[trans], (256, 131072), 30.96 30.95 1.00x
CuVector::AddDiagMat2Shapes<double>[trans], (128, 262144), 30.00 29.99 1.00x
CuVector::AddDiagMat2Shapes<double>[trans], (64, 524288), 28.43 28.78 1.01x
CuVector::AddDiagMat2Shapes<double>[trans], (32, 1048576), 24.95 24.93 1.00x
CuVector::AddDiagMat2Shapes<float>[no-trans], (1048576, 32), 2.92 15.87 5.44x
CuVector::AddDiagMat2Shapes<float>[no-trans], (524288, 64), 5.70 14.27 2.51x
CuVector::AddDiagMat2Shapes<float>[no-trans], (262144, 128), 11.04 16.65 1.51x
CuVector::AddDiagMat2Shapes<float>[no-trans], (131072, 256), 21.12 21.15 1.00x
CuVector::AddDiagMat2Shapes<float>[no-trans], (65536, 512), 38.60 38.67 1.00x
CuVector::AddDiagMat2Shapes<float>[no-trans], (32768, 1024), 57.21 57.29 1.00x
CuVector::AddDiagMat2Shapes<float>[no-trans], (16384, 2048), 63.39 63.50 1.00x
CuVector::AddDiagMat2Shapes<float>[no-trans], (8192, 4096), 62.63 62.71 1.00x
CuVector::AddDiagMat2Shapes<float>[no-trans], (4096, 8192), 63.60 63.71 1.00x
CuVector::AddDiagMat2Shapes<float>[no-trans], (2048, 16384), 63.07 63.09 1.00x
CuVector::AddDiagMat2Shapes<float>[no-trans], (1024, 32768), 62.47 62.64 1.00x
CuVector::AddDiagMat2Shapes<float>[no-trans], (512, 65536), 61.80 61.86 1.00x
CuVector::AddDiagMat2Shapes<float>[no-trans], (256, 131072), 61.03 60.99 1.00x
CuVector::AddDiagMat2Shapes<float>[no-trans], (128, 262144), 60.22 59.81 0.99x
CuVector::AddDiagMat2Shapes<float>[no-trans], (64, 524288), 62.09 61.87 1.00x
CuVector::AddDiagMat2Shapes<float>[no-trans], (32, 1048576), 52.96 53.01 1.00x
CuVector::AddDiagMat2Shapes<float>[trans], (1048576, 32), 1.25 16.44 13.19x
CuVector::AddDiagMat2Shapes<float>[trans], (524288, 64), 2.48 17.15 6.91x
CuVector::AddDiagMat2Shapes<float>[trans], (262144, 128), 4.92 17.14 3.49x
CuVector::AddDiagMat2Shapes<float>[trans], (131072, 256), 9.55 18.27 1.91x
CuVector::AddDiagMat2Shapes<float>[trans], (65536, 512), 17.90 18.30 1.02x
CuVector::AddDiagMat2Shapes<float>[trans], (32768, 1024), 31.49 31.48 1.00x
CuVector::AddDiagMat2Shapes<float>[trans], (16384, 2048), 34.38 34.38 1.00x
CuVector::AddDiagMat2Shapes<float>[trans], (8192, 4096), 51.61 51.59 1.00x
CuVector::AddDiagMat2Shapes<float>[trans], (4096, 8192), 48.60 48.87 1.01x
CuVector::AddDiagMat2Shapes<float>[trans], (2048, 16384), 57.47 57.52 1.00x
CuVector::AddDiagMat2Shapes<float>[trans], (1024, 32768), 56.30 56.38 1.00x
CuVector::AddDiagMat2Shapes<float>[trans], (512, 65536), 55.83 56.24 1.01x
CuVector::AddDiagMat2Shapes<float>[trans], (256, 131072), 55.35 55.81 1.01x
CuVector::AddDiagMat2Shapes<float>[trans], (128, 262144), 54.26 54.56 1.01x
CuVector::AddDiagMat2Shapes<float>[trans], (64, 524288), 52.88 53.00 1.00x
CuVector::AddDiagMat2Shapes<float>[trans], (32, 1048576), 47.55 47.44 1.00x
…encies.sh failure
|
Hi, I noticed that you didn't add a recipe for librispeech, is it because it won't give any improvement in that dataset? I tried myself and found it's not as good as pure tdnn-f. |
|
We didn't run it on Librispeech. When you say you ran it on librispeech, can you be more specific about the configuration you ran? Because we'd tend to use slightly larger models for librispeech, and other settings like num-epochs may be different there too. |
|
@danpovey Sorry, the configurations are:
different tdnnf layers are tried(17/18/19), bottleneck dims(256/160). I set the epochs always equals to 4 so that I could compare the loss with pure tdnnf. And the relative loss is like(first column is tdnnf, the others are cnn-tdnnf): 2.95 | 3.16 | 3.18 |
|
Don't focus on that one test condition. The compare_wer.sh script compares
about 16 different test conditions. Is there degradation consistently
across those-- or at least, on average?
The configuration seems reasonable.
…On Tue, Sep 18, 2018 at 9:58 PM YangXuerui ***@***.***> wrote:
@danpovey <https://github.com/danpovey> Sorry, the configurations are:
input dim=100 name=ivector
input dim=40 name=input
# MFCC to filterbank
idct-layer name=idct input=input dim=40 cepstral-lifter=22
affine-transform-file=$dir/configs/idct.mat
linear-component name=ivector-linear $ivector_affine_opts dim=200
input=ReplaceIndex(ivector, t, 0)
batchnorm-component name=ivector-batchnorm target-rms=0.025
batchnorm-component name=idct-batchnorm input=idct
combine-feature-maps-layer name=combine_inputs
input=Append(idct-batchnorm, ivector-batchnorm) num-filters1=1
num-filters2=5 height=40
conv-relu-batchnorm-layer name=cnn1 $cnn_opts height-in=40 height-out=40
time-offsets=-1,0,1 height-offsets=-1,0,1 num-filters-out=64
conv-relu-batchnorm-layer name=cnn2 $cnn_opts height-in=40 height-out=40
time-offsets=-1,0,1 height-offsets=-1,0,1 num-filters-out=64
conv-relu-batchnorm-layer name=cnn3 $cnn_opts height-in=40 height-out=20
height-subsample-out=2 time-offsets=-1,0,1 height-offsets=-1,0,1
num-filters-out=128
conv-relu-batchnorm-layer name=cnn4 $cnn_opts height-in=20 height-out=20
time-offsets=-1,0,1 height-offsets=-1,0,1 num-filters-out=128
conv-relu-batchnorm-layer name=cnn5 $cnn_opts height-in=20 height-out=10
height-subsample-out=2 time-offsets=-1,0,1 height-offsets=-1,0,1
num-filters-out=256
conv-relu-batchnorm-layer name=cnn6 $cnn_opts height-in=10 height-out=10
time-offsets=-1,0,1 height-offsets=-1,0,1 num-filters-out=256
# the first TDNN-F layer has no bypass
tdnnf-layer name=tdnnf7 $tdnnf_first_opts dim=1536 bottleneck-dim=256
time-stride=0
tdnnf-layer name=tdnnf8 $tdnnf_opts dim=1536 bottleneck-dim=160
time-stride=3
tdnnf-layer name=tdnnf9 $tdnnf_opts dim=1536 bottleneck-dim=160
time-stride=3
tdnnf-layer name=tdnnf10 $tdnnf_opts dim=1536 bottleneck-dim=160
time-stride=3
tdnnf-layer name=tdnnf11 $tdnnf_opts dim=1536 bottleneck-dim=160
time-stride=3
tdnnf-layer name=tdnnf12 $tdnnf_opts dim=1536 bottleneck-dim=160
time-stride=3
tdnnf-layer name=tdnnf13 $tdnnf_opts dim=1536 bottleneck-dim=160
time-stride=3
tdnnf-layer name=tdnnf14 $tdnnf_opts dim=1536 bottleneck-dim=160
time-stride=3
tdnnf-layer name=tdnnf15 $tdnnf_opts dim=1536 bottleneck-dim=160
time-stride=3
tdnnf-layer name=tdnnf16 $tdnnf_opts dim=1536 bottleneck-dim=160
time-stride=3
tdnnf-layer name=tdnnf17 $tdnnf_opts dim=1536 bottleneck-dim=160
time-stride=3
tdnnf-layer name=tdnnf18 $tdnnf_opts dim=1536 bottleneck-dim=160
time-stride=3
tdnnf-layer name=tdnnf19 $tdnnf_opts dim=1536 bottleneck-dim=160
time-stride=3
linear-component name=prefinal-l dim=256 $linear_opts
prefinal-layer name=prefinal-chain input=prefinal-l $prefinal_opts
big-dim=1536 small-dim=256
output-layer name=output include-log-softmax=false dim=$num_targets
$output_opts
prefinal-layer name=prefinal-xent input=prefinal-l $prefinal_opts
big-dim=1536 small-dim=256
output-layer name=output-xent dim=$num_targets
learning-rate-factor=$learning_rate_factor $output_opts
different tdnnf layers are tried(17/18/19), bottleneck dims(256/160). I
set the epochs always equals to 4 so that I could compare the loss with
pure tdnnf.
And the relative loss is like(first column is tdnnf, the others are
cnn-tdnnf):
2.95 | 3.16 | 3.18
2.94 | 3.14 | 3.18
2.93 | 3.13 | 3.17
2.93 | 3.13 | 3.16
2.94 | 3.12 | 3.14
2.92 | 3.12 | 3.13
2.92 | 3.12 | 3.07
2.89 | 3.11 | 3.1
2.91 | 3.12 | 3.1
2.92 | 3.12 | 3.1
2.89 | 3.12 | 3.08
2.9 | 3.09 | 3.08
2.86 | 3.08 | 3.06
2.88 | 3.07 | 3.05
2.85 | 3.07 | 3.05
2.9 | 3.09 | 3.06
2.88 | 3.05 | 3.04
2.85 | 3.08 | 3.06
2.85 | 3.05 | 3.03
2.83 | 3.03 | 3.07
2.83 | 3.05 | 3.02
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2643 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVuxodF692KB4T-e2T2Irpw5wasGuSks5ucaS8gaJpZM4WKr0E>
.
|
No description provided.