Evaluate results using pre-trained ASR model from JSALT 2015 #15

mim · 2017-02-22T21:59:23Z

Use the kaldi recognizer that is in /scratch/mim/jsalt/code/asr/kaldi-jsalt/egs/jsalt15-ffs/s5 to recognize the enhanced utterances. It was copied from another system, so there might be errors caused by paths being set incorrectly.

You should also modify local/publish_results.sh to use the google spreadsheet we setup.

Here is an example of how to run it, but using the paths from the other system:

# Run recognizer on non-WPE replays at 9db
SHORTNAME=xcPreIpd09db  # Set to a reasonable name for the system being evaluated
LONGNAME=out_beamformit/replayXcPreIpdMaxSup09db   # Actual path to separations
cd /home/ws15mmandel/code/asr/kaldi-jsalt/egs/jsalt15-ffs/s5/
mkdir -p data/chime3/$SHORTNAME
pushd data/chime3/$SHORTNAME
ln -s /export/ws15/ws15-ffs-data2/mmandel/data/chime3/$LONGNAME/wav/ wav
popd
./run.sh --do-ami false --do-reverb false --stage 3 --enhan-chime3 $SHORTNAME

# Publish results to google spreadsheet
local/publish_results.sh --comment "CHiME3 only, MESSL-MVDR, mask-driven noise, MESSL IPD look direction, cross-correlation initialization, 9db max suppression" exp/mdm8/dnn4_pretrain-dbn_dnn/ $SHORTNAME $SHORTNAME $SHORTNAME

This might work on crescent:

# Create a link from where the output wav files are to where kaldi expects them to be
SHORTNAME=xcPreIpd09db  # Set to a reasonable name for the system being evaluated
LONGNAME=replayMesslXcPreIpdMaxSup09db   # Actual path to separations
cd /scratch/mim/jsalt/code/asr/kaldi-jsalt/egs/jsalt15-ffs/s5
mkdir -p data/chime3/$SHORTNAME
pushd data/chime3/$SHORTNAME
ln -s /scratch/near/chime3/$LONGNAME/wav/ wav
popd

# Run recognizer
./run.sh --do-ami false --do-reverb false --stage 3 --enhan-chime3 $SHORTNAME

# Publish results to google spreadsheet
local/publish_results.sh --comment "[Comment describing the system that you ran, to be included in spreadsheet]" exp/mdm8/dnn4_pretrain-dbn_dnn/ $SHORTNAME $SHORTNAME $SHORTNAME

The text was updated successfully, but these errors were encountered:

nateanl · 2017-03-01T01:07:48Z

Can I have writing permission in your directory? ...

mim · 2017-03-01T01:48:22Z

Why don't you copy /scratch/mim/jsalt/code/asr/kaldi-jsalt to your own /scratch directory, because you might need to recompile it, etc. It's 14GB, but we have plenty of storage space.

nateanl · 2017-03-01T01:48:57Z

Got it.

nateanl · 2017-03-01T19:57:15Z

I still need your permission to copy all the stuffs. Is there a option to change the permission for copying?

mim · 2017-03-01T21:09:47Z

It looks like you have read permissions on all files and directories and execute permissions on all directories, which is what you should need. Which directory/file are you getting an error on?

nateanl · 2017-03-02T03:20:19Z

I got an error like this just now:
cp: cannot access '/scratch/mim/jsalt/code/asr/kaldi-jsalt/tools/openfst-1.3.4': Permission denied

mim · 2017-03-02T03:24:25Z

Ok, I just modified the permissions on everything in /scratch/mim/jsalt/code/asr/kaldi-jsalt, so try it again now.

nateanl · 2017-03-02T03:33:35Z

Nice, all things copied.

nateanl · 2017-03-02T18:46:11Z

I got an error like this:


loadtxt_ram()
1-grams: reading 4989 entries
done level 1
2-grams: reading 1639687 entries
done level 2
3-grams: reading 2684151 entries
done level 3
done
starting to use OOV words [<unk>]
OOV code is 4989
OOV code is 4989
OOV code is 4989
pruning LM with thresholds:
 1e-07 1e-07
ng: \<s\> 0 nextlevel_ts=1.99968 nextlevel_tbs=0.931817 k=1 ns=4206
savetxt: /scratch/near/kaldi-jsalt/egs/chime3/s5/data/local/nist_lm/lm_tgpr_5k.arpa
save: 4989 1-grams
save: 473820 2-grams
save: 656493 3-grams
done
Data preparation succeeded
Checked out revision 13265.
Dictionary preparation succeeded
Checking data/local/dict/silence_phones.txt ...
--> reading data/local/dict/silence_phones.txt
--> data/local/dict/silence_phones.txt is OK

Checking data/local/dict/optional_silence.txt ...
--> reading data/local/dict/optional_silence.txt
--> data/local/dict/optional_silence.txt is OK

Checking data/local/dict/nonsilence_phones.txt ...
--> reading data/local/dict/nonsilence_phones.txt
--> data/local/dict/nonsilence_phones.txt is OK

Checking disjoint: silence_phones.txt, nonsilence_phones.txt
--> disjoint property is OK.

Checking data/local/dict/lexicon.txt
--> reading data/local/dict/lexicon.txt
--> data/local/dict/lexicon.txt is OK

Checking data/local/dict/extra_questions.txt ...
--> reading data/local/dict/extra_questions.txt
--> data/local/dict/extra_questions.txt is OK
--> SUCCESS [validating dictionary directory data/local/dict]
**Creating data/local/dict/lexiconp.txt from data/local/dict/lexicon.txt
fstcompile: error while loading shared libraries: libfstscript.so.1: cannot open shared object file: No such file or directory
fstarcsort: error while loading shared libraries: libfstscript.so.1: cannot open shared object file: No such file or directory
+ nj=30
+ enhan=enhanced
+ enhan_data=/scratch/near/CHiME3/v2/replayMessl/average/
+ '[' '!' -d data/lang ']'
+ local/real_enhan_chime3_data_prep.sh enhanced /scratch/near/CHiME3/v2/replayMessl/average/
local/real_enhan_chime3_data_prep.sh enhanced /scratch/near/CHiME3/v2/replayMessl/average/
cat: tr05_real.dot: No such file or directory
cat: tr05_real.dot: No such file or directory
cat: dt05_real.dot: No such file or directory
cat: dt05_real.dot: No such file or directory
cat: et05_real.dot: No such file or directory
cat: et05_real.dot: No such file or directory
Data preparation succeeded
+ local/simu_enhan_chime3_data_prep.sh enhanced /scratch/near/CHiME3/v2/replayMessl/average/
local/simu_enhan_chime3_data_prep.sh enhanced /scratch/near/CHiME3/v2/replayMessl/average/
cat: dt05_simu.dot: No such file or directory
cat: dt05_simu.dot: No such file or directory
cat: et05_simu.dot: No such file or directory
cat: et05_simu.dot: No such file or directory
Data preparation succeeded
+ mfccdir=mfcc/enhanced
+ for x in 'dt05_real_$enhan' 'et05_real_$enhan' 'tr05_real_$enhan' 'dt05_simu_$enhan' 'et05_simu_$enhan' 'tr05_simu_$enhan'
+ steps/make_mfcc.sh --nj 10 --cmd 'queue.pl -l arch=*64* -q all.q' data/dt05_real_enhanced exp/make_mfcc/dt05_real_enhanced mfcc/enhanced
steps/make_mfcc.sh --nj 10 --cmd queue.pl -l arch=*64* -q all.q data/dt05_real_enhanced exp/make_mfcc/dt05_real_enhanced mfcc/enhanced
utils/validate_data_dir.sh: Error: in data/dt05_real_enhanced, utterance lists extracted from utt2spk and text
utils/validate_data_dir.sh: differ, partial diff is:   
1,1640d0
< F01_050C0101_PED_REAL
< F01_050C0102_CAF_REAL
< F01_050C0102_STR_REAL
< F01_050C0103_BUS_REAL
< F01_050C0103_STR_REAL
...
< M04_423C0211_PED_REAL
< M04_423C0212_BUS_REAL
< M04_423C0213_CAF_REAL
< M04_423C0214_STR_REAL
< M04_423C0215_STR_REAL
< M04_423C0216_BUS_REAL
[Lengths are kaldi.oYmm/utts=1640 versus kaldi.oYmm/utts.txt=0]
+ nj=30
+ enhan=enhanced
+ enhan_data=/scratch/near/CHiME3/v2/replayMessl/average/
+ '[' '!' -d data/lang ']'
+ '[' '!' -d exp/tri3b_tr05_multi_enhanced ']'
+ echo 'error, execute local/run_gmm.sh, first'
error, execute local/run_gmm.sh, first
+ exit 1

mim · 2017-03-02T18:48:46Z

I would execute local/run_gmm.sh first, like the error message says. Unless you called it incorrectly. What was the command you used to call it?

nateanl · 2017-03-02T18:49:41Z

./run.sh --do-ami false --do-reverb false --stage 3 --enhan-chime3 $SHORTNAME

$SHORTNAME=lstmc2Avg

mim · 2017-03-02T18:52:34Z

And you did the linking and everything to create the appropriate directory in data/chime3?

nateanl · 2017-03-02T18:54:25Z

Yes. I indeed changed one line in run.sh.
chime3_data=/export/ws15-ffs-data/corpora/chime3/CHiME3
I changed it to chime3_data=/home/data/CHiME3
Is this correct?

mim · 2017-03-02T20:19:49Z

That should be correct, unless there's an extra subdirectory in our chime3 directory that this should point to.

nateanl · 2017-03-07T17:29:09Z

Now I'm close to the correct way...
gmm dir not found: /export/ws15-ffs-data/swatanabe/tools/kaldi-trunk/egs/ami/s5/exp/mdm8/tri4a

Where can I find this model?

mim · 2017-03-07T17:41:10Z

Good question. I've copied it to /scratch/mim/kaldi/ami/exp/mdm8/tri4a/

nateanl · 2017-03-07T17:52:36Z

Can we submit the job to the server now? If no, I need to modify the code.
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance. steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance. queue.pl: error submitting jobs to queue (return status was 256) queue log file is data/chime3/lstmc2Avg/et05_real_lstmc2Avg/q/make_mfcc_et05_real_lstmc2Avg.log, command was qsub -v PATH -cwd -S /bin/bash -j y -l arch=*64* -o data/chime3/lstmc2Avg/et05_real_lstmc2Avg/q/make_mfcc_et05_real_lstmc2Avg.log -l arch=*64* -t 1:4 /scratch/near/kaldi-jsalt/egs/jsalt15-ffs/s5/data/chime3/lstmc2Avg/et05_real_lstmc2Avg/q/make_mfcc_et05_real_lstmc2Avg.sh >>data/chime3/lstmc2Avg/et05_real_lstmc2Avg/q/make_mfcc_et05_real_lstmc2Avg.log 2>&1 queue.pl: error submitting jobs to queue (return status was 256) queue log file is data/chime3/lstmc2Avg/dt05_real_lstmc2Avg/q/make_mfcc_dt05_real_lstmc2Avg.log, command was qsub -v PATH -cwd -S /bin/bash -j y -l arch=*64* -o data/chime3/lstmc2Avg/dt05_real_lstmc2Avg/q/make_mfcc_dt05_real_lstmc2Avg.log -l arch=*64* -t 1:4 /scratch/near/kaldi-jsalt/egs/jsalt15-ffs/s5/data/chime3/lstmc2Avg/dt05_real_lstmc2Avg/q/make_mfcc_dt05_real_lstmc2Avg.sh >>data/chime3/lstmc2Avg/dt05_real_lstmc2Avg/q/make_mfcc_dt05_real_lstmc2Avg.log 2>&1 Unable to run job: warning: near your job is not allowed to run in any queue Your job-array 6.1-4:1 ("make_mfcc_dt05_real_lstmc2Avg.sh") has been submitted. Exiting. Unable to run job: warning: near your job is not allowed to run in any queue Your job-array 5.1-4:1 ("make_mfcc_et05_real_lstmc2Avg.sh") has been submitted. Exiting.

nateanl · 2017-03-08T03:32:16Z

I found what's wrong with it.
When I try to submit a sample script to SGE. I got this error:

Unable to run job: warning: near your job is not allowed to run in any queue
Your job 16 ("Sleeper1") has been submitted.
Exiting.

I think I need to be added to some group to submit my job?

mim · 2017-03-08T03:47:30Z

@arsyed is there a group to add Zhaoheng to for sge?

arsyed · 2017-03-08T17:15:52Z

Thanks for spotting that. There's an "arusers" group attached to the "mainqueue". I've added user "near" to this group (using the qmon tool). That's the only difference I noticed with my username, so hopefully this works. Can you try submitting a job again?

If this works, we can document this on the wiki.

nateanl · 2017-03-08T17:20:21Z

I have succeeded submitting the job. It indeed works. Thanks a lot.

nateanl · 2017-03-09T16:37:10Z

I also need the dnn directory:
steps/nnet/decode.sh: missing file /export/ws15-ffs-data/swatanabe/tools/kaldi-trunk/egs/ami/s5/exp/mdm8/dnn4_pretrain-dbn_dnn/final.nnet

I think I found it, is it in /scratch/mim/kaldi/jsaltRecognizer/exp/mdm8/dnn4_pretrain-dbn_dnn?

mim · 2017-03-09T17:31:47Z

I've copied it over.

nateanl · 2017-03-09T21:38:38Z

Got it to work.

nateanl · 2017-03-10T02:14:35Z

The spreadsheet here: Spreadsheet

nateanl · 2017-03-10T02:23:30Z

There is no word error rate for simulated data set, right?

mim · 2017-03-10T04:34:53Z

I think I only ran messl on the real test and dev files originally, not the simulated ones. But you ran it on the simulated files, right? If so, we can measure wer on them too. Just keep them separate.

nateanl · 2017-03-10T04:40:11Z

The result WER is higher than the result you used MVDR . But I think it uses Channel 0 in the model, while we don't in our experiments. So it means our model is still comparable to the previous ones, right?

mim · 2017-03-10T15:02:14Z

The results in my paper didn't use channel 0, just like here. So it's doing similarly to before, i.e., hasn't improved on it.

nateanl · 2017-03-16T01:20:53Z

Why did I get so high WER on MESSL files? It's 72.59% on dev set. I just run replayMessl on it. What about your previous experiment for it?

mim · 2017-03-16T01:32:23Z

My files for the best performing system in those experiments (19.7% WER dev and 32.6% WER test on the real data only for both) are in /scratch/mim/kaldi/ami/exp/ihm/tri3_ali/replayXcMaskSoudenMaxSup09db you can try running it on them directly. You can also listen to yours and listen to those and see if they are the same. And see if any files are missing from yours (or mine).

nateanl · 2017-03-16T01:41:13Z

There is no directory called replayXcMaskSoudenMaxSup09db, I assume you use the system variable for the directory right?

mim · 2017-03-16T01:52:36Z

Oops, it's here: /scratch/mim/jsalt/data/chime3/out_beamformit/replayXcMaskSoudenMaxSup09db/

mim · 2017-05-22T18:18:44Z

Done by Zhaoheng

nateanl · 2017-06-29T23:12:23Z

Edit run.sh and local/publish_results.sh to calculate WER score for the CHiME-3 simulated dataset.

nateanl · 2017-06-30T22:03:12Z

New link here:
https://docs.google.com/spreadsheets/d/1bnr5zlsEeTLsMY1l3QTGfzCtFMlQezD5QviJ3yNXJc4/edit#gid=132513526

Need to figure out what else can be added to the form, so I can find a way to upload all of them at once.

Perhaps open a new issue to do this.

mim · 2017-07-03T01:46:59Z

Thanks. So why did you reopen this? What are the new files you're evaluating? Can you create a new issue specifying what information you want to add to the form/spreadsheet?

nateanl · 2017-07-10T17:21:16Z

I just added WER for simulated set. Last time we added other columns manually. If we can add them when running the code, it'll be easier.

mim · 2017-07-11T02:29:52Z

Ah, ok, thanks. I think the code from JSALT should automatically upload both the real and synthetic test results, but maybe not. If not, you can make a new google form / spreadsheet that can accept it.

mim closed this as completed May 22, 2017

nateanl reopened this Jun 29, 2017

Evaluate results using pre-trained ASR model from JSALT 2015 #15

Evaluate results using pre-trained ASR model from JSALT 2015 #15

Comments

mim commented Feb 22, 2017

nateanl commented Mar 1, 2017

mim commented Mar 1, 2017

nateanl commented Mar 1, 2017

nateanl commented Mar 1, 2017

mim commented Mar 1, 2017

nateanl commented Mar 2, 2017

mim commented Mar 2, 2017

nateanl commented Mar 2, 2017

nateanl commented Mar 2, 2017

mim commented Mar 2, 2017

nateanl commented Mar 2, 2017

mim commented Mar 2, 2017

nateanl commented Mar 2, 2017

mim commented Mar 2, 2017

nateanl commented Mar 7, 2017

mim commented Mar 7, 2017

nateanl commented Mar 7, 2017

nateanl commented Mar 8, 2017

mim commented Mar 8, 2017

arsyed commented Mar 8, 2017

nateanl commented Mar 8, 2017

nateanl commented Mar 9, 2017 • edited Loading

mim commented Mar 9, 2017

nateanl commented Mar 9, 2017

nateanl commented Mar 10, 2017

nateanl commented Mar 10, 2017

mim commented Mar 10, 2017

nateanl commented Mar 10, 2017

mim commented Mar 10, 2017

nateanl commented Mar 16, 2017

mim commented Mar 16, 2017

nateanl commented Mar 16, 2017

mim commented Mar 16, 2017

mim commented May 22, 2017

nateanl commented Jun 29, 2017

nateanl commented Jun 30, 2017

mim commented Jul 3, 2017

nateanl commented Jul 10, 2017

mim commented Jul 11, 2017

nateanl commented Mar 9, 2017 •

edited

Loading