Skip to content

Conversation

@jtrmal
Copy link
Contributor

@jtrmal jtrmal commented Nov 22, 2016

I did some changes in order to allow me to train on all data.
depends on @naxingyu and @danpovey if they decide merge or if a better strategy would be only cherry-pick some files or changes.
y.

@danpovey
Copy link
Contributor

I don't like the weird RESULTS filename.

@jtrmal
Copy link
Contributor Author

jtrmal commented Nov 22, 2016

that's how these filenames have been always generated for gale setups -- see arabic gale also...
I wanted to provide some numbers for @naxingyu as a reference.
I don't think it matters that much, unless you wanna merge it -- I'd suggest waiting how @naxingyu feels better solution for him.

@danpovey
Copy link
Contributor

I guess there's no hurry to merge, but do you see a reason why we would not
merge this? Does it improve results?
I don't like the idea of having a separate convention for writing results
for particular setups. I must have not reviewed the original commits for
that setup very well. I'd rather just have it as a RESULTS file, and
remove any old results files. People can figure out from the git log when
and who committed it, if they need.

On Tue, Nov 22, 2016 at 5:09 PM, jtrmal [email protected] wrote:

that's how these filenames have been always generated for gale setups --
see arabic gale also...
I wanted to provide some numbers for @naxingyu
https://github.com/naxingyu as a reference.
I don't think it matters that much, unless you wanna merge it -- I'd
suggest waiting how @naxingyu https://github.com/naxingyu feels better
solution for him.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1207 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu10pCCrfuY2J316-T2SmdwXVY8Qrks5rA2gogaJpZM4K5nPZ
.

@jtrmal
Copy link
Contributor Author

jtrmal commented Nov 22, 2016

I can, of course, change it -- no issue in that. Maybe the original idea was that the RESULTS file won't be overwritten by people running the recipes and generating their own results and if the results should be used as a reference, they should be manually renamed to RESULTS -- in that case I'm at fault, I wasn't sure of the reasons.
y.

@naxingyu
Copy link
Contributor

Thank you Yenda for the updated numbers. I was curious why there wasn't
a RESULT file, now it explains. There are several issues with the
original gale_mandarin recipe. It details WER of "report" and
"conversation" but the references don't exist. Maybe it was not
finished. And it confuses "test" with dev. But I saw that Yenda has
reshaped the data structure so it should be addressed by now. Some of
the example scripts are out-dated, using unavailable options of the
steps scripts. I'll see if there is anything I can do with this, but I
think we should agree on a suitable data structure. What Yenda proposes
seems reasonable.

Actually, just think out loud, there is no such thing as 'eval2000' and
'rt03' for mandarin/chinese. All the test sets are just dev sets. Is
there a possibility that we find a suitable chinese test set?

Xingyu

On 2016/11/23 6:09, jtrmal wrote:

that's how these filenames have been always generated for gale setups
-- see arabic gale also...
I wanted to provide some numbers for @naxingyu
https://github.com/naxingyu as a reference.
I don't think it matters that much, unless you wanna merge it -- I'd
suggest waiting how @naxingyu https://github.com/naxingyu feels
better solution for him.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1207 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADKpxJVR8Vw4xnn7bsQeFBSjCdEcqA9Pks5rA2gogaJpZM4K5nPZ.

@danpovey
Copy link
Contributor

Xingyu, I'd say go ahead and do what you think is best, taking into account
Yenda's changes. This recipe is not properly maintained by anyone else;
don't assume that whatever you see there makes sense. When and if you
think I should merge Yenda's PR, let me know. I don't have time to think
about this much.

On Tue, Nov 22, 2016 at 9:33 PM, Xingyu Na [email protected] wrote:

Thank you Yenda for the updated numbers. I was curious why there wasn't
a RESULT file, now it explains. There are several issues with the
original gale_mandarin recipe. It details WER of "report" and
"conversation" but the references don't exist. Maybe it was not
finished. And it confuses "test" with dev. But I saw that Yenda has
reshaped the data structure so it should be addressed by now. Some of
the example scripts are out-dated, using unavailable options of the
steps scripts. I'll see if there is anything I can do with this, but I
think we should agree on a suitable data structure. What Yenda proposes
seems reasonable.

Actually, just think out loud, there is no such thing as 'eval2000' and
'rt03' for mandarin/chinese. All the test sets are just dev sets. Is
there a possibility that we find a suitable chinese test set?

Xingyu

On 2016/11/23 6:09, jtrmal wrote:

that's how these filenames have been always generated for gale setups
-- see arabic gale also...
I wanted to provide some numbers for @naxingyu
https://github.com/naxingyu as a reference.
I don't think it matters that much, unless you wanna merge it -- I'd
suggest waiting how @naxingyu https://github.com/naxingyu feels
better solution for him.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1207 (comment),
or mute the thread
<https://github.com/notifications/unsubscribe-auth/
ADKpxJVR8Vw4xnn7bsQeFBSjCdEcqA9Pks5rA2gogaJpZM4K5nPZ>.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1207 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu5mFCQyszhxzG03Z5rYMhJO1xR_Wks5rA6YCgaJpZM4K5nPZ
.

@naxingyu
Copy link
Contributor

OK.

On 2016/11/23 10:39, Daniel Povey wrote:

Xingyu, I'd say go ahead and do what you think is best, taking into
account
Yenda's changes. This recipe is not properly maintained by anyone else;
don't assume that whatever you see there makes sense. When and if you
think I should merge Yenda's PR, let me know. I don't have time to think
about this much.

On Tue, Nov 22, 2016 at 9:33 PM, Xingyu Na [email protected]
wrote:

Thank you Yenda for the updated numbers. I was curious why there wasn't
a RESULT file, now it explains. There are several issues with the
original gale_mandarin recipe. It details WER of "report" and
"conversation" but the references don't exist. Maybe it was not
finished. And it confuses "test" with dev. But I saw that Yenda has
reshaped the data structure so it should be addressed by now. Some of
the example scripts are out-dated, using unavailable options of the
steps scripts. I'll see if there is anything I can do with this, but I
think we should agree on a suitable data structure. What Yenda proposes
seems reasonable.

Actually, just think out loud, there is no such thing as 'eval2000' and
'rt03' for mandarin/chinese. All the test sets are just dev sets. Is
there a possibility that we find a suitable chinese test set?

Xingyu

On 2016/11/23 6:09, jtrmal wrote:

that's how these filenames have been always generated for gale setups
-- see arabic gale also...
I wanted to provide some numbers for @naxingyu
https://github.com/naxingyu as a reference.
I don't think it matters that much, unless you wanna merge it -- I'd
suggest waiting how @naxingyu https://github.com/naxingyu feels
better solution for him.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1207 (comment),
or mute the thread
<https://github.com/notifications/unsubscribe-auth/
ADKpxJVR8Vw4xnn7bsQeFBSjCdEcqA9Pks5rA2gogaJpZM4K5nPZ>.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub

#1207 (comment),
or mute
the thread

https://github.com/notifications/unsubscribe-auth/ADJVu5mFCQyszhxzG03Z5rYMhJO1xR_Wks5rA6YCgaJpZM4K5nPZ
.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1207 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADKpxJTVXKxEzjNe3an9ZnbNwprQPy9nks5rA6dogaJpZM4K5nPZ.

@naxingyu
Copy link
Contributor

@jtrmal I don't see where you use all data to train. It's still running on LDC2013S08 and LDC2013T20.

@jtrmal
Copy link
Contributor Author

jtrmal commented Nov 23, 2016

Let me check, maybe I didn't commit the changes to run.sh

On Nov 23, 2016 8:11 AM, "Xingyu Na" [email protected] wrote:

@jtrmal https://github.com/jtrmal I don't see where you use all data to
train. It's still running on LDC2013S08 and LDC2013T20.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1207 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AKisX0iVGq2UY6WibNGMW5zUoID1K6itks5rBDuKgaJpZM4K5nPZ
.

@jtrmal
Copy link
Contributor Author

jtrmal commented Nov 23, 2016

I commit the run.sh and path.sh. Also renamed the RESULTS file. Sorry about
that.
y.

On Wed, Nov 23, 2016 at 8:26 AM, Jan Trmal [email protected] wrote:

Let me check, maybe I didn't commit the changes to run.sh

On Nov 23, 2016 8:11 AM, "Xingyu Na" [email protected] wrote:

@jtrmal https://github.com/jtrmal I don't see where you use all data
to train. It's still running on LDC2013S08 and LDC2013T20.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1207 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKisX0iVGq2UY6WibNGMW5zUoID1K6itks5rBDuKgaJpZM4K5nPZ
.

exit 1
fi
fi

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this removed? g2p.py is used later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a helpful error message if g2p is not found.

@jtrmal
Copy link
Contributor Author

jtrmal commented Nov 24, 2016 via email

Copy link
Contributor

@naxingyu naxingyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the comments, the RESULT file should be renamed. And did you plan to commit the changes you made on the scoring script?

exit 1
fi
fi

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a helpful error message if g2p is not found.


wait
local/nnet/run_dnn.sh

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in local/nnet/run_dnn.sh, there is a invalid option to steps/nnet/train.sh (--use-gpu-id) and a missing done near the end of the script.

@@ -0,0 +1,61 @@
#!/bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new script is not called in run.sh. We may think of deprecate the original split_wer.sh and rename this one as split_wer.sh. That one was errornous anyway.

@danpovey
Copy link
Contributor

danpovey commented Nov 30, 2016 via email

@naxingyu
Copy link
Contributor

naxingyu commented Dec 1, 2016 via email

This was referenced Dec 7, 2016
@jtrmal
Copy link
Contributor Author

jtrmal commented Dec 8, 2016

This is being handled by #1253, so closing...

@jtrmal jtrmal closed this Dec 8, 2016
@jtrmal jtrmal deleted the gale_recipe_fix branch February 10, 2017 03:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants