WIP: modify WaveData::Read(std::istream &is) function to deal with data re…#338
WIP: modify WaveData::Read(std::istream &is) function to deal with data re…#338Minhua722 wants to merge 6 commits intokaldi-asr:masterfrom
Conversation
…ad in stream mode when we don't know its exact size in advance
|
@kkm000 , can you please comment on this? |
src/feat/wave-reader.cc
Outdated
There was a problem hiding this comment.
rather performing the same check as above again, just use a bool like is_stream_mode which is set the first time you detect possible stream mode.
There was a problem hiding this comment.
I added "is_stream_mode" in the new commit
|
@Minhua722: This module has no tests. I started writing them but never actually got anywhere far on that. Do you think you would be able to create a few test cases? All it should take is put some known data into different wave files and then read it and verify it is read correctly. There is already the The machinery being added is quite complex and has some edge cases. A real danger for code like this (and any signal processing code in general) is that quite serious bugs in it can go undiscovered for years, as ML algorithms would often show only a very slight decrease in performance on quite mangled data. I do not know how much time do you have on hand, but adding tests would be a very big improvement for this part of kaldi. I am keeping #88 open to keep that on my radar. |
src/feat/wave-reader.cc
Outdated
There was a problem hiding this comment.
Can you maybe rewrite this branch as well? I understand it is old code, but it is extremely ugly. No need to allocate a kerzillion of small buffers, read into them, then merge into a large one. Just read all into its final location in one single read! You should actually get 3 lines of code instead of 40 some. Something to the effect of
chunk_data_vec.resize(data_chunk_size);
is.read(&chunk_data_vec.first(), data_chunk_size);
uint32 num_bytes_read = is.gcount();This is actually all it should take to read the chunk!
|
@kkm000 I've rewritten the original wave read code. For the test module staff, I'm not sure if I understand the problem properly and I've sent you another email separately. |
|
I've added my current understanding of the test staff |
|
@Minhua722: It might be really unsimple to properly test this streaming code. The idea I have in mind is to prepare a couple files (representing egde cases, like finite stream and infinite stream) of known simple contents and compare the result. You are on the right way, but I do not think the matlab ASCII file is the right data set to compare. But in any case, if you are not comfortable doing that, do not worry if you decide to drop it on the floor; I'll pick it up! :) |
|
One way to do this might be to have a function like this: void CreateWaveTestCase(std::ostream &os, that will create some random wave data and write it in a randomized way Dan On Sun, Nov 22, 2015 at 11:30 PM, Kirill Katsnelson <
|
|
Sorry. I'm still confused of what I should do... |
|
Kirill, how about I wait until the test code is done before merging this On Mon, Nov 23, 2015 at 12:52 AM, Minhua722 notifications@github.com
|
|
@danpovey, @Minhua722: No problem, I'll pick it up from this point. |
|
I am marking this original PR WIP since I am pulling in this branch to work on it. |
|
@danpovey: I have been working on the tests tonight, and wanted to come up with a test scaffolding to easily code different structure of wave files we support (and there is a lot of edge cases already). What I am thinking of is a simple syntax to express the prototype file structure right in code, and then allowing tests with different combinations of byte ordering and streaming/non-streaming encoding produced from this prototype. Essentially I am thinking along the lines of TestWaveData wave_file(
Chunk("fmt ", L(1), S(2), S(3)),
Chunk("data", S(4), S(5), S(6)));
istream& stm = wave_file.RenderStream(byte_order_flag, infinite_length_flag, ...);
This all is cute and nice. C++ however is not the easiest language to express these things (ah, Lisp would ve very natural!). My first attempt at a preprocessor abuse did not end up well: I came up with a horrific combo of macros and classes to make a nicer syntax possible. I do not really like it. My working implementation of the exact syntax above uses some C++11 features, and they are disabled in the compilation by default. What do you think of this--would you be ok with enabling -std=c++0x for tests only? They are not part of the Kaldi main library code anyway. This should be fine with gcc 4.7+, and MSVC and Intel are fine with that too, if anyone ever runs tests under these at all except me. I do not really know right now how to express these test cases with binary streams concisely without the new language features. |
|
Why don't you just #ifdef the tests? Kald may still get built on systems Dan On Sat, Nov 28, 2015 at 3:59 AM, Kirill Katsnelson <notifications@github.com
|
|
Actually, I think it makes sense to do both. Even the latest gcc 4.9 does not enable C++11 features without the switch. AFAIK, clang and msvc are free of this idiosyncrasy. |
|
I don't think it makes sense to activate flags for the test programs that On Sat, Nov 28, 2015 at 4:54 PM, Kirill Katsnelson <notifications@github.com
|
|
Well, I am frankly torn on this. I also do not entirely appreciate the idea of enabling the new language features even for the tests, as this opens a whole fresh can of worms. In addition, having to run the tests manually is a sure way to have it abandoned. On the other hand, we'll have enabled them on Travis in any case, so this may be of a lesser concern. I think I should look more into the preprocessor + template solution not requiring C++11. Maybe I'll come up with something more elegant than what I had. It is not going to be typesafe and more prone to runtime errors in defining the test, but we should not really care, I believe, as long as the tests cases are defined correctly with the limitations in mind. What do you think is better approach of the two? And I am implicitly dismissing creating tests with wave files defined with just |
|
It's up to you-- it's fine with me if these particular tests get executed On Sat, Nov 28, 2015 at 8:46 PM, Kirill Katsnelson <notifications@github.com
|
|
SGTM, thanks. Ceteris paribus, I'll stick to the least ugly approach. |
|
@kkm000, did you actually finish this? I though this pull request was waiting on something from you, but I may be wrong. |
|
@kkm000, are you still going to finish this pull request, or should I find someone else to take ownership? |
|
@danpovey: I'll do it, it's on my list. Not the first priority. I think I should find some time in February. |
|
Kirill, remember this is on your TODO list! |
|
I do! I'll get to it soon. |
|
@kkm000, any update on this? Can I just ask someone else to look at this? |
|
@kkm000, what do you think should be done about this? Will you do it eventually? |
|
Hi everybody, is there any update on this? Now that latest release of ffmpeg supports sphere files (with mulaw/Alaw/shorten encodings), being able to replace sph2pipe with ffmpeg without workarounds seems very appealing. |
|
I think there isn't an update. I'll try to think of someone to take it However, this PR is orthogonal to the question of replacing sph2pipe with Dan On Tue, Jul 19, 2016 at 8:01 AM, Giulio Paci notifications@github.com
|
|
@LvHang, can you please look into this PR, resolve the conflicts and update it, check the code for obvious bugs, and try to test that it doesn't break reading normal wav files? Kirill has been saying that he would write test code but I think we should accept that that isn't going to happen. |
|
Ok, get it
Daniel Povey <notifications@github.com>于2017年4月13日 周四19:57写道:
… @LvHang <https://github.com/LvHang>, can you please look into this PR,
resolve the conflicts and update it, check the code for obvious bugs, and
try to test that it doesn't break reading normal wav files? Kirill has been
saying that he would write test code but I think we should accept that that
isn't going to happen.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#338 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ATlBoJtZJ7XGu3gGjKZdBE_e8jZM-yspks5rvraDgaJpZM4GkRGA>
.
|
|
Sorry. Let me get back on it right now. I'll give an update later today. |
|
Thanks!
…On Fri, Apr 14, 2017 at 11:58 AM, Kirill Katsnelson < ***@***.***> wrote:
Sorry. Let me get back on it right now. I'll give an update later today.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#338 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVux0gwIuWdcH2SfTj--G69x2PPZPNks5rv8HOgaJpZM4GkRGA>
.
|
|
So, that means kkm000 will write the test code and I will fix the conflicts, right? |
|
He may just finish the whole thing himself, but if there isn't progress
within a day or two I may ask you to do do it yourself.
…On Fri, Apr 14, 2017 at 12:00 PM, LvHang ***@***.***> wrote:
So, that means kkm000 will write the test code and I will fix the
conflicts, right?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#338 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu16xzI8QeeSDFYQmkzAoPAn6qfsiks5rv8JGgaJpZM4GkRGA>
.
|
…asr#88) Slightly modified work of @Minhua722 lifted from PR kaldi-asr#338
…asr#88) Slightly modified work of @Minhua722 lifted from PR kaldi-asr#338
|
Continued in #1546, hence closing. |
…asr#88) Slightly modified work of @Minhua722 lifted from PR kaldi-asr#338
…ats (#1546) Based on some earlier work of @Minhua722 lifted from PR #338.
Slightly modified work of @Minhua722 lifted from PR kaldi-asr#338. Key changes: * Only 16-bit signed linear PCM sample format is supported. For everything else, there is SoX. * class WaveInfo separated from WaveData, and now used by wav-to-duration. * Added basic wave reader tests. Closes: kaldi-asr#88 Closes: kaldi-asr#338 Closes: kaldi-asr#1567
…ats (kaldi-asr#1546) Based on some earlier work of @Minhua722 lifted from PR kaldi-asr#338.
modify WaveData::Read(std::istream &is) function to deal with data read in stream mode when we don't know its exact size in advance.
If data are read in stream mode, we will get riff_chunk_size, data_chunk_size either 0 or max of Uint32. The function is modified to deal with this case separately.