Skip to content

WIP: modify WaveData::Read(std::istream &is) function to deal with data re…#338

Closed
Minhua722 wants to merge 6 commits intokaldi-asr:masterfrom
Minhua722:wav-read
Closed

WIP: modify WaveData::Read(std::istream &is) function to deal with data re…#338
Minhua722 wants to merge 6 commits intokaldi-asr:masterfrom
Minhua722:wav-read

Conversation

@Minhua722
Copy link
Contributor

modify WaveData::Read(std::istream &is) function to deal with data read in stream mode when we don't know its exact size in advance.

If data are read in stream mode, we will get riff_chunk_size, data_chunk_size either 0 or max of Uint32. The function is modified to deal with this case separately.

…ad in stream mode when we don't know its exact size in advance
@danpovey
Copy link
Contributor

@kkm000 , can you please comment on this?
Vijay will also check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather performing the same check as above again, just use a bool like is_stream_mode which is set the first time you detect possible stream mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added "is_stream_mode" in the new commit

@kkm000
Copy link
Contributor

kkm000 commented Nov 20, 2015

@Minhua722: This module has no tests. I started writing them but never actually got anywhere far on that. Do you think you would be able to create a few test cases? All it should take is put some known data into different wave files and then read it and verify it is read correctly. There is already the test_data suborder for these

The machinery being added is quite complex and has some edge cases. A real danger for code like this (and any signal processing code in general) is that quite serious bugs in it can go undiscovered for years, as ML algorithms would often show only a very slight decrease in performance on quite mangled data. I do not know how much time do you have on hand, but adding tests would be a very big improvement for this part of kaldi. I am keeping #88 open to keep that on my radar.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you maybe rewrite this branch as well? I understand it is old code, but it is extremely ugly. No need to allocate a kerzillion of small buffers, read into them, then merge into a large one. Just read all into its final location in one single read! You should actually get 3 lines of code instead of 40 some. Something to the effect of

chunk_data_vec.resize(data_chunk_size);
is.read(&chunk_data_vec.first(), data_chunk_size);
uint32 num_bytes_read = is.gcount();

This is actually all it should take to read the chunk!

@Minhua722
Copy link
Contributor Author

@kkm000 I've rewritten the original wave read code.

For the test module staff, I'm not sure if I understand the problem properly and I've sent you another email separately.

@Minhua722
Copy link
Contributor Author

I've added my current understanding of the test staff

@kkm000
Copy link
Contributor

kkm000 commented Nov 23, 2015

@Minhua722: It might be really unsimple to properly test this streaming code. The idea I have in mind is to prepare a couple files (representing egde cases, like finite stream and infinite stream) of known simple contents and compare the result. You are on the right way, but I do not think the matlab ASCII file is the right data set to compare. But in any case, if you are not comfortable doing that, do not worry if you decide to drop it on the floor; I'll pick it up! :)

@danpovey
Copy link
Contributor

One way to do this might be to have a function like this:

void CreateWaveTestCase(std::ostream &os,
WaveData *data);

that will create some random wave data and write it in a randomized way
(i.e. possibly with zero or infinite or inaccurate header and possibly
multiple data chunks), and for reference will write the data that the
stream is supposed to represent.
Then the wave-reading code could read that stream and make sure the
resulting object is the same as it is supposed to be.

Dan

On Sun, Nov 22, 2015 at 11:30 PM, Kirill Katsnelson <
notifications@github.com> wrote:

@Minhua722 https://github.com/Minhua722: It might be really unsimple to
properly test this streaming code. The idea I have in mind is to prepare a
couple files (representing egde cases, like finite stream and infinite
stream) of known simple contents and compare the result. You are on the
right way, but I do not think the matlab ASCII file is the right data set
to compare. But in any case, if you are not comfortable doing that, do not
worry if you decide to drop it on the floor; I'll pick it up! :)


Reply to this email directly or view it on GitHub
#338 (comment).

@Minhua722
Copy link
Contributor Author

Sorry. I'm still confused of what I should do...
@kkm000 Maybe I could work on it after you push an initial version.

@danpovey
Copy link
Contributor

Kirill, how about I wait until the test code is done before merging this
pull request.. I don't think it's urgent.

On Mon, Nov 23, 2015 at 12:52 AM, Minhua722 notifications@github.com
wrote:

Sorry. I'm still confused of what I should do...
@kkm000 https://github.com/kkm000 Maybe I could work on it after you
push an initial version.


Reply to this email directly or view it on GitHub
#338 (comment).

@kkm000
Copy link
Contributor

kkm000 commented Nov 24, 2015

@danpovey, @Minhua722: No problem, I'll pick it up from this point.

@kkm000 kkm000 self-assigned this Nov 24, 2015
@kkm000 kkm000 changed the title modify WaveData::Read(std::istream &is) function to deal with data re… WIP: modify WaveData::Read(std::istream &is) function to deal with data re… Nov 27, 2015
@kkm000
Copy link
Contributor

kkm000 commented Nov 27, 2015

I am marking this original PR WIP since I am pulling in this branch to work on it.

@kkm000
Copy link
Contributor

kkm000 commented Nov 28, 2015

@danpovey: I have been working on the tests tonight, and wanted to come up with a test scaffolding to easily code different structure of wave files we support (and there is a lot of edge cases already). What I am thinking of is a simple syntax to express the prototype file structure right in code, and then allowing tests with different combinations of byte ordering and streaming/non-streaming encoding produced from this prototype. Essentially I am thinking along the lines of

TestWaveData wave_file(
  Chunk("fmt ",  L(1), S(2), S(3)),
  Chunk("data", S(4), S(5), S(6)));
istream& stm = wave_file.RenderStream(byte_order_flag, infinite_length_flag, ...);

L and S are for a "long" (32-bit) and "short" data words, and the combinations of flags would provide multiple different streams to feed WaveReader.

This all is cute and nice. C++ however is not the easiest language to express these things (ah, Lisp would ve very natural!). My first attempt at a preprocessor abuse did not end up well: I came up with a horrific combo of macros and classes to make a nicer syntax possible. I do not really like it. My working implementation of the exact syntax above uses some C++11 features, and they are disabled in the compilation by default.

What do you think of this--would you be ok with enabling -std=c++0x for tests only? They are not part of the Kaldi main library code anyway. This should be fine with gcc 4.7+, and MSVC and Intel are fine with that too, if anyone ever runs tests under these at all except me. I do not really know right now how to express these test cases with binary streams concisely without the new language features.

@danpovey
Copy link
Contributor

Why don't you just #ifdef the tests? Kald may still get built on systems
without c++11.

Dan

On Sat, Nov 28, 2015 at 3:59 AM, Kirill Katsnelson <notifications@github.com

wrote:

@danpovey https://github.com/danpovey: I have been working on the tests
tonight, and wanted to come up with a test scaffolding to easily code
different structure of wave files we support (and there is a lot of edge
cases already). What I am thinking of is a simple syntax to express the
prototype file structure right in code, and then allowing tests with
different combinations of byte ordering and streaming/non-streaming
encoding produced from this prototype. Essentially I am thinking along the
lines of

TestWaveData wave_file(
Chunk("fmt ", L(1), S(2), S(3)),
Chunk("data", S(4), S(5), S(6)));
istream& stm = wave_file.RenderStream(byte_order_flag, infinite_length_flag, ...);

L and S are for a "long" (32-bit) and "short" data words, and the
combinations of flags would provide multiple different streams to feed
WaveReader.

This all is cute and nice. C++ however is not the easiest language to
express these things (ah, Lisp would ve very natural!). My first attempt at
a preprocessor abuse did not end up well: I came up with a horrific combo
of macros and classes to make a nicer syntax possible. I do not really like
it. My working implementation of the exact syntax above uses some C++11
features, and they are disabled in the compilation by default.

What do you think of this--would you be ok with enabling -std=c++0x for
tests only? They are not part of the Kaldi main library code anyway. This
should be fine with gcc 4.7+, and MSVC and Intel are fine with that too, if
anyone ever runs tests under these at all except me. I do not really know
right now how to express these test cases with binary streams concisely
without the new language features.


Reply to this email directly or view it on GitHub
#338 (comment).

@kkm000
Copy link
Contributor

kkm000 commented Nov 28, 2015

Actually, I think it makes sense to do both. Even the latest gcc 4.9 does not enable C++11 features without the switch. AFAIK, clang and msvc are free of this idiosyncrasy.

@danpovey
Copy link
Contributor

I don't think it makes sense to activate flags for the test programs that
are not being used in the main program-- it's a mismatch. And this isn't
really a good enough reason to modify the Makefiles. You can just #ifdef
the test and test yourself that it runs correctly by running with c++0x in
your local environment- the test won't run by default.
Dan

On Sat, Nov 28, 2015 at 4:54 PM, Kirill Katsnelson <notifications@github.com

wrote:

Actually, I think it makes sense to do both. Even the latest gcc 4.9 does
not enable C++11 features without the switch. AFAIK, clang and msvc are
free of this idiosyncrasy.


Reply to this email directly or view it on GitHub
#338 (comment).

@kkm000
Copy link
Contributor

kkm000 commented Nov 29, 2015

Well, I am frankly torn on this. I also do not entirely appreciate the idea of enabling the new language features even for the tests, as this opens a whole fresh can of worms. In addition, having to run the tests manually is a sure way to have it abandoned. On the other hand, we'll have enabled them on Travis in any case, so this may be of a lesser concern.

I think I should look more into the preprocessor + template solution not requiring C++11. Maybe I'll come up with something more elegant than what I had. It is not going to be typesafe and more prone to runtime errors in defining the test, but we should not really care, I believe, as long as the tests cases are defined correctly with the limitations in mind.

What do you think is better approach of the two?

And I am implicitly dismissing creating tests with wave files defined with just char data[] = { 0x... }; or checking in binary files. That would be really ugly,

@danpovey
Copy link
Contributor

It's up to you-- it's fine with me if these particular tests get executed
only by you for the time being. We can just run them again if we ever
change that code again. I don't think it's super critical.
Dan

On Sat, Nov 28, 2015 at 8:46 PM, Kirill Katsnelson <notifications@github.com

wrote:

Well, I am frankly torn on this. I also do not entirely appreciate the
idea of enabling the new language features even for the tests, as this
opens a whole fresh can of worms. In addition, having to run the tests
manually is a sure way to have it abandoned. On the other hand, we'll have
enabled them on Travis in any case, so this may be of a lesser concern.

I think I should look more into the preprocessor + template solution not
requiring C++11. Maybe I'll come up with something more elegant that what I
had. It is not going to be typesafe and more prone to runtime errors in
defining the test, but we should not really care, I believe, as long as the
tests cases are defined correctly with the limitations in mind.

What do you think is better approach of the two?

And I am implicitly dismissing creating tests with wave files defined with
just char data[] = { 0x... }; or checking in binary files. That would be
really ugly,


Reply to this email directly or view it on GitHub
#338 (comment).

@kkm000
Copy link
Contributor

kkm000 commented Nov 29, 2015

SGTM, thanks. Ceteris paribus, I'll stick to the least ugly approach.

@danpovey
Copy link
Contributor

danpovey commented Jan 9, 2016

@kkm000, did you actually finish this? I though this pull request was waiting on something from you, but I may be wrong.

@danpovey
Copy link
Contributor

@kkm000, are you still going to finish this pull request, or should I find someone else to take ownership?

@kkm000
Copy link
Contributor

kkm000 commented Jan 31, 2016

@danpovey: I'll do it, it's on my list. Not the first priority. I think I should find some time in February.

@danpovey
Copy link
Contributor

Kirill, remember this is on your TODO list!

@kkm000
Copy link
Contributor

kkm000 commented Mar 28, 2016

I do! I'll get to it soon.

@danpovey
Copy link
Contributor

danpovey commented May 7, 2016

@kkm000, any update on this? Can I just ask someone else to look at this?

@danpovey
Copy link
Contributor

@kkm000, what do you think should be done about this? Will you do it eventually?

@giuliopaci
Copy link
Contributor

Hi everybody, is there any update on this? Now that latest release of ffmpeg supports sphere files (with mulaw/Alaw/shorten encodings), being able to replace sph2pipe with ffmpeg without workarounds seems very appealing.

@danpovey
Copy link
Contributor

I think there isn't an update. I'll try to think of someone to take it
over from Kirill, since he doesn't seem to have time.

However, this PR is orthogonal to the question of replacing sph2pipe with
ffmpeg. I'd rather wait a while to do that, until the latest release of
ffmpeg has propagated to all the repositories of the standard distributions.

Dan

On Tue, Jul 19, 2016 at 8:01 AM, Giulio Paci notifications@github.com
wrote:

Hi everybody, is there any update on this? Now that latest release of
ffmpeg http://ffmpeg.org/download.html#release_3.1 supports sphere
files (with mulaw/Alaw/shorten encodings), being able to replace sph2pipe
with ffmpeg without workarounds seems very appealing.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#338 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADJVu1UEnL8-nlOeZTbNhq1husb2on9iks5qXOa3gaJpZM4GkRGA
.

@danpovey
Copy link
Contributor

@LvHang, can you please look into this PR, resolve the conflicts and update it, check the code for obvious bugs, and try to test that it doesn't break reading normal wav files? Kirill has been saying that he would write test code but I think we should accept that that isn't going to happen.

@LvHang
Copy link
Contributor

LvHang commented Apr 13, 2017 via email

@kkm000
Copy link
Contributor

kkm000 commented Apr 14, 2017

Sorry. Let me get back on it right now. I'll give an update later today.

@danpovey
Copy link
Contributor

danpovey commented Apr 14, 2017 via email

@LvHang
Copy link
Contributor

LvHang commented Apr 14, 2017

So, that means kkm000 will write the test code and I will fix the conflicts, right?

@danpovey
Copy link
Contributor

danpovey commented Apr 14, 2017 via email

kkm000 pushed a commit to kkm000/kaldi that referenced this pull request Apr 15, 2017
kkm000 pushed a commit to kkm000/kaldi that referenced this pull request Apr 15, 2017
@kkm000
Copy link
Contributor

kkm000 commented Apr 15, 2017

Continued in #1546, hence closing.

@kkm000 kkm000 closed this Apr 15, 2017
kkm000 pushed a commit to kkm000/kaldi that referenced this pull request Apr 16, 2017
danpovey pushed a commit that referenced this pull request Apr 17, 2017
kkm000 pushed a commit to kkm000/kaldi that referenced this pull request Apr 28, 2017
Slightly modified work of @Minhua722 lifted from PR kaldi-asr#338. Key changes:

 * Only 16-bit signed linear PCM sample format is supported. For
   everything else, there is SoX.
 * class WaveInfo separated from WaveData, and now used by wav-to-duration.
 * Added basic wave reader tests.

Closes: kaldi-asr#88
Closes: kaldi-asr#338
Closes: kaldi-asr#1567
Skaiste pushed a commit to Skaiste/idlak that referenced this pull request Sep 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants