-
-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create wav file source and sink #576
Comments
This is pseudo-code that updates WAV header. It can be invoked each time after we write more samples to file, and at the very beginning when opening file. #define WAVE_FORMAT_PCM 1
#define WAVE_FORMAT_IEEE_FLOAT 3
struct WavHeader
{
uint32_t chunk_id;
uint32_t chunk_size;
uint32_t format;
uint32_t subchunk1_id;
uint32_t subchunk1_size;
uint16_t audio_format;
uint16_t num_channels;
uint32_t sample_rate;
uint32_t byte_rate;
uint16_t block_align;
uint16_t bits_per_sample;
uint32_t subchunk2_id;
uint32_t subchunk2_size;
};
void update_header(FILE* file, size_t num_samples, size_t num_channels, size_t sample_rate)
{
const uint32_t data_len = uint32_t(num_samples * num_channels) * sizeof(sample_t);
WavHeader hdr = {};
memcpy(&hdr.chunk_id, "RIFF", 4);
hdr.chunk_size = to_little_endian_32(data_len + 36);
memcpy(&hdr.format, "WAVE", 4);
memcpy(&hdr.subchunk1_id, "fmt ", 4);
hdr.subchunk1_size = to_little_endian_32(16);
hdr.audio_format = to_little_endian_16(WAVE_FORMAT_PCM);
hdr.num_channels = to_little_endian_16((uint16_t)num_channels);
hdr.sample_rate = to_little_endian_32((uint32_t)sample_rate));
hdr.byte_rate =
to_little_endian_32((uint32_t)(sample_rate * num_channels) * sizeof(sample_t));
hdr.block_align = to_little_endian_16((uint16_t)num_channels * sizeof(sample_t));
hdr.bits_per_sample = to_little_endian_16(8 * sizeof(sample_t));
memcpy(&hdr.subchunk2_id, "data", 4);
hdr.subchunk2_size = to_little_endian_32(data_len);
if (fseek(file, 0, SEEK_SET) != 0) {
/* error */
}
if (fwrite(&hdr, sizeof(hdr), 1, file) != 1) {
/* error */
}
if (fseek(file, 0, SEEK_END) != 0) {
/* error */
}
} |
Hello, I'd like to try my strengths by resolving this issue, of course if it's still available for assignment. |
You're welcome, thank you! |
I have a question regarding a couple of things I've had issues deciding on what should be done:
roc_log(LogInfo,
"wav source:"
" in_bits=%lu out_bits=%lu in_rate=%lu out_rate=%lu"
" in_ch=%lu out_ch=%lu",
(unsigned long)wav_.bitsPerSample, (unsigned long)wav_.bitsPerSample,
(unsigned long)wav_.sampleRate, (unsigned long)wav_.sampleRate,
(unsigned long)wav_.channels, (unsigned long)wav_.channels); vs roc_log(LogInfo,
"sox source:"
" in_bits=%lu out_bits=%lu in_rate=%lu out_rate=%lu"
" in_ch=%lu out_ch=%lu is_file=%d",
(unsigned long)input_->encoding.bits_per_sample,
(unsigned long)in_signal_.precision, (unsigned long)input_->signal.rate,
(unsigned long)in_signal_.rate, (unsigned long)input_->signal.channels,
(unsigned long)in_signal_.channels, (int)is_file_); I assume that previously the config defined also output parameters for conversion. I'm not sure about the conversion with regard to bits per sample. It was not mentioned but I'd want to make sure that this conversion should be a possibility.
I'm not sure if there's an issue made for 3rd party libs sources, but I've had to change slightly the script to be able to perform a complete build, namely for If you'd like to follow the development, hopefully this week it will be major, please follow the fork fork. It may make some of my points clearer if looked into WIP commits. Cosmetic issues will be resolved after 'WIP stage'. |
Hi! Good catches. I think for wav source, there are only two valid uses of If user provided Note that for wav sink, the behavior is different: if AFAIK, WAV format does not specify semantics & order of channels, it just has channel count, and intepretation of channels is up to the user. However, I think common convention is to use SMPTE order (example). Roc supports two channel layouts (ChannelSet.layout):
In surround layout, different orders are possible (ChannelSet.order), and by default roc uses SMPTE order. By default, we can assume that WAV file has surround layout with SMPTE order. We should set corresponding values to sample spec. And if WAV file has N channels, we can just enable first N bits of channel set (ChannelSet.set_channel_range). We also need to allow the user to explicitly specify what channels are stored in the WAV file to override default behavior. I think we can do it later. In next release (I hope) we'll replace
Not sure why it happens, what OS are you using? Anyway, the change makes sense, feel free to create a PR with that.
A few notes:
|
Hi, thanks for your insights! Regarding the points:
I'm using Ubuntu 22.04.3 LTS running on WSL2, though in this case it was a problem with my Internet connection as I'm mostly operating on a network with blocked FTP ports. I should've noticed that, so I'm sorry for the confusion. I've verified the links on my mobile connection and they work perfectly fine, so I'll reverse the change as it's not a part of this issue, unless requested otherwise.
It should be achievable with |
I see. I don't know about you case, but quick googling suggests that ISP providers sometimes limit FTP ports, so apparently your changes can be helpful. (and also ftp is insecure) |
Hi, I have another couple of questions: I'm struggling with endian conversion for the purposes of writing the WAV data. For now I'm using Should we implement any conversion between sample sizes, ie. bits per sample from 8b to 16/24/32b etc. or down from 16b to 8b. If a parameter of bits per sample is not divisible by 8 starting from a value of 8, should it be validated and rejected? Panic or error? The type of Are there any plans for handling WAV files larger than 2^32-1 bytes? I think there are typos for Relevant code is available in the fork. |
We could use std::endian if it was available, because we only avoid "STL" part of stdlib (template containers, iterators, etc). However we use C++98 so it's not available. Also we support a lot of (embedded) targets that don't have C++20 compiler yet. For endian conversions, we already have two headers:
For reading, there is For writing, we can just always produce WAV files with 32-bit floats (format = WAVE_FORMAT_IEEE_FLOAT = 3), and don't support other formats, for now.
Conversion is done by scaling each sample. E.g. to convert from float32 (in range -1 to +1) to int16 (in range -32768 to 32767), you multiply each float by 32767 and round it to integer. In case of WAV backend, as I wrote above, conversion should be needed only when reading, and dr_wav should handle it for us. Though, if we would need it, we have audio::PcmMapper class that can convert between different types and endians.
For 32-bit stereo with 48Khz sample rate, 4 GB is 3 hours of sound, which is not that huge as one could think. If support for large files does not need extra efforts, it would be nice to have it. But I don't think it's essential.
Yep, indeed, thanks. |
FYI: #246 (comment) |
Small notice: since we removed |
What kind of driver should be pushed if even it should be pushed in void discover_drivers(core::Array<DriverInfo, MaxDrivers>& driver_list) If Aside from that the only things left for me is to test changes, fix problems if there are any, optimize and clean-up the code. Thanks again for your insights! |
|
Landed, thanks! |
- make wav backend lower priority than sndio backend, as sndio can handle more variation of WAV/WAVEX files - don't try wav backend if either driver or file extension is not "wav" - use SampleSpec::use_defaults()
- make wav backend lower priority than sndio backend, as sndio can handle more variation of WAV/WAVEX files - don't try wav backend if either driver or file extension is not "wav" - use SampleSpec::use_defaults()
Add new source and sink implementations:
WavSource
andWavSink
. Also addWavBackend
corresponding to them and register it inBackendDispatcher
.WavSource
should read wav file using dr_wav library. The library is single header, which we can just copy to3rdparty/dr_wav
and add to include path in 3rdparty/SConscript the same way as we do it for hedley.WavSink
should write wav file. But instead of using dr_wav, we should implement writing manually. The reason is that we need to update wav file header with the new size each time when we append more samples to file. This feature makes output file usable at any point of time. If we used dr_wav instead, the file would be usable only after closing the file.The new backend should be employed when user specifies
file://<path>.wav
as--input
parameter ofroc-recv
or--output
parameter ofroc-send
. We should ensure thatroc-recv
androc-send
will employ this backend instead of sox for wav files, and will work correctly.Wav source or sink should report that it doesn't have clock and latency. Wav source should detect sample rate and channel count from the file.
Source, sink, and backend interfaces are here: src/internal_modules/roc_sndio (see
ISource
,ISink
, andIBackend
)Example of backend implementation can be found here: src/internal_modules/roc_sndio/target_sox/roc_sndio
Documentation for CLI tools is here: https://roc-streaming.org/toolkit/docs/tools/command_line_tools.html
See also #246 for additional hints.
The text was updated successfully, but these errors were encountered: