-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible bug in the audio part of SelectRangeEvery #232
Comments
Without digging into the problem deeper, let me have a question. |
Thanks for your reply. It doesn't, it's a bug in how exactly the audio is processed when audio=true I believe. It often will even work (seemingly) well for a few seconds or so (at least in my case), which is why it might be easy to overlook. The most noticeable kind of glitch is when the output becomes silent, but I reckon this doesn't tend to happen in the beginning of the output because the offsets are still so small that even when jumping too far ahead, you still end up with some kind of audio instead of silence. I reckon the silence happens when the error becomes so large that it tries to query source audio past its end. Here's an example with SelectRangeEvery(240,60). And here's two times the result. Nothing was changed, just rendered out twice from VirtualDub: As you can see, both results are different. I think this is because VirtualDub won't always query the audio frames in the exact same order. So the position of the errors shifts. Some parts remain stable across the two attempts as you can see, others change around, move around borders, etc. But both have one thing in common: The beginning of the audio seems okay at first glance if you just take a quick listen. |
What happens with libavformat, though? It could mostly (or entirely) be a bug in VirtualDub, or a consequence of VDub probably having to go through the ACM to decode/render audio (or even the interference of accessing AviSynth through Video for Windows). |
Sorry, I don't know what libavformat is, but I've also tried ffmpeg, if that answers the question. Same issue. Also tried rendering only audio vs with video, which changes the durations requested in one go. With video rendering, the artifacts become more short term and often, because the count of samples requested is typically exactly one video frame (in my example it was 1920 samples per request iirc, but ofc depends on frame & samplerate). I should perhaps mention that I made my own version of the SelectRangeEvery filter: https://github.com/TomArrow/SelectRangeEveryReversing That is how I found the error. Making the suggested change fixed the issue for me. If you look at the logic of the function, I think it's pretty clear that it should be Initially the int startframe = vi.FramesFromAudioSamples(start); with Then const int iteration = startframe / length; And lastly this is called to advance the "pointer" startframe = (iteration+1) * every; So we first divide by length and then we multiply by every. Which doesn't really make a whole lot of sense. |
libavformat is the (de)muxing library in FFmpeg. The AviSynth support in it is implemented by accessing the AviSynth library through the C interface directly, so there's no middleman. |
Ah, gotcha. Well, I never had any issues with VirtualDub in the past that are comparable and I also have the issue with ffmpeg, as stated above. |
Can you check this build (with your proposed change - local build, no commit yet), thanks |
Sure, I can test it, thanks. Where do I put all the files? In my AviSynth+ installation folder there is no AviSynth.dll and no plugins and system folder etc. |
Usually I just overwrite existing avisynth.dll in system32 with the one in the x64 folder |
In avs_core/filters/field.cpp, line 1085:
startframe = (iteration+1) * every;
Seems to be a bug to me. I believe startframe is supposed to track the frame position relative to the output video, but multiplying by every would make it relative to the source video.
As a result, I believe that there are countless glitches in the generated audio, depending on how exactly an application requests the audio and in which order. Misplaced segments, silent areas, etc.
I believe the correct way to do this would be:
startframe = (iteration+1) * length;
The audio code in general could maybe use a revamp to be based solely on audio samples, without bringing frames into it, because it's currently integer-based, but the relationship between framerate and audio sample rate isn't guaranteed to be expressible with an integer. Perhaps it could be solved by converting the length and every parameters into their corresponding audio sample equivalents in the constructor and using those for the audio? However I can also see how in the worst case, that could cause a very slight drift over extremely long videos. But as it is now, I think the current code can result in lost or unpredictable individual samples at the borders of the ranges.
The text was updated successfully, but these errors were encountered: