[Cherry-picked 2.0.1] Properly set #samples passed to encoder #3204

mthrok · 2023-03-24T17:06:33Z

Some audio encoders expect specific, exact number of samples described as in AVCodecContext.frame_size.

The AVFrame.nb_samples is set for the frames passed to AVFilterGraph, but frames coming out of the graph do not necessarily have the same numbr of frames.

This causes issues with encoding OPUS.

This commit fixes it by inserting asetnsamples to filter graph if a fixed number of samples is requested.

Followup:

issue warning if encoding opus in FFmpeg 4.1

check if AAC issue is caused by this

audio/examples/tutorials/streamwriter_basic_tutorial.py

Lines 506 to 546 in d8a37a2

    
           ###################################################################### 
        
           # Note on slicing and AAC 
        
           # ~~~~~~~~~~~~~~~~~~~~~~~ 
        
           # 
        
           # .. warning:: 
        
           # 
        
           #    FFmpeg's native AAC encoder (which is used by default when 
        
           #    saving video with MP4 format) has a bug that affects the audibility. 
        
           # 
        
           #    Please refer to the examples bellow. 
        
           # 
        
           def test_slice(audio_encoder, slice_size, ext="mp4"): 
        
               path = get_path(f"slice_{slice_size}.{ext}") 
        
               s = StreamWriter(dst=path) 
        
               s.add_audio_stream(SAMPLE_RATE, NUM_CHANNELS, encoder=audio_encoder) 
        
               with s.open(): 
        
                   for start in range(0, NUM_FRAMES, slice_size): 
        
                       end = start + slice_size 
        
                       s.write_audio_chunk(0, WAVEFORM[start:end, ...]) 
        
               return path 
        
           ###################################################################### 
        
           # 
        
           # This causes some artifacts. 
        
           # note: 
        
           # Chrome does not support playing AAC audio directly while Safari does. 
        
           # Using MP4 container and specifying AAC allows Chrome to play it. 
        
           Video(test_slice(audio_encoder="aac", slice_size=8000, ext="mp4"), embed=True) 
        
           ###################################################################### 
        
           # 
        
           # It is more noticeable when using smaller slice. 
        
           Video(test_slice(audio_encoder="aac", slice_size=512, ext="mp4"), embed=True) 
        
           ###################################################################### 
        
           # 
        
           # Lame MP3 encoder works fine for the same slice size. 
        
           Audio(test_slice(audio_encoder="libmp3lame", slice_size=512, ext="mp3"))

facebook-github-bot · 2023-03-24T17:08:01Z

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-03-24T18:56:03Z

This pull request was exported from Phabricator. Differential Revision: D44374668

Summary: Some audio encoders expect specific, exact number of samples described as in `AVCodecContext.frame_size`. The `AVFrame.nb_samples` is set for the frames passed to `AVFilterGraph`, but frames coming out of the graph do not necessarily have the same numbr of frames. This causes issues with encoding OPUS. This commit fixes it by inserting `asetnsamples` to filter graph if a fixed number of samples is requested. Pull Request resolved: pytorch#3204 Differential Revision: D44374668 Pulled By: mthrok fbshipit-source-id: 444f296c288bd7fee4172e89e2f465e0f2762d48

facebook-github-bot · 2023-03-25T03:02:09Z

This pull request was exported from Phabricator. Differential Revision: D44374668

Summary: Some audio encoders expect specific, exact number of samples described as in `AVCodecContext.frame_size`. The `AVFrame.nb_samples` is set for the frames passed to `AVFilterGraph`, but frames coming out of the graph do not necessarily have the same numbr of frames. This causes issues with encoding OPUS. This commit fixes it by inserting `asetnsamples` to filter graph if a fixed number of samples is requested. Pull Request resolved: pytorch#3204 Reviewed By: nateanl Differential Revision: D44374668 Pulled By: mthrok fbshipit-source-id: 3025ed586a010b87e7870220bfa0dadf2317ba87

facebook-github-bot · 2023-03-25T04:28:33Z

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: Some audio encoders expect specific, exact number of samples described as in `AVCodecContext.frame_size`. The `AVFrame.nb_samples` is set for the frames passed to `AVFilterGraph`, but frames coming out of the graph do not necessarily have the same numbr of frames. This causes issues with encoding OPUS (among others). This commit fixes it by inserting `asetnsamples` to filter graph if a fixed number of samples is requested. Note: It turned out that FFmpeg 4.1 has issue with OPUS encoding. It does not properly discard some sample. We should probably move the minimum required FFmpeg to 4.2, but I am not sure if we can enforce it via ABI. Work around will be to issue an warning if encoding OPUS with 4.1. (follow-up) Pull Request resolved: pytorch#3204 Reviewed By: nateanl Differential Revision: D44374668 Pulled By: mthrok fbshipit-source-id: 723131e4f9b1979928f3ea2eddda17b1180b1a27

facebook-github-bot · 2023-03-25T14:59:19Z

This pull request was exported from Phabricator. Differential Revision: D44374668

facebook-github-bot · 2023-03-25T18:47:12Z

@mthrok merged this pull request in d8a37a2.

github-actions · 2023-03-25T18:47:21Z

Hey @mthrok.
You merged this PR, but labels were not properly added. Please add a primary and secondary label (See https://github.com/pytorch/audio/blob/main/.github/process_commit.py)

Summary: Some audio encoders expect specific, exact number of samples described as in `AVCodecContext.frame_size`. The `AVFrame.nb_samples` is set for the frames passed to `AVFilterGraph`, but frames coming out of the graph do not necessarily have the same numbr of frames. This causes issues with encoding OPUS (among others). This commit fixes it by inserting `asetnsamples` to filter graph if a fixed number of samples is requested. Note: It turned out that FFmpeg 4.1 has issue with OPUS encoding. It does not properly discard some sample. We should probably move the minimum required FFmpeg to 4.2, but I am not sure if we can enforce it via ABI. Work around will be to issue an warning if encoding OPUS with 4.1. (follow-up) Pull Request resolved: pytorch#3204 Reviewed By: nateanl Differential Revision: D44374668 Pulled By: mthrok fbshipit-source-id: 10ef5333dc0677dfb83c8e40b78edd8ded1b21dc

Summary: Some audio encoders expect specific, exact number of samples described as in `AVCodecContext.frame_size`. The `AVFrame.nb_samples` is set for the frames passed to `AVFilterGraph`, but frames coming out of the graph do not necessarily have the same numbr of frames. This causes issues with encoding OPUS (among others). This commit fixes it by inserting `asetnsamples` to filter graph if a fixed number of samples is requested. Note: It turned out that FFmpeg 4.1 has issue with OPUS encoding. It does not properly discard some sample. We should probably move the minimum required FFmpeg to 4.2, but I am not sure if we can enforce it via ABI. Work around will be to issue an warning if encoding OPUS with 4.1. (follow-up) Pull Request resolved: #3204 Reviewed By: nateanl Differential Revision: D44374668 Pulled By: mthrok fbshipit-source-id: 10ef5333dc0677dfb83c8e40b78edd8ded1b21dc

facebook-github-bot added the CLA Signed label Mar 24, 2023

mthrok force-pushed the fix-opus branch from 01d50d3 to f1324ae Compare March 24, 2023 18:56

mthrok force-pushed the fix-opus branch from f1324ae to 7651aed Compare March 25, 2023 03:02

mthrok force-pushed the fix-opus branch from 502f513 to e8f896f Compare March 25, 2023 14:59

facebook-github-bot closed this in d8a37a2 Mar 25, 2023

facebook-github-bot added the Merged label Mar 25, 2023

mthrok deleted the fix-opus branch March 25, 2023 18:56

mthrok added module: IO bug fix labels Mar 25, 2023

mthrok mentioned this pull request Apr 4, 2023

[v2.0.1] Release Tracker #3237

Closed

mthrok mentioned this pull request Apr 4, 2023

[Cherry-pick 2.0.1] Properly set #samples passed to encoder (#3204) #3239

Merged

mthrok changed the title ~~Properly set #samples passed to encoder~~ [Cherry-picked 2.0.1] Properly set #samples passed to encoder Apr 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Cherry-picked 2.0.1] Properly set #samples passed to encoder #3204

[Cherry-picked 2.0.1] Properly set #samples passed to encoder #3204

mthrok commented Mar 24, 2023 •

edited

Loading

facebook-github-bot commented Mar 24, 2023

facebook-github-bot commented Mar 24, 2023

facebook-github-bot commented Mar 25, 2023

facebook-github-bot commented Mar 25, 2023

facebook-github-bot commented Mar 25, 2023

facebook-github-bot commented Mar 25, 2023

github-actions bot commented Mar 25, 2023

	######################################################################
	# Note on slicing and AAC
	# ~~~~~~~~~~~~~~~~~~~~~~~
	#
	# .. warning::
	#
	# FFmpeg's native AAC encoder (which is used by default when
	# saving video with MP4 format) has a bug that affects the audibility.
	#
	# Please refer to the examples bellow.
	#

	def test_slice(audio_encoder, slice_size, ext="mp4"):
	path = get_path(f"slice_{slice_size}.{ext}")

	s = StreamWriter(dst=path)
	s.add_audio_stream(SAMPLE_RATE, NUM_CHANNELS, encoder=audio_encoder)
	with s.open():
	for start in range(0, NUM_FRAMES, slice_size):
	end = start + slice_size
	s.write_audio_chunk(0, WAVEFORM[start:end, ...])
	return path

	######################################################################
	#
	# This causes some artifacts.

	# note:
	# Chrome does not support playing AAC audio directly while Safari does.
	# Using MP4 container and specifying AAC allows Chrome to play it.
	Video(test_slice(audio_encoder="aac", slice_size=8000, ext="mp4"), embed=True)

	######################################################################
	#
	# It is more noticeable when using smaller slice.
	Video(test_slice(audio_encoder="aac", slice_size=512, ext="mp4"), embed=True)

	######################################################################
	#
	# Lame MP3 encoder works fine for the same slice size.
	Audio(test_slice(audio_encoder="libmp3lame", slice_size=512, ext="mp3"))

[Cherry-picked 2.0.1] Properly set #samples passed to encoder #3204

[Cherry-picked 2.0.1] Properly set #samples passed to encoder #3204

Conversation

mthrok commented Mar 24, 2023 • edited Loading

facebook-github-bot commented Mar 24, 2023

facebook-github-bot commented Mar 24, 2023

facebook-github-bot commented Mar 25, 2023

facebook-github-bot commented Mar 25, 2023

facebook-github-bot commented Mar 25, 2023

facebook-github-bot commented Mar 25, 2023

github-actions bot commented Mar 25, 2023

mthrok commented Mar 24, 2023 •

edited

Loading