You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, VideoDecoder.get_frame_at() returns a Frame which has the decoded data as a tensor, the pts in seconds and the duration in seconds. We should also return the frame index.
Motivation, pitch
No response
The text was updated successfully, but these errors were encountered:
Ideally, we would always return the frame index whenever we return a frame and its metadata. Doing that is not as simple as just plumbing some values through, because in the C++, we return a DecodedOutput object whenever we want to return any decoded frame. We can easily add a field in that object for the frame index, but the challenge is that we return this object in situations where we don't (currently) know the frame index, such as getNextFrameNoDemux(). (That is the core of the implementation for the Python core.get_next_frame() function; see the entry point.)
The simplest solution I can think of is to build up a reverse pts-to-index mapping when we scan a file. That would easily enable getNextFrameNoDemux() to cheapy look up the appropriate index for a pts value. (We always know the pts value, because it's part of what FFmpeg gives us when we decode the frame.)
Of course, the challenge here is that we have several entry points that do not currently assume we have scanned the file, and core.get_next_frame() is one of them. We should preserve that. This feature is then connected to our desire for an approximate mode (which we need to create an issue for). In getNextFrameNoDemux(), when we're in exact mode (meaning that we have scanned the file), we can just use the revert pts-to-index mapping. When in approximate mode, we would just do whatever math we do elsewhere.
However, this maybe means we should implement an approximate mode first, before trying to address this issue.
🚀 The feature
Currently,
VideoDecoder.get_frame_at()
returns aFrame
which has the decoded data as a tensor, the pts in seconds and the duration in seconds. We should also return the frame index.Motivation, pitch
No response
The text was updated successfully, but these errors were encountered: