-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can we utilize multiple processors/GPU to fasten up rendering? #101
Comments
I'd like to vote up for this. |
This would be awesome. My main gripe with rendering using MIDIVisualizer is how slow it is, even on my reasonably fast laptop. |
If I'm reading the code correctly, this is how MIDIVisualizer is generating+exporting the MIDI visualizations: "[MIDIVisualizer issues OpenGL draw commands to draw a given frame] -> [OpenGL executes the draw commands, waits for frame to be fully rendered to the User] -> [MIDIVisualizer then exports the GUI's framebuffer contents to the exported video frame]". Note that this is done sequentially, frame-by-frame. So, refactoring to utilize GPU rendering could involve the following. You'd want to refactor the code to issue batched render requests to minimize CPU<->GPU data transfer. To illustrate, rather than asking the GPU to draw each frame one-at-a-time ("please draw frame 0. wait. receive frame 0's rendered pixels. please draw frame 1. wait. receive frame 1's rendered pixels. please draw frame 2..."), we'd want to ask the GPU to render N frames at a time, say N=64 frames at a time (N has to be small enough where we don't exceed the host's available GPU memory). At first glance, seems like a fairly non-trivial refactor. Definitely do-able though! Perhaps an easier alternative approach is to do a "divide-and-conquer" CPU-based approach: If you're interested, here's the entry point for exporting a MIDI file visualization from the CLI: https://github.com/kosua20/MIDIVisualizer/blob/master/src/rendering/Renderer.cpp#L1365 |
By checking my resource util, I found out that MIDIVisualizer is already using the GPU during rendering (likely because OpenGL is able to auto-utilize the GPU if one is available, neat!). Here's some measurements: Interestingly, GPU utilization is quite low, ~30% GPU util. And, CPU util is quite low: ~6% for the MIDIVisualizer process. So, the bottleneck is elsewhere. I'm willing to bet that the bottleneck is disk writing. If you look at the second image above, during rendering my system is writing ~3.5 MB/sec to the output video file, which is the max write speed of my drive (Samsung SSD 970 EVO Plus 1TB). Based on the above, I wonder if the code is indeed being bottlenecked by disk writes? If so, I wonder if there's a way to improve pipelining so that we decouple disk writing (eg video encoding) and rendering. Since CPU/GPU util is low, it appears that rendering can outpace disk writing. |
First, let me say that the NVMe SSD can do way more than 3.5MB/s, base of your now seemingly delete comment, you were using MPEG4 which would be a sequenced write and that drive should be more than capable of doing 3000MB/s, if it really can only do 3.5MB/s, then something is really wrong with your system. Even with a clean 100% separation of the render and encoding, the most you get out of that is just a 2x speed up The limitation comes from the encoding of the video which isn't easy to multithread, adding threads to the encoding progress tend to reduce quality, causing artifacting, as the data for each frame depends on the frame before it The screenshot pretty much says it all, it is limited by how fast a single core of your computer can encode it as shown as a flat line in the cpu usage, a simple test you could do is to start multiple recording process, An example is provided below base on your now deleted comment, you should find the time it takes to record scales very well and it writing an additional ~3.5MB/s per instances base on the specific command in your deleted example examplesThe formal should be around 3x slower than the laterexample1.bat./MIDIVisualizer --midi 'C:\Users\Eric\Documents\REAPER Media\youtube_video_record_settings\youtube_video_record_settings_keyboard\youtube_video_record_settings_keyboard_bpm240.mid' --size 1920 1080 --export video1.mp4 --format MPEG4
./MIDIVisualizer --midi 'C:\Users\Eric\Documents\REAPER Media\youtube_video_record_settings\youtube_video_record_settings_keyboard\youtube_video_record_settings_keyboard_bpm240.mid' --size 1920 1080 --export video2.mp4 --format MPEG4
./MIDIVisualizer --midi 'C:\Users\Eric\Documents\REAPER Media\youtube_video_record_settings\youtube_video_record_settings_keyboard\youtube_video_record_settings_keyboard_bpm240.mid' --size 1920 1080 --export video3.mp4 --format MPEG4 example2.batSTART /B ./MIDIVisualizer --midi 'C:\Users\Eric\Documents\REAPER Media\youtube_video_record_settings\youtube_video_record_settings_keyboard\youtube_video_record_settings_keyboard_bpm240.mid' --size 1920 1080 --export video1.mp4 --format MPEG4
START /B ./MIDIVisualizer --midi 'C:\Users\Eric\Documents\REAPER Media\youtube_video_record_settings\youtube_video_record_settings_keyboard\youtube_video_record_settings_keyboard_bpm240.mid' --size 1920 1080 --export video2.mp4 --format MPEG4
START /B ./MIDIVisualizer --midi 'C:\Users\Eric\Documents\REAPER Media\youtube_video_record_settings\youtube_video_record_settings_keyboard\youtube_video_record_settings_keyboard_bpm240.mid' --size 1920 1080 --export video3.mp4 --format MPEG4 As for spliting up a single midi and rendering them separately and merging them afterwards, the merging will take just as long as the encoding step because it will basically need to be reencoded to properly merge it together, not to mention the artifacting that comes with reencoding, if you really want to try it, there is the midicopy program that can do the spliting for you As for PNG multithreading, where it is more likely to get write bottleneck as it's basically random write, it's already done 92f86f0 |
Ah yup, you're right, I was misreading the drive specs, >3000MBps is indeed my max write speed, not 3MBps. So not disk-write-bottlenecked, and it seems plausible from your info that video encoding is the bottleneck (as it's not able to effectively utilize all cores). Thanks for the insights! Very helpful. Regarding splitting MIDI + concat-ing the N video files: I'm not an expert with video codec formats, but it seems that it's possible to do an efficient concatenation (without any re-encoding required) for certain video formats (MPEG-2 seems to be one), but other video formats a re-encode is seemingly required (eg MPEG-4): https://stackoverflow.com/a/11175851 So, there could be a possible route forward with the split+merge approach for certain video formats. Not sure what the pro's and con's of each are yet format are though. Regarding difficulty to multi-thread video encoding: I see your points. I think I'd have to dig deeper into video encoding implementations/algs (particularly for various popular video formats, eg mpeg-2 vs mpeg-4), and see what are the industry-standard high-perf encoding techniques these days. Maybe there's some new tricks/libraries that can greatly accelerate things? Or maybe we should be considering other video formats that are more performant? Then, there's the option of doing GPU video encoding, which I'm not sure MIDIVisualizer is allowing right now. FFmpeg does allow GPU encoding, but it seems that the user would have to configure their system to enable FFmpeg+GPU-encoding: https://stackoverflow.com/questions/44510765/gpu-accelerated-video-processing-with-ffmpeg At the end of the day, the current perf isn't a deal breaker. By tuning some export settings, I was able to get export time down quite a bit (eg for a 40sec MIDI clip, takes ~17 secs to export) with still acceptable quality. It's a fun little rabbit hole to go into though! |
We can divide the frames amongst the CPU/GPU cores and then combine them together.
The text was updated successfully, but these errors were encountered: