You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While the new SDF-based UI looks really beautiful, introduces new performance requirements that weren't here before. The previous UI was using a 9-patch, and it was less flexible, but worked great along bitmap fonts, since the current KorGE batcher supports up to 4 textures in a single batch. So we could render buttons and text in a single batch call.
There were some introduced optimizations to cache whole sub-graphics, so we can for example cache a whole UI window, or the whole UI or part of it, so it doesn't affect the rest of the game updating, but still, we want to be faster.
Now, rendering the background of the button and the text itself are two separate batches. We need to set the vertices, the uniforms and the data for the background, and then go back to the text rendering.
Each batch is slow because requires setting the vertex data, attributes and uniforms. Doing a VisualVM profiling we can identify the hot points here:
Since we are already using lists that would allow to execute stuff in parallel, we can start optimizing some stuff while being future-proof for other backends.
The idea here is to use proper VAOs and UBOs and also buffer everything to the end of the list.
This will require either extensions available, or WebGL >= 2 and Open GL ES >= 3.0.
It is possible to read more about this here: https://webgl2fundamentals.org/webgl/lessons/webgl2-whats-new.html
On desktop, we should have that functionality already.
On discord people is heavily supporting going forward with this:
So for example, if we have the layout of the attributes and uniform beforehand, we can construct a single vertex buffer and a single uniform buffer with everything. Then upload it once to the GPU, and then execute really small commands selecting memory areas in those buffers to do small render batches.
This should improve the amount of batches we can do per fram substantially.
In addition to that, if we can keep the code in a separate thread (now K/N should support that), and then the rendering code in the UI thread consuming commands, we should improve the performance like a lot. Probably we can reach one to two orders of magnitude, that will pave the way for future needs, specially with 3D later.
This is an epic, that will require a lot of small tasks to reach here. Since this is an optimization improvement, shouldn't affect other areas of work. Even if code is slow, this will allow to eventually be faster, as we are already doing indirect buffered rendering through rendering lists.
The text was updated successfully, but these errors were encountered:
soywiz
changed the title
Optimize draw batches
Optimize draw batches by using VAOs, UBOs, single buffer uploads and separate thread for rendering
Nov 17, 2022
soywiz
changed the title
Optimize draw batches by using VAOs, UBOs, single buffer uploads and separate thread for rendering
Optimize draw batches by using VAOs, UBOs, buffering, single buffer uploads and separate thread for rendering
Nov 17, 2022
While the new SDF-based UI looks really beautiful, introduces new performance requirements that weren't here before. The previous UI was using a 9-patch, and it was less flexible, but worked great along bitmap fonts, since the current KorGE batcher supports up to 4 textures in a single batch. So we could render buttons and text in a single batch call.
There were some introduced optimizations to cache whole sub-graphics, so we can for example cache a whole UI window, or the whole UI or part of it, so it doesn't affect the rest of the game updating, but still, we want to be faster.
Now, rendering the background of the button and the text itself are two separate batches. We need to set the vertices, the uniforms and the data for the background, and then go back to the text rendering.
Each batch is slow because requires setting the vertex data, attributes and uniforms. Doing a VisualVM profiling we can identify the hot points here:
Since we are already using lists that would allow to execute stuff in parallel, we can start optimizing some stuff while being future-proof for other backends.
The idea here is to use proper VAOs and UBOs and also buffer everything to the end of the list.
This will require either extensions available, or WebGL >= 2 and Open GL ES >= 3.0.
It is possible to read more about this here: https://webgl2fundamentals.org/webgl/lessons/webgl2-whats-new.html
On discord people is heavily supporting going forward with this:
So for example, if we have the layout of the attributes and uniform beforehand, we can construct a single vertex buffer and a single uniform buffer with everything. Then upload it once to the GPU, and then execute really small commands selecting memory areas in those buffers to do small render batches.
This should improve the amount of batches we can do per fram substantially.
In addition to that, if we can keep the code in a separate thread (now K/N should support that), and then the rendering code in the UI thread consuming commands, we should improve the performance like a lot. Probably we can reach one to two orders of magnitude, that will pave the way for future needs, specially with 3D later.
This is an epic, that will require a lot of small tasks to reach here. Since this is an optimization improvement, shouldn't affect other areas of work. Even if code is slow, this will allow to eventually be faster, as we are already doing indirect buffered rendering through rendering lists.
The text was updated successfully, but these errors were encountered: