Understand performance characteristics #983
Replies: 5 comments
-
JIT enable, gpu, materials and texture bandwidth, ... It can also be some shortcuts we don't take like in this issue : #205 Also take care of VSync. If browser runs at 60Hz and BN runs at 30Hz with the same PG, we have an issue. I usually try to push PG to use more than 16ms on browser to not have timings hidden behind vsync. |
Beta Was this translation helpful? Give feedback.
-
With latest bgfx version, MSAA for opengl is detected differently. The way is was detected before (in current BN master) makes it disabled. MSAA max samples was 0 -> no MSAA on Android. Once fixed, MSAA may be active on Android and performance will be lower. When using full device resolution, this will result in far more samples to do (and more bandwidth). |
Beta Was this translation helpful? Give feedback.
-
iOSFollowing are some high level observations from Babylon React Native Playground on iOS while profiling a 5x5x5 grid of spheres (non-instanced). First, unsurprisingly we are much more CPU bound that GPU bound. CPU usage is approximately:
The majority of the 50% in native code starts with calls to Within those code paths, around 25% of the time is spent in Other items of note are:
UPDATE - new investigation 9 months later:Here is some more detailed info for my scenario (single mesh with ~250 submeshes loaded from a glb on iOS without XR): On iOS, I'm seeing about 63% of time spent in calls into native code, spread across the following: setTexture ~20% The time spent in these function calls can mostly be attributed to:
The only exception to this is drawIndexed, in which case about 50% is the combined cost of NativeEngine::Draw and FrameBuffer::Submit. On the pure JS side, there are a few functions that collectively account for 27% of CPU time:
|
Beta Was this translation helpful? Give feedback.
-
In terms of next steps, we should:
|
Beta Was this translation helpful? Give feedback.
-
Just to clarify, JSI does have weak objects, but it works like a std::weak_ptr which requires getting the strong ref before calling any members. Using this to accurately represent what NAPI expects will likely worsen the situation. |
Beta Was this translation helpful? Give feedback.
-
Performance is not great in certain scenarios, especially when you have many distinct (non-instanced) pieces of geometry (e.g. more than 50 or 100). The "spheres test" is a good example:
We should try to better understand the performance characteristics, whether there is low hanging fruit, and what kind of investments we need to make to improve perf. We should at least understand how these things impact perf:
Beta Was this translation helpful? Give feedback.
All reactions