Understand performance characteristics #983

ryantrem · 2020-11-16T23:12:26Z

ryantrem
Nov 16, 2020
Maintainer

Performance is not great in certain scenarios, especially when you have many distinct (non-instanced) pieces of geometry (e.g. more than 50 or 100). The "spheres test" is a good example:

function CreateSpheres(scene: Scene, size: number) {
  for (var i = 0; i < size; i++) {
      for (var j = 0; j < size; j++) {
          for (var k = 0; k < size; k++) {
              var sphere = Mesh.CreateSphere("sphere" + i + j + k, 32, 0.9, scene);
              sphere.position.x = i;
              sphere.position.y = j;
              sphere.position.z = k;
          }
      }
  }
}

We should try to better understand the performance characteristics, whether there is low hanging fruit, and what kind of investments we need to make to improve perf. We should at least understand how these things impact perf:

Multiple pieces of distinct geometry.
Screen resolution.
What else?

CedricGuillemet · 2020-11-17T13:44:16Z

CedricGuillemet
Nov 17, 2020
Maintainer

JIT enable, gpu, materials and texture bandwidth, ...
Also, if we find something weird performance wise, first thing to do is to compare with a browser.
If we get lower perf than a browser, we should start worrying and launch the profiler.

It can also be some shortcuts we don't take like in this issue : #205

Also take care of VSync. If browser runs at 60Hz and BN runs at 30Hz with the same PG, we have an issue.
But we can also have an issue if both run at 60Hz with browser frame taking 2ms and BN using 16ms.

I usually try to push PG to use more than 16ms on browser to not have timings hidden behind vsync.

0 replies

CedricGuillemet · 2020-11-20T10:40:03Z

CedricGuillemet
Nov 20, 2020
Maintainer

With latest bgfx version, MSAA for opengl is detected differently. The way is was detected before (in current BN master) makes it disabled. MSAA max samples was 0 -> no MSAA on Android. Once fixed, MSAA may be active on Android and performance will be lower. When using full device resolution, this will result in far more samples to do (and more bandwidth).

0 replies

ryantrem · 2020-12-14T21:11:50Z

ryantrem
Dec 14, 2020
Maintainer Author

iOS

Following are some high level observations from Babylon React Native Playground on iOS while profiling a 5x5x5 grid of spheres (non-instanced).

First, unsurprisingly we are much more CPU bound that GPU bound. CPU usage is approximately:

34% in JavaScript execution
16% in JavaScript to native communication layer (of JSC)
50% in native code

The majority of the 50% in native code starts with calls to setMatrices, bindVertexArray, drawIndexed, and setState.

Within those code paths, around 25% of the time is spent in JSValueProtect and JSValueUnprotect due to JSI not having the concept of weak references to JS objects.

Other items of note are:

Napi::ObjectWrap<Babylon::NativeEngine>::Unwrap(...) which accounts for around 9%
Napi::TypedArray::GetTypedArrayInfo(...) which accounts for around 6%
Napi::TypedArrayOf<float>::TypedArrayOf(...) which accounts for around 5%

UPDATE - new investigation 9 months later:

Here is some more detailed info for my scenario (single mesh with ~250 submeshes loaded from a glb on iOS without XR):

On iOS, I'm seeing about 63% of time spent in calls into native code, spread across the following:

setTexture ~20%
setTextureWrapMode ~10%
setMatrices ~18%
drawIndexed ~5%
setState ~5%
bindVertexArray ~5%

The time spent in these function calls can mostly be attributed to:

JS -> native overhead (as described in my previous message)
Object::Unwrap
Reading the napi CallBackInfo
ArrayBuffer::Data
JSI::Value construction/destruction (JSC value protect/unprotect on iOS)

The only exception to this is drawIndexed, in which case about 50% is the combined cost of NativeEngine::Draw and FrameBuffer::Submit.

On the pure JS side, there are a few functions that collectively account for 27% of CPU time:

BoundingInfo.update~10% (~80% of this is in BoundingBox.update and ~20% is in BoundingSphere.update)
PBRBaseMaterial.isReadyForSubmesh ~7% (85% of this is in _prepareDefines, ~10% is self time, and ~5% is in the isDirty getter)
PBRSubSurfaceConfiguration.bindForSubmesh ~5% (~25% self time, ~20% in Matrix.decompose, rest spread across smaller bits)
Scene._activeMesh ~5% (~50% in isInFrustum, ~25% in RenderingManager.dispatch, rest spread across small bits)

0 replies

ryantrem · 2020-12-14T21:15:30Z

ryantrem
Dec 14, 2020
Maintainer Author

In terms of next steps, we should:

Do a similar analysis for Android+Hermes and see if there are common performance costs across the two that could be reduced (for example, can we reduce the number of calls to Napi::ObjectWrap<Babylon::NativeEngine>::Unwrap(...), etc.).
See if it is possible (for testing purposes only) to enable JSC JIT (probably just in the regular Babylon Native Playground) and understand how much of the JS execution cost is due to lack of JIT.

0 replies

bghgary · 2020-12-14T21:33:21Z

bghgary
Dec 14, 2020
Maintainer

due to JSI not having the concept of weak references to JS objects

Just to clarify, JSI does have weak objects, but it works like a std::weak_ptr which requires getting the strong ref before calling any members. Using this to accurately represent what NAPI expects will likely worsen the situation.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understand performance characteristics #983

{{title}}

Replies: 5 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Understand performance characteristics #983

ryantrem Nov 16, 2020 Maintainer

Replies: 5 comments

CedricGuillemet Nov 17, 2020 Maintainer

CedricGuillemet Nov 20, 2020 Maintainer

ryantrem Dec 14, 2020 Maintainer Author

iOS

UPDATE - new investigation 9 months later:

ryantrem Dec 14, 2020 Maintainer Author

bghgary Dec 14, 2020 Maintainer

ryantrem
Nov 16, 2020
Maintainer

CedricGuillemet
Nov 17, 2020
Maintainer

CedricGuillemet
Nov 20, 2020
Maintainer

ryantrem
Dec 14, 2020
Maintainer Author

ryantrem
Dec 14, 2020
Maintainer Author

bghgary
Dec 14, 2020
Maintainer