Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

400 less fps when drawing circles/arcs in 3.3 #47824

Closed
ghost opened this issue Apr 12, 2021 · 19 comments
Closed

400 less fps when drawing circles/arcs in 3.3 #47824

ghost opened this issue Apr 12, 2021 · 19 comments

Comments

@ghost
Copy link

ghost commented Apr 12, 2021

No description provided.

@akien-mga
Copy link
Member

akien-mga commented Apr 12, 2021

This seems to be device/driver-specific as I can't reproduce the issue on Linux with Intel HD Graphics 630 and AMD Radeon RX Vega M, nor on another laptop with AMD Radeon RX Vega 56. Details below.

The main change between 3.2.3 and 3.3 which might affect this is the extension of 2D batching to GLES3 (in 3.2.3 batching was only for GLES2).

Can you see if you get different results if you turn off batching in the project settings? (Rendering > Batching). There are also various options you can play with to see if they make a difference.


Test results

Note that to avoid having a performance hit from running the editor itself, I'm running the project directly from the command line, not from the editor.

Both tests done using the official Linux 64-bit editor builds for 3.2.3-stable and 3.3 RC 8.

Intel 630 + discrete AMD Radeon Vega M

$ inxi -CSG
System:    Host: cauldron Kernel: 5.11.13-desktop-2.mga9 x86_64 bits: 64 Desktop: KDE Plasma 5.21.3 Distro: Mageia 9 mga9 
CPU:       Info: Quad Core model: Intel Core i7-8705G bits: 64 type: MT MCP cache: L2: 8 MiB 
           Speed: 3590 MHz min/max: 800/4100 MHz Core speeds (MHz): 1: 3590 2: 3863 3: 3775 4: 3840 5: 3837 6: 3894 7: 3880 
           8: 3860 
Graphics:  Device-1: Intel HD Graphics 630 driver: i915 v: kernel 
           Device-2: Advanced Micro Devices [AMD/ATI] Polaris 22 XL [Radeon RX Vega M GL] driver: amdgpu v: kernel 
           Device-3: Cheng Uei Precision Industry (Foxlink) HP Wide Vision FHD Camera type: USB driver: uvcvideo 
           Display: x11 server: Mageia X.org 1.20.10 driver: loaded: intel,v4l resolution: 1920x1080~60Hz 
           OpenGL: renderer: Mesa Intel HD Graphics 630 (KBL GT2) v: 4.6 Mesa 21.0.2

I seem to get more or less the same metrics with both GLES2 and GLES3 (maybe +20 FPS for AMD with GLES2 on both 3.2.3 and 3.3 RC 8 compared to GLES3).

GPU 3.2.3 stable 3.3 RC 8
Intel 210 FPS 380 FPS
AMD 400 FPS 530 FPS

AMD Radeon Vega 56

$ inxi -CSG
System:    Host: helios Kernel: 5.10.27-desktop-1.mga8 x86_64 bits: 64 Desktop: KDE Plasma 5.20.4 Distro: Mageia 8 mga8 
CPU:       Info: 8-Core model: AMD Ryzen 7 2700 bits: 64 type: MT MCP L2 cache: 4 MiB 
           Speed: 1613 MHz min/max: 1550/3200 MHz Core speeds (MHz): 1: 1613 2: 1731 3: 1473 4: 1408 5: 1519 6: 1697 7: 2669 
           8: 3393 9: 2743 10: 3407 11: 2840 12: 2436 13: 2174 14: 2239 15: 2170 16: 1742 
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] driver: amdgpu v: kernel 
           Device-2: Quanta HD Webcam type: USB driver: uvcvideo 
           Display: x11 server: Mageia X.org 1.20.10 driver: amdgpu,v4l resolution: 1920x1080~144Hz 
           OpenGL: renderer: AMD VEGA10 (DRM 3.40.0 5.10.27-desktop-1.mga8 LLVM 11.0.1) v: 4.6 Mesa 21.0.1
GPU 3.2.3 stable 3.3 RC 8
AMD 240 FPS 890 FPS

@akien-mga
Copy link
Member

akien-mga commented Apr 12, 2021

Testing on Windows 10 with my laptop with the AMD Radeon RX Vega 56, I seem to be able to reproduce a performance drop in 3.3 RC 8 compared to 3.2.3 stable, though both give me surprisingly bad performance metrics on this repo, compared to Linux drivers on the same hardware.

The performance regression seems to already happen in 3.2.4 beta 1, so it must be due to a change between 3.2.3-stable and 3.2.4 beta 1.

GPU 3.2.3 stable 3.3 RC 8
AMD 188 FPS 80 FPS

I'm not so familiar with debugging GPU drivers on Windows, but I can see in the Task Manager's Performance tab that the GPU is 100% busy on 3.2.3 stable, and only 66% busy on 3.3 RC 8. In both releases the CPU is only 5% busy.

Disabling batching doesn't seem to make a difference.

Here's some performance metrics from Radeon Software:

image

@YuriSizov
Copy link
Contributor

YuriSizov commented Apr 12, 2021

I can confirm similar results on another AMD GPU, 315 fps vs 170 fps on average (also on Windows). Interestingly enough, the number of draw calls is the same for both versions (400), and there is no output in the console from batching doing anything. So my assumption is just general degradation for some reason.

Frame times tell a different story. In 3.2.3 lows are lower, but highs tend to be higher. It gets from 2.9-3 ms to 7 ms on average and up to 10+ ms sometimes. In 3.3 the averages are more stable, it flactuates from 5.9 ms to 7+ ms, with no outliers in either direction.

image
image

Radeon 5700 XT, driver version 21.3.1.

@lawnjelly
Copy link
Member

lawnjelly commented Apr 12, 2021

Note that neither circles or arcs (polyline) are batched, so there shouldn't be a huge amount of differences codewise (that probably explains why you get the same performance with batching on or off, they are using the same path). It's possible that an extra state being set (possibly to fix a bug) may have introduced slower performance on some hardware. Or if one version is using line smoothing and one not. Although the drop in performance might seem serious this is unlikely to be a major bug.

I'll try and investigate, I just have a couple more crucial bugs to fix first (or clayjohn or anyone interested, it's not a massive amount of code to compare).

Doing it manually BTW will probably be batched, hence the improvement in speed.

@lawnjelly
Copy link
Member

If anyone getting this drop can try out 3.2.4 beta 2, and try these settings and see if there is any effect, that will be useful:
#43038 (comment)

@YuriSizov
Copy link
Contributor

YuriSizov commented Apr 12, 2021

If anyone getting this drop can try out 3.2.4 beta 2, and try these settings and see if there is any effect, that will be useful:
#43038 (comment)

@lawnjelly Tested. Only turning rendering->options->api_usage_legacy->flag_stream on has any effect, but it's a positive one. I get an fps boost even compared to 3.2.3, over 410 fps on average in the same sample project. Frame times are also better, giving constant stream of 2.3ms with almost no deviations:

image

@lawnjelly
Copy link
Member

See #47833 (comment) in the other issue, as although the observed results are slightly different, the underlying problems / discussion are similar.

@akien-mga
Copy link
Member

I can confirm @pycbouh's results with the new options in #47864, turning ON rendering/2d/legacy_stream makes performance go from 80 FPS to 315 FPS (compared with 3.2.3-stable which had 188 FPS).

@lawnjelly
Copy link
Member

I can confirm @pycbouh's results with the new options in #47864, turning ON rendering/2d/legacy_stream makes performance go from 80 FPS to 315 FPS (compared with 3.2.3-stable which had 188 FPS).

Out of interest what are the results of this one with legacy_orphan set to off, and legacy_stream set to off?

@YuriSizov
Copy link
Contributor

I think I've tried all combinations between each pair of settings from 3.2.4b2 and nothing else provided any meaningful difference. Including both set to off.

@akien-mga
Copy link
Member

akien-mga commented Apr 14, 2021

I can confirm @pycbouh's results with the new options in #47864, turning ON rendering/2d/legacy_stream makes performance go from 80 FPS to 315 FPS (compared with 3.2.3-stable which had 188 FPS).

Testing with the same device (Radeon RX Vega 56) on Linux, I get:

  • Default:
Project FPS: 876 (1.1 mspf)
Project FPS: 874 (1.1 mspf)
Project FPS: 857 (1.1 mspf)
Project FPS: 831 (1.2 mspf)
Project FPS: 801 (1.2 mspf)
Project FPS: 869 (1.1 mspf)
Project FPS: 926 (1.0 mspf)
Project FPS: 831 (1.2 mspf)
Project FPS: 787 (1.2 mspf)
Project FPS: 789 (1.2 mspf)
Project FPS: 840 (1.1 mspf)
Project FPS: 842 (1.1 mspf)
  • Legacy Stream ON:
Project FPS: 954 (1.0 mspf)
Project FPS: 922 (1.0 mspf)
Project FPS: 924 (1.0 mspf)
Project FPS: 946 (1.0 mspf)
Project FPS: 924 (1.0 mspf)
Project FPS: 928 (1.0 mspf)
Project FPS: 908 (1.1 mspf)
Project FPS: 930 (1.0 mspf)
Project FPS: 910 (1.0 mspf)
Project FPS: 847 (1.1 mspf)
Project FPS: 923 (1.0 mspf)
Project FPS: 911 (1.0 mspf)

Fluctuates a lot but I guess that can be summarized as a +50-80 FPS gain (or -0.1 mspf).

@lawnjelly
Copy link
Member

I was just wondering because I hadn't yet worked out what had caused the drop in FPS from 3.2.3 to 3.3, as aside from the orphaning the code was mostly the same if I remember correctly. Maybe there is some other change since 3.2.3.

@YuriSizov
Copy link
Contributor

Something at least improves frame times in 3.3 overall. Not the average, maybe, but the spread. I definitely see it going off both ends in 3.2.4, but not so in 3.3 even with default/auto settings.

If there is anything else we can track, do tell.

@bruvzg
Copy link
Member

bruvzg commented Apr 14, 2021

Some tests all done on the same machine with AMD FirePro D300 GPU:

OS / Driver Godot 3.2.3-stable Godot 3.3-rc8 Godot 3.x + 47864 and legacy options enabled
macOS 11.2.3 (20D91) 105 FPS 470 FPS 460 FPS
Windows 10 (Radeon Software 21.3.1) 230 FPS 120 FPS 210 FPS
Linux (Mesa 20.2.6/AMDGPU) 270 FPS 350 FPS 350 FPS

@lawnjelly
Copy link
Member

As an aside, something I had noted a while ago is that the legacy draw_polygon and draw_generic etc functions are very inefficient API usage wise. (I think these are used to draw the circles).

They call glBufferSubData multiple times per call which could well be not good. I did make a bit of an attempt to improve this to a better method a while ago but made a mistake and reverted, but I've just not had it as a priority to fix. If we pre-prepared the whole buffer before uploading as a one off these calls would probably be a lot more efficient.

It's just not been a massive priority as I'm not sure how much these legacy primitives are used (they may be important in some editor modes though).

@YuriSizov
Copy link
Contributor

Out of curiousity I've also tried this project on my GTX 960m laptop. In 3.2.3 I get anywhere between 200 and 300 fps, sometimes dipping below 200. In 3.2.4 beta 2/3.3 RC 8 I get values between 35 and 50 fps, varying run to run. Changing flags in 3.2.4 beta 2 doesn't have any effect.

That's on Windows 10 and docked (not on battery). I didn't update either drivers or Windows on that laptop for some time though.

@bruvzg
Copy link
Member

bruvzg commented Apr 15, 2021

RC9 tests on the same machine (average over 1 minute of measurement, with first and last 5 seconds excluded), using official binaries for all tests, on Linux FPS have huge fluctuation for some reason:

Godot version and config macOS 11.2.3 (20D91) Windows 10 (Radeon Software 21.3.1) Ubuntu 20.10 (Mesa 20.2.6)
3.3 RC9, orphan_buffers=2, stream=1 450 ± 20 120 ± 10 465 ± 200
3.3 RC9, orphan_buffers=1, stream=2 470 ± 20 205 ± 10 430 ± 200
3.3 RC9, orphan_buffers=2, stream=2 445 ± 20 205 ± 10 460 ± 200
3.3 RC9, orphan_buffers=1, stream=1 440 ± 20 120 ± 2 430 ± 100
3.2.3 stable 110 ± 2 210 ± 2 265 ± 2

@lawnjelly
Copy link
Member

Those results look encouraging, it should be good with the defaults we set in RC9.

The fluctuation on linux I'm not sure we can do anything obvious about, probably something you'd need internal knowledge of the driver to diagnose. Interesting that on windows the performance is halved with the same workload.

@akien-mga
Copy link
Member

Fixed by #47864. Please comment if you still reproduce this specific issue in 3.3 RC 9 or stable.

@ghost ghost changed the title 400 less fps when drawing circles/arcs in 3.3 . Jun 5, 2021
@akien-mga akien-mga changed the title . 400 less fps when drawing circles/arcs in 3.3 Sep 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants