Skip to content

Rewrite Radiance and Reflection probes to use Octahedral maps.#107902

Merged
Repiteo merged 1 commit into
godotengine:masterfrom
DarioSamo:sky-octahedral
Dec 3, 2025
Merged

Rewrite Radiance and Reflection probes to use Octahedral maps.#107902
Repiteo merged 1 commit into
godotengine:masterfrom
DarioSamo:sky-octahedral

Conversation

@DarioSamo
Copy link
Copy Markdown
Contributor

@DarioSamo DarioSamo commented Jun 23, 2025

Reimplements Sky Radiance and Reflection Probes by using Octahedral maps instead of Cubemaps. This drops the requirement of needing cubemap array support on the target platform.

Octahedral maps have pretty big benefits over cubemaps when it comes to memory usage and sampling performance. It also leads to a heavy reduction in the amount of compute dispatches or render passes necessary to compute the mipmaps and the rough versions of the reflections.

Performance benefits are expected across the board for a very small difference in quality. I'll gather some comparisons while the PR is in progress.

TODO

  • Check if coefficients for roughness filters need to be recomputed or if they can be used as is.
  • Fix Update Once behavior.
  • Test for regressions.
  • Gather some performance comparisons against master.
  • Track error where shaders were never freed (CubeToOctmapShaderRD).
  • Investigate seam.
  • Investigate toon shader radiance error.
  • Replicate border generation on raster version.
  • Replicate border generation on fast filter version.

@Calinou
Copy link
Copy Markdown
Member

Calinou commented Jun 26, 2025

Out of curiosity, would this PR make it easier to implement godotengine/godot-proposals#11530 in the future?

@clayjohn
Copy link
Copy Markdown
Member

Out of curiosity, would this PR make it easier to implement godotengine/godot-proposals#11530 in the future?

Nope, it's unrelated

@DarioSamo DarioSamo force-pushed the sky-octahedral branch 8 times, most recently from 946df48 to 6632eaa Compare June 27, 2025 18:09
@DarioSamo DarioSamo marked this pull request as ready for review June 27, 2025 18:12
@DarioSamo DarioSamo requested a review from a team as a code owner June 27, 2025 18:12
@DarioSamo
Copy link
Copy Markdown
Contributor Author

I've finished my self-review pass on this, but we'll probably leave this parked for a proper review and merge until we start 4.6, as it's far too risky of a change for 4.5.

Copy link
Copy Markdown
Member

@Calinou Calinou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally, it works as expected.

Testing project: https://0x0.st/8lkg.zip
Press R to toggle all ReflectionProbes in the scene between the Once update mode (default) and Always.

However, I can notice seams in the Material Testers demo when I disable reflection probes in the in-game menu:

image

The size of pixels on the seams depends on viewport resolution, not reflection resolution. This makes seams more noticeable at small viewport resolutions. The seam is still here when rendering with llvmpipe, so it's not NVIDIA-specific.

Seams are not visible with reflection probes enabled:

image

When reflection probes are enabled, the Toon material at the end has a visible radiance seam:

This is also visible on the first material when you run the project, but it's more noticeable on Toon.

Sharpness comparison when zooming on the mirror material in 4K:

Before  After
image image

Performance-wise, I can't spot a difference on my PC with any combination of settings I've tried (viewport resolution or probe update mode), even though I'm GPU-bound in this scene. I will try to test on slower devices when I have more time.

PC specifications
  • CPU: AMD Ryzen 9 9950X3D
  • GPU: NVIDIA GeForce RTX 5090
  • RAM: 64 GB (2×32 GB DDR5-6000 CL30)
  • SSD: Solidigm P44 Pro 2 TB
  • OS: Linux (Fedora 42)

@clayjohn
Copy link
Copy Markdown
Member

clayjohn commented Jun 30, 2025

Performance-wise, I can't spot a difference on my PC with any combination of settings I've tried (viewport resolution or probe update mode), even though I'm GPU-bound in this scene. I will try to test on slower devices when I have more time.

This is an optimization for TBDR architecture devices. You won't notice a difference on an RTX 5090 or any other desktop chip (high end or low end)

Also, keep in mind that this optimizes the filtering of the radiance map. So when using a panoramasky material this code only runs once at load time

@clayjohn
Copy link
Copy Markdown
Member

I've done a quick test on an M2 macbook using a single sky with a radiance size of 1024 and the sky update mode set to "high quality".

4.5-beta1: 9 FPS (~111 mspf)
this PR: 48 FPS (~21 mspf)

For this simple case it is a difference of about 5x

This PR does give the following error though:

ERROR: 1 shaders of type CubeToOctmapShaderRD were never freed

@Calinou
Copy link
Copy Markdown
Member

Calinou commented Jun 30, 2025

For this simple case it is a difference of about 5x

I'm surprised the difference is so large. Were you doing something that causes the radiance map to update constantly (e.g. sky shader using TIME or rotating a DirectionalLight3D that's used in the sky shader)?

Can you share a MRP?

@DarioSamo
Copy link
Copy Markdown
Contributor Author

I'm surprised the difference is so large. Were you doing something that causes the radiance map to update constantly (e.g. sky shader using TIME or rotating a DirectionalLight3D that's used in the sky shader)?

This is one of the cases we're optimizing for, yes. The GD Quest TPS Demo does that with the sky.

@Calinou
Copy link
Copy Markdown
Member

Calinou commented Jun 30, 2025

This is one of the cases we're optimizing for, yes. The GD Quest TPS Demo does that with the sky.

That makes sense. I was initially benchmarking this on the "render-time" aspect, rather than the "generate-time" aspect which is much more often a bottleneck (sometimes by accident in projects that don't notice they're updating the radiance map every frame).

See also #105889 and #55933. The latter in particular could be helpful on low-end platforms, even after this PR in scenes where sky animation is subtle in ambient/reflected light (e.g. clouds).

Edit: I've made a testing project to benchmark radiance map generation rather than its rendering:

test_pr_107902.zip

On my RTX 5090, it's consistently 0.01 ms faster in terms of frametime with this PR compared to master (reproduced across various resolutions). Running through llvmpipe gives me a 2.33 mspf frametime improvement with this PR (from 101 FPS to 132 FPS at the default window size).

The benefit is more noticeable at lower resolutions, since the radiance map generation cost is independent of viewport resolution.

@clayjohn
Copy link
Copy Markdown
Member

For this simple case it is a difference of about 5x

I'm surprised the difference is so large. Were you doing something that causes the radiance map to update constantly (e.g. sky shader using TIME or rotating a DirectionalLight3D that's used in the sky shader)?

Yes, as I said above. This optimization is for the environment map generation, so you need it to regenerate it each frame to profile its performance.

Can you share a MRP?

octa.tscn.txt

@DarioSamo
Copy link
Copy Markdown
Contributor Author

I pushed a fix for the border scale depending on the mipmap level. Looking into the other changes proposed by Clay as well.

@DarioSamo DarioSamo force-pushed the sky-octahedral branch 3 times, most recently from c073649 to 559a16f Compare October 20, 2025 18:28
@Repiteo Repiteo requested review from Calinou and clayjohn November 19, 2025 02:04
@Repiteo
Copy link
Copy Markdown
Contributor

Repiteo commented Nov 19, 2025

Needs rebase

Comment thread servers/rendering/renderer_rd/effects/copy_effects.cpp Outdated
Comment thread servers/rendering/renderer_rd/effects/copy_effects.h Outdated
Comment thread servers/rendering/renderer_rd/effects/copy_effects.cpp Outdated
Comment thread servers/rendering/renderer_rd/effects/copy_effects.cpp Outdated
Comment thread servers/rendering/renderer_rd/effects/copy_effects.h Outdated
Co-authored-by: clayjohn <claynjohn@gmail.com>
@clayjohn
Copy link
Copy Markdown
Member

clayjohn commented Dec 3, 2025

Just pushed a rebase + update. The big change in the update is to use a fixed width border (1 pix at the smallest mip, but doubling with each mip level). This takes up more memory than the single pixel border in the old method, but it preserves filtering when mipmapping, which is a hard requirement to avoid filtering artifacts. Additionally, the cost to read from the sky is now cheaper since the border calculation is a single FMA instead of a few multiplications, a pow() call and an addition.

This push also addresses the review comments from ATS!

With this, the PR should be ready to merge.

I measured performance on 3 devices:

  1. Intel iGPU laptop
  2. M2 MBP
  3. Quest 3

For a testing scene, I forced a sky to update every frame by using TIME in the shader.

With the Intel iGPU, I also tested every combination of texture_use_array_layers project setting, realtime update, high_quality update, incremental update, and various radiance sizes. In all cases I saw a roughly 2X improvement. In many configurations this saves several ms.

With the MBP there was only a modest improvement of about 15%

On the Quest, I tested in the context of the GDQuest TPS demo and saw a fixed performance bump of 2ms per frame.

Note for release team

This PR is high risk, but has been in development for a long time and is the combined efforts of both Dario and I. Most of the work is done by Dario and has been very carefully reviewed by me. I think we should go ahead and merge this for 4.6 since rebasing it has been a huge pain and will continue to be a big pain.

Given my already extensive review. I am pretty confident that this is well done and that any issues from it can be fixed during the Beta phase

@clayjohn clayjohn modified the milestones: 4.x, 4.6 Dec 3, 2025
@Repiteo Repiteo merged commit 62affe0 into godotengine:master Dec 3, 2025
20 checks passed
@Repiteo
Copy link
Copy Markdown
Contributor

Repiteo commented Dec 3, 2025

Thanks!

@dsnopek
Copy link
Copy Markdown
Contributor

dsnopek commented Dec 3, 2025

Thanks, this looks great!

I did some benchmarking with a modified GDQuest TPS demo on Quest 3, and I can confirm a nice reproducible improvement, but I'm seeing more like 1ms (rather than 2ms):

Marker Before PR After PR
1 16.507 ms 15.545 ms
2 18.505 ms 17.797 ms
3 13.895 ms 13.887 ms

My test fixes the camera at 3 different places in the scene. Marker 3 shows no change, because it was already at frame rate.

@kubaofc123
Copy link
Copy Markdown
Contributor

Not sure yet if directly related to this commit but there is regression in visual quality with the default Sky and reflections. The seam goes away if used with reflection probes. It's not related in any way to meshes themselves as rotating them keeps the seam in same world-space
image

@clayjohn
Copy link
Copy Markdown
Member

@kubaofc123 That's a known issue, reported here #113676

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants