-
-
Notifications
You must be signed in to change notification settings - Fork 21.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[3.x] Implement LOD range in GeometryInstance #53778
Conversation
71cb8d3
to
c58391c
Compare
Thank you! |
instance->lod_begin = p_min; | ||
instance->lod_end = p_max; | ||
instance->lod_begin_hysteresis = p_min_margin; | ||
instance->lod_end_hysteresis = p_max_margin; | ||
} | ||
void VisualServerScene::instance_geometry_set_as_instance_lod(RID p_instance, RID p_as_lod_of_instance) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one should be removed, nothing ever calls it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That method is exposed to the scripting API – wouldn't removing it technically break compatibility?
58dffe9
to
eecc9cd
Compare
// Calculate instance->depth from the camera. | ||
const Vector3 aabb_center = ins->transformed_aabb.position + (ins->transformed_aabb.size * 0.5); | ||
ins->depth = near_plane.distance_to(aabb_center); | ||
ins->depth_layer = CLAMP(int(ins->depth * 16 / z_far), 0, 15); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This calculation seems only required here if LOD is active for this instance. This could be set efficiently with e.g. a bool when setting the LOD params for the instance. i.e.:
if (ins->lod_active) ... do lod calcs
In addition the depth calculated here could potentially be reused in many cases later in the line 3305 part, to save doing the same calculation twice. Although that would need some perf testing to check it is a win.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This calculation seems only required here if LOD is active for this instance. This could be set efficiently with e.g. a bool when setting the LOD params for the instance. i.e.:
Do we need a dedicated boolean field for this? We can check if LOD is enabled by checking whether the min range or max range is greater than 0 (exclusive). We can also check for hysteresis being greater than 0 (exclusive) if any of those conditions is true, but lazy evaluation can take care of that for us anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't profiled this, but imo a good aim is to make any new code as cheap as possible for instances that aren't using LOD. We essentially should aim not to be doing extra unnecessary calculations for the 99% of times where people aren't using LOD on an instance.
Doing unneeded distance calculations, and checking two float values against CMP_EPSILON
is not as cheap as just detecting a bool is set to false which will likely be the cheapest conditional (bear in mind on some mobile floating point math could even be emulated). This double float check is additionally now not required at runtime should the bool be set to true.
There are lots of occasions where just doing things unoptimized are fine (where code clarity is important, and it isn't performance sensitive), but inside the renderer imo we should strive to make it as efficient as possible as there is a fair amount of churn, and we are just wasting cycles and battery on mobile.
Looking at the Instance : RasterizerScene::InstanceBase
structure, it looks like RID lod_instance
is unused so could be removed. You could for instance add a bool up with the update_materials
bools, and you could possibly make them 1 bit too (e.g. bool update_aabb : 1;
), so they would likely be no more than 32 bit between them.
In instance_geometry_set_draw_range()
you could just do something like this:
instance->using_lod = (instance->lod_begin > CMP_EPSILON) || (instance->lod_end > CMP_EPSILON);
Therefore the float check is done once (rarely) when setting the params, and the cheap bool test can be done at runtime on each frame on each prepare_scene
.
One thing I'm mostly unsure about is the use of a hashmap to store the hysteresis state of each instance on each camera. Even with the improvements to hashmap in 4.x (which aren't available in 3.x), this seems like a sub-optimal way of storing this for realtime use. I wonder whether we could simply store a vector of instance IDs per camera that are in a "far" hysteresis state. You could run through this vector at the start of The only snag here is that objects that move outside the view frustum would lose their hysteresis state (but maybe that would not matter? not sure). An alternative approach might be to store a bitfield on each instance of the hysteresis state for each camera (say 32 bit?). Cameras could be assigned a hysteresis ID as they are created, with a limit of 32 current for the hysteresis to work. I don't know if anyone would need more than 32 cameras... These are just some ideas but personally I would really try and avoid the hashmap for this kind of thing, if possible. |
eecc9cd
to
ca7bfa0
Compare
Similar to the one present in `master`, this provides a LOD system for GeometryInstance-derived nodes such as MeshInstance and MultiMeshInstance. This LOD system currently doesn't obey shadow rendering, but it can still provide substantial performance improvements when used well. Coupled with the occlusion culling systems (portals/rooms or occluder spheres), this makes possible to achieve much greater performance in large detailed scenes. Co-authored-by: Florian Jung <[email protected]>
ca7bfa0
to
55c258e
Compare
Just doing some more testing. I've profiled with 100,000 boxes (not all at the same time, it was kind of derived from your test project, but this should be indicative), with most of them being LOD culled: So as expected the bottlenecks look to be
Within |
As I was saying on rocket chat, for the extreme situation of an open world, where the user simply wants some far away stuff not to show up, we should probably look at modifying An alternative approach (like in the addon) is to maintain a list of currently LODed instances on the camera, and run through this list round robin fashion, and not necessarily do the whole list every frame. If this was done over say 10 frames, you have a 1/10th of the checks necessary (at a cost of a slight lag to the LOD change). This would be in exchange for a slight complexity of adding LODed instances to the relevant camera lists as cameras are added / destroyed, and instances are added / destroyed. Having said that, I don't think the above is necessary for a first implementation, provided that we make the LOD check highly efficient in the code as I outlined earlier, by checking e.g. a bool. See https://github.com/lawnjelly/godot/tree/calinou_lod_range_modified . I'll try and push some commits for possible changes. |
The changes I added to https://github.com/lawnjelly/godot/tree/calinou_lod_range_modified now seem to approx double the frame rate in the extreme test case (from 60fps to 120fps).
The downside (versus the HashMap) is that the bitset is currently hardcoded to a max of 32 current cameras (above that, hysteresis will not work on added cameras). There is a typedef so it can be changed to e.g. 64, but 32 should be fine imo. UsabilityAlthough most comments I made earlier are to do with performance, given we have a bit of time before 3.6 we should consider whether there is anything we can do to increase the usability (especially regarding visibility parents, and auto-generating LODs / merging as a menu option). I'm not super familiar with the LOD in 4.x, but hopefully some of the guys that worked on that there can chime in such as @JFonS . Self shadowingThinking ahead one of problems that might be faced is the difference between the LOD used for shadows and rendering which can cause nasty artefacts. This is probably an issue just using a low LOD for the shadow map and a higher LOD for rendering. However it can potentially be reduced if you can use the same LOD for shadow map and rendering (except in the case of fading between LODs). I don't know how 4.x deals with this currently. However, this PR is doing LOD and storing hysteresis states independently for different cameras, whereas shadow maps are rendered once per light afaik. This means you no longer have the option to just pick the same LOD for the shadow map, because with a split screen game, different screens may be rendering a different LOD from the camera view. An alternative might be to make the decision of which LOD to use globally (i.e. not per camera). You would probably have to do a step ahead of time to calculate the closest camera and choose the highest LOD (or average or something) determined by the closest camera. This might lead to some weirdness in split screen as far off objects move in and out of the view of a secondary camera, but it could help with the shadow artefact problem. Anyway this isn't saying we should necessarily use the other approach, as there are problems with both. You could also render different shadow maps per camera, but this would be slow and quite complex to change. |
Does this have all the features of multilod? It's already well tested and stable. Eg, the plugin manages shadow impostors, sends signals on lod changes, and adjusts lights and MMI. |
See the first post in #58512 for light LOD. I started working on a Shadow meshes aren't supported in MultiMeshInstance might work already as the LOD properties used in this PR are available on any node that inherits from GeometryInstance, not just MeshInstance. As for signals on LOD changes, I'm not sure if it's worth the added complexity. We try to keep the feature set as consistent as possible between |
Alright, thanks. Signals on lod change are essential for us. It serves as a proximity detector, similar to visibility notifier but for multiple distances. We'll continue to use puchik's multi-lod in 3.x until we upgrade to GD4. |
The godot https://github.com/CaptainProton42/GodotHoloPlay uses 35 cameras I think. @CaptainProton42 |
This case looks more suitable for manual LOD though (if indeed it is required). A situation like that will likely benefit more from performing LOD once per object, rather than once per camera. |
Will this be merged before 3.5 stable releases? |
3.5 is nearing release, so I'm afraid not. This will have to wait for 3.6, but nothing prevents you from building your own editor and export template binaries with this PR included or sticking to godot-lod. |
Just want to throw in my own 2 cents on this. But I do personally think its a very bad look to have the LOD properties available in the inspector while this isn't actually implemented in the engine, especially considering Godot 3 is labeled as a LTS version now. Considering this is probably not going to be merged before 3.6 is released, why not just decide to hide these properties until this is actually implemented, to avoid confusion? This is the type of thing that confuses and potentially turns off new users. Am I wrong? https://www.reddit.com/r/godot/comments/180jtdh/does_anyone_know_when_is_this_actually_making_it/ |
I wanted to pitch this as well. It's in the UI, clearly someone has already deemed it worthy of proposing and a whole lot of other people have already approved it to warrant adding. LOD is kind of a must-have and there's no way that a GDScript implementation is anywhere as performant as a C++ implementation (considering that allegedly, GDscript had a speed boost in Godot 4). Adding either a script to run logic or a whole node just to get LOD in seems bloated. Not to mention that you can't use this LOD node with GLTF animated meshes like characters that have baked in LOD meshes. This is a must have for any engine, especially for those who have existing projects in Godot 3 and a lot of props. |
Superseded by #85437. |
Salvaged version of #20489. Thanks @Windfisch for the implementation 🙂
Similar to the one present in
master
, this provides a LOD system for GeometryInstance-derived nodes such as MeshInstance and MultiMeshInstance.This LOD system currently doesn't obey shadow rendering, but it can still provide substantial performance improvements when used well. (To prevent multi-LOD objects from casting shadows over each other, only make one of the LOD levels cast shadows by configuring other meshes' Shadow Casting property to Off.)
On the other hand, this LOD system should be able to handle split screen, unlike the one currently present in
master
.Coupled with the occlusion culling systems (portals/rooms or occluder spheres), this makes possible to achieve much greater performance in large detailed scenes.
Testing project: test_lod_3.x.zip
Rationale
As you may have seen in #20489's comments, the PR was closed since a different approach was adopted in
master
. The closing comment refers to the automatic LOD system, which can't be backported to3.x
as it would require too many backwards-incompatible changes. The automatic LOD system is a good start as it requires no manual setup, but it's not suited to every use case and can't perform HLOD (merging several distant objects into a single draw call).However, the manual (H)LOD system can still be backported or reimplemented in a limited fashion in
3.x
, while still providing substantial performance improvements with some manual setup.The existing LOD properties can be read by a Godot 4.0 converter to allow LOD to work automatically when porting a project from Godot 3.x to Godot 4.0.
Preview
The testing project has 1,134 boxes with 16,000 triangles each (20×20×20 subdivision).
No LOD
The video is stuttery due to how I recorded it, but it's much smoother in practice.
no_lod.mp4
With LOD
The video is stuttery due to how I recorded it, but it's much smoother in practice.
with_lod.mp4
TODO
3.x
🙂