Skip to content

Conversation

@pcwalton
Copy link
Contributor

@pcwalton pcwalton commented Apr 6, 2025

Currently, RenderMaterialInstances and RenderMeshMaterialIds are very similar render-world resources: the former maps main world meshes to typed material asset IDs, and the latter maps main world meshes to untyped material asset IDs. This is needlessly-complex and wasteful, so this patch unifies the two in favor of a single untyped RenderMaterialInstances resource.

This patch also fixes a subtle issue that could cause mesh materials to be incorrect if a MeshMaterial3d<A> was removed and replaced with a MeshMaterial3d<B> material in the same frame. The problematic pattern looks like:

  1. extract_mesh_materials<B> runs and, seeing the Changed<MeshMaterial3d<B>> condition, adds an entry mapping the mesh to the new material to the untyped RenderMeshMaterialIds.

  2. extract_mesh_materials<A> runs and, seeing that the entity is present in RemovedComponents<MeshMaterial3d<A>>, removes the entry from RenderMeshMaterialIds.

  3. The material slot is now empty, and the mesh will show up as whatever material happens to be in slot 0 in the material data slab.

This commit fixes the issue by splitting out extract_mesh_materials into three phases: extraction, early sweeping, and late sweeping, which run in that order:

  1. The extraction system, which runs for each material, updates RenderMaterialInstances records whenever MeshMaterial3d components change, and updates a change tick so that the following system will know not to remove it.

  2. The early sweeping system, which runs for each material, processes entities present in RemovedComponents<MeshMaterial3d> and removes each such entity's record from RenderMeshInstances only if the extraction system didn't update it this frame. This system runs after all extraction systems have completed, fixing the race condition.

  3. The late sweeping system, which runs only once regardless of the number of materials in the scene, processes entities present in RemovedComponents<ViewVisibility> and, as in the early sweeping phase, removes each such entity's record from RenderMeshInstances only if the extraction system didn't update it this frame. At the end, the late sweeping system updates the change tick.

Because this pattern happens relatively frequently, I think this PR should land for 0.16.

associated race condition.

Currently, `RenderMaterialInstances` and `RenderMeshMaterialIds` are
very similar render-world resources: the former maps main world meshes
to typed material asset IDs, and the latter maps main world meshes to
untyped material asset IDs. This is needlessly-complex and wasteful, so
this patch unifies the two in favor of a single untyped
`RenderMaterialInstances` resource.

This patch also fixes a subtle issue that could cause mesh materials to
be incorrect if a `MeshMaterial3d<A>` was removed and replaced with a
`MeshMaterial3d<B>` material in the same frame. The problematic pattern
looks like:

1. `extract_mesh_materials<B>` runs and, seeing the
   `Changed<MeshMaterial3d<B>>` condition, adds an entry mapping the
   mesh to the new material to the untyped `RenderMeshMaterialIds`.

2. `extract_mesh_materials<A>` runs and, seeing that the entity is
   present in `RemovedComponents<MeshMaterial3d<A>>`, removes the entry
   from `RenderMeshMaterialIds`.

3. The material slot is now empty, and the mesh will show up as whatever
   material happens to be in slot 0 in the material data slab.

This commit fixes the issue by splitting out `extract_mesh_materials`
into *three* phases: *extraction*, *early sweeping*, and *late
sweeping*, which run in that order:

1. The *extraction* system, which runs for each material, updates
   `RenderMaterialInstances` records whenever `MeshMaterial3d`
   components change, and updates a change tick so that the following
   system will know not to remove it.

2. The *early sweeping* system, which runs for each material, processes
   entities present in `RemovedComponents<MeshMaterial3d>` and removes
   each such entity's record from `RenderMeshInstances` only if the
   extraction system didn't update it this frame. This system runs after
   *all* extraction systems have completed, fixing the race condition.

3. The *late sweeping* system, which runs only once regardless of the
   number of materials in the scene, processes entities present in
   `RemovedComponents<ViewVisibility>` and, as in the early sweeping
   phase, removes each such entity's record from `RenderMeshInstances`
   only if the extraction system didn't update it this frame. At the
   end, the late sweeping system updates the change tick.

Because this pattern happens relatively frequently, I think this PR
should land for 0.16.
@pcwalton pcwalton added this to the 0.16 milestone Apr 6, 2025
@pcwalton pcwalton added A-Rendering Drawing game state to the screen C-Bug An unexpected or incorrect behavior labels Apr 6, 2025
@pcwalton pcwalton added the S-Needs-Review Needs reviewer attention (from anyone!) to move forward label Apr 6, 2025
Copy link
Contributor

@superdump superdump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not dropping a final review yet as CI is broken and I need to take another look at a couple of bits when I have more time but what I see all looks fine

Copy link
Member

@tychedelia tychedelia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, nasty bug, great step towards #18075.

return RenderCommandResult::Skip;
};
let Some(material) = materials.get(*material_asset_id) else {
let Ok(material_asset_id) = material_instance.asset_id.try_typed::<M>() else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note for myself, but this is basically guaranteed to succeed because the phase item was queued in a queue_material_meshes::<M> and so we would never look up a material instance whose asset id's material type doesn't match the draw command's,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to add an error! there?

pub(crate) fn mesh_material(&self, entity: MainEntity) -> UntypedAssetId {
match self.instances.get(&entity) {
Some(render_instance) => render_instance.asset_id,
None => AssetId::<StandardMaterial>::invalid().into(),
Copy link
Member

@tychedelia tychedelia Apr 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should extract this to a constant (in another PR) as we're using this pattern in a few places and the use of StandardMaterial is potentially confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@superdump superdump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we fix checkdocs then this can be merged.

@pcwalton pcwalton added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Apr 9, 2025
@superdump superdump added this pull request to the merge queue Apr 9, 2025
Merged via the queue into bevyengine:main with commit 6c61939 Apr 9, 2025
34 checks passed
@github-project-automation github-project-automation bot moved this to Done in Rendering Apr 9, 2025
mockersf pushed a commit that referenced this pull request Apr 9, 2025
…an associated race condition. (#18734)

Currently, `RenderMaterialInstances` and `RenderMeshMaterialIds` are
very similar render-world resources: the former maps main world meshes
to typed material asset IDs, and the latter maps main world meshes to
untyped material asset IDs. This is needlessly-complex and wasteful, so
this patch unifies the two in favor of a single untyped
`RenderMaterialInstances` resource.

This patch also fixes a subtle issue that could cause mesh materials to
be incorrect if a `MeshMaterial3d<A>` was removed and replaced with a
`MeshMaterial3d<B>` material in the same frame. The problematic pattern
looks like:

1. `extract_mesh_materials<B>` runs and, seeing the
`Changed<MeshMaterial3d<B>>` condition, adds an entry mapping the mesh
to the new material to the untyped `RenderMeshMaterialIds`.

2. `extract_mesh_materials<A>` runs and, seeing that the entity is
present in `RemovedComponents<MeshMaterial3d<A>>`, removes the entry
from `RenderMeshMaterialIds`.

3. The material slot is now empty, and the mesh will show up as whatever
material happens to be in slot 0 in the material data slab.

This commit fixes the issue by splitting out `extract_mesh_materials`
into *three* phases: *extraction*, *early sweeping*, and *late
sweeping*, which run in that order:

1. The *extraction* system, which runs for each material, updates
`RenderMaterialInstances` records whenever `MeshMaterial3d` components
change, and updates a change tick so that the following system will know
not to remove it.

2. The *early sweeping* system, which runs for each material, processes
entities present in `RemovedComponents<MeshMaterial3d>` and removes each
such entity's record from `RenderMeshInstances` only if the extraction
system didn't update it this frame. This system runs after *all*
extraction systems have completed, fixing the race condition.

3. The *late sweeping* system, which runs only once regardless of the
number of materials in the scene, processes entities present in
`RemovedComponents<ViewVisibility>` and, as in the early sweeping phase,
removes each such entity's record from `RenderMeshInstances` only if the
extraction system didn't update it this frame. At the end, the late
sweeping system updates the change tick.

Because this pattern happens relatively frequently, I think this PR
should land for 0.16.
github-merge-queue bot pushed a commit that referenced this pull request Apr 14, 2025
There's still a race resulting in blank materials whenever a material of
type A is added on the same frame that a material of type B is removed.
PR #18734 improved the situation, but ultimately didn't fix the race
because of two issues:

1. The `late_sweep_material_instances` system was never scheduled. This
PR fixes the problem by scheduling that system.

2. `early_sweep_material_instances` needs to be called after *every*
material type has been extracted, not just when the material of *that*
type has been extracted. The `chain()` added during the review process
in PR #18734 broke this logic. This PR reverts that and fixes the
ordering by introducing a new `SystemSet` that contains all material
extraction systems.

I also took the opportunity to switch a manual reference to
`AssetId::<StandardMaterial>::invalid()` to the new
`DUMMY_MESH_MATERIAL` constant for clarity.

Because this is a bug that can affect any application that switches
material types in a single frame, I think this should be uplifted to
Bevy 0.16.
mockersf pushed a commit that referenced this pull request Apr 14, 2025
There's still a race resulting in blank materials whenever a material of
type A is added on the same frame that a material of type B is removed.
PR #18734 improved the situation, but ultimately didn't fix the race
because of two issues:

1. The `late_sweep_material_instances` system was never scheduled. This
PR fixes the problem by scheduling that system.

2. `early_sweep_material_instances` needs to be called after *every*
material type has been extracted, not just when the material of *that*
type has been extracted. The `chain()` added during the review process
in PR #18734 broke this logic. This PR reverts that and fixes the
ordering by introducing a new `SystemSet` that contains all material
extraction systems.

I also took the opportunity to switch a manual reference to
`AssetId::<StandardMaterial>::invalid()` to the new
`DUMMY_MESH_MATERIAL` constant for clarity.

Because this is a bug that can affect any application that switches
material types in a single frame, I think this should be uplifted to
Bevy 0.16.
jf908 pushed a commit to jf908/bevy that referenced this pull request May 12, 2025
…an associated race condition. (bevyengine#18734)

Currently, `RenderMaterialInstances` and `RenderMeshMaterialIds` are
very similar render-world resources: the former maps main world meshes
to typed material asset IDs, and the latter maps main world meshes to
untyped material asset IDs. This is needlessly-complex and wasteful, so
this patch unifies the two in favor of a single untyped
`RenderMaterialInstances` resource.

This patch also fixes a subtle issue that could cause mesh materials to
be incorrect if a `MeshMaterial3d<A>` was removed and replaced with a
`MeshMaterial3d<B>` material in the same frame. The problematic pattern
looks like:

1. `extract_mesh_materials<B>` runs and, seeing the
`Changed<MeshMaterial3d<B>>` condition, adds an entry mapping the mesh
to the new material to the untyped `RenderMeshMaterialIds`.

2. `extract_mesh_materials<A>` runs and, seeing that the entity is
present in `RemovedComponents<MeshMaterial3d<A>>`, removes the entry
from `RenderMeshMaterialIds`.

3. The material slot is now empty, and the mesh will show up as whatever
material happens to be in slot 0 in the material data slab.

This commit fixes the issue by splitting out `extract_mesh_materials`
into *three* phases: *extraction*, *early sweeping*, and *late
sweeping*, which run in that order:

1. The *extraction* system, which runs for each material, updates
`RenderMaterialInstances` records whenever `MeshMaterial3d` components
change, and updates a change tick so that the following system will know
not to remove it.

2. The *early sweeping* system, which runs for each material, processes
entities present in `RemovedComponents<MeshMaterial3d>` and removes each
such entity's record from `RenderMeshInstances` only if the extraction
system didn't update it this frame. This system runs after *all*
extraction systems have completed, fixing the race condition.

3. The *late sweeping* system, which runs only once regardless of the
number of materials in the scene, processes entities present in
`RemovedComponents<ViewVisibility>` and, as in the early sweeping phase,
removes each such entity's record from `RenderMeshInstances` only if the
extraction system didn't update it this frame. At the end, the late
sweeping system updates the change tick.

Because this pattern happens relatively frequently, I think this PR
should land for 0.16.
jf908 pushed a commit to jf908/bevy that referenced this pull request May 12, 2025
…engine#18825)

There's still a race resulting in blank materials whenever a material of
type A is added on the same frame that a material of type B is removed.
PR bevyengine#18734 improved the situation, but ultimately didn't fix the race
because of two issues:

1. The `late_sweep_material_instances` system was never scheduled. This
PR fixes the problem by scheduling that system.

2. `early_sweep_material_instances` needs to be called after *every*
material type has been extracted, not just when the material of *that*
type has been extracted. The `chain()` added during the review process
in PR bevyengine#18734 broke this logic. This PR reverts that and fixes the
ordering by introducing a new `SystemSet` that contains all material
extraction systems.

I also took the opportunity to switch a manual reference to
`AssetId::<StandardMaterial>::invalid()` to the new
`DUMMY_MESH_MATERIAL` constant for clarity.

Because this is a bug that can affect any application that switches
material types in a single frame, I think this should be uplifted to
Bevy 0.16.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Rendering Drawing game state to the screen C-Bug An unexpected or incorrect behavior S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants