Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider switching to aborting on panic in hot ECS functions #12107

Open
james7132 opened this issue Feb 25, 2024 · 2 comments
Open

Consider switching to aborting on panic in hot ECS functions #12107

james7132 opened this issue Feb 25, 2024 · 2 comments
Labels
A-ECS Entities, components, systems, and events C-Feature A new feature, making something new possible C-Performance A change motivated by improving speed, memory usage or compile times P-Unsound A bug that results in undefined compiler behavior X-Controversial There is active debate or serious implications around merging this PR

Comments

@james7132
Copy link
Member

james7132 commented Feb 25, 2024

What problem does this solve or what need does it fill?

Some potentially very hot and heavily monomorphized ECS functions still have panics in the case something goes wrong. Panicking results in a lot of codegen, which can make binary sizes much larger, and can affect instruction fetch locality and thus performance in very hot ECS functions.

As seen via the asembly generation tests, there are still many locations where panics may crop up.

What solution would you like?

Create a utility called AbortOnPanic which turns all panics in the local scope into aborts, only in release mode.

This technique is used by a number of fairly low level crates like tokio or async-task, where a panic would otherwise be an artifact of incorrect safety or correctness-related invariants not being satisfied. Any location where it's calling into user provided code (i.e. trait functions turned into function pointers) must be wrapped in std::panic::catch_unwind to avoid aborting on user-provided code.

This may also be a correctness and soundness issue, since panicking on unsatisfied invariant can be caught via catch_unwind, and leave the ECS in an invalid and unsound state.

Note that this likely won't affect builds using panic = "abort" in their build profiles, but will likely improve the default --release configuration that cargo provides.

This should only be used in locations where we know that we're not calling into user provided code, and we should only be panicking

What alternative(s) have you considered?

Leaving it as is, panicking instead of aborting.

The other alternative is to use std::hint::unreachable_unchecked, which is both unsafe and very easy to result in undefined behavior is used incorrectly.

@james7132 james7132 added C-Feature A new feature, making something new possible A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times X-Controversial There is active debate or serious implications around merging this PR P-Unsound A bug that results in undefined compiler behavior labels Feb 25, 2024
@Bluefinger
Copy link
Contributor

If we have methods that have explicit panicking paths, we could reduce the inlining of panic paths by splitting out the panicking code into functions that are never inlined. Hitting a panic case is not something we want to optimise for or have the compiler inline that code so it should be fine. Using Entity::from_bits as an example to change:

    #[inline]
    pub const fn from_bits(bits: u64) -> Self {
        #[inline(never)]
        #[cold]
        #[track_caller]
        const fn invalid_entity() -> ! {
            panic!("Attempted to initialise invalid bits as an entity");
        }

        // Construct an Identifier initially to extract the kind from.
        let id = Self::try_from_bits(bits);

        match id {
            Ok(entity) => entity,
            Err(_) => invalid_entity(),
        }
    }

Doing this should prevent the inlining of the panic case and thus reduce the amount of assembly overall in the hot path. The panic won't get inlined so it just becomes a case where all instances of from_bits will make use of the panic function. And as a whole, should reduce the amount of code being generated while still having panicking behaviour.

Of course, we could still incorporate both approaches, as this is not a mutually exclusive approach.

@james7132
Copy link
Member Author

james7132 commented Feb 28, 2024

For the user-facing functions that panicking is part of their public interface, that makes sense.

This issue focuses on internal panics that are really there to check invariants that should never be violated like bounds checking a ArchetypeId versus Archetypes it is allocated from. In reality, we can replace the panic with std::hint::unreachable_unchecked, and, so long as the proper invariants hold, it would be sound.

Likewise, panics that can happen during structural mutations to the World must also checked as well, particularly where we're calling back into user-provided code (i.e. Droping components and resources, component hooks, etc.).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Feature A new feature, making something new possible C-Performance A change motivated by improving speed, memory usage or compile times P-Unsound A bug that results in undefined compiler behavior X-Controversial There is active debate or serious implications around merging this PR
Projects
None yet
Development

No branches or pull requests

2 participants