Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fallible allocator experiment #111970

Closed
wants to merge 15 commits into from
Closed

Conversation

pitaj
Copy link
Contributor

@pitaj pitaj commented May 26, 2023

No description provided.

@rustbot
Copy link
Collaborator

rustbot commented May 26, 2023

r? @thomcc

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 26, 2023
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@pitaj pitaj marked this pull request as ready for review June 6, 2023 01:44
@pitaj pitaj marked this pull request as draft June 6, 2023 02:21
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@bors

This comment was marked as outdated.

@lqd
Copy link
Member

lqd commented Jun 8, 2023

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 8, 2023
@bors
Copy link
Contributor

bors commented Jun 8, 2023

⌛ Trying commit 0357918 with merge 1a267bc685998caa7666938acb14302f1cd729b6...

@bors
Copy link
Contributor

bors commented Jun 8, 2023

☀️ Try build successful - checks-actions
Build commit: 1a267bc685998caa7666938acb14302f1cd729b6 (1a267bc685998caa7666938acb14302f1cd729b6)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (1a267bc685998caa7666938acb14302f1cd729b6): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.3% [0.2%, 10.8%] 202
Regressions ❌
(secondary)
0.8% [0.1%, 2.4%] 87
Improvements ✅
(primary)
-0.9% [-1.3%, -0.3%] 6
Improvements ✅
(secondary)
-0.6% [-1.0%, -0.3%] 8
All ❌✅ (primary) 1.2% [-1.3%, 10.8%] 208

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
4.1% [1.1%, 9.1%] 21
Regressions ❌
(secondary)
2.1% [1.2%, 2.8%] 4
Improvements ✅
(primary)
-4.4% [-9.6%, -0.9%] 5
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.5% [-9.6%, 9.1%] 26

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
3.3% [1.2%, 9.6%] 59
Regressions ❌
(secondary)
2.2% [1.0%, 3.3%] 15
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 3.3% [1.2%, 9.6%] 59

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.1% [0.1%, 4.9%] 114
Regressions ❌
(secondary)
1.4% [0.5%, 2.2%] 83
Improvements ✅
(primary)
-0.3% [-0.3%, -0.3%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.0% [-0.3%, 4.9%] 115

Bootstrap: 648.715s -> 661.923s (2.04%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jun 8, 2023
@CAD97
Copy link
Contributor

CAD97 commented Jul 3, 2023

(drive-by comment, saw this since I'm involved with storages)

This is, indeed, very cool, but also a notable increase in incidental complexity that everybody needs to deal with, not just those using fallible-mode allocation. Everyone gets exposed to it, because the docs for

fn push_back(&mut self, value: T)

will now be instead docs for

fn push_back(
    &mut self,
    value: T,
) -> <<A as Allocator>::ErrorHandling as ErrorHandling>::Result<(), TryReserveError>

(See for example [Iterator::try_reduce], which uses a similar type alias to hide the double type projection with Try/Residual; the nonexported type alias is inlined.)

This sort of double projection is already currently under fire as needlessly complex and unprecedented (in std) with Residual. Here it is perhaps a bit more justified1, but it is also pretty significantly more egregious, since it impacts normal (and stable) functions.

It is certainly very cool that double type projection can enable Rust to model monadic type rebinding, but including it into stable Rust will be an uphill battle at best. -> A::Result<T, E> would be at least more palatable, since what that means is more immediately obvious to the ready than the double projection (thus the use of the private AllocResult alias here).

Footnotes

  1. With try_ style methods, there's always the option available to reify whatever Try type into Result temporarily to avoid the need to be monad-generic. Whereas here we're selecting between returning Result<T, E> or just T.

@pitaj
Copy link
Contributor Author

pitaj commented Jul 3, 2023

-> <<A as Allocator>::ErrorHandling as ErrorHandling>::Result<(), TryReserveError>

-> A::Result<T, E> would be at least more palatable

To be fair, I see no reason why A::Error handling::Result<T, E> shouldn't work eventually (compiler should see the bounds in the associated type and assume the trait is implemented in generic contexts at least) which is much cleaner. But yes, as it stands currently it's pretty painful.

I originally started with A::Result<T, E> but adding it to impl<A: Allocator + ?Sized> Allocator for &A seemed to be impossible.

@tgross35
Copy link
Contributor

Agreed that they look messy, but I'm wondering if this might be something that rustdoc could help make easier. Maybe something like #[doc(use_generic_defaults)] could be used to indicate that rustdoc should use the A = Global default when creating docs so the ugly boilerplate doesn't confuse a new user, but allow a viewer to check a box or something to see the fully generic, more verbose return values.

@CAD97
Copy link
Contributor

CAD97 commented Aug 7, 2023

I ended up rambling about possibilities in the storages RFC thread; linking that here in the possibility that it's of some use to someone. rust-lang/rfcs#3446 (comment)

TL;DR: documenting with a stricter bound for Box in std (e.g. A: Allocator<ErrorHandling = Fatal>) than is used for documenting Box in alloc (i.e. A: Allocator) builds off of existing concepts (i.e. that std is a facade crate around alloc and core) instead of introducing new ones (e.g. toggling rustdoc between documentation "layers" of being generic or (partially) instantiated).

To be fair, I see no reason why A::ErrorHandling::Result<T, E> shouldn't work eventually (compiler should see the bounds in the associated type and assume the trait is implemented in generic contexts at least)

I agree — qualifying the first projection already isn't necessary, but the compiler always complains that the second projection is ambiguous despite only one version of the name being resolvable — but it's worth noting that rustdoc currently just always uses the qualified form, because it doesn't have any concept of "in scope" traits for which to permit the unqualified projection. So unless rustdoc changes that rule somehow (e.g. considering generic bounds as the "in scope" traits to elide qualification for (which might be what you were saying) or just matching whatever the source uses), it's going to continue documenting using the qualified form even if the source uses an unqualified projection.

Projections aren't always guaranteed to be unambiguous even when they're on a generic type, because there could be a blanket-implemented trait in scope which is satisfied by the bound and provides the same associated name.

@ojeda
Copy link
Contributor

ojeda commented Aug 8, 2023

  1. An example alternate doc-only solution would be the ability to toggle the rustdoc API documentation between showing the signatures derived from A = Global, A: Allocator<INFALLIBLE = true>, A: Allocator<INFALLIBLE = false>, and the fully generic A: Allocator.

"Doc switches" are quite interesting because they could potentially be used for other purposes too. I suggested groups for hiding/collapsing particular API sets by default (e.g. for try_* methods), and @tgross35 and you mention filtering based on generic parameters which would be nice too. But it could also be used for non-allocator-related APIs cases, like hiding/collapsing by default advanced/uncommon/"low-level" APIs in libraries. It would be nice knowing if rustdoc would be willing to provide something like that, because it could impact this kind of decisions / API designs.

An interesting generalized version of this would be to have a wasm-compiled item name resolution engine (probably rust-analyzer's instead of rustc's) to be able to substitute in specific types and/or add bound refinements on generics and show how that impacts signatures directly in the documentation. That would certainly be an interesting project,

Sounds like providing sample template arguments for C++ IntelliSense in Visual Studio, but for Rust docs. There is a UI GIF at https://devblogs.microsoft.com/cppblog/template-intellisense/. Especially useful if one could save/pass a URL with those filters set in a fragment.

@pitaj
Copy link
Contributor Author

pitaj commented Aug 8, 2023

I think one thing we could do is just have separate impl blocks for the different forms of error handling:

impl<T, A> Vec<T, A>
where A: Allocator<ErrorHandling = Fatal>
{
    fn push(&mut self) { ... }
}

impl<T, A> Vec<T, A>
where A: Allocator<ErrorHandling = Fallible>
{
    fn push(&mut self) -> Result<(), TryReserveError> { ... }
}

That would keep the signatures more sane.

Rustdoc already splits function signatures by impl block. Maybe they could add a #[doc(collapsed))] attribute or something that we could put on the second impl block to collapse it by default.

(The above code currently results in a duplicate definition error but I don't see why it must)

@matthieu-m
Copy link
Contributor

Has any thought been given in moving fallibility outside of the Allocator trait?

How do deal with a failed allocation seems to be more of a matter of usecase, than a matter of allocator. Certainly, one can wrap a fallible allocator into an infallible one (aborting on failure) and vice versa (just Ok wrapping) but... considering that most allocators are fallible in the first place, this seems like a round-about way to go about things, and a conflation of responsibilities.

Mixing multiple responsibilities in a single entity generally violates the Single Responsibility Principle, which typically leads to undesirable consequences.

What if, instead, a policy-based design was used? That is:

  • All allocators may fail. No allocator can provides memory aligned to (1 << 63) bytes, even though it's a technically legal alignment.
  • A separator AllocationPolicy parameter is passed, which "mediates" allocation failure.

The policy can be as simple as taking the result of the allocation, and either returning just the pointer (infallible) or returning a Result. In terms of signature, this would mean:

impl<T, A, AFP> Vec<T, A, AFP>
where
     A: Allocator,
     AFP: AllocationFailurePolicy<Result = NonNull<u8>>,
{
    fn push(&mut self) { ... }
}

impl<T, A, AFP> Vec<T, A, AFP>
where
    A: Allocator,
    AFP: AllocationFailurePolicy<Result = Result<NonNull<u8>, AllocError>>,
{
    fn push(&mut self) -> Result<(), AllocError> { ... }
}

Which has the advantage that writers of allocators don't have to worry about, and users of allocators only need to learn about a handful of allocation failure policies no matter how many allocators they use: a typical M + N (vs M x N) benefit of separating orthogonal concepts.

Finally std::vec::Vec can be defined as:

#[doc(inline)]
pub Vec<T, A> = alloc::Vec<T, A, Abort>;

So that by default the doc for std remains approachable, while alloc::Vec exposes all the details.

@CAD97
Copy link
Contributor

CAD97 commented Aug 14, 2023

Error handling is restricted to be either Fatal or Fallible with no other available option by this RFC, so there's no M×N, just M+2. I've written A::Result<T>, but that's just shorthand for A::ErrorHandling::Result<T> (which is itself shorthand for the qualified projections).

Additionally, while all allocators can fail, certain allocators are more likely to fail than others. While the ErrorHandling associated type could be separated from the allocator, it's still a very related concept, and having a default associated eh mode isn't unreasonable; it's roughly like saying Vec<T, A = Global, EH = A::ErrorHandling> if using a separate strategy generic. It's also not sufficient to just have a non-GAT switch between NonNull<u8> and Result<NonNull<u8>, AllocError>; the ability to define GAT generic uses is the primary purpose of this experiment (at least specifically this iteraton thereof). The simple approach also has the problem of not working particularly well for #[cfg(no_global_oom_handling)].

A separate param does have the advantage of being easier to file off for the std facade, since it's just fixing a generic instead of refining a trait bound.

@CAD97
Copy link
Contributor

CAD97 commented Aug 14, 2023

I was tempted into trying that kind of scheme out, though, and the following worked, or at least compiled:

#![feature(never_type, try_blocks, yeet_expr)]

pub trait ErrorHandling: Sealed {
    #[must_use]
    type FailWith<E>;
    #[must_use]
    type Output<T, E>;

    fn cover<T, E>(result: Result<T, Self::FailWith<E>>) -> Self::Output<T, E>;
    fn fail<E>(error: E) -> Self::FailWith<E>;
    fn unwrap_with<T, E>(
        result: Result<T, E>,
        op: impl FnOnce(E) -> !,
    ) -> Result<T, Self::FailWith<E>>;
}

impl ErrorHandling for Fatal {
    type FailWith<E> = !;
    type Output<T, E> = T;

    fn cover<T, E>(result: Result<T, !>) -> Self::Output<T, E> {
        match result {
            Ok(result) => result,
            Err(never) => never,
        }
    }

    fn fail<E>(_: E) -> Self::FailWith<E> {
        unreachable!()
    }

    fn unwrap_with<T, E>(
        result: Result<T, E>,
        op: impl FnOnce(E) -> !,
    ) -> Result<T, Self::FailWith<E>> {
        result.map_err(op)
    }
}

impl ErrorHandling for Nonfatal {
    type FailWith<E> = E;
    type Output<T, E> = Result<T, E>;

    fn cover<T, E>(result: Result<T, E>) -> Self::Output<T, E> {
        result
    }

    fn fail<E>(error: E) -> Self::FailWith<E> {
        error
    }

    fn unwrap_with<T, E>(
        result: Result<T, E>,
        _: impl FnOnce(E) -> !,
    ) -> Result<T, Self::FailWith<E>> {
        result
    }
}

#[cfg(not(no_global_oom_handling))]
type DefaultEh = Fatal;
#[cfg(no_global_oom_handling)]
type DefaultEh = Nonfatal;

pub struct Vec<T, A: Allocator = Global, EH: ErrorHandling = DefaultEh> {
    ptr: NonNull<T>,
    len: usize,
    cap: usize,
    alloc: A,
    eh: PhantomData<EH>,
}

impl<T, A: Allocator, EH: ErrorHandling> Vec<T, A, EH> {
    pub fn with_capacity_in(cap: usize, alloc: A) -> EH::Output<Self, A> {
        EH::cover(
            try {
                let Ok(layout) = EH::unwrap_with(Layout::array::<T>(cap), |_| capacity_overflow())
                else {
                    do yeet EH::fail(alloc);
                };
                let Ok(ptr) =
                    EH::unwrap_with(alloc.allocate(layout), |_| handle_alloc_error(layout))
                else {
                    do yeet EH::fail(alloc);
                };
                Self {
                    ptr: ptr.cast(),
                    len: 0,
                    cap,
                    alloc,
                    eh: PhantomData,
                }
            },
        )
    }
}

It's not perfectly pretty, but it's another potential direction. Also I did try but #[doc(inline)] does nothing for type aliases currently, and is in fact a forwards-compat warning.

@Dylan-DPC Dylan-DPC added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 14, 2023
@Dylan-DPC
Copy link
Member

Closing this as it was an experiment and has conflicts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.