Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blocking get #92

Closed
wants to merge 1 commit into from
Closed

Blocking get #92

wants to merge 1 commit into from

Conversation

matklad
Copy link
Owner

@matklad matklad commented Feb 5, 2020

The current API guarantees that .get never blocks, but I think that was a mistake.

Specifically, if code spawns threads in get_or_init and the spawned thread does get.unwrap, it migth panic, while it is reasonable to expect that it'll always see the initialized value (b/c initialization started before the thread was spawned). This sounds esoteric, but I did found a well motivated instance this in a real code base.

So, we should fix .get such that it blocks. The docstring for .get says that it is non-blocking, but I am willing to make a technically breaking change to the semantics. I feel that more code will be fixed by this change, and the fix does not worth causing the whole ecosystem to update Cargo.toml to 2.0.

@matklad
Copy link
Owner Author

matklad commented Feb 5, 2020

cc @KodrAus, this is probably a mistake that we don't want to do in stdlib 😅

Turns our, real world code can has **really** subtle bugs with
non-blocking semantics
@pitdicker
Copy link
Contributor

If I understand your change, you want to make get block if an initialization is running, and return None when the initialization is complete?

I can imagine that not making get blocking can cause really subtle bugs such as the one you encountered (?). On the other hand it also feels a bit like asking for it when spawning threads in the initialization closure...

Would there be any negative consequence to this change besides it being a 'technically' breaking change?

@manuthambi
Copy link
Contributor

manuthambi commented Feb 9, 2020

I can see that a blocking get can be useful sometimes. But personally, I like .get() being non-blocking. If the user wants blocking behavior, why not always use get_or_init()?

Regardless of the merits of whether get() should be blocking, I think changing the semantics without a semver change is very bad. It will subtly break (deadlock) reentrant code written carefully based on the previous documentation. Also implementations of PartialEq, Clone, Debug will all become blocking (These can be fixed to be non-blocking of course).

I feel like the change reduces bugs for the case when threads are created in the init block, and increases bugs for the case where threads are not created. The latter case is more common.

If the change is made, then we should provide a non-blocking version of get. Otherwise dealing with reentrancy will be much harder.

@matklad
Copy link
Owner Author

matklad commented Feb 9, 2020

Yeah, I am on the fence about this. I wouldn’t worry to much about backwards compatibility, I believe no one relies on the non-blocking guarantee, and I also now that this fixes at least one actual bug in existing code.

What I am worried more is that, if get is on the hot path, we inflate code size quite a bit, as we turn a load and branch into a function call. OTOH, the hot path is most likely get_or_init anyway.

I am also worried about making Clone, Debug and friends blocking.

But also note that get_or_init (or, more specifically, get_or_try_init) are not an answer as well, as the problem is within the Lazy type.

@matklad
Copy link
Owner Author

matklad commented Feb 9, 2020

Which actually makes me think: perhaps we should fix only Lazy?

@manuthambi
Copy link
Contributor

I just double checked. Lazy is not using get, only get_or_init.

I am curious as to why the code which had the bug you mentioned is using get instead of get_or_init?

@matklad
Copy link
Owner Author

matklad commented Feb 10, 2020

The code actually uses Lazy, which uses get_or_init, so the original code is in fact not buggy, it's just that I apparently don't know how Lazy works.

Still, I'd like to think more about this whole situation and, in particular, what should be the semantics of

cell.get_or_init(|| { cell.clone(); ... })

@manuthambi
Copy link
Contributor

Haha. Happens to me all the time :)

@RReverser
Copy link

I was looking through issues for a different reason, but this seems the most relevant. I was wondering if it would make sense to add some general handle type that would combine OnceCell and Condvar and make it possible to give a OnceCell to a different thread, and later .wait() on a handle until the OnceCell is populated.

It seems like OnceCell already has most of the necessary bits in place, but creating such wrapper requires access to internals, as Condvar can only wait on a mutex guard.

@matklad
Copy link
Owner Author

matklad commented May 13, 2020

@RReverser good suggestion! I've opened #102, and I think we should have it. I think we don't even need cond-var, as we already have thread::park based synchronisation internally.

@matklad
Copy link
Owner Author

matklad commented Jul 10, 2020

I think it is clear that we don't want to pursue this.

@matklad matklad closed this Jul 10, 2020
Manishearth added a commit to Manishearth/rust that referenced this pull request Jul 18, 2020
…crum

 Add lazy initialization primitives to std

Follow-up to rust-lang#68198

Current RFC: rust-lang/rfcs#2788

Rebased and fixed up a few of the dangling comments. Some notes carried over from the previous PR:

- [ ] Naming. I'm ok to just roll with the `Sync` prefix like `SyncLazy` for now, but [have a personal preference for `Atomic`](rust-lang/rfcs#2788 (comment)) like `AtomicLazy`.
- [x] [Poisoning](rust-lang/rfcs#2788 (comment)). It seems like there's [some regret around poisoning in other `std::sync` types that we might want to just avoid upfront for `std::lazy`, especially if that would align with a future `std::mutex` that doesn't poison](https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/parking_lot.3A.3AMutex.20in.20std/near/190331199). Personally, if we're adding these types to `std::lazy` instead of `std::sync`, I'd be on-board with not worrying about poisoning in `std::lazy`, and potentially deprecating `std::sync::Once` and `lazy_static` in favour of `std::lazy` down the track if it's possible, rather than attempting to replicate their behavior. cc @Amanieu @sfackler.
- [ ] [Consider making`SyncOnceCell::get` blocking](matklad/once_cell#92). There doesn't seem to be consensus in the linked PR on whether or not that's strictly better than the non-blocking variant.

In general, none of these seem to be really blocking an initial unstable merge, so we could possibly kick off a FCP if y'all are happy?

cc @matklad @pitdicker have I missed anything, or were there any other considerations that have come up since we last looked at this?
Manishearth added a commit to Manishearth/rust that referenced this pull request Jul 18, 2020
…crum

 Add lazy initialization primitives to std

Follow-up to rust-lang#68198

Current RFC: rust-lang/rfcs#2788

Rebased and fixed up a few of the dangling comments. Some notes carried over from the previous PR:

- [ ] Naming. I'm ok to just roll with the `Sync` prefix like `SyncLazy` for now, but [have a personal preference for `Atomic`](rust-lang/rfcs#2788 (comment)) like `AtomicLazy`.
- [x] [Poisoning](rust-lang/rfcs#2788 (comment)). It seems like there's [some regret around poisoning in other `std::sync` types that we might want to just avoid upfront for `std::lazy`, especially if that would align with a future `std::mutex` that doesn't poison](https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/parking_lot.3A.3AMutex.20in.20std/near/190331199). Personally, if we're adding these types to `std::lazy` instead of `std::sync`, I'd be on-board with not worrying about poisoning in `std::lazy`, and potentially deprecating `std::sync::Once` and `lazy_static` in favour of `std::lazy` down the track if it's possible, rather than attempting to replicate their behavior. cc @Amanieu @sfackler.
- [ ] [Consider making`SyncOnceCell::get` blocking](matklad/once_cell#92). There doesn't seem to be consensus in the linked PR on whether or not that's strictly better than the non-blocking variant.

In general, none of these seem to be really blocking an initial unstable merge, so we could possibly kick off a FCP if y'all are happy?

cc @matklad @pitdicker have I missed anything, or were there any other considerations that have come up since we last looked at this?
Manishearth added a commit to Manishearth/rust that referenced this pull request Jul 18, 2020
…crum

 Add lazy initialization primitives to std

Follow-up to rust-lang#68198

Current RFC: rust-lang/rfcs#2788

Rebased and fixed up a few of the dangling comments. Some notes carried over from the previous PR:

- [ ] Naming. I'm ok to just roll with the `Sync` prefix like `SyncLazy` for now, but [have a personal preference for `Atomic`](rust-lang/rfcs#2788 (comment)) like `AtomicLazy`.
- [x] [Poisoning](rust-lang/rfcs#2788 (comment)). It seems like there's [some regret around poisoning in other `std::sync` types that we might want to just avoid upfront for `std::lazy`, especially if that would align with a future `std::mutex` that doesn't poison](https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/parking_lot.3A.3AMutex.20in.20std/near/190331199). Personally, if we're adding these types to `std::lazy` instead of `std::sync`, I'd be on-board with not worrying about poisoning in `std::lazy`, and potentially deprecating `std::sync::Once` and `lazy_static` in favour of `std::lazy` down the track if it's possible, rather than attempting to replicate their behavior. cc @Amanieu @sfackler.
- [ ] [Consider making`SyncOnceCell::get` blocking](matklad/once_cell#92). There doesn't seem to be consensus in the linked PR on whether or not that's strictly better than the non-blocking variant.

In general, none of these seem to be really blocking an initial unstable merge, so we could possibly kick off a FCP if y'all are happy?

cc @matklad @pitdicker have I missed anything, or were there any other considerations that have come up since we last looked at this?
@matklad matklad deleted the badtest branch November 11, 2020 00:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants