Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove std::io::lazy::Lazy in favour of SyncOnceCell #77154

Merged
merged 5 commits into from
Sep 27, 2020

Conversation

m-ou-se
Copy link
Member

@m-ou-se m-ou-se commented Sep 24, 2020

The (internal) std::io::lazy::Lazy was used to lazily initialize the stdout and stdin buffers (and mutexes). It uses atexit() to register a destructor to flush the streams on exit, and mark the streams as 'closed'. Using the stream afterwards would result in a panic.

Stdout uses a LineWriter which contains a BufWriter that will flush the buffer on drop. This one is important to be executed during shutdown, to make sure no buffered output is lost. It also forbids access to stdout afterwards, since the buffer is already flushed and gone.

Stdin uses a BufReader, which does not implement Drop. It simply forgets any previously read data that was not read from the buffer yet. This means that in the case of stdin, the atexit() function's only effect is making stdin inaccessible to the program, such that later accesses result in a panic. This is uncessary, as it'd have been safe to access stdin during shutdown of the program.


This change removes the entire io::lazy module in favour of SyncOnceCell. SyncOnceCell's fast path is much faster (a single atomic operation) than locking a sys_common::Mutex on every access like Lazy did.

However, SyncOnceCell does not use atexit() to drop the contained object during shutdown.

As noted above, this is not a problem for stdin. It simply means stdin is now usable during shutdown.

The atexit() call for stdout is moved to the stdio module. Unlike the now-removed Lazy struct, SyncOnceCell does not have a 'gone and unusable' state that panics. Instead of adding this again, this simply replaces the buffer with one with zero capacity. This effectively flushes the old buffer and makes any writes afterwards pass through directly without touching a buffer, making print!() available during shutdown without panicking.


In addition, because the contents of the SyncOnceCell are no longer dropped, we can now use &'static instead of Arc in Stdout and Stdin. This also saves two levels of indirection in stdin() and stdout(), since Lazy effectively stored a Box<Arc<T>>, and SyncOnceCell stores the T directly.

@rust-highfive
Copy link
Collaborator

r? @dtolnay

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 24, 2020
The (internal) std::io::lazy::Lazy was used to lazily initialize the
stdout and stdin buffers (and mutexes). It uses atexit() to register a
destructor to flush the streams on exit, and mark the streams as
'closed'. Using the stream afterwards would result in a panic.

Stdout uses a LineWriter which contains a BufWriter that will flush the
buffer on drop. This one is important to be executed during shutdown,
to make sure no buffered output is lost. It also forbids access to
stdout afterwards, since the buffer is already flushed and gone.

Stdin uses a BufReader, which does not implement Drop. It simply forgets
any previously read data that was not read from the buffer yet. This
means that in the case of stdin, the atexit() function's only effect is
making stdin inaccessible to the program, such that later accesses
result in a panic. This is uncessary, as it'd have been safe to access
stdin during shutdown of the program.

---

This change removes the entire io::lazy module in favour of
SyncOnceCell. SyncOnceCell's fast path is much faster (a single atomic
operation) than locking a sys_common::Mutex on every access like Lazy
did.

However, SyncOnceCell does not use atexit() to drop the contained object
during shutdown.

As noted above, this is not a problem for stdin. It simply means stdin
is now usable during shutdown.

The atexit() call for stdout is moved to the stdio module. Unlike the
now-removed Lazy struct, SyncOnceCell does not have a 'gone and
unusable' state that panics. Instead of adding this again, this simply
replaces the buffer with one with zero capacity. This effectively
flushes the old buffer *and* makes any writes afterwards pass through
directly without touching a buffer, making print!() available during
shutdown without panicking.
@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 24, 2020

Two interesting minor differences before/after:


1:

use std::{thread, io, mem, time::Duration};

fn main() {
    print!("hello");
    thread::spawn(|| mem::forget(io::stdout()));
    thread::sleep(Duration::from_secs(1));
}

Before this change, this outputs nothing: The Arc count does not drop back to 0, the stdout buffer is not dropped/flushed.
After this change, this properly flushes the stream, and outputs hello. Improvement!


2:

use std::{thread, io, mem, time::Duration};

fn main() {
    print!("hello");
    thread::spawn(|| mem::forget(io::stdout().lock()));
    thread::sleep(Duration::from_secs(1));
}

Before this change, this outputs hello, while after this change, this no longer flushes the stream and outputs nothing. :( However: This only worked because the whole ReentrantMutex in the Arc was dropped (thus dropping/flushing the contained buffer), even though the ReentrantMutex inside there was still locked (since a guard was leaked). Strictly speaking, this would've been unsound: destroying a locked mutex is undefined behaviour according to posix. (Although, this seems safe on all existing implementations. See #31936.)

(Adding a print!() at the end of main here would've deadlocked, both before and after this change.)

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 24, 2020

@rustbot modify labels: +T-libs +A-io

@rustbot rustbot added A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Sep 24, 2020
Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks amazing to me. Thank you @m-ou-se!

@dtolnay
Copy link
Member

dtolnay commented Sep 25, 2020

To get some eyes on the externally observable behavior changes in #77154 (comment):
@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented Sep 25, 2020

Team member @dtolnay has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Sep 25, 2020
@withoutboats
Copy link
Contributor

IMO all the externally visible differences are bug fixes. This is great!

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Sep 25, 2020
@rfcbot
Copy link

rfcbot commented Sep 25, 2020

🔔 This is now entering its final comment period, as per the review above. 🔔

@dtolnay
Copy link
Member

dtolnay commented Sep 25, 2020

@bors r+

@bors
Copy link
Contributor

bors commented Sep 25, 2020

📌 Commit 6f9c132 has been approved by dtolnay

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 25, 2020
@RalfJung
Copy link
Member

Maybe add comments explaining why this is ignored?

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 26, 2020

Maybe add comments explaining why this is ignored?

Done.

It might be the case that adding this in the test would also solve it:

#![cfg_attr(target_os = "emscripten", feature(link_args))]
#[cfg(target_os = "emscripten")]
#[allow(unused_attributes)]
#[link_args = "-s EXIT_RUNTIME=1"]
extern "C" {}

But I'm unable to test this locally. And I don't think this test really matters on emscripten.

Edit: Tested it. Kind of works, but still fails: with EXIT_RUNTIME enabled Emscripten appends a newline by itself, failing the test. So I'll just leave the // ignore-emscripten there.

@RalfJung
Copy link
Member

@bors r=dtolnay rollup=iffy

@bors
Copy link
Contributor

bors commented Sep 26, 2020

📌 Commit 6b8b9c4 has been approved by dtolnay

@bors
Copy link
Contributor

bors commented Sep 27, 2020

⌛ Testing commit 6b8b9c4 with merge c9e5e6a...

@bors
Copy link
Contributor

bors commented Sep 27, 2020

☀️ Test successful - checks-actions, checks-azure
Approved by: dtolnay
Pushing c9e5e6a to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 27, 2020
@bors bors merged commit c9e5e6a into rust-lang:master Sep 27, 2020
@rustbot rustbot added this to the 1.48.0 milestone Sep 27, 2020
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. to-announce Announce this issue on triage meeting and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Oct 5, 2020
@spastorino spastorino removed the to-announce Announce this issue on triage meeting label Oct 8, 2020
JohnTitor added a commit to JohnTitor/rust that referenced this pull request Oct 13, 2020
…mutex-attr-cleanup, r=Mark-Simulacrum

Remove unnecessary rustc_const_stable attributes.

These attributes were added in rust-lang#74033 (comment) because of [std::io::lazy::Lazy::new](https://github.com/rust-lang/rust/blob/0c03aee8b81185d65b5821518661c30ecdb42de5/src/libstd/io/lazy.rs#L21-L23). But [std::io::lazy::Lazy is gone now](rust-lang#77154), so this can be cleaned up.

@rustbot modify labels: +T-libs +C-cleanup
JohnTitor added a commit to JohnTitor/rust that referenced this pull request Oct 13, 2020
…mutex-attr-cleanup, r=Mark-Simulacrum

Remove unnecessary rustc_const_stable attributes.

These attributes were added in rust-lang#74033 (comment) because of [std::io::lazy::Lazy::new](https://github.com/rust-lang/rust/blob/0c03aee8b81185d65b5821518661c30ecdb42de5/src/libstd/io/lazy.rs#L21-L23). But [std::io::lazy::Lazy is gone now](rust-lang#77154), so this can be cleaned up.

@rustbot modify labels: +T-libs +C-cleanup
JohnTitor added a commit to JohnTitor/rust that referenced this pull request Oct 13, 2020
…mutex-attr-cleanup, r=Mark-Simulacrum

Remove unnecessary rustc_const_stable attributes.

These attributes were added in rust-lang#74033 (comment) because of [std::io::lazy::Lazy::new](https://github.com/rust-lang/rust/blob/0c03aee8b81185d65b5821518661c30ecdb42de5/src/libstd/io/lazy.rs#L21-L23). But [std::io::lazy::Lazy is gone now](rust-lang#77154), so this can be cleaned up.

@rustbot modify labels: +T-libs +C-cleanup
@m-ou-se m-ou-se deleted the lazy-stdio branch November 23, 2020 22:17
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 10, 2020
…=Mark-Simulacrum

Enforce no-move rule of ReentrantMutex using Pin and fix UB in stdio

A `sys_common::ReentrantMutex` may not be moved after initializing it with `.init()`. This was not enforced, but only stated as a requirement in the comments on the unsafe functions. This change enforces this no-moving rule using `Pin`, by changing `&self` to a `Pin` in the `init()` and `lock()` functions.

This uncovered a bug I introduced in rust-lang#77154: stdio.rs (the only user of ReentrantMutex) called `init()` on its ReentrantMutexes while constructing them in the intializer of `SyncOnceCell::get_or_init`, which would move them afterwards. Interestingly, the ReentrantMutex unit tests already had the same bug, so this invalid usage has been tested on all (CI-tested) platforms for a long time. Apparently this doesn't break badly on any of the major platforms, but it does break the rules.\*

To be able to keep using SyncOnceCell, this adds a `SyncOnceCell::get_or_init_pin` function, which makes it possible to work with pinned values inside a (pinned) SyncOnceCell. Whether this function should be public or not and what its exact behaviour and interface should be if it would be public is something I'd like to leave for a separate issue or PR. In this PR, this function is internal-only and marked with `pub(crate)`.

\* Note: That bug is now included in 1.48, while this patch can only make it to ~~1.49~~ 1.50. We should consider the implications of 1.48 shipping with a wrong usage of `pthread_mutex_t` / `CRITICAL_SECTION` / .. which technically invokes UB according to their specification. The risk is very low, considering the objects are not 'used' (locked) before the move, and the ReentrantMutex unit tests have verified this works fine in practice.

Edit: This has been backported and included in 1.48. And soon 1.49 too.

---

In future changes, I want to push this usage of Pin further inside `sys` instead of only `sys_common`, and apply it to all 'unmovable' objects there (`Mutex`, `Condvar`, `RwLock`). Also, while `sys_common`'s mutexes and condvars are already taken care of by rust-lang#77147 and rust-lang#77648, its `RwLock` should still be made movable or get pinned.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants