-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Unsafe fields #3458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Unsafe fields #3458
Changes from 1 commit
95dd295
fa712be
46766ed
bb6e76a
aafd5f4
ed5ceb4
3f851ae
9908419
247bc41
2d7a512
000075f
1d0d8be
e8aebe2
9521d12
27628e0
f04701b
ed920e9
cf0516b
aa0c333
322c4a8
a5cc19f
b229496
f4426ea
f4e85fb
94c173a
462cee4
d1d7388
f5a82f8
d4cfd14
7769117
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,137 @@ | ||
| - Feature Name: `unsafe_fields` | ||
| - Start Date: 2023-07-13 | ||
| - RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) | ||
| - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) | ||
|
|
||
| # Summary | ||
|
|
||
| Fields may be declared `unsafe`. Unsafe fields may only be mutated (excluding interior mutability) | ||
jswrenn marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| or initialized in an unsafe context. Reading the value of an unsafe field may occur in either safe | ||
| or unsafe contexts. An unsafe field may be relied upon as a safety invariant in other unsafe code. | ||
|
|
||
| # Motivation | ||
|
|
||
| Emphasis on safety is a key strength of Rust. A major point here is that any code path that can | ||
| result in undefined behavior must be explicitly marked with the `unsafe` keyword. However, the | ||
| current system is insufficient. While Rust provides the `unsafe` keyword at the function level, | ||
| there is currently no mechanism to mark fields as `unsafe`. | ||
|
|
||
| For a real-world example, consider the `Vec` type in the standard library. It has a `len` field that | ||
| is used to store the number of elements present. Setting this field is exposed publicly in the | ||
| `Vec::set_len` method, which has safety requirements: | ||
|
|
||
| - `new_len` must be less than or equal to `capacity()`. | ||
| - The elements at `old_len..new_len` must be initialized. | ||
|
|
||
| This field is safe to read, but unsafe to mutate or initialize due to the invariants. These | ||
| invariants cannot be expressed in the type system, so they must be enforced manually. Failure to do | ||
| so may result in undefined behavior elsewhere in `Vec`. | ||
|
|
||
| By introducing unsafe fields, Rust can improve the situation where a field that is otherwise safe is | ||
| used as a safety invariant. | ||
|
|
||
| # Guide-level explanation | ||
|
|
||
| Fields may be declared `unsafe`. Unsafe fields may only be initialized or accessed mutably in an | ||
| unsafe context. Reading the value of an unsafe field may occur in either safe or unsafe contexts. An | ||
| unsafe field may be relied upon as a safety invariant in other unsafe code. | ||
|
|
||
| Here is an example to illustrate usage: | ||
|
|
||
| ```rust | ||
| struct Foo { | ||
| safe_field: u32, | ||
| /// Safety: Value must be an odd number. | ||
| unsafe unsafe_field: u32, | ||
| } | ||
|
|
||
| // Unsafe field initialization requires an `unsafe` block. | ||
| // Safety: `unsafe_field` is odd. | ||
| let mut foo = unsafe { | ||
|
||
| Foo { | ||
| safe_field: 0, | ||
| unsafe_field: 1, | ||
| } | ||
| }; | ||
|
|
||
| // Safe field: no unsafe block. | ||
| foo.safe_field = 1; | ||
|
|
||
| // Unsafe field with mutation: unsafe block is required. | ||
| // Safety: The value is odd. | ||
| unsafe { foo.unsafe_field = 3; } | ||
|
|
||
| // Unsafe field without mutation: no unsafe block. | ||
| println!("{}", foo.unsafe_field); | ||
| ``` | ||
|
|
||
| For a full description of where a mutable access is considered to have occurred (and why), see | ||
| [RFC 3323]. Keep in mind that due to reborrowing, a mutable access of an unsafe field is not | ||
| necessarily explicit. | ||
|
|
||
| [RFC 3323]: https://rust-lang.github.io/rfcs/3323-restrictions.html#where-does-a-mutation-occur | ||
|
|
||
| ```rust | ||
| fn change_unsafe_field(foo: &mut Foo) { | ||
| // Safety: An odd number plus two remains an odd number. | ||
| unsafe { foo.unsafe_field += 2; } | ||
jswrenn marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| } | ||
| ``` | ||
|
|
||
| # Reference-level explanation | ||
|
|
||
| ## Syntax | ||
|
|
||
| Using the syntax from [the reference for structs][struct syntax], the change needed to support | ||
| unsafe fields is minimal. | ||
|
|
||
| [struct syntax]: https://doc.rust-lang.org/stable/reference/items/structs.html#structs | ||
|
|
||
| ```diff | ||
| StructField : | ||
| OuterAttribute* | ||
| Visibility? | ||
| + unsafe? | ||
| IDENTIFIER : Type | ||
|
|
||
| TupleField : | ||
| OuterAttribute* | ||
| Visibility? | ||
| + unsafe? | ||
| Type | ||
| ``` | ||
|
|
||
| ## Behavior | ||
|
|
||
| An unsafe field may only be mutated or initialized in an unsafe context. Failure to do so is a compile error. | ||
|
|
||
| ## "Mutable use" in the compiler | ||
|
|
||
| The concept of a "mutable use" [already exists][mutating use] within the compiler. This catches all | ||
| situations that are relevant here, including `ptr::addr_of_mut!`, `&mut`, and direct assignment to a | ||
|
||
| field, while excluding interior mutability. As such, formal semantics of what constitutes a "mutable | ||
| use" are not stated here. | ||
|
|
||
| [mutating use]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/visit/enum.PlaceContext.html#method.is_mutating_use | ||
|
|
||
| # Drawbacks | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would add something about the proliferation of This isn't necessarily a bad thing, but it does seem like it starts uplifting There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If a crate uses no unsafe, then it has no need for unsafe fields. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The proposal is guided toward more broadly preventing logic errors rather than strictly what is now known as UB via There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I stand corrected on the flags case - but the RFC example is using these fields to ensure that a value is odd with no other There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's fair. The example of an even number isn't the best, as it's contrived. There's an unstated assumption that it's being relied upon as a safety invariant somewhere else in the program. I can definitely improve that example. |
||
|
|
||
| - Additional syntax for macros to handle | ||
| - More syntax to learn | ||
|
|
||
| # Prior art | ||
|
|
||
| Some items in the Rust standard library have `#[rustc_layout_scalar_valid_range_start]`, | ||
| `#[rustc_layout_scalar_valid_range_end]`, or both. These items have identical behavior to that of | ||
| unsafe fields described here. It is likely (though not required by this RFC) that these items will | ||
| be required to use unsafe fields, which would reduce special-casing of the standard library. | ||
jswrenn marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| # Unresolved questions | ||
|
|
||
| - If the syntax for restrictions does not change, what is the ordering of keywords on a field that | ||
| is both unsafe and mut-restricted? | ||
| - Are there any interactions or edge cases with other language features that need to be considered? | ||
|
|
||
| # Future possibilities | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As this RFC is written, and as unstably implemented,
I scanned the discussion but did not see any mention of supporting an unsafe-to-modify-only variety. Is that something that could be a future possibility? Or has it been considered and rejected? (One could argue that it is pointless because, in such cases, one can always provide a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I argue the same thing here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That is true. Since this RFC is the first major extension of field safety tooling since 1.0, we take a syntactically and semantically conservative approach: The keyword for denoting an invariant is simply
It is true that there are a lot of use-cases in which mutation is specifically problematic, but there are also a lot of use-cases for which other uses are problematic, too. Lately, I've been reviewing more of the latter than the former! In any case, this RFC is aimed at folks who want to be very diligent about their use of fields that carry invariants. I've found that it's no so hard to write "
Whoops, I thought I had written about this somewhere. An unsafe-to-modify-only variety is difficult to support well due to the possibility of interior mutation.
Perhaps! If Rust has more information about the kind of invariant the field has, its safety analysis can be more refined and tailored to the restrictions imposed by that invariant. We might also pair additional syntactic information with facts known about the type, like whether or not it contains There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. see also: #3458 (comment) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That (and the thing about interior mutability being a hazard) makes sense to me. However, getters impede borrow splitting (unless and until we get some kind of subsetting type for the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @moulins It's impossible to determine whether or not a field can be written through an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In fact, because of the moving-out problem, the wrapper struct solution can enable better scoped Compare these two examples (for simplicity, we assume that // With unsafe-to-read fields
mod string_stuff {
/// Safe to call even if `input` is not valid UTF-8
pub fn process_sketchy_string(input: String) {
if core::str::from_utf8(input.as_bytes()).is_ok() {
dbg!(input);
} else {
println!("not valid utf8");
}
}
}
struct Foo {
unsafe maybe_invalid_utf8: String,
}
fn do_thing(foo: Foo) {
// SAFETY: we immediately pass to `process_sketchy_string()`,
// which is documented safe to call even when `s` is invalid UTF-8
let s = unsafe { foo.maybe_invalid_utf8 };
process_sketchy_string(s);
}
fn just_drop_the_string(foo: Foo) {
// SAFETY: we immediately drop the `String` without reading its contents
let s = unsafe { foo.maybe_invalid_utf8 };
drop(s);
}// With unsafe-to-write fields + wrapper struct
mod string_stuff {
pub fn process_sketchy_string(input: UnsafeReadWrapper<String>) {
let string: Option<String> = {
// SAFETY: we immediately check for UTF-8 validity
// before doing anything else with the string
let input = unsafe { input.into_inner() };
core::str::from_utf8(input.as_bytes())
.is_ok()
.then_some(input)
};
if let Some(string) = string {
dbg!(string);
} else {
println!("not valid utf8");
}
}
}
struct Foo {
unsafe maybe_invalid_utf8: UnsafeReadWrapper<String>,
}
fn do_thing(foo: Foo) {
let s = foo.maybe_invalid_utf8;
process_sketchy_string(s);
}
fn just_drop_the_string(foo: Foo) {
let s = foo.maybe_invalid_utf8;
drop(s);
}With the wrapper type solution, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Would be good if the RFC would contain this example, since the question is bound to come up again :)
This only works if you have a type that exactly matches the invariant you care for. Presumably There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The RFC now documents the alternatives of making moving safe, and copying safe: cf0516b There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Even if the call to |
||
|
|
||
| ?? | ||
|
||
Uh oh!
There was an error while loading. Please reload this page.