From eb65e394f6f5161031fd077d7c2299eac987eb87 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Tue, 6 Jan 2015 21:52:46 +0200 Subject: [PATCH 01/11] Added RFC: Lifetime parameters for unsafe pointer conversions --- text/0000-raw-lifetime-anchors.md | 136 ++++++++++++++++++++++++++++++ 1 file changed, 136 insertions(+) create mode 100644 text/0000-raw-lifetime-anchors.md diff --git a/text/0000-raw-lifetime-anchors.md b/text/0000-raw-lifetime-anchors.md new file mode 100644 index 00000000000..456a914120b --- /dev/null +++ b/text/0000-raw-lifetime-anchors.md @@ -0,0 +1,136 @@ +- Start Date: 2015-01-06 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Provide more flexible, less error-prone lifetime support +for unsafe pointer conversions throughout the libraries. + +The currently tedious: +```rust +let s = std::slice::from_raw_buf(&ptr, len) +std::mem::copy_lifetime(self, s) +``` +Becomes: +```rust +std::slice::from_raw_buf_with_lifetime(ptr, len, self) +``` +Or, after a breaking change of the current API: +```rust +std::slice::from_raw_buf(ptr, len, self) +``` + +Static lifetime can be easy too: +```rust +std::slice::from_raw_buf(ptr, len, std::mem::STATIC) +``` + +# Motivation + +The current library convention on functions constructing lifetime-bound +values from raw pointers has the pointer passed by reference, which +reference's lifetime is carried over to the return value. +Unfortunately, the lifetime of a raw pointer is often not indicative +of the lifetime of the pointed-to data. This may lead to mistakes +since the acquired reference crosses into the "safe" domain without +much indication in the code, while the pointed-to value may become +invalid at any time. + +A typical use case where the lifetime needs to be adjusted is +in bindings to a foregn library, when returning a reference to an object's +inner value (we know from the library's API contract that +the inner data's lifetime is bound to the containing object): +```rust +impl Outer { + fn inner_str(&self) -> &[u8] { + unsafe { + let p = ffi::outer_get_inner_str(&self.raw); + let s = std::slice::from_raw_buf(p, libc::strlen(p)); + std::mem::copy_lifetime(self, s) + } + } +} +``` + +And here's a plausible gotcha: +```rust +let foo = unsafe { ffi::new_foo() }; +let s = unsafe { std::slice::from_raw_buf(&foo.data, foo.len) }; +// s lives as long as foo + +// some lines later + +unsafe { ffi::free_foo(foo) }; + +// more lines later, in perfectly safe-looking code: + +let guess_what = s[0]; +``` + +# Detailed design + +The signature of `from_raw*` constructors is changed: the raw pointer is +passed by value, and a generic reference argument is appended to work as a +lifetime anchor (the value can be anything and is ignored): + +```rust +fn from_raw_buf_with_lifetime<'a, T, Sized? U>(ptr: *const T, len: uint, + life_anchor: &'a U) + -> &'a [T] +``` +```rust +fn from_raw_mut_buf_with_lifetime<'a, T, Sized? U>(ptr: *mut T, len: uint, + life_anchor: &'a U) + -> &'a mut [T] +``` + +The existing constructors can be deprecated, to open a migration +path towards reusing their shorter names with the new signatures +when the time to break the API is right. + +The current usage can be mechanically changed: +```rust +let s = std::slice::from_raw_buf(&ptr, len) +``` +to: +```rust +std::slice::from_raw_buf_with_lifetime(ptr, len, &ptr) +``` +However, it's better to try to find a more appropriate lifetime anchor +for each use. + +## Fix copy_mut_lifetime + +While we are at it, the first parameter of `std::mem::copy_mut_lifetime` +could be made a non-mutable reference. There is no reason for the lifetime +anchor to be mutable: the pointer's mutability is usually the relevant +question, and it's an unsafe function to begin with. This wart makes +code tedious, mut-happy, or transmute-happy in cases when a container +providing the lifetime for a mutable view into its contents is not itself +necessarily mutable. + +## Anchor for static references + +To facilitate conversion of pointers to static references, an anchor constant +is provided in `std::mem`: + +```rust +pub const STATIC: &'static () = &(); +``` + +# Drawbacks + +The proposal adds new functions to the library for something that is +already doable, albeit with some inconvenience. Replacement of the existing +constructors, if approved, is a breaking change. + +# Alternatives + +Not adding the lifetime-anchored conversion functions, continuing to use the +current convention despite its problems and inconvenience. + +# Unresolved questions + +Should the existing by-reference constructors be deprecated and eventually +removed? If yes, should this change be done before 1.0? From f5872d5d814695d6c4e714e6f23a466721f9d403 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Tue, 6 Jan 2015 22:13:22 +0200 Subject: [PATCH 02/11] Small rewordings --- text/0000-raw-lifetime-anchors.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0000-raw-lifetime-anchors.md b/text/0000-raw-lifetime-anchors.md index 456a914120b..f057b48be02 100644 --- a/text/0000-raw-lifetime-anchors.md +++ b/text/0000-raw-lifetime-anchors.md @@ -28,7 +28,7 @@ std::slice::from_raw_buf(ptr, len, std::mem::STATIC) # Motivation -The current library convention on functions constructing lifetime-bound +The current library convention on functions constructing borrowed values from raw pointers has the pointer passed by reference, which reference's lifetime is carried over to the return value. Unfortunately, the lifetime of a raw pointer is often not indicative @@ -105,8 +105,8 @@ for each use. While we are at it, the first parameter of `std::mem::copy_mut_lifetime` could be made a non-mutable reference. There is no reason for the lifetime anchor to be mutable: the pointer's mutability is usually the relevant -question, and it's an unsafe function to begin with. This wart makes -code tedious, mut-happy, or transmute-happy in cases when a container +question, and it's an unsafe function to begin with. This wart may +breed tedious, mut-happy, or transmute-happy code, when e.g. a container providing the lifetime for a mutable view into its contents is not itself necessarily mutable. From 858613615074bd3ccbd574201a407777d65b67df Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Tue, 6 Jan 2015 22:15:15 +0200 Subject: [PATCH 03/11] A tab crept in --- text/0000-raw-lifetime-anchors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-raw-lifetime-anchors.md b/text/0000-raw-lifetime-anchors.md index f057b48be02..4c2f4a4fbdc 100644 --- a/text/0000-raw-lifetime-anchors.md +++ b/text/0000-raw-lifetime-anchors.md @@ -46,7 +46,7 @@ impl Outer { fn inner_str(&self) -> &[u8] { unsafe { let p = ffi::outer_get_inner_str(&self.raw); - let s = std::slice::from_raw_buf(p, libc::strlen(p)); + let s = std::slice::from_raw_buf(p, libc::strlen(p)); std::mem::copy_lifetime(self, s) } } From 6fd5affcb1e4972cf45ba19eca6bcec5e695580e Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Tue, 6 Jan 2015 22:47:25 +0200 Subject: [PATCH 04/11] Small readability change --- text/0000-raw-lifetime-anchors.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/text/0000-raw-lifetime-anchors.md b/text/0000-raw-lifetime-anchors.md index 4c2f4a4fbdc..a3c506ebe99 100644 --- a/text/0000-raw-lifetime-anchors.md +++ b/text/0000-raw-lifetime-anchors.md @@ -89,11 +89,12 @@ The existing constructors can be deprecated, to open a migration path towards reusing their shorter names with the new signatures when the time to break the API is right. -The current usage can be mechanically changed: +The current usage can be mechanically changed. + ```rust let s = std::slice::from_raw_buf(&ptr, len) ``` -to: +becomes: ```rust std::slice::from_raw_buf_with_lifetime(ptr, len, &ptr) ``` From ab84b27e8fc29e03099dfdfdfd489f9119ea9981 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Sat, 10 Jan 2015 21:06:03 +0200 Subject: [PATCH 05/11] Cleared the placeholder text in the header --- text/0000-raw-lifetime-anchors.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-raw-lifetime-anchors.md b/text/0000-raw-lifetime-anchors.md index a3c506ebe99..6f5ea58735c 100644 --- a/text/0000-raw-lifetime-anchors.md +++ b/text/0000-raw-lifetime-anchors.md @@ -1,6 +1,6 @@ - Start Date: 2015-01-06 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: +- Rust Issue: # Summary From 839e6a325e843d305208e78c23730353fbfb1247 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Sat, 10 Jan 2015 21:10:15 +0200 Subject: [PATCH 06/11] Added a note about using `""` as the lifetime anchor --- text/0000-raw-lifetime-anchors.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0000-raw-lifetime-anchors.md b/text/0000-raw-lifetime-anchors.md index 6f5ea58735c..c9c3ba85ca1 100644 --- a/text/0000-raw-lifetime-anchors.md +++ b/text/0000-raw-lifetime-anchors.md @@ -120,6 +120,9 @@ is provided in `std::mem`: pub const STATIC: &'static () = &(); ``` +A simple `""` works just as well, while not being as descriptive +regarding its purpose. + # Drawbacks The proposal adds new functions to the library for something that is From c9e5704204ddeee593af84bdbd654bbf82fc2a89 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Sat, 24 Jan 2015 07:56:37 +0200 Subject: [PATCH 07/11] Described the alternative of not providing input lifetimes At the behest of Alex Crichton. --- text/0000-raw-lifetime-anchors.md | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/text/0000-raw-lifetime-anchors.md b/text/0000-raw-lifetime-anchors.md index c9c3ba85ca1..7367a2d141b 100644 --- a/text/0000-raw-lifetime-anchors.md +++ b/text/0000-raw-lifetime-anchors.md @@ -131,8 +131,16 @@ constructors, if approved, is a breaking change. # Alternatives -Not adding the lifetime-anchored conversion functions, continuing to use the -current convention despite its problems and inconvenience. +The `from_raw*` functions can lose input-derived lifetimes altogether, +reverting to an earlier design: +```rust +pub unsafe fn from_raw_buf<'a, T>(ptr: *const T, len: usize) -> &'a T +``` +Such functions would be usable without explicit type annotation and unelided +lifetime parameters only in another function's return value context. For other +uses, wrapper functions would often be created as workarounds. + +The status quo convention can be used despite its problems and inconvenience. # Unresolved questions From b3c34dbfcafd047fe5c15ff62d972e454ab02ca1 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Mon, 26 Jan 2015 19:25:36 +0200 Subject: [PATCH 08/11] Corrected the analysis on the anchor-less alternative --- text/0000-raw-lifetime-anchors.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0000-raw-lifetime-anchors.md b/text/0000-raw-lifetime-anchors.md index 7367a2d141b..90f941cbb5e 100644 --- a/text/0000-raw-lifetime-anchors.md +++ b/text/0000-raw-lifetime-anchors.md @@ -136,9 +136,9 @@ reverting to an earlier design: ```rust pub unsafe fn from_raw_buf<'a, T>(ptr: *const T, len: usize) -> &'a T ``` -Such functions would be usable without explicit type annotation and unelided -lifetime parameters only in another function's return value context. For other -uses, wrapper functions would often be created as workarounds. +In such form the lifetime can be inferred from the use of the return value. +This has even worse potential for unintentional misuse than the current +design. The status quo convention can be used despite its problems and inconvenience. From 4d67b849dd839f4e628d6b70f65b315c92622f5f Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Sat, 31 Jan 2015 11:36:30 +0200 Subject: [PATCH 09/11] Drop the anchors! Rewritten the RFC to reflect the emerging consensus. --- text/0000-raw-lifetime-anchors.md | 148 ------------------------------ text/0000-raw-lifetime.md | 130 ++++++++++++++++++++++++++ 2 files changed, 130 insertions(+), 148 deletions(-) delete mode 100644 text/0000-raw-lifetime-anchors.md create mode 100644 text/0000-raw-lifetime.md diff --git a/text/0000-raw-lifetime-anchors.md b/text/0000-raw-lifetime-anchors.md deleted file mode 100644 index 90f941cbb5e..00000000000 --- a/text/0000-raw-lifetime-anchors.md +++ /dev/null @@ -1,148 +0,0 @@ -- Start Date: 2015-01-06 -- RFC PR: -- Rust Issue: - -# Summary - -Provide more flexible, less error-prone lifetime support -for unsafe pointer conversions throughout the libraries. - -The currently tedious: -```rust -let s = std::slice::from_raw_buf(&ptr, len) -std::mem::copy_lifetime(self, s) -``` -Becomes: -```rust -std::slice::from_raw_buf_with_lifetime(ptr, len, self) -``` -Or, after a breaking change of the current API: -```rust -std::slice::from_raw_buf(ptr, len, self) -``` - -Static lifetime can be easy too: -```rust -std::slice::from_raw_buf(ptr, len, std::mem::STATIC) -``` - -# Motivation - -The current library convention on functions constructing borrowed -values from raw pointers has the pointer passed by reference, which -reference's lifetime is carried over to the return value. -Unfortunately, the lifetime of a raw pointer is often not indicative -of the lifetime of the pointed-to data. This may lead to mistakes -since the acquired reference crosses into the "safe" domain without -much indication in the code, while the pointed-to value may become -invalid at any time. - -A typical use case where the lifetime needs to be adjusted is -in bindings to a foregn library, when returning a reference to an object's -inner value (we know from the library's API contract that -the inner data's lifetime is bound to the containing object): -```rust -impl Outer { - fn inner_str(&self) -> &[u8] { - unsafe { - let p = ffi::outer_get_inner_str(&self.raw); - let s = std::slice::from_raw_buf(p, libc::strlen(p)); - std::mem::copy_lifetime(self, s) - } - } -} -``` - -And here's a plausible gotcha: -```rust -let foo = unsafe { ffi::new_foo() }; -let s = unsafe { std::slice::from_raw_buf(&foo.data, foo.len) }; -// s lives as long as foo - -// some lines later - -unsafe { ffi::free_foo(foo) }; - -// more lines later, in perfectly safe-looking code: - -let guess_what = s[0]; -``` - -# Detailed design - -The signature of `from_raw*` constructors is changed: the raw pointer is -passed by value, and a generic reference argument is appended to work as a -lifetime anchor (the value can be anything and is ignored): - -```rust -fn from_raw_buf_with_lifetime<'a, T, Sized? U>(ptr: *const T, len: uint, - life_anchor: &'a U) - -> &'a [T] -``` -```rust -fn from_raw_mut_buf_with_lifetime<'a, T, Sized? U>(ptr: *mut T, len: uint, - life_anchor: &'a U) - -> &'a mut [T] -``` - -The existing constructors can be deprecated, to open a migration -path towards reusing their shorter names with the new signatures -when the time to break the API is right. - -The current usage can be mechanically changed. - -```rust -let s = std::slice::from_raw_buf(&ptr, len) -``` -becomes: -```rust -std::slice::from_raw_buf_with_lifetime(ptr, len, &ptr) -``` -However, it's better to try to find a more appropriate lifetime anchor -for each use. - -## Fix copy_mut_lifetime - -While we are at it, the first parameter of `std::mem::copy_mut_lifetime` -could be made a non-mutable reference. There is no reason for the lifetime -anchor to be mutable: the pointer's mutability is usually the relevant -question, and it's an unsafe function to begin with. This wart may -breed tedious, mut-happy, or transmute-happy code, when e.g. a container -providing the lifetime for a mutable view into its contents is not itself -necessarily mutable. - -## Anchor for static references - -To facilitate conversion of pointers to static references, an anchor constant -is provided in `std::mem`: - -```rust -pub const STATIC: &'static () = &(); -``` - -A simple `""` works just as well, while not being as descriptive -regarding its purpose. - -# Drawbacks - -The proposal adds new functions to the library for something that is -already doable, albeit with some inconvenience. Replacement of the existing -constructors, if approved, is a breaking change. - -# Alternatives - -The `from_raw*` functions can lose input-derived lifetimes altogether, -reverting to an earlier design: -```rust -pub unsafe fn from_raw_buf<'a, T>(ptr: *const T, len: usize) -> &'a T -``` -In such form the lifetime can be inferred from the use of the return value. -This has even worse potential for unintentional misuse than the current -design. - -The status quo convention can be used despite its problems and inconvenience. - -# Unresolved questions - -Should the existing by-reference constructors be deprecated and eventually -removed? If yes, should this change be done before 1.0? diff --git a/text/0000-raw-lifetime.md b/text/0000-raw-lifetime.md new file mode 100644 index 00000000000..50592df2f93 --- /dev/null +++ b/text/0000-raw-lifetime.md @@ -0,0 +1,130 @@ +- Start Date: 2015-01-06 +- RFC PR: +- Rust Issue: + +# Summary + +Establish a convention throughout the core libraries for unsafe functions +constructing references out of raw pointers. The goal is to improve usability +while promoting awareness of possible pitfalls with inferred lifetimes. + +# Motivation + +The current library convention on functions constructing borrowed +values from raw pointers has the pointer passed by reference, which +reference's lifetime is carried over to the return value. +Unfortunately, the lifetime of a raw pointer is often not indicative +of the lifetime of the pointed-to data. So the status quo eschews the +flexibility of inferring the lifetime from the usage, while falling short +of providing useful safety semantics in exchange. + +A typical case where the lifetime needs to be adjusted is in bindings +to a foregn library, when returning a reference to an object's +inner value (we know from the library's API contract that +the inner data's lifetime is bound to the containing object): +```rust +impl Outer { + fn inner_str(&self) -> &[u8] { + unsafe { + let p = ffi::outer_get_inner_str(&self.raw); + let s = std::slice::from_raw_buf(&p, libc::strlen(p)); + std::mem::copy_lifetime(self, s) + } + } +} +``` +Raw pointer casts also discard the lifetime of the original pointed-to value. + +# Detailed design + +The signature of `from_raw*` constructors will be changed back to what it +once was, passing a pointer by value: +```rust +unsafe fn from_raw_buf<'a, T>(ptr: *const T, len: uint) -> &'a [T] +``` +The lifetime on the return value is inferred from the call context. + +The current usage can be mechanically changed, while keeping an eye on +possible lifetime leaks which need to be worked around by e.g. providing +safe helper functions establishing lifetime guarantees, as described below. + +## Document the unsafety + +In many cases, the lifetime parameter will come annotated or elided from the +call context. The example above, adapted to the new convention, is safe +despite lack of any explicit annotation: +```rust +impl Outer { + fn inner_str(&self) -> &[u8] { + unsafe { + let p = ffi::outer_get_inner_str(&self.raw); + std::slice::from_raw_buf(p, libc::strlen(p)) + } + } +} +``` + +In other cases, the inferred lifetime will not be correct: +```rust + let foo = unsafe { ffi::new_foo() }; + let s = unsafe { std::slice::from_raw_buf(foo.data, foo.len) }; + + // Some lines later + unsafe { ffi::free_foo(foo) }; + + // More lines later + let guess_what = s[0]; + // The lifetime of s is inferred to extend to the line above. + // That code told you it's unsafe, didn't it? +``` + +Given that the function is unsafe, the code author should exercise due care +when using it. However, the pitfall here is not readily apparent at the +place where the invalid usage happens, so it can be easily committed by an +inexperienced user, or inadvertently slipped in with a later edit. + +To mitigate this, the documentation on the reference-from-raw functions +should include caveats warning about possible misuse and suggesting ways to +avoid it. When an 'anchor' object providing the lifetime is available, the +best practice is to create a safe helper function or method, taking a +reference to the anchor object as input for the lifetime parameter, like in +the example above. The lifetime can also be explicitly assigned with +`std::mem::copy_lifetime` or `std::mem::copy_lifetime_mut`, or annotated when +possible. + +## Fix copy_mut_lifetime + +To improve composability in cases when the lifetime does need to be assigned +explicitly, the first parameter of `std::mem::copy_mut_lifetime` +should be made an immutable reference. There is no reason for the lifetime +anchor to be mutable: the pointer's mutability is usually the relevant +question, and it's an unsafe function to begin with. This wart may +breed tedious, mut-happy, or transmute-happy code, when e.g. a container +providing the lifetime for a mutable view into its contents is not itself +necessarily mutable. + +# Drawbacks + +The implicitly inferred lifetimes are unsafe in sneaky ways, so care is +required when using the borrowed values. + +Changing the existing functions is an API break. + +# Alternatives + +An earlier revision of this RFC proposed adding a generic input parameter to +determine the lifetime of the returned reference: +```rust +unsafe fn from_raw_buf<'a, T, U: Sized?>(ptr: *const T, len: uint, + life_anchor: &'a U) + -> &'a [T] +``` +However, an object with a suitable lifetime is not always available +in the context of the call. In line with the general trend in Rust libraries +to favor composability, `std::mem::copy_lifetime` and +`std::mem::copy_lifetime_mut` should be the principal methods to explicitly +adjust a lifetime. + +# Unresolved questions + +Should the change in function parameter signatures be done before 1.0? From dcf608f8ed86c7eb20a30662d9818255266d0b59 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Sat, 31 Jan 2015 11:58:07 +0200 Subject: [PATCH 10/11] Credit where due --- text/0000-raw-lifetime.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/0000-raw-lifetime.md b/text/0000-raw-lifetime.md index 50592df2f93..b598613fbb3 100644 --- a/text/0000-raw-lifetime.md +++ b/text/0000-raw-lifetime.md @@ -128,3 +128,8 @@ adjust a lifetime. # Unresolved questions Should the change in function parameter signatures be done before 1.0? + +# Acknowledgements + +Thanks to Alex Crichton for shepherding this proposal in a constructive and +timely manner. He has in fact rationalized the convention in its current form. From 184daeecb08416b60d2243aee89fd5ccb2f08c5f Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Sat, 31 Jan 2015 12:04:53 +0200 Subject: [PATCH 11/11] Tiny editorial --- text/0000-raw-lifetime.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-raw-lifetime.md b/text/0000-raw-lifetime.md index b598613fbb3..b0c01d174b3 100644 --- a/text/0000-raw-lifetime.md +++ b/text/0000-raw-lifetime.md @@ -132,4 +132,4 @@ Should the change in function parameter signatures be done before 1.0? # Acknowledgements Thanks to Alex Crichton for shepherding this proposal in a constructive and -timely manner. He has in fact rationalized the convention in its current form. +timely manner. He has in fact rationalized the convention in its present form.