Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for str_char stabilization #27754

Closed
aturon opened this issue Aug 12, 2015 · 15 comments
Closed

Tracking issue for str_char stabilization #27754

aturon opened this issue Aug 12, 2015 · 15 comments
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@aturon
Copy link
Member

aturon commented Aug 12, 2015

The str type offers a number of char-oriented methods, many of which can be replaced by uses of char_indices.

It's not clear how widely used these methods are, or whether there are significant downsides over using char_indices. This API area needs a comprehensive re-evaluation.

cc @SimonSapin

@aturon aturon added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. B-unstable Blocker: Implemented in the nightly compiler and unstable. labels Aug 12, 2015
@murarth
Copy link
Contributor

murarth commented Oct 29, 2015

This may be obvious, but I just want to point out that is_char_boundary is much more efficient than a solution involving iteration over char_indices. If nothing else under this feature gate is stabilized, I think at least this should method should be.

@SimonSapin
Copy link
Contributor

I’ve used str::is_char_boundary in low-level string manipulation: rust-lang/rfcs#1432. In this case I’d like this functionality to move into std where stability is not a problem, but the fact remains that is_char_boundary is important in unsafe code dealing with strings.

Now, most (all?) of its uses are inside assert!, which is possibly to do in stable Rust by relying on the side effect of some other operation. For example, s[..i]; is equivalent to assert!(s.is_char_boundary(i)); but it’s much less self-explanatory. It definitely needs a comment.

I’d like to nominate is_char_boundary for stabilization. It’s needed internally anyway, so it’s not like it has extra maintenance cost.

(I care less about the other str_char methods. I think they predate and are largely made obsolete by external iterators.)

@huonw
Copy link
Member

huonw commented Jan 5, 2016

Some discussion about CharRange on #9387.

@aturon
Copy link
Member Author

aturon commented Jan 27, 2016

Nominating for discussion -- need a direction here.

@dhardy
Copy link
Contributor

dhardy commented Jan 27, 2016

I found a use for is_char_boundary recently, in order to implement a variant of truncate:

fn truncate_at_boundary<'a>(s: &'a str, len: usize) -> &'a str {
    let mut len = min(len, s.len());
    while !s.is_char_boundary(len) { len -= 1; }
    &s[0..len]
}

on play.rust-lang.org

@bluss
Copy link
Member

bluss commented Jan 27, 2016

is_char_boundary is very useful, as soon as you implement your own low level string utils. I've replicated in stable rust for use as well.

@alexcrichton
Copy link
Member

Unfortunately the libs team didn't get a chance to talk about this in terms of stabilization for 1.8, but I'm going to leave the nominated tag as I think we should chat about this regardless.

@alexcrichton
Copy link
Member

The libs team discussed this during triage today, and the conclusion was:

  • It seems prudent to stabilize is_char_boundary (definitely seems desired at least)
  • We may be able to deprecate all other methods, they're all expressible in terms of slicing an iterators, and there doesn't seem to be much of a burning desire to keep them!

@alexcrichton
Copy link
Member

🔔 This issue is now entering its cycle-long final comment period 🔔

Specifically, the libs team would like to:

  • Stabilize is_char_boundary
  • Deprecate char_range_at - slicing plus chars() plus len_utf8 should suffice
  • Deprecate char_range_at_reverse - slicing plus chars().rev() plus len_utf8 should suffice
  • Deprecate char_at - slicing plus chars() should suffice
  • Deprecate char_at_reverse - slicing plus chars().rev() should suffice
  • Deprecate slice_shift_char - using chars() should suffice (plus the Chars::as_str method)

@alexcrichton alexcrichton added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed I-nominated labels Mar 11, 2016
@aturon
Copy link
Member Author

aturon commented Mar 11, 2016

Some more detail from the meeting:

Clearly the deprecations here represent an ergonomic regression, but the general feeling is that most of these are quite niche methods, and as long as std can express them efficiently, external crates can build up whatever ergonomic wrappers are needed.

If you disagree with this characterization, please speak up during FCP!

@bluss
Copy link
Member

bluss commented Mar 11, 2016

@alexcrichton I think you mean len_utf8 which is a handy lil method.

str docs will improve when these unstable ones are gone, less noise!

@alexcrichton
Copy link
Member

Oops yes indeed!

@thijsc
Copy link

thijsc commented Mar 22, 2016

👍 On the deprecations. I was actually just bitten by char_at because I did not properly understood how it worked. My assumption was that the position you provide was for the position of the character, while actually of course it's the position of the byte you're trying to use as the first byte of a character. To me this was counterintuitive.

I think we should encourage using char_indexes as much as possible, that provides an actually safe api. If you know what you're doing the slicing approach seems fine.

@BurntSushi
Copy link
Member

Also 👍 on the deprecations. I do a ton of work with strings and I can't recall ever feeling like I was missing them.

@alexcrichton
Copy link
Member

The libs team discussed this during triage yesterday and the conclusion was to move forward with the previously proposed plan.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Apr 11, 2016
This commit applies all stabilizations, renamings, and deprecations that the
library team has decided on for the upcoming 1.9 release. All tracking issues
have gone through a cycle-long "final comment period" and the specific APIs
stabilized/deprecated are:

Stable

* `std::panic`
* `std::panic::catch_unwind` (renamed from `recover`)
* `std::panic::resume_unwind` (renamed from `propagate`)
* `std::panic::AssertUnwindSafe` (renamed from `AssertRecoverSafe`)
* `std::panic::UnwindSafe` (renamed from `RecoverSafe`)
* `str::is_char_boundary`
* `<*const T>::as_ref`
* `<*mut T>::as_ref`
* `<*mut T>::as_mut`
* `AsciiExt::make_ascii_uppercase`
* `AsciiExt::make_ascii_lowercase`
* `char::decode_utf16`
* `char::DecodeUtf16`
* `char::DecodeUtf16Error`
* `char::DecodeUtf16Error::unpaired_surrogate`
* `BTreeSet::take`
* `BTreeSet::replace`
* `BTreeSet::get`
* `HashSet::take`
* `HashSet::replace`
* `HashSet::get`
* `OsString::with_capacity`
* `OsString::clear`
* `OsString::capacity`
* `OsString::reserve`
* `OsString::reserve_exact`
* `OsStr::is_empty`
* `OsStr::len`
* `std::os::unix::thread`
* `RawPthread`
* `JoinHandleExt`
* `JoinHandleExt::as_pthread_t`
* `JoinHandleExt::into_pthread_t`
* `HashSet::hasher`
* `HashMap::hasher`
* `CommandExt::exec`
* `File::try_clone`
* `SocketAddr::set_ip`
* `SocketAddr::set_port`
* `SocketAddrV4::set_ip`
* `SocketAddrV4::set_port`
* `SocketAddrV6::set_ip`
* `SocketAddrV6::set_port`
* `SocketAddrV6::set_flowinfo`
* `SocketAddrV6::set_scope_id`
* `<[T]>::copy_from_slice`
* `ptr::read_volatile`
* `ptr::write_volatile`
* The `#[deprecated]` attribute
* `OpenOptions::create_new`

Deprecated

* `std::raw::Slice` - use raw parts of `slice` module instead
* `std::raw::Repr` - use raw parts of `slice` module instead
* `str::char_range_at` - use slicing plus `chars()` plus `len_utf8`
* `str::char_range_at_reverse` - use slicing plus `chars().rev()` plus `len_utf8`
* `str::char_at` - use slicing plus `chars()`
* `str::char_at_reverse` - use slicing plus `chars().rev()`
* `str::slice_shift_char` - use `chars()` plus `Chars::as_str`
* `CommandExt::session_leader` - use `before_exec` instead.

Closes rust-lang#27719
cc rust-lang#27751 (deprecating the `Slice` bits)
Closes rust-lang#27754
Closes rust-lang#27780
Closes rust-lang#27809
Closes rust-lang#27811
Closes rust-lang#27830
Closes rust-lang#28050
Closes rust-lang#29453
Closes rust-lang#29791
Closes rust-lang#29935
Closes rust-lang#30014
Closes rust-lang#30752
Closes rust-lang#31262
cc rust-lang#31398 (still need to deal with `before_exec`)
Closes rust-lang#31405
Closes rust-lang#31572
Closes rust-lang#31755
Closes rust-lang#31756
bors added a commit that referenced this issue Apr 12, 2016
std: Stabilize APIs for the 1.9 release

This commit applies all stabilizations, renamings, and deprecations that the
library team has decided on for the upcoming 1.9 release. All tracking issues
have gone through a cycle-long "final comment period" and the specific APIs
stabilized/deprecated are:

Stable

* `std::panic`
* `std::panic::catch_unwind` (renamed from `recover`)
* `std::panic::resume_unwind` (renamed from `propagate`)
* `std::panic::AssertUnwindSafe` (renamed from `AssertRecoverSafe`)
* `std::panic::UnwindSafe` (renamed from `RecoverSafe`)
* `str::is_char_boundary`
* `<*const T>::as_ref`
* `<*mut T>::as_ref`
* `<*mut T>::as_mut`
* `AsciiExt::make_ascii_uppercase`
* `AsciiExt::make_ascii_lowercase`
* `char::decode_utf16`
* `char::DecodeUtf16`
* `char::DecodeUtf16Error`
* `char::DecodeUtf16Error::unpaired_surrogate`
* `BTreeSet::take`
* `BTreeSet::replace`
* `BTreeSet::get`
* `HashSet::take`
* `HashSet::replace`
* `HashSet::get`
* `OsString::with_capacity`
* `OsString::clear`
* `OsString::capacity`
* `OsString::reserve`
* `OsString::reserve_exact`
* `OsStr::is_empty`
* `OsStr::len`
* `std::os::unix::thread`
* `RawPthread`
* `JoinHandleExt`
* `JoinHandleExt::as_pthread_t`
* `JoinHandleExt::into_pthread_t`
* `HashSet::hasher`
* `HashMap::hasher`
* `CommandExt::exec`
* `File::try_clone`
* `SocketAddr::set_ip`
* `SocketAddr::set_port`
* `SocketAddrV4::set_ip`
* `SocketAddrV4::set_port`
* `SocketAddrV6::set_ip`
* `SocketAddrV6::set_port`
* `SocketAddrV6::set_flowinfo`
* `SocketAddrV6::set_scope_id`
* `<[T]>::copy_from_slice`
* `ptr::read_volatile`
* `ptr::write_volatile`
* The `#[deprecated]` attribute
* `OpenOptions::create_new`

Deprecated

* `std::raw::Slice` - use raw parts of `slice` module instead
* `std::raw::Repr` - use raw parts of `slice` module instead
* `str::char_range_at` - use slicing plus `chars()` plus `len_utf8`
* `str::char_range_at_reverse` - use slicing plus `chars().rev()` plus `len_utf8`
* `str::char_at` - use slicing plus `chars()`
* `str::char_at_reverse` - use slicing plus `chars().rev()`
* `str::slice_shift_char` - use `chars()` plus `Chars::as_str`
* `CommandExt::session_leader` - use `before_exec` instead.

Closes #27719
cc #27751 (deprecating the `Slice` bits)
Closes #27754
Closes #27780
Closes #27809
Closes #27811
Closes #27830
Closes #28050
Closes #29453
Closes #29791
Closes #29935
Closes #30014
Closes #30752
Closes #31262
cc #31398 (still need to deal with `before_exec`)
Closes #31405
Closes #31572
Closes #31755
Closes #31756
alexcrichton added a commit to alexcrichton/rust that referenced this issue Apr 12, 2016
This commit applies all stabilizations, renamings, and deprecations that the
library team has decided on for the upcoming 1.9 release. All tracking issues
have gone through a cycle-long "final comment period" and the specific APIs
stabilized/deprecated are:

Stable

* `std::panic`
* `std::panic::catch_unwind` (renamed from `recover`)
* `std::panic::resume_unwind` (renamed from `propagate`)
* `std::panic::AssertUnwindSafe` (renamed from `AssertRecoverSafe`)
* `std::panic::UnwindSafe` (renamed from `RecoverSafe`)
* `str::is_char_boundary`
* `<*const T>::as_ref`
* `<*mut T>::as_ref`
* `<*mut T>::as_mut`
* `AsciiExt::make_ascii_uppercase`
* `AsciiExt::make_ascii_lowercase`
* `char::decode_utf16`
* `char::DecodeUtf16`
* `char::DecodeUtf16Error`
* `char::DecodeUtf16Error::unpaired_surrogate`
* `BTreeSet::take`
* `BTreeSet::replace`
* `BTreeSet::get`
* `HashSet::take`
* `HashSet::replace`
* `HashSet::get`
* `OsString::with_capacity`
* `OsString::clear`
* `OsString::capacity`
* `OsString::reserve`
* `OsString::reserve_exact`
* `OsStr::is_empty`
* `OsStr::len`
* `std::os::unix::thread`
* `RawPthread`
* `JoinHandleExt`
* `JoinHandleExt::as_pthread_t`
* `JoinHandleExt::into_pthread_t`
* `HashSet::hasher`
* `HashMap::hasher`
* `CommandExt::exec`
* `File::try_clone`
* `SocketAddr::set_ip`
* `SocketAddr::set_port`
* `SocketAddrV4::set_ip`
* `SocketAddrV4::set_port`
* `SocketAddrV6::set_ip`
* `SocketAddrV6::set_port`
* `SocketAddrV6::set_flowinfo`
* `SocketAddrV6::set_scope_id`
* `<[T]>::copy_from_slice`
* `ptr::read_volatile`
* `ptr::write_volatile`
* The `#[deprecated]` attribute
* `OpenOptions::create_new`

Deprecated

* `std::raw::Slice` - use raw parts of `slice` module instead
* `std::raw::Repr` - use raw parts of `slice` module instead
* `str::char_range_at` - use slicing plus `chars()` plus `len_utf8`
* `str::char_range_at_reverse` - use slicing plus `chars().rev()` plus `len_utf8`
* `str::char_at` - use slicing plus `chars()`
* `str::char_at_reverse` - use slicing plus `chars().rev()`
* `str::slice_shift_char` - use `chars()` plus `Chars::as_str`
* `CommandExt::session_leader` - use `before_exec` instead.

Closes rust-lang#27719
cc rust-lang#27751 (deprecating the `Slice` bits)
Closes rust-lang#27754
Closes rust-lang#27780
Closes rust-lang#27809
Closes rust-lang#27811
Closes rust-lang#27830
Closes rust-lang#28050
Closes rust-lang#29453
Closes rust-lang#29791
Closes rust-lang#29935
Closes rust-lang#30014
Closes rust-lang#30752
Closes rust-lang#31262
cc rust-lang#31398 (still need to deal with `before_exec`)
Closes rust-lang#31405
Closes rust-lang#31572
Closes rust-lang#31755
Closes rust-lang#31756
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

9 participants