Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The string API should be more generic. #10827

Closed
Kimundi opened this issue Dec 5, 2013 · 7 comments
Closed

The string API should be more generic. #10827

Kimundi opened this issue Dec 5, 2013 · 7 comments

Comments

@Kimundi
Copy link
Member

Kimundi commented Dec 5, 2013

Current situation

Right now, many operation you can do with a string are hard coded to one specific type, like &str or char. For example:

let mut s = ~"";

s.push_str("foo");
s.push_char('a');

s.contains("foo");
s.contains_char('a');

Improving the API

Using traits and generics, these two examples could get boiled down into one contains and one push, which would accept either type while still being just as efficient as before.

There are quite a few methods that would benefit from such genericy, and so far I've identified two different kinds of traits:

  1. StrPushable - Anything that can be pushed to a string. String slices and char would implement this, but also things like Ascii could.
    Functions that would benefit from this include:

    push()
    replace()
    ...

    This might be implementable with fmt::Default instead, but I'm not sure if it's a good idea to do that, as it would mean you could push anything to a string that implements that trait.

  2. StrMatcher - Anything that could be used to find a string pattern in a string.
    Again, string slices and char would implement it, as well as predicate functions like |char| -> bool, the Ascii type, or things like a regex type.
    A trait like this already exists in a limited form with the CharEq trait, and would if properly extend be useful for a number of functions:

    split() // all variants
    replace()
    find()
    contains()
    ...

How it could look like

let mut s = ~"";

s.push("Hello");
s.push('!');
s.push(" aaa bbb ccc".as_ascii().to_upper());

assert_eq!(s, ~"Hello! AAA BBB CCC");

assert!(s.contains('!'));
assert!(s.contains(char::is_whitespace));
assert!(s.contains(regex("AA+")));

assert_eq!(s.split('!').collect(), ~["Hello", " AAA BBB CCC"]);
assert_eq!(s.split(char::is_whitespace).collect(), ~["Hello!", "AAA", "BBB", "CCC"]);
assert_eq!(s.split(regex("AA+")).collect(), ~["Hello! ", "A BBB CCC"]);

Status

I'm currently working on trying out this approach, to see if there are any rough edges or issues I haven't though of, but I think this would bring the string API a great step forward.

@huonw
Copy link
Member

huonw commented Dec 5, 2013

It's not immediately clear to me how one can merge e.g. split_str and split and maintain efficiency and avoid repetition/ridiculous type signatures.

@Kimundi
Copy link
Member Author

Kimundi commented Dec 6, 2013

@huonw Both &str and char would implement the StrMatch trait, which would have methods for finding self in a provided string. The definition of split would become something like this:

fn split<M: StrMatch>(m: M) -> SplitIterator { ... }

And the SplitIterator impls would then use the methods on m to look for a match.

Provided the interface is defined in the right way, it should inline to efficient code.

There might be a bit more repetition and ridiculous type signatures involved on the library side, but the user facing API would become smaller.

@metajack
Copy link
Contributor

metajack commented Dec 8, 2013

This may be useful in Servo. We're going to need functions to work with UTF16 strings (required for JavaScript) and it would be nice if these were the same string API functions we were used to.

@thestinger
Copy link
Contributor

@metajack: AFAIK JavaScript really treats them as UCS2 strings though, and a UTF-16 string type would be too strict. Of course, UCS2 strings could just be a type in Servo.

@Kimundi
Copy link
Member Author

Kimundi commented Dec 8, 2013

@metajack Hm, this is actually not the kind of genericy I though of here - I'm thinking of making a few functions that work with &str more generic.

But defining the traits in a way that would be reusable for custom string types makes sense too, it just would be a different issue.

@metajack
Copy link
Contributor

metajack commented Dec 9, 2013

@thestinger Yes, UCS2 is what I meant.

@Kimundi Ah, I see.

@aturon aturon added the A-libs label Jun 2, 2014
@alexcrichton
Copy link
Member

Closing in favor of your RFC: rust-lang/rfcs#528

flip1995 pushed a commit to flip1995/rust that referenced this issue Jun 30, 2023
new lint: `large_stack_frames`

This implements a lint that looks for functions that use a lot of stack space.

It uses the MIR because conveniently every temporary gets its own local and I think it maps best to stack space used in a function.
It's probably going to be quite inaccurate in release builds, but at least for debug builds where opts are less aggressive on LLVM's side I think this is accurate "enough".

(This does not work for generic functions yet. Not sure if I should try to get it working in this PR or if it could land without it for now and be added in a followup PR.)

I also put it under the nursery category because this probably needs more work...

changelog: new lint: [`large_stack_frames`]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants