Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Better conversions between Utf32String and Utf32Str #50

Open
bugnano opened this issue Sep 6, 2024 · 0 comments
Open

Comments

@bugnano
Copy link

bugnano commented Sep 6, 2024

For maximising performance of matching, I have a Vec<Utf32String> that I want to match against.
It's fairly big (hundreds of thousands of records).
For matching I do something like:

    let mut scores: Vec<(Utf32String, usize, u32, usize)> = entries
        .iter()
        .enumerate()
        .filter_map(|(i, file_name)| {
            let score = pattern.score(file_name.slice(..), matcher);

            score.map(|score| (file_name.clone(), i, score, file_name.len()))
        })
        .collect();

So that I can sort it like so:

    scores.sort_by(|(_file1, i1, score1, len1), (_file2, i2, score2, len2)| {
        score2.cmp(score1).then(len1.cmp(len2)).then(i1.cmp(i2))
    });

And finally, construct a new Vec<Utf32String> with the filtered result:

        scores
            .iter()
            .map(|(file_name, _i, _score, _len)| file_name.clone())
            .collect();

Notice that I do file_name.clone() twice.

Now, to avoid at least 1 clone, I was thinking of refactoring the scoring vector like so:

    let mut scores: Vec<(Utf32Str, usize, u32, usize)> = entries
        .iter()
        .enumerate()
        .filter_map(|(i, file_name)| {
            let slice = file_name.slice(..);
            let score = pattern.score(slice, matcher);

            score.map(|score| (slice, i, score, file_name.len()))
        })
        .collect();

So far so good, but then I found no way of converting an Utf32Str to Utf32String for the final Vec<Utf32String>

It would be nice to have, first of all an AsRef<Utf32Str> for Utf32String, so that I don't have to call .slice(..), next it would be awesome to have a From<Utf32Str> for Utf32String, so that I can do something like Utf32String::from(file_name) in the final Vec<Utf32String>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant