Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: use str instead of Cow to reduce Token size #958

Merged
merged 2 commits into from
Jan 4, 2024

Conversation

sssooonnnggg
Copy link
Contributor

@sssooonnnggg sssooonnnggg commented Jan 3, 2024

"Cow" has a relatively large layout size than &str, which can impact performance.

  • size_of::<Option<Cow>>() == 32
  • size_of::<Option<&str>>() == 16

It seems that using '&str' to represent the tag is sufficient.
For my USD parser, the parsing time for a 50MB USD file decreased from 800 milliseconds to 600 milliseconds.
https://github.com/sssooonnnggg/rusd

@sssooonnnggg sssooonnnggg requested a review from a team as a code owner January 3, 2024 11:35
@sssooonnnggg sssooonnnggg requested review from NoahTheDuke and removed request for a team January 3, 2024 11:35
@sssooonnnggg sssooonnnggg changed the title perf: use str instead of Cow to reduse Token size perf: use str instead of Cow to reduce Token size Jan 3, 2024
coderabbitai[bot]

This comment was marked as spam.

coderabbitai[bot]

This comment was marked as spam.

Copy link
Contributor

@tomtau tomtau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pest_vm fails to compile: https://github.com/pest-parser/pest/actions/runs/7396775314/job/20127215025?pr=958#step:4:238 I'm not sure if a string slice can be used there

pest/src/parser_state.rs Show resolved Hide resolved
@tomtau
Copy link
Contributor

tomtau commented Jan 3, 2024

Perhaps this optimization (if not possible to make it work in a non-semver-breaking way and work within pest_vm) can be feature-guarded under #[cfg(not(feature = "std"))] (the original code would be under #[cfg(feature = "std")])?

@pest-parser pest-parser deleted a comment from coderabbitai bot Jan 3, 2024
@NoahTheDuke
Copy link
Member

Please don't use AI tools to make PRs to Pest.

@tomtau
Copy link
Contributor

tomtau commented Jan 3, 2024

Please don't use AI tools to make PRs to Pest.

@NoahTheDuke I think @sssooonnnggg didn't use AI tool to make this PR, those coderabbitai comments were meant to "code review" this PR, because I enabled that bot some time ago to see what's like (and it seems it's not particularly useful for small PRs like this)... would you prefer to disable it? One other option is to limit it to review only PRs if we put some special labels on them

@NoahTheDuke
Copy link
Member

NoahTheDuke commented Jan 3, 2024

Oh dang, my apologies. I thought that was brought in by the OP. If you enabled it, then I'll leave it be in the future. (I'm not a fan of such AI tools but that's just personal preference.) I only ask that folks don't come to use with their own AI tools.

I think the label idea is good, that will make it more deliberate.

@tomtau
Copy link
Contributor

tomtau commented Jan 3, 2024

No problem. We'll see how it goes, so I put it that AI code review comments will only be added if "pr" label is added.

Copy link
Contributor

coderabbitai bot commented Jan 4, 2024

Important

Auto Review Skipped

Auto reviews are limited to the following labels: pr. Please add one of these labels to enable auto reviews.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository.

To trigger a single review, invoke the @coderabbitai review command.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • You can directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
  • You can tag CodeRabbit on specific lines of code or entire files in the PR by tagging @coderabbitai in a comment. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • You can tag @coderabbitai in a PR comment and ask questions about the PR and the codebase. Examples:
    • @coderabbitai generate interesting stats about this repository from git and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid.
    • @coderabbitai read the files in the src/scheduler package and generate README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

@sssooonnnggg
Copy link
Contributor Author

pest_vm fails to compile: https://github.com/pest-parser/pest/actions/runs/7396775314/job/20127215025?pr=958#step:4:238 I'm not sure if a string slice can be used there

Thank you for the review,
Fixed the compilation error in the vm/src/lib.rs. I think we can use a string slice here, as it's used similarly in lib.rs:188 with match_string.

@sssooonnnggg
Copy link
Contributor Author

sssooonnnggg commented Jan 4, 2024

It's an API breaking change, but from the user's perspective, this API is usually not directly used by real users. In reality, it's used within the code generated by a derived macro, so I don't think it's a major issue.

According to the documentation, the APIs actually used by the users are as_node_tag, find_first_tagged, and find_tagged. These interfaces have not changed.

Additionally, this optimization significantly improves execution speed and reduces memory usage. I believe it's unrelated to the std feature.

Copy link
Contributor

@tomtau tomtau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review,
Fixed the compilation error in the vm/src/lib.rs. I think we can use a string slice here, as it's used similarly in lib.rs:188 with match_string.

It still doesn't compile, unfortunately:

https://github.com/pest-parser/pest/actions/runs/7405863941/job/20155605432?pr=958#step:4:226

@sssooonnnggg
Copy link
Contributor Author

Thank you for the review,
Fixed the compilation error in the vm/src/lib.rs. I think we can use a string slice here, as it's used similarly in lib.rs:188 with match_string.

It still doesn't compile, unfortunately:

https://github.com/pest-parser/pest/actions/runs/7405863941/job/20155605432?pr=958#step:4:226

My fault :), I forgot to check all features, now it should pass all checks by cargo check --all --features pretty-print,const_prec_climber,memchr,grammar-extras --all-targets command, pelease review again

Copy link
Contributor

@tomtau tomtau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fault :), I forgot to check all features, now it should pass all checks by cargo check --all --features pretty-print,const_prec_climber,memchr,grammar-extras --all-targets command, pelease review again

Thanks for fixing it!

that tag_node function signature change and those lifetime parameter changes in pest_vm may be technically semver-breaking, but hopefully it's benign.

It's an API breaking change, but from the user's perspective, this API is usually not directly used by real users. In reality, it's used within the code generated by a derived macro, so I don't think it's a major issue.

I'm not worried about the end user here, but more worried about the weird situations where cargo version resolution picks up different combinations of pest 2.X crates, such as older pest_generator 2.7 and the latest pest 2.7, which happened in the past and some users had compilation issues because of that.

Anyway, I manually ran cargo-semver-checks (because CI seems broken at the moment) between 2.5.0 and this PR, and between 2.7.0 and this PR... and it didn't report any new issues from this PR, so I assume it should be ok. I'll go ahead and merge it then; once the new version is out and if any users report issues with it, we could potentially yank it and try some feature-guarding workaround, but hopefully no need for it.

@tomtau tomtau merged commit 6ea9523 into pest-parser:master Jan 4, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants