-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor to use an arena #83
Comments
Re: lazy regex compilation: Just the change to an arena of some sort should be enough to make It would be a bit too much to hope to combine What's the |
Disregard that part about using |
You're right, it would have the advantage of making it |
Hey, looked into this a bit. I have some thoughts and wanted to check if they make sense: Should we split the types so that "syntax definition read from yaml" and "linked syntax definition" are different types? Let's call the first one
I'm not sure how the (Edit: This also means splitting types such as |
@robinst Yah I looked at the code a bit and that seems like a good idea. It'd entail some duplication in the structs but it'd buy a lot of type safety and thus avoid lots of runtime checks that things are linked. I'm still unsure of whether borrows and lifetime parameters or vec arenas would be better. Borrows might give more compile-time safety to the API (can't use the wrong arena with a As for dumps, I'm unsure of which way is better. Loading pre-linked dumps would be faster, but it makes it so it isn't possible to load new syntaxes into a set afterwards. The binary is pretty large so I only want to choose one set of dumps to include. I bet the linking step will be really fast so probably best to load unlinked ones. So probably the API (with probably helpers to avoid unnecessary steps in simple cases) will be something like load a |
Just a heads up: I have something that passes most tests but has lots of open questions! I'm going to raise a PR so we can discuss those soon. (One of the things I didn't end up doing was splitting the types, but it should be possible to do that on top of my changes.) |
This has been implemented in 3.0.0 🎉: https://github.com/trishume/syntect/blob/master/CHANGELOG.md#version-30 |
This would be a large project, and may never happen, but the architecture would work much better with an arena of
Context
s instead of usingRc<RefCell<>>
everywhere. Not only would the code be cleaner, but the lifetimes ofSyntaxDefinition
andSyntaxSet
would be more correct, avoiding the footgun of dropping aSyntaxSet
and keeping a clonedSyntaxDefinition
that no longer has references to nested languages.This would look something like using a
Vec<Context>
and indices everywhere.It may also be easier to parallelize and make thread-safe, but that still requires extra work because of lazy regex compilation. See extensive discussion in #20.
The text was updated successfully, but these errors were encountered: