Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change some std types to core equivalent #337

Merged

Conversation

stevefan1999-personal
Copy link
Contributor

@stevefan1999-personal stevefan1999-personal commented Apr 5, 2023

Part 2 of the #294 series, this PR just simply replace some std types that can use core as its substitution. This will greatly help no_std compatibility, although there are still some issue unresolved, for example, the vec! macro is subtlety used and this is undefined in no_std context -- although adding use alloc::vec; statement back in would compile well, but this would imply the use of alloc, and I think it is okay for most situations.

My own anecdotal use case is for a core part of a language parser that I want it to be no_std so it can be embedded easily into a ESP32 target to parse some Lispy command from serial console for configurating sensor states and setup WiFi information in flash memory -- and I don't want to use LALR parser there because memory access time is slow on that MCU. Although I can have a memory allocator there because each ESP32 device have their memory map well-defined and a global allocator is usually provided for free using their SDK. This solved my problem well and I guess 99% of the people can relax nicely here.

...But I have another use case: I'm writing a kernel command-line parser and the existence of a global allocator is not guaranteed at all time, so I can't use alloc at all most of the time especially when this parser ran before the virtual memory heap is initialized yet.

To be honest the implicit vec problem comes from here:

expression* - Repeat: match zero or more repetitions of expression and return the results as a Vec.
expression+ - One-or-more: match one or more repetitions of expression and return the results as a Vec.
expression*<n,m> - Range repeat: match between n and m repetitions of expression return the results as a Vec. [(details)](https://docs.rs/peg/latest/peg/#repeat-ranges)
expression ** delim - Delimited repeat: match zero or more repetitions of expression delimited with delim and return the results as a Vec.
expression **<n,m> delim - Delimited repeat (range): match between n and m repetitions of expression delimited with delim and return the results as a Vec. [(details)](https://docs.rs/peg/latest/peg/#repeat-ranges)
expression ++ delim - Delimited repeat (one or more): match one or more repetitions of expression delimited with delim and return the results as a Vec.

These operators are merely tail-recursive optimization for the equivalent recursion expansion, and it can be worked-around by going back to doing recursive expansions instead.

For now, any user that does not use alloc but no_std must be aware of this issue, since we cannot disable these operators as a feature.

Maybe we should document it instead and tell any user who wants to preserve these operators, they must define their own Vec type and implement their own vec macro that does static size allocation -- you can wrap the peg grammar inside a mod, and do whatever customization you want, so the freedom is up to what you defined in the super crate.

However, if you are not using those repeating operators at all, you don't have to worry: your code is already no_std ready now.

@kevinmehall kevinmehall merged commit f62301b into kevinmehall:master Apr 6, 2023
@kevinmehall
Copy link
Owner

For the repeat operators, one case that is going to be quite limiting is wanting to use them for repetition, without building a Vec, for instance [' ' | '\n']* or $(['0'..='9']+). We already avoid allocating there because it ends up using Vec<()>, but that still requires the Vec type to exist.

I turned down a PR to special case that last year, but supporting no-alloc seems like a a sufficient justification that didn't come up at the time.

You could also imagine allowing a custom expression instead of the vec![] constructor (and duck-type assuming that .push() and .len() work on it). I don't know what the syntax would look like for that; e **<{lo},{hi}> delim is already crazy enough. It does probably want to be per-instance though, not a grammar-wide option, because you might want to set ArrayVec sizes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants