Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#![no_std] where possible? #144

Open
CAD97 opened this issue Sep 15, 2017 · 7 comments
Open

#![no_std] where possible? #144

CAD97 opened this issue Sep 15, 2017 · 7 comments
Assignees
Labels
A: lib-impl Library Implementation enhancement Enhancements to existing features
Milestone

Comments

@CAD97
Copy link
Collaborator

CAD97 commented Sep 15, 2017

A common feature among unicode crates is that many of them are no_std or are opt-out std.

Do we want to support the no_std use case? If we do, we should do so soon to avoid including std things in our crates.

For the UCD at the very least, no_std does not seem difficult. char_property works as-is with #![no_std] use core as std. (caveat: my small test didn't cover the macro...) char_range works so long as the Bound-based construction API is gated on std support. utils has iter_all_chars still (why? that should probably be removed since we have CharRange) which returns a Box, but works if that is dropped.

A quick search of the ucd directory shows one use of std::collections, which is in a test. Other than that, I don't think any non-core apis are used in ucd. std::ascii fn can be easily shimmed for those that are being used (I know they exist some places). I mean, we're writing a text-processing library, I hope we could shim it 😆.

Working in a no_std would also force us to think Iterator-first, as we would no longer have allocating APIs available at all.

normal has one use of VecDeque. Other than that, String, and std::ascii, I don't think we're using any non-core APIs in the libraries. (I of course exclude the source generation tools.)

@CAD97 CAD97 added discussion Discussions enhancement Enhancements to existing features labels Sep 15, 2017
@behnam
Copy link
Member

behnam commented Sep 15, 2017

Thanks for filing this, @CAD97! I've been thinking about it and am more inclined towards not making it a priority, specially because alternate solutions are already available for critical algorithms.

Also, during the implementations, we think as we're no_std and, as you've mentioned, we're not allocating memory in runtime (with possibly one or two exceptions), but maintaining two variations makes everything much slower.

Do you really think it's going to be harder to do so later if we don't start now? I think if we do that later, we can better draw the line for what needs a local implementation (like std::ascii) and what can be completely dropped or conditioned.

What do you think?

@CAD97
Copy link
Collaborator Author

CAD97 commented Sep 15, 2017

I think, for the UCD at least, we should #![no_std]. It's basically no-effort (except FromStr using eq_ignore_ascii_case). For the larger algorithm crates, I'm fine with punting for now. They should definitely all get a no_std-reconsider before 1.0 though.

At the same time, there's a voice in the back of my head saying "what use case is there for unicode processing in a no-std environment?" But I still think it is worth it to enable no_std in the UCD crates to enforce the no-allocation guideline.

It's definitely easier to turn no_std off than to turn it on. And I think that, in this project at least, we should definitely not no_std if it means shimming anything other than std::ascii.

TL;DR no_std in UCD crates, punt the rest for consideration closer to 1.0.

@behnam
Copy link
Member

behnam commented Sep 15, 2017

Yeah, agreed generally. Although, it needs some CI configuration to test everything clearly.

When no_std, we can indeed add unic::ascii to fill in the gap where needed.

@behnam behnam added this to the UNIC-1.0 milestone Sep 15, 2017
@CAD97
Copy link
Collaborator Author

CAD97 commented Sep 15, 2017

Given #21, unic::ascii was going to happen at some point. Or some form, in any case. Of ASCII-only algorithms in addition to the full-unicode tables.

@behnam
Copy link
Member

behnam commented Sep 15, 2017

Re #21, the idea is to change the table-creation and data-table matching codes to perform ASCII lookups via a direct array lookup, and all the rest of Unicode via b-tree search.

@CAD97
Copy link
Collaborator Author

CAD97 commented Sep 15, 2017

It seems that in the future we could even drop the AsciiExt shim as it seems like those fn are going to migrate onto the primitives in the future. rust-lang/rust#44042

@behnam behnam self-assigned this Nov 6, 2017
@behnam behnam modified the milestones: UNIC-1.0, UNIC-0.7 Nov 6, 2017
bors bot added a commit that referenced this issue Nov 6, 2017
180: [ucd] [char] Make libraries no_std r=behnam a=behnam

Follow up on <#179>, marking more libraries as `no_std`.

Rename `unic-ucd-core` to `unic-ucd-version`, as `core` is a language library and `version` represents better what the component is providing.

Tracker: <#144>
@behnam behnam added A: lib-impl Library Implementation and removed discussion Discussions labels Nov 6, 2017
@behnam
Copy link
Member

behnam commented Feb 7, 2018

Many of the packages are now no_std. Let's track down the rest for the next release.

@behnam behnam modified the milestones: UNIC-0.7, UNIC-0.8 Feb 7, 2018
@behnam behnam modified the milestones: UNIC-0.8, UNIC-0.X Jan 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: lib-impl Library Implementation enhancement Enhancements to existing features
Projects
None yet
Development

No branches or pull requests

2 participants