Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add no_std support #145

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Add no_std support #145

wants to merge 7 commits into from

Conversation

mullr
Copy link
Contributor

@mullr mullr commented Aug 30, 2019

This is a series of changes to add no_std support to quick-protobuf. It is unfortunately a breaking change; this can't be avoided, as the existing API directly uses the std-only std::io::Write trait. But it only requires regenerating the stubs; after that, everything works the same way it used to.

There are two major changes here:

  • You can now choose to use ArrayVec instead of Vec for a given field, using the [rust_gen_arrayvec = <capacity>] option in the schema. This works for regular builds as well, but is required for no_std.
  • There is a new WriterBackend trait, removing the hard dependency on std::io::Write.

@mullr
Copy link
Contributor Author

mullr commented Aug 30, 2019

Appveyor failure looks unrelated:

'cargo-fmt.exe' is not installed for the toolchain 'stable-x86_64-pc-windows-msvc'

@mullr
Copy link
Contributor Author

mullr commented Aug 30, 2019

Travis passes on stable; the other failures appear to be because the test script always wants to be on stable.

@nerdrew
Copy link
Collaborator

nerdrew commented Sep 4, 2019

This diff is really large :) Mind breaking the "cleanup" commits into a separate PR so we can quickly review and merge them? I think I have many of the same cleanups on a branch somewhere that I've been meaning to PR. Did you run cargo clippy by any chance for the cleanups?

@@ -0,0 +1,41 @@
#!/bin/env bash

set -e
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm in the habit of using set -eu -o pipefail, or at least set -eu

for proto in $ps; do
(
cd pb-rs
cargo run "${base_dir}"/$proto
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quote $proto

@nerdrew
Copy link
Collaborator

nerdrew commented Sep 4, 2019

What are people's thoughts on gitignore-ing all the generated files? I find they cause PRs to be quite a bit larger than they should be. I like to see some of the generated code if a PR is changing the code generation, but I don't know if we need to check it all in. Thoughts?

@@ -265,6 +265,12 @@ named!(
.find(|&&(k, _)| k == "deprecated")
.map_or(false, |&(_, v)| str::FromStr::from_str(v)
.expect("Cannot parse Deprecated value")),
gen_arrayvec: key_vals
.iter()
.find(|&&(k, _)| k == "rust_gen_arrayvec")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when you run this option through protoc? It looks like you are using this as an option extension without defining it. Is that valid in protos? I haven't tried it yet, but I'll give it a try later this week.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what happens with protoc, didn't try it. Since you're asking I assume it's not going to work :) For the custom option business, how do you think that would be idiomatically managed? Presumably we need this code

import "google/protobuf/descriptor.proto";
extend google.protobuf.FieldOptions {
  uint32 rust_gen_arrayvec = 55555; 
}

to live somewhere; would it be in a .proto that ships with this library?

(presumably we'd register a number at https://github.com/protocolbuffers/protobuf/blob/master/docs/options.md)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOL. I wasn't sure if it'd work. I thought of some cool things I want to try if it had worked ;)

Error I get:

% protoc --python_out=out quick-protobuf/no-std-example/src/no_std.proto
quick-protobuf/no-std-example/src/no_std.proto:17:30: Option "rust_gen_arrayvec" unknown. Ensure that your proto definition file imports the proto which defines the option.
quick-protobuf/no-std-example/src/no_std.proto:19:42: Option "rust_gen_arrayvec" unknown. Ensure that your proto definition file imports the proto which defines the option.

I've never registered a number upstream, but it looks like there is a process. I'm also not sure how to include an always-imported dependency. It's possible that we'd need to bundle the google/protobuf/descriptor.proto and another proto with this extension along with pb-rs. It's possible that users of rust_gen_arrayvec would need to import our extension file in order to use it. That doesn't feel like the worst API in the world, though if the proto is embedded in pb-rs, the protoc won't work anyway since it won't be able to find the proto with our extension anyway...

Also, to get it to work I needed to change it to this syntax, note the (...) around the field option name:

message NoStdMessage {
  fixed32 num = 1;
  repeated fixed32 nums = 2 [(rust_gen_arrayvec) = 16];
  EmbeddedMessage message = 3;
  repeated EmbeddedMessage messages = 4 [(rust_gen_arrayvec) = 16];
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never done it on my side as well ... I am not a big fan of it to be honest because the messages should not be rust specific IMHO.

How do they actually compute the necessary allocation on C side?

I think I would prefer some specific pb-rs parameters do deal with it (so we can build a dictionary of message/field arrayvec length, with an optional default length?).

They could also decide not to write the message and just use a regular &[u8] for reading it.

Copy link
Contributor Author

@mullr mullr Sep 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do they actually compute the necessary allocation on C side?

Usually in C you have a heap, so you can just malloc. I'm not aware of any C implementations that work without a heap, which is why ArrayVec is used here.

I think I would prefer some specific pb-rs parameters do deal with it (so we can build a dictionary of message/field arrayvec length, with an optional default length?).

The way I approached this was to add the features necessary to use this in a no_std context, but then allow the user to choose what they want to use on a piecemeal basis. With annotations, you can easily choose to use an ArrayVec for one field and a regular Vec for another, if they so desire.

The protobuf field option system seems to have been designed for exactly this kind of use. I agree it's a little bit distasteful, but I think it's far better than trying to thread field-specific code down from the codegen command line.

w.r.t. registering the extension: I was thinking it could be a little bit more generic: (rust.max_length) = 123 could conceivably apply to String (=> ArrayString) or bytes fields.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the adding the field option is probably considered the "official" way to doing this. protobuf already has a lot of options that are c++ or java specific. That said, we'd need to work out some usability issues to make it work.

If we are going to define a rust specific field / message option extension, then we need some way of importing that options. We could have pb-rs special case a rust-options.proto import and make sure it is on the import path, but then the protos wouldn't work with protoc. We'd also probably need to get extensions generally working (at least the proto3 version of extensions) in order to parse the base descriptor proto; I haven't tested if pb-rs works with the google descriptor.proto.

IIRC proto3 allows extensions for descriptor options only, whereas proto2 allows extensions to arbitrary messages.

So... while adding the field options is more official, adding command line options to do this would probably be more expedient.

I have a branch that extends the custom_struct_derive so you can specify derives for specific messges only using something like --custom_struct_derive mypackage.MyProtoMessage=Hash,Eq. I plan on opening a PR with that soon (probably tonight).

@mullr
Copy link
Contributor Author

mullr commented Sep 4, 2019

What are people's thoughts on gitignore-ing all the generated files? I find they cause PRs to be quite a bit larger than they should be. I like to see some of the generated code if a PR is changing the code generation, but I don't know if we need to check it all in. Thoughts?

+1 on this; it was pretty awkward dealing with all the generated code diffs in this commit.

@mullr
Copy link
Contributor Author

mullr commented Sep 4, 2019

This diff is really large :) Mind breaking the "cleanup" commits into a separate PR so we can quickly review and merge them? I think I have many of the same cleanups on a branch somewhere that I've been meaning to PR. Did you run cargo clippy by any chance for the cleanups?

That's a good idea. I'll make a separate PR for that and base this on that branch. I didn't clippy, but I will.

@nerdrew
Copy link
Collaborator

nerdrew commented Sep 8, 2019

@tafia Do you have an opinion on gitignoring the generated proto rust files? If you don't object, I'll open a PR.

Copy link
Owner

@tafia tafia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First thanks a lot for the PR. no_std is something I was wanting to do since a long time!
This is indeed a massive PR.

What are people's thoughts on gitignore-ing all the generated files? I find they cause PRs to be quite a bit larger than they should be. I like to see some of the generated code if a PR is changing the code generation, but I don't know if we need to check it all in. Thoughts?

I agree that it adds lot of unnecessary noise. I'd like to keep at least one of them in check so we can see "in practice" what's happening.

@mullr, apart from the rust_gen_arrayvec comment, I have a another one: I'd like the default behavior to be as close as today if possible, which means using std::io::Read and not having WriterBackend per default. The reason is to keep the generated code as lean as possible. Similarly I believe we should not have the arrayvec crate in default features. (default should be std only).

@@ -265,6 +265,12 @@ named!(
.find(|&&(k, _)| k == "deprecated")
.map_or(false, |&(_, v)| str::FromStr::from_str(v)
.expect("Cannot parse Deprecated value")),
gen_arrayvec: key_vals
.iter()
.find(|&&(k, _)| k == "rust_gen_arrayvec")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never done it on my side as well ... I am not a big fan of it to be honest because the messages should not be rust specific IMHO.

How do they actually compute the necessary allocation on C side?

I think I would prefer some specific pb-rs parameters do deal with it (so we can build a dictionary of message/field arrayvec length, with an optional default length?).

They could also decide not to write the message and just use a regular &[u8] for reading it.

"Cannot read next bytes",
))
})?;
let b = bytes.get(self.start).ok_or(Error::UnexpectedEndOfBuffer)?;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change, not an IO error anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. std::io doesn't exist in core, so io::error can't really be used in a no_std compatible library.

@mullr
Copy link
Contributor Author

mullr commented Sep 9, 2019

@mullr, apart from the rust_gen_arrayvec comment, I have a another one: I'd like the default behavior to be as close as today if possible, which means using std::io::Read and not having WriterBackend per default. The reason is to keep the generated code as lean as possible.

That would add a fair bit of complexity to the codegen; you'd have to choose regular mode, and no_std mode. I don't think you'd gain very much at all from it, either: all of the write calls are using static dispatch (not trait objects), so the everything will get monomorphized. S if the caller is using the std::io backend, the compiler will end up generating the exact same code.

Similarly I believe we should not have the arrayvec crate in default features. (default should be std only).

Agreed. I tried to do that, but I couldn't figure out how to enable non-default features for tests. Is there a way?

This replaces all public uses of the Writer trait with the new WriterBackend,
which defines only the interaction points which are actually used. It also
introduces the BytesWriter struct as a new implementation of WriterBackend that
will work in no_std, along with the serialize_into_slice convenience fn.

This is technically a breaking change, as it will require stubs to be
regenerated.
- Add the `std` feature to the quick-protobuf runtime crate
- Put uses of Vec and std::io::Write behind the `std` feature
- Change other uses of `std` to `core`
- Add the `quick-protobuf/no-std-example` compile test
@tafia
Copy link
Owner

tafia commented Sep 10, 2019

That would add a fair bit of complexity to the codegen.

Wouldn't it be just replace WriterBackend with std::io::Write and change the use ... at the top?
In terms of knowing when to use what, it could be either an explicit flag or by checking features.

But I agree that this is not super important. @nerdrew, any preference?

I couldn't figure out how to enable non-default features for tests

Haven't tried but something like cargo test --no-default-features --feature xyz should work?

@nerdrew
Copy link
Collaborator

nerdrew commented Sep 10, 2019

Wouldn't it be just replace WriterBackend with std::io::Write and change the use ... at the top? In terms of knowing when to use what, it could be either an explicit flag or by checking features.

This makes sense to me. I think you'd need a flag for pb-rs and then a feature for quick-protobuf.

By default I like keeping std and std::io::{Error, Read, Write} for ease of use in the common (or should I say... "standard") case.

@nerdrew
Copy link
Collaborator

nerdrew commented Sep 10, 2019

Question about ArrayVec: it looks like you can use Vec in a no_std environment. I don't really know much about it, but this doc says you can if you can add alloc and collections. When people say no_std do they also usually mean no alloc and no global allocator?

@mullr
Copy link
Contributor Author

mullr commented Sep 10, 2019

Question about ArrayVec: it looks like you can use Vec in a no_std environment. I don't really know much about it, but this doc says you can if you can add alloc and collections. When people say no_std do they also usually mean no alloc and no global allocator?

Yes, that's right.

@mullr
Copy link
Contributor Author

mullr commented Sep 10, 2019

Ok, I've rebased this branch on master, with one small change: the field option is now '(rust_max_length)', so it could in principle be used for other kinds of fields, works with protoc, and could in principle be registered as an official extension.

w.r.t. the other requested design changes: I'm out of time to work on this. At this point it's good enough for the use case I have, and I can't really justify any more time spent on reworking it. You guys are welcome to do with it what you want. Of course I'd be thrilled if there was upstream support for no_std.

@tafia
Copy link
Owner

tafia commented Sep 11, 2019

w.r.t. the other requested design changes: I'm out of time to work on this. At this point it's good enough for the use case I have, and I can't really justify any more time spent on reworking it. You guys are welcome to do with it what you want. Of course I'd be thrilled if there was upstream support for no_std.

Thanks a lot for your work!

@xoloki
Copy link
Contributor

xoloki commented Sep 11, 2019

Hi @nerdrew @tafia,
I'm thinking about taking this branch, removing the arrayvec code, and using the alloc collections when in no_std mode. This removes the need for the length annotations as well.
Would you be interested in a PR with those changes?

@tafia
Copy link
Owner

tafia commented Sep 12, 2019

Would you be interested in a PR with those changes?

Certainly!

jonlamb-gh and others added 2 commits November 26, 2019 10:32
Adjust the parsers to:
* treat `import public` equivalent to `import`
* treat `(rust_ext.rust_max_length)` equivalent to `(rust_max_length)`
@nerdrew
Copy link
Collaborator

nerdrew commented Sep 14, 2020

Looks like this can be closed? no_std support looks like it was merged separately.

@MathiasKoch
Copy link

I don't think this should be closed entirely yet, as this PR (If i'm reading it correctly) not only provides no_std, but also no_alloc usage?

This is a huge advantage for embedded use-cases, where an allocator is broadly not available or not desired.

@quentinmit
Copy link

What's blocking this PR? Is it just the merge conflicts? It sounds like all that's left for no_alloc support is to merge the rust_max_length option with arrayvec support. (And yes, no_alloc is very important for embedded applications.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants