Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust: error: consider naming for fns that parse C bindings' return values #342

Closed
ojeda opened this issue Jun 3, 2021 · 17 comments
Closed
Labels
• lib Related to the `rust/` library.

Comments

@ojeda
Copy link
Member

ojeda commented Jun 3, 2021

Related: #324, #335.
Orthogonal: whether we need the safe version of these, see #283.

Thinking about the pub(crate) helpers that help us "parse" return values from C functions, I think we can also improve the names for all these helpers:

  • They are not constructors, so from_* does not fit too well (i.e. do not confuse with the Error constructors).
  • _kernel_ does not add information -- we are also the kernel! Nothing instead (or perhaps _c_ or _bindings_) would be better. I think nothing is OK because these functions are internal to rust/kernel/ and will be relatively well-known, since they will be used by most abstractions, so smaller names are better.

We need names for at least 3 functions:

  • decode_ret_ptr() -> Result<*mut T>: Decodes the return value of C functions that return a pointer or an error value (i.e. the C side uses IS_ERR()/ERR_PTR() etc.).
  • decode_ret() -> Result: Decodes the return value of C functions that only return 0 on success, but otherwise do not use the positive side.
  • decode_ret_value() -> Result<c_uint>: Decodes the return value of C functions that return a positive value on success.
    • Ideally we will specialize this one if some patterns repeat themselves, so that we can directly return the proper Rust-side type (e.g. the unsigned integer is a file descriptor which is then fed into a File, so we could instead have parse_result_fd() or parse_result_file() etc.).

Possible prefixes:

  • parse
  • interpret
  • transform
  • to_result
  • decode
  • ...

Possible middle nouns:

  • ret
  • result
  • return
  • c_return
  • creturn
  • retval
  • cretval
  • ...

Opinions, suggestions, etc. welcome!

@ojeda ojeda added • lib Related to the `rust/` library. prio: high labels Jun 3, 2021
@nbdd0121
Copy link
Member

nbdd0121 commented Jun 3, 2021

I don't think parse is a good name; for me parsing should only be used when the source form is textual, binary or some other serialized data.

I think it's okay to use from in the name, as long as it's clear it gives a Result instead of an Error. Maybe result_from_return_value?

@ojeda
Copy link
Member Author

ojeda commented Jun 4, 2021

result_from_retval sounds good (I do not mind shortening to retval or ret because it is fairly used in the C side).

I don't think parse is a good name; for me parsing should only be used when the source form is textual, binary or some other serialized data.

That is fair -- the meaning I was going for with parse is that we are "splitting" the retvals into their "two sides" (success, error), because they come "encoded" in a single integer.

Actually, decode could be a good prefix.

I think it's okay to use from in the name, as long as it's clear it gives a Result instead of an Error.

I was referring to from_ as the prefix (i.e. not in the middle) -- isn't from_*() usually "reserved" for constructors in Rust?

Maybe result_from_return_value?

I do not mind shortening return_value to e.g. ret because those are fairly used in the C side (13k uses vs. retval with 1k and result with 1k).

@ojeda
Copy link
Member Author

ojeda commented Jun 4, 2021

We can have a live poll in the meeting! :)

@TheSven73
Copy link
Collaborator

TheSven73 commented Jun 4, 2021

Agreed 100% that we need better naming. Part of the problem here is that these C "result mechanisms" are not named or differentiated explicitly in the kernel itself.

In a lot of cases. the error return behaviour isn't even documented in the function docs or signature. I often find myself grepping through a function's existing call sites to figure out how to handle its error return.

I think there are three main C "result mechanisms" that need a good name:

  • C returns an "pointer result", called "error pointer" in the kernel (maps to Result<*mut T>)
  • C returns an int result which is negative error or zero=success (maps to Result)
  • C returns an int result which is negative error or non-negative=success (maps to Result<c_uint>)

@ojeda
Copy link
Member Author

ojeda commented Jun 4, 2021

In a lot of cases. the error return behaviour isn't even documented in the function docs or signature. I often find myself grepping through a function's existing call sites to figure out how to handle its error return.

Agreed! It is a mess. :-/

Which is why I am a quite scared of using unsafe everywhere for these. It is quite hard to review every single call manually, and worse, it will happen sooner or later that the C side changes and they forget to update the Rust side, and if it happens to compile, boom.

So I think either 1) we use safe calls by default (unless we have performance numbers to tell us otherwise), or 2) we document the C side if it is not documented (including adding an explicit Rust safety marker so that people know editing the function implies reviewing the Rust side too, and not just C callers).

I think there are three main C "result mechanisms" that need a good name:

These three are the ones I mention in the OP, taken from another comment of yours. I will edit with the "maps to" to the post.

@TheSven73
Copy link
Collaborator

Interesting discussion, Miguel 💯

I can see why you'd want to be paranoid about these. But it's certainly not a good demonstration of "The Power of Rust" for the upstream community, rather the opposite! We'd basically be runtime-checking something that should really be a build time check. It could be perceived as going backwards rather than forwards.

That said, if we do end up enforcing these invariants with a runtime check (and/or warning), we should still mark these functions as unsafe. Because we still want to rigorously enforce and document the expected result behaviour of C functions, it's just good practice. And this will allow us to transition to unchecked functions in the future, when circumstances permit us to become less paranoid :)

@TheSven73
Copy link
Collaborator

Suggestions:
result_from_int_retval()
result_uint_from_int_retval()
result_ptr_from_err_ptr()

Could replace retval by ret. But I don't think that's as clear.

@ojeda
Copy link
Member Author

ojeda commented Jun 4, 2021

We'd basically be runtime-checking something that should really be a build time check.

Well, for this case, we could do in a semi-automated way :)

i.e. we could propose upstream to have these marked somehow in the C side, e.g.

int * my_c_function(void) __ret(err_ptr) { ... }
int my_c_function(void) __ret(zero_only) { ... }
int my_c_function(void) __ret(value) { ... }

which then would be parsed by a script or perhaps libclang (which we already depend on for Rust) which would save the list of known functions like this into a file.

Then a Rust lint (e.g. in klint) would read the file and check that any call to the decode_*() functions matches the expected values.

@TheSven73
Copy link
Collaborator

We could suggest something along these lines upstream, but I'm not holding my breath.

If we decide to go that way, I believe it's worth talking to the author of Coccinelle (Julia Lawall) first. She shares our interest in making sure that return values are handled correctly. Perhaps there are Coccinelle scripts that already verify this, which we may piggy-back onto somehow?

@ojeda
Copy link
Member Author

ojeda commented Jun 4, 2021

Indeed, talking to Julia would be great, since she probably knows what is already there etc. I can drop her an email. Since we are raising this tomorrow, perhaps somebody else knows too.

In any case, given there are already annotations for other things/tools (e.g. for locks, for sparse, etc.), we could try to attack this from that angle too. It could also be useful for C too.

@TheSven73
Copy link
Collaborator

Please do! This is of great interest to me as well. Feel free to put me in CC.

@ojeda
Copy link
Member Author

ojeda commented Jun 4, 2021

Yeah, I will Cc you!

Actually, I think having a technical meeting on this would be great. We can ask Julia Lawall to join, plus Luc Van Oostenryck (sparse), Joe Perches (checkpatch.pl), etc. too.

@TheSven73
Copy link
Collaborator

👍 Not sure if a technical meeting is "a thing" in the kernel world. It may have to take the form of an LKML thread. I guess it depends on who is backing you, and how big they are :)

@bjorn3
Copy link
Member

bjorn3 commented Jun 4, 2021

i.e. we could propose upstream to have these marked somehow in the C side, e.g.

int * my_c_function(void) __ret(err_ptr) { ... }
int my_c_function(void) __ret(zero_only) { ... }
int my_c_function(void) __ret(value) { ... }

At the risk of being more magic it could also be something like:

__err_ptr(int) my_c_function(void) { ... }
__err_void my_c_function(void) { ... }
__err_int my_c_function(void) { ... }

Where __err_void and __err_int expand to int and __err_ptr(ty) expands to ty*.

@ojeda
Copy link
Member Author

ojeda commented Jun 4, 2021

Not sure if a technical meeting is "a thing" in the kernel world. It may have to take the form of an LKML thread. I guess it depends on who is backing you, and how big they are :)

It isn't; but Rust, GitHub, Zulip, etc. aren't either; and I don't mind being a bit unorthodox ;)

More seriously: the discussion in the LKML definitely would happen, but that would come after we have an implementation working, and for that, in turn, I think it is best to ask some folks first to see what they think about it etc.

@ojeda
Copy link
Member Author

ojeda commented Jun 4, 2021

At the risk of being more magic it could also be something like:

That has some advantages indeed; but the downside is it may annoy people that do not care about the annotation, specially if only used for the benefit of Rust :( Ideally it could be used for some C tooling too, but we will see...

@ojeda ojeda changed the title rust: error: Consider naming for fns that parse C bindings' return values rust: error: consider naming for fns that parse C bindings' return values Jun 4, 2021
@ojeda ojeda removed the prio: high label Feb 17, 2023
@ojeda
Copy link
Member Author

ojeda commented Oct 29, 2024

Closing -- it was more of a discussion-starter rather than an issue (we typically have those in Zulip nowadays), and it is very old anyway.

@ojeda ojeda closed this as completed Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
• lib Related to the `rust/` library.
Development

No branches or pull requests

4 participants