-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rust: check range and add type invariant to Error (issue #295) #324
Conversation
This comment has been minimized.
This comment has been minimized.
rust/kernel/error.rs
Outdated
/// 1) convert it to EINVAL | ||
/// 2) print a warning message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe swap the order of these two statements? Printing the warning happens before returning EINVAL
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
89cf0c7
to
0bc70a3
Compare
This comment has been minimized.
This comment has been minimized.
rust/kernel/error.rs
Outdated
/// INVARIANT: make sure Error is initialized with a sane value | ||
/// When an invalid errno is found, it will | ||
/// 1) print a warning message | ||
/// 2) convert it to EINVAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look at one of the other type invariants we have and how they are documented :)
rust/kernel/error.rs
Outdated
/// 2) convert it to EINVAL | ||
pub fn new(errno: c_types::c_int) -> Error { | ||
if errno < -(bindings::MAX_ERRNO as i32) || errno >= 0 { | ||
crate::pr_warn!("Creating Error with an invalid errno {}, convert it to EINVAL", errno); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is an OK approach to avoid a panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a caveat. printk
will grab the console_lock. This is generally ok, but when one is writing a console-related thing that may lead to a potential deadlock. So I think we surely need an unchecked version from_kernel_errno_unchecked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was mistaken. I've checked printk code. If it failed to grab console_lock, it would use another approach. It won't be a problem.
0bc70a3
to
780a935
Compare
This comment has been minimized.
This comment has been minimized.
rust/kernel/error.rs
Outdated
return Error::EINVAL; | ||
} | ||
|
||
return Error(errno); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return Error(errno); | |
Error(errno) |
rust/kernel/error.rs
Outdated
/// when errno given is invalid, | ||
/// it will print a warning message and convert it to EINVAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please capitalize the start of a sentence. Also maybe use the passive form. Eg "will be printed" instead of "it will print" and "the error will be converted" instead of "convert it".
780a935
to
f0b11ea
Compare
This comment has been minimized.
This comment has been minimized.
f0b11ea
to
5cb307f
Compare
This comment has been minimized.
This comment has been minimized.
pub fn from_kernel_errno(errno: c_types::c_int) -> Error { | ||
if errno < -(bindings::MAX_ERRNO as i32) || errno >= 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this implementation of #295 is not zero-cost, is that correct? I am wondering if a blanket runtime check to enforce a type invariant is overkill in this context?
The kernel people can do this in C, at runtime, if they wanted to, e.g. by adding an if/pr_warn
(or even if/WARN
) in the PTR_ERR()
macro. But they don't. Does anyone know if this is a conscious choice, or an oversight? Did anyone check if the kernel people care? What happens in the kernel if you do return an negative error int
outside -MAX_ERRNO:-1
?
Is it a good use of "The Power of Rust" to insert non-zero-cost runtime checks that are possibly too paranoid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTR_ERR() usually works in tandem with IS_ERR which does a runtime check like this. Because it's for a different semantic (either pointer or error), they don't print warnings.
But I think your point is valid. It may be overkill in some situations, so we have an unchecked version that relies on users to guarantee validity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens in the kernel if you do return an negative error
int
outside-MAX_ERRNO:-1
?
This could lead to a memory safety issue if not careful. If we have a Result<SomePointer>::Err(e)
and e
is outside MAX_ERRNO
range, when we convert this Result
into pointer, this will be recognized as a success, and the pointer points to some random kernel memory location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could lead to a memory safety issue if not careful. If we have a
Result<SomePointer>::Err(e)
ande
is outsideMAX_ERRNO
range, when we convert thisResult
into pointer, this will be recognized as a success, and the pointer points to some random kernel memory location.
Kernel C has exactly the same issue: if you call ERR_PTR(x)
where x
is "negative-er" than -MAX_ERRNO
, then you end up with a random pointer that's not IS_ERR()
. My point is: it's perfectly possible to check/warn/fix these issues at runtime in C, yet this has not been done by the kernel community.
So my question is: are we "fixing" things that are not seen as a problem upstream? And are we adding non-zero cost runtime checks to accomplish this? How does that help us, or the upstream community?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really a contract between the caller and callee doesn't it? In C there are no simple ways to represent this invariant in the type system, so pretty much the caller just assumes that the return value is sane, so there're no checks. In Rust we have the opportunity to represent this in the type system.
However, I do think that in many cases this check may not be desirable. In that case the unsafe unchecked constructor should be used:
pub fn getrandom(dest: &mut [u8]) -> error::Result {
let res = unsafe { bindings::wait_for_random_bytes() };
if res != 0 {
return Err(error::Error::from_kernel_errno(res));
}
/* ... */
}
would be:
pub fn getrandom(dest: &mut [u8]) -> error::Result {
let res = unsafe { bindings::wait_for_random_bytes() };
if res != 0 {
// SAFETY: `wait_for_random_bytes` returns 0 on success and -errno on failure.
return Err(unsafe { error::Error::from_kernel_errno_unchecked(res) });
}
/* ... */
}
I think this is actually good. It would prevent people from just try to convert an arbitrary int to errno, and force the caller to reason about whether the value is indeed an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Off-topic: @TheSven73 do we need the extern "C"
calls to IS_PTR/ERR_PTR
? Since we will be having checks against MAX_ERRNO
anyway we can just cast them to usize
and check them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Replying to @TheSven73 -- again GitHub did not update with the new replies :)
Those are good questions. Let me write some context (which you already know, but I am saving these explanations for future discussions) first.
There are two ways we can introduce Rust into the kernel. One is as a C-like replacement, keeping semantics as close as possible (e.g. in the case here). This is what many people may expect, but that would mean we need to go the unsoundness-way, and therefore one of the major advantages of Rust is gone. There are other advantages to Rust, of course, but many may wonder whether "just a nicer language" is worth it.
The other way is to be fully sound. Of course, this means we cannot take for granted even "trivial" things (e.g. like the case here), where the caller is "supposed to do the right thing". Thus, to remain sound, we need to either take a runtime hit or have unsafe
blocks (plus their proofs, plus *_unchecked()
functions). Nobody likes either, but the upside is huge: it is what leads us to the "holy grail" of being able to write kernel-space memory-safe code; i.e. without the risk of exploits and without moving things to user-space.
So, with that context in mind, the answers would be:
are we "fixing" things that are not seen as a problem upstream?
It is not so much about fixing, but about being a requirement for us in order to remain sound.
And are we adding non-zero cost runtime checks to accomplish this?
That depends: in each instance, we have to decide whether to take a runtime hit or go the unsafe
/*_unchecked()
way.
I think, in general, the latter is best kept within rust/kernel/
code (since we reap the benefits across many modules) and/or when performance is critical, while the former is best for "leaf" modules (to be able to write them in the safe subset as much as possible) and when performance does not matter much (e.g. initialization of a driver).
In some cases, though, the only way is the runtime hit, because there may be no way to prove the unsafe
call is sound.
How does that help us, or the upstream community?
For us, it is what enables us to claim that Rust allows us to write UB-free and memory-safe code/drivers.
For the upstream community, it gives Linux the ability to have kernel-space memory-safe drivers without having to move everything into user-space (or into a multi-rings model). This should allow Linux to improve its security & reliability (due to no UB), maintainability (due to easier reviews of "leaf" modules), etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Off-topic: @TheSven73 do we need the
extern "C"
calls toIS_PTR/ERR_PTR
? Since we will be having checks againstMAX_ERRNO
anyway we can just cast them tousize
and check them?
Calling the C macros from Rust is probably more robust - if the "error pointer" implementation changes in C, things will continue to work.
Also, the "error pointer" mechanism could differ per-architecture:
Lines 10 to 18 in 7884043
/* | |
* Kernel pointers have redundant information, so we can use a | |
* scheme where we can return either an error code or a normal | |
* pointer with the same return value. | |
* | |
* This should be a per-architecture thing, to allow different | |
* error and pointer decisions. | |
*/ | |
#define MAX_ERRNO 4095 |
5cb307f
to
e021924
Compare
This comment has been minimized.
This comment has been minimized.
e021924
to
5ce8496
Compare
This comment has been minimized.
This comment has been minimized.
Can you see if existing calls to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also please add the # Invariants
section to Error
explaining the type invariant.
@ojeda and @nbdd0121 I see and hear your arguments, and some are even quite convincing to me :) I do not like the blanket choice for a non-zero cost abstraction here at all, though. It might definitely be worth exploring Gary's suggestion to eliminate (some of them) with proofs + unsafe unchecked constructor. So yeah, in which cases can we replace pub fn getrandom(dest: &mut [u8]) -> error::Result {
let res = unsafe { bindings::wait_for_random_bytes() };
if res != 0 {
return Err(error::Error::from_kernel_errno(res));
}
/* ... */
} becomes pub fn getrandom(dest: &mut [u8]) -> error::Result {
// from_kernel_int_err() internally uses unsafe, unchecked so it's zero cost
unsafe { from_kernel_int_err(bindings::wait_for_random_bytes()) }
} I kind of like the analogy between |
If the C side guarantees it, of course, we can (should) use the unchecked call. In fact, since our policy is that only In other words, drivers etc. should only need to use the pre-created |
Yes, I'd be fully on board with this. One question though - why must the unchecked variant of the constructor be |
The downside is, of course, that the C side may change without notice. It would be best to document the assumptions we rely on in the C side if they are not already. Of course, that means touching the C side which is not great until we are in mainline, so perhaps we can wait on that (i.e. on going around documenting C functions). |
Because while it is true that there is no UB, we would be breaking the type invariant, which other safe code may be relying upon to remain UB-free. |
For instance, someone could write this safe but unsound function, and you could not blame them: fn f(e: Error) {
let x = e.to_kernel_errno();
// SAFETY: `x` is non-zero by the `Error` type invariant.
unsafe {
boom_if_x_is_zero(x);
}
} where: /// ...
///
/// # Safety
///
/// `x` must be non-zero.
fn boom_if_x_is_zero(x: ...) { ... } (Edit: the |
Very good point, thanks! And other safe code relies on the error value being in range. But that code can't see the error value, except via |
Yeah, see my message above with the example.
Definitely, that can be done and it would be perfectly sound too.
Well, it would allow us to remove On the other hand, you lose the type invariant. For types that e.g. have many methods, it is normally much better to model things using type invariants, because then you do not need to check on every method that requires it to be true. It is also harder to understand for readers, because they need to look into pre/post-conditions on methods, instead of "just knowing" something is true as soon as they see the type. So, in general, I would say it is best to design with type invariants where they make sense, unless there is a strong reason not to. |
@ojeda thanks for the explanation, really appreciate it!! Most of this is crystal clear to me. With the exception of the requirement to use Are you saying we could go from this: Lines 76 to 88 in ac536cc
To this? // SAFETY:
// - `this.pdrv` lives at least until the call to `platform_driver_unregister()` returns.
// - `name` pointer has static lifetime.
// - `module.0` lives at least as long as the module.
// - `probe()` and `remove()` are static functions.
// - `of_match_table` is either:
// - a raw pointer which lives until after the call to
// `bindings::platform_driver_unregister()`, or
// - null.
unsafe {
// SAFETY: `__platform_driver_register()` returns zero on success,
// a negative error code otherwise, so we can use `from_kernel_retval()`.
from_kernel_retval(bindings::__platform_driver_register(&mut this.pdrv, module.0))?
} where |
You're welcome! I would say it may be best to split the Although, to be clear on the "zero-cost": in this example, we still need the So with your helper and // SAFETY: ...
let ret = unsafe { bindings::__platform_driver_register(&mut this.pdrv, module.0) };
// SAFETY: ...
unsafe { from_kernel_retval(ret)? }; |
Does it mean we don't need this PR anymore? We will ensure type invariants inside the wrapper (from_kernel_err_ptr, from_kernel_retval...). And we use predefined const in rust side (e.g. Error::EINVAL, Error::ENOMEM.....) |
We still need this PR :-) i.e. regardless of which wrappers we create, the wrappers would use the methods of |
Great :D Lines 209 to 233 in ac536cc
If this pr is going to get accepted, we may need to use Let me include it in this PR. |
Please do not forget to add the Also, if it is already possible (not sure), feel free to go ahead and make the |
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! A few constructive remarks:
- merge heads are not permitted in PRs. Each time you make a change, use
git rebase
to create only "clean commits". For this PR, you probably need only a single commit. - before pushing a new version, check if the CI will build without errors, by enabling && running CI in your own fork of the Rust-for-Linux repo. It can save a lot of time.
@ojeda would it make sense to make ksquirrel complain about merge heads in PRs? |
Sorry about this. :( |
fbf8cb5
to
00145c5
Compare
This comment has been minimized.
This comment has been minimized.
Definitely! +1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nits.
We will need to make sure that no Error with out of range error code can be constructed. This commit 1. Add errno check in from_kernel_errno() 2. Provides a unchecked version from_kernel_errno_unchecked() And when an invalid errno is found, it will 1) Print a warning. 2) Convert it to EINVAL. Signed-off-by: Fox Chen <[email protected]>
00145c5
to
d032957
Compare
Review of
|
It may be a bit late to say this, but I was wondering would it be better if we return Option instead of converting it to a valid errno like this? |
Yes, that is why I asked you to put But indeed, returning a So this goes back into the error handling discussion, panics, etc. (#283). I said the approach you had was OK because we have not decided what to do yet (what you did is basically the " Perhaps we could propose upstream to have an |
Thanks for explaining. It's clearer to me. :)
I'm down to this. Ultimately, a unique error number is necessary if we take this approach, and it can be used in other similar cases. |
You're welcome! :) Let me merge this one, since it is already an improvement, and then you can send another PR to change it to |
Thank you |
I've got the following splat after enabling preemption: [ 3.724721] BUG: using __this_cpu_add() in preemptible [00000000] code: swapper/0/1 [ 3.734630] caller is __this_cpu_preempt_check+0x38/0x50 [ 3.740635] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc4-64bit+ Rust-for-Linux#324 [ 3.744605] Hardware name: 9000/785/C8000 [ 3.744605] Backtrace: [ 3.744605] [<00000000401d9d58>] show_stack+0x74/0xb0 [ 3.744605] [<0000000040c27bd4>] dump_stack_lvl+0x10c/0x188 [ 3.744605] [<0000000040c27c84>] dump_stack+0x34/0x48 [ 3.744605] [<0000000040c33438>] check_preemption_disabled+0x178/0x1b0 [ 3.744605] [<0000000040c334f8>] __this_cpu_preempt_check+0x38/0x50 [ 3.744605] [<00000000401d632c>] flush_tlb_all+0x58/0x2e0 [ 3.744605] [<00000000401075c0>] 0x401075c0 [ 3.744605] [<000000004010b8fc>] 0x4010b8fc [ 3.744605] [<00000000401080fc>] 0x401080fc [ 3.744605] [<00000000401d5224>] do_one_initcall+0x128/0x378 [ 3.744605] [<0000000040102de8>] 0x40102de8 [ 3.744605] [<0000000040c33864>] kernel_init+0x60/0x3a8 [ 3.744605] [<00000000401d1020>] ret_from_kernel_thread+0x20/0x28 [ 3.744605] Fix this by moving the __inc_irq_stat() into the locked section. Signed-off-by: Sven Schnelle <[email protected]> Signed-off-by: Helge Deller <[email protected]>
We will need to make sure that no Error with out of
range error code can be constructed.
This commit
And when an invalid errno is found, it will
Signed-off-by: Fox Chen [email protected]
Issue #295