Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust: check range and add type invariant to Error (issue #295) #324

Merged
merged 1 commit into from
Jun 3, 2021

Conversation

foxhlchen
Copy link

@foxhlchen foxhlchen commented May 31, 2021

We will need to make sure that no Error with out of
range error code can be constructed.

This commit

  1. Add errno check in from_kernel_errno()
  2. Provides a unchecked version from_kernel_errno_unchecked()

And when an invalid errno is found, it will

  1. Print a warning.
  2. Convert it to EINVAL.

Signed-off-by: Fox Chen [email protected]

Issue #295

@ksquirrel

This comment has been minimized.

@foxhlchen foxhlchen changed the title rust: check range and add type invariant to Error #295 rust: check range and add type invariant to Error (issue #295) May 31, 2021
Comment on lines 78 to 79
/// 1) convert it to EINVAL
/// 2) print a warning message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe swap the order of these two statements? Printing the warning happens before returning EINVAL.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@foxhlchen foxhlchen force-pushed the 295-ERROR-invariant branch from 89cf0c7 to 0bc70a3 Compare May 31, 2021 07:48
@ksquirrel

This comment has been minimized.

@foxhlchen foxhlchen requested a review from bjorn3 May 31, 2021 07:51
rust/kernel/error.rs Outdated Show resolved Hide resolved
rust/kernel/error.rs Outdated Show resolved Hide resolved
Comment on lines 76 to 79
/// INVARIANT: make sure Error is initialized with a sane value
/// When an invalid errno is found, it will
/// 1) print a warning message
/// 2) convert it to EINVAL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look at one of the other type invariants we have and how they are documented :)

rust/kernel/error.rs Outdated Show resolved Hide resolved
/// 2) convert it to EINVAL
pub fn new(errno: c_types::c_int) -> Error {
if errno < -(bindings::MAX_ERRNO as i32) || errno >= 0 {
crate::pr_warn!("Creating Error with an invalid errno {}, convert it to EINVAL", errno);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is an OK approach to avoid a panic.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a caveat. printk will grab the console_lock. This is generally ok, but when one is writing a console-related thing that may lead to a potential deadlock. So I think we surely need an unchecked version from_kernel_errno_unchecked.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mistaken. I've checked printk code. If it failed to grab console_lock, it would use another approach. It won't be a problem.

@foxhlchen foxhlchen force-pushed the 295-ERROR-invariant branch from 0bc70a3 to 780a935 Compare May 31, 2021 09:15
@ksquirrel

This comment has been minimized.

@foxhlchen foxhlchen requested a review from ojeda May 31, 2021 09:16
return Error::EINVAL;
}

return Error(errno);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return Error(errno);
Error(errno)

Comment on lines 60 to 61
/// when errno given is invalid,
/// it will print a warning message and convert it to EINVAL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please capitalize the start of a sentence. Also maybe use the passive form. Eg "will be printed" instead of "it will print" and "the error will be converted" instead of "convert it".

@foxhlchen foxhlchen force-pushed the 295-ERROR-invariant branch from 780a935 to f0b11ea Compare May 31, 2021 13:00
@ksquirrel

This comment has been minimized.

@foxhlchen foxhlchen force-pushed the 295-ERROR-invariant branch from f0b11ea to 5cb307f Compare May 31, 2021 13:21
@ksquirrel

This comment has been minimized.

pub fn from_kernel_errno(errno: c_types::c_int) -> Error {
if errno < -(bindings::MAX_ERRNO as i32) || errno >= 0 {
Copy link
Collaborator

@TheSven73 TheSven73 May 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this implementation of #295 is not zero-cost, is that correct? I am wondering if a blanket runtime check to enforce a type invariant is overkill in this context?

The kernel people can do this in C, at runtime, if they wanted to, e.g. by adding an if/pr_warn (or even if/WARN) in the PTR_ERR() macro. But they don't. Does anyone know if this is a conscious choice, or an oversight? Did anyone check if the kernel people care? What happens in the kernel if you do return an negative error int outside -MAX_ERRNO:-1?

Is it a good use of "The Power of Rust" to insert non-zero-cost runtime checks that are possibly too paranoid?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTR_ERR() usually works in tandem with IS_ERR which does a runtime check like this. Because it's for a different semantic (either pointer or error), they don't print warnings.

But I think your point is valid. It may be overkill in some situations, so we have an unchecked version that relies on users to guarantee validity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in the kernel if you do return an negative error int outside -MAX_ERRNO:-1?

This could lead to a memory safety issue if not careful. If we have a Result<SomePointer>::Err(e) and e is outside MAX_ERRNO range, when we convert this Result into pointer, this will be recognized as a success, and the pointer points to some random kernel memory location.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could lead to a memory safety issue if not careful. If we have a Result<SomePointer>::Err(e) and e is outside MAX_ERRNO range, when we convert this Result into pointer, this will be recognized as a success, and the pointer points to some random kernel memory location.

Kernel C has exactly the same issue: if you call ERR_PTR(x) where x is "negative-er" than -MAX_ERRNO, then you end up with a random pointer that's not IS_ERR(). My point is: it's perfectly possible to check/warn/fix these issues at runtime in C, yet this has not been done by the kernel community.

So my question is: are we "fixing" things that are not seen as a problem upstream? And are we adding non-zero cost runtime checks to accomplish this? How does that help us, or the upstream community?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really a contract between the caller and callee doesn't it? In C there are no simple ways to represent this invariant in the type system, so pretty much the caller just assumes that the return value is sane, so there're no checks. In Rust we have the opportunity to represent this in the type system.

However, I do think that in many cases this check may not be desirable. In that case the unsafe unchecked constructor should be used:

pub fn getrandom(dest: &mut [u8]) -> error::Result {
    let res = unsafe { bindings::wait_for_random_bytes() };
    if res != 0 {
        return Err(error::Error::from_kernel_errno(res));
    }
	/* ... */
}

would be:

pub fn getrandom(dest: &mut [u8]) -> error::Result {
    let res = unsafe { bindings::wait_for_random_bytes() };
    if res != 0 {
		// SAFETY: `wait_for_random_bytes` returns 0 on success and -errno on failure.
        return Err(unsafe { error::Error::from_kernel_errno_unchecked(res) });
    }
	/* ... */
}

I think this is actually good. It would prevent people from just try to convert an arbitrary int to errno, and force the caller to reason about whether the value is indeed an error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off-topic: @TheSven73 do we need the extern "C" calls to IS_PTR/ERR_PTR? Since we will be having checks against MAX_ERRNO anyway we can just cast them to usize and check them?

Copy link
Member

@ojeda ojeda May 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Replying to @TheSven73 -- again GitHub did not update with the new replies :)

Those are good questions. Let me write some context (which you already know, but I am saving these explanations for future discussions) first.

There are two ways we can introduce Rust into the kernel. One is as a C-like replacement, keeping semantics as close as possible (e.g. in the case here). This is what many people may expect, but that would mean we need to go the unsoundness-way, and therefore one of the major advantages of Rust is gone. There are other advantages to Rust, of course, but many may wonder whether "just a nicer language" is worth it.

The other way is to be fully sound. Of course, this means we cannot take for granted even "trivial" things (e.g. like the case here), where the caller is "supposed to do the right thing". Thus, to remain sound, we need to either take a runtime hit or have unsafe blocks (plus their proofs, plus *_unchecked() functions). Nobody likes either, but the upside is huge: it is what leads us to the "holy grail" of being able to write kernel-space memory-safe code; i.e. without the risk of exploits and without moving things to user-space.


So, with that context in mind, the answers would be:

are we "fixing" things that are not seen as a problem upstream?

It is not so much about fixing, but about being a requirement for us in order to remain sound.

And are we adding non-zero cost runtime checks to accomplish this?

That depends: in each instance, we have to decide whether to take a runtime hit or go the unsafe/*_unchecked() way.

I think, in general, the latter is best kept within rust/kernel/ code (since we reap the benefits across many modules) and/or when performance is critical, while the former is best for "leaf" modules (to be able to write them in the safe subset as much as possible) and when performance does not matter much (e.g. initialization of a driver).

In some cases, though, the only way is the runtime hit, because there may be no way to prove the unsafe call is sound.

How does that help us, or the upstream community?

For us, it is what enables us to claim that Rust allows us to write UB-free and memory-safe code/drivers.

For the upstream community, it gives Linux the ability to have kernel-space memory-safe drivers without having to move everything into user-space (or into a multi-rings model). This should allow Linux to improve its security & reliability (due to no UB), maintainability (due to easier reviews of "leaf" modules), etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off-topic: @TheSven73 do we need the extern "C" calls to IS_PTR/ERR_PTR? Since we will be having checks against MAX_ERRNO anyway we can just cast them to usize and check them?

Calling the C macros from Rust is probably more robust - if the "error pointer" implementation changes in C, things will continue to work.

Also, the "error pointer" mechanism could differ per-architecture:

/*
* Kernel pointers have redundant information, so we can use a
* scheme where we can return either an error code or a normal
* pointer with the same return value.
*
* This should be a per-architecture thing, to allow different
* error and pointer decisions.
*/
#define MAX_ERRNO 4095

@foxhlchen foxhlchen force-pushed the 295-ERROR-invariant branch from 5cb307f to e021924 Compare May 31, 2021 15:51
@ksquirrel

This comment has been minimized.

@foxhlchen foxhlchen force-pushed the 295-ERROR-invariant branch from e021924 to 5ce8496 Compare May 31, 2021 16:21
@ksquirrel

This comment has been minimized.

@nbdd0121
Copy link
Member

Can you see if existing calls to from_kernel_errno can be replaced with the unchecked variant? The call within from_kernel_err_ptr certainly can because it's checked by IS_ERR/PTR_ERR already.

Copy link
Member

@ojeda ojeda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also please add the # Invariants section to Error explaining the type invariant.

rust/kernel/error.rs Outdated Show resolved Hide resolved
rust/kernel/error.rs Outdated Show resolved Hide resolved
rust/kernel/error.rs Outdated Show resolved Hide resolved
@TheSven73
Copy link
Collaborator

@ojeda and @nbdd0121 I see and hear your arguments, and some are even quite convincing to me :)

I do not like the blanket choice for a non-zero cost abstraction here at all, though. It might definitely be worth exploring Gary's suggestion to eliminate (some of them) with proofs + unsafe unchecked constructor.

So yeah, in which cases can we replace from_kernel_errno with the unchecked variant? Are we able to assume that certain kernel functions always return zero-or-valid-error? In that case, kernel function return values can always use an unsafe error constructor. We could even create a separate function for this, e.g.:

pub fn getrandom(dest: &mut [u8]) -> error::Result {
    let res = unsafe { bindings::wait_for_random_bytes() };
    if res != 0 {
        return Err(error::Error::from_kernel_errno(res));
    }
	/* ... */
}

becomes

pub fn getrandom(dest: &mut [u8]) -> error::Result {
    // from_kernel_int_err() internally uses unsafe, unchecked so it's zero cost
    unsafe { from_kernel_int_err(bindings::wait_for_random_bytes()) }
}

I kind of like the analogy between from_kernel_int_err() and from_kernel_err_ptr().

@ojeda
Copy link
Member

ojeda commented May 31, 2021

Are we able to assume that certain kernel functions always return zero-or-valid-error?

If the C side guarantees it, of course, we can (should) use the unchecked call.

In fact, since our policy is that only rust/kernel/ code calls bindings, we should perhaps make the constructors private for code outside the crate. Then using the unchecked one always or most of the time is not a very big deal because we are inside rust/kernel/.

In other words, drivers etc. should only need to use the pre-created const ones (e.g. when failing some request) or use the ones they got from elsewhere (e.g. when using a Result from rust/kernel/).

@TheSven73
Copy link
Collaborator

Yes, I'd be fully on board with this.

One question though - why must the unchecked variant of the constructor be unsafe? What is the benefit of it here?

@ojeda
Copy link
Member

ojeda commented May 31, 2021

The downside is, of course, that the C side may change without notice. It would be best to document the assumptions we rely on in the C side if they are not already. Of course, that means touching the C side which is not great until we are in mainline, so perhaps we can wait on that (i.e. on going around documenting C functions).

@ojeda
Copy link
Member

ojeda commented May 31, 2021

One question though - why must the unchecked variant of the constructor be unsafe? What is the benefit of it here?

Because while it is true that there is no UB, we would be breaking the type invariant, which other safe code may be relying upon to remain UB-free.

@ojeda
Copy link
Member

ojeda commented May 31, 2021

For instance, someone could write this safe but unsound function, and you could not blame them:

fn f(e: Error) {
    let x = e.to_kernel_errno();

    // SAFETY: `x` is non-zero by the `Error` type invariant.
    unsafe {
        boom_if_x_is_zero(x);
    }
}

where:

/// ...
///
/// # Safety
///
/// `x` must be non-zero.
fn boom_if_x_is_zero(x: ...) { ... }

(Edit: the SAFETY comment should actually talk about to_kernel_errno()'s post-conditions which would be the ones referring to the type invariant, but I wrote it like this to clearly show why it would be unsound if we do not uphold the type invariant).

@TheSven73
Copy link
Collaborator

TheSven73 commented May 31, 2021

One question though - why must the unchecked variant of the constructor be unsafe? What is the benefit of it here?

Because while it is true that there is no UB, we would be breaking the type invariant, which other safe code may be relying upon to remain UB-free.

Very good point, thanks! And other safe code relies on the error value being in range. But that code can't see the error value, except via to_kernel_errno() right? So maybe we could enforce the invariant only on the return value of that function? For example, by moving the "if saturation" inside to_kernel_errno()? Is there a benefit to doing that?

@ojeda
Copy link
Member

ojeda commented May 31, 2021

Very good point, thanks! And other safe code relies on the error value being in range. But that code can't see the error value, except via to_kernel_errno() right?

Yeah, see my message above with the example.

So maybe we could enforce the invariant only on the return value of that function? For example, by moving the "if saturation" inside to_kernel_errno()?

Definitely, that can be done and it would be perfectly sound too.

Is there a benefit to doing that?

Well, it would allow us to remove unsafe blocks to create Errors, which may be interesting if e.g. to_kernel_errno() is almost never used.

On the other hand, you lose the type invariant. For types that e.g. have many methods, it is normally much better to model things using type invariants, because then you do not need to check on every method that requires it to be true.

It is also harder to understand for readers, because they need to look into pre/post-conditions on methods, instead of "just knowing" something is true as soon as they see the type.

So, in general, I would say it is best to design with type invariants where they make sense, unless there is a strong reason not to.

@TheSven73
Copy link
Collaborator

@ojeda thanks for the explanation, really appreciate it!! Most of this is crystal clear to me. With the exception of the requirement to use unsafe when trusting kernel return values.

Are you saying we could go from this:

// SAFETY:
// - `this.pdrv` lives at least until the call to `platform_driver_unregister()` returns.
// - `name` pointer has static lifetime.
// - `module.0` lives at least as long as the module.
// - `probe()` and `remove()` are static functions.
// - `of_match_table` is either:
// - a raw pointer which lives until after the call to
// `bindings::platform_driver_unregister()`, or
// - null.
let ret = unsafe { bindings::__platform_driver_register(&mut this.pdrv, module.0) };
if ret < 0 {
return Err(Error::from_kernel_errno(ret));
}

To this?

        // SAFETY:
        //   - `this.pdrv` lives at least until the call to `platform_driver_unregister()` returns.
        //   - `name` pointer has static lifetime.
        //   - `module.0` lives at least as long as the module.
        //   - `probe()` and `remove()` are static functions.
        //   - `of_match_table` is either:
        //      - a raw pointer which lives until after the call to
        //       `bindings::platform_driver_unregister()`, or
        //      - null.
        unsafe {
            // SAFETY: `__platform_driver_register()` returns zero on success,
            // a negative error code otherwise, so we can use `from_kernel_retval()`.
            from_kernel_retval(bindings::__platform_driver_register(&mut this.pdrv, module.0))?
        }

where from_kernel_retval() is a zero-cost unsafe function.

@ojeda
Copy link
Member

ojeda commented Jun 2, 2021

You're welcome! I would say it may be best to split the unsafe block so that the SAFETY comments are more fine-grained, but yes.

Although, to be clear on the "zero-cost": in this example, we still need the < 0 check somewhere even if we "ignore" the MAX_ERROR check, whether it is inside a helper or not, because we need to branch, after all. (Edit: clarified that this is about the __platform_driver_register example).

So with your helper and ?, and with the unsafe zero-cost version, and split into two unsafes, it would look like:

// SAFETY: ...
let ret = unsafe { bindings::__platform_driver_register(&mut this.pdrv, module.0) };
// SAFETY: ...
unsafe { from_kernel_retval(ret)? };

@foxhlchen
Copy link
Author

We can “wrap” all three types of functions into a wrapper (one of these already exists: from_kernel_err_ptr()). Inside the wapper, we can create the Error using unsafe/unchecked, since we trust kernel C error codes. This is zero-cost.

Exactly!

And if we only allow to do such conversions within rust/kernel/, then all other Rust code should only see already-constructed Errors.

But inside Rust code, we also need to create Errors. Most of the time, we know in advance which error we will pass to the kernel. And so we can use something like return Err(Error::EINVAL); which is also zero-cost.

Indeed.

Does it mean we don't need this PR anymore? We will ensure type invariants inside the wrapper (from_kernel_err_ptr, from_kernel_retval...). And we use predefined const in rust side (e.g. Error::EINVAL, Error::ENOMEM.....)

@ojeda
Copy link
Member

ojeda commented Jun 2, 2021

Does it mean we don't need this PR anymore? We will ensure type invariants inside the wrapper (from_kernel_err_ptr, from_kernel_retval...). And we use predefined const in rust side (e.g. Error::EINVAL, Error::ENOMEM.....)

We still need this PR :-) i.e. regardless of which wrappers we create, the wrappers would use the methods of Error.

@foxhlchen
Copy link
Author

foxhlchen commented Jun 2, 2021

Does it mean we don't need this PR anymore? We will ensure type invariants inside the wrapper (from_kernel_err_ptr, from_kernel_retval...). And we use predefined const in rust side (e.g. Error::EINVAL, Error::ENOMEM.....)

We still need this PR :-) i.e. regardless of which wrappers we create, the wrappers would use the methods of Error.

Great :D

linux/rust/kernel/error.rs

Lines 209 to 233 in ac536cc

pub(crate) fn from_kernel_err_ptr<T>(ptr: *mut T) -> Result<*mut T> {
extern "C" {
#[allow(improper_ctypes)]
fn rust_helper_is_err(ptr: *const c_types::c_void) -> bool;
#[allow(improper_ctypes)]
fn rust_helper_ptr_err(ptr: *const c_types::c_void) -> c_types::c_long;
}
// CAST: casting a pointer to `*const c_types::c_void` is always valid.
let const_ptr: *const c_types::c_void = ptr.cast();
// SAFETY: the FFI function does not deref the pointer.
if unsafe { rust_helper_is_err(const_ptr) } {
// SAFETY: the FFI function does not deref the pointer.
let err = unsafe { rust_helper_ptr_err(const_ptr) };
// CAST: if `rust_helper_is_err()` returns `true`,
// then `rust_helper_ptr_err()` is guaranteed to return a
// negative value greater-or-equal to `-bindings::MAX_ERRNO`,
// which always fits in an `i16`, as per the invariant above.
// And an `i16` always fits in an `i32`. So casting `err` to
// an `i32` can never overflow, and is always valid.
return Err(Error::from_kernel_errno(err as i32));
}
Ok(ptr)
}

If this pr is going to get accepted, we may need to use
return Err(Error::from_kernel_errno_unchecked(err as i32));
in from_kernel_err_ptr() since we've already verified that by rust_helper_is_err.

Let me include it in this PR.

@ojeda
Copy link
Member

ojeda commented Jun 2, 2021

Please do not forget to add the # Invariants section to Error (I mentioned this previously, but it seems it got lost).

Also, if it is already possible (not sure), feel free to go ahead and make the Error constructors private to the crate.

@ksquirrel

This comment has been minimized.

Copy link
Collaborator

@TheSven73 TheSven73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! A few constructive remarks:

  1. merge heads are not permitted in PRs. Each time you make a change, use git rebase to create only "clean commits". For this PR, you probably need only a single commit.
  2. before pushing a new version, check if the CI will build without errors, by enabling && running CI in your own fork of the Rust-for-Linux repo. It can save a lot of time.

@TheSven73
Copy link
Collaborator

@ojeda would it make sense to make ksquirrel complain about merge heads in PRs?

@foxhlchen
Copy link
Author

Thanks for the contribution! A few constructive remarks:

  1. merge heads are not permitted in PRs. Each time you make a change, use git rebase to create only "clean commits". For this PR, you probably need only a single commit.
  2. before pushing a new version, check if the CI will build without errors, by enabling && running CI in your own fork of the Rust-for-Linux repo. It can save a lot of time.

Sorry about this. :(
I accidentally clicked merge button on my branch's front page.
I'm working on fixing it.

@foxhlchen foxhlchen force-pushed the 295-ERROR-invariant branch from fbf8cb5 to 00145c5 Compare June 2, 2021 13:27
@ksquirrel

This comment has been minimized.

@ojeda
Copy link
Member

ojeda commented Jun 2, 2021

@ojeda would it make sense to make ksquirrel complain about merge heads in PRs?

Definitely! +1

Copy link
Collaborator

@TheSven73 TheSven73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

rust/kernel/error.rs Outdated Show resolved Hide resolved
Copy link
Member

@ojeda ojeda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits.

rust/kernel/error.rs Outdated Show resolved Hide resolved
rust/kernel/error.rs Outdated Show resolved Hide resolved
rust/kernel/error.rs Outdated Show resolved Hide resolved
rust/kernel/error.rs Outdated Show resolved Hide resolved
rust/kernel/error.rs Show resolved Hide resolved
We will need to make sure that no Error with out of
range error code can be constructed.

This commit
1. Add errno check in from_kernel_errno()
2. Provides a unchecked version from_kernel_errno_unchecked()

And when an invalid errno is found, it will
1) Print a  warning.
2) Convert it to EINVAL.

Signed-off-by: Fox Chen <[email protected]>
@foxhlchen foxhlchen force-pushed the 295-ERROR-invariant branch from 00145c5 to d032957 Compare June 3, 2021 01:46
@ksquirrel
Copy link
Member

Review of d0329579a9c1:

  • ✔️ Commit d032957: Looks fine!

@foxhlchen
Copy link
Author

It may be a bit late to say this, but I was wondering would it be better if we return Option instead of converting it to a valid errno like this?
We convert it to Error::EINVAL, what if it happens to have other meanings for the caller?? For example, a caller may expect EINVAL for "wrong parameters". In this situation, the caller has no way to know whether it's an invalid error number from the c side or simply "wrong parameters". It's too ambiguous. What's even worse, new developers might not notice this behavior of 'Error' and blindly treat EINVAL as it is, this would lead to subtle bugs.

@ojeda
Copy link
Member

ojeda commented Jun 3, 2021

Yes, that is why I asked you to put It is a bug to call this ..., so that it was 100% clear it is not supposed to be used like that.

But indeed, returning a Result (not an Option -- it is an error to call it with an integer, after all) is a more flexible approach because it allows callers to do something different if needed. The downside is that then we are dealing with a fallible API which is never supposed to fail with all its problems, which is why in "normal Rust" it would be a panic.

So this goes back into the error handling discussion, panics, etc. (#283). I said the approach you had was OK because we have not decided what to do yet (what you did is basically the "WARN() and continue" approach).

Perhaps we could propose upstream to have an EBUG or an EIKE ("Internal Kernel Error") for these cases ;-)

@foxhlchen
Copy link
Author

Thanks for explaining. It's clearer to me. :)

Perhaps we could propose upstream to have an EBUG or an EIKE ("Internal Kernel Error") for these cases ;-)

I'm down to this. Ultimately, a unique error number is necessary if we take this approach, and it can be used in other similar cases.

@ojeda
Copy link
Member

ojeda commented Jun 3, 2021

You're welcome! :)

Let me merge this one, since it is already an improvement, and then you can send another PR to change it to Result<Error>.

@ojeda ojeda merged commit 5c65d82 into Rust-for-Linux:rust Jun 3, 2021
@foxhlchen
Copy link
Author

You're welcome! :)

Let me merge this one, since it is already an improvement, and then you can send another PR to change it to Result<Error>.

Thank you
Will do!

mdaverde pushed a commit to mdaverde/linux that referenced this pull request Nov 7, 2021
I've got the following splat after enabling preemption:

[    3.724721] BUG: using __this_cpu_add() in preemptible [00000000] code: swapper/0/1
[    3.734630] caller is __this_cpu_preempt_check+0x38/0x50
[    3.740635] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc4-64bit+ Rust-for-Linux#324
[    3.744605] Hardware name: 9000/785/C8000
[    3.744605] Backtrace:
[    3.744605]  [<00000000401d9d58>] show_stack+0x74/0xb0
[    3.744605]  [<0000000040c27bd4>] dump_stack_lvl+0x10c/0x188
[    3.744605]  [<0000000040c27c84>] dump_stack+0x34/0x48
[    3.744605]  [<0000000040c33438>] check_preemption_disabled+0x178/0x1b0
[    3.744605]  [<0000000040c334f8>] __this_cpu_preempt_check+0x38/0x50
[    3.744605]  [<00000000401d632c>] flush_tlb_all+0x58/0x2e0
[    3.744605]  [<00000000401075c0>] 0x401075c0
[    3.744605]  [<000000004010b8fc>] 0x4010b8fc
[    3.744605]  [<00000000401080fc>] 0x401080fc
[    3.744605]  [<00000000401d5224>] do_one_initcall+0x128/0x378
[    3.744605]  [<0000000040102de8>] 0x40102de8
[    3.744605]  [<0000000040c33864>] kernel_init+0x60/0x3a8
[    3.744605]  [<00000000401d1020>] ret_from_kernel_thread+0x20/0x28
[    3.744605]

Fix this by moving the __inc_irq_stat() into the locked section.

Signed-off-by: Sven Schnelle <[email protected]>
Signed-off-by: Helge Deller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants