Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust: thread: Add Thread wrapper #109

Open
wants to merge 2 commits into
base: rust
Choose a base branch
from
Open

Conversation

fbq
Copy link
Member

@fbq fbq commented Mar 15, 2021

It's working (on my machine), but sure, a lot to improve.

Updates (v9):

  • Fix the potential memory leak found by @wedsonaf .
  • Now we have two thread structure: RawThread and Thread<ThreadArg>, each thread struct has its own try_new function. try_new_closure is now kernel::thread::spawn.
  • Other minor fixes.

Updates (v8):

  • Rebase to the latest rust/rust.

Updates (v7):

  • Rebase to the latest rust/rust to adopt sampe code changes.

Updates (v6):

  • Fix style and typo as suggestions from @ojeda
  • Add the case where a DST is shared between threads in examples, where try_new_closure() is slightly more appropriate than try_new().
  • Fix a bug: the ThreadArg parameter of try_new() should have 'static lifetime, because the new thread may live longer.

Updates (v5):

  • Make ThreadArg more easily to merge with PointerWrapper
  • Take the suggestion from @ojeda on the function names (I still keep the try_ prefixes, as that indicates the return type is KernelResult not Thread
  • Add the documentation section of "Memory Cost" for try_new_closure, so that the users will know the current drawbacks.

Updates (v4):

  • Don't use macro to generate bridge functions
  • Still surface try_new_c_style.

Updates (v3):

  • Multiple document and style fixes
  • Renames Thread::new to Thread::try_new because as the return type suggests the creation may fail.
  • Adds Thread::try_new_c_style to allow people use C function pointer for thread creation.
  • Moves the name parameter as the first parameter of Thread::try_new

Update:

  • Documentation is added ;-)
  • Thread::stop is added, to support waiting for a thread to finish.
  • Correctly handle the refcount of the task_struct. during Thread::new and Thread::drop.
  • Now Thread::new accepts FnOnce() -> KernelResult<()> as the closure for kernel threads.

rust/kernel/thread.rs Outdated Show resolved Hide resolved
Copy link

@wedsonaf wedsonaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very cool. Once this is in, we can add better examples for mutex/spinlock/condvar to the examples.

rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
Copy link
Member

@ojeda ojeda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this! Having new contributors is definitely great :)

rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
@ojeda
Copy link
Member

ojeda commented Mar 16, 2021

I think I'd like to see first workqueues etc. before lower-level APIs, but nevertheless any proof of concept is welcome :)

rust/kernel/thread.rs Outdated Show resolved Hide resolved
drivers/char/rust_example.rs Outdated Show resolved Hide resolved
drivers/char/rust_example.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
/// )?;
///
/// t.wake_up();
pub fn new<F>(f: F, name: CStr) -> KernelResult<Self>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How hard would it be to have c-style non-allocating version? The thread function would be a static function that doesn't capture any environment and takes a single argument no bigger than the size of a pointer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean something like below, which is basically a wrapper of kthread_create_on_node. And if you want to share data between threads, you still need to do box/Arc dance yourself.

     pub unsafe fn new_unsafe<F>(
         name: CStr,
         f: extern "C" fn(*mut c_types::c_void) -> i32,
         arg: *mut c_types::c_void,
     ) -> KernelResult<Self> {
         let task;
         
         task = bindings::kthread_create_on_node(
                 Some(f),
                 arg,
                 bindings::NUMA_NO_NODE,
                 "%s".as_ptr() as _,
               name.as_ptr(),
             );
     
         let result = ptr_to_result(task);

         if let Err(e) = result {
             Err(e)
         } else {
             Ok(Thread { task: task })
         }
      }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's very close to what I was suggestion. By c-style I just meant that it should take one pointer-sized argument: it doesn't have to be extern "C", and it can use usize or perhaps some arbitrary type T whose size is less than or equal to size_of(usize) -- perhaps we could enforce this with a static assert.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wedsonaf Could we do function pointer conversion from Rust to C? If not, what's the function pointer that we are going to pass to kthread_create_on_node?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? Whether it is safe to cast a C function pointer to a Rust function pointer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I push the old version, the one with thread creation macro here: https://github.com/fbq/linux-rust/tree/dev/thread-old, for discussion purpose.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ojeda I upload an updated version:

  • take your suggestion on the names, but I still keep the try_ prefixes since the return type is KernelResult. And instead of hiding try_new_closure, I add a "Memory Cost" section in the document. I think the API itself doesn't have problems it's the implementation that we are more worried about, so I keep it public but not recommend it to users unless they are aware of the cost.
  • Do some prepare work for the future mergement with PointerWrapper.

Copy link
Member

@ojeda ojeda Apr 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better! Thanks a lot, and sorry again for the delay.

Is there any use case for which try_new() cannot (reasonably easily) cover try_new_closure() that you can think of? If yes, it would be good to put it as an example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One limitation is that ThreadArg has to be Sized, otherwise we cannot convert it to *const (). So if there is a DST shared between threads, we either use try_new_closure() or do our own double boxing and stay with try_new(). A somewhat pointless example is as follow:

let boxed_slice: Box<[i32]> = Box::try_new([1,2,3,4])?;

Thread::try_new_closure(move || {
    println!("{:?}" boxed_slice);
});

// This will fail to compile
// Thread::try_new::<...>(boxed_slice);

// Do your own double boxing, it will work
Thread::try_new::<...>(Box::try_new(boxed_slice)?);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in that case doing the double boxing "by hand" is fine and it can be explained in try_new's docs.

But I also think showing how closures could be done may be worth it since it is early days. So I think we can keep it, and see what other folks think (and perhaps someone comes up with a way to support closures with a single allocation etc.).

In the end, I suspect only the faster and non-/less-allocating APIs will survive considering this is the kernel, but we will see!

rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
drivers/char/rust_example.rs Outdated Show resolved Hide resolved
@fbq
Copy link
Member Author

fbq commented Mar 19, 2021

https://github.com/Rust-for-Linux/linux/pull/109/checks?check_run_id=2147388614#step:35:26

Is this a false postive? Because the code is in a unsafe block

5d3fd95#diff-67dbdd410624adbd4c7ff8aac814d8aa148376945f04c4d690b4d947809d17c4R73

Besides the raw pointer is actually not dereferenced in the function

@wedsonaf
Copy link

wedsonaf commented Mar 19, 2021 via email

@fbq
Copy link
Member Author

fbq commented Mar 19, 2021

@wedsonaf But I don't dereference the pointer in the function. So should the following function be unsafe?

pub fn(ptr: *mut i32) -> i32 {
     ptr as usize as i32
}

@fbq
Copy link
Member Author

fbq commented Mar 19, 2021

And if I assign arg to a local binding, the problem is gone:

--- a/rust/kernel/thread.rs
+++ b/rust/kernel/thread.rs
@@ -62,18 +62,21 @@ impl Thread {
     ) -> KernelResult<Self> {
         let task;

+        // To make clippy happy.
+        let arg_copy = arg;
+
         // SAFETY:
         //
         // - `kthread_create_on_node` won't use `f` or dereference `arg`, if `arg` is an invalid
         // pointer or `f` doesn't handle `arg` as it should, the new thread will case unsafe
         // behaviors.

         // - `kthread_create_on_node` will copy the content of `name`, so we don't need to make the
         // `name` live longer.
         unsafe {
             task = ptr_to_result(bindings::kthread_create_on_node(
                 Some(f),
-                arg,
+                arg_copy,

@wedsonaf
Copy link

wedsonaf commented Mar 19, 2021

@wedsonaf But I don't dereference the pointer in the function.

Well, you call an unsafe function defined in another language. The compiler doesn't know what the function does nor what its unsafe requirements are, so it can't convince itself that the pointer is not dereferenced.

So should the following function be unsafe?

No, because the compiler can convince itself that you don't dereference it in this case.

@wedsonaf
Copy link

And if I assign arg to a local binding, the problem is gone:

I guess clippy is not smart enough to follow the usage of the copy...

But let's consider why we mark a function unsafe. I see at least two distinct reasons:

  1. Because it genuinely does something that violates Rust's safety rules.
  2. Because it has extra requirements beyond the ordinary ones on the caller, and we'd like the make the caller aware of these requirements.

I think this function falls in the second category. WDYT?

@fbq
Copy link
Member Author

fbq commented Mar 19, 2021

But let's consider why we mark a function unsafe. I see at least two distinct reasons:

  1. Because it genuinely does something that violates Rust's safety rules.
  2. Because it has extra requirements beyond the ordinary ones on the caller, and we'd like the make the caller aware of these requirements.

I think this function falls in the second category. WDYT?

I did consider to make it unsafe at first but the function itself doesn't dereference the arg pointer or call the f function, so I made it as safe. But now think about it, it makes more sense to make it unsafe. Let me send an udpated version.

Copy link
Member

@ojeda ojeda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit-review.

To clarify on the whole newline vs. no newline thing around safety comments:

  • /// # Safety goes with newlines because it is a Markdown section (#) for an unsafe function/method docs (i.e. 3 slashes: ///). It explains the preconditions the caller must uphold.
  • // SAFETY: goes without newlines because it is a code comment (i.e. 2 slashes //, no Markdown section) on the following unsafe block that explains why it is sound.

drivers/char/rust_example.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/error.rs Outdated Show resolved Hide resolved
rust/kernel/error.rs Outdated Show resolved Hide resolved
@ojeda
Copy link
Member

ojeda commented Mar 19, 2021

Because it has extra requirements beyond the ordinary ones on the caller, and we'd like the make the caller aware of these requirements.

A function should be qualified as unsafe whenever there is a path of execution that triggers UB. If there is no way to trigger UB with it, it shouldn't be marked as unsafe, even if it panics or breaks the logic of the program.

I am not sure we want to mark things as unsafe just to make the caller aware of extra preconditions (the same way, we don't want to mark things as safe just because it is safe "in most cases" or "if you do the right thing").

Some people may want to have an "almost safe" qualifier, though. For instance, when the only real way to break a function is to pass an invalid pointer but otherwise there is no other UB. I mention this and a few other things in N2659 (feedback welcome!), although I only propose [[safe]] for C.

Well, you call an unsafe function defined in another language. The compiler doesn't know what the function does nor what its unsafe requirements are, so it can't convince itself that the pointer is not dereferenced.

FFI is already unsafe: when we call an external function, we are promising that function won't trigger UB in any way with the arguments we are passing. The compiler does not need to know whether it is safe or not: we are stating it is.

In summary: if kthread_create_on_node can trigger UB, and we have no way of preventing that (e.g. by checking arguments before calling it), then try_new_c_style should be qualified as unsafe.

@wedsonaf
Copy link

A function should be qualified as unsafe whenever there is a path of execution that triggers UB. If there is no way to trigger UB with it, it shouldn't be marked as unsafe, even if it panics or breaks the logic of the program.

This is not true. A function needs to be marked unsafe even if the function itself cannot lead to UB, but something later on can. An example of this from the standard library is Pin::get_unchecked_mut: the function is trivial and obviously doesn't cause UB, but it has extra safety requirements that are imposed on the caller so as to avoid future UBs.

FFI is already unsafe: when we call an external function, we are promising that function won't trigger UB in any way with the arguments we are passing. The compiler does not need to know whether it is safe or not: we are stating it is.

This is also not true. Our "promise" can be conditional: if we mark the caller as unsafe as well, we're saying that calling an unsafe FFI won't trigger UB as long as the caller of the function also abides by the additional safety requirements.

In summary: if kthread_create_on_node can trigger UB, and we have no way of preventing that (e.g. by checking arguments before calling it), then try_new_c_style should be unsafe.

We are in agreement on this part. :)

@ojeda
Copy link
Member

ojeda commented Mar 19, 2021

This is not true. A function needs to be marked unsafe even if the function itself cannot lead to UB, but something later on can. An example of this from the standard library is Pin::get_unchecked_mut: the function is trivial and obviously doesn't cause UB, but it has extra safety requirements that are imposed on the caller so as to avoid future UBs.

Yes, by "any path of execution" and "If there is no way to trigger UB with it", I'm including the context, not just the operations inside the function, e.g. if somewhere else you are relying on an invariant to avoid UB and your function may break that invariant, then it is unsafe.

The same thing applies in reverse: a function may contain a fundamentally unsafe operation in it, e.g. a dereference, and yet be marked as safe as long as something else in the program ensures that dereference is always valid (i.e. that there is no path of execution that may arrive at the dereference triggering UB).

This is also not true. Our "promise" can be conditional: if we mark the caller as unsafe as well, we're saying that calling an unsafe FFI won't trigger UB as long as the caller of the function also abides by the additional safety requirements.

Of course, I am talking about the case when the outer function is qualified as safe. If it isn't, the language does not require to use unsafe blocks to begin with (although it is still a good idea).

Copy link
Member

@ojeda ojeda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bits here and there.

rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
@fbq
Copy link
Member Author

fbq commented Mar 20, 2021

My thought is that if we don't make try_new_c_style as unsafe, then we need to make wake_up as unsafe, because wake_up is the function that potentially trigger the call of the unsafe C function, and that's undesirable, because wake_up a Thread created by try_new is safe. If you all agree on this, I will add the reasoning to the comments for try_new_c_style, so that the future me (e.g. six months later version) will be grateful ;-)

@fbq fbq force-pushed the dev/thread branch 4 times, most recently from 77c70dc to 6a9da4b Compare March 22, 2021 07:47
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
drivers/char/rust_example.rs Outdated Show resolved Hide resolved
drivers/char/rust_example.rs Outdated Show resolved Hide resolved
rust/kernel/error.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
rust/kernel/thread.rs Outdated Show resolved Hide resolved
@fbq
Copy link
Member Author

fbq commented Sep 17, 2021

v9 -> v10:

  • Use single boxing implement for closures as thread functions
  • Remove ThreadFunctor way of thread creation, since using closures doesn't require double boxing.

@fbq fbq force-pushed the dev/thread branch 2 times, most recently from 3b618d5 to 413a7b9 Compare September 17, 2021 14:35
@fbq
Copy link
Member Author

fbq commented Sep 17, 2021

Pushing a temporary version to see which commit triggered the "Inconsistent kallsym data", as I failed to reproduce this locally.

@fbq fbq force-pushed the dev/thread branch 9 times, most recently from db49a24 to a885525 Compare September 22, 2021 17:28
@fbq
Copy link
Member Author

fbq commented Sep 23, 2021

Seems like I'm hitting a rarely seen "bug" of scripts/link-vmlinux.sh: currently file sizes are used to decide whether the kallsym results of step 1 and step 2 are different, however, I'm hitting a case where the file sizes are the same but the content is different...

@fbq fbq force-pushed the dev/thread branch 2 times, most recently from 657b015 to b207507 Compare September 28, 2021 04:56
@fbq
Copy link
Member Author

fbq commented Sep 28, 2021

Seems like I'm hitting a rarely seen "bug" of scripts/link-vmlinux.sh: currently file sizes are used to decide whether the kallsym results of step 1 and step 2 are different, however, I'm hitting a case where the file sizes are the same but the content is different...

And the problem is gone, maybe because we update to a new version of toolchain.

@fbq fbq requested review from ojeda and wedsonaf September 28, 2021 05:15
@ojeda
Copy link
Member

ojeda commented Sep 28, 2021

Seems like I'm hitting a rarely seen "bug" of scripts/link-vmlinux.sh: currently file sizes are used to decide whether the kallsym results of step 1 and step 2 are different, however, I'm hitting a case where the file sizes are the same but the content is different...

Hmm... perhaps we should put this in an issue even if it gone.

@fbq
Copy link
Member Author

fbq commented Sep 29, 2021

Seems like I'm hitting a rarely seen "bug" of scripts/link-vmlinux.sh: currently file sizes are used to decide whether the kallsym results of step 1 and step 2 are different, however, I'm hitting a case where the file sizes are the same but the content is different...

Hmm... perhaps we should put this in an issue even if it gone.

Done #502

@fbq fbq changed the title [RFC] rust: thread: Add Thread wrapper rust: thread: Add Thread wrapper Oct 23, 2021
Copy link

@wedsonaf wedsonaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good, thank you Boqun! And apologies for taking so long to get back to this.

/// The safety requirements of calling this function are:
///
/// - Make sure `arg` is a proper pointer that points to a valid memory
/// location when the new thread begins to run.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, arg could be a pointer-sized integer, couldn't it?

Also, when it actually is a pointer, I think we need to be more explicit about its lifetime requirements: in addition to being valid when the new thread begins to run, it must remain valid for as long as f needs it. (IOW, if f needs it forever, it should stay alive forever.)

/// # Context
///
/// This function might sleep in `kthread_create_on_node` due to the memory
/// allocation and waiting for the completion, therefore do not call this
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit about wording: I would prefer that we avoid using the imperative mood. So instead of saying "do not call this atomic context", perhaps say something like "Callers must not be in atomic context".

Some(f),
arg,
bindings::NUMA_NO_NODE,
c_str!("%s").as_ptr() as _,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably something for another PR, but given we should probably allow callers to specify rust-style formats when calling this function so that they can do something similar to C when choosing names, something like ("my-thread-{}", index), where in C we'd see ("my-thread-%d", index).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but that requires the API implemented as a macro, right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but that requires the API implemented as a macro, right?

Maybe not. Instead of taking a &CStr as argument, you'd take a core::fmt::Arguments. Callers would have to use the format_args! macro to create the name, but they already had to use c_str! anyway; if format_args! is too long, perhaps we could define a new one called fmt! that just calls format_args!.

And the implementation would pass c_str!("%pA") as the format, with a pointer to the Arguments arg as the first (and only) format argument.

Perhaps you could do it in this PR if you're so inclined. But I'm fine with deferring it too.

/// # Context
///
/// This function might sleep, don't call it in atomic contexts.
pub fn wake_up(&self) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any functions from Task that we don't want exposed by RawThread?

If not, I think we should have RawThread implement Deref<Target = Task> so that we expose all Task methods directly without having to write these helper routines.

/// Stops the thread.
///
/// - If the thread hasn't been waken up after creation, the thread function
/// won't be called, and this function will return `-EINTR`. Note that a
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the selling points of Rust is that we don't mix errors and proper return values. In this instance, it seems like we're violating this by returning an error when the function actually succeeded.

I think we should try to avoid this. Perhaps return Result<bool>: an Err indicates that something went wrong in our attempt to stop the thread; Ok(b) means that we successfully stopped the thread, and b is true if it (the thread) managed to run at some point, false if we stopped it before it ever ran. Of course, I'm not insisting on this solution, but I'd like to clearly separate success and failure cases. WDYT?

/// # Examples
///
/// ```rust
/// #![feature(allocator_api)]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need this in the sample?

/// # Context
///
/// This function might sleep, don't call it in atomic contexts.
pub fn wake_up(&self) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here wrt methods from Task/RawThread -- can we use Deref to avoid having to forward these calls to the wrapped object?

// `TASK_DEAD`, and then `schedule` will not come. Currently we don't have a
// way to do this safely in Rust, and in the future, we probably still won't
// allow it.
unsafe {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have cond_resched in rust/kernel/sync/mod.rs.

I think these two need to be together. I'm trying to think of a good place for them, one thing that comes to mind is Task, so one would call Task::schedule() and Task::cond_reschedule (no self involved here). WDYT?

Thread::try_new(c_str!("rust-thread0"), || {
pr_info!("Hello Rust Thread");
Ok(())
})?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this, the following thought came to mind: what if we never call wake_up (either on purpose or by omission, i.e., not having realised that it needs to called)?

The thread and its state would be stuck in no-man's land, right? I think we should make it harder for such a state to be reached; another way to put it: a better design would one in which this no-man's state is reached only when the user explicitly intends it.

Two things come to mind:

  1. We should offer a constructor that automatically starts the thread in addition to the one we have now. In fact, I think the try_new one should automatically wake up the thread and the other one could be something like try_new_paused (or something to that effect).

  2. Do thread creation in two stages with different types. The first stage is one in which the thread was never started; there are three ways to come out of it (they would all consume this initial type and optionally return a new one): starting the thread (returns the Thread as you have here); stopping the thread (doesn't return anything); upgrading (doesn't start the thread, but returns Thread as you have here). If none of these actions are taken, i.e., we default to stop.

Some examples to illustrate what I mean:

Thread::try_new(...); // automatically runs the closure

Thread::try_new_paused(...).run(); // automatically runs the closure

// Creates the thread but never runs it and automatically cleans
// everything up (no accidental leaks).
Thread::try_new_paused();

// Same as above
Thread::try_new_paused().stop();

// `upgrade` is not the best name, we should think of something better.
//
// creates the thread, never runs it, but don't do cleanups beyond 
// `put_task_struct`, which is basically what we have today but with
// explicit action from the user (calling `upgrade`).
Thread::try_new_paused().upgrade(); 

To implement the above, Thread::try_new_paused would return a different type, something like InitialThread that would have the run, upgrade, stop methods plus a Drop implementations that calls stop (and would be inhibited by the calls to run and upgrade).

WDYT?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two more examples to make the role of InitialThread clearer:

fn example1() -> Result {
    let t = Thread::try_new_paused(...);
    // Do other stuff. If it fails, and we return, `t` never runs and is cleaned up.
    let t = t.run();
    // Now we can use `t` as a `Thread`.
}

fn example1() -> Result {
    let t = Thread::try_new_paused(...).upgrade();
    // Do other stuff. If it fails, and we return, `t` never runs and leaks memory.
    let t = t.run();
    // Now we can use `t` as a `Thread`.
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this, the following thought came to mind: what if we never call wake_up (either on purpose or by omission, i.e., not having realised that it needs to called)?

The Thread::drop will handle it in that case. So if you do

fn do_something() {
    let t = Thread::try_new(...)?;
    // never call t.wake_up();
}

the Thread will be freed when do_something() returns. IOW, the lifetime of a kernel thread is the same as the lifetime of a Thread. For a (kernel) thread that forever runs, kernel must keep a reference to is somewhere, otherwise kernel won't have a way to manipulate it.

The thread and its state would be stuck in no-man's land, right? I think we should make it harder for such a state to be reached; another way to put it: a better design would one in which this no-man's state is reached only when the user explicitly intends it.

The interface that I want to provide to users is that if a Thread is alive, then the thread must be there. And if the users want to keep a thread alive for a long time (maybe forever), they are required to manage the lifetime of Thread.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Thread::drop will handle it in that case. So if you do

fn do_something() {
    let t = Thread::try_new(...)?;
    // never call t.wake_up();
}

the Thread will be freed when do_something() returns. IOW, the lifetime of a kernel thread is the same as the lifetime of a Thread. For a (kernel) thread that forever runs, kernel must keep a reference to is somewhere, otherwise kernel won't have a way to manipulate it.

If we never call t.wake_up, do_something() will never run and therefore never return, right? (Or am I missing something here?)

IOW, it looks to me that we're leaking by inaction (i.e., not calling t.wake_up). I'm saying that we should design things to avoid accidental leaks like this one. If users actually do want to leak, they should do it explicitly (e.g., by calling some function that will result in the leak).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Thread::drop will handle it in that case. So if you do

fn do_something() {
    let t = Thread::try_new(...)?;
    // never call t.wake_up();
}

the Thread will be freed when do_something() returns. IOW, the lifetime of a kernel thread is the same as the lifetime of a Thread. For a (kernel) thread that forever runs, kernel must keep a reference to is somewhere, otherwise kernel won't have a way to manipulate it.

If we never call t.wake_up, do_something() will never run and therefore never return, right? (Or am I missing something here?)

No, please see the comments below (I adjust a little on the return type of do_something().

fn do_something() -> Result {
     let t = Thread::try_new(...)?; // create a thread, and thread doesn't start to run
     // if t.wake_up() here, the thread will start running
     return Ok(()); // here `drop(t)` will be called, if `t` hasn't started, `t` will get recycled, otherwise, we wait `t` to finish and then recycle it.
}

IOW, it looks to me that we're leaking by inaction (i.e., not calling t.wake_up). I'm saying that we should design things to avoid accidental leaks like this one. If users actually do want to leak, they should do it explicitly (e.g., by calling some function that will result in the leak).

I don't think we leak anything here, all thread related resources are bound to Thread, and they will get recycled when Thread reach the end of life, that's how I want to design the API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that even if we call t.wake_up(), we could still "leak" memory,

Sure, but I think that the fact that we may 'leak' later has no bearing on the fact that we definitely leak between try_new and wake_up.

I agree that we should resolve this, but I'm not sure Thread is the proper layer that we resolve it. The current design of Thread is combination of JoinHandle in Rust std and Linux C kernel API, both of which are already known and somehow well-defined, but what's we don't know is the usage pattern that Rust users may choose to manage the kernel threads, for example, will Rust user prefer one-shot threads (only run one time and return) or long-live threads? Given the uncertainness, I'm not sure the best API the users want. The current rule of Thread is easy, if one wants to make sure all the resources of a thread is recycled, one must call stop, otherwise, there is no guarantee. Upon the Thread, we can build other layers to resolve various leaking issues.

It seems to me that part of your argument is that the situation above is unique. So let me show 3 examples from the project that, in my view, show a pattern that I think we should follow here:

  1. Reference-counted object:
    let r = UniqueRef::try_new(...)?;
    do_something()?; // <- if this fails, we automatically free the object.
    r.into() // <- converts UniqueRef<T> to Ref<T> -- users may leak this by creating
             // circular references and many other ways, but we still clean up if
             // do_something fails.
  1. Red-black tree nodes:
    let r = RBTree::try_reserve_node()?;
    do_something()?; // <- if this fails, we automatically free the node reservation.
    tree.insert(r.into_node()); // <- this converts a reservation into a node and inserts it
                                // into the tree -- users may "leak" it by never removing it from the
                                // tree but we still clean up the reservation if do_something above 
                                // fails. 
  1. File descriptor:
    let r = FileDescriptorReservation::new(...)?;
    do_something()?; // <- if this fails, we automatically free the fd reservation.
    r.commit(file); // <- ties this fd reservation to the given file -- users may "leak" the
                    // file descriptors by never closing it, but we still clean up the reservation
                    // if do_something fails.

It makes more sense considering this is a widely used pattern ;-)

So perhaps one way to address your concern wrt to keeping things simple would be something like this: Thread::try_new() creates and automatically runs a thread -- I suspect this would be usual way of doing things, and would be similar to userspace and kthread_run.

I probably name it as Thread::try_run().

For the more unusual scenario where one needs to do some work before letting the thread run, we'd have a type 'PausedThread' (or something similar, but would play the same role as UniqueRef, FdReservation and NodeReservation in the examples above). PausedThread::try_new creates a paused thread that is automatically cleaned up if it goes out of scope. Eventually the users would call paused_thread.into_thread() which would consume the paused thread and turn it into a regular Thread by just waking it up. (We may even avoid the extra reference count increment on the new task if we have PausedThread::wake_up that consumes the PausedThread but doesn't return anything -- we don't need to increment the refcount because we're giving up any pointers to the new task).

So we keep Thread as simple as it gets. The extra 'complexity' only comes in when users want the less common scenarios, and even then it is contained in PausedThread. And we never leak paused threads if an error occurs before they run.

OK, I will update this PR accordingly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we keep Thread as simple as it gets. The extra 'complexity' only comes in when users want the less common scenarios, and even then it is contained in PausedThread. And we never leak paused threads if an error occurs before they run.

@wedsonaf there is an interesting limitation: should PausedThread impl Deref<Target=Task>? It makes sense it does, because most of operations between try_new and wake_up are task related: e.g. set nice value. However, this conflicts with having a Task::wake_up(), because directly calling wake_up on PausedThread violates the type invariants. But having a Task::wake_up() is quite natural, which I already did in this PR. Any idea?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's indeed problematic. I agree that we should have Task::wake_up, especially because it's used/useful in other scenarios, but that seems to prevent us from implementing Deref<Target = Task> for PausedThread. I can think of two ways of going about this, none of them ideal:

  1. Simply not implement Deref<Target = Task>, which means that we need to implement a bunch of Task methods in PausedThread and forward them.
  2. Implement an unsafe variant of Deref (e.g., an as_task_unsafe() method) with the safety requirement being that the caller must not call wake_up on the returned task.

Or perhaps we can do a combination of both, for example, we only implement the most common Task methods in PausedThread (e.g., nice()). Other users can go the unsafe route of caling as_task_unsafe() or simply adding the method to PauseThread if it is common enough. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's indeed problematic. I agree that we should have Task::wake_up, especially because it's used/useful in other scenarios, but that seems to prevent us from implementing Deref<Target = Task> for PausedThread. I can think of two ways of going about this, none of them ideal:

  1. Simply not implement Deref<Target = Task>, which means that we need to implement a bunch of Task methods in PausedThread and forward them.
  2. Implement an unsafe variant of Deref (e.g., an as_task_unsafe() method) with the safety requirement being that the caller must not call wake_up on the returned task.

Or perhaps we can do a combination of both, for example, we only implement the most common Task methods in PausedThread (e.g., nice()). Other users can go the unsafe route of caling as_task_unsafe() or simply adding the method to PauseThread if it is common enough. WDYT?

How about a AutoStopThread instead of PausedThread? AutoStopThread impl Deref<Target=Task> and can convert to or from Thread, which means AutoStopThread can represent a thread that is already running, the only difference is that AutoStopThread will call kthread_stop when it goes out of scope (i.e. in AutoStopThread::drop) while Thread::drop is just put_task_struct.

AutoStopThread does the same thing as PausedThread: cleaning up when drop. The only downside is that we need to get_task_struct when we create a new AutoStopThread for the same reason as we do for Thread.

Thoughts?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I really like this! It also addresses the problem of storing a pointer to the task somewhere else before running the thread (from a cursory look at the uses of kthread_create, it seems to be fairly common).

I'm not too concerned about having to call get_task_struct. Creating a thread is already an expensive operation anyway.

ojeda pushed a commit that referenced this pull request May 4, 2022
Reading EEPROM fails with following warning:

[   16.357496] ------------[ cut here ]------------
[   16.357529] fsl_spi b01004c0.spi: rejecting DMA map of vmalloc memory
[   16.357698] WARNING: CPU: 0 PID: 371 at include/linux/dma-mapping.h:326 fsl_spi_cpm_bufs+0x2a0/0x2d8
[   16.357775] CPU: 0 PID: 371 Comm: od Not tainted 5.16.11-s3k-dev-01743-g19beecbfe9d6-dirty #109
[   16.357806] NIP:  c03fbc9c LR: c03fbc9c CTR: 00000000
[   16.357825] REGS: e68d9b20 TRAP: 0700   Not tainted  (5.16.11-s3k-dev-01743-g19beecbfe9d6-dirty)
[   16.357849] MSR:  00029032 <EE,ME,IR,DR,RI>  CR: 24002282  XER: 00000000
[   16.357931]
[   16.357931] GPR00: c03fbc9c e68d9be0 c26d06a0 00000039 00000001 c0d36364 c0e96428 00000027
[   16.357931] GPR08: 00000001 00000000 00000023 3fffc000 24002282 100d3dd6 100a2ffc 00000000
[   16.357931] GPR16: 100cd280 100b0000 00000000 aff54f7e 100d0000 100d0000 00000001 100cf328
[   16.357931] GPR24: 100cf328 00000000 00000003 e68d9e30 c156b410 e67ab4c0 e68d9d38 c24ab278
[   16.358253] NIP [c03fbc9c] fsl_spi_cpm_bufs+0x2a0/0x2d8
[   16.358292] LR [c03fbc9c] fsl_spi_cpm_bufs+0x2a0/0x2d8
[   16.358325] Call Trace:
[   16.358336] [e68d9be0] [c03fbc9c] fsl_spi_cpm_bufs+0x2a0/0x2d8 (unreliable)
[   16.358388] [e68d9c00] [c03fcb44] fsl_spi_bufs.isra.0+0x94/0x1a0
[   16.358436] [e68d9c20] [c03fd970] fsl_spi_do_one_msg+0x254/0x3dc
[   16.358483] [e68d9cb0] [c03f7e50] __spi_pump_messages+0x274/0x8a4
[   16.358529] [e68d9ce0] [c03f9d30] __spi_sync+0x344/0x378
[   16.358573] [e68d9d20] [c03fb52c] spi_sync+0x34/0x60
[   16.358616] [e68d9d30] [c03b4dec] at25_ee_read+0x138/0x1a8
[   16.358667] [e68d9e50] [c04a8fb8] bin_attr_nvmem_read+0x98/0x110
[   16.358725] [e68d9e60] [c0204b14] kernfs_fop_read_iter+0xc0/0x1fc
[   16.358774] [e68d9e80] [c0168660] vfs_read+0x284/0x410
[   16.358821] [e68d9f00] [c016925c] ksys_read+0x6c/0x11c
[   16.358863] [e68d9f30] [c00160e0] ret_from_syscall+0x0/0x28
...
[   16.359608] ---[ end trace a4ce3e34afef0cb5 ]---
[   16.359638] fsl_spi b01004c0.spi: unable to map tx dma

This is due to the AT25 driver using buffers on stack, which is not
possible with CONFIG_VMAP_STACK.

As mentionned in kernel Documentation (Documentation/spi/spi-summary.rst):

  - Follow standard kernel rules, and provide DMA-safe buffers in
    your messages.  That way controller drivers using DMA aren't forced
    to make extra copies unless the hardware requires it (e.g. working
    around hardware errata that force the use of bounce buffering).

Modify the driver to use a buffer located in the at25 device structure
which is allocated via kmalloc during probe.

Protect writes in this new buffer with the driver's mutex.

Fixes: b587b13 ("[PATCH] SPI eeprom driver")
Cc: stable <[email protected]>
Signed-off-by: Christophe Leroy <[email protected]>
Link: https://lore.kernel.org/r/230a9486fc68ea0182df46255e42a51099403642.1648032613.git.christophe.leroy@csgroup.eu
Signed-off-by: Greg Kroah-Hartman <[email protected]>
ojeda pushed a commit that referenced this pull request Jul 19, 2023
The following crash happens for me when running the -mm selftests (below).
Specifically, it happens while running the uffd-stress subtests:

kernel BUG at mm/hugetlb.c:7249!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 3238 Comm: uffd-stress Not tainted 6.4.0-hubbard-github+ #109
Hardware name: ASUS X299-A/PRIME X299-A, BIOS 1503 08/03/2018
RIP: 0010:huge_pte_alloc+0x12c/0x1a0
...
Call Trace:
 <TASK>
 ? __die_body+0x63/0xb0
 ? die+0x9f/0xc0
 ? do_trap+0xab/0x180
 ? huge_pte_alloc+0x12c/0x1a0
 ? do_error_trap+0xc6/0x110
 ? huge_pte_alloc+0x12c/0x1a0
 ? handle_invalid_op+0x2c/0x40
 ? huge_pte_alloc+0x12c/0x1a0
 ? exc_invalid_op+0x33/0x50
 ? asm_exc_invalid_op+0x16/0x20
 ? __pfx_put_prev_task_idle+0x10/0x10
 ? huge_pte_alloc+0x12c/0x1a0
 hugetlb_fault+0x1a3/0x1120
 ? finish_task_switch+0xb3/0x2a0
 ? lock_is_held_type+0xdb/0x150
 handle_mm_fault+0xb8a/0xd40
 ? find_vma+0x5d/0xa0
 do_user_addr_fault+0x257/0x5d0
 exc_page_fault+0x7b/0x1f0
 asm_exc_page_fault+0x22/0x30

That happens because a BUG() statement in huge_pte_alloc() attempts to
check that a pte, if present, is a hugetlb pte, but it does so in a
non-lockless-safe manner that leads to a false BUG() report.

We got here due to a couple of bugs, each of which by itself was not quite
enough to cause a problem:

First of all, before commit c33c794("mm: ptep_get() conversion"), the
BUG() statement in huge_pte_alloc() was itself fragile: it relied upon
compiler behavior to only read the pte once, despite using it twice in the
same conditional.

Next, commit c33c794 ("mm: ptep_get() conversion") broke that
delicate situation, by causing all direct pte reads to be done via
READ_ONCE().  And so READ_ONCE() got called twice within the same BUG()
conditional, leading to comparing (potentially, occasionally) different
versions of the pte, and thus to false BUG() reports.

Fix this by taking a single snapshot of the pte before using it in the
BUG conditional.

Now, that commit is only partially to blame here but, people doing
bisections will invariably land there, so this will help them find a fix
for a real crash.  And also, the previous behavior was unlikely to ever
expose this bug--it was fragile, yet not actually broken.

So that's why I chose this commit for the Fixes tag, rather than the
commit that created the original BUG() statement.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: c33c794 ("mm: ptep_get() conversion")
Signed-off-by: John Hubbard <[email protected]>
Acked-by: James Houghton <[email protected]>
Acked-by: Muchun Song <[email protected]>
Reviewed-by: Ryan Roberts <[email protected]>
Acked-by: Mike Kravetz <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Alex Williamson <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Dimitri Sivanich <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Oleksandr Tyshchenko <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: SeongJae Park <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: Yu Zhao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
ojeda pushed a commit that referenced this pull request Aug 5, 2024
The following bug was triggered on a system built with
CONFIG_DEBUG_PREEMPT=y:

 # echo p > /proc/sysrq-trigger

 BUG: using smp_processor_id() in preemptible [00000000] code: sh/117
 caller is perf_event_print_debug+0x1a/0x4c0
 CPU: 3 UID: 0 PID: 117 Comm: sh Not tainted 6.11.0-rc1 #109
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
 Call Trace:
  <TASK>
  dump_stack_lvl+0x4f/0x60
  check_preemption_disabled+0xc8/0xd0
  perf_event_print_debug+0x1a/0x4c0
  __handle_sysrq+0x140/0x180
  write_sysrq_trigger+0x61/0x70
  proc_reg_write+0x4e/0x70
  vfs_write+0xd0/0x430
  ? handle_mm_fault+0xc8/0x240
  ksys_write+0x9c/0xd0
  do_syscall_64+0x96/0x190
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

This is because the commit d4b294b ("perf/x86: Hybrid PMU support
for counters") took smp_processor_id() outside the irq critical section.
If a preemption occurs in perf_event_print_debug() and the task is
migrated to another cpu, we may get incorrect pmu debug information.
Move smp_processor_id() back inside the irq critical section to fix this
issue.

Fixes: d4b294b ("perf/x86: Hybrid PMU support for counters")
Signed-off-by: Li Huafei <[email protected]>
Reviewed-and-tested-by: K Prateek Nayak <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants