-
Notifications
You must be signed in to change notification settings - Fork 350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calling from Rust to C++ is not safe #1
Comments
100% of C++ code is unsafe. When auditing a project, you would be on the hook for auditing all the unsafe Rust code and all the C++ code. The core safety claim under this new model is that auditing just the C++ side would be sufficient to catch all problems, i.e. the Rust side can be 100% safe. |
I understand. Making the rust side safe could make for example |
The user of cargo-geiger would need to know that if there is non-Rust code in the project then Rust's safety guarantees obviously do not apply to the non-Rust code. I don't know much about cargo-geiger but hopefully there would be some way for it to flag the presence of non-Rust code. |
There's an interpretation of safe Rust code which allows to draw a clear boundary of what should be marked unsafe, which is that it should not be possible to cause undefined behavior with safe Rust code. Do I understand correctly that the intended usage of cxx is to write the C++ side in such a way as to satisfy this Rust side requirement, rather than marking the functions unsafe from the Rust side? |
That's certainly the case here. You can't cause undefined behavior with just safe Rust code. You need to add some unsafe C++ code in order to get undefined behavior. Consider the following: suppose you were using traditional bindgen to expose an unsafe FFI which you then wrapped in a safe Rust API, as is common practice. That doesn't give you the right to later mess up the C++ code arbitrarily badly without getting UB... With CXX you can mess up the Rust code arbitrarily badly without getting UB, so the Rust code is safe. |
Yes, but in that case you uses |
I understand what you're saying but I don't find it to be a distinction that would be valuable for maintaining correctness of a codebase in practice. My approach is as follows:
|
Hello, I'm the author of the tool cargo-geiger. Thank you @dtolnay for yet another significant open source Rust contribution, your work is amazing, period. This approach to C++ FFI looks wonderful from my perspective. @bjorn3 I think I can see the point you are making, that it should be easy to find out if a crate depends on any unsafe code including FFI? Let's continue that discussion including detection of unsafe code using macro expansion over at the cargo-geiger repos. |
But by that definition, extern functions should not be unsafe. But they are: extern { fn init_stuff(); }
fn main() {
init_stuff(); // Error: call to unsafe function is unsafe and requires unsafe function or block
} Why is calling |
That's already not true, though. However I have to admit that this particular refutation has little bearing on the discussion at hand. I think @bjorn3 is following the interpretation that when running a Rust program, all possible UB must arise from some unsafe block. This property is violated by cxx, because C++ code is not inside a literal unsafe block. Usually, writing
No, it doesn't -- so? The @dtolnay on the other hand views C++ code as implicitly carrying an unsafe block all around it. I can understand this view, but I think the issue with this is the lack of a clear proof obligation. With cxx, what has to be proven about the C++ code is determined very non-locally by the cxx bridge file. The C++ type system is too weak to tell you what you have to prove. Crucially, that cxx bridge file doesn't even have any |
Indeed
Maybe add a |
@RalfJung I think your post you linked reinforces my perspective. The emphasis is that "unsafe is the minimal set of code that you need to audit to discover all sources of undefined behavior. As with the Vec implementation you talk about, looking at one piece of unsafe code can expose other code not necessarily bearing the unsafe keyword that also needs to be considered, but through the process you discover all problems and can convince yourself that the whole system is good. Your wording of "C++ code as implicitly carrying an unsafe block all around it" is accurate. This is just how that language works, and how audits of FFI to it always work. https://www.reddit.com/r/rust/comments/elvfyn/ffi_like_its_2020_announcing_safe_ffi_for_rust_c/fdm8olr/ covers this in more detail from the perspective of real practical users of FFI. C++ isn't about clear formal proof obligations, it relies on the programmer to understand what they are looking at in broader context; this is pervasive to the language and applies here. I would add that in addition to C++ code carrying an implicit unsafe block all around it, Cargo build scripts also always carry an implicit unsafe block all around and must be audited to discover the necessary understanding of the system and its safety requirements. |
Moderator note: I am deleting comments because GitHub's linear comment structure makes it unmanageable to hold a complex forking discussion here. Discussions that are not the dominant thread would be better to hold here. |
In the case of a handwritten extern, the two reasons it's unsafe are:
CXX is designed for C++ functions that are fine to call with any possible value of their arguments and this would be something that an audit of the C++ side of the bridge would be responsible for verifying (but this definitely should be called out better in the documentation). |
I don't find this to be a practical litmus test nor helpful for upholding the correctness of a codebase in practice. I've called out that all C++ and all build scripts implicitly carry the unsafe you are looking for. To drive this home, consider: // build.rs
fn main() {
let f = curl("https://gist.github.com/dtolnay/.../mystery.rs");
let out_dir = PathBuf::from(env::var("OUT_DIR").unwrap());
fs::write(out_dir.join("mystery.rs"), f).unwrap();
} // src/lib.rs
include!(concat!(env!("OUT_DIR"), "/mystery.rs")); What do we grep for to find the undefined behavior in this project? The solution isn't to make curling unsafe, or fs::write unsafe, or |
Does CXX codegen automatically prove this? If not, why generated bindings are safe? |
I would extend this to say "to discover all sources of undefined behavior and the proof obligations to avoid it". And that's where cxx's approach falls short: even if we accept that C++ has an implicitly unsafe block all around it, that tells us nothing about what the actual proof obligation is. And you are right that build scripts (and most forms of code generation) break this. There's not much we can do about that. Item-level proc-macros sometimes have a similar problem, and the usual approach seems to be to call them something with
The part about any possible value might be worth calling out more explicitly in the docs, maybe? |
The C++ side may do arbitrary unsafe things like dereferencing dangling pointers.
The text was updated successfully, but these errors were encountered: