Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make runtime library loading safer #15040

Closed
emberian opened this issue Jun 19, 2014 · 10 comments
Closed

make runtime library loading safer #15040

emberian opened this issue Jun 19, 2014 · 10 comments
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. E-hard Call for participation: Hard difficulty. Experience needed to fix: A lot.

Comments

@emberian
Copy link
Member

We currently have an interface for runtime library loading, using the dynamic linker, but it has the problem that it interacts poorly with name mangling and is entirely unsafe. Figure out how to make it type-safe, and preferably memory-safe too.

@emberian
Copy link
Member Author

I believe @Kimundi had some ideas here.

@Kimundi
Copy link
Member

Kimundi commented Jun 25, 2014

Dynamic loading of non-Rust libraries can probably not be made safe in general because of missing type metadata, so I'll only talk about dynamically loaded Rust crates.

Additionally, dynamically loaded crates and dynamically compiled crates (like produced by a jit compiler) both only start to exist at runtime and can get cleaned up, so I see this as an issue for any kind of native code with non-'static lifetimes.

Requirements

In an ideal world, you would have a safe way to dynamically load a library, use its symbols, and then afterwards unload the library again, likely implemented with some kind of RAAI scheme.

The interface should be typesafe, which implies that communication between host executable and dynamic library can only happen through types known to both.

And if should be memory safe, which implies that no dependencies on dynamically loaded code should escape the lifetime of the dynamic library.

However, there are three problems that need to be solved for this to happen:

  • Typechecking
    This one is comparatively easy, but probably needs to have a intrinsic added. It could be implemented by having a generic function fn load_symbol<T>(path: &str), used like load_symbol::<fn(uint, Bar) -> Baz>("a::b::c::bar_adder").
    It would create a globally unique identifier for the type it got instantiated with, read the metadata of the dynamic library, try to resolve the given path in it, and then confirms that it points at an item with the same unique type identifier. This might be implementable with the current type_id intrinsic, but you might want to have a intrinsic that gives a few more informations.

  • Don't let function pointers escape
    A problem in the Rust language right now is that function pointers are always implicitly assumed to be equivalent to something like &'static code, that is they are copyable and can outlive any scope. This is true for direct function types like fn(Foo) -> Bar, as well for types that implicitly contain them, like &Trait or |Foo| -> Bar, where the function pointers are hidden in vtables or similar.
    If all code is valid for the duration of the process, this is not a problem and makes things easier, but with dynamically loaded libraries you have a problem. Say you have the symbol fn() -> Box<Trait>. It would allow you to get a new owned trait object with a vtable that becomes invalid as soon as the dynamic library gets unloaded. Bad!
    Even if the function pointer itself gets a lifetime, you could have a type like fn () -> fn() -> fn() -> (), where each return value is a new independent value that also needs to be lifetime restricted. So you can't just look at the type itself, but also at all types deriveable by function calls or similar.

  • Not letting 'static lifetimes escape
    This is similar to the problem above: Say you have a symbol fn() -> &'static Foo. Normally you could assume from such a signature that there is a constant of type Foo that you are returning a reference too, which you can continue using afterwards without any lifetime restrictions.

    However, in the case of a dynamic library such a type would lie: The lifetime of the reference is not 'static, its actually a 'a that is the intersection of 'static and the lifetime of the crate it got loaded from.

Possible solutions:

  1. Don't change the language, and always leak the loaded dynamic code so that lifetimes never end.
    This is a valid approach, but ugly because it leaks memory, which restricts the possible use cases it applies too. For example, a simple plugin system might not have a problem with this, as the number if plugins is limited, but a system that repeatedly jits new code could get into trouble fast with this approach.

  2. Don't change the language, but restrict the set of possible loadable symbols.
    This would involve the symbol loader figuring out if a given symbol can lead to lifetime-leaking function pointers or &'statics, and prohibit those symbols from being loaded, and otherwise only expose the actual function pointer through a wrapper struct that ensures a lifetime dependency on it.
    This might be enough for many cases, but probably would have to block language features like type erasure in general to be made sound.

  3. Extend the typesystem with a optional annotation to "taint" any 'static lifetime in a type graph with a smaller lifetime 'a. I'm not sure exactly how and if that could work, but I imagine it being similar to the build-in traits. The dynamic code loader could then use that annotation to limit the graph starting with the symbol function pointer itself.
    This approach has probably the most promise, but its also the hardest to implement as it involves language changes.
    Imaginary example syntax:

    fn load_symbol<'a, T>(&'a self) -> T<crate: 'a>;
    let x = c.load_symbol::<fn() -> fn() -> ()>();
    let f: (fn() -> fn() -> ())<crate: 'a> = x; // OK
    let g: (fn() -> ())<crate: 'a> = x(); // OK
    let h: fn() -> () = x(); // ERROR!
    

@emberian emberian added the E-hard Call for participation: Hard difficulty. Experience needed to fix: A lot. label Oct 21, 2014
@emberian emberian changed the title Add type-safe runtime library loading Add type-safe, memory-safe, runtime library loading Oct 21, 2014
@PatrLind
Copy link

How about keeping track of all things related to the library and not allow unload until it is safe to do so?

@thestinger
Copy link
Contributor

It's fundamentally impossible for runtime library loading to be safe. It can be made safer via verification of the metadata in the crate, but loading a library from a path dynamically will always be unsafe. It's not sane to trust files at arbitrary files on the filesystem and there is no race-free way to do something like verifying that the library is controlled by root or the current user.

A library runs arbitrary initialization code upon loading, and nothing prevents the code from having invalid metadata or simply being memory unsafe. It would be trivial to bypass safety restrictions by outputting arbitrary code in a plugin and running it.

@thestinger thestinger removed the B-RFC label Oct 21, 2014
@thestinger thestinger changed the title Add type-safe, memory-safe, runtime library loading make runtime library loading safer Oct 21, 2014
@emberian
Copy link
Member Author

Yes, of course the actual loading would be unsafe for obvious reasons, but
it could still be type safe and memory safe after it's loaded, given an
assumption that the library is "trusted", even if that's only verified by a
hash or signature. At some point you need to give up on absolute security
if you want the flexibility afforded by runtime reloading, and assume the
system is hostile.

On Tue, Oct 21, 2014 at 4:11 PM, Daniel Micay [email protected]
wrote:

It's fundamentally impossible for runtime library loading to be safe.
It can be made safer via verification of the metadata in the crate, but
loading a library from a path dynamically will always be unsafe. It's not
sane to trust files at arbitrary files on the filesystem and there is no
race-free way to do something like verifying that the library is controlled
by root or the current user. A library runs arbitrary initialization code
upon loading, and nothing prevents the code from having invalid metadata or
simply being memory unsafe. It would be trivial to bypass safety
restrictions by outputting arbitrary code in a plugin and running it.


Reply to this email directly or view it on GitHub
#15040 (comment).

http://octayn.net/

@thestinger
Copy link
Contributor

It can only be verified by a hash / signature if you load the entire thing into memory first and then somehow load it from there without touching the file again. It's never going to be appropriate to remove the unsafe marker from the API because running arbitrary code from an arbitrary path is fundamentally memory unsafe.

@Kimundi
Copy link
Member

Kimundi commented Oct 22, 2014

I fail to see how dynamically loading a library manually is any more unsafe than dynamically loading a library by having it dynamically linked to from the beginning. In both cases there is a point where the compiler/usercode has to trust that the file being loaded does actually contain what is expected.

@mzabaluev
Copy link
Contributor

Code trust issues aside, an operation certainly unsafe for the process image integrity is _un_loading of libraries. The problems and the per-platform variability here are such that all cross-platform programming environments I've seen that allow loading of arbitrary dynamic libraries have chosen to disable or discourage unloading. There hasn't been much demand for the feature either; memory is cheap, and in systems supporting dynamic libraries it is almost certainly virtually mapped. Being able to change parts of the program at runtime is neat, but I don't think it is compatible with what Rust is aimed to be otherwise.

To be safely unloaded, a module and the program using it must ensure that none of the in-image data it adds to the process is referenced elsewhere. This extends to any dynamic libraries the module might be linked to as the single consumer in the process, because the libraries are unloaded together with it (any alternatives to this behavior are not portable to the best of my knowledge). Even if Rust provides a solution for safe code with lifetimes, I expect the issues with foreign libraries will be too numerous and hard to debug. Heaven help you if any of those libraries is in C++; implementations of language features there are not safe with regard to unloading.

I think it best for everyone's sanity if the unload operation is kept unsafe, or not provided at all. This will remove the problem with soundness of 'static, because any in-binary data added to the process image is assumed to stay there forever.

@Kimundi
Copy link
Member

Kimundi commented Jan 13, 2015

Yeah, supporting unloading for arbitrary libraries is a tricky issue for sure. But I still think it could be made safe in controlled environments like Rust-only libraries explicitly compiled as unloadable plugins.

@steveklabnik
Copy link
Member

I'm pulling a massive triage effort to get us ready for 1.0. As part of this, I'm moving stuff that's wishlist-like to the RFCs repo, as that's where major new things should get discussed/prioritized.

This issue has been moved to the RFCs repo: rust-lang/rfcs#661

lnicola pushed a commit to lnicola/rust that referenced this issue Jun 19, 2023
fix: Use a more obscure hasher name in derive expansion

Closes rust-lang#15039
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. E-hard Call for participation: Hard difficulty. Experience needed to fix: A lot.
Projects
None yet
Development

No branches or pull requests

6 participants