Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[eRFC] Include call graph information in LLVM IR #59412

Open
japaric opened this issue Mar 25, 2019 · 18 comments
Open

[eRFC] Include call graph information in LLVM IR #59412

japaric opened this issue Mar 25, 2019 · 18 comments
Labels
A-CLI Area: Command-line interface (CLI) to the compiler A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. needs-fcp This change is insta-stable, so needs a completed FCP to proceed. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-embedded Working group: Embedded systems

Comments

@japaric
Copy link
Member

japaric commented Mar 25, 2019

Summary

Add an experimental compiler feature / flag to add call graph information, in
the form of LLVM metadata, to the LLVM IR (.ll) files produced by the
compiler.

Motivation

(This section ended up being a bit long winded. The TL;DR is improving existing
stack analysis usage tools.)

Stack usage analysis is a hard requirement for certifying safety critical
(embedded) applications. This analysis is usually implemented as a static
analysis tool that computes the worst case stack usage of an application. The
information provided by this tool is used in the certification process to
demonstrate that the application won't run into a stack overflow at runtime.

Several such tools exist for C/C++ programs, mainly in commercial
and closed source forms. A few months ago the Rust compiler gained a feature
that's a pre-requisite for implementing such tools in the Rust world: -Z emit-stack-sizes. This flag makes stack usage information about every Rust
function available in the binary artifacts (object files) produced by the
compiler.

And just recently a tool that uses this flag and call graph analysis to perform
whole program stack usage analysis has been developed: cargo-call-stack
(full disclaimer: I'm the author of said tool). The tool does OK when dealing
with programs that only uses direct function calls but it's lacking (either
over-pessimistic or flat out incorrect) when analyzing programs that contain
indirect function calls, that is function pointer calls and/or dynamic
dispatch.

Call graph analysis in the presence of indirect function calls is notoriously
hard, but Rust strong typing makes the problem tractable -- in fact, dynamic
dispatch is easier to reason about than function pointer calls. However, this
last part only holds true when Rust type information is available to the tool,
which is not the case today.

To elaborate: it's' important that the call graph is extracted from
post-LLVM-optimization output as that greatly reduces the chance of inserting
invalid edges. For example, what appears to be a function call at the (Rust)
source level (e.g. let x = foo();) may not actually exist in the final binary
due to inlining or dead code elimination. For this reason, Rust stack usage
analysis tools are limited to two sources of call graph information: the machine
code in the final executable and post-optimization LLVM IR (rustc's
--emit=llvm-ir). The former contains no type information and the latter
contains LLVM type information, not Rust type information.

cargo-call-stack currently uses the type information available in the
LLVM IR of a crate to reason about indirect function calls. However, LLVM type
information is not as complete as Rust type information because the conversion
is lossy. Consider the following Rust source code and corresponding LLVM IR.

#[no_mangle] // shorter name
static F: AtomicPtr<fn() -> u32> = AtomicPtr::new(foo as *mut _);

fn main() {
    if let Some(f) = unsafe { F.load(Ordering::Relaxed).as_ref() } {
        f(); // function pointer call
    }

    // volatile load to preserve the return value of `bar`
    unsafe {
        core::ptr::read_volatile(&baz());
    }
}

#[no_mangle]
fn foo() -> u32 {
    F.store(bar as *mut _, Ordering::Relaxed);

    0
}

#[no_mangle]
fn bar() -> u32 {
    1
}

#[inline(never)]
#[no_mangle]
fn baz() -> i32 {
    F.load(Ordering::Relaxed) as usize as i32
}
define void @main() unnamed_addr #3 !dbg !1240 {
start:
  %_14 = alloca i32, align 4
  %0 = load atomic i32, i32* bitcast (<{ i8*, [0 x i8] }>* @F to i32*) monotonic, align 4
  %1 = icmp eq i32 %0, 0, !dbg !1251
  br i1 %1, label %bb5, label %bb3, !dbg !1251

bb3:                                              ; preds = %start
  %2 = inttoptr i32 %0 to i32 ()**, !dbg !1252
  %3 = load i32 ()*, i32 ()** %2, align 4, !dbg !1254, !nonnull !46
  %4 = tail call i32 %3() #9, !dbg !1254
  br label %bb5, !dbg !1255

; ..
}

; Function Attrs: norecurse nounwind
define internal i32 @foo() unnamed_addr #0 !dbg !1189 {
; ..
}

; Function Attrs: norecurse nounwind readnone
define internal i32 @bar() unnamed_addr #1 !dbg !1215 {
; ..
}

; Function Attrs: noinline norecurse nounwind
define internal fastcc i32 @baz() unnamed_addr #2 !dbg !1217 {
; ..
}

Note how in the LLVM IR output foo, bar and baz all have the same function
signature: i32 (), which is the LLVM version of fn() -> i32. There are no
unsigned integer types in LLVM IR so both Rust types, i32 and u32, get
converted to i32 in the LLVM IR.

Line %4 = .. in the LLVM IR is the function pointer call. This too,
incorrectly, indicates that a function pointer with signature i32 () (fn() -> i32) is being invoked.

This lossy conversion leads cargo-call-stack to incorrectly add an edge
between the node that represents the function pointer call and baz. See below:

fn-wrong

If the tool had access to call graph information from the compiler it would have
produced the following accurate call graph.

fn-right

This eRFC proposes adding a feature to aid call graph and stack usage analysis.

(For a more in depth explanation of how cargo-call-stack works please refer to
this blog post: https://blog.japaric.io/stack-analysis/)

Design

We propose that call graph information is added to the LLVM IR that rustc
produces in the form of LLVM metadata when the unstable -Z call-metadata
rustc flag is used.

Function pointer calls

Functions that are converted into function pointers in Rust source (e.g. let x: fn() -> i32 = foo) will get extra LLVM metadata in their definitions (IR:
define). The metadata will have the form !{!"fn", !"fn() -> i32"}, where the
second node represents the signature of the function. Likewise, function pointer
calls will get similar LLVM metadata at call site (IR: call/ invoke).
Revisiting the previous example, the IR would change as shown below:

define void @main() unnamed_addr #3 !dbg !1240 {
; ..
  %4 = tail call i32 %3() #9, !dbg !1254, !rust !0
; ..                                      ^^^^^^^^ (ADDED)
}

; Function Attrs: norecurse nounwind
define internal i32 @foo() unnamed_addr #0 !dbg !1189 !rust !0 {
; ..                                                  ^^^^^^^^ (ADDED)
}

; Function Attrs: norecurse nounwind readnone
define internal i32 @bar() unnamed_addr #1 !dbg !1215 !rust !0 {
; ..                                                  ^^^^^^^^ (ADDED)
}

; Function Attrs: noinline norecurse nounwind
define internal fastcc i32 @baz() unnamed_addr #2 !dbg !1217 {
; ..
}

; ..

; (ADDED) at the bottom of the file
!0 = !{!"fn", "fn() -> i32"}
; ..

Note how main and baz didn't get the extra !rust metadata because they are
never converted into function pointers. Whereas both foo and bar got the
same metadata because they have the same signature and are converted into
function pointers in the source code (lines static F and F.store).

When tools parse this LLVM IR they will know that line %4 = .. can invoke
foo or bar (!rust !0), but not baz or main because the latter two
don't have the same "fn" metadata.

This -Z flag only promises two things with respect to "fn" metadata:

  • Only functions that are converted (coerced) into function pointers in the
    source code will get "fn" metadata -- note that this does not necessarily mean that
    function will be called via a function pointer call

  • That the string node that comes after the !"fn" node will be unique for
    each function type -- the flag does not make any promise about the contents
    or syntax of this string node. (Having a stringified version of the function
    signature in the LLVM IR would be nice to have but it's not required to
    produce an accurate call graph.)

Adding this kind of metadata doesn't affect LLVM optimization passes and more
importantly our previous experiments show that this custom metadata is not
removed by LLVM passes.

Trait objects

There's one more of bit of information we can encode in the metadata to make the
analysis less pessimistic: information about trait objects.

Consider the following Rust source code and corresponding LLVM IR.

static TO: Mutex<&'static (dyn Foo + Sync)> = Mutex::new(&Bar);
static X: AtomicI32 = AtomicI32::new(0);

fn main() {
    (*TO.lock()).foo();

    if X.load(Ordering::Relaxed).quux() {
        // side effect to keep `quux`'s return value
        unsafe { asm!("" : : : "memory" : "volatile") }
    }
}

trait Foo {
    fn foo(&self) -> bool {
        false
    }
}

struct Bar;

impl Foo for Bar {}

struct Baz;

impl Foo for Baz {
    fn foo(&self) -> bool {
        true
    }
}

trait Quux {
    fn quux(&self) -> bool;
}

impl Quux for i32 {
    #[inline(never)]
    fn quux(&self) -> bool {
        *TO.lock() = &Baz;

        unsafe { core::ptr::read_volatile(self) > 0 }
    }
}
; Function Attrs: noinline noreturn nounwind
define void @main() unnamed_addr #2 !dbg !1418 {
; ..
  %8 = tail call zeroext i1 %7({}* nonnull align 1 %4) #8, !dbg !1437, !rust !0
; ..                                                                   ^^^^^^^^
}

; app::Foo::foo
; Function Attrs: norecurse nounwind readnone
define internal zeroext i1 @_ZN3app3Foo3foo17h5a849e28d8bf9a2eE(
  %Bar* noalias nocapture nonnull readonly align 1
) unnamed_addr #0 !dbg !1224 !rust !1 {
; ..                         ^^^^^^^^
}

; <app::Baz as app::Foo>::foo
; Function Attrs: norecurse nounwind readnone
define internal zeroext i1
  @"_ZN37_$LT$app..Baz$u20$as$u20$app..Foo$GT$3foo17h9e4a36340940b841E"(
    %Baz* noalias nocapture nonnull readonly align 1
  ) unnamed_addr #0 !dbg !1236 !rust !2 {
; ..                           ^^^^^^^^
}

; <i32 as app::Quux>::quux
; Function Attrs: noinline norecurse nounwind
define internal fastcc zeroext i1
  @"_ZN33_$LT$i32$u20$as$u20$app..Quux$GT$4quux17haf5232e76b46052fE"(
    i32* noalias readonly align 4 dereferenceable(4)
  ) unnamed_addr #1 !dbg !1245 !rust !3 {
; ..                           ^^^^^^^^
}

; ..

!0 = "fn(*mut ()) -> bool"
!1 = "fn(&Bar) -> bool"
!2 = "fn(&Baz) -> bool"
!3 = "fn(&i32) -> bool"
; ..

In this case we have dynamic dispatch, which shows up in the LLVM IR at line
%8 as a function pointer call where the signature of the function pointer is
i1 ({}*), which is more or less equivalent to Rust's fn(*mut ()) -> bool --
the {}* denotes an "erased" type.

With just the function signature metadata tools could at best assume that the
dynamic dispatch could invoke Bar.foo(), Baz.foo() or i32.quux() resulting
in the following, incorrect call graph.

trait-wrong

Thus, we also propose that the -Z call-metadata flag adds trait-method
information to trait method implementations (IR: define) of traits that are
converted into trait objects
, and to dynamic dispatch sites (IR: call _ %_({}* _, ..)) using the following metadata syntax: !{!"dyn", !"Foo", !"foo"}, where
the second node represents the trait and the third node represents the method
being dispatched / defined.

Building upon the previous example, here's how the "dyn" metadata would be
emitted by the compiler:

; Function Attrs: noinline noreturn nounwind
define void @main() unnamed_addr #2 !dbg !1418 {
; ..
  %8 = tail call zeroext i1 %7({}* nonnull align 1 %4) #8, !dbg !1437, !rust !0
; ..
}

; app::Foo::foo
; Function Attrs: norecurse nounwind readnone
define internal zeroext i1 @_ZN3app3Foo3foo17h5a849e28d8bf9a2eE(
  %Bar* noalias nocapture nonnull readonly align 1
) unnamed_addr #0 !dbg !1224 !rust !0 {
; ..                         ^^^^^^^^ (CHANGED)
}

; <app::Baz as app::Foo>::foo
; Function Attrs: norecurse nounwind readnone
define internal zeroext i1
  @"_ZN37_$LT$app..Baz$u20$as$u20$app..Foo$GT$3foo17h9e4a36340940b841E"(
    %Baz* noalias nocapture nonnull readonly align 1
  ) unnamed_addr #0 !dbg !1236 !rust !0 {
; ..                           ^^^^^^^^ (CHANGED)
}

; <i32 as app::Quux>::quux
; Function Attrs: noinline norecurse nounwind
define internal fastcc zeroext i1
  @"_ZN33_$LT$i32$u20$as$u20$app..Quux$GT$4quux17haf5232e76b46052fE"(
    i32* noalias readonly align 4 dereferenceable(4)
  ) unnamed_addr #1 !dbg !1245          {
; ..                           ^^^^^^^^ (REMOVED)
}

; ..

!0 = !{!"dyn", !"Foo", !"foo"}" ; CHANGED
; ..

Note that <i32 as Quux>::quux loses its !rust metadata because there's no
dyn Quux in the source code.

With trait-method information tools would be able to limit the candidates of
dynamic dispatch to the actual implementations of the trait being dispatched.
Thus the call graph produced by the tools would become:

trait-right

Like "fn" metadata, "dyn" metadata only promises two things:

  • Only trait method implementations (including default implementations) of
    traits that appear as trait objects (e.g. &dyn Foo, Box<dyn Bar>) in the
    source code will get this kind of metadata

  • That the string nodes that come after the !"dyn" node will be unique for
    each trait and method -- the flag does not make any promise about the
    contents or syntax of these string nodes.

Destructors

Calling the destructor of a trait object (e.g. Box<dyn Foo>) can result in the
destructor of any Foo implementer being invoked. This information will also be
encoded in the LLVM IR using "drop" metadata of the form: !{!"drop", !"Foo"}
where the second node represents the trait.

Here's an example of this kind of metadata:

trait Foo {
    fn foo(&self) {}
}

struct Bar;

impl Foo for Bar {}

struct Baz;

impl Foo for Baz {}

static MAYBE: AtomicBool = AtomicBool::new(false);

fn main() {
    let mut x: Box<dyn Foo> = Box::new(Bar);

    if MAYBE.load(Ordering::Relaxed) {
        x = Box::new(Baz);
    }

    drop(x);
}

Unoptimized LLVM IR:

; core::ptr::real_drop_in_place
define internal void @_(%Baz* nonnull align 1) unnamed_addr #4 !rust !199 {
  ; ..
}

; core::ptr::real_drop_in_place
define internal void @_(%Bar* nonnull align 1) unnamed_addr #4 !rust !199 {
  ; ..
}

; hello::main
define internal void @() {
  ; ..

  ; `drop(x)`
  invoke void @_ZN4core3ptr18real_drop_in_place17h258eb03c50ca2fcaE(..)

  ; ..
}

; core::ptr::real_drop_in_place
define internal void @_ZN4core3ptr18real_drop_in_place17h258eb03c50ca2fcaE(..) {
  ; ..

  ; calls destructor on the concrete type behind the trait object
  invoke void %8({}* align 1 %3)
          to label %bb3 unwind label %cleanup, !dbg !209, !rust !199

  ; ..
}

!199 = !{!"drop", !"Foo"}

Here dropping x can result in Bar's or Baz's destructor being invoked (see !199).

Multiple metadata

Some function definitions may get more than one different metadata kind or
different instances of the same kind. In that case we'll use a metadata tuple
(e.g. !{!1, !2}) to group the different instances. An example:

trait Foo {
    fn foo(&self) -> bool;
}

trait Bar {
    fn bar(&self) -> i32 {
        0
    }
}

struct Baz;

impl Foo for Baz {
    fn foo(&self) -> bool {
        false
    }
}

impl Bar for Baz {}

fn main() {
    let x: &Foo = &Baz;
    let y: &Bar = &Baz;

    let z: fn(&Baz) -> bool = Baz::foo;
}

Unoptimized LLVM IR:

; core::ptr::real_drop_in_place
define internal void @_(%Baz* nonnull align 1) unnamed_addr #2 !rust !107 {
  ; ..
}

; <hello::Baz as hello::Foo>::foo
define internal zeroext i1 @_(%Baz* noalias nonnull readonly align 1) unnamed_addr #2 !rust !157 {
  ; ..
}

!105 = !{!"drop", !"Foo"}
!106 = !{!"drop", !"Bar"}
!107 = !{!105, !106}
; ..
!155 = !{!"dyn", !"Foo", !"foo"}
!156 = !{!"fn", !"fn(&Baz) -> bool"}
!157 = !{!155, !156}

Summary

In summary, these are the proposed changes:

  • Add an unstable -Z call-metadata flag

  • Using this flag adds extra LLVM metadata to the LLVM IR produced by rustc
    (--emit=llvm-ir). Three kinds of metadata will be added:

    • !{!"fn", !"fn() -> i32"} metadata will be added to the definitions of
      functions (IR: define) that are coerced into function pointers in the
      source code
      and to function pointer calls (IR: call _ %_(..)). The second
      node is a string that uniquely identifies the signature (type) of the
      function.

    • !{!"dyn", !"Trait", !"method"} metadata will be added to the trait method
      implementations (IR: define) of traits that appear as trait objects in
      the source code
      and to dynamic dispatch sites (IR: call _ %_({}* _, ..)).
      The second node is a string that uniquely identifies the trait and the third
      node is a string that uniquely identifies the trait method.

    • !{!"drop", "Trait"} metadata will be added to destructors (IR: define)
      of types that implement traits that appear as trait objects in the source
      code
      and to the invocations of trait object destructors. The second node is
      a string that uniquely identifies the implemented trait / trait object.

Alternatives

An alternative would be to make type information available in the final binary
artifact, that is in the executable, rather than in the LLVM IR. This would make
the feature harder to implement and less portable. Making the type information
available in, say, the ELF format would require designing a (binary) format
to encode the information in a linker section plus non-trivial implementation
work. Making this feature available in other formats (Mach-O, WASM, etc.) would
only multiply the amount of required work, likely leading to this feature being
implemented for some formats but not others.

Drawbacks

LLVM IR is tied to the LLVM backend; this makes the proposed feature hard, or
maybe even impossible, to port to other backends like cranelift. I don't think
this is much of an issue as this is an experimental feature; backend portability
can, and should, be revisited when we consider stabilizing this feature (if
ever).


Since this is a (hopefully small) experimental compiler feature (along the lines
of -Z emit-stack-sizes) and not a language (semantics or syntax)
change I'm posting this in rust-lang/rust for FCP consideration. If this
warrants a formal RFC I'd be happy to repost this in rust-lang/rfcs.

cc @rust-lang/compiler @oli-obk

@japaric japaric added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Mar 25, 2019
@Centril
Copy link
Contributor

Centril commented Mar 26, 2019

To that end we propose that Rust type information is added to the LLVM IR
that rustc produces in the form of LLVM metadata when the unstable -Z rust-types-in-llvm-ir rustc flag is used.

LLVM IR is tied to the LLVM backend; this makes the proposed feature hard, or
maybe even impossible, to port to other backends like cranelift. I don't think
this is much of an issue as this is an experimental feature; backend portability
can, and should, be revisited when we consider stabilizing this feature (if
ever).

Do you expect this to become stable one day? Under what circumstances...?

!0 = "fn()"

Shouldn't this be fn() -> ()?

Having a stringified version of the function signature in the LLVM IR would be
nice to have but it's not required to produce an accurate call graph.

So it is safe to say that you don't expect stability wrt. the type name outputs?

@Centril Centril added needs-fcp This change is insta-stable, so needs a completed FCP to proceed. A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. labels Mar 26, 2019
@japaric
Copy link
Member Author

japaric commented Mar 26, 2019

Do you expect this to become stable one day? Under what circumstances...?

Not in the form proposed here, which is tied to LLVM. Perhaps it would make
sense to encode the info in the object file, or perhaps encoding this info in
MIR would be a good compromise. Experience with the feature proposed here
(perhaps there are other use cases for this kind of feature?) would inform such
decision.

Speaking as the author of cargo-call-stack to me it doesn't make sense to
stabilize this before stabilizing -Z emit-stack-sizes. The latter is a hard
dependency of the tool and it's also deeply tied to LLVM; this feature is more
of an accuracy improvement (though a rather welcome one).

!0 = "fn()"

Shouldn't this be fn() -> ()?

Either it's fine.

So it is safe to say that you don't expect stability wrt. the type name
outputs?

Again, speaking as the author of cargo-call-stack, the value (string) behind
the metadata node is not important. As long as I can use the metadata identifier
(the number) to match function calls to their definitions any value will do.

But perhaps there are use cases for this feature to make the LLVM IR more
readable? In that case having the type name in the string would be a hard
requirement but if it is for human consumption the stability of the output still
wouldn't be too important.

@eddyb
Copy link
Member

eddyb commented Mar 27, 2019

I'm not seeing any mentions of debuginfo or DWARF, could they be used for this?

But also, I'm not sure what you need can be easily obtained at this moment.

It (almost) requires @rust-lang/wg-unsafe-code-guidelines to come up with some predicate for "definitely UB" fn ABI mismatches. Otherwise, you'd need to work with these, not the Rust signature:

(click to expand)

pub struct FnType<'a, Ty> {
/// The LLVM types of each argument.
pub args: Vec<ArgType<'a, Ty>>,
/// LLVM return type.
pub ret: ArgType<'a, Ty>,
pub c_variadic: bool,
pub conv: Conv,
}

/// Information about how to pass an argument to,
/// or return a value from, a function, under some ABI.
#[derive(Debug)]
pub struct ArgType<'a, Ty> {
pub layout: TyLayout<'a, Ty>,
/// Dummy argument, which is emitted before the real argument.
pub pad: Option<Reg>,
pub mode: PassMode,
}

For example, you bring up fn() -> i32 vs fn() -> u32, but transmuting between those is not UB right now AFAIK (modulo sign/zero-extension ABI details on some targets).

Btw, I have a WIP tool that tries really hard to find dynamic targets, I should talk to you about it! (it's more precise than a signature-based analysis, while still being an approximation - should work well on heap-less no-arbitrary-recursion binaries though)

@japaric
Copy link
Member Author

japaric commented Mar 27, 2019

I'm not seeing any mentions of debuginfo or DWARF, could they be used for this?

The debuginfo certainly contains type information but I doubt it includes trait information as in "this function is method foo of impl Trait for Type" (besides symbol names which are not reliable because of #[link_name]).

And if the info you can access from GDB is any indication of the information available in the final executable then most value-level debuginfo seems to be discarded by the optimization passes.

A quick look at the debug metadata (!dbg !123) in the LLVM IR indicates that there's type information on function pointers but there seem to be several levels of indirection between the debug metadata attached to the function pointer value and the metadata node that contains the type information. I doubt this information is going to be easy to extract from DWARF.

For example, you bring up fn() -> i32 vs fn() -> u32, but transmuting between those is not UB right now AFAIK

core::fmt transmutes method pointers to erase the type of the receiver and has not exploded (yet) so there must be cases where this operation is not UB. This pattern does make the signature based call graph analysis fail, but if an application is aiming for certification I can easily see "do not transmute function pointers" being part of the coding guidelines it needs to adhere to (see MISRA). In general, transmutes would need to be very well motivated if used in safety critical software so that kind of software will likely actively avoid them.

TL; DR I'm not too concerned about this pattern breaking the analysis as we'll likely be linting against it for the main use case of the tool.

@eddyb
Copy link
Member

eddyb commented Mar 27, 2019

@japaric Ah, in that case, may I suggest making this less about type information and more about "dynamic call target sets"? The compiler could even output several sets at once (e.g. one with very strict type-matching, and one with "the right size and number of register arguments/returns"), and it would be one general system instead of having both !fn and !dyn.

Also, have you looked at doing this on the MIR? The "monomorphization collector" effectively finds the callgraph (of instances), and if you're using the compiler API you can look at the entire list of reified¹ functions, compare the types and whatnot.
But I guess this can't really work as an external CLI tool (then again, you need nightly anyway for -Z).

¹i.e. turned into pointers - functions which aren't reified don't need to be dynamically callable! (at the very least, the "target sets" idea above would take this into account)

@oli-obk
Copy link
Contributor

oli-obk commented Mar 27, 2019

The reason this system is not built on top of the MIR is

it's' important that the call graph is extracted from
post-LLVM-optimization output as that greatly reduces the chance of inserting
invalid edges. For example, what appears to be a function call at the (Rust)
source level (e.g. let x = foo();) may not actually exist in the final binary
due to inlining or dead code elimination. For this reason, Rust stack usage
analysis tools are limited to two sources of call graph information: the machine
code in the final executable and post-optimization LLVM IR (rustc's
--emit=llvm-ir). The former contains no type information and the latter
contains LLVM type information, not Rust type information.

@japaric
Copy link
Member Author

japaric commented Mar 27, 2019

Ah, in that case, may I suggest making this less about type information and more about "dynamic call target sets"?

That would be even better but since this is an eRFC with no commitment towards stabilization I think the implementation and maintenance effort should be kept as small. I don't know how much effort would "dynamic call target sets" take but I have implemented a PoC of the !fn metadata before in a diff of less of 50 LoC.

Also, have you looked at doing this on the MIR?

The other reason I'm avoiding MIR is that I don't think its textual representation is stable and even though my tool requires nightly I'd like to minimize nightly breakage (the .ll and .stack_sizes formats are relatively stable and llvm upgrades are infrequent). I certainly want to avoid linking the tool to any of the rustc crates.

you can look at the entire list of reified¹ functions

That's an interesting data point. That would help narrow the targets of function pointer calls. Now if I could get access to that data without linking to rustc that'd be great :-).

@oli-obk
Copy link
Contributor

oli-obk commented Mar 27, 2019

The other reason I'm avoiding MIR is that I don't think its textual representation is stable and even though my tool requires nightly I'd like to minimize nightly breakage

We could add an unstable flag to rustc that dumps exactly the info you need (instead of you having to process MIR).

@Centril
Copy link
Contributor

Centril commented Mar 27, 2019

But perhaps there are use cases for this feature to make the LLVM IR more
readable? In that case having the type name in the string would be a hard
requirement but if it is for human consumption the stability of the output still
wouldn't be too important.

Alright; as long as we don't do anything here to make it more difficult to change the type names and such I'm happy from a T-Lang perspective.

core::fmt transmutes method pointers to erase the type of the receiver and has not exploded (yet) so there must be cases where this operation is not UB.

General note (that is probably not highly relevant to the rest of that paragraph...): Remember that just because core gets to do this it doesn't mean user-space does. core is effectively part of the language in some parts and can assume things about the compiler others cannot.

@eddyb
Copy link
Member

eddyb commented Mar 27, 2019

Oh, right, you can't do this on MIR because you need the LLVM frame sizes, so you need to emit some information into LLVM IR, or something like that.

As for the monomorphization collector being able to, you'd want to record the is_direct_call == false cases in this function, in order to be able to annotate the IR later:

fn visit_instance_use<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>,
instance: ty::Instance<'tcx>,
is_direct_call: bool,
output: &mut Vec<MonoItem<'tcx>>)
{

which only come from this call (so I guess you could record things here, instead):
mir::Rvalue::Cast(mir::CastKind::ReifyFnPointer, ref operand, _) => {
let fn_ty = operand.ty(self.mir, self.tcx);
let fn_ty = self.tcx.subst_and_normalize_erasing_regions(
self.param_substs,
ty::ParamEnv::reveal_all(),
&fn_ty,
);
visit_fn_use(self.tcx, fn_ty, false, &mut self.output);
}

Vtables are separate (although it'd be nicer if const-eval generated them):

/// Creates a `MonoItem` for each method that is referenced by the vtable for
/// the given trait/impl pair.
fn create_mono_items_for_vtable_methods<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>,
trait_ty: Ty<'tcx>,
impl_ty: Ty<'tcx>,
output: &mut Vec<MonoItem<'tcx>>) {

As for the "dynamic call target sets" idea, I suppose it depends what's merging LLVM IR modules (or is all the IR created at once, with -Z miri-only-rlibs and 1 CGU or w/e?), because rustc might not have the full set when it codegens.

Your notion of relying on strings, to group everything that refers to the same (conceptual) "set", would work even across multiple CGUs (as long as e.g. the type names are global/"absolute").

So maybe the main things I have to propose are:

  • take into account refication (the creation of fn / dyn Trait)
  • don't require the "group ID" value be a string, but allow any metadata, and rely on LLVM's deduplication (or reconstruct a tree after the fact) to compare them
    • this would allow a richer representation, closer to DWARF in LLVM IR
  • replace !dyn with !fn by having a syntax for the "group ID" that is unambiguous
  • allow a define to be in multiple !fn "groups" (like your !fn !2, !dyn !1 examples) by taking a metadata list of "group ID" instead of just a single "group ID"

@japaric
Copy link
Member Author

japaric commented Mar 27, 2019

EDIT: nevermind, eddyb's comment above (which was posted as I was going to hit the comment button) answers my questions below.

> We could add an unstable flag to rustc that dumps exactly the info you need (instead of you having to process MIR).

Would it be possible to emit the LLVM metadata on the function definition (IR: define) iff that function gets reified (i.e. the compiler knows it will be called via a function pointer)? If so, I think that would be simpler than emitting the information separately.

If we go with a separate flag for the list of reified functions then the list would need to include the signatures of the functions so that tools can match the function names in that list to the signatures of function pointer calls in the LLVM IR (call %_(..) !fn !0). As in:

$ # following the first example; note that baz and main are not listed here
$ rustc -Z print-reified-functions (..)
app::foo: fn() -> u32
app::bar: fn() -> u32

LLVM-IR:

define void @main() unnamed_addr #3 !dbg !1240, !fn !0 {
; ..
  %4 = tail call i32 %3() #9, !dbg !1254, !fn !1
; ..                                      ^^^^^^
}

!0 = "fn()"
!1 = "fn() -> u32" ; tool needs to compare this string against the output of `-Z print-reified-functions`

Also, is there a similar concept for trait methods? As in does the compiler knows which trait implementations will be used in dynamic dispatch? I assume it does since it has to produce a vtable for those. Or maybe it always produces a vtable and lets the linker remove the vtables that will never be used?

@japaric
Copy link
Member Author

japaric commented Apr 7, 2019

I have implemented this eRFC with @eddyb suggestions (#59412 (comment)) in
#59777. There were some extra deviations from the eRFC though:

Turns out it's not possible to attach multiple !rust metadata nodes to a
single define item / call instruction so I had to adjust the metadata syntax
a bit:

; I was hoping this (the double `!rust` bit) worked
; <hello::Baz as hello::Bar<bool>>::bar
define i1 @_(%Baz*) !rust !1 !rust !2 {
  ; ..
}

!1 = !{!"fn", !"fn(&Baz) -> bool"} ; function pointer call
!2 = !{!"dyn", !"Bar<bool>", !"bar"} ; dynamic dispatch

; but it didn't so instead I'm doing this
; <hello::Baz as hello::Bar<bool>>::bar
define i1 @_(%Baz*) !rust !3 {
  ; ..
}

; the two nodes get mixed into one
!3 = !{!"fn", !"fn(&Baz) -> bool", !"dyn", !"Bar<bool>", !"bar"}

This has an impact on tools. With the original approach I was aiming for tools
only having to look at the number in the call-site / define metadata (e.g. the
1 in !rust !1); now they'll have to look at the actual definition of the
named node (the !{"fn", "fn(&Baz) -> bool", "dyn", "Bar<bool>", "bar"} bit at
the bottom) which may contain multiple bits of information (in this case, it
says that the function may be called via a function pointer call or dynamic
dispatch). I think that even in this case we don't have to commit to a
particular text representation for types / traits; we can say that "the text
after !"fn" (!"dyn") uniquely represents a function type (trait) but its
syntax is not stable" with the goal of having tools exactly match the string
rather than parse it.

The other deviation from the eRFC is that I did not consider trait object
destructors at all in the eRFC, but I have implemented a "drop" metadata in
the PR.

Later today I'll amend the eRFC text to include these deviations and @eddyb's
suggestion which I have also implemented.

@eddyb
Copy link
Member

eddyb commented Apr 7, 2019

You can do !3 = !{!1, !2} to make it set-like.

@japaric
Copy link
Member Author

japaric commented Apr 7, 2019

I have updated the eRFC text.

@eddyp good idea. I have included that in the updated text (I have not yet added to the PR though)

japaric added a commit to japaric/rust that referenced this issue Apr 7, 2019
@japaric japaric changed the title [eRFC] Include (some) Rust type information in LLVM IR [eRFC] Include call graph information in LLVM IR Apr 7, 2019
@yzhaiustc
Copy link

Mark... Seems very useful.

@jonas-schievink jonas-schievink added the WG-embedded Working group: Embedded systems label Nov 9, 2019
@steveklabnik
Copy link
Member

Triage: looks like there was a bunch of activity, and then nothing... not sure what the next steps here are.

@hudson-ayers
Copy link
Contributor

Just commenting that this seems very useful, and that any mechanism that allows for mapping LLVM-IR to rust types could also be useful for symbolic execution of Rust binaries. I see that #59777 was closed, but I would be interested in contributing to some continuation of it if possible.

@workingjubilee workingjubilee added the A-CLI Area: Command-line interface (CLI) to the compiler label Mar 5, 2023
@cameronelliott
Copy link

This would probably be super useful to a lot of people, myself included, but it seems the original author @japaric ran into a show-stopper, and didn't figure a way around it.
It is mentioned here in the sample pull request:

#59777 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-CLI Area: Command-line interface (CLI) to the compiler A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. needs-fcp This change is insta-stable, so needs a completed FCP to proceed. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-embedded Working group: Embedded systems
Projects
None yet
Development

No branches or pull requests

10 participants