-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DWARF for enums could be improved #32920
Comments
Yes, that would be nice. This strange encoding was chosen because it only contains things that also occur in C, so that existing debuggers can work with it. LLDB (and maybe GDB too) don't allow discriminated unions to be accessed via the Python API we use for pretty printing. Once we get proper support for Rust in debuggers, it would definitely better to encode enums differently. About |
I'd like to explore how DWARF's discriminated unions would work for Rust. Sticking to what DWARF currently specifies, we could make enums look like this: enum RegularEnum {
One { field1: u32, field2: f32 },
Two(i32),
} would be represented as
This representation would also scale to enums with optimized memory layout: enum OptimizedEnum {
Some(i64, &u32),
None,
} would be represented as
To stay consistent with that, C-like enums could also be represented like this: enum CLikeEnum {
One,
Two,
Three,
} would become
Now, this would work, but it would make some concessions to what is allowed in DWARF right now:
Regarding (1), we could use
Regarding (2), in a future version of Rust, where structs and enums are unified in one way or another, it might be possible that only part of a data-type is variant, so omitting the With these restrictions in mind, I would be OK with saying that enums are represented in DWARF as structs that have a variant part. Another question would be whether to represent Rust's C-like enums as actual cc @Manishearth |
This seems reasonable to me. I think having the variant part refer to one of the members is an extension to DWARF. Currently I think this has to refer to part of the structure (well, there is also some confusing text about a type, that didn't make sense to me). But, I think this is no big deal, we can document this as an extension and perhaps file a DWARF issue about it (DWARF issues being one of the two main non-coding deliverables from this project... do we need some kind of tag for this?) This approach seems to handle the non-zero optimzation reasonably well -- in fact it's nicer than what we have today, since it's a simple offset and dereference. |
I think it should be supported even without any extensions. From the DWARF 4 standard, "5.5.9 Variant Entries":
It's surprisingly unambiguous |
Ok, I can see how that could be read that way. I think the discriminant is supposed to come from outside the variants. See the introductory text to 5.5:
Also, in the text you quoted, there is a requirement for A note to the DWARF list asking for clarification would be a good idea. |
Or, no, I just wasn't reading it properly. |
You're right, I didn't see that. At least it's just in the informal description, not in the normative part. We could try to have that changed to something like:
That would seem unobjectionable to me. |
For the NonZero thing, the actual location of the discriminant can be several structures deep -- does this new scheme let us specify that? |
Yes, I would think so: The discriminant has its own |
Will this be prone to LLVM's optimizations reshuffling fields? Not sure if it does that now, but IIRC it reserves the power to do so. |
That's definitely something that needs to be investigated. I doubt that LLVM is allowed to change data-type layouts but there are optimizations like SROA that might cause problems. |
I have been looking into this, now that rustc has more space optimizations for enums. LLVM doesn't have debuginfo support for variant records, so adding that is the first step. I found this thread asking for clarifications about the DWARF standard in this area: http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/2006-August/thread.html#3086 Also, while neither gdb nor gcc currently read |
Started LLVM work here: https://github.com/tromey/llvm/tree/discriminated-unions. I'll post a link to the Phabricator request when that is ready; but I'll probably be writing all the patches first before trying to land any of them, to make sure it all works ok. |
Started the rust compiler work here: https://github.com/tromey/rust/tree/enum-debug-info |
Today I got the compiler generating the output I want. Next up, tests and the gdb patch.
|
I'm going to be doing the gdb work here: https://github.com/tromey/gdb/tree/variant-parts |
It all seems to be working, so next I'm going to write an LLVM test and submit the LLVM patch. |
LLVM review request here: https://reviews.llvm.org/D42082 This differs from the branch I published, because it is based on LLVM master, not the branch Rust uses. Once LLVM accepts something, I will back-port it for submission to rust-llvm. |
Great to see you making progress |
The LLVM patch has landed. I've back-ported it to the 4.x branch of rust-llvm (though I can't seem to link tests there...). Now I'm updating the rustc patch to work with the changes I made to the patch during the LLVM review process. One unknown to me is whether I should also backport this to the rust-llvm 6.x branch. |
I have rustc working again. However I'm having some second thoughts about trying to land this in the near term. On the one hand, debuginfo for some enums is currently broken; but on the other hand, because lldb doesn't understand DWARF variants, landing this will break more cases for lldb (while fixing all cases in gdb once those patches land). |
gdb patches submitted for review: https://sourceware.org/ml/gdb-patches/2018-02/msg00269.html |
The gdb patches have landed upstream now. |
I don't really know this code either. It is a "pretty printer" for Visual Studio's debugger -- which you probably already knew. Do your DWARF extension already map to a CodeView/PDB equivalent? |
LLVM doesn't emit pdb for this construct. It seems like pdb might have something relevant here, since F# has a discriminated union type, but so far I haven't been able to find it. So maybe the principled approach is out of reach - I will see if I can find someone to ask. I suppose if there's no good way, then removing these visualizers is at least reflecting the underlying reality. Another possibility is to fall back to the current approach when not generating DWARF. However, the current approach is also already known to be broken in some cases. |
Actually I think we need the fallback regardless, because rustc still supports LLVM 5.0. |
@tromey: I've finally gotten around to figure out what's up with https://bugs.llvm.org/show_bug.cgi?id=36654. Looks like what is throwing LLVM off is union variants that are smaller than the entire union. This patch appears to fix it. |
I have a patch to change this to use However, there are some drawbacks:
So, one idea is to have a fallback mode -- basically, keep the existing code (even though it is broken in some situations), and use it for Windows and for LLVM 5/6. Recently, though, my manager was telling me not to bother and that I should try to only support LLVM 7. This is somewhat attractive since it would mean not having to rewrite my patch again. But, I don't know if it is really a good idea, certainly my natural inclination is more toward the fallback idea. What do you think we should do? Just today I rebased my existing patch. I plan to submit it very soon, assuming the above decision can is made. All that really remains is fixing up the testing situation -- mostly updating existing tests (and if the fallback route is not taken, removing some obsolete code from tests). |
Would that mean dropping Rust support for LLVM 5/6 altogether? Or just degrading their enum debuginfo? Seems preferable to keep the imperfect status quo while improving LLVM 7, if that's feasible. |
I would prefer to keep the option of generating old-style enum debug info, at least until rustc ships with a customized version of lldb. And even then, people might have a reason to keep using whatever gdb/lldb version comes with their distro. On the flip side, rustc would need flag(s) for choosing version of the emitted debug info :-/ |
In this scenario it would mean degraded DWARF for LLVM 5/6 users.
It will not emit correct information about enums. I don't really know what exactly will be produced. As far as I am aware there is no way to represent Rust enums in pdb.
I really do not want to do this. lldb will be in rustup soon. The needed gdb support is already committed and will be in gdb 8.2, which is due out soon and which anyway is trivial to build. So, if this is a blocker, I'd rather just wait until the debugger situation is considered good enough... this will not be any worse than the fallback plan anyway. I guess I'm ok with choosing the fallback path for pdb and for older versions of LLVM. Though I question if it's better to emit bad debug info rather than just not emit it at all. The upside of missing debug info is that at least the debugger doesn't lie to you. Future debuginfo changes are going to face these same issues. However in some of those cases, the fallback is more clearly to do nothing, since rustc already doesn't emit some kinds of debuginfo. Still, this is something to consider when choosing our path. |
@vadimcn My patch already includes something close to what your patch has, and so I think as part of the fallback I will be fixing this as well. |
The DWARF generated for Rust enums was always somewhat unusual. Rather than using DWARF constructs directly, it would emit magic field names like "RUST$ENCODED$ENUM$0$Name" and "RUST$ENUM$DISR". Since PR rust-lang#45225, though, even this has not worked -- the ad hoc scheme was not updated to handle the wider variety of niche-filling layout optimizations now available. This patch changes the generated DWARF to use the standard tags meant for this purpose; namely, DW_TAG_variant and DW_TAG_variant_part. The patch to implement this went in to LLVM 7. In order to work with older versions of LLVM, and because LLVM doesn't do anything here for PDB, the existing code is kept as a fallback mode. Support for this DWARF is in the Rust lldb and in gdb 8.2. Closes rust-lang#32920 Closes rust-lang#32924 Closes rust-lang#52762 Closes rust-lang#53153
Fix DWARF generation for enums The DWARF generated for Rust enums was always somewhat unusual. Rather than using DWARF constructs directly, it would emit magic field names like "RUST$ENCODED$ENUM$0$Name" and "RUST$ENUM$DISR". Since PR #45225, though, even this has not worked -- the ad hoc scheme was not updated to handle the wider variety of niche-filling layout optimizations now available. This patch changes the generated DWARF to use the standard tags meant for this purpose; namely, DW_TAG_variant and DW_TAG_variant_part. The patch to implement this went in to LLVM 7. In order to work with older versions of LLVM, and because LLVM doesn't do anything here for PDB, the existing code is kept as a fallback mode. Support for this DWARF is in the Rust lldb and in gdb 8.2. Closes #32920 Closes #32924 Closes #52762 Closes #53153
The DWARF generated for Rust enums was always somewhat unusual. Rather than using DWARF constructs directly, it would emit magic field names like "RUST$ENCODED$ENUM$0$Name" and "RUST$ENUM$DISR". Since PR rust-lang#45225, though, even this has not worked -- the ad hoc scheme was not updated to handle the wider variety of niche-filling layout optimizations now available. This patch changes the generated DWARF to use the standard tags meant for this purpose; namely, DW_TAG_variant and DW_TAG_variant_part. The patch to implement this went in to LLVM 7. In order to work with older versions of LLVM, and because LLVM doesn't do anything here for PDB, the existing code is kept as a fallback mode. Support for this DWARF is in the Rust lldb and in gdb 8.2. Closes rust-lang#32920 Closes rust-lang#32924 Closes rust-lang#52762 Closes rust-lang#53153
Fix DWARF generation for enums The DWARF generated for Rust enums was always somewhat unusual. Rather than using DWARF constructs directly, it would emit magic field names like "RUST$ENCODED$ENUM$0$Name" and "RUST$ENUM$DISR". Since PR #45225, though, even this has not worked -- the ad hoc scheme was not updated to handle the wider variety of niche-filling layout optimizations now available. This patch changes the generated DWARF to use the standard tags meant for this purpose; namely, DW_TAG_variant and DW_TAG_variant_part. The patch to implement this went in to LLVM 7. In order to work with older versions of LLVM, and because LLVM doesn't do anything here for PDB, the existing code is kept as a fallback mode. Support for this DWARF is in the Rust lldb and in gdb 8.2. Closes #32920 Closes #32924 Closes #52762 Closes #53153
The DWARF generated for Rust enums was always somewhat unusual. Rather than using DWARF constructs directly, it would emit magic field names like "RUST$ENCODED$ENUM$0$Name" and "RUST$ENUM$DISR". Since PR rust-lang#45225, though, even this has not worked -- the ad hoc scheme was not updated to handle the wider variety of niche-filling layout optimizations now available. This patch changes the generated DWARF to use the standard tags meant for this purpose; namely, DW_TAG_variant and DW_TAG_variant_part. The patch to implement this went in to LLVM 7. In order to work with older versions of LLVM, and because LLVM doesn't do anything here for PDB, the existing code is kept as a fallback mode. Support for this DWARF is in the Rust lldb and in gdb 8.2. Closes rust-lang#32920 Closes rust-lang#32924 Closes rust-lang#52762 Closes rust-lang#53153
Fix DWARF generation for enums The DWARF generated for Rust enums was always somewhat unusual. Rather than using DWARF constructs directly, it would emit magic field names like "RUST$ENCODED$ENUM$0$Name" and "RUST$ENUM$DISR". Since PR #45225, though, even this has not worked -- the ad hoc scheme was not updated to handle the wider variety of niche-filling layout optimizations now available. This patch changes the generated DWARF to use the standard tags meant for this purpose; namely, DW_TAG_variant and DW_TAG_variant_part. The patch to implement this went in to LLVM 7. In order to work with older versions of LLVM, and because LLVM doesn't do anything here for PDB, the existing code is kept as a fallback mode. Support for this DWARF is in the Rust lldb and in gdb 8.2. Closes #32920 Closes #32924 Closes #52762 Closes #53153
Suppose you have a "non-trivial" enum like:
The DWARF looks like:
Then looking at the type of
One
, aka DIE 0xc0:This is peculiar DWARF in a few ways.
First, this emits a separate structure type for each branch. DWARF has a notion of a discriminated union that might be a closer fit semantically. See
DW_TAG_variant_part
,DW_AT_discr
, etc. (gdb's DWARF reader doesn't handle these at all, but it certainly could be made to.)Second, I think at the very least the discriminant should be marked
DW_AT_artificial
.The text was updated successfully, but these errors were encountered: