- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
sess: default to v0 symbol mangling #89917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
sess: default to v0 symbol mangling #89917
Conversation
| It looks like the valgrind and LLVM tool patches were only just merged in the last couple months - are they included in published releases? It'll probably take at least a year or two for them to get into e.g. LTS releases on distros, which we may not want to wait for, but having at least some release seems reasonable to me and is probably not that long a wait. | 
| @bors try @rust-timer queue | 
| Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf | 
| ⌛ Trying commit 4f1bf2a with merge d6cf5a91f80fba50e1594bddce52ee42486468c6... | 
| 
 As far as I can tell, the LLVM patch is in LLVM 13 (GitHub says llvm/llvm-project@0a2d4f3 is on the branch at least) and valgrind 3.18.0 has support (release notes). | 
| ☀️ Try build successful - checks-actions | 
| Queued d6cf5a91f80fba50e1594bddce52ee42486468c6 with parent af9b508, future comparison URL. | 
| Finished benchmarking commit (d6cf5a91f80fba50e1594bddce52ee42486468c6): comparison url. Summary: This change led to very large relevant mixed results 🤷 in compiler performance. 
 If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with  @bors rollup=never | 
| 
 FTR, on the just released Ubuntu 21.10, the llvm available in the repos is 13.0, and the valgrind available is 3.17.0. Debian sid however is still on LLVM 11.0. Nix OS has just updated to the new valgrind on their master branch, and I think it'll take a few days until it arrives on hydra. Stable users on 21.05 still get 3.16.1, but 21.11 seems to be on track to getting the new version. | 
| As discussed in the compiler team triage meeting a while ago, we will do a @rust-lang/compiler team FCP on this PR. That will take at least 10 days, so that the new default would make it to stable 1.58 (released January 13th 2022). With that timeline in mind, do you still think it is necessary to wait for a valgrind release, @Mark-Simulacrum? | 
| I think the latest valgrind release already contains the relevant patches. I'm not sure how I feel about the default being not supported by tooling on LTS distros, but I don't know that we want to wait for the potentially long time that would take -- it seems unlikely to be too impactful -- rustfilt is pretty fast and easy to install. I think we can see how much trouble we might run into by flipping the default for the compiler itself now (perhaps even before an FCP concludes) so that we can get some sense of the impact -- to maximize the time for identifying any problems we can observe in the wild from users and compiler developers using these. Another concern I'll note that as of now this increases the compiler's install size by ~27 MB (from 378 to 405 MB) -- not a huge increase, but also not particularly small. The download size only goes up by ~5MB so I guess the extended symbols compress well. The stripped binaries avoid this extra cost, but those aren't what we ship today. Maybe it's worth blocking this on stable support for split-debuginfo, which would give a smoother path toward stripping binaries (right?) for most users. But this is not any hard blockers IMO, just some concerns to think through as we make this decision. I think a writeup of each of these and why we're making a particular tradeoff here would be helpful to me at least. On the other hand, there are definite benefits to the new mangling -- e.g., I've been using it for some of the performance triage + improvement work I'm doing locally -- so generally I think we should move ahead. Timeline wise, I don't know whether we expect much breakage or issues filed, but I'll note that shipping in early January probably means most of the beta/nightly cycles will be during holidays and generally we don't have as much speed. So maybe delaying 1 releases (to ~1.59 or 1.60) would be of benefit regardless. | 
| I don't think split-debuginfo can handle linking (import/export) symbols - and I've only heard of a way to split the latter on Windows (mapping names to "ordinals"). (I sometimes "symbols" used ambiguously - the distinction I'm making is that debuginfo is non-semantic, but "linking names" are semantic and allow less wiggle room) It would be great to keep linking names compressed (esp. since a lot of names would share very similar crate roots, path components, etc.) until some  Sadly I'm worried platform tooling is decades behind having this kind of infrastructure directly supported, and we'd have to do a lot of it ourselves (and in brittle ways). | 
| 
 Yes, that sounds like a good idea. I'll open a PR shortly. 
 What format would you like that to have? Extending the PR message maybe? 
 I think that's an important point: there's also a cost to keeping the old mangling scheme around. We developed the new scheme because the old one is lacking in a number of ways. 
 That's good point. | 
| 
 Implemented in PR #90054. | 
| Yeah, I was thinking about adding to the PR message. I skimmed the RFC but I think it didn't directly address the tradeoffs I mentioned. Another aspect which seems like it's worth including is our plans for eventual removal of the old mangling scheme. I could also imagine that some users might be interested in an alternate mangling that is just a hash and so is as short as possible. 
 Hm, I guess. I think there's probably still an increase in debuginfo size with the new mangling but I suppose it's probably not as important. Seems good to call out this support / lack thereof, though. | 
| 
 Yes, we might want to extend the new scheme with a well-defined "hash mode" for cases like this. E.g.  I would also be interested in a "zstd mode" with a predefined dictionary. That might give additional compression without losing information. But we'd need to collect numbers for that and make sure zstd dictionaries are standardized enough for something like that. But both things would need a proper RFC amendment. | 
| I've labeled the PR as blocked for now because of @Mark-Simulacrum's point about this becoming stable right after the holiday season. I'll look into a writeup about tradeoffs some time next week. | 
…piler, r=Mark-Simulacrum Make new symbol mangling scheme default for compiler itself. As suggest in rust-lang#89917 (comment), this PR enables the new symbol mangling scheme for the compiler itself. The standard library is still compiled using the legacy mangling scheme so that the new symbol format does not show up in user code (yet). r? `@Mark-Simulacrum`
…ler, r=Mark-Simulacrum Make new symbol mangling scheme default for compiler itself. As suggest in rust-lang#89917 (comment), this PR enables the new symbol mangling scheme for the compiler itself. The standard library is still compiled using the legacy mangling scheme so that the new symbol format does not show up in user code (yet). r? `@Mark-Simulacrum`
| 
 I think this is a really good point! Perhaps it would make sense to land this change on nightly and disable it once beta branches for a few cycles? That would give us extra testing time on nightly while not putting us on a path toward shipping in 1.58 when many people would have been on holiday. | 
…g-version, r=wesleywiser Stabilize -Z symbol-mangling-version=v0 as -C symbol-mangling-version=v0 This allows selecting `v0` symbol-mangling without an unstable option. Selecting `legacy` still requires -Z unstable-options. This does not change the default symbol-mangling-version. See rust-lang#89917 for a pull request changing the default. Rationale, from rust-lang#89917: Rust's current mangling scheme depends on compiler internals; loses information about generic parameters (and other things) which makes for a worse experience when using external tools that need to interact with Rust symbol names; is inconsistent; and can contain . characters which aren't universally supported. Therefore, Rust has defined its own symbol mangling scheme which is defined in terms of the Rust language, not the compiler implementation; encodes information about generic parameters in a reversible way; has a consistent definition; and generates symbols that only use the characters A-Z, a-z, 0-9, and _. Support for the new Rust symbol mangling scheme has been added to upstream tools that will need to interact with Rust symbols (e.g. debuggers). This pull request allows enabling the new v0 symbol-mangling-version. See rust-lang#89917 for references to the implementation of v0, and for references to the tool changes to decode Rust symbols.
| Discussed on 2022-01-13 during T-compiler meeting (see Zulip thread) @rustbot label -I-compiler-nominated | 
| I think this can be revived, though I'm not sure what the conclusion was regarding how long we should wait for external tools to pick up and distribute the (now fixed) v0. I like the aforementioned idea of having a "hash-only" symbol mode for performance; maybe we could even have Cargo automatically turn it on when the "strip" profile option is enabled, since by that point we know you won't be doing any debugging anyway. | 
| It looks like @nnethercote's Valgrind fix made it into the 3.19 release. Looking around at some distros, I see the following versions of Valgrind packaged: 
 | 
| T-compiler discussed this in a dedicated steering meeting on Zulip. The following action items were noted: 
 | 
…rister Add documentation on v0 symbol mangling. This adds official documentation for the v0 symbol mangling format, migrating the documentation from [RFC 2603](https://rust-lang.github.io/rfcs/2603-rust-symbol-name-mangling-v0.html). The format was originally stabilized as the `-C symbol-mangling-version` option, but the specifics were not stabilized (per rust-lang#90128 (comment)). Per the discussion at rust-lang#93661 (comment) this adds those specifics as an official description of the format. cc rust-lang#89917
…ster Add documentation on v0 symbol mangling. This adds official documentation for the v0 symbol mangling format, migrating the documentation from [RFC 2603](https://rust-lang.github.io/rfcs/2603-rust-symbol-name-mangling-v0.html). The format was originally stabilized as the `-C symbol-mangling-version` option, but the specifics were not stabilized (per rust-lang#90128 (comment)). Per the discussion at rust-lang#93661 (comment) this adds those specifics as an official description of the format. cc rust-lang#89917
| For info, T-compiler held a design meeting 2 weeks ago about updating v0 symbol mangling (on Zulip), specifically 
 It was decided to go for option 1, therefore closing MCP#737. Actionables: 
 (please feel free to add more details if I forgot anything important) | 
| To clarify: the design meeting was about setting a policy for dealing with updates to the v0 mangling scheme (which sometimes are necessary when new language features are added). The meeting was not about switching from legacy- to v0-mangling. | 
| Removing the assignee. If anyone reading this is interested in moving this forward (see previous comment), feel free to self-assign :-) | 
4f1bf2a    to
    1174ea7      
    Compare
  
    
      
        
              This comment was marked as resolved.
        
        
      
    
  This comment was marked as resolved.
1174ea7    to
    36de2f7      
    Compare
  
    
      
        
              This comment was marked as resolved.
        
        
      
    
  This comment was marked as resolved.
36de2f7    to
    a473676      
    Compare
  
    
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
a473676    to
    85fa90a      
    Compare
  
    | Some changes occurred in tests/codegen-llvm/sanitizer cc @rcvalle | 
      
        
              This comment was marked as resolved.
        
        
      
    
  This comment was marked as resolved.
Rust's current mangling scheme depends on compiler internals; loses information about generic parameters (and other things) which makes for a worse experience when using external tools that need to interact with Rust symbol names; is inconsistent; and can contain `.` characters which aren't universally supported. Therefore, Rust has defined its own symbol mangling scheme which is defined in terms of the Rust language, not the compiler implementation; encodes information about generic parameters in a reversible way; has a consistent definition; and generates symbols that only use the characters `A-Z`, `a-z`, `0-9`, and `_`. Support for the new Rust symbol mangling scheme has been added to upstream tools that will need to interact with Rust symbols (e.g. debuggers). This commit changes the default symbol mangling scheme from the legacy scheme to the new Rust mangling scheme. Signed-off-by: David Wood <[email protected]>
85fa90a    to
    83ca9cf      
    Compare
  
    | This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. | 
Closes #60705.
Rust's current mangling scheme depends on compiler internals; loses information about generic parameters (and other things) which makes for a worse experience when using external tools that need to interact with Rust symbol names; is inconsistent; and can contain
.characters which aren't universally supported. Therefore, Rust has defined its own symbol mangling scheme which is defined in terms of the Rust language, not the compiler implementation; encodes information about generic parameters in a reversible way; has a consistent definition; and generates symbols that only use the charactersA-Z,a-z,0-9, and_.Support for the new Rust symbol mangling scheme has been added to upstream tools that will need to interact with Rust symbols (e.g. debuggers).
This pull request changes the default symbol mangling scheme from the legacy scheme to the new Rust mangling scheme.
The following pull requests implemented v0 mangling in rustc (if I'm missing any, let me know):
charin v0 mangling #77554Rust's symbol mangling scheme has support in the following external tools:
binutils/gdb(GNUlibiberty)committed as gcc-mirror/gcc@979526c
committed as gcc-mirror/gcc@42bf58b
committed as gcc-mirror/gcc@e1cb00d
(original submission) committed as gcc-mirror/gcc@32fc371
(original submission) committed as gcc-mirror/gcc@8409649
lldb/llvm-objdump/llvm-nm/llvm-symbolizer/llvm-cxxfilt/etcperfvalgrind#85530 (comment) contains a summary of the most recent crater run of the v0 mangling, and the remaining issues from that were fixed by #87194 (confirmed by follow-up crater run, #85530 (comment)).
@rustbot label +T-compiler
r? @michaelwoerister