Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check rustc version before loading a proc-macro #6174

Closed
bjorn3 opened this issue Oct 8, 2020 · 25 comments · Fixed by #6822
Closed

Check rustc version before loading a proc-macro #6174

bjorn3 opened this issue Oct 8, 2020 · 25 comments · Fixed by #6822
Labels
E-has-instructions Issue has some instructions and pointers to code to get started E-medium good first issue S-actionable Someone could pick this issue up and work on it right now

Comments

@bjorn3
Copy link
Member

bjorn3 commented Oct 8, 2020

The .rustc section contains the rustc version in a very stable format. It is used to report the original rustc version when attempting to use a crate compiled with a different rustc version. You will first need to use snap to decompress the section, then you can look at get_rustc_version to see how to decode it. METADATA_HEADER is defined at https://github.com/rust-lang/rust/blob/382848989f91fa2c25390f0d5c1e8b1ae94f47df/compiler/rustc_metadata/src/rmeta/mod.rs#L54

@matklad matklad added E-has-instructions Issue has some instructions and pointers to code to get started E-medium good first issue S-actionable Someone could pick this issue up and work on it right now labels Oct 14, 2020
@jsomedon
Copy link
Contributor

jsomedon commented Dec 6, 2020

Hi, can I take this issue if no one's working on this? I am kinda new to the community but I want to start contributing. :-)

@lnicola
Copy link
Member

lnicola commented Dec 6, 2020

It's yours :-)

@jsomedon
Copy link
Contributor

jsomedon commented Dec 7, 2020

Might not be really related to this issue but I have several questions about proc macro feature in general:

  • Is there any reason for spawning a whole new process upon "proc_macro" arg, instead of, say using a thread, like ProcMacroProcessSrv::run spawn the ProcMacroProcessSrv::client_loop as a jod_thread?
  • I see no help string for "proc_macro" arg, is it like temporary design that might change in future?

And regarding the actual bug in this issue: I think I wrapped my mind over how this proc macro thing works in rust analyzer but I only took a peek to the rustc code mentioned by @bjorn3. So looks like the MetadataBlob type has the decoding logic implemented in the get_rustc_version method.

  • Is there anyway I can reuse this method somehow(I mean if the user always has rust toolchains installed, does rustc somehow expose something like this as some api, so I can, u know, kinda just call that api on user's rustc? Just asking.) or I better implement this similar logic myself?
  • Are we talking about checking the version of rustc used to compile binary crate that contains the proc macro definition, then see if it's the same as rustc installed on user machine? And do you mean the binary crate contains .rustc section, and I should decompress the section then decode?
  • Looks like the ProcMacroProcessSrv::find_proc_macros checks what macros are in the crate binary in ProcMacroClient::by_dylib_path method call. So my thought is, I should do version check inside ProcMacroClient::by_dylib_path, and proceed to ProcMacroProcessSrv::find_proc_macros if version check succeeds?
  • If version check fails, I probably should tell the user somehow. Should I do something like log::error!("The proc macro crate's version doesn't match.")?

@bjorn3
Copy link
Member Author

bjorn3 commented Dec 7, 2020

Is there any reason for spawning a whole new process upon "proc_macro" arg, instead of, say using a thread, like ProcMacroProcessSrv::run spawn the ProcMacroProcessSrv::client_loop as a jod_thread?

To prevent bugs and version incompatibilities from brining down rust-analyzer as a whole.

Is there anyway I can reuse this method somehow(I mean if the user always has rust toolchains installed, does rustc somehow expose something like this as some api, so I can, u know, kinda just call that api on user's rustc? Just asking.) or I better implement this similar logic myself?

If you are using nightly, you can directly call into rustc_metadata functions. Rust-analyzer works with stable though, so you can't reuse it directly. You could copy it though.

Are we talking about checking the version of rustc used to compile binary crate that contains the proc macro definition, then see if it's the same as rustc installed on user machine?

You need to check the rustc version that compiled the proc macro to see if it is compatible with the proc_macro version included in rust-analyzer.

And do you mean the binary crate contains .rustc section, and I should decompress the section then decode?

Yes

Looks like the ProcMacroProcessSrv::find_proc_macros checks what macros are in the crate binary in ProcMacroClient::by_dylib_path method call. So my thought is, I should do version check inside ProcMacroClient::by_dylib_path, and proceed to ProcMacroProcessSrv::find_proc_macros if version check succeeds?

Makes sense

If version check fails, I probably should tell the user somehow. Should I do something like log::error!("The proc macro crate's version doesn't match.")?

log::error! is only visible to the user when they look at the source. You could show an lsp notification and/or add a diagnostic to each use of the proc macro.

@jsomedon
Copy link
Contributor

jsomedon commented Dec 7, 2020

And do you mean the binary crate contains .rustc section, and I should decompress the section then decode?

Yes

The only feasible way I can come up with is: look for bytes 2e 72 75 73 74 63 which is ascii representation of ".rustc", then bytes from this point upto but excluding 00 (that would be delimiter between sections I think?) would be the bytes I need to decompress & decode. Wonder if there is simpler way of doing this, and also is it safe to treat 00 as end mark of data? I mean could the decompressed data happen to contain 00 in between?

Also I just built and took a look at the proc_macro_test crate, to see what a .rustc section look like, and I only see .rustc text with no more data but followed by 00 then .debug_arranges directly?

@bjorn3
Copy link
Member Author

bjorn3 commented Dec 7, 2020

You can use the object crate to read the .rustc section.

@jsomedon
Copy link
Contributor

jsomedon commented Dec 8, 2020

I went through decompressing, and kinda stuck at figuring out how decoding actually works once I jumped forward the 8(the METADATA_HEADER)+4(the crate root position) bytes offset. I expect to see logic like how many bytes to read and how to decode a string from them being defined somewhere, but couldn't find such thing. The only thing that looks related remotely is this decode method impled for Lazy<T>, but the signature seem don't match?

    crate fn get_rustc_version(&self) -> String {
        Lazy::<String>::from_position(NonZeroUsize::new(METADATA_HEADER.len() + 4).unwrap())
        // ^ this line above should give me a `Lazy<String>` value
            .decode(self)
            // so decode should be `fn (self, some_other_arg)`-> String?
            // but the `define` from the link above is `fn (some_non_self) -> Result<Self, String> `
            // I was asking fellas in discord channel if functions with such two signature match with
            // some advanced syntax that I never heard about, but that doesn't seem to be the case..
    }

EDIT:
I think I found the correct one, I am still trying to figure out like how many bytes decode and in what scheme though.

@lnicola
Copy link
Member

lnicola commented Dec 8, 2020

Not sure if it helps, but that section looks like this:

$ objdump -s -j .rustc deps/libserde_derive-86e2a18a1c01b046.so                                                                          2 ✘ 

deps/libserde_derive-86e2a18a1c01b046.so:     file format elf64-x86-64

Contents of section .rustc:
 0000 72757374 00000005 ff060000 734e6150  rust........sNaP
 0010 70590049 0d0060c9 7a44d5a1 01307275  pY.I..`.zD...0ru
 0020 73740000 00050000 50242b01 0df06563  st......P$+...ec
 0030 20312e35 302e302d 6e696768 746c7920   1.50.0-nightly 
 0040 28666539 38323331 39612032 3032302d  (fe982319a 2020-
 0050 31312d31 39290000 002b3cf3 ac05acc3  11-19)...+<.....
 0060 62294f2f 342953bd d8010004 10646572  b)O/4)S......der

For some reason the header is included twice and the version string isn't where I'd expect. There's also a sNaPpY magic, presumably because the section is compressed (bjorn3 already mentioned this). The version string shows up in clear, but that might happen even with compression (I don't know the specifics of Snappy).

Also, this seems like a really round-about way to check the version. We compile the proc macros ourselves by running cargo check so we already know the rustc version, right? I thought this was the proc macro ABI version (which we might be interested in), but it's just the compiler version.

@jsomedon
Copy link
Contributor

jsomedon commented Dec 8, 2020

ha, it looks like you didn't even decompress it and it can still show enough information..

@lnicola
Copy link
Member

lnicola commented Dec 8, 2020

Here it is after decompression:

┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 72 75 73 74 00 00 00 05 ┊ 00 00 50 24 2b 72 75 73 │rust000•┊00P$+rus│
│00000010│ 74 63 20 31 2e 35 30 2e ┊ 30 2d 6e 69 67 68 74 6c │tc 1.50.┊0-nightl│
│00000020│ 79 20 28 66 65 39 38 32 ┊ 33 31 39 61 20 32 30 32 │y (fe982┊319a 202│
│00000030│ 30 2d 31 31 2d 31 39 29 ┊ 00 00 00 2b 3c f3 ac 05 │0-11-19)┊000+<×ו│
│00000040│ ac c3 62 29 4f 2f 34 29 ┊ 53 bd d8 01 00 04 10 64 │××b)O/4)┊S×ו0••d│
│00000050│ 65 72 69 76 65 5f 73 65 ┊ 72 69 61 6c 69 7a 65 00 │erive_se┊rialize0│
│00000060│ f2 41 1c 1a b2 17 9c f9 ┊ 03 cb d2 4a c3 35 3c 1b │×A••×•××┊•××J×5<•│
│00000070│ 01 00 04 12 64 65 72 69 ┊ 76 65 5f 64 65 73 65 72 │•0••deri┊ve_deser│
│00000080│ 69 61 6c 69 7a 65 00 08 ┊ 83 15 1d a4 49 29 4d 23 │ialize0•┊ו•×I)M#│
│00000090│ 64 3d 9d 53 37 e6 c1 00 ┊ 00 00 14 02 00 00 00 a6 │d=×S7××0┊00••000×│

@jsomedon
Copy link
Contributor

jsomedon commented Dec 8, 2020

bjorn3 replied me a while ago on zulip, and I am posting his message here in case someone interested wants to know:

The metadata starts with 4 bytes that should be equal to b"rust", then 4 bytes for the crate root position. Followed by that is 4 bytes with the length of the rustc version string. Next comes the rustc version string itself encoded as UTF-8.

@jsomedon
Copy link
Contributor

jsomedon commented Dec 8, 2020

Here it is after decompression:

┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 72 75 73 74 00 00 00 05 ┊ 00 00 50 24 2b 72 75 73 │rust000•┊00P$+rus│
│00000010│ 74 63 20 31 2e 35 30 2e ┊ 30 2d 6e 69 67 68 74 6c │tc 1.50.┊0-nightl│
│00000020│ 79 20 28 66 65 39 38 32 ┊ 33 31 39 61 20 32 30 32 │y (fe982┊319a 202│
│00000030│ 30 2d 31 31 2d 31 39 29 ┊ 00 00 00 2b 3c f3 ac 05 │0-11-19)┊000+<×ו│
│00000040│ ac c3 62 29 4f 2f 34 29 ┊ 53 bd d8 01 00 04 10 64 │××b)O/4)┊S×ו0••d│
│00000050│ 65 72 69 76 65 5f 73 65 ┊ 72 69 61 6c 69 7a 65 00 │erive_se┊rialize0│
│00000060│ f2 41 1c 1a b2 17 9c f9 ┊ 03 cb d2 4a c3 35 3c 1b │×A••×•××┊•××J×5<•│
│00000070│ 01 00 04 12 64 65 72 69 ┊ 76 65 5f 64 65 73 65 72 │•0••deri┊ve_deser│
│00000080│ 69 61 6c 69 7a 65 00 08 ┊ 83 15 1d a4 49 29 4d 23 │ialize0•┊ו•×I)M#│
│00000090│ 64 3d 9d 53 37 e6 c1 00 ┊ 00 00 14 02 00 00 00 a6 │d=×S7××0┊00••000×│

What command was used for this? Very nice textual layout!

@lnicola
Copy link
Member

lnicola commented Dec 8, 2020

I think it's rust\x00\x00\x00\x05, then the crate root offset, the length and the string. But the version string is still present twice. I skipped over the first one before uncompressing the rest of the section. The Snappy header is \xff\06\x00\x00sNaPpY, visible in the first dump.

What command was used for this? Very nice textual layout!

https://github.com/sharkdp/hexyl

@jsomedon
Copy link
Contributor

I thought I went through the decompression, but realized that I was working with the, un-decompressed section all the time... Basically, I got the ".rsutc" section using object crate, then I tried to decompress the section's bytes using snap crate. I tried snap::read::FrameDecoder and got error like Custom { kind: Other, error: StreamHeader { byte: 114 } }( thrown when the decoder expects header type but different chunk type was read, in this case, the byte read is 114, that is 'r', my guess is the error was thrown at the very first byte r of rust........sNaP), I also tried the snap::raw::Decoder and got error with invalid offset(thrown on offset that's 0 or out of bound, mine wasn't 0 so probably out of bound).

I see that my original section bytes contain the snappy magic word as well, so I should be able to somehow decompres it but have no idea where did I get it wrong. I see that's @lnicola actually decompressed it, wonder how did you do that?

@lnicola
Copy link
Member

lnicola commented Dec 11, 2020

You need to skip the first 8 bytes:

Contents of section .rustc:
 0000 72757374 00000005 ff060000 734e6150  rust........sNaP
 0010 70590049 0d0060c9 7a44d5a1 01307275  pY.I..`.zD...0ru
 0020 73740000 00050000 50242b01 0df06563  st......P$+...ec
 0030 20312e35 302e302d 6e696768 746c7920   1.50.0-nightly 
 0040 28666539 38323331 39612032 3032302d  (fe982319a 2020-
 0050 31312d31 39290000 002b3cf3 ac05acc3  11-19)...+<.....
 0060 62294f2f 342953bd d8010004 10646572  b)O/4)S......der

The compressed data starts at ff060000 734e6150. I mentioned this before, the Rust metadata version header is store twice, once before the compressed data and once at the beginning of the compressed data.

@jsomedon
Copy link
Contributor

Just finished checking verision from .rustc, and I can confirm that the layout is just like what @lnicola said. Now moving into comparing crate's version with rust-analyzer's version..

1.How do I check what proc_macro version rust-analyzer uses inside rust-analyzer code?
2. Should loading happen only if both versions are the same? Or is it ok to have the crate's version lower than rust-analyzer's verison?(because rust is generally backward compatible I assume?)

@jsomedon
Copy link
Contributor

Also, this seems like a really round-about way to check the version. We compile the proc macros ourselves by running cargo check so we already know the rustc version, right? I thought this was the proc macro ABI version (which we might be interested in), but it's just the compiler version.

Does this mean the crate's version I've been trying to parse out is not the one we are interested??

@lnicola
Copy link
Member

lnicola commented Dec 11, 2020

I'm not sure. I don't think ABI changes are backwards compatible, so we might want to either:

  • support multiple ABI versions (the code is mostly copied from rustc, we could create a new module on each change and pick the one we need at run-time)
  • support a single version and include in the source code the commit (or date) to which it corresponds.

Ideally we'd support at least nightly, beta and stable.

Does this mean the crate's version I've been trying to parse out is not the one we are interested??

It's not that we are not interested in it, but AFAICT it's pretty much the same one that rustc -V will return. The hard part is not getting the version, but deciding what to do based on it.

@jsomedon
Copy link
Contributor

jsomedon commented Dec 11, 2020

Sounds like we are talking about implementing a whole new feature, not just a simple bug fix. Not sure how planning and decision making go through in this project, I guess a new contributor like me might not get to say much.. nevertheless, I imagine a typical user would definitely like to see rust analyzer supporting as many versions as possible.

That being said, I will go by whatever version support you guys think is reasonable to implement at this point.

@lnicola
Copy link
Member

lnicola commented Dec 11, 2020

There's more discussion in #6448.

To close this I suppose we could add a method somewhere that would return the version string from a proc macro shared library. Actually using it would be a different issue (to avoid scope creep in a new contributor's work).

@jsomedon
Copy link
Contributor

a method somewhere that would return the version string from a proc macro shared library

I think that's exactly what I have implemented so far. To double check, the version string in the compressed part of the crate's .rustc section, right?

@lnicola
Copy link
Member

lnicola commented Dec 11, 2020

To double check, the version string in the compressed part of the crate's .rustc section, right?

Yes, this part:

image

(Unrelated, but is hexyl really printing \x00 as 0 😕 ?)

image

@jsomedon
Copy link
Contributor

(Unrelated, but is hexyl really printing \x00 as 0 confused ?)

(yeah on my laptop it prints 0 too)

@jsomedon
Copy link
Contributor

Okay then I will add a read_version method to ProcMacroClient in the proc_macro_api/src/lib.rs if that's okay.

And before that I will try to figure out how to make a pull request first 😅

@bjorn3
Copy link
Member Author

bjorn3 commented Dec 11, 2020

Hexyl should use a different color for b'\0x00' and b'0'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E-has-instructions Issue has some instructions and pointers to code to get started E-medium good first issue S-actionable Someone could pick this issue up and work on it right now
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants