Support AVR for inline asm! #91224

couchand · 2021-11-25T16:04:45Z

A first pass at support for the AVR platform in inline asm!. Passes the initial compiler tests, have not yet done more complete verification.

In particular, the register classes could use a lot more fleshing out, this draft PR so far only includes the most basic.

cc @Amanieu @dylanmckay

rust-highfive · 2021-11-25T16:04:48Z

r? @wesleywiser

(rust-highfive has picked a reviewer for you, use r? to override)

Amanieu

Great work! Regarding register classes, just make sure you have enough to accurate describe the constraints that would be needed for various instructions. See https://www.nongnu.org/avr-libc/user-manual/inline_asm.html for a mostly complete list of constraint code used by avr-gcc and LLVM. For example, I can see some instructions which only accept a "high" register (but then again I'm not very familiar with AVR assembly).

Amanieu · 2021-11-25T20:47:12Z

compiler/rustc_target/src/asm/avr.rs

+        r31: reg = ["r31", "ZH"],
+
+        X: reg_xyz = ["r27r26", "X"],
+        Y: reg_xyz = ["r29r28", "Y"],


If you have a look at lib/Target/AVR/AVRRegisterInfo.cpp in the LLVM source you will see that Y is actually reserved for the stack pointer. Also r0 and r1 are reserved.

It looks like for Y the way to go is adding the #error pragma here to keep the user's hands off of it.

I think we'd need some way to name r0 and r1. For instance, to get the result of a MUL:

let a = 1u8; let b = 1u8; let res: u16; asm!( "mul {}, {}", in(reg) a, in(reg) b, lateout("r1r0") res, );

Is there another architecture that I can look to for an example of how to handle this?

(and just to clarify because it confused me for a moment, Y is reserved for the frame pointer, the stack pointer has a dedicated register SPH:SPL)

I think we'd need some way to name r0 and r1.

Or just leave it to the user to reassemble the value from r0 and r1 themselves. In the end it should be optimized the same way by the compiler.

Sorry, I wasn't clear. If r0 and r1 are reserved, and thus probably shouldn't be nameable for inline asm!, how is the user expected to get the results from them?

Or to put it another way, what should I do with the information that lib/Target/AVR/AVRRegisterInfo.cpp reserves these registers if not exclude them from use?

Sorry, I wasn't clear. If r0 and r1 are reserved, and thus probably shouldn't be nameable for inline asm!, how is the user expected to get the results from them?

Both registers r1 and r0 should be accessible. Any author of some piece Rust code targeting AVR should be able to name these registers in inline ASM.

The zero register should always be accessible even in inline asm so that the dev can pass 0 to an instruction which only supports registers as inputs, not immediates.

See this avr-gcc wiki doc https://gcc.gnu.org/wiki/avr-gcc#Fixed_Registers which shows that both registers should be available in inline asm.

It is a little bit finicky with LLVM instruction reordering, but it's always like that and I don't see a problem with your approach. You won't be able to say, run a piece of C code (i.e. something that executes MUL) and then use inline ASM to reassemble r1/r0 because LLVM inline assembler does not parse/inspect into the raw assembly text (only the constraints/list), so LLVM won't know you're referring to r1/r0 and so will not be able to preserve it during transformation. You would be able to use inline ASM to reassemble r1/r0 from the result of an instruction that is also inside the inline asm block, because the internal contents of any inline asm block would not be reordered. In summary, in this paragraph all I am trying to say that it is perfectly acceptable and allowable to use r1/r0 in inline ASM, but you cannot rely on their values across inline asm block boundaries

It sounds like there should be a register class just for r1/r0, they shouldn't be in reg because we wouldn't want to pick either of them for "any old register".

The zero register should always be accessible

Thanks for bringing this up. I'll need to add to the documentation that if a programmer writes to r1 in any way that they must restore it to zero before the end of the asm! block.

it is perfectly acceptable and allowable to use r1/r0 in inline ASM, but you cannot rely on their values across inline asm block boundaries.

Does that mean that you shouldn't be able to use them as inputs or outputs to the block? For instance, would my code sample up thread be expected to work, or does that count as "across block boundaries"?

I agree that r0 and r1 should be usable as operands in asm!. However the reality is that LLVM marks these as reserved registers, which causes LLVM's register allocator to ignore these and assume they are never touched by inline assembly. If you try to actually use r0/r1 as operands then things will break in unpredictable ways, which is why we always prevent reserved registers from being used in asm!.

The only "correct" way to use r0 and r1 in asm! is to save their value to another register (which should be properly marked as clobbered) and then restoring these values at the end of the assembly snippet. This is unavoidable until the LLVM backend is fixed to no longer treat these registers as reserved.

Source: https://github.com/llvm/llvm-project/blob/27c9e8b45b25614a92539ac6787dbb5670d950b3/llvm/lib/Target/AVR/AVRRegisterInfo.cpp#L53

I am having a hard time understanding exactly what LLVM means by reserved. The only authority I've found is the comment above getReservedRegs here: https://github.com/llvm/llvm-project/blob/e356027016c6365b3d8924f54c33e2c63d931492/llvm/include/llvm/CodeGen/TargetRegisterInfo.h#L524-L527

/// A reserved register: /// - is not allocatable /// - is considered always live /// - is ignored by liveness tracking

I don't think I know enough to tell if that should be applied to r0 and r1 or not. I'm inclined to trust your judgement there @Amanieu.

This is unavoidable until the LLVM backend is fixed to no longer treat these registers as reserved.

If this is where we're going, I'm happy to submit this change to LLVM. @dylanmckay, since I suspect you'll be reviewing the differential, does that change make sense to you?

In the meantime, I'll make r0 and r1 error out here with a helpful message.

This is unavoidable until the LLVM backend is fixed to no longer treat these registers as reserved.

If this is where we're going, I'm happy to submit this change to LLVM. @dylanmckay, since I suspect you'll be reviewing the differential, does that change make sense to you?

From my point of view, if AVR-GCC compatibility is kept, then the change is definitely good. If not, maybe it's possible but certainly more complicated 👍

src/doc/unstable-book/src/library-features/asm.md

compiler/rustc_target/src/asm/avr.rs

src/doc/unstable-book/src/library-features/asm.md

src/test/assembly/asm/avr-types.rs

compiler/rustc_codegen_llvm/src/asm.rs

wesleywiser · 2021-11-29T18:43:12Z

r? @Amanieu

couchand · 2021-11-29T19:23:18Z

Thanks for your review @Amanieu! I've replied in the places where I have questions.

Is there any style guidance on the translation from constraint codes/descriptions to register class names? Here is my proposal for a minimal set of constraint codes to support:

r for any 8-bit register
w for any 16-bit register pair
d for an 8-bit upper register
e for any pointer register pair (just X and Z I guess)
z for the Z register (and also any base pointer register since Y is not usable) [does this even need to be a class?]

The first two could possibly be named reg and wreg to reuse symbols from other architectures, but I'm not sure what the others should be named.

couchand · 2021-11-29T22:24:33Z

I've pushed up a variety of changes that addresses most (but not all) of the outstanding items. Please take a look when you have availability @Amanieu. Thanks!

couchand · 2021-11-29T23:03:10Z

I think I'm not entirely clear on how I should be handling the register pairs. Any pair can be used as an argument to the 16-bit MOVW instruction, and the upper pairs can be an argument to the 16-bit immediate instructions like ADIW.

Also, each of the pointer pairs, which have special behavior for the indirect memory operations, can be used as two general-purpose 8-bit registers when passed to any other instruction.

The 8-bit and 16-bit operation expect the lower register of the pair to be named, with the upper register implicit.

Are the changes I've made here consistent with that usage?

compiler/rustc_codegen_llvm/src/asm.rs

dylanmckay · 2021-12-01T11:38:51Z

I think I'm not entirely clear on how I should be handling the register pairs. Any pair can be used as an argument to the 16-bit MOVW instruction, and the upper pairs can be an argument to the 16-bit immediate instructions like ADIW.

Also, each of the pointer pairs, which have special behavior for the indirect memory operations, can be used as two general-purpose 8-bit registers when passed to any other instruction.

The 8-bit and 16-bit operation expect the lower register of the pair to be named, with the upper register implicit.

Are the changes I've made here consistent with that usage?

One point I'll make is that the instruction set is pretty ad-hoc/bolted on parts in ways, with even the registers supported for an "instruction" being target MCU specific. You will see instructions like: LPM: only ever works on the Z register, therefore, to use in inline ASM there needs to be a register class that can only ever resolve to the Z register[0] so GCC has an inline asm reg class constraint named z.

Another instruction complicating it is STD/LDD (store/load to memory with pointer offset/displacement). On some obscure AVR chips, they are only defined for Y and Z pointer registers and not the X pointer register (but you can probably ignore this case because the LLVM backend doesn't even make this distinction). But if the backend did make the distinction in the future as it should already be doing, then for the Rust code to work correct across AVR chips, the Rust code would need to explicitly opt-in to one of the x, y, z reg class constraints (redundantly excluding registers on the high end device that could be used, from being used), or the Rust compiler itself would need to expose an inline asm reg class constraint whose included registers were target device specific so that the Rust could could just use that and have it just work.

Suffice to say, I think it's best to land a core set like you're doing and add the others later. It's probably worth adding 3 more extra register classes in this PR (x, y, z, that can only resolve to their own respective implied registers) which will allow Rust code to at least use instructions like LPM and also work around any chip-specific/subtarget feature issues.

EDIT: I remembered you're intentionally not allowing frame pointer Y to be used so in that case obviously there's no need no add a register class for it

[0] Inline asm list: https://www.nongnu.org/avr-libc/user-manual/inline_asm.html

couchand · 2021-12-01T19:25:43Z

Thanks for the review @dylanmckay!

The point is well received that every AVR device is a new beast. For myself, I'm usually working on ones that don't even have a hardware MUL (so that conversation is all a bit theoretical!). A very conservative use of asm! might have a good chance of being portable across cores, but anything very complicated will need special attention paid to the specific core(s) in use. I think that's a reasonable trade off to make.

It's probably worth adding 3 more extra register classes in this PR (x, y, z, that can only resolve to their own respective implied registers) which will allow Rust code to at least use instructions like LPM and also work around any chip-specific/subtarget feature issues.

Are all the use-cases for these handled by the specific-register syntax of asm!? That is, is it sufficient that the user can write

let mut byte = MaybeUninit::uninit();
let input = 42u8;
asm!(
    "st X, {}",
    in("X") byte.as_mut_ptr() as u16,
    in(reg) input,
);

or should you also be able to say

let mut byte = MaybeUninit::uninit();
let input = 42u8;
asm!(
    "st {}, {}",
    in(x) byte.as_mut_ptr() as u16,
    in(reg) input,
);

Amanieu · 2021-12-02T02:15:40Z

I would recommend adding template modifiers for register pairs to print just the high/low register of a pair. Have a look at how it's done for ARM where the e and f modifiers allow printing the low/high d-reg component of a q-reg. In the AVR backend the equivalent modifiers are A and B.

compiler/rustc_target/src/asm/avr.rs

couchand · 2021-12-02T17:45:57Z

I would recommend adding template modifiers for register pairs to print just the high/low register of a pair. Have a look at how it's done for ARM where the e and f modifiers allow printing the low/high d-reg component of a q-reg. In the AVR backend the equivalent modifiers are A and B.

That's a great idea. I'll go ahead and add those.

couchand · 2021-12-02T22:25:56Z

Thank you @Amanieu and @dylanmckay for your continued reviews and many thoughtful comments. I have incorporated all of the feedback thus far and rebased this branch. Please take another look when you have an opportunity.

Amanieu · 2021-12-04T00:11:36Z

compiler/rustc_codegen_llvm/src/asm.rs

@@ -322,6 +322,7 @@ impl AsmBuilderMethods<'tcx> for Builder<'a, 'll, 'tcx> {
                InlineAsmArch::SpirV => {}
                InlineAsmArch::Wasm32 | InlineAsmArch::Wasm64 => {}
                InlineAsmArch::Bpf => {}
+                InlineAsmArch::Avr => {}


I had to look this up in LLVM's internals, but the clobber for the status register is "~{sreg}", which should be marked as clobbered when preserves_flags is not set. Also the documentation in the unstable book should be updated to reflect this.

I've added that.

So many of the other platforms have an empty branch here it was easy to not spend much thought on this clause. But the realization that I missed this made me worried there might be more. I just went through again and checked for other copy-pasta, didn't notice any concerns.

The one other place we have a null implementation that could potentially have code is suggest_modifier -- that one I did consider if there was a heuristic to use to know which byte you mean. But in the end it seems equally likely you'd want either byte, and there doesn't seem to be a way to suggest two modifiers.

src/doc/unstable-book/src/library-features/asm.md

Amanieu · 2021-12-06T15:42:33Z

@bors r+ rollup=never

bors · 2021-12-06T15:42:35Z

📌 Commit c6e8ae1 has been approved by Amanieu

bors · 2021-12-07T14:23:05Z

⌛ Testing commit c6e8ae1 with merge 0b6f079...

bors · 2021-12-07T17:26:29Z

☀️ Test successful - checks-actions
Approved by: Amanieu
Pushing 0b6f079 to master...

rust-timer · 2021-12-07T18:57:17Z

Finished benchmarking commit (0b6f079): comparison url.

Summary: This change led to moderate relevant regressions 😿 in compiler performance.

Moderate regression in instruction counts (up to 0.8% on incr-unchanged builds of externs)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression

couchand · 2021-12-08T01:18:07Z

I have no idea what needs to be done to investigate this performance issue. From the linked results I can see there are five benchmark/profile/scenario rows with issues, and four of them are the externs benchmark.

But I don't know what to do next. What are these externs benchmarks? Why would inline assembler changes for a platform that presumably isn't exercised by the benchmark make such a significant change? What's the usual range on these results?

What does "sufficient written justification" require? Are there any examples or documentation that explain what's needed?

Amanieu · 2021-12-08T01:19:07Z

The performance issue is almost certainly not related to this PR.

couchand · 2021-12-08T01:31:04Z

The performance issue is almost certainly not related to this PR.

Good to hear. Is there anything actionable for me here, then?

pnkfelix · 2021-12-15T05:03:35Z

@couchand no, I don't think there's anything actionable for you here. If anything, the performance team might want to double check what's going on with that benchmark, and/or our noise suppression techniques.

pnkfelix · 2021-12-15T05:16:36Z

I think we can ignore this regression. externs is a pathological benchmark (3000 extern function declarations) and I do not think we should worry about it going up by 0.8%, especially since a different like that could easily be due to memory alignment effects in a benchmark like this.

@rustbot label: +perf-regression-triaged

rust-highfive assigned wesleywiser Nov 25, 2021

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 25, 2021

This comment has been minimized.

Sign in to view

Amanieu reviewed Nov 25, 2021

View reviewed changes

rust-highfive assigned Amanieu and unassigned wesleywiser Nov 29, 2021

couchand force-pushed the 2021-11/avr-asm branch from ca56851 to babde3c Compare November 29, 2021 22:17

dylanmckay reviewed Dec 1, 2021

View reviewed changes

compiler/rustc_codegen_llvm/src/asm.rs Outdated Show resolved Hide resolved

Amanieu reviewed Dec 2, 2021

View reviewed changes

compiler/rustc_target/src/asm/avr.rs Outdated Show resolved Hide resolved

Amanieu reviewed Dec 2, 2021

View reviewed changes

compiler/rustc_target/src/asm/avr.rs Outdated Show resolved Hide resolved

couchand mentioned this pull request Dec 2, 2021

generic: delay: update to asm! Rahix/avr-hal#229

Closed

6 tasks

couchand force-pushed the 2021-11/avr-asm branch 2 times, most recently from e3ef486 to 1e4e1c3 Compare December 2, 2021 22:23

couchand marked this pull request as ready for review December 2, 2021 22:24

Amanieu mentioned this pull request Dec 3, 2021

Tracking Issue for inline assembly (asm!) #72016

Closed

4 tasks

Amanieu reviewed Dec 4, 2021

View reviewed changes

Implement inline asm! for AVR platform

c6e8ae1

couchand force-pushed the 2021-11/avr-asm branch from 1e4e1c3 to c6e8ae1 Compare December 6, 2021 06:10

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 6, 2021

bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 7, 2021

bors merged commit 0b6f079 into rust-lang:master Dec 7, 2021

rustbot added this to the 1.59.0 milestone Dec 7, 2021

bors mentioned this pull request Dec 7, 2021

Remove the reg_thumb register class for asm! on ARM #90796

Merged

rustbot added the perf-regression Performance regression. label Dec 7, 2021

couchand deleted the 2021-11/avr-asm branch December 8, 2021 00:55

couchand restored the 2021-11/avr-asm branch December 8, 2021 00:55

couchand deleted the 2021-11/avr-asm branch December 8, 2021 00:55

couchand restored the 2021-11/avr-asm branch December 8, 2021 00:55

rustbot added the perf-regression-triaged The performance regression has been triaged. label Dec 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support AVR for inline asm! #91224

Support AVR for inline asm! #91224

couchand commented Nov 25, 2021

rust-highfive commented Nov 25, 2021

This comment has been minimized.

Amanieu left a comment

Amanieu Nov 25, 2021

couchand Nov 29, 2021

Amanieu Nov 29, 2021

couchand Nov 29, 2021

dylanmckay Dec 1, 2021

couchand Dec 1, 2021

Amanieu Dec 2, 2021

Amanieu Dec 2, 2021

couchand Dec 2, 2021

dylanmckay Dec 15, 2021

wesleywiser commented Nov 29, 2021

couchand commented Nov 29, 2021

couchand commented Nov 29, 2021

couchand commented Nov 29, 2021 •

edited

Loading

dylanmckay commented Dec 1, 2021 •

edited

Loading

couchand commented Dec 1, 2021 •

edited

Loading

Amanieu commented Dec 2, 2021

couchand commented Dec 2, 2021

couchand commented Dec 2, 2021

Amanieu Dec 4, 2021

couchand Dec 6, 2021 •

edited

Loading

Amanieu commented Dec 6, 2021

bors commented Dec 6, 2021

bors commented Dec 7, 2021

bors commented Dec 7, 2021

rust-timer commented Dec 7, 2021

couchand commented Dec 8, 2021

Amanieu commented Dec 8, 2021

couchand commented Dec 8, 2021

pnkfelix commented Dec 15, 2021

pnkfelix commented Dec 15, 2021

Support AVR for inline asm! #91224

Support AVR for inline asm! #91224

Conversation

couchand commented Nov 25, 2021

rust-highfive commented Nov 25, 2021

This comment has been minimized.

Amanieu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wesleywiser commented Nov 29, 2021

couchand commented Nov 29, 2021

couchand commented Nov 29, 2021

couchand commented Nov 29, 2021 • edited Loading

dylanmckay commented Dec 1, 2021 • edited Loading

couchand commented Dec 1, 2021 • edited Loading

Amanieu commented Dec 2, 2021

couchand commented Dec 2, 2021

couchand commented Dec 2, 2021

Choose a reason for hiding this comment

couchand Dec 6, 2021 • edited Loading

Choose a reason for hiding this comment

Amanieu commented Dec 6, 2021

bors commented Dec 6, 2021

bors commented Dec 7, 2021

bors commented Dec 7, 2021

rust-timer commented Dec 7, 2021

couchand commented Dec 8, 2021

Amanieu commented Dec 8, 2021

couchand commented Dec 8, 2021

pnkfelix commented Dec 15, 2021

pnkfelix commented Dec 15, 2021

couchand commented Nov 29, 2021 •

edited

Loading

dylanmckay commented Dec 1, 2021 •

edited

Loading

couchand commented Dec 1, 2021 •

edited

Loading

couchand Dec 6, 2021 •

edited

Loading