Skip to content

support #[target_feature(enable = ...)] on #[naked] functions #137720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
230 changes: 230 additions & 0 deletions compiler/rustc_codegen_ssa/src/mir/naked_asm.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
use object::{Architecture, SubArchitecture};
use rustc_abi::{BackendRepr, Float, Integer, Primitive, RegKind};
use rustc_attr_parsing::InstructionSetAttr;
use rustc_hir::def_id::DefId;
use rustc_middle::middle::codegen_fn_attrs::{CodegenFnAttrs, TargetFeature};
use rustc_middle::mir::mono::{Linkage, MonoItem, MonoItemData, Visibility};
use rustc_middle::mir::{Body, InlineAsmOperand};
use rustc_middle::ty::layout::{FnAbiOf, HasTyCtxt, HasTypingEnv, LayoutOf};
Expand Down Expand Up @@ -104,6 +106,221 @@ fn inline_to_global_operand<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>>(
}
}

// FIXME share code with `create_object_file`
fn parse_architecture(
Comment on lines +109 to +110
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this there some standard way of getting the architecture that I'm missing? I'd like the rest of the code to operate on an enum rather than doing a bunch of manual string matching.

sess: &rustc_session::Session,
) -> Option<(Architecture, Option<SubArchitecture>)> {
let (architecture, subarchitecture) = match &sess.target.arch[..] {
"arm" => (Architecture::Arm, None),
"aarch64" => (
if sess.target.pointer_width == 32 {
Architecture::Aarch64_Ilp32
} else {
Architecture::Aarch64
},
None,
),
"x86" => (Architecture::I386, None),
"s390x" => (Architecture::S390x, None),
"mips" | "mips32r6" => (Architecture::Mips, None),
"mips64" | "mips64r6" => (Architecture::Mips64, None),
"x86_64" => (
if sess.target.pointer_width == 32 {
Architecture::X86_64_X32
} else {
Architecture::X86_64
},
None,
),
"powerpc" => (Architecture::PowerPc, None),
"powerpc64" => (Architecture::PowerPc64, None),
"riscv32" => (Architecture::Riscv32, None),
"riscv64" => (Architecture::Riscv64, None),
"sparc" => {
if sess.unstable_target_features.contains(&sym::v8plus) {
// Target uses V8+, aka EM_SPARC32PLUS, aka 64-bit V9 but in 32-bit mode
(Architecture::Sparc32Plus, None)
} else {
// Target uses V7 or V8, aka EM_SPARC
(Architecture::Sparc, None)
}
}
"sparc64" => (Architecture::Sparc64, None),
"avr" => (Architecture::Avr, None),
"msp430" => (Architecture::Msp430, None),
"hexagon" => (Architecture::Hexagon, None),
"bpf" => (Architecture::Bpf, None),
"loongarch64" => (Architecture::LoongArch64, None),
"csky" => (Architecture::Csky, None),
"arm64ec" => (Architecture::Aarch64, Some(SubArchitecture::Arm64EC)),

// added here
"wasm32" => (Architecture::Wasm32, None),
"wasm64" => (Architecture::Wasm64, None),
"m68k" => (Architecture::M68k, None),

// Unsupported architecture.
_ => return None,
};

Some((architecture, subarchitecture))
}

/// Enable the function's target features in the body of the function, then disable them again
fn enable_disable_target_features<'tcx>(
tcx: TyCtxt<'tcx>,
attrs: &CodegenFnAttrs,
) -> Option<(String, String)> {
use std::fmt::Write;

let mut begin = String::new();
let mut end = String::new();

let (architecture, _subarchitecture) = parse_architecture(tcx.sess)?;
let features = attrs.target_features.iter().filter(|attr| !attr.implied);

match architecture {
Architecture::X86_64 | Architecture::X86_64_X32 | Architecture::I386 => {
// no action is needed, all instructions are accepted regardless of target feature
}

Architecture::Aarch64 | Architecture::Aarch64_Ilp32 | Architecture::Arm => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for Arm (AArch32), some target features like v8, neon, don't seem to work with .arch_extension: https://godbolt.org/z/GTK3vqfrh

As for v* features, they work with .arch armv*: https://godbolt.org/z/7oj9njWG6

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the docs list only three features for .arch_extension on aarch32: crc, fp16 and ras. I could see how the architectures are special, but even dotprod does not seem to work?!

also for the arch: because there is no push/pop, we can't get back to the original arch if we set it, right? I did figure out that you can .fpu neon for neon, but like arch, there is no way to reset the fpu once you set it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some further work categorizing the features that rust/llvm supports

https://godbolt.org/z/46s7zxvfo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, good news, I think there is a way forward here.

global_asm! being stateful (as mentioned here #137720 (comment)) cannot be relied on by users because of this rule in the reference

https://doc.rust-lang.org/reference/inline-assembly.html#r-asm.rules.not-successive

Therefore, we just need to make sure that at the end of a naked function, the settings are as configured via the global target features. It's less elegant that the push/pop mechanism , and for arm that will still be a bit of a mess, but it seems workable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, after looking at it a bit, I still think it's possible, but hard. I have some partial code here https://gist.github.com/folkertdev/fe99874c466e598d0fb2dadf13b91b6f but I think this really needs someone more familiar with the arm target features to do a good job on.

(or, like, can we get LLVM to add a push/pop mechanism for us?)

So, in combination with target features being unstable on arm anyway, I would prefer to skip that logic for now, keep this PR straightforward, and get target features working on targets with stable asm support first.

// https://developer.arm.com/documentation/100067/0611/armclang-Integrated-Assembler/AArch32-Target-selection-directives?lang=en

for feature in features {
writeln!(begin, ".arch_extension {}", feature.name).unwrap();

writeln!(end, ".arch_extension no{}", feature.name).unwrap();
}
}
Architecture::Riscv32 | Architecture::Riscv64 => {
// https://github.com/riscv-non-isa/riscv-asm-manual/blob/ad0de8c004e29c9a7ac33cfd054f4d4f9392f2fb/src/asm-manual.adoc#arch

writeln!(begin, ".option push").unwrap();
for feature in features {
writeln!(begin, ".option arch, +{}", feature.name).unwrap();
}

writeln!(end, ".option pop").unwrap();
}
Architecture::Mips | Architecture::Mips64 | Architecture::Mips64_N32 => {
// https://sourceware.org/binutils/docs/as/MIPS-ISA.html
// https://sourceware.org/binutils/docs/as/MIPS-ASE-Instruction-Generation-Overrides.html

writeln!(begin, ".set push").unwrap();
for feature in features {
writeln!(begin, ".set {}", feature.name).unwrap();
}

writeln!(end, ".set pop").unwrap();
}

Architecture::S390x => {
// https://sourceware.org/binutils/docs/as/s390-Directives.html

// based on src/llvm-project/llvm/lib/Target/SystemZ/SystemZFeatures.td
let isa_revision_for_feature = |feature: &TargetFeature| match feature.name.as_str() {
"backchain" => None, // does not define any instructions
"deflate-conversion" => Some(13),
"enhanced-sort" => Some(13),
"guarded-storage" => Some(12),
"high-word" => None, // technically 9, but LLVM supports only >= 10
"nnp-assist" => Some(14),
"transactional-execution" => Some(10),
"vector" => Some(11),
"vector-enhancements-1" => Some(12),
"vector-enhancements-2" => Some(13),
"vector-packed-decimal" => Some(12),
"vector-packed-decimal-enhancement" => Some(13),
"vector-packed-decimal-enhancement-2" => Some(14),
_ => None,
};

if let Some(minimum_isa) = features.filter_map(isa_revision_for_feature).max() {
writeln!(begin, ".machine arch{minimum_isa}").unwrap();

// NOTE: LLVM does not support `.machine push` and `.machine pop`, so we rely on these
// target features only being applied to this ASM block (LLVM clears them for the next)
//
// https://github.com/llvm/llvm-project/blob/74306afe87b85cb9b5734044eb6c74b8290098b3/llvm/lib/Target/SystemZ/AsmParser/SystemZAsmParser.cpp#L1362
}
}
Architecture::PowerPc | Architecture::PowerPc64 => {
// https://www.ibm.com/docs/en/ssw_aix_71/assembler/assembler_pdf.pdf

// based on src/llvm-project/llvm/lib/Target/PowerPC/PPC.td
let isa_revision_for_feature = |feature: &TargetFeature| match feature.name.as_str() {
"altivec" => Some(7),
"partword-atomics" => Some(8),
"power10-vector" => Some(10),
"power8-altivec" => Some(8),
"power8-crypto" => Some(8),
"power8-vector" => Some(9),
"power9-altivec" => Some(9),
"power9-vector" => Some(9),
"quadword-atomics" => Some(8),
"vsx" => Some(7),
_ => None,
};

if let Some(minimum_isa) = features.filter_map(isa_revision_for_feature).max() {
writeln!(begin, ".machine push").unwrap();

// LLVM currently ignores the .machine directive, and allows all instructions regardless
// of the machine. This may be fixed in the future.
//
// https://github.com/llvm/llvm-project/blob/74306afe87b85cb9b5734044eb6c74b8290098b3/llvm/lib/Target/PowerPC/AsmParser/PPCAsmParser.cpp#L1799
writeln!(begin, ".machine pwr{minimum_isa}").unwrap();

writeln!(end, ".machine pop").unwrap();
}
}

Architecture::M68k => {
// https://sourceware.org/binutils/docs/as/M68K_002dDirectives.html#index-directives_002c-M680x0

// FIXME support m64k
// return None;
}

Architecture::Wasm32 | Architecture::Wasm64 => {
// LLVM does not appear to accept any directive to enable target features
//
// https://github.com/llvm/llvm-project/blob/74306afe87b85cb9b5734044eb6c74b8290098b3/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp#L909

/* fallthrough */
}

Architecture::LoongArch64 => {
// LLVM does not appear to accept any directive to enable target features
//
// https://github.com/llvm/llvm-project/blob/74306afe87b85cb9b5734044eb6c74b8290098b3/llvm/lib/Target/LoongArch/AsmParser/LoongArchAsmParser.cpp#L1918

/* fallthrough */
}

// FIXME: support naked_asm! on more architectures
Architecture::Avr => return None,
Architecture::Bpf => return None,
Architecture::Csky => return None,
Architecture::E2K32 => return None,
Architecture::E2K64 => return None,
Architecture::Hexagon => return None,
Architecture::Msp430 => return None,
Architecture::Sbf => return None,
Architecture::Sharc => return None,
Architecture::Sparc => return None,
Architecture::Sparc32Plus => return None,
Architecture::Sparc64 => return None,
Architecture::Xtensa => return None,

// the Architecture enum is non-exhaustive
Architecture::Unknown | _ => return None,
}

Some((begin, end))
}

fn prefix_and_suffix<'tcx>(
tcx: TyCtxt<'tcx>,
instance: Instance<'tcx>,
Expand Down Expand Up @@ -186,6 +403,12 @@ fn prefix_and_suffix<'tcx>(
Ok(())
};

let Some((target_feature_begin, target_feature_end)) =
enable_disable_target_features(tcx, attrs)
else {
panic!("target features on naked functions are not supported for this architecture");
};

let mut begin = String::new();
let mut end = String::new();
match asm_binary_format {
Expand All @@ -205,6 +428,8 @@ fn prefix_and_suffix<'tcx>(
writeln!(begin, ".pushsection {section},\"ax\", {progbits}").unwrap();
writeln!(begin, ".balign {align}").unwrap();
write_linkage(&mut begin).unwrap();
begin.push_str(&target_feature_begin);

if let Visibility::Hidden = item_data.visibility {
writeln!(begin, ".hidden {asm_name}").unwrap();
}
Expand All @@ -215,6 +440,7 @@ fn prefix_and_suffix<'tcx>(
writeln!(begin, "{asm_name}:").unwrap();

writeln!(end).unwrap();
end.push_str(&target_feature_end);
writeln!(end, ".size {asm_name}, . - {asm_name}").unwrap();
writeln!(end, ".popsection").unwrap();
if !arch_suffix.is_empty() {
Expand All @@ -226,12 +452,14 @@ fn prefix_and_suffix<'tcx>(
writeln!(begin, ".pushsection {},regular,pure_instructions", section).unwrap();
writeln!(begin, ".balign {align}").unwrap();
write_linkage(&mut begin).unwrap();
begin.push_str(&target_feature_begin);
if let Visibility::Hidden = item_data.visibility {
writeln!(begin, ".private_extern {asm_name}").unwrap();
}
writeln!(begin, "{asm_name}:").unwrap();

writeln!(end).unwrap();
end.push_str(&target_feature_end);
writeln!(end, ".popsection").unwrap();
if !arch_suffix.is_empty() {
writeln!(end, "{}", arch_suffix).unwrap();
Expand All @@ -242,13 +470,15 @@ fn prefix_and_suffix<'tcx>(
writeln!(begin, ".pushsection {},\"xr\"", section).unwrap();
writeln!(begin, ".balign {align}").unwrap();
write_linkage(&mut begin).unwrap();
begin.push_str(&target_feature_begin);
writeln!(begin, ".def {asm_name}").unwrap();
writeln!(begin, ".scl 2").unwrap();
writeln!(begin, ".type 32").unwrap();
writeln!(begin, ".endef {asm_name}").unwrap();
writeln!(begin, "{asm_name}:").unwrap();

writeln!(end).unwrap();
end.push_str(&target_feature_end);
writeln!(end, ".popsection").unwrap();
if !arch_suffix.is_empty() {
writeln!(end, "{}", arch_suffix).unwrap();
Expand Down
Loading
Loading