Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat llvm aarch64 relocs #2599

Merged
merged 26 commits into from
Oct 19, 2021
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
12135eb
fix(compiler) macOS Aarch64 ABI is not SystemV
ptitSeb Sep 28, 2021
21660e6
feat(compiler) Added preliminary support for Arm64Call relocation
ptitSeb Sep 28, 2021
4f3b0a9
feat(compiler) - Fixed Arm64Call relocation
ptitSeb Sep 28, 2021
9b6a9ad
feat(compiler) Fix Arm64Call relocation with negative offset
ptitSeb Sep 28, 2021
20f0c66
feat(compiler) Added Trampolines and more Relocations for Arm64 (llvm…
ptitSeb Oct 5, 2021
44eef49
feat(compiler) Fixed single-pass build
ptitSeb Oct 5, 2021
9cf3605
feat(compiler) Don't try to use macOS Aarch64 specific ABI for now (a…
ptitSeb Oct 7, 2021
204238c
feat(compiler) Fixed linting
ptitSeb Oct 7, 2021
840d90b
Merge branch 'master' into feat_llvm_aarch64_relocs
ptitSeb Oct 8, 2021
0c6010c
feat(compiler) Use x17 as scratch instead of x16 on Aarch64 to help w…
ptitSeb Oct 8, 2021
7965180
feat(compiler) Added CHANGELOG note about Linux/Aarch64 Universal eng…
ptitSeb Oct 8, 2021
8aeccc3
Update lib/engine-universal/src/link.rs
ptitSeb Oct 11, 2021
95d332e
Fixed cargo audit
ptitSeb Oct 11, 2021
541ee00
Fix build
ptitSeb Oct 11, 2021
b54780c
feat(compiler) Refactor the new ARM Reloc and Trampoline to avoid a &mut
ptitSeb Oct 12, 2021
e1f8346
feat(compiler) removed useless commented code
ptitSeb Oct 12, 2021
dc2ed56
feat(compiler) Small refactor to make fill_trampolines_map more clear
ptitSeb Oct 12, 2021
a408cdf
feat(compiler) Removed useless commented code
ptitSeb Oct 12, 2021
8c9082c
Merge branch 'master' into feat_llvm_aarch64_relocs
syrusakbary Oct 12, 2021
89cd4f9
Merge branch 'master' into feat_llvm_aarch64_relocs
syrusakbary Oct 12, 2021
eccc9dc
Removed unused commented code
ptitSeb Oct 12, 2021
2ac7948
feat(compiler) Another small refactor in arm trampoline handling
ptitSeb Oct 12, 2021
a270270
Merge branch 'master' into feat_llvm_aarch64_relocs
ptitSeb Oct 12, 2021
8c24ab4
Some last small changes
ptitSeb Oct 18, 2021
2419398
Merge branch 'master' into feat_llvm_aarch64_relocs
syrusakbary Oct 19, 2021
b39a0cf
Merge branch 'master' into feat_llvm_aarch64_relocs
syrusakbary Oct 19, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ Looking for changes that affect our C API? See the [C API Changelog](lib/c-api/C
- [#2478](https://github.com/wasmerio/wasmer/pull/2478) Rename `wasm_instance_new()`’s “traps” argument to “trap”.

### Fixed
- [#2599](https://github.com/wasmerio/wasmer/pull/2599) Fixed Universal engine for Linux/Aarch64 target.
- [#2587](https://github.com/wasmerio/wasmer/pull/2587) Fixed deriving `WasmerEnv` when aliasing `Result`.
- [#2518](https://github.com/wasmerio/wasmer/pull/2518) Remove temporary file used to creating an artifact when creating a Dylib engine artifact.
- [#2494](https://github.com/wasmerio/wasmer/pull/2494) Fixed `WasmerEnv` access when using `call_indirect` with the Singlepass compiler.
Expand Down
1 change: 1 addition & 0 deletions lib/compiler-cranelift/src/compiler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -294,6 +294,7 @@ impl Compiler for CraneliftCompiler {
function_call_trampolines,
dynamic_function_trampolines,
dwarf,
None,
))
}
}
39 changes: 36 additions & 3 deletions lib/compiler-llvm/src/compiler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,10 @@ use rayon::iter::ParallelBridge;
use rayon::prelude::{IntoParallelIterator, IntoParallelRefIterator, ParallelIterator};
use std::sync::Arc;
use wasmer_compiler::{
Compilation, CompileError, CompileModuleInfo, Compiler, CustomSection, CustomSectionProtection,
Dwarf, FunctionBodyData, ModuleMiddleware, ModuleTranslationState, RelocationTarget,
SectionBody, SectionIndex, Symbol, SymbolRegistry, Target,
Architecture, Compilation, CompileError, CompileModuleInfo, Compiler, CustomSection,
CustomSectionProtection, Dwarf, FunctionBodyData, ModuleMiddleware, ModuleTranslationState,
RelocationTarget, SectionBody, SectionIndex, Symbol, SymbolRegistry, Target,
TrampolinesSection,
};
use wasmer_types::entity::{EntityRef, PrimaryMap};
use wasmer_types::{FunctionIndex, LocalFunctionIndex, SignatureIndex};
Expand Down Expand Up @@ -303,6 +304,37 @@ impl Compiler for LLVMCompiler {
})
.collect::<PrimaryMap<LocalFunctionIndex, _>>();

let trampolines = match target.triple().architecture {
Architecture::Aarch64(_) => {
let nj = 16;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use onejump.len() instead of hard-coding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the number of slots hard-coded to 16?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I need to hardcode some value. Cant resize the section later.

// We create a jump to an absolute 64bits address
// using x17 as a scratch register, SystemV declare both x16 and x17 as Intra-Procedural scratch register
// but Apple ask to just not use x16
// LDR x17, #8 51 00 00 58
// BR x17 20 02 1f d6
// JMPADDR 00 00 00 00 00 00 00 00
let onejump = [
0x51, 0x00, 0x00, 0x58, 0x20, 0x02, 0x1f, 0xd6, 0, 0, 0, 0, 0, 0, 0, 0,
];
let trampolines = Some(TrampolinesSection::new(
SectionIndex::from_u32(module_custom_sections.len() as u32),
nj,
onejump.len(),
));
let mut alljmps = vec![];
for _ in 0..nj {
alljmps.extend(onejump.iter().copied());
}
module_custom_sections.push(CustomSection {
protection: CustomSectionProtection::ReadExecute,
bytes: SectionBody::new_with_vec(alljmps),
relocations: vec![],
});
trampolines
}
_ => None,
};

let dwarf = if !frame_section_bytes.is_empty() {
let dwarf = Some(Dwarf::new(SectionIndex::from_u32(
module_custom_sections.len() as u32,
Expand Down Expand Up @@ -367,6 +399,7 @@ impl Compiler for LLVMCompiler {
function_call_trampolines,
dynamic_function_trampolines,
dwarf,
trampolines,
))
}
}
15 changes: 9 additions & 6 deletions lib/compiler-llvm/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -102,11 +102,6 @@ impl LLVM {
fn target_triple(&self, target: &Target) -> TargetTriple {
// Hack: we're using is_pic to determine whether this is a native
// build or not.
let binary_format = if self.is_pic {
target.triple().binary_format
} else {
target_lexicon::BinaryFormat::Elf
};
let operating_system = if target.triple().operating_system
== wasmer_compiler::OperatingSystem::Darwin
&& !self.is_pic
Expand All @@ -117,10 +112,18 @@ impl LLVM {
// MachO, they check whether the OS is set to Darwin.
//
// Since both linux and darwin use SysV ABI, this should work.
wasmer_compiler::OperatingSystem::Linux
// but not in the case of Aarch64, there the ABI is slightly different
ptitSeb marked this conversation as resolved.
Show resolved Hide resolved
match target.triple().architecture {
_ => wasmer_compiler::OperatingSystem::Linux,
}
} else {
target.triple().operating_system
};
let binary_format = if self.is_pic {
target.triple().binary_format
} else {
target_lexicon::BinaryFormat::Elf
};
let triple = Triple {
architecture: target.triple().architecture,
vendor: target.triple().vendor.clone(),
Expand Down
13 changes: 13 additions & 0 deletions lib/compiler-llvm/src/object_file.rs
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,19 @@ where
(object::RelocationKind::Elf(object::elf::R_X86_64_PC64), 0) => {
RelocationKind::X86PCRel8
}
(object::RelocationKind::Elf(object::elf::R_AARCH64_MOVW_UABS_G0_NC), 0) => {
RelocationKind::Arm64Movw0
}
(object::RelocationKind::Elf(object::elf::R_AARCH64_MOVW_UABS_G1_NC), 0) => {
RelocationKind::Arm64Movw1
}
(object::RelocationKind::Elf(object::elf::R_AARCH64_MOVW_UABS_G2_NC), 0) => {
RelocationKind::Arm64Movw2
}
(object::RelocationKind::Elf(object::elf::R_AARCH64_MOVW_UABS_G3), 0) => {
RelocationKind::Arm64Movw3
}
(object::RelocationKind::PltRelative, 26) => RelocationKind::Arm64Call,
_ => {
return Err(CompileError::Codegen(format!(
"unknown relocation {:?}",
Expand Down
1 change: 1 addition & 0 deletions lib/compiler-singlepass/src/compiler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ impl Compiler for SinglepassCompiler {
function_call_trampolines,
dynamic_function_trampolines,
None,
None,
))
}
}
Expand Down
37 changes: 37 additions & 0 deletions lib/compiler/src/function.rs
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,33 @@ impl Dwarf {
}
}

/// Trampolines section used by ARM short jump (26bits)
#[cfg_attr(feature = "enable-serde", derive(Deserialize, Serialize))]
#[cfg_attr(
feature = "enable-rkyv",
derive(RkyvSerialize, RkyvDeserialize, Archive)
)]
#[derive(Debug, PartialEq, Eq, Clone, MemoryUsage)]
pub struct TrampolinesSection {
/// SectionIndex for the actual Trampolines code
pub section_index: SectionIndex,
/// Number of jump slots in the section
pub slots: usize,
/// Slot size
pub size: usize,
}

impl TrampolinesSection {
/// Creates a `Trampolines` struct with the indice for its section, and number of slots and size of slot
pub fn new(section_index: SectionIndex, slots: usize, size: usize) -> Self {
Self {
section_index,
slots,
size,
}
}
}

/// The result of compiling a WebAssembly module's functions.
#[cfg_attr(feature = "enable-serde", derive(Deserialize, Serialize))]
#[derive(Debug, PartialEq, Eq)]
Expand Down Expand Up @@ -155,6 +182,9 @@ pub struct Compilation {

/// Section ids corresponding to the Dwarf debug info
debug: Option<Dwarf>,

/// Trampolines for the arch that needs it
trampolines: Option<TrampolinesSection>,
}

impl Compilation {
Expand All @@ -165,13 +195,15 @@ impl Compilation {
function_call_trampolines: PrimaryMap<SignatureIndex, FunctionBody>,
dynamic_function_trampolines: PrimaryMap<FunctionIndex, FunctionBody>,
debug: Option<Dwarf>,
trampolines: Option<TrampolinesSection>,
) -> Self {
Self {
functions,
custom_sections,
function_call_trampolines,
dynamic_function_trampolines,
debug,
trampolines,
}
}

Expand Down Expand Up @@ -249,6 +281,11 @@ impl Compilation {
pub fn get_debug(&self) -> Option<Dwarf> {
self.debug.clone()
}

/// Returns the Trampilines info.
pub fn get_trampolines(&self) -> Option<TrampolinesSection> {
self.trampolines.clone()
}
}

impl<'a> IntoIterator for &'a Compilation {
Expand Down
2 changes: 1 addition & 1 deletion lib/compiler/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ pub use crate::error::{
};
pub use crate::function::{
Compilation, CompiledFunction, CompiledFunctionFrameInfo, CustomSections, Dwarf, FunctionBody,
Functions,
Functions, TrampolinesSection,
};
pub use crate::jump_table::{JumpTable, JumpTableOffsets};
pub use crate::module::CompileModuleInfo;
Expand Down
26 changes: 25 additions & 1 deletion lib/compiler/src/relocation.rs
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,14 @@ pub enum RelocationKind {
Arm32Call,
/// Arm64 call target
Arm64Call,
/// Arm64 movk/z part 0
Arm64Movw0,
/// Arm64 movk/z part 1
Arm64Movw1,
/// Arm64 movk/z part 2
Arm64Movw2,
/// Arm64 movk/z part 3
Arm64Movw3,
// /// RISC-V call target
// RiscvCall,
/// Elf x86_64 32 bit signed PC relative offset to two GOT entries for GD symbol.
Expand All @@ -72,6 +80,10 @@ impl fmt::Display for RelocationKind {
Self::X86CallPLTRel4 => write!(f, "CallPLTRel4"),
Self::X86GOTPCRel4 => write!(f, "GOTPCRel4"),
Self::Arm32Call | Self::Arm64Call => write!(f, "Call"),
Self::Arm64Movw0 => write!(f, "Arm64MovwG0"),
Self::Arm64Movw1 => write!(f, "Arm64MovwG1"),
Self::Arm64Movw2 => write!(f, "Arm64MovwG2"),
Self::Arm64Movw3 => write!(f, "Arm64MovwG3"),
Self::ElfX86_64TlsGd => write!(f, "ElfX86_64TlsGd"),
// Self::MachOX86_64Tlv => write!(f, "MachOX86_64Tlv"),
}
Expand Down Expand Up @@ -121,7 +133,11 @@ impl Relocation {
/// The function returns the relocation address and the delta.
pub fn for_address(&self, start: usize, target_func_address: u64) -> (usize, u64) {
match self.kind {
RelocationKind::Abs8 => {
RelocationKind::Abs8
| RelocationKind::Arm64Movw0
| RelocationKind::Arm64Movw1
| RelocationKind::Arm64Movw2
| RelocationKind::Arm64Movw3 => {
let reloc_address = start + self.offset as usize;
let reloc_addend = self.addend as isize;
let reloc_abs = target_func_address
Expand Down Expand Up @@ -155,6 +171,14 @@ impl Relocation {
.wrapping_add(reloc_addend as u32);
(reloc_address, reloc_delta_u32 as u64)
}
RelocationKind::Arm64Call => {
let reloc_address = start + self.offset as usize;
let reloc_addend = self.addend as isize;
let reloc_delta_u32 = target_func_address
.wrapping_sub(reloc_address as u64)
.wrapping_add(reloc_addend as u64);
(reloc_address, reloc_delta_u32)
}
// RelocationKind::X86PCRelRodata4 => {
// (start, target_func_address)
// }
Expand Down
3 changes: 3 additions & 0 deletions lib/engine-universal/src/artifact.rs
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ impl UniversalArtifact {
custom_sections: compilation.get_custom_sections(),
custom_section_relocations: compilation.get_custom_section_relocations(),
debug: compilation.get_debug(),
trampolines: compilation.get_trampolines(),
};
let serializable = SerializableModule {
compilation: serializable_compilation,
Expand Down Expand Up @@ -194,6 +195,7 @@ impl UniversalArtifact {
serializable.compilation.function_relocations.clone(),
&custom_sections,
&serializable.compilation.custom_section_relocations,
&serializable.compilation.trampolines,
);

// Compute indices into the shared signature table.
Expand Down Expand Up @@ -221,6 +223,7 @@ impl UniversalArtifact {
}
None => None,
};

// Make all code compiled thus far executable.
inner_engine.publish_compiled_code();

Expand Down
Loading