Skip to content

Commit

Permalink
Update comments
Browse files Browse the repository at this point in the history
  • Loading branch information
rui314 committed Jul 28, 2024
1 parent 61af6f3 commit 766ce80
Show file tree
Hide file tree
Showing 2 changed files with 58 additions and 57 deletions.
56 changes: 3 additions & 53 deletions elf/arch-riscv.cc
Original file line number Diff line number Diff line change
Expand Up @@ -12,58 +12,8 @@
// From the linker's point of view, the RISC-V's psABI is unique because
// sections in input object files can be shrunk while being copied to the
// output file. That is contrary to other psABIs in which sections are an
// atomic unit of copying. Let me explain it in more details.
//
// Since RISC-V instructions are 16-bit or 32-bit long, there's no way to
// embed a very large immediate into a branch instruction. In fact, JAL
// (jump and link) instruction can jump to only within PC ± 1 MiB because
// its immediate is only 21 bits long. If the destination is out of its
// reach, we need to use two instructions instead; the first instruction
// being AUIPC which sets upper 20 bits to a register and the second being
// JALR with a 12-bit immediate and the register. Combined, they specify a
// 32 bits displacement.
//
// Other RISC ISAs have the same limitation, and they solved the problem by
// letting the linker create so-called "range extension thunks". It works as
// follows: the compiler optimistically emits single jump instructions for
// function calls. If the linker finds that a branch target is out of reach,
// it emits a small piece of machine code near the branch instruction and
// redirect the branch to the linker-synthesized code. The code constructs a
// full 32-bit address in a register and jump to the destination. That
// linker-synthesized code is called "range extension thunks" or just
// "thunks".
//
// The RISC-V psABI is unique that it works the other way around. That is,
// for RISC-V, the compiler always emits two instructions (AUIPC + JAL) for
// function calls. If the linker finds the destination is reachable with a
// single instruction, it replaces the two instructions with the one and
// shrink the section size by one instruction length, instead of filling the
// gap with a nop.
//
// With the presence of this relaxation, sections can no longer be
// considered as an atomic unit. If we delete 4 bytes from the middle of a
// section, all contents after that point needs to be shifted by 4. Symbol
// values and relocation offsets have to be adjusted accordingly if they
// refer to past the deleted bytes.
//
// In mold, we use `r_deltas` to memorize how many bytes have be adjusted
// for relocations. For symbols, we directly mutate their `value` member.
//
// RISC-V object files tend to have way more relocations than those for
// other targets. This is because all branches, including ones that jump
// within the same section, are explicitly expressed with relocations.
// Here is why we need them: all control-flow statements such as `if` or
// `for` are implemented using branch instructions. For other targets, the
// compiler doesn't emit relocations for such branches because they know
// at compile-time exactly how many bytes has to be skipped. That's not
// true to RISC-V because the linker may delete bytes between a branch and
// its destination. Therefore, all branches including in-section ones have
// to be explicitly expressed with relocations.
//
// Note that this mechanism only shrink sections and never enlarge, as
// the compiler always emits the longest instruction sequence. This
// makes the linker implementation a bit simpler because we don't need
// to worry about oscillation.
// atomic unit of copying. See file comments in shrink-sections.cc for
// details.
//
// https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-elf.adoc

Expand Down Expand Up @@ -912,7 +862,7 @@ static i64 compute_distance(Context<E> &ctx, Symbol<E> &sym,
return S + A - P;
}

// Scan relocations to shrink sections.
// Scan relocations to a given shrink section.
template <>
void shrink_section(Context<E> &ctx, InputSection<E> &isec, bool use_rvc) {
std::span<const ElfRel<E>> rels = isec.get_rels(ctx);
Expand Down
59 changes: 55 additions & 4 deletions elf/shrink-sections.cc
Original file line number Diff line number Diff line change
@@ -1,5 +1,56 @@
// Shrink sections by interpreting relocations.
// See file comments in arch-riscv.cc for details.
// Since RISC instructions are generally up to 32 bit long, there's no way
// to embed very large immediates into their branch instructions. For
// example, RISC-V's JAL (jump and link) instruction can jump to only
// within PC ± 1 MiB because its immediate is 21 bits long. If the
// destination is further away, we need to use two instructions instead;
// the first instruction being AUIPC which sets upper 20 bit of a
// displacement to a register, and the second being JALR which specifies
// the lower 12 bits and the register. Combined, they specify a 32 bit
// displacement, which is sufficient to support the medium code model.
//
// However, always using two or more instructions for function calls is a
// waste of time and space if the branch target is within a single
// instruction's reach. There are two approaches to address this problem
// as follows:
//
// 1. The compiler optimistically emits a single branch instruction for
// all function calls. The linker then checks if the branch target is
// reachable, and if not, redirect the branch to a linker-synthesized
// code sequence that uses two or more instructions to branch further.
// That linker-synthesized code is called a "thunk". All RISC psABIs
// except RISC-V and LoongArch take this approach.
//
// 2. The compiler pessimistically emits two instructions to branch
// anywhere in PC ± 2 GiB, and the linker rewrites them with a single
// instruction if the branch target is close enough.RISC-V and
// LoongArch take this approach.
//
// This file contains functions to support (2). For (1), see thunks.cc.
//
// With the presence of this code-shrinking relaxation, sections can no
// longer be considered as an atomic unit. If we delete 4 bytes from the
// middle of a section, section contents after that point needs to be
// shifted by 4. Symbols values and relocations offsets have to be shifted
// too if they refers to past the deleted bytes.
//
// In mold, we use `r_deltas` to memorize how many bytes have be shifted
// for relocations. For symbols, we directly mutate their `value` member.
//
// RISC-V and LoongArch object files tend to have way more relocations
// than those for other targets. This is because all branches, including
// ones that jump within the same section, are explicitly expressed with
// relocations. Here is why we need them: all control-flow statements such
// as `if` or `for` are implemented using branch instructions. For other
// targets, the compiler doesn't emit relocations for such branches
// because they know at compile-time exactly how many bytes has to be
// skipped. That's not true to RISC-V because the linker may delete bytes
// between a branch and its destination. Therefore, all branches including
// in-section ones have to be explicitly expressed with relocations.
//
// Note that this mechanism only shrink sections and never enlarge, as
// the compiler always emits the longest instruction sequence. This
// makes the linker implementation a bit simpler because we don't need
// to worry about oscillation.

#if MOLD_RV64LE || MOLD_RV64BE || MOLD_RV32LE || MOLD_RV32BE || \
MOLD_LOONGARCH64 || MOLD_LOONGARCH32
Expand Down Expand Up @@ -27,8 +78,8 @@ i64 shrink_sections<E>(Context<E> &ctx) {
if constexpr (is_riscv<E>)
use_rvc = get_eflags(ctx) & EF_RISCV_RVC;

// Find all the relocations that can be relaxed.
// This step should only shrink sections.
// Find all relaxable relocations and record how many bytes we can save
// into r_deltas.
tbb::parallel_for_each(ctx.objs, [&](ObjectFile<E> *file) {
for (std::unique_ptr<InputSection<E>> &isec : file->sections)
if (is_resizable(isec.get()))
Expand Down

0 comments on commit 766ce80

Please sign in to comment.