Additional wide instructions #3

dthaler · 2022-09-01T00:17:38Z

Quentin writes:

The dest = imm (0x18) and call (0x85) instructions have a different semantic when their src register is set to a special flag. I think this is also part of the ISA and should be documented? See commits 2 to 7 of iovisor/bpf-docs#26 (and their description) for a quick reference.

The PR has:

But what does "map[0]" mean? What does "insn[0]" mean, is that relative to the PC or absolute from the start of the program or what?
Also the ISA does not currently define the existence / meaning of a "map fd" or a "BTF id of var" or a "map index" or a "BPF callback". I'm concerned about adding these to the ISA without definitions.

dthaler · 2022-09-13T18:22:03Z

@qmonnet

0x18 (src == 1) | lddw dst, map | dst = imm with imm == map fd

Does "lddw dst, map" mean to set dst to the memory address of the map, or to the id of the map, or to the fd of the map, or what?

0x18 (src == 2) | lddw dst, map value | dst = map[0] + insn[1].imm with insn[0] == map fd

To confirm, this has to be a map fd, not a map id? So runtimes without a map fd available cannot use this instruction? (E.g., Windows has map ids that are u32 but no map fds in the kernel.)

0x18 (src == 3) | lddw dst, kernel var | dst = imm with imm == BTF id of var

by "lddw dst, kernel var" does that mean the 64-bit value of the variable? or the BTF id of the variable? or the memory address of the variable?

0x18 (src == 4) | lddw dst, BPF func | dst = imm with imm == insn offset of BPF callback

Similar question, what does this mean?

0x18 (src == 5) | lddw dst, imm | dst = imm with imm == map index

Other than what the verifier does, offhand this looks to be the same as src == 0, meaning "lddw dst, imm". Correct?

0x18 (src == 6) | lddw dst, map value | dst = map[0] + insn[1].imm with insn[0] == map index

Is a "map index" the same thing as the map id as used by libbpf/bpftool/etc? Or is it the index into the maps defined by the executing program?

qmonnet · 2022-09-13T23:37:26Z

@qmonnet
0x18 (src == 1) | lddw dst, map | dst = imm with imm == map fd
Does "lddw dst, map" mean to set dst to the memory address of the map, or to the id of the map, or to the fd of the map, or what?

The instruction is converted into a load of the memory address of the map into the destination register:

			addr = (unsigned long)map;
			[...]
			insn[0].imm = (u32)addr;
			insn[1].imm = addr >> 32;

thus becoming a regular lddw, loading the pointer to the map into dst.

0x18 (src == 2) | lddw dst, map value | dst = map[0] + insn[1].imm with insn[0] == map fd
To confirm, this has to be a map fd, not a map id? So runtimes without a map fd available cannot use this instruction? (E.g., Windows has map ids that are u32 but no map fds in the kernel.)

The kernel uses a file descriptor here. It could probably work just the same with another identifier, however, given that Linux also has a notion of map ids, this could lead to confusion. Maybe identifier to the map (or reference or descriptor?) with a Linux note telling that for Linux this identifier is a file descriptor (not the map id)?

0x18 (src == 3) | lddw dst, kernel var | dst = imm with imm == BTF id of var
by "lddw dst, kernel var" does that mean the 64-bit value of the variable? or the BTF id of the variable? or the memory address of the variable?

This converts into a load of the memory address of the kernel variable into the destination register.

0x18 (src == 4) | lddw dst, BPF func | dst = imm with imm == insn offset of BPF callback

Similar question, what does this mean?

When the source register of a "load immediate" instruction is set to 4,
the verifier considers the immediate value as an instruction offset for
a BPF function in the program. Once loaded into the destination
register, this function can be passed to the bpf_for_each_map_elem()
helper function and used as a callback.

In that case, this loads a pointer to a function in the BPF program, which cannot be reused directly but can be passed to bpf_for_each_map_elem(). When it runs, this helper calls the function on each entry of the provided map successively.

0x18 (src == 5) | lddw dst, imm | dst = imm with imm == map index
Other than what the verifier does, offhand this looks to be the same as src == 0, meaning "lddw dst, imm". Correct?

No, this would be the same as src == 1 (BPF_PSEUDO_MAP_FD), but expects a map index instead of a map fd. It does not load a random immediate, but converts into a load of the memory address for this map.

0x18 (src == 6) | lddw dst, map value | dst = map[0] + insn[1].imm with insn[0] == map index
Is a "map index" the same thing as the map id as used by libbpf/bpftool/etc? Or is it the index into the maps defined by the executing program?

For src == 5 like for src == 6, the “map index” is not the map id. It is the index of the map in an array of maps used by the program, fd_array, and passed in the union bpf_attr to the bpf() syscall. Commit log explains the motivation:

    Typical program loading sequence involves creating bpf maps and applying                                                                                              
    map FDs into bpf instructions in various places in the bpf program.                                                                                                   
    This job is done by libbpf that is using compiler generated ELF relocations                                                                                           
    to patch certain instruction after maps are created and BTFs are loaded.                                                                                              
    The goal of fd_idx is to allow bpf instructions to stay immutable                                                                                                     
    after compilation. At load time the libbpf would still create maps as usual,                                                                                          
    but it wouldn't need to patch instructions. It would store map_fds into                                                                                               
    __u32 fd_array[] and would pass that pointer to sys_bpf(BPF_PROG_LOAD).

dthaler · 2022-09-14T11:12:39Z

0x18 (src == 5) | lddw dst, imm | dst = imm with imm == map index
Other than what the verifier does, offhand this looks to be the same as src == 0, meaning "lddw dst, imm". Correct?
No, this would be the same as src == 1 (BPF_PSEUDO_MAP_FD), but expects a map index instead of a map fd. It does not load a random immediate, but converts into a load of the memory address for this map.

Then shouldn't your PR instead say:

0x18 (src == 5) | lddw dst, map | dst = imm with imm == map index

for consistency with src == 1?

dthaler · 2022-09-14T17:50:43Z

@qmonnet Please check PR #4 for correctness

dthaler · 2022-09-14T18:12:54Z

@qmonnet Do you know if all these are supported in all ISA versions, i.e. even with -mcpu=v1 in clang?

qmonnet · 2022-09-20T17:04:04Z

@qmonnet Do you know if all these are supported in all ISA versions, i.e. even with -mcpu=v1 in clang?

These instructions were added a few years apart so I'm pretty sure they're not all in the same ISA version. Would need to check kernel versions and to run that against the dates of the different ISA versions in clang, I suppose, but I haven't had time to do it yet.

qmonnet · 2022-09-21T11:00:21Z

So I don't think that the v1, v2 and v3 as known to LLVM strictly relate to any accurate state of the ISA in the kernel. Mostly, v2 was support in LLVM for ALU32, and for the 32-bit jump instructions.

Here's a chronology of the different LLVM -mcpu values and the flags for the extended lddw and call instructions in Linux, if you want to “match” the features with the existing -mcpu versions existing at the time. Most flags were added after the introduction of LLVM's v3.

BPF_PSEUDO_MAP_FD: commit, September 2014
-mcpu=v2 (and v1) introduction in LLVM: commit, August 2017
BPF_PSEUDO_CALL: commit, December 2017
-mcpu=v3 introduction in LLVM: commit, February 2019
BPF_PSEUDO_MAP_VALUE: commit, April 2019
BPF_PSEUDO_BTF_ID: commit, October 2020
BPF_PSEUDO_FUNC: commit, February 2021
BPF_PSEUDO_KFUNC_CALL: commit, March 2021
BPF_PSEUDO_MAP_IDX, BPF_PSEUDO_MAP_VALUE_IDX: commit, May 2021
New atomic instructions: commit and later, January 2021

tools/ebpf-checks: Multiple improvements to the scripts

dthaler mentioned this issue Sep 14, 2022

Address feedback from LPC #5

Merged

dthaler added a commit that referenced this issue Jul 5, 2023

Merge pull request #3 from qmonnet/pr/ebpf-checks

4986c4f

tools/ebpf-checks: Multiple improvements to the scripts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional wide instructions #3

Additional wide instructions #3

dthaler commented Sep 1, 2022 •

edited

Loading

dthaler commented Sep 13, 2022

qmonnet commented Sep 13, 2022

dthaler commented Sep 14, 2022

dthaler commented Sep 14, 2022 •

edited

Loading

dthaler commented Sep 14, 2022

qmonnet commented Sep 20, 2022

qmonnet commented Sep 21, 2022

Additional wide instructions #3

Additional wide instructions #3

Comments

dthaler commented Sep 1, 2022 • edited Loading

dthaler commented Sep 13, 2022

qmonnet commented Sep 13, 2022

dthaler commented Sep 14, 2022

dthaler commented Sep 14, 2022 • edited Loading

dthaler commented Sep 14, 2022

qmonnet commented Sep 20, 2022

qmonnet commented Sep 21, 2022

dthaler commented Sep 1, 2022 •

edited

Loading

dthaler commented Sep 14, 2022 •

edited

Loading