Skip to content

Conversation

@kernel-patches-daemon-bpf
Copy link

Pull request for series with
subject: BPF indirect jumps
version: 9
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1018485

aspsk added 11 commits November 1, 2025 04:15
On bpf(BPF_PROG_LOAD) syscall user-supplied BPF programs are
translated by the verifier into "xlated" BPF programs. During this
process the original instructions offsets might be adjusted and/or
individual instructions might be replaced by new sets of instructions,
or deleted.

Add a new BPF map type which is aimed to keep track of how, for a
given program, the original instructions were relocated during the
verification. Also, besides keeping track of the original -> xlated
mapping, make x86 JIT to build the xlated -> jitted mapping for every
instruction listed in an instruction array. This is required for every
future application of instruction arrays: static keys, indirect jumps
and indirect calls.

A map of the BPF_MAP_TYPE_INSN_ARRAY type must be created with a u32
keys and value of size 8. The values have different semantics for
userspace and for BPF space. For userspace a value consists of two
u32 values – xlated and jitted offsets. For BPF side the value is
a real pointer to a jitted instruction.

On map creation/initialization, before loading the program, each
element of the map should be initialized to point to an instruction
offset within the program. Before the program load such maps should
be made frozen. After the program verification xlated and jitted
offsets can be read via the bpf(2) syscall.

If a tracked instruction is removed by the verifier, then the xlated
offset is set to (u32)-1 which is considered to be too big for a valid
BPF program offset.

One such a map can, obviously, be used to track one and only one BPF
program.  If the verification process was unsuccessful, then the same
map can be re-used to verify the program with a different log level.
However, if the program was loaded fine, then such a map, being
frozen in any case, can't be reused by other programs even after the
program release.

Example. Consider the following original and xlated programs:

    Original prog:                      Xlated prog:

     0:  r1 = 0x0                        0: r1 = 0
     1:  *(u32 *)(r10 - 0x4) = r1        1: *(u32 *)(r10 -4) = r1
     2:  r2 = r10                        2: r2 = r10
     3:  r2 += -0x4                      3: r2 += -4
     4:  r1 = 0x0 ll                     4: r1 = map[id:88]
     6:  call 0x1                        6: r1 += 272
                                         7: r0 = *(u32 *)(r2 +0)
                                         8: if r0 >= 0x1 goto pc+3
                                         9: r0 <<= 3
                                        10: r0 += r1
                                        11: goto pc+1
                                        12: r0 = 0
     7:  r6 = r0                        13: r6 = r0
     8:  if r6 == 0x0 goto +0x2         14: if r6 == 0x0 goto pc+4
     9:  call 0x76                      15: r0 = 0xffffffff8d2079c0
                                        17: r0 = *(u64 *)(r0 +0)
    10:  *(u64 *)(r6 + 0x0) = r0        18: *(u64 *)(r6 +0) = r0
    11:  r0 = 0x0                       19: r0 = 0x0
    12:  exit                           20: exit

An instruction array map, containing, e.g., instructions [0,4,7,12]
will be translated by the verifier to [0,4,13,20]. A map with
index 5 (the middle of 16-byte instruction) or indexes greater than 12
(outside the program boundaries) would be rejected.

The functionality provided by this patch will be extended in consequent
patches to implement BPF Static Keys, indirect jumps, and indirect calls.

Signed-off-by: Anton Protopopov <[email protected]>
Reviewed-by: Eduard Zingerman <[email protected]>
Add the following selftests for new insn_array map:

  * Incorrect instruction indexes are rejected
  * Two programs can't use the same map
  * BPF progs can't operate the map
  * no changes to code => map is the same
  * expected changes when instructions are added
  * expected changes when instructions are deleted
  * expected changes when multiple functions are present

Signed-off-by: Anton Protopopov <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
When bpf_jit_harden is enabled, all constants in the BPF code are
blinded to prevent JIT spraying attacks. This happens during JIT
phase. Adjust all the related instruction arrays accordingly.

Signed-off-by: Anton Protopopov <[email protected]>
Reviewed-by: Eduard Zingerman <[email protected]>
Add a specific test for instructions arrays with blinding enabled.

Signed-off-by: Anton Protopopov <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
Currently the emit_indirect_jump() function only accepts one of the
RAX, RCX, ..., RBP registers as the destination. Make it to accept
R8, R9, ..., R15 as well, and make callers to pass BPF registers, not
native registers. This is required to enable indirect jumps support
in eBPF.

Signed-off-by: Anton Protopopov <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
Add support for a new instruction

    BPF_JMP|BPF_X|BPF_JA, SRC=0, DST=Rx, off=0, imm=0

which does an indirect jump to a location stored in Rx.  The register
Rx should have type PTR_TO_INSN. This new type assures that the Rx
register contains a value (or a range of values) loaded from a
correct jump table – map of type instruction array.

For example, for a C switch LLVM will generate the following code:

    0:   r3 = r1                    # "switch (r3)"
    1:   if r3 > 0x13 goto +0x666   # check r3 boundaries
    2:   r3 <<= 0x3                 # adjust to an index in array of addresses
    3:   r1 = 0xbeef ll             # r1 is PTR_TO_MAP_VALUE, r1->map_ptr=M
    5:   r1 += r3                   # r1 inherits boundaries from r3
    6:   r1 = *(u64 *)(r1 + 0x0)    # r1 now has type INSN_TO_PTR
    7:   gotox r1                   # jit will generate proper code

Here the gotox instruction corresponds to one particular map. This is
possible however to have a gotox instruction which can be loaded from
different maps, e.g.

    0:   r1 &= 0x1
    1:   r2 <<= 0x3
    2:   r3 = 0x0 ll                # load from map M_1
    4:   r3 += r2
    5:   if r1 == 0x0 goto +0x4
    6:   r1 <<= 0x3
    7:   r3 = 0x0 ll                # load from map M_2
    9:   r3 += r1
    A:   r1 = *(u64 *)(r3 + 0x0)
    B:   gotox r1                   # jump to target loaded from M_1 or M_2

During check_cfg stage the verifier will collect all the maps which
point to inside the subprog being verified. When building the config,
the high 16 bytes of the insn_state are used, so this patch
(theoretically) supports jump tables of up to 2^16 slots.

During the later stage, in check_indirect_jump, it is checked that
the register Rx was loaded from a particular instruction array.

Signed-off-by: Anton Protopopov <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
Add support for indirect jump instruction.

Example output from bpftool:

   0: (79) r3 = *(u64 *)(r1 +0)
   1: (25) if r3 > 0x4 goto pc+666
   2: (67) r3 <<= 3
   3: (18) r1 = 0xffffbeefspameggs
   5: (0f) r1 += r3
   6: (79) r1 = *(u64 *)(r1 +0)
   7: (0d) gotox r1

Signed-off-by: Anton Protopopov <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
For v4 instruction set LLVM is allowed to generate indirect jumps for
switch statements and for 'goto *rX' assembly. Every such a jump will
be accompanied by necessary metadata, e.g. (`llvm-objdump -Sr ...`):

       0:       r2 = 0x0 ll
                0000000000000030:  R_BPF_64_64  BPF.JT.0.0

Here BPF.JT.1.0 is a symbol residing in the .jumptables section:

    Symbol table:
       4: 0000000000000000   240 OBJECT  GLOBAL DEFAULT     4 BPF.JT.0.0

The -bpf-min-jump-table-entries llvm option may be used to control the
minimal size of a switch which will be converted to an indirect jumps.

Signed-off-by: Anton Protopopov <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
Teach bpftool to recognize instruction array map type.

Signed-off-by: Anton Protopopov <[email protected]>
Acked-by: Quentin Monnet <[email protected]>
Add a set of tests to validate core gotox functionality
without need to rely on compilers.

Signed-off-by: Anton Protopopov <[email protected]>
Add C-level selftests for indirect jumps to validate LLVM and libbpf
functionality. The tests are intentionally disabled, to be run
locally by developers, but will not make the CI red.

Signed-off-by: Anton Protopopov <[email protected]>
@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: ab01bfa
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1018485
version: 9

@kernel-patches-review-bot
Copy link

commit 77dc1e7091f8bbd42ffa1730c0339642333528c6
Author: Anton Protopopov <[email protected]>

bpf, x86: add new map type: instructions array

This patch adds BPF_MAP_TYPE_INSN_ARRAY to track instruction offset
mappings (original -> xlated -> jitted) during BPF program verification
and JIT compilation.

> diff --git a/kernel/bpf/bpf_insn_array.c b/kernel/bpf/bpf_insn_array.c
> new file mode 100644
> index 000000000..d29f38721
> --- /dev/null
> +++ b/kernel/bpf/bpf_insn_array.c

[ ... ]

> @@ -0,0 +1,286 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright (c) 2025 Isovalent */
> +
> +#include <linux/bpf.h>
> +
> +struct bpf_insn_array {
> +	struct bpf_map map;
> +	atomic_t used;
> +	long *ips;
> +	DECLARE_FLEX_ARRAY(struct bpf_insn_array_value, values);
> +};
> +
> +#define cast_insn_array(MAP_PTR) \
> +	container_of((MAP_PTR), struct bpf_insn_array, map)
> +
> +#define INSN_DELETED ((u32)-1)
> +
> +static inline u32 insn_array_alloc_size(u32 max_entries)
> +{
> +	const u32 base_size = sizeof(struct bpf_insn_array);
> +	const u32 entry_size = sizeof(struct bpf_insn_array_value);
> +
> +	return base_size + max_entries * (entry_size + sizeof(long));
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +}

Can this overflow u32? With entry_size=16 and sizeof(long)=8 on 64-bit,
max_entries around 200 million would cause: 200M * 24 = 4.8B which wraps
to around 504MB when cast to u32. The subsequent bpf_map_area_alloc()
would then allocate only 504MB, but later accesses to values[150000000]
would be out of bounds. Should this function return u64 instead, or
should insn_array_alloc_check() validate an upper bound on max_entries?

[ ... ]

> +static int insn_array_alloc_check(union bpf_attr *attr)
> +{
> +	u32 value_size = sizeof(struct bpf_insn_array_value);
> +
> +	if (attr->max_entries == 0 || attr->key_size != 4 ||
> +	    attr->value_size != value_size || attr->map_flags != 0)
> +		return -EINVAL;
> +
> +	return 0;
> +}

[ ... ]

> +void bpf_prog_update_insn_ptrs(struct bpf_prog *prog, u32 *offsets, void *image)
> +{
> +	struct bpf_insn_array *insn_array;
> +	struct bpf_map *map;
> +	u32 xlated_off;
> +	int i, j;
> +
> +	if (!offsets || !image)
> +		return;
> +
> +	for (i = 0; i < prog->aux->used_map_cnt; i++) {
> +		map = prog->aux->used_maps[i];
> +		if (!is_insn_array(map))
> +			continue;
> +
> +		insn_array = cast_insn_array(map);
> +		for (j = 0; j < map->max_entries; j++) {
> +			xlated_off = insn_array->values[j].xlated_off;
> +			if (xlated_off == INSN_DELETED)
> +				continue;
> +			if (xlated_off < prog->aux->subprog_start)
> +				continue;
> +			xlated_off -= prog->aux->subprog_start;
> +			if (xlated_off >= prog->len)
> +				continue;
> +
> +			insn_array->values[j].jitted_off = offsets[xlated_off];
> +			insn_array->ips[j] = (long)(image + offsets[xlated_off]);
> +		}
> +	}
> +}

Does this handle programs with function calls correctly? When
jit_subprogs() in the verifier splits a program with function calls, it
creates separate bpf_prog structures for the main program and each
subprogram. During JIT compilation, bpf_int_jit_compile() calls
bpf_prog_update_insn_ptrs() for each subprogram separately.

However, subprograms don't have used_maps copied to their aux structure
(see jit_subprogs() in verifier.c). So when bpf_prog_update_insn_ptrs()
is called on a subprogram, prog->aux->used_map_cnt is 0 and the loop
doesn't execute.

When called on the main program (func[0]), prog->len only covers the
main program's instructions (not subprograms), so any tracked instruction
with xlated_off >= prog->len (i.e., in a subprogram) hits the continue
statement and doesn't get its ips[] field set.

Then bpf_insn_array_ready() checks if ips[i] is NULL and returns -EFAULT,
causing program load to fail. Should this function iterate through all
func[] entries and handle each subprogram, or should the main program's
used_maps be copied to subprograms so this works as-is?

> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c

[ ... ]

> @@ -2853,6 +2854,23 @@ static int bpf_prog_verify_signature(struct bpf_prog *prog, union bpf_attr *attr
>  	return err;
>  }
>
> +static int bpf_prog_mark_insn_arrays_ready(struct bpf_prog *prog)
> +{
> +	int err;
> +	int i;
> +
> +	for (i = 0; i < prog->aux->used_map_cnt; i++) {
> +		if (prog->aux->used_maps[i]->map_type != BPF_MAP_TYPE_INSN_ARRAY)
> +			continue;
> +
> +		err = bpf_insn_array_ready(prog->aux->used_maps[i]);
> +		if (err)
> +			return err;
> +	}
> +
> +	return 0;
> +}

[ ... ]



AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

In-Reply-To-Subject: bpf, x86: add new map type: instructions array
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/18995945884

@kernel-patches-review-bot
Copy link

commit cb92a900026e1efa9f89af1f2682f1b544cc7e22
Author: Anton Protopopov <[email protected]>

bpf, x86: add support for indirect jumps

This patch adds support for indirect jumps (gotox instruction) in BPF
programs, introducing a new PTR_TO_INSN register type to ensure jump
targets are loaded from verified instruction array maps.

> diff --git a/kernel/bpf/bpf_insn_array.c b/kernel/bpf/bpf_insn_array.c
> index d29f38721..7bf345a87 100644
> --- a/kernel/bpf/bpf_insn_array.c
> +++ b/kernel/bpf/bpf_insn_array.c
> @@ -114,6 +114,20 @@ static u64 insn_array_mem_usage(const struct bpf_map *map)
>  	return insn_array_alloc_size(map->max_entries);
>  }
>
> +static int insn_array_map_direct_value_addr(const struct bpf_map *map, u64 *imm, u32 off)
> +{
> +	struct bpf_insn_array *insn_array = cast_insn_array(map);
> +
> +	if ((off % sizeof(long)) != 0 ||
> +	    (off / sizeof(long)) >= map->max_entries)
> +		return -EINVAL;
> +
> +	/* from BPF's point of view, this map is a jump table */
> +	*imm = (unsigned long)insn_array->ips + off;
> +
> +	return 0;
> +}

The insn_array_alloc_size() function can overflow. It computes:

  base_size + max_entries * (entry_size + sizeof(long))

and returns u32. With entry_size=16 and sizeof(long)=8, if max_entries
exceeds ~178 million, the multiplication overflows u32 before the
result is widened to u64 in insn_array_alloc(). This could result in
allocating a buffer much smaller than intended.

While BPF programs have size limits, max_entries validation in
insn_array_alloc_check() only checks != 0 without an upper bound. Should
insn_array_alloc_size() return u64 or add overflow checking?

[ ... ]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 2b771e2bf..3fdabae07 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c

[ ... ]

> @@ -17833,6 +17861,197 @@ static struct bpf_iarray *iarray_realloc(struct bpf_iarray *old, size_t n_elem)
>  	return new;
>  }
>
> +static int copy_insn_array(struct bpf_map *map, u32 start, u32 end, u32 *items)
> +{
> +	struct bpf_insn_array_value *value;
> +	u32 i;
> +
> +	for (i = start; i <= end; i++) {
> +		value = map->ops->map_lookup_elem(map, &i);
> +		if (!value)
> +			return -EINVAL;
> +		items[i - start] = value->xlated_off;
> +	}
> +	return 0;
> +}
> +
> +static int cmp_ptr_to_u32(const void *a, const void *b)
> +{
> +	return *(u32 *)a - *(u32 *)b;
> +}
> +
> +static int sort_insn_array_uniq(u32 *items, int cnt)
> +{
> +	int unique = 1;
> +	int i;
> +
> +	sort(items, cnt, sizeof(items[0]), cmp_ptr_to_u32, NULL);
> +
> +	for (i = 1; i < cnt; i++)
> +		if (items[i] != items[unique - 1])
> +			items[unique++] = items[i];
> +
> +	return unique;
> +}
> +
> +/*
> + * sort_unique({map[start], ..., map[end]}) into off
> + */
> +static int copy_insn_array_uniq(struct bpf_map *map, u32 start, u32 end, u32 *off)
> +{
> +	u32 n = end - start + 1;
> +	int err;
> +
> +	err = copy_insn_array(map, start, end, off);
> +	if (err)
> +		return err;
> +
> +	return sort_insn_array_uniq(off, n);
> +}
> +
> +/*
> + * Copy all unique offsets from the map
> + */
> +static struct bpf_iarray *jt_from_map(struct bpf_map *map)
> +{
> +	struct bpf_iarray *jt;
> +	int n;
> +
> +	jt = iarray_realloc(NULL, map->max_entries);
> +	if (!jt)
> +		return ERR_PTR(-ENOMEM);
> +
> +	n = copy_insn_array_uniq(map, 0, map->max_entries - 1, jt->items);
> +	if (n < 0) {
> +		kvfree(jt);
> +		return ERR_PTR(n);
> +	}
> +
> +	return jt;
> +}
                  ^^^^

The iarray_realloc() call sets jt->cnt = map->max_entries, but
copy_insn_array_uniq() returns n (the actual number of unique items)
which could be less. The function returns without updating jt->cnt to n.

In jt_from_subprog()->jt_from_map(), at the check:

  if (jt_cur->items[0] >= subprog_start && jt_cur->items[0] < subprog_end)

Can items[0] be accessed when n=0? If the map contains no valid entries,
jt_cur->cnt still equals max_entries but no items were populated.

Later at:

  memcpy(jt->items + old_cnt, jt_cur->items, jt_cur->cnt << 2);

This copies jt_cur->cnt elements, but cnt=max_entries, not the actual
count. Does this copy uninitialized memory from the jt_cur->items array?

[ ... ]

> @@ -18855,6 +19074,9 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
>  		return regs_exact(rold, rcur, idmap) && rold->frameno == rcur->frameno;
>  	case PTR_TO_ARENA:
>  		return true;
> +	case PTR_TO_INSN:
> +		return (rold->off == rcur->off && range_within(rold, rcur) &&
> +			tnum_in(rold->var_off, rcur->var_off));
>  	default:
>  		return regs_exact(rold, rcur, idmap);
>  	}
                     ^^^^

Should PTR_TO_INSN check map_ptr equality like PTR_TO_MAP_VALUE does?

The PTR_TO_MAP_VALUE case uses memcmp() which compares all fields
including map_ptr. But this PTR_TO_INSN case only checks off, range, and
var_off. If a BPF program uses multiple INSN_ARRAY maps with different
jump tables, could state pruning incorrectly consider two PTR_TO_INSN
registers equivalent when they point to different maps?



AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

In-Reply-To-Subject: bpf, x86: add support for indirect jumps
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/18995945884

@kernel-patches-daemon-bpf
Copy link
Author

Forwarding comment 3476280931 via email
In-Reply-To: [email protected]
Patch: https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/

@kernel-patches-daemon-bpf
Copy link
Author

Forwarding comment 3476281416 via email
In-Reply-To: [email protected]
Patch: https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/

@kernel-patches-daemon-bpf kernel-patches-daemon-bpf bot deleted the series/1018485=>bpf-next branch November 5, 2025 05:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants