Skip to content

Conversation

@jsitnicki
Copy link
Contributor

No description provided.

This series continues the effort to provide reliable access to xdp/skb
metadata from BPF context on the receive path.

Currently skb metadata location is tied to the MAC header offset, which
breaks on L2 decapsulation (VLAN, GRE, etc.) when the MAC offset is
reset. The naive fix is to memmove metadata on every decap path, but we can
avoid this cost by tracking metadata position independently.

Introduce a dedicated meta_end field in skb_shared_info that records where
metadata ends relative to skb->head. This allows BPF dynptr
access (bpf_dynptr_from_skb_meta()) to work without memmove. For
skb->data_meta pointer access, which expects metadata immediately before
skb->data, the verifier injects realignment code in TC BPF prologue.

Patches 1-8 enforce the calling convention: skb_metadata_set() must be
called after skb->data points past the metadata area, ensuring meta_end
captures the correct position. Patch 9 implements the core change.
Patches 10-13 extend the verifier to track data_meta usage, and patch 14
adds the realignment logic.

Note: This series does not address moving metadata on L2 encapsulation
(forwarding path). VLAN and QinQ have already been patched when fixing TC
BPF helpers [1], but other tagging/tunnel code still requires changes.

Selftests are missing. The series has been developed against an out-of-tree
shell-based test suite at [2].

Note to maintainers: This not a typical series, in the sense that it
touches both for the networking drivers and the BPF verifier. The
preparatory changes for the drivers could be split out, if it makes things
easier.

Thanks,
-jkbs

[1] https://lore.kernel.org/all/[email protected]/
[2] https://github.com/jsitnicki/skb-metadata-tests/blob/main/rx_loopback_test.sh

# Describe the purpose of this series. The information you put here
# will be used by the project maintainer to make a decision whether
# your patches should be reviewed, and in what priority order. Please be
# very detailed and link to any relevant discussions or sites that the
# maintainer can review to better understand your proposed changes. If you
# only have a single patch in your series, the contents of the cover
# letter will be appended to the "under-the-cut" portion of the patch.

# Lines starting with # will be removed from the cover letter. You can
# use them to add notes or reminders to yourself. If you want to use
# markdown headers in your cover letter, start the line with ">#".

# You can add trailers to the cover letter. Any email addresses found in
# these trailers will be added to the addresses specified/generated
# during the b4 send stage. You can also run "b4 prep --auto-to-cc" to
# auto-populate the To: and Cc: trailers based on the code being
# modified.

Signed-off-by: Jakub Sitnicki <[email protected]>

--- b4-submit-tracking ---
# This section is used internally by b4 prep for tracking purposes.
{
  "series": {
    "revision": 1,
    "change-id": "20251123-skb-meta-safeproof-netdevs-rx-only-3f2d20d15eda",
    "prefixes": [
      "RFC bpf-next"
    ],
    "prerequisites": [
      "message-id: [email protected]"
    ]
  }
}
Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
already points past the metadata area.

Adjust the driver to pull from skb->data before calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <[email protected]>
Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
already points past the metadata area.

Adjust the driver to pull from skb->data before calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <[email protected]>
Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
already points past the metadata area.

Adjust the driver to pull from skb->data before calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <[email protected]>
Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
already points past the metadata area.

Adjust the driver to pull from skb->data before calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <[email protected]>
Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
already points past the metadata area.

Adjust the driver to pull from skb->data before calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <[email protected]>
Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
already points past the metadata area.

Adjust the driver to pull from skb->data before calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <[email protected]>
Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
points right past the metadata area.

veth, unlike any other driver, calls skb_metadata_set() only after
extracting EtherType with eth_type_trans(), which pulls the MAC
header.

Adjust the driver to pull the MAC header after calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <[email protected]>
Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
already points past the metadata area.

Adjust AF_XDP to pull from skb->data before calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <[email protected]>
Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
points just past the metadata area.

Tweak XDP generic mode accordingly.

Signed-off-by: Jakub Sitnicki <[email protected]>
Currently skb metadata location is derived from the MAC header offset.
This breaks when L2 tunnel/tagging devices (VLAN, GRE, etc.) reset the MAC
offset after pulling the encapsulation header, making the metadata
inaccessible.

A naive fix would be to move metadata on every skb_pull() path. However, we
can avoid a memmove on L2 decapsulation if we can locate metadata
independently of the MAC offset.

Introduce a meta_end field in skb_shared_info to track where metadata ends,
decoupling it from mac_header. The new field takes 2 bytes out of the
existing 4 byte hole, with structure size unchanged if we reorder the
gso_type field.

Update skb_metadata_set() to record meta_end at the time of the call, and
adjust skb_data_move() and pskb_expand_head() to keep meta_end in sync with
head buffer layout.

Remove the now-unneeded metadata adjustment in skb_reorder_vlan_header().

Note that this breaks BPF skb metadata access through skb->data_meta when
there is a gap between meta_end and skb->data. Following BPF verifier
changes address this.

Also, we still need to relocate the metadata on encapsulation on forward
path. VLAN and QinQ have already been patched when fixing TC BPF helpers
[1], but other tagging/tunnel code still requires similar changes. This
will be done as a follow up.

Signed-off-by: Jakub Sitnicki <[email protected]>
The may_access_direct_pkt_data() helper sets env->seen_direct_write as a
side effect, which creates awkward calling patterns:

- check_special_kfunc() has a comment warning readers about the side effect
- specialize_kfunc() must save and restore the flag around the call

Make the helper a pure function by moving the seen_direct_write flag
setting to call sites that need it.

Signed-off-by: Jakub Sitnicki <[email protected]>
Convert seen_direct_write from a boolean to a bitmap (seen_packet_access)
in preparation for tracking additional packet access patterns.

No functional change.

Signed-off-by: Jakub Sitnicki <[email protected]>
Change gen_prologue() to accept the packet access flags bitmap. This allows
gen_prologue() to inspect multiple access patterns when needed.

No functional change.

Signed-off-by: Jakub Sitnicki <[email protected]>
Introduce PA_F_DATA_META_LOAD flag to track when a BPF program loads the
skb->data_meta pointer.

This information will be used by gen_prologue() to handle cases where there
is a gap between metadata end and skb->data, requiring metadata to be
realigned.

Signed-off-by: Jakub Sitnicki <[email protected]>
@jsitnicki jsitnicki force-pushed the skb-meta/safeproof-netdevs-rx-only branch 2 times, most recently from 2adec4e to 815642b Compare November 24, 2025 14:18
After decoupling metadata location from MAC header offset, a gap can appear
between metadata and skb->data on L2 decapsulation (e.g., VLAN, GRE). This
breaks the BPF data_meta pointer which assumes metadata is directly before
skb->data.

Introduce bpf_skb_meta_realign() kfunc to close the gap by moving metadata
to immediately precede the MAC header. Inject a call to it in
tc_cls_act_prologue() when the verifier detects data_meta access
(PA_F_DATA_META_LOAD flag).

Update skb_data_move() to handle the gap case: on skb_push(), move metadata
to the top of the head buffer; on skb_pull() where metadata is already
detached, leave it in place.

This restores data_meta functionality for TC programs while keeping the
performance benefit of avoiding memmove on L2 decapsulation for programs
that don't use data_meta.

Signed-off-by: Jakub Sitnicki <[email protected]>
@jsitnicki jsitnicki force-pushed the skb-meta/safeproof-netdevs-rx-only branch from 815642b to 2044203 Compare November 24, 2025 14:45
@kernel-patches-daemon-bpf
Copy link

Automatically cleaning up stale PR; feel free to reopen if needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant