Netlink event API: add SUBFLOW_CREATED event #222

matttbe · 2021-08-04T18:11:43Z

For the moment, there is no event when a subflow has been created, only when it has been established.

If we have this situation:

the userspace PM creates a subflow -- when the command will be available, see Add netlink command support #186 -- without specifying the src port/address.
the subflow is never established, e.g. SYN are dropped
the userspace is not able to destroy the subflow -- again, when the command will be available -- because the userspace doesn't have the 5-tuples.
the userspace will see SUBFLOW_CLOSED events for unknown subflow

Solution: have a SUBFLOW_CREATED event.

The text was updated successfully, but these errors were encountered:

fw-strlen · 2021-08-11T10:30:16Z

We could suppress SF_CLOSED for subflows that were never announced.
I think that for purpose of 'canceling' a subflow connect request it would make more sense to me to support
the NLM_F_ECHO flag in the 'create' api so that mptcpd learns some identifier that it can use to
terminate the connection request when it sends the appropriate create command.

matttbe · 2021-08-11T14:05:02Z

We could suppress SF_CLOSED for subflows that were never announced.

Indeed. For me, receiving an event for an unknown subflow is not a big deal: anyway the userspace app has to manage this case in case events are lost, the app is restarted, etc.

I think that for purpose of 'canceling' a subflow connect request it would make more sense to me to support
the NLM_F_ECHO flag in the 'create' api so that mptcpd learns some identifier that it can use to
terminate the connection request when it sends the appropriate create command.

Good point, I didn't think about that!
Will the advantage be that only the creator of the subflow will get the notification?
From what I see; the creator will not have to "block" so it seems good.

I was thinking about a new dedicated event because on the kernel side, we send the same content: the implementation is easy. And this could benefit to others, e.g. if we only have a listener (ip mptcp monitor) or if the subflow has been created by one part the events are processed by another entity. We can have two channels of communications: one for the events and one for the commands (and their ACKs): one part is in charge or managing new events and the other side is in charge of sending commands and wait for ACK. Not sure it is the proper way to do such thing.

VenkateswaranJ · 2022-08-29T14:26:05Z

@matttbe what do you mean by when the command will be available? how does the user's space know whether the command is available or not?

matttbe · 2022-08-29T14:36:49Z

@VenkateswaranJ it was just to say that this task depends on #186 (now done).

The userspace can know if a command is supported by trying it and check the return code.

The following warning was reported when running "./test_progs -t test_bpf_ma/percpu_free_through_map_free": ------------[ cut here ]------------ WARNING: CPU: 1 PID: 68 at kernel/bpf/memalloc.c:342 CPU: 1 PID: 68 Comm: kworker/u16:2 Not tainted 6.6.0-rc2+ #222 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:bpf_mem_refill+0x21c/0x2a0 ...... Call Trace: <IRQ> ? bpf_mem_refill+0x21c/0x2a0 irq_work_single+0x27/0x70 irq_work_run_list+0x2a/0x40 irq_work_run+0x18/0x40 __sysvec_irq_work+0x1c/0xc0 sysvec_irq_work+0x73/0x90 </IRQ> <TASK> asm_sysvec_irq_work+0x1b/0x20 RIP: 0010:unit_free+0x50/0x80 ...... bpf_mem_free+0x46/0x60 __bpf_obj_drop_impl+0x40/0x90 bpf_obj_free_fields+0x17d/0x1a0 array_map_free+0x6b/0x170 bpf_map_free_deferred+0x54/0xa0 process_scheduled_works+0xba/0x370 worker_thread+0x16d/0x2e0 kthread+0x105/0x140 ret_from_fork+0x39/0x60 ret_from_fork_asm+0x1b/0x30 </TASK> ---[ end trace 0000000000000000 ]--- The reason is simple: __bpf_obj_drop_impl() does not know the freeing field is a per-cpu pointer and it uses bpf_global_ma to free the pointer. Because bpf_global_ma is not a per-cpu allocator, so ksize() is used to select the corresponding cache. The bpf_mem_cache with 16-bytes unit_size will always be selected to do the unmatched free and it will trigger the warning in free_bulk() eventually. Because per-cpu kptr doesn't support list or rb-tree now, so fix the problem by only checking whether or not the type of kptr is per-cpu in bpf_obj_free_fields(), and using bpf_global_percpu_ma to these kptrs. Signed-off-by: Hou Tao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

matttbe added the enhancement label Aug 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Netlink event API: add SUBFLOW_CREATED event #222

Netlink event API: add SUBFLOW_CREATED event #222

matttbe commented Aug 4, 2021

fw-strlen commented Aug 11, 2021

matttbe commented Aug 11, 2021

VenkateswaranJ commented Aug 29, 2022

matttbe commented Aug 29, 2022

Netlink event API: add SUBFLOW_CREATED event #222

Netlink event API: add SUBFLOW_CREATED event #222

Comments

matttbe commented Aug 4, 2021

fw-strlen commented Aug 11, 2021

matttbe commented Aug 11, 2021

VenkateswaranJ commented Aug 29, 2022

matttbe commented Aug 29, 2022