-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mention move to own git hosting #5
Comments
This is the canonical repository for SPDK's dpdk submodule. The dpdk submodule never points to master - it always points to a spdk-xx.yy branch. These branches start with the DPDK xx.yy release, and add a few extra commits, typically disabling libraries and PMD drivers that SPDK does not require. You will see much more recent commits on these branches. For anyone who wants latest DPDK master, it's suggested to point to the DPDK repository directly. |
Thanks for the very quick response and explanation.
Ah ok. Perhaps this should be mentioned in the README on the master branch then and master shouldn't have any files other than README so people don't get confused? There is also the option of specifying in the github settings which branch is the main one. |
I've changed the default branch to spdk-21.05. @tomzawadzki, we will want to remember to update this when we move the DPDK submodule to spdk-21.08 (and others) in the future. |
Thanks @zeenix! |
Caught with ASan: ==9727==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7f0daa2fc0d0 at pc 0x7f0daeefacb2 bp 0x7f0daa2fadd0 sp 0x7f0daa2fa578 READ of size 1 at 0x7f0daa2fc0d0 thread T1 #0 0x7f0daeefacb1 (/lib64/libasan.so.5+0xbacb1) #1 0x115eba1 in dev_uev_parse ../lib/eal/linux/eal_dev.c:167 #2 0x115f281 in dev_uev_handler ../lib/eal/linux/eal_dev.c:248 #3 0x1169b91 in eal_intr_process_interrupts ../lib/eal/linux/eal_interrupts.c:1026 #4 0x116a3a2 in eal_intr_handle_interrupts ../lib/eal/linux/eal_interrupts.c:1100 #5 0x116a7f0 in eal_intr_thread_main ../lib/eal/linux/eal_interrupts.c:1172 #6 0x112640a in ctrl_thread_init ../lib/eal/common/eal_common_thread.c:202 #7 0x7f0dade27159 in start_thread (/lib64/libpthread.so.0+0x8159) #8 0x7f0dadb58f72 in clone (/lib64/libc.so.6+0xfcf72) Address 0x7f0daa2fc0d0 is located in stack of thread T1 at offset 4192 in frame #0 0x115f0c9 in dev_uev_handler ../lib/eal/linux/eal_dev.c:226 This frame has 2 object(s): [32, 48) 'uevent' [96, 4192) 'buf' <== Memory access at offset 4192 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions *are* supported) Thread T1 created by T0 here: #0 0x7f0daee92ea3 in __interceptor_pthread_create (/lib64/libasan.so.5+0x52ea3) #1 0x1126542 in rte_ctrl_thread_create ../lib/eal/common/eal_common_thread.c:228 #2 0x116a8b5 in rte_eal_intr_init ../lib/eal/linux/eal_interrupts.c:1200 #3 0x1159dd1 in rte_eal_init ../lib/eal/linux/eal.c:1044 #4 0x7a22f8 in main ../app/test-pmd/testpmd.c:4105 #5 0x7f0dada7f802 in __libc_start_main (/lib64/libc.so.6+0x23802) Bugzilla ID: 792 Fixes: 0d0f478 ("eal/linux: add uevent parse and process") Cc: [email protected] Signed-off-by: David Marchand <[email protected]> Tested-by: Yan Xia <[email protected]> Reviewed-by: Maxime Coquelin <[email protected]>
If DPDK is built with thread sanitizer it reports a race in setting of multiprocess file descriptor. The fix is to use atomic operations when updating mp_fd. Build: $ meson -Db_sanitize=address build $ ninja -C build Simple example: $ .build/app/dpdk-testpmd -l 1-3 --no-huge EAL: Detected CPU lcores: 16 EAL: Detected NUMA nodes: 1 EAL: Static memory layout is selected, amount of reserved memory can be adjusted with -m or --socket-mem EAL: Detected static linkage of DPDK EAL: Multi-process socket /run/user/1000/dpdk/rte/mp_socket EAL: Selected IOVA mode 'VA' testpmd: No probed ethernet devices testpmd: create a new mbuf pool <mb_pool_0>: n=163456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc EAL: Error - exiting with code: 1 Cause: Creation of mbuf pool for socket 0 failed: Cannot allocate memory ================== WARNING: ThreadSanitizer: data race (pid=87245) Write of size 4 at 0x558e04d8ff70 by main thread: #0 rte_mp_channel_cleanup <null> (dpdk-testpmd+0x1e7d30c) #1 rte_eal_cleanup <null> (dpdk-testpmd+0x1e85929) #2 rte_exit <null> (dpdk-testpmd+0x1e5bc0a) #3 mbuf_pool_create.cold <null> (dpdk-testpmd+0x274011) #4 main <null> (dpdk-testpmd+0x5cc15d) Previous read of size 4 at 0x558e04d8ff70 by thread T2: #0 mp_handle <null> (dpdk-testpmd+0x1e7c439) #1 ctrl_thread_init <null> (dpdk-testpmd+0x1e6ee1e) As if synchronized via sleep: #0 nanosleep libsanitizer/tsan/tsan_interceptors_posix.cpp:366 #1 get_tsc_freq <null> (dpdk-testpmd+0x1e92ff9) #2 set_tsc_freq <null> (dpdk-testpmd+0x1e6f2fc) #3 rte_eal_timer_init <null> (dpdk-testpmd+0x1e931a4) #4 rte_eal_init.cold <null> (dpdk-testpmd+0x29e578) #5 main <null> (dpdk-testpmd+0x5cbc45) Location is global 'mp_fd' of size 4 at 0x558e04d8ff70 (dpdk-testpmd+0x000003122f70) Thread T2 'rte_mp_handle' (tid=87248, running) created by main thread at: #0 pthread_create libsanitizer/tsan/tsan_interceptors_posix.cpp:969 #1 rte_ctrl_thread_create <null> (dpdk-testpmd+0x1e6efd0) #2 rte_mp_channel_init.cold <null> (dpdk-testpmd+0x29cb7c) #3 rte_eal_init <null> (dpdk-testpmd+0x1e8662e) #4 main <null> (dpdk-testpmd+0x5cbc45) SUMMARY: ThreadSanitizer: data race (app/dpdk-testpmd+0x1e7d30c) in rte_mp_channel_cleanup ================== ThreadSanitizer: reported 1 warnings Fixes: bacaa27 ("eal: add channel for multi-process communication") Cc: [email protected] Signed-off-by: Stephen Hemminger <[email protected]> Acked-by: Anatoly Burakov <[email protected]> Reviewed-by: Chengwen Feng <[email protected]>
Devices can end up without driver assigned after probing. The cleanup function from patch below does not free devices that do not have it assigned. The devices reported to leak are not the ones that are the object of the hotplug test. This issue appears only when using shared objects build, and on a specific set of CI machines. More debugging needs to happen to fully understand the issue here. Fixes patch below: (1cab1a4)bus: cleanup devices on shutdown Errors from ASAN: 00:16:25.099 ==48971==ERROR: LeakSanitizer: detected memory leaks 00:16:25.099 00:16:25.099 Indirect leak of 11544 byte(s) in 37 object(s) allocated from: 00:16:25.099 #0 0x7f4d00f1a6af in __interceptor_malloc (/usr/lib64/libasan.so.8+0xba6af) 00:16:25.099 #1 0x7f4d0017c4a7 in pci_scan_one ../drivers/bus/pci/linux/pci.c:218 00:16:25.099 #2 0x7f4d0017ceb5 in rte_pci_scan ../drivers/bus/pci/linux/pci.c:471 00:16:25.099 #3 0x7f4d0002394c in rte_bus_scan ../lib/eal/common/eal_common_bus.c:56 00:16:25.099 #4 0x7f4d00053847 in rte_eal_init ../lib/eal/linux/eal.c:1065 00:16:25.099 #5 0x7f4d00256c01 in spdk_env_init /var/jenkins/workspace/hw-nvme-hotplug/spdk/lib/env_dpdk/init.c:585 00:16:25.099 #6 0x40709c in main /var/jenkins/workspace/hw-nvme-hotplug/spdk/examples/nvme/hotplug/hotplug.c:571 00:16:25.099 #7 0x7f4cff50150f in __libc_start_call_main (/usr/lib64/libc.so.6+0x2750f) Change-Id: I78252587a0930a15097ce16227a4935d34871b75 Signed-off-by: Krzysztof Karas <[email protected]> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/dpdk/+/16443 Reviewed-by: Tomasz Zawadzki <[email protected]> Reviewed-by: Konrad Sztyber <[email protected]> Tested-by: SPDK CI Jenkins <[email protected]>
The net/vhost pmd currently provides a -1 vid when disabling interrupt after a virtio port got disconnected. This can be caught when running with ASan. First, start dpdk-l3fwd-power in interrupt mode with a net/vhost port. $ ./build-clang/examples/dpdk-l3fwd-power -l0,1 --in-memory \ -a 0000:00:00.0 \ --vdev net_vhost0,iface=plop.sock,client=1\ -- \ -p 0x1 \ --interrupt-only \ --config '(0,0,1)' \ --parse-ptype 0 Then start testpmd with virtio-user. $ ./build-clang/app/dpdk-testpmd -l0,2 --single-file-segment --in-memory \ -a 0000:00:00.0 \ --vdev net_virtio_user0,path=plop.sock,server=1 \ -- \ -i Finally stop testpmd. ASan then splats in dpdk-l3fwd-power: ================================================================= ==3641005==ERROR: AddressSanitizer: global-buffer-overflow on address 0x000005ed0778 at pc 0x000001270f81 bp 0x7fddbd2eee20 sp 0x7fddbd2eee18 READ of size 8 at 0x000005ed0778 thread T2 #0 0x1270f80 in get_device .../lib/vhost/vhost.h:801:27 #1 0x1270f80 in rte_vhost_get_vhost_vring .../lib/vhost/vhost.c:951:8 #2 0x3ac95cb in eth_rxq_intr_disable .../drivers/net/vhost/rte_eth_vhost.c:647:8 #3 0x170e0bf in rte_eth_dev_rx_intr_disable .../lib/ethdev/rte_ethdev.c:5443:25 #4 0xf72ba7 in turn_on_off_intr .../examples/l3fwd-power/main.c:881:4 #5 0xf71045 in main_intr_loop .../examples/l3fwd-power/main.c:1061:6 #6 0x17f9292 in eal_thread_loop .../lib/eal/common/eal_common_thread.c:210:9 #7 0x18373f5 in eal_worker_thread_loop .../lib/eal/linux/eal.c:915:2 #8 0x7fddc16ae12c in start_thread (/lib64/libc.so.6+0x8b12c) (BuildId: 81daba31ee66dbd63efdc4252a872949d874d136) #9 0x7fddc172fbbf in __GI___clone3 (/lib64/libc.so.6+0x10cbbf) (BuildId: 81daba31ee66dbd63efdc4252a872949d874d136) 0x000005ed0778 is located 8 bytes to the left of global variable 'vhost_devices' defined in '.../lib/vhost/vhost.c:24' (0x5ed0780) of size 8192 0x000005ed0778 is located 20 bytes to the right of global variable 'vhost_config_log_level' defined in '.../lib/vhost/vhost.c:2174' (0x5ed0760) of size 4 SUMMARY: AddressSanitizer: global-buffer-overflow .../lib/vhost/vhost.h:801:27 in get_device Shadow bytes around the buggy address: 0x000080bd2090: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 0x000080bd20a0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 0x000080bd20b0: f9 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 00 f9 f9 f9 0x000080bd20c0: 00 00 00 00 00 00 00 f9 f9 f9 f9 f9 04 f9 f9 f9 0x000080bd20d0: 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 =>0x000080bd20e0: 00 f9 f9 f9 f9 f9 f9 f9 04 f9 f9 f9 04 f9 f9[f9] 0x000080bd20f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000080bd2100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000080bd2110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000080bd2120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x000080bd2130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Thread T2 created by T0 here: #0 0xe98996 in __interceptor_pthread_create (.examples/dpdk-l3fwd-power+0xe98996) (BuildId: d0b984a3b0287b9e0f301b73426fa921aeecca3a) #1 0x1836767 in eal_worker_thread_create .../lib/eal/linux/eal.c:952:6 #2 0x1834b83 in rte_eal_init .../lib/eal/linux/eal.c:1257:9 #3 0xf68902 in main .../examples/l3fwd-power/main.c:2496:8 #4 0x7fddc164a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f) (BuildId: 81daba31ee66dbd63efdc4252a872949d874d136) ==3641005==ABORTING More generally, any application passing an incorrect vid would trigger such an OOB access. Fixes: 4796ad6 ("examples/vhost: import userspace vhost application") Cc: [email protected] Signed-off-by: David Marchand <[email protected]> Reviewed-by: Maxime Coquelin <[email protected]>
getline() may allocate a buffer even though it returns -1: """ If *lineptr is set to NULL before the call, then getline() will allocate a buffer for storing the line. This buffer should be freed by the user program even if getline() failed. """ This leak has been observed on a RHEL8 system with two CX5 PF devices (no VFs). ASan reports: ==8899==ERROR: LeakSanitizer: detected memory leaks Direct leak of 120 byte(s) in 1 object(s) allocated from: #0 0x7fe58576aba8 in __interceptor_malloc (/lib64/libasan.so.5+0xefba8) #1 0x7fe583e866b2 in __getdelim (/lib64/libc.so.6+0x886b2) spdk#2 0x327bd23 in mlx5_sysfs_switch_info ../drivers/net/mlx5/linux/mlx5_ethdev_os.c:1084 spdk#3 0x3271f86 in mlx5_os_pci_probe_pf ../drivers/net/mlx5/linux/mlx5_os.c:2282 spdk#4 0x3273c83 in mlx5_os_pci_probe ../drivers/net/mlx5/linux/mlx5_os.c:2497 spdk#5 0x327475f in mlx5_os_net_probe ../drivers/net/mlx5/linux/mlx5_os.c:2578 #6 0xc6eac7 in drivers_probe ../drivers/common/mlx5/mlx5_common.c:937 #7 0xc6f150 in mlx5_common_dev_probe ../drivers/common/mlx5/mlx5_common.c:1027 #8 0xc8ef80 in mlx5_common_pci_probe ../drivers/common/mlx5/mlx5_common_pci.c:168 #9 0xc21b67 in rte_pci_probe_one_driver ../drivers/bus/pci/pci_common.c:312 #10 0xc2224c in pci_probe_all_drivers ../drivers/bus/pci/pci_common.c:396 #11 0xc222f4 in pci_probe ../drivers/bus/pci/pci_common.c:423 #12 0xb71fff in rte_bus_probe ../lib/eal/common/eal_common_bus.c:78 #13 0xbe6888 in rte_eal_init ../lib/eal/linux/eal.c:1300 #14 0x5ec717 in main ../app/test-pmd/testpmd.c:4515 #15 0x7fe583e38d84 in __libc_start_main (/lib64/libc.so.6+0x3ad84) As far as why getline() errors, strace gives a hint: 8516 openat(AT_FDCWD, "/sys/class/net/enp130s0f0/phys_port_name", O_RDONLY) = 34 8516 fstat(34, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 8516 read(34, 0x621000098900, 4096) = -1 EOPNOTSUPP (Operation not supported) Fixes: f8a226e ("net/mlx5: fix sysfs port name translation") Cc: [email protected] Signed-off-by: David Marchand <[email protected]> Acked-by: Viacheslav Ovsiienko <[email protected]>
In function mlx5_dev_configure, dev->data->tx_queues is assigned to priv->txqs. When a member is removed from a bond, the function eth_dev_tx_queue_config is called to release dev->data->tx_queues. However, function mlx5_dev_close will access priv->txqs again and cause the use after free problem. In function mlx5_dev_close, before free priv->txqs, we add a check that dev->data->tx_queues is not NULL. build/app/dpdk-testpmd -c7 -a 0000:08:00.2 -- -i --nb-cores=2 --total-num-mbufs=2048 testpmd> port stop 0 testpmd> create bonding device 4 0 testpmd> add bonding member 0 1 testpmd> remove bonding member 0 1 testpmd> quit ASan reports: ==2571911==ERROR: AddressSanitizer: heap-use-after-free on address 0x000174529880 at pc 0x0000113c8440 bp 0xffffefae0ea0 sp 0xffffefae0eb0 READ of size 8 at 0x000174529880 thread T0 #0 0x113c843c in mlx5_txq_release ../drivers/net/mlx5/mlx5_txq.c: 1203 #1 0xffdb53c in mlx5_dev_close ../drivers/net/mlx5/mlx5.c:2286 #2 0xe12dc0 in rte_eth_dev_close ../lib/ethdev/rte_ethdev.c:1877 #3 0x6bac1c in close_port ../app/test-pmd/testpmd.c:3540 #4 0x6bc320 in pmd_test_exit ../app/test-pmd/testpmd.c:3808 #5 0x6c1a94 in main ../app/test-pmd/testpmd.c:4759 #6 0xffff9328f038 (/usr/lib64/libc.so.6+0x2b038) #7 0xffff9328f110 in __libc_start_main (/usr/lib64/libc.so.6+ 0x2b110) Fixes: 6e78005 ("net/mlx5: add reference counter on DPDK Tx queues") Cc: [email protected] Reported-by: Yunjian Wang <[email protected]> Signed-off-by: Pengfei Sun <[email protected]> Acked-by: Dariusz Sosnowski <[email protected]>
I thought this was the canonical repository until i noticed the 3-year old timestamp on the latest commit. It would save time for a lot of people if the move to the own git repo hosting was documented in the README here. I can provide a PR for that if you like?
The text was updated successfully, but these errors were encountered: