netdev CI testing #6666

kuba-moo · 2024-03-27T20:02:33Z

Reusable PR for hooking netdev CI to BPF testing.

Report if/when qcom-ethqos changes the PCS configuration. With phylink now setting the PCS configuration, there should be no need for drivers to change this. Signed-off-by: Russell King (Oracle) <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Signed-off-by: Russell King (Oracle) <[email protected]> Signed-off-by: NipaLocal <nipa@local>

There are two different ways that LLVM can expand kCFI operand bundles in LLVM IR: generically in the middle end or using an architecture specific sequence when lowering LLVM IR to machine code in the backend. The generic pass allows any architecture to take advantage of kCFI but the expansion of these bundles in the middle end can mess with optimizations that may turn indirect calls into direct calls when the call target is known at compile time, such as after inlining. Add __nocfi_generic, dependent on an architecture selecting CONFIG_ARCH_USES_CFI_GENERIC_LLVM_PASS, to disable kCFI bundle generation in functions where only the generic kCFI pass may cause problems. Link: ClangBuiltLinux/linux#2124 Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Prior to clang 22.0.0 [1], ARM did not have an architecture specific kCFI bundle lowering in the backend, which may cause issues. Select CONFIG_ARCH_USES_CFI_GENERIC_LLVM_PASS to enable use of __nocfi_generic. Link: llvm/llvm-project@d130f40 [1] Link: ClangBuiltLinux/linux#2124 Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: NipaLocal <nipa@local>

When building drivers/net/ethernet/intel/idpf/xsk.c for ARCH=arm with CONFIG_CFI=y using a version of LLVM prior to 22.0.0, there is a BUILD_BUG_ON failure: $ cat arch/arm/configs/repro.config CONFIG_BPF_SYSCALL=y CONFIG_CFI=y CONFIG_IDPF=y CONFIG_XDP_SOCKETS=y $ make -skj"$(nproc)" ARCH=arm LLVM=1 clean defconfig repro.config drivers/net/ethernet/intel/idpf/xsk.o In file included from drivers/net/ethernet/intel/idpf/xsk.c:4: include/net/libeth/xsk.h:205:2: error: call to '__compiletime_assert_728' declared with 'error' attribute: BUILD_BUG_ON failed: !__builtin_constant_p(tmo == libeth_xsktmo) 205 | BUILD_BUG_ON(!__builtin_constant_p(tmo == libeth_xsktmo)); | ^ ... libeth_xdp_tx_xmit_bulk() indirectly calls libeth_xsk_xmit_fill_buf() but these functions are marked as __always_inline so that the compiler can turn these indirect calls into direct ones and see that the tmo parameter to __libeth_xsk_xmit_fill_buf_md() is ultimately libeth_xsktmo from idpf_xsk_xmit(). Unfortunately, the generic kCFI pass in LLVM expands the kCFI bundles from the indirect calls in libeth_xdp_tx_xmit_bulk() in such a way that later optimizations cannot turn these calls into direct ones, making the BUILD_BUG_ON fail because it cannot be proved at compile time that tmo is libeth_xsktmo. Disable the generic kCFI pass for libeth_xdp_tx_xmit_bulk() to ensure these indirect calls can always be turned into direct calls to avoid this error. Closes: ClangBuiltLinux/linux#2124 Fixes: 9705d65 ("idpf: implement Rx path for AF_XDP") Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: NipaLocal <nipa@local>

…ify updated Using atomic to protect the send_peer_notif instead of rtnl_mutex. This approach allows safe updates in both interrupt and process contexts, while avoiding code complexity. In LACP mode, the RTNL might be locked, preventing ad_cond_set_peer_notif from acquiring the lock and updating send_peer_notif. This patch addresses the issue by using a atomic. Since updating send_peer_notif does not require high real-time performance, such atomic updates are acceptable. Cc: Jay Vosburgh <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Paolo Abeni <[email protected]> Cc: Simon Horman <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Andrew Lunn <[email protected]> Cc: Nikolay Aleksandrov <[email protected]> Cc: Hangbin Liu <[email protected]> Suggested-by: Jay Vosburgh <[email protected]> Signed-off-by: Tonghao Zhang <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Commit 267bca0 ("dt-bindings: net: sparx5: correct LAN969x register space windows") said that LAN969x has exactly two address spaces ("reg" property) but implemented it as 2 or more. Narrow the constraint to properly express that only two items are allowed, which also matches Linux driver. Fixes: 267bca0 ("dt-bindings: net: sparx5: correct LAN969x register space windows") Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

The LED registers on YT8531 are exactly same as YT8521, so simply reuse yt8521_led_hw_* functions. Tested on OrangePi R1 Plus LTS and Zero3. Signed-off-by: Tianling Shen <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Fix an issue detected by syzbot: KMSAN reported an uninitialized-value access in sctp_inq_pop BUG: KMSAN: uninit-value in sctp_inq_pop The issue is actually caused by skb trimming via sk_filter() in sctp_rcv(). In the reproducer, skb->len becomes 1 after sk_filter(), which bypassed the original check: if (skb->len < sizeof(struct sctphdr) + sizeof(struct sctp_chunkhdr) + skb_transport_offset(skb)) To handle this safely, a new check should be performed after sk_filter(). Reported-by: [email protected] Tested-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=d101e12bccd4095460e7 Fixes: 1da177e ("Linux-2.6.12-rc2") Suggested-by: Xin Long <[email protected]> Signed-off-by: Ranganath V N <[email protected]> Signed-off-by: NipaLocal <nipa@local>

The code did not check the return value of usbnet_get_endpoints. Add checks and return the error if it fails to transfer the error. Found via static anlaysis and this is similar to commit 07161b2 ("sr9800: Add check for usbnet_get_endpoints"). Fixes: 933a27d ("USB: asix - Add AX88178 support and many other changes") Fixes: 2e55cc7 ("[PATCH] USB: usbnet (3/9) module for ASIX Ethernet adapters") Cc: [email protected] Signed-off-by: Miaoqian Lin <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Socket APIs like recvfrom(), accept(), and getsockname() expect socklen_t* arg, but tests were using int variables. This causes -Wpointer-sign warnings on platforms where socklen_t is unsigned. Change the variable type from int to socklen_t to resolve the warning and ensure type safety across platforms. warning fixed: sctp_collision.c:62:70: warning: passing 'int *' to parameter of type 'socklen_t *' (aka 'unsigned int *') converts between pointers to integer types with different sign [-Wpointer-sign] 62 | ret = recvfrom(sd, buf, sizeof(buf), 0, (struct sockaddr *)&daddr, &len); | ^~~~ /usr/include/sys/socket.h:165:27: note: passing argument to parameter '__addr_len' here 165 | socklen_t *__restrict __addr_len); | ^ Signed-off-by: Ankit Khushwaha <[email protected]> Reviewed-by: Muhammad Usama Anjum <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Update tls_offload_rx_resync_async_request_start() and tls_offload_rx_resync_async_request_end() to get a struct tls_offload_resync_async parameter directly, rather than extracting it from struct sock. This change aligns the function signatures with the upcoming tls_offload_rx_resync_async_request_cancel() helper, which will be introduced in a subsequent patch. Signed-off-by: Shahar Shitrit <[email protected]> Reviewed-by: Sabrina Dubroca <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: NipaLocal <nipa@local>

When a netdev issues a RX async resync request for a TLS connection, the TLS module handles it by logging record headers and attempting to match them to the tcp_sn provided by the device. If a match is found, the TLS module approves the tcp_sn for resynchronization. While waiting for a device response, the TLS module also increments rcd_delta each time a new TLS record is received, tracking the distance from the original resync request. However, if the device response is delayed or fails (e.g due to unstable connection and device getting out of tracking, hardware errors, resource exhaustion etc.), the TLS module keeps logging and incrementing, which can lead to a WARN() when rcd_delta exceeds the threshold. To address this, introduce tls_offload_rx_resync_async_request_cancel() to explicitly cancel resync requests when a device response failure is detected. Call this helper also as a final safeguard when rcd_delta crosses its threshold, as reaching this point implies that earlier cancellation did not occur. Signed-off-by: Shahar Shitrit <[email protected]> Reviewed-by: Sabrina Dubroca <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: NipaLocal <nipa@local>

When device loses track of TLS records, it attempts to resync by monitoring records and requests an asynchronous resynchronization from software for this TLS connection. The TLS module handles such device RX resync requests by logging record headers and comparing them with the record tcp_sn when provided by the device. It also increments rcd_delta to track how far the current record tcp_sn is from the tcp_sn of the original resync request. If the device later responds with a matching tcp_sn, the TLS module approves the tcp_sn for resync. However, the device response may be delayed or never arrive, particularly due to traffic-related issues such as packet drops or reordering. In such cases, the TLS module remains unaware that resync will not complete, and continues performing unnecessary work by logging headers and incrementing rcd_delta, which can eventually exceed the threshold and trigger a WARN(). For example, this was observed when the device got out of tracking, causing mlx5e_ktls_handle_get_psv_completion() to fail and ultimately leading to the rcd_delta warning. To address this, call tls_offload_rx_resync_async_request_cancel() to cancel the resync request and stop resync tracking in such error cases. Also, increment the tls_resync_req_skip counter to track these cancellations. Fixes: 0419d8c ("net/mlx5e: kTLS, Add kTLS RX resync support") Signed-off-by: Shahar Shitrit <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Alex will send phylink patches soon which will make us link up on QEMU again, but for now let's hack up the link. Gives us a chance to add another QEMU NIC test to "HW" runners in the CI. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Let's see if this increases stability of timing-related results.. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Signed-off-by: NipaLocal <nipa@local>

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

These are unlikely to matter for CI testing and they slow things down. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

tc_actions.sh keeps hanging the forwarding tests. sdf@: tdc & tdc-dbg started intermittenly failing around Sep 25th Signed-off-by: NipaLocal <nipa@local>

Signed-off-by: NipaLocal <nipa@local>

We exclusively use headless VMs today, don't waste time compiling sound and GPU drivers. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

kmemleak auto scan could be a source of latency for the tests. We run a full scan after the tests manually, we don't need the autoscan thread to be enabled. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

… HEAD

kuba-moo force-pushed the to-test branch from 6bd5e75 to bdd05e2 Compare March 27, 2024 21:49

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 3 times, most recently from 4f22ee0 to 8a9a8e0 Compare March 28, 2024 04:46

kuba-moo force-pushed the to-test branch 11 times, most recently from 64c403f to 8da1f58 Compare March 29, 2024 00:01

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 3 times, most recently from 78ebb17 to 9325308 Compare March 29, 2024 02:14

kuba-moo force-pushed the to-test branch 6 times, most recently from c8c7b2f to a71aae6 Compare March 29, 2024 18:01

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 9325308 to 7940ae1 Compare March 29, 2024 18:12

kuba-moo force-pushed the to-test branch 2 times, most recently from d8feb00 to b16a6b9 Compare March 30, 2024 00:01

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 7940ae1 to 8f1ff3c Compare March 30, 2024 00:21

kuba-moo force-pushed the to-test branch 2 times, most recently from 4164329 to c5cecb3 Compare March 30, 2024 06:00

Russell King (Oracle) and others added 29 commits October 26, 2025 14:01

net: stmmac: add support specifying PCS supported interfaces

ee07ed6

Signed-off-by: Russell King (Oracle) <[email protected]> Signed-off-by: NipaLocal <nipa@local>

net: phy: motorcomm: Add support for PHY LEDs on YT8531

ee28bcf

The LED registers on YT8531 are exactly same as YT8521, so simply reuse yt8521_led_hw_* functions. Tested on OrangePi R1 Plus LTS and Zero3. Signed-off-by: Tianling Shen <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: disable random kunit tests

86f6eba

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: disable 6.17's merge window kunit tests

6271b17

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: config: x86: use periodic HZ tick

4fc0312

Let's see if this increases stability of timing-related results.. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: profile (time) test output

46f0122

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: timestamp - try waking

d47cd1a

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: dbg: tests: bonding: print info on failure

bcf9fff

Signed-off-by: NipaLocal <nipa@local>

nipa: selftests: net: enable profiling

291346c

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: tc_action dbg

7f47d91

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: config: disable CPU_MITIGATIONS

884c678

These are unlikely to matter for CI testing and they slow things down. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: forwarding: set timeout to 3 hours

45e9470

tc_actions.sh keeps hanging the forwarding tests. sdf@: tdc & tdc-dbg started intermittenly failing around Sep 25th Signed-off-by: NipaLocal <nipa@local>

nipa: drv: net: add timeout

0872853

Signed-off-by: NipaLocal <nipa@local>

nipa: config: x86: disable GPUs and sound

3e5c9be

We exclusively use headless VMs today, don't waste time compiling sound and GPU drivers. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

nipa: config: disable kmemleak auto scan

db402aa

kmemleak auto scan could be a source of latency for the tests. We run a full scan after the tests manually, we don't need the autoscan thread to be enabled. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Merge remote-tracking branch 'origin/net-next-2025-10-26--21-00' into…

a9380a4

… HEAD

kuba-moo force-pushed the to-test branch from e298d56 to a9380a4 Compare October 26, 2025 21:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

netdev CI testing #6666

netdev CI testing #6666

kuba-moo commented Mar 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

50 participants

netdev CI testing #6666

Are you sure you want to change the base?

netdev CI testing #6666

Conversation

kuba-moo commented Mar 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

50 participants