kernel - Hosts the 0x221E linux distro kernel.

Age	Commit message (Collapse)	Author
2025-07-11	net_sched: act_csum: use RCU in tcf_csum_dump()	Eric Dumazet
	Also storing tcf_action into struct tcf_csum_params makes sure there is no discrepancy in tcf_csum_act(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250709090204.797558-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11	net_sched: act_connmark: use RCU in tcf_connmark_dump()	Eric Dumazet
	Also storing tcf_action into struct tcf_connmark_parms makes sure there is no discrepancy in tcf_connmark_act(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250709090204.797558-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11	net_sched: act: annotate data-races in tcf_lastuse_update() and tcf_tm_dump()	Eric Dumazet
	tcf_tm_dump() reads fields that can be changed concurrently, and tcf_lastuse_update() might race against itself. Add READ_ONCE() and WRITE_ONCE() annotations. Fetch jiffies once in tcf_tm_dump(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250709090204.797558-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.16-rc6-2). No conflicts. Adjacent changes: drivers/net/wireless/mediatek/mt76/mt7925/mcu.c c701574c5412 ("wifi: mt76: mt7925: fix invalid array index in ssid assignment during hw scan") b3a431fe2e39 ("wifi: mt76: mt7925: fix off by one in mt7925_mcu_hw_scan()") drivers/net/wireless/mediatek/mt76/mt7996/mac.c 62da647a2b20 ("wifi: mt76: mt7996: Add MLO support to mt7996_tx_check_aggr()") dc66a129adf1 ("wifi: mt76: add a wrapper for wcid access with validation") drivers/net/wireless/mediatek/mt76/mt7996/main.c 3dd6f67c669c ("wifi: mt76: Move RCU section in mt7996_mcu_add_rate_ctrl()") 8989d8e90f5f ("wifi: mt76: mt7996: Do not set wcid.sta to 1 in mt7996_mac_sta_event()") net/mac80211/cfg.c 58fcb1b4287c ("wifi: mac80211: reject VHT opmode for unsupported channel widths") 037dc18ac3fb ("wifi: mac80211: add support for storing station S1G capabilities") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11	bpf: Remove location field in tcx_link	Tao Chen
	Use attach_type in bpf_link to replace the location filed, and remove location field in tcx_link. Signed-off-by: Tao Chen <chen.dylane@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20250710032038.888700-5-chen.dylane@linux.dev
2025-07-10	Merge tag 'nf-next-25-07-10' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next (v2) The following series contains an initial small batch of Netfilter updates for net-next: 1) Remove DCCP conntrack support, keep DCCP matches around in order to avoid breakage when loading ruleset, add Kconfig to wrap the code so it can be disabled by distributors. 2) Remove buggy code aiming at shrinking netlink deletion event, then re-add it correctly in another patch. This is to prevent -stable to pick up on a fix that breaks old userspace. From Phil Sutter. 3) Missing WARN_ON_ONCE() to check for lockdep_commit_lock_is_held() to uncover bugs. From Fedor Pchelkin. * tag 'nf-next-25-07-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: adjust lockdep assertions handling netfilter: nf_tables: Reintroduce shortened deletion notifications netfilter: nf_tables: Drop dead code from fill_*_info routines netfilter: conntrack: remove DCCP protocol support ==================== Link: https://patch.msgid.link/20250710010706.2861281-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	Merge tag 'wireless-next-2025-07-10' of ↵	Jakub Kicinski
	https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes Berg says: ==================== Quite a bit more work, notably: - mt76: firmware recovery improvements, MLO work - iwlwifi: use embedded PNVM in (to be released) FW images to fix compatibility issues - cfg80211/mac80211: extended regulatory info support (6 GHz) - cfg80211: use "faux device" for regulatory * tag 'wireless-next-2025-07-10' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (48 commits) wifi: mac80211: don't complete management TX on SAE commit wifi: cfg80211/mac80211: implement dot11ExtendedRegInfoSupport wifi: mac80211: send extended MLD capa/ops if AP has it wifi: mac80211: copy first_part into HW scan wifi: cfg80211: add a flag for the first part of a scan wifi: mac80211: remove DISALLOW_PUNCTURING_5GHZ code wifi: cfg80211: only verify part of Extended MLD Capabilities wifi: nl80211: make nl80211_check_scan_flags() type safe wifi: cfg80211: hide scan internals wifi: mac80211: fix deactivated link CSA wifi: mac80211: add mandatory bitrate support for 6 GHz wifi: mac80211: remove spurious blank line wifi: mac80211: verify state before connection wifi: mac80211: avoid weird state in error path wifi: iwlwifi: mvm: remove support for iwl_wowlan_info_notif_v4 wifi: iwlwifi: bump minimum API version in BZ wifi: iwlwifi: mvm: remove unneeded argument wifi: iwlwifi: mvm: remove MLO GTK rekey code wifi: iwlwifi: pcie: rename iwl_pci_gen1_2_probe() argument wifi: iwlwifi: match discrete/integrated to fix some names ... ==================== Link: https://patch.msgid.link/20250710123113.24878-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	netfilter: flowtable: account for Ethernet header in nf_flow_pppoe_proto()	Eric Dumazet
	syzbot found a potential access to uninit-value in nf_flow_pppoe_proto() Blamed commit forgot the Ethernet header. BUG: KMSAN: uninit-value in nf_flow_offload_inet_hook+0x7e4/0x940 net/netfilter/nf_flow_table_inet.c:27 nf_flow_offload_inet_hook+0x7e4/0x940 net/netfilter/nf_flow_table_inet.c:27 nf_hook_entry_hookfn include/linux/netfilter.h:157 [inline] nf_hook_slow+0xe1/0x3d0 net/netfilter/core.c:623 nf_hook_ingress include/linux/netfilter_netdev.h:34 [inline] nf_ingress net/core/dev.c:5742 [inline] __netif_receive_skb_core+0x4aff/0x70c0 net/core/dev.c:5837 __netif_receive_skb_one_core net/core/dev.c:5975 [inline] __netif_receive_skb+0xcc/0xac0 net/core/dev.c:6090 netif_receive_skb_internal net/core/dev.c:6176 [inline] netif_receive_skb+0x57/0x630 net/core/dev.c:6235 tun_rx_batched+0x1df/0x980 drivers/net/tun.c:1485 tun_get_user+0x4ee0/0x6b40 drivers/net/tun.c:1938 tun_chr_write_iter+0x3e9/0x5c0 drivers/net/tun.c:1984 new_sync_write fs/read_write.c:593 [inline] vfs_write+0xb4b/0x1580 fs/read_write.c:686 ksys_write fs/read_write.c:738 [inline] __do_sys_write fs/read_write.c:749 [inline] Reported-by: syzbot+bf6ed459397e307c3ad2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/686bc073.a00a0220.c7b3.0086.GAE@google.com/T/#u Fixes: 87b3593bed18 ("netfilter: flowtable: validate pppoe header") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org> Link: https://patch.msgid.link/20250707124517.614489-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	net: replace ND_PRINTK with dynamic debug	Wang Liang
	ND_PRINTK with val > 1 only works when the ND_DEBUG was set in compilation phase. Replace it with dynamic debug. Convert ND_PRINTK with val <= 1 to net_{err,warn}_ratelimited, and convert the rest to net_dbg_ratelimited. Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Wang Liang <wangliang74@huawei.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250708033342.1627636-1-wangliang74@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.16-rc6). No conflicts. Adjacent changes: Documentation/devicetree/bindings/net/allwinner,sun8i-a83t-emac.yaml 0a12c435a1d6 ("dt-bindings: net: sun8i-emac: Add A100 EMAC compatible") b3603c0466a8 ("dt-bindings: net: sun8i-emac: Rename A523 EMAC0 to GMAC0") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	net: xsk: introduce XDP_MAX_TX_SKB_BUDGET setsockopt	Jason Xing
	This patch provides a setsockopt method to let applications leverage to adjust how many descs to be handled at most in one send syscall. It mitigates the situation where the default value (32) that is too small leads to higher frequency of triggering send syscall. Considering the prosperity/complexity the applications have, there is no absolutely ideal suggestion fitting all cases. So keep 32 as its default value like before. The patch does the following things: - Add XDP_MAX_TX_SKB_BUDGET socket option. - Set max_tx_budget to 32 by default in the initialization phase as a per-socket granular control. - Set the range of max_tx_budget as [32, xs->tx->nentries]. The idea behind this comes out of real workloads in production. We use a user-level stack with xsk support to accelerate sending packets and minimize triggering syscalls. When the packets are aggregated, it's not hard to hit the upper bound (namely, 32). The moment user-space stack fetches the -EAGAIN error number passed from sendto(), it will loop to try again until all the expected descs from tx ring are sent out to the driver. Enlarging the XDP_MAX_TX_SKB_BUDGET value contributes to less frequency of sendto() and higher throughput/PPS. Here is what I did in production, along with some numbers as follows: For one application I saw lately, I suggested using 128 as max_tx_budget because I saw two limitations without changing any default configuration: 1) XDP_MAX_TX_SKB_BUDGET, 2) socket sndbuf which is 212992 decided by net.core.wmem_default. As to XDP_MAX_TX_SKB_BUDGET, the scenario behind this was I counted how many descs are transmitted to the driver at one time of sendto() based on [1] patch and then I calculated the possibility of hitting the upper bound. Finally I chose 128 as a suitable value because 1) it covers most of the cases, 2) a higher number would not bring evident results. After twisting the parameters, a stable improvement of around 4% for both PPS and throughput and less resources consumption were found to be observed by strace -c -p xxx: 1) %time was decreased by 7.8% 2) error counter was decreased from 18367 to 572 [1]: https://lore.kernel.org/all/20250619093641.70700-1-kerneljasonxing@gmail.com/ Signed-off-by: Jason Xing <kernelxing@tencent.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20250704160138.48677-1-kerneljasonxing@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-10	net/sched: sch_qfq: Fix null-deref in agg_dequeue	Xiang Mei
	To prevent a potential crash in agg_dequeue (net/sched/sch_qfq.c) when cl->qdisc->ops->peek(cl->qdisc) returns NULL, we check the return value before using it, similar to the existing approach in sch_hfsc.c. To avoid code duplication, the following changes are made: 1. Changed qdisc_warn_nonwc(include/net/pkt_sched.h) into a static inline function. 2. Moved qdisc_peek_len from net/sched/sch_hfsc.c to include/net/pkt_sched.h so that sch_qfq can reuse it. 3. Applied qdisc_peek_len in agg_dequeue to avoid crashing. Signed-off-by: Xiang Mei <xmei5@asu.edu> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Link: https://patch.msgid.link/20250705212143.3982664-1-xmei5@asu.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-09	devlink: Add new "clock_id" generic device param	Ivan Vecera
	Add a new device generic parameter to specify clock ID that should be used by the device for registering DPLL devices and pins. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250704182202.1641943-5-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	devlink: Add support for u64 parameters	Ivan Vecera
	Only 8, 16 and 32-bit integers are supported for numeric devlink parameters. The subsequent patch adds support for DPLL clock ID that is defined as 64-bit number. Add support for u64 parameter type. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250704182202.1641943-4-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	wifi: mac80211: don't complete management TX on SAE commit	Johannes Berg
	When SAE commit is sent and received in response, there's no ordering for the SAE confirm messages. As such, don't call drivers to stop listening on the channel when the confirm message is still expected. This fixes an issue if the local confirm is transmitted later than the AP's confirm, for iwlwifi (and possibly mt76) the AP's confirm would then get lost since the device isn't on the channel at the time the AP transmit the confirm. For iwlwifi at least, this also improves the overall timing of the authentication handshake (by about 15ms according to the report), likely since the session protection won't be aborted and rescheduled. Note that even before this, mgd_complete_tx() wasn't always called for each call to mgd_prepare_tx() (e.g. in the case of WEP key shared authentication), and the current drivers that have the complete callback don't seem to mind. Document this as well though. Reported-by: Jan Hendrik Farr <kernel@jfarr.cc> Closes: https://lore.kernel.org/all/aB30Ea2kRG24LINR@archlinux/ Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250609213232.12691580e140.I3f1d3127acabcd58348a110ab11044213cf147d3@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-09	wifi: cfg80211: add a flag for the first part of a scan	Benjamin Berg
	When there are no non-6 GHz channels, then the 6 GHz scan is the first part of a split scan. Add a boolean denoting whether the scan is the first part of a scan as it might be useful to drivers for internal bookkeeping. This flag is also set if the scan is not split. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250609213231.07e5a8a452ec.Ibf18f513e507422078fb31b28947e582a20df87a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-09	wifi: mac80211: remove DISALLOW_PUNCTURING_5GHZ code	Johannes Berg
	Since iwlwifi was the only driver using this and no longer does, we can remove all this code. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250609213231.4dff5fb8890f.Ie531f912b252a0042c18c0734db50c3afe1adfb5@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-09	wifi: cfg80211: hide scan internals	Johannes Berg
	Hide the internal scan fields from mac80211 and drivers, the 'notified' variable is for internal tracking, and the 'info' is output that's passed to cfg80211_scan_done() and stored only for delayed userspace notification. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250609213231.6a62e41858e2.I004f66e9c087cc6e6ae4a24951cf470961ee9466@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-09	wifi: mac80211: avoid weird state in error path	Miri Korenblit
	If we get to the error path of ieee80211_prep_connection, for example because of a FW issue, then ieee80211_vif_set_links is called with 0. But the call to drv_change_vif_links from ieee80211_vif_update_links will probably fail as well, for the same reason. In this case, the valid_links and active_links bitmaps will be reverted to the value of the failing connection. Then, in the next connection, due to the logic of ieee80211_set_vif_links_bitmaps, valid_links will be set to the ID of the new connection assoc link, but the active_links will remain with the ID of the old connection's assoc link. If those IDs are different, we get into a weird state of valid_links and active_links being different. One of the consequences of this state is to call drv_change_vif_links with new_links as 0, since the & operation between the bitmaps will be 0. Since a removal of a link should always succeed, ignore the return value of drv_change_vif_links if it was called to only remove links, which is the case for the ieee80211_prep_connection's error path. That way, the bitmaps will not be reverted to have the value from the failing connection and will have 0, so the next connection will have a good state. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Link: https://patch.msgid.link/20250609213231.ba2011fb435f.Id87ff6dab5e1cf757b54094ac2d714c656165059@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-08	af_unix: Introduce SO_INQ.	Kuniyuki Iwashima
	We have an application that uses almost the same code for TCP and AF_UNIX (SOCK_STREAM). TCP can use TCP_INQ, but AF_UNIX doesn't have it and requires an extra syscall, ioctl(SIOCINQ) or getsockopt(SO_MEMINFO) as an alternative. Let's introduce the generic version of TCP_INQ. If SO_INQ is enabled, recvmsg() will put a cmsg of SCM_INQ that contains the exact value of ioctl(SIOCINQ). The cmsg is also included when msg->msg_get_inq is non-zero to make sockets io_uring-friendly. Note that SOCK_CUSTOM_SOCKOPT is flagged only for SOCK_STREAM to override setsockopt() for SOL_SOCKET. By having the flag in struct unix_sock, instead of struct sock, we can later add SO_INQ support for TCP and reuse tcp_sk(sk)->recvmsg_inq. Note also that supporting custom getsockopt() for SOL_SOCKET will need preparation for other SOCK_CUSTOM_SOCKOPT users (UDP, vsock, MPTCP). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250702223606.1054680-7-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	af_unix: Use cached value for SOCK_STREAM in unix_inq_len().	Kuniyuki Iwashima
	Compared to TCP, ioctl(SIOCINQ) for AF_UNIX SOCK_STREAM socket is more expensive, as unix_inq_len() requires iterating through the receive queue and accumulating skb->len. Let's cache the value for SOCK_STREAM to a new field during sendmsg() and recvmsg(). The field is protected by the receive queue lock. Note that ioctl(SIOCINQ) for SOCK_DGRAM returns the length of the first skb in the queue. SOCK_SEQPACKET still requires iterating through the queue because we do not touch functions shared with unix_dgram_ops. But, if really needed, we can support it by switching __skb_try_recv_datagram() to a custom version. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250702223606.1054680-5-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	Revert "xfrm: destroy xfrm_state synchronously on net exit path"	Sabrina Dubroca
	This reverts commit f75a2804da391571563c4b6b29e7797787332673. With all states (whether user or kern) removed from the hashtables during deletion, there's no need for synchronous destruction of states. xfrm6_tunnel states still need to have been destroyed (which will be the case when its last user is deleted (not destroyed)) so that xfrm6_tunnel_free_spi removes it from the per-netns hashtable before the netns is destroyed. This has the benefit of skipping one synchronize_rcu per state (in __xfrm_state_destroy(sync=true)) when we exit a netns. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-07-08	xfrm: delete x->tunnel as we delete x	Sabrina Dubroca
	The ipcomp fallback tunnels currently get deleted (from the various lists and hashtables) as the last user state that needed that fallback is destroyed (not deleted). If a reference to that user state still exists, the fallback state will remain on the hashtables/lists, triggering the WARN in xfrm_state_fini. Because of those remaining references, the fix in commit f75a2804da39 ("xfrm: destroy xfrm_state synchronously on net exit path") is not complete. We recently fixed one such situation in TCP due to defered freeing of skbs (commit 9b6412e6979f ("tcp: drop secpath at the same time as we currently drop dst")). This can also happen due to IP reassembly: skbs with a secpath remain on the reassembly queue until netns destruction. If we can't guarantee that the queues are flushed by the time xfrm_state_fini runs, there may still be references to a (user) xfrm_state, preventing the timely deletion of the corresponding fallback state. Instead of chasing each instance of skbs holding a secpath one by one, this patch fixes the issue directly within xfrm, by deleting the fallback state as soon as the last user state depending on it has been deleted. Destruction will still happen when the final reference is dropped. A separate lockdep class for the fallback state is required since we're going to lock x->tunnel while x is locked. Fixes: 9d4139c76905 ("netns xfrm: per-netns xfrm_state_all list") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-07-08	net/sched: acp_api: no longer acquire RTNL in tc_action_net_exit()	Eric Dumazet
	tc_action_net_exit() got an rtnl exclusion in commit a159d3c4b829 ("net_sched: acquire RTNL in tc_action_net_exit()") Since then, commit 16af6067392c ("net: sched: implement reference counted action release") made this RTNL exclusion obsolete for most cases. Only tcf_action_offload_del() might still require it. Move the rtnl locking into tcf_idrinfo_destroy() when an offload action is found. Most netns do not have actions, yet deleting them is adding a lot of pressure on RTNL, which is for many the most contended mutex in the kernel. We are moving to a per-netns 'rtnl', so tc_action_net_exit() will not be able to grab 'rtnl' a single time for a batch of netns. Before the patch: perf probe -a rtnl_lock perf record -e probe:rtnl_lock -a /bin/bash -c 'unshare -n "/bin/true"; sleep 1' [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.305 MB perf.data (25 samples) ] After the patch: perf record -e probe:rtnl_lock -a /bin/bash -c 'unshare -n "/bin/true"; sleep 1' [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.304 MB perf.data (9 samples) ] Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vlad Buslov <vladbu@nvidia.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Link: https://patch.msgid.link/20250702071230.1892674-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08	net: mctp: add gateway routing support	Jeremy Kerr
	This change allows for gateway routing, where a route table entry may reference a routable endpoint (by network and EID), instead of routing directly to a netdevice. We add support for a RTM_GATEWAY attribute for netlink route updates, with an attribute format of: struct mctp_fq_addr { unsigned int net; mctp_eid_t eid; } - we need the net here to uniquely identify the target EID, as we no longer have the device reference directly (which would provide the net id in the case of direct routes). This makes route lookups recursive, as a route lookup that returns a gateway route must be resolved into a direct route (ie, to a device) eventually. We provide a limit to the route lookups, to prevent infinite loop routing. The route lookup populates a new 'nexthop' field in the dst structure, which now specifies the key for the neighbour table lookup on device output, rather than using the packet destination address directly. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-13-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08	net: mctp: separate cb from direct-addressing routing	Jeremy Kerr
	Now that we have the dst->haddr populated by sendmsg (when extended addressing is in use), we no longer need to stash the link-layer address in the skb->cb. Instead, only use skb->cb for incoming lladdr data. While we're at it: remove cb->src, as was never used. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-4-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08	net: mctp: separate routing database from routing operations	Jeremy Kerr
	This change adds a struct mctp_dst, representing the result of a routing lookup. This decouples the struct mctp_route from the actual implementation of a routing operation. This will allow for future routing changes which may require more involved lookup logic, such as gateway routing - which may require multiple traversals of the routing table. Since we only use the struct mctp_route at lookup time, we no longer hold routes over a routing operation, as we only need it to populate the dst. However, we do hold the dev while the dst is active. This requires some changes to the route test infrastructure, as we no longer have a mock route to handle the route output operation, and transient dsts are created by the routing code, so we can't override them as easily. Instead, we use kunit->priv to stash a packet queue, and a custom dst_output function queues into that packet queue, which we can use for later expectations. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-3-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08	net: bonding: add broadcast_neighbor option for 802.3ad	Tonghao Zhang
	Stacking technology is a type of technology used to expand ports on Ethernet switches. It is widely used as a common access method in large-scale Internet data center architectures. Years of practice have proved that stacking technology has advantages and disadvantages in high-reliability network architecture scenarios. For instance, in stacking networking arch, conventional switch system upgrades require multiple stacked devices to restart at the same time. Therefore, it is inevitable that the business will be interrupted for a while. It is for this reason that "no-stacking" in data centers has become a trend. Additionally, when the stacking link connecting the switches fails or is abnormal, the stack will split. Although it is not common, it still happens in actual operation. The problem is that after the split, it is equivalent to two switches with the same configuration appearing in the network, causing network configuration conflicts and ultimately interrupting the services carried by the stacking system. To improve network stability, "non-stacking" solutions have been increasingly adopted, particularly by public cloud providers and tech companies like Alibaba, Tencent, and Didi. "non-stacking" is a method of mimicing switch stacking that convinces a LACP peer, bonding in this case, connected to a set of "non-stacked" switches that all of its ports are connected to a single switch (i.e., LACP aggregator), as if those switches were stacked. This enables the LACP peer's ports to aggregate together, and requires (a) special switch configuration, described in the linked article, and (b) modifications to the bonding 802.3ad (LACP) mode to send all ARP/ND packets across all ports of the active aggregator. Note that, with multiple aggregators, the current broadcast mode logic will send only packets to the selected aggregator(s). +-----------+ +-----------+ \| switch1 \| \| switch2 \| +-----------+ +-----------+ ^ ^ \| \| +-----------------+ \| bond4 lacp \| +-----------------+ \| \| \| NIC1 \| NIC2 +-----------------+ \| server \| +-----------------+ - https://www.ruijie.com/fr-fr/support/tech-gallery/de-stack-data-center-network-architecture/ Cc: Jay Vosburgh <jv@jvosburgh.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com> Signed-off-by: Zengbing Tu <tuzengbing@didiglobal.com> Link: https://patch.msgid.link/84d0a044514157bb856a10b6d03a1028c4883561.1751031306.git.tonghao@bamaicloud.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-07	Merge tag 'for-net-2025-07-03' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - hci_sync: Fix not disabling advertising instance - hci_core: Remove check of BDADDR_ANY in hci_conn_hash_lookup_big_state - hci_sync: Fix attempting to send HCI_Disconnect to BIS handle - hci_event: Fix not marking Broadcast Sink BIS as connected * tag 'for-net-2025-07-03' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: hci_event: Fix not marking Broadcast Sink BIS as connected Bluetooth: hci_sync: Fix attempting to send HCI_Disconnect to BIS handle Bluetooth: hci_core: Remove check of BDADDR_ANY in hci_conn_hash_lookup_big_state Bluetooth: hci_sync: Fix not disabling advertising instance ==================== Link: https://patch.msgid.link/20250703160409.1791514-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07	page_pool: make page_pool_get_dma_addr() just wrap ↵	Byungchul Park
	page_pool_get_dma_addr_netmem() The page pool members in struct page cannot be removed unless it's not allowed to access any of them via struct page. Do not access 'page->dma_addr' directly in page_pool_get_dma_addr() but just wrap page_pool_get_dma_addr_netmem() safely. Signed-off-by: Byungchul Park <byungchul@sk.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/20250702053256.4594-6-byungchul@sk.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07	netmem: use _Generic to cover const casting for page_to_netmem()	Byungchul Park
	The current page_to_netmem() doesn't cover const casting resulting in trying to cast const struct page * to const netmem_ref fails. To cover the case, change page_to_netmem() to use macro and _Generic. Signed-off-by: Byungchul Park <byungchul@sk.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://patch.msgid.link/20250702053256.4594-5-byungchul@sk.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07	vsock: fix `vsock_proto` declaration	Stefano Garzarella
	From commit 634f1a7110b4 ("vsock: support sockmap"), `struct proto vsock_proto`, defined in af_vsock.c, is not static anymore, since it's used by vsock_bpf.c. If CONFIG_BPF_SYSCALL is not defined, `make C=2` will print a warning: $ make O=build C=2 W=1 net/vmw_vsock/ ... CC [M] net/vmw_vsock/af_vsock.o CHECK ../net/vmw_vsock/af_vsock.c ../net/vmw_vsock/af_vsock.c:123:14: warning: symbol 'vsock_proto' was not declared. Should it be static? Declare `vsock_proto` regardless of CONFIG_BPF_SYSCALL, since it's defined in af_vsock.c, which is built regardless of CONFIG_BPF_SYSCALL. Fixes: 634f1a7110b4 ("vsock: support sockmap") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20250703112329.28365-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-04	af_unix/scm: fix whitespace errors	Alexander Mikhalitsyn
	Fix whitespace/formatting errors. Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Kuniyuki Iwashima <kuniyu@google.com> Cc: Lennart Poettering <mzxreary@0pointer.de> Cc: Luca Boccassi <bluca@debian.org> Cc: David Rheinsberg <david@readahead.eu> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Link: https://lore.kernel.org/20250703222314.309967-5-aleksandr.mikhalitsyn@canonical.com Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-03	Bluetooth: hci_core: Remove check of BDADDR_ANY in ↵	Luiz Augusto von Dentz
	hci_conn_hash_lookup_big_state The check for destination to be BDADDR_ANY is no longer necessary with the introduction of BIS_LINK. Fixes: 23205562ffc8 ("Bluetooth: separate CIS_LINK and BIS_LINK link types") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-07-03	netfilter: conntrack: remove DCCP protocol support	Pablo Neira Ayuso
	The DCCP socket family has now been removed from this tree, see: 8bb3212be4b4 ("Merge branch 'net-retire-dccp-socket'") Remove connection tracking and NAT support for this protocol, this should not pose a problem because no DCCP traffic is expected to be seen on the wire. As for the code for matching on dccp header for iptables and nftables, mark it as deprecated and keep it in place. Ruleset restoration is an atomic operation. Without dccp matching support, an astray match on dccp could break this operation leaving your computer with no policy in place, so let's follow a more conservative approach for matches. Add CONFIG_NFT_EXTHDR_DCCP which is set to 'n' by default to deprecate dccp extension support. Similarly, label CONFIG_NETFILTER_XT_MATCH_DCCP as deprecated too and also set it to 'n' by default. Code to match on DCCP protocol from ebtables also remains in place, this is just a few checks on IPPROTO_DCCP from _check() path which is exercised when ruleset is loaded. There is another use of IPPROTO_DCCP from the _check() path in the iptables multiport match. Another check for IPPROTO_DCCP from the packet in the reject target is also removed. So let's schedule removal of the dccp matching for a second stage, this should not interfer with the dccp retirement since this is only matching on the dccp header. Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-07-02	devlink: Extend devlink rate API with traffic classes bandwidth management	Carolina Jubran
	Introduce support for specifying relative bandwidth shares between traffic classes (TC) in the devlink-rate API. This new option allows users to allocate bandwidth across multiple traffic classes in a single command. This feature provides a more granular control over traffic management, especially for scenarios requiring Enhanced Transmission Selection. Users can now define a relative bandwidth share for each traffic class. For example, assigning share values of 20 to TC0 (TCP/UDP) and 80 to TC5 (RoCE) will result in TC0 receiving 20% and TC5 receiving 80% of the total bandwidth. The actual percentage each class receives depends on the ratio of its share value to the sum of all shares. Example: DEV=pci/0000:08:00.0 $ devlink port function rate add $DEV/vfs_group tx_share 10Gbit \ tx_max 50Gbit tc-bw 0:20 1:0 2:0 3:0 4:0 5:80 6:0 7:0 $ devlink port function rate set $DEV/vfs_group \ tc-bw 0:20 1:0 2:0 3:0 4:0 5:20 6:60 7:0 Example usage with ynl: ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/devlink.yaml \ --do rate-set --json '{ "bus-name": "pci", "dev-name": "0000:08:00.0", "port-index": 1, "rate-tc-bws": [ {"rate-tc-index": 0, "rate-tc-bw": 50}, {"rate-tc-index": 1, "rate-tc-bw": 50}, {"rate-tc-index": 2, "rate-tc-bw": 0}, {"rate-tc-index": 3, "rate-tc-bw": 0}, {"rate-tc-index": 4, "rate-tc-bw": 0}, {"rate-tc-index": 5, "rate-tc-bw": 0}, {"rate-tc-index": 6, "rate-tc-bw": 0}, {"rate-tc-index": 7, "rate-tc-bw": 0} ] }' ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/devlink.yaml \ --do rate-get --json '{ "bus-name": "pci", "dev-name": "0000:08:00.0", "port-index": 1 }' output for rate-get: {'bus-name': 'pci', 'dev-name': '0000:08:00.0', 'port-index': 1, 'rate-tc-bws': [{'rate-tc-bw': 50, 'rate-tc-index': 0}, {'rate-tc-bw': 50, 'rate-tc-index': 1}, {'rate-tc-bw': 0, 'rate-tc-index': 2}, {'rate-tc-bw': 0, 'rate-tc-index': 3}, {'rate-tc-bw': 0, 'rate-tc-index': 4}, {'rate-tc-bw': 0, 'rate-tc-index': 5}, {'rate-tc-bw': 0, 'rate-tc-index': 6}, {'rate-tc-bw': 0, 'rate-tc-index': 7}], 'rate-tx-max': 0, 'rate-tx-priority': 0, 'rate-tx-share': 0, 'rate-tx-weight': 0, 'rate-type': 'leaf'} Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250629142138.361537-3-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02	netlink: introduce type-checking attribute iteration for nlmsg	Carolina Jubran
	Add the nlmsg_for_each_attr_type() macro to simplify iteration over attributes of a specific type in a Netlink message. Convert existing users in vxlan and nfsd to use the new macro. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250629142138.361537-2-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02	ipv6: adopt skb_dst_dev() and skb_dst_dev_net[_rcu]() helpers	Eric Dumazet
	Use the new helpers as a step to deal with potential dst->dev races. v2: fix typo in ipv6_rthdr_rcv() (kernel test robot <lkp@intel.com>) Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250630121934.3399505-10-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02	ipv6: adopt dst_dev() helper	Eric Dumazet
	Use the new helper as a step to deal with potential dst->dev races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250630121934.3399505-9-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02	ipv4: adopt dst_dev, skb_dst_dev and skb_dst_dev_net[_rcu]	Eric Dumazet
	Use the new helpers as a first step to deal with potential dst->dev races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250630121934.3399505-8-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02	net: dst: add four helpers to annotate data-races around dst->dev	Eric Dumazet
	dst->dev is read locklessly in many contexts, and written in dst_dev_put(). Fixing all the races is going to need many changes. We probably will have to add full RCU protection. Add three helpers to ease this painful process. static inline struct net_device dst_dev(const struct dst_entry dst) { return READ_ONCE(dst->dev); } static inline struct net_device skb_dst_dev(const struct sk_buff skb) { return dst_dev(skb_dst(skb)); } static inline struct net skb_dst_dev_net(const struct sk_buff skb) { return dev_net(skb_dst_dev(skb)); } static inline struct net skb_dst_dev_net_rcu(const struct sk_buff skb) { return dev_net_rcu(skb_dst_dev(skb)); } Fixes: 4a6ce2b6f2ec ("net: introduce a new function dst_dev_put()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250630121934.3399505-7-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02	net: dst: annotate data-races around dst->output	Eric Dumazet
	dst_dev_put() can overwrite dst->output while other cpus might read this field (for instance from dst_output()) Add READ_ONCE()/WRITE_ONCE() annotations to suppress potential issues. We will likely need RCU protection in the future. Fixes: 4a6ce2b6f2ec ("net: introduce a new function dst_dev_put()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250630121934.3399505-6-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02	net: dst: annotate data-races around dst->input	Eric Dumazet
	dst_dev_put() can overwrite dst->input while other cpus might read this field (for instance from dst_input()) Add READ_ONCE()/WRITE_ONCE() annotations to suppress potential issues. We will likely need full RCU protection later. Fixes: 4a6ce2b6f2ec ("net: introduce a new function dst_dev_put()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250630121934.3399505-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02	net: dst: annotate data-races around dst->lastuse	Eric Dumazet
	(dst_entry)->lastuse is read and written locklessly, add corresponding annotations. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250630121934.3399505-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02	net: dst: annotate data-races around dst->expires	Eric Dumazet
	(dst_entry)->expires is read and written locklessly, add corresponding annotations. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250630121934.3399505-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02	net: dst: annotate data-races around dst->obsolete	Eric Dumazet
	(dst_entry)->obsolete is read locklessly, add corresponding annotations. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250630121934.3399505-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02	udp: move udp_memory_allocated into net_aligned_data	Eric Dumazet
	____cacheline_aligned_in_smp attribute only makes sure to align a field to a cache line. It does not prevent the linker to use the remaining of the cache line for other variables, causing potential false sharing. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250630093540.3052835-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02	tcp: move tcp_memory_allocated into net_aligned_data	Eric Dumazet
	____cacheline_aligned_in_smp attribute only makes sure to align a field to a cache line. It does not prevent the linker to use the remaining of the cache line for other variables, causing potential false sharing. Move tcp_memory_allocated into a dedicated cache line. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250630093540.3052835-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02	net: move net_cookie into net_aligned_data	Eric Dumazet
	Using per-cpu data for net->net_cookie generation is overkill, because even busy hosts do not create hundreds of netns per second. Make sure to put net_cookie in a private cache line to avoid potential false sharing. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250630093540.3052835-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-02	net: add struct net_aligned_data	Eric Dumazet
	This structure will hold networking data that must consume a full cache line to avoid accidental false sharing. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250630093540.3052835-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>