summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2026-03-06drm/msm/dpu: Correct the SA8775P intr_underrun/intr_underrun indexAbhinav Kumar
The intr_underrun and intr_vsync indices have been swapped, just simply corrects them. Cc: stable@vger.kernel.org Fixes: b139c80d181c ("drm/msm/dpu: Add SA8775P support") Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Yongxing Mou <yongxing.mou@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/709209/ Link: https://lore.kernel.org/r/20260305-mdss_catalog-v5-2-06678ac39ac7@oss.qualcomm.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2026-03-05drm/msm/a8xx: Fix ubwc config related to swizzlingAkhil P Oommen
To disable l2/l3 swizzling in A8x, set the respective bits in both GRAS_NC_MODE_CNTL and RB_CCU_NC_MODE_CNTL registers. This is required for Glymur where it is recommended to keep l2/l3 swizzling disabled. Fixes: 288a93200892 ("drm/msm/adreno: Introduce A8x GPU Support") Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com> Message-ID: <20260305-a8xx-ubwc-fix-v1-1-d99b6da4c5a9@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2026-03-05accel: ethosu: Handle possible underflow in IFM size calculationsRob Herring (Arm)
If the command stream has larger padding sizes than the IFM and OFM diminsions, then the calculations will underflow to a negative value. The result is a very large region bounds which is caught on submit, but it's better to catch it earlier. Current mesa ethosu driver has a signedness bug which resulted in padding of 127 (the max) and triggers this issue. Reviewed-and-Tested-by: Anders Roxell <anders.roxell@linaro.org> Link: https://patch.msgid.link/20260218-ethos-fixes-v1-3-be3fa3ea9a30@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2026-03-05accel: ethosu: Fix NPU_OP_ELEMENTWISE validation with scalarRob Herring (Arm)
The NPU_OP_ELEMENTWISE instruction uses a scalar value for IFM2 if the IFM2_BROADCAST "scalar" mode is set. It is a bit (7) on the u65 and part of a field (bits 3:0) on the u85. The driver was hardcoded to the u85. Fixes: 5a5e9c0228e6 ("accel: Add Arm Ethos-U NPU driver") Reviewed-and-Tested-by: Anders Roxell <anders.roxell@linaro.org> Link: https://patch.msgid.link/20260218-ethos-fixes-v1-2-be3fa3ea9a30@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2026-03-05accel: ethosu: Fix job submit error clean-up refcount underflowsRob Herring (Arm)
If the job submit fails before adding the job to the scheduler queue such as when the GEM buffer bounds checks fail, then doing a ethosu_job_put() results in a pm_runtime_put_autosuspend() without the corresponding pm_runtime_resume_and_get(). The dma_fence_put()'s are also unnecessary, but seem to be harmless. Split the ethosu_job_cleanup() function into 2 parts for the before and after the job is queued. Fixes: 5a5e9c0228e6 ("accel: Add Arm Ethos-U NPU driver") Reviewed-and-Tested-by: Anders Roxell <anders.roxell@linaro.org> Link: https://patch.msgid.link/20260218-ethos-fixes-v1-1-be3fa3ea9a30@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2026-03-05scftorture: Update due to x86 not supporting none/voluntary preemptionPaul E. McKenney
As of v7.0-rc1, architectures that support preemption, including x86 and arm64, no longer support CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY. Attempting to build kernels with these two Kconfig options results in .config errors. This commit therefore switches such scftorture scenarios to CONFIG_PREEMPT_LAZY. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun@kernel.org> Link: https://patch.msgid.link/20260303235903.1967409-4-paulmck@kernel.org
2026-03-05refscale: Update due to x86 not supporting none/voluntary preemptionPaul E. McKenney
As of v7.0-rc1, architectures that support preemption, including x86 and arm64, no longer support CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY. Attempting to build kernels with these two Kconfig options results in .config errors. This commit therefore switches such refscale scenarios to CONFIG_PREEMPT_LAZY. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun@kernel.org> Link: https://patch.msgid.link/20260303235903.1967409-3-paulmck@kernel.org
2026-03-05rcuscale: Update due to x86 not supporting none/voluntary preemptionPaul E. McKenney
As of v7.0-rc1, architectures that support preemption, including x86 and arm64, no longer support CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY. Attempting to build kernels with these two Kconfig options results in .config errors. This commit therefore switches such rcuscale scenarios to CONFIG_PREEMPT_LAZY. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun@kernel.org> Link: https://patch.msgid.link/20260303235903.1967409-2-paulmck@kernel.org
2026-03-05rcutorture: Update due to x86 not supporting none/voluntary preemptionPaul E. McKenney
As of v7.0-rc1, architectures that support preemption, including x86 and arm64, no longer support CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY. Attempting to build kernels with these two Kconfig options results in .config errors. This commit therefore switches such rcutorture scenarios to CONFIG_PREEMPT_LAZY. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Boqun Feng <boqun@kernel.org> Link: https://patch.msgid.link/bfe89f6c-3b63-40c6-aa6d-5f523e3e9a31@paulmck-laptop
2026-03-05tools headers UAPI: Update tools' copy of linux/coresight-pmu.hArnaldo Carvalho de Melo
To get the comment changes in this commit: 171efc70097a9f5f ("x86/ibs: Fix typo in dc_l2tlb_miss comment") This silences this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/amd/ibs.h arch/x86/include/asm/amd/ibs.h Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-05tools headers: Update the syscall tables and unistd.h, to support the new ↵Arnaldo Carvalho de Melo
'rseq_slice_yield' syscall Picking up the changes from these csets: 2153b2e8917b73e9 ("sparc: Add architecture support for clone3") 99d2592023e5d0a3 ("rseq: Implement sys_rseq_slice_yield()") 4ac286c4a8d904c8 ("s390/syscalls: Switch to generic system call table generation") This makes 'perf trace' support it, now its possible, for instance, to do: # perf trace -e rseq_slice_yield --max-stack=16 Here is an example with the 'sendmmsg' syscall: root@x1:~# perf trace -e sendmmsg --max-stack 16 --max-events=1 0.000 ( 0.062 ms): dbus-broker/1012 sendmmsg(fd: 150, mmsg: 0x7ffef57cca50, vlen: 1, flags: DONTWAIT|NOSIGNAL) = 1 syscall_exit_to_user_mode_prepare ([kernel.kallsyms]) syscall_exit_to_user_mode_prepare ([kernel.kallsyms]) syscall_exit_to_user_mode ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) [0x117ce7] (/usr/lib64/libc.so.6 (deleted)) root@x1:~# To do a system wide tracing of the new 'rseq_slice_yield' syscall with a backtrace of at most 16 entries. This addresses these perf tools build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h diff -u tools/scripts/syscall.tbl scripts/syscall.tbl diff -u tools/perf/arch/x86/entry/syscalls/syscall_32.tbl arch/x86/entry/syscalls/syscall_32.tbl diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl diff -u tools/perf/arch/arm/entry/syscalls/syscall.tbl arch/arm/tools/syscall.tbl diff -u tools/perf/arch/sh/entry/syscalls/syscall.tbl arch/sh/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/sparc/entry/syscalls/syscall.tbl arch/sparc/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/xtensa/entry/syscalls/syscall.tbl arch/xtensa/kernel/syscalls/syscall.tbl Cc: Andreas Larsson <andreas@gaisler.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ludwig Rydberg <ludwig.rydberg@gaisler.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-05Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linuxLinus Torvalds
Pull fsverity fix from Eric Biggers: "Prevent CONFIG_FS_VERITY from being enabled when the page size is 256K, since it doesn't work in that case" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux: fsverity: add dependency on 64K or smaller pages
2026-03-05perf disasm: Fix off-by-one bug in outside checkPeter Collingbourne
If a branch target points to one past the end of a function, the branch should be treated as a branch to another function. This can happen e.g. with a tail call to a function that is laid out immediately after the caller. Fixes: 751b1783da784299 ("perf annotate: Mark jumps to outher functions with the call arrow") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Peter Collingbourne <pcc@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bill Wendling <morbo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://linux-review.googlesource.com/id/Ide471112e82d68177e0faf08ca411d9fcf0a7bdf Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-05tools arch x86: Sync msr-index.h to pick ↵Arnaldo Carvalho de Melo
MSR_{OMR_[0-3],CORE_PERF_GLOBAL_STATUS_SET} To pick up the changes in: 4e955c08d6dc76fb ("perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL") 736a2dcfdae72483 ("x86/CPU/AMD: Simplify the spectral chicken fix") 56bb2736975068cc ("KVM: x86/pmu: Load/put mediated PMU context when entering/exiting guest") Addressing this tools/perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h That makes the beautification scripts to pick some new entries: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before.txt $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after.txt $ diff -u before.txt after.txt --- before.txt 2026-03-04 17:21:39.165956041 -0300 +++ after.txt 2026-03-04 17:21:52.479191640 -0300 @@ -130,6 +130,11 @@ [0x0000038e] = "CORE_PERF_GLOBAL_STATUS", [0x0000038f] = "CORE_PERF_GLOBAL_CTRL", [0x00000390] = "CORE_PERF_GLOBAL_OVF_CTRL", + [0x00000391] = "CORE_PERF_GLOBAL_STATUS_SET", + [0x000003e0] = "OMR_0", + [0x000003e1] = "OMR_1", + [0x000003e2] = "OMR_2", + [0x000003e3] = "OMR_3", [0x000003f1] = "IA32_PEBS_ENABLE", [0x000003f2] = "PEBS_DATA_CFG", [0x000003f4] = "IA32_PEBS_BASE", $ Now one can use those strings in 'perf trace' to do filtering, e.g.: # perf trace -e msr:*_msr/max-stack=32/ --filter="msr==CORE_PERF_GLOBAL_STATUS_SET" Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-03-05Merge tag 'libcrypto-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library fixes from Eric Biggers: - Several test fixes: - Fix flakiness in the interrupt context tests in certain VMs - Make the lib/crypto/ KUnit tests depend on the corresponding library options rather than selecting them. This follows the standard KUnit convention, and it fixes an issue where enabling CONFIG_KUNIT_ALL_TESTS pulled in all the crypto library code - Add a kunitconfig file for lib/crypto/ - Fix a couple stale references to "aes-generic" that made it in concurrently with the rename to "aes-lib" - Update the help text for several CRYPTO kconfig options to remove outdated information about users that now use the library instead * tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: crypto: testmgr - Fix stale references to aes-generic crypto: Clean up help text for CRYPTO_CRC32 crypto: Clean up help text for CRYPTO_CRC32C crypto: Clean up help text for CRYPTO_XXHASH crypto: Clean up help text for CRYPTO_SHA256 crypto: Clean up help text for CRYPTO_BLAKE2B lib/crypto: tests: Add a .kunitconfig file lib/crypto: tests: Depend on library options rather than selecting them kunit: irq: Ensure timer doesn't fire too frequently
2026-03-05Merge tag 'acpi-7.0-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI support fixes from Rafael Wysocki: - Revert a commit related to ACPI device power management that was not supposed to make any functional difference, but it did so and introduced a regression (Rafael Wysocki) - Update the _CPC object definition in ACPICA to match ACPI 6.6 and prevent the kernel from printing a false-positive warning regarding _CPC output package format on platforms shipping with firmware based on ACPI 6.6 (Saket Dumbre) * tag 'acpi-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "ACPI: PM: Let acpi_dev_pm_attach() skip devices without ACPI PM" ACPICA: Update the _CPC definition to match ACPI 6.6
2026-03-05Merge tag 'net-7.0-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from CAN, netfilter and wireless. Current release - new code bugs: - sched: cake: fixup cake_mq rate adjustment for diffserv config - wifi: fix missing ieee80211_eml_params member initialization Previous releases - regressions: - tcp: give up on stronger sk_rcvbuf checks (for now) Previous releases - always broken: - net: fix rcu_tasks stall in threaded busypoll - sched: - fq: clear q->band_pkt_count[] in fq_reset() - only allow act_ct to bind to clsact/ingress qdiscs and shared blocks - bridge: check relevant per-VLAN options in VLAN range grouping - xsk: fix fragment node deletion to prevent buffer leak Misc: - spring cleanup of inactive maintainers" * tag 'net-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (138 commits) xdp: produce a warning when calculated tailroom is negative net: enetc: use truesize as XDP RxQ info frag_size libeth, idpf: use truesize as XDP RxQ info frag_size i40e: use xdp.frame_sz as XDP RxQ info frag_size i40e: fix registering XDP RxQ info ice: change XDP RxQ frag_size from DMA write length to xdp.frame_sz ice: fix rxq info registering in mbuf packets xsk: introduce helper to determine rxq->frag_size xdp: use modulo operation to calculate XDP frag tailroom selftests/tc-testing: Add tests exercising act_ife metalist replace behaviour net/sched: act_ife: Fix metalist update behavior selftests: net: add test for IPv4 route with loopback IPv6 nexthop net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthop net: vxlan: fix nd_tbl NULL dereference when IPv6 is disabled net: bridge: fix nd_tbl NULL dereference when IPv6 is disabled MAINTAINERS: remove Thomas Falcon from IBM ibmvnic MAINTAINERS: remove Claudiu Manoil and Alexandre Belloni from Ocelot switch MAINTAINERS: replace Taras Chornyi with Elad Nachman for Marvell Prestera MAINTAINERS: remove Jonathan Lemon from OpenCompute PTP MAINTAINERS: replace Clark Wang with Frank Li for Freescale FEC ...
2026-03-05Merge branch 'acpica'Rafael J. Wysocki
Merge a fix updating the _CPC object definition in ACPICA to avoid printing a false-positive output package format warning on new platforms (Saket Dumbre) * acpica: ACPICA: Update the _CPC definition to match ACPI 6.6
2026-03-05MAINTAINERS: Orphan Altera PCIe controller driverDave Hansen
Email to Joyce Ooi <joyce.ooi@intel.com> now bounces. Remove the address and mark the Altera PCIe controller driver as an orphan for now. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260305171852.3114177-1-dave.hansen@linux.intel.com
2026-03-05workqueue: Add stall detector sample moduleBreno Leitao
Add a sample module under samples/workqueue/stall_detector/ that reproduces a workqueue stall caused by PF_WQ_WORKER misuse. The module queues two work items on the same per-CPU pool, then clears PF_WQ_WORKER and sleeps in wait_event_idle(), hiding from the concurrency manager and stalling the second work item indefinitely. This is useful for testing the workqueue watchdog stall diagnostics. Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-05workqueue: Show all busy workers in stall diagnosticsBreno Leitao
show_cpu_pool_hog() only prints workers whose task is currently running on the CPU (task_is_running()). This misses workers that are busy processing a work item but are sleeping or blocked — for example, a worker that clears PF_WQ_WORKER and enters wait_event_idle(). Such a worker still occupies a pool slot and prevents progress, yet produces an empty backtrace section in the watchdog output. This is happening on real arm64 systems, where toggle_allocation_gate() IPIs every single CPU in the machine (which lacks NMI), causing workqueue stalls that show empty backtraces because toggle_allocation_gate() is sleeping in wait_event_idle(). Remove the task_is_running() filter so every in-flight worker in the pool's busy_hash is dumped. The busy_hash is protected by pool->lock, which is already held. Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-05workqueue: Show in-flight work item duration in stall diagnosticsBreno Leitao
When diagnosing workqueue stalls, knowing how long each in-flight work item has been executing is valuable. Add a current_start timestamp (jiffies) to struct worker, set it when a work item begins execution in process_one_work(), and print the elapsed wall-clock time in show_pwq(). Unlike current_at (which tracks CPU runtime and resets on wakeup for CPU-intensive detection), current_start is never reset because the diagnostic cares about total wall-clock time including sleeps. Before: in-flight: 165:stall_work_fn [wq_stall] After: in-flight: 165:stall_work_fn [wq_stall] for 100s Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-05workqueue: Rename pool->watchdog_ts to pool->last_progress_tsBreno Leitao
The watchdog_ts name doesn't convey what the timestamp actually tracks. This field tracks the last time a workqueue got progress. Rename it to last_progress_ts to make it clear that it records when the pool last made forward progress (started processing new work items). No functional change. Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-05workqueue: Use POOL_BH instead of WQ_BH when checking pool flagsBreno Leitao
pr_cont_worker_id() checks pool->flags against WQ_BH, which is a workqueue-level flag (defined in workqueue.h). Pool flags use a separate namespace with POOL_* constants (defined in workqueue.c). The correct constant is POOL_BH. Both WQ_BH and POOL_BH are defined as (1 << 0) so this has no behavioral impact, but it is semantically wrong and inconsistent with every other pool-level BH check in the file. Fixes: 4cb1ef64609f ("workqueue: Implement BH workqueues to eventually replace tasklets") Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-05accel/amdxdna: Split mailbox channel create functionLizhi Hou
The management channel used for firmware control command submission is currently created after the firmware is started. If channel creation fails (for example, due to memory allocation failure or workqueue creation interruption), the firmware remains in a pending state and is unable to receive any control commands. To avoid leaving the firmware in this inconsistent state, split xdna_mailbox_create_channel() into two separate functions so that resource allocation can be completed before interacting with the hardware. xdna_mailbox_alloc_channel() Allocates memory and initializes the workqueue. This can be called earlier, before interacting with the hardware. xdna_mailbox_start_channel() Performs the hardware interaction required to start the channel. Rename xdna_mailbox_destroy_channel() to xdna_mailbox_free_channel(). Ensure that xdna_mailbox_stop_channel() and xdna_mailbox_free_channel() properly unwind the corresponding start and allocation steps, respectively. Fixes: b87f920b9344 ("accel/amdxdna: Support hardware mailbox") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260305062041.3954024-1-lizhi.hou@amd.com
2026-03-05remoteproc: imx_rproc: Fix unreachable platform prepare_opsPeng Fan
Smatch reports unreachable code in imx_rproc_prepare(), where an early return inside the reserved-memory parsing loop prevents platform prepare_ops from being executed. When of_reserved_mem_region_to_resource() fails, imx_rproc_prepare() returns immediately, so the platform-specific prepare callback is never called. As a result, prepare_ops such as imx_rproc_sm_lmm_prepare() on i.MX95 have no chance to run. This is problematic when Linux controls the M7 Logical Machine and is responsible for preparing resources such as TCM. Without running the platform prepare callback, loading the M7 ELF into TCM may fail if the bootloader did not power up and initialize TCM. Fix this by breaking out of the reserved-memory loop instead of returning, allowing the platform prepare_ops to be executed as intended. Fixes: edd2a9956055 ("remoteproc: imx_rproc: Introduce prepare ops for imx_rproc_dcfg") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-remoteproc/aYYXAa2Fj36XG4yQ@p14s/T/#t Signed-off-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Link: https://lore.kernel.org/r/20260208-imx-rproc-fix-v1-1-ad74555eb9a4@nxp.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2026-03-05remoteproc: mediatek: Unprepare SCP clock during system suspendTzung-Bi Shih
Prior to commit d935187cfb27 ("remoteproc: mediatek: Break lock dependency to prepare_lock"), `scp->clk` was prepared and enabled only when it needs to communicate with the SCP. The commit d935187cfb27 moved the prepare operation to remoteproc's prepare(), keeping the clock prepared as long as the SCP is running. The power consumption due to the prolonged clock preparation can be negligible when the system is running, as SCP is designed to be a very power efficient processor. However, the clock remains prepared even when the system enters system suspend. This prevents the underlying clock controller (and potentially the parent PLLs) from shutting down, which increases power consumption and may block the system from entering deep sleep states. Add suspend and resume callbacks. Unprepare the clock in suspend() if it was active and re-prepare it in resume() to ensure the clock is properly disabled during system suspend, while maintaining the "always prepared" semantics while the system is active. The driver doesn't implement .attach() callback, hence it only checks for RPROC_RUNNING. Fixes: d935187cfb27 ("remoteproc: mediatek: Break lock dependency to prepare_lock") Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org> Link: https://lore.kernel.org/r/20260206033034.3031781-1-tzungbi@kernel.org Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2026-03-05drm/panthor: Correct the order of arguments passed to gem_syncAkash Goel
This commit corrects the order of arguments passed to panthor_gem_sync() function, called when the SYNC_WAIT condition has to be evaluated for a blocked GPU queue. Fixes: cd2c9c3015e6 ("drm/panthor: Add flag to map GEM object Write-Back Cacheable") Signed-off-by: Akash Goel <akash.goel@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patch.msgid.link/20260305110723.2871733-1-akash.goel@arm.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
2026-03-05fbdev: au1100fb: Fix build on MIPS64Helge Deller
Fix an error reported by the kernel test robot: au1100fb.c: error: implicit declaration of function 'KSEG1ADDR'; did you mean 'CKSEG1ADDR'? arch/mips/include/asm/addrspace.h defines KSEG1ADDR only for 32 bit configurations. So provide its compile-test stub also for 64bit mips builds. Fixes: 6f366e86481a ("fbdev: au1100fb: Make driver compilable on non-mips platforms") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202603042127.PT6LuKqi-lkp@intel.com/ Signed-off-by: Helge Deller <deller@gmx.de> Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
2026-03-05KVM: arm64: Fix page leak in user_mem_abort() on atomic faultFuad Tabba
When a guest performs an atomic/exclusive operation on memory lacking the required attributes, user_mem_abort() injects a data abort and returns early. However, it fails to release the reference to the host page acquired via __kvm_faultin_pfn(). A malicious guest could repeatedly trigger this fault, leaking host page references and eventually causing host memory exhaustion (OOM). Fix this by consolidating the early error returns to a new out_put_page label that correctly calls kvm_release_page_unused(). Fixes: 2937aeec9dc5 ("KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory") Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Yuan Yao <yaoyuan@linux.alibaba.com> Link: https://patch.msgid.link/20260304162222.836152-2-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-03-05Merge tag 'asoc-fix-v7.0-rc2' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v7.0 A moderately large pile of fixes, though none of them are super major, plus a few new quirks and device IDs.
2026-03-05sched_ext: Document task ownership state machineAndrea Righi
The task ownership state machine in sched_ext is quite hard to follow from the code alone. The interaction of ownership states, memory ordering rules and cross-CPU "lock dancing" makes the overall model subtle. Extend the documentation next to scx_ops_state to provide a more structured and self-contained description of the state transitions and their synchronization rules. The new reference should make the code easier to reason about and maintain and can help future contributors understand the overall task-ownership workflow. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-05sched_ext: Use READ_ONCE() for lock-free reads of module param variableszhidao su
bypass_lb_cpu() reads scx_bypass_lb_intv_us and scx_slice_bypass_us without holding any lock, in timer callback context where module parameter writes via sysfs can happen concurrently: min_delta_us = scx_bypass_lb_intv_us / SCX_BYPASS_LB_MIN_DELTA_DIV; ^^^^^^^^^^^^^^^^^^^^ plain read -- KCSAN data race if (delta < DIV_ROUND_UP(min_delta_us, scx_slice_bypass_us)) ^^^^^^^^^^^^^^^^^ plain read -- KCSAN data race scx_bypass_lb_intv_us already uses READ_ONCE() in scx_bypass_lb_timerfn() and scx_bypass() for its other lock-free read sites, leaving bypass_lb_cpu() inconsistent. scx_slice_bypass_us has the same lock-free access pattern in the same function. Fix both plain reads by using READ_ONCE() to complete the concurrent access annotation and make the code KCSAN-clean. Signed-off-by: zhidao su <suzhidao@xiaomi.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-05Merge tag 'trace-v7.0-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix thresh_return of function graph tracer The update to store data on the shadow stack removed the abuse of using the task recursion word as a way to keep track of what functions to ignore. The trace_graph_return() was updated to handle this, but when function_graph tracer is using a threshold (only trace functions that took longer than a specified time), it uses trace_graph_thresh_return() instead. This function was still incorrectly using the task struct recursion word causing the function graph tracer to permanently set all functions to "notrace" - Fix thresh_return nosleep accounting When the calltime was moved to the shadow stack storage instead of being on the fgraph descriptor, the calculations for the amount of sleep time was updated. The calculation was done in the trace_graph_thresh_return() function, which also called the trace_graph_return(), which did the calculation again, causing the time to be doubled. Remove the call to trace_graph_return() as what it needed to do wasn't that much, and just do the work in trace_graph_thresh_return(). - Fix syscall trace event activation on boot up The syscall trace events are pseudo events attached to the raw_syscall tracepoints. When the first syscall event is enabled, it enables the raw_syscall tracepoint and doesn't need to do anything when a second syscall event is also enabled. When events are enabled via the kernel command line, syscall events are partially enabled as the enabling is called before rcu_init. This is due to allow early events to be enabled immediately. Because kernel command line events do not distinguish between different types of events, the syscall events are enabled here but are not fully functioning. After rcu_init, they are disabled and re-enabled so that they can be fully enabled. The problem happened is that this "disable-enable" is done one at a time. If more than one syscall event is specified on the command line, by disabling them one at a time, the counter never gets to zero, and the raw_syscall is not disabled and enabled, keeping the syscall events in their non-fully functional state. Instead, disable all events and re-enabled them all, as that will ensure the raw_syscall event is also disabled and re-enabled. - Disable preemption in ftrace pid filtering The ftrace pid filtering attaches to the fork and exit tracepoints to add or remove pids that should be traced. They access variables protected by RCU (preemption disabled). Now that tracepoint callbacks are called with preemption enabled, this protection needs to be added explicitly, and not depend on the functions being called with preemption disabled. - Disable preemption in event pid filtering The event pid filtering needs the same preemption disabling guards as ftrace pid filtering. - Fix accounting of the memory mapped ring buffer on fork Memory mapping the ftrace ring buffer sets the vm_flags to DONTCOPY. But this does not prevent the application from calling madvise(MADVISE_DOFORK). This causes the mapping to be copied on fork. After the first tasks exits, the mapping is considered unmapped by everyone. But when he second task exits, the counter goes below zero and triggers a WARN_ON. Since nothing prevents two separate tasks from mmapping the ftrace ring buffer (although two mappings may mess each other up), there's no reason to stop the memory from being copied on fork. Update the vm_operations to have an ".open" handler to update the accounting and let the ring buffer know someone else has it mapped. - Add all ftrace headers in MAINTAINERS file The MAINTAINERS file only specifies include/linux/ftrace.h But misses ftrace_irq.h and ftrace_regs.h. Make the file use wildcards to get all *ftrace* files. * tag 'trace-v7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ftrace: Add MAINTAINERS entries for all ftrace headers tracing: Fix WARN_ON in tracing_buffers_mmap_close tracing: Disable preemption in the tracepoint callbacks handling filtered pids ftrace: Disable preemption in the tracepoint callbacks handling filtered pids tracing: Fix syscall events activation by ensuring refcount hits zero fgraph: Fix thresh_return nosleeptime double-adjust fgraph: Fix thresh_return clear per-task notrace
2026-03-05Merge branch 'Address-XDP-frags-having-negative-tailroom'Jakub Kicinski
Larysa Zaremba says: ==================== Address XDP frags having negative tailroom Aside from the issue described below, tailroom calculation does not account for pages being split between frags, e.g. in i40e, enetc and AF_XDP ZC with smaller chunks. These series address the problem by calculating modulo (skb_frag_off() % rxq->frag_size) in order to get data offset within a smaller block of memory. Please note, xskxceiver tail grow test passes without modulo e.g. in xdpdrv mode on i40e, because there is not enough descriptors to get to flipped buffers. Many ethernet drivers report xdp Rx queue frag size as being the same as DMA write size. However, the only user of this field, namely bpf_xdp_frags_increase_tail(), clearly expects a truesize. Such difference leads to unspecific memory corruption issues under certain circumstances, e.g. in ixgbevf maximum DMA write size is 3 KB, so when running xskxceiver's XDP_ADJUST_TAIL_GROW_MULTI_BUFF, 6K packet fully uses all DMA-writable space in 2 buffers. This would be fine, if only rxq->frag_size was properly set to 4K, but value of 3K results in a negative tailroom, because there is a non-zero page offset. We are supposed to return -EINVAL and be done with it in such case, but due to tailroom being stored as an unsigned int, it is reported to be somewhere near UINT_MAX, resulting in a tail being grown, even if the requested offset is too much(it is around 2K in the abovementioned test). This later leads to all kinds of unspecific calltraces. [ 7340.337579] xskxceiver[1440]: segfault at 1da718 ip 00007f4161aeac9d sp 00007f41615a6a00 error 6 [ 7340.338040] xskxceiver[1441]: segfault at 7f410000000b ip 00000000004042b5 sp 00007f415bffecf0 error 4 [ 7340.338179] in libc.so.6[61c9d,7f4161aaf000+160000] [ 7340.339230] in xskxceiver[42b5,400000+69000] [ 7340.340300] likely on CPU 6 (core 0, socket 6) [ 7340.340302] Code: ff ff 01 e9 f4 fe ff ff 0f 1f 44 00 00 4c 39 f0 74 73 31 c0 ba 01 00 00 00 f0 0f b1 17 0f 85 ba 00 00 00 49 8b 87 88 00 00 00 <4c> 89 70 08 eb cc 0f 1f 44 00 00 48 8d bd f0 fe ff ff 89 85 ec fe [ 7340.340888] likely on CPU 3 (core 0, socket 3) [ 7340.345088] Code: 00 00 00 ba 00 00 00 00 be 00 00 00 00 89 c7 e8 31 ca ff ff 89 45 ec 8b 45 ec 85 c0 78 07 b8 00 00 00 00 eb 46 e8 0b c8 ff ff <8b> 00 83 f8 69 74 24 e8 ff c7 ff ff 8b 00 83 f8 0b 74 18 e8 f3 c7 [ 7340.404334] Oops: general protection fault, probably for non-canonical address 0x6d255010bdffc: 0000 [#1] SMP NOPTI [ 7340.405972] CPU: 7 UID: 0 PID: 1439 Comm: xskxceiver Not tainted 6.19.0-rc1+ #21 PREEMPT(lazy) [ 7340.408006] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-5.fc42 04/01/2014 [ 7340.409716] RIP: 0010:lookup_swap_cgroup_id+0x44/0x80 [ 7340.410455] Code: 83 f8 1c 73 39 48 ba ff ff ff ff ff ff ff 03 48 8b 04 c5 20 55 fa bd 48 21 d1 48 89 ca 83 e1 01 48 d1 ea c1 e1 04 48 8d 04 90 <8b> 00 48 83 c4 10 d3 e8 c3 cc cc cc cc 31 c0 e9 98 b7 dd 00 48 89 [ 7340.412787] RSP: 0018:ffffcc5c04f7f6d0 EFLAGS: 00010202 [ 7340.413494] RAX: 0006d255010bdffc RBX: ffff891f477895a8 RCX: 0000000000000010 [ 7340.414431] RDX: 0001c17e3fffffff RSI: 00fa070000000000 RDI: 000382fc7fffffff [ 7340.415354] RBP: 00fa070000000000 R08: ffffcc5c04f7f8f8 R09: ffffcc5c04f7f7d0 [ 7340.416283] R10: ffff891f4c1a7000 R11: ffffcc5c04f7f9c8 R12: ffffcc5c04f7f7d0 [ 7340.417218] R13: 03ffffffffffffff R14: 00fa06fffffffe00 R15: ffff891f47789500 [ 7340.418229] FS: 0000000000000000(0000) GS:ffff891ffdfaa000(0000) knlGS:0000000000000000 [ 7340.419489] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7340.420286] CR2: 00007f415bfffd58 CR3: 0000000103f03002 CR4: 0000000000772ef0 [ 7340.421237] PKRU: 55555554 [ 7340.421623] Call Trace: [ 7340.421987] <TASK> [ 7340.422309] ? softleaf_from_pte+0x77/0xa0 [ 7340.422855] swap_pte_batch+0xa7/0x290 [ 7340.423363] zap_nonpresent_ptes.constprop.0.isra.0+0xd1/0x270 [ 7340.424102] zap_pte_range+0x281/0x580 [ 7340.424607] zap_pmd_range.isra.0+0xc9/0x240 [ 7340.425177] unmap_page_range+0x24d/0x420 [ 7340.425714] unmap_vmas+0xa1/0x180 [ 7340.426185] exit_mmap+0xe1/0x3b0 [ 7340.426644] __mmput+0x41/0x150 [ 7340.427098] exit_mm+0xb1/0x110 [ 7340.427539] do_exit+0x1b2/0x460 [ 7340.427992] do_group_exit+0x2d/0xc0 [ 7340.428477] get_signal+0x79d/0x7e0 [ 7340.428957] arch_do_signal_or_restart+0x34/0x100 [ 7340.429571] exit_to_user_mode_loop+0x8e/0x4c0 [ 7340.430159] do_syscall_64+0x188/0x6b0 [ 7340.430672] ? __do_sys_clone3+0xd9/0x120 [ 7340.431212] ? switch_fpu_return+0x4e/0xd0 [ 7340.431761] ? arch_exit_to_user_mode_prepare.isra.0+0xa1/0xc0 [ 7340.432498] ? do_syscall_64+0xbb/0x6b0 [ 7340.433015] ? __handle_mm_fault+0x445/0x690 [ 7340.433582] ? count_memcg_events+0xd6/0x210 [ 7340.434151] ? handle_mm_fault+0x212/0x340 [ 7340.434697] ? do_user_addr_fault+0x2b4/0x7b0 [ 7340.435271] ? clear_bhb_loop+0x30/0x80 [ 7340.435788] ? clear_bhb_loop+0x30/0x80 [ 7340.436299] ? clear_bhb_loop+0x30/0x80 [ 7340.436812] ? clear_bhb_loop+0x30/0x80 [ 7340.437323] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 7340.437973] RIP: 0033:0x7f4161b14169 [ 7340.438468] Code: Unable to access opcode bytes at 0x7f4161b1413f. [ 7340.439242] RSP: 002b:00007ffc6ebfa770 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca [ 7340.440173] RAX: fffffffffffffe00 RBX: 00000000000005a1 RCX: 00007f4161b14169 [ 7340.441061] RDX: 00000000000005a1 RSI: 0000000000000109 RDI: 00007f415bfff990 [ 7340.441943] RBP: 00007ffc6ebfa7a0 R08: 0000000000000000 R09: 00000000ffffffff [ 7340.442824] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 7340.443707] R13: 0000000000000000 R14: 00007f415bfff990 R15: 00007f415bfff6c0 [ 7340.444586] </TASK> [ 7340.444922] Modules linked in: rfkill intel_rapl_msr intel_rapl_common intel_uncore_frequency_common skx_edac_common nfit libnvdimm kvm_intel vfat fat kvm snd_pcm irqbypass rapl iTCO_wdt snd_timer intel_pmc_bxt iTCO_vendor_support snd ixgbevf virtio_net soundcore i2c_i801 pcspkr libeth_xdp net_failover i2c_smbus lpc_ich failover libeth virtio_balloon joydev 9p fuse loop zram lz4hc_compress lz4_compress 9pnet_virtio 9pnet netfs ghash_clmulni_intel serio_raw qemu_fw_cfg [ 7340.449650] ---[ end trace 0000000000000000 ]--- The issue can be fixed in all in-tree drivers, but we cannot just trust OOT drivers to not do this. Therefore, make tailroom a signed int and produce a warning when it is negative to prevent such mistakes in the future. The issue can also be easily reproduced with ice driver, by applying the following diff to xskxceiver and enjoying a kernel panic in xdpdrv mode: diff --git a/tools/testing/selftests/bpf/prog_tests/test_xsk.c b/tools/testing/selftests/bpf/prog_tests/test_xsk.c index 5af28f359cfd..042d587fa7ef 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_xsk.c +++ b/tools/testing/selftests/bpf/prog_tests/test_xsk.c @@ -2541,8 +2541,8 @@ int testapp_adjust_tail_grow_mb(struct test_spec *test) { test->mtu = MAX_ETH_JUMBO_SIZE; /* Grow by (frag_size - last_frag_Size) - 1 to stay inside the last fragment */ - return testapp_adjust_tail(test, (XSK_UMEM__MAX_FRAME_SIZE / 2) - 1, - XSK_UMEM__LARGE_FRAME_SIZE * 2); + return testapp_adjust_tail(test, XSK_UMEM__MAX_FRAME_SIZE * 100, + 6912); } int testapp_tx_queue_consumer(struct test_spec *test) If we print out the values involved in the tailroom calculation: tailroom = rxq->frag_size - skb_frag_size(frag) - skb_frag_off(frag); 4294967040 = 3456 - 3456 - 256 I personally reproduced and verified the issue in ice and i40e, aside from WiP ixgbevf implementation. ==================== Link: https://patch.msgid.link/20260305111253.2317394-1-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05xdp: produce a warning when calculated tailroom is negativeLarysa Zaremba
Many ethernet drivers report xdp Rx queue frag size as being the same as DMA write size. However, the only user of this field, namely bpf_xdp_frags_increase_tail(), clearly expects a truesize. Such difference leads to unspecific memory corruption issues under certain circumstances, e.g. in ixgbevf maximum DMA write size is 3 KB, so when running xskxceiver's XDP_ADJUST_TAIL_GROW_MULTI_BUFF, 6K packet fully uses all DMA-writable space in 2 buffers. This would be fine, if only rxq->frag_size was properly set to 4K, but value of 3K results in a negative tailroom, because there is a non-zero page offset. We are supposed to return -EINVAL and be done with it in such case, but due to tailroom being stored as an unsigned int, it is reported to be somewhere near UINT_MAX, resulting in a tail being grown, even if the requested offset is too much (it is around 2K in the abovementioned test). This later leads to all kinds of unspecific calltraces. [ 7340.337579] xskxceiver[1440]: segfault at 1da718 ip 00007f4161aeac9d sp 00007f41615a6a00 error 6 [ 7340.338040] xskxceiver[1441]: segfault at 7f410000000b ip 00000000004042b5 sp 00007f415bffecf0 error 4 [ 7340.338179] in libc.so.6[61c9d,7f4161aaf000+160000] [ 7340.339230] in xskxceiver[42b5,400000+69000] [ 7340.340300] likely on CPU 6 (core 0, socket 6) [ 7340.340302] Code: ff ff 01 e9 f4 fe ff ff 0f 1f 44 00 00 4c 39 f0 74 73 31 c0 ba 01 00 00 00 f0 0f b1 17 0f 85 ba 00 00 00 49 8b 87 88 00 00 00 <4c> 89 70 08 eb cc 0f 1f 44 00 00 48 8d bd f0 fe ff ff 89 85 ec fe [ 7340.340888] likely on CPU 3 (core 0, socket 3) [ 7340.345088] Code: 00 00 00 ba 00 00 00 00 be 00 00 00 00 89 c7 e8 31 ca ff ff 89 45 ec 8b 45 ec 85 c0 78 07 b8 00 00 00 00 eb 46 e8 0b c8 ff ff <8b> 00 83 f8 69 74 24 e8 ff c7 ff ff 8b 00 83 f8 0b 74 18 e8 f3 c7 [ 7340.404334] Oops: general protection fault, probably for non-canonical address 0x6d255010bdffc: 0000 [#1] SMP NOPTI [ 7340.405972] CPU: 7 UID: 0 PID: 1439 Comm: xskxceiver Not tainted 6.19.0-rc1+ #21 PREEMPT(lazy) [ 7340.408006] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-5.fc42 04/01/2014 [ 7340.409716] RIP: 0010:lookup_swap_cgroup_id+0x44/0x80 [ 7340.410455] Code: 83 f8 1c 73 39 48 ba ff ff ff ff ff ff ff 03 48 8b 04 c5 20 55 fa bd 48 21 d1 48 89 ca 83 e1 01 48 d1 ea c1 e1 04 48 8d 04 90 <8b> 00 48 83 c4 10 d3 e8 c3 cc cc cc cc 31 c0 e9 98 b7 dd 00 48 89 [ 7340.412787] RSP: 0018:ffffcc5c04f7f6d0 EFLAGS: 00010202 [ 7340.413494] RAX: 0006d255010bdffc RBX: ffff891f477895a8 RCX: 0000000000000010 [ 7340.414431] RDX: 0001c17e3fffffff RSI: 00fa070000000000 RDI: 000382fc7fffffff [ 7340.415354] RBP: 00fa070000000000 R08: ffffcc5c04f7f8f8 R09: ffffcc5c04f7f7d0 [ 7340.416283] R10: ffff891f4c1a7000 R11: ffffcc5c04f7f9c8 R12: ffffcc5c04f7f7d0 [ 7340.417218] R13: 03ffffffffffffff R14: 00fa06fffffffe00 R15: ffff891f47789500 [ 7340.418229] FS: 0000000000000000(0000) GS:ffff891ffdfaa000(0000) knlGS:0000000000000000 [ 7340.419489] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7340.420286] CR2: 00007f415bfffd58 CR3: 0000000103f03002 CR4: 0000000000772ef0 [ 7340.421237] PKRU: 55555554 [ 7340.421623] Call Trace: [ 7340.421987] <TASK> [ 7340.422309] ? softleaf_from_pte+0x77/0xa0 [ 7340.422855] swap_pte_batch+0xa7/0x290 [ 7340.423363] zap_nonpresent_ptes.constprop.0.isra.0+0xd1/0x270 [ 7340.424102] zap_pte_range+0x281/0x580 [ 7340.424607] zap_pmd_range.isra.0+0xc9/0x240 [ 7340.425177] unmap_page_range+0x24d/0x420 [ 7340.425714] unmap_vmas+0xa1/0x180 [ 7340.426185] exit_mmap+0xe1/0x3b0 [ 7340.426644] __mmput+0x41/0x150 [ 7340.427098] exit_mm+0xb1/0x110 [ 7340.427539] do_exit+0x1b2/0x460 [ 7340.427992] do_group_exit+0x2d/0xc0 [ 7340.428477] get_signal+0x79d/0x7e0 [ 7340.428957] arch_do_signal_or_restart+0x34/0x100 [ 7340.429571] exit_to_user_mode_loop+0x8e/0x4c0 [ 7340.430159] do_syscall_64+0x188/0x6b0 [ 7340.430672] ? __do_sys_clone3+0xd9/0x120 [ 7340.431212] ? switch_fpu_return+0x4e/0xd0 [ 7340.431761] ? arch_exit_to_user_mode_prepare.isra.0+0xa1/0xc0 [ 7340.432498] ? do_syscall_64+0xbb/0x6b0 [ 7340.433015] ? __handle_mm_fault+0x445/0x690 [ 7340.433582] ? count_memcg_events+0xd6/0x210 [ 7340.434151] ? handle_mm_fault+0x212/0x340 [ 7340.434697] ? do_user_addr_fault+0x2b4/0x7b0 [ 7340.435271] ? clear_bhb_loop+0x30/0x80 [ 7340.435788] ? clear_bhb_loop+0x30/0x80 [ 7340.436299] ? clear_bhb_loop+0x30/0x80 [ 7340.436812] ? clear_bhb_loop+0x30/0x80 [ 7340.437323] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 7340.437973] RIP: 0033:0x7f4161b14169 [ 7340.438468] Code: Unable to access opcode bytes at 0x7f4161b1413f. [ 7340.439242] RSP: 002b:00007ffc6ebfa770 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca [ 7340.440173] RAX: fffffffffffffe00 RBX: 00000000000005a1 RCX: 00007f4161b14169 [ 7340.441061] RDX: 00000000000005a1 RSI: 0000000000000109 RDI: 00007f415bfff990 [ 7340.441943] RBP: 00007ffc6ebfa7a0 R08: 0000000000000000 R09: 00000000ffffffff [ 7340.442824] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 7340.443707] R13: 0000000000000000 R14: 00007f415bfff990 R15: 00007f415bfff6c0 [ 7340.444586] </TASK> [ 7340.444922] Modules linked in: rfkill intel_rapl_msr intel_rapl_common intel_uncore_frequency_common skx_edac_common nfit libnvdimm kvm_intel vfat fat kvm snd_pcm irqbypass rapl iTCO_wdt snd_timer intel_pmc_bxt iTCO_vendor_support snd ixgbevf virtio_net soundcore i2c_i801 pcspkr libeth_xdp net_failover i2c_smbus lpc_ich failover libeth virtio_balloon joydev 9p fuse loop zram lz4hc_compress lz4_compress 9pnet_virtio 9pnet netfs ghash_clmulni_intel serio_raw qemu_fw_cfg [ 7340.449650] ---[ end trace 0000000000000000 ]--- The issue can be fixed in all in-tree drivers, but we cannot just trust OOT drivers to not do this. Therefore, make tailroom a signed int and produce a warning when it is negative to prevent such mistakes in the future. Fixes: bf25146a5595 ("bpf: add frags support to the bpf_xdp_adjust_tail() API") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-10-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05net: enetc: use truesize as XDP RxQ info frag_sizeLarysa Zaremba
The only user of frag_size field in XDP RxQ info is bpf_xdp_frags_increase_tail(). It clearly expects truesize instead of DMA write size. Different assumptions in enetc driver configuration lead to negative tailroom. Set frag_size to the same value as frame_sz. Fixes: 2768b2e2f7d2 ("net: enetc: register XDP RX queues with frag_size") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-9-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05libeth, idpf: use truesize as XDP RxQ info frag_sizeLarysa Zaremba
The only user of frag_size field in XDP RxQ info is bpf_xdp_frags_increase_tail(). It clearly expects whole buffer size instead of DMA write size. Different assumptions in idpf driver configuration lead to negative tailroom. To make it worse, buffer sizes are not actually uniform in idpf when splitq is enabled, as there are several buffer queues, so rxq->rx_buf_size is meaningless in this case. Use truesize of the first bufq in AF_XDP ZC, as there is only one. Disable growing tail for regular splitq. Fixes: ac8a861f632e ("idpf: prepare structures to support XDP") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-8-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05i40e: use xdp.frame_sz as XDP RxQ info frag_sizeLarysa Zaremba
The only user of frag_size field in XDP RxQ info is bpf_xdp_frags_increase_tail(). It clearly expects whole buffer size instead of DMA write size. Different assumptions in i40e driver configuration lead to negative tailroom. Set frag_size to the same value as frame_sz in shared pages mode, use new helper to set frag_size when AF_XDP ZC is active. Fixes: a045d2f2d03d ("i40e: set xdp_rxq_info::frag_size") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-7-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05i40e: fix registering XDP RxQ infoLarysa Zaremba
Current way of handling XDP RxQ info in i40e has a problem, where frag_size is not updated when xsk_buff_pool is detached or when MTU is changed, this leads to growing tail always failing for multi-buffer packets. Couple XDP RxQ info registering with buffer allocations and unregistering with cleaning the ring. Fixes: a045d2f2d03d ("i40e: set xdp_rxq_info::frag_size") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-6-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05ice: change XDP RxQ frag_size from DMA write length to xdp.frame_szLarysa Zaremba
The only user of frag_size field in XDP RxQ info is bpf_xdp_frags_increase_tail(). It clearly expects whole buff size instead of DMA write size. Different assumptions in ice driver configuration lead to negative tailroom. This allows to trigger kernel panic, when using XDP_ADJUST_TAIL_GROW_MULTI_BUFF xskxceiver test and changing packet size to 6912 and the requested offset to a huge value, e.g. XSK_UMEM__MAX_FRAME_SIZE * 100. Due to other quirks of the ZC configuration in ice, panic is not observed in ZC mode, but tailroom growing still fails when it should not. Use fill queue buffer truesize instead of DMA write size in XDP RxQ info. Fix ZC mode too by using the new helper. Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-5-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05ice: fix rxq info registering in mbuf packetsLarysa Zaremba
XDP RxQ info contains frag_size, which depends on the MTU. This makes the old way of registering RxQ info before calculating new buffer sizes invalid. Currently, it leads to frag_size being outdated, making it sometimes impossible to grow tailroom in a mbuf packet. E.g. fragments are actually 3K+, but frag size is still as if MTU was 1500. Always register new XDP RxQ info after reconfiguring memory pools. Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-4-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05xsk: introduce helper to determine rxq->frag_sizeLarysa Zaremba
rxq->frag_size is basically a step between consecutive strictly aligned frames. In ZC mode, chunk size fits exactly, but if chunks are unaligned, there is no safe way to determine accessible space to grow tailroom. Report frag_size to be zero, if chunks are unaligned, chunk_size otherwise. Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-3-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05xdp: use modulo operation to calculate XDP frag tailroomLarysa Zaremba
The current formula for calculating XDP tailroom in mbuf packets works only if each frag has its own page (if rxq->frag_size is PAGE_SIZE), this defeats the purpose of the parameter overall and without any indication leads to negative calculated tailroom on at least half of frags, if shared pages are used. There are not many drivers that set rxq->frag_size. Among them: * i40e and enetc always split page uniformly between frags, use shared pages * ice uses page_pool frags via libeth, those are power-of-2 and uniformly distributed across page * idpf has variable frag_size with XDP on, so current API is not applicable * mlx5, mtk and mvneta use PAGE_SIZE or 0 as frag_size for page_pool As for AF_XDP ZC, only ice, i40e and idpf declare frag_size for it. Modulo operation yields good results for aligned chunks, they are all power-of-2, between 2K and PAGE_SIZE. Formula without modulo fails when chunk_size is 2K. Buffers in unaligned mode are not distributed uniformly, so modulo operation would not work. To accommodate unaligned buffers, we could define frag_size as data + tailroom, and hence do not subtract offset when calculating tailroom, but this would necessitate more changes in the drivers. Define rxq->frag_size as an even portion of a page that fully belongs to a single frag. When calculating tailroom, locate the data start within such portion by performing a modulo operation on page offset. Fixes: bf25146a5595 ("bpf: add frags support to the bpf_xdp_adjust_tail() API") Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-2-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05selftests/tc-testing: Add tests exercising act_ife metalist replace behaviourVictor Nogueira
Add 2 test cases to exercise fix in act_ife's internal metalist behaviour. - Update decode ife action into encode with tcindex metadata - Update decode ife action into encode with multiple metadata Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20260304140603.76500-2-jhs@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05net/sched: act_ife: Fix metalist update behaviorJamal Hadi Salim
Whenever an ife action replace changes the metalist, instead of replacing the old data on the metalist, the current ife code is appending the new metadata. Aside from being innapropriate behavior, this may lead to an unbounded addition of metadata to the metalist which might cause an out of bounds error when running the encode op: [ 138.423369][ C1] ================================================================== [ 138.424317][ C1] BUG: KASAN: slab-out-of-bounds in ife_tlv_meta_encode (net/ife/ife.c:168) [ 138.424906][ C1] Write of size 4 at addr ffff8880077f4ffe by task ife_out_out_bou/255 [ 138.425778][ C1] CPU: 1 UID: 0 PID: 255 Comm: ife_out_out_bou Not tainted 7.0.0-rc1-00169-gfbdfa8da05b6 #624 PREEMPT(full) [ 138.425795][ C1] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 138.425800][ C1] Call Trace: [ 138.425804][ C1] <IRQ> [ 138.425808][ C1] dump_stack_lvl (lib/dump_stack.c:122) [ 138.425828][ C1] print_report (mm/kasan/report.c:379 mm/kasan/report.c:482) [ 138.425839][ C1] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:221) [ 138.425844][ C1] ? __virt_addr_valid (./arch/x86/include/asm/preempt.h:95 (discriminator 1) ./include/linux/rcupdate.h:975 (discriminator 1) ./include/linux/mmzone.h:2207 (discriminator 1) arch/x86/mm/physaddr.c:54 (discriminator 1)) [ 138.425853][ C1] ? ife_tlv_meta_encode (net/ife/ife.c:168) [ 138.425859][ C1] kasan_report (mm/kasan/report.c:221 mm/kasan/report.c:597) [ 138.425868][ C1] ? ife_tlv_meta_encode (net/ife/ife.c:168) [ 138.425878][ C1] kasan_check_range (mm/kasan/generic.c:186 (discriminator 1) mm/kasan/generic.c:200 (discriminator 1)) [ 138.425884][ C1] __asan_memset (mm/kasan/shadow.c:84 (discriminator 2)) [ 138.425889][ C1] ife_tlv_meta_encode (net/ife/ife.c:168) [ 138.425893][ C1] ? ife_tlv_meta_encode (net/ife/ife.c:171) [ 138.425898][ C1] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:221) [ 138.425903][ C1] ife_encode_meta_u16 (net/sched/act_ife.c:57) [ 138.425910][ C1] ? __pfx_do_raw_spin_lock (kernel/locking/spinlock_debug.c:114) [ 138.425916][ C1] ? __asan_memcpy (mm/kasan/shadow.c:105 (discriminator 3)) [ 138.425921][ C1] ? __pfx_ife_encode_meta_u16 (net/sched/act_ife.c:45) [ 138.425927][ C1] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:221) [ 138.425931][ C1] tcf_ife_act (net/sched/act_ife.c:847 net/sched/act_ife.c:879) To solve this issue, fix the replace behavior by adding the metalist to the ife rcu data structure. Fixes: aa9fd9a325d51 ("sched: act: ife: update parameters via rcu handling") Reported-by: Ruitong Liu <cnitlrt@gmail.com> Tested-by: Ruitong Liu <cnitlrt@gmail.com> Co-developed-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20260304140603.76500-1-jhs@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05Merge branch ↵Jakub Kicinski
'net-ipv6-fix-panic-when-ipv4-route-references-loopback-ipv6-nexthop-and-add-selftest' Jiayuan Chen says: ==================== net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthop and add selftest syzbot reported a kernel panic [1] when an IPv4 route references a loopback IPv6 nexthop object: BUG: unable to handle page fault for address: ffff8d069e7aa000 PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 6aa01067 P4D 6aa01067 PUD 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 2 UID: 0 PID: 530 Comm: ping Not tainted 6.19.0+ #193 PREEMPT RIP: 0010:ip_route_output_key_hash_rcu+0x578/0x9e0 RSP: 0018:ffffd2ffc1573918 EFLAGS: 00010286 RAX: ffff8d069e7aa000 RBX: ffffd2ffc1573988 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffd2ffc1573978 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8d060d496000 R13: 0000000000000000 R14: ffff8d060399a600 R15: ffff8d06019a6ab8 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff8d069e7aa000 CR3: 0000000106eb0001 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> ip_route_output_key_hash+0x86/0x1a0 __ip4_datagram_connect+0x2b5/0x4e0 udp_connect+0x2c/0x60 inet_dgram_connect+0x88/0xd0 __sys_connect_file+0x56/0x90 __sys_connect+0xa8/0xe0 __x64_sys_connect+0x18/0x30 x64_sys_call+0xfb9/0x26e0 do_syscall_64+0xd3/0x1510 entry_SYSCALL_64_after_hwframe+0x76/0x7e Reproduction: ip -6 nexthop add id 100 dev lo ip route add 172.20.20.0/24 nhid 100 ping -c1 172.20.20.1 # kernel crash Problem Description When a standalone IPv6 nexthop object is created with a loopback device, fib6_nh_init() misclassifies it as a reject route. Nexthop objects have no destination prefix (fc_dst=::), so fib6_is_reject() always matches any loopback nexthop. The reject path skips fib_nh_common_init(), leaving nhc_pcpu_rth_output unallocated. When an IPv4 route later references this nexthop and triggers a route lookup, __mkroute_output() calls raw_cpu_ptr(nhc->nhc_pcpu_rth_output) on a NULL pointer, causing a page fault. The reject classification was designed for regular IPv6 routes to prevent kernel routing loops, but nexthop objects should not be subject to this check since they carry no destination information. Loop prevention is handled separately when the route itself is created. [1] https://syzkaller.appspot.com/bug?extid=334190e097a98a1b81bb ==================== Link: https://patch.msgid.link/20260304113817.294966-1-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05selftests: net: add test for IPv4 route with loopback IPv6 nexthopJiayuan Chen
Add a regression test for a kernel panic that occurs when an IPv4 route references an IPv6 nexthop object created on the loopback device. The test creates an IPv6 nexthop on lo, binds an IPv4 route to it, then triggers a route lookup via ping to verify the kernel does not crash. ./fib_nexthops.sh Tests passed: 249 Tests failed: 0 Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20260304113817.294966-3-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthopJiayuan Chen
When a standalone IPv6 nexthop object is created with a loopback device (e.g., "ip -6 nexthop add id 100 dev lo"), fib6_nh_init() misclassifies it as a reject route. This is because nexthop objects have no destination prefix (fc_dst=::), causing fib6_is_reject() to match any loopback nexthop. The reject path skips fib_nh_common_init(), leaving nhc_pcpu_rth_output unallocated. If an IPv4 route later references this nexthop, __mkroute_output() dereferences NULL nhc_pcpu_rth_output and panics. Simplify the check in fib6_nh_init() to only match explicit reject routes (RTF_REJECT) instead of using fib6_is_reject(). The loopback promotion heuristic in fib6_is_reject() is handled separately by ip6_route_info_create_nh(). After this change, the three cases behave as follows: 1. Explicit reject route ("ip -6 route add unreachable 2001:db8::/64"): RTF_REJECT is set, enters reject path, skips fib_nh_common_init(). No behavior change. 2. Implicit loopback reject route ("ip -6 route add 2001:db8::/32 dev lo"): RTF_REJECT is not set, takes normal path, fib_nh_common_init() is called. ip6_route_info_create_nh() still promotes it to reject afterward. nhc_pcpu_rth_output is allocated but unused, which is harmless. 3. Standalone nexthop object ("ip -6 nexthop add id 100 dev lo"): RTF_REJECT is not set, takes normal path, fib_nh_common_init() is called. nhc_pcpu_rth_output is properly allocated, fixing the crash when IPv4 routes reference this nexthop. Suggested-by: Ido Schimmel <idosch@nvidia.com> Fixes: 493ced1ac47c ("ipv4: Allow routes to use nexthop objects") Reported-by: syzbot+334190e097a98a1b81bb@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/698f8482.a70a0220.2c38d7.00ca.GAE@google.com/T/ Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20260304113817.294966-2-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05net: vxlan: fix nd_tbl NULL dereference when IPv6 is disabledFernando Fernandez Mancera
When booting with the 'ipv6.disable=1' parameter, the nd_tbl is never initialized because inet6_init() exits before ndisc_init() is called which initializes it. If an IPv6 packet is injected into the interface, route_shortcircuit() is called and a NULL pointer dereference happens on neigh_lookup(). BUG: kernel NULL pointer dereference, address: 0000000000000380 Oops: Oops: 0000 [#1] SMP NOPTI [...] RIP: 0010:neigh_lookup+0x20/0x270 [...] Call Trace: <TASK> vxlan_xmit+0x638/0x1ef0 [vxlan] dev_hard_start_xmit+0x9e/0x2e0 __dev_queue_xmit+0xbee/0x14e0 packet_sendmsg+0x116f/0x1930 __sys_sendto+0x1f5/0x200 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x12f/0x1590 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fix this by adding an early check on route_shortcircuit() when protocol is ETH_P_IPV6. Note that ipv6_mod_enabled() cannot be used here because VXLAN can be built-in even when IPv6 is built as a module. Fixes: e15a00aafa4b ("vxlan: add ipv6 route short circuit support") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Link: https://patch.msgid.link/20260304120357.9778-2-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>