summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2026-02-24RDMA/ionic: Fix potential NULL pointer dereference in ionic_query_portKamal Heib
The function ionic_query_port() calls ib_device_get_netdev() without checking the return value which could lead to NULL pointer dereference, Fix it by checking the return value and return -ENODEV if the 'ndev' is NULL. Fixes: 2075bbe8ef03 ("RDMA/ionic: Register device ops for miscellaneous functionality") Signed-off-by: Kamal Heib <kheib@redhat.com> Link: https://patch.msgid.link/20260220222125.16973-2-kheib@redhat.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-02-24pinctrl: sunxi: Implement gpiochip::get_direction()Chen-Yu Tsai
After commit 471e998c0e31 ("gpiolib: remove redundant callback check"), a warning will be printed if the gpio driver does not implement this callback. The warning was added in commit e623c4303ed1 ("gpiolib: sanitize the return value of gpio_chip::get_direction()"), but was masked by the "redundant" check. The warning can be triggered by any action that calls the callback, such as dumping the GPIO state from /sys/kernel/debug/gpio. Implement it for the sunxi driver. This is simply a matter of reading out the mux value from the registers, then checking if it is one of the GPIO functions and which direction it is. Signed-off-by: Chen-Yu Tsai <wens@kernel.org> Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Linus Walleij <linusw@kernel.org>
2026-02-24pinctrl: rockchip: Fix configuring a deferred pinKrzysztof Kozlowski
Commit e2c58cbe3aff ("pinctrl: rockchip: Simplify locking with scoped_guard()") added a scoped_guard() over existing code containing a "break" instruction. That "break" was for the outer (existing) for-loop, which now exits inner, scoped_guard() loop. If GPIO driver did not probe, then driver will not bail out, but instead continue to configure the pin. Fix the issue by simplifying the code - the break in original code was leading directly to end of the function returning 0, thus we can simply return here rockchip_pinconf_defer_pin status. Reported-by: David Lechner <dlechner@baylibre.com> Closes: https://lore.kernel.org/r/f5b38942-a584-4e78-a893-de4a219070b2@baylibre.com/ Fixes: e2c58cbe3aff ("pinctrl: rockchip: Simplify locking with scoped_guard()") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Linus Walleij <linusw@kernel.org>
2026-02-24pinctrl: cirrus: cs42l43: Fix double-put in cs42l43_pin_probe()Felix Gu
devm_add_action_or_reset() already invokes the action on failure, so the explicit put causes a double-put. Fixes: 9b07cdf86a0b ("pinctrl: cirrus: Fix fwnode leak in cs42l43_pin_probe()") Signed-off-by: Felix Gu <ustc.gu@gmail.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Signed-off-by: Linus Walleij <linusw@kernel.org>
2026-02-24pinctrl: meson: amlogic-a4: Fix device node reference leak in ↵Felix Gu
aml_dt_node_to_map_pinmux() The of_get_parent() function returns a device_node with an incremented reference count. Use the __free(device_node) cleanup attribute to ensure of_node_put() is automatically called when pnode goes out of scope, fixing a reference leak. Fixes: 6e9be3abb78c ("pinctrl: Add driver support for Amlogic SoCs") Signed-off-by: Felix Gu <ustc.gu@gmail.com> Signed-off-by: Linus Walleij <linusw@kernel.org>
2026-02-24pinctrl: qcom: sdm660-lpass-lpi: Make groups and functions variables staticKrzysztof Kozlowski
File-scope 'sdm660_lpi_pinctrl_groups' and 'sdm660_lpi_pinctrl_functions' are not used outside of this unit, so make them static to silence sparse warnings: pinctrl-sdm660-lpass-lpi.c:79:27: warning: symbol 'sdm660_lpi_pinctrl_groups' was not declared. Should it be static? pinctrl-sdm660-lpass-lpi.c:116:27: warning: symbol 'sdm660_lpi_pinctrl_functions' was not declared. Should it be static? Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Linus Walleij <linusw@kernel.org>
2026-02-24pinctrl: cix: sky1: Unexport sky1_pinctrl_pm_opsKrzysztof Kozlowski
File-scope 'sky1_pinctrl_pm_ops' is not used outside of this unit (and it should not be!), so unexport it and make it static to silence sparse warning: pinctrl-sky1.c:525:25: warning: symbol 'sky1_pinctrl_pm_ops' was not declared. Should it be static? Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Linus Walleij <linusw@kernel.org>
2026-02-24pinctrl: amdisp: Make amdisp_pinctrl_ops variable staticKrzysztof Kozlowski
File-scope 'amdisp_pinctrl_ops' is not used outside of this unit, so make it static to silence sparse warning: pinctrl-amdisp.c:83:26: warning: symbol 'amdisp_pinctrl_ops' was not declared. Should it be static? Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Linus Walleij <linusw@kernel.org>
2026-02-24pinctrl: pinconf-generic: Fix memory leak in pinconf_generic_parse_dt_config()Felix Gu
In pinconf_generic_parse_dt_config(), if parse_dt_cfg() fails, it returns directly. This bypasses the cleanup logic and results in a memory leak of the cfg buffer. Fix this by jumping to the out label on failure, ensuring kfree(cfg) is called before returning. Fixes: 90a18c512884 ("pinctrl: pinconf-generic: Handle string values for generic properties") Signed-off-by: Felix Gu <ustc.gu@gmail.com> Reviewed-by: Antonio Borneo <antonio.borneo@foss.st.com> Signed-off-by: Linus Walleij <linusw@kernel.org>
2026-02-24netconsole: avoid OOB reads, msg is not nul-terminatedJakub Kicinski
msg passed to netconsole from the console subsystem is not guaranteed to be nul-terminated. Before recent commit 7eab73b18630 ("netconsole: convert to NBCON console infrastructure") the message would be placed in printk_shared_pbufs, a static global buffer, so KASAN had harder time catching OOB accesses. Now we see: printk: console [netcon_ext0] enabled BUG: KASAN: slab-out-of-bounds in string+0x1f7/0x240 Read of size 1 at addr ffff88813b6d4c00 by task pr/netcon_ext0/594 CPU: 65 UID: 0 PID: 594 Comm: pr/netcon_ext0 Not tainted 6.19.0-11754-g4246fd6547c9 Call Trace: kasan_report+0xe4/0x120 string+0x1f7/0x240 vsnprintf+0x655/0xba0 scnprintf+0xba/0x120 netconsole_write+0x3fe/0xa10 nbcon_emit_next_record+0x46e/0x860 nbcon_kthread_func+0x623/0x750 Allocated by task 1: nbcon_alloc+0x1ea/0x450 register_console+0x26b/0xe10 init_netconsole+0xbb0/0xda0 The buggy address belongs to the object at ffff88813b6d4000 which belongs to the cache kmalloc-4k of size 4096 The buggy address is located 0 bytes to the right of allocated 3072-byte region [ffff88813b6d4000, ffff88813b6d4c00) Fixes: c62c0a17f9b7 ("netconsole: Append kernel version to message") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260219195021.2099699-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-24net: wan: farsync: Fix use-after-free bugs caused by unfinished taskletsDuoming Zhou
When the FarSync T-series card is being detached, the fst_card_info is deallocated in fst_remove_one(). However, the fst_tx_task or fst_int_task may still be running or pending, leading to use-after-free bugs when the already freed fst_card_info is accessed in fst_process_tx_work_q() or fst_process_int_work_q(). A typical race condition is depicted below: CPU 0 (cleanup) | CPU 1 (tasklet) | fst_start_xmit() fst_remove_one() | tasklet_schedule() unregister_hdlc_device()| | fst_process_tx_work_q() //handler kfree(card) //free | do_bottom_half_tx() | card-> //use The following KASAN trace was captured: ================================================================== BUG: KASAN: slab-use-after-free in do_bottom_half_tx+0xb88/0xd00 Read of size 4 at addr ffff88800aad101c by task ksoftirqd/3/32 ... Call Trace: <IRQ> dump_stack_lvl+0x55/0x70 print_report+0xcb/0x5d0 ? do_bottom_half_tx+0xb88/0xd00 kasan_report+0xb8/0xf0 ? do_bottom_half_tx+0xb88/0xd00 do_bottom_half_tx+0xb88/0xd00 ? _raw_spin_lock_irqsave+0x85/0xe0 ? __pfx__raw_spin_lock_irqsave+0x10/0x10 ? __pfx___hrtimer_run_queues+0x10/0x10 fst_process_tx_work_q+0x67/0x90 tasklet_action_common+0x1fa/0x720 ? hrtimer_interrupt+0x31f/0x780 handle_softirqs+0x176/0x530 __irq_exit_rcu+0xab/0xe0 sysvec_apic_timer_interrupt+0x70/0x80 ... Allocated by task 41 on cpu 3 at 72.330843s: kasan_save_stack+0x24/0x50 kasan_save_track+0x17/0x60 __kasan_kmalloc+0x7f/0x90 fst_add_one+0x1a5/0x1cd0 local_pci_probe+0xdd/0x190 pci_device_probe+0x341/0x480 really_probe+0x1c6/0x6a0 __driver_probe_device+0x248/0x310 driver_probe_device+0x48/0x210 __device_attach_driver+0x160/0x320 bus_for_each_drv+0x101/0x190 __device_attach+0x198/0x3a0 device_initial_probe+0x78/0xa0 pci_bus_add_device+0x81/0xc0 pci_bus_add_devices+0x7e/0x190 enable_slot+0x9b9/0x1130 acpiphp_check_bridge.part.0+0x2e1/0x460 acpiphp_hotplug_notify+0x36c/0x3c0 acpi_device_hotplug+0x203/0xb10 acpi_hotplug_work_fn+0x59/0x80 ... Freed by task 41 on cpu 1 at 75.138639s: kasan_save_stack+0x24/0x50 kasan_save_track+0x17/0x60 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x43/0x70 kfree+0x135/0x410 fst_remove_one+0x2ca/0x540 pci_device_remove+0xa6/0x1d0 device_release_driver_internal+0x364/0x530 pci_stop_bus_device+0x105/0x150 pci_stop_and_remove_bus_device+0xd/0x20 disable_slot+0x116/0x260 acpiphp_disable_and_eject_slot+0x4b/0x190 acpiphp_hotplug_notify+0x230/0x3c0 acpi_device_hotplug+0x203/0xb10 acpi_hotplug_work_fn+0x59/0x80 ... The buggy address belongs to the object at ffff88800aad1000 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 28 bytes inside of freed 1024-byte region The buggy address belongs to the physical page: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xaad0 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0x100000000000040(head|node=0|zone=1) page_type: f5(slab) raw: 0100000000000040 ffff888007042dc0 dead000000000122 0000000000000000 raw: 0000000000000000 0000000080100010 00000000f5000000 0000000000000000 head: 0100000000000040 ffff888007042dc0 dead000000000122 0000000000000000 head: 0000000000000000 0000000080100010 00000000f5000000 0000000000000000 head: 0100000000000003 ffffea00002ab401 00000000ffffffff 00000000ffffffff head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88800aad0f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88800aad0f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88800aad1000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88800aad1080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88800aad1100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Fix this by ensuring that both fst_tx_task and fst_int_task are properly canceled before the fst_card_info is released. Add tasklet_kill() in fst_remove_one() to synchronize with any pending or running tasklets. Since unregister_hdlc_device() stops data transmission and reception, and fst_disable_intr() prevents further interrupts, it is appropriate to place tasklet_kill() after these calls. The bugs were identified through static analysis. To reproduce the issue and validate the fix, a FarSync T-series card was simulated in QEMU and delays(e.g., mdelay()) were introduced within the tasklet handler to increase the likelihood of triggering the race condition. Fixes: 2f623aaf9f31 ("net: farsync: Fix kmemleak when rmmods farsync") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20260219124637.72578-1-duoming@zju.edu.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-24RDMA/bng_re: Unwind bng_re_dev_init properlySiva Reddy Kallam
Fix below smatch warning: drivers/infiniband/hw/bng_re/bng_dev.c:270 bng_re_dev_init() warn: missing unwind goto? Current bng_re_dev_init function is not having clear unwinding code. So, added proper unwinding with ladder. Fixes: 4f830cd8d7fe ("RDMA/bng_re: Add infrastructure for enabling Firmware channel") Reported-by: Simon Horman <horms@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Closes: https://lore.kernel.org/r/202601010413.sWadrQel-lkp@intel.com/ Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com> Link: https://patch.msgid.link/20260218091246.1764808-3-siva.kallam@broadcom.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-24RDMA/bng_re: Remove unnessary validity checksSiva Reddy Kallam
Fix below smatch warning: drivers/infiniband/hw/bng_re/bng_dev.c:113 bng_re_net_ring_free() warn: variable dereferenced before check 'rdev' (see line 107) current driver has unnessary validity checks. So, removing these unnessary validity checks. Fixes: 4f830cd8d7fe ("RDMA/bng_re: Add infrastructure for enabling Firmware channel") Fixes: 745065770c2d ("RDMA/bng_re: Register and get the resources from bnge driver") Fixes: 04e031ff6e60 ("RDMA/bng_re: Initialize the Firmware and Hardware") Fixes: d0da769c19d0 ("RDMA/bng_re: Add Auxiliary interface") Reported-by: Simon Horman <horms@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Closes: https://lore.kernel.org/r/202601010413.sWadrQel-lkp@intel.com/ Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com> Link: https://patch.msgid.link/20260218091246.1764808-2-siva.kallam@broadcom.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-24RDMA/core: Fix stale RoCE GIDs during netdev events at registrationJiri Pirko
RoCE GID entries become stale when netdev properties change during the IB device registration window. This is reproducible with a udev rule that sets a MAC address when a VF netdev appears: ACTION=="add", SUBSYSTEM=="net", KERNEL=="eth4", \ RUN+="/sbin/ip link set eth4 address 88:22:33:44:55:66" After VF creation, show_gids displays GIDs derived from the original random MAC rather than the configured one. The root cause is a race between netdev event processing and device registration: CPU 0 (driver) CPU 1 (udev/workqueue) ────────────── ────────────────────── ib_register_device() ib_cache_setup_one() gid_table_setup_one() _gid_table_setup_one() ← GID table allocated rdma_roce_rescan_device() ← GIDs populated with OLD MAC ip link set eth4 addr NEW_MAC NETDEV_CHANGEADDR queued netdevice_event_work_handler() ib_enum_all_roce_netdevs() ← Iterates DEVICE_REGISTERED ← Device NOT marked yet, SKIP! enable_device_and_get() xa_set_mark(DEVICE_REGISTERED) ← Too late, event was lost The netdev event handler uses ib_enum_all_roce_netdevs() which only iterates devices marked DEVICE_REGISTERED. However, this mark is set late in the registration process, after the GID cache is already populated. Events arriving in this window are silently dropped. Fix this by introducing a new xarray mark DEVICE_GID_UPDATES that is set immediately after the GID table is allocated and initialized. Use the new mark in ib_enum_all_roce_netdevs() function to iterate devices instead of DEVICE_REGISTERED. This is safe because: - After _gid_table_setup_one(), all required structures exist (port_data, immutable, cache.gid) - The GID table mutex serializes concurrent access between the initial rescan and event handlers - Event handlers correctly update stale GIDs even when racing with rescan - The mark is cleared in ib_cache_cleanup_one() before teardown This also fixes similar races for IP address events (inetaddr_event, inet6addr_event) which use the same enumeration path. Fixes: 0df91bb67334 ("RDMA/devices: Use xarray to store the client_data") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20260127093839.126291-1-jiri@resnulli.us Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-02-24HID: multitouch: new class MT_CLS_EGALAX_P80H84Ian Ray
Fixes: f9e82295eec1 ("HID: multitouch: add eGalaxTouch P80H84 support") Signed-off-by: Ian Ray <ian.ray@gehealthcare.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2026-02-24HID: magicmouse: fix battery reporting for Apple Magic Trackpad 2Julius Lehmann
Battery reporting does not work for the Apple Magic Trackpad 2 if it is connected via USB. The current hid descriptor fixup code checks for a hid descriptor length of exactly 83 bytes. If the hid descriptor is larger, which is the case for newer apple mice, the fixup is not applied. This fix checks for hid descriptor sizes greater/equal 83 bytes which applies the fixup for newer devices as well. Signed-off-by: Julius Lehmann <lehmanju@devpi.de> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2026-02-24drm/msm/dsi: fix hdisplay calculation when programming dsi registersPengyu Luo
Recently, the hdisplay calculation is working for 3:1 compressed ratio only. If we have a video panel with DSC BPP = 8, and BPC = 10, we still use the default bits_per_pclk = 24, then we get the wrong hdisplay. We can draw the conclusion by cross-comparing the calculation with the calculation in dsi_adjust_pclk_for_compression(). Since CMD mode does not use this, we can remove !(msm_host->mode_flags & MIPI_DSI_MODE_VIDEO) safely. Fixes: efcbd6f9cdeb ("drm/msm/dsi: Enable widebus for DSI") Signed-off-by: Pengyu Luo <mitltlatltl@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/704822/ Link: https://lore.kernel.org/r/20260214105145.105308-1-mitltlatltl@gmail.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2026-02-23dpll: zl3073x: fix REF_PHASE_OFFSET_COMP register width for some chip IDsIvan Vecera
The REF_PHASE_OFFSET_COMP register is 48-bit wide on most zl3073x chip variants, but only 32-bit wide on chip IDs 0x0E30, 0x0E93..0x0E97 and 0x1F60. The driver unconditionally uses 48-bit read/write operations, which on 32-bit variants causes reading 2 bytes past the register boundary (corrupting the value) and writing 2 bytes into the adjacent register. Fix this by storing the chip ID in the device structure during probe and adding a helper to detect the affected variants. Use the correct register width for read/write operations and the matching sign extension bit (31 vs 47) when interpreting the phase compensation value. Fixes: 6287262f761e ("dpll: zl3073x: Add support to adjust phase") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260220155755.448185-1-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-23gve: fix incorrect buffer cleanup in gve_tx_clean_pending_packets for QPLAnkit Garg
In DQ-QPL mode, gve_tx_clean_pending_packets() incorrectly uses the RDA buffer cleanup path. It iterates num_bufs times and attempts to unmap entries in the dma array. This leads to two issues: 1. The dma array shares storage with tx_qpl_buf_ids (union). Interpreting buffer IDs as DMA addresses results in attempting to unmap incorrect memory locations. 2. num_bufs in QPL mode (counting 2K chunks) can significantly exceed the size of the dma array, causing out-of-bounds access warnings (trace below is how we noticed this issue). UBSAN: array-index-out-of-bounds in drivers/net/ethernet/drivers/net/ethernet/google/gve/gve_tx_dqo.c:178:5 index 18 is out of range for type 'dma_addr_t[18]' (aka 'unsigned long long[18]') Workqueue: gve gve_service_task [gve] Call Trace: <TASK> dump_stack_lvl+0x33/0xa0 __ubsan_handle_out_of_bounds+0xdc/0x110 gve_tx_stop_ring_dqo+0x182/0x200 [gve] gve_close+0x1be/0x450 [gve] gve_reset+0x99/0x120 [gve] gve_service_task+0x61/0x100 [gve] process_scheduled_works+0x1e9/0x380 Fix this by properly checking for QPL mode and delegating to gve_free_tx_qpl_bufs() to reclaim the buffers. Cc: stable@vger.kernel.org Fixes: a6fb8d5a8b69 ("gve: Tx path for DQO-QPL") Signed-off-by: Ankit Garg <nktgrg@google.com> Reviewed-by: Jordan Rhee <jordanrhee@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260220215324.1631350-1-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-24ata: libata-core: fix cancellation of a port deferred qc workDamien Le Moal
cancel_work_sync() is a sleeping function so it cannot be called with the spin lock of a port being held. Move the call to this function in ata_port_detach() after EH completes, with the port lock released, together with other work cancellation calls. Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
2026-02-24ata: libata-eh: correctly handle deferred qc timeoutsDamien Le Moal
A deferred qc may timeout while waiting for the device queue to drain to be submitted. In such case, since the qc is not active, ata_scsi_cmd_error_handler() ends up calling scsi_eh_finish_cmd(), which frees the qc. But as the port deferred_qc field still references this finished/freed qc, the deferred qc work may eventually attempt to call ata_qc_issue() against this invalid qc, leading to errors such as reported by UBSAN (syzbot run): UBSAN: shift-out-of-bounds in drivers/ata/libata-core.c:5166:24 shift exponent 4210818301 is too large for 64-bit type 'long long unsigned int' ... Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120 ubsan_epilogue+0xa/0x30 lib/ubsan.c:233 __ubsan_handle_shift_out_of_bounds+0x279/0x2a0 lib/ubsan.c:494 ata_qc_issue.cold+0x38/0x9f drivers/ata/libata-core.c:5166 ata_scsi_deferred_qc_work+0x154/0x1f0 drivers/ata/libata-scsi.c:1679 process_one_work+0x9d7/0x1920 kernel/workqueue.c:3275 process_scheduled_works kernel/workqueue.c:3358 [inline] worker_thread+0x5da/0xe40 kernel/workqueue.c:3439 kthread+0x370/0x450 kernel/kthread.c:467 ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 </TASK> Fix this by checking if the qc of a timed out SCSI command is a deferred one, and in such case, clear the port deferred_qc field and finish the SCSI command with DID_TIME_OUT. Reported-by: syzbot+1f77b8ca15336fff21ff@syzkaller.appspotmail.com Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
2026-02-24drm/msm/dpu: Don't use %pK through printk (again)Thomas Weißschuh
In the past %pK was preferable to %p as it would not leak raw pointer values into the kernel log. Since commit ad67b74d2469 ("printk: hash addresses printed with %p") the regular %p has been improved to avoid this issue. Furthermore, restricted pointers ("%pK") were never meant to be used through printk(). They can still unintentionally leak raw pointers or acquire sleeping locks in atomic contexts. Switch to the regular pointer formatting which is safer and easier to reason about. This was previously fixed in this driver in commit 1ba9fbe40337 ("drm/msm: Don't use %pK through printk") but an additional usage was reintroduced in commit 39a750ff5fc9 ("drm/msm/dpu: Add DSPP GC driver to provide GAMMA_LUT DRM property") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Fixes: 39a750ff5fc9 ("drm/msm/dpu: Add DSPP GC driver to provide GAMMA_LUT DRM property") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/706229/ Link: https://lore.kernel.org/r/20260223-restricted-pointers-msm-v1-1-14c0b451e372@linutronix.de Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2026-02-24Revert "drm/msm/dpu: try reserving the DSPP-less LM first"Dmitry Baryshkov
This reverts commit 42f62cd79578 ("drm/msm/dpu: try reserving the DSPP-less LM first"). It seems on later DPUs using higher LMs require some additional setup or conflicts with the hardware defaults. Val (and other developers) reported blue screen on Hamoa (X1E80100) laptops. Revert the offending commit until we understand, what is the issue. Fixes: 42f62cd79578 ("drm/msm/dpu: try reserving the DSPP-less LM first") Reported-by: Val Packett <val@packett.cool> Closes: https://lore.kernel.org/r/33424a9d-10a6-4479-bba6-12f8ce60da1a@packett.cool Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Tested-by: Manivannan Sadhasivam <mani@kernel.org> # T14s Patchwork: https://patchwork.freedesktop.org/patch/704814/ Link: https://lore.kernel.org/r/20260214-revert-dspp-less-v1-1-be0d636a2a6e@oss.qualcomm.com
2026-02-24drm/msm/dpu: Fix smatch warnings about variable dereferenced before checksunliming
Fix below smatch warnings: drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp_v13.c:161 dpu_hw_sspp_setup_pe_config_v13() warn: variable dereferenced before check 'ctx' (see line 159) Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Closes: https://lore.kernel.org/r/202601252214.oEaY3UZM-lkp@intel.com/ Signed-off-by: sunliming <sunliming@kylinos.cn> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/701853/ Link: https://lore.kernel.org/r/20260130053615.24886-1-sunliming@linux.dev Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2026-02-24drm/msm: Adjust msm_iommu_pagetable_prealloc_allocate() allocation typeKees Cook
In preparation for making the kmalloc family of allocators type aware, we need to make sure that the returned type from the allocation matches the type of the variable being assigned. (Before, the allocator would always return "void *", which can be implicitly cast to any pointer type.) The assigned type is "void **" but the returned type will be "void ***". These are the same allocation size (pointer size), but the types do not match. Adjust the allocation type to match the assignment. Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/703588/ Link: https://lore.kernel.org/r/20260206222151.work.016-kees@kernel.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2026-02-24drm/msm/dpu: Fix LM size on a number of platformsKonrad Dybcio
The register space has grown with what seems to be DPU8. Bump up the .len to match. Fixes: e3b1f369db5a ("drm/msm/dpu: Add X1E80100 support") Fixes: 4a352c2fc15a ("drm/msm/dpu: Introduce SC8280XP") Fixes: efcd0107727c ("drm/msm/dpu: add support for SM8550") Fixes: 100d7ef6995d ("drm/msm/dpu: add support for SM8450") Fixes: 178575173472 ("drm/msm/dpu: add catalog entry for SAR2130P") Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Abel Vesa <abel.vesa@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/701063/ Link: https://lore.kernel.org/r/20260127-topic-lm_size_fix-v1-1-25f88d014dfd@oss.qualcomm.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2026-02-24drm/msm/adreno: Add GPU to MODULE_DEVICE_TABLEAkhil P Oommen
Since it is possible to independently probe Adreno GPU, add GPU match table to MODULE_DEVICE_TABLE to allow auto-loading of msm module. Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/700656/ Link: https://lore.kernel.org/r/20260124-adreno-module-table-v1-1-9c2dbb2638b4@oss.qualcomm.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2026-02-23Input: synaptics-rmi4 - fix a locking bug in an error pathBart Van Assche
Lock f54->data_mutex when entering the function statement since jumping to the 'error' label when checking report_size fails causes that mutex to be unlocked. This bug has been detected by the Clang thread-safety checker. Fixes: 3a762dbd5347 ("[media] Input: synaptics-rmi4 - add support for F54 diagnostics") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20260223215118.2154194-16-bvanassche@acm.org Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2026-02-23drm/xe/sync: Fix user fence leak on alloc failureShuicheng Lin
When dma_fence_chain_alloc() fails, properly release the user fence reference to prevent a memory leak. Fixes: 0995c2fc39b0 ("drm/xe: Enforce correct user fence signaling order using") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260219233516.2938172-6-shuicheng.lin@intel.com (cherry picked from commit a5d5634cde48a9fcd68c8504aa07f89f175074a0) Cc: stable@vger.kernel.org Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-23drm/xe/sync: Cleanup partially initialized sync on parse failureShuicheng Lin
xe_sync_entry_parse() can allocate references (syncobj, fence, chain fence, or user fence) before hitting a later failure path. Several of those paths returned directly, leaving partially initialized state and leaking refs. Route these error paths through a common free_sync label and call xe_sync_entry_cleanup(sync) before returning the error. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260219233516.2938172-5-shuicheng.lin@intel.com (cherry picked from commit f939bdd9207a5d1fc55cced5459858480686ce22) Cc: stable@vger.kernel.org Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-23Input: i8042 - add TUXEDO InfinityBook Max 16 Gen10 AMD to i8042 quirk tableChristoffer Sandberg
The device occasionally wakes up from suspend with missing input on the internal keyboard and the following suspend attempt results in an instant wake-up. The quirks fix both issues for this device. Signed-off-by: Christoffer Sandberg <cs@tuxedo.de> Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Link: https://patch.msgid.link/20260223142054.50310-1-wse@tuxedocomputers.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2026-02-23Bluetooth: hci_qca: Cleanup on all setup failuresJinwang Li
The setup process previously combined error handling and retry gating under one condition. As a result, the final failed attempt exited without performing cleanup. Update the failure path to always perform power and port cleanup on setup failure, and reopen the port only when retrying. Fixes: 9e80587aba4c ("Bluetooth: hci_qca: Enhance retry logic in qca_setup") Signed-off-by: Jinwang Li <jinwang.li@oss.qualcomm.com> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-02-23remoteproc: qcom_wcnss: Fix reserved region mapping failureRob Herring (Arm)
Commit c70b9d5fdcd7 ("remoteproc: qcom: Use of_reserved_mem_region_* functions for "memory-region"") switched from devm_ioremap_wc() to devm_ioremap_resource_wc(). The difference is devm_ioremap_resource_wc() also requests the resource which fails. Testing of both fixed and dynamic reserved regions indicates that requesting the resource should work, so I'm not sure why it doesn't work in this case. Fix the issue by reverting back to devm_ioremap_wc(). Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Reported-by: André Apitzsch <git@apitzsch.eu> Fixes: c70b9d5fdcd7 ("remoteproc: qcom: Use of_reserved_mem_region_* functions for "memory-region"") Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: André Apitzsch <git@apitzsch.eu> # on BQ Aquaris M5 Link: https://lore.kernel.org/r/20260128220243.3018526-1-robh@kernel.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-02-23RDMA/uverbs: select CONFIG_DMA_SHARED_BUFFERArnd Bergmann
The addition of dmabuf support in uverbs means that it is no longer possible to build infiniband support if that is disabled: arm-linux-gnueabi-ld: drivers/infiniband/core/ib_core_uverbs.o: in function `rdma_user_mmap_entry_remove.part.0': ib_core_uverbs.c:(.text+0x508): undefined reference to `dma_buf_move_notify' (dma_buf_move_notify): Unknown destination type (ARM/Thumb) in drivers/infiniband/core/ib_core_uverbs.o ib_core_uverbs.c:(.text+0x518): undefined reference to `dma_resv_wait_timeout' (dma_resv_wait_timeout): Unknown destination type (ARM/Thumb) in drivers/infiniband/core/ib_core_uverbs.o Select this from Kconfig, as we do for the other users. Fixes: 0ac6f4056c4a ("RDMA/uverbs: Add DMABUF object type and operations") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20260216121213.2088910-1-arnd@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-02-23usb: yurex: fix race in probeOliver Neukum
The bbu member of the descriptor must be set to the value standing for uninitialized values before the URB whose completion handler sets bbu is submitted. Otherwise there is a window during which probing can overwrite already retrieved data. Cc: stable <stable@kernel.org> Signed-off-by: Oliver Neukum <oneukum@suse.com> Link: https://patch.msgid.link/20260209143720.1507500-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-23usb: gadget: f_ncm: Fix atomic context locking issueKuen-Han Tsai
The ncm_set_alt function was holding a mutex to protect against races with configfs, which invokes the might-sleep function inside an atomic context. Remove the struct net_device pointer from the f_ncm_opts structure to eliminate the contention. The connection state is now managed by a new boolean flag to preserve the use-after-free fix from commit 6334b8e4553c ("usb: gadget: f_ncm: Fix UAF ncm object at re-bind after usb ep transport error"). BUG: sleeping function called from invalid context Call Trace: dump_stack_lvl+0x83/0xc0 dump_stack+0x14/0x16 __might_resched+0x389/0x4c0 __might_sleep+0x8e/0x100 ... __mutex_lock+0x6f/0x1740 ... ncm_set_alt+0x209/0xa40 set_config+0x6b6/0xb40 composite_setup+0x734/0x2b40 ... Fixes: 56a512a9b410 ("usb: gadget: f_ncm: align net_device lifecycle with bind/unbind") Cc: stable@kernel.org Signed-off-by: Kuen-Han Tsai <khtsai@google.com> Link: https://patch.msgid.link/20260221-legacy-ncm-v2-2-dfb891d76507@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-23usb: legacy: ncm: Fix NPE in gncm_bindKuen-Han Tsai
Commit 56a512a9b410 ("usb: gadget: f_ncm: align net_device lifecycle with bind/unbind") deferred the allocation of the net_device. This change leads to a NULL pointer dereference in the legacy NCM driver as it attempts to access the net_device before it's fully instantiated. Store the provided qmult, host_addr, and dev_addr into the struct ncm_opts->net_opts during gncm_bind(). These values will be properly applied to the net_device when it is allocated and configured later in the binding process by the NCM function driver. Fixes: 56a512a9b410 ("usb: gadget: f_ncm: align net_device lifecycle with bind/unbind") Cc: stable@kernel.org Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202602181727.fd76c561-lkp@intel.com Signed-off-by: Kuen-Han Tsai <khtsai@google.com> Link: https://patch.msgid.link/20260221-legacy-ncm-v2-1-dfb891d76507@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-23usb: gadget: f_tcm: Fix NULL pointer dereferences in nexus handlingJiasheng Jiang
The `tpg->tpg_nexus` pointer in the USB Target driver is dynamically managed and tied to userspace configuration via ConfigFS. It can be NULL if the USB host sends requests before the nexus is fully established or immediately after it is dropped. Currently, functions like `bot_submit_command()` and the data transfer paths retrieve `tv_nexus = tpg->tpg_nexus` and immediately dereference `tv_nexus->tvn_se_sess` without any validation. If a malicious or misconfigured USB host sends a BOT (Bulk-Only Transport) command during this race window, it triggers a NULL pointer dereference, leading to a kernel panic (local DoS). This exposes an inconsistent API usage within the module, as peer functions like `usbg_submit_command()` and `bot_send_bad_response()` correctly implement a NULL check for `tv_nexus` before proceeding. Fix this by bringing consistency to the nexus handling. Add the missing `if (!tv_nexus)` checks to the vulnerable BOT command and request processing paths, aborting the command gracefully with an error instead of crashing the system. Fixes: c52661d60f63 ("usb-gadget: Initial merge of target module for UASP + BOT") Cc: stable <stable@kernel.org> Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com> Reviewed-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://patch.msgid.link/20260219023834.17976-1-jiashengjiangcool@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-23Remove WARN_ALL_UNSEEDED_RANDOM kernel config optionLinus Torvalds
This config option goes way back - it used to be an internal debug option to random.c (at that point called DEBUG_RANDOM_BOOT), then was renamed and exposed as a config option as CONFIG_WARN_UNSEEDED_RANDOM, and then further renamed to the current CONFIG_WARN_ALL_UNSEEDED_RANDOM. It was all done with the best of intentions: the more limited rate-limited reports were reporting some cases, but if you wanted to see all the gory details, you'd enable this "ALL" option. However, it turns out - perhaps not surprisingly - that when people don't care about and fix the first rate-limited cases, they most certainly don't care about any others either, and so warning about all of them isn't actually helping anything. And the non-ratelimited reporting causes problems, where well-meaning people enable debug options, but the excessive flood of messages that nobody cares about will hide actual real information when things go wrong. I just got a kernel bug report (which had nothing to do with randomness) where two thirds of the the truncated dmesg was just variations of random: get_random_u32 called from __get_random_u32_below+0x10/0x70 with crng_init=0 and in the process early boot messages had been lost (in addition to making the messages that _hadn't_ been lost harder to read). The proper way to find these things for the hypothetical developer that cares - if such a person exists - is almost certainly with boot time tracing. That gives you the option to get call graphs etc too, which is likely a requirement for fixing any problems anyway. See Documentation/trace/boottime-trace.rst for that option. And if we for some reason do want to re-introduce actual printing of these things, it will need to have some uniqueness filtering rather than this "just print it all" model. Fixes: cc1e127bfa95 ("random: remove ratelimiting for in-kernel unseeded randomness") Acked-by: Jason Donenfeld <Jason@zx2c4.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-23USB: add QUIRK_NO_BOS for video capture several devicesA1RM4X
Several USB capture devices also need the USB_QUIRK_NO_BOS set for them to work properly, odds are they are all the same chip inside, just different vendor/product ids. This fixes up: - ASUS TUF 4K PRO - Avermedia Live Gamer Ultra 2.1 (GC553G2) - UGREEN 35871 to now run at full speed (10 Gbps/4K 60 fps mode.) Link: https://lore.kernel.org/r/CACy+XB-f-51xGpNQFCSm5pE_momTQLu=BaZggHYU1DiDmFX=ug@mail.gmail.com Cc: stable <stable@kernel.org> Signed-off-by: A1RM4X <dev@a1rm4x.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-23drm/xe/wa: Steer RMW of MCR registers while building default LRCMatt Roper
When generating the default LRC, if a register is not masked, we apply any save-restore programming necessary via a read-modify-write sequence that will ensure we only update the relevant bits/fields without clobbering the rest of the register. However some of the registers that need to be updated might be MCR registers which require steering to a non-terminated instance to ensure we can read back a valid, non-zero value. The steering of reads originating from a command streamer is controlled by register CS_MMIO_GROUP_INSTANCE_SELECT. Emit additional MI_LRI commands to update the steering before any RMW of an MCR register to ensure the reads are performed properly. Note that needing to perform a RMW of an MCR register while building the default LRC is pretty rare. Most of the MCR registers that are part of an engine's LRCs are also masked registers, so no MCR is necessary. Fixes: f2f90989ccff ("drm/xe: Avoid reading RMW registers in emit_wa_job") Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patch.msgid.link/20260206223058.387014-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 6c2e331c915ba9e774aa847921262805feb00863) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-23Merge branch '7.0/scsi-queue' into 7.0/scsi-fixesMartin K. Petersen
Pull in remaining fixes from 7.0/scsi-queue. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2026-02-23cxl: Move devm_cxl_add_nvdimm_bridge() to cxl_pmem.koDave Jiang
Moving the symbol devm_cxl_add_nvdimm_bridge() to drivers/cxl/cxl_pmem.c, so that cxl_pmem can export a symbol that gives cxl_acpi a depedency on cxl_pmem kernel module. This is a prepatory patch to resolve the issue of a race for nvdimm_bus object that is created during cxl_acpi_probe(). No functional changes besides moving code. Suggested-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com?> Link: https://patch.msgid.link/20260205001633.1813643-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-23accel/amdxdna: Validate command buffer payload countLizhi Hou
The count field in the command header is used to determine the valid payload size. Verify that the valid payload does not exceed the remaining buffer space. Fixes: aac243092b70 ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260219211946.1920485-1-lizhi.hou@amd.com
2026-02-23accel/amdxdna: Prevent ubuf size overflowLizhi Hou
The ubuf size calculation may overflow, resulting in an undersized allocation and possible memory corruption. Use check_add_overflow() helpers to validate the size calculation before allocation. Fixes: bd72d4acda10 ("accel/amdxdna: Support user space allocated buffer") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260217192815.1784689-1-lizhi.hou@amd.com
2026-02-23accel/amdxdna: Fix out-of-bounds memset in command slot handlingLizhi Hou
The remaining space in a command slot may be smaller than the size of the command header. Clearing the command header with memset() before verifying the available slot space can result in an out-of-bounds write and memory corruption. Fix this by moving the memset() call after the size validation. Fixes: 3d32eb7a5ecf ("accel/amdxdna: Fix cu_idx being cleared by memset() during command setup") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260217185415.1781908-1-lizhi.hou@amd.com
2026-02-23accel/amdxdna: Fix command hang on suspended hardware contextLizhi Hou
When a hardware context is suspended, the job scheduler is stopped. If a command is submitted while the context is suspended, the job is queued in the scheduler but aie2_sched_job_run() is never invoked to restart the hardware context. As a result, the command hangs. Fix this by modifying the hardware context suspend routine to keep the job scheduler running so that queued jobs can trigger context restart properly. Fixes: aac243092b70 ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260211205341.722982-1-lizhi.hou@amd.com
2026-02-23accel/amdxdna: Fix suspend failure after enabling turbo modeLizhi Hou
Enabling turbo mode disables hardware clock gating. Suspend requires hardware clock gating to be re-enabled, otherwise suspend will fail. Fix this by calling aie2_runtime_cfg() from aie2_hw_stop() to re-enable clock gating during suspend. Also ensure that firmware is initialized in aie2_hw_start() before modifying clock-gating settings during resume. Fixes: f4d7b8a6bc8c ("accel/amdxdna: Enhance power management settings") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260211204716.722788-1-lizhi.hou@amd.com
2026-02-23accel/amdxdna: Fix dead lock for suspend and resumeLizhi Hou
When an application issues a query IOCTL while auto suspend is running, a deadlock can occur. The query path holds dev_lock and then calls pm_runtime_resume_and_get(), which waits for the ongoing suspend to complete. Meanwhile, the suspend callback attempts to acquire dev_lock and blocks, resulting in a deadlock. Fix this by releasing dev_lock before calling pm_runtime_resume_and_get() and reacquiring it after the call completes. Also acquire dev_lock in the resume callback to keep the locking consistent. Fixes: 063db451832b ("accel/amdxdna: Enhance runtime power management") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260211204644.722758-1-lizhi.hou@amd.com
2026-02-23accel/amdxdna: Reduce log noise during process terminationMario Limonciello
During process termination, several error messages are logged that are not actual errors but expected conditions when a process is killed or interrupted. This creates unnecessary noise in the kernel log. The specific scenarios are: 1. HMM invalidation returns -ERESTARTSYS when the wait is interrupted by a signal during process cleanup. This is expected when a process is being terminated and should not be logged as an error. 2. Context destruction returns -ENODEV when the firmware or device has already stopped, which commonly occurs during cleanup if the device was already torn down. This is also an expected condition during orderly shutdown. Downgrade these expected error conditions from error level to debug level to reduce log noise while still keeping genuine errors visible. Fixes: 97f27573837e ("accel/amdxdna: Fix potential NULL pointer dereference in context cleanup") Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260210164521.1094274-3-mario.limonciello@amd.com