kernel - Hosts the 0x221E linux distro kernel.

Age	Commit message (Collapse)	Author
2026-04-09	Merge tag 'iommu-fixes-v7.0-rc7' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull IOMMU fix from Will Deacon: - Fix regression introduced by the empty MMU gather fix in -rc7, where the ->iotlb_sync() callback can be elided incorrectly, resulting in boot failures (hangs), crashes and potential memory corruption. * tag 'iommu-fixes-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: iommu: Ensure .iotlb_sync is called correctly
2026-04-09	iommu: Ensure .iotlb_sync is called correctly	Robin Murphy
	Many drivers have no reason to use the iotlb_gather mechanism, but do still depend on .iotlb_sync being called to properly complete an unmap. Since the core code is now relying on the gather to detect when there is legitimately something to sync, it should also take care of encoding a successful unmap when the driver does not touch the gather itself. Fixes: 90c5def10bea ("iommu: Do not call drivers for empty gathers") Reported-by: Jon Hunter <jonathanh@nvidia.com> Closes: https://lore.kernel.org/r/8800a38b-8515-4bbe-af15-0dae81274bf7@nvidia.com Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Will Deacon <will@kernel.org>
2026-04-02	Merge tag 'iommu-fixes-v7.0-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu fixes from Joerg Roedel: - IOMMU-PT related compile breakage in for AMD driver - IOTLB flushing behavior when unmapped region is larger than requested due to page-sizes - Fix IOTLB flush behavior with empty gathers * tag 'iommu-fixes-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: iommupt/amdv1: mark amdv1pt_install_leaf_entry as __always_inline iommupt: Fix short gather if the unmap goes into a large mapping iommu: Do not call drivers for empty gathers
2026-03-27	iommupt/amdv1: mark amdv1pt_install_leaf_entry as __always_inline	Sherry Yang
	After enabling CONFIG_GCOV_KERNEL and CONFIG_GCOV_PROFILE_ALL, following build failure is observed under GCC 14.2.1: In function 'amdv1pt_install_leaf_entry', inlined from '__do_map_single_page' at drivers/iommu/generic_pt/fmt/../iommu_pt.h:650:3, inlined from '__map_single_page0' at drivers/iommu/generic_pt/fmt/../iommu_pt.h:661:1, inlined from 'pt_descend' at drivers/iommu/generic_pt/fmt/../pt_iter.h:391:9, inlined from '__do_map_single_page' at drivers/iommu/generic_pt/fmt/../iommu_pt.h:657:10, inlined from '__map_single_page1.constprop' at drivers/iommu/generic_pt/fmt/../iommu_pt.h:661:1: ././include/linux/compiler_types.h:706:45: error: call to '__compiletime_assert_71' declared with attribute error: FIELD_PREP: value too large for the field 706 \| _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) \| ...... drivers/iommu/generic_pt/fmt/amdv1.h:220:26: note: in expansion of macro 'FIELD_PREP' 220 \| FIELD_PREP(AMDV1PT_FMT_OA, \| ^~~~~~~~~~ In the path '__do_map_single_page()', level 0 always invokes 'pt_install_leaf_entry(&pts, map->oa, PAGE_SHIFT, …)'. At runtime that lands in the 'if (oasz_lg2 == isz_lg2)' arm of 'amdv1pt_install_leaf_entry()'; the contiguous-only 'else' block is unreachable for 4 KiB pages. With CONFIG_GCOV_KERNEL + CONFIG_GCOV_PROFILE_ALL, the extra instrumentation changes GCC's inlining so that the "dead" 'else' branch still gets instantiated. The compiler constant-folds the contiguous OA expression, runs the 'FIELD_PREP()' compile-time check, and produces: FIELD_PREP: value too large for the field gcov-enabled builds therefore fail even though the code path never executes. Fix this by marking amdv1pt_install_leaf_entry as __always_inline. Fixes: dcd6a011a8d5 ("iommupt: Add map_pages op") Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sherry Yang <sherry.yang@oracle.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-03-27	iommupt: Fix short gather if the unmap goes into a large mapping	Jason Gunthorpe
	unmap has the odd behavior that it can unmap more than requested if the ending point lands within the middle of a large or contiguous IOPTE. In this case the gather should flush everything unmapped which can be larger than what was requested to be unmapped. The gather was only flushing the range requested to be unmapped, not extending to the extra range, resulting in a short invalidation if the caller hits this special condition. This was found by the new invalidation/gather test I am adding in preparation for ARMv8. Claude deduced the root cause. As far as I remember nothing relies on unmapping a large entry, so this is likely not a triggerable bug. Cc: stable@vger.kernel.org Fixes: 7c53f4238aa8 ("iommupt: Add unmap_pages op") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-03-26	Merge tag 'dma-mapping-7.0-2026-03-25' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux Pull dma-mapping fixes from Marek Szyprowski: "A set of fixes for DMA-mapping subsystem, which resolve false- positive warnings from KMSAN and DMA-API debug (Shigeru Yoshida and Leon Romanovsky) as well as a simple build fix (Miguel Ojeda)" * tag 'dma-mapping-7.0-2026-03-25' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: dma-mapping: add missing `inline` for `dma_free_attrs` mm/hmm: Indicate that HMM requires DMA coherency RDMA/umem: Tell DMA mapping that UMEM requires coherency iommu/dma: add support for DMA_ATTR_REQUIRE_COHERENT attribute dma-direct: prevent SWIOTLB path when DMA_ATTR_REQUIRE_COHERENT is set dma-mapping: Introduce DMA require coherency attribute dma-mapping: Clarify valid conditions for CPU cache line overlap dma-mapping: handle DMA_ATTR_CPU_CACHE_CLEAN in trace output dma-debug: Allow multiple invocations of overlapping entries dma: swiotlb: add KMSAN annotations to swiotlb_bounce()
2026-03-20	iommu/dma: add support for DMA_ATTR_REQUIRE_COHERENT attribute	Leon Romanovsky
	Add support for the DMA_ATTR_REQUIRE_COHERENT attribute to the exported functions. This attribute indicates that the SWIOTLB path must not be used and that no sync operations should be performed. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260316-dma-debug-overlap-v3-6-1dde90a7f08b@nvidia.com
2026-03-17	iommu/amd: Block identity domain when SNP enabled	Joe Damato
	Previously, commit 8388f7df936b ("iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY after SNP is enabled") prevented users from changing the IOMMU domain to identity if SNP was enabled. This resulted in an error when writing to sysfs: # echo "identity" > /sys/kernel/iommu_groups/50/type -bash: echo: write error: Cannot allocate memory However, commit 4402f2627d30 ("iommu/amd: Implement global identity domain") changed the flow of the code, skipping the SNP guard and allowing users to change the IOMMU domain to identity after a machine has booted. Once the user does that, they will probably try to bind and the device/driver will start to do DMA which will trigger errors: iommu ivhd3: AMD-Vi: Event logged [ILLEGAL_DEV_TABLE_ENTRY device=0000:43:00.0 pasid=0x00000 address=0x3737b01000 flags=0x0020] iommu ivhd3: AMD-Vi: Control Reg : 0xc22000142148d AMD-Vi: DTE[0]: 6000000000000003 AMD-Vi: DTE[1]: 0000000000000001 AMD-Vi: DTE[2]: 2000003088b3e013 AMD-Vi: DTE[3]: 0000000000000000 bnxt_en 0000:43:00.0 (unnamed net_device) (uninitialized): Error (timeout: 500015) msg {0x0 0x0} len:0 iommu ivhd3: AMD-Vi: Event logged [ILLEGAL_DEV_TABLE_ENTRY device=0000:43:00.0 pasid=0x00000 address=0x3737b01000 flags=0x0020] iommu ivhd3: AMD-Vi: Control Reg : 0xc22000142148d AMD-Vi: DTE[0]: 6000000000000003 AMD-Vi: DTE[1]: 0000000000000001 AMD-Vi: DTE[2]: 2000003088b3e013 AMD-Vi: DTE[3]: 0000000000000000 bnxt_en 0000:43:00.0: probe with driver bnxt_en failed with error -16 To prevent this from happening, create an attach wrapper for identity_domain_ops which returns EINVAL if amd_iommu_snp_en is true. With this commit applied: # echo "identity" > /sys/kernel/iommu_groups/62/type -bash: echo: write error: Invalid argument Fixes: 4402f2627d30 ("iommu/amd: Implement global identity domain") Signed-off-by: Joe Damato <joe@dama.to> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-03-17	iommu/sva: Fix crash in iommu_sva_unbind_device()	Lizhi Hou
	domain->mm->iommu_mm can be freed by iommu_domain_free(): iommu_domain_free() mmdrop() __mmdrop() mm_pasid_drop() After iommu_domain_free() returns, accessing domain->mm->iommu_mm may dereference a freed mm structure, leading to a crash. Fix this by moving the code that accesses domain->mm->iommu_mm to before the call to iommu_domain_free(). Fixes: e37d5a2d60a3 ("iommu/sva: invalidate stale IOTLB entries for kernel address space") Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-03-17	iommu: Fix mapping check for 0x0 to avoid re-mapping it	Antheas Kapenekakis
	Commit 789a5913b29c ("iommu/amd: Use the generic iommu page table") introduces the shared iommu page table for AMD IOMMU. Some bioses contain an identity mapping for address 0x0, which is not parsed properly (e.g., certain Strix Halo devices). This causes the DMA components of the device to fail to initialize (e.g., the NVMe SSD controller), leading to a failed post. Specifically, on the GPD Win 5, the NVME and SSD GPU fail to mount, making collecting errors difficult. While debugging, it was found that a -EADDRINUSE error was emitted and its source was traced to iommu_iova_to_phys(). After adding some debug prints, it was found that phys_addr becomes 0, which causes the code to try to re-map the 0 address and fail, causing a cascade leading to a failed post. This is because the GPD Win 5 contains a 0x0-0x1 identity mapping for DMA devices, causing it to be repeated for each device. The cause of this failure is the following check in iommu_create_device_direct_mappings(), where address aliasing is handled via the following check: ``` phys_addr = iommu_iova_to_phys(domain, addr); if (!phys_addr) { map_size += pg_size; continue; } ```` Obviously, the iommu_iova_to_phys() signature is faulty and aliases unmapped and 0 together, causing the allocation code to try to re-allocate the 0 address per device. However, it has too many instantiations to fix. Therefore, use a ternary so that when addr is 0, the check is done for address 1 instead. Suggested-by: Robin Murphy <robin.murphy@arm.com> Fixes: 789a5913b29c ("iommu/amd: Use the generic iommu page table") Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-03-17	iommu/vt-d: Only handle IOPF for SVA when PRI is supported	Lu Baolu
	In intel_svm_set_dev_pasid(), the driver unconditionally manages the IOPF handling during a domain transition. However, commit a86fb7717320 ("iommu/vt-d: Allow SVA with device-specific IOPF") introduced support for SVA on devices that handle page faults internally without utilizing the PCI PRI. On such devices, the IOMMU-side IOPF infrastructure is not required. Calling iopf_for_domain_replace() on these devices is incorrect and can lead to unexpected failures during PASID attachment or unwinding. Add a check for info->pri_supported to ensure that the IOPF queue logic is only invoked for devices that actually rely on the IOMMU's PRI-based fault handling. Fixes: 17fce9d2336d ("iommu/vt-d: Put iopf enablement in domain attach path") Cc: stable@vger.kernel.org Suggested-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20260310075520.295104-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-03-17	iommu/vt-d: Fix intel iommu iotlb sync hardlockup and retry	Guanghui Feng
	During the qi_check_fault process after an IOMMU ITE event, requests at odd-numbered positions in the queue are set to QI_ABORT, only satisfying single-request submissions. However, qi_submit_sync now supports multiple simultaneous submissions, and can't guarantee that the wait_desc will be at an odd-numbered position. Therefore, if an item times out, IOMMU can't re-initiate the request, resulting in an infinite polling wait. This modifies the process by setting the status of all requests already fetched by IOMMU and recorded as QI_IN_USE status (including wait_desc requests) to QI_ABORT, thus enabling multiple requests to be resubmitted. Fixes: 8a1d82462540 ("iommu/vt-d: Multiple descriptors per qi_submit_sync()") Cc: stable@vger.kernel.org Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com> Tested-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Link: https://lore.kernel.org/r/20260306101516.3885775-1-guanghuifeng@linux.alibaba.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Fixes: 8a1d82462540 ("iommu/vt-d: Multiple descriptors per qi_submit_sync()") Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-02-22	Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses	Kees Cook
	Conversion performed via this Coccinelle script: // SPDX-License-Identifier: GPL-2.0-only // Options: --include-headers-for-types --all-includes --include-headers --keep-comments virtual patch @gfp depends on patch && !(file in "tools") && !(file in "samples")@ identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex, kzalloc_obj,kzalloc_objs,kzalloc_flex, kvmalloc_obj,kvmalloc_objs,kvmalloc_flex, kvzalloc_obj,kvzalloc_objs,kvzalloc_flex}; @@ ALLOC(... - , GFP_KERNEL ) $ make coccicheck MODE=patch COCCI=gfp.cocci Build and boot tested x86_64 with Fedora 42's GCC and Clang: Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21	Convert more 'alloc_obj' cases to default GFP_KERNEL arguments	Linus Torvalds
	This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21	Convert 'alloc_flex' family to use the new default GFP_KERNEL argument	Linus Torvalds
	This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21	Convert 'alloc_obj' family to use the new default GFP_KERNEL argument	Linus Torvalds
	This was done entirely with mindless brute force, using git grep -l '\<k[vmz]alloc_objs(., GFP_KERNEL)' \| xargs sed -i 's/$alloc_objs(.*$, GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21	treewide: Replace kmalloc with kmalloc_obj for non-scalar types	Kees Cook
	This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-12	Merge tag 'riscv-for-linus-7.0-mw1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Paul Walmsley: - Add support for control flow integrity for userspace processes. This is based on the standard RISC-V ISA extensions Zicfiss and Zicfilp - Improve ptrace behavior regarding vector registers, and add some selftests - Optimize our strlen() assembly - Enable the ISO-8859-1 code page as built-in, similar to ARM64, for EFI volume mounting - Clean up some code slightly, including defining copy_user_page() as copy_page() rather than memcpy(), aligning us with other architectures; and using max3() to slightly simplify an expression in riscv_iommu_init_check() * tag 'riscv-for-linus-7.0-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits) riscv: lib: optimize strlen loop efficiency selftests: riscv: vstate_exec_nolibc: Use the regular prctl() function selftests: riscv: verify ptrace accepts valid vector csr values selftests: riscv: verify ptrace rejects invalid vector csr inputs selftests: riscv: verify syscalls discard vector context selftests: riscv: verify initial vector state with ptrace selftests: riscv: test ptrace vector interface riscv: ptrace: validate input vector csr registers riscv: csr: define vtype register elements riscv: vector: init vector context with proper vlenb riscv: ptrace: return ENODATA for inactive vector extension kselftest/riscv: add kselftest for user mode CFI riscv: add documentation for shadow stack riscv: add documentation for landing pad / indirect branch tracking riscv: create a Kconfig fragment for shadow stack and landing pad support arch/riscv: add dual vdso creation logic and select vdso based on hw arch/riscv: compile vdso with landing pad and shadow stack note riscv: enable kernel access to shadow stack memory via the FWFT SBI call riscv: add kernel command line option to opt out of user CFI riscv/hwprobe: add zicfilp / zicfiss enumeration in hwprobe ...
2026-02-12	Merge tag 'vfio-v7.0-rc1' of https://github.com/awilliam/linux-vfio	Linus Torvalds
	Pull VFIO updates from Alex Williamson: "A small cycle with the bulk in selftests and reintroducing poison handling in the nvgrace-gpu driver. The rest are fixes, cleanups, and some dmabuf structure consolidation. - Update outdated mdev comment referencing the renamed mdev_type_add() function (Julia Lawall) - Introduce selftest support for IOMMU mapping of PCI MMIO BARs (Alex Mastro) - Relax selftest assertion relative to differences in huge page handling between legacy (v1) TYPE1 IOMMU mapping behavior and the compatibility mode supported by IOMMUFD (David Matlack) - Reintroduce memory poison handling support for non-struct-page- backed memory in the nvgrace-gpu variant driver (Ankit Agrawal) - Replace dma_buf_phys_vec with phys_vec to avoid duplicate structure and semantics (Leon Romanovsky) - Add missing upstream bridge locking across PCI function reset, resolving an assertion failure when secondary bus reset is used to provide that reset (Anthony Pighin) - Fixes to hisi_acc vfio-pci variant driver to resolve corner case issues related to resets, repeated migration, and error injection scenarios (Longfang Liu, Weili Qian) - Restrict vfio selftest builds to arm64 and x86_64, resolving compiler warnings on 32-bit archs (Ted Logan) - Un-deprecate the fsl-mc vfio bus driver as a new maintainer has stepped up (Ioana Ciornei)" * tag 'vfio-v7.0-rc1' of https://github.com/awilliam/linux-vfio: vfio/fsl-mc: add myself as maintainer vfio: selftests: only build tests on arm64 and x86_64 hisi_acc_vfio_pci: fix the queue parameter anomaly issue hisi_acc_vfio_pci: resolve duplicate migration states hisi_acc_vfio_pci: update status after RAS error hisi_acc_vfio_pci: fix VF reset timeout issue vfio/pci: Lock upstream bridge for vfio_pci_core_disable() types: reuse common phys_vec type instead of DMABUF open‑coded variant vfio/nvgrace-gpu: register device memory for poison handling mm: add stubs for PFNMAP memory failure registration functions vfio: selftests: Drop IOMMU mapping size assertions for VFIO_TYPE1_IOMMU vfio: selftests: Add vfio_dma_mapping_mmio_test vfio: selftests: Align BAR mmaps for efficient IOMMU mapping vfio: selftests: Centralize IOMMU mode name definitions vfio/mdev: update outdated comment
2026-02-11	Merge tag 'driver-core-7.0-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core Pull driver core updates from Danilo Krummrich: "Bus: - Ensure bus->match() is consistently called with the device lock held - Improve type safety of bus_find_device_by_acpi_dev() Devtmpfs: - Parse 'devtmpfs.mount=' boot parameter with kstrtoint() instead of simple_strtoul() - Avoid sparse warning by making devtmpfs_context_ops static IOMMU: - Do not register the qcom_smmu_tbu_driver in arm_smmu_device_probe() MAINTAINERS: - Add the new driver-core mailing list (driver-core@lists.linux.dev) to all relevant entries - Add missing tree location for "FIRMWARE LOADER (request_firmware)" - Add driver-model documentation to the "DRIVER CORE" entry - Add missing driver-core maintainers to the "AUXILIARY BUS" entry Misc: - Change return type of attribute_container_register() to void; it has always been infallible - Do not export sysfs_change_owner(), sysfs_file_change_owner() and device_change_owner() - Move devres_for_each_res() from the public devres header to drivers/base/base.h - Do not use a static struct device for the faux bus; allocate it dynamically Revocable: - Patches for the revocable synchronization primitive have been scheduled for v7.0-rc1, but have been reverted as they need some more refinement Rust: - Device: - Support dev_printk on all device types, not just the core Device struct; remove now-redundant .as_ref() calls in dev_* print calls - Devres: - Introduce an internal reference count in Devres<T> to avoid a deadlock condition in case of (indirect) nesting - DMA: - Allow drivers to tune the maximum DMA segment size via dma_set_max_seg_size() - I/O: - Introduce the concept of generic I/O backends to handle different kinds of device shared memory through a common interface. This enables higher-level concepts such as register abstractions, I/O slices, and field projections to be built generically on top. In a first step, introduce the Io, IoCapable<T>, and IoKnownSize trait hierarchy for sharing a common interface supporting offset validation and bound-checking logic between I/O backends. - Refactor MMIO to use the common I/O backend infrastructure - Misc: - Add __rust_helper annotations to C helpers for inlining into Rust code - Use "kernel vertical" style for imports - Replace kernel::c_str! with C string literals - Update ARef imports to use sync::aref - Use pin_init::zeroed() for struct auxiliary_device_id and debugfs file_operations initialization - Use LKMM atomic types in debugfs doc-tests - Various minor comment and documentation fixes - PCI: - Implement PCI configuration space accessors using the common I/O backend infrastructure - Document pci::Bar device endianness assumptions - SoC: - Abstractions for struct soc_device and struct soc_device_attribute - Sample driver for soc::Device" * tag 'driver-core-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: (79 commits) rust: devres: fix race condition due to nesting rust: dma: add missing __rust_helper annotations samples: rust: pci: Remove some additional `.as_ref()` for `dev_` print Revert "revocable: Revocable resource management" Revert "revocable: Add Kunit test cases" Revert "selftests: revocable: Add kselftest cases" driver core: remove device_change_owner() export sysfs: remove exports of sysfs_change_owner() driver core: disable revocable code from build revocable: Add KUnit test for concurrent access revocable: fix SRCU index corruption by requiring caller-provided storage revocable: Add KUnit test for provider lifetime races revocable: Fix races in revocable_alloc() using RCU driver core: fix inverted "locked" suffix of driver_match_device() rust: io: move MIN_SIZE and io_addr_assert to IoKnownSize rust: pci: re-export ConfigSpace rust: dma: allow drivers to tune max segment size gpu: tyr: remove redundant `.as_ref()` for `dev_*` print rust: auxiliary: use `pin_init::zeroed()` for device ID rust: debugfs: use pin_init::zeroed() for file_operations ...
2026-02-11	Merge tag 'iommu-updates-v7.0' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu updates from Joerg Roedel: "Core changes: - Rust bindings for IO-pgtable code - IOMMU page allocation debugging support - Disable ATS during PCI resets Intel VT-d changes: - Skip dev-iotlb flush for inaccessible PCIe device - Flush cache for PASID table before using it - Use right invalidation method for SVA and NESTED domains - Ensure atomicity in context and PASID entry updates AMD-Vi changes: - Support for nested translations - Other minor improvements ARM-SMMU-v2 changes: - Configure SoC-specific prefetcher settings for Qualcomm's "MDSS" ARM-SMMU-v3 changes: - Improve CMDQ locking fairness for pathetically small queue sizes - Remove tracking of the IAS as this is only relevant for AArch32 and was causing C_BAD_STE errors - Add device-tree support for NVIDIA's CMDQV extension - Allow some hitless transitions for the 'MEV' and 'EATS' STE fields - Don't disable ATS for nested S1-bypass nested domains - Additions to the kunit selftests" * tag 'iommu-updates-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (54 commits) iommupt: Always add IOVA range to iotlb_gather in gather_range_pages() iommu/amd: serialize sequence allocation under concurrent TLB invalidations iommu/amd: Fix type of type parameter to amd_iommufd_hw_info() iommu/arm-smmu-v3: Do not set disable_ats unless vSTE is Translate iommu/arm-smmu-v3-test: Add nested s1bypass/s1dssbypass coverage iommu/arm-smmu-v3: Mark EATS_TRANS safe when computing the update sequence iommu/arm-smmu-v3: Mark STE MEV safe when computing the update sequence iommu/arm-smmu-v3: Add update_safe bits to fix STE update sequence iommu/arm-smmu-v3: Add device-tree support for CMDQV driver iommu/tegra241-cmdqv: Decouple driver from ACPI iommu/arm-smmu-qcom: Restore ACTLR settings for MDSS on sa8775p iommu/vt-d: Fix race condition during PASID entry replacement iommu/vt-d: Clear Present bit before tearing down context entry iommu/vt-d: Clear Present bit before tearing down PASID entry iommu/vt-d: Flush piotlb for SVM and Nested domain iommu/vt-d: Flush cache for PASID table before using it iommu/vt-d: Flush dev-IOTLB only when PCIe device is accessible in scalable mode iommu/vt-d: Skip dev-iotlb flush for inaccessible PCIe device without scalable mode rust: iommu: fix `srctree` link warning rust: iommu: fix Rust formatting ...
2026-02-10	Merge tag 'x86-irq-2026-02-09' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 irq updates from Thomas Gleixner: "Trivial cleanups for the posted MSI interrupt handling" * tag 'x86-irq-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/irq_remapping: Sanitize posted_msi_supported() x86/irq: Cleanup posted MSI code
2026-02-10	Merge tag 'irq-cleanups-2026-02-09' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq cleanups from Thomas Gleixner: "A series of treewide cleanups to ensure interrupt request consistency. - Add the missing IRQF_COND_ONESHOT flag to devm_request_irq() This is inconsistent vs request_irq() and causes the same issues which where addressed with the introduction of this flag - Cleanup IRQF_ONESHOT and IRQF_NO_THREAD usage Quite some drivers have inconsistent interrupt request flags related to interrupt threading namely IRQF_ONESHOT and IRQF_NO_THREAD. This leads to warnings and/or malfunction when forced interrupt threading is enabled. - Remove stub primary (hard interrupt) handlers A bunch of drivers implement a stub primary (hard interrupt) handler which just returns IRQ_WAKE_THREAD. The same functionality is provided by the core code when the primary handler argument of request_thread_irq() is set to NULL" * tag 'irq-cleanups-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: media: pci: mg4b: Use IRQF_NO_THREAD mfd: wm8350-core: Use IRQF_ONESHOT thermal/qcom/lmh: Replace IRQF_ONESHOT with IRQF_NO_THREAD rtc: amlogic-a4: Remove IRQF_ONESHOT usb: typec: fusb302: Remove IRQF_ONESHOT EDAC/altera: Remove IRQF_ONESHOT char: tpm: cr50: Remove IRQF_ONESHOT ARM: versatile: Remove IRQF_ONESHOT scsi: efct: Use IRQF_ONESHOT and default primary handler Bluetooth: btintel_pcie: Use IRQF_ONESHOT and default primary handler bus: fsl-mc: Use default primary handler mailbox: bcm-ferxrm-mailbox: Use default primary handler iommu/amd: Use core's primary handler and set IRQF_ONESHOT platform/x86: int0002: Remove IRQF_ONESHOT from request_irq() genirq: Set IRQF_COND_ONESHOT in devm_request_irq().
2026-02-06	Merge branches 'fixes', 'arm/smmu/updates', 'intel/vt-d', 'amd/amd-vi' and ↵	Joerg Roedel
	'core' into next
2026-02-06	iommu/vt-d: Treat PAGE_SNOOP and PWSNP separately	Viktor Kleen
	The PASID_FLAG_PAGE_SNOOP and PASID_FLAG_PWSNP constants are identical. This will cause the pasid code to always set both or neither of the PGSNP and PWSNP bits in PASID table entries. However, PWSNP is a reserved bit if SMPWC is not set in the IOMMU's extended capability register, even if SC is supported. This has resulted in DMAR errors when testing the iommufd code on an Arrow Lake platform. With this patch, those errors disappear and the PASID table entries look correct. Fixes: 101a2854110fa ("iommu/vt-d: Follow PT_FEAT_DMA_INCOHERENT into the PASID entry") Cc: stable@vger.kernel.org Signed-off-by: Viktor Kleen <viktor@kleen.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20260202192109.1665799-1-viktor@kleen.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-02-03	iommupt: Always add IOVA range to iotlb_gather in gather_range_pages()	Yu Zhang
	Add current (iova, len) to the iotlb gather, regardless of the setting of PT_FEAT_FLUSH_RANGE or PT_FEAT_FLUSH_RANGE_NO_GAPS. In gather_range_pages(), the current IOVA range is only added to iotlb_gather when PT_FEAT_FLUSH_RANGE is set. Yet a virtual IOMMU with NpCache uses only PT_FEAT_FLUSH_RANGE_NO_GAPS. In that case, iotlb_gather will stay empty (start=ULONG_MAX, end=0) after initialization, and the current (iova, len) will not be added to the iotlb_gather, causing subsequent iommu_iotlb_sync() to perform IOTLB invalidation with wrong parameters (e.g., amd_iommu_iotlb_sync() computes size from gather->end - gather->start + 1, leading to an invalid range). The disjoint check and sync for PT_FEAT_FLUSH_RANGE_NO_GAPS remain unchanged: when the new range is disjoint from the existing gather, we still sync first and then add the new range, so semantics for NO_GAPS are preserved. Fixes: 7c53f4238aa8 ("iommupt: Add unmap_pages op") Cc: stable@vger.kernel.org Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Yu Zhang <zhangyu1@linux.microsoft.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-02-03	iommu/amd: serialize sequence allocation under concurrent TLB invalidations	Ankit Soni
	With concurrent TLB invalidations, completion wait randomly gets timed out because cmd_sem_val was incremented outside the IOMMU spinlock, allowing CMD_COMPL_WAIT commands to be queued out of sequence and breaking the ordering assumption in wait_on_sem(). Move the cmd_sem_val increment under iommu->lock so completion sequence allocation is serialized with command queuing. And remove the unnecessary return. Fixes: d2a0cac10597 ("iommu/amd: move wait_on_sem() out of spinlock") Tested-by: Srikanth Aithal <sraithal@amd.com> Reported-by: Srikanth Aithal <sraithal@amd.com> Signed-off-by: Ankit Soni <Ankit.Soni@amd.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-02-01	iommu/amd: Use core's primary handler and set IRQF_ONESHOT	Sebastian Andrzej Siewior
	request_threaded_irq() is invoked with a primary and a secondary handler and no flags are passed. The primary handler is the same as irq_default_primary_handler() so there is no need to have an identical copy. The lack of the IRQF_ONESHOT can be dangerous because the interrupt source is not masked while the threaded handler is active. This means, especially on LEVEL typed interrupt lines, the interrupt can fire again before the threaded handler had a chance to run. Use the default primary interrupt handler by specifying NULL and set IRQF_ONESHOT so the interrupt source is masked until the secondary handler is done. Fixes: 72fe00f01f9a3 ("x86/amd-iommu: Use threaded interupt handler") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260128095540.863589-4-bigeasy@linutronix.de
2026-01-31	Merge tag 'iommu-fixes-v6.19-rc7' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu fixes from Joerg Roedel: - Fix a performance regression cause by the new Generic IO-Page-Table code detected in Intel VT-d driver - Command queue flushing fix for NVidia version of the ARM-SMMU-v3 * tag 'iommu-fixes-v6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: iommu/tegra241-cmdqv: Reset VCMDQ in tegra241_vcmdq_hw_init_user() iommupt: Only cache flush memory changed by unmap
2026-01-31	iommu/tegra241-cmdqv: Reset VCMDQ in tegra241_vcmdq_hw_init_user()	Nicolin Chen
	The Enable bits in CMDQV/VINTF/VCMDQ_CONFIG registers do not actually reset the HW registers. So, the driver explicitly clears all the registers when a VINTF or VCMDQ is being initialized calling its hw_deinit() function. However, a userspace VCMDQ is not properly reset, unlike an in-kernel VCMDQ getting reset in tegra241_vcmdq_hw_init(). Meanwhile, tegra241_vintf_hw_init() calling tegra241_vintf_hw_deinit() will not deinit any VCMDQ, since there is no userspace VCMDQ mapped to the VINTF at that stage. Then, this may result in dirty VCMDQ registers, which can fail the VM. Like tegra241_vcmdq_hw_init(), reset a VCMDQ in tegra241_vcmdq_hw_init() to fix this bug. This is required by a host kernel. Fixes: 6717f26ab1e7 ("iommu/tegra241-cmdqv: Add user-space use support") Cc: stable@vger.kernel.org Reported-by: Bao Nguyen <ncqb@google.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-28	iommufd: Initialize batch->kind in batch_clear()	Deepanshu Kartikey
	KMSAN reported an uninitialized value when batch_add_pfn_num() reads batch->kind. This occurs because batch_clear() does not initialize the kind field. When batch_add_pfn_num() checks "if (batch->kind != kind)", it reads this uninitialized value, triggering KMSAN warnings. However the algorithm is fine with any value in kind at this point as the batch is always empty and it always corrects kind if wrong. Initialize batch->kind to zero in batch_clear() to silence the KMSAN warning. Link: https://patch.msgid.link/r/20260124132214.624041-1-kartikey406@gmail.com Reported-by: syzbot+df28076a30d726933015@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=df28076a30d726933015 Fixes: f394576eb11db ("iommufd: PFN handling for iopt_pages") Tested-by: syzbot+df28076a30d726933015@syzkaller.appspotmail.com Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: syzbot+a0c841e02f328005bbcc@syzkaller.appspotmail.com Reported-by: syzbot+a0c841e02f328005bbcc@syzkaller.appspotmail.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2026-01-28	iommupt: Only cache flush memory changed by unmap	Jason Gunthorpe
	The cache flush was happening on every level across the whole range of iteration, even if no leafs or tables were cleared. Instead flush only the sub range that was actually written. Overflushing isn't a correctness problem but it does impact the performance of unmap. After this series the performance compared to the original VT-d implementation with cache flushing turned on is: map_pages pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 253,266 , 213,227 , 6.06 2^21, 246,244 , 221,219 , 0.00 2^30, 231,240 , 209,217 , 3.03 2562^12, 2604,2668 , 2415,2540 , 4.04 2562^21, 2495,2824 , 2390,2734 , 12.12 2562^30, 2542,2845 , 2380,2718 , 12.12 unmap_pages pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 259,292 , 222,251 , 11.11 2^21, 255,259 , 227,236 , 3.03 2^30, 238,254 , 217,230 , 5.05 2562^12, 2751,2620 , 2417,2437 , 0.00 2562^21, 2461,2526 , 2377,2423 , 1.01 2562^30, 2498,2543 , 2370,2404 , 1.01 Fixes: efa03dab7ce4 ("iommupt: Flush the CPU cache after any writes to the page table") Reported-by: Francois Dugast <francois.dugast@intel.com> Closes: https://lore.kernel.org/all/20260121130233.257428-1-francois.dugast@intel.com/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-28	iommu/amd: Fix type of type parameter to amd_iommufd_hw_info()	Nathan Chancellor
	When building with -Wincompatible-function-pointer-types-strict, a warning designed to catch kernel control flow integrity (kCFI) issues at build time, there is an instance around amd_iommufd_hw_info(): drivers/iommu/amd/iommu.c:3141:13: error: incompatible function pointer types initializing 'void ()(struct device , u32 , enum iommu_hw_info_type )' (aka 'void ()(struct device , unsigned int , enum iommu_hw_info_type )') with an expression of type 'void (struct device , u32 , u32 )' (aka 'void (struct device , unsigned int , unsigned int )') [-Werror,-Wincompatible-function-pointer-types-strict] 3141 \| .hw_info = amd_iommufd_hw_info, \| ^~~~~~~~~~~~~~~~~~~ While 'u32 ' and 'enum iommu_hw_info_type ' are ABI compatible, hence no regular warning from -Wincompatible-function-pointer-types, the mismatch will trigger a kCFI violation when amd_iommufd_hw_info() is called indirectly. Update the type parameter of amd_iommufd_hw_info() to be 'enum iommu_hw_info_type *' to match the prototype in 'struct iommu_ops', clearing up the warning and kCFI violation. Fixes: 7d8b06ecc45b ("iommu/amd: Add support for hw_info for iommu capability query") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-26	Merge tag 'driver-core-6.19-rc7-deferred' into driver-core-next	Danilo Krummrich
	Driver core fixes deferred from 6.19-rc7 [1, 2] were originally intended for -rc7. Patch [1] uncovered potential deadlocks that require a few driver fixes; [2] is one such fix. [1] https://patch.msgid.link/20260113162843.12712-1-hanguidong02@gmail.com [2] https://patch.msgid.link/20260121141215.29658-1-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-01-25	iommu/riscv: Simplify maximum determination in riscv_iommu_init_check()	Markus Elfring
	Reduce nested max() calls by a single max3() call in this function implementation. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Link: https://patch.msgid.link/d1a384c9-f154-4537-94d6-c3613f4167bc@web.de Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-23	Merge tag 'iommu-fixes-v6.19-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu fixes from Joerg Roedel: - AMD IOMMU: Fix potential NULL-ptr dereference in error path of amd_iommu_probe_device() - Generic IOMMUPT: Fix another compiler issue seen with older compiler versions - Fix signedness issue in ARM IO-PageTable code * tag 'iommu-fixes-v6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: iommu/io-pgtable-arm: fix size_t signedness bug in unmap path iommupt: Make it clearer to the compiler that pts.level == 0 for single page iommu/amd: Fix error path in amd_iommu_probe_device()
2026-01-23	iommu/arm-smmu-v3: Do not set disable_ats unless vSTE is Translate	Nicolin Chen
	A vSTE may have three configuration types: Abort, Bypass, and Translate. An Abort vSTE wouldn't enable ATS, but the other two might. It makes sense for a Transalte vSTE to rely on the guest vSTE.EATS field. For a Bypass vSTE, it would end up with an S2-only physical STE, similar to an attachment to a regular S2 domain. However, the nested case always disables ATS following the Bypass vSTE, while the regular S2 case always enables ATS so long as arm_smmu_ats_supported(master) == true. Note that ATS is needed for certain VM centric workloads and historically non-vSMMU cases have relied on this automatic enablement. So, having the nested case behave differently causes problems. To fix that, add a condition to disable_ats, so that it might enable ATS for a Bypass vSTE, aligning with the regular S2 case. Fixes: f27298a82ba0 ("iommu/arm-smmu-v3: Allow ATS for IOMMU_DOMAIN_NESTED") Cc: stable@vger.kernel.org Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org>
2026-01-23	iommu/arm-smmu-v3-test: Add nested s1bypass/s1dssbypass coverage	Nicolin Chen
	STE in a nested case requires both S1 and S2 fields. And this makes the use case different from the existing one. Add coverage for previously failed cases shifting between S2-only and S1+S2 STEs. Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org>
2026-01-23	iommu/arm-smmu-v3: Mark EATS_TRANS safe when computing the update sequence	Jason Gunthorpe
	If VM wants to toggle EATS_TRANS off at the same time as changing the CFG, hypervisor will see EATS change to 0 and insert a V=0 breaking update into the STE even though the VM did not ask for that. In bare metal, EATS_TRANS is ignored by CFG=ABORT/BYPASS, which is why this does not cause a problem until we have the nested case where CFG is always a variation of S2 trans that does use EATS_TRANS. Relax the rules for EATS_TRANS sequencing, we don't need it to be exact as the enclosing code will always disable ATS at the PCI device when changing EATS_TRANS. This ensures there are no ATS transactions that can race with an EATS_TRANS change so we don't need to carefully sequence these bits. Fixes: 1e8be08d1c91 ("iommu/arm-smmu-v3: Support IOMMU_DOMAIN_NESTED") Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org>
2026-01-23	iommu/arm-smmu-v3: Mark STE MEV safe when computing the update sequence	Jason Gunthorpe
	Nested CD tables set the MEV bit to try to reduce multi-fault spamming on the hypervisor. Since MEV is in STE word 1 this causes a breaking update sequence that is not required and impacts real workloads. For the purposes of STE updates the value of MEV doesn't matter, if it is set/cleared early or late it just results in a change to the fault reports that must be supported by the kernel anyhow. The spec says: Note: Software must expect, and be able to deal with, coalesced fault records even when MEV == 0. So mark STE MEV safe when computing the update sequence, to avoid creating a breaking update. Fixes: da0c56520e88 ("iommu/arm-smmu-v3: Set MEV bit in nested STE for DoS mitigations") Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org>
2026-01-23	iommu/arm-smmu-v3: Add update_safe bits to fix STE update sequence	Jason Gunthorpe
	C_BAD_STE was observed when updating nested STE from an S1-bypass mode to an S1DSS-bypass mode. As both modes enabled S2, the used bit is slightly different than the normal S1-bypass and S1DSS-bypass modes. As a result, fields like MEV and EATS in S2's used list marked the word1 as a critical word that requested a STE.V=0. This breaks a hitless update. However, both MEV and EATS aren't critical in terms of STE update. One controls the merge of the events and the other controls the ATS that is managed by the driver at the same time via pci_enable_ats(). Add an arm_smmu_get_ste_update_safe() to allow STE update algorithm to relax those fields, avoiding the STE update breakages. After this change, entry_set has no caller checking its return value, so change it to void. Note that this change is required by both MEV and EATS fields, which were introduced in different kernel versions. So add get_update_safe() first. MEV and EATS will be added to arm_smmu_get_ste_update_safe() separately. Fixes: 1e8be08d1c91 ("iommu/arm-smmu-v3: Support IOMMU_DOMAIN_NESTED") Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org>
2026-01-22	iommu/arm-smmu-v3: Add device-tree support for CMDQV driver	Ashish Mhetre
	Add device tree support to the CMDQV driver to enable usage on Tegra264 SoCs. The implementation parses the nvidia,cmdqv phandle from the SMMU device tree node to associate each SMMU with its corresponding CMDQV instance based on compatible string. Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Ashish Mhetre <amhetre@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org>
2026-01-22	iommu/tegra241-cmdqv: Decouple driver from ACPI	Nicolin Chen
	A platform device is created by acpi_create_platform_device() per CMDQV's adev. That means there is no point in going through _CRS of ACPI. Replace all the ACPI functions with standard platform functions. And drop all ACPI dependencies. This will make the driver compatible with DT also. Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org>
2026-01-22	iommu/arm-smmu-qcom: do not register driver in probe()	Danilo Krummrich
	Commit 0b4eeee2876f ("iommu/arm-smmu-qcom: Register the TBU driver in qcom_smmu_impl_init") intended to also probe the TBU driver when CONFIG_ARM_SMMU_QCOM_DEBUG is disabled, but also moved the corresponding platform_driver_register() call into qcom_smmu_impl_init() which is called from arm_smmu_device_probe(). However, it neither makes sense to register drivers from probe() callbacks of other drivers, nor does the driver core allow registering drivers with a device lock already being held. The latter was revealed by commit dc23806a7c47 ("driver core: enforce device_lock for driver_match_device()") leading to a deadlock condition described in [1]. Additionally, it was noted by Robin that the current approach is potentially racy with async probe [2]. Hence, fix this by registering the qcom_smmu_tbu_driver from module_init(). Unfortunately, due to the vendoring of the driver, this requires an indirection through arm-smmu-impl.c. Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/lkml/7ae38e31-ef31-43ad-9106-7c76ea0e8596@sirena.org.uk/ Link: https://lore.kernel.org/lkml/DFU7CEPUSG9A.1KKGVW4HIPMSH@kernel.org/ [1] Link: https://lore.kernel.org/lkml/0c0d3707-9ea5-44f9-88a1-a65c62e3df8d@arm.com/ [2] Fixes: dc23806a7c47 ("driver core: enforce device_lock for driver_match_device()") Fixes: 0b4eeee2876f ("iommu/arm-smmu-qcom: Register the TBU driver in qcom_smmu_impl_init") Acked-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Bjorn Andersson <andersson@kernel.org> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Acked-by: Konrad Dybcio <konradybcio@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> #LX2160ARDB Tested-by: Wang Jiayue <akaieurus@gmail.com> Reviewed-by: Wang Jiayue <akaieurus@gmail.com> Tested-by: Mark Brown <broonie@kernel.org> Acked-by: Joerg Roedel <joerg.roedel@amd.com> Link: https://patch.msgid.link/20260121141215.29658-1-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-01-22	iommu/arm-smmu-qcom: Restore ACTLR settings for MDSS on sa8775p	Bibek Kumar Patro
	The ACTLR configuration for the sa8775p MDSS client was inadvertently dropped while reworking the commit f91879fdf70b ("iommu/arm-smmu-qcom: Add actlr settings for mdss on Qualcomm platforms"). Without this entry, the sa8775p MDSS block does not receive the intended default ACTLR configuration. Restore the missing compatible entry so that the platform receives the expected behavior. Fixes: f91879fdf70b ("iommu/arm-smmu-qcom: Add actlr settings for mdss on Qualcomm platforms") Signed-off-by: Bibek Kumar Patro <bibek.patro@oss.qualcomm.com> Signed-off-by: Will Deacon <will@kernel.org>
2026-01-22	iommu/vt-d: Fix race condition during PASID entry replacement	Lu Baolu
	The Intel VT-d PASID table entry is 512 bits (64 bytes). When replacing an active PASID entry (e.g., during domain replacement), the current implementation calculates a new entry on the stack and copies it to the table using a single structure assignment. struct pasid_entry pte, new_pte; pte = intel_pasid_get_entry(dev, pasid); pasid_pte_config_first_level(iommu, &new_pte, ...); pte = new_pte; Because the hardware may fetch the 512-bit PASID entry in multiple 128-bit chunks, updating the entire entry while it is active (Present bit set) risks a "torn" read. In this scenario, the IOMMU hardware could observe an inconsistent state — partially new data and partially old data — leading to unpredictable behavior or spurious faults. Fix this by removing the unsafe "replace" helpers and following the "clear-then-update" flow, which ensures the Present bit is cleared and the required invalidation handshake is completed before the new configuration is applied. Fixes: 7543ee63e811 ("iommu/vt-d: Add pasid replace helpers") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20260120061816.2132558-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-22	iommu/vt-d: Clear Present bit before tearing down context entry	Lu Baolu
	When tearing down a context entry, the current implementation zeros the entire 128-bit entry using multiple 64-bit writes. This creates a window where the hardware can fetch a "torn" entry — where some fields are already zeroed while the 'Present' bit is still set — leading to unpredictable behavior or spurious faults. While x86 provides strong write ordering, the compiler may reorder writes to the two 64-bit halves of the context entry. Even without compiler reordering, the hardware fetch is not guaranteed to be atomic with respect to multiple CPU writes. Align with the "Guidance to Software for Invalidations" in the VT-d spec (Section 6.5.3.3) by implementing the recommended ownership handshake: 1. Clear only the 'Present' (P) bit of the context entry first to signal the transition of ownership from hardware to software. 2. Use dma_wmb() to ensure the cleared bit is visible to the IOMMU. 3. Perform the required cache and context-cache invalidation to ensure hardware no longer has cached references to the entry. 4. Fully zero out the entry only after the invalidation is complete. Also, add a dma_wmb() to context_set_present() to ensure the entry is fully initialized before the 'Present' bit becomes visible. Fixes: ba39592764ed2 ("Intel IOMMU: Intel IOMMU driver") Reported-by: Dmytro Maluka <dmaluka@chromium.org> Closes: https://lore.kernel.org/all/aTG7gc7I5wExai3S@google.com/ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Dmytro Maluka <dmaluka@chromium.org> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20260120061816.2132558-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-22	iommu/vt-d: Clear Present bit before tearing down PASID entry	Lu Baolu
	The Intel VT-d Scalable Mode PASID table entry consists of 512 bits (64 bytes). When tearing down an entry, the current implementation zeros the entire 64-byte structure immediately using multiple 64-bit writes. Since the IOMMU hardware may fetch these 64 bytes using multiple internal transactions (e.g., four 128-bit bursts), updating or zeroing the entire entry while it is active (P=1) risks a "torn" read. If a hardware fetch occurs simultaneously with the CPU zeroing the entry, the hardware could observe an inconsistent state, leading to unpredictable behavior or spurious faults. Follow the "Guidance to Software for Invalidations" in the VT-d spec (Section 6.5.3.3) by implementing the recommended ownership handshake: 1. Clear only the 'Present' (P) bit of the PASID entry. 2. Use a dma_wmb() to ensure the cleared bit is visible to hardware before proceeding. 3. Execute the required invalidation sequence (PASID cache, IOTLB, and Device-TLB flush) to ensure the hardware has released all cached references. 4. Only after the flushes are complete, zero out the remaining fields of the PASID entry. Also, add a dma_wmb() in pasid_set_present() to ensure that all other fields of the PASID entry are visible to the hardware before the Present bit is set. Fixes: 0bbeb01a4faf ("iommu/vt-d: Manage scalalble mode PASID tables") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Dmytro Maluka <dmaluka@chromium.org> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20260120061816.2132558-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-22	iommu/vt-d: Flush piotlb for SVM and Nested domain	Yi Liu
	Besides the paging domains that use FS, SVM and Nested domains need to use piotlb invalidation descriptor as well. Fixes: b33125296b50 ("iommu/vt-d: Create unique domain ops for each stage") Cc: stable@vger.kernel.org Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20251223065824.6164-1-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-22	iommu/vt-d: Flush cache for PASID table before using it	Dmytro Maluka
	When writing the address of a freshly allocated zero-initialized PASID table to a PASID directory entry, do that after the CPU cache flush for this PASID table, not before it, to avoid the time window when this PASID table may be already used by non-coherent IOMMU hardware while its contents in RAM is still some random old data, not zero-initialized. Fixes: 194b3348bdbb ("iommu/vt-d: Fix PASID directory pointer coherency") Signed-off-by: Dmytro Maluka <dmaluka@chromium.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20251221123508.37495-1-dmaluka@chromium.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>