summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
AgeCommit message (Collapse)Author
2026-03-30drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KBDonet Tom
Currently, AMDGPU_VA_RESERVED_TRAP_SIZE is hardcoded to 8KB, while KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE. On systems with 4K pages, both values match (8KB), so allocation and reserved space are consistent. However, on 64K page-size systems, KFD_CWSR_TBA_TMA_SIZE becomes 128KB, while the reserved trap area remains 8KB. This mismatch causes the kernel to crash when running rocminfo or rccl unit tests. Kernel attempted to read user page (2) - exploit attempt? (uid: 1001) BUG: Kernel NULL pointer dereference on read at 0x00000002 Faulting instruction address: 0xc0000000002c8a64 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries CPU: 34 UID: 1001 PID: 9379 Comm: rocminfo Tainted: G E 6.19.0-rc4-amdgpu-00320-gf23176405700 #56 VOLUNTARY Tainted: [E]=UNSIGNED_MODULE Hardware name: IBM,9105-42A POWER10 (architected) 0x800200 0xf000006 of:IBM,FW1060.30 (ML1060_896) hv:phyp pSeries NIP: c0000000002c8a64 LR: c00000000125dbc8 CTR: c00000000125e730 REGS: c0000001e0957580 TRAP: 0300 Tainted: G E MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24008268 XER: 00000036 CFAR: c00000000125dbc4 DAR: 0000000000000002 DSISR: 40000000 IRQMASK: 1 GPR00: c00000000125d908 c0000001e0957820 c0000000016e8100 c00000013d814540 GPR04: 0000000000000002 c00000013d814550 0000000000000045 0000000000000000 GPR08: c00000013444d000 c00000013d814538 c00000013d814538 0000000084002268 GPR12: c00000000125e730 c000007e2ffd5f00 ffffffffffffffff 0000000000020000 GPR16: 0000000000000000 0000000000000002 c00000015f653000 0000000000000000 GPR20: c000000138662400 c00000013d814540 0000000000000000 c00000013d814500 GPR24: 0000000000000000 0000000000000002 c0000001e0957888 c0000001e0957878 GPR28: c00000013d814548 0000000000000000 c00000013d814540 c0000001e0957888 NIP [c0000000002c8a64] __mutex_add_waiter+0x24/0xc0 LR [c00000000125dbc8] __mutex_lock.constprop.0+0x318/0xd00 Call Trace: 0xc0000001e0957890 (unreliable) __mutex_lock.constprop.0+0x58/0xd00 amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x6fc/0xb60 [amdgpu] kfd_process_alloc_gpuvm+0x54/0x1f0 [amdgpu] kfd_process_device_init_cwsr_dgpu+0xa4/0x1a0 [amdgpu] kfd_process_device_init_vm+0xd8/0x2e0 [amdgpu] kfd_ioctl_acquire_vm+0xd0/0x130 [amdgpu] kfd_ioctl+0x514/0x670 [amdgpu] sys_ioctl+0x134/0x180 system_call_exception+0x114/0x300 system_call_vectored_common+0x15c/0x2ec This patch changes AMDGPU_VA_RESERVED_TRAP_SIZE to 64 KB and KFD_CWSR_TBA_TMA_SIZE to the AMD GPU page size. This means we reserve 64 KB for the trap in the address space, but only allocate 8 KB within it. With this approach, the allocation size never exceeds the reserved area. Fixes: 34a1de0f7935 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole") Reviewed-by: Christian König <christian.koenig@amd.com> Suggested-by: Felix Kuehling <felix.kuehling@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 31b8de5e55666f26ea7ece5f412b83eab3f56dbb) Cc: stable@vger.kernel.org
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-01-14drm/amdkfd: Add domain parameter to alloc kernel BOPhilip Yang
To allocate kernel BO from VRAM domain for MQD in the following patch. No functional change because kernel BO allocate all from GTT domain. Rename amdgpu_amdkfd_alloc_gtt_mem to amdgpu_amdkfd_alloc_kernel_mem Rename amdgpu_amdkfd_free_gtt_mem to amdgpu_amdkfd_free_kernel_mem Rename mem_kfd_mem_obj gtt_mem to mem Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-14drm/amdkfd: kfd driver supports hot unplug/replug amdgpu devicesXiaogang Chen
This patch allows kfd driver function correctly when AMD gpu devices got unplug/replug at run time. When an AMD gpu device got unplug kfd driver gracefully terminates existing kfd processes after stops all queues by sending SIGBUS to user process. After that user space can still use remaining AMD gpu devices. When all AMD gpu devices at system got removed kfd driver will not response new requests. Unplugged AMD gpu devices can be re-plugged. kfd driver will use added devices to function as usual. The purpose of this patch is having kfd driver behavior as expected during and after AMD gpu devices unplug/replug at run time. Signed-off-by: Xiaogang Chen <Xiaogang.Chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdkfd: Add metadata ring buffer for computeDavid Yat Sin
Add support for separate ring-buffer for metadata packets when using compute queues. Userspace application allocate the metadata ring-buffer and the queue ring-buffer with a single allocation. The metadata ring-buffer starts after the queue ring-buffer. Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-10drm/amdgpu: Update MES VM_CNTX_CNTL for XNACK off for GFX 12.1Mukul Joshi
Currently, we do not turn off retry faults in VM_CONTEXT_CNTL value when passing it to MES if XNACK is off. This creates a situation where XNACK is disabled in SQ but enabled in UTCL2, which is not recommended. As a result, turn off/on retry faults in both SQ and UTCL2 when passing vm_context_cntl value to MES if XNACK is disabled/enabled. Suggested-by: Jay Cornwall <jay.cornwall@amd.com> Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-10drm/amdkfd: Enable per-process XNACK for GFX 12.1.0Mukul Joshi
GFX 12.1.0 will support enabling/disabling XNACK on a per- process basis. This change enables the per process XNACK feature. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08amdkfd: introduce new ioctl AMDKFD_IOC_CREATE_PROCESSZhu Lingshan
This commit implemetns a new ioctl AMDKFD_IOC_CREATE_PROCESS that creates a new secondary kfd_progress on the FD. To keep backward compatibility, userspace programs need to invoke this ioctl explicitly on a FD to create a secondary kfd_process which replacing its primary kfd_process. This commit bumps ioctl minor version. Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdkfd: Add interrupt handling for GFX 12.1.0Mukul Joshi
Add interrupt handling for GFX 12.1.0 similar to what is done for GFX 9.4.3. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdkfd: Add MQD manager for GFX 12.1.0Mukul Joshi
This patch adds the following functionality for GFX 12.1.0: 1. Add a new MQD manager for GFX v12.1.0. 2. Add a new 12.1.0 specific device queue manager file. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08amdkfd: introduce new helper kfd_lookup_process_by_idZhu Lingshan
This commit introduces a new helper function kfd_lookup_process_by_id which can find a kfd process that identified by its context id from the kfd process table Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08amdkfd: process pointer of a HIQ should be NULLZhu Lingshan
In kq_initialize, queue->process of a HIQ should be NULL as initialized, because it does not belong to any kfd_process. This commit decommisions the function kfd_get_process() because it can not locate a specific kfd_process among multiple contexts and not any code path calls it after this commit. Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08amdkfd: remove DIQ supportZhu Lingshan
This commit remove DIQ support because it has been marked as DEPRECATED since 2022 Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08amdkfd: identify a secondary kfd process by its idZhu Lingshan
This commit introduces a new id field for struct kfd process, which helps identify a kfd process among multiple contexts that all belong to a single user space program. The sysfs entry of a secondary kfd process is placed under the sysfs entry folder of its primary kfd process. The naming format of the sysfs entry of a secondary kfd process is "context_%u" where %u is the context id. Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08amdkfd: destroy kfd secondary contexts through fd closeZhu Lingshan
Life cycle of a KFD secondary context(kfd_process) is tied to the opened file. Therefore this commit destroy a kfd secondary context when close the fd it belonging to. This commit extracts the code removing the kfd_process from the kfd_process_table to a separate function and call it in kfd_process_notifier_release_internal unconditionally. Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08amdkfd: Introduce kfd_create_process_sysfs as a separate functionZhu Lingshan
KFD creates sysfs entries for a kfd_process in function kfd_create_process when creating it. This commit extracts the code creating sysfs entries to a separate function because it would be invoked in other code path like creating secondary kfd contexts (kfd_process). Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08amdkfd: mark the first kfd_process as the primary oneZhu Lingshan
The first kfd_process is created through open(), this commit marks it as the primary kfd_process by assigning a primary id for its context_id. Only the primary process should register the mmu_notifier. Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08amdkfd: enlarge the hashtable of kfd_processZhu Lingshan
This commit enlarges the hashtable size of kfd_process to 256, because of the multiple contexts feature allowing each application create multiple kfd_processes Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdkfd: Remove hard‑coded GC IP version checks from kfd_node_by_irq_idsSreekant Somasekharan
Replace the GC IP version hard-coded check with multi-aid check in kfd_node_by_irq_ids(). If aid_mask is not set, we immediately return dev->nodes[0] otherwise we iterate and match using kfd_irq_is_from_node(). Signed-off-by: Sreekant Somasekharan <Sreekant.Somasekharan@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amdkfd: Rework reserved SDMA queue handlingMukul Joshi
We would need to reserve SDMA queues per KFD node. As a result, rework the SDMA reserved queue handling to make it per KFD node. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-25amd/amdkfd: enhance kfd process check in switch partitionYifan Zhang
current switch partition only check if kfd_processes_table is empty. kfd_prcesses_table entry is deleted in kfd_process_notifier_release, but kfd_process tear down is in kfd_process_wq_release. consider two processes: Process A (workqueue) -> kfd_process_wq_release -> Access kfd_node member Process B switch partition -> amdgpu_xcp_pre_partition_switch -> amdgpu_amdkfd_device_fini_sw -> kfd_node tear down. Process A and B may trigger a race as shown in dmesg log. This patch is to resolve the race by adding an atomic kfd_process counter kfd_processes_count, it increment as create kfd process, decrement as finish kfd_process_wq_release. v2: Put kfd_processes_count per kfd_dev, move decrement to kfd_process_destroy_pdds and bug fix. (Philip Yang) [3966658.307702] divide error: 0000 [#1] SMP NOPTI [3966658.350818] i10nm_edac [3966658.356318] CPU: 124 PID: 38435 Comm: kworker/124:0 Kdump: loaded Tainted [3966658.356890] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu] [3966658.362839] nfit [3966658.366457] RIP: 0010:kfd_get_num_sdma_engines+0x17/0x40 [amdgpu] [3966658.366460] Code: 00 00 e9 ac 81 02 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 48 8b 4f 08 48 8b b7 00 01 00 00 8b 81 58 26 03 00 99 <f7> be b8 01 00 00 80 b9 70 2e 00 00 00 74 0b 83 f8 02 ba 02 00 00 [3966658.380967] x86_pkg_temp_thermal [3966658.391529] RSP: 0018:ffffc900a0edfdd8 EFLAGS: 00010246 [3966658.391531] RAX: 0000000000000008 RBX: ffff8974e593b800 RCX: ffff888645900000 [3966658.391531] RDX: 0000000000000000 RSI: ffff888129154400 RDI: ffff888129151c00 [3966658.391532] RBP: ffff8883ad79d400 R08: 0000000000000000 R09: ffff8890d2750af4 [3966658.391532] R10: 0000000000000018 R11: 0000000000000018 R12: 0000000000000000 [3966658.391533] R13: ffff8883ad79d400 R14: ffffe87ff662ba00 R15: ffff8974e593b800 [3966658.391533] FS: 0000000000000000(0000) GS:ffff88fe7f600000(0000) knlGS:0000000000000000 [3966658.391534] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3966658.391534] CR2: 0000000000d71000 CR3: 000000dd0e970004 CR4: 0000000002770ee0 [3966658.391535] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [3966658.391535] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [3966658.391536] PKRU: 55555554 [3966658.391536] Call Trace: [3966658.391674] deallocate_sdma_queue+0x38/0xa0 [amdgpu] [3966658.391762] process_termination_cpsch+0x1ed/0x480 [amdgpu] [3966658.399754] intel_powerclamp [3966658.402831] kfd_process_dequeue_from_all_devices+0x5b/0xc0 [amdgpu] [3966658.402908] kfd_process_wq_release+0x1a/0x1a0 [amdgpu] [3966658.410516] coretemp [3966658.434016] process_one_work+0x1ad/0x380 [3966658.434021] worker_thread+0x49/0x310 [3966658.438963] kvm_intel [3966658.446041] ? process_one_work+0x380/0x380 [3966658.446045] kthread+0x118/0x140 [3966658.446047] ? __kthread_bind_mask+0x60/0x60 [3966658.446050] ret_from_fork+0x1f/0x30 [3966658.446053] Modules linked in: kpatch_20765354(OEK) [3966658.455310] kvm [3966658.464534] mptcp_diag xsk_diag raw_diag unix_diag af_packet_diag netlink_diag udp_diag act_pedit act_mirred act_vlan cls_flower kpatch_21951273(OEK) kpatch_18424469(OEK) kpatch_19749756(OEK) [3966658.473462] idxd_mdev [3966658.482306] kpatch_17971294(OEK) sch_ingress xt_conntrack amdgpu(OE) amdxcp(OE) amddrm_buddy(OE) amd_sched(OE) amdttm(OE) amdkcl(OE) intel_ifs iptable_mangle tcm_loop target_core_pscsi tcp_diag target_core_file inet_diag target_core_iblock target_core_user target_core_mod coldpgs kpatch_18383292(OEK) ip6table_nat ip6table_filter ip6_tables ip_set_hash_ipportip ip_set_hash_ipportnet ip_set_hash_ipport ip_set_bitmap_port xt_comment iptable_nat nf_nat iptable_filter ip_tables ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 sn_core_odd(OE) i40e overlay binfmt_misc tun bonding(OE) aisqos(OE) aisqos_hotfixes(OE) rfkill uio_pci_generic uio cuse fuse nf_tables nfnetlink intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm idxd_mdev [3966658.491237] vfio_pci [3966658.501196] vfio_pci vfio_virqfd mdev vfio_iommu_type1 vfio iax_crypto intel_pmt_telemetry iTCO_wdt intel_pmt_class iTCO_vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_seq [3966658.508537] vfio_virqfd [3966658.517569] snd_seq_device ipmi_ssif isst_if_mbox_pci isst_if_mmio pcspkr snd_pcm idxd intel_uncore ses isst_if_common intel_vsec idxd_bus enclosure snd_timer mei_me snd i2c_i801 i2c_smbus mei i2c_ismt soundcore joydev acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad vfat fat [3966658.526851] mdev [3966658.536096] nfsd auth_rpcgss nfs_acl lockd grace slb_vtoa(OE) sunrpc dm_mod hookers mlx5_ib(OE) ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helper ttm mlx5_core(OE) mlxfw(OE) [3966658.540381] vfio_iommu_type1 [3966658.544341] nvme mpt3sas tls drm nvme_core pci_hyperv_intf raid_class psample libcrc32c crc32c_intel mlxdevm(OE) i2c_core [3966658.551254] vfio [3966658.558742] scsi_transport_sas wmi pinctrl_emmitsburg sd_mod t10_pi sg ahci libahci libata rdma_ucm(OE) ib_uverbs(OE) rdma_cm(OE) iw_cm(OE) ib_cm(OE) ib_umad(OE) ib_core(OE) ib_ucm(OE) mlx_compat(OE) [3966658.563004] iax_crypto [3966658.570988] [last unloaded: diagnose] [3966658.571027] ---[ end trace cc9dbb180f9ae537 ]--- Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Philip.Yang<Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-08-27drm/amdkfd: Tie UNMAP_LATENCY to queue_preemptionAmber Lin
When KFD asks CP to preempt queues, other than preempt CP queues, CP also requests SDMA to preempt SDMA queues with UNMAP_LATENCY timeout. Currently queue_preemption_timeout_ms is 9000 ms by default but can be configured via module parameter. KFD_UNMAP_LATENCY_MS is hard coded as 4000 ms though. This patch ties KFD_UNMAP_LATENCY_MS to queue_preemption_timeout_ms so in a slow system such as emulator, both CP and SDMA slowness are taken into account. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-18drm/amdkfd: allow compute partition mode switch with cgroup exclusionsJonathan Kim
The KFD currently bars a compute partition mode switch while a KFD process exists. Since cgroup excluded devices remain excluded for the lifetime of a KFD process and user space is able to mode switch single devices, allow users to mode switch a device with any running process that has been cgroup excluded from this device. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07amd/amdkfd: Trigger segfault for early userptr unmmappingShane Xiao
If applications unmap the memory before destroying the userptr, it needs trigger a segfault to notify user space to correct the free sequence in VM debug mode. v2: Send gpu access fault to user space v3: Report gpu address to user space, remove unnecessary params v4: update pr_err into one line, remove userptr log info Signed-off-by: Shane Xiao <shane.xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-30drm/amdkfd: add pasid debugfs entriesEric Huang
the entries will be appearing at /sys/kernel/debug/kfd/proc/<pid>/pasid_<gpuid>. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-10drm/amdkfd: Add pm_config_dequeue_wait_counts APIHarish Kasiviswanathan
Update pm_update_grace_period() to more cleaner pm_config_dequeue_wait_counts(). Previously, grace_period variable was overloaded as a variable and a macro, making it inflexible to configure additional dequeue wait times. pm_config_dequeue_wait_counts() now takes in a cmd / variable. This allows flexibility to update different dequeue wait times. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07drm/amdkfd: remove unused debug gws support status variableJonathan Kim
Remove unused declaration of gws_debug_workaround. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Amber Lin <amber.lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05drm/amdgpu: Remove unused pqm_get_kernel_queueDr. David Alan Gilbert
pqm_get_kernel_queue() has been unused since 2022's commit 5bdd3eb25354 ("drm/amdkfd: Remove unused old debugger implementation") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12drm/amdkfd: Remove unused functionsDr. David Alan Gilbert
kfd_device_by_pci_dev(), kfd_get_pasid_limit() and kfd_set_pasid_limit() have been unused since 2023's commit c99a2e7ae291 ("drm/amdkfd: drop IOMMUv2 support") Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12drm/amdkfd: Have kfd driver use same PASID values from graphic driverXiaogang Chen
Current kfd driver has its own PASID value for a kfd process and uses it to locate vm at interrupt handler or mapping between kfd process and vm. That design is not working when a physical gpu device has multiple spatial partitions, ex: adev in CPX mode. This patch has kfd driver use same pasid values that graphic driver generated which is per vm per pasid. These pasid values are passed to fw/hardware. We do not need change interrupt handler though more pasid values are used. Also, pasid values at log are replaced by user process pid; pasid values are not exposed to user. Users see their process pids that have meaning in user space. Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06drm/amdkfd: always include uapi header in priv.hZhu Lingshan
The header usr/linux/kfd_ioctl.h is a duplicate of uapi/linux/kfd_ioctl.h. And it is actually not a file in the source code tree. Ideally, the usr version should be updated whenever the source code is recompiled. However, I have noticed a discrepancy between the two headers even after rebuilding the kernel. This commit modifies kfd_priv.h to always include the header from uapi to ensure the latest changes are reflected. We should always include the source code header other than the duplication. Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdkfd: Queue interrupt work to different CPUPhilip Yang
For CPX mode, each KFD node has interrupt worker to process ih_fifo to send events to user space. Currently all interrupt workers of same adev queue to same CPU, all workers execution are actually serialized and this cause KFD ih_fifo overflow when CPU usage is high. Use per-GPU unbounded highpri queue with number of workers equals to number of partitions, let queue_work select the next CPU round robin among the local CPUs of same NUMA. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu: Optimize gfx v9 GPU page fault handlingPhilip Yang
After GPU page fault, there are lots of page fault interrupts generated at short period even with CAM filter enabled because the fault address is different. Each page fault copy to KFD ih fifo to send event to user space by KFD interrupt worker, this could cause KFD ih fifo overflow while other processes generate events at same time. KFD process is aborted after GPU page fault, we only need one GPU page fault interrupt sent to KFD ih fifo to send memory exception event to user space. Incease KFD ih fifo size to 2 times of IH primary ring size, to handle the burst events case. This patch handle the gfx v9 path, cover retry on/off and CAM filter on/off cases. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdkfd: add gc 9.5.0 support on kfdAlex Sierra
Initial support for GC 9.5.0. v2: squash in pqm_clean_queue_resource() fix from Lijo Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-15drm/amdkfd: Accounting pdd vram_usage for svmPhilip Yang
Process device data pdd->vram_usage is read by rocm-smi via sysfs, this is currently missing the svm_bo usage accounting, so "rocm-smi --showpids" per process VRAM usage report is incorrect. Add pdd->vram_usage accounting when svm_bo allocation and release, change to atomic64_t type because it is updated outside process mutex now. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-01drm/amdkfd: Remove an unused parameter in queue creationLang Yu
struct file *f is unused in queue creation, remove it. Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdgpu: Surface svm_default_granularity, a RW module parameterRamesh Errabolu
Enables users to update SVM's default granularity, used in buffer migration and handling of recoverable page faults. Param value is set in terms of log(numPages(buffer)), e.g. 9 for a 2 MIB buffer Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-23drm/amdkfd: Change kfd/svm page fault drain handlingXiaogang Chen
When app unmap vm ranges(munmap) kfd/svm starts drain pending page fault and not handle any incoming pages fault of this process until a deferred work item got executed by default system wq. The time period of "not handle page fault" can be long and is unpredicable. That is advese to kfd performance on page faults recovery. This patch uses time stamp of incoming page fault to decide to drop or recover page fault. When app unmap vm ranges kfd records each gpu device's ih ring current time stamp. These time stamps are used at kfd page fault recovery routine. Any page fault happened on unmapped ranges after unmap events is application bug that accesses vm range after unmap. It is not driver work to cover that. By using time stamp of page fault do not need drain page faults at deferred work. So, the time period that kfd does not handle page faults is reduced and can be controlled. Signed-off-by: Xiaogang.Chen <Xiaogang.Chen@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-20drm/amdkfd: Update BadOpcode Interrupt handling with MESMukul Joshi
Based on the recommendation of MEC FW, update BadOpcode interrupt handling by unmapping all queues, removing the queue that got the interrupt from scheduling and remapping rest of the queues back when using MES scheduler. This is done to prevent the case where unmapping of the bad queue can fail thereby causing a GPU reset. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13drm/amdkfd: Handle queue destroy buffer access racePhilip Yang
Add helper function kfd_queue_unreference_buffers to reduce queue buffer refcount, separate it from release queue buffers. Because it is circular locking to hold dqm_lock to take vm lock, kfd_ioctl_destroy_queue should take vm lock, unreference queue buffers first, but not release queue buffers, to handle error in case failed to hold vm lock. Then hold dqm_lock to remove queue from queue list and then release queue buffers. Restore process worker restore queue hold dqm_lock, will always find the queue with valid queue buffers. v2 (Felix): - renamed kfd_queue_unreference_buffer(s) to kfd_queue_unref_bo_va(s) - added two FIXME comments for follow up Signed-off-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06drm/amdkfd: fix debug watchpoints for logical devicesJonathan Kim
The number of watchpoints should be set and constrained per logical partition device, not by the socket device. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-06drm/amdkfd: support per-queue reset on gfx9Jonathan Kim
Support per-queue reset for GFX9. The recommendation is for the driver to target reset the HW queue via a SPI MMIO register write. Since this requires pipe and HW queue info and MEC FW is limited to doorbell reports of hung queues after an unmap failure, scan the HW queue slots defined by SET_RESOURCES first to identify the user queue candidates to reset. Only signal reset events to processes that have had a queue reset. If queue reset fails, fall back to GPU reset. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-25drm/amdkfd: allow users to target recommended SDMA enginesJonathan Kim
Certain GPUs have better copy performance over xGMI on specific SDMA engines depending on the source and destination GPU. Allow users to create SDMA queues on these recommended engines. Close to 2x overall performance has been observed with this optimization. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-24drm/amdkfd: Store queue cwsr area size to node propertiesPhilip Yang
Use the queue eop buffer size, cwsr area size, ctl stack size calculation from Thunk, store the value to KFD node properties. Those will be used to validate queue eop buffer size, cwsr area size, ctl stack size when creating KFD user compute queue. Those will be exposed to user space via sysfs KFD node properties, to remove the duplicate calculation code from Thunk. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23drm/amdkfd: Ensure user queue buffers residencyPhilip Yang
Add atomic queue_refcount to struct bo_va, return -EBUSY to fail unmap BO from the GPU if the bo_va queue_refcount is not zero. Create queue to increase the bo_va queue_refcount, destroy queue to decrease the bo_va queue_refcount, to ensure the queue buffers mapped on the GPU when queue is active. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23drm/amdkfd: Validate user queue buffersPhilip Yang
Find user queue rptr, ring buf, eop buffer and cwsr area BOs, and check BOs are mapped on the GPU with correct size and take the BO reference. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-23drm/amdkfd: Refactor queue wptr_bo GART mappingPhilip Yang
Add helper function kfd_queue_acquire_buffers to get queue wptr_bo reference from queue write_ptr if it is mapped to the KFD node with expected size. Add wptr_bo to structure queue_properties because structure queue is allocated after queue buffers are validated, then we can remove wptr_bo parameter from pqm_create_queue. Rename structure queue wptr_bo_gart to hold wptr_bo reference for GART mapping and umapping. Move MES wptr_bo_gart mapping to init_user_queue, the same location with queue ctx_bo GART mapping. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14drm/amdgpu/kfd: remove is_hws_hang and is_resettingYunxiang Li
is_hws_hang and is_resetting serves pretty much the same purpose and they all duplicates the work of the reset_domain lock, just check that directly instead. This also eliminate a few bugs listed below and get rid of dqm->ops.pre_reset. kfd_hws_hang did not need to avoid scheduling another reset. If the on-going reset decided to skip GPU reset we have a bad time, otherwise the extra reset will get cancelled anyway. remove_queue_mes forgot to check is_resetting flag compared to the pre-MES path unmap_queue_cpsch, so it did not block hw access during reset correctly. Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-02drm/amdkfd: Added MQD manager files for GFX12.David Belanger
Initial implementation, based on GFX11. v2: Removed dbg_wa code as not needed on GFX12. v3: squash in SDMA queue fixes (Alex) v4: rebase (Alex) Signed-off-by: David Belanger <david.belanger@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>