summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2026-01-05drm/amdgpu: Add sysfs up clean for gfx_v12_1Asad Kamal
Add sysfs clean up for gfx_v12_1 during gfx fini sequence. This will prevent following crash while reloading driver 2645.490824] R13: 000055d0cb186330 R14: 000055d0cb185ed0 R15: 000055d0cb188f40 [ 2645.490825] </TASK> [ 2645.490836] amdgpu 0000:02:00.0: amdgpu: failed to create xcp sysfs files [ 2645.490937] amdgpu 0000:02:00.0: amdgpu: sw_init of IP block <gfx_v12_1> failed -17 [ 2645.491018] amdgpu 0000:02:00.0: amdgpu: amdgpu_device_ip_init failed [ 2645.491098] amdgpu 0000:02:00.0: amdgpu: Fatal error during GPU init [ 2645.491547] amdgpu 0000:02:00.0: amdgpu: amdgpu: finishing device. [ 2648.549939] ------------[ cut here ]------------ [ 2648.549942] WARNING: CPU: 0 PID: 2459 at /tmp/amd.aIpOeG3c/amd/amdgpu/amdgpu_irq.c: Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdkfd: Add metadata ring buffer for computeDavid Yat Sin
Add support for separate ring-buffer for metadata packets when using compute queues. Userspace application allocate the metadata ring-buffer and the queue ring-buffer with a single allocation. The metadata ring-buffer starts after the queue ring-buffer. Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amd/amdgpu : Use the MES INV_TLBS API for tlb invalidation on gfx12_1Shaoyun Liu
Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: Update TCP Control register on GFX 12.1Mukul Joshi
Update TCP CNTL register to disable some features not supported on GFX 12.1. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdkfd: Add back CWSR trap handler for GFX 12.1Mukul Joshi
CWSR Trap handler for GFX 12.1 was missed when merging changes from 6.14 NPI branch to 6.16 NPI branch. This change adds back the CWSR trap handler for GFX 12.1. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: Cleanup gmc_v12_1 after 6.16 mergeMukul Joshi
After the 6.16 merge, some changes not applicable to GFX 12.1 were added in the gmc_v12_1_get_vm_pte function. Additionally, add the case for MTYPE RW for GFX 12.1. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: Disable TCP Early Write Ack for GFX 12.1Mukul Joshi
Disable the TCP Early Write Ack feature on GFX 12.1. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdkfd: enable precise memory operations for gfx1250Jonathan Kim
Enable precise memory for GFX 1250. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdkfd: fix partitioned gfx12 address watch enablementJonathan Kim
GFX 12 devices that support spatial partitioning should use the WREG32 per XCC macro when updating address watch settings, similar to GFX 9 devices that support spatial partitioning. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdkfd: Implement CU Masking for GFX 12.1Mukul Joshi
Add CU masking implementation for GFX 12.1. Add a local implementation for GFX 12.1 instead of using the generic function defined in kfd_mqd_manager.c because of some quirks in the way CU mask is handled on GFX 12.1. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: skip gfxhub tlb flush if gfx is power offLikun Gao
Skip for gfxhub tlb flush for gc v12_1 if gfx is not poweron. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdkfd: Add gfx_v12_1_kfd2kgd interface for GFX12_1Likun Gao
Create new kfd2kgd interface for gfx v12_1, based on gfx v12. Support register program accoding to xcc id. V2: Fix SDMA register address for muti-xcc. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: update mcm_addr_lut data for imu v12_1Likun Gao
Support for partition mode to program MCM_ADDR_LUT. v2: clean up (Alex) Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: Init mcm_addr look up tableHawking Zhang
Encode mcm address look up table in SPX mode as a temp solution. v2: fill in when interface is ready (Alex) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: Always set PTE.B for device memory on GFX 12.1Mukul Joshi
On GFX 12.1, we need to set the atomics bit (PTE.B) always for device memory. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu/gfx12.1: Don't fetch default register values from hardware in mqd ↵Lang Yu
init 1. We can't assure the fetched values are always default register values. Observing non-zero cp_hqd_pq_rptr in mes_v12_1_self_test->init_mqd() where no GRBM_GFX_CNTL is specified. 2. See commit fc3c139cf043 ("drm/amdgpu/gfx12: don't read registers in mqd init"). Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amd: Convert DRM_*() to drm_*()Mario Limonciello (AMD)
The drm_*() macros include the device which is helpful for debugging issues in multi-GPU systems. Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amd: Drop amdgpu prefix from message printsMario Limonciello (AMD)
Hardcoding the prefix isn't necessary when using drm_* or dev_* message prints. Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amd: Convert amdgpu_display from DRM_* to drm_ macrosMario Limonciello (AMD)
drm_* macros show the device they were called with which is helpful in multi-GPU systems. Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amd/display: Fix DPMS log printingMario Limonciello (AMD)
[Why] Spaces before newline are not necessary. Inserting newlines in multi-line strings are harder to follow when tracing messages. [How] Drop extra new lines and split multi-line messages into one print per line. Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amd: Drop dev_fmt prefixMario Limonciello (AMD)
The `amdgpu:` prefix in dev_fmt() isn't needed because the core already includes the driver in the print. Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amd: Pass `adev` to amdgpu_gfx_parse_disable_cu()Mario Limonciello (AMD)
In order for messages to be attribute to the correct device amdgpu_gfx_parse_disable_cu() needs to know what device is being operated on. Pass the argument in. Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amd: Add correct prefix for VBIOS messageMario Limonciello (AMD)
It's not obvious which GPU the ATOM BIOS message goes with. Use drm_info() to show the correct one. Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdkfd: Correct the topology message for APUsMario Limonciello (AMD)
At bootup on a Strix machine the following message comes up: ``` amdgpu: Topology: Add dGPU node [0x150e:0x1002] ``` This is an APU though. Clarify the messaging by only offer a "CPU node" or "GPU node" message. Also set the message as VID:DID instead which is how other messages work. Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdkfd: Fix signal_eviction_fence() bool return valueSrinivasan Shanmugam
signal_eviction_fence() is declared to return bool, but returns -EINVAL when no eviction fence is present. This makes the "no fence" or "the NULL-fence" path evaluate to true and triggers a Smatch warning. v2: Return true instead to explicitly indicate that there is no eviction fence to signal and that eviction is already complete. This matches the existing caller logic where a NULL fence means "nothing to do" and allows restore handling to proceed normally. (Christian) Fixes the below: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:2099 signal_eviction_fence() warn: '(-22)' is not bool drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c 2090 static bool signal_eviction_fence(struct kfd_process *p) ^^^^ 2091 { 2092 struct dma_fence *ef; 2093 bool ret; 2094 2095 rcu_read_lock(); 2096 ef = dma_fence_get_rcu_safe(&p->ef); 2097 rcu_read_unlock(); 2098 if (!ef) --> 2099 return -EINVAL; This should be either true or false. Probably true because presumably it has been tested? 2100 2101 ret = dma_fence_check_and_signal(ef); 2102 dma_fence_put(ef); 2103 2104 return ret; 2105 } Fixes: 37865e02e6cc ("drm/amdkfd: Fix eviction fence handling") Reported by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Philip Yang <Philip.Yang@amd.com> Cc: Gang BA <Gang.Ba@amd.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amd/pm: fix wrong pcie parameter on navi1xYang Wang
fix wrong pcie dpm parameter on navi1x Fixes: 1a18607c07bb ("drm/amd/pm: override pcie dpm parameters only if it is necessary") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4671 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Co-developed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amd: Drop "amdgpu kernel modesetting enabled" messageMario Limonciello (AMD)
The behavior for amdgpu was changed with commit e00e5c223878 ("drm/amdgpu: adjust drm_firmware_drivers_only() handling") to potentially allow loading even if nomodeset was set, so the message is no longer accurate. Just drop it to avoid confusion. Fixes: e00e5c223878 ("drm/amdgpu: adjust drm_firmware_drivers_only() handling") Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: Add address checking for unirasJinzhou Su
Add address checking for uniras Signed-off-by: Jinzhou Su <jinzhou.su@amd.com> Reviewed-by: YiPeng Chai <YiPeng.Chai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/radeon: Remove __counted_by from ClockInfoArray.clockInfo[]Alex Deucher
clockInfo[] is a generic uchar pointer to variable sized structures which vary from ASIC to ASIC. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4374 Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: add support for MMHUB IP version 3.4.0Tim Huang
This initializes MMHUB IP version 3.4.0. v2: squash in clients table update (Alex) Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Tim Huang <tim.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: add support for HDP IP version 6.1.1Tim Huang
This initializes HDP IP version 6.1.1. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Tim Huang <tim.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: add support for IH IP version 6.1.1Tim Huang
This initializes IH IP version 6.1.1. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Tim Huang <tim.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: add support for NBIO IP version 7.11.4Tim Huang
This initializes NBIO IP version 7.11.4. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Tim Huang <tim.huang@amd.com> Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: add support for SDMA IP version 6.1.4Tim Huang
This initializes SDMA IP version 6.1.4. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Tim Huang <tim.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: add support for GC IP version 11.5.4Tim Huang
This initializes GC IP version 11.5.4. v2: squash in RLC offset fix Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Tim Huang <tim.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: Fix xcc_id input for soc_v1_0_grbm_selectHawking Zhang
Ensure the GRBM_GFX_CNTL is programmed correctly Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: Do not initialize imu callback for vfHawking Zhang
Not needed in guest environment Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: make normalize reg addr to common func for soc v1Likun Gao
Normalize registers address to local xcc address for sdma v7_1. Merge normalize register address function to an common function for soc v1. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: Setup MTYPE on SOC models for GFX 12.1Mukul Joshi
Fix it to apply for all models. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: Report correct compute partition mode on GFX 12.1Mukul Joshi
PSP programs the NBIO partition status register. In the absence of PSP, read the current compute partition from the GFX IMU register instead of NBIO. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdkfd: Send MES packets on correct XCC on GFX 12.1Mukul Joshi
Send the Set_Shader_Debugger packet on the correct MES pipe when partition mode is set to non-SPX mode. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdkfd: Add/remove queues on the correct XCC on GFX 12.1Mukul Joshi
On GFX 12.1, pass the xcc id of the master XCC to choose the correct MES Pipe to send the add_queue/remove_queue requests to MES. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdkfd: Don't partition VMID space on GFX 12.1Mukul Joshi
There is no need to partition VMID space on GFX 12.1 when operating in CPX mode as SDMA is not sharing MMHUB on GFX 12.1. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: Rework MES initialization on GFX 12.1Mukul Joshi
Currently, only SPX mode works on GFX 12.1. This patch reworks the MES initialization to get other non-SPX modes working. For example, for CPX mode, coop_enable bit needs to be set to 0. The shared command buffer initialization is also not needed in CPX mode. The shared command buffer initialization needs further improvements which will be handled in later patches. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: Use correct MES pipe in non-SPX mode on GFX 12.1Mukul Joshi
On GFX 12.1, use the correct MES pipe instance for readiness before sending MES commands on that pipe. Additionally, send the TLB requests on the correct MES pipe in non-SPX modes. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: adjust xcc_id program logic for sdma v7_1Likun Gao
Adjust program logic for sdam v7_1, only use physical xcc_id when program register to support compute partition. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: adjust xcc logic for gfxhub v12_1Likun Gao
Adjust xcc_id logic to only use physical xcc_id when program register, (use logic xcc_id by default), to fit for compute partition. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdgpu: adjust xcc_cp_resume function for gfx_v12_1Likun Gao
Adjust gfx_v12_1_xcc_cp_resume function to program cp resume per xcc_id (logic xcc number) to fix for xcp_resume. V2: Allocate compute microcode bo when sw init Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdkfd: Add SDMA queue quantum support for GFX12.1Gang Ba
program SDMAx_QUEUEx_SCHEDULE_CNTL for context switch due to quantum in KFD for GFX12.1 Signed-off-by: Gang Ba <Gang.Ba@amd.com> Reviewed-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amdkfd: Set SDMA_QUEUEx_IB_CNTL/SWITCH_INSIDE_IBGang Ba
When submitting MQD to CP, set SDMA_QUEUEx_IB_CNTL/SWITCH_INSIDE_IB bit so it'll allow SDMA preemption if there is a massive command buffer of long-running SDMA commands. Signed-off-by: Gang Ba <Gang.Ba@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>