summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2025-11-04drm/amd/display: Persist stream refcount through restoreJoshua Aberback
[Why & How] Overwriting the refcount on stream restore can lead to double-free errors or memory leaks if an unbalanced number of retains and releases occurs between a backup and restore. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/display: Add Pstate viewport reductionAustin Zheng
[Why/How] Add struct to hold calculated reduced viewport pstate recout reduction lines per plane Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/display: Refactor VActive implementationAustin Zheng
[Why & How] Refactors VActive accounting in PMO, and breaks down fill time requirement by P-State type as it can result in drasitcally different bandwidth requirements depending on the blackout length. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/display: Update P-state naming for clarity.Austin Zheng
[Why & How] P-state can refer to different things like UCLK P-state, PPT, or temp read Update naming for clarity Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/display: Remove old PMO optionsAustin Zheng
[Why & How] Removes deprecated or unused PMO options. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/display: Add pte_buffer_mode and force_one_row_for_frame in dchub regAustin Zheng
[Why & How] Update structs for rq regs Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/display: Extend inbox0 lock to run Replay/PSRAndrew Mazour
[Why] The inbox1 infrastructure is deprecated, so to support display power features requiring a DMUB interlock moving forward extend the inbox0 locking conditions to also include Replay or PSR. [How] Implemented a series of changes to improve HW lock handling: - Deprecated should_use_dmub_inbox1_lock() and guarded it with DCN401 flag. - Migrated lock checks into inbox0 helpers and added PSR/Replay enablement checks to ensure correct behavior. - Updated HWSS fast update path to acquire HW lock as needed using the new helpers. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Andrew Mazour <Andrew.Mazour@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/display: fw locality check refactorsWenjing Liu
[why] There are some new changes for HDCP2 firmware locality check. The implementation doesn't perfectly fit the intended design and clarity. 1. Clarify and consolidate variable responsibilities. The previous implementation introduced the following variables: - config.ddc.funcs.atomic_write_poll_read_i2c (optional pointer) - hdcp->config.ddc.funcs.atomic_write_poll_read_aux (optional pointer) - hdcp->connection.link.adjust.hdcp2.force_sw_locality_check (bool) - hdcp->config.debug.lc_enable_sw_fallback (bool) - use_fw (bool) They will be used together to determine two operations: - Whether to use FW locality check - Whether to use SW fallback on FW locality check failure The refactor streamlines this by introducing two variables in the hdcp2 link adjustment, while ensuring function pointers are always assigned and remain independent from policy decisions: - use_fw_locality_check (bool) -> true if fw locality should be used. - use_sw_locality_fallback (bool) -> true to reset use_fw_locality_check back to false and retry on fw locality check failure. 2. Mixed meanings of l_prime_read transition input l_prime_read originally means if l_prime is read when sw locality check is used. When FW locality check is used, l_prime_read means if lc init write, l prime poll and l_prime read combo operation is successful. The mix of meanings is confusing. The refactor introduces a new variable l_prime_combo_read to isolate the second meaning into its own variable. 3. Missing specific error code on firmware locality error. The original change reuses the generic DDC failure error code when firmware fails to return locality check result. This is not ideal as DDC failure indicates an error occurred during an I2C/AUX transaction. FW locality failure could be caused by polling timeout in firmware or failure to acquire firmware access. Which sits at a higher level of abstraction above DDC hardware. An incorrect error code could mislead the debug into a wrong direction. 4. Correcting misplaced comments. The previous implementation of the firmware locality check resulted in some comments in hdcp2_transition being incorrectly positioned. This refactor relocates those comments to their appropriate locations for better clarity. Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amdgpu: Implement user queue reset functionalityJesse.Zhang
This patch adds robust reset handling for user queues (userq) to improve recovery from queue failures. The key components include: 1. Queue detection and reset logic: - amdgpu_userq_detect_and_reset_queues() identifies failed queues - Per-IP detect_and_reset callbacks for targeted recovery - Falls back to full GPU reset when needed 2. Reset infrastructure: - Adds userq_reset_work workqueue for async reset handling - Implements pre/post reset handlers for queue state management - Integrates with existing GPU reset framework 3. Error handling improvements: - Enhanced state tracking with HUNG state - Automatic reset triggering on critical failures - VRAM loss handling during recovery 4. Integration points: - Added to device init/reset paths - Called during queue destroy, suspend, and isolation events - Handles both individual queue and full GPU resets The reset functionality works with both gfx/compute and sdma queues, providing better resilience against queue failures while minimizing disruption to unaffected queues. v2: add detection and reset calls when preemption/unmaped fails. add a per device userq counter for each user queue type.(Alex) v3: make sure we hold the adev->userq_mutex when we call amdgpu_userq_detect_and_reset_queues. (Alex) warn if the adev->userq_mutex is not held. v4: make sure we have all of the uqm->userq_mutex held. warn if the uqm->userq_mutex is not held. v5: Use array for user queue type counters.(Alex) all of the uqm->userq_mutex need to be held when calling detect and reset. (Alex) v6: fix lock dep warning in amdgpu_userq_fence_dence_driver_process v7: add the queue types in an array and use a loop in amdgpu_userq_detect_and_reset_queues (Lijo) v8: remove atomic_set(&userq_mgr->userq_count[i], 0). it should already be 0 since we kzalloc the structure (Alex) v9: For consistency with kernel queues, We may want something like: amdgpu_userq_is_reset_type_supported (Alex) Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/display: Don't stretch non-native images by default in eDPMario Limonciello (AMD)
commit 978fa2f6d0b12 ("drm/amd/display: Use scaling for non-native resolutions on eDP") started using the GPU scaler hardware to scale when a non-native resolution was picked on eDP. This scaling was done to fill the screen instead of maintain aspect ratio. The idea was supposed to be that if a different scaling behavior is preferred then the compositor would request it. The not following aspect ratio behavior however isn't desirable, so adjust it to follow aspect ratio and still try to fill screen. Note: This will lead to black bars in some cases for non-native resolutions. Compositors can request the previous behavior if desired. Fixes: 978fa2f6d0b1 ("drm/amd/display: Use scaling for non-native resolutions on eDP") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4538 Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amdkfd: Fix Unchecked Return ValuesSunday Clement
Properly Check for return values from calls to debug functions in runtime_disable(). v2: storing the last non zero returned value from the loop. Signed-off-by: Sunday Clement <Sunday.Clement@amd.com> Reviewed-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd: Unwind for failed device suspendMario Limonciello (AMD)
If device suspend has failed, add a recovery flow that will attempt to unwind the suspend and get things back up and running. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4627 Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd: Add an unwind for failures in amdgpu_device_ip_suspend_phase2()Mario Limonciello (AMD)
If any hardware IPs involved with the second phase of suspend fail, unwind all steps to restore back to original state. Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd: Add an unwind for failures in amdgpu_device_ip_suspend_phase1()Mario Limonciello (AMD)
If any hardware IPs involved with the first phase of suspend fail, unwind all steps to restore back to original state. Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amdgpu: Drop PMFW RLC notifier from amdgpu_device_suspend()Alex Deucher
For S3 on vangogh, PMFW needs to be notified before the driver powers down RLC. This already happens in smu_disable_dpms() so drop the superfluous call in amdgpu_device_suspend(). Co-developed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/ras: Use correct severity for BP threshold exceed eventXiang Liu
The severity of CPER for BP threshold exceed event should be set as FATAL to match the OOB implementation. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/ras: Correct info field of bad page threshold exceed CPERXiang Liu
Correct valid_bits and ms_chk_bits of section info field for bad page threshold exceed CPER to match OOB's behavior. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/display: remove unneeded semicolonJiapeng Chong
No functional modification involved. ./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:7392:3-4: Unneeded semicolon. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=26821 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/display: remove unneeded semicolonJiapeng Chong
No functional modification involved. ./drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c:1850:3-4: Unneeded semicolon. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=26821 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/display: remove unneeded semicolonJiapeng Chong
No functional modification involved. ./drivers/gpu/drm/amd/display/dc/resource/dcn401/dcn401_resource.c:1674:3-4: Unneeded semicolon. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=26821 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/pm/si: Delete unused structs and fieldsTimur Kristóf
The contents of si_dpm.h seem to have been copied from the old radeon driver, including a lot of structs and fields which were only relevant to GPU generations even older than SI. A lot of these can be deleted without causing much churn to the actual SI DPM code. Let's delete them to make the code easier to understand. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/pm: Use gpu metrics 1.9 for SMUv13.0.6Lijo Lazar
Fill and publish GPU metrics in v1.9 format for SMUv13.0.6 SOCs Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/pm: Add helper functions for gpu metricsLijo Lazar
Add helper macros to define metrics struct definitions. It will define structs with field type followed by actual field. A helper macro is also added to initialize the field encoding for all fields and to initialize the field members to 0xFFs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/pm: fix missing device_attr cleanup in amdgpu_pm_sysfs_init()Yang Wang
Use the correct label to complete all cleanup work. Fixes: 4d154b1ca580 ("drm/amd/pm: Add support for DPM policies") Fixes: 25e82f2e2c59 ("drm/amd/pm: Add temperature metrics sysfs entry") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/pm: fix the issue of size calculation error for smu 13.0.6Yang Wang
v1: the driver should handle return value of smu_v13_0_6_printk_clk_levels() to return the correct size for sysfs reads. v2: fix the issue of size calculation error in smu_v13_0_6_print_clks() Fixes: cdfdec6f1608 ("drm/amd/pm: Avoid writing nulls into `pp_od_clk_voltage`") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/ras: Update IPID value for bad page threshold CPERXiang Liu
The IPID register value for bad page threshold CPER holds socket_id info now according to the latest definition. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/display: Fix null pointer on analog detectionHarry Wentland
Check if we have an amdgpu_dm_connector->dc_sink first before adding common modes for analog outputs. If we don't have a sink yet we can safely skip this. Fixes: 70181ad96ec2 ("drm/amd/display: Add common modes to analog displays without EDID") Signed-off-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amdgpu: Fix error injection parameter errorYiPeng Chai
Fix error injection parameter error. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/ras: Fix the error of undefined reference to `__udivdi3'YiPeng Chai
Fix the error: drivers/gpu/drm/amd/amdgpu/../ras/ras_mgr/amdgpu_ras_mgr.c:132:undefined reference to `__udivdi3' Fixes: fa0b203cd902 ("drm/amd/ras: Add amdgpu ras management function.") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202510272144.6SUHUoWx-lkp@intel.com/ Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amdgpu: Remove invalidate and flush hdp macrosAsad Kamal
Remove amdgpu_asic_flush_hdp & amdgpu_asic_invalidate_hdp functions and directly use the mapped ones Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/ras: Add CPER ring read for unirasXiang Liu
Read CPER raw data from debugfs node "/sys/kernel/debug/dri/*/ amdgpu_ring_cper". Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amdgpu: move reset debug disable handlingAlex Deucher
Move everything to the supported resets masks rather than having an explicit misc checks for this. Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amdgpu: Update invalidate and flush hdp functionAsad Kamal
Update asic_invalidate_hdp and asic_flush_hdp function to check if ip function exist, if not return void v2: Use else/if (Kevin) Update function name (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amdgpu: caller should make sure not to double freeSunil Khatri
Remove the NULL check from amdgpu_hmm_range_free for hmm_pfns as caller is responsible not to call amdgpu_hmm_range_free more than once. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amdgpu: set default gfx reset masks for gfx6-8Alex Deucher
These were not set so soft recovery was inadvertantly disabled. Fixes: 6ac55eab4fc4 ("drm/amdgpu: move reset support type checks into the caller") Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amdkfd: clean up the code to free hmm_rangeSunil Khatri
a. hmm_range is either NULL or a valid pointer so we do not need to set range to NULL ever. b. keep the hmm_range_free in the end irrespective of the other conditions to avoid some additional checks and also avoid double free issue. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/xe: Remove last fence dependency check from binds and execsMatthew Brost
Eliminate redundant last fence dependency checks in exec and bind jobs, as they are now equivalent to xe_exec_queue_is_idle. Simplify the code by removing this dead logic. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-7-matthew.brost@intel.com
2025-11-04drm/xe: Disallow input fences on zero batch execs and zero bindsMatthew Brost
Prevent input fences from being installed on zero batch execs or zero binds, which were originally added to support queue idling in Mesa via output fences. Although input fence support was introduced for interface consistency, it leads to incorrect behavior due to chained composite fences, which are disallowed. Avoid the complexity of fixing this by removing support, as input fences for these cases are not used in practice. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-6-matthew.brost@intel.com
2025-11-04drm/xe: Skip TLB invalidation waits in page fault bindsMatthew Brost
Avoid waiting on unrelated TLB invalidations when servicing page fault binds. Since the migrate queue is shared across processes, TLB invalidations triggered by other processes may occur concurrently but are not relevant to the current bind. Teach the bind pipeline to skip waits on such invalidations to prevent unnecessary serialization. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-5-matthew.brost@intel.com
2025-11-04drm/xe: Decouple bind queue last fence from TLB invalidationsMatthew Brost
Separate the bind queue’s last fence to apply exclusively to the bind job, avoiding unnecessary serialization on prior TLB invalidations. Preserve correct user fence signaling by merging bind and TLB invalidation fences later in the pipeline. v3: - Fix lockdep assert for migrate queues (CI) - Use individual dma fence contexts for array out fences (Testing) - Don't set last fence with arrays (Testing) - Move TLB invalid last fence under migrate lock (Testing) - Don't set queue last for migrate queues (Testing) Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6047 Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-4-matthew.brost@intel.com
2025-11-04drm/xe: Attach last fence to TLB invalidation job queuesMatthew Brost
Add support for attaching the last fence to TLB invalidation job queues to address serialization issues during bursts of unbind jobs. Ensure that user fence signaling for a bind job reflects both the bind job itself and the last fences of all related TLB invalidations. Maintain submission order based solely on the state of the bind and TLB invalidation queues. Introduce support functions for last fence attachment to TLB invalidation queues. v3: - Fix assert in xe_exec_queue_tlb_inval_last_fence_set (CI) - Ensure migrate lock held for migrate queues (Testing) v5: - Style nits (Thomas) - Rewrite commit message (Thomas) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-3-matthew.brost@intel.com
2025-11-04drm/xe: Enforce correct user fence signaling order usingMatthew Brost
Prevent application hangs caused by out-of-order fence signaling when user fences are attached. Use drm_syncobj (via dma-fence-chain) to guarantee that each user fence signals in order, regardless of the signaling order of the attached fences. Ensure user fence writebacks to user space occur in the correct sequence. v7: - Skip drm_syncbj create of error (CI) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251031234050.3043507-2-matthew.brost@intel.com
2025-11-04drm/etnaviv: add HWDB entry for GC8000 Nano Ultra VIP r6205Marek Vasut
This is the GPU/NPU combined device found on the ST STM32MP25 SoC. Feature bits taken from the downstream kernel driver 6.4.21. Signed-off-by: Marek Vasut <marek.vasut@mailbox.org> Acked-by: Christian Gmeiner <cgmeiner@igalia.com> Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Link: https://patch.msgid.link/20250919183042.273687-1-marek.vasut@mailbox.org
2025-11-04drm/xe: Do clean shutdown also when using flrJouni Högander
Currently Xe driver is triggering flr without any clean-up on shutdown. This is causing random warnings from pending related works as the underlying hardware is reset in the middle of their execution. Fix this by performing clean shutdown also when using flr. Fixes: 501d799a47e2 ("drm/xe: Wire up device shutdown handler") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Maarten Lankhorst <dev@lankhorst.se> Link: https://patch.msgid.link/20251031122312.1836534-1-jouni.hogander@intel.com Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-11-04drm/displayid: add quirk to ignore DisplayID checksum errorsJani Nikula
Add a mechanism for DisplayID specific quirks, and add the first quirk to ignore DisplayID section checksum errors. It would be quite inconvenient to pass existing EDID quirks from drm_edid.c for DisplayID parsing. Not all places doing DisplayID iteration have the quirks readily available, and would have to pass it in all places. Simply add a separate array of DisplayID specific EDID quirks. We do end up checking it every time we iterate DisplayID blocks, but hopefully the number of quirks remains small. There are a few laptop models with DisplayID checksum failures, leading to higher refresh rates only present in the DisplayID blocks being ignored. Add a quirk for the panel in the machines. Reported-by: Tiago Martins Araújo <tiago.martins.araujo@gmail.com> Closes: https://lore.kernel.org/r/CACRbrPGvLP5LANXuFi6z0S7XMbAG4X5y2YOLBDxfOVtfGGqiKQ@mail.gmail.com Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14703 Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Tiago Martins Araújo <tiago.martins.araujo@gmail.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/c04d81ae648c5f21b3f5b7953f924718051f2798.1761681968.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-11-04drm/displayid: pass iter to drm_find_displayid_extension()Jani Nikula
It's more convenient to pass iter than a handful of its members to drm_find_displayid_extension(), especially as we're about to add another member. Rename the function find_next_displayid_extension() while at it, to be more descriptive. Cc: Tiago Martins Araújo <tiago.martins.araujo@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Tiago Martins Araújo <tiago.martins.araujo@gmail.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/3837ae7f095e77a082ac2422ce2fac96c4f9373d.1761681968.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-11-04drm/hyperv: include drm_print.h where neededJani Nikula
hyperv_drm_drv.c and hyperv_drm_modeset.c depend on drm_print.h being indirectly included via drm_buddy.h, drm_mm.h, or ttm/ttm_resource.h. Include drm_print.h explicitly. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/r/20251104101158.1cc9abcd@canb.auug.org.au Fixes: f6e8dc9edf96 ("drm: include drm_print.h where needed") Cc: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20251104100253.646577-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-11-04drm/i915/display: Extend i915_display_info with Type-C port detailsKhaled Almahallawy
Expose key Type-C port data in i915_display_info to make it easier to understand the port configuration and active mode, especially whether the link is in DP-Alt or TBT-Alt, without having to scan kernel logs. Tested in DP-Alt, TBT-Alt, SST, and MST. Expected output: [CONNECTOR:290:DP-2]: status: connected TC Port: E/TC#2 mode: tbt-alt pin assignment: - max lanes: 4 physical dimensions: 600x340mm ... [CONNECTOR:263:DP-5]: status: connected TC Port: G/TC#4 mode: dp-alt pin assignment: C max lanes: 4 physical dimensions: 610x350mm v2: Use drm_printer (Ville) Lock/Unlock around the printf (Imre) v3: Forward Declaration drm_printer struct (Jani) v4: Handle MST connector with no active encoder (Imre) Add a delimiter between fields and ":" after the port name (Imre) v5: Init dig_port and use it in intel_encorder_is_tc and tc_info (Imre) Move tc->port_name to a newline (Imre) v6: Use intel_tc_port_lock/Unlock (Imre) Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Khaled Almahallawy <khaled.almahallawy@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patch.msgid.link/20251028190753.3089937-1-khaled.almahallawy@intel.com
2025-11-04drm/vkms: Fix use after frees on error pathsDan Carpenter
These error paths free a pointer and then dereference it on the next line to get the error code. Save the error code first and then free the memory. Fixes: 3e4d5b30d2b2 ("drm/vkms: Allow to configure multiple CRTCs via configfs") Fixes: 2f1734ba271b ("drm/vkms: Allow to configure multiple planes via configfs") Fixes: 67d8cf92e13e ("drm/vkms: Allow to configure multiple encoders via configfs") Fixes: 272acbca96a3 ("drm/vkms: Allow to configure multiple connectors via configfs") Fixes: 13fc9b9745cc ("drm/vkms: Add and remove VKMS instances via configfs") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: José Expósito <jose.exposito89@gmail.com> Link: https://lore.kernel.org/r/aPtfy2jCI_kb3Df7@stanley.mountain Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-11-04gpu: nova-core: use `try_from` instead of `as` for u32 conversionsAlexandre Courbot
There are a few situations in the driver where we convert a `usize` into a `u32` using `as`. Even though most of these are obviously correct, use `try_from` and let the compiler optimize wherever it is safe to do so. Acked-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251029-nova-as-v3-3-6a30c7333ad9@nvidia.com>