summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2025-12-15Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixesMaarten Lankhorst
Pull in rc1 to include all changes since the merge window closed, and grab all fixes and changes from drm/drm-next. Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-12-15drm/gem: fix build for mm_get_unmapped_area() call after backmergeJani Nikula
Commit 9ac09bb9feac ("mm: consistently use current->mm in mm_get_unmapped_area()") upstream dropped a parameter from mm_get_unmapped_area() while commit 99bda20d6d4c ("drm/gem: Introduce drm_gem_get_unmapped_area() fop") in drm-misc-next added a new user. Drop the extra parameter from the call. Fixes: 7f790dd21a93 ("Merge drm/drm-next into drm-misc-next") Cc: Maxime Ripard <mripard@kernel.org> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patch.msgid.link/20251215092706.3218018-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-15drm/i915/display: group and sort the parent interface wrappers betterJani Nikula
Aligning with the parent interface struct definitions, also group and sort the parent interface wrappers to improve clarity on where to add new stuff. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/b61af1d33d0448cd904cccccb2714f0d07d85b07.1765548786.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-15drm/xe: sort parent interface initializationJani Nikula
Sort the member initializers to improve clarity. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/0af6654afb2174c472f75710cea328eb443f4b73.1765548786.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-15drm/i915: sort parent interface initializationJani Nikula
Sort the member initializers to improve clarity. Separate individual function initializers with a blank line in between. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/7f5deefc30703006bc2daa1ce1093a4947f6e049.1765548786.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-15Merge drm/drm-next into drm-misc-nextMaxime Ripard
Let's kickstart the v6.20 (7.0?) release cycle. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-12-13drm/sched: Add pending job list iteratorMatthew Brost
Stop open coding pending job list in drivers. Add pending job list iterator which safely walks DRM scheduler list asserting DRM scheduler is stopped. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patch.msgid.link/20251209200039.1366764-3-matthew.brost@intel.com
2025-12-13drm/sched: Add several job helpers to avoid drivers touching scheduler stateMatthew Brost
In the past, drivers used to reach into scheduler internals—this must end because it makes it difficult to change scheduler internals, as driver-side code must also be updated. Add helpers to check if the scheduler is stopped and to query a job’s signaled state to avoid reaching into scheduler internals. These are expected to be used driver-side in recovery and debug flows. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patch.msgid.link/20251209200039.1366764-2-matthew.brost@intel.com
2025-12-13Merge tag 'drm-fixes-2025-12-13' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull more drm fixes from Dave Airlie: "These are the enqueued fixes that ended up in our fixes branch, nouveau mostly, along with some small fixes in other places. plane: - Handle IS_ERR vs NULL in drm_plane_create_hotspot_properties() ttm: - fix devcoredump for evicted bos panel: - Fix stack usage warning in novatek-nt35560 nouveau: - alloc fwsec sb at boot to avoid s/r problems - fix strcpy usage - fix i2c encoder crash bridge: - Ignore spurious PLL_UNLOCK bit in ti-sn65dsi83 mgag200: - Fix bigendian handling in mgag200 tilcdc: - Fix probe failure in tilcdc" * tag 'drm-fixes-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel: drm/mgag200: Fix big-endian support drm/tilcdc: Fix removal actions in case of failed probe drm/ttm: Avoid NULL pointer deref for evicted BOs drm: nouveau: Replace sprintf() with sysfs_emit() drm/nouveau: fix circular dep oops from vendored i2c encoder drm/nouveau: refactor deprecated strcpy drm/plane: Fix IS_ERR() vs NULL check in drm_plane_create_hotspot_properties() drm/bridge: ti-sn65dsi83: ignore PLL_UNLOCK errors drm/nouveau/gsp: Allocate fwsec-sb at boot drm/panel: novatek-nt35560: avoid on-stack device structure
2025-12-13Merge tag 'drm-next-2025-12-13' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "This is the weekly fixes for what is in next tree, mostly amdgpu and some i915, panthor and a core revert. core: - revert dumb bo 8 byte alignment amdgpu: - SI fix - DC reduce stack usage - HDMI fixes - VCN 4.0.5 fix - DP MST fix - DC memory allocation fix amdkfd: - SVM fix - Trap handler fix - VGPR fixes for GC 11.5 i915: - Fix format string truncation warning - FIx runtime PM reference during fbdev BO creation panthor: - fix UAF renesas: - fix sync flag handling" * tag 'drm-next-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel: Revert "drm/amd/display: Fix pbn to kbps Conversion" drm/amd: Fix unbind/rebind for VCN 4.0.5 drm/i915: Fix format string truncation warning drm/i915/fbdev: Hold runtime PM ref during fbdev BO creation drm/amd/display: Improve HDMI info retrieval drm/amdkfd: bump minimum vgpr size for gfx1151 drm/amd/display: shrink struct members drm/amdkfd: Export the cwsr_size and ctl_stack_size to userspace drm/amd/display: Refactor dml_core_mode_support to reduce stack frame drm/amdgpu: don't attach the tlb fence for SI drm/amd/display: Use GFP_ATOMIC in dc_create_plane_state() drm/amdkfd: Trap handler support for expert scheduling mode drm/amdkfd: Use huge page size to check split svm range alignment drm/rcar-du: dsi: Handle both DRM_MODE_FLAG_N.SYNC and !DRM_MODE_FLAG_P.SYNC drm/gem-shmem: revert the 8-byte alignment constraint drm/gem-dma: revert the 8-byte alignment constraint drm/panthor: Prevent potential UAF in group creation
2025-12-12drm/xe/lnl: Drop pre-production workaround supportMatt Roper
LNL has been out long enough that all of our internal usage of pre-production hardware has been phased out and we no longer need to maintain workarounds that were exclusive to pre-production parts. Production LNL hardware always has B0 or later steppings for both graphics and media IP. Eliminate all workarounds that were exclusive to A-step hardware and set the 'has_prod_wa_only' device flag for LNL to make sure we warn and taint if someone tries to load the driver on an old pre-production part. Bspec: 70821 Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Link: https://patch.msgid.link/20251212181411.294854-4-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-12-12drm/xe: Track pre-production workaround supportMatt Roper
When we're initially enabling driver support for a new platform/IP, we usually implement all workarounds documented in the WA database in the driver. Many of those workarounds are restricted to early steppings that only showed up in pre-production hardware (i.e., internal test chips that are not available to the general public). Since the workarounds for early, pre-production steppings tend to be some of the ugliest and most complicated workarounds, we generally want to eliminate them and simplify the code once the platform has launched and our internal usage of those pre-production parts have been phased out. Let's add a flag to the device info that tracks which platforms still have support for pre-production workarounds for so that we can print a warning and taint if someone tries to load the driver on a pre-production part for a platform without pre-production workarounds. This will help our internal users understand the likely problems they'll encounter if they try to load the driver on an old pre-production device. The Xe behavior here is similar to what we've done for many years on i915 (see intel_detect_preproduction_hw()), except that instead of manually coding up ranges of device steppings that we believe to be pre-production hardware, Xe will use the hardware's own production vs pre-production fusing status, which we can read from the FUSE2 register. This fuse didn't exist on older Intel hardware, but should be present on all platforms supported by the Xe driver. Going forward, let's set the expectation that we'll start looking into removing pre-production workarounds for a platform around the time that platforms of the next major IP stepping are having their force_probe requirement lifted. This timing is just a rough guideline; there may be cases where some instances of pre-production parts are still being actively used in CI farms, internal device pools, etc. and we'll need to wait a bit longer for those to be swapped out. v2: - Fix inverted forcewake check v3: - Invert flag and add it to the platforms on which we still have pre-prod workarounds. (Jani, Lucas) v4: - Avoid checking pre-production on VF since they don't have access to the FUSE2 register. Bspec: 78271, 52544 Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251212181411.294854-3-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-12-12drm/xe: Add debugfs support for page reclamationBrian Nguyen
Allow for runtime modification to page reclamation feature through debugfs configuration. This parameter will only take effect if the platform supports the page reclamation feature by default. v2: - Minor comment tweaks. (Shuicheng) - Convert to kstrtobool_from_user. (Michal) - Only expose page reclaim file if page reclaim flag initially supported and with that, remove xe_match_desc usage. (Michal) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-22-brian3.nguyen@intel.com
2025-12-12drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaimBrian Nguyen
There are additional hardware managed L2$ flushing such as the transient display. In those scenarios, page reclamation is unnecessary resulting in redundant cacheline flushes, so skip over those corresponding ranges. v2: - Elaborated on reasoning for page reclamation skip based on Tejas's discussion. (Matthew A, Tejas) v3: - Removed MEDIA_IS_ON due to racy condition resulting in removal of relevant registers and values. (Matthew A) - Moved l3 policy access to xe_pat. (Matthew A) v4: - Updated comments based on previous change. (Tejas) - Move back PAT index macros to xe_pat.c. Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-21-brian3.nguyen@intel.com
2025-12-12drm/xe: Append page reclamation action to tlb invalBrian Nguyen
Add page reclamation action to tlb inval backend. The page reclamation action is paired with range tlb invalidations so both are issued at the same time. Page reclamation will issue the TLB invalidation with an invalid seqno and a H2G page reclamation action with the fence's corresponding seqno and handle the fence accordingly on page reclaim action done handler. If page reclamation fails, tlb timeout handler will be responsible for signalling fence and cleaning up. v2: - add send_page_reclaim to patch. - Remove flush_cache and use prl_sa pointer to determine PPC flush instead of explicit bool. Add NULL as fallback for others. (Matthew B) v3: - Add comments for flush_cache with media. Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-20-brian3.nguyen@intel.com
2025-12-12drm/xe: Prep page reclaim in tlb inval jobBrian Nguyen
Use page reclaim list as indicator if page reclaim action is desired and pass it to tlb inval fence to handle. Job will need to maintain its own embedded copy to ensure lifetime of PRL exist until job has run. v2: - Use xe variant of WARN_ON (Michal) v3: - Add comments for PRL tile handling and flush behavior with media. (Matthew Brost) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-19-brian3.nguyen@intel.com
2025-12-12drm/xe: Suballocate BO for page reclaimBrian Nguyen
Page reclamation feature needs the PRL to be suballocated into a GGTT-mapped BO. On allocation failure, fallback to default tlb invalidation with full PPC flush. PRL's BO allocation is managed in separate pool to ensure 4K alignment for proper GGTT address. With BO, pass into TLB invalidation backend and modify fence to accomadate accordingly. v2: - Removed page reclaim related variables from TLB fence. (Matthew B) - Allocate PRL bo size to num_entries. (Matthew B) - Move PRL bo allocation to tlb_inval run_job. (Matthew B) v5: - Use xe_page_reclaim_list_valid. (Matthew B) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-18-brian3.nguyen@intel.com
2025-12-12drm/xe: Create page reclaim list on unbindBrian Nguyen
Page reclaim list (PRL) is preparation work for the page reclaim feature. The PRL is firstly owned by pt_update_ops and all other page reclaim operations will point back to this PRL. PRL generates its entries during the unbind page walker, updating the PRL. This PRL is restricted to a 4K page, so 512 page entries at most. v2: - Removed unused function. (Shuicheng) - Compacted warning checking, update commit message, spelling, etc. (Shuicheng, Matthew B) - Fix kernel docs - Moved PRL max entries overflow handling out from generate_reclaim_entry to caller (Shuicheng) - Add xe_page_reclaim_list_init for clarity. (Matthew B) - Modify xe_guc_page_reclaim_entry to use macros for greater flexbility. (Matthew B) - Add fallback for PTE outside of page reclaim supported 4K, 64K, 2M pages (Matthew B) - Invalidate PRL for early abort page walk. - Removed page reclaim related variables from tlb fence (Matthew Brost) - Remove error handling in *alloc_entries failure. (Matthew B) v3: - Fix NULL pointer dereference check. - Modify reclaim_entry to QW and bitfields accordingly. (Matthew B) - Add vm_dbg prints for PRL generation and invalidation. (Matthew B) v4: - s/GENMASK/GENMASK_ULL && s/BIT/BIT_ULL (CI) v5: - Addition of xe_page_reclaim_list_is_new() to avoid continuous allocation of PRL if consecutive VMAs cause a PRL invalidation. - Add xe_page_reclaim_list_valid() helpers for clarity. (Matthew B) - Move xe_page_reclaim_list_entries_put in xe_page_reclaim_list_invalidate. Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-17-brian3.nguyen@intel.com
2025-12-12drm/xe/guc: Add page reclamation interface to GuCBrian Nguyen
Add page reclamation related changes to GuC interface, handlers, and senders to support page reclamation. Currently TLB invalidations will perform an entire PPC flush in order to prevent stale memory access for noncoherent system memory. Page reclamation is an extension of the typical TLB invalidation workflow, allowing disabling of full PPC flush and enable selective PPC flushing. Selective flushing will be decided by a list of pages whom's address is passed to GuC at time of action. Page reclamation interfaces require at least GuC FW ver 70.31.0. v2: - Moved send_page_reclaim to first patch usage. - Add comments explaining shared done handler. (Matthew B) - Add FW version fallback to disable page reclaim on older versions. (Matthew B, Shuicheng) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-16-brian3.nguyen@intel.com
2025-12-12drm/xe: Add page reclamation info to device infoOak Zeng
Starting from Xe3p, HW adds a feature assisting range based page reclamation. Introduce a bit in device info to indicate whether device has such capability. Signed-off-by: Oak Zeng <oak.zeng@intel.com> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-15-brian3.nguyen@intel.com
2025-12-12drm/xe/xe_tlb_inval: Modify fence interface to support PPC flushBrian Nguyen
Allow tlb_invalidation to control when driver wants to flush the Private Physical Cache (PPC) as a process of the tlb invalidation process. Default behavior is still to always flush the PPC but driver now has the option to disable it. v2: - Revise commit/kernel doc descriptions. (Shuicheng) - Remove unused function. (Shuicheng) - Remove bool flush_cache parameter from fence, and various function inputs. (Matthew B) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-14-brian3.nguyen@intel.com
2025-12-12drm/xe: Do not forward invalid TLB invalidation seqnos to upper layersMatthew Brost
Certain TLB invalidation operations send multiple H2G messages per seqno with only the final H2G containing the valid seqno - the others carry an invalid seqno. The G2H handler drops these invalid seqno to aovid prematurely signaling a TLB invalidation fence. With TLB_INVALIDATION_SEQNO_INVALID used to indicate in progress multi-step TLB invalidations, reset tdr to ensure that timeout won't prematurely trigger when G2H actions are still ongoing. v2: Remove lock from xe_tlb_inval_reset_timeout. (Matthew B) v3: Squash with dependent patch from Matthew Brost' series. Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-13-brian3.nguyen@intel.com
2025-12-13Merge tag 'drm-misc-fixes-2025-12-10' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.19-rc1: - Fix stack usage warning in novatek-nt35560. - Fix s/r, i2c issues in nouveau and update string handling. - Ignore spurious PLL_UNLOCK bit in ti-sn65dsi83. - Handle IS_ERR vs NULL in drm_plane_create_hotspot_properties(). - Fix devcoredump crash on reading evicted bo's. - Fix bigendian handling in mgag200. - Fix probe failure in tilcdc. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patch.msgid.link/6c371dc1-08bf-4a34-895c-9ef348b6061b@linux.intel.com
2025-12-12drm/xe: Restore engine registers before restarting schedulers after GT resetJan Maslak
During GT reset recovery in do_gt_restart(), xe_uc_start() was called before xe_reg_sr_apply_mmio() restored engine-specific registers. This created a race window where the scheduler could run jobs before hardware state was fully restored. This caused failures in eudebug tests (xe_exec_sip_eudebug@breakpoint- waitsip-*) where TD_CTL register (containing TD_CTL_GLOBAL_DEBUG_ENABLE) wasn't restored before jobs started executing. Breakpoints would fail to trigger SIP entry because the debug enable bit wasn't set yet. Fix by moving xe_uc_start() after all MMIO register restoration, including engine registers and CCS mode configuration, ensuring all hardware state is fully restored before any jobs can be scheduled. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Jan Maslak <jan.maslak@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251210145618.169625-2-jan.maslak@intel.com
2025-12-12drm/xe: Increase TDF timeoutJagmeet Randhawa
There are some corner cases where flushing transient data may take slightly longer than the 150us timeout we currently allow. Update the driver to use a 300us timeout instead based on the latest guidance from the hardware team. An update to the bspec to formally document this is expected to arrive soon. Fixes: c01c6066e6fa ("drm/xe/device: implement transient flush") Signed-off-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/0201b1d6ec64d3651fcbff1ea21026efa915126a.1765487866.git.jagmeet.randhawa@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-12-12drm/xe/cri: Enable I2C controllerRaag Jadav
Enable I2C controller for Crescent Island and while at it, rely on has_i2c flag instead of manual platform checks. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20251128084414.306265-1-raag.jadav@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-12-12drm: Fix object leak in DRM_IOCTL_GEM_CHANGE_HANDLEKarol Wachowski
Add missing drm_gem_object_put() call when drm_gem_object_lookup() successfully returns an object. This fixes a GEM object reference leak that can prevent driver modules from unloading when using prime buffers. Fixes: 53096728b891 ("drm: Add DRM prime interface to reassign GEM handle") Cc: <stable@vger.kernel.org> # v6.18+ Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20251212134133.475218-1-karol.wachowski@linux.intel.com
2025-12-12drm/{i915, xe}/panic: move panic handling to parent interfaceJani Nikula
Move the panic handling to the display parent interface, making display more independent of i915 and xe driver implementations. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/e27eca5424479e8936b786018d0af19a34f839f6.1765474612.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-12drm/i915/panic: move i915 specific panic implementation to i915Jani Nikula
The intel_panic.c implementation is i915 specific, and xe has its own. Move it to i915 core as i915_panic.c. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/8dc7af0ae1f859d17b0be269a545146c5536d8fc.1765474612.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-12drm/tests: Handle EDEADLK in set_up_atomic_state()José Expósito
Fedora/CentOS/RHEL CI is reporting intermittent failures while running the drm_validate_modeset test [1]: # drm_test_check_connector_changed_modeset: EXPECTATION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:162 Expected ret == 0, but ret == -35 (0xffffffffffffffdd) Change the set_up_atomic_state() helper function to return on error and restart the atomic sequence when the returned error is EDEADLK. [1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2106744096/test_x86_64/11762450343/artifacts/jobwatch/logs/recipes/19797909/tasks/204139142/results/945095586/logs/dmesg.log Fixes: 73d934d7b6e3 ("drm/tests: Add test for drm_atomic_helper_commit_modeset_disables()") Closes: https://datawarehouse.cki-project.org/issue/4004 Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: José Expósito <jose.exposito89@gmail.com> Link: https://patch.msgid.link/20251104102535.12212-2-jose.exposito89@gmail.com
2025-12-12drm/tests: Handle EDEADLK in drm_test_check_valid_clones()José Expósito
Fedora/CentOS/RHEL CI is reporting intermittent failures while running the drm_test_check_valid_clones() KUnit test. The error log can be either [1]: # drm_test_check_valid_clones: ASSERTION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:295 Expected ret == param->expected_result, but ret == -35 (0xffffffffffffffdd) param->expected_result == 0 (0x0) Or [2] depending on the test case: # drm_test_check_valid_clones: ASSERTION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:295 Expected ret == param->expected_result, but ret == -35 (0xffffffffffffffdd) param->expected_result == -22 (0xffffffffffffffea) Restart the atomic sequence when EDEADLK is returned. [1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2113057246/test_x86_64/11802139999/artifacts/jobwatch/logs/recipes/19824965/tasks/204347800/results/946112713/logs/dmesg.log [2] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2106744297/test_aarch64/11762450907/artifacts/jobwatch/logs/recipes/19797942/tasks/204139727/results/945094561/logs/dmesg.log Fixes: 88849f24e2ab ("drm/tests: Add test for drm_atomic_helper_check_modeset()") Closes: https://datawarehouse.cki-project.org/issue/4004 Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: José Expósito <jose.exposito89@gmail.com> Link: https://patch.msgid.link/20251104102535.12212-1-jose.exposito89@gmail.com
2025-12-12drm/tests: hdmi: Handle drm_kunit_helper_enable_crtc_connector() returning ↵José Expósito
EDEADLK Fedora/CentOS/RHEL CI is reporting intermittent failures while running the KUnit tests present in drm_hdmi_state_helper_test.c [1]. While the specific test causing the failure change between runs, all of them are caused by drm_kunit_helper_enable_crtc_connector() returning -EDEADLK. The error trace always follow this structure: # <test name>: ASSERTION FAILED at # drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c:<line> Expected ret == 0, but ret == -35 (0xffffffffffffffdd) As documented, if the drm_kunit_helper_enable_crtc_connector() function returns -EDEADLK (-35), the entire atomic sequence must be restarted. Handle this error code for all function calls. Closes: https://datawarehouse.cki-project.org/issue/4039 [1] Fixes: 6a5c0ad7e08e ("drm/tests: hdmi_state_helpers: Switch to new helper") Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: José Expósito <jose.exposito89@gmail.com> Link: https://patch.msgid.link/20251104102258.10026-1-jose.exposito89@gmail.com
2025-12-12Merge tag 'drm-intel-next-fixes-2025-12-12' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next drm/i915 fixes for v6.19-rc1: - Fix format string truncation warning - FIx runtime PM reference during fbdev BO creation Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patch.msgid.link/281309f78560bcceebac8d5c0511efe66baf641c@intel.com
2025-12-11drm/xe/doc: Add documentation for Multi Queue Group GuC interfaceNiranjana Vishwanathapura
Add kernel documentation for Multi Queue group GuC interface. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-36-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/doc: Add documentation for Multi Queue GroupNiranjana Vishwanathapura
Add kernel documentation for Multi Queue group and update the corresponding rst. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-35-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Support active group after primary is destroyedNiranjana Vishwanathapura
Add support to keep the group active after the primary queue is destroyed. Instead of killing the primary queue during exec_queue destroy ioctl, kill it when all the secondary queues of the group are killed. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-34-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Tracepoint supportNiranjana Vishwanathapura
Add xe_exec_queue_create_multi_queue event with multi-queue information. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-33-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Teardown group upon job timeoutNiranjana Vishwanathapura
Upon a job timeout, teardown the multi-queue group by triggering TDR on all queues of the multi-queue group and by skipping timeout checks in them. v5: Ban the group while triggering TDR for the guc reported errors Add FIXME in TDR to take multi-queue group off HW (Matt Brost) v6: Trigger cleanup of group only for multi-queue case Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-32-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Reset GT upon CGP_SYNC failureNiranjana Vishwanathapura
If GuC doesn't response to CGP_SYNC message, trigger GT reset and cleanup of all the queues of the multi queue group. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-31-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Handle CGP context errorNiranjana Vishwanathapura
Trigger multi-queue context cleanup upon CGP context error notification from GuC. v4: Fix error message Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-30-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Set QUEUE_DRAIN_MODE for Multi Queue batchesNiranjana Vishwanathapura
To properly support soft light restore between batches being arbitrated at the CFEG, PIPE_CONTROL instructions have a new bit in the first DW, QUEUE_DRAIN_MODE. When set, this indicates to the CFEG that it should only drain the current queue. Additionally we no longer want to set the CS_STALL bit for these multi queue queues as this causes the entire pipeline to stall waiting for completion of the prior batch, preventing this soft light restore from occurring between queues in a queue group. v4: Assert !multi_queue where applicable (Matt Roper) Bspec: 56551 Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-29-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Handle tearing down of a multi queueNiranjana Vishwanathapura
As all queues of a multi queue group use the primary queue of the group to interface with GuC. Hence there is a dependency between the queues of the group. So, when primary queue of a multi queue group is cleaned up, also trigger a cleanup of the secondary queues also. During cleanup, stop and re-start submission for all queues of a multi queue group to avoid any submission happening in parallel when a queue is being cleaned up. v2: Initialize group->list_lock, add fs_reclaim dependency, remove unwanted secondary queues cleanup (Matt Brost) v3: Properly handle cleanup of multi-queue group (Matt Brost) v4: Fix IS_ENABLED(CONFIG_LOCKDEP) check (Matt Brost) Revert stopping/restarting of submissions on queues of the group in TDR as it is not needed. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-28-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Add multi queue information to guc_info dumpNiranjana Vishwanathapura
Dump multi queue specific information in the guc exec queue dump. v2: Move multi queue related fields inside the multi_queue sub-structure (Matt Brost) Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-27-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Add support for multi queue dynamic priority changeNiranjana Vishwanathapura
Support dynamic priority change for multi queue group queues via exec queue set_property ioctl. Issue CGP_SYNC command to GuC through the drm scheduler message interface for priority to take effect. v2: Move is_multi_queue check to exec_queue layer and assert is_multi_queue being set in guc submission layer (Matt Brost) v3: Assert CGP_SYNC message length is valid (Matt Brost) Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-26-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Add exec_queue set_property ioctl supportNiranjana Vishwanathapura
This patch adds support for exec_queue set_property ioctl. It is derived from the original work which is part of https://patchwork.freedesktop.org/series/112188/ Currently only DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE_PRIORITY property can be dynamically set. v2: Check for and update kernel-doc which property this ioctl supports (Matt Brost) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-25-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Handle invalid exec queue property settingNiranjana Vishwanathapura
Only MULTI_QUEUE_PRIORITY property is valid for secondary queues of a multi queue group. MULTI_QUEUE_PRIORITY only applies to multi queue group queues. Detect invalid user queue property setting and return error. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-24-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Add multi queue priority propertyNiranjana Vishwanathapura
Add support for queues of a multi queue group to set their priority within the queue group by adding property DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE_PRIORITY. This is the only other property supported by secondary queues of a multi queue group, other than DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE. v2: Add kernel doc for enum xe_multi_queue_priority, Add assert for priority values, fix includes and declarations (Matt Brost) v3: update uapi kernel-doc (Matt Brost) v4: uapi change due to rebase Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-23-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Add GuC interface for multi queue supportNiranjana Vishwanathapura
Implement GuC commands and response along with the Context Group Page (CGP) interface for multi queue support. Ensure that only primary queue (q0) of a multi queue group communicate with GuC. The secondary queues of the group only need to maintain LRCA and interface with drm scheduler. Use primary queue's submit_wq for all secondary queues of a multi queue group. This serialization avoids any locking around CGP synchronization with GuC. v2: Fix G2H_LEN_DW_MULTI_QUEUE_CONTEXT value, add more comments (Matt Brost) v3: Minor code refactro, use xe_gt_assert v4: Use xe_guc_ct_wake_waiters(), remove vf recovery support (Matt Brost) Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-22-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Add user interface for multi queue supportNiranjana Vishwanathapura
Multi Queue is a new mode of execution supported by the compute and blitter copy command streamers (CCS and BCS, respectively). It is an enhancement of the existing hardware architecture and leverages the same submission model. It enables support for efficient, parallel execution of multiple queues within a single context. All the queues of a group must use the same address space (VM). The new DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE execution queue property supports creating a multi queue group and adding queues to a queue group. All queues of a multi queue group share the same context. A exec queue create ioctl call with above property specified with value DRM_XE_SUPER_GROUP_CREATE will create a new multi queue group with the queue being created as the primary queue (aka q0) of the group. To add secondary queues to the group, they need to be created with the above property with id of the primary queue as the value. The properties of the primary queue (like priority, timeslice) applies to the whole group. So, these properties can't be set for secondary queues of a group. Once destroyed, the secondary queues of a multi queue group can't be replaced. However, they can be dynamically added to the group up to a total of 64 queues per group. Once the primary queue is destroyed, secondary queues can't be added to the queue group. v2: Remove group->lock, fix xe_exec_queue_group_add()/delete() function semantics, add additional comments, remove unused group->list_lock, add XE_BO_FLAG_GGTT_INVALIDATE for cgp bo, Assert LRC is valid, update uapi kernel doc. (Matt Brost) v3: Use XE_BO_FLAG_PINNED_LATE_RESTORE/USER_VRAM/GGTT_INVALIDATE flags for cgp bo (Matt) v4: Ensure queue is not a vm_bind queue uapi change due to rebase Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-21-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Add multi_queue_enable_mask to gt informationNiranjana Vishwanathapura
Add multi_queue_enable_mask field to the gt information structure which is bitmask of all engine classes with multi queue support enabled. v2: Rename multi_queue_enable_mask to multi_queue_engine_class_mask (Matt Brost) Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-20-niranjana.vishwanathapura@intel.com