summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm
AgeCommit message (Collapse)Author
2025-12-15drm/xe/vf: Fix queuing of recovery workSatyanarayana K V P
Ensure VF migration recovery work is only queued when no recovery is already queued and teardown is not in progress. Fixes: b47c0c07c350 ("drm/xe/vf: Teardown VF post migration worker on driver unload") Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251210052546.622809-5-satyanarayana.k.v.p@intel.com (cherry picked from commit 8d8cf42b03f149dcb545b547906306f3b474565e) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15drm/xe/bo: Don't include the CCS metadata in the dma-buf sg-tableThomas Hellström
Some Xe bos are allocated with extra backing-store for the CCS metadata. It's never been the intention to share the CCS metadata when exporting such bos as dma-buf. Don't include it in the dma-buf sg-table. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://patch.msgid.link/20251209204920.224374-1-thomas.hellstrom@linux.intel.com (cherry picked from commit a4ebfb9d95d78a12512b435a698ee6886d712571) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15drm/me/gsc: mei interrupt top half should be in irq disabled contextJunxiao Chang
MEI GSC interrupt comes from i915 or xe driver. It has top half and bottom half. Top half is called from i915/xe interrupt handler. It should be in irq disabled context. With RT kernel(PREEMPT_RT enabled), by default IRQ handler is in threaded IRQ. MEI GSC top half might be in threaded IRQ context. generic_handle_irq_safe API could be called from either IRQ or process context, it disables local IRQ then calls MEI GSC interrupt top half. This change fixes B580 GPU boot issue with RT enabled. Fixes: e02cea83d32d ("drm/xe/gsc: add Battlemage support") Tested-by: Baoli Zhang <baoli.zhang@intel.com> Signed-off-by: Junxiao Chang <junxiao.chang@intel.com> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251107033152.834960-1-junxiao.chang@intel.com Signed-off-by: Maarten Lankhorst <dev@lankhorst.se> (cherry picked from commit 3efadf028783a49ab2941294187c8b6dd86bf7da) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15drm/xe/vf: Stop waiting for ring space on VF post migration recoveryTomasz Lis
If wait for ring space started just before migration, it can delay the recovery process, by waiting without bailout path for up to 2 seconds. Two second wait for recovery is not acceptable, and if the ring was completely filled even without the migration temporarily stopping execution, then such a wait will result in up to a thousand new jobs (assuming constant flow) being added while the wait is happening. While this will not cause data corruption, it will lead to warning messages getting logged due to reset being scheduled on a GT under recovery. Also several seconds of unresponsiveness, as the backlog of jobs gets progressively executed. Add a bailout condition, to make sure the recovery starts without much delay. The recovery is expected to finish in about 100 ms when under moderate stress, so the condition verification period needs to be below that - settling at 64 ms. The theoretical max time which the recovery can take depends on how many requests can be emitted to engine rings and be pending execution. While stress testing, it was possible to reach 10k pending requests on rings when a platform with two GTs was used. This resulted in max recovery time of 5 seconds. But in real life situations, it is very unlikely that the amount of pending requests will ever exceed 100, and for that the recovery time will be around 50 ms - well within our claimed limit of 100ms. Fixes: a4dae94aad6a ("drm/xe/vf: Wakeup in GuC backend on VF post migration recovery") Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251204200820.2206168-1-tomasz.lis@intel.com (cherry picked from commit a00e305fba02a915cf2745bf6ef3f55537e65d57) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15drm/xe/throttle: Skip reason prefix while emitting arrayRaag Jadav
The newly introduced "reasons" attribute already signifies possible reasons for throttling and makes the prefix in individual attribute names redundant while emitting them as an array. Skip the prefix. Fixes: 83ccde67a3f7 ("drm/xe/gt_throttle: Avoid TOCTOU when monitoring reasons") Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Sk Anirban <sk.anirban@intel.com> Link: https://patch.msgid.link/20251203123355.571606-1-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit b64a14334ef3ebbcf70d11bc67d0934bdc0e390d) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15drm/xe: fix drm_gpusvm_init() argumentsArnd Bergmann
The Xe driver fails to build when CONFIG_DRM_XE_GPUSVM is disabled but CONFIG_DRM_GPUSVM is turned on, due to the clash of two commits: In file included from drivers/gpu/drm/xe/xe_vm_madvise.c:8: drivers/gpu/drm/xe/xe_svm.h: In function 'xe_svm_init': include/linux/stddef.h:8:14: error: passing argument 5 of 'drm_gpusvm_init' makes integer from pointer without a cast [-Wint-conversion] drivers/gpu/drm/xe/xe_svm.h:217:38: note: in expansion of macro 'NULL' 217 | NULL, NULL, 0, 0, 0, NULL, NULL, 0); | ^~~~ In file included from drivers/gpu/drm/xe/xe_bo_types.h:11, from drivers/gpu/drm/xe/xe_bo.h:11, from drivers/gpu/drm/xe/xe_vm_madvise.c:11: include/drm/drm_gpusvm.h:254:35: note: expected 'long unsigned int' but argument is of type 'void *' 254 | unsigned long mm_start, unsigned long mm_range, | ~~~~~~~~~~~~~~^~~~~~~~ In file included from drivers/gpu/drm/xe/xe_vm_madvise.c:14: drivers/gpu/drm/xe/xe_svm.h:216:16: error: too many arguments to function 'drm_gpusvm_init'; expected 10, have 11 216 | return drm_gpusvm_init(&vm->svm.gpusvm, "Xe SVM (simple)", &vm->xe->drm, | ^~~~~~~~~~~~~~~ 217 | NULL, NULL, 0, 0, 0, NULL, NULL, 0); | ~ include/drm/drm_gpusvm.h:251:5: note: declared here Adapt the caller to the new argument list by removing the extraneous NULL argument. Fixes: 9e9787414882 ("drm/xe/userptr: replace xe_hmm with gpusvm") Fixes: 10aa5c806030 ("drm/gpusvm, drm/xe: Fix userptr to not allow device private pages") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251204094704.1030933-1-arnd@kernel.org (cherry picked from commit 29bce9c8b41d5c378263a927acb9a9074d0e7a0e) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15drm/xe: Do not reference loop variable directlyMatthew Brost
Do not reference the loop variable job after the loop has exited. Instead, save the job from the last iteration of the loop. Fixes: 3d98a7164da6 ("drm/xe/vf: Start re-emission from first unsignaled job during VF migration") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202511291102.jnnKP6IB-lkp@intel.com/ Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Link: https://patch.msgid.link/20251203011809.968893-1-matthew.brost@intel.com (cherry picked from commit 76ce2313709f13a6adbcaa1a43a8539c8f509f6a) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15drm/xe: Apply Wa_14020316580 in xe_gt_idle_enable_pg()Vinay Belgaumkar
Wa_14020316580 was getting clobbered by power gating init code later in the driver load sequence. Move the Wa so that it applies correctly. Fixes: 7cd05ef89c9d ("drm/xe/xe2hpm: Add initial set of workarounds") Suggested-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20251129052548.70766-1-vinay.belgaumkar@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 8b5502145351bde87f522df082b9e41356898ba3) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15drm/xe: Fix freq kobject leak on sysfs_create_files failureShuicheng Lin
Ensure gt->freq is released when sysfs_create_files() fails in xe_gt_freq_init(). Without this, the kobject would leak. Add kobject_put() before returning the error. Fixes: fdc81c43f0c1 ("drm/xe: use devm_add_action_or_reset() helper") Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Alex Zuo <alex.zuo@intel.com> Reviewed-by: Xin Wang <x.wang@intel.com> Link: https://patch.msgid.link/20251114205638.2184529-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 251be5fb4982ebb0f5a81b62d975bd770f3ad5c2) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15drm/sitronix/st7571-spi: add support for SPI interfaceMarcus Folkesson
Add support for ST7561/ST7571 connected to SPI bus. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com> Link: https://patch.msgid.link/20251215-st7571-split-v3-6-d5f3205c3138@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
2025-12-15drm/sitronix/st7571: split up the driver into a common and an i2c partMarcus Folkesson
Split up the driver to make it possible to add support for hw interfaces other than I2C. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com> Link: https://patch.msgid.link/20251215-st7571-split-v3-5-d5f3205c3138@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
2025-12-15drm/sitronix/st7571-i2c: make probe independent of hw interfaceMarcus Folkesson
Create an interface independent layer for the probe function. This is to make it possible to add support for other interfaces. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com> Link: https://patch.msgid.link/20251215-st7571-split-v3-4-d5f3205c3138@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
2025-12-15drm/sitronix/st7571-i2c: move common structures to st7571.hMarcus Folkesson
Move all structures that will be common for all interfaces (SPI/I2C) to a separate header file. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com> Link: https://patch.msgid.link/20251215-st7571-split-v3-3-d5f3205c3138@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
2025-12-15drm/sitronix/st7571-i2c: add 'struct device' to st7571_deviceMarcus Folkesson
Keep a copy of the device structure instead of referring to i2c_client. This is a preparation step to separate the generic part from all i2c stuff. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com> Link: https://patch.msgid.link/20251215-st7571-split-v3-2-d5f3205c3138@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
2025-12-15drm/sitronix/st7571-i2c: rename 'struct drm_device' in st7571_deviceMarcus Folkesson
Rename st7571_device.dev to st7571_device.drm in preparation to introduce a 'struct device' member to this structure. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com> Link: https://patch.msgid.link/20251215-st7571-split-v3-1-d5f3205c3138@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
2025-12-15Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixesMaarten Lankhorst
Pull in rc1 to include all changes since the merge window closed, and grab all fixes and changes from drm/drm-next. Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-12-15drm/gem: fix build for mm_get_unmapped_area() call after backmergeJani Nikula
Commit 9ac09bb9feac ("mm: consistently use current->mm in mm_get_unmapped_area()") upstream dropped a parameter from mm_get_unmapped_area() while commit 99bda20d6d4c ("drm/gem: Introduce drm_gem_get_unmapped_area() fop") in drm-misc-next added a new user. Drop the extra parameter from the call. Fixes: 7f790dd21a93 ("Merge drm/drm-next into drm-misc-next") Cc: Maxime Ripard <mripard@kernel.org> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patch.msgid.link/20251215092706.3218018-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-15drm/i915/display: group and sort the parent interface wrappers betterJani Nikula
Aligning with the parent interface struct definitions, also group and sort the parent interface wrappers to improve clarity on where to add new stuff. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/b61af1d33d0448cd904cccccb2714f0d07d85b07.1765548786.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-15drm/xe: sort parent interface initializationJani Nikula
Sort the member initializers to improve clarity. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/0af6654afb2174c472f75710cea328eb443f4b73.1765548786.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-15drm/i915: sort parent interface initializationJani Nikula
Sort the member initializers to improve clarity. Separate individual function initializers with a blank line in between. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/7f5deefc30703006bc2daa1ce1093a4947f6e049.1765548786.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-15Merge drm/drm-next into drm-misc-nextMaxime Ripard
Let's kickstart the v6.20 (7.0?) release cycle. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-12-13drm/sched: Add pending job list iteratorMatthew Brost
Stop open coding pending job list in drivers. Add pending job list iterator which safely walks DRM scheduler list asserting DRM scheduler is stopped. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patch.msgid.link/20251209200039.1366764-3-matthew.brost@intel.com
2025-12-13drm/sched: Add several job helpers to avoid drivers touching scheduler stateMatthew Brost
In the past, drivers used to reach into scheduler internals—this must end because it makes it difficult to change scheduler internals, as driver-side code must also be updated. Add helpers to check if the scheduler is stopped and to query a job’s signaled state to avoid reaching into scheduler internals. These are expected to be used driver-side in recovery and debug flows. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patch.msgid.link/20251209200039.1366764-2-matthew.brost@intel.com
2025-12-13Merge tag 'drm-fixes-2025-12-13' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull more drm fixes from Dave Airlie: "These are the enqueued fixes that ended up in our fixes branch, nouveau mostly, along with some small fixes in other places. plane: - Handle IS_ERR vs NULL in drm_plane_create_hotspot_properties() ttm: - fix devcoredump for evicted bos panel: - Fix stack usage warning in novatek-nt35560 nouveau: - alloc fwsec sb at boot to avoid s/r problems - fix strcpy usage - fix i2c encoder crash bridge: - Ignore spurious PLL_UNLOCK bit in ti-sn65dsi83 mgag200: - Fix bigendian handling in mgag200 tilcdc: - Fix probe failure in tilcdc" * tag 'drm-fixes-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel: drm/mgag200: Fix big-endian support drm/tilcdc: Fix removal actions in case of failed probe drm/ttm: Avoid NULL pointer deref for evicted BOs drm: nouveau: Replace sprintf() with sysfs_emit() drm/nouveau: fix circular dep oops from vendored i2c encoder drm/nouveau: refactor deprecated strcpy drm/plane: Fix IS_ERR() vs NULL check in drm_plane_create_hotspot_properties() drm/bridge: ti-sn65dsi83: ignore PLL_UNLOCK errors drm/nouveau/gsp: Allocate fwsec-sb at boot drm/panel: novatek-nt35560: avoid on-stack device structure
2025-12-13Merge tag 'drm-next-2025-12-13' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "This is the weekly fixes for what is in next tree, mostly amdgpu and some i915, panthor and a core revert. core: - revert dumb bo 8 byte alignment amdgpu: - SI fix - DC reduce stack usage - HDMI fixes - VCN 4.0.5 fix - DP MST fix - DC memory allocation fix amdkfd: - SVM fix - Trap handler fix - VGPR fixes for GC 11.5 i915: - Fix format string truncation warning - FIx runtime PM reference during fbdev BO creation panthor: - fix UAF renesas: - fix sync flag handling" * tag 'drm-next-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel: Revert "drm/amd/display: Fix pbn to kbps Conversion" drm/amd: Fix unbind/rebind for VCN 4.0.5 drm/i915: Fix format string truncation warning drm/i915/fbdev: Hold runtime PM ref during fbdev BO creation drm/amd/display: Improve HDMI info retrieval drm/amdkfd: bump minimum vgpr size for gfx1151 drm/amd/display: shrink struct members drm/amdkfd: Export the cwsr_size and ctl_stack_size to userspace drm/amd/display: Refactor dml_core_mode_support to reduce stack frame drm/amdgpu: don't attach the tlb fence for SI drm/amd/display: Use GFP_ATOMIC in dc_create_plane_state() drm/amdkfd: Trap handler support for expert scheduling mode drm/amdkfd: Use huge page size to check split svm range alignment drm/rcar-du: dsi: Handle both DRM_MODE_FLAG_N.SYNC and !DRM_MODE_FLAG_P.SYNC drm/gem-shmem: revert the 8-byte alignment constraint drm/gem-dma: revert the 8-byte alignment constraint drm/panthor: Prevent potential UAF in group creation
2025-12-12drm/xe/lnl: Drop pre-production workaround supportMatt Roper
LNL has been out long enough that all of our internal usage of pre-production hardware has been phased out and we no longer need to maintain workarounds that were exclusive to pre-production parts. Production LNL hardware always has B0 or later steppings for both graphics and media IP. Eliminate all workarounds that were exclusive to A-step hardware and set the 'has_prod_wa_only' device flag for LNL to make sure we warn and taint if someone tries to load the driver on an old pre-production part. Bspec: 70821 Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Link: https://patch.msgid.link/20251212181411.294854-4-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-12-12drm/xe: Track pre-production workaround supportMatt Roper
When we're initially enabling driver support for a new platform/IP, we usually implement all workarounds documented in the WA database in the driver. Many of those workarounds are restricted to early steppings that only showed up in pre-production hardware (i.e., internal test chips that are not available to the general public). Since the workarounds for early, pre-production steppings tend to be some of the ugliest and most complicated workarounds, we generally want to eliminate them and simplify the code once the platform has launched and our internal usage of those pre-production parts have been phased out. Let's add a flag to the device info that tracks which platforms still have support for pre-production workarounds for so that we can print a warning and taint if someone tries to load the driver on a pre-production part for a platform without pre-production workarounds. This will help our internal users understand the likely problems they'll encounter if they try to load the driver on an old pre-production device. The Xe behavior here is similar to what we've done for many years on i915 (see intel_detect_preproduction_hw()), except that instead of manually coding up ranges of device steppings that we believe to be pre-production hardware, Xe will use the hardware's own production vs pre-production fusing status, which we can read from the FUSE2 register. This fuse didn't exist on older Intel hardware, but should be present on all platforms supported by the Xe driver. Going forward, let's set the expectation that we'll start looking into removing pre-production workarounds for a platform around the time that platforms of the next major IP stepping are having their force_probe requirement lifted. This timing is just a rough guideline; there may be cases where some instances of pre-production parts are still being actively used in CI farms, internal device pools, etc. and we'll need to wait a bit longer for those to be swapped out. v2: - Fix inverted forcewake check v3: - Invert flag and add it to the platforms on which we still have pre-prod workarounds. (Jani, Lucas) v4: - Avoid checking pre-production on VF since they don't have access to the FUSE2 register. Bspec: 78271, 52544 Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251212181411.294854-3-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-12-12drm/xe: Add debugfs support for page reclamationBrian Nguyen
Allow for runtime modification to page reclamation feature through debugfs configuration. This parameter will only take effect if the platform supports the page reclamation feature by default. v2: - Minor comment tweaks. (Shuicheng) - Convert to kstrtobool_from_user. (Michal) - Only expose page reclaim file if page reclaim flag initially supported and with that, remove xe_match_desc usage. (Michal) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-22-brian3.nguyen@intel.com
2025-12-12drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaimBrian Nguyen
There are additional hardware managed L2$ flushing such as the transient display. In those scenarios, page reclamation is unnecessary resulting in redundant cacheline flushes, so skip over those corresponding ranges. v2: - Elaborated on reasoning for page reclamation skip based on Tejas's discussion. (Matthew A, Tejas) v3: - Removed MEDIA_IS_ON due to racy condition resulting in removal of relevant registers and values. (Matthew A) - Moved l3 policy access to xe_pat. (Matthew A) v4: - Updated comments based on previous change. (Tejas) - Move back PAT index macros to xe_pat.c. Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-21-brian3.nguyen@intel.com
2025-12-12drm/xe: Append page reclamation action to tlb invalBrian Nguyen
Add page reclamation action to tlb inval backend. The page reclamation action is paired with range tlb invalidations so both are issued at the same time. Page reclamation will issue the TLB invalidation with an invalid seqno and a H2G page reclamation action with the fence's corresponding seqno and handle the fence accordingly on page reclaim action done handler. If page reclamation fails, tlb timeout handler will be responsible for signalling fence and cleaning up. v2: - add send_page_reclaim to patch. - Remove flush_cache and use prl_sa pointer to determine PPC flush instead of explicit bool. Add NULL as fallback for others. (Matthew B) v3: - Add comments for flush_cache with media. Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-20-brian3.nguyen@intel.com
2025-12-12drm/xe: Prep page reclaim in tlb inval jobBrian Nguyen
Use page reclaim list as indicator if page reclaim action is desired and pass it to tlb inval fence to handle. Job will need to maintain its own embedded copy to ensure lifetime of PRL exist until job has run. v2: - Use xe variant of WARN_ON (Michal) v3: - Add comments for PRL tile handling and flush behavior with media. (Matthew Brost) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-19-brian3.nguyen@intel.com
2025-12-12drm/xe: Suballocate BO for page reclaimBrian Nguyen
Page reclamation feature needs the PRL to be suballocated into a GGTT-mapped BO. On allocation failure, fallback to default tlb invalidation with full PPC flush. PRL's BO allocation is managed in separate pool to ensure 4K alignment for proper GGTT address. With BO, pass into TLB invalidation backend and modify fence to accomadate accordingly. v2: - Removed page reclaim related variables from TLB fence. (Matthew B) - Allocate PRL bo size to num_entries. (Matthew B) - Move PRL bo allocation to tlb_inval run_job. (Matthew B) v5: - Use xe_page_reclaim_list_valid. (Matthew B) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-18-brian3.nguyen@intel.com
2025-12-12drm/xe: Create page reclaim list on unbindBrian Nguyen
Page reclaim list (PRL) is preparation work for the page reclaim feature. The PRL is firstly owned by pt_update_ops and all other page reclaim operations will point back to this PRL. PRL generates its entries during the unbind page walker, updating the PRL. This PRL is restricted to a 4K page, so 512 page entries at most. v2: - Removed unused function. (Shuicheng) - Compacted warning checking, update commit message, spelling, etc. (Shuicheng, Matthew B) - Fix kernel docs - Moved PRL max entries overflow handling out from generate_reclaim_entry to caller (Shuicheng) - Add xe_page_reclaim_list_init for clarity. (Matthew B) - Modify xe_guc_page_reclaim_entry to use macros for greater flexbility. (Matthew B) - Add fallback for PTE outside of page reclaim supported 4K, 64K, 2M pages (Matthew B) - Invalidate PRL for early abort page walk. - Removed page reclaim related variables from tlb fence (Matthew Brost) - Remove error handling in *alloc_entries failure. (Matthew B) v3: - Fix NULL pointer dereference check. - Modify reclaim_entry to QW and bitfields accordingly. (Matthew B) - Add vm_dbg prints for PRL generation and invalidation. (Matthew B) v4: - s/GENMASK/GENMASK_ULL && s/BIT/BIT_ULL (CI) v5: - Addition of xe_page_reclaim_list_is_new() to avoid continuous allocation of PRL if consecutive VMAs cause a PRL invalidation. - Add xe_page_reclaim_list_valid() helpers for clarity. (Matthew B) - Move xe_page_reclaim_list_entries_put in xe_page_reclaim_list_invalidate. Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-17-brian3.nguyen@intel.com
2025-12-12drm/xe/guc: Add page reclamation interface to GuCBrian Nguyen
Add page reclamation related changes to GuC interface, handlers, and senders to support page reclamation. Currently TLB invalidations will perform an entire PPC flush in order to prevent stale memory access for noncoherent system memory. Page reclamation is an extension of the typical TLB invalidation workflow, allowing disabling of full PPC flush and enable selective PPC flushing. Selective flushing will be decided by a list of pages whom's address is passed to GuC at time of action. Page reclamation interfaces require at least GuC FW ver 70.31.0. v2: - Moved send_page_reclaim to first patch usage. - Add comments explaining shared done handler. (Matthew B) - Add FW version fallback to disable page reclaim on older versions. (Matthew B, Shuicheng) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-16-brian3.nguyen@intel.com
2025-12-12drm/xe: Add page reclamation info to device infoOak Zeng
Starting from Xe3p, HW adds a feature assisting range based page reclamation. Introduce a bit in device info to indicate whether device has such capability. Signed-off-by: Oak Zeng <oak.zeng@intel.com> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-15-brian3.nguyen@intel.com
2025-12-12drm/xe/xe_tlb_inval: Modify fence interface to support PPC flushBrian Nguyen
Allow tlb_invalidation to control when driver wants to flush the Private Physical Cache (PPC) as a process of the tlb invalidation process. Default behavior is still to always flush the PPC but driver now has the option to disable it. v2: - Revise commit/kernel doc descriptions. (Shuicheng) - Remove unused function. (Shuicheng) - Remove bool flush_cache parameter from fence, and various function inputs. (Matthew B) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-14-brian3.nguyen@intel.com
2025-12-12drm/xe: Do not forward invalid TLB invalidation seqnos to upper layersMatthew Brost
Certain TLB invalidation operations send multiple H2G messages per seqno with only the final H2G containing the valid seqno - the others carry an invalid seqno. The G2H handler drops these invalid seqno to aovid prematurely signaling a TLB invalidation fence. With TLB_INVALIDATION_SEQNO_INVALID used to indicate in progress multi-step TLB invalidations, reset tdr to ensure that timeout won't prematurely trigger when G2H actions are still ongoing. v2: Remove lock from xe_tlb_inval_reset_timeout. (Matthew B) v3: Squash with dependent patch from Matthew Brost' series. Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-13-brian3.nguyen@intel.com
2025-12-13Merge tag 'drm-misc-fixes-2025-12-10' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.19-rc1: - Fix stack usage warning in novatek-nt35560. - Fix s/r, i2c issues in nouveau and update string handling. - Ignore spurious PLL_UNLOCK bit in ti-sn65dsi83. - Handle IS_ERR vs NULL in drm_plane_create_hotspot_properties(). - Fix devcoredump crash on reading evicted bo's. - Fix bigendian handling in mgag200. - Fix probe failure in tilcdc. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patch.msgid.link/6c371dc1-08bf-4a34-895c-9ef348b6061b@linux.intel.com
2025-12-12drm/xe: Restore engine registers before restarting schedulers after GT resetJan Maslak
During GT reset recovery in do_gt_restart(), xe_uc_start() was called before xe_reg_sr_apply_mmio() restored engine-specific registers. This created a race window where the scheduler could run jobs before hardware state was fully restored. This caused failures in eudebug tests (xe_exec_sip_eudebug@breakpoint- waitsip-*) where TD_CTL register (containing TD_CTL_GLOBAL_DEBUG_ENABLE) wasn't restored before jobs started executing. Breakpoints would fail to trigger SIP entry because the debug enable bit wasn't set yet. Fix by moving xe_uc_start() after all MMIO register restoration, including engine registers and CCS mode configuration, ensuring all hardware state is fully restored before any jobs can be scheduled. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Jan Maslak <jan.maslak@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251210145618.169625-2-jan.maslak@intel.com
2025-12-12drm/xe: Increase TDF timeoutJagmeet Randhawa
There are some corner cases where flushing transient data may take slightly longer than the 150us timeout we currently allow. Update the driver to use a 300us timeout instead based on the latest guidance from the hardware team. An update to the bspec to formally document this is expected to arrive soon. Fixes: c01c6066e6fa ("drm/xe/device: implement transient flush") Signed-off-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/0201b1d6ec64d3651fcbff1ea21026efa915126a.1765487866.git.jagmeet.randhawa@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-12-12drm/xe/cri: Enable I2C controllerRaag Jadav
Enable I2C controller for Crescent Island and while at it, rely on has_i2c flag instead of manual platform checks. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20251128084414.306265-1-raag.jadav@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-12-12drm: Fix object leak in DRM_IOCTL_GEM_CHANGE_HANDLEKarol Wachowski
Add missing drm_gem_object_put() call when drm_gem_object_lookup() successfully returns an object. This fixes a GEM object reference leak that can prevent driver modules from unloading when using prime buffers. Fixes: 53096728b891 ("drm: Add DRM prime interface to reassign GEM handle") Cc: <stable@vger.kernel.org> # v6.18+ Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20251212134133.475218-1-karol.wachowski@linux.intel.com
2025-12-12drm/{i915, xe}/panic: move panic handling to parent interfaceJani Nikula
Move the panic handling to the display parent interface, making display more independent of i915 and xe driver implementations. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/e27eca5424479e8936b786018d0af19a34f839f6.1765474612.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-12drm/i915/panic: move i915 specific panic implementation to i915Jani Nikula
The intel_panic.c implementation is i915 specific, and xe has its own. Move it to i915 core as i915_panic.c. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/8dc7af0ae1f859d17b0be269a545146c5536d8fc.1765474612.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-12drm/tests: Handle EDEADLK in set_up_atomic_state()José Expósito
Fedora/CentOS/RHEL CI is reporting intermittent failures while running the drm_validate_modeset test [1]: # drm_test_check_connector_changed_modeset: EXPECTATION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:162 Expected ret == 0, but ret == -35 (0xffffffffffffffdd) Change the set_up_atomic_state() helper function to return on error and restart the atomic sequence when the returned error is EDEADLK. [1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2106744096/test_x86_64/11762450343/artifacts/jobwatch/logs/recipes/19797909/tasks/204139142/results/945095586/logs/dmesg.log Fixes: 73d934d7b6e3 ("drm/tests: Add test for drm_atomic_helper_commit_modeset_disables()") Closes: https://datawarehouse.cki-project.org/issue/4004 Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: José Expósito <jose.exposito89@gmail.com> Link: https://patch.msgid.link/20251104102535.12212-2-jose.exposito89@gmail.com
2025-12-12drm/tests: Handle EDEADLK in drm_test_check_valid_clones()José Expósito
Fedora/CentOS/RHEL CI is reporting intermittent failures while running the drm_test_check_valid_clones() KUnit test. The error log can be either [1]: # drm_test_check_valid_clones: ASSERTION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:295 Expected ret == param->expected_result, but ret == -35 (0xffffffffffffffdd) param->expected_result == 0 (0x0) Or [2] depending on the test case: # drm_test_check_valid_clones: ASSERTION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:295 Expected ret == param->expected_result, but ret == -35 (0xffffffffffffffdd) param->expected_result == -22 (0xffffffffffffffea) Restart the atomic sequence when EDEADLK is returned. [1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2113057246/test_x86_64/11802139999/artifacts/jobwatch/logs/recipes/19824965/tasks/204347800/results/946112713/logs/dmesg.log [2] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2106744297/test_aarch64/11762450907/artifacts/jobwatch/logs/recipes/19797942/tasks/204139727/results/945094561/logs/dmesg.log Fixes: 88849f24e2ab ("drm/tests: Add test for drm_atomic_helper_check_modeset()") Closes: https://datawarehouse.cki-project.org/issue/4004 Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: José Expósito <jose.exposito89@gmail.com> Link: https://patch.msgid.link/20251104102535.12212-1-jose.exposito89@gmail.com
2025-12-12drm/tests: hdmi: Handle drm_kunit_helper_enable_crtc_connector() returning ↵José Expósito
EDEADLK Fedora/CentOS/RHEL CI is reporting intermittent failures while running the KUnit tests present in drm_hdmi_state_helper_test.c [1]. While the specific test causing the failure change between runs, all of them are caused by drm_kunit_helper_enable_crtc_connector() returning -EDEADLK. The error trace always follow this structure: # <test name>: ASSERTION FAILED at # drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c:<line> Expected ret == 0, but ret == -35 (0xffffffffffffffdd) As documented, if the drm_kunit_helper_enable_crtc_connector() function returns -EDEADLK (-35), the entire atomic sequence must be restarted. Handle this error code for all function calls. Closes: https://datawarehouse.cki-project.org/issue/4039 [1] Fixes: 6a5c0ad7e08e ("drm/tests: hdmi_state_helpers: Switch to new helper") Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: José Expósito <jose.exposito89@gmail.com> Link: https://patch.msgid.link/20251104102258.10026-1-jose.exposito89@gmail.com
2025-12-12Merge tag 'drm-intel-next-fixes-2025-12-12' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next drm/i915 fixes for v6.19-rc1: - Fix format string truncation warning - FIx runtime PM reference during fbdev BO creation Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patch.msgid.link/281309f78560bcceebac8d5c0511efe66baf641c@intel.com
2025-12-11drm/xe/doc: Add documentation for Multi Queue Group GuC interfaceNiranjana Vishwanathapura
Add kernel documentation for Multi Queue group GuC interface. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-36-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/doc: Add documentation for Multi Queue GroupNiranjana Vishwanathapura
Add kernel documentation for Multi Queue group and update the corresponding rst. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-35-niranjana.vishwanathapura@intel.com