summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2025-11-14PCI: Fix restoring BARs on BAR resize rollback pathIlpo Järvinen
BAR resize operation is implemented in the pci_resize_resource() and pbus_reassign_bridge_resources() functions. pci_resize_resource() can be called either from __resource_resize_store() from sysfs or directly by the driver for the Endpoint Device. The pci_resize_resource() requires that caller has released the device resources that share the bridge window with the BAR to be resized as otherwise the bridge window is pinned in place and cannot be changed. pbus_reassign_bridge_resources() rolls back resources if the resize operation fails, but rollback is performed only for the bridge windows. Because releasing the device resources are done by the caller of the BAR resize interface, these functions performing the BAR resize do not have access to the device resources as they were before the resize. pbus_reassign_bridge_resources() could try __pci_bridge_assign_resources() after rolling back the bridge windows as they were, however, it will not guarantee the resource are assigned due to differences in how FW and the kernel assign the resources (alignment of the start address and tail). To perform rollback robustly, the BAR resize interface has to be altered to also release the device resources that share the bridge window with the BAR to be resized. Also, remove restoring from the entries failed list as saved list should now contain both the bridge windows and device resources so the extra restore is duplicated work. Some drivers (currently only amdgpu) want to prevent releasing some resources. Add exclude_bars param to pci_resize_resource() and make amdgpu pass its register BAR (BAR 2 or 5), which should never be released during resize operation. Normally 64-bit prefetchable resources do not share a bridge window with the 32-bit only register BAR, but there are various fallbacks in the resource assignment logic which may make the resources share the bridge window in rare cases. This change (together with the driver side changes) is to counter the resource releases that had to be done to prevent resource tree corruption in the ("PCI: Release assigned resource before restoring them") change. As such, it likely restores functionality in cases where device resources were released to avoid resource tree conflicts which appeared to be "working" when such conflicts were not correctly detected by the kernel. Reported-by: Simon Richter <Simon.Richter@hogyros.de> Link: https://lore.kernel.org/linux-pci/f9a8c975-f5d3-4dd2-988e-4371a1433a60@hogyros.de/ Reported-by: Alex Bennée <alex.bennee@linaro.org> Link: https://lore.kernel.org/linux-pci/874irqop6b.fsf@draig.linaro.org/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: squash amdgpu BAR selection from https://lore.kernel.org/r/20251114103053.13778-1-ilpo.jarvinen@linux.intel.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patch.msgid.link/20251113162628.5946-7-ilpo.jarvinen@linux.intel.com
2025-11-14drm/tegra: dsi: Calculate packet parameters for video modeSvyatoslav Ryhel
Calculate packet parameters for video mode same way it is done for command mode, by halving timings plugged into equations. Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20250909073335.91531-3-clamor95@gmail.com
2025-11-14drm/tegra: dsi: Make SOL delay calculation mode independentSvyatoslav Ryhel
Move SOL delay calculation outside of video mode conditions. Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20250909073335.91531-2-clamor95@gmail.com
2025-11-14gpu: host1x: Syncpoint interrupt performance optimizationMikko Perttunen
Optimize performance of syncpoint interrupt handling by reading the status register in 64-bit chunks when possible, and skipping processing when the read value is zero. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20250917-host1x-syncpt-irq-perf-v2-1-736ef69b1347@nvidia.com
2025-11-14Revert "drm/tegra: dsi: Clear enable register if powered by bootloader"Diogo Ivo
Commit b6bcbce33596 ("soc/tegra: pmc: Ensure power-domains are in a known state") was introduced so that all power domains get initialized to a known working state when booting and it does this by shutting them down (including asserting resets and disabling clocks) before registering each power domain with the genpd framework, leaving it to each driver to later on power its needed domains. This caused the Google Pixel C to hang when booting due to a workaround in the DSI driver introduced in commit b22fd0b9639e ("drm/tegra: dsi: Clear enable register if powered by bootloader") meant to handle the case where the bootloader enabled the DSI hardware module. The workaround relies on reading a hardware register to determine the current status and after b6bcbce33596 that now happens in a powered down state thus leading to the boot hang. Fix this by reverting b22fd0b9639e since currently we are guaranteed that the hardware will be fully reset by the time we start enabling the DSI module. Fixes: b6bcbce33596 ("soc/tegra: pmc: Ensure power-domains are in a known state") Cc: stable@vger.kernel.org Signed-off-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20251103-diogo-smaug_ec_typec-v1-1-be656ccda391@tecnico.ulisboa.pt
2025-11-14drm/tegra: Add call to put_pid()Prateek Agarwal
Add a call to put_pid() corresponding to get_task_pid(). host1x_memory_context_alloc() does not take ownership of the PID so we need to free it here to avoid leaking. Signed-off-by: Prateek Agarwal <praagarwal@nvidia.com> Fixes: e09db97889ec ("drm/tegra: Support context isolation") [mperttunen@nvidia.com: reword commit message] Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20250919-host1x-put-pid-v1-1-19c2163dfa87@nvidia.com
2025-11-14drm/xe: Remove duplicate DRM_EXEC selection from KconfigShuicheng Lin
There are 2 identical "select DRM_EXEC" lines for DRM_XE. Remove one to clean up the configuration. Fixes: d490ecf57790 ("drm/xe: Rework xe_exec and the VM rebind worker to use the drm_exec helper") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251110232657.1807998-2-shuicheng.lin@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-14drm/tegra: dc: Fix reference leak in tegra_dc_couple()Ma Ke
driver_find_device() calls get_device() to increment the reference count once a matching device is found, but there is no put_device() to balance the reference count. To avoid reference count leakage, add put_device() to decrease the reference count. Found by code review. Cc: stable@vger.kernel.org Fixes: a31500fe7055 ("drm/tegra: dc: Restore coupling of display controllers") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Acked-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20251022114720.24937-1-make24@iscas.ac.cn
2025-11-14drm/xe/kunit: Fix forcewake assertion in mocs testMatt Roper
The MOCS kunit test calls KUNIT_ASSERT_TRUE_MSG() with a condition of 'true;' this prevents the assertion from ever failing. Replace KUNIT_ASSERT_TRUE_MSG with KUNIT_FAIL_AND_ABORT to get the intended failure behavior in cases where forcewake was not acquired successfully. Fixes: 51c0ee84e4dc ("drm/xe/tests/mocs: Hold XE_FORCEWAKE_ALL for LNCF regs") Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patch.msgid.link/20251113234038.2256106-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-11-14drm/amdgpu: Use amdgpu by default on SI dedicated GPUs (v2)Timur Kristóf
Now that the DC analog connector support and VCE1 support landed, amdgpu is at feature parity with the old radeon driver on SI dGPUs. Enabling the amdgpu driver by default for SI dGPUs has the following benefits: - More stable OpenGL support through RadeonSI - Vulkan support through RADV - Improved performance - Better display features through DC Users who want to keep using the old driver can do so using: amdgpu.si_support=0 radeon.si_support=1 v2: - Update documentation in Kconfig file Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-14drm/amdgpu: Use amdgpu by default on CIK dedicated GPUsTimur Kristóf
The amdgpu driver has been working well on CIK dGPUs for years. Now that the DC analog connector support landed, amdgpu is at feature parity with the old radeon driver on CIK dGPUs. Enabling the amdgpu driver by default for CIK dGPUs has the following benefits: - More stable OpenGL support through RadeonSI - Vulkan support through RADV - Improved performance - Better display features through DC Users who want to keep using the old driver can do so using: amdgpu.cik_support=0 radeon.cik_support=1 v2: - Update documentation in Kconfig file v3: - Rebase documentation updates (Alex) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-14drm/amdgpu: Fix the issue of missing ras message on sriov hostYiPeng Chai
This code only applies to amdgpu processing poison consumption after uniras is enabled, but not to sriov. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-14drm/amdgpu: Add lock to serialize sriov command executionYiPeng Chai
Add lock to serialize sriov command execution. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-14drm/amdgpu: Synchronize sriov host to add block_mmsch bit fieldYiPeng Chai
Synchronize sriov host to add block_mmsch bit field. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-14drm/amdgpu: use GFP_ATOMIC instead of NOWAIT in the critical pathChristian König
Otherwise job submissions can fail with ENOMEM. We probably need to re-design the per VMID tracking at some point. Signed-off-by: Christian König <christian.koenig@amd.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4258 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-14drm/amdgpu: avoid memory allocation in the critical code path v3Christian König
When we run out of VMIDs we need to wait for some to become available. Previously we were using a dma_fence_array for that, but this means that we have to allocate memory. Instead just wait for the first not signaled fence from the least recently used VMID to signal. That is not as efficient since we end up in this function multiple times again, but allocating memory can easily fail or deadlock if we have to wait for memory to become available. v2: remove now unused VM manager fields v3: fix dma_fence reference Signed-off-by: Christian König <christian.koenig@amd.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4258 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-14drm/amdgpu: Enable xgmi extended peer links for sriov guestWill Aitken
The amd-smi tool relies on extended peer link information to report xgmi link metrics. The necessary xgmi ta command, GET_EXTEND_PEER_LINKS, has been enabled in the host driver and this change is necessary for the guest to make use of it. To handle the case where the host driver does not have the latest xgmi ta, the guest driver checks for guest support through a pf2vf feature flag before invoking psp. Signed-off-by: Will Aitken <wiaitken@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-14drm/amdgpu: Update headers for sriov xgmi ext peer link support feature flagWill Aitken
Adds new sriov msg flag to match host, feature flag in the amdgim enum, and a wrapper macro to check it. Signed-off-by: Will Aitken <wiaitken@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-14drm/amdgpu: Refactor sriov xgmi topology filling to common codeWill Aitken
amdgpu_xgmi_fill_topology_info and psp_xgmi_reflect_topology_info perform the same logic of copying topology info of one node to every other node in the hive. Instead of having two functions that purport to do the same thing, this refactoring moves the logic of the fill function to the reflect function and adds reflecting port number info as well for complete functionality. Signed-off-by: Will Aitken <wiaitken@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-14drm/amdgpu: Use amdgpu by default on CIK dedicated GPUsTimur Kristóf
The amdgpu driver has been working well on CIK dGPUs for years. Now that the DC analog connector support landed, these GPUs are at feature parity with the old radeon driver. Additionally, amdgpu yields extra performance, supports Vulkan and provides more display features through DC as well as more robust power management. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-14drm/amdgpu: Refactor how SI and CIK support is determinedTimur Kristóf
Move the determination into a separate function. Change amdgpu.si_support and amdgpu.cik_support so that their default value is -1 (default). This prepares the code for changing the default driver based on the chip. Also adjust the module param documentation. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-14drm/radeon: Refactor how SI and CIK support is determinedTimur Kristóf
Move the determination into a separate function. Change radeon.si_support and radeon.cik_support so that their default value is -1 (default). This prepares the code for changing the default driver based on the chip. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-14drm/amdgpu: Avoid xgmi register accessLijo Lazar
On single GPU systems, avoid accesses to XGMI link registers. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-14drm/xe/pf: Fix kernel-doc warning in migration_save_consumeMichał Winiarski
The kernel-doc for xe_sriov_pf_migration_save_consume() contained multiple "Return:" sections, causing a warning. Fix it by removing the extra line. Fixes: 67df4a5cbc583 ("drm/xe/pf: Add data structures and handlers for migration rings") Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251114134030.1795947-1-michal.winiarski@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-14drm/msm/disp: fix kernel-doc warningsRandy Dunlap
Fix all kernel-doc warnings in msm_disp_snapshot.h: msm_disp_snapshot.h:53: warning: Function parameter or struct member 'blocks' not described in 'msm_disp_state' msm_disp_snapshot.h:69: warning: Function parameter or struct member 'node' not described in 'msm_disp_state_block' msm_disp_snapshot.h:69: warning: Excess struct member 'drm_dev' description in 'msm_disp_state_block' msm_disp_snapshot.h:95: warning: No description found for return value of 'msm_disp_snapshot_state_sync' msm_disp_snapshot.h:100: warning: bad line: msm_disp_snapshot.h:117: warning: bad line: msm_disp_snapshot.h:125: warning: bad line: msm_disp_snapshot.h:142: warning: Excess function parameter 'name' description in 'msm_disp_snapshot_add_block' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/687132/ Link: https://lore.kernel.org/r/20251111060353.1972869-1-rdunlap@infradead.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-11-14drm/msm: mdss: Add QCS8300 supportYongxing Mou
Add Mobile Display Subsystem (MDSS) support for the QCS8300 platform. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Yongxing Mou <yongxing.mou@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/684205/ Link: https://lore.kernel.org/r/20251029-qcs8300_mdss-v13-5-e8c8c4f82da2@oss.qualcomm.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-11-14drm/msm/dp: Add support for GlymurAbel Vesa
The Qualcomm Glymur platform comes with 4 DisplayPort controllers, which have a different core revision compared to all previous platforms. Describe them and add the compatible. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/683722/ Link: https://lore.kernel.org/r/20251027-glymur-display-v3-6-aa13055818ac@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-11-14drm/msm/dpu: Add support for GlymurAbel Vesa
Add DPU version v12.2 support for the Glymur platform. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/683721/ Link: https://lore.kernel.org/r/20251027-glymur-display-v3-5-aa13055818ac@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-11-14drm/msm/mdss: Add Glymur device configurationAbel Vesa
Add Mobile Display Subsystem (MDSS) support for the Glymur platform. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/683718/ Link: https://lore.kernel.org/r/20251027-glymur-display-v3-4-aa13055818ac@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-11-14drm/msm/dpu: drop dpu_hw_dsc_destroy() prototypeDmitry Baryshkov
The commit a106ed98af68 ("drm/msm/dpu: use devres-managed allocation for HW blocks") dropped all dpu_hw_foo_destroy() functions, but the prototype for dpu_hw_dsc_destroy() was omitted. Drop it now to clean up the header. Fixes: a106ed98af68 ("drm/msm/dpu: use devres-managed allocation for HW blocks") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Jessica Zhang <jesszhan0024@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/683697/ Link: https://lore.kernel.org/r/20251027-dpu-drop-dsc-destroy-v1-1-968128de4bf6@oss.qualcomm.com
2025-11-14drm/msm/dp: Add support for lane mapping configurationXiangxu Yin
QCS615 platform requires non-default logical-to-physical lane mapping due to its unique hardware routing. Unlike the standard mapping sequence <0 1 2 3>, QCS615 uses <3 2 0 1>, which necessitates explicit configuration via the data-lanes property in the device tree. This ensures correct signal routing between the DP controller and PHY. For partial definitions, fill remaining lanes with unused physical lanes in ascending order. Signed-off-by: Xiangxu Yin <xiangxu.yin@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/675645/ Link: https://lore.kernel.org/r/20250919-add-displayport-support-for-qcs615-platform-v5-14-eae6681f4002@oss.qualcomm.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-11-14drm/msm/dp: move link-specific parsing from dp_panel to dp_linkXiangxu Yin
Since max_dp_lanes and max_dp_link_rate are link-specific parameters, move their parsing from dp_panel to dp_link for better separation of concerns. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Xiangxu Yin <xiangxu.yin@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/675643/ Link: https://lore.kernel.org/r/20250919-add-displayport-support-for-qcs615-platform-v5-13-eae6681f4002@oss.qualcomm.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-11-14drm/msm/dpu: Enable quad-pipe for DSC and dual-DSI caseJun Nie
To support high-resolution cases that exceed the width limitation of a pair of SSPPs, or scenarios that surpass the maximum MDP clock rate, additional pipes are necessary to enable parallel data processing within the SSPP width constraints and MDP clock rate. Request 4 mixers and 4 DSCs for high-resolution cases where both DSC and dual interfaces are enabled. More use cases can be incorporated later if quad-pipe capabilities are required. Signed-off-by: Jun Nie <jun.nie@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Jessica Zhang <quic_jesszhan@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/675418/ Link: https://lore.kernel.org/r/20250918-v6-16-rc2-quad-pipe-upstream-4-v16-10-ff6232e3472f@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-11-14gpu: nova-core: implement Display for SpecJohn Hubbard
Implement Display for Spec. This simplifies the dev_info!() code for printing banners such as: NVIDIA (Chipset: GA104, Architecture: Ampere, Revision: a.1) Cc: Alexandre Courbot <acourbot@nvidia.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Timur Tabi <ttabi@nvidia.com> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251114024109.465136-2-jhubbard@nvidia.com>
2025-11-14drm/xe: Prevent BIT() overflow when handling invalid prefetch regionShuicheng Lin
If user provides a large value (such as 0x80) for parameter prefetch_mem_region_instance in vm_bind ioctl, it will cause BIT(prefetch_region) overflow as below: " ------------[ cut here ]------------ UBSAN: shift-out-of-bounds in drivers/gpu/drm/xe/xe_vm.c:3414:7 shift exponent 128 is too large for 64-bit type 'long unsigned int' CPU: 8 UID: 0 PID: 53120 Comm: xe_exec_system_ Tainted: G W 6.18.0-rc1-lgci-xe-kernel+ #200 PREEMPT(voluntary) Tainted: [W]=WARN Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023 Call Trace: <TASK> dump_stack_lvl+0xa0/0xc0 dump_stack+0x10/0x20 ubsan_epilogue+0x9/0x40 __ubsan_handle_shift_out_of_bounds+0x10e/0x170 ? mutex_unlock+0x12/0x20 xe_vm_bind_ioctl.cold+0x20/0x3c [xe] ... " Fix it by validating prefetch_region before the BIT() usage. v2: Add Closes and Cc stable kernels. (Matt) Reported-by: Koen Koning <koen.koning@intel.com> Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6478 Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20251112181005.2120521-2-shuicheng.lin@intel.com
2025-11-14drm/xe/pat: Add helper to query compression enable statusXin Wang
Add xe_pat_index_get_comp_en() helper function to check whether compression is enabled for a given PAT index by extracting the XE2_COMP_EN bit from the PAT table entry. There are no current users, however there are multiple in-flight series which will all use this helper. CC: Nitin Gote <nitin.r.gote@intel.com> CC: Sanjay Yadav <sanjay.kumar.yadav@intel.com> CC: Matt Roper <matthew.d.roper@intel.com> Suggested-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Xin Wang <x.wang@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20251110221458.1864507-2-x.wang@intel.com
2025-11-14gpu: nova-core: gsp: Boot GSPAlistair Popple
Boot the GSP to the RISC-V active state. Completing the boot requires running the CPU sequencer which will be added in a future commit. Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Alistair Popple <apopple@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251110-gsp_boot-v9-15-8ae4058e3c0e@nvidia.com>
2025-11-14gpu: nova-core: falcon: Add support to write firmware versionJoel Fernandes
This will be needed by both the GSP boot code as well as GSP resume code in the sequencer. Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251110-gsp_boot-v9-14-8ae4058e3c0e@nvidia.com>
2025-11-14gpu: nova-core: falcon: Add support to check if RISC-V is activeJoel Fernandes
Add definition for RISCV_CPUCTL register and use it in a new falcon API to check if the RISC-V core of a Falcon is active. It is required by the sequencer to know if the GSP's RISCV processor is active. Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251110-gsp_boot-v9-13-8ae4058e3c0e@nvidia.com>
2025-11-14gpu: nova-core: gsp: Add SetRegistry commandAlistair Popple
Add support for sending the SetRegistry command, which is critical to GSP initialization. The RM registry is serialized into a packed format and sent via the command queue. For now only three parameters which are required to boot GSP are hardcoded. In the future a kernel module parameter will be added to enable other parameters to be added. Signed-off-by: Alistair Popple <apopple@nvidia.com> [acourbot@nvidia.com: split into its own patch.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251110-gsp_boot-v9-12-8ae4058e3c0e@nvidia.com>
2025-11-14gpu: nova-core: gsp: Add SetSystemInfo commandAlistair Popple
Add support for sending the SetSystemInfo command, which provides required hardware information to the GSP and is critical to its initialization. Signed-off-by: Alistair Popple <apopple@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251110-gsp_boot-v9-11-8ae4058e3c0e@nvidia.com>
2025-11-14gpu: nova-core: gsp: Create rmargsAlistair Popple
Initialise the GSP resource manager arguments (rmargs) which provides initialisation parameters to the GSP firmware during boot. The rmargs structure contains arguments to configure the GSP message/command queue location. These are mapped for coherent DMA and added to the libos data structure for access when booting GSP. Signed-off-by: Alistair Popple <apopple@nvidia.com> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251110-gsp_boot-v9-10-8ae4058e3c0e@nvidia.com>
2025-11-14gpu: nova-core: gsp: Add GSP command queue bindings and handlingAlistair Popple
This commit introduces core infrastructure for handling GSP command and message queues in the nova-core driver. The command queue system enables bidirectional communication between the host driver and GSP firmware through a remote message passing interface. The interface is based on passing serialised data structures over a ring buffer with separate transmit and receive queues. Commands are sent by writing to the CPU transmit queue and waiting for completion via the receive queue. To ensure safety mutable or immutable (depending on whether it is a send or receive operation) references are taken on the command queue when allocating the message to write/read to. This ensures message memory remains valid and the command queue can't be mutated whilst an operation is in progress. Currently this is only used by the probe() routine and therefore can only used by a single thread of execution. Locking to enable safe access from multiple threads will be introduced in a future series when that becomes necessary. Signed-off-by: Alistair Popple <apopple@nvidia.com> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251110-gsp_boot-v9-9-8ae4058e3c0e@nvidia.com>
2025-11-14gpu: nova-core: Add zeroable trait to bindingsAlistair Popple
Derive the Zeroable trait for existing bindgen generated bindings. This is safe because all bindgen generated types are simple integer types for which any bit pattern, including all zeros, is valid. Signed-off-by: Alistair Popple <apopple@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251110-gsp_boot-v9-7-8ae4058e3c0e@nvidia.com>
2025-11-14gpu: nova-core: Add a slice-buffer (sbuffer) datastructureJoel Fernandes
A data structure that can be used to write across multiple slices which may be out of order in memory. This lets SBuffer user correctly and safely write out of memory order, without error-prone tracking of pointers/offsets. let mut buf1 = [0u8; 3]; let mut buf2 = [0u8; 5]; let mut sbuffer = SBuffer::new([&mut buf1[..], &mut buf2[..]]); let data = b"hello"; let result = sbuffer.write(data); Reviewed-by: Lyude Paul <lyude@redhat.com> Co-developed-by: Alistair Popple <apopple@nvidia.com> Signed-off-by: Alistair Popple <apopple@nvidia.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251110-gsp_boot-v9-6-8ae4058e3c0e@nvidia.com>
2025-11-14gpu: nova-core: gsp: Create wpr metadataAlistair Popple
The GSP requires some pieces of metadata to boot. These are passed in a struct which the GSP transfers via DMA. Create this struct and get a handle to it for future use when booting the GSP. Signed-off-by: Alistair Popple <apopple@nvidia.com> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251110-gsp_boot-v9-5-8ae4058e3c0e@nvidia.com>
2025-11-14gpu: nova-core: Create initial GspAlistair Popple
The GSP requires several areas of memory to operate. Each of these have their own simple embedded page tables. Set these up and map them for DMA to/from GSP using CoherentAllocation's. Return the DMA handle describing where each of these regions are for future use when booting GSP. Signed-off-by: Alistair Popple <apopple@nvidia.com> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251110-gsp_boot-v9-4-8ae4058e3c0e@nvidia.com>
2025-11-14gpu: nova-core: num: add functions to safely convert a const value to a ↵Alexandre Courbot
smaller type There are times where we need to store a constant value defined as a larger type (e.g. through a binding) into a smaller type, knowing that the value will fit. Rust, unfortunately, only provides us with the `as` operator for that purpose, the use of which is discouraged as it silently strips data. Extend the `num` module with functions allowing to perform the conversion infallibly, at compile time. Example: const FOO_VALUE: u32 = 1; // `FOO_VALUE` fits into a `u8`, so the conversion is valid. let foo = num::u32_to_u8::<{ FOO_VALUE }>(); We are going to use this feature extensively in Nova. Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> [acourbot@nvidia.com: fix mistake in example pointed out by Mikko.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Message-ID: <20251110-gsp_boot-v9-3-8ae4058e3c0e@nvidia.com>
2025-11-14Merge tag 'drm-xe-fixes-2025-11-13' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - New HW workarounds affecting PTL and WCL platforms (Nitin Gote, Tangudu Tilak Tirumalesh) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/ay2qztgonodwson6tuzcv5napjmqbgwzv27so4ybfola34guux@xgufrrmbzyws
2025-11-14Merge tag 'drm-intel-fixes-2025-11-13' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Fix PSR's pipe to vblank conversion (Jani) - Disable Panel Replay on MST links (Imre) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/aRXdQnitzyFcokhF@intel.com