| Age | Commit message (Collapse) | Author |
|
[Why & How]
Although it's dummy updates of surface update for committing stream
updates, we should not have dummy_updates[j].surface all indicating
to the same surface under multiple surfaces case. Otherwise,
copy_surface_update_to_plane() in update_planes_and_stream_state()
will update to the same surface only.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
System will try to apply idle power optimizations setting during
system resume. But system power state is still in D3 state, and
it will cause the idle power optimizations command not actually
to be sent to DMUB and cause some platforms to go into IPS.
[How]
Set power state to D0 first before calling the
dc_dmub_srv_apply_idle_power_optimizations(dm->dc, false)
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use U64 for accumulation counter in gpu metrics for smu_v13_0_6 and
smu_v13_0_12
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add accumulation counter and firmware timestamp to partition metrics for
smu_v13_0_6 & smu_v13_0_12
v2: Use U64 for accumulation counter (Lijo)
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Pull drm updates from Dave Airlie:
"Highlights:
- amdgpu support for lots of new IP blocks which means newer GPUs
- xe has a lot of SR-IOV and SVM improvements
- lots of intel display refactoring across i915/xe
- msm has more support for gen8 platforms
- Given up on kgdb/kms integration, it's too hard on modern hw
core:
- drop kgdb support
- replace system workqueue with percpu
- account for property blobs in memcg
- MAINTAINERS updates for xe + buddy
rust:
- Fix documentation for Registration constructors
- Use pin_init::zeroed() for fops initialization
- Annotate DRM helpers with __rust_helper
- Improve safety documentation for gem::Object::new()
- Update AlwaysRefCounted imports
- mm: Prevent integer overflow in page_align()
atomic:
- add drm_device pointer to drm_private_obj
- introduce gamma/degamma LUT size check
buddy:
- fix free_trees memory leak
- prevent BUG_ON
bridge:
- introduce drm_bridge_unplug/enter/exit
- add connector argument to .hpd_notify
- lots of recounting conversions
- convert rockchip inno hdmi to bridge
- lontium-lt9611uxc: switch to HDMI audio helpers
- dw-hdmi-qp: add support for HPD-less setups
- Algoltek AG6311 support
panels:
- edp: CSW MNE007QB3-1, AUO B140HAN06.4, AUO B140QAX01.H
- st75751: add SPI support
- Sitronix ST7920, Samsung LTL106HL02
- LG LH546WF1-ED01, HannStar HSD156J
- BOE NV130WUM-T08
- Innolux G150XGE-L05
- Anbernic RG-DS
dma-buf:
- improve sg_table debugging
- add tracepoints
- call clear_page instead of memset
- start to introduce cgroup memory accounting in heaps
- remove sysfs stats
dma-fence:
- add new helpers
dp:
- mst: avoid oob access with vcpi=0
hdmi:
- limit infoframes exposure to userspace
gem:
- reduce page table overhead with THP
- fix leak in drm_gem_get_unmapped_area
gpuvm:
- API sanitation for rust bindings
sched:
- introduce new helpers
panic:
- report invalid panic modes
- add kunit tests
i915/xe display:
- Expose sharpness only if num_scalers is >= 2
- Add initial Xe3P_LPD for NVL
- BMG FBC support
- Add MTL+ platforms to support dpll framework
_ fix DIMM_S DRM decoding on ICL
- Return to using AUX interrupts
- PSR/Panel replay refactoring
- use consolidation HDMI tables
- Xe3_LPD CD2X dividier changes
xe:
- vfio: add vfio_pci for intel GPU
- multi queue support
- dynamic pagemaps and multi-device SVM
- expose temp attribs in hwmon
- NO_COMPRESSION bo flag
- expose MERT OA unit
- sysfs survivability refactor
- SRIOV PF: add MERT support
- enable SR-IOV VF migration
- Enable I2C/NVM on Crescent Island
- Xe3p page reclaimation support
- introduce SRIOV scheduler groups
- add SoC remappt support in system controller
- insert compiler barriers in GuC code
- define NVL GuC firmware
- handle GT resume failure
- fix drm scheduler layering violations
- enable GSC loading and PXP for PTL
- disable GuC Power DCC strategy on PTL
- unregister drm device on probe error
i915:
- move to kernel standard fault injection
- bump recommended GuC version for DG2 and MTL
amdgpu:
- SMUIO 15.x, PSP 15.x support
- IH 6.1.1/7.1 support
- MMHUB 3.4/4.2 support
- GC 11.5.4/12.1 support
- SDMA 6.1.4/7.1/7.11.4 support
- JPEG 5.3 support
- UserQ updates
- GC 9 gfx queue reset support
- TTM memory ops parallelization
- convert legacy logging to new helpers
- DC analog fixes
amdkfd:
- GC 11.5.4/12.1 suppport
- SDMA 6.1.4/7.1 support
- per context support
- increase kfd process hash table
- Reserved SDMA rework
radeon:
- convert legacy logging to new helpers
- use devm for i2c adapters
msm:
- GPU
- Document a612/RGMU dt bindings
- UBWC 6.0 support (for A840 / Kaanapali)
- a225 support
- DPU:
- Switch to use virtual planes by default
- Fix DSI CMD panels on DPU 3.x
- Rewrite format handling to remove intermediate representation
- Fix watchdog on DPU 8.x+
- Fix TE / Vsync source setting on DPU 8.x+
- Add 3D_Mux on SC7280
- Kaanapali platform support
- Fix UBWC register programming
- Make RM reserve DSPP-enabled mixers for CRTCs with LMs
- Gamma correction support
- DP:
- Enable support for eDP 1.4+ link rate tables
- Fix MDSS1 DP indices on SA8775P, making them to work
- Fix msm_dp_ctrl_config_msa() to work with LLVM 20
- DSI:
- Document QCS8300 as compatible with SA8775P
- Kaanapali platform support
- DSI PHY:
- switch to divider_determine_rate()
- MDP5:
- Drop support for MSM8998, SDM660 and SDM630 (switch over to DPU)
- MDSS:
- Kaanapali platform support
- Fixed UBWC register programming
nova-core:
- Prepare for Turing support. This includes parsing and handling
Turing-specific firmware headers and sections as well as a Turing
Falcon HAL implementation
- Get rid of the Result<impl PinInit<T, E>> anti-pattern
- Relocate initializer-specific code into the appropriate initializer
- Use CStr::from_bytes_until_nul() to remove custom helpers
- Improve handling of unexpected firmware values
- Clean up redundant debug prints
- Replace c_str!() with native Rust C-string literals
- Update nova-core task list
nova:
- Align GEM object size to system page size
tyr:
- Use generated uAPI bindings for GpuInfo
- Replace manual sleeps with read_poll_timeout()
- Replace c_str!() with native Rust C-string literals
- Suppress warnings for unread fields
- Fix incorrect register name in print statement
nouveau:
- fix big page table support races in PTE management
- improve reclocking on tegra 186+
amdxdna:
- fix suspend race conditions
- improve handling of zero tail pointers
- fix cu_idx overwritten during command setup
- enable hardware context priority
- remove NPU2 support
- update message buffer allocation requirements
- update firmware version check
ast:
- support imported cursor buffers
- big endian fixes
etnaviv:
- add PPU flop reset support
imagination:
- add AM62P support
- introduce hw version checks
ivpu:
- implement warm boot flow
panfrost:
- add bo sync ioctl
- add GPU_PM_RT support for RZ/G3E SoC
panthor:
- add bo sync ioctl
- enable timestamp propagation
- scheduler robustness improvements
- VM termination fixes
- huge page support
rockchip:
- RK3368 HDMI Support
- get rid of atomic_check fixups
- RK3506 support
- RK3576/RK3588 improved HPD handling
rz-du:
- RZ/V2H(P) MIPI-DSI Support
v3d:
- fix DMA segment size
- convert to new logging helpers
mediatek:
- move DP training to hotplug thread
- convert logging to new helpers
- add support for HS speed DSI
- Genio 510/700/1200-EVK, Radxa NIO-12L HDMI support
atmel-hlcdc:
- switch to drmm resource
- support nomodeset
- use newer helpers
hisilicon:
- fix various DP bugs
renesas:
- fix kernel panic on reboot
exynos:
- fix vidi_connection_ioctl using wrong device
- fix vidi_connection deref user ptr
- fix concurrency regression with vidi_context
vkms:
- add configfs support for display configuration
* tag 'drm-next-2026-02-11' of https://gitlab.freedesktop.org/drm/kernel: (1610 commits)
drm/xe/pm: Disable D3Cold for BMG only on specific platforms
drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_dep
drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_early
drm/xe: Fix kerneldoc for xe_migrate_exec_queue
drm/xe/query: Fix topology query pointer advance
drm/xe/guc: Fix kernel-doc warning in GuC scheduler ABI header
drm/xe/guc: Fix CFI violation in debugfs access.
accel/amdxdna: Move RPM resume into job run function
accel/amdxdna: Fix incorrect DPM level after suspend/resume
nouveau/vmm: start tracking if the LPT PTE is valid. (v6)
nouveau/vmm: increase size of vmm pte tracker struct to u32 (v2)
nouveau/vmm: rewrite pte tracker using a struct and bitfields.
accel/amdxdna: Fix incorrect error code returned for failed chain command
accel/amdxdna: Remove hardware context status
drm/bridge: imx8qxp-pixel-combiner: Fix bailout for imx8qxp_pc_bridge_probe()
drm/panel: ilitek-ili9882t: Remove duplicate initializers in tianma_il79900a_dsc
drm/i915/display: fix the pixel normalization handling for xe3p_lpd
drm/exynos: vidi: use ctx->lock to protect struct vidi_context member variables related to memory alloc/free
drm/exynos: vidi: fix to avoid directly dereferencing user pointer
drm/exynos: vidi: use priv->vidi_dev for ctx lookup in vidi_connection_ioctl()
...
|
|
Firmware and monitoring tools may not be ready to receive a CPER when we
read the bad pages, so send the CPERs at the end of RAS initialization
to ensure that the FW is ready to receive and process the CPER. This
removes the previous CPER submission that was added during bad page
load, and sends both in-band and out-of-band at the same time.
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The commit 6a23e7b4332c ("drm/amd: Clean up kfd node on surprise
disconnect") introduced early KFD cleanup when drm_dev_is_unplugged()
returns true. However, this causes hangs during normal module unload
(rmmod amdgpu).
The issue occurs because drm_dev_unplug() is called in amdgpu_pci_remove()
for all removal scenarios, not just surprise disconnects. This was done
intentionally in commit 39934d3ed572 ("Revert "drm/amdgpu: TA unload
messages are not actually sent to psp when amdgpu is uninstalled"") to
fix IGT PCI software unplug test failures. As a result,
drm_dev_is_unplugged() returns true even during normal module unload,
triggering the early KFD cleanup inappropriately.
The correct check should distinguish between:
- Actual surprise disconnect (eGPU unplugged): pci_dev_is_disconnected()
returns true
- Normal module unload (rmmod): pci_dev_is_disconnected() returns false
Replace drm_dev_is_unplugged() with pci_dev_is_disconnected() to ensure
the early cleanup only happens during true hardware disconnect events.
Cc: stable@vger.kernel.org
Reported-by: Cal Peake <cp@absolutedigital.net>
Closes: https://lore.kernel.org/all/b0c22deb-c0fa-3343-33cf-fd9a77d7db99@absolutedigital.net/
Fixes: 6a23e7b4332c ("drm/amd: Clean up kfd node on surprise disconnect")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Returning DRM_GPU_SCHED_STAT_NO_HANG causes the scheduler
to add the bad job back the pending list. We've already
set the errors on the fence and killed the bad job at this point
so it's the correct behavior.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
sdma ring reset is not supported in SRIOV. kfd driver does not check
reset mask, and could queue sdma ring reset during unmap_queues_cpsch.
Avoid the ring reset for sriov.
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In low memory conditions, kmalloc can fail. In such conditions
unlock the mutex for a clean exit.
We do not need to amdgpu_bo_list_put as it's been handled in the
amdgpu_cs_parser_fini.
Fixes: 737da5363cc0 ("drm/amdgpu: update the functions to use amdgpu version of hmm")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202602030017.7E0xShmH-lkp@intel.com/
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Regardless if CPU enable 5-level paging, GPU vm use 5-level paging if
gmc init with 57-bit address space support, because
ARM64 4-level paging support 48-bit VA, x86 and GPU 4-level paging
support 47-bit VA, require 5-level paging on GPU to support ARM64.
NPA address space 52-bit mapping on NPA GPU VM require 5-level paging.
Debugger trap get device snapshot expect LDS and Scratch base, limit
above 57-bit, which is set only for 5-level paging.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.19.x
|
|
Since commit 39923050615cd ("drm/amd/display: Clear DPP 3DLUT Cap")
there is a flag in the mpc_color_caps that indicates the pre-blend usage
of MPC color caps. Do the same as commit 9e5d4a5e27c6 ("drm/amd/display:
Use mpc.preblend flag to indicate preblend") and use the mpc.preblend
flag to expose plane blend LUT/TF properties on AMD display driver.
CC: Matthew Schwartz <matthew.schwartz@linux.dev>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Tested-by: Matthew Schwartz <matthew.schwartz@linux.dev>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Ungate GPU CG/PG in device_fini_hw and device_halt to protect GPU
register accesses, e.g. GC registers are accessed in amdgpu_irq_disable_all()
and amdgpu_fence_driver_hw_fini().
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
The kfd_event_page_set() function writes KFD_SIGNAL_EVENT_LIMIT * 8
bytes via memset without checking the buffer size parameter. This allows
unprivileged userspace to trigger an out-of bounds kernel memory write
by passing a small buffer, leading to potential privilege
escalation.
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
There is no firmware version dependency. This also
enables sdma queue resets on all SDMA 6.x based
chips.
Fixes: 59fd50b8663b ("drm/amdgpu: Add sysfs interface for sdma reset mask")
Cc: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
There is no firmware version dependency. This also
enables sdma queue resets on all SDMA 5.2.x based
chips.
Fixes: 59fd50b8663b ("drm/amdgpu: Add sysfs interface for sdma reset mask")
Cc: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
There is no firmware version dependency.
Fixes: 59fd50b8663b ("drm/amdgpu: Add sysfs interface for sdma reset mask")
Cc: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When amdgpu_nbio_ras_sw_init() fails in amdgpu_ras_init(), the function
returns directly without freeing the allocated con structure, leading
to a memory leak.
Fix this by jumping to the release_con label to properly clean up the
allocated memory before returning the error code.
Compile tested only. Issue found using a prototype static analysis tool
and code review.
Fixes: fdc94d3a8c88 ("drm/amdgpu: Rework pcie_bif ras sw_init")
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
amdgpu_discovery_get_nps_info() internally allocates memory for ranges
using kvcalloc(), which may use vmalloc() for large allocation. Using
kfree() to release vmalloc memory will lead to a memory corruption.
Use kvfree() to safely handle both kmalloc and vmalloc allocations.
Compile tested only. Issue found using a prototype static analysis tool
and code review.
Fixes: b194d21b9bcc ("drm/amdgpu: Use NPS ranges from discovery table")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In amdgpu_acpi_enumerate_xcc(), if amdgpu_acpi_dev_init() returns -ENOMEM,
the function returns directly without releasing the allocated xcc_info,
resulting in a memory leak.
Fix this by ensuring that xcc_info is properly freed in the error paths.
Compile tested only. Issue found using a prototype static analysis tool
and code review.
Fixes: 4d5275ab0b18 ("drm/amdgpu: Add parsing of acpi xcc objects")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Report xgmi training error uncorrectable error count.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If amdgpu_amdkfd_gpuvm_free_memory_of_gpu() fails after kgd_mem is
removed from validate_list, the mem handle still lingers in the KFD idr.
This means when process is terminated,
kfd_process_free_outstanding_kfd_bos() will call
amdgpu_amdkfd_gpuvm_free_memory_of_gpu() again resulting in double
deletion.
To avoid this -
(a) Check if list is empty before deleting it
(b) Rearragne amdgpu_amdkfd_gpuvm_free_memory_of_gpu() such that it can
be safely called again if it returns failure the first time.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6ba60345f45eaf7cb4f89105d26083a4b9fd1cba)
|
|
There is nothing wrong if in_shaper_func type is DISTRIBUTED POINTS.
Remove the assert placed for a TODO to avoid misinterpretations.
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1714dcc4c2c53e41190896eba263ed6328bcf415)
|
|
Commit 27fc10d1095f ("drm/amd/display: Fix the delta clamping for shaper
LUT") fixed banding when using plane shaper LUT in DCN10 CM helper. The
problem is also present in DCN30 CM helper, fix banding by extending the
same bug delta clamping fix to CM3.
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0274a54897f356f9c78767c4a2a5863f7dde90c6)
|
|
Some shimmer/colorful points appears when using the steamOS color
pipeline for HDR on gaming with DCN32. These points look like black
values being wrongly mapped to red/blue/green values. It was caused
because the number of hw points in regular LUTs and in a shaper LUT was
treated as the same.
DCN3+ regular LUTs have 257 bases and implicit deltas (i.e. HW
calculates them), but shaper LUT is a special case: it has 256 bases and
256 deltas, as in DCN1-2 regular LUTs, and outputs 14-bit values.
Fix that by setting by decreasing in 1 the number of HW points computed
in the LUT segmentation so that shaper LUT (i.e. fixpoint == true) keeps
the same DCN10 CM logic and regular LUTs go with `hw_points + 1`.
CC: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Fixes: 4d5fd3d08ea9 ("drm/amd/display: PQ tail accuracy")
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5006505b19a2119e71c008044d59f6d753c858b9)
|
|
This reverts commit 7294863a6f01248d72b61d38478978d638641bee.
This commit was erroneously applied again after commit 0ab5d711ec74
("drm/amd: Refactor `amdgpu_aspm` to be evaluated per device")
removed it, leading to very hard to debug crashes, when used with a system with two
AMD GPUs of which only one supports ASPM.
Link: https://lore.kernel.org/linux-acpi/20251006120944.7880-1-spasswolf@web.de/
Link: https://github.com/acpica/acpica/issues/1060
Fixes: 0ab5d711ec74 ("drm/amd: Refactor `amdgpu_aspm` to be evaluated per device")
Signed-off-by: Bert Karwatzki <spasswolf@web.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 97a9689300eb2b393ba5efc17c8e5db835917080)
Cc: stable@vger.kernel.org
|
|
commit f81cd793119e ("drm/amd/amdgpu: Fix MES init sequence") caused
a dependency on new enough MES firmware to use amdgpu. This was fixed
on most gfx11 and gfx12 hardware with commit 0180e0a5dd5c
("drm/amdgpu/mes: add compatibility checks for set_hw_resource_1"), but
this left out that GC 11.0.4 had breakage at MES 0x51.
Bump the requirement to 0x52 instead.
Reported-by: danijel@nausys.com
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4576
Fixes: f81cd793119e ("drm/amd/amdgpu: Fix MES init sequence")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c2d2ccc85faf8cc6934d50c18e43097eb453ade2)
Cc: stable@vger.kernel.org
|
|
No longer required to allocate temporary buffer while fetching metrcis,
instead, use metrics table cache data directly.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
VF doesn't enable VCN poison irq in VCNv2.5. Skip releasing it and avoid
call trace during deinitialization.
[ 71.913601] [drm] clean up the vf2pf work item
[ 71.915088] ------------[ cut here ]------------
[ 71.915092] WARNING: CPU: 3 PID: 1079 at /tmp/amd.aFkFvSQl/amd/amdgpu/amdgpu_irq.c:641 amdgpu_irq_put+0xc6/0xe0 [amdgpu]
[ 71.915355] Modules linked in: amdgpu(OE-) amddrm_ttm_helper(OE) amdttm(OE) amddrm_buddy(OE) amdxcp(OE) amddrm_exec(OE) amd_sched(OE) amdkcl(OE) drm_suballoc_helper drm_display_helper cec rc_core i2c_algo_bit video wmi binfmt_misc nls_iso8859_1 intel_rapl_msr intel_rapl_common input_leds joydev serio_raw mac_hid qemu_fw_cfg sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 hid_generic crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel usbhid 8139too sha256_ssse3 sha1_ssse3 hid psmouse bochs i2c_i801 ahci drm_vram_helper libahci i2c_smbus lpc_ich drm_ttm_helper 8139cp mii ttm aesni_intel crypto_simd cryptd
[ 71.915484] CPU: 3 PID: 1079 Comm: rmmod Tainted: G OE 6.8.0-87-generic #88~22.04.1-Ubuntu
[ 71.915489] Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-2.el9_5.1 04/01/2014
[ 71.915492] RIP: 0010:amdgpu_irq_put+0xc6/0xe0 [amdgpu]
[ 71.915768] Code: 75 84 b8 ea ff ff ff eb d4 44 89 ea 48 89 de 4c 89 e7 e8 fd fc ff ff 5b 41 5c 41 5d 41 5e 5d 31 d2 31 f6 31 ff e9 55 30 3b c7 <0f> 0b eb d4 b8 fe ff ff ff eb a8 e9 b7 3b 8a 00 66 2e 0f 1f 84 00
[ 71.915771] RSP: 0018:ffffcf0800eafa30 EFLAGS: 00010246
[ 71.915775] RAX: 0000000000000000 RBX: ffff891bda4b0668 RCX: 0000000000000000
[ 71.915777] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 71.915779] RBP: ffffcf0800eafa50 R08: 0000000000000000 R09: 0000000000000000
[ 71.915781] R10: 0000000000000000 R11: 0000000000000000 R12: ffff891bda480000
[ 71.915782] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
[ 71.915792] FS: 000070cff87c4c40(0000) GS:ffff893abfb80000(0000) knlGS:0000000000000000
[ 71.915795] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 71.915797] CR2: 00005fa13073e478 CR3: 000000010d634006 CR4: 0000000000770ef0
[ 71.915800] PKRU: 55555554
[ 71.915802] Call Trace:
[ 71.915805] <TASK>
[ 71.915809] vcn_v2_5_hw_fini+0x19e/0x1e0 [amdgpu]
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
There is nothing wrong if in_shaper_func type is DISTRIBUTED POINTS.
Remove the assert placed for a TODO to avoid misinterpretations.
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Commit 27fc10d1095f ("drm/amd/display: Fix the delta clamping for shaper
LUT") fixed banding when using plane shaper LUT in DCN10 CM helper. The
problem is also present in DCN30 CM helper, fix banding by extending the
same bug delta clamping fix to CM3.
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Some shimmer/colorful points appears when using the steamOS color
pipeline for HDR on gaming with DCN32. These points look like black
values being wrongly mapped to red/blue/green values. It was caused
because the number of hw points in regular LUTs and in a shaper LUT was
treated as the same.
DCN3+ regular LUTs have 257 bases and implicit deltas (i.e. HW
calculates them), but shaper LUT is a special case: it has 256 bases and
256 deltas, as in DCN1-2 regular LUTs, and outputs 14-bit values.
Fix that by setting by decreasing in 1 the number of HW points computed
in the LUT segmentation so that shaper LUT (i.e. fixpoint == true) keeps
the same DCN10 CM logic and regular LUTs go with `hw_points + 1`.
CC: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Fixes: 4d5fd3d08ea9 ("drm/amd/display: PQ tail accuracy")
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If amdgpu_amdkfd_gpuvm_free_memory_of_gpu() fails after kgd_mem is
removed from validate_list, the mem handle still lingers in the KFD idr.
This means when process is terminated,
kfd_process_free_outstanding_kfd_bos() will call
amdgpu_amdkfd_gpuvm_free_memory_of_gpu() again resulting in double
deletion.
To avoid this -
(a) Check if list is empty before deleting it
(b) Rearragne amdgpu_amdkfd_gpuvm_free_memory_of_gpu() such that it can
be safely called again if it returns failure the first time.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The return code of a non void function should not be ignored. In cases
where we do not care, the code needs to suppress it.
Signed-off-by: Andrew Martin <andrew.martin@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add get ras capability for psp 15.0.8.
v2:Remove APU type check and IP version check.
Signed-off-by: Jinzhou Su <jinzhou.su@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The number of default features could be different from the actual width
of the bitmap. Use a different definition for it. Also increase the max
width of bitmap to 128.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use smu_feature_bits instead of uint64_t pointer and operate on
feature bits.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Feature bits are not necessarily restricted to 64-bits. Use
smu_feature_bits data structure to represent feature mask for checking
DPM status.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This reverts commit 7294863a6f01248d72b61d38478978d638641bee.
This commit was erroneously applied again after commit 0ab5d711ec74
("drm/amd: Refactor `amdgpu_aspm` to be evaluated per device")
removed it, leading to very hard to debug crashes, when used with a system with two
AMD GPUs of which only one supports ASPM.
Link: https://lore.kernel.org/linux-acpi/20251006120944.7880-1-spasswolf@web.de/
Link: https://github.com/acpica/acpica/issues/1060
Fixes: 0ab5d711ec74 ("drm/amd: Refactor `amdgpu_aspm` to be evaluated per device")
Signed-off-by: Bert Karwatzki <spasswolf@web.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Report xgmi training error uncorrectable error count.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This version brings along following fixes:
- Migrate DCCG register access from hwseq to dccg component.
- Add lpddr5 handling to dml2.1
- Correct external pr fsm control
- Make DCN35 OTG disable w/a reusable
- Make DSC FGCG a DSC block level function
- Make some DCN35 DCCG symbols reusable
- Fix writeback on DCN 3.2+
- Fix IGT link training failure on Replay panel
- Fix system resume lag issue
- Add oem panel config for new features
- Fix IGT ILR link training failure on Replay panel
- Fix a NULL pointer dereference in dcn20_hwseq.c
- Add Gfx Base Case For Linear Tiling Handling
- Migrate DIO registers access from hwseq to dio component.
- Match expected data types
- Add CRC 32-bit mode support for DCN3.6+
- Init DMUB DPIA Only for APU
- DIO memory leak fix.
- Add Handling for gfxversion DcGfxBase
Acked-by: ChiaHsuan Chung <chiahsuan.chung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add new enum definition for panel replay ml activity options.
Acked-by: ChiaHsuan Chung <chiahsuan.chung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
There is no way to set tiling in dml in the case that gfxversion is
DcGfxBase.
[How]
Where tiling is updated based on Gfx, add case for DcGfxBase and set
tiling to dml2_sw_linear.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Nicholas Carbones <ncarbone@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
Allocated memory for dcn10_dio but not freed in dcn401_resource.
[how]
Add kfree for it in dcn401_resource.
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Bhuvanachandra Pinninti <bpinnint@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why & how]
1. There is no need to init dpia in dgpu
2. Add additional dpia flags
a. dpia hpd dynamic control
b. consolidated dpia link training to dp
c. dynamic bw allocation support
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
DCN 3.6+ hardware supports CRC-32 polynomial in addition to the
legacy CRC-16. Enable 32-bit CRC values per color component for
improvement of precision in display validation.
[How]
When userspace sets crc_poly_mode (0=CRC-16, 1=CRC-32) via the debugfs
interface, the value is stored in dm_irq_params.crc_poly_mode. When CRC
source configuration triggers amdgpu_dm_crtc_configure_crc_source(),
crc_poly_mode is retrieved from dm_irq_params and passed to
dc_stream_configure_crc().
In the DC layer, dc_stream_configure_crc() sets crc_poly_mode into the
crc_params structure and passes it to optc35_configure_crc(). If the
hardware supports the OTG_CRC_POLY_SEL register, the register is
programmed to select CRC-16 or CRC-32 polynomial.
When reading CRC values, optc35_get_crc() checks whether CRC32 register
masks are available. If present, it reads 32-bit CRC values from
OTG_CRC0/1_DATA_R32/G32/B32 registers; otherwise, it falls back
to reading 16-bit CRC values from legacy OTG_CRC0/1_DATA_RG/B
registers.
Reviewed-by: ChiaHsuan Chung <chiahsuan.chung@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why/How]
Data types should match what is expected.
Update/cast data accordingly.
Also change ASSERT to use DML_ASSERT instead
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Zheng Austin <Austin.Zheng@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
Direct DIO registers access in hwseq layer was creating register conflicts.
[how]
Migrated DIO registers from hwseq to dio component.
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Signed-off-by: Bhuvanachandra Pinninti <bpinnint@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
commit f81cd793119e ("drm/amd/amdgpu: Fix MES init sequence") caused
a dependency on new enough MES firmware to use amdgpu. This was fixed
on most gfx11 and gfx12 hardware with commit 0180e0a5dd5c
("drm/amdgpu/mes: add compatibility checks for set_hw_resource_1"), but
this left out that GC 11.0.4 had breakage at MES 0x51.
Bump the requirement to 0x52 instead.
Reported-by: danijel@nausys.com
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4576
Fixes: f81cd793119e ("drm/amd/amdgpu: Fix MES init sequence")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Post-driver cases always use linear tiling yet there is no dedicated
Gfx handling for this condition.
[How]
Add DcGfxBase/DalGfxBase to gfx version enums and set tiling to linear
when it is used. Also, enforce the use of proper tiling format as tiling
information is used.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Nicholas Carbones <ncarbone@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|