| Age | Commit message (Collapse) | Author |
|
Normalize registers address to local xcc address for sdma v7_1.
Merge normalize register address function to an common function
for soc v1.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix it to apply for all models.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
PSP programs the NBIO partition status register. In the absence of PSP,
read the current compute partition from the GFX IMU register instead of
NBIO.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Michael Chen <michael.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Send the Set_Shader_Debugger packet on the correct MES pipe when
partition mode is set to non-SPX mode.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Michael Chen <michael.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
On GFX 12.1, pass the xcc id of the master XCC to choose the correct
MES Pipe to send the add_queue/remove_queue requests to MES.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Michael Chen <michael.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
There is no need to partition VMID space on GFX 12.1 when
operating in CPX mode as SDMA is not sharing MMHUB on GFX 12.1.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Michael Chen <michael.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Currently, only SPX mode works on GFX 12.1. This patch reworks
the MES initialization to get other non-SPX modes working. For example,
for CPX mode, coop_enable bit needs to be set to 0. The shared command
buffer initialization is also not needed in CPX mode.
The shared command buffer initialization needs further improvements which
will be handled in later patches.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Michael Chen <michael.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
On GFX 12.1, use the correct MES pipe instance for readiness before
sending MES commands on that pipe. Additionally, send the TLB requests
on the correct MES pipe in non-SPX modes.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Michael Chen <michael.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Adjust program logic for sdam v7_1, only use physical xcc_id
when program register to support compute partition.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Adjust xcc_id logic to only use physical xcc_id when program
register, (use logic xcc_id by default), to fit for compute
partition.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Adjust gfx_v12_1_xcc_cp_resume function to program
cp resume per xcc_id (logic xcc number) to fix for
xcp_resume.
V2: Allocate compute microcode bo when sw init
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
program SDMAx_QUEUEx_SCHEDULE_CNTL for context switch due to
quantum in KFD for GFX12.1
Signed-off-by: Gang Ba <Gang.Ba@amd.com>
Reviewed-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When submitting MQD to CP, set SDMA_QUEUEx_IB_CNTL/SWITCH_INSIDE_IB bit
so it'll allow SDMA preemption if there is a massive command buffer of
long-running SDMA commands.
Signed-off-by: Gang Ba <Gang.Ba@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Disable burst in GL1A and GLARBA for gfx v12_1.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Enable the new UTCL0 retry-based thrashing prevention on GFX 12.1.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
For querying VMID <-> PASID mapping on GFX 12.1, we need to first
program the IH_VMID_LUT_INDEX before fetching the LUT mapping. Without
this TLB flush may not work.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Michael Chen <michael.chen@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Support physical address convert to current NPS
pages in uniras.
Signed-off-by: Jinzhou Su <jinzhou.su@amd.com>
Reviewed-by: YiPeng Chai <YiPeng.Chai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
mqd_stride_size is used to calculate the next mqd offset
for cooperative dispatch.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
There are a couple of spelling mistakes, one in a pr_warn message
and one in a seq_printf message. Fix these.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Sphinx reports kernel-doc warning:
WARNING: ./drivers/gpu/drm/amd/include/amd_shared.h:113 Enum value 'AMD_IP_BLOCK_TYPE_RAS' not described in enum 'amd_ip_block_type'
Describe the value to fix it.
Fixes: 7169e706c82d ("drm/amdgpu: Add ras module ip block to amdgpu discovery")
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
struct
Sphinx reports kernel-doc warning:
WARNING: ./drivers/gpu/drm/amd/display/dc/dc.h:2796 This comment starts with '/**', but isn't a kernel-doc comment. Refer to Documentation/doc-guide/kernel-doc.rst
* Software state variables used to program register fields across the display pipeline
Don't use kernel-doc comment syntax to fix it.
Fixes: b0ff344fe70c ("drm/amd/display: Add interface to capture expected HW state from SW state")
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
CalculateWatermarksAndDRAMSpeedChangeSupport()
CalculateWatermarksAndDRAMSpeedChangeSupport() has a large number of
parameters, which must be passed on the stack. Most of the parameters
between the two callsites are the same, so they can be accessed through
the existing mode_lib pointer, instead of being passed as explicit
arguments. Doing this reduces the stack size of
dml30_ModeSupportAndSystemConfigurationFull() from 1912 bytes to 1840
bytes building for x86_64 with clang-22, helping stay under the 2048
byte limit for display_mode_vba_30.c.
Additionally, now that there is a pointer to mode_lib->vba available,
use 'v' consistently throughout the entire function.
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
CalculatePrefetchSchedule()
After an innocuous optimization change in clang-22,
dml30_ModeSupportAndSystemConfigurationFull() is over the 2048 byte
stack limit for display_mode_vba_30.c.
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3529:6: warning: stack frame size (2096) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
3529 | void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
| ^
With clang-21, this function was already close to the limit:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3529:6: warning: stack frame size (1912) exceeds limit (1586) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
3529 | void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
| ^
CalculatePrefetchSchedule() has a large number of parameters, which must
be passed on the stack. Most of the parameters between the two callsites
are the same, so they can be accessed through the existing mode_lib
pointer, instead of being passed as explicit arguments. Doing this
reduces the stack size of dml30_ModeSupportAndSystemConfigurationFull()
from 2096 bytes to 1912 bytes with clang-22.
Closes: https://github.com/ClangBuiltLinux/linux/issues/2117
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
After an innocuous optimization change in clang-22, allmodconfig (which
enables CONFIG_KASAN and CONFIG_WERROR) breaks with:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1724:6: error: stack frame size (3144) exceeds limit (3072) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
1724 | void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
| ^
With clang-21, this function was already pretty close to the existing
limit of 3072 bytes.
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1724:6: error: stack frame size (2904) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
1724 | void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
| ^
A similar situation occurred in dml2, which was resolved by
commit e4479aecf658 ("drm/amd/display: Increase sanitizer frame larger
than limit when compile testing with clang") by increasing the limit for
clang when compile testing with certain sanitizer enabled, so that
allmodconfig (an easy testing target) continues to work.
Apply that same change to the dml folder to clear up the warning for
allmodconfig, unbreaking the build.
Closes: https://github.com/ClangBuiltLinux/linux/issues/2135
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add a query for sdma queues. Userspace can use this to
query the size of the CSA buffers for sdma user queues.
Proposed userspace:
https://gitlab.freedesktop.org/yogeshmohan/mesa/-/commits/userq_query
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add a query for compute queues. Userspace can use this to
query the size of the EOP buffers for compute user queues.
Proposed userspace:
https://gitlab.freedesktop.org/yogeshmohan/mesa/-/commits/userq_query
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for 6.19:
UAPI Changes:
- panfrost: Add PANFROST_BO_SYNC ioctl
- panthor: Add PANTHOR_BO_SYNC ioctl
Core Changes:
- atomic: Add drm_device pointer to drm_private_obj
- bridge: Introduce drm_bridge_unplug, drm_bridge_enter, and
drm_bridge_exit
- dma-buf: Improve sg_table debugging
- dma-fence: Add new helpers, and use them when needed
- dp_mst: Avoid out-of-bounds access with VCPI==0
- gem: Reduce page table overhead with transparent huge pages
- panic: Report invalid panic modes
- sched: Add TODO entries
- ttm: Various cleanups
- vblank: Various refactoring and cleanups
- Kconfig cleanups
- Removed support for kdb
Driver Changes:
- amdxdna: Fix race conditions at suspend, Improve handling of zero
tail pointers, Fix cu_idx being overwritten during command setup
- ast: Support imported cursor buffers
-
- panthor: Enable timestamp propagation, Multiple improvements and
fixes to improve the overall robustness, notably of the scheduler.
- panels:
- panel-edp: Support for CSW MNE007QB3-1, AUO B140HAN06.4, AUO B140QAX01.H
Signed-off-by: Dave Airlie <airlied@redhat.com>
[airlied: fix mm conflict]
From: Maxime Ripard <mripard@redhat.com>
Link: https://patch.msgid.link/20251212-spectacular-agama-of-abracadabra-aaef32@penduick
|
|
functions
There is no real reason to include drm_colorop.h from drm_atomic.h, as
drm_atomic_get_{old,new}_colorop_state() have no real reason to be
static inline.
Convert the static inlines to proper functions, and drop the include to
reduce the include dependencies and improve data hiding.
v2: Fix vkms build failures (Alex)
Fixes: cfc27680ee20 ("drm/colorop: Introduce new drm_colorop mode object")
Cc: Simon Ser <contact@emersion.fr>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Melissa Wen <mwen@igalia.com>
Cc: Sebastian Wick <sebastian.wick@redhat.com>
Cc: Alex Hung <alex.hung@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Link: https://patch.msgid.link/20251219114939.1069851-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Pass character "0" rather than NULL terminator to properly format
queue restoration SMI events. Currently, the NULL terminator precedes
the newline character that is intended to delineate separate events
in the SMI event buffer, which can break userspace parsers.
Signed-off-by: Brian Kocoloski <brian.kocoloski@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6e7143e5e6e21f9d5572e0390f7089e6d53edf3c)
|
|
User-configured SCLK(GPU core clock)frequencies were not persisting
across S0ix suspend/resume cycles on smu v14 hardware.
The issue occurred because of the code resetting clock frequency
to zero during resume.
This patch addresses the problem by:
- Preserving user-configured values in driver and sets the
clock frequency across resume
- Preserved settings are sent to the hardware during resume
Signed-off-by: mythilam <mythilam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 20ba98326f4c69e6bf8d1f42942ece485a675b27)
|
|
Avoid a possible UAF in GPU recovery due to a race between
the sched timeout callback and the tdr work queue.
The gpu recovery function calls drm_sched_stop() and
later drm_sched_start(). drm_sched_start() restarts
the tdr queue which will eventually free the job. If
the tdr queue frees the job before time out callback
completes, the job will be freed and we'll get a UAF
when accessing the pasid. Cache it early to avoid the
UAF.
Example KASAN trace:
[ 493.058141] BUG: KASAN: slab-use-after-free in amdgpu_device_gpu_recover+0x968/0x990 [amdgpu]
[ 493.067530] Read of size 4 at addr ffff88b0ce3f794c by task kworker/u128:1/323
[ 493.074892]
[ 493.076485] CPU: 9 UID: 0 PID: 323 Comm: kworker/u128:1 Tainted: G E 6.16.0-1289896.2.zuul.bf4f11df81c1410bbe901c4373305a31 #1 PREEMPT(voluntary)
[ 493.076493] Tainted: [E]=UNSIGNED_MODULE
[ 493.076495] Hardware name: TYAN B8021G88V2HR-2T/S8021GM2NR-2T, BIOS V1.03.B10 04/01/2019
[ 493.076500] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
[ 493.076512] Call Trace:
[ 493.076515] <TASK>
[ 493.076518] dump_stack_lvl+0x64/0x80
[ 493.076529] print_report+0xce/0x630
[ 493.076536] ? _raw_spin_lock_irqsave+0x86/0xd0
[ 493.076541] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ 493.076545] ? amdgpu_device_gpu_recover+0x968/0x990 [amdgpu]
[ 493.077253] kasan_report+0xb8/0xf0
[ 493.077258] ? amdgpu_device_gpu_recover+0x968/0x990 [amdgpu]
[ 493.077965] amdgpu_device_gpu_recover+0x968/0x990 [amdgpu]
[ 493.078672] ? __pfx_amdgpu_device_gpu_recover+0x10/0x10 [amdgpu]
[ 493.079378] ? amdgpu_coredump+0x1fd/0x4c0 [amdgpu]
[ 493.080111] amdgpu_job_timedout+0x642/0x1400 [amdgpu]
[ 493.080903] ? pick_task_fair+0x24e/0x330
[ 493.080910] ? __pfx_amdgpu_job_timedout+0x10/0x10 [amdgpu]
[ 493.081702] ? _raw_spin_lock+0x75/0xc0
[ 493.081708] ? __pfx__raw_spin_lock+0x10/0x10
[ 493.081712] drm_sched_job_timedout+0x1b0/0x4b0 [gpu_sched]
[ 493.081721] ? __pfx__raw_spin_lock_irq+0x10/0x10
[ 493.081725] process_one_work+0x679/0xff0
[ 493.081732] worker_thread+0x6ce/0xfd0
[ 493.081736] ? __pfx_worker_thread+0x10/0x10
[ 493.081739] kthread+0x376/0x730
[ 493.081744] ? __pfx_kthread+0x10/0x10
[ 493.081748] ? __pfx__raw_spin_lock_irq+0x10/0x10
[ 493.081751] ? __pfx_kthread+0x10/0x10
[ 493.081755] ret_from_fork+0x247/0x330
[ 493.081761] ? __pfx_kthread+0x10/0x10
[ 493.081764] ret_from_fork_asm+0x1a/0x30
[ 493.081771] </TASK>
Fixes: a72002cb181f ("drm/amdgpu: Make use of drm_wedge_task_info")
Link: https://github.com/HansKristian-Work/vkd3d-proton/pull/2670
Cc: SRINIVASAN.SHANMUGAM@amd.com
Cc: vitaly.prosyak@amd.com
Cc: christian.koenig@amd.com
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 20880a3fd5dd7bca1a079534cf6596bda92e107d)
|
|
[why]
need to enable APG_CLOCK_ENABLE enable first
also need to wake up az from D3 before access az block
Reviewed-by: Swapnil Patel <swapnil.patel@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bf5e396957acafd46003318965500914d5f4edfa)
|
|
[Why]
Different platforms use different NBIO header files,
causing display code to use differnt offset and read
wrong accelerated status.
[How]
- Unified NBIO offset header file across platform.
- Correct scratch registers offsets to proper locations.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4667
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 576e032e909c8a6bb3d907b4ef5f6abe0f644199)
Cc: stable@vger.kernel.org
|
|
[Why]
Different platforms use differnet NBIO header files,
causing display code to use differnt offset and read
wrong accelerated status.
[How]
- Unified NBIO offset header file across platform.
- Correct scratch registers offsets to proper locations.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4667
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 49a63bc8eda0304ba307f5ba68305f936174f72d)
Cc: stable@vger.kernel.org
|
|
If console suspend has been disabled using `no_console_suspend` also
wake up during thaw() so that some messages can be seen for debugging.
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4191
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 63387cbbb714d9f0d179d9d4560de1408d0906de)
|
|
To acommandate specific interrupt source for gfx v12_1
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Override the local MTYPE mappings in KFD SVM code with mtype_local
modprobe param for GFX 12.1.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If the number instances of firmware is RLC_NUM_INS_CODE0(Only 1 inst),
need to copy it directly for rlcautolad.
For the firmware which instances number bigger than 1, only copy for
enabled XCC to save copy time.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add gfx sysfs support for gfx_v12_1.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix to use local register offset inside die for mes fw accessing
local/remote xcd register.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Normalize registers address to local xcc address for gfx v12_1.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Support xcc harvest for ih translate to logic xcc.
V2: Only check available instances
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Correct inst_id input from physical to logic for sdma v7_1.
V2: Show real instance number on logic xcc.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use physical xcc_id to get rrmt on misc_op for mes v12_1.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Pass character "0" rather than NULL terminator to properly format
queue restoration SMI events. Currently, the NULL terminator precedes
the newline character that is intended to delineate separate events
in the SMI event buffer, which can break userspace parsers.
Signed-off-by: Brian Kocoloski <brian.kocoloski@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Correct xcc_id input to GET_INST from physical to logic for
gfx_v12_1.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Need to allocate memory for MEC FW data and program
registers CP_MEC_MDBASE for each XCC respectively.
Signed-off-by: Michael Chen <michael.chen@amd.com>
Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The gmc fault virtual address is up to 57bit for 5 level page table,
this also works with 48bit virtual address for 4 level page table.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Set pde3 invalidation request bit during tlb flush for up to 5 level
page table.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
For 5-level page tables, update compute vmid sh_mem_base LDS aperture
and Scratch aperture base address to above 57-bit, use the same setting
from gfx vmid, we can remove the duplicate macro.
Update queue pdd lds_base and scratch_base to the same value as
sh_mem_base setting. Then application get process apertures return the
correct value to access LDS and Scratch memory for 57bit address 5-level
page tables. This may pass to MES in future when mapping queue.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|