<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c, branch linux-4.17.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.17.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.17.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2018-04-03T17:52:56Z</updated>
<entry>
<title>drm/amdgpu: drop compute ring timeout setting for non-sriov only (v2)</title>
<updated>2018-04-03T17:52:56Z</updated>
<author>
<name>Evan Quan</name>
<email>evan.quan@amd.com</email>
</author>
<published>2018-03-27T01:53:15Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=687c1c2eed0c0463b731ed462dd87de6ba7d4ac8'/>
<id>urn:sha1:687c1c2eed0c0463b731ed462dd87de6ba7d4ac8</id>
<content type='text'>
Sriov still wants these error messags on timeout. So, for sriov
use case, the timeout setting on compute rings is kept.

-v2: clean the code

Signed-off-by: Evan Quan &lt;evan.quan@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Monk Liu &lt;monk.liu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: no job timeout setting on compute queues</title>
<updated>2018-03-21T19:36:57Z</updated>
<author>
<name>Evan Quan</name>
<email>evan.quan@amd.com</email>
</author>
<published>2018-03-15T01:49:01Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=f0c2b16ba84a0b8b960a6d442496ce2d2e6bfa99'/>
<id>urn:sha1:f0c2b16ba84a0b8b960a6d442496ce2d2e6bfa99</id>
<content type='text'>
Under some heavy computing environment(e.g. dgemm test), it
takes the asic over 10+ seconds to finish the dispatched job
which will trigger the timeout.

It's quite confusing although it does not seem to bring any
real problems. As a quick workround, we choose to not enfoce
the timeout setting on compute queues.

Signed-off-by: Evan Quan &lt;evan.quan@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: rename amdgpu_gpu_recover</title>
<updated>2017-12-18T15:59:58Z</updated>
<author>
<name>Alex Deucher</name>
<email>alexander.deucher@amd.com</email>
</author>
<published>2017-12-15T21:40:49Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5f152b5e69a5392181b0a84bd55fe17a417364ac'/>
<id>urn:sha1:5f152b5e69a5392181b0a84bd55fe17a417364ac</id>
<content type='text'>
add device to the name for consistency.

Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Simplify amdgpu_lockup_timeout usage.</title>
<updated>2017-12-15T22:15:00Z</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2017-12-13T19:36:53Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8854695add1eaaeafae728850c905c4727e56f35'/>
<id>urn:sha1:8854695add1eaaeafae728850c905c4727e56f35</id>
<content type='text'>
With introduction of amdgpu_gpu_recovery we don't need any more
to rely on amdgpu_lockup_timeout == 0 for disabling GPU reset.

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Add gpu_recovery parameter</title>
<updated>2017-12-15T22:14:50Z</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2017-12-12T19:09:30Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=dcebf026e6f69fb79e7f88d10681faf4f8a985ba'/>
<id>urn:sha1:dcebf026e6f69fb79e7f88d10681faf4f8a985ba</id>
<content type='text'>
Add new parameter to control GPU recovery procedure.

v2:
Add auto logic where reset is disabled for bare metal and enabled
for SR-IOV.
Allow forced reset from debugfs.

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: no need with INT for fence polling</title>
<updated>2017-12-12T19:50:00Z</updated>
<author>
<name>Monk Liu</name>
<email>Monk.Liu@amd.com</email>
</author>
<published>2017-12-04T12:46:17Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=d118a62153000e6141872b99aea8498568b0d361'/>
<id>urn:sha1:d118a62153000e6141872b99aea8498568b0d361</id>
<content type='text'>
We are polling so no need for INT.

Signed-off-by: Monk Liu &lt;Monk.Liu@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm: move amd_gpu_scheduler into common location</title>
<updated>2017-12-07T16:51:56Z</updated>
<author>
<name>Lucas Stach</name>
<email>l.stach@pengutronix.de</email>
</author>
<published>2017-12-06T16:49:39Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=1b1f42d8fde4fef1ed7873bf5aa91755f8c3de35'/>
<id>urn:sha1:1b1f42d8fde4fef1ed7873bf5aa91755f8c3de35</id>
<content type='text'>
This moves and renames the AMDGPU scheduler to a common location in DRM
in order to facilitate re-use by other drivers. This is mostly a straight
forward rename with no code changes.

One notable exception is the function to_drm_sched_fence(), which is no
longer a inline header function to avoid the need to export the
drm_sched_fence_ops_scheduled and drm_sched_fence_ops_finished structures.

Reviewed-by: Chunming Zhou &lt;david1.zhou@amd.com&gt;
Tested-by: Dieter Nützel &lt;Dieter@nuetzel-hh.de&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu:implement new GPU recover(v3)</title>
<updated>2017-12-04T21:41:30Z</updated>
<author>
<name>Monk Liu</name>
<email>Monk.Liu@amd.com</email>
</author>
<published>2017-10-25T08:37:02Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5740682e66cef57626a328d237698cad329c0449'/>
<id>urn:sha1:5740682e66cef57626a328d237698cad329c0449</id>
<content type='text'>
1,new imple names amdgpu_gpu_recover which gives more hint
on what it does compared with gpu_reset

2,gpu_recover unify bare-metal and SR-IOV, only the asic reset
part is implemented differently

3,gpu_recover will increase hang job karma and mark its entity/context
as guilty if exceeds limit

V2:

4,in scheduler main routine the job from guilty context  will be immedialy
fake signaled after it poped from queue and its fence be set with
"-ECANCELED" error

5,in scheduler recovery routine all jobs from the guilty entity would be
dropped

6,in run_job() routine the real IB submission would be skipped if @skip parameter
equales true or there was VRAM lost occured.

V3:

7,replace deprecated gpu reset, use new gpu recover

Signed-off-by: Monk Liu &lt;Monk.Liu@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: change redundant init logs to debug level</title>
<updated>2017-12-04T21:33:12Z</updated>
<author>
<name>pding</name>
<email>Pixel.Ding@amd.com</email>
</author>
<published>2017-10-26T01:30:38Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9953b72f9c9cb7733334753788faab33ccc4dc0a'/>
<id>urn:sha1:9953b72f9c9cb7733334753788faab33ccc4dc0a</id>
<content type='text'>
When this VF stays in exclusive mode for long, other VFs will be
impacted.

The redundant messages causes exclusive mode timeout when they're
redirected. That is a normal use case for cloud service to redirect
guest log to virtual serial port.

Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: pding &lt;Pixel.Ding@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu:add hang_limit for sched(v2)</title>
<updated>2017-12-04T21:33:08Z</updated>
<author>
<name>Monk Liu</name>
<email>Monk.Liu@amd.com</email>
</author>
<published>2017-10-17T05:40:54Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=95aa9b1d9738faa80c66df41d59358d5ff4c288a'/>
<id>urn:sha1:95aa9b1d9738faa80c66df41d59358d5ff4c288a</id>
<content type='text'>
since gpu_scheduler source domain cannot access amdgpu variable
so need create the hang_limit membewr for sched, and it can
refer it for the upcoming GPU RESET patches

v2:
make hang_limit a parameter of sched_init()

Signed-off-by: Monk Liu &lt;Monk.Liu@amd.com&gt;
Reviewed-by: Chunming Zhou &lt;David1.Zhou@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
