<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/drivers/gpu/drm/i915/i915_gem_execbuffer.c, branch linux-4.1.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.1.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.1.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2015-09-21T17:05:27Z</updated>
<entry>
<title>drm/i915: Always mark the object as dirty when used by the GPU</title>
<updated>2015-09-21T17:05:27Z</updated>
<author>
<name>Chris Wilson</name>
<email>chris@chris-wilson.co.uk</email>
</author>
<published>2015-08-31T14:10:39Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5b2c006168304879990b33f696219eff47157d8d'/>
<id>urn:sha1:5b2c006168304879990b33f696219eff47157d8d</id>
<content type='text'>
commit 51bc140431e233284660b1d22c47dec9ecdb521e upstream.

There have been many hard to track down bugs whereby userspace forgot to
flag a write buffer and then cause graphics corruption or a hung GPU
when that buffer was later purged under memory pressure (as the buffer
appeared clean, its pages would have been evicted rather than preserved
and any changes more recent than in the backing storage would be lost).
In retrospect this is a rare optimisation against memory pressure,
already the slow path. If we always mark the buffer as dirty when
accessed by the GPU, anything not used can still be evicted cheaply
(ideal behaviour for mark-and-sweep eviction) but we do not run the risk
of corruption. For correct read serialisation, userspace still has to
notify when the GPU writes to an object. However, there are certain
situations under which userspace may wish to tell white lies to the
kernel...

Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: Kristian Høgsberg &lt;krh@bitplanet.net&gt;
Cc: Jesse Barnes &lt;jbarnes@virtuousgeek.org&gt;
Cc: "Goel, Akash" &lt;akash.goel@intel.co&gt;
Cc: Michał Winiarski &lt;michal.winiarski@intel.com&gt;
Reviewed-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Signed-off-by: Jani Nikula &lt;jani.nikula@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>drm/i915: Skip allocating shadow batch for 0-length batches</title>
<updated>2015-03-27T14:28:41Z</updated>
<author>
<name>Chris Wilson</name>
<email>chris@chris-wilson.co.uk</email>
</author>
<published>2015-03-27T11:02:10Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=743e78c1d726d875b98ff9689cc77c4d3d5d9ae2'/>
<id>urn:sha1:743e78c1d726d875b98ff9689cc77c4d3d5d9ae2</id>
<content type='text'>
Since

commit 17cabf571e50677d980e9ab2a43c5f11213003ae
Author: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Date:   Wed Jan 14 11:20:57 2015 +0000

    drm/i915: Trim the command parser allocations

we may then try to allocate a zero-sized object and attempt to extract
its pages. Understandably this fails.

Testcase: igt/gem_exec_nop #ivb,byt,hsw
Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
</content>
</entry>
<entry>
<title>drm/i915: Track page table reload need</title>
<updated>2015-03-20T10:48:18Z</updated>
<author>
<name>Ben Widawsky</name>
<email>benjamin.widawsky@intel.com</email>
</author>
<published>2015-03-19T12:53:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=563222a745012cc2bb20c5d5cbff1c1ec4832c05'/>
<id>urn:sha1:563222a745012cc2bb20c5d5cbff1c1ec4832c05</id>
<content type='text'>
This patch was formerly known as, "Force pd restore when PDEs change,
gen6-7." I had to change the name because it is needed for GEN8 too.

The real issue this is trying to solve is when a new object is mapped
into the current address space. The GPU does not snoop the new mapping
so we must do the gen specific action to reload the page tables.

GEN8 and GEN7 do differ in the way they load page tables for the RCS.
GEN8 does so with the context restore, while GEN7 requires the proper
load commands in the command streamer. Non-render is similar for both.

Caveat for GEN7
The docs say you cannot change the PDEs of a currently running context.
We never map new PDEs of a running context, and expect them to be
present - so I think this is okay. (We can unmap, but this should also
be okay since we only unmap unreferenced objects that the GPU shouldn't
be tryingto va-&gt;pa xlate.) The MI_SET_CONTEXT command does have a flag
to signal that even if the context is the same, force a reload. It's
unclear exactly what this does, but I have a hunch it's the right thing
to do.

The logic assumes that we always emit a context switch after mapping new
PDEs, and before we submit a batch. This is the case today, and has been
the case since the inception of hardware contexts. A note in the comment
let's the user know.

It's not just for gen8. If the current context has mappings change, we
need a context reload to switch

v2: Rebased after ppgtt clean up patches. Split the warning for aliasing
and true ppgtt options. And do not break aliasing ppgtt, where to-&gt;ppgtt
is always null.

v3: Invalidate PPGTT TLBs inside alloc_va_range.

v4: Rename ppgtt_invalidate_tlbs to mark_tlbs_dirty and move
pd_dirty_rings from i915_address_space to i915_hw_ppgtt. Fixes when
neither ctx-&gt;ppgtt and aliasing_ppgtt exist.

v5: Removed references to teardown_va_range.

v6: Updated needs_pd_load_pre/post.

v7: Fix pd_dirty_rings check in needs_pd_load_post, and update/move
comment about updated PDEs to object_pin/bind (Mika).

Cc: Mika Kuoppala &lt;mika.kuoppala@linux.intel.com&gt;
Signed-off-by: Ben Widawsky &lt;ben@bwidawsk.net&gt;
Signed-off-by: Michel Thierry &lt;michel.thierry@intel.com&gt; (v2+)
Reviewed-by: Mika Kuoppala &lt;mika.kuoppala@intel.com&gt;
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
</content>
</entry>
<entry>
<title>drm/i915: Fallback to using CPU relocations for large batch buffers</title>
<updated>2015-03-20T10:48:13Z</updated>
<author>
<name>Chris Wilson</name>
<email>chris@chris-wilson.co.uk</email>
</author>
<published>2015-01-14T11:20:56Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=edf4427b8055dc93eb5222d8174b07a75ba24fb5'/>
<id>urn:sha1:edf4427b8055dc93eb5222d8174b07a75ba24fb5</id>
<content type='text'>
If the batch buffer is too large to fit into the aperture and we need a
GTT mapping for relocations, we currently fail. This only applies to a
subset of machines for a subset of environments, quite undesirable. We
can simply check after failing to insert the batch into the GTT as to
whether we only need a mappable binding for relocation and, if so, we can
revert to using a non-mappable binding and an alternate relocation
method. However, using relocate_entry_cpu() is excruciatingly slow for
large buffers on non-LLC as the entire buffer requires clflushing before
and after the relocation handling. Alternatively, we can implement a
third relocation method that only clflushes around the relocation entry.
This is still slower than updating through the GTT, so we prefer using
the GTT where possible, but is orders of magnitude faster as we
typically do not have to then clflush the entire buffer.

An alternative idea of using a temporary WC mapping of the backing store
is promising (it should be faster than using the GTT itself), but
requires fairly extensive arch/x86 support - along the lines of
kmap_atomic_prof_pfn() (which is not universally implemented even for
x86).

Testcase: igt/gem_exec_big #pnv,byt
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88392
Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
[danvet: Add a WARN_ONCE for the impossible reloc case and explain in
a short comment why we want to avoid ping-pong.]
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
</content>
</entry>
<entry>
<title>drm/i915: pass which operation triggered the frontbuffer tracking</title>
<updated>2015-03-17T21:29:51Z</updated>
<author>
<name>Paulo Zanoni</name>
<email>paulo.r.zanoni@intel.com</email>
</author>
<published>2015-02-13T19:23:44Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=a4001f1b75cdb06db0febd9e72d3631f73ba8171'/>
<id>urn:sha1:a4001f1b75cdb06db0febd9e72d3631f73ba8171</id>
<content type='text'>
We want to port FBC to the frontbuffer tracking infrastructure, but
for that we need to know what caused the object invalidation so
we can react accordingly: CPU mmaps need manual, GTT mmaps and
flips don't need handling and ring rendering needs nukes.

v2: - s/ORIGIN_RENDER/ORIGIN_CS/ (Daniel, Rodrigo)
    - Fix copy/pasted wrong documentation
    - Rebase
v3: - Rebase
v4: - Don't pass the operation to flushes (Daniel).

Signed-off-by: Paulo Zanoni &lt;paulo.r.zanoni@intel.com&gt;
Reviewed-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
</content>
</entry>
<entry>
<title>drm/i915: Fix trivial typos in comments and warning message</title>
<updated>2015-03-17T21:29:48Z</updated>
<author>
<name>Yannick Guerrini</name>
<email>yguerrini@tomshardware.fr</email>
</author>
<published>2015-02-28T16:20:41Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=fd0753cf8052bc6f3f56364177a6de7847135a9b'/>
<id>urn:sha1:fd0753cf8052bc6f3f56364177a6de7847135a9b</id>
<content type='text'>
Change 'mutliple' to 'multiple'
Change 'mutlipler' to 'multiplier'
Change 'Haswel' to 'Haswell'

Signed-off-by: Yannick Guerrini &lt;yguerrini@tomshardware.fr&gt;
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
</content>
</entry>
<entry>
<title>drm/i915: Rename 'flags' to 'dispatch_flags' for better code reading</title>
<updated>2015-02-25T21:43:29Z</updated>
<author>
<name>John Harrison</name>
<email>John.C.Harrison@Intel.com</email>
</author>
<published>2015-02-13T11:48:10Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8e004efc16541e7f6e35673449195db5d1f92f40'/>
<id>urn:sha1:8e004efc16541e7f6e35673449195db5d1f92f40</id>
<content type='text'>
There is a flags word that is passed through the execbuffer code path all the
way from initial decoding of the user parameters down to the very final dispatch
buffer call. It is simply called 'flags'. Unfortuantely, there are many other
flags words floating around in the same blocks of code. Even more once the GPU
scheduler arrives.

This patch makes it more obvious exactly which flags word is which by renaming
'flags' to 'dispatch_flags'. Note that the bit definitions for this flags word
already have an 'I915_DISPATCH_' prefix on them and so are not quite so
ambiguous.

OTC-Jira: VIZ-1587
Signed-off-by: John Harrison &lt;John.C.Harrison@Intel.com&gt;
[danvet: Resolve conflict with Chris' rework of the bb parsing.]
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
</content>
</entry>
<entry>
<title>drm/i915: Trim the command parser allocations</title>
<updated>2015-02-23T16:07:40Z</updated>
<author>
<name>Chris Wilson</name>
<email>chris@chris-wilson.co.uk</email>
</author>
<published>2015-01-14T11:20:57Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=17cabf571e50677d980e9ab2a43c5f11213003ae'/>
<id>urn:sha1:17cabf571e50677d980e9ab2a43c5f11213003ae</id>
<content type='text'>
Currently, the command parser tries to create a secondary batch exactly
as large as the original, and vmap both. This is open to abuse by
userspace using extremely large batch objects, but only executing very
short batches. For example, this would be if userspace were to implement
a command submission ringbuffer. However, we only need to allocate pages
for just the contents of the command sequence in the batch - all
relocations copied to the secondary batch will reference the original
batch and so there can be no access to the secondary batch outside of
the explicit execution region.

Testcase: igt/gem_exec_big #ivb,byt,hsw
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88308
Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Reviewed-by: John Harrison &lt;John.C.Harrison@Intel.com&gt;
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
</content>
</entry>
<entry>
<title>drm/i915: Specify bsd rings through exec flag</title>
<updated>2015-01-27T08:51:05Z</updated>
<author>
<name>Zhipeng Gong</name>
<email>zhipeng.gong@intel.com</email>
</author>
<published>2015-01-13T00:48:24Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8d360dffd6d8634868e433128d5178bea14cc42c'/>
<id>urn:sha1:8d360dffd6d8634868e433128d5178bea14cc42c</id>
<content type='text'>
On Skylake GT3 we have 2 Video Command Streamers (VCS), which is asymmetrical.
For example, HEVC GPU commands can be only dispatched to VCS1 ring.
But userspace has no control when using VCS1 or VCS2. This patch introduces
a mechanism to avoid the default ping-pong mode and use one specific ring
through execution flag. This mechanism is usable for all the platforms
with 2 VCS rings.

The open source usage is from these two commits in vaapi/intel:
	commit 702050f04131a44ef8ac16651708ce8a8d98e4b8
	Author: Zhao, Yakui &lt;yakui.zhao@intel.com&gt;
	Date:   Mon Nov 17 12:44:19 2014 +0800

	    Allow the batchbuffer to be submitted with override flag

	commit a56efcdf27d11ad9b21664b4a2cda72d7f90f5a8
	Author: Zhao Yakui &lt;yakui.zhao@intel.com&gt;
	Date:   Mon Nov 17 12:44:22 2014 +0800

	    Add the override flag to assure that HEVC video command
		always uses BSD ring0 for SKL GT3 machine

v2: fix whitespace (Rodrigo)
v3: remove incorrect chunk that came on -collector rebase. (Rodrigo)
v4: change the comment (Zhipeng)
v5: address Daniel's comment (Zhipeng)

Signed-off-by: Zhipeng Gong &lt;zhipeng.gong@intel.com&gt;
Reviewed-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
</content>
</entry>
<entry>
<title>Merge tag 'topic/i915-hda-componentized-2015-01-12' into drm-intel-next-queued</title>
<updated>2015-01-12T22:07:46Z</updated>
<author>
<name>Daniel Vetter</name>
<email>daniel.vetter@ffwll.ch</email>
</author>
<published>2015-01-12T22:07:46Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0a87a2db485a1456b7427914969c0e8195a1bbda'/>
<id>urn:sha1:0a87a2db485a1456b7427914969c0e8195a1bbda</id>
<content type='text'>
Conflicts:
	drivers/gpu/drm/i915/intel_runtime_pm.c

Separate branch so that Takashi can also pull just this refactoring
into sound-next.

Signed-off-by: Daniel Vetter &lt;daniel.vetter@intel.com&gt;
</content>
</entry>
</feed>
