<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/drivers/vfio, branch linux-5.1.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2019-06-15T09:53:00Z</updated>
<entry>
<title>vfio: Fix WARNING "do not call blocking ops when !TASK_RUNNING"</title>
<updated>2019-06-15T09:53:00Z</updated>
<author>
<name>Farhan Ali</name>
<email>alifm@linux.ibm.com</email>
</author>
<published>2019-04-03T18:22:27Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=6a3bde70488edd34a985e89043790bf20df96212'/>
<id>urn:sha1:6a3bde70488edd34a985e89043790bf20df96212</id>
<content type='text'>
[ Upstream commit 41be3e2618174fdf3361e49e64f2bf530f40c6b0 ]

vfio_dev_present() which is the condition to
wait_event_interruptible_timeout(), will call vfio_group_get_device
and try to acquire the mutex group-&gt;device_lock.

wait_event_interruptible_timeout() will set the state of the current
task to TASK_INTERRUPTIBLE, before doing the condition check. This
means that we will try to acquire the mutex while already in a
sleeping state. The scheduler warns us by giving the following
warning:

[ 4050.264464] ------------[ cut here ]------------
[ 4050.264508] do not call blocking ops when !TASK_RUNNING; state=1 set at [&lt;00000000b33c00e2&gt;] prepare_to_wait_event+0x14a/0x188
[ 4050.264529] WARNING: CPU: 12 PID: 35924 at kernel/sched/core.c:6112 __might_sleep+0x76/0x90
....

 4050.264756] Call Trace:
[ 4050.264765] ([&lt;000000000017bbaa&gt;] __might_sleep+0x72/0x90)
[ 4050.264774]  [&lt;0000000000b97edc&gt;] __mutex_lock+0x44/0x8c0
[ 4050.264782]  [&lt;0000000000b9878a&gt;] mutex_lock_nested+0x32/0x40
[ 4050.264793]  [&lt;000003ff800d7abe&gt;] vfio_group_get_device+0x36/0xa8 [vfio]
[ 4050.264803]  [&lt;000003ff800d87c0&gt;] vfio_del_group_dev+0x238/0x378 [vfio]
[ 4050.264813]  [&lt;000003ff8015f67c&gt;] mdev_remove+0x3c/0x68 [mdev]
[ 4050.264825]  [&lt;00000000008e01b0&gt;] device_release_driver_internal+0x168/0x268
[ 4050.264834]  [&lt;00000000008de692&gt;] bus_remove_device+0x162/0x190
[ 4050.264843]  [&lt;00000000008daf42&gt;] device_del+0x1e2/0x368
[ 4050.264851]  [&lt;00000000008db12c&gt;] device_unregister+0x64/0x88
[ 4050.264862]  [&lt;000003ff8015ed84&gt;] mdev_device_remove+0xec/0x130 [mdev]
[ 4050.264872]  [&lt;000003ff8015f074&gt;] remove_store+0x6c/0xa8 [mdev]
[ 4050.264881]  [&lt;000000000046f494&gt;] kernfs_fop_write+0x14c/0x1f8
[ 4050.264890]  [&lt;00000000003c1530&gt;] __vfs_write+0x38/0x1a8
[ 4050.264899]  [&lt;00000000003c187c&gt;] vfs_write+0xb4/0x198
[ 4050.264908]  [&lt;00000000003c1af2&gt;] ksys_write+0x5a/0xb0
[ 4050.264916]  [&lt;0000000000b9e270&gt;] system_call+0xdc/0x2d8
[ 4050.264925] 4 locks held by sh/35924:
[ 4050.264933]  #0: 000000001ef90325 (sb_writers#4){.+.+}, at: vfs_write+0x9e/0x198
[ 4050.264948]  #1: 000000005c1ab0b3 (&amp;of-&gt;mutex){+.+.}, at: kernfs_fop_write+0x1cc/0x1f8
[ 4050.264963]  #2: 0000000034831ab8 (kn-&gt;count#297){++++}, at: kernfs_remove_self+0x12e/0x150
[ 4050.264979]  #3: 00000000e152484f (&amp;dev-&gt;mutex){....}, at: device_release_driver_internal+0x5c/0x268
[ 4050.264993] Last Breaking-Event-Address:
[ 4050.265002]  [&lt;000000000017bbaa&gt;] __might_sleep+0x72/0x90
[ 4050.265010] irq event stamp: 7039
[ 4050.265020] hardirqs last  enabled at (7047): [&lt;00000000001cee7a&gt;] console_unlock+0x6d2/0x740
[ 4050.265029] hardirqs last disabled at (7054): [&lt;00000000001ce87e&gt;] console_unlock+0xd6/0x740
[ 4050.265040] softirqs last  enabled at (6416): [&lt;0000000000b8fe26&gt;] __udelay+0xb6/0x100
[ 4050.265049] softirqs last disabled at (6415): [&lt;0000000000b8fe06&gt;] __udelay+0x96/0x100
[ 4050.265057] ---[ end trace d04a07d39d99a9f9 ]---

Let's fix this as described in the article
https://lwn.net/Articles/628628/.

Signed-off-by: Farhan Ali &lt;alifm@linux.ibm.com&gt;
[remove now redundant vfio_dev_present()]
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>vfio-pci/nvlink2: Fix potential VMA leak</title>
<updated>2019-06-15T09:52:58Z</updated>
<author>
<name>Greg Kurz</name>
<email>groug@kaod.org</email>
</author>
<published>2019-04-19T15:37:17Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=dcaa5e20015d63a1dd3bf1c2dd1c56d83010cd10'/>
<id>urn:sha1:dcaa5e20015d63a1dd3bf1c2dd1c56d83010cd10</id>
<content type='text'>
[ Upstream commit 2c85f2bd519457073444ec28bbb4743a4e4237a7 ]

If vfio_pci_register_dev_region() fails then we should rollback
previous changes, ie. unmap the ATSD registers.

Fixes: 7f92891778df ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver")
Signed-off-by: Greg Kurz &lt;groug@kaod.org&gt;
Reviewed-by: Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt;
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>vfio/type1: Limit DMA mappings per container</title>
<updated>2019-04-03T18:43:05Z</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2019-04-03T18:36:21Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=492855939bdb59c6f947b0b5b44af9ad82b7e38c'/>
<id>urn:sha1:492855939bdb59c6f947b0b5b44af9ad82b7e38c</id>
<content type='text'>
Memory backed DMA mappings are accounted against a user's locked
memory limit, including multiple mappings of the same memory.  This
accounting bounds the number of such mappings that a user can create.
However, DMA mappings that are not backed by memory, such as DMA
mappings of device MMIO via mmaps, do not make use of page pinning
and therefore do not count against the user's locked memory limit.
These mappings still consume memory, but the memory is not well
associated to the process for the purpose of oom killing a task.

To add bounding on this use case, we introduce a limit to the total
number of concurrent DMA mappings that a user is allowed to create.
This limit is exposed as a tunable module option where the default
value of 64K is expected to be well in excess of any reasonable use
case (a large virtual machine configuration would typically only make
use of tens of concurrent mappings).

This fixes CVE-2019-3882.

Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Tested-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Reviewed-by: Peter Xu &lt;peterx@redhat.com&gt;
Reviewed-by: Cornelia Huck &lt;cohuck@redhat.com&gt;
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
</entry>
<entry>
<title>vfio/spapr_tce: Make symbol 'tce_iommu_driver_ops' static</title>
<updated>2019-04-03T18:42:02Z</updated>
<author>
<name>Wang Hai</name>
<email>wanghai26@huawei.com</email>
</author>
<published>2019-04-03T18:36:21Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e39dd513d5f2ae2041c593d42fd0d8b24e7e950b'/>
<id>urn:sha1:e39dd513d5f2ae2041c593d42fd0d8b24e7e950b</id>
<content type='text'>
Fixes the following sparse warning:

drivers/vfio/vfio_iommu_spapr_tce.c:1401:36: warning:
 symbol 'tce_iommu_driver_ops' was not declared. Should it be static?

Fixes: 5ffd229c0273 ("powerpc/vfio: Implement IOMMU driver for VFIO")
Signed-off-by: Wang Hai &lt;wanghai26@huawei.com&gt;
Reviewed-by: Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt;
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
</entry>
<entry>
<title>vfio/pci: use correct format characters</title>
<updated>2019-04-03T18:36:20Z</updated>
<author>
<name>Louis Taylor</name>
<email>louis@kragniz.eu</email>
</author>
<published>2019-04-03T18:36:20Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=426b046b748d1f47e096e05bdcc6fb4172791307'/>
<id>urn:sha1:426b046b748d1f47e096e05bdcc6fb4172791307</id>
<content type='text'>
When compiling with -Wformat, clang emits the following warnings:

drivers/vfio/pci/vfio_pci.c:1601:5: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                ^~~~~~

drivers/vfio/pci/vfio_pci.c:1601:13: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                        ^~~~~~

drivers/vfio/pci/vfio_pci.c:1601:21: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                                ^~~~~~~~~

drivers/vfio/pci/vfio_pci.c:1601:32: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                                           ^~~~~~~~~

drivers/vfio/pci/vfio_pci.c:1605:5: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                ^~~~~~

drivers/vfio/pci/vfio_pci.c:1605:13: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                        ^~~~~~

drivers/vfio/pci/vfio_pci.c:1605:21: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                                ^~~~~~~~~

drivers/vfio/pci/vfio_pci.c:1605:32: warning: format specifies type
      'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                vendor, device, subvendor, subdevice,
                                                           ^~~~~~~~~
The types of these arguments are unconditionally defined, so this patch
updates the format character to the correct ones for unsigned ints.

Link: https://github.com/ClangBuiltLinux/linux/issues/378
Signed-off-by: Louis Taylor &lt;louis@kragniz.eu&gt;
Reviewed-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'powerpc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux</title>
<updated>2019-03-07T20:56:26Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-03-07T20:56:26Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=6c3ac1134371b51c9601171af2c32153ccb11100'/>
<id>urn:sha1:6c3ac1134371b51c9601171af2c32153ccb11100</id>
<content type='text'>
Pull powerpc updates from Michael Ellerman:
 "Notable changes:

   - Enable THREAD_INFO_IN_TASK to move thread_info off the stack.

   - A big series from Christoph reworking our DMA code to use more of
     the generic infrastructure, as he said:
       "This series switches the powerpc port to use the generic swiotlb
        and noncoherent dma ops, and to use more generic code for the
        coherent direct mapping, as well as removing a lot of dead
        code."

   - Increase our vmalloc space to 512T with the Hash MMU on modern
     CPUs, allowing us to support machines with larger amounts of total
     RAM or distance between nodes.

   - Two series from Christophe, one to optimise TLB miss handlers on
     6xx, and another to optimise the way STRICT_KERNEL_RWX is
     implemented on some 32-bit CPUs.

   - Support for KCOV coverage instrumentation which means we can run
     syzkaller and discover even more bugs in our code.

  And as always many clean-ups, reworks and minor fixes etc.

  Thanks to: Alan Modra, Alexey Kardashevskiy, Alistair Popple, Andrea
  Arcangeli, Andrew Donnellan, Aneesh Kumar K.V, Aravinda Prasad, Balbir
  Singh, Brajeswar Ghosh, Breno Leitao, Christian Lamparter, Christian
  Zigotzky, Christophe Leroy, Christoph Hellwig, Corentin Labbe, Daniel
  Axtens, David Gibson, Diana Craciun, Firoz Khan, Gustavo A. R. Silva,
  Igor Stoppa, Joe Lawrence, Joel Stanley, Jonathan Neuschäfer, Jordan
  Niethe, Laurent Dufour, Madhavan Srinivasan, Mahesh Salgaonkar, Mark
  Cave-Ayland, Masahiro Yamada, Mathieu Malaterre, Matteo Croce, Meelis
  Roos, Michael W. Bringmann, Nathan Chancellor, Nathan Fontenot,
  Nicholas Piggin, Nick Desaulniers, Nicolai Stange, Oliver O'Halloran,
  Paul Mackerras, Peter Xu, PrasannaKumar Muralidharan, Qian Cai,
  Rashmica Gupta, Reza Arbab, Robert P. J. Day, Russell Currey,
  Sabyasachi Gupta, Sam Bobroff, Sandipan Das, Sergey Senozhatsky,
  Souptick Joarder, Stewart Smith, Tyrel Datwyler, Vaibhav Jain,
  YueHaibing"

* tag 'powerpc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits)
  powerpc/32: Clear on-stack exception marker upon exception return
  powerpc: Remove export of save_stack_trace_tsk_reliable()
  powerpc/mm: fix "section_base" set but not used
  powerpc/mm: Fix "sz" set but not used warning
  powerpc/mm: Check secondary hash page table
  powerpc: remove nargs from __SYSCALL
  powerpc/64s: Fix unrelocated interrupt trampoline address test
  powerpc/powernv/ioda: Fix locked_vm counting for memory used by IOMMU tables
  powerpc/fsl: Fix the flush of branch predictor.
  powerpc/powernv: Make opal log only readable by root
  powerpc/xmon: Fix opcode being uninitialized in print_insn_powerpc
  powerpc/powernv: move OPAL call wrapper tracing and interrupt handling to C
  powerpc/64s: Fix data interrupts vs d-side MCE reentrancy
  powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy
  powerpc/64s: system reset interrupt preserve HSRRs
  powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
  powerpc/mm/hash: Handle mmap_min_addr correctly in get_unmapped_area topdown search
  powerpc/hugetlb: Handle mmap_min_addr correctly in get_unmapped_area callback
  selftests/powerpc: Remove duplicate header
  powerpc sstep: Add support for modsd, modud instructions
  ...
</content>
</entry>
<entry>
<title>vfio_pci: Enable memory accesses before calling pci_map_rom</title>
<updated>2019-02-18T21:57:50Z</updated>
<author>
<name>Eric Auger</name>
<email>eric.auger@redhat.com</email>
</author>
<published>2019-02-15T16:16:06Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0cfd027be1d6def4a462cdc180c055143af24069'/>
<id>urn:sha1:0cfd027be1d6def4a462cdc180c055143af24069</id>
<content type='text'>
pci_map_rom/pci_get_rom_size() performs memory access in the ROM.
In case the Memory Space accesses were disabled, readw() is likely
to trigger a synchronous external abort on some platforms.

In case memory accesses were disabled, re-enable them before the
call and disable them back again just after.

Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
Signed-off-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Suggested-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
</entry>
<entry>
<title>vfio/pci: Restore device state on PM transition</title>
<updated>2019-02-18T21:55:53Z</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2019-02-09T20:43:30Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=51ef3a004b1eb6241e56b3aa8495769a092a4dc2'/>
<id>urn:sha1:51ef3a004b1eb6241e56b3aa8495769a092a4dc2</id>
<content type='text'>
PCI core handles save and restore of device state around reset, but
when using pci_set_power_state() we can unintentionally trigger a soft
reset of the device, where PCI core only restores the BAR state.  If
we're using vfio-pci's idle D3 support to try to put devices into low
power when unused, this might trigger a reset when the device is woken
for use.  Also power state management by the user, or within a guest,
can put the device into D3 power state with potentially limited
ability to restore the device if it should undergo a reset.  The PCI
spec does not define the extent of a soft reset and many devices
reporting soft reset on D3-&gt;D0 transition do not undergo a PCI config
space reset.  It's therefore assumed safe to unconditionally restore
the remainder of the state if the device indicates soft reset
support, even on a user initiated wakeup.

Implement a wrapper in vfio-pci to tag devices reporting PM reset
support, save their state on transitions into D3 and restore on
transitions back to D0.

Reported-by: Alexander Duyck &lt;alexander.h.duyck@linux.intel.com&gt;
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
</entry>
<entry>
<title>vfio/spapr_tce: Skip unsetting already unset table</title>
<updated>2019-02-13T20:08:12Z</updated>
<author>
<name>Alexey Kardashevskiy</name>
<email>aik@ozlabs.ru</email>
</author>
<published>2019-02-11T07:49:17Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=a3906855890d94736d240f0f637585c1470d8d02'/>
<id>urn:sha1:a3906855890d94736d240f0f637585c1470d8d02</id>
<content type='text'>
VFIO TCE IOMMU v2 owns IOMMU tables. When we detach an IOMMU group from
a container, we need to unset these tables from the group which we do by
calling unset_window(). We also unset tables when removing a DMA window
via the VFIO_IOMMU_SPAPR_TCE_REMOVE ioctl.

The window removal checks if the table actually exists (hidden inside
tce_iommu_find_table()) but the group detaching does not so the user
may see duplicating messages:
pci 0009:03     : [PE# fd] Removing DMA window #0
pci 0009:03     : [PE# fd] Removing DMA window #1
pci 0009:03     : [PE# fd] Removing DMA window #0
pci 0009:03     : [PE# fd] Removing DMA window #1

At the moment this is not a problem as the second invocation
of unset_window() writes zeroes to the HW registers again and exits early
as there is no table.

Signed-off-by: Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt;
Reviewed-by: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
</entry>
<entry>
<title>vfio: expand minor range when registering chrdev region</title>
<updated>2019-02-12T20:20:56Z</updated>
<author>
<name>Chengguang Xu</name>
<email>cgxu519@gmx.com</email>
</author>
<published>2019-02-12T05:59:29Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8bcb64a51065e957e170ada58cbbd766be6a9619'/>
<id>urn:sha1:8bcb64a51065e957e170ada58cbbd766be6a9619</id>
<content type='text'>
Actually, total amount of available minor number
for a single major is MINORMARK + 1. So expand
minor range when registering chrdev region.

Signed-off-by: Chengguang Xu &lt;cgxu519@gmx.com&gt;
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
</entry>
</feed>
