<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/net/unix, branch linux-5.1.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2019-03-08T22:48:40Z</updated>
<entry>
<title>Merge tag 'io_uring-2019-03-06' of git://git.kernel.dk/linux-block</title>
<updated>2019-03-08T22:48:40Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-03-08T22:48:40Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=38e7571c07be01f9f19b355a9306a4e3d5cb0f5b'/>
<id>urn:sha1:38e7571c07be01f9f19b355a9306a4e3d5cb0f5b</id>
<content type='text'>
Pull io_uring IO interface from Jens Axboe:
 "Second attempt at adding the io_uring interface.

  Since the first one, we've added basic unit testing of the three
  system calls, that resides in liburing like the other unit tests that
  we have so far. It'll take a while to get full coverage of it, but
  we're working towards it. I've also added two basic test programs to
  tools/io_uring. One uses the raw interface and has support for all the
  various features that io_uring supports outside of standard IO, like
  fixed files, fixed IO buffers, and polled IO. The other uses the
  liburing API, and is a simplified version of cp(1).

  This adds support for a new IO interface, io_uring.

  io_uring allows an application to communicate with the kernel through
  two rings, the submission queue (SQ) and completion queue (CQ) ring.
  This allows for very efficient handling of IOs, see the v5 posting for
  some basic numbers:

    https://lore.kernel.org/linux-block/20190116175003.17880-1-axboe@kernel.dk/

  Outside of just efficiency, the interface is also flexible and
  extendable, and allows for future use cases like the upcoming NVMe
  key-value store API, networked IO, and so on. It also supports async
  buffered IO, something that we've always failed to support in the
  kernel.

  Outside of basic IO features, it supports async polled IO as well.
  This particular feature has already been tested at Facebook months ago
  for flash storage boxes, with 25-33% improvements. It makes polled IO
  actually useful for real world use cases, where even basic flash sees
  a nice win in terms of efficiency, latency, and performance. These
  boxes were IOPS bound before, now they are not.

  This series adds three new system calls. One for setting up an
  io_uring instance (io_uring_setup(2)), one for submitting/completing
  IO (io_uring_enter(2)), and one for aux functions like registrating
  file sets, buffers, etc (io_uring_register(2)). Through the help of
  Arnd, I've coordinated the syscall numbers so merge on that front
  should be painless.

  Jon did a writeup of the interface a while back, which (except for
  minor details that have been tweaked) is still accurate. Find that
  here:

    https://lwn.net/Articles/776703/

  Huge thanks to Al Viro for helping getting the reference cycle code
  correct, and to Jann Horn for his extensive reviews focused on both
  security and bugs in general.

  There's a userspace library that provides basic functionality for
  applications that don't need or want to care about how to fiddle with
  the rings directly. It has helpers to allow applications to easily set
  up an io_uring instance, and submit/complete IO through it without
  knowing about the intricacies of the rings. It also includes man pages
  (thanks to Jeff Moyer), and will continue to grow support helper
  functions and features as time progresses. Find it here:

    git://git.kernel.dk/liburing

  Fio has full support for the raw interface, both in the form of an IO
  engine (io_uring), but also with a small test application (t/io_uring)
  that can exercise and benchmark the interface"

* tag 'io_uring-2019-03-06' of git://git.kernel.dk/linux-block:
  io_uring: add a few test tools
  io_uring: allow workqueue item to handle multiple buffered requests
  io_uring: add support for IORING_OP_POLL
  io_uring: add io_kiocb ref count
  io_uring: add submission polling
  io_uring: add file set registration
  net: split out functions related to registering inflight socket files
  io_uring: add support for pre-mapped user IO buffers
  block: implement bio helper to add iter bvec pages to bio
  io_uring: batch io_kiocb allocation
  io_uring: use fget/fput_many() for file references
  fs: add fget_many() and fput_many()
  io_uring: support for IO polling
  io_uring: add fsync support
  Add io_uring IO interface
</content>
</entry>
<entry>
<title>net: split out functions related to registering inflight socket files</title>
<updated>2019-02-28T15:24:23Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2019-02-08T16:01:44Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=f4e65870e5cede5ca1ec0006b6c9803994e5f7b8'/>
<id>urn:sha1:f4e65870e5cede5ca1ec0006b6c9803994e5f7b8</id>
<content type='text'>
We need this functionality for the io_uring file registration, but
we cannot rely on it since CONFIG_UNIX can be modular. Move the helpers
to a separate file, that's always builtin to the kernel if CONFIG_UNIX is
m/y.

No functional changes in this patch, just moving code around.

Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Add io_uring IO interface</title>
<updated>2019-02-28T15:24:23Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2019-01-07T17:46:33Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2b188cc1bb857a9d4701ae59aa7768b5124e262e'/>
<id>urn:sha1:2b188cc1bb857a9d4701ae59aa7768b5124e262e</id>
<content type='text'>
The submission queue (SQ) and completion queue (CQ) rings are shared
between the application and the kernel. This eliminates the need to
copy data back and forth to submit and complete IO.

IO submissions use the io_uring_sqe data structure, and completions
are generated in the form of io_uring_cqe data structures. The SQ
ring is an index into the io_uring_sqe array, which makes it possible
to submit a batch of IOs without them being contiguous in the ring.
The CQ ring is always contiguous, as completion events are inherently
unordered, and hence any io_uring_cqe entry can point back to an
arbitrary submission.

Two new system calls are added for this:

io_uring_setup(entries, params)
	Sets up an io_uring instance for doing async IO. On success,
	returns a file descriptor that the application can mmap to
	gain access to the SQ ring, CQ ring, and io_uring_sqes.

io_uring_enter(fd, to_submit, min_complete, flags, sigset, sigsetsize)
	Initiates IO against the rings mapped to this fd, or waits for
	them to complete, or both. The behavior is controlled by the
	parameters passed in. If 'to_submit' is non-zero, then we'll
	try and submit new IO. If IORING_ENTER_GETEVENTS is set, the
	kernel will wait for 'min_complete' events, if they aren't
	already available. It's valid to set IORING_ENTER_GETEVENTS
	and 'min_complete' == 0 at the same time, this allows the
	kernel to return already completed events without waiting
	for them. This is useful only for polling, as for IRQ
	driven IO, the application can just check the CQ ring
	without entering the kernel.

With this setup, it's possible to do async IO with a single system
call. Future developments will enable polled IO with this interface,
and polled submission as well. The latter will enable an application
to do IO without doing ANY system calls at all.

For IRQ driven IO, an application only needs to enter the kernel for
completions if it wants to wait for them to occur.

Each io_uring is backed by a workqueue, to support buffered async IO
as well. We will only punt to an async context if the command would
need to wait for IO on the device side. Any data that can be accessed
directly in the page cache is done inline. This avoids the slowness
issue of usual threadpools, since cached data is accessed as quickly
as a sync interface.

Sample application: http://git.kernel.dk/cgit/fio/plain/t/io_uring.c

Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>missing barriers in some of unix_sock -&gt;addr and -&gt;path accesses</title>
<updated>2019-02-21T04:06:28Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2019-02-15T20:09:35Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=ae3b564179bfd06f32d051b9e5d72ce4b2a07c37'/>
<id>urn:sha1:ae3b564179bfd06f32d051b9e5d72ce4b2a07c37</id>
<content type='text'>
Several u-&gt;addr and u-&gt;path users are not holding any locks in
common with unix_bind().  unix_state_lock() is useless for those
purposes.

u-&gt;addr is assign-once and *(u-&gt;addr) is fully set up by the time
we set u-&gt;addr (all under unix_table_lock).  u-&gt;path is also
set in the same critical area, also before setting u-&gt;addr, and
any unix_sock with -&gt;path filled will have non-NULL -&gt;addr.

So setting -&gt;addr with smp_store_release() is all we need for those
"lockless" users - just have them fetch -&gt;addr with smp_load_acquire()
and don't even bother looking at -&gt;path if they see NULL -&gt;addr.

Users of -&gt;addr and -&gt;path fall into several classes now:
    1) ones that do smp_load_acquire(u-&gt;addr) and access *(u-&gt;addr)
and u-&gt;path only if smp_load_acquire() has returned non-NULL.
    2) places holding unix_table_lock.  These are guaranteed that
*(u-&gt;addr) is seen fully initialized.  If unix_sock is in one of the
"bound" chains, so's -&gt;path.
    3) unix_sock_destructor() using -&gt;addr is safe.  All places
that set u-&gt;addr are guaranteed to have seen all stores *(u-&gt;addr)
while holding a reference to u and unix_sock_destructor() is called
when (atomic) refcount hits zero.
    4) unix_release_sock() using -&gt;path is safe.  unix_bind()
is serialized wrt unix_release() (normally - by struct file
refcount), and for the instances that had -&gt;path set by unix_bind()
unix_release_sock() comes from unix_release(), so they are fine.
Instances that had it set in unix_stream_connect() either end up
attached to a socket (in unix_accept()), in which case the call
chain to unix_release_sock() and serialization are the same as in
the previous case, or they never get accept'ed and unix_release_sock()
is called when the listener is shut down and its queue gets purged.
In that case the listener's queue lock provides the barriers needed -
unix_stream_connect() shoves our unix_sock into listener's queue
under that lock right after having set -&gt;path and eventual
unix_release_sock() caller picks them from that queue under the
same lock right before calling unix_release_sock().
    5) unix_find_other() use of -&gt;path is pointless, but safe -
it happens with successful lookup by (abstract) name, so -&gt;path.dentry
is guaranteed to be NULL there.

earlier-variant-reviewed-by: "Paul E. McKenney" &lt;paulmck@linux.ibm.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Revert "net: simplify sock_poll_wait"</title>
<updated>2018-10-23T17:57:06Z</updated>
<author>
<name>Karsten Graul</name>
<email>kgraul@linux.ibm.com</email>
</author>
<published>2018-10-23T11:40:39Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=89ab066d4229acd32e323f1569833302544a4186'/>
<id>urn:sha1:89ab066d4229acd32e323f1569833302544a4186</id>
<content type='text'>
This reverts commit dd979b4df817e9976f18fb6f9d134d6bc4a3c317.

This broke tcp_poll for SMC fallback: An AF_SMC socket establishes an
internal TCP socket for the initial handshake with the remote peer.
Whenever the SMC connection can not be established this TCP socket is
used as a fallback. All socket operations on the SMC socket are then
forwarded to the TCP socket. In case of poll, the file-&gt;private_data
pointer references the SMC socket because the TCP socket has no file
assigned. This causes tcp_poll to wait on the wrong socket.

Signed-off-by: Karsten Graul &lt;kgraul@linux.ibm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: fix warning in af_unix</title>
<updated>2018-10-18T04:57:28Z</updated>
<author>
<name>Kyeongdon Kim</name>
<email>kyeongdon.kim@lge.com</email>
</author>
<published>2018-10-16T05:57:26Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=33c4368ee2589c165aebd8d388cbd91e9adb9688'/>
<id>urn:sha1:33c4368ee2589c165aebd8d388cbd91e9adb9688</id>
<content type='text'>
This fixes the "'hash' may be used uninitialized in this function"

net/unix/af_unix.c:1041:20: warning: 'hash' may be used uninitialized in this function [-Wmaybe-uninitialized]
  addr-&gt;hash = hash ^ sk-&gt;sk_type;

Signed-off-by: Kyeongdon Kim &lt;kyeongdon.kim@lge.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>af_unix: ensure POLLOUT on remote close() for connected dgram socket</title>
<updated>2018-08-03T23:44:19Z</updated>
<author>
<name>Jason Baron</name>
<email>jbaron@akamai.com</email>
</author>
<published>2018-08-03T21:24:53Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=51f7e95187f127d5eadf50541943813ff57f12ba'/>
<id>urn:sha1:51f7e95187f127d5eadf50541943813ff57f12ba</id>
<content type='text'>
Applications use -ECONNREFUSED as returned from write() in order to
determine that a socket should be closed. However, when using connected
dgram unix sockets in a poll/write loop, a final POLLOUT event can be
missed when the remote end closes. Thus, the poll is stuck forever:

          thread 1 (client)                   thread 2 (server)

connect() to server
write() returns -EAGAIN
unix_dgram_poll()
 -&gt; unix_recvq_full() is true
                                       close()
                                        -&gt;unix_release_sock()
                                         -&gt;wake_up_interruptible_all()
unix_dgram_poll() (due to the
     wake_up_interruptible_all)
 -&gt; unix_recvq_full() still is true
                                         -&gt;free all skbs

Now thread 1 is stuck and will not receive anymore wakeups. In this
case, when thread 1 gets the -EAGAIN, it has not queued any skbs
otherwise the 'free all skbs' step would in fact cause a wakeup and
a POLLOUT return. So the race here is probably fairly rare because
it means there are no skbs that thread 1 queued and that thread 1
schedules before the 'free all skbs' step.

This issue was reported as a hang when /dev/log is closed.

The fix is to signal POLLOUT if the socket is marked as SOCK_DEAD, which
means a subsequent write() will get -ECONNREFUSED.

Reported-by: Ian Lance Taylor &lt;iant@golang.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Rainer Weikusat &lt;rweikusat@mobileactivedefense.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Jason Baron &lt;jbaron@akamai.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: simplify sock_poll_wait</title>
<updated>2018-07-30T16:10:25Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2018-07-30T07:42:10Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=dd979b4df817e9976f18fb6f9d134d6bc4a3c317'/>
<id>urn:sha1:dd979b4df817e9976f18fb6f9d134d6bc4a3c317</id>
<content type='text'>
The wait_address argument is always directly derived from the filp
argument, so remove it.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Revert changes to convert to -&gt;poll_mask() and aio IOCB_CMD_POLL</title>
<updated>2018-06-28T17:40:47Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-06-28T16:43:44Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=a11e1d432b51f63ba698d044441284a661f01144'/>
<id>urn:sha1:a11e1d432b51f63ba698d044441284a661f01144</id>
<content type='text'>
The poll() changes were not well thought out, and completely
unexplained.  They also caused a huge performance regression, because
"-&gt;poll()" was no longer a trivial file operation that just called down
to the underlying file operations, but instead did at least two indirect
calls.

Indirect calls are sadly slow now with the Spectre mitigation, but the
performance problem could at least be largely mitigated by changing the
"-&gt;get_poll_head()" operation to just have a per-file-descriptor pointer
to the poll head instead.  That gets rid of one of the new indirections.

But that doesn't fix the new complexity that is completely unwarranted
for the regular case.  The (undocumented) reason for the poll() changes
was some alleged AIO poll race fixing, but we don't make the common case
slower and more complex for some uncommon special case, so this all
really needs way more explanations and most likely a fundamental
redesign.

[ This revert is a revert of about 30 different commits, not reverted
  individually because that would just be unnecessarily messy  - Linus ]

Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2018-06-04T20:57:43Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-06-04T20:57:43Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=408afb8d7847faea115508ba154346e33edfc7d5'/>
<id>urn:sha1:408afb8d7847faea115508ba154346e33edfc7d5</id>
<content type='text'>
Pull aio updates from Al Viro:
 "Majority of AIO stuff this cycle. aio-fsync and aio-poll, mostly.

  The only thing I'm holding back for a day or so is Adam's aio ioprio -
  his last-minute fixup is trivial (missing stub in !CONFIG_BLOCK case),
  but let it sit in -next for decency sake..."

* 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
  aio: sanitize the limit checking in io_submit(2)
  aio: fold do_io_submit() into callers
  aio: shift copyin of iocb into io_submit_one()
  aio_read_events_ring(): make a bit more readable
  aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the same way
  aio: take list removal to (some) callers of aio_complete()
  aio: add missing break for the IOCB_CMD_FDSYNC case
  random: convert to -&gt;poll_mask
  timerfd: convert to -&gt;poll_mask
  eventfd: switch to -&gt;poll_mask
  pipe: convert to -&gt;poll_mask
  crypto: af_alg: convert to -&gt;poll_mask
  net/rxrpc: convert to -&gt;poll_mask
  net/iucv: convert to -&gt;poll_mask
  net/phonet: convert to -&gt;poll_mask
  net/nfc: convert to -&gt;poll_mask
  net/caif: convert to -&gt;poll_mask
  net/bluetooth: convert to -&gt;poll_mask
  net/sctp: convert to -&gt;poll_mask
  net/tipc: convert to -&gt;poll_mask
  ...
</content>
</entry>
</feed>
