<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/include/net/af_unix.h, branch linux-4.1.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.1.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.1.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2016-03-04T15:25:48Z</updated>
<entry>
<title>unix: correctly track in-flight fds in sending process user_struct</title>
<updated>2016-03-04T15:25:48Z</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2016-02-03T01:11:03Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=797c009c98dbb21127a2549d1106ed19d18661cf'/>
<id>urn:sha1:797c009c98dbb21127a2549d1106ed19d18661cf</id>
<content type='text'>
[ Upstream commit 415e3d3e90ce9e18727e8843ae343eda5a58fad6 ]

The commit referenced in the Fixes tag incorrectly accounted the number
of in-flight fds over a unix domain socket to the original opener
of the file-descriptor. This allows another process to arbitrary
deplete the original file-openers resource limit for the maximum of
open files. Instead the sending processes and its struct cred should
be credited.

To do so, we add a reference counted struct user_struct pointer to the
scm_fp_list and use it to account for the number of inflight unix fds.

Fixes: 712f4aad406bb1 ("unix: properly account for FDs passed over unix sockets")
Reported-by: David Herrmann &lt;dh.herrmann@gmail.com&gt;
Cc: David Herrmann &lt;dh.herrmann@gmail.com&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>unix: avoid use-after-free in ep_remove_wait_queue</title>
<updated>2015-12-15T05:24:28Z</updated>
<author>
<name>Rainer Weikusat</name>
<email>rweikusat@mobileactivedefense.com</email>
</author>
<published>2015-11-20T22:07:23Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5c77e26862ce604edea05b3442ed765e9756fe0f'/>
<id>urn:sha1:5c77e26862ce604edea05b3442ed765e9756fe0f</id>
<content type='text'>
[ Upstream commit 7d267278a9ece963d77eefec61630223fce08c6c ]

Rainer Weikusat &lt;rweikusat@mobileactivedefense.com&gt; writes:
An AF_UNIX datagram socket being the client in an n:1 association with
some server socket is only allowed to send messages to the server if the
receive queue of this socket contains at most sk_max_ack_backlog
datagrams. This implies that prospective writers might be forced to go
to sleep despite none of the message presently enqueued on the server
receive queue were sent by them. In order to ensure that these will be
woken up once space becomes again available, the present unix_dgram_poll
routine does a second sock_poll_wait call with the peer_wait wait queue
of the server socket as queue argument (unix_dgram_recvmsg does a wake
up on this queue after a datagram was received). This is inherently
problematic because the server socket is only guaranteed to remain alive
for as long as the client still holds a reference to it. In case the
connection is dissolved via connect or by the dead peer detection logic
in unix_dgram_sendmsg, the server socket may be freed despite "the
polling mechanism" (in particular, epoll) still has a pointer to the
corresponding peer_wait queue. There's no way to forcibly deregister a
wait queue with epoll.

Based on an idea by Jason Baron, the patch below changes the code such
that a wait_queue_t belonging to the client socket is enqueued on the
peer_wait queue of the server whenever the peer receive queue full
condition is detected by either a sendmsg or a poll. A wake up on the
peer queue is then relayed to the ordinary wait queue of the client
socket via wake function. The connection to the peer wait queue is again
dissolved if either a wake up is about to be relayed or the client
socket reconnects or a dead peer is detected or the client socket is
itself closed. This enables removing the second sock_poll_wait from
unix_dgram_poll, thus avoiding the use-after-free, while still ensuring
that no blocked writer sleeps forever.

Signed-off-by: Rainer Weikusat &lt;rweikusat@mobileactivedefense.com&gt;
Fixes: ec0d215f9420 ("af_unix: fix 'poll for write'/connected DGRAM sockets")
Reviewed-by: Jason Baron &lt;jbaron@akamai.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>af_unix: Convert the unix_sk macro to an inline function for type safety</title>
<updated>2015-10-27T00:51:52Z</updated>
<author>
<name>Aaron Conole</name>
<email>aconole@bytheb.org</email>
</author>
<published>2015-09-26T22:50:42Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=a96cb1c6030aa565bcf15e0b03dee3ca5bb1d05a'/>
<id>urn:sha1:a96cb1c6030aa565bcf15e0b03dee3ca5bb1d05a</id>
<content type='text'>
[ Upstream commit 4613012db1d911f80897f9446a49de817b2c4c47 ]

As suggested by Eric Dumazet this change replaces the
#define with a static inline function to enjoy
complaints by the compiler when misusing the API.

Signed-off-by: Aaron Conole &lt;aconole@bytheb.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>af_unix: improve STREAM behavior with fragmented memory</title>
<updated>2013-08-10T08:16:44Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-08-08T21:37:32Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e370a7236321773245c5522d8bb299380830d3b2'/>
<id>urn:sha1:e370a7236321773245c5522d8bb299380830d3b2</id>
<content type='text'>
unix_stream_sendmsg() currently uses order-2 allocations,
and we had numerous reports this can fail.

The __GFP_REPEAT flag present in sock_alloc_send_pskb() is
not helping.

This patch extends the work done in commit eb6a24816b247c
("af_unix: reduce high order page allocations) for
datagram sockets.

This opens the possibility of zero copy IO (splice() and
friends)

The trick is to not use skb_pull() anymore in recvmsg() path,
and instead add a @consumed field in UNIXCB() to track amount
of already read payload in the skb.

There is a performance regression for large sends
because of extra page allocations that will be addressed
in a follow-up patch, allowing sock_alloc_send_pskb()
to attempt high order page allocations.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>af_unix.h: Remove extern from function prototypes</title>
<updated>2013-08-01T00:50:01Z</updated>
<author>
<name>Joe Perches</name>
<email>joe@perches.com</email>
</author>
<published>2013-08-01T00:31:33Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=b60a8280ba2a34351dd8b98664df3785c27528dd'/>
<id>urn:sha1:b60a8280ba2a34351dd8b98664df3785c27528dd</id>
<content type='text'>
There are a mix of function prototypes with and without extern
in the kernel sources.  Standardize on not using extern for
function prototypes.

Function prototypes don't need to be written with extern.
extern is assumed by the compiler.  Its use is as unnecessary as
using auto to declare automatic/local variables in a block.

Reflow modified prototypes to 80 columns.

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>af_unix: fix a fatal race with bit fields</title>
<updated>2013-05-01T19:13:49Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2013-05-01T05:24:03Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=60bc851ae59bfe99be6ee89d6bc50008c85ec75d'/>
<id>urn:sha1:60bc851ae59bfe99be6ee89d6bc50008c85ec75d</id>
<content type='text'>
Using bit fields is dangerous on ppc64/sparc64, as the compiler [1]
uses 64bit instructions to manipulate them.
If the 64bit word includes any atomic_t or spinlock_t, we can lose
critical concurrent changes.

This is happening in af_unix, where unix_sk(sk)-&gt;gc_candidate/
gc_maybe_cycle/lock share the same 64bit word.

This leads to fatal deadlock, as one/several cpus spin forever
on a spinlock that will never be available again.

A safer way would be to use a long to store flags.
This way we are sure compiler/arch wont do bad things.

As we own unix_gc_lock spinlock when clearing or setting bits,
we can use the non atomic __set_bit()/__clear_bit().

recursion_level can share the same 64bit location with the spinlock,
as it is set only with this spinlock held.

[1] bug fixed in gcc-4.8.0 :
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080

Reported-by: Ambrose Feinstein &lt;ambrose@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>scm: Stop passing struct cred</title>
<updated>2013-04-07T22:58:55Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2013-04-03T17:28:16Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=6b0ee8c036ecb3ac92e18e6ca0dca7bff88beaf0'/>
<id>urn:sha1:6b0ee8c036ecb3ac92e18e6ca0dca7bff88beaf0</id>
<content type='text'>
Now that uids and gids are completely encapsulated in kuid_t
and kgid_t we no longer need to pass struct cred which allowed
us to test both the uid and the user namespace for equality.

Passing struct cred potentially allows us to pass the entire group
list as BSD does but I don't believe the cost of cache line misses
justifies retaining code for a future potential application.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>unix: Remove unused field from unix_sock</title>
<updated>2012-10-22T00:37:06Z</updated>
<author>
<name>Pavel Emelyanov</name>
<email>xemul@parallels.com</email>
</author>
<published>2012-10-18T20:28:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0da5f7c6d2a2653e3c8867df5f226e62377a141f'/>
<id>urn:sha1:0da5f7c6d2a2653e3c8867df5f226e62377a141f</id>
<content type='text'>
The struct sock *other one seem to be unused. Grep and make do not object.

Signed-off-by: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>af_unix: speedup /proc/net/unix</title>
<updated>2012-06-08T21:27:23Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-06-08T05:03:21Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=7123aaa3a1416529ce461e98108e6b343b294643'/>
<id>urn:sha1:7123aaa3a1416529ce461e98108e6b343b294643</id>
<content type='text'>
/proc/net/unix has quadratic behavior, and can hold unix_table_lock for
a while if high number of unix sockets are alive. (90 ms for 200k
sockets...)

We already have a hash table, so its quite easy to use it.

Problem is unbound sockets are still hashed in a single hash slot
(unix_socket_table[UNIX_HASH_TABLE])

This patch also spreads unbound sockets to 256 hash slots, to speedup
both /proc/net/unix and unix_diag.

Time to read /proc/net/unix with 200k unix sockets :
(time dd if=/proc/net/unix of=/dev/null bs=4k)

before : 520 secs
after : 2 secs

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Steven Whitehouse &lt;swhiteho@redhat.com&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: cleanup unsigned to unsigned int</title>
<updated>2012-04-15T16:44:40Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2012-04-15T05:58:06Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=95c961747284a6b83a5e2d81240e214b0fa3464d'/>
<id>urn:sha1:95c961747284a6b83a5e2d81240e214b0fa3464d</id>
<content type='text'>
Use of "unsigned int" is preferred to bare "unsigned" in net tree.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
