<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/net/core/sock.c, branch linux-4.1.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.1.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.1.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2018-03-04T15:28:35Z</updated>
<entry>
<title>net: avoid signed overflows for SO_{SND|RCV}BUFFORCE</title>
<updated>2018-03-04T15:28:35Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2016-12-02T17:44:53Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=f99fb439e6aff4e9f8b91a80d48b2a2d97aa2248'/>
<id>urn:sha1:f99fb439e6aff4e9f8b91a80d48b2a2d97aa2248</id>
<content type='text'>
[ Upstream commit b98b0bc8c431e3ceb4b26b0dfc8db509518fb290 ]

CAP_NET_ADMIN users should not be allowed to set negative
sk_sndbuf or sk_rcvbuf values, as it can lead to various memory
corruptions, crashes, OOM...

Note that before commit 82981930125a ("net: cleanups in
sock_setsockopt()"), the bug was even more serious, since SO_SNDBUF
and SO_RCVBUF were vulnerable.

This needs to be backported to all known linux kernels.

Again, many thanks to syzkaller team for discovering this gem.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</content>
</entry>
<entry>
<title>net: properly release sk_frag.page</title>
<updated>2018-01-17T17:29:53Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2017-03-15T20:21:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2c472af8144771a3ab4bb0e846aad83bf1bc2d16'/>
<id>urn:sha1:2c472af8144771a3ab4bb0e846aad83bf1bc2d16</id>
<content type='text'>
[ Upstream commit 22a0e18eac7a9e986fec76c60fa4a2926d1291e2 ]

I mistakenly added the code to release sk-&gt;sk_frag in
sk_common_release() instead of sk_destruct()

TCP sockets using sk-&gt;sk_allocation == GFP_ATOMIC do no call
sk_common_release() at close time, thus leaking one (order-3) page.

iSCSI is using such sockets.

Fixes: 5640f7685831 ("net: use a per task frag allocator")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</content>
</entry>
<entry>
<title>net: Set sk_prot_creator when cloning sockets to the right proto</title>
<updated>2017-11-06T04:54:30Z</updated>
<author>
<name>Christoph Paasch</name>
<email>cpaasch@apple.com</email>
</author>
<published>2017-09-27T00:38:50Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0a11ea32304f6ad6daf5589b6626775b7f00015c'/>
<id>urn:sha1:0a11ea32304f6ad6daf5589b6626775b7f00015c</id>
<content type='text'>
[ Upstream commit 9d538fa60bad4f7b23193c89e843797a1cf71ef3 ]

sk-&gt;sk_prot and sk-&gt;sk_prot_creator can differ when the app uses
IPV6_ADDRFORM (transforming an IPv6-socket to an IPv4-one).
Which is why sk_prot_creator is there to make sure that sk_prot_free()
does the kmem_cache_free() on the right kmem_cache slab.

Now, if such a socket gets transformed back to a listening socket (using
connect() with AF_UNSPEC) we will allocate an IPv4 tcp_sock through
sk_clone_lock() when a new connection comes in. But sk_prot_creator will
still point to the IPv6 kmem_cache (as everything got copied in
sk_clone_lock()). When freeing, we will thus put this
memory back into the IPv6 kmem_cache although it was allocated in the
IPv4 cache. I have seen memory corruption happening because of this.

With slub-debugging and MEMCG_KMEM enabled this gives the warning
	"cache_from_obj: Wrong slab cache. TCPv6 but object is from TCP"

A C-program to trigger this:

void main(void)
{
        int fd = socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP);
        int new_fd, newest_fd, client_fd;
        struct sockaddr_in6 bind_addr;
        struct sockaddr_in bind_addr4, client_addr1, client_addr2;
        struct sockaddr unsp;
        int val;

        memset(&amp;bind_addr, 0, sizeof(bind_addr));
        bind_addr.sin6_family = AF_INET6;
        bind_addr.sin6_port = ntohs(42424);

        memset(&amp;client_addr1, 0, sizeof(client_addr1));
        client_addr1.sin_family = AF_INET;
        client_addr1.sin_port = ntohs(42424);
        client_addr1.sin_addr.s_addr = inet_addr("127.0.0.1");

        memset(&amp;client_addr2, 0, sizeof(client_addr2));
        client_addr2.sin_family = AF_INET;
        client_addr2.sin_port = ntohs(42421);
        client_addr2.sin_addr.s_addr = inet_addr("127.0.0.1");

        memset(&amp;unsp, 0, sizeof(unsp));
        unsp.sa_family = AF_UNSPEC;

        bind(fd, (struct sockaddr *)&amp;bind_addr, sizeof(bind_addr));

        listen(fd, 5);

        client_fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
        connect(client_fd, (struct sockaddr *)&amp;client_addr1, sizeof(client_addr1));
        new_fd = accept(fd, NULL, NULL);
        close(fd);

        val = AF_INET;
        setsockopt(new_fd, SOL_IPV6, IPV6_ADDRFORM, &amp;val, sizeof(val));

        connect(new_fd, &amp;unsp, sizeof(unsp));

        memset(&amp;bind_addr4, 0, sizeof(bind_addr4));
        bind_addr4.sin_family = AF_INET;
        bind_addr4.sin_port = ntohs(42421);
        bind(new_fd, (struct sockaddr *)&amp;bind_addr4, sizeof(bind_addr4));

        listen(new_fd, 5);

        client_fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
        connect(client_fd, (struct sockaddr *)&amp;client_addr2, sizeof(client_addr2));

        newest_fd = accept(new_fd, NULL, NULL);
        close(new_fd);

        close(client_fd);
        close(new_fd);
}

As far as I can see, this bug has been there since the beginning of the
git-days.

Signed-off-by: Christoph Paasch &lt;cpaasch@apple.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
</content>
</entry>
<entry>
<title>net: fix compile error in skb_orphan_partial()</title>
<updated>2017-06-25T03:05:43Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2017-05-16T20:27:53Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=93dcd4929d18e947dd5746b1e281c4a14c316a22'/>
<id>urn:sha1:93dcd4929d18e947dd5746b1e281c4a14c316a22</id>
<content type='text'>
[ Upstream commit 9142e9007f2d7ab58a587a1e1d921b0064a339aa ]

If CONFIG_INET is not set, net/core/sock.c can not compile :

net/core/sock.c: In function ‘skb_orphan_partial’:
net/core/sock.c:1810:2: error: implicit declaration of function
‘skb_is_tcp_pure_ack’ [-Werror=implicit-function-declaration]
  if (skb_is_tcp_pure_ack(skb))
  ^

Fix this by always including &lt;net/tcp.h&gt;

Fixes: f6ba8d33cfbb ("netem: fix skb_orphan_partial()")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Reported-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Reported-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
</content>
</entry>
<entry>
<title>netem: fix skb_orphan_partial()</title>
<updated>2017-06-25T03:05:35Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2017-05-11T22:24:41Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=7a230cfdf208b079dd9c5d71e5704811e2b267d9'/>
<id>urn:sha1:7a230cfdf208b079dd9c5d71e5704811e2b267d9</id>
<content type='text'>
[ Upstream commit f6ba8d33cfbb46df569972e64dbb5bb7e929bfd9 ]

I should have known that lowering skb-&gt;truesize was dangerous :/

In case packets are not leaving the host via a standard Ethernet device,
but looped back to local sockets, bad things can happen, as reported
by Michael Madsen ( https://bugzilla.kernel.org/show_bug.cgi?id=195713 )

So instead of tweaking skb-&gt;truesize, lets change skb-&gt;destructor
and keep a reference on the owner socket via its sk_refcnt.

Fixes: f2f872f9272a ("netem: Introduce skb_orphan_partial() helper")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Michael Madsen &lt;mkm@nabto.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
</content>
</entry>
<entry>
<title>net: check both type and procotol for tcp sockets</title>
<updated>2016-01-23T04:54:15Z</updated>
<author>
<name>WANG Cong</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2015-12-17T07:39:04Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=bd30c2ee7e5dc8258476801a2ec7039a47e2078e'/>
<id>urn:sha1:bd30c2ee7e5dc8258476801a2ec7039a47e2078e</id>
<content type='text'>
[ Upstream commit ac5cc977991d2dce85fc734a6c71ddb33f6fe3c1 ]

Dmitry reported the following out-of-bound access:

Call Trace:
 [&lt;ffffffff816cec2e&gt;] __asan_report_load4_noabort+0x3e/0x40
mm/kasan/report.c:294
 [&lt;ffffffff84affb14&gt;] sock_setsockopt+0x1284/0x13d0 net/core/sock.c:880
 [&lt;     inline     &gt;] SYSC_setsockopt net/socket.c:1746
 [&lt;ffffffff84aed7ee&gt;] SyS_setsockopt+0x1fe/0x240 net/socket.c:1729
 [&lt;ffffffff85c18c76&gt;] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185

This is because we mistake a raw socket as a tcp socket.
We should check both sk-&gt;sk_type and sk-&gt;sk_protocol to ensure
it is a tcp socket.

Willem points out __skb_complete_tx_timestamp() needs to fix as well.

Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Willem de Bruijn &lt;willemdebruijn.kernel@gmail.com&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Acked-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>sctp: update the netstamp_needed counter when copying sockets</title>
<updated>2016-01-23T04:54:12Z</updated>
<author>
<name>Marcelo Ricardo Leitner</name>
<email>marcelo.leitner@gmail.com</email>
</author>
<published>2015-12-04T17:14:04Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2e2b937a300e338bbe6c131181a97af17a9eb15e'/>
<id>urn:sha1:2e2b937a300e338bbe6c131181a97af17a9eb15e</id>
<content type='text'>
[ Upstream commit 01ce63c90170283a9855d1db4fe81934dddce648 ]

Dmitry Vyukov reported that SCTP was triggering a WARN on socket destroy
related to disabling sock timestamp.

When SCTP accepts an association or peel one off, it copies sock flags
but forgot to call net_enable_timestamp() if a packet timestamping flag
was copied, leading to extra calls to net_disable_timestamp() whenever
such clones were closed.

The fix is to call net_enable_timestamp() whenever we copy a sock with
that flag on, like tcp does.

Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Signed-off-by: Marcelo Ricardo Leitner &lt;marcelo.leitner@gmail.com&gt;
Acked-by: Vlad Yasevich &lt;vyasevich@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: don't wait for order-3 page allocation</title>
<updated>2015-06-12T00:33:44Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2015-06-11T23:50:48Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=fb05e7a89f500cfc06ae277bdc911b281928995d'/>
<id>urn:sha1:fb05e7a89f500cfc06ae277bdc911b281928995d</id>
<content type='text'>
We saw excessive direct memory compaction triggered by skb_page_frag_refill.
This causes performance issues and add latency. Commit 5640f7685831e0
introduces the order-3 allocation. According to the changelog, the order-3
allocation isn't a must-have but to improve performance. But direct memory
compaction has high overhead. The benefit of order-3 allocation can't
compensate the overhead of direct memory compaction.

This patch makes the order-3 page allocation atomic. If there is no memory
pressure and memory isn't fragmented, the alloction will still success, so we
don't sacrifice the order-3 benefit here. If the atomic allocation fails,
direct memory compaction will not be triggered, skb_page_frag_refill will
fallback to order-0 immediately, hence the direct memory compaction overhead is
avoided. In the allocation failure case, kswapd is waken up and doing
compaction, so chances are allocation could success next time.

alloc_skb_with_frags is the same.

The mellanox driver does similar thing, if this is accepted, we must fix
the driver too.

V3: fix the same issue in alloc_skb_with_frags as pointed out by Eric
V2: make the changelog clearer

Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Debabrata Banerjee &lt;dbavatar@gmail.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net, swap: Remove a warning and clarify why sk_mem_reclaim is required when deactivating swap</title>
<updated>2015-06-11T06:02:31Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2015-06-11T01:02:04Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5d753610277862fd8050901f38b6571b9500cdb6'/>
<id>urn:sha1:5d753610277862fd8050901f38b6571b9500cdb6</id>
<content type='text'>
Jeff Layton reported the following;

 [   74.232485] ------------[ cut here ]------------
 [   74.233354] WARNING: CPU: 2 PID: 754 at net/core/sock.c:364 sk_clear_memalloc+0x51/0x80()
 [   74.234790] Modules linked in: cts rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xfs libcrc32c snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device nfsd snd_pcm snd_timer snd e1000 ppdev parport_pc joydev parport pvpanic soundcore floppy serio_raw i2c_piix4 pcspkr nfs_acl lockd virtio_balloon acpi_cpufreq auth_rpcgss grace sunrpc qxl drm_kms_helper ttm drm virtio_console virtio_blk virtio_pci ata_generic virtio_ring pata_acpi virtio
 [   74.243599] CPU: 2 PID: 754 Comm: swapoff Not tainted 4.1.0-rc6+ #5
 [   74.244635] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 [   74.245546]  0000000000000000 0000000079e69e31 ffff8800d066bde8 ffffffff8179263d
 [   74.246786]  0000000000000000 0000000000000000 ffff8800d066be28 ffffffff8109e6fa
 [   74.248175]  0000000000000000 ffff880118d48000 ffff8800d58f5c08 ffff880036e380a8
 [   74.249483] Call Trace:
 [   74.249872]  [&lt;ffffffff8179263d&gt;] dump_stack+0x45/0x57
 [   74.250703]  [&lt;ffffffff8109e6fa&gt;] warn_slowpath_common+0x8a/0xc0
 [   74.251655]  [&lt;ffffffff8109e82a&gt;] warn_slowpath_null+0x1a/0x20
 [   74.252585]  [&lt;ffffffff81661241&gt;] sk_clear_memalloc+0x51/0x80
 [   74.253519]  [&lt;ffffffffa0116c72&gt;] xs_disable_swap+0x42/0x80 [sunrpc]
 [   74.254537]  [&lt;ffffffffa01109de&gt;] rpc_clnt_swap_deactivate+0x7e/0xc0 [sunrpc]
 [   74.255610]  [&lt;ffffffffa03e4fd7&gt;] nfs_swap_deactivate+0x27/0x30 [nfs]
 [   74.256582]  [&lt;ffffffff811e99d4&gt;] destroy_swap_extents+0x74/0x80
 [   74.257496]  [&lt;ffffffff811ecb52&gt;] SyS_swapoff+0x222/0x5c0
 [   74.258318]  [&lt;ffffffff81023f27&gt;] ? syscall_trace_leave+0xc7/0x140
 [   74.259253]  [&lt;ffffffff81798dae&gt;] system_call_fastpath+0x12/0x71
 [   74.260158] ---[ end trace 2530722966429f10 ]---

The warning in question was unnecessary but with Jeff's series the rules
are also clearer.  This patch removes the warning and updates the comment
to explain why sk_mem_reclaim() may still be called.

[jlayton: remove if (sk-&gt;sk_forward_alloc) conditional. As Leon
          points out that it's not needed.]

Cc: Leon Romanovsky &lt;leon@leon.nu&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Signed-off-by: Jeff Layton &lt;jeff.layton@primarydata.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Revert "net: kernel socket should be released in init_net namespace"</title>
<updated>2015-05-04T04:13:16Z</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2015-05-03T00:04:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2e70aedd3d522b018c01df172cd213a8a75e2d55'/>
<id>urn:sha1:2e70aedd3d522b018c01df172cd213a8a75e2d55</id>
<content type='text'>
This reverts commit c243d7e20996254f89c28d4838b5feca735c030d.

That patch is solving a non-existant problem while creating a
real problem.  Just because a socket is allocated in the init
name space doesn't mean that it gets hashed in the init name space.

When we unhash it the name space must be the same as the one
we had when we hashed it.  So this patch is completely bogus
and causes socket leaks.

Reported-by: Andrey Wagin &lt;avagin@gmail.com&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
