<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/net/core/skbuff.c, branch linux-5.1.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2019-07-14T06:09:36Z</updated>
<entry>
<title>bpf: sockmap, fix use after free from sleep in psock backlog workqueue</title>
<updated>2019-07-14T06:09:36Z</updated>
<author>
<name>John Fastabend</name>
<email>john.fastabend@gmail.com</email>
</author>
<published>2019-05-24T15:01:00Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8badadedccebc8d67f2920c7adc153e344777d47'/>
<id>urn:sha1:8badadedccebc8d67f2920c7adc153e344777d47</id>
<content type='text'>
[ Upstream commit bd95e678e0f6e18351ecdc147ca819145db9ed7b ]

Backlog work for psock (sk_psock_backlog) might sleep while waiting
for memory to free up when sending packets. However, while sleeping
the socket may be closed and removed from the map by the user space
side.

This breaks an assumption in sk_stream_wait_memory, which expects the
wait queue to be still there when it wakes up resulting in a
use-after-free shown below. To fix his mark sendmsg as MSG_DONTWAIT
to avoid the sleep altogether. We already set the flag for the
sendpage case but we missed the case were sendmsg is used.
Sockmap is currently the only user of skb_send_sock_locked() so only
the sockmap paths should be impacted.

==================================================================
BUG: KASAN: use-after-free in remove_wait_queue+0x31/0x70
Write of size 8 at addr ffff888069a0c4e8 by task kworker/0:2/110

CPU: 0 PID: 110 Comm: kworker/0:2 Not tainted 5.0.0-rc2-00335-g28f9d1a3d4fe-dirty #14
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
Workqueue: events sk_psock_backlog
Call Trace:
 print_address_description+0x6e/0x2b0
 ? remove_wait_queue+0x31/0x70
 kasan_report+0xfd/0x177
 ? remove_wait_queue+0x31/0x70
 ? remove_wait_queue+0x31/0x70
 remove_wait_queue+0x31/0x70
 sk_stream_wait_memory+0x4dd/0x5f0
 ? sk_stream_wait_close+0x1b0/0x1b0
 ? wait_woken+0xc0/0xc0
 ? tcp_current_mss+0xc5/0x110
 tcp_sendmsg_locked+0x634/0x15d0
 ? tcp_set_state+0x2e0/0x2e0
 ? __kasan_slab_free+0x1d1/0x230
 ? kmem_cache_free+0x70/0x140
 ? sk_psock_backlog+0x40c/0x4b0
 ? process_one_work+0x40b/0x660
 ? worker_thread+0x82/0x680
 ? kthread+0x1b9/0x1e0
 ? ret_from_fork+0x1f/0x30
 ? check_preempt_curr+0xaf/0x130
 ? iov_iter_kvec+0x5f/0x70
 ? kernel_sendmsg_locked+0xa0/0xe0
 skb_send_sock_locked+0x273/0x3c0
 ? skb_splice_bits+0x180/0x180
 ? start_thread+0xe0/0xe0
 ? update_min_vruntime.constprop.27+0x88/0xc0
 sk_psock_backlog+0xb3/0x4b0
 ? strscpy+0xbf/0x1e0
 process_one_work+0x40b/0x660
 worker_thread+0x82/0x680
 ? process_one_work+0x660/0x660
 kthread+0x1b9/0x1e0
 ? __kthread_create_on_node+0x250/0x250
 ret_from_fork+0x1f/0x30

Fixes: 20bf50de3028c ("skbuff: Function to send an skbuf on a socket")
Reported-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Tested-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Signed-off-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: correct zerocopy refcnt with udp MSG_MORE</title>
<updated>2019-06-04T05:59:45Z</updated>
<author>
<name>Willem de Bruijn</name>
<email>willemb@google.com</email>
</author>
<published>2019-05-30T22:01:21Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8778dcee3f54acd4e082f36a4eafdcd39bce97ec'/>
<id>urn:sha1:8778dcee3f54acd4e082f36a4eafdcd39bce97ec</id>
<content type='text'>
[ Upstream commit 100f6d8e09905c59be45b6316f8f369c0be1b2d8 ]

TCP zerocopy takes a uarg reference for every skb, plus one for the
tcp_sendmsg_locked datapath temporarily, to avoid reaching refcnt zero
as it builds, sends and frees skbs inside its inner loop.

UDP and RAW zerocopy do not send inside the inner loop so do not need
the extra sock_zerocopy_get + sock_zerocopy_put pair. Commit
52900d22288ed ("udp: elide zerocopy operation in hot path") introduced
extra_uref to pass the initial reference taken in sock_zerocopy_alloc
to the first generated skb.

But, sock_zerocopy_realloc takes this extra reference at the start of
every call. With MSG_MORE, no new skb may be generated to attach the
extra_uref to, so refcnt is incorrectly 2 with only one skb.

Do not take the extra ref if uarg &amp;&amp; !tcp, which implies MSG_MORE.
Update extra_uref accordingly.

This conditional assignment triggers a false positive may be used
uninitialized warning, so have to initialize extra_uref at define.

Changes v1-&gt;v2: fix typo in Fixes SHA1

Fixes: 52900d22288e7 ("udp: elide zerocopy operation in hot path")
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Diagnosed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: Fix missing meta data in skb with vlan packet</title>
<updated>2019-04-17T04:29:38Z</updated>
<author>
<name>Yuya Kusakabe</name>
<email>yuya.kusakabe@gmail.com</email>
</author>
<published>2019-04-16T01:22:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=d85e8be2a5a02869f815dd0ac2d743deb4cd7957'/>
<id>urn:sha1:d85e8be2a5a02869f815dd0ac2d743deb4cd7957</id>
<content type='text'>
skb_reorder_vlan_header() should move XDP meta data with ethernet header
if XDP meta data exists.

Fixes: de8f3a83b0a0 ("bpf: add meta pointer for direct access")
Signed-off-by: Yuya Kusakabe &lt;yuya.kusakabe@gmail.com&gt;
Signed-off-by: Takeru Hayasaka &lt;taketarou2@gmail.com&gt;
Co-developed-by: Takeru Hayasaka &lt;taketarou2@gmail.com&gt;
Reviewed-by: Toshiaki Makita &lt;makita.toshiaki@lab.ntt.co.jp&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net-gro: Fix GRO flush when receiving a GSO packet.</title>
<updated>2019-04-04T04:40:52Z</updated>
<author>
<name>Steffen Klassert</name>
<email>steffen.klassert@secunet.com</email>
</author>
<published>2019-04-02T06:16:03Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0ab03f353d3613ea49d1f924faf98559003670a8'/>
<id>urn:sha1:0ab03f353d3613ea49d1f924faf98559003670a8</id>
<content type='text'>
Currently we may merge incorrectly a received GSO packet
or a packet with frag_list into a packet sitting in the
gro_hash list. skb_segment() may crash case because
the assumptions on the skb layout are not met.
The correct behaviour would be to flush the packet in the
gro_hash list and send the received GSO packet directly
afterwards. Commit d61d072e87c8e ("net-gro: avoid reorders")
sets NAPI_GRO_CB(skb)-&gt;flush in this case, but this is not
checked before merging. This patch makes sure to check this
flag and to not merge in that case.

Fixes: d61d072e87c8e ("net-gro: avoid reorders")
Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Do not allocate page fragments that are not skb aligned</title>
<updated>2019-02-17T23:48:43Z</updated>
<author>
<name>Alexander Duyck</name>
<email>alexander.h.duyck@linux.intel.com</email>
</author>
<published>2019-02-15T22:44:18Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=3bed3cc4156eedf652b4df72bdb35d4f1a2a739d'/>
<id>urn:sha1:3bed3cc4156eedf652b4df72bdb35d4f1a2a739d</id>
<content type='text'>
This patch addresses the fact that there are drivers, specifically tun,
that will call into the network page fragment allocators with buffer sizes
that are not cache aligned. Doing this could result in data alignment
and DMA performance issues as these fragment pools are also shared with the
skb allocator and any other devices that will use napi_alloc_frags or
netdev_alloc_frags.

Fixes: ffde7328a36d ("net: Split netdev_alloc_frag into __alloc_page_frag and add __napi_alloc_frag")
Reported-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Alexander Duyck &lt;alexander.h.duyck@linux.intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net, skbuff: do not prefer skb allocation fails early</title>
<updated>2019-01-04T20:53:16Z</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2019-01-02T21:01:43Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=f8c468e8537925e0c4607263f498a1b7c0c8982e'/>
<id>urn:sha1:f8c468e8537925e0c4607263f498a1b7c0c8982e</id>
<content type='text'>
Commit dcda9b04713c ("mm, tree wide: replace __GFP_REPEAT by
__GFP_RETRY_MAYFAIL with more useful semantic") replaced __GFP_REPEAT in
alloc_skb_with_frags() with __GFP_RETRY_MAYFAIL when the allocation may
directly reclaim.

The previous behavior would require reclaim up to 1 &lt;&lt; order pages for
skb aligned header_len of order &gt; PAGE_ALLOC_COSTLY_ORDER before failing,
otherwise the allocations in alloc_skb() would loop in the page allocator
looking for memory.  __GFP_RETRY_MAYFAIL makes both allocations failable
under memory pressure, including for the HEAD allocation.

This can cause, among many other things, write() to fail with ENOTCONN
during RPC when under memory pressure.

These allocations should succeed as they did previous to dcda9b04713c
even if it requires calling the oom killer and additional looping in the
page allocator to find memory.  There is no way to specify the previous
behavior of __GFP_REPEAT, but it's unlikely to be necessary since the
previous behavior only guaranteed that 1 &lt;&lt; order pages would be reclaimed
before failing for order &gt; PAGE_ALLOC_COSTLY_ORDER.  That reclaim is not
guaranteed to be contiguous memory, so repeating for such large orders is
usually not beneficial.

Removing the setting of __GFP_RETRY_MAYFAIL to restore the previous
behavior, specifically not allowing alloc_skb() to fail for small orders
and oom kill if necessary rather than allowing RPCs to fail.

Fixes: dcda9b04713c ("mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic")
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: minor cleanup in skb_ext_add()</title>
<updated>2018-12-21T18:24:54Z</updated>
<author>
<name>Paolo Abeni</name>
<email>pabeni@redhat.com</email>
</author>
<published>2018-12-21T18:03:15Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=682ec859518d73435cc924d816da2953343241c1'/>
<id>urn:sha1:682ec859518d73435cc924d816da2953343241c1</id>
<content type='text'>
When the extension to be added is already present, the only
skb field we may need to update is 'extensions': we can reorder
the code and avoid a branch.

v1 -&gt; v2:
 - be sure to flag the newly added extension as active

Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Acked-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: fix possible user-after-free in skb_ext_add()</title>
<updated>2018-12-21T18:24:54Z</updated>
<author>
<name>Paolo Abeni</name>
<email>pabeni@redhat.com</email>
</author>
<published>2018-12-21T18:03:13Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e94e50bd88f7ed2f2d40c32c06efd61c36c33ec8'/>
<id>urn:sha1:e94e50bd88f7ed2f2d40c32c06efd61c36c33ec8</id>
<content type='text'>
On cow we can free the old extension: we must avoid dereferencing
such extension after skb_ext_maybe_cow(). Since 'new' contents
are always equal to 'old' after the copy, we can fix the above
accessing the relevant data using 'new'.

Fixes: df5042f4c5b9 ("sk_buff: add skb extension infrastructure")
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Acked-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: switch secpath to use skb extension infrastructure</title>
<updated>2018-12-19T19:21:38Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2018-12-18T16:15:27Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=4165079ba328dd47262a2183049d3591f0a750b1'/>
<id>urn:sha1:4165079ba328dd47262a2183049d3591f0a750b1</id>
<content type='text'>
Remove skb-&gt;sp and allocate secpath storage via extension
infrastructure.  This also reduces sk_buff by 8 bytes on x86_64.

Total size of allyesconfig kernel is reduced slightly, as there is
less inlined code (one conditional atomic op instead of two on
skb_clone).

No differences in throughput in following ipsec performance tests:
- transport mode with aes on 10GB link
- tunnel mode between two network namespaces with aes and null cipher

Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: convert bridge_nf to use skb extension infrastructure</title>
<updated>2018-12-19T19:21:37Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2018-12-18T16:15:17Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=de8bda1d22d38b7d5cd08b33f86efd94d4c86630'/>
<id>urn:sha1:de8bda1d22d38b7d5cd08b33f86efd94d4c86630</id>
<content type='text'>
This converts the bridge netfilter (calling iptables hooks from bridge)
facility to use the extension infrastructure.

The bridge_nf specific hooks in skb clone and free paths are removed, they
have been replaced by the skb_ext hooks that do the same as the bridge nf
allocations hooks did.

Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
