<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/net/ipv6/ip6_output.c, branch linux-5.1.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2019-06-22T06:09:07Z</updated>
<entry>
<title>net: correct udp zerocopy refcnt also when zerocopy only on append</title>
<updated>2019-06-22T06:09:07Z</updated>
<author>
<name>Willem de Bruijn</name>
<email>willemb@google.com</email>
</author>
<published>2019-06-07T21:57:48Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0ab34cf296c3c7d326e57a38f61a5a86686bdda1'/>
<id>urn:sha1:0ab34cf296c3c7d326e57a38f61a5a86686bdda1</id>
<content type='text'>
[ Upstream commit 522924b583082f51b8a2406624a2f27c22119b20 ]

The below patch fixes an incorrect zerocopy refcnt increment when
appending with MSG_MORE to an existing zerocopy udp skb.

  send(.., MSG_ZEROCOPY | MSG_MORE);	// refcnt 1
  send(.., MSG_ZEROCOPY | MSG_MORE);	// refcnt still 1 (bar frags)

But it missed that zerocopy need not be passed at the first send. The
right test whether the uarg is newly allocated and thus has extra
refcnt 1 is not !skb, but !skb_zcopy.

  send(.., MSG_MORE);			// &lt;no uarg&gt;
  send(.., MSG_ZEROCOPY);		// refcnt 1

Fixes: 100f6d8e09905 ("net: correct zerocopy refcnt with udp MSG_MORE")
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: correct zerocopy refcnt with udp MSG_MORE</title>
<updated>2019-06-04T05:59:45Z</updated>
<author>
<name>Willem de Bruijn</name>
<email>willemb@google.com</email>
</author>
<published>2019-05-30T22:01:21Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8778dcee3f54acd4e082f36a4eafdcd39bce97ec'/>
<id>urn:sha1:8778dcee3f54acd4e082f36a4eafdcd39bce97ec</id>
<content type='text'>
[ Upstream commit 100f6d8e09905c59be45b6316f8f369c0be1b2d8 ]

TCP zerocopy takes a uarg reference for every skb, plus one for the
tcp_sendmsg_locked datapath temporarily, to avoid reaching refcnt zero
as it builds, sends and frees skbs inside its inner loop.

UDP and RAW zerocopy do not send inside the inner loop so do not need
the extra sock_zerocopy_get + sock_zerocopy_put pair. Commit
52900d22288ed ("udp: elide zerocopy operation in hot path") introduced
extra_uref to pass the initial reference taken in sock_zerocopy_alloc
to the first generated skb.

But, sock_zerocopy_realloc takes this extra reference at the start of
every call. With MSG_MORE, no new skb may be generated to attach the
extra_uref to, so refcnt is incorrectly 2 with only one skb.

Do not take the extra ref if uarg &amp;&amp; !tcp, which implies MSG_MORE.
Update extra_uref accordingly.

This conditional assignment triggers a false positive may be used
uninitialized warning, so have to initialize extra_uref at define.

Changes v1-&gt;v2: fix typo in Fixes SHA1

Fixes: 52900d22288e7 ("udp: elide zerocopy operation in hot path")
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Diagnosed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ipv6: Fix dangling pointer when ipv6 fragment</title>
<updated>2019-04-04T04:42:20Z</updated>
<author>
<name>Junwei Hu</name>
<email>hujunwei4@huawei.com</email>
</author>
<published>2019-04-02T11:38:04Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=ef0efcd3bd3fd0589732b67fb586ffd3c8705806'/>
<id>urn:sha1:ef0efcd3bd3fd0589732b67fb586ffd3c8705806</id>
<content type='text'>
At the beginning of ip6_fragment func, the prevhdr pointer is
obtained in the ip6_find_1stfragopt func.
However, all the pointers pointing into skb header may change
when calling skb_checksum_help func with
skb-&gt;ip_summed = CHECKSUM_PARTIAL condition.
The prevhdr pointe will be dangling if it is not reloaded after
calling __skb_linearize func in skb_checksum_help func.

Here, I add a variable, nexthdr_offset, to evaluate the offset,
which does not changes even after calling __skb_linearize func.

Fixes: 405c92f7a541 ("ipv6: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment")
Signed-off-by: Junwei Hu &lt;hujunwei4@huawei.com&gt;
Reported-by: Wenhao Zhang &lt;zhangwenhao8@huawei.com&gt;
Reported-by: syzbot+e8ce541d095e486074fc@syzkaller.appspotmail.com
Reviewed-by: Zhiqiang Liu &lt;liuzhiqiang26@huawei.com&gt;
Acked-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: ipv6: add socket option IPV6_ROUTER_ALERT_ISOLATE</title>
<updated>2019-03-04T05:05:10Z</updated>
<author>
<name>Francesco Ruggeri</name>
<email>fruggeri@arista.com</email>
</author>
<published>2019-03-01T23:31:03Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9036b2fe092a107856edd1a3bad48b83f2b45000'/>
<id>urn:sha1:9036b2fe092a107856edd1a3bad48b83f2b45000</id>
<content type='text'>
By default IPv6 socket with IPV6_ROUTER_ALERT socket option set will
receive all IPv6 RA packets from all namespaces.
IPV6_ROUTER_ALERT_ISOLATE socket option restricts packets received by
the socket to be only from the socket's namespace.

Signed-off-by: Maxim Martynov &lt;maxim@arista.com&gt;
Signed-off-by: Francesco Ruggeri &lt;fruggeri@arista.com&gt;
Reviewed-by: David Ahern &lt;dsahern@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2018-12-20T19:53:36Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2018-12-20T18:53:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2be09de7d6a06f58e768de1255a687c9aaa66606'/>
<id>urn:sha1:2be09de7d6a06f58e768de1255a687c9aaa66606</id>
<content type='text'>
Lots of conflicts, by happily all cases of overlapping
changes, parallel adds, things of that nature.

Thanks to Stephen Rothwell, Saeed Mahameed, and others
for their guidance in these resolutions.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>sk_buff: add skb extension infrastructure</title>
<updated>2018-12-19T19:21:37Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2018-12-18T16:15:16Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=df5042f4c5b9326c593bf2e31ed859ebc3b4130a'/>
<id>urn:sha1:df5042f4c5b9326c593bf2e31ed859ebc3b4130a</id>
<content type='text'>
This adds an optional extension infrastructure, with ispec (xfrm) and
bridge netfilter as first users.
objdiff shows no changes if kernel is built without xfrm and br_netfilter
support.

The third (planned future) user is Multipath TCP which is still
out-of-tree.
MPTCP needs to map logical mptcp sequence numbers to the tcp sequence
numbers used by individual subflows.

This DSS mapping is read/written from tcp option space on receive and
written to tcp option space on transmitted tcp packets that are part of
and MPTCP connection.

Extending skb_shared_info or adding a private data field to skb fclones
doesn't work for incoming skb, so a different DSS propagation method would
be required for the receive side.

mptcp has same requirements as secpath/bridge netfilter:

1. extension memory is released when the sk_buff is free'd.
2. data is shared after cloning an skb (clone inherits extension)
3. adding extension to an skb will COW the extension buffer if needed.

The "MPTCP upstreaming" effort adds SKB_EXT_MPTCP extension to store the
mapping for tx and rx processing.

Two new members are added to sk_buff:
1. 'active_extensions' byte (filling a hole), telling which extensions
   are available for this skb.
   This has two purposes.
   a) avoids the need to initialize the pointer.
   b) allows to "delete" an extension by clearing its bit
   value in -&gt;active_extensions.

   While it would be possible to store the active_extensions byte
   in the extension struct instead of sk_buff, there is one problem
   with this:
    When an extension has to be disabled, we can always clear the
    bit in skb-&gt;active_extensions.  But in case it would be stored in the
    extension buffer itself, we might have to COW it first, if
    we are dealing with a cloned skb.  On kmalloc failure we would
    be unable to turn an extension off.

2. extension pointer, located at the end of the sk_buff.
   If the active_extensions byte is 0, the pointer is undefined,
   it is not initialized on skb allocation.

This adds extra code to skb clone and free paths (to deal with
refcount/free of extension area) but this replaces similar code that
manages skb-&gt;nf_bridge and skb-&gt;sp structs in the followup patches of
the series.

It is possible to add support for extensions that are not preseved on
clones/copies.

To do this, it would be needed to define a bitmask of all extensions that
need copy/cow semantics, and change __skb_ext_copy() to check
-&gt;active_extensions &amp; SKB_EXT_PRESERVE_ON_CLONE, then just set
-&gt;active_extensions to 0 on the new clone.

This isn't done here because all extensions that get added here
need the copy/cow semantics.

v2:
Allocate entire extension space using kmem_cache.
Upside is that this allows better tracking of used memory,
downside is that we will allocate more space than strictly needed in
most cases (its unlikely that all extensions are active/needed at same
time for same skb).
The allocated memory (except the small extension header) is not cleared,
so no additonal overhead aside from memory usage.

Avoid atomic_dec_and_test operation on skb_ext_put()
by using similar trick as kfree_skbmem() does with fclone_ref:
If recount is 1, there is no concurrent user and we can free right away.

Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: clear skb-&gt;tstamp in forwarding paths</title>
<updated>2018-12-15T21:24:21Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2018-12-14T14:46:49Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8203e2d844d34af247a151d8ebd68553a6e91785'/>
<id>urn:sha1:8203e2d844d34af247a151d8ebd68553a6e91785</id>
<content type='text'>
Sergey reported that forwarding was no longer working
if fq packet scheduler was used.

This is caused by the recent switch to EDT model, since incoming
packets might have been timestamped by __net_timestamp()

__net_timestamp() uses ktime_get_real(), while fq expects packets
using CLOCK_MONOTONIC base.

The fix is to clear skb-&gt;tstamp in forwarding paths.

Fixes: 80b14dee2bea ("net: Add a new socket option for a future transmit time.")
Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Sergey Matyukevich &lt;geomatsi@gmail.com&gt;
Tested-by: Sergey Matyukevich &lt;geomatsi@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2018-12-10T05:43:31Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2018-12-10T05:27:48Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=4cc1feeb6ffc2799f8badb4dea77c637d340cb0d'/>
<id>urn:sha1:4cc1feeb6ffc2799f8badb4dea77c637d340cb0d</id>
<content type='text'>
Several conflicts, seemingly all over the place.

I used Stephen Rothwell's sample resolutions for many of these, if not
just to double check my own work, so definitely the credit largely
goes to him.

The NFP conflict consisted of a bug fix (moving operations
past the rhashtable operation) while chaning the initial
argument in the function call in the moved code.

The net/dsa/master.c conflict had to do with a bug fix intermixing of
making dsa_master_set_mtu() static with the fixing of the tagging
attribute location.

cls_flower had a conflict because the dup reject fix from Or
overlapped with the addition of port range classifiction.

__set_phy_supported()'s conflict was relatively easy to resolve
because Andrew fixed it in both trees, so it was just a matter
of taking the net-next copy.  Or at least I think it was :-)

Joe Stringer's fix to the handling of netns id 0 in bpf_sk_lookup()
intermixed with changes on how the sdif and caller_net are calculated
in these code paths in net-next.

The remaining BPF conflicts were largely about the addition of the
__bpf_md_ptr stuff in 'net' overlapping with adjustments and additions
to the relevant data structure where the MD pointer macros are used.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ip: silence udp zerocopy smatch false positive</title>
<updated>2018-12-08T20:26:20Z</updated>
<author>
<name>Willem de Bruijn</name>
<email>willemb@google.com</email>
</author>
<published>2018-12-08T11:22:46Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=97ef7b4c5501b081c6144a08bba6d87baf69b6e5'/>
<id>urn:sha1:97ef7b4c5501b081c6144a08bba6d87baf69b6e5</id>
<content type='text'>
extra_uref is used in __ip(6)_append_data only if uarg is set.

Smatch sees that the variable is passed to sock_zerocopy_put_abort.
This function accesses it only when uarg is set, but smatch cannot
infer this.

Make this dependency explicit.

Fixes: 52900d22288e ("udp: elide zerocopy operation in hot path")
Signed-off-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv6: Check available headroom in ip6_xmit() even without options</title>
<updated>2018-12-08T00:24:40Z</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2018-12-06T18:30:36Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=66033f47ca60294a95fc85ec3a3cc909dab7b765'/>
<id>urn:sha1:66033f47ca60294a95fc85ec3a3cc909dab7b765</id>
<content type='text'>
Even if we send an IPv6 packet without options, MAX_HEADER might not be
enough to account for the additional headroom required by alignment of
hardware headers.

On a configuration without HYPERV_NET, WLAN, AX25, and with IPV6_TUNNEL,
sending short SCTP packets over IPv4 over L2TP over IPv6, we start with
100 bytes of allocated headroom in sctp_packet_transmit(), end up with 54
bytes after l2tp_xmit_skb(), and 14 bytes in ip6_finish_output2().

Those would be enough to append our 14 bytes header, but we're going to
align that to 16 bytes, and write 2 bytes out of the allocated slab in
neigh_hh_output().

KASan says:

[  264.967848] ==================================================================
[  264.967861] BUG: KASAN: slab-out-of-bounds in ip6_finish_output2+0x1aec/0x1c70
[  264.967866] Write of size 16 at addr 000000006af1c7fe by task netperf/6201
[  264.967870]
[  264.967876] CPU: 0 PID: 6201 Comm: netperf Not tainted 4.20.0-rc4+ #1
[  264.967881] Hardware name: IBM 2827 H43 400 (z/VM 6.4.0)
[  264.967887] Call Trace:
[  264.967896] ([&lt;00000000001347d6&gt;] show_stack+0x56/0xa0)
[  264.967903]  [&lt;00000000017e379c&gt;] dump_stack+0x23c/0x290
[  264.967912]  [&lt;00000000007bc594&gt;] print_address_description+0xf4/0x290
[  264.967919]  [&lt;00000000007bc8fc&gt;] kasan_report+0x13c/0x240
[  264.967927]  [&lt;000000000162f5e4&gt;] ip6_finish_output2+0x1aec/0x1c70
[  264.967935]  [&lt;000000000163f890&gt;] ip6_finish_output+0x430/0x7f0
[  264.967943]  [&lt;000000000163fe44&gt;] ip6_output+0x1f4/0x580
[  264.967953]  [&lt;000000000163882a&gt;] ip6_xmit+0xfea/0x1ce8
[  264.967963]  [&lt;00000000017396e2&gt;] inet6_csk_xmit+0x282/0x3f8
[  264.968033]  [&lt;000003ff805fb0ba&gt;] l2tp_xmit_skb+0xe02/0x13e0 [l2tp_core]
[  264.968037]  [&lt;000003ff80631192&gt;] l2tp_eth_dev_xmit+0xda/0x150 [l2tp_eth]
[  264.968041]  [&lt;0000000001220020&gt;] dev_hard_start_xmit+0x268/0x928
[  264.968069]  [&lt;0000000001330e8e&gt;] sch_direct_xmit+0x7ae/0x1350
[  264.968071]  [&lt;000000000122359c&gt;] __dev_queue_xmit+0x2b7c/0x3478
[  264.968075]  [&lt;00000000013d2862&gt;] ip_finish_output2+0xce2/0x11a0
[  264.968078]  [&lt;00000000013d9b14&gt;] ip_finish_output+0x56c/0x8c8
[  264.968081]  [&lt;00000000013ddd1e&gt;] ip_output+0x226/0x4c0
[  264.968083]  [&lt;00000000013dbd6c&gt;] __ip_queue_xmit+0x894/0x1938
[  264.968100]  [&lt;000003ff80bc3a5c&gt;] sctp_packet_transmit+0x29d4/0x3648 [sctp]
[  264.968116]  [&lt;000003ff80b7bf68&gt;] sctp_outq_flush_ctrl.constprop.5+0x8d0/0xe50 [sctp]
[  264.968131]  [&lt;000003ff80b7c716&gt;] sctp_outq_flush+0x22e/0x7d8 [sctp]
[  264.968146]  [&lt;000003ff80b35c68&gt;] sctp_cmd_interpreter.isra.16+0x530/0x6800 [sctp]
[  264.968161]  [&lt;000003ff80b3410a&gt;] sctp_do_sm+0x222/0x648 [sctp]
[  264.968177]  [&lt;000003ff80bbddac&gt;] sctp_primitive_ASSOCIATE+0xbc/0xf8 [sctp]
[  264.968192]  [&lt;000003ff80b93328&gt;] __sctp_connect+0x830/0xc20 [sctp]
[  264.968208]  [&lt;000003ff80bb11ce&gt;] sctp_inet_connect+0x2e6/0x378 [sctp]
[  264.968212]  [&lt;0000000001197942&gt;] __sys_connect+0x21a/0x450
[  264.968215]  [&lt;000000000119aff8&gt;] sys_socketcall+0x3d0/0xb08
[  264.968218]  [&lt;000000000184ea7a&gt;] system_call+0x2a2/0x2c0

[...]

Just like ip_finish_output2() does for IPv4, check that we have enough
headroom in ip6_xmit(), and reallocate it if we don't.

This issue is older than git history.

Reported-by: Jianlin Shi &lt;jishi@redhat.com&gt;
Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
