<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/net/ipv6/exthdrs.c, branch linux-5.1.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2018-07-05T11:15:26Z</updated>
<entry>
<title>ipv6: make ipv6_renew_options() interrupt/kernel safe</title>
<updated>2018-07-05T11:15:26Z</updated>
<author>
<name>Paul Moore</name>
<email>paul@paul-moore.com</email>
</author>
<published>2018-07-04T13:58:05Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=a9ba23d48dbc6ffd08426bb10f05720e0b9f5c14'/>
<id>urn:sha1:a9ba23d48dbc6ffd08426bb10f05720e0b9f5c14</id>
<content type='text'>
At present the ipv6_renew_options_kern() function ends up calling into
access_ok() which is problematic if done from inside an interrupt as
access_ok() calls WARN_ON_IN_IRQ() on some (all?) architectures
(x86-64 is affected).  Example warning/backtrace is shown below:

 WARNING: CPU: 1 PID: 3144 at lib/usercopy.c:11 _copy_from_user+0x85/0x90
 ...
 Call Trace:
  &lt;IRQ&gt;
  ipv6_renew_option+0xb2/0xf0
  ipv6_renew_options+0x26a/0x340
  ipv6_renew_options_kern+0x2c/0x40
  calipso_req_setattr+0x72/0xe0
  netlbl_req_setattr+0x126/0x1b0
  selinux_netlbl_inet_conn_request+0x80/0x100
  selinux_inet_conn_request+0x6d/0xb0
  security_inet_conn_request+0x32/0x50
  tcp_conn_request+0x35f/0xe00
  ? __lock_acquire+0x250/0x16c0
  ? selinux_socket_sock_rcv_skb+0x1ae/0x210
  ? tcp_rcv_state_process+0x289/0x106b
  tcp_rcv_state_process+0x289/0x106b
  ? tcp_v6_do_rcv+0x1a7/0x3c0
  tcp_v6_do_rcv+0x1a7/0x3c0
  tcp_v6_rcv+0xc82/0xcf0
  ip6_input_finish+0x10d/0x690
  ip6_input+0x45/0x1e0
  ? ip6_rcv_finish+0x1d0/0x1d0
  ipv6_rcv+0x32b/0x880
  ? ip6_make_skb+0x1e0/0x1e0
  __netif_receive_skb_core+0x6f2/0xdf0
  ? process_backlog+0x85/0x250
  ? process_backlog+0x85/0x250
  ? process_backlog+0xec/0x250
  process_backlog+0xec/0x250
  net_rx_action+0x153/0x480
  __do_softirq+0xd9/0x4f7
  do_softirq_own_stack+0x2a/0x40
  &lt;/IRQ&gt;
  ...

While not present in the backtrace, ipv6_renew_option() ends up calling
access_ok() via the following chain:

  access_ok()
  _copy_from_user()
  copy_from_user()
  ipv6_renew_option()

The fix presented in this patch is to perform the userspace copy
earlier in the call chain such that it is only called when the option
data is actually coming from userspace; that place is
do_ipv6_setsockopt().  Not only does this solve the problem seen in
the backtrace above, it also allows us to simplify the code quite a
bit by removing ipv6_renew_options_kern() completely.  We also take
this opportunity to cleanup ipv6_renew_options()/ipv6_renew_option()
a small amount as well.

This patch is heavily based on a rough patch by Al Viro.  I've taken
his original patch, converted a kmemdup() call in do_ipv6_setsockopt()
to a memdup_user() call, made better use of the e_inval jump target in
the same function, and cleaned up the use ipv6_renew_option() by
ipv6_renew_options().

CC: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Paul Moore &lt;paul@paul-moore.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv6: Count interface receive statistics on the ingress netdev</title>
<updated>2018-04-17T17:39:51Z</updated>
<author>
<name>Stephen Suryaputra</name>
<email>ssuryaextr@gmail.com</email>
</author>
<published>2018-04-16T17:42:16Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=bdb7cc643fc9db8d6ed9a2b9e524e27ac5882029'/>
<id>urn:sha1:bdb7cc643fc9db8d6ed9a2b9e524e27ac5882029</id>
<content type='text'>
The statistics such as InHdrErrors should be counted on the ingress
netdev rather than on the dev from the dst, which is the egress.

Signed-off-by: Stephen Suryaputra &lt;ssuryaextr@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv6: sr: fix TLVs not being copied using setsockopt</title>
<updated>2018-01-10T21:03:55Z</updated>
<author>
<name>Mathieu Xhonneux</name>
<email>m.xhonneux@gmail.com</email>
</author>
<published>2018-01-10T13:35:49Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=ccc12b11c5332c84442ef120dcd631523be75089'/>
<id>urn:sha1:ccc12b11c5332c84442ef120dcd631523be75089</id>
<content type='text'>
Function ipv6_push_rthdr4 allows to add an IPv6 Segment Routing Header
to a socket through setsockopt, but the current implementation doesn't
copy possible TLVs at the end of the SRH received from userspace.

Therefore, the execution of the following branch if (sr_has_hmac(sr_phdr))
{ ... } will never complete since the len and type fields of a possible
HMAC TLV are not copied, hence seg6_get_tlv_hmac will return an error,
and the HMAC will not be computed.

This commit adds a memcpy in case TLVs have been appended to the SRH.

Fixes: a149e7c7ce81 ("ipv6: sr: add support for SRH injection through setsockopt")
Acked-by: David Lebrun &lt;dlebrun@google.com&gt;
Signed-off-by: Mathieu Xhonneux &lt;m.xhonneux@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv6: Implement limits on Hop-by-Hop and Destination options</title>
<updated>2017-11-03T00:50:22Z</updated>
<author>
<name>Tom Herbert</name>
<email>tom@quantonium.net</email>
</author>
<published>2017-10-30T21:16:00Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=47d3d7ac656a1ffb9d0f0d3c845663ed6fd7e78d'/>
<id>urn:sha1:47d3d7ac656a1ffb9d0f0d3c845663ed6fd7e78d</id>
<content type='text'>
RFC 8200 (IPv6) defines Hop-by-Hop options and Destination options
extension headers. Both of these carry a list of TLVs which is
only limited by the maximum length of the extension header (2048
bytes). By the spec a host must process all the TLVs in these
options, however these could be used as a fairly obvious
denial of service attack. I think this could in fact be
a significant DOS vector on the Internet, one mitigating
factor might be that many FWs drop all packets with EH (and
obviously this is only IPv6) so an Internet wide attack might not
be so effective (yet!).

By my calculation, the worse case packet with TLVs in a standard
1500 byte MTU packet that would be processed by the stack contains
1282 invidual TLVs (including pad TLVS) or 724 two byte TLVs. I
wrote a quick test program that floods a whole bunch of these
packets to a host and sure enough there is substantial time spent
in ip6_parse_tlv. These packets contain nothing but unknown TLVS
(that are ignored), TLV padding, and bogus UDP header with zero
payload length.

  25.38%  [kernel]                    [k] __fib6_clean_all
  21.63%  [kernel]                    [k] ip6_parse_tlv
   4.21%  [kernel]                    [k] __local_bh_enable_ip
   2.18%  [kernel]                    [k] ip6_pol_route.isra.39
   1.98%  [kernel]                    [k] fib6_walk_continue
   1.88%  [kernel]                    [k] _raw_write_lock_bh
   1.65%  [kernel]                    [k] dst_release

This patch adds configurable limits to Destination and Hop-by-Hop
options. There are three limits that may be set:
  - Limit the number of options in a Hop-by-Hop or Destination options
    extension header.
  - Limit the byte length of a Hop-by-Hop or Destination options
    extension header.
  - Disallow unrecognized options in a Hop-by-Hop or Destination
    options extension header.

The limits are set in corresponding sysctls:

  ipv6.sysctl.max_dst_opts_cnt
  ipv6.sysctl.max_hbh_opts_cnt
  ipv6.sysctl.max_dst_opts_len
  ipv6.sysctl.max_hbh_opts_len

If a max_*_opts_cnt is less than zero then unknown TLVs are disallowed.
The number of known TLVs that are allowed is the absolute value of
this number.

If a limit is exceeded when processing an extension header the packet is
dropped.

Default values are set to 8 for options counts, and set to INT_MAX
for maximum length. Note the choice to limit options to 8 is an
arbitrary guess (roughly based on the fact that the stack supports
three HBH options and just one destination option).

These limits have being proposed in draft-ietf-6man-rfc6434-bis.

Tested (by Martin Lau)

I tested out 1 thread (i.e. one raw_udp process).

I changed the net.ipv6.max_dst_(opts|hbh)_number between 8 to 2048.
With sysctls setting to 2048, the softirq% is packed to 100%.
With 8, the softirq% is almost unnoticable from mpstat.

v2;
  - Code and documention cleanup.
  - Change references of RFC2460 to be RFC8200.
  - Add reference to RFC6434-bis where the limits will be in standard.

Signed-off-by: Tom Herbert &lt;tom@quantonium.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv6: exthdrs: use swap macro in ipv6_dest_hao</title>
<updated>2017-10-28T10:06:59Z</updated>
<author>
<name>Gustavo A. R. Silva</name>
<email>garsilva@embeddedor.com</email>
</author>
<published>2017-10-27T04:10:35Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=74b6551b9f4106a3e69ab46e60b52a8947dcf60c'/>
<id>urn:sha1:74b6551b9f4106a3e69ab46e60b52a8947dcf60c</id>
<content type='text'>
make use of the swap macro and remove unnecessary variable tmp_addr.
This makes the code easier to read and maintain.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva &lt;garsilva@embeddedor.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv6: mark expected switch fall-throughs</title>
<updated>2017-10-18T13:13:08Z</updated>
<author>
<name>Gustavo A. R. Silva</name>
<email>garsilva@embeddedor.com</email>
</author>
<published>2017-10-16T21:36:52Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=275757e6bae15a8621130907a78096afd1e15d2c'/>
<id>urn:sha1:275757e6bae15a8621130907a78096afd1e15d2c</id>
<content type='text'>
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Notice that in some cases I placed the "fall through" comment
on its own line, which is what GCC is expecting to find.

Signed-off-by: Gustavo A. R. Silva &lt;garsilva@embeddedor.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv6: sr: allow SRH insertion with arbitrary segments_left value</title>
<updated>2017-08-07T21:16:21Z</updated>
<author>
<name>David Lebrun</name>
<email>david.lebrun@uclouvain.be</email>
</author>
<published>2017-08-05T10:38:24Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=925615ceda94532ba612e641549c7b1e41ebff10'/>
<id>urn:sha1:925615ceda94532ba612e641549c7b1e41ebff10</id>
<content type='text'>
The seg6_validate_srh() function only allows SRHs whose active segment is
the first segment of the path. However, an application may insert an SRH
whose active segment is not the first one. Such an application might be
for example an SR-aware Virtual Network Function.

This patch enables to insert SRHs with an arbitrary active segment.

Signed-off-by: David Lebrun &lt;david.lebrun@uclouvain.be&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>udp6: fix jumbogram reception</title>
<updated>2017-08-01T05:01:21Z</updated>
<author>
<name>Paolo Abeni</name>
<email>pabeni@redhat.com</email>
</author>
<published>2017-07-31T14:52:36Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=cb891fa6a1d5f52c5f5c07b6f7f4c6d65ea55fc0'/>
<id>urn:sha1:cb891fa6a1d5f52c5f5c07b6f7f4c6d65ea55fc0</id>
<content type='text'>
Since commit 67a51780aebb ("ipv6: udp: leverage scratch area
helpers") udp6_recvmsg() read the skb len from the scratch area,
to avoid a cache miss.
But the UDP6 rx path support RFC 2675 UDPv6 jumbograms, and their
length exceeds the 16 bits available in the scratch area. As a side
effect the length returned by recvmsg() is:
&lt;ingress datagram len&gt; % (1&lt;&lt;16)

This commit addresses the issue allocating one more bit in the
IP6CB flags field and setting it for incoming jumbograms.
Such field is still in the first cacheline, so at recvmsg()
time we can check it and fallback to access skb-&gt;len if
required, without a measurable overhead.

Fixes: 67a51780aebb ("ipv6: udp: leverage scratch area helpers")
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net, ipv6: convert ipv6_txoptions.refcnt from atomic_t to refcount_t</title>
<updated>2017-07-04T08:29:03Z</updated>
<author>
<name>Reshetova, Elena</name>
<email>elena.reshetova@intel.com</email>
</author>
<published>2017-07-04T06:34:54Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0aeea21ada517b99a6e95298ffc105381bcd3a52'/>
<id>urn:sha1:0aeea21ada517b99a6e95298ffc105381bcd3a52</id>
<content type='text'>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova &lt;elena.reshetova@intel.com&gt;
Signed-off-by: Hans Liljestrand &lt;ishkamiel@gmail.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: David Windsor &lt;dwindsor@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>networking: make skb_push &amp; __skb_push return void pointers</title>
<updated>2017-06-16T15:48:40Z</updated>
<author>
<name>Johannes Berg</name>
<email>johannes.berg@intel.com</email>
</author>
<published>2017-06-16T12:29:23Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=d58ff35122847a83ba55394e2ae3a1527b6febf5'/>
<id>urn:sha1:d58ff35122847a83ba55394e2ae3a1527b6febf5</id>
<content type='text'>
It seems like a historic accident that these return unsigned char *,
and in many places that means casts are required, more often than not.

Make these functions return void * and remove all the casts across
the tree, adding a (u8 *) cast only where the unsigned char pointer
was used directly, all done with the following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

    @@
    expression SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - fn(SKB, LEN)[0]
    + *(u8 *)fn(SKB, LEN)

Note that the last part there converts from push(...)[0] to the
more idiomatic *(u8 *)push(...).

Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
