<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/net/ipv4/inet_hashtables.c, branch linux-5.1.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.1.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2018-12-24T23:27:56Z</updated>
<entry>
<title>net: dccp: fix kernel crash on module load</title>
<updated>2018-12-24T23:27:56Z</updated>
<author>
<name>Peter Oskolkov</name>
<email>posk@google.com</email>
</author>
<published>2018-12-24T20:57:17Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c92c81df93df95dafbf6926613ce0b436227b007'/>
<id>urn:sha1:c92c81df93df95dafbf6926613ce0b436227b007</id>
<content type='text'>
Patch eedbbb0d98b2 "net: dccp: initialize (addr,port) ..."
added calling to inet_hashinfo2_init() from dccp_init().

However, inet_hashinfo2_init() is marked as __init(), and
thus the kernel panics when dccp is loaded as module. Removing
__init() tag from inet_hashinfo2_init() is not feasible because
it calls into __init functions in mm.

This patch adds inet_hashinfo2_init_mod() function that can
be called after the init phase is done; changes dccp_init() to
call the new function; un-marks inet_hashinfo2_init() as
exported.

Fixes: eedbbb0d98b2 ("net: dccp: initialize (addr,port) ...")
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Peter Oskolkov &lt;posk@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: dccp: initialize (addr,port) listening hashtable</title>
<updated>2018-12-18T07:11:48Z</updated>
<author>
<name>Peter Oskolkov</name>
<email>posk@google.com</email>
</author>
<published>2018-12-16T23:42:48Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=eedbbb0d98b2a89250a8bb83d9c71b77881e5247'/>
<id>urn:sha1:eedbbb0d98b2a89250a8bb83d9c71b77881e5247</id>
<content type='text'>
Commit d9fbc7f6431f "net: tcp: prefer listeners bound to an address"
removes port-only listener lookups. This caused segfaults in DCCP
lookups because DCCP did not initialize the (addr,port) hashtable.

This patch adds said initialization.

The only non-trivial issue here is the size of the new hashtable.
It seemed reasonable to make it match the size of the port-only
hashtable (= INET_LHTABLE_SIZE) that was used previously. Other
parameters to inet_hashinfo2_init() match those used in TCP.

V2 changes: marked inet_hashinfo2_init as an exported symbol
so that DCCP compiles when configured as a module.

Tested: syzcaller issues fixed; the second patch in the patchset
        tests that DCCP lookups work correctly.

Fixes: d9fbc7f6431f "net: tcp: prefer listeners bound to an address"
Reported-by: syzcaller &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: Peter Oskolkov &lt;posk@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: tcp: prefer listeners bound to an address</title>
<updated>2018-12-14T23:55:20Z</updated>
<author>
<name>Peter Oskolkov</name>
<email>posk@google.com</email>
</author>
<published>2018-12-12T21:15:35Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=d9fbc7f6431fc0e5c0ddedf72206d7c5175c5c9a'/>
<id>urn:sha1:d9fbc7f6431fc0e5c0ddedf72206d7c5175c5c9a</id>
<content type='text'>
A relatively common use case is to have several IPs configured
on a host, and have different listeners for each of them. We would
like to add a "catch all" listener on addr_any, to match incoming
connections not served by any of the listeners bound to a specific
address.

However, port-only lookups can match addr_any sockets when sockets
listening on specific addresses are present if so_reuseport flag
is set. This patch eliminates lookups into port-only hashtable,
as lookups by (addr,port) tuple are easily available.

In addition, compute_score() is tweaked to _not_ match
addr_any sockets to specific addresses, as hash collisions
could result in the unwanted behavior described above.

Tested: the patch compiles; full test in the last patch in this
patchset. Existing reuseport_* selftests also pass.

Suggested-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Peter Oskolkov &lt;posk@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: ensure unbound stream socket to be chosen when not in a VRF</title>
<updated>2018-11-08T00:12:38Z</updated>
<author>
<name>Mike Manning</name>
<email>mmanning@vyatta.att-mail.com</email>
</author>
<published>2018-11-07T15:36:03Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e78190581aff7c96fbd6324aa633170934650b65'/>
<id>urn:sha1:e78190581aff7c96fbd6324aa633170934650b65</id>
<content type='text'>
The commit a04a480d4392 ("net: Require exact match for TCP socket
lookups if dif is l3mdev") only ensures that the correct socket is
selected for packets in a VRF. However, there is no guarantee that
the unbound socket will be selected for packets when not in a VRF.
By checking for a device match in compute_score() also for the case
when there is no bound device and attaching a score to this, the
unbound socket is selected. And if a failure is returned when there
is no device match, this ensures that bound sockets are never selected,
even if there is no unbound socket.

Signed-off-by: Mike Manning &lt;mmanning@vyatta.att-mail.com&gt;
Reviewed-by: David Ahern &lt;dsahern@gmail.com&gt;
Tested-by: David Ahern &lt;dsahern@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: allow binding socket in a VRF when there's an unbound socket</title>
<updated>2018-11-08T00:12:38Z</updated>
<author>
<name>Robert Shearman</name>
<email>rshearma@vyatta.att-mail.com</email>
</author>
<published>2018-11-07T15:36:02Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=3c82a21f4320c8d54cf6456b27c8d49e5ffb722e'/>
<id>urn:sha1:3c82a21f4320c8d54cf6456b27c8d49e5ffb722e</id>
<content type='text'>
Change the inet socket lookup to avoid packets arriving on a device
enslaved to an l3mdev from matching unbound sockets by removing the
wildcard for non sk_bound_dev_if and instead relying on check against
the secondary device index, which will be 0 when the input device is
not enslaved to an l3mdev and so match against an unbound socket and
not match when the input device is enslaved.

Change the socket binding to take the l3mdev into account to allow an
unbound socket to not conflict sockets bound to an l3mdev given the
datapath isolation now guaranteed.

Signed-off-by: Robert Shearman &lt;rshearma@vyatta.att-mail.com&gt;
Signed-off-by: Mike Manning &lt;mmanning@vyatta.att-mail.com&gt;
Reviewed-by: David Ahern &lt;dsahern@gmail.com&gt;
Tested-by: David Ahern &lt;dsahern@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>mm: remove include/linux/bootmem.h</title>
<updated>2018-10-31T15:54:16Z</updated>
<author>
<name>Mike Rapoport</name>
<email>rppt@linux.vnet.ibm.com</email>
</author>
<published>2018-10-30T22:09:49Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=57c8a661d95dff48dd9c2f2496139082bbaf241a'/>
<id>urn:sha1:57c8a661d95dff48dd9c2f2496139082bbaf241a</id>
<content type='text'>
Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.

The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include &lt;linux/memblock.h&gt;

@@
@@
- #include &lt;linux/bootmem.h&gt;
+ #include &lt;linux/memblock.h&gt;

[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
  Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport &lt;rppt@linux.vnet.ibm.com&gt;
Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Greentime Hu &lt;green.hu@gmail.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Guan Xuetao &lt;gxt@pku.edu.cn&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: "James E.J. Bottomley" &lt;jejb@parisc-linux.org&gt;
Cc: Jonas Bonn &lt;jonas@southpole.se&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Ley Foon Tan &lt;lftan@altera.com&gt;
Cc: Mark Salter &lt;msalter@redhat.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Michal Simek &lt;monstr@monstr.eu&gt;
Cc: Palmer Dabbelt &lt;palmer@sifive.com&gt;
Cc: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rich Felker &lt;dalias@libc.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Serge Semin &lt;fancer.lancer@gmail.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Cc: Yoshinori Sato &lt;ysato@users.sourceforge.jp&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>bpf: Enable BPF_PROG_TYPE_SK_REUSEPORT bpf prog in reuseport selection</title>
<updated>2018-08-10T23:58:46Z</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2018-08-08T08:01:26Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8217ca653ec601246832d562207bc24bdf652d2f'/>
<id>urn:sha1:8217ca653ec601246832d562207bc24bdf652d2f</id>
<content type='text'>
This patch allows a BPF_PROG_TYPE_SK_REUSEPORT bpf prog to select a
SO_REUSEPORT sk from a BPF_MAP_TYPE_REUSEPORT_ARRAY introduced in
the earlier patch.  "bpf_run_sk_reuseport()" will return -ECONNREFUSED
when the BPF_PROG_TYPE_SK_REUSEPORT prog returns SK_DROP.
The callers, in inet[6]_hashtable.c and ipv[46]/udp.c, are modified to
handle this case and return NULL immediately instead of continuing the
sk search from its hashtable.

It re-uses the existing SO_ATTACH_REUSEPORT_EBPF setsockopt to attach
BPF_PROG_TYPE_SK_REUSEPORT.  The "sk_reuseport_attach_bpf()" will check
if the attaching bpf prog is in the new SK_REUSEPORT or the existing
SOCKET_FILTER type and then check different things accordingly.

One level of "__reuseport_attach_prog()" call is removed.  The
"sk_unhashed() &amp;&amp; ..." and "sk-&gt;sk_reuseport_cb" tests are pushed
back to "reuseport_attach_prog()" in sock_reuseport.c.  sock_reuseport.c
seems to have more knowledge on those test requirements than filter.c.
In "reuseport_attach_prog()", after new_prog is attached to reuse-&gt;prog,
the old_prog (if any) is also directly freed instead of returning the
old_prog to the caller and asking the caller to free.

The sysctl_optmem_max check is moved back to the
"sk_reuseport_attach_filter()" and "sk_reuseport_attach_bpf()".
As of other bpf prog types, the new BPF_PROG_TYPE_SK_REUSEPORT is only
bounded by the usual "bpf_prog_charge_memlock()" during load time
instead of bounded by both bpf_prog_charge_memlock and sysctl_optmem_max.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</content>
</entry>
<entry>
<title>bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT</title>
<updated>2018-08-10T23:58:46Z</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2018-08-08T08:01:25Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2dbb9b9e6df67d444fbe425c7f6014858d337adf'/>
<id>urn:sha1:2dbb9b9e6df67d444fbe425c7f6014858d337adf</id>
<content type='text'>
This patch adds a BPF_PROG_TYPE_SK_REUSEPORT which can select
a SO_REUSEPORT sk from a BPF_MAP_TYPE_REUSEPORT_ARRAY.  Like other
non SK_FILTER/CGROUP_SKB program, it requires CAP_SYS_ADMIN.

BPF_PROG_TYPE_SK_REUSEPORT introduces "struct sk_reuseport_kern"
to store the bpf context instead of using the skb-&gt;cb[48].

At the SO_REUSEPORT sk lookup time, it is in the middle of transiting
from a lower layer (ipv4/ipv6) to a upper layer (udp/tcp).  At this
point,  it is not always clear where the bpf context can be appended
in the skb-&gt;cb[48] to avoid saving-and-restoring cb[].  Even putting
aside the difference between ipv4-vs-ipv6 and udp-vs-tcp.  It is not
clear if the lower layer is only ipv4 and ipv6 in the future and
will it not touch the cb[] again before transiting to the upper
layer.

For example, in udp_gro_receive(), it uses the 48 byte NAPI_GRO_CB
instead of IP[6]CB and it may still modify the cb[] after calling
the udp[46]_lib_lookup_skb().  Because of the above reason, if
sk-&gt;cb is used for the bpf ctx, saving-and-restoring is needed
and likely the whole 48 bytes cb[] has to be saved and restored.

Instead of saving, setting and restoring the cb[], this patch opts
to create a new "struct sk_reuseport_kern" and setting the needed
values in there.

The new BPF_PROG_TYPE_SK_REUSEPORT and "struct sk_reuseport_(kern|md)"
will serve all ipv4/ipv6 + udp/tcp combinations.  There is no protocol
specific usage at this point and it is also inline with the current
sock_reuseport.c implementation (i.e. no protocol specific requirement).

In "struct sk_reuseport_md", this patch exposes data/data_end/len
with semantic similar to other existing usages.  Together
with "bpf_skb_load_bytes()" and "bpf_skb_load_bytes_relative()",
the bpf prog can peek anywhere in the skb.  The "bind_inany" tells
the bpf prog that the reuseport group is bind-ed to a local
INANY address which cannot be learned from skb.

The new "bind_inany" is added to "struct sock_reuseport" which will be
used when running the new "BPF_PROG_TYPE_SK_REUSEPORT" bpf prog in order
to avoid repeating the "bind INANY" test on
"sk_v6_rcv_saddr/sk-&gt;sk_rcv_saddr" every time a bpf prog is run.  It can
only be properly initialized when a "sk-&gt;sk_reuseport" enabled sk is
adding to a hashtable (i.e. during "reuseport_alloc()" and
"reuseport_add_sock()").

The new "sk_select_reuseport()" is the main helper that the
bpf prog will use to select a SO_REUSEPORT sk.  It is the only function
that can use the new BPF_MAP_TYPE_REUSEPORT_ARRAY.  As mentioned in
the earlier patch, the validity of a selected sk is checked in
run time in "sk_select_reuseport()".  Doing the check in
verification time is difficult and inflexible (consider the map-in-map
use case).  The runtime check is to compare the selected sk's reuseport_id
with the reuseport_id that we want.  This helper will return -EXXX if the
selected sk cannot serve the incoming request (e.g. reuseport_id
not match).  The bpf prog can decide if it wants to do SK_DROP as its
discretion.

When the bpf prog returns SK_PASS, the kernel will check if a
valid sk has been selected (i.e. "reuse_kern-&gt;selected_sk != NULL").
If it does , it will use the selected sk.  If not, the kernel
will select one from "reuse-&gt;socks[]" (as before this patch).

The SK_DROP and SK_PASS handling logic will be in the next patch.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</content>
</entry>
<entry>
<title>net/tcp: Fix socket lookups with SO_BINDTODEVICE</title>
<updated>2018-06-19T23:03:06Z</updated>
<author>
<name>David Ahern</name>
<email>dsahern@gmail.com</email>
</author>
<published>2018-06-18T19:30:37Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8c43bd1706885ba1acfa88da02bc60a2ec16f68c'/>
<id>urn:sha1:8c43bd1706885ba1acfa88da02bc60a2ec16f68c</id>
<content type='text'>
Similar to 69678bcd4d2d ("udp: fix SO_BINDTODEVICE"), TCP socket lookups
need to fail if dev_match is not true. Currently, a packet to a given port
can match a socket bound to device when it should not. In the VRF case,
this causes the lookup to hit a VRF socket and not a global socket
resulting in a response trying to go through the VRF when it should not.

Fixes: 3fa6f616a7a4d ("net: ipv4: add second dif to inet socket lookups")
Fixes: 4297a0ef08572 ("net: ipv6: add second dif to inet6 socket lookups")
Reported-by: Lou Berger &lt;lberger@labn.net&gt;
Diagnosed-by: Renato Westphal &lt;renato@opensourcerouting.org&gt;
Tested-by: Renato Westphal &lt;renato@opensourcerouting.org&gt;
Signed-off-by: David Ahern &lt;dsahern@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: Avoid unitialized variable warning in inet_unhash()</title>
<updated>2018-02-01T14:48:42Z</updated>
<author>
<name>Geert Uytterhoeven</name>
<email>geert@linux-m68k.org</email>
</author>
<published>2018-02-01T10:26:23Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0ba987181028ab41cdc68fa91b74c98d97b93ff3'/>
<id>urn:sha1:0ba987181028ab41cdc68fa91b74c98d97b93ff3</id>
<content type='text'>
With gcc-4.1.2:

    net/ipv4/inet_hashtables.c: In function ‘inet_unhash’:
    net/ipv4/inet_hashtables.c:628: warning: ‘ilb’ may be used uninitialized in this function

While this is a false positive, it can easily be avoided by using the
pointer itself as the canary variable.

Signed-off-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
