<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/include/net/inet_hashtables.h, branch linux-5.11.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.11.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-5.11.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2020-11-24T00:32:33Z</updated>
<entry>
<title>tcp: fix race condition when creating child sockets from syncookies</title>
<updated>2020-11-24T00:32:33Z</updated>
<author>
<name>Ricardo Dias</name>
<email>rdias@singlestore.com</email>
</author>
<published>2020-11-20T11:11:33Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=01770a166165738a6e05c3d911fb4609cc4eb416'/>
<id>urn:sha1:01770a166165738a6e05c3d911fb4609cc4eb416</id>
<content type='text'>
When the TCP stack is in SYN flood mode, the server child socket is
created from the SYN cookie received in a TCP packet with the ACK flag
set.

The child socket is created when the server receives the first TCP
packet with a valid SYN cookie from the client. Usually, this packet
corresponds to the final step of the TCP 3-way handshake, the ACK
packet. But is also possible to receive a valid SYN cookie from the
first TCP data packet sent by the client, and thus create a child socket
from that SYN cookie.

Since a client socket is ready to send data as soon as it receives the
SYN+ACK packet from the server, the client can send the ACK packet (sent
by the TCP stack code), and the first data packet (sent by the userspace
program) almost at the same time, and thus the server will equally
receive the two TCP packets with valid SYN cookies almost at the same
instant.

When such event happens, the TCP stack code has a race condition that
occurs between the momement a lookup is done to the established
connections hashtable to check for the existence of a connection for the
same client, and the moment that the child socket is added to the
established connections hashtable. As a consequence, this race condition
can lead to a situation where we add two child sockets to the
established connections hashtable and deliver two sockets to the
userspace program to the same client.

This patch fixes the race condition by checking if an existing child
socket exists for the same client when we are adding the second child
socket to the established connections socket. If an existing child
socket exists, we drop the packet and discard the second child socket
to the same client.

Signed-off-by: Ricardo Dias &lt;rdias@singlestore.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://lore.kernel.org/r/20201120111133.GA67501@rdias-suse-pc.lan
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>dccp: Fix possible memleak in dccp_init and dccp_fini</title>
<updated>2020-06-09T20:26:23Z</updated>
<author>
<name>Wang Hai</name>
<email>wanghai38@huawei.com</email>
</author>
<published>2020-06-09T14:18:16Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c96b6acc8f89a4a7f6258dfe1d077654c11415be'/>
<id>urn:sha1:c96b6acc8f89a4a7f6258dfe1d077654c11415be</id>
<content type='text'>
There are some memory leaks in dccp_init() and dccp_fini().

In dccp_fini() and the error handling path in dccp_init(), free lhash2
is missing. Add inet_hashinfo2_free_mod() to do it.

If inet_hashinfo2_init_mod() failed in dccp_init(),
percpu_counter_destroy() should be called to destroy dccp_orphan_count.
It need to goto out_free_percpu when inet_hashinfo2_init_mod() failed.

Fixes: c92c81df93df ("net: dccp: fix kernel crash on module load")
Reported-by: Hulk Robot &lt;hulkci@huawei.com&gt;
Signed-off-by: Wang Hai &lt;wanghai38@huawei.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Track socket refcounts in skb_steal_sock()</title>
<updated>2020-03-30T20:45:04Z</updated>
<author>
<name>Joe Stringer</name>
<email>joe@wand.net.nz</email>
</author>
<published>2020-03-29T22:53:39Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=71489e21d720a09388b565d60ef87ae993c10528'/>
<id>urn:sha1:71489e21d720a09388b565d60ef87ae993c10528</id>
<content type='text'>
Refactor the UDP/TCP handlers slightly to allow skb_steal_sock() to make
the determination of whether the socket is reference counted in the case
where it is prefetched by earlier logic such as early_demux.

Signed-off-by: Joe Stringer &lt;joe@wand.net.nz&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Link: https://lore.kernel.org/bpf/20200329225342.16317-3-joe@wand.net.nz
</content>
</entry>
<entry>
<title>tcp/dccp: fix possible race __inet_lookup_established()</title>
<updated>2019-12-14T05:40:49Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-12-14T02:20:41Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8dbd76e79a16b45b2ccb01d2f2e08dbf64e71e40'/>
<id>urn:sha1:8dbd76e79a16b45b2ccb01d2f2e08dbf64e71e40</id>
<content type='text'>
Michal Kubecek and Firo Yang did a very nice analysis of crashes
happening in __inet_lookup_established().

Since a TCP socket can go from TCP_ESTABLISH to TCP_LISTEN
(via a close()/socket()/listen() cycle) without a RCU grace period,
I should not have changed listeners linkage in their hash table.

They must use the nulls protocol (Documentation/RCU/rculist_nulls.txt),
so that a lookup can detect a socket in a hash list was moved in
another one.

Since we added code in commit d296ba60d8e2 ("soreuseport: Resolve
merge conflict for v4/v6 ordering fix"), we have to add
hlist_nulls_add_tail_rcu() helper.

Fixes: 3b24d854cb35 ("tcp/dccp: do not touch listener sk_refcnt under synflood")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Michal Kubecek &lt;mkubecek@suse.cz&gt;
Reported-by: Firo Yang &lt;firo.yang@suse.com&gt;
Reviewed-by: Michal Kubecek &lt;mkubecek@suse.cz&gt;
Link: https://lore.kernel.org/netdev/20191120083919.GH27852@unicorn.suse.cz/
Signed-off-by: Jakub Kicinski &lt;jakub.kicinski@netronome.com&gt;
</content>
</entry>
<entry>
<title>treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152</title>
<updated>2019-05-30T18:26:32Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-05-27T06:55:01Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2874c5fd284268364ece81a7bd936f3c8168e567'/>
<id>urn:sha1:2874c5fd284268364ece81a7bd936f3c8168e567</id>
<content type='text'>
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Allison Randal &lt;allison@lohutok.net&gt;
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: dccp: fix kernel crash on module load</title>
<updated>2018-12-24T23:27:56Z</updated>
<author>
<name>Peter Oskolkov</name>
<email>posk@google.com</email>
</author>
<published>2018-12-24T20:57:17Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c92c81df93df95dafbf6926613ce0b436227b007'/>
<id>urn:sha1:c92c81df93df95dafbf6926613ce0b436227b007</id>
<content type='text'>
Patch eedbbb0d98b2 "net: dccp: initialize (addr,port) ..."
added calling to inet_hashinfo2_init() from dccp_init().

However, inet_hashinfo2_init() is marked as __init(), and
thus the kernel panics when dccp is loaded as module. Removing
__init() tag from inet_hashinfo2_init() is not feasible because
it calls into __init functions in mm.

This patch adds inet_hashinfo2_init_mod() function that can
be called after the init phase is done; changes dccp_init() to
call the new function; un-marks inet_hashinfo2_init() as
exported.

Fixes: eedbbb0d98b2 ("net: dccp: initialize (addr,port) ...")
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Peter Oskolkov &lt;posk@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: ensure unbound stream socket to be chosen when not in a VRF</title>
<updated>2018-11-08T00:12:38Z</updated>
<author>
<name>Mike Manning</name>
<email>mmanning@vyatta.att-mail.com</email>
</author>
<published>2018-11-07T15:36:03Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e78190581aff7c96fbd6324aa633170934650b65'/>
<id>urn:sha1:e78190581aff7c96fbd6324aa633170934650b65</id>
<content type='text'>
The commit a04a480d4392 ("net: Require exact match for TCP socket
lookups if dif is l3mdev") only ensures that the correct socket is
selected for packets in a VRF. However, there is no guarantee that
the unbound socket will be selected for packets when not in a VRF.
By checking for a device match in compute_score() also for the case
when there is no bound device and attaching a score to this, the
unbound socket is selected. And if a failure is returned when there
is no device match, this ensures that bound sockets are never selected,
even if there is no unbound socket.

Signed-off-by: Mike Manning &lt;mmanning@vyatta.att-mail.com&gt;
Reviewed-by: David Ahern &lt;dsahern@gmail.com&gt;
Tested-by: David Ahern &lt;dsahern@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: allow binding socket in a VRF when there's an unbound socket</title>
<updated>2018-11-08T00:12:38Z</updated>
<author>
<name>Robert Shearman</name>
<email>rshearma@vyatta.att-mail.com</email>
</author>
<published>2018-11-07T15:36:02Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=3c82a21f4320c8d54cf6456b27c8d49e5ffb722e'/>
<id>urn:sha1:3c82a21f4320c8d54cf6456b27c8d49e5ffb722e</id>
<content type='text'>
Change the inet socket lookup to avoid packets arriving on a device
enslaved to an l3mdev from matching unbound sockets by removing the
wildcard for non sk_bound_dev_if and instead relying on check against
the secondary device index, which will be 0 when the input device is
not enslaved to an l3mdev and so match against an unbound socket and
not match when the input device is enslaved.

Change the socket binding to take the l3mdev into account to allow an
unbound socket to not conflict sockets bound to an l3mdev given the
datapath isolation now guaranteed.

Signed-off-by: Robert Shearman &lt;rshearma@vyatta.att-mail.com&gt;
Signed-off-by: Mike Manning &lt;mmanning@vyatta.att-mail.com&gt;
Reviewed-by: David Ahern &lt;dsahern@gmail.com&gt;
Tested-by: David Ahern &lt;dsahern@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: Add a 2nd listener hashtable (port+addr)</title>
<updated>2017-12-03T15:18:28Z</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2017-12-01T20:52:31Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=61b7c691c7317529375f90f0a81a331990b1ec1b'/>
<id>urn:sha1:61b7c691c7317529375f90f0a81a331990b1ec1b</id>
<content type='text'>
The current listener hashtable is hashed by port only.
When a process is listening at many IP addresses with the same port (e.g.
[IP1]:443, [IP2]:443... [IPN]:443), the inet[6]_lookup_listener()
performance is degraded to a link list.  It is prone to syn attack.

UDP had a similar issue and a second hashtable was added to resolve it.

This patch adds a second hashtable for the listener's sockets.
The second hashtable is hashed by port and address.

It cannot reuse the existing skc_portaddr_node which is shared
with skc_bind_node.  TCP listener needs to use skc_bind_node.
Instead, this patch adds a hlist_node 'icsk_listen_portaddr_node' to
the inet_connection_sock which the listener (like TCP) also belongs to.

The new portaddr hashtable may need two lookup (First by IP:PORT.
Second by INADDR_ANY:PORT if the IP:PORT is a not found).   Hence,
it implements a similar cut off as UDP such that it will only consult the
new portaddr hashtable if the current port-only hashtable has &gt;10
sk in the link-list.

lhash2 and lhash2_mask are added to 'struct inet_hashinfo'.  I take
this chance to plug a 4 bytes hole.  It is done by first moving
the existing bind_bucket_cachep up and then add the new
(int lhash2_mask, *lhash2) after the existing bhash_size.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: Add a count to struct inet_listen_hashbucket</title>
<updated>2017-12-03T15:18:27Z</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2017-12-01T20:52:29Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=76d013b20ba9a5f88eee7c90ac82cbc3ee64be18'/>
<id>urn:sha1:76d013b20ba9a5f88eee7c90ac82cbc3ee64be18</id>
<content type='text'>
This patch adds a count to the 'struct inet_listen_hashbucket'.
It counts how many sk is hashed to a bucket.  It will be
used to decide if the (to-be-added) portaddr listener's hashtable
should be used during inet[6]_lookup_listener().

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
