<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/include/net/tcp.h, branch linux-3.17.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-3.17.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-3.17.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2014-08-14T21:38:54Z</updated>
<entry>
<title>tcp: don't allow syn packets without timestamps to pass tcp_tw_recycle logic</title>
<updated>2014-08-14T21:38:54Z</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2014-08-14T20:06:12Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=a26552afe89438eefbe097512b3187f2c7e929fd'/>
<id>urn:sha1:a26552afe89438eefbe097512b3187f2c7e929fd</id>
<content type='text'>
tcp_tw_recycle heavily relies on tcp timestamps to build a per-host
ordering of incoming connections and teardowns without the need to
hold state on a specific quadruple for TCP_TIMEWAIT_LEN, but only for
the last measured RTO. To do so, we keep the last seen timestamp in a
per-host indexed data structure and verify if the incoming timestamp
in a connection request is strictly greater than the saved one during
last connection teardown. Thus we can verify later on that no old data
packets will be accepted by the new connection.

During moving a socket to time-wait state we already verify if timestamps
where seen on a connection. Only if that was the case we let the
time-wait socket expire after the RTO, otherwise normal TCP_TIMEWAIT_LEN
will be used. But we don't verify this on incoming SYN packets. If a
connection teardown was less than TCP_PAWS_MSL seconds in the past we
cannot guarantee to not accept data packets from an old connection if
no timestamps are present. We should drop this SYN packet. This patch
closes this loophole.

Please note, this patch does not make tcp_tw_recycle in any way more
usable but only adds another safety check:
Sporadic drops of SYN packets because of reordering in the network or
in the socket backlog queues can happen. Users behing NAT trying to
connect to a tcp_tw_recycle enabled server can get caught in blackholes
and their connection requests may regullary get dropped because hosts
behind an address translator don't have synchronized tcp timestamp clocks.
tcp_tw_recycle cannot work if peers don't have tcp timestamps enabled.

In general, use of tcp_tw_recycle is disadvised.

Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: fix tcp_release_cb() to dispatch via address family for mtu_reduced()</title>
<updated>2014-08-14T21:38:54Z</updated>
<author>
<name>Neal Cardwell</name>
<email>ncardwell@google.com</email>
</author>
<published>2014-08-14T16:40:05Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=4fab9071950c2021d846e18351e0f46a1cffd67b'/>
<id>urn:sha1:4fab9071950c2021d846e18351e0f46a1cffd67b</id>
<content type='text'>
Make sure we use the correct address-family-specific function for
handling MTU reductions from within tcp_release_cb().

Previously AF_INET6 sockets were incorrectly always using the IPv6
code path when sometimes they were handling IPv4 traffic and thus had
an IPv4 dst.

Signed-off-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Diagnosed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Fixes: 563d34d057862 ("tcp: dont drop MTU reduction indications")
Reviewed-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: don't use timestamp from repaired skb-s to calculate RTT (v2)</title>
<updated>2014-08-14T21:38:54Z</updated>
<author>
<name>Andrey Vagin</name>
<email>avagin@openvz.org</email>
</author>
<published>2014-08-13T12:03:10Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9d186cac7ffb1831e9f34cb4a3a8b22abb9dd9d4'/>
<id>urn:sha1:9d186cac7ffb1831e9f34cb4a3a8b22abb9dd9d4</id>
<content type='text'>
We don't know right timestamp for repaired skb-s. Wrong RTT estimations
isn't good, because some congestion modules heavily depends on it.

This patch adds the TCPCB_REPAIRED flag, which is included in
TCPCB_RETRANS.

Thanks to Eric for the advice how to fix this issue.

This patch fixes the warning:
[  879.562947] WARNING: CPU: 0 PID: 2825 at net/ipv4/tcp_input.c:3078 tcp_ack+0x11f5/0x1380()
[  879.567253] CPU: 0 PID: 2825 Comm: socket-tcpbuf-l Not tainted 3.16.0-next-20140811 #1
[  879.567829] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  879.568177]  0000000000000000 00000000c532680c ffff880039643d00 ffffffff817aa2d2
[  879.568776]  0000000000000000 ffff880039643d38 ffffffff8109afbd ffff880039d6ba80
[  879.569386]  ffff88003a449800 000000002983d6bd 0000000000000000 000000002983d6bc
[  879.569982] Call Trace:
[  879.570264]  [&lt;ffffffff817aa2d2&gt;] dump_stack+0x4d/0x66
[  879.570599]  [&lt;ffffffff8109afbd&gt;] warn_slowpath_common+0x7d/0xa0
[  879.570935]  [&lt;ffffffff8109b0ea&gt;] warn_slowpath_null+0x1a/0x20
[  879.571292]  [&lt;ffffffff816d0a05&gt;] tcp_ack+0x11f5/0x1380
[  879.571614]  [&lt;ffffffff816d10bd&gt;] tcp_rcv_established+0x1ed/0x710
[  879.571958]  [&lt;ffffffff816dc9da&gt;] tcp_v4_do_rcv+0x10a/0x370
[  879.572315]  [&lt;ffffffff81657459&gt;] release_sock+0x89/0x1d0
[  879.572642]  [&lt;ffffffff816c81a0&gt;] do_tcp_setsockopt.isra.36+0x120/0x860
[  879.573000]  [&lt;ffffffff8110a52e&gt;] ? rcu_read_lock_held+0x6e/0x80
[  879.573352]  [&lt;ffffffff816c8912&gt;] tcp_setsockopt+0x32/0x40
[  879.573678]  [&lt;ffffffff81654ac4&gt;] sock_common_setsockopt+0x14/0x20
[  879.574031]  [&lt;ffffffff816537b0&gt;] SyS_setsockopt+0x80/0xf0
[  879.574393]  [&lt;ffffffff817b40a9&gt;] system_call_fastpath+0x16/0x1b
[  879.574730] ---[ end trace a17cbc38eb8c5c00 ]---

v2: moving setting of skb-&gt;when for repaired skb-s in tcp_write_xmit,
    where it's set for other skb-s.

Fixes: 431a91242d8d ("tcp: timestamp SYN+DATA messages")
Fixes: 740b0f1841f6 ("tcp: switch rtt estimations to usec resolution")
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Andrey Vagin &lt;avagin@openvz.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: reduce spurious retransmits due to transient SACK reneging</title>
<updated>2014-08-05T23:29:33Z</updated>
<author>
<name>Neal Cardwell</name>
<email>ncardwell@google.com</email>
</author>
<published>2014-08-04T23:12:29Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5ae344c949e79b8545a11db149f0a85a6e59e1f3'/>
<id>urn:sha1:5ae344c949e79b8545a11db149f0a85a6e59e1f3</id>
<content type='text'>
This commit reduces spurious retransmits due to apparent SACK reneging
by only reacting to SACK reneging that persists for a short delay.

When a sequence space hole at snd_una is filled, some TCP receivers
send a series of ACKs as they apparently scan their out-of-order queue
and cumulatively ACK all the packets that have now been consecutiveyly
received. This is essentially misbehavior B in "Misbehaviors in TCP
SACK generation" ACM SIGCOMM Computer Communication Review, April
2011, so we suspect that this is from several common OSes (Windows
2000, Windows Server 2003, Windows XP). However, this issue has also
been seen in other cases, e.g. the netdev thread "TCP being hoodwinked
into spurious retransmissions by lack of timestamps?" from March 2014,
where the receiver was thought to be a BSD box.

Since snd_una would temporarily be adjacent to a previously SACKed
range in these scenarios, this receiver behavior triggered the Linux
SACK reneging code path in the sender. This led the sender to clear
the SACK scoreboard, enter CA_Loss, and spuriously retransmit
(potentially) every packet from the entire write queue at line rate
just a few milliseconds before the ACK for each packet arrives at the
sender.

To avoid such situations, now when a sender sees apparent reneging it
does not yet retransmit, but rather adjusts the RTO timer to give the
receiver a little time (max(RTT/2, 10ms)) to send us some more ACKs
that will restore sanity to the SACK scoreboard. If the reneging
persists until this RTO then, as before, we clear the SACK scoreboard
and enter CA_Loss.

A 10ms delay tolerates a receiver sending such a stream of ACKs at
56Kbit/sec. And to allow for receivers with slower or more congested
paths, we wait for at least RTT/2.

We validated the resulting max(RTT/2, 10ms) delay formula with a mix
of North American and South American Google web server traffic, and
found that for ACKs displaying transient reneging:

 (1) 90% of inter-ACK delays were less than 10ms
 (2) 99% of inter-ACK delays were less than RTT/2

In tests on Google web servers this commit reduced reneging events by
75%-90% (as measured by the TcpExtTCPSACKReneging counter), without
any measurable impact on latency for user HTTP and SPDY requests.

Signed-off-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: Remove unnecessary arg from tcp_enter_cwr and tcp_init_cwnd_reduction</title>
<updated>2014-07-15T23:19:36Z</updated>
<author>
<name>Christoph Paasch</name>
<email>christoph.paasch@uclouvain.be</email>
</author>
<published>2014-07-14T14:58:32Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5ee2c941b5969eb1b5592f9731b3ee76a784641f'/>
<id>urn:sha1:5ee2c941b5969eb1b5592f9731b3ee76a784641f</id>
<content type='text'>
Since Yuchung's 9b44190dc11 (tcp: refactor F-RTO), tcp_enter_cwr is always
called with set_ssthresh = 1. Thus, we can remove this argument from
tcp_enter_cwr. Further, as we remove this one, tcp_init_cwnd_reduction
is then always called with set_ssthresh = true, and so we can get rid of
this argument as well.

Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: Christoph Paasch &lt;christoph.paasch@uclouvain.be&gt;
Acked-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: switch snt_synack back to measuring transmit time of first SYNACK</title>
<updated>2014-07-08T02:26:37Z</updated>
<author>
<name>Neal Cardwell</name>
<email>ncardwell@google.com</email>
</author>
<published>2014-06-30T19:09:49Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=86c6a2c75ab97fe31844985169e26aea335432f9'/>
<id>urn:sha1:86c6a2c75ab97fe31844985169e26aea335432f9</id>
<content type='text'>
Always store in snt_synack the time at which the server received the
first client SYN and attempted to send the first SYNACK.

Recent commit aa27fc501 ("tcp: tcp_v[46]_conn_request: fix snt_synack
initialization") resolved an inconsistency between IPv4 and IPv6 in
the initialization of snt_synack. This commit brings back the idea
from 843f4a55e (tcp: use tcp_v4_send_synack on first SYN-ACK), which
was going for the original behavior of snt_synack from the commit
where it was added in 9ad7c049f0f79 ("tcp: RFC2988bis + taking RTT
sample from 3WHS for the passive open side") in v3.1.

In addition to being simpler (and probably a tiny bit faster),
unconditionally storing the time of the first SYNACK attempt has been
useful because it allows calculating a performance metric quantifying
how long it took to establish a passive TCP connection.

Signed-off-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: Octavian Purdila &lt;octavian.purdila@intel.com&gt;
Cc: Jerry Chu &lt;hkchu@google.com&gt;
Acked-by: Octavian Purdila &lt;octavian.purdila@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: add tcp_conn_request</title>
<updated>2014-06-27T22:53:37Z</updated>
<author>
<name>Octavian Purdila</name>
<email>octavian.purdila@intel.com</email>
</author>
<published>2014-06-25T14:10:02Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=1fb6f159fd21c640a28eb65fbd62ce8c9f6a777e'/>
<id>urn:sha1:1fb6f159fd21c640a28eb65fbd62ce8c9f6a777e</id>
<content type='text'>
Create tcp_conn_request and remove most of the code from
tcp_v4_conn_request and tcp_v6_conn_request.

Signed-off-by: Octavian Purdila &lt;octavian.purdila@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: add queue_add_hash to tcp_request_sock_ops</title>
<updated>2014-06-27T22:53:36Z</updated>
<author>
<name>Octavian Purdila</name>
<email>octavian.purdila@intel.com</email>
</author>
<published>2014-06-25T14:10:01Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=695da14eb0af21129187ed3810e329b21262e45f'/>
<id>urn:sha1:695da14eb0af21129187ed3810e329b21262e45f</id>
<content type='text'>
Add queue_add_hash member to tcp_request_sock_ops so that we can later
unify tcp_v4_conn_request and tcp_v6_conn_request.

Signed-off-by: Octavian Purdila &lt;octavian.purdila@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: add mss_clamp to tcp_request_sock_ops</title>
<updated>2014-06-27T22:53:36Z</updated>
<author>
<name>Octavian Purdila</name>
<email>octavian.purdila@intel.com</email>
</author>
<published>2014-06-25T14:10:00Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2aec4a297b21f3690486bbf8f7d5d29281ba6a48'/>
<id>urn:sha1:2aec4a297b21f3690486bbf8f7d5d29281ba6a48</id>
<content type='text'>
Add mss_clamp member to tcp_request_sock_ops so that we can later
unify tcp_v4_conn_request and tcp_v6_conn_request.

Signed-off-by: Octavian Purdila &lt;octavian.purdila@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: unify tcp_v4_rtx_synack and tcp_v6_rtx_synack</title>
<updated>2014-06-27T22:53:36Z</updated>
<author>
<name>Octavian Purdila</name>
<email>octavian.purdila@intel.com</email>
</author>
<published>2014-06-25T14:09:59Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5db92c994982ed826cf38f38d58bd09bc326aef6'/>
<id>urn:sha1:5db92c994982ed826cf38f38d58bd09bc326aef6</id>
<content type='text'>
Signed-off-by: Octavian Purdila &lt;octavian.purdila@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
