<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/include/net/netns/ipv4.h, branch linux-4.1.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.1.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.1.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2015-03-13T05:57:07Z</updated>
<entry>
<title>tcp_metrics: Use a single hash table for all network namespaces.</title>
<updated>2015-03-13T05:57:07Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2015-03-13T05:07:44Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=098a697b497e3154a1a583c1d34c67568acaadcc'/>
<id>urn:sha1:098a697b497e3154a1a583c1d34c67568acaadcc</id>
<content type='text'>
Now that all of the operations are safe on a single hash table
accross network namespaces, allocate a single global hash table
and update the code to use it.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: Create probe timer for tcp PMTU as per RFC4821</title>
<updated>2015-03-06T19:57:42Z</updated>
<author>
<name>Fan Du</name>
<email>fan.du@intel.com</email>
</author>
<published>2015-03-06T03:18:24Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=05cbc0db03e82128f2e7e353d4194dd24a1627fe'/>
<id>urn:sha1:05cbc0db03e82128f2e7e353d4194dd24a1627fe</id>
<content type='text'>
As per RFC4821 7.3.  Selecting Probe Size, a probe timer should
be armed once probing has converged. Once this timer expired,
probing again to take advantage of any path PMTU change. The
recommended probing interval is 10 minutes per RFC1981. Probing
interval could be sysctled by sysctl_tcp_probe_interval.

Eric Dumazet suggested to implement pseudo timer based on 32bits
jiffies tcp_time_stamp instead of using classic timer for such
rare event.

Signed-off-by: Fan Du &lt;fan.du@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: Use binary search to choose tcp PMTU probe_size</title>
<updated>2015-03-06T19:57:41Z</updated>
<author>
<name>Fan Du</name>
<email>fan.du@intel.com</email>
</author>
<published>2015-03-06T03:18:23Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=6b58e0a5f32dedb609438bb9c9c82aa6e23381f2'/>
<id>urn:sha1:6b58e0a5f32dedb609438bb9c9c82aa6e23381f2</id>
<content type='text'>
Current probe_size is chosen by doubling mss_cache,
the probing process will end shortly with a sub-optimal
mss size, and the link mtu will not be taken full
advantage of, in return, this will make user to tweak
tcp_base_mss with care.

Use binary search to choose probe_size in a fine
granularity manner, an optimal mss will be found
to boost performance as its maxmium.

In addition, introduce a sysctl_tcp_probe_threshold
to control when probing will stop in respect to
the width of search range.

Test env:
Docker instance with vxlan encapuslation(82599EB)
iperf -c 10.0.0.24  -t 60

before this patch:
1.26 Gbits/sec

After this patch: increase 26%
1.59 Gbits/sec

Signed-off-by: Fan Du &lt;fan.du@intel.com&gt;
Acked-by: John Heffner &lt;johnwheffner@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: add net bool fib_offload_disabled</title>
<updated>2015-03-06T05:24:58Z</updated>
<author>
<name>Scott Feldman</name>
<email>sfeldma@gmail.com</email>
</author>
<published>2015-03-06T05:21:18Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=448b128a14501543748514a4f9adedd3c0da2e85'/>
<id>urn:sha1:448b128a14501543748514a4f9adedd3c0da2e85</id>
<content type='text'>
If something goes wrong with IPv4 FIB offload, mark entire net offload
disabled.  This is brute force policy to basically shut down IPv4 FIB offload
permanently if there is a problem offloading any route to an external device.
We can refine the policy in the future, to handle failures on a per-device or
per-route basis, but for now, this policy is per-net.

What we're trying to avoid is an inconsistent split between the kernel's FIB
and the offload device's FIB.  We don't want the device to fwd a pkt
inconsitent with what the kernel would do.  An example of a split is if device
has 10.0.0.0/16 and kernel has 10.0.0.0/16 and 10.0.0.0/24, the device wouldn't
see the longest prefix 10.0.0.0/24 and potentially forward pkts incorrectly.

Limited capacity or limited capability are two ways a route may fail to install
to the offload device.  We'll not differentiate between failures at this time,
and treat any failure as fatal and mark the net as fib_offload_disabled.

Signed-off-by: Scott Feldman &lt;sfeldma@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>fib_trie: Make fib_table rcu safe</title>
<updated>2015-03-05T04:35:18Z</updated>
<author>
<name>Alexander Duyck</name>
<email>alexander.h.duyck@redhat.com</email>
</author>
<published>2015-03-04T23:02:44Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=a7e53531234dc206bb75abb5305a72665dd4d75d'/>
<id>urn:sha1:a7e53531234dc206bb75abb5305a72665dd4d75d</id>
<content type='text'>
The fib_table was wrapped in several places with an
rcu_read_lock/rcu_read_unlock however after looking over the code I found
several spots where the tables were being accessed as just standard
pointers without any protections.  This change fixes that so that all of
the proper protections are in place when accessing the table to take RCU
replacement or removal of the table into account.

Signed-off-by: Alexander Duyck &lt;alexander.h.duyck@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>multicast: Extend ip address command to enable multicast group join/leave on</title>
<updated>2015-02-27T21:25:25Z</updated>
<author>
<name>Madhu Challa</name>
<email>challa@noironetworks.com</email>
</author>
<published>2015-02-25T17:58:35Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=93a714d6b53d87872e552dbb273544bdeaaf6e12'/>
<id>urn:sha1:93a714d6b53d87872e552dbb273544bdeaaf6e12</id>
<content type='text'>
Joining multicast group on ethernet level via "ip maddr" command would
not work if we have an Ethernet switch that does igmp snooping since
the switch would not replicate multicast packets on ports that did not
have IGMP reports for the multicast addresses.

Linux vxlan interfaces created via "ip link add vxlan" have the group option
that enables then to do the required join.

By extending ip address command with option "autojoin" we can get similar
functionality for openvswitch vxlan interfaces as well as other tunneling
mechanisms that need to receive multicast traffic. The kernel code is
structured similar to how the vxlan driver does a group join / leave.

example:
ip address add 224.1.1.10/24 dev eth5 autojoin
ip address del 224.1.1.10/24 dev eth5

Signed-off-by: Madhu Challa &lt;challa@noironetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: Namespecify TCP PMTU mechanism</title>
<updated>2015-02-10T02:45:00Z</updated>
<author>
<name>Fan Du</name>
<email>fan.du@intel.com</email>
</author>
<published>2015-02-10T01:53:16Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=b0f9ca53cbb103e9240a29a974e0b6085e58f9f7'/>
<id>urn:sha1:b0f9ca53cbb103e9240a29a974e0b6085e58f9f7</id>
<content type='text'>
Packetization Layer Path MTU Discovery works separately beside
Path MTU Discovery at IP level, different net namespace has
various requirements on which one to chose, e.g., a virutalized
container instance would require TCP PMTU to probe an usable
effective mtu for underlying tunnel, while the host would
employ classical ICMP based PMTU to function.

Hence making TCP PMTU mechanism per net namespace to decouple
two functionality. Furthermore the probe base MSS should also
be configured separately for each namespace.

Signed-off-by: Fan Du &lt;fan.du@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2015-02-05T22:33:28Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-02-05T22:33:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=6e03f896b52cd2ca88942170c5c9c407ec0ede69'/>
<id>urn:sha1:6e03f896b52cd2ca88942170c5c9c407ec0ede69</id>
<content type='text'>
Conflicts:
	drivers/net/vxlan.c
	drivers/vhost/net.c
	include/linux/if_vlan.h
	net/core/dev.c

The net/core/dev.c conflict was the overlap of one commit marking an
existing function static whilst another was adding a new function.

In the include/linux/if_vlan.h case, the type used for a local
variable was changed in 'net', whereas the function got rewritten
to fix a stacked vlan bug in 'net-next'.

In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next'
overlapped with an endainness fix for VHOST 1.0 in 'net'.

In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter
in 'net-next' whereas in 'net' there was a bug fix to pass in the
correct network namespace pointer in calls to this function.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: tcp: get rid of ugly unicast_sock</title>
<updated>2015-02-02T07:06:19Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-01-30T05:35:05Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=bdbbb8527b6f6a358dbcb70dac247034d665b8e4'/>
<id>urn:sha1:bdbbb8527b6f6a358dbcb70dac247034d665b8e4</id>
<content type='text'>
In commit be9f4a44e7d41 ("ipv4: tcp: remove per net tcp_sock")
I tried to address contention on a socket lock, but the solution
I chose was horrible :

commit 3a7c384ffd57e ("ipv4: tcp: unicast_sock should not land outside
of TCP stack") addressed a selinux regression.

commit 0980e56e506b ("ipv4: tcp: set unicast_sock uc_ttl to -1")
took care of another regression.

commit b5ec8eeac46 ("ipv4: fix ip_send_skb()") fixed another regression.

commit 811230cd85 ("tcp: ipv4: initialize unicast_sock sk_pacing_rate")
was another shot in the dark.

Really, just use a proper socket per cpu, and remove the skb_orphan()
call, to re-enable flow control.

This solves a serious problem with FQ packet scheduler when used in
hostile environments, as we do not want to allocate a flow structure
for every RST packet sent in response to a spoofed packet.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: icmp: use percpu allocation</title>
<updated>2015-02-01T01:48:18Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-01-29T23:58:09Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=349c9e3c7341bbab6efbea39acfadeba9ab19f61'/>
<id>urn:sha1:349c9e3c7341bbab6efbea39acfadeba9ab19f61</id>
<content type='text'>
Get rid of nr_cpu_ids and use modern percpu allocation.

Note that the sockets themselves are not yet allocated
using NUMA affinity.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
