<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/include/net/cls_cgroup.h, branch linux-6.18.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.18.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.18.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2025-09-14T18:55:04Z</updated>
<entry>
<title>net/cls_cgroup: Fix task_get_classid() during qdisc run</title>
<updated>2025-09-14T18:55:04Z</updated>
<author>
<name>Yafang Shao</name>
<email>laoar.shao@gmail.com</email>
</author>
<published>2025-09-02T06:29:33Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=66048f8b3cc7e462953c04285183cdee43a1cb89'/>
<id>urn:sha1:66048f8b3cc7e462953c04285183cdee43a1cb89</id>
<content type='text'>
During recent testing with the netem qdisc to inject delays into TCP
traffic, we observed that our CLS BPF program failed to function correctly
due to incorrect classid retrieval from task_get_classid(). The issue
manifests in the following call stack:

        bpf_get_cgroup_classid+5
        cls_bpf_classify+507
        __tcf_classify+90
        tcf_classify+217
        __dev_queue_xmit+798
        bond_dev_queue_xmit+43
        __bond_start_xmit+211
        bond_start_xmit+70
        dev_hard_start_xmit+142
        sch_direct_xmit+161
        __qdisc_run+102             &lt;&lt;&lt;&lt;&lt; Issue location
        __dev_xmit_skb+1015
        __dev_queue_xmit+637
        neigh_hh_output+159
        ip_finish_output2+461
        __ip_finish_output+183
        ip_finish_output+41
        ip_output+120
        ip_local_out+94
        __ip_queue_xmit+394
        ip_queue_xmit+21
        __tcp_transmit_skb+2169
        tcp_write_xmit+959
        __tcp_push_pending_frames+55
        tcp_push+264
        tcp_sendmsg_locked+661
        tcp_sendmsg+45
        inet_sendmsg+67
        sock_sendmsg+98
        sock_write_iter+147
        vfs_write+786
        ksys_write+181
        __x64_sys_write+25
        do_syscall_64+56
        entry_SYSCALL_64_after_hwframe+100

The problem occurs when multiple tasks share a single qdisc. In such cases,
__qdisc_run() may transmit skbs created by different tasks. Consequently,
task_get_classid() retrieves an incorrect classid since it references the
current task's context rather than the skb's originating task.

Given that dev_queue_xmit() always executes with bh disabled, we can use
softirq_count() instead to obtain the correct classid.

The simple steps to reproduce this issue:
1. Add network delay to the network interface:
  such as: tc qdisc add dev bond0 root netem delay 1.5ms
2. Build two distinct net_cls cgroups, each with a network-intensive task
3. Initiate parallel TCP streams from both tasks to external servers.

Under this specific condition, the issue reliably occurs. The kernel
eventually dequeues an SKB that originated from Task-A while executing in
the context of Task-B.

It is worth noting that it will change the established behavior for a
slightly different scenario:

  &lt;sock S is created by task A&gt;
  &lt;class ID for task A is changed&gt;
  &lt;skb is created by sock S xmit and classified&gt;

prior to this patch the skb will be classified with the 'new' task A
classid, now with the old/original one. The bpf_get_cgroup_classid_curr()
function is a more appropriate choice for this case.

Signed-off-by: Yafang Shao &lt;laoar.shao@gmail.com&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Thomas Graf &lt;tgraf@suug.ch&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Link: https://patch.msgid.link/20250902062933.30087-1-laoar.shao@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Allow to retrieve cgroup v1 classid from v2 hooks</title>
<updated>2020-03-28T02:40:38Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2020-03-27T15:58:53Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5a52ae4e32a61ad06ef67f0b3123adbdbac4fb83'/>
<id>urn:sha1:5a52ae4e32a61ad06ef67f0b3123adbdbac4fb83</id>
<content type='text'>
Today, Kubernetes is still operating on cgroups v1, however, it is
possible to retrieve the task's classid based on 'current' out of
connect(), sendmsg(), recvmsg() and bind-related hooks for orchestrators
which attach to the root cgroup v2 hook in a mixed env like in case
of Cilium, for example, in order to then correlate certain pod traffic
and use it as part of the key for BPF map lookups.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/555e1c69db7376c0947007b4951c260e1074efc3.1585323121.git.daniel@iogearbox.net
</content>
</entry>
<entry>
<title>treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152</title>
<updated>2019-05-30T18:26:32Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-05-27T06:55:01Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2874c5fd284268364ece81a7bd936f3c8168e567'/>
<id>urn:sha1:2874c5fd284268364ece81a7bd936f3c8168e567</id>
<content type='text'>
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Allison Randal &lt;allison@lohutok.net&gt;
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cls_cgroup: get sk_classid only from full sockets</title>
<updated>2016-04-20T00:09:25Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2016-04-18T11:37:10Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2309236c13feae619572324efd3e910e66ef6bd0'/>
<id>urn:sha1:2309236c13feae619572324efd3e910e66ef6bd0</id>
<content type='text'>
skb-&gt;sk could point to timewait or request socket which has no sk_classid.
Detected as "BUG: KASAN: slab-out-of-bounds in cls_cgroup_classify".

Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: wrap sock-&gt;sk_cgrp_prioidx and -&gt;sk_classid inside a struct</title>
<updated>2015-12-09T03:02:33Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2015-12-07T22:38:52Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2a56a1fec290bf0bc4676bbf4efdb3744953a3e7'/>
<id>urn:sha1:2a56a1fec290bf0bc4676bbf4efdb3744953a3e7</id>
<content type='text'>
Introduce sock-&gt;sk_cgrp_data which is a struct sock_cgroup_data.
-&gt;sk_cgroup_prioidx and -&gt;sk_classid are moved into it.  The struct
and its accessors are defined in cgroup-defs.h.  This is to prepare
for overloading the fields with a cgroup pointer.

This patch mostly performs equivalent conversions but the followings
are noteworthy.

* Equality test before updating classid is removed from
  sock_update_classid().  This shouldn't make any noticeable
  difference and a similar test will be implemented on the helper side
  later.

* sock_update_netprioidx() now takes struct sock_cgroup_data and can
  be moved to netprio_cgroup.h without causing include dependency
  loop.  Moved.

* The dummy version of sock_update_netprioidx() converted to a static
  inline function while at it.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>cls_cgroup: factor out classid retrieval</title>
<updated>2015-07-20T19:41:30Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-07-15T12:21:41Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=b87a173e25d6bf5c26f13d329cdddf57dbd4061a'/>
<id>urn:sha1:b87a173e25d6bf5c26f13d329cdddf57dbd4061a</id>
<content type='text'>
Split out retrieving the cgroups net_cls classid retrieval into its
own function, so that it can be reused later on from other parts of
the traffic control subsystem. If there's no skb-&gt;sk, then the small
helper returns 0 as well, which in cls_cgroup terms means 'could not
classify'.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Thomas Graf &lt;tgraf@suug.ch&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>cgroup: clean up cgroup_subsys names and initialization</title>
<updated>2014-02-08T15:36:58Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2014-02-08T15:36:58Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=073219e995b4a3f8cf1ce8228b7ef440b6994ac0'/>
<id>urn:sha1:073219e995b4a3f8cf1ce8228b7ef440b6994ac0</id>
<content type='text'>
cgroup_subsys is a bit messier than it needs to be.

* The name of a subsys can be different from its internal identifier
  defined in cgroup_subsys.h.  Most subsystems use the matching name
  but three - cpu, memory and perf_event - use different ones.

* cgroup_subsys_id enums are postfixed with _subsys_id and each
  cgroup_subsys is postfixed with _subsys.  cgroup.h is widely
  included throughout various subsystems, it doesn't and shouldn't
  have claim on such generic names which don't have any qualifier
  indicating that they belong to cgroup.

* cgroup_subsys-&gt;subsys_id should always equal the matching
  cgroup_subsys_id enum; however, we require each controller to
  initialize it and then BUG if they don't match, which is a bit
  silly.

This patch cleans up cgroup_subsys names and initialization by doing
the followings.

* cgroup_subsys_id enums are now postfixed with _cgrp_id, and each
  cgroup_subsys with _cgrp_subsys.

* With the above, renaming subsys identifiers to match the userland
  visible names doesn't cause any naming conflicts.  All non-matching
  identifiers are renamed to match the official names.

  cpu_cgroup -&gt; cpu
  mem_cgroup -&gt; memory
  perf -&gt; perf_event

* controllers no longer need to initialize -&gt;subsys_id and -&gt;name.
  They're generated in cgroup core and set automatically during boot.

* Redundant cgroup_subsys declarations removed.

* While updating BUG_ON()s in cgroup_init_early(), convert them to
  WARN()s.  BUGging that early during boot is stupid - the kernel
  can't print anything, even through serial console and the trap
  handler doesn't even link stack frame properly for back-tracing.

This patch doesn't introduce any behavior changes.

v2: Rebased on top of fe1217c4f3f7 ("net: net_cls: move cgroupfs
    classid handling into core").

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Acked-by: "David S. Miller" &lt;davem@davemloft.net&gt;
Acked-by: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Acked-by: Aristeu Rozanski &lt;aris@redhat.com&gt;
Acked-by: Ingo Molnar &lt;mingo@redhat.com&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Balbir Singh &lt;bsingharora@gmail.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Cc: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Cc: Thomas Graf &lt;tgraf@suug.ch&gt;
</content>
</entry>
<entry>
<title>net: net_cls: move cgroupfs classid handling into core</title>
<updated>2014-01-03T22:41:41Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>dborkman@redhat.com</email>
</author>
<published>2013-12-29T17:27:10Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=fe1217c4f3f7d7cbf8efdd8dd5fdc7204a1d65a8'/>
<id>urn:sha1:fe1217c4f3f7d7cbf8efdd8dd5fdc7204a1d65a8</id>
<content type='text'>
Zefan Li requested [1] to perform the following cleanup/refactoring:

- Split cgroupfs classid handling into net core to better express a
  possible more generic use.

- Disable module support for cgroupfs bits as the majority of other
  cgroupfs subsystems do not have that, and seems to be not wished
  from cgroup side. Zefan probably might want to follow-up for netprio
  later on.

- By this, code can be further reduced which previously took care of
  functionality built when compiled as module.

cgroupfs bits are being placed under net/core/netclassid_cgroup.c, so
that we are consistent with {netclassid,netprio}_cgroup naming that is
under net/core/ as suggested by Zefan.

No change in functionality, but only code refactoring that is being
done here.

 [1] http://patchwork.ozlabs.org/patch/304825/

Suggested-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Cc: Zefan Li &lt;lizefan@huawei.com&gt;
Cc: Thomas Graf &lt;tgraf@suug.ch&gt;
Cc: cgroups@vger.kernel.org
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next</title>
<updated>2013-09-05T21:54:29Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-09-05T21:54:29Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=cc998ff8811530be521f6b316f37ab7676a07938'/>
<id>urn:sha1:cc998ff8811530be521f6b316f37ab7676a07938</id>
<content type='text'>
Pull networking changes from David Miller:
 "Noteworthy changes this time around:

   1) Multicast rejoin support for team driver, from Jiri Pirko.

   2) Centralize and simplify TCP RTT measurement handling in order to
      reduce the impact of bad RTO seeding from SYN/ACKs.  Also, when
      both timestamps and local RTT measurements are available prefer
      the later because there are broken middleware devices which
      scramble the timestamp.

      From Yuchung Cheng.

   3) Add TCP_NOTSENT_LOWAT socket option to limit the amount of kernel
      memory consumed to queue up unsend user data.  From Eric Dumazet.

   4) Add a "physical port ID" abstraction for network devices, from
      Jiri Pirko.

   5) Add a "suppress" operation to influence fib_rules lookups, from
      Stefan Tomanek.

   6) Add a networking development FAQ, from Paul Gortmaker.

   7) Extend the information provided by tcp_probe and add ipv6 support,
      from Daniel Borkmann.

   8) Use RCU locking more extensively in openvswitch data paths, from
      Pravin B Shelar.

   9) Add SCTP support to openvswitch, from Joe Stringer.

  10) Add EF10 chip support to SFC driver, from Ben Hutchings.

  11) Add new SYNPROXY netfilter target, from Patrick McHardy.

  12) Compute a rate approximation for sending in TCP sockets, and use
      this to more intelligently coalesce TSO frames.  Furthermore, add
      a new packet scheduler which takes advantage of this estimate when
      available.  From Eric Dumazet.

  13) Allow AF_PACKET fanouts with random selection, from Daniel
      Borkmann.

  14) Add ipv6 support to vxlan driver, from Cong Wang"

Resolved conflicts as per discussion.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1218 commits)
  openvswitch: Fix alignment of struct sw_flow_key.
  netfilter: Fix build errors with xt_socket.c
  tcp: Add missing braces to do_tcp_setsockopt
  caif: Add missing braces to multiline if in cfctrl_linkup_request
  bnx2x: Add missing braces in bnx2x:bnx2x_link_initialize
  vxlan: Fix kernel panic on device delete.
  net: mvneta: implement -&gt;ndo_do_ioctl() to support PHY ioctls
  net: mvneta: properly disable HW PHY polling and ensure adjust_link() works
  icplus: Use netif_running to determine device state
  ethernet/arc/arc_emac: Fix huge delays in large file copies
  tuntap: orphan frags before trying to set tx timestamp
  tuntap: purge socket error queue on detach
  qlcnic: use standard NAPI weights
  ipv6:introduce function to find route for redirect
  bnx2x: VF RSS support - VF side
  bnx2x: VF RSS support - PF side
  vxlan: Notify drivers for listening UDP port changes
  net: usbnet: update addr_assign_type if appropriate
  driver/net: enic: update enic maintainers and driver
  driver/net: enic: Exposing symbols for Cisco's low latency driver
  ...
</content>
</entry>
<entry>
<title>cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/</title>
<updated>2013-08-09T00:11:22Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2013-08-09T00:11:22Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8af01f56a03e9cbd91a55d688fce1315021efba8'/>
<id>urn:sha1:8af01f56a03e9cbd91a55d688fce1315021efba8</id>
<content type='text'>
The names of the two struct cgroup_subsys_state accessors -
cgroup_subsys_state() and task_subsys_state() - are somewhat awkward.
The former clashes with the type name and the latter doesn't even
indicate it's somehow related to cgroup.

We're about to revamp large portion of cgroup API, so, let's rename
them so that they're less awkward.  Most per-controller usages of the
accessors are localized in accessor wrappers and given the amount of
scheduled changes, this isn't gonna add any noticeable headache.

Rename cgroup_subsys_state() to cgroup_css() and task_subsys_state()
to task_css().  This patch is pure rename.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
</content>
</entry>
</feed>
