<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/include/linux/sched/topology.h, branch linux-6.17.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.17.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.17.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2025-10-15T10:03:17Z</updated>
<entry>
<title>sched/fair: Get rid of sched_domains_curr_level hack for tl-&gt;cpumask()</title>
<updated>2025-10-15T10:03:17Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2025-08-25T12:02:44Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=788bfdb55866beb0ddf0610f92ed1ed4928b8f05'/>
<id>urn:sha1:788bfdb55866beb0ddf0610f92ed1ed4928b8f05</id>
<content type='text'>
[ Upstream commit 661f951e371cc134ea31c84238dbdc9a898b8403 ]

Leon [1] and Vinicius [2] noted a topology_span_sane() warning during
their testing starting from v6.16-rc1. Debug that followed pointed to
the tl-&gt;mask() for the NODE domain being incorrectly resolved to that of
the highest NUMA domain.

tl-&gt;mask() for NODE is set to the sd_numa_mask() which depends on the
global "sched_domains_curr_level" hack. "sched_domains_curr_level" is
set to the "tl-&gt;numa_level" during tl traversal in build_sched_domains()
calling sd_init() but was not reset before topology_span_sane().

Since "tl-&gt;numa_level" still reflected the old value from
build_sched_domains(), topology_span_sane() for the NODE domain trips
when the span of the last NUMA domain overlaps.

Instead of replicating the "sched_domains_curr_level" hack, get rid of
it entirely and instead, pass the entire "sched_domain_topology_level"
object to tl-&gt;cpumask() function to prevent such mishap in the future.

sd_numa_mask() now directly references "tl-&gt;numa_level" instead of
relying on the global "sched_domains_curr_level" hack to index into
sched_domains_numa_masks[].

The original warning was reproducible on the following NUMA topology
reported by Leon:

    $ sudo numactl -H
    available: 5 nodes (0-4)
    node 0 cpus: 0 1
    node 0 size: 2927 MB
    node 0 free: 1603 MB
    node 1 cpus: 2 3
    node 1 size: 3023 MB
    node 1 free: 3008 MB
    node 2 cpus: 4 5
    node 2 size: 3023 MB
    node 2 free: 3007 MB
    node 3 cpus: 6 7
    node 3 size: 3023 MB
    node 3 free: 3002 MB
    node 4 cpus: 8 9
    node 4 size: 3022 MB
    node 4 free: 2718 MB
    node distances:
    node   0   1   2   3   4
      0:  10  39  38  37  36
      1:  39  10  38  37  36
      2:  38  38  10  37  36
      3:  37  37  37  10  36
      4:  36  36  36  36  10

The above topology can be mimicked using the following QEMU cmd that was
used to reproduce the warning and test the fix:

     sudo qemu-system-x86_64 -enable-kvm -cpu host \
     -m 20G -smp cpus=10,sockets=10 -machine q35 \
     -object memory-backend-ram,size=4G,id=m0 \
     -object memory-backend-ram,size=4G,id=m1 \
     -object memory-backend-ram,size=4G,id=m2 \
     -object memory-backend-ram,size=4G,id=m3 \
     -object memory-backend-ram,size=4G,id=m4 \
     -numa node,cpus=0-1,memdev=m0,nodeid=0 \
     -numa node,cpus=2-3,memdev=m1,nodeid=1 \
     -numa node,cpus=4-5,memdev=m2,nodeid=2 \
     -numa node,cpus=6-7,memdev=m3,nodeid=3 \
     -numa node,cpus=8-9,memdev=m4,nodeid=4 \
     -numa dist,src=0,dst=1,val=39 \
     -numa dist,src=0,dst=2,val=38 \
     -numa dist,src=0,dst=3,val=37 \
     -numa dist,src=0,dst=4,val=36 \
     -numa dist,src=1,dst=0,val=39 \
     -numa dist,src=1,dst=2,val=38 \
     -numa dist,src=1,dst=3,val=37 \
     -numa dist,src=1,dst=4,val=36 \
     -numa dist,src=2,dst=0,val=38 \
     -numa dist,src=2,dst=1,val=38 \
     -numa dist,src=2,dst=3,val=37 \
     -numa dist,src=2,dst=4,val=36 \
     -numa dist,src=3,dst=0,val=37 \
     -numa dist,src=3,dst=1,val=37 \
     -numa dist,src=3,dst=2,val=37 \
     -numa dist,src=3,dst=4,val=36 \
     -numa dist,src=4,dst=0,val=36 \
     -numa dist,src=4,dst=1,val=36 \
     -numa dist,src=4,dst=2,val=36 \
     -numa dist,src=4,dst=3,val=36 \
     ...

  [ prateek: Moved common functions to include/linux/sched/topology.h,
    reuse the common bits for s390 and ppc, commit message ]

Closes: https://lore.kernel.org/lkml/20250610110701.GA256154@unreal/ [1]
Fixes: ccf74128d66c ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap") # ce29a7da84cd, f55dac1dafb3
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reported-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Reviewed-by: Shrikanth Hegde &lt;sshegde@linux.ibm.com&gt;
Tested-by: Valentin Schneider &lt;vschneid@redhat.com&gt; # x86
Tested-by: Shrikanth Hegde &lt;sshegde@linux.ibm.com&gt; # powerpc
Link: https://lore.kernel.org/lkml/a3de98387abad28592e6ab591f3ff6107fe01dc1.1755893468.git.tim.c.chen@linux.intel.com/ [2]
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/topology: Remove sched_domain_topology_level::flags</title>
<updated>2025-07-14T08:59:35Z</updated>
<author>
<name>K Prateek Nayak</name>
<email>kprateek.nayak@amd.com</email>
</author>
<published>2025-07-11T05:50:30Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=1eec89a671413ce38df9fe9e70f5130a9eb79a59'/>
<id>urn:sha1:1eec89a671413ce38df9fe9e70f5130a9eb79a59</id>
<content type='text'>
Support for overlapping domains added in commit e3589f6c81e4 ("sched:
Allow for overlapping sched_domain spans") also allowed forcefully
setting SD_OVERLAP for !NUMA domains via FORCE_SD_OVERLAP sched_feat().

Since NUMA domains had to be presumed overlapping to ensure correct
behavior, "sched_domain_topology_level::flags" was introduced. NUMA
domains added the SDTL_OVERLAP flag would ensure SD_OVERLAP was always
added during build_sched_domains() for these domains, even when
FORCE_SD_OVERLAP was off.

Condition for adding the SD_OVERLAP flag at the aforementioned commit
was as follows:

    if (tl-&gt;flags &amp; SDTL_OVERLAP || sched_feat(FORCE_SD_OVERLAP))
            sd-&gt;flags |= SD_OVERLAP;

The FORCE_SD_OVERLAP debug feature was removed in commit af85596c74de
("sched/topology: Remove FORCE_SD_OVERLAP") which left the NUMA domains
as the exclusive users of SDTL_OVERLAP, SD_OVERLAP, and SD_NUMA flags.

Get rid of SDTL_OVERLAP and SD_OVERLAP as they have become redundant
and instead rely on SD_NUMA to detect the only overlapping domain
currently supported. Since SDTL_OVERLAP was the only user of
"tl-&gt;flags", get rid of "sched_domain_topology_level::flags" too.

Signed-off-by: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/ba4dbdf8-bc37-493d-b2e0-2efb00ea3e19@amd.com
</content>
</entry>
<entry>
<title>smpboot: introduce SDTL_INIT() helper to tidy sched topology setup</title>
<updated>2025-07-14T08:59:34Z</updated>
<author>
<name>Li Chen</name>
<email>chenl311@chinatelecom.cn</email>
</author>
<published>2025-07-10T10:57:07Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e075f4360931263f5ec006ea5dadc065e5e98eb8'/>
<id>urn:sha1:e075f4360931263f5ec006ea5dadc065e5e98eb8</id>
<content type='text'>
Define a small SDTL_INIT(maskfn, flagsfn, name) macro and use it to build the
sched_domain_topology_level array. Purely a cleanup; behaviour is unchanged.

Suggested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Li Chen &lt;chenl311@chinatelecom.cn&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Tested-by: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Link: https://lore.kernel.org/r/20250710105715.66594-2-me@linux.beauty
</content>
</entry>
<entry>
<title>sched/smp: Make SMP unconditional</title>
<updated>2025-06-13T06:47:18Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2025-05-28T08:09:01Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=cac5cefbade90ff0bb0b393d301fa3b5234cf056'/>
<id>urn:sha1:cac5cefbade90ff0bb0b393d301fa3b5234cf056</id>
<content type='text'>
Simplify the scheduler by making CONFIG_SMP=y primitives and data
structures unconditional.

Introduce transitory wrappers for functionality not yet converted to SMP.

Note that this patch is pretty large, because there's no clear separation
between various aspects of the SMP scheduler, it's basically a huge block
of #ifdef CONFIG_SMP. A fair amount of it has to be switched on for it to
boot and work on UP systems.

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Cc: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Shrikanth Hegde &lt;sshegde@linux.ibm.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Valentin Schneider &lt;vschneid@redhat.com&gt;
Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Link: https://lore.kernel.org/r/20250528080924.2273858-21-mingo@kernel.org
</content>
</entry>
<entry>
<title>sched/topology: Introduce sched_update_asym_prefer_cpu()</title>
<updated>2025-04-16T19:09:11Z</updated>
<author>
<name>K Prateek Nayak</name>
<email>kprateek.nayak@amd.com</email>
</author>
<published>2025-04-09T05:34:44Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0e3f6c3696424fa90d6f512779d617a05a1cf031'/>
<id>urn:sha1:0e3f6c3696424fa90d6f512779d617a05a1cf031</id>
<content type='text'>
A subset of AMD Processors supporting Preferred Core Rankings also
feature the ability to dynamically switch these rankings at runtime to
bias load balancing towards or away from the LLC domain with larger
cache.

To support dynamically updating "sg-&gt;asym_prefer_cpu" without needing to
rebuild the sched domain, introduce sched_update_asym_prefer_cpu() which
recomutes the "asym_prefer_cpu" when the core-ranking of a CPU changes.

sched_update_asym_prefer_cpu() swaps the "sg-&gt;asym_prefer_cpu" with the
CPU whose ranking has changed if the new ranking is greater than that of
the "asym_prefer_cpu". If CPU whose ranking has changed is the current
"asym_prefer_cpu", it scans the CPUs of the sched groups to find the new
"asym_prefer_cpu" and sets it accordingly.

get_group() for non-overlapping sched domains returns the sched group
for the first CPU in the sched_group_span() which ensures all CPUs in
the group see the updated value of "asym_prefer_cpu".

Overlapping groups are allocated differently and will require moving the
"asym_prefer_cpu" to "sg-&gt;sgc" but since the current implementations do
not set "SD_ASYM_PACKING" at NUMA domains, skip additional
indirection and place a SCHED_WARN_ON() to alert any future users.

Signed-off-by: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20250409053446.23367-3-kprateek.nayak@amd.com
</content>
</entry>
<entry>
<title>sched/debug: Make CONFIG_SCHED_DEBUG functionality unconditional</title>
<updated>2025-03-19T21:20:53Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2025-03-17T10:42:54Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=dd5bdaf2b72da81d57f4f99e518af80002b6562e'/>
<id>urn:sha1:dd5bdaf2b72da81d57f4f99e518af80002b6562e</id>
<content type='text'>
All the big Linux distros enable CONFIG_SCHED_DEBUG, because
the various features it provides help not just with kernel
development, but with system administration and user-space
software development as well.

Reflect this reality and enable this functionality
unconditionally.

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Tested-by: Shrikanth Hegde &lt;sshegde@linux.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Cc: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Ben Segall &lt;bsegall@google.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Valentin Schneider &lt;vschneid@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Link: https://lore.kernel.org/r/20250317104257.3496611-4-mingo@kernel.org
</content>
</entry>
<entry>
<title>include/{topology,cpuset}: Move dl_rebuild_rd_accounting to cpuset.h</title>
<updated>2025-03-17T10:23:43Z</updated>
<author>
<name>Juri Lelli</name>
<email>juri.lelli@redhat.com</email>
</author>
<published>2025-03-13T17:13:29Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=34929a070b7fd06c386080c926b61ee844e6ad34'/>
<id>urn:sha1:34929a070b7fd06c386080c926b61ee844e6ad34</id>
<content type='text'>
dl_rebuild_rd_accounting() is defined in cpuset.c, so it makes more
sense to move related declarations to cpuset.h.

Implement the move.

Suggested-by: Waiman Long &lt;llong@redhat.com&gt;
Signed-off-by: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Waiman Long &lt;llong@redhat.com&gt;
Reviewed-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Reviewed-by: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Tested-by: Waiman Long &lt;longman@redhat.com&gt;
Tested-by: Jon Hunter &lt;jonathanh@nvidia.com&gt;
Tested-by: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Link: https://lore.kernel.org/r/Z9MSOVMpU7jpVrMU@jlelli-thinkpadt14gen4.remote.csb
</content>
</entry>
<entry>
<title>sched/topology: Stop exposing partition_sched_domains_locked</title>
<updated>2025-03-17T10:23:43Z</updated>
<author>
<name>Juri Lelli</name>
<email>juri.lelli@redhat.com</email>
</author>
<published>2025-03-13T17:12:43Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=d128130f486b4aa86086655af0fbb943b26b0003'/>
<id>urn:sha1:d128130f486b4aa86086655af0fbb943b26b0003</id>
<content type='text'>
The are no callers of partition_sched_domains_locked() outside
topology.c.

Stop exposing such function.

Suggested-by: Waiman Long &lt;llong@redhat.com&gt;
Signed-off-by: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Reviewed-by: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Tested-by: Waiman Long &lt;longman@redhat.com&gt;
Tested-by: Jon Hunter &lt;jonathanh@nvidia.com&gt;
Tested-by: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Link: https://lore.kernel.org/r/Z9MSC96a8FcqWV3G@jlelli-thinkpadt14gen4.remote.csb
</content>
</entry>
<entry>
<title>sched/deadline: Rebuild root domain accounting after every update</title>
<updated>2025-03-17T10:23:42Z</updated>
<author>
<name>Juri Lelli</name>
<email>juri.lelli@redhat.com</email>
</author>
<published>2025-03-13T17:10:21Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=2ff899e3516437354204423ef0a94994717b8e6a'/>
<id>urn:sha1:2ff899e3516437354204423ef0a94994717b8e6a</id>
<content type='text'>
Rebuilding of root domains accounting information (total_bw) is
currently broken on some cases, e.g. suspend/resume on aarch64. Problem
is that the way we keep track of domain changes and try to add bandwidth
back is convoluted and fragile.

Fix it by simplify things by making sure bandwidth accounting is cleared
and completely restored after root domains changes (after root domains
are again stable).

To be sure we always call dl_rebuild_rd_accounting while holding
cpuset_mutex we also add cpuset_reset_sched_domains() wrapper.

Fixes: 53916d5fd3c0 ("sched/deadline: Check bandwidth overflow earlier for hotplug")
Reported-by: Jon Hunter &lt;jonathanh@nvidia.com&gt;
Co-developed-by: Waiman Long &lt;llong@redhat.com&gt;
Signed-off-by: Waiman Long &lt;llong@redhat.com&gt;
Signed-off-by: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Tested-by: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Link: https://lore.kernel.org/r/Z9MRfeJKJUOyUSto@jlelli-thinkpadt14gen4.remote.csb
</content>
</entry>
<entry>
<title>sched: Move sched domain name out of CONFIG_SCHED_DEBUG</title>
<updated>2024-12-20T14:31:17Z</updated>
<author>
<name>Swapnil Sapkal</name>
<email>swapnil.sapkal@amd.com</email>
</author>
<published>2024-12-20T06:32:22Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=1c055a0f5d3bafaca5d218bbb3e4e63d6307be45'/>
<id>urn:sha1:1c055a0f5d3bafaca5d218bbb3e4e63d6307be45</id>
<content type='text'>
/proc/schedstat file shows cpu and sched domain level scheduler
statistics. It does not show domain name instead shows domain level.
It will be very useful for tools like `perf sched stats`[1] to
aggragate domain level stats if domain names are shown in /proc/schedstat.
But sched domain name is guarded by CONFIG_SCHED_DEBUG. As per the
discussion[2], move sched domain name out of CONFIG_SCHED_DEBUG.

[1] https://lore.kernel.org/lkml/20241122084452.1064968-1-swapnil.sapkal@amd.com/
[2] https://lore.kernel.org/lkml/fcefeb4d-3acb-462d-9c9b-3df8d927e522@amd.com/

Suggested-by: "Gautham R. Shenoy" &lt;gautham.shenoy@amd.com&gt;
Signed-off-by: Swapnil Sapkal &lt;swapnil.sapkal@amd.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20241220063224.17767-5-swapnil.sapkal@amd.com
</content>
</entry>
</feed>
