<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/tools/perf/Documentation, branch linux-6.16.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.16.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.16.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2025-06-16T17:05:10Z</updated>
<entry>
<title>perf mem: Document new output fields (op, cache, mem, dtlb, snoop)</title>
<updated>2025-06-16T17:05:10Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2025-06-10T00:57:42Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=bbeb1088a40b0da8b832d72e05e1c1cb71535712'/>
<id>urn:sha1:bbeb1088a40b0da8b832d72e05e1c1cb71535712</id>
<content type='text'>
Update the documentation of the new fields with examples and caveats.

Also update the related documentation for AMD IBS.

Reviewed-by: Ravi Bangoria &lt;ravi.bangoria@amd.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20250610005742.2173050-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf lock contention: Reject more than 10ms delays for safety</title>
<updated>2025-05-31T11:45:24Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2025-05-15T18:10:42Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0df14c1f1ed530ba21b603a2e282589c034367da'/>
<id>urn:sha1:0df14c1f1ed530ba21b603a2e282589c034367da</id>
<content type='text'>
Delaying kernel operations can be dangerous and the kernel may kill
(non-sleepable) BPF programs running for long in the future.

Limit the max delay to 10ms and update the document about it.

  $ sudo ./perf lock con -abl -J 100000us@cgroup_mutex true
  lock delay is too long: 100000us (&gt; 10ms)

   Usage: perf lock contention [&lt;options&gt;]

      -J, --inject-delay &lt;TIME@FUNC&gt;
                            Inject delays to specific locks

Suggested-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20250515181042.555189-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf mem: Describe overhead calculation in brief</title>
<updated>2025-05-28T17:41:10Z</updated>
<author>
<name>Ravi Bangoria</name>
<email>ravi.bangoria@amd.com</email>
</author>
<published>2025-05-23T22:21:55Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=00a23c000e13d4c66f38eefd530dc377cc90ec14'/>
<id>urn:sha1:00a23c000e13d4c66f38eefd530dc377cc90ec14</id>
<content type='text'>
Unlike perf-report which uses sample period for overhead calculation,
perf-mem overhead is calculated using sample weight. Describe perf-mem
overhead calculation method in it's man page.

Reviewed-by: Leo Yan &lt;leo.yan@arm.com&gt;
Signed-off-by: Ravi Bangoria &lt;ravi.bangoria@amd.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Link: https://lore.kernel.org/r/20250523222157.1259998-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf record: Add 8-byte aligned event type PERF_RECORD_COMPRESSED2</title>
<updated>2025-05-16T20:31:40Z</updated>
<author>
<name>Chun-Tse Shao</name>
<email>ctshao@google.com</email>
</author>
<published>2025-03-03T18:32:40Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=208c0e16834472bb3a4b456a69dbdd1ad05eab49'/>
<id>urn:sha1:208c0e16834472bb3a4b456a69dbdd1ad05eab49</id>
<content type='text'>
The original PERF_RECORD_COMPRESS is not 8-byte aligned, which can cause
asan runtime error:

  # Build with asan
  $ make -C tools/perf O=/tmp/perf DEBUG=1 EXTRA_CFLAGS="-O0 -g -fno-omit-frame-pointer -fsanitize=undefined"
  # Test success with many asan runtime errors:
  $ /tmp/perf/perf test "Zstd perf.data compression/decompression" -vv
   83: Zstd perf.data compression/decompression:
  ...
  util/session.c:1959:13: runtime error: member access within misaligned address 0x7f69e3f99653 for type 'union perf_event', which requires 13 byte alignment
  0x7f69e3f99653: note: pointer points here
   d0  3a 50 69 44 00 00 00 00  00 08 00 bb 07 00 00 00  00 00 00 44 00 00 00 00  00 00 00 ff 07 00 00
                ^
  util/session.c:2163:22: runtime error: member access within misaligned address 0x7f69e3f99653 for type 'union perf_event', which requires 8 byte alignment
  0x7f69e3f99653: note: pointer points here
   d0  3a 50 69 44 00 00 00 00  00 08 00 bb 07 00 00 00  00 00 00 44 00 00 00 00  00 00 00 ff 07 00 00
                ^
  ...

Since there is no way to align compressed data in zstd compression, this
patch add a new event type `PERF_RECORD_COMPRESSED2`, which adds a field
`data_size` to specify the actual compressed data size.

The `header.size` contains the total record size, including the padding
at the end to make it 8-byte aligned.

Tested with `Zstd perf.data compression/decompression`

Signed-off-by: Chun-Tse Shao &lt;ctshao@google.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ben Gainey &lt;ben.gainey@arm.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Clark &lt;james.clark@linaro.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Leo Yan &lt;leo.yan@arm.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Nick Terrell &lt;terrelln@fb.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20250303183646.327510-1-ctshao@google.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf trace: Support --summary-mode=cgroup</title>
<updated>2025-05-13T21:20:46Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2025-05-01T22:53:37Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=ef60b8f5724da364500ddb7b8240c157ea65075e'/>
<id>urn:sha1:ef60b8f5724da364500ddb7b8240c157ea65075e</id>
<content type='text'>
Add a new summary mode to collect stats for each cgroup.

  $ sudo ./perf trace -as --bpf-summary --summary-mode=cgroup -- sleep 1

   Summary of events:

   cgroup /user.slice/user-657345.slice/user@657345.service/session.slice/org.gnome.Shell@x11.service, 535 events

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     ppoll                 15      0   373.600     0.004    24.907   197.491     55.26%
     poll                  15      0     1.325     0.001     0.088     0.369     38.76%
     close                 66      0     0.567     0.007     0.009     0.026      3.55%
     write                150      0     0.471     0.001     0.003     0.010      3.29%
     recvmsg               94     83     0.290     0.000     0.003     0.037     16.39%
     ioctl                 26      0     0.237     0.001     0.009     0.096     50.13%
     timerfd_create        66      0     0.236     0.003     0.004     0.024      8.92%
     timerfd_settime       70      0     0.160     0.001     0.002     0.012      7.66%
     writev                10      0     0.118     0.001     0.012     0.019     18.17%
     read                   9      0     0.021     0.001     0.002     0.004     14.07%
     getpid                14      0     0.019     0.000     0.001     0.004     20.28%

   cgroup /system.slice/polkit.service, 94 events

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     ppoll                 22      0    19.811     0.000     0.900     9.273     63.88%
     write                 30      0     0.040     0.001     0.001     0.003     12.09%
     recvmsg               12      0     0.018     0.001     0.002     0.006     28.15%
     read                  18      0     0.013     0.000     0.001     0.003     21.99%
     poll                  12      0     0.006     0.000     0.001     0.001      4.48%

   cgroup /user.slice/user-657345.slice/user@657345.service/app.slice/app-org.gnome.Terminal.slice/gnome-terminal-server.service, 21 events

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     ppoll                  4      0    17.476     0.003     4.369    13.298     69.65%
     recvmsg               15     12     0.068     0.002     0.005     0.014     26.53%
     writev                 1      0     0.033     0.033     0.033     0.033      0.00%
     poll                   1      0     0.005     0.005     0.005     0.005      0.00%

   ...

It works only for --bpf-summary for now.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Howard Chu &lt;howardchu95@gmail.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20250501225337.928470-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf report: Add 'tgid' sort key</title>
<updated>2025-05-13T20:51:32Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2025-05-09T21:04:21Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=39922dc53c26842939e5d4d6bfeff8da677a9b33'/>
<id>urn:sha1:39922dc53c26842939e5d4d6bfeff8da677a9b33</id>
<content type='text'>
Sometimes we need to analyze the data in process level but current sort
keys only work on thread level.  Let's add 'tgid' sort key for that as
'pid' is already taken for thread.

This will look mostly the same, but it only uses tgid instead of tid.
Here's an example of a process with two threads (thloop).

  $ perf record -- perf test -w thloop

  $ perf report --stdio -s tgid,pid -H
  ...
  #
  #    Overhead  Tgid:Command / Pid:Command
  # ...........  ..........................
  #
     100.00%     2018407:perf
         50.34%     2018407:perf
         49.66%     2018409:perf

Suggested-by: Stephane Eranian &lt;eranian@google.com&gt;
Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20250509210421.197245-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf parse-events: Add "cpu" term to set the CPU an event is recorded on</title>
<updated>2025-05-12T17:23:19Z</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2025-04-03T19:43:37Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=255f5b6d060be5a43de6edf42149b1391f69e3fc'/>
<id>urn:sha1:255f5b6d060be5a43de6edf42149b1391f69e3fc</id>
<content type='text'>
The -C option allows the CPUs for a list of events to be specified but
its not possible to set the CPU for a single event. Add a term to
allow this. The term isn't a general CPU list due to ',' already being
a special character in event parsing instead multiple cpu= terms may
be provided and they will be merged/unioned together.

An example of mixing different types of events counted on different CPUs:
```
$ perf stat -A -C 0,4-5,8 -e "instructions/cpu=0/,l1d-misses/cpu=4,cpu=5/,inst_retired.any/cpu=8/,cycles" -a sleep 0.1

 Performance counter stats for 'system wide':

CPU0            6,979,225      instructions/cpu=0/              #    0.89  insn per cycle
CPU4               75,138      cpu/l1d-misses/
CPU5            1,418,939      cpu/l1d-misses/
CPU8              797,553      cpu/inst_retired.any,cpu=8/
CPU0            7,845,302      cycles
CPU4            6,546,859      cycles
CPU5          185,915,438      cycles
CPU8            2,065,668      cycles

       0.112449242 seconds time elapsed
```

Committer testing:

  root@number:~# grep -m1 "model name" /proc/cpuinfo
  model name	: AMD Ryzen 9 9950X3D 16-Core Processor
  root@number:~# perf stat -A -e "instructions/cpu=0/,instructions,l1d-misses/cpu=4,cpu=5/,cycles" -a sleep 0.1

   Performance counter stats for 'system wide':

  CPU0    2,398,351   instructions/cpu=0/    #  0.44  insn per cycle
  CPU0    2,398,152   instructions           #  0.44  insn per cycle
  CPU1    1,265,634   instructions           #  0.49  insn per cycle
  CPU2      606,087   instructions           #  0.50  insn per cycle
  CPU3    4,025,752   instructions           #  0.52  insn per cycle
  CPU4    4,236,810   instructions           #  0.53  insn per cycle
  CPU5    3,984,832   instructions           #  0.66  insn per cycle
  CPU6      434,132   instructions           #  0.44  insn per cycle
  CPU7       65,752   instructions           #  0.41  insn per cycle
  CPU8      459,083   instructions           #  0.48  insn per cycle
  CPU9    6,464,161   instructions           #  1.31  insn per cycle
  &lt;SNIP&gt;
  root@number:~# perf stat -e "instructions/cpu=0/,instructions,l1d-misses/cpu=4,cpu=5/,cycles" -a sleep 0.

   Performance counter stats for 'system wide':

             144,822      instructions/cpu=0/              #    0.03  insn per cycle
           4,666,114      instructions                     #    0.93  insn per cycle
               2,583      l1d-misses
           4,993,633      cycles

         0.000868512 seconds time elapsed

  root@number:~#

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Tested-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Clark &lt;james.clark@linaro.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Leo Yan &lt;leo.yan@arm.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Weilin Wang &lt;weilin.wang@intel.com&gt;
Cc: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Link: https://lore.kernel.org/r/20250403194337.40202-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf intel-pt: Do not default to recording all switch events</title>
<updated>2025-05-12T17:18:16Z</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2025-05-12T09:39:31Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=352b088164b5cde15e4d8a66dc8ba36c06f4416b'/>
<id>urn:sha1:352b088164b5cde15e4d8a66dc8ba36c06f4416b</id>
<content type='text'>
On systems with many CPUs, recording extra context switch events can be
excessive and unnecessary. Add perf config intel-pt.all-switch-events=false
to control the behaviour.

Example:

 # perf config intel-pt.all-switch-events=false
 # perf record -eintel_pt//u uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.082 MB perf.data ]
 # perf script -D | grep PERF_RECORD_SWITCH | awk '{print $5}' | uniq -c
       5 PERF_RECORD_SWITCH
 # perf config intel-pt.all-switch-events=true
 # perf record -eintel_pt//u uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.102 MB perf.data ]
 # perf script -D | grep PERF_RECORD_SWITCH | awk '{print $5}' | uniq -c
     180 PERF_RECORD_SWITCH_CPU_WIDE

Committer testing:

While doing a make -j28 allmodconfig:

  root@five:~# grep "model name" -m1 /proc/cpuinfo
  model name	: Intel(R) Core(TM) i7-14700K
  root@five:~#
  root@five:~# perf config intel-pt.all-switch-events=false
  root@five:~# perf record -e intel_pt//u uname
  Linux
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.019 MB perf.data ]
  root@five:~# perf report --stats | grep SWITCH_CPU_WIDE
  root@five:~#
  root@five:~# perf config intel-pt.all-switch-events=true
  root@five:~# perf record -e intel_pt//u uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.047 MB perf.data ]
  root@five:~# perf report --stats | grep SWITCH_CPU_WIDE
       SWITCH_CPU_WIDE events:        542  (96.4%)
  root@five:~#

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20250512093932.79854-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf lock contention: Add -J/--inject-delay option</title>
<updated>2025-05-09T17:32:15Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2025-05-09T17:19:50Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c42e219942cb1325fe38160a4fd2be73c74abc0e'/>
<id>urn:sha1:c42e219942cb1325fe38160a4fd2be73c74abc0e</id>
<content type='text'>
This is to slow down lock acquistion (on contention locks) deliberately.

A possible use case is to estimate impact on application performance by
optimization of kernel locking behavior.  By delaying the lock it can
simulate the worse condition as a control group, and then compare with
the current behavior as a optimized condition.

The syntax is 'time@function' and the time can have unit suffix like
"us" and "ms".  For example, I ran a simple test like below.

  $ sudo perf lock con -abl -L tasklist_lock -- \
    sh -c 'for i in $(seq 1000); do sleep 1 &amp; done; wait'
   contended   total wait     max wait     avg wait            address   symbol

          92      1.18 ms    199.54 us     12.79 us   ffffffff8a806080   tasklist_lock (rwlock)

The contention count was 92 and the average wait time was around 10 us.
But if I add 100 usec of delay to the tasklist_lock,

  $ sudo perf lock con -abl -L tasklist_lock -J 100us@tasklist_lock -- \
    sh -c 'for i in $(seq 1000); do sleep 1 &amp; done; wait'
   contended   total wait     max wait     avg wait            address   symbol

         190     15.67 ms    230.10 us     82.46 us   ffffffff8a806080   tasklist_lock (rwlock)

The contention count increased and the average wait time was up closed
to 100 usec.  If I increase the delay even more,

  $ sudo perf lock con -abl -L tasklist_lock -J 1ms@tasklist_lock -- \
    sh -c 'for i in $(seq 1000); do sleep 1 &amp; done; wait'
   contended   total wait     max wait     avg wait            address   symbol

        1002      2.80 s       3.01 ms      2.80 ms   ffffffff8a806080   tasklist_lock (rwlock)

Now every sleep process had contention and the wait time was more than 1
msec.  This is on my 4 CPU laptop so I guess one CPU has the lock while
other 3 are waiting for it mostly.

For simplicity, it only supports global locks for now.

Committer testing:

  root@number:~# grep -m1 'model name' /proc/cpuinfo
  model name : AMD Ryzen 9 9950X3D 16-Core Processor
  root@number:~# perf lock con -abl -L tasklist_lock -- sh -c 'for i in $(seq 1000); do sleep 1 &amp; done; wait'
   contended  total wait   max wait  avg wait           address  symbol

         142   453.85 us   25.39 us   3.20 us  ffffffffae808080  tasklist_lock (rwlock)
  root@number:~# perf lock con -abl -L tasklist_lock -J 100us@tasklist_lock -- sh -c 'for i in $(seq 1000); do sleep 1 &amp; done; wait'
   contended  total wait   max wait  avg wait           address  symbol

        1040     2.39 s     3.11 ms   2.30 ms  ffffffffae808080  tasklist_lock (rwlock)
  root@number:~# perf lock con -abl -L tasklist_lock -J 1ms@tasklist_lock -- sh -c 'for i in $(seq 1000); do sleep 1 &amp; done; wait'
   contended  total wait   max wait  avg wait           address  symbol

        1025    24.72 s    31.01 ms  24.12 ms  ffffffffae808080  tasklist_lock (rwlock)
  root@number:~#

Suggested-by: Stephane Eranian &lt;eranian@google.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20250509171950.183591-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf record --off-cpu: Add --off-cpu-thresh option</title>
<updated>2025-05-06T00:51:54Z</updated>
<author>
<name>Howard Chu</name>
<email>howardchu95@gmail.com</email>
</author>
<published>2025-05-01T02:28:07Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9557c000768741bb42a04a3c20c2bf2a5e60445a'/>
<id>urn:sha1:9557c000768741bb42a04a3c20c2bf2a5e60445a</id>
<content type='text'>
Specify the threshold for dumping offcpu samples with --off-cpu-thresh,
the unit is milliseconds. Default value is 500ms.

Example:

  perf record --off-cpu --off-cpu-thresh 824

The example above collects direct off-cpu samples where the off-cpu time
is longer than 824ms.

Committer testing:

After commenting out the end off-cpu dump to have just the ones that are
added right after the task is scheduled back, and using a threshould of
1000ms, we see some periods (the 5th column, just before "offcpu-time"
in the 'perf script' output) that are over 1000.000.000 nanoseconds:

  root@number:~# perf record --off-cpu --off-cpu-thresh 10000
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 3.902 MB perf.data (34335 samples) ]
  root@number:~# perf script
&lt;SNIP&gt;
  Isolated Web Co   59932 [028] 63839.594437: 1000049427 offcpu-time:
             7fe63c7976c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
             7fe63c78c04c __futex_abstimed_wait_common+0x7c (/usr/lib64/libc.so.6)
             7fe63c78e928 pthread_cond_timedwait@@GLIBC_2.3.2+0x178 (/usr/lib64/libc.so.6)
             5599974a9fe7 mozilla::detail::ConditionVariableImpl::wait_for(mozilla::detail::MutexImpl&amp;, mozilla::BaseTimeDuration&lt;mozilla::TimeDurationValueCalculator&gt; const&amp;)+0xe7 (/usr/lib64/fir&gt;
                100000000 [unknown] ([unknown])

          swapper       0 [025] 63839.594459:     195724    cycles:P:  ffffffffac328270 read_tsc+0x0 ([kernel.kallsyms])
  Isolated Web Co   59932 [010] 63839.594466: 1000055278 offcpu-time:
             7fe63c7976c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
             7fe63c78ba24 __syscall_cancel+0x14 (/usr/lib64/libc.so.6)
             7fe63c804c4e __poll+0x1e (/usr/lib64/libc.so.6)
             7fe633b0d1b8 PollWrapper(_GPollFD*, unsigned int, int) [clone .lto_priv.0]+0xf8 (/usr/lib64/firefox/libxul.so)
                10000002c [unknown] ([unknown])

          swapper       0 [027] 63839.594475:     134433    cycles:P:  ffffffffad4c45d9 irqentry_enter+0x19 ([kernel.kallsyms])
          swapper       0 [028] 63839.594499:     215838    cycles:P:  ffffffffac39199a switch_mm_irqs_off+0x10a ([kernel.kallsyms])
  MediaPD~oder #1 1407676 [027] 63839.594514:     134433    cycles:P:      7f982ef5e69f dct_IV(int*, int, int*)+0x24f (/usr/lib64/libfdk-aac.so.2.0.0)
          swapper       0 [024] 63839.594524:     267411    cycles:P:  ffffffffad4c6ee6 poll_idle+0x56 ([kernel.kallsyms])
  MediaSu~sor #75 1093827 [026] 63839.594555:     332652    cycles:P:      55be753ad030 moz_xmalloc+0x200 (/usr/lib64/firefox/firefox)
          swapper       0 [027] 63839.594616:     160548    cycles:P:  ffffffffad144840 menu_select+0x570 ([kernel.kallsyms])
  Isolated Web Co   14019 [027] 63839.595120: 1000050178 offcpu-time:
             7fc9537cc6c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
             7fc9537c104c __futex_abstimed_wait_common+0x7c (/usr/lib64/libc.so.6)
             7fc9537c3928 pthread_cond_timedwait@@GLIBC_2.3.2+0x178 (/usr/lib64/libc.so.6)
             7fc95372a3c8 pt_TimedWait+0xb8 (/usr/lib64/libnspr4.so)
             7fc95372a8d8 PR_WaitCondVar+0x68 (/usr/lib64/libnspr4.so)
             7fc94afb1f7c WatchdogMain(void*)+0xac (/usr/lib64/firefox/libxul.so)
             7fc947498660 [unknown] ([unknown])
             7fc9535fce88 [unknown] ([unknown])
             7fc94b620e60 WatchdogManager::~WatchdogManager()+0x0 (/usr/lib64/firefox/libxul.so)
          fff8548387f8b48 [unknown] ([unknown])

          swapper       0 [003] 63839.595712:     212948    cycles:P:  ffffffffacd5b865 acpi_os_read_port+0x55 ([kernel.kallsyms])
&lt;SNIP&gt;

Suggested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Suggested-by: Ian Rogers &lt;irogers@google.com&gt;
Suggested-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Howard Chu &lt;howardchu95@gmail.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Tested-by: Gautam Menghani &lt;gautam@linux.ibm.com&gt;
Tested-by: Ian Rogers &lt;irogers@google.com&gt;
Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Clark &lt;james.clark@linaro.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20241108204137.2444151-2-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-10-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
</feed>
