<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/tools/perf/util/hist.c, branch linux-rolling-stable</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-rolling-stable</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-rolling-stable'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2025-12-18T13:00:06Z</updated>
<entry>
<title>perf hist: In init, ensure mem_info is put on error paths</title>
<updated>2025-12-18T13:00:06Z</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2025-11-22T08:19:18Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=d80ff470826965eb634dae8cd27e72452577dea0'/>
<id>urn:sha1:d80ff470826965eb634dae8cd27e72452577dea0</id>
<content type='text'>
[ Upstream commit f60efb4454b24cc944ff3eac164bb9dce9169f71 ]

Rather than exit the internal map_symbols directly, put the mem-info
that does this and also lowers the reference count on the mem-info
itself otherwise the mem-info is being leaked.

Fixes: 56e144fe98260a0f ("perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit")
Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Reviewed-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf sample: Remove arch notion of sample parsing</title>
<updated>2025-07-25T17:37:58Z</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2025-07-24T16:33:00Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8882095b1d4d785524a7a4df8e04e35cfd039142'/>
<id>urn:sha1:8882095b1d4d785524a7a4df8e04e35cfd039142</id>
<content type='text'>
By definition arch sample parsing and synthesis will inhibit certain
kinds of cross-platform record then analysis (report, script,
etc.). Remove arch_perf_parse_sample_weight and
arch_perf_synthesize_sample_weight replacing with a common
implementation. Combine perf_sample p_stage_cyc and retire_lat as
weight3 to capture the differing uses regardless of compiled for
architecture.

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Link: https://lore.kernel.org/r/20250724163302.596743-21-irogers@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf hist: Hide unused mem stat columns</title>
<updated>2025-05-02T18:36:14Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2025-04-30T20:55:45Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=225772c17c9fb7679a4db45837fa9b429d97d6f4'/>
<id>urn:sha1:225772c17c9fb7679a4db45837fa9b429d97d6f4</id>
<content type='text'>
Some mem_stat types don't use all 8 columns.  And there are cases only
samples in certain kinds of mem_stat types are available only.  For that
case hide columns which has no samples.

The new output for the previous data would be:

  $ perf mem report -F overhead,op,comm --stdio
  ...
  #           ------ Mem Op -------
  # Overhead     Load  Store  Other  Command
  # ........  .....................  ...............
  #
      44.85%    21.1%  30.7%  48.3%  swapper
      26.82%    98.8%   0.3%   0.9%  netsli-prober
       7.19%    51.7%  13.7%  34.6%  perf
       5.81%    89.7%   2.2%   8.1%  qemu-system-ppc
       4.77%   100.0%   0.0%   0.0%  notifications_c
       1.77%    95.9%   1.2%   3.0%  MemoryReleaser
       0.77%    71.6%   4.1%  24.3%  DefaultEventMan
       0.19%    66.7%  22.2%  11.1%  gnome-shell
       ...

On Intel machines, the event is only for loads or stores so it'll have
only one column:

  #            Mem Op
  # Overhead     Load  Command
  # ........  .......  ...............
  #
      20.55%   100.0%  swapper
      17.13%   100.0%  chrome
       9.02%   100.0%  data-loop.0
       6.26%   100.0%  pipewire-pulse
       5.63%   100.0%  threaded-ml
       5.47%   100.0%  GraphRunner
       5.37%   100.0%  AudioIP~allback
       5.30%   100.0%  Chrome_ChildIOT
       3.17%   100.0%  Isolated Web Co
       ...

Committer testing:

  # grep "model name" -m1 /proc/cpuinfo
  model name	: AMD Ryzen 9 9950X3D 16-Core Processo
  # perf mem report -F overhead,op,comm --stdio
  # Total Lost Samples: 0
  #
  # Samples: 2K of event 'cycles:P'
  # Total weight : 2637
  # Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,local_p_stage_cyc
  #
  #           ------ Mem Op -------
  # Overhead     Load  Store  Other  Command
  # ........  .....................  ...............
  #
      61.02%    14.4%  25.5%  60.1%  swapper
       5.61%    26.4%  13.5%  60.1%  Isolated Web Co
       5.50%    21.4%  29.7%  49.0%  perf
       4.74%    27.2%  15.2%  57.6%  gnome-shell
       4.63%    33.6%  11.5%  54.9%  mdns_service
       4.29%    28.3%  12.4%  59.3%  ptyxis
       2.16%    24.6%  19.3%  56.1%  DOM Worker
       0.99%    23.1%  34.6%  42.3%  firefox
       0.72%    26.3%  15.8%  57.9%  IPC I/O Parent
       0.61%    12.5%  12.5%  75.0%  kworker/u130:20
       0.61%    37.5%  18.8%  43.8%  podman
       0.57%    33.3%   6.7%  60.0%  Timer
       0.53%    14.3%   7.1%  78.6%  KMS thread
       0.49%    30.8%   7.7%  61.5%  kworker/u130:3-
       0.46%    41.7%  33.3%  25.0%  IPDL Background

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Leo Yan &lt;leo.yan@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ravi Bangoria &lt;ravi.bangoria@amd.com&gt;
Link: https://lore.kernel.org/r/20250430205548.789750-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf hist: Basic support for mem_stat accounting</title>
<updated>2025-05-02T18:36:14Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2025-04-30T20:55:42Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9fcb43e27c0e2696f6315022972e75ec0da8eb86'/>
<id>urn:sha1:9fcb43e27c0e2696f6315022972e75ec0da8eb86</id>
<content type='text'>
Add a logic to account he-&gt;mem_stat based on mem_stat_type in hists.

Each mem_stat entry will have different meaning based on the type so the
index in the array is calculated at runtime using the corresponding
value in the sample.data_src.

Still hists has no mem_stat_types yet so this code won't work for now.

Later hists-&gt;mem_stat_types will be allocated based on what users want
in the output actually.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Leo Yan &lt;leo.yan@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ravi Bangoria &lt;ravi.bangoria@amd.com&gt;
Link: https://lore.kernel.org/r/20250430205548.789750-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf hist: Add struct he_mem_stat</title>
<updated>2025-05-02T18:36:14Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2025-04-30T20:55:41Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=930d4c45c687246d02950d4aa2608641f26498ec'/>
<id>urn:sha1:930d4c45c687246d02950d4aa2608641f26498ec</id>
<content type='text'>
The 'struct he_mem_stat' is to save detailed information about memory
instruction.  It'll be used to show breakdown of various data from
PERF_SAMPLE_DATA_SRC.  Note that this structure is generic and the
contents will be different depending on actual data it'll use later.

The information about the actual data will be saved in 'struct hists'
and its length is in nr_mem_stats.  This commit just adds ground works
and does nothing since hists-&gt;nr_mem_stats is 0 for now.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Leo Yan &lt;leo.yan@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ravi Bangoria &lt;ravi.bangoria@amd.com&gt;
Link: https://lore.kernel.org/r/20250430205548.789750-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf report: Fix memory leaks in the hierarchy mode</title>
<updated>2025-03-07T22:07:07Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2025-03-07T06:12:50Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e1f5bb18a7b25cac6cbf219b5f28159656faa152'/>
<id>urn:sha1:e1f5bb18a7b25cac6cbf219b5f28159656faa152</id>
<content type='text'>
Ian told me that there are many memory leaks in the hierarchy mode.  I
can easily reproduce it with the follwing command.

  $ make DEBUG=1 EXTRA_CFLAGS=-fsanitize=leak

  $ perf record --latency -g -- ./perf test -w thloop

  $ perf report -H --stdio
  ...
  Indirect leak of 168 byte(s) in 21 object(s) allocated from:
      #0 0x7f3414c16c65 in malloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:75
      #1 0x55ed3602346e in map__get util/map.h:189
      #2 0x55ed36024cc4 in hist_entry__init util/hist.c:476
      #3 0x55ed36025208 in hist_entry__new util/hist.c:588
      #4 0x55ed36027c05 in hierarchy_insert_entry util/hist.c:1587
      #5 0x55ed36027e2e in hists__hierarchy_insert_entry util/hist.c:1638
      #6 0x55ed36027fa4 in hists__collapse_insert_entry util/hist.c:1685
      #7 0x55ed360283e8 in hists__collapse_resort util/hist.c:1776
      #8 0x55ed35de0323 in report__collapse_hists /home/namhyung/project/linux/tools/perf/builtin-report.c:735
      #9 0x55ed35de15b4 in __cmd_report /home/namhyung/project/linux/tools/perf/builtin-report.c:1119
      #10 0x55ed35de43dc in cmd_report /home/namhyung/project/linux/tools/perf/builtin-report.c:1867
      #11 0x55ed35e66767 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:351
      #12 0x55ed35e66a0e in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:404
      #13 0x55ed35e66b67 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:448
      #14 0x55ed35e66eb0 in main /home/namhyung/project/linux/tools/perf/perf.c:556
      #15 0x7f340ac33d67 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
  ...

  $ perf report -H --stdio 2&gt;&amp;1 | grep -c '^Indirect leak'
  93

I found that hist_entry__delete() missed to release child entries in the
hierarchy tree (hroot_{in,out}).  It needs to iterate the child entries
and call hist_entry__delete() recursively.

After this change:

  $ perf report -H --stdio 2&gt;&amp;1 | grep -c '^Indirect leak'
  0

Reported-by: Ian Rogers &lt;irogers@google.com&gt;
Tested-by Thomas Falcon &lt;thomas.falcon@intel.com&gt;
Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Link: https://lore.kernel.org/r/20250307061250.320849-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf report: Fix sample number stats for branch entry mode</title>
<updated>2025-02-25T00:02:28Z</updated>
<author>
<name>Thomas Falcon</name>
<email>thomas.falcon@intel.com</email>
</author>
<published>2025-02-20T04:59:42Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c40aa8d98db64ee2144bf6cc55eddb4f7625d728'/>
<id>urn:sha1:c40aa8d98db64ee2144bf6cc55eddb4f7625d728</id>
<content type='text'>
Currently, stats-&gt;nr_samples is incremented per entry in the branch stack
instead of per sample taken. As a result, statistics of samples taken
during perf record in --branch-filter or --branch-any mode does not
seem correct. Instead call hists__inc_nr_samples() for each sample taken
instead of for each entry in the branch stack.

Before:

$ ./perf record -e cycles:u -b -c 10000000000 ./tchain_edit
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.005 MB perf.data (2 samples) ]
$ perf report -D | tail -n 16
Aggregated stats:
               TOTAL events:         16
                COMM events:          2  (12.5%)
                EXIT events:          1  ( 6.2%)
              SAMPLE events:          2  (12.5%)
               MMAP2 events:          2  (12.5%)
             KSYMBOL events:          1  ( 6.2%)
      FINISHED_ROUND events:          1  ( 6.2%)
            ID_INDEX events:          1  ( 6.2%)
          THREAD_MAP events:          1  ( 6.2%)
             CPU_MAP events:          1  ( 6.2%)
        EVENT_UPDATE events:          2  (12.5%)
           TIME_CONV events:          1  ( 6.2%)
       FINISHED_INIT events:          1  ( 6.2%)
cpu_core/cycles/u stats:
              SAMPLE events:         64

After:

$ ./perf report -D | tail -n 16
Aggregated stats:
               TOTAL events:         16
                COMM events:          2  (12.5%)
                EXIT events:          1  ( 6.2%)
              SAMPLE events:          2  (12.5%)
               MMAP2 events:          2  (12.5%)
             KSYMBOL events:          1  ( 6.2%)
      FINISHED_ROUND events:          1  ( 6.2%)
            ID_INDEX events:          1  ( 6.2%)
          THREAD_MAP events:          1  ( 6.2%)
             CPU_MAP events:          1  ( 6.2%)
        EVENT_UPDATE events:          2  (12.5%)
           TIME_CONV events:          1  ( 6.2%)
       FINISHED_INIT events:          1  ( 6.2%)
cpu_core/cycles/u stats:
              SAMPLE events:          2

Signed-off-by: Thomas Falcon &lt;thomas.falcon@intel.com&gt;
Link: https://lore.kernel.org/r/20250220045942.114965-1-thomas.falcon@intel.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf report: Add latency output field</title>
<updated>2025-02-18T22:04:32Z</updated>
<author>
<name>Dmitry Vyukov</name>
<email>dvyukov@google.com</email>
</author>
<published>2025-02-13T09:08:18Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=ee1cffbe24e7dc129c737a5f433b35e2ce0bdd78'/>
<id>urn:sha1:ee1cffbe24e7dc129c737a5f433b35e2ce0bdd78</id>
<content type='text'>
Latency output field is similar to overhead, but represents overhead for
latency rather than CPU consumption. It's re-scaled from overhead by dividing
weight by the current parallelism level at the time of the sample.
It effectively models profiling with 1 sample taken per unit of wall-clock
time rather than unit of CPU time.

Signed-off-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Reviewed-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Link: https://lore.kernel.org/r/b6269518758c2166e6ffdc2f0e24cfdecc8ef9c1.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf report: Add parallelism filter</title>
<updated>2025-02-18T22:04:32Z</updated>
<author>
<name>Dmitry Vyukov</name>
<email>dvyukov@google.com</email>
</author>
<published>2025-02-13T09:08:17Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=61b6b31c2f51f8757ecc65df8a4f5eeff029a804'/>
<id>urn:sha1:61b6b31c2f51f8757ecc65df8a4f5eeff029a804</id>
<content type='text'>
Add parallelism filter that can be used to look at specific parallelism
levels only. The format is the same as cpu lists. For example:

Only single-threaded samples: --parallelism=1
Low parallelism only: --parallelism=1-4
High parallelism only: --parallelism=64-128

Signed-off-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Reviewed-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Link: https://lore.kernel.org/r/e61348985ff0a6a14b07c39e880edbd60a8f8635.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf report: Switch filtered from u8 to u16</title>
<updated>2025-02-18T22:04:31Z</updated>
<author>
<name>Dmitry Vyukov</name>
<email>dvyukov@google.com</email>
</author>
<published>2025-02-13T09:08:16Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=216f8a970ca4a0369fd0ac707b1c405dec9512a8'/>
<id>urn:sha1:216f8a970ca4a0369fd0ac707b1c405dec9512a8</id>
<content type='text'>
We already have all u8 bits taken, adding one more filter leads to unpleasant
failure mode, where code compiles w/o warnings, but the last filters silently
don't work. Add a typedef and switch to u16.

Signed-off-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Reviewed-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Link: https://lore.kernel.org/r/32b4ce1731126c88a2d9e191dc87e39ae4651cb7.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
</feed>
