<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/tools/perf/Documentation/perf-stat.txt, branch linux-4.17.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.17.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.17.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2018-04-12T12:29:31Z</updated>
<entry>
<title>perf stat: Enable 1ms interval for printing event counters values</title>
<updated>2018-04-12T12:29:31Z</updated>
<author>
<name>Alexey Budankov</name>
<email>alexey.budankov@linux.intel.com</email>
</author>
<published>2018-04-03T18:18:33Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9dc9a95f03a69ab926d9ff1986ab2087f34a5dce'/>
<id>urn:sha1:9dc9a95f03a69ab926d9ff1986ab2087f34a5dce</id>
<content type='text'>
Currently print count interval for performance counters values is
limited by 10ms so reading the values at frequencies higher than 100Hz
is restricted by the tool.

This change makes perf stat -I possible on frequencies up to 1KHz and,
to some extent, makes perf stat -I to be on-par with perf record
sampling profiling.

When running perf stat -I for monitoring e.g. PCIe uncore counters and
at the same time profiling some I/O workload by perf record e.g. for
cpu-cycles and context switches, it is then possible to observe
consolidated CPU/OS/IO(Uncore) performance picture for that workload.

Tool overhead warning printed when specifying -v option can be missed
due to screen scrolling in case you have output to the console
so message is moved into help available by running perf stat -h.

Signed-off-by: Alexey Budankov &lt;alexey.budankov@linux.intel.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/b842ad6a-d606-32e4-afe5-974071b5198e@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf pmu: Auto-merge PMU events created by prefix or glob match</title>
<updated>2018-03-08T13:05:49Z</updated>
<author>
<name>Agustin Vega-Frias</name>
<email>agustinv@codeaurora.org</email>
</author>
<published>2018-03-06T14:04:44Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c199c11dce197b12ff884ac0cfcb527b1164788b'/>
<id>urn:sha1:c199c11dce197b12ff884ac0cfcb527b1164788b</id>
<content type='text'>
Auto-merge for these events was disabled when auto-merging of non-alias
events was disabled in commit 63ce844 (perf stat: Only auto-merge events
that are PMU aliases).

Non-merging of legacy events is preserved:

    $ perf stat -ag -e cache-misses,cache-misses sleep 1

     Performance counter stats for 'system wide':

                86,323      cache-misses
                86,323      cache-misses

           1.002623307 seconds time elapsed

But prefix or glob matching auto-merges the events created:

    $ perf stat -a -e l3cache/read-miss/ sleep 1

     Performance counter stats for 'system wide':

                   328      l3cache/read-miss/

           1.002627008 seconds time elapsed

    $ perf stat -a -e l3cache_0_[01]/read-miss/ sleep 1

     Performance counter stats for 'system wide':

                   172      l3cache/read-miss/

           1.002627008 seconds time elapsed

As with events created with aliases, auto-merging can be suppressed with
the --no-merge option:

    $ perf stat -a -e l3cache/read-miss/ --no-merge sleep 1

     Performance counter stats for 'system wide':

                    67      l3cache/read-miss/
                    67      l3cache/read-miss/
                    63      l3cache/read-miss/
                    60      l3cache/read-miss/

           1.002622192 seconds time elapsed

Signed-off-by: Agustin Vega-Frias &lt;agustinv@codeaurora.org&gt;
Acked-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Timur Tabi &lt;timur@codeaurora.org&gt;
Cc: linux-arm-kernel@lists.infradead.org
Change-Id: I0a47eed54c05e1982ca964d743b37f50f60c508c
Link: http://lkml.kernel.org/r/1520345084-42646-4-git-send-email-agustinv@codeaurora.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf pmu: Support wildcards on pmu name in dynamic pmu events</title>
<updated>2018-03-08T13:05:25Z</updated>
<author>
<name>Agustin Vega-Frias</name>
<email>agustinv@codeaurora.org</email>
</author>
<published>2018-03-06T14:04:42Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=b2b9d3a3f0211c5d08c7befdf9d4adad48cda315'/>
<id>urn:sha1:b2b9d3a3f0211c5d08c7befdf9d4adad48cda315</id>
<content type='text'>
Starting on v4.12 event parsing code for dynamic pmu events already
supports prefix-based matching of multiple pmus when creating dynamic
events. E.g., in a system with the following dynamic pmus:

    mypmu_0
    mypmu_1
    mypmu_2
    mypmu_4

passing mypmu/&lt;config&gt;/ as an event spec will result in the creation of
the event in all of the pmus. This change expands this matching through
the use of fnmatch so glob-like expressions can be used to create events
in multiple pmus. E.g., in the system described above if a user only
wants to create the event in mypmu_0 and mypmu_1, mypmu_[01]/&lt;config&gt;/
can be passed.

Signed-off-by: Agustin Vega-Frias &lt;agustinv@codeaurora.org&gt;
Acked-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: linux-arm-kernel@lists.infradead.org
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Timur Tabi &lt;timur@codeaurora.org&gt;
Change-Id: Icb25653fc5d5239c20f3bffdfdf4ab4c9c9bb20b
Link: http://lkml.kernel.org/r/1520454947-16977-1-git-send-email-agustinv@codeaurora.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf cgroup: Simplify arguments when tracking multiple events</title>
<updated>2018-02-22T13:02:27Z</updated>
<author>
<name>weiping zhang</name>
<email>zhangweiping@didichuxing.com</email>
</author>
<published>2018-01-29T15:48:09Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=25f72f9ed88d5be86c92432fc779e2725e3506cd'/>
<id>urn:sha1:25f72f9ed88d5be86c92432fc779e2725e3506cd</id>
<content type='text'>
When using -G with one cgroup and -e with multiple events, only the
first event gets the correct cgroup setting, all events from the second
onwards will track system-wide events.

If the user wants to track multiple events for a specific cgroup, the
user must give parameters like the following:

  $ perf stat -e e1 -e e2 -e e3 -G test,test,test

This patch simplify this case, just type one cgroup:

  $ perf stat -e e1 -e e2 -e e3 -G test

  $ mkdir -p /sys/fs/cgroup/perf_event/empty_cgroup
  $ perf stat -e cycles -e cache-misses -a -I 1000 -G empty_cgroup

Before:

     1.001007226   &lt;not counted&gt;      cycles	   empty_cgroup
     1.001007226           7,506      cache-misses

After:

     1.000834097   &lt;not counted&gt;      cycles	   empty_cgroup
     1.000834097   &lt;not counted&gt;      cache-misses empty_cgroup

Signed-off-by: weiping zhang &lt;zhangweiping@didichuxing.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20180129154805.GA6284@localhost.didichuxing.com
[ Improved the doc text a bit, providing an example for cgroup + system wide counting ]
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf stat: Add support to print counts after a period of time</title>
<updated>2018-02-16T13:18:06Z</updated>
<author>
<name>yuzhoujian</name>
<email>yuzhoujian@didichuxing.com</email>
</author>
<published>2018-01-29T09:25:23Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=f1f8ad52f8bf1239282737a2a5c3bd450300cc78'/>
<id>urn:sha1:f1f8ad52f8bf1239282737a2a5c3bd450300cc78</id>
<content type='text'>
Introduce a new option to print counts after N milliseconds and update
'perf stat' documentation accordingly.

Show below is the output of the new option for perf stat.

  $ perf stat --time 2000 -e cycles -a
  Performance counter stats for 'system wide':

        157,260,423      cycles

        2.003060766 seconds time elapsed

We can print the count deltas after N milliseconds with this new
introduced option. This option is not supported with "-I" option.

In addition, according to Kangliang's patch(19afd10410957), the
monitoring overhead for system-wide core event could be very high if the
interval-print parameter was below 100ms, and the limitation value is
10ms.

So the same warning will be displayed when the time is set between 10ms
to 100ms, and the minimal time is limited to 10ms. Users can make a
decision according to their spcific cases.

Committer notes:

This actually stops the workload after the specified time, then prints
the counts.

So I renamed the option to --timeout and updated the documentation to
state that it will not just print the counts after the specified time,
but will really stop the 'perf stat' session and print the counts.

The rename from 'time' to 'timeout' also fixes the build in systems
where 'time' is used by glibc and can't be used as a name of a variable,
such as centos:5 and centos:6.

Changes since v3:
- none.

Changes since v2:
- modify the time check in __run_perf_stat func to keep some consistency
  with the workload case.
- add the warning when the time is set between 10ms to 100ms.
- add the pr_err when the time is set below 10ms.

Changes since v1:
- none.

Signed-off-by: yuzhoujian &lt;yuzhoujian@didichuxing.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Kan Liang &lt;kan.liang@intel.com&gt;
Cc: Milian Wolff &lt;milian.wolff@kdab.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Wang Nan &lt;wangnan0@huawei.com&gt;
Link: http://lkml.kernel.org/r/1517217923-8302-3-git-send-email-ufo19890607@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf stat: Add support to print counts for fixed times</title>
<updated>2018-02-16T13:09:24Z</updated>
<author>
<name>yuzhoujian</name>
<email>yuzhoujian@didichuxing.com</email>
</author>
<published>2018-01-29T09:25:22Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=db06a269ecbb1d71d534fc5713624eeeee0b8f92'/>
<id>urn:sha1:db06a269ecbb1d71d534fc5713624eeeee0b8f92</id>
<content type='text'>
Introduce a new option to print counts for fixed number of times and
update 'perf stat' documentation accordingly.

Show below is the output of the new option for perf stat.

  $ perf stat -I 1000 --interval-count 2 -e cycles -a
  #           time             counts unit events
           1.002827089         93,884,870      cycles
           2.004231506         56,573,446      cycles

We can just print the counts for several times with this newly
introduced option. The usage of it is a little like 'vmstat', and it
should be used together with "-I" option.

  $ vmstat -n 1 2
  procs ---------memory-------------- --swap- ----io-- -system-- ------cpu---
   r  b swpd   free   buff   cache    si   so  bi   bo  in   cs us sy id wa st
   0  0    0 78270544 547484 51732076  0   0   0   20    1    1  1  0 99  0 0
   0  0    0 78270512 547484 51732080  0   0   0   16  477 1555  0  0 100 0 0

Changes since v3:
- merge interval_count check and times check to one line.
- fix the wrong indent in stat.h
- use stat_config.times instead of 'times' in cmd_stat function.

Changes since v2:
- none.

Changes since v1:
- change the name of the new option "times-print" to "interval-count".
- keep the new option interval specifically.

Signed-off-by: yuzhoujian &lt;yuzhoujian@didichuxing.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Kan Liang &lt;kan.liang@intel.com&gt;
Cc: Milian Wolff &lt;milian.wolff@kdab.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Wang Nan &lt;wangnan0@huawei.com&gt;
Link: http://lkml.kernel.org/r/1517217923-8302-2-git-send-email-ufo19890607@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf stat: Support JSON metrics in perf stat</title>
<updated>2017-09-13T12:49:13Z</updated>
<author>
<name>Andi Kleen</name>
<email>ak@linux.intel.com</email>
</author>
<published>2017-08-31T19:40:31Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=b18f3e365019de1a5b26a851e123f0aedcce881f'/>
<id>urn:sha1:b18f3e365019de1a5b26a851e123f0aedcce881f</id>
<content type='text'>
Add generic support for standalone metrics specified in JSON files to
perf stat. A metric is a formula that uses multiple events to compute a
higher level result (e.g. IPC).

Previously metrics were always tied to an event and automatically
enabled with that event. But now change it that we can have standalone
metrics. They are in the same JSON data structure as events, but don't
have an event name.

We also allow to organize the metrics in metric groups, which allows a
short cut to select several related metrics at once.

Add a new -M / --metrics option to perf stat that adds the metrics or
metric groups specified.

Add the core code to manage and parse the metric groups. They are
collected from the JSON data structures into a separate rblist.  When
computing shadow values look for metrics in that list.  Then they are
computed using the existing saved values infrastructure in stat-shadow.c

The actual JSON metrics are in a separate pull request.

  % perf stat -M Summary --metric-only -a sleep 1

   Performance counter stats for 'system wide':

  Instructions   CLKS          CPU_Utilization  GFLOPs   SMT_2T_Utilization   Kernel_Utilization
  317614222.0    1392930775.0  0.0              0.0      0.2                  0.1

       1.001497549 seconds time elapsed

  % perf stat -M GFLOPs flops

   Performance counter stats for 'flops':

     3,999,541,471  fp_comp_ops_exe.sse_scalar_single #  1.2 GFLOPs   (66.65%)
                14  fp_comp_ops_exe.sse_scalar_double                 (66.65%)
                 0  fp_comp_ops_exe.sse_packed_double                 (66.67%)
                 0  fp_comp_ops_exe.sse_packed_single                 (66.70%)
                 0  simd_fp_256.packed_double                         (66.70%)
                 0  simd_fp_256.packed_single                         (66.67%)
                 0  duration_time

       3.238372845 seconds time elapsed

v2: Add missing header file
v3: Move find_map to pmu.c

Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: http://lkml.kernel.org/r/20170831194036.30146-7-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf stat: Fix path to PMU formats in documentation</title>
<updated>2017-08-28T14:05:09Z</updated>
<author>
<name>Jack Henschel</name>
<email>jackdev@mailbox.org</email>
</author>
<published>2017-08-24T13:20:22Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=726647d0526c5c2f3472010677122b89d9e4ef88'/>
<id>urn:sha1:726647d0526c5c2f3472010677122b89d9e4ef88</id>
<content type='text'>
As defined in tools/perf/util/pmu.c, the EVENT_SOURCE_DEVICE_PATH is
/sys/bus/event_source/devices/ (no traling 's' in event_source)

This patch corrects the path in the perf stat documentation

Signed-off-by: Jack Henschel &lt;jackdev@mailbox.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Jack Henschel &lt;jackdev@mailbox.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: trivial@kernel.org
Link: http://lkml.kernel.org/r/20170824132022.10934-1-jackdev@mailbox.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf stat: Add support to measure SMI cost</title>
<updated>2017-06-21T14:35:35Z</updated>
<author>
<name>Kan Liang</name>
<email>Kan.liang@intel.com</email>
</author>
<published>2017-05-26T19:05:38Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=daefd0bc0bd28cea2e6b2f3e1a9da005cd4f58fc'/>
<id>urn:sha1:daefd0bc0bd28cea2e6b2f3e1a9da005cd4f58fc</id>
<content type='text'>
Implementing a new --smi-cost mode in perf stat to measure SMI cost.

During the measurement, the /sys/device/cpu/freeze_on_smi will be set.

The measurement can be done with one counter (unhalted core cycles), and
two free running MSR counters (IA32_APERF and SMI_COUNT).

In practice, the percentages of SMI core cycles should be more useful
than absolute value. So the output will be the percentage of SMI core
cycles and SMI#. metric_only will be set by default.

SMI cycles% = (aperf - unhalted core cycles) / aperf

Here is an example output.

 Performance counter stats for 'sudo echo ':

SMI cycles%          SMI#
    0.1%              1

       0.010858678 seconds time elapsed

Users who wants to get the actual value can apply additional
--no-metric-only.

Signed-off-by: Kan Liang &lt;Kan.liang@intel.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Kan Liang &lt;kan.liang@intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Robert Elliott &lt;elliott@hpe.com&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1495825538-5230-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf stat: Collapse identically named events</title>
<updated>2017-03-21T19:04:11Z</updated>
<author>
<name>Andi Kleen</name>
<email>ak@linux.intel.com</email>
</author>
<published>2017-03-20T20:17:00Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=430daf2dc7aff16096a137347e6fd03d4af609e9'/>
<id>urn:sha1:430daf2dc7aff16096a137347e6fd03d4af609e9</id>
<content type='text'>
The uncore PMU has a lot of duplicated PMUs for different subsystems.
When expanding an uncore alias we usually end up with a large
number of identically named aliases, which makes perf stat
output difficult to read.

Automatically sum them up in perf stat, unless --no-merge is specified.

This can be default because only the uncores generally have duplicated
aliases. Other PMUs have unique names.

Before:

  % perf stat --no-merge -a -e unc_c_llc_lookup.any sleep 1

  Performance counter stats for 'system wide':

           694,976 Bytes unc_c_llc_lookup.any
           706,304 Bytes unc_c_llc_lookup.any
           956,608 Bytes unc_c_llc_lookup.any
           782,720 Bytes unc_c_llc_lookup.any
           605,696 Bytes unc_c_llc_lookup.any
           442,816 Bytes unc_c_llc_lookup.any
           659,328 Bytes unc_c_llc_lookup.any
           509,312 Bytes unc_c_llc_lookup.any
           263,936 Bytes unc_c_llc_lookup.any
           592,448 Bytes unc_c_llc_lookup.any
           672,448 Bytes unc_c_llc_lookup.any
           608,640 Bytes unc_c_llc_lookup.any
           641,024 Bytes unc_c_llc_lookup.any
           856,896 Bytes unc_c_llc_lookup.any
           808,832 Bytes unc_c_llc_lookup.any
           684,864 Bytes unc_c_llc_lookup.any
           710,464 Bytes unc_c_llc_lookup.any
           538,304 Bytes unc_c_llc_lookup.any

       1.002577660 seconds time elapsed

After:

  % perf stat -a -e unc_c_llc_lookup.any sleep 1

  Performance counter stats for 'system wide':

         2,685,120 Bytes unc_c_llc_lookup.any

       1.002648032 seconds time elapsed

v2: Split collect_aliases. Rename alias flag.
v3: Make sure unsupported/not counted is always printed.
v4: Factor out callback change into separate patch.
v5: Move check for bad results here
    Move merged check into collect_data

Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: http://lkml.kernel.org/r/20170320201711.14142-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
</feed>
