| Age | Commit message (Collapse) | Author |
|
Except in dwarf-regs the function is never called. The weak function
has no strong arch implementations.
Remove so that the fall-through case applies.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Wielaard <mark@klomp.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergei Trofimovich <slyich@gmail.com>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf_regs.h has cross architecture functions for operating with the
differing perf register constants. dwarf-regs.h is similar but for
cross architecture dwarf notions of registers.
For consistency move the arch parts of dwarf-regs out of util and into
its own directory.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Wielaard <mark@klomp.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergei Trofimovich <slyich@gmail.com>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The arch string requires multiple strcmp to identify things like the
IP and SP.
Switch to passing in an e_machine that in the bulk of cases is computed
using a current thread load.
The e_machine also allows identification of 32-bit vs 64-bit processes.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Wielaard <mark@klomp.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergei Trofimovich <slyich@gmail.com>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
[ Include dwarf-regs.h to get conditional defines for EM_CSKY and EM_LOONGARCH, not available in old distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rather than have 2 Dwfl unify the Dwfl in skip-callchain-idx with that
is used by libdw__addr2line().
Rename that variable in 'struct dso' from 'a2l_libdw' to just 'libdw' as
it is now used in more than addr2line.
The Dwfl in skip-callchain-idx uses a map address when being read with
dwfl_report_elf (rather than dwfl_report_offline that addr2line
uses).
skip-callchain-idx is wrong as the map address can vary between
processes because of ASLR, ie it should need a different Dwfl per
process.
In the code after this patch the base address becomes 0 and the mapped
PC is used with the dwfl functions.
This should increase the accuracy of skip-callchain-idx, but the impact
has only been build tested.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Wielaard <mark@klomp.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergei Trofimovich <slyich@gmail.com>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Argument is a pointer but EM_NONE (0) was being passed. Correct by
passing NULL.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Wielaard <mark@klomp.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergei Trofimovich <slyich@gmail.com>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Some irregular stack traces are causing double frees and memory
leaks. Make the code robust by proactively freeing and being more
careful with the memory management of the leaf_srcline.
Fixes: 88c51002d06f9a68 ("perf addr2line: Add a libdw implementation")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Wielaard <mark@klomp.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergei Trofimovich <slyich@gmail.com>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The set_initial_registers field of Dwfl_Thread_Callbacks needs to be set
according to the arch of the stack samples being analyzed, not the arch
that perf itself is built for.
Currently perf fails to unwind stack samples collected from archs
different from that of the host perf is running on.
This patch moves the arch-specific implementations of set_initial_registers
from tools/perf/arch to tools/perf/utli/unwind-libdw-arch, similar to the
way the perf-regs-arch folder contains arch-specific functions related to
registers, and chooses the implementation based on the arch of the data
being processed.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Shimin Guo <shimin.guo@skydio.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Wielaard <mark@klomp.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergei Trofimovich <slyich@gmail.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Better ensure a read e_machine is valid by checking the file appears
like an ELF file and the read e_machine value is less than EM_NUM.
This better avoids spurious e_machine values when looking for an
e_machine in say a thread.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Wielaard <mark@klomp.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergei Trofimovich <slyich@gmail.com>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The processing of DSO_BINARY_TYPE__GNU_DEBUGDATA in symsrc__init happens
with an open ELF file but the error path only closes the associate fd.
Fix the goto so that the ELF file is also ended and memory released.
Fixes: b10f74308e130527 ("perf symbol: Support .gnu_debugdata for symbols")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Wielaard <mark@klomp.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergei Trofimovich <slyich@gmail.com>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Requiring the 'pmu->perf_event_attr_init_default' callback to be set to
track user changes is a bit of a trap to fall in. It's hard to see that
this is required when depending on the user change tracking.
It's possible to want all 0 defaults so not set it, but at the same time
still do some programmatic setting of configs with
evsel__set_config_if_unset(). Also if a PMU reverts to 0 defaults and
deletes its existing callback, it will silently break existing uses of
evsel__set_config_if_unset().
One way to fix this would be to assert in evsel__set_config_if_unset()
if the changes weren't tracked, but that would be a possibly untested
runtime failure. Instead, always track it as it's harmless and
simplifies testing too.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This will be used by aux PMUs to read an already written value for
configuring their events and for also testing.
Its helper perf_pmu__format_unpack() does the opposite of the existing
pmu_format_value() so rename that one to perf_pmu__format_pack() so it's
clear how they are related.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Misleadingly, evsel__set_config_if_unset() only works with the config
field and not config1, config2, etc. This is fine at the moment because
all users of it happen to operate on bits that are in that config field.
Fix it before there are any new users of the function which operate on
bits in different config fields.
In theory it's also possible for a driver to move an existing bit to
another config field and this fixes that scenario too, although this
hasn't happened yet either.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently we only track which bits were set by the user in attr->config.
But all configN fields should be treated equally as they can all have
default and user overridden values.
Track them all by making get_config_chgs() generic and calling it once
for each config value.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Sparse config fields are technically supported although currently
unused. field_prep() only works for contiguous bitfields so replace it
with pmu_format_value().
pmu_format_value() also takes a bitmap rather than a u64 so replace
'u64 bits' with format->bits.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
At least one of these were put here to avoid a Python binding linking
issue which is no longer present. Put them back in their correct
location to avoid confusion about which file to add a new evsel__*
function to later.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/all/ZEbAS2yx2fguW60w@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Make the evsel argument first to match the other evsel__* functions
and remove the redundant pmu argument, which can be accessed via evsel.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The ADD_CONFIG_TERM() macros build the __type argument out of a partial
EVSEL__CONFIG_TERM_x enum name. This means that they can't be called
from a function where __type is a variable and it's also impossible to
grep the codebase to find usages of these enums as they're never typed
in full.
Fix this by removing the macros and replacing them with an
add_config_term() function. It seems the main reason these existed in
the first place was to avoid type punning and to write to a specific
field in the union, but the same thing can be achieved with a single
write to a u64 'val' field.
Running the Perf tests with "-fsanitize=undefined -fno-sanitize-recover"
results in no new issues as a result of this change.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
strerror() has thread safety issues, strerror_r() requires stack
allocated buffers.
Code in perf has already been using the "%m" formatting flag that is a
widely support glibc extension to print the current errno's description.
Expand the usage of this formatting flag and remove usage of
strerror()/strerror_r().
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Yunseong Kim <ysk@kzalloc.com>
Cc: Zhongqiu Han <quic_zhonhan@quicinc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There's a lot of infrastructure for generating a relatively simple
array used by one function.
Move the array into the function and remove the supporting build logic.
At the same time opportunistically const-ify the array.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The function addr_location__put() was renamed addr_location__exit() in
commit 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy
functions"). Make the comment preceding the function consistent with
the function itself.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kexin Sun <kexinsun@smail.nju.edu.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ratnadira Widyasari <ratnadiraw@smu.edu.sg>
Cc: Xutong Ma <xutong.ma@inria.fr>
Cc: Yumbo Lyu <yunbolyu@smu.edu.sg>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
These are hard to interpret in the raw output because they are printed
as hex but are defined in perf_event.h as decimal. Make it much easier
to read the raw callchains by just printing their names.
For example:
$ perf report -D
1798195372321 0x4638 [0xb0]: PERF_RECORD_SAMPLE(IP, 0x4002): 44922/44922: 0x7c8046dd3400 period: 120218 addr: 0
... FP chain: nr:12
..... 0: fffffffffffffe00 (PERF_CONTEXT_USER)
..... 1: 00007c8046dd3400
..... 2: 00007c8046db86d3
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
[ Add PERF_CONTEXT_USER_DEFERRED too, as per Namhyung's review comment ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now that the SHA-1 code is no longer used, remove it.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fangrui Song <maskray@sourceware.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pablo Galindo <pablogsal@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Recent patches [1] [2] added an implementation of SHA-1 to perf and made
it be used for build ID generation.
I had understood the choice of SHA-1, which is a legacy algorithm, to be
for backwards compatibility.
It turns out, though, that there's no backwards compatibility
requirement here other than the size of the build ID field, which is
fixed at 20 bytes. Not only did the hash algorithm already change (from
MD5 to SHA-1), but the inputs to the hash changed too: from 'load_addr
|| code' to just 'code', and now again to 'code || symtab || strsym'
[3]. Different linkers generate different build IDs, with the LLVM
linker using BLAKE3 hashes for example [4].
Therefore, we might as well switch to a more modern algorithm. Let's go
with BLAKE2s. It's faster than SHA-1, isn't cryptographically broken,
is easier to implement than BLAKE3, and the kernel's implementation in
lib/crypto/blake2s.c is easily borrowed. It also natively supports
variable-length hashes, so it can directly produce the needed 20 bytes.
Also make the following additional improvements:
- Hash the three inputs incrementally, so they don't all have to be
concatenated into one buffer.
- Add tag/length prefixes to each of the three inputs, so that distinct
input tuples reliably result in distinct hashes.
[1] https://lore.kernel.org/linux-perf-users/20250521225307.743726-1-yuzhuo@google.com/
[2] https://lore.kernel.org/linux-perf-users/20250625202311.23244-1-ebiggers@kernel.org/
[3] https://lore.kernel.org/linux-perf-users/20251125080748.461014-1-namhyung@kernel.org/
[4] https://github.com/llvm/llvm-project/commit/d3e5b6f7539b86995aef6e2075c1edb3059385ce
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fangrui Song <maskray@sourceware.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pablo Galindo <pablogsal@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add BLAKE2s support to the perf utility library. The code is borrowed
from the kernel. This will replace the use of SHA-1 in genelf.c.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fangrui Song <maskray@sourceware.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pablo Galindo <pablogsal@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
sample__fprintf_callchain() was using map__fprintf_srcline() which won't
report inline line numbers.
Fix by using the srcline from the callchain and falling back to the map
variant.
Fixes: 25da4fab5f66e659 ("perf evsel: Move fprintf methods to separate source file")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Allow the addr2line style to be specified on the `perf report` command
line or in the .perfconfig file.
Committer testing:
The methods:
# perf probe -x ~/bin/perf -F *__addr2line
cmd__addr2line
libbfd__addr2line
libdw__addr2line
llvm__addr2line
#
So if we configure one of them, say 'addr2line':
# perf config addr2line.style=addr2line
# perf config addr2line.style
addr2line.style=addr2line
#
And have probes on all of them:
# perf probe -x ~/bin/perf *__addr2line
Added new events:
probe_perf:cmd__addr2line (on *__addr2line in /home/acme/bin/perf)
probe_perf:llvm__addr2line (on *__addr2line in /home/acme/bin/perf)
probe_perf:libbfd__addr2line (on *__addr2line in /home/acme/bin/perf)
probe_perf:libdw__addr2line (on *__addr2line in /home/acme/bin/perf)
You can now use it in all perf tools, such as:
perf record -e probe_perf:libdw__addr2line -aR sleep 1
#
Only the selected method should be used:
# perf stat -e probe_perf:*_addr2line perf report -f --dso perf --stdio -s srcfile,srcline
# Total Lost Samples: 0
#
# Samples: 4K of event 'cpu/cycles/Pu'
# Event count (approx.): 5535180842
#
# Overhead Source File Source:Line
# ........ ............ ...............
#
99.04% inlineloop.c inlineloop.c:21
0.46% inlineloop.c inlineloop.c:20
#
# (Tip: For hierarchical output, try: perf report --hierarchy)
#
Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline':
44 probe_perf:cmd__addr2line
0 probe_perf:llvm__addr2line
0 probe_perf:libbfd__addr2line
0 probe_perf:libdw__addr2line
0.035915611 seconds time elapsed
0.028008000 seconds user
0.009051000 seconds sys
#
I checked and that is the case for the other methods.
Also when using:
# perf config addr2line.style=libdw,llvm
Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline':
0 probe_perf:cmd__addr2line
23 probe_perf:llvm__addr2line
0 probe_perf:libbfd__addr2line
44 probe_perf:libdw__addr2line
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The a2l_style is only relevant to the command line version, so rename
to make this clearer.
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add an implementation of addr2line that uses libdw.
Other addr2line implementations are slow, particularly in the case of
forking addr2line.
Add an implementation that caches the libdw information in the dso and
uses it to find the file and line number information.
Inline information is supported but because cu_walk_functions_at visits
the leaf function last add a inline_list__append_tail to reverse the
lists order.
Committer testing:
# perf probe -x ~/bin/perf libdw__addr2line
Added new event:
probe_perf:libdw_addr2line (on libdw__addr2line in /home/acme/bin/perf)
You can now use it in all perf tools, such as:
perf record -e probe_perf:libdw_addr2line -aR sleep 1
#
# perf stat -e probe_perf:libdw_addr2line perf report -f --dso perf --stdio -s srcfile,srcline
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 4K of event 'cpu/cycles/Pu'
# Event count (approx.): 5535180842
#
# Overhead Source File Source:Line
# ........ ............ ...............
#
99.04% inlineloop.c inlineloop.c:21
0.46% inlineloop.c inlineloop.c:20
#
# (Tip: For tracepoint events, try: perf report -s trace_fields)
#
Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline':
44 probe_perf:libdw_addr2line
0.037260744 seconds time elapsed
0.025299000 seconds user
0.011918000 seconds sys
#
Adding probes to the other addr2line implementations (llvm__addr2line,
libbfd__addr2line and cmd__addr2line) I noticed some fallbacks to the
llvm one:
Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline':
44 probe_perf:libdw_addr2line
23 probe_perf:llvm_addr2line
0 probe_perf:libbfd_addr2line
0 probe_perf:cmd_addr2line
Something to investigate further, but at least we don't fallback to the
cmd based one :-)
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The addition of addr_location__exit() causes use-after put on the maps
and map references in the unwind info. Add the gets and then add the
map_symbol__exit() calls.
Fixes: 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy functions")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Commit bc22de9bcdb22491 ("perf stat: Display time in precision based on
std deviation") added multirun workload elapsed time. There was an
effort to make the precision in the output most useful for the user,
however, when gathering over runs it means the formatting varies. This
change just makes the output format fixed.
Before:
```
$ while :; do perf stat --null --repeat 3 sleep 0.1 2>&1 | grep elapsed; done
0.101140 +- 0.000149 seconds time elapsed ( +- 0.15% )
0.1011396 +- 0.0000218 seconds time elapsed ( +- 0.02% )
0.101331 +- 0.000124 seconds time elapsed ( +- 0.12% )
^C
$ while :; do perf stat --null --repeat 3 sleep 1 2>&1 | grep elapsed; done
1.001317 +- 0.000146 seconds time elapsed ( +- 0.01% )
1.001377 +- 0.000172 seconds time elapsed ( +- 0.02% )
1.00253 +- 0.00131 seconds time elapsed ( +- 0.13% )
```
After:
```
$ while :; do perf stat --null --repeat 3 sleep 0.1 2>&1 | grep elapsed; done
0.101406408 +- 0.000064778 seconds time elapsed ( +- 0.06% )
0.101367315 +- 0.000027253 seconds time elapsed ( +- 0.03% )
0.101434164 +- 0.000084750 seconds time elapsed ( +- 0.08% )
^C
$ while :; do perf stat --null --repeat 3 sleep 1 2>&1 | grep elapsed; done
1.001525467 +- 0.000051703 seconds time elapsed ( +- 0.01% )
1.001375093 +- 0.000116200 seconds time elapsed ( +- 0.01% )
1.001141025 +- 0.000046361 seconds time elapsed ( +- 0.00% )
```
Closes: https://lore.kernel.org/lkml/aTQRgAOpKyI53TEq@gmail.com/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If evsel__new_idx() returns NULL, the function currently jumps to label
'out_err'. Here, references to `cpus` and `pmu_cpus` are dropped.
Also, resources held by evsel->name and evsel->metric_id are freed.
But if evsel__new_idx() returns NULL, it can lead to NULL pointer
dereference.
Fixes: cd63c22168257a0b ("perf parse-events: Minor __add_event refactoring")
Signed-off-by: Faisal Bukhari <faisalbukhari523@gmail.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Adjust some oddly indented fprintf() calls.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This adds a feature to allow restricting the range of converted samples
with a range string like perf-script and perf-report --time.
Committer testing:
Put a probe on the ICMP receive path handling broadcast packets:
# perf probe icmp_rcv:64
Added new event:
probe:icmp_rcv_L64 (on icmp_rcv:64)
You can now use it in all perf tools, such as:
perf record -e probe:icmp_rcv_L64 -aR sleep 1
# perf record -e probe:icmp_rcv_L64 ping -c 10 -b 127.255.255.255
WARNING: pinging broadcast address
PING 127.255.255.255 (127.255.255.255) 56(84) bytes of data.
^C
--- 127.255.255.255 ping statistics ---
10 packets transmitted, 0 received, 100% packet loss, time 9217ms
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data (10 samples) ]
# perf script
ping 52785 [009] 5847.300394: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5848.325018: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5849.349007: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5850.372979: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5851.396988: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5852.420954: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5853.444934: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5854.468926: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5855.492914: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5856.516883: probe:icmp_rcv_L64: (ffffffffaadb337e)
#
Now get some slices using perf script:
# perf script --time 40%
ping 52785 [009] 5847.300394: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5848.325018: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5849.349007: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5850.372979: probe:icmp_rcv_L64: (ffffffffaadb337e)
# perf script --time 40%-60%
ping 52785 [009] 5851.396988: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5852.420954: probe:icmp_rcv_L64: (ffffffffaadb337e)
#
And finally use this new feature:
# perf data convert --to-json out.json --time 0%-10%
[ perf data convert: Converted 'perf.data' into JSON data 'out.json' ]
[ perf data convert: Converted and wrote 0.001 MB (1 samples) ]
[ perf data convert: Skipped 9 samples ]
# cat out.json
{
"linux-perf-json-version": 1,
"headers": {
"header-version": 1,
"captured-on": "2026-01-06T22:26:40Z",
"data-offset": 520,
"data-size": 34648,
"feat-offset": 35168,
"hostname": "number",
"os-release": "6.17.12-300.fc43.x86_64",
"arch": "x86_64",
"cpu-desc": "AMD Ryzen 9 9950X3D 16-Core Processor",
"cpuid": "AuthenticAMD,26,68,0",
"nrcpus-online": 32,
"nrcpus-avail": 32,
"perf-version": "6.19.rc4.gf4c270685d3d",
"cmdline": [
"/home/acme/bin/perf"
]
},
"samples": [
{
"timestamp": 5847300394661,
"pid": 52785,
"tid": 52785,
"cpu": 9,
"comm": "ping",
"callchain": [
{
"ip": "0xffffffffaadb337f",
"symbol": "icmp_rcv",
"dso": "[kernel.kallsyms]"
}
],
"__probe_ip": "ffffffffaadb337e"
}
]
}
#
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Diamond Rapids introduces two types of PCIe related uncore PMUs:
"uncore_pcie4_*" and "uncore_pcie6_*".
To ensure that generic PCIe events (e.g., UNC_PCIE_CLOCKTICKS) can match
and collect events from both PMU types, slightly relax the wildcard
matching logic in perf_pmu__match_wildcard().
This change allows a wildcard such as "pcie" to match PMU names that
include a numeric suffix, such as "pcie4_*" and "pcie6_*".
Co-developed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://patch.msgid.link/20251231224233.113839-12-zide.chen@intel.com
|
|
Due to an off-by-one error introduced in commit 73bbb94466fd3f8b
("kallsyms: support "big" kernel symbols"), long symbols (which are
currently only produced by Rust) can have their symbol type being
wrongly parsed by kernel/kallsyms.c.
This has been fixed in commit f3f9f42232dee596 ("kallsyms: Fix wrong
"big" kernel symbol type read from procfs"), and these symbols are now
reported correctly.
Drop the workaround in perf symbol that filter out these symbol types.
Specifically, '1' and 'l' can never be generated by nm -- 'u' does
indicate GNU unique, however such symbols are only generated by G++ for
C++ templates, and are never generated by LLVM (LLVM generates weak
symbols in such cases instead).
'N' can appear if symbols exist inside debug sections, and 'n' may
appear for symbols inside note sections, however these sections do not
typically have symbol (and they're explicitly filtered out by kallsyms).
Therefore, the previous occurrence of these symbols types must be due to
the off-by-one error and can be safely removed.
Signed-off-by: Gary Guo <gary@garyguo.net>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Benno Lossin <lossin@kernel.org>
Cc: Bill Wendling <morbo@google.com>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick up perf-tools fixes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add NVIDIA Olympus MIDR to neoverse_spe range list.
Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Some callers of filename__read_build_id assume the error value must be
-1, fix by making them handle all < 0 values.
If is_regular_file fails in filename__read_build_id then it could be
the file is missing (ENOENT) and it would be wrong to return
-EWOULDBLOCK in that case. Fix the logic so -EWOULDBLOCK is only
reported if other errors with stat haven't occurred.
Fixes: 834ebb5678d7 ("perf tools: Don't read build-ids from non-regular files")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Factor out __sort_dimension__update() so that it can be called from -s
and -F option parsing logics. Otherwise the following command cannot go
into the annotation mode.
$ perf report -F overhead,type,sym
Warning: Annotation is only available for symbolic views, include "sym*" in --sort to use it.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Users can use -F/--fields option to set output fields and sort keys
together.
But it missed to set perf_hpp_list->need_collapse for sort entries that
have se_collapse callbacks.
So it ends up with having duplicated entries separately.
For example, let's run this command first.
$ perf mem record -t load -U -- perf test -w datasym
This will record samples for memory access (load) to struct 'buf' and a
loop condition ('sig_atomic_t') types.
So the following two commands should have identical output.
$ perf report -s type --stdio --percent-limit=1 -q
87.80% perf buf
12.17% perf sig_atomic_t
But using -F option didn't collapse the entries based on types so the
result looked like below:
$ perf report -F overhead,type --stdio --percent-limit=1 -q
23.31% perf buf
22.84% perf buf
21.26% perf buf
20.39% perf buf
12.17% perf sig_atomic_t
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently -d/--data option controls both PERF_SAMPLE_ADDR bit and
perf_event_attr.mmap_data flag. Separate them using new --data-mmap
option to support recording only one of them.
For data-type profiling, data MMAP is unnecessary but it wastes a lot
of space in the ring buffer and data file.
Committer testing:
On an idle system:
root@x1:~# perf record -d -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 5.672 MB perf.data (1075 samples) ]
root@x1:~# ls -la perf.data
-rw-------. 1 root root 5982480 Dec 16 15:34 perf.data
root@x1:~# perf evlist -v
cpu_atom/cycles/P: type: 0 (PERF_TYPE_HARDWARE), size: 144, config: 0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
cpu_core/cycles/P: type: 0 (PERF_TYPE_HARDWARE), size: 144, config: 0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 144, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, build_id: 1
root@x1:~#
Now with just --data-mmap we will not save that much, as only DATA_SRC
will not be enabled in sample_type:
root@x1:~# perf record --data-mmap -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 5.576 MB perf.data (716 samples) ]
root@x1:~# ls -la perf.data
-rw-------. 1 root root 5880112 Dec 16 15:37 perf.data
root@x1:~# perf evlist -v
cpu_atom/cycles/P: type: 0 (PERF_TYPE_HARDWARE), size: 144, config: 0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
cpu_core/cycles/P: type: 0 (PERF_TYPE_HARDWARE), size: 144, config: 0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 144, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|IDENTIFIER, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, build_id: 1
root@x1:~#
To complete, just with DATA_SRC, no mmap_data:
root@x1:~# perf record --sample-mem-info -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.407 MB perf.data (1311 samples) ]
root@x1:~# ls -la perf.data
-rw-------. 1 root root 1509224 Dec 16 15:40 perf.data
root@x1:~# perf evlist -v
cpu_atom/cycles/P: type: 0 (PERF_TYPE_HARDWARE), size: 144, config: 0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER|DATA_SRC, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
cpu_core/cycles/P: type: 0 (PERF_TYPE_HARDWARE), size: 144, config: 0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER|DATA_SRC, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 144, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|IDENTIFIER|DATA_SRC, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, build_id: 1
root@x1:~#
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Newer glibc versions return const char for str{str,chr}() where the
haystack/s is const so to avoid warnings like these on fedora 44 change
some variables to const:
36 8.17 fedora:44 : FAIL gcc version 15.2.1 20251111 (Red Hat 15.2.1-4) (GCC)
libbpf.c: In function 'kallsyms_cb':
libbpf.c:8489:13: error: assignment discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
8489 | res = strstr(sym_name, ".llvm.");
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20251211221756.96294-4-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Namhyung Kim:
"Perf event/metric description:
Unify all event and metric descriptions in JSON format. Now event
parsing and handling is greatly simplified by that.
From users point of view, perf list will provide richer information
about hardware events like the following.
$ perf list hw
List of pre-defined events (to be used in -e or -M):
legacy hardware:
branch-instructions
[Retired branch instructions [This event is an alias of branches]. Unit: cpu]
branch-misses
[Mispredicted branch instructions. Unit: cpu]
branches
[Retired branch instructions [This event is an alias of branch-instructions]. Unit: cpu]
bus-cycles
[Bus cycles,which can be different from total cycles. Unit: cpu]
cache-misses
[Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in conjunction with the
PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. Unit: cpu]
cache-references
[Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include
prefetches and coherency messages; again this depends on the design of your CPU. Unit: cpu]
cpu-cycles
[Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cycles]. Unit: cpu]
cycles
[Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cpu-cycles]. Unit: cpu]
instructions
[Retired instructions. Be careful,these can be affected by various issues,most notably hardware interrupt counts. Unit: cpu]
ref-cycles
[Total cycles; not affected by CPU frequency scaling. Unit: cpu]
But most notable changes would be in the perf stat. On the right side,
the default metrics are better named and aligned. :)
$ perf stat -- perf test -w noploop
Performance counter stats for 'perf test -w noploop':
11 context-switches # 10.8 cs/sec cs_per_second
0 cpu-migrations # 0.0 migrations/sec migrations_per_second
3,612 page-faults # 3532.5 faults/sec page_faults_per_second
1,022.51 msec task-clock # 1.0 CPUs CPUs_utilized
110,466 branch-misses # 0.0 % branch_miss_rate (88.66%)
6,934,452,104 branches # 6781.8 M/sec branch_frequency (88.66%)
4,657,032,590 cpu-cycles # 4.6 GHz cycles_frequency (88.65%)
27,755,874,218 instructions # 6.0 instructions insn_per_cycle (89.03%)
TopdownL1 # 0.3 % tma_backend_bound
# 9.3 % tma_bad_speculation (89.05%)
# 9.7 % tma_frontend_bound (77.86%)
# 80.7 % tma_retiring (88.81%)
1.025318171 seconds time elapsed
1.013248000 seconds user
0.012014000 seconds sys
Deferred unwinding support:
With the kernel support (commit c69993ecdd4d: "perf: Support deferred
user unwind"), perf can use deferred callchains for userspace stack
trace with frame pointers like below:
$ perf record --call-graph fp,defer ...
This will be transparent to users when it comes to other commands like
perf report and perf script. They will merge the deferred callchains
to the previous samples as if they were collected together.
ARM SPE updates
- Extensive enhancements to support various kinds of memory
operations including GCS, MTE allocation tags, memcpy/memset,
register access, and SIMD operations.
- Add inverted data source filter (inv_data_src_filter) support to
exclude certain data sources.
- Improve documentation.
Vendor event updates:
- Intel: Updated event files for Sierra Forest, Panther Lake, Meteor
Lake, Lunar Lake, Granite Rapids, and others.
- Arm64: Added metrics for i.MX94 DDR PMU and Cortex-A720AE
definitions.
- RISC-V: Added JSON support for T-HEAD C920V2.
Misc:
- Improve pointer tracking in data type profiling. It'd give better
output when the variable is using container_of() to convert type.
- Annotation support for perf c2c report in TUI. Press 'a' key to
enter annotation view from cacheline browser window. This will show
which instruction is causing the cacheline contention.
- Lots of fixes and test coverage improvements!"
* tag 'perf-tools-for-v6.19-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (214 commits)
libperf: Use 'extern' in LIBPERF_API visibility macro
perf stat: Improve handling of termination by signal
perf tests stat: Add test for error for an offline CPU
perf stat: When no events, don't report an error if there is none
perf tests stat: Add "--null" coverage
perf cpumap: Add "any" CPU handling to cpu_map__snprint_mask
libperf cpumap: Fix perf_cpu_map__max for an empty/NULL map
perf stat: Allow no events to open if this is a "--null" run
perf test kvm: Add some basic perf kvm test coverage
perf tests evlist: Add basic evlist test
perf tests script dlfilter: Add a dlfilter test
perf tests kallsyms: Add basic kallsyms test
perf tests timechart: Add a perf timechart test
perf tests top: Add basic perf top coverage test
perf tests buildid: Add purge and remove testing
perf tests c2c: Add a basic c2c
perf c2c: Clean up some defensive gets and make asan clean
perf jitdump: Fix missed dso__put
perf mem-events: Don't leak online CPU map
perf hist: In init, ensure mem_info is put on error paths
...
|
|
If the perf_cpu_map is empty or is just the any CPU value, then early
return. Don't process the "any" CPU when creating the bitmap.
Tested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Reference count checking caught a missing dso__put following a
machine__findnew_dso_id.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Reference count checking found the online CPU map was being gotten but
not put. Add in the missing put.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Rather than exit the internal map_symbols directly, put the mem-info
that does this and also lowers the reference count on the mem-info
itself otherwise the mem-info is being leaked.
Fixes: 56e144fe98260a0f ("perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Move nsinfo__zput from cleanup_perf_probe_events to
clear_perf_probe_event so it is always executed. Clean up
clear_perf_probe_events to not call nsinfo__zput and use the pev
variable to avoid repeated array accesses.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add missing dso__put for the dso created in maps__split_kallsyms.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
In dso__process_kernel_symbol if inserting a map fails, probably
ENOMEM, then the reference count puts were missing on the dso and map.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|