kernel - Hosts the 0x221E linux distro kernel.

Age	Commit message (Collapse)	Author
2025-12-03	perf arm_spe: Add CPU variants supporting common data source packet	Leo Yan
	Add the following CPU variants to the list for data source decoding: - Cortex-A715 [1] - Cortex-A78C [2] - Cortex-X1 [3] - Cortex-X4 [4] - Neoverse V3 [5] [1] https://developer.arm.com/documentation/101590/0103/Statistical-Profiling-Extension-Support/Statistical-Profiling-Extension-data-source-packet [2] https://developer.arm.com/documentation/102226/0002/Debug-descriptions/Statistical-Profiling-Extension/implementation-defined-features-of-SPE [3] https://developer.arm.com/documentation/101433/0102/Debug-descriptions/Statistical-Profiling-Extension/implementation-defined-features-of-SPE [4] https://developer.arm.com/documentation/102484/0003/Statistical-Profiling-Extension-support/Statistical-Profiling-Extension-data-source-packet [5] https://developer.arm.com/documentation/107734/0002/Statistical-Profiling-Extension-support/Statistical-Profiling-Extension-data-source-packet Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-03	perf auxtrace: Include sys/types.h for pid_t	Arnaldo Carvalho de Melo
	In 754187ad73b73bcb ("perf build: Remove NO_AUXTRACE build option") sys/types.h was removed, which broke the build in all Alpine Linux releases, as musl libc has pid_t defined via sys/types.h, add it back. Fixes: 754187ad73b73bcb ("perf build: Remove NO_AUXTRACE build option") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02	perf tools: Use machine->root_dir to find /proc/kallsyms	Namhyung Kim
	This is for test functions to find the kallsyms correctly. It can find the machine from the kernel maps and use its root_dir. This is helpful to setup fake /proc directory for testing. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02	perf tools: Fallback to initial kernel map properly	Namhyung Kim
	In maps__split_kallsyms(), it assumes new kernel map when it finds a symbol without module after any module and the initial kernel map has some symbols. Because it expects modules are out of the kernel map so modules should not have symbols in the kernel map. For example, the following memory map shows symbols and maps. Any symbols in the module 1 area will go to the module 1. The main kernel map starts at 0xffffffffbc200000. But if any symbol has a module between the symbols in that area, next symbols after 0xffffffffbd008000 will generate new kernel maps like [kernel].1. kernel address \| \| \| \| 0xffffffffc0000000 \|---------------------\| \| (symbols) \| \| ... \| <--- [kernel].N 0xffffffffbc400000 \|---------------------\| \| (symbols) \| \| module 2 \| <--- bad? 0xffffffffbc380000 \|---------------------\| \| ... \| \| (symbols) \| \| [kernel.kallsyms] \| <--- initial map 0xffffffffbc200000 \|---------------------\| \| \| \| \| 0xffffffffabcde000 \|---------------------\| \| (symbols) \| \| module 1 \| 0xffffffffabcd0000 \|---------------------\| This is very fragile when the module has a symbol that falls into the main kernel map for some reason. My system has a livepatch module with such symbols. And it created a lot of new kernel maps after those symbols. But the symbol may have broken addresses and the later symbols can still be found in the initial kernel map. Let's check the symbol address in the initial map and use it if found. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02	perf tools: Fix split kallsyms DSO counting	Namhyung Kim
	It's counted twice as it's increased after calling maps__insert(). I guess we want to increase it only after it's added properly. Reviewed-by: Ian Rogers <irogers@google.com> Fixes: 2e538c4a1847291cf ("perf tools: Improve kernel/modules symbol lookup") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02	perf tools: Mark split kallsyms DSOs as loaded	Namhyung Kim
	The maps__split_kallsyms() will split symbols to module DSOs if it comes from a module. It also handled some unusual kernel symbols after modules by creating new kernel maps like "[kernel].0". But they are pseudo DSOs to have those unexpected symbols. They should not be considered as unloaded kernel DSOs. Otherwise the dso__load() for them will end up calling dso__load_kallsyms() and then maps__split_kallsyms() again and again. Reviewed-by: Ian Rogers <irogers@google.com> Fixes: 2e538c4a1847291cf ("perf tools: Improve kernel/modules symbol lookup") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02	perf tools: Flush remaining samples w/o deferred callchains	Namhyung Kim
	It's possible that some kernel samples don't have matching deferred callchain records when the profiling session was ended before the threads came back to userspace. Let's flush the samples before finish the session. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02	perf tools: Merge deferred user callchains	Namhyung Kim
	Save samples with deferred callchains in a separate list and deliver them after merging the user callchains. If users don't want to merge they can set tool->merge_deferred_callchains to false to prevent the behavior. With previous result, now perf script will show the merged callchains. $ perf script ... pwd 2312 121.163435: 249113 cpu/cycles/P: ffffffff845b78d8 __build_id_parse.isra.0+0x218 ([kernel.kallsyms]) ffffffff83bb5bf6 perf_event_mmap+0x2e6 ([kernel.kallsyms]) ffffffff83c31959 mprotect_fixup+0x1e9 ([kernel.kallsyms]) ffffffff83c31dc5 do_mprotect_pkey+0x2b5 ([kernel.kallsyms]) ffffffff83c3206f __x64_sys_mprotect+0x1f ([kernel.kallsyms]) ffffffff845e6692 do_syscall_64+0x62 ([kernel.kallsyms]) ffffffff8360012f entry_SYSCALL_64_after_hwframe+0x76 ([kernel.kallsyms]) 7f18fe337fa7 mprotect+0x7 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) 7f18fe330e0f _dl_sysdep_start+0x7f (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) 7f18fe331448 _dl_start_user+0x0 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) ... The old output can be get using --no-merge-callchain option. Also perf report can get the user callchain entry at the end. $ perf report --no-children --stdio -q -S __build_id_parse.isra.0 # symbol: __build_id_parse.isra.0 8.40% pwd [kernel.kallsyms] \| ---__build_id_parse.isra.0 perf_event_mmap mprotect_fixup do_mprotect_pkey __x64_sys_mprotect do_syscall_64 entry_SYSCALL_64_after_hwframe mprotect _dl_sysdep_start _dl_start_user Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02	perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED	Namhyung Kim
	Handle the deferred callchains in the script output. $ perf script ... pwd 2312 121.163435: 249113 cpu/cycles/P: ffffffff845b78d8 __build_id_parse.isra.0+0x218 ([kernel.kallsyms]) ffffffff83bb5bf6 perf_event_mmap+0x2e6 ([kernel.kallsyms]) ffffffff83c31959 mprotect_fixup+0x1e9 ([kernel.kallsyms]) ffffffff83c31dc5 do_mprotect_pkey+0x2b5 ([kernel.kallsyms]) ffffffff83c3206f __x64_sys_mprotect+0x1f ([kernel.kallsyms]) ffffffff845e6692 do_syscall_64+0x62 ([kernel.kallsyms]) ffffffff8360012f entry_SYSCALL_64_after_hwframe+0x76 ([kernel.kallsyms]) b00000006 (cookie) ([unknown]) pwd 2312 121.163447: DEFERRED CALLCHAIN [cookie: b00000006] 7f18fe337fa7 mprotect+0x7 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) 7f18fe330e0f _dl_sysdep_start+0x7f (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) 7f18fe331448 _dl_start_user+0x0 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02	perf record: Add --call-graph fp,defer option for deferred callchains	Namhyung Kim
	Add a new callchain record mode option for deferred callchains. For now it only works with FP (frame-pointer) mode. And add the missing feature detection logic to clear the flag on old kernels. $ perf record --call-graph fp,defer -vv true ... ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) { sample_period, sample_freq } 4000 sample_type IP\|TID\|TIME\|CALLCHAIN\|PERIOD read_format ID\|LOST disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 sample_id_all 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 defer_callchain 1 defer_output 1 ------------------------------------------------------------ sys_perf_event_open: pid 162755 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off deferred callchain support Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02	perf tools: Minimal DEFERRED_CALLCHAIN support	Namhyung Kim
	Add a new event type for deferred callchains and a new callback for the struct perf_tool. For now it doesn't actually handle the deferred callchains but it just marks the sample if it has the PERF_CONTEXT_ USER_DEFFERED in the callchain array. At least, perf report can dump the raw data with this change. Actually this requires the next commit to enable attr.defer_callchain, but if you already have a data file, it'll show the following result. $ perf report -D ... 0x2158@perf.data [0x40]: event: 22 . . ... raw event: size 64 bytes . 0000: 16 00 00 00 02 00 40 00 06 00 00 00 0b 00 00 00 ......@......... . 0010: 03 00 00 00 00 00 00 00 a7 7f 33 fe 18 7f 00 00 ..........3..... . 0020: 0f 0e 33 fe 18 7f 00 00 48 14 33 fe 18 7f 00 00 ..3.....H.3..... . 0030: 08 09 00 00 08 09 00 00 e6 7a e7 35 1c 00 00 00 .........z.5.... 121163447014 0x2158 [0x40]: PERF_RECORD_CALLCHAIN_DEFERRED(IP, 0x2): 2312/2312: 0xb00000006 ... FP chain: nr:3 ..... 0: 00007f18fe337fa7 ..... 1: 00007f18fe330e0f ..... 2: 00007f18fe331448 : unhandled! Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02	perf python: Correct copying of metric_leader in an evsel	Ian Rogers
	Ensure the metric_leader is copied and set up correctly. In compute_metric determine the correct metric_leader event to match the requested CPU. Fixes the handling of metrics particularly on hybrid machines. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02	perf jitdump: Add sym/str-tables to build-ID generation	Namhyung Kim
	It was reported that python backtrace with JIT dump was broken after the change to built-in SHA-1 implementation. It seems python generates the same JIT code for each function. They will become separate DSOs but the contents are the same. Only difference is in the symbol name. But this caused a problem that every JIT'ed DSOs will have the same build-ID which makes perf confused. And it resulted in no python symbols (from JIT) in the output. Looking back at the original code before the conversion, it used the load_addr as well as the code section to distinguish each DSO. But it'd be better to use contents of symtab and strtab instead as it aligns with some linker behaviors. This patch adds a buffer to save all the contents in a single place for SHA-1 calculation. Probably we need to add sha1_update() or similar to update the existing hash value with different contents and use it here. But it's out of scope for this change and I'd like something that can be backported to the stable trees easily. Reviewed-by: Ian Rogers <irogers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Pablo Galindo <pablogsal@gmail.com> Cc: Fangrui Song <maskray@sourceware.org> Link: https://github.com/python/cpython/issues/139544 Fixes: e3f612c1d8f3945b ("perf genelf: Remove libcrypto dependency and use built-in sha1()") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02	perf tools: Remove a trailing newline in the event terms	Namhyung Kim
	So that it can show the correct encoding info in the JSON output. $ perf list -j hw [ { "Unit": "cpu", "Topic": "legacy hardware", "EventName": "branch-instructions", "EventType": "Kernel PMU event", "BriefDescription": "Retired branch instructions [This event is an alias of branches]", "Encoding": "cpu/event=0xc4/" }, ... Reviewed-by: Ian Rogers <irogers@google.com> Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-26	perf tools: Don't read build-ids from non-regular files	James Clark
	Simplify the build ID reading code by removing the non-blocking option. Having to pass the correct option to this function was fragile and a mistake would result in a hang, see the linked fix. Furthermore, compressed files are always opened blocking anyway, ignoring the non-blocking option. We also don't expect to read build IDs from non-regular files. The only hits to this function that are non-regular are devices that won't be elf files with build IDs, for example "/dev/dri/renderD129". Now instead of opening these as non-blocking and failing to read, we skip them. Even if something like a pipe or character device did have a build ID, I don't think it would have worked because you need to call read() in a loop, check for -EAGAIN and handle timeouts to make non-blocking reads work. Link: https://lore.kernel.org/linux-perf-users/20251022-james-perf-fix-dso-block-v1-1-c4faab150546@linaro.org/ Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-26	perf pmu: fix duplicate conditional statement	Anubhav Shelat
	Remove duplicate check for PERF_PMU_TYPE_DRM_END in perf_pmu__kind. Fixes: f0feb21e0a10 ("perf pmu: Add PMU kind to simplify differentiating") Signed-off-by: Anubhav Shelat <ashelat@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Closes: https://lore.kernel.org/linux-perf-users/CA+G8Dh+wLx+FvjjoEkypqvXhbzWEQVpykovzrsHi2_eQjHkzQA@mail.gmail.com/ Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-24	perf tools: Add support for perf_event_attr::config4	James Clark
	perf_event_attr has gained a new field, config4, so add support for it extending the existing configN support. Reviewed-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-20	perf: replace strcpy() with strncpy() in util/jitdump.c	Hrishikesh Suresh
	Usage of strcpy() can lead to buffer overflows. Therefore, it has been replaced with strncpy(). The output file path is provided as a parameter and might be restricted by command-line by default. But this defensive patch will prevent any potential overflow, making the code more robust against future changes in input handling. Testing: - ran perf test from tools/perf and did not observe any regression with the earlier code Signed-off-by: Hrishikesh Suresh <hrishikesh123s@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-19	perf evsel: Skip store_evsel_ids for non-perf-event PMUs	Ian Rogers
	The IDs are associated with perf events and not applicable to non-perf event PMUs. The failure to generate the ids was causing perf stat record to fail. ``` $ perf stat record -a sleep 1 Performance counter stats for 'system wide': 47,941 context-switches # nan cs/sec cs_per_second 0.00 msec cpu-clock # 0.0 CPUs CPUs_utilized 3,261 cpu-migrations # nan migrations/sec migrations_per_second 516 page-faults # nan faults/sec page_faults_per_second 7,525,483 cpu_core/branch-misses/ # 2.3 % branch_miss_rate 322,069,004 cpu_core/branches/ # nan M/sec branch_frequency 1,895,684,291 cpu_core/cpu-cycles/ # nan GHz cycles_frequency 2,789,777,426 cpu_core/instructions/ # 1.5 instructions insn_per_cycle 7,074,765 cpu_atom/branch-misses/ # 3.2 % branch_miss_rate (49.89%) 224,225,412 cpu_atom/branches/ # nan M/sec branch_frequency (50.29%) 2,061,679,981 cpu_atom/cpu-cycles/ # nan GHz cycles_frequency (50.33%) 2,011,242,533 cpu_atom/instructions/ # 1.0 instructions insn_per_cycle (50.33%) TopdownL1 (cpu_core) # 9.0 % tma_bad_speculation # 28.3 % tma_frontend_bound # 35.2 % tma_backend_bound # 27.5 % tma_retiring TopdownL1 (cpu_atom) # 36.8 % tma_backend_bound (59.65%) # 22.8 % tma_frontend_bound (59.60%) # 11.6 % tma_bad_speculation # 28.8 % tma_retiring (59.59%) 1.006777519 seconds time elapsed $ perf stat report Performance counter stats for 'perf': 1,013,376,154 duration_time <not counted> duration_time <not counted> duration_time <not counted> duration_time <not counted> duration_time <not counted> duration_time 47,941 context-switches 0.00 msec cpu-clock 3,261 cpu-migrations 516 page-faults 7,525,483 cpu_core/branch-misses/ 322,069,814 cpu_core/branches/ 322,069,004 cpu_core/branches/ 1,895,684,291 cpu_core/cpu-cycles/ 1,895,679,209 cpu_core/cpu-cycles/ 2,789,777,426 cpu_core/instructions/ <not counted> cpu_core/cpu-cycles/ <not counted> cpu_core/stalled-cycles-frontend/ <not counted> cpu_core/cpu-cycles/ <not counted> cpu_core/stalled-cycles-backend/ <not counted> cpu_core/stalled-cycles-backend/ <not counted> cpu_core/instructions/ <not counted> cpu_core/stalled-cycles-frontend/ 7,074,765 cpu_atom/branch-misses/ (49.89%) 221,679,088 cpu_atom/branches/ (49.89%) 224,225,412 cpu_atom/branches/ (50.29%) 2,061,679,981 cpu_atom/cpu-cycles/ (50.33%) 2,016,259,567 cpu_atom/cpu-cycles/ (50.33%) 2,011,242,533 cpu_atom/instructions/ (50.33%) <not counted> cpu_atom/cpu-cycles/ <not counted> cpu_atom/stalled-cycles-frontend/ <not counted> cpu_atom/cpu-cycles/ <not counted> cpu_atom/stalled-cycles-backend/ <not counted> cpu_atom/stalled-cycles-backend/ <not counted> cpu_atom/instructions/ <not counted> cpu_atom/stalled-cycles-frontend/ 17,145,113 cpu_core/INT_MISC.UOP_DROPPING/ 10,594,226,100 cpu_core/TOPDOWN.SLOTS/ 2,919,021,401 cpu_core/topdown-retiring/ 943,101,838 cpu_core/topdown-bad-spec/ 3,031,152,533 cpu_core/topdown-fe-bound/ 3,739,756,791 cpu_core/topdown-be-bound/ 1,909,501,648 cpu_atom/CPU_CLK_UNHALTED.CORE/ (60.04%) 3,516,608,359 cpu_atom/TOPDOWN_BE_BOUND.ALL/ (59.65%) 2,179,403,876 cpu_atom/TOPDOWN_FE_BOUND.ALL/ (59.60%) 2,745,732,458 cpu_atom/TOPDOWN_RETIRING.ALL/ (59.59%) 1.006777519 seconds time elapsed Some events weren't counted. Try disabling the NMI watchdog: echo 0 > /proc/sys/kernel/nmi_watchdog perf stat ... echo 1 > /proc/sys/kernel/nmi_watchdog ``` Reported-by: James Clark <james.clark@linaro.org> Closes: https://lore.kernel.org/lkml/ca0f0cd3-7335-48f9-8737-2f70a75b019a@linaro.org/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-19	perf pmu: Add PMU kind to simplify differentiating	Ian Rogers
	Rather than perf_pmu__is_xxx calls, and a notion of kind so that a single call can be used. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-19	perf header: Switch "cpu" for find_core_pmu in caps feature writing	Ian Rogers
	Writing currently fails on non-x86 and hybrid CPUs. Switch to the more regular find_core_pmu that is normally used in this case. Tested on hybrid alderlake system. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-19	perf maps: Avoid RC_CHK use after free	Ian Rogers
	The case of __maps__fixup_overlap_and_insert where the "new" maps covers existing mappings can create a use-after-free with reference count checking enabled. The issue is that "pos" holds a map pointer from maps_by_address that is put from maps_by_address but then used to look for a map in maps_by_name (the compared map is now a use-after-free). The issue stems from using maps__remove which redoes some of the searches already done by __maps__fixup_overlap_and_insert, so optimize the code (by avoiding repeated searches) and avoid the use-after-free by inlining the appropriate removal code. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202511141407.f9edcfa6-lkp@intel.com Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Synthesize memory samples for SIMD operations	Leo Yan
	Synthesize memory samples for SIMD operations (including Advanced SIMD, SVE, and SME). To provide complete information, also generate data source entries for SIMD operations. Since memory operations are not limited to load and store, set PERF_MEM_OP_STORE if the operation does not fall into these cases. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Expose SIMD information in other operations	Leo Yan
	The other operations contain SME data processing, ASE (Advanced SIMD) and floating-point operations. Expose these info in the records. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Report GCS in record	Leo Yan
	Report GCS related info in records. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Report memset and memcpy in records	Leo Yan
	Expose memset and memcpy related info in records. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Report associated info for SVE / SME operations	Leo Yan
	SVE / SME operations can be predicated or Gather load / scatter store, save the relevant info into record. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Report extended memory operations in records	Leo Yan
	Extended memory operations include atomic (AT), acquire/release (AR), and exclusive (EXCL) operations. Save the relevant information in the records. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Report MTE allocation tag in record	Leo Yan
	Save MTE tag info in memory record. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Report register access in record	Leo Yan
	Record register access info for load / store operations. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Introduce data processing macro for SVE operations	Leo Yan
	Introduce the ARM_SPE_OP_DP (data processing) macro as associated information for SVE operations. For SVE register access, only ARM_SPE_OP_SVE is set; for SVE data processing, both ARM_SPE_OP_SVE and ARM_SPE_OP_DP are set together. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Consolidate operation types	Leo Yan
	Consolidate operation types in a way: (a) Extract the second-level types into separate enums. (b) The second-level types for memory and SIMD operations are classified by modules. E.g., an operation may relate to general register, SIMD/FP, SVE, etc. (c) The associated information tells details. E.g., an operation is load or store, whether it is atomic operation, etc. Start the enum items for the second-level types from 8 to accommodate more entries within a 32-bit integer. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Remove unused operation types	Leo Yan
	Remove unused SVE operation types. These operations will be reintroduced in subsequent refactoring, but with a different format. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Decode SME data processing packet	Leo Yan
	For SME data processing, decode its Effective vector length or Tile Size (ETS), and print out if a floating-point operation. After: . 00000000: 49 00 SME-OTHER ETS 1024 FP . 00000002: b2 18 3c d7 83 00 80 ff ff VA 0xffff800083d73c18 . 0000000b: 9a 00 00 LAT 0 XLAT . 0000000e: 43 00 DATA-SOURCE 0 Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Decode ASE and FP fields in other operation	Leo Yan
	Add a check for other operation, which prevents any incorrectly classifying. Parse the ASE and FP fields. After: . 0000002f: 48 06 OTHER ASE FP INSN-OTHER . 00000031: b2 08 80 48 01 08 00 ff ff VA 0xffff000801488008 . 0000003a: 9a 00 00 LAT 0 XLAT . 0000003d: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Rename SPE_OP_PKT_IS_OTHER_SVE_OP macro	Leo Yan
	Rename the macro to SPE_OP_PKT_OTHER_SUBCLASS_SVE to unify naming. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Decode GCS operation	Leo Yan
	Decode a load or store from a GCS operation and the associated "common" field. After: . 00000000: 49 44 LD GCS COMM . 00000002: b2 18 3c d7 83 00 80 ff ff VA 0xffff800083d73c18 . 0000000b: 9a 00 00 LAT 0 XLAT . 0000000e: 43 00 DATA-SOURCE 0 Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Unify operation naming	Leo Yan
	Rename extended subclass and SVE/SME register access subclass, so that the naming can be consistent cross all sub classes. Add an log "SVE-SME-REG" for the SVE/SME register access, this is easier for parsing. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-18	perf arm_spe: Fix memset subclass in operation	Leo Yan
	The operation subclass is extracted from bits [7..1] of the payload. Since bit [0] is not parsed, there is no chance to match the memset type (0x25). As a result, the memset payload is never parsed successfully. Instead of extracting a unified bit field, change to extract the specific bits for each operation subclass. Fixes: 34fb60400e32 ("perf arm-spe: Add raw decoding for SPEv1.3 MTE and MOPS load/store") Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-17	perf tool_pmu: More accurately set the cpus for tool events	Ian Rogers
	The user and system time events can record on different CPUs, but for all other events a single CPU map of just CPU 0 makes sense. In parse-events detect a tool PMU and then pass the perf_event_attr so that the tool_pmu can return CPUs specific for the event. This avoids a CPU map of all online CPUs being used for events like duration_time. Avoiding this avoids the evlist CPUs containing CPUs for which duration_time just gives 0. Minimizing the evlist CPUs can remove unnecessary sched_setaffinity syscalls that delay metric calculations. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-17	perf stat: Reduce scope of walltime_nsecs_stats	Ian Rogers
	walltime_nsecs_stats is no longer used for counter values, move into that stat_config where it controls certain things like noise measurement. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-17	perf stat: Reduce scope of ru_stats	Ian Rogers
	The ru_stats are used to capture user and system time stats when a process exits. These are then applied to user and system time tool events if their reads fail due to the process terminating. Reduce the scope now the metric code no longer reads these values. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-17	perf stat-shadow: Read tool events directly	Ian Rogers
	When reading time values for metrics don't use the globals updated in builtin-stat, just read the events as regular events. The only exception is for time events where nanoseconds need converting to seconds as metrics assume time metrics are in seconds. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-17	perf tool_pmu: Use old_count when computing count values for time events	Ian Rogers
	When running in interval mode every third count of a time event isn't showing properly: ``` $ perf stat -e duration_time -a -I 1000 1.001082862 1,002,290,425 duration_time 2.004264262 1,003,183,516 duration_time 3.007381401 <not counted> duration_time 4.011160141 1,003,705,631 duration_time 5.014515385 1,003,290,110 duration_time 6.018539680 <not counted> duration_time 7.022065321 1,003,591,720 duration_time ``` The regression came in with a different fix, found through bisection, commit 68cb1567439f ("perf tool_pmu: Fix aggregation on duration_time"). The issue is caused by the enabled and running time of the event matching the old_count's and creating a delta of 0, which is indicative of an error. Fixes: 68cb1567439f ("perf tool_pmu: Fix aggregation on duration_time") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-17	perf pmu: perf_cpu_map__new_int to avoid parsing a string	Ian Rogers
	Prefer perf_cpu_map__new_int(0) to perf_cpu_map__new("0") as it avoids strings parsing. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-17	perf stat: Display metric-only for 0 counters	Ian Rogers
	0 counters may occur in hypervisor settings but metric-only output is always expected. This resolves an issue in the "perf stat STD output linter" test. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-16	perf parse-events: Add debug logging to perf_event	Ian Rogers
	If verbose is enabled and parse_event is called, typically by tests, log failures. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-16	perf sample: Fix the wrong format specifier	liujing
	In the file tools/perf/util/cs-etm.c, queue_nr is of type unsigned int and should be printed with %u. Signed-off-by: liujing <liujing@cmss.chinamobile.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-13	perf auxtrace: Remove errno.h from auxtrace.h and fix transitive dependencies	Ian Rogers
	errno.h isn't used in auxtrace.h so remove it and fix build failures caused by transitive dependencies through auxtrace.h on errno.h. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-13	perf build: Remove NO_AUXTRACE build option	Ian Rogers
	The NO_AUXTRACE build option was used when the __get_cpuid feature test failed or if it was provided on the command line. The option no longer avoids a dependency on a library and so having the option is just adding complexity to the code base. Remove the option CONFIG_AUXTRACE from Build files and HAVE_AUXTRACE_SUPPORT by assuming it is always defined. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>