summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2026-01-13perf callchain: Fix srcline printing with inlinesIan Rogers
sample__fprintf_callchain() was using map__fprintf_srcline() which won't report inline line numbers. Fix by using the srcline from the callchain and falling back to the map variant. Fixes: 25da4fab5f66e659 ("perf evsel: Move fprintf methods to separate source file") Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Tony Jones <tonyj@suse.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-13perf srcline: Add configuration support for the addr2line styleIan Rogers
Allow the addr2line style to be specified on the `perf report` command line or in the .perfconfig file. Committer testing: The methods: # perf probe -x ~/bin/perf -F *__addr2line cmd__addr2line libbfd__addr2line libdw__addr2line llvm__addr2line # So if we configure one of them, say 'addr2line': # perf config addr2line.style=addr2line # perf config addr2line.style addr2line.style=addr2line # And have probes on all of them: # perf probe -x ~/bin/perf *__addr2line Added new events: probe_perf:cmd__addr2line (on *__addr2line in /home/acme/bin/perf) probe_perf:llvm__addr2line (on *__addr2line in /home/acme/bin/perf) probe_perf:libbfd__addr2line (on *__addr2line in /home/acme/bin/perf) probe_perf:libdw__addr2line (on *__addr2line in /home/acme/bin/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:libdw__addr2line -aR sleep 1 # Only the selected method should be used: # perf stat -e probe_perf:*_addr2line perf report -f --dso perf --stdio -s srcfile,srcline # Total Lost Samples: 0 # # Samples: 4K of event 'cpu/cycles/Pu' # Event count (approx.): 5535180842 # # Overhead Source File Source:Line # ........ ............ ............... # 99.04% inlineloop.c inlineloop.c:21 0.46% inlineloop.c inlineloop.c:20 # # (Tip: For hierarchical output, try: perf report --hierarchy) # Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline': 44 probe_perf:cmd__addr2line 0 probe_perf:llvm__addr2line 0 probe_perf:libbfd__addr2line 0 probe_perf:libdw__addr2line 0.035915611 seconds time elapsed 0.028008000 seconds user 0.009051000 seconds sys # I checked and that is the case for the other methods. Also when using: # perf config addr2line.style=libdw,llvm Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline': 0 probe_perf:cmd__addr2line 23 probe_perf:llvm__addr2line 0 probe_perf:libbfd__addr2line 44 probe_perf:libdw__addr2line Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Tony Jones <tonyj@suse.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull x86 kvm fixes from Paolo Bonzini: - Avoid freeing stack-allocated node in kvm_async_pf_queue_task - Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1 * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: selftests: kvm: Verify TILELOADD actually #NM faults when XFD[18]=1 selftests: kvm: try getting XFD and XSAVE state out of sync selftests: kvm: replace numbered sync points with actions x86/fpu: Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1 x86/kvm: Avoid freeing stack-allocated node in kvm_async_pf_queue_task
2026-01-13selftests/bpf: Add tests for s>>=31 and s>>=63Alexei Starovoitov
Add tests for special arithmetic shift right. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Co-developed-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260112201424.816836-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-13selftests/bpf: Fix verifier_arena_globals1 failure with 64K pageYonghong Song
With 64K page on arm64, verifier_arena_globals1 failed like below: ... libbpf: map 'arena': failed to create: -E2BIG ... #509/1 verifier_arena_globals1/check_reserve1:FAIL ... For 64K page, if the number of arena pages is (1UL << 20), the total memory will exceed 4G and this will cause map creation failure. Adjusting ARENA_PAGES based on the actual page size fixed the problem. Cc: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260113061033.3798549-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-13selftests/bpf: Fix sk_bypass_prot_mem failure with 64K pageYonghong Song
The current selftest sk_bypass_prot_mem only supports 4K page. When running with 64K page on arm64, the following failure happens: ... check_bypass:FAIL:no bypass unexpected no bypass: actual 3 <= expected 32 ... #385/1 sk_bypass_prot_mem/TCP :FAIL ... check_bypass:FAIL:no bypass unexpected no bypass: actual 4 <= expected 32 ... #385/2 sk_bypass_prot_mem/UDP :FAIL ... Adding support to 64K page as well fixed the failure. Cc: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260113061028.3798326-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-13selftests/bpf: Fix dmabuf_iter/lots_of_buffers failure with 64K pageYonghong Song
On arm64 with 64K page , I observed the following test failure: ... subtest_dmabuf_iter_check_lots_of_buffers:FAIL:total_bytes_read unexpected total_bytes_read: actual 4696 <= expected 65536 #97/3 dmabuf_iter/lots_of_buffers:FAIL With 4K page on x86, the total_bytes_read is 4593. With 64K page on arm64, the total_byte_read is 4696. In progs/dmabuf_iter.c, for each iteration, the output is BPF_SEQ_PRINTF(seq, "%lu\n%llu\n%s\n%s\n", inode, size, name, exporter); The only difference between 4K and 64K page is 'size' in the above BPF_SEQ_PRINTF. The 4K page will output '4096' and the 64K page will output '65536'. So the total_bytes_read with 64K page is slighter greater than 4K page. Adjusting the total_bytes_read from 65536 to 4096 fixed the issue. Cc: T.J. Mercier <tjmercier@google.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260113061023.3798085-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-13cxl: Check for invalid addresses returned from translation functions on errorsRobert Richter
Translation functions may return an invalid address in case of errors. If the address is not checked the further use of the invalid value will cause an address corruption. Consistently check for a valid address returned by translation functions. Use RESOURCE_SIZE_MAX to indicate an invalid address for type resource_size_t. Depending on the type either RESOURCE_SIZE_MAX or ULLONG_MAX is used to indicate an address error. Propagating an invalid address from a failed translation may cause userspace to think it has received a valid SPA, when in fact it is wrong. The CXL userspace API, using trace events, expects ULLONG_MAX to indicate a translation failure. If ULLONG_MAX is not returned immediately, subsequent calculations can transform that bad address into a different value (!ULLONG_MAX), and an invalid SPA may be returned to userspace. This can lead to incorrect diagnostics and erroneous corrective actions. [ dj: Added user impact statement from Alison. ] [ dj: Fixed checkpatch tab alignment issue. ] Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Fixes: c3dd67681c70 ("cxl/region: Add inject and clear poison by region offset") Fixes: b78b9e7b7979 ("cxl/region: Refactor address translation funcs for testing") Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260107120544.410993-1-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-01-13x86/pvlocks: Move paravirt spinlock functions into own headerJuergen Gross
Instead of having the pv spinlock function definitions in paravirt.h, move them into the new header paravirt-spinlock.h. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260105110520.21356-22-jgross@suse.com
2026-01-13selftests: vDSO: vdso_test_abi: Add test for clock_getres_time64()Thomas Weißschuh
Some architectures will start to implement this function. Make sure it works correctly. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20251223-vdso-compat-time32-v1-4-97ea7a06a543@linutronix.de
2026-01-13selftests: vDSO: vdso_test_abi: Use UAPI system call numbersThomas Weißschuh
SYS_clock_getres might have been redirected by libc to some other system call than the actual clock_getres. For testing it is required to use exactly this system call. Use the system call number exported by the UAPI headers which is always correct. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20251223-vdso-compat-time32-v1-3-97ea7a06a543@linutronix.de
2026-01-13selftests: vDSO: vdso_config: Add configurations for clock_getres_time64()Thomas Weißschuh
Some architectures will start to implement this function. Make sure that tests can be written for it. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20251223-vdso-compat-time32-v1-2-97ea7a06a543@linutronix.de
2026-01-13objtool: Allow multiple pv_ops arraysJuergen Gross
Having a single large pv_ops array has the main disadvantage of needing all prototypes of the single array members in one header file. This is adding up to the need to include lots of otherwise unrelated headers. In order to allow multiple smaller pv_ops arrays dedicated to one area of the kernel each, allow multiple arrays in objtool. For better performance limit the possible names of the arrays to start with "pv_ops". Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260105110520.21356-19-jgross@suse.com
2026-01-13selftests/tc-testing: add selftests for cake_mq qdiscJonas Köppeler
Test 684b: Create CAKE_MQ with default setting (4 queues) Test 7ee8: Create CAKE_MQ with bandwidth limit (4 queues) Test 1f87: Create CAKE_MQ with rtt time (4 queues) Test e9cf: Create CAKE_MQ with besteffort flag (4 queues) Test 7c05: Create CAKE_MQ with diffserv8 flag (4 queues) Test 5a77: Create CAKE_MQ with diffserv4 flag (4 queues) Test 8f7a: Create CAKE_MQ with flowblind flag (4 queues) Test 7ef7: Create CAKE_MQ with dsthost and nat flag (4 queues) Test 2e4d: Create CAKE_MQ with wash flag (4 queues) Test b3e6: Create CAKE_MQ with flowblind and no-split-gso flag (4 queues) Test 62cd: Create CAKE_MQ with dual-srchost and ack-filter flag (4 queues) Test 0df3: Create CAKE_MQ with dual-dsthost and ack-filter-aggressive flag (4 queues) Test 9a75: Create CAKE_MQ with memlimit and ptm flag (4 queues) Test cdef: Create CAKE_MQ with fwmark and atm flag (4 queues) Test 93dd: Create CAKE_MQ with overhead 0 and mpu (4 queues) Test 1475: Create CAKE_MQ with conservative and ingress flag (4 queues) Test 7bf1: Delete CAKE_MQ with conservative and ingress flag (4 queues) Test ee55: Replace CAKE_MQ with mpu (4 queues) Test 6df9: Change CAKE_MQ with mpu (4 queues) Test 67e2: Show CAKE_MQ class (4 queues) Test 2de4: Change bandwidth of CAKE_MQ (4 queues) Test 5f62: Fail to create CAKE_MQ with autorate-ingress flag (4 queues) Test 038e: Fail to change setting of sub-qdisc under CAKE_MQ Test 7bdc: Fail to replace sub-qdisc under CAKE_MQ Test 18e0: Fail to install CAKE_MQ on single queue device Reviewed-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-6-8d613fece5d8@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-13objtool: fix build failure due to missing libopcodes checkSasha Levin
Commit 59953303827e ("objtool: Disassemble code with libopcodes instead of running objdump") added support for using libopcodes for disassembly. However, the feature detection checks for libbfd availability but then unconditionally links against libopcodes: ifeq ($(feature-libbfd),1) OBJTOOL_LDFLAGS += -lopcodes endif This causes build failures in environments where libbfd is installed but libopcodes is not, since the test-libbfd.c feature test only links against -lbfd and -ldl, not -lopcodes: /usr/bin/ld: cannot find -lopcodes: No such file or directory collect2: error: ld returned 1 exit status make[4]: *** [Makefile:109: objtool] Error 1 Additionally, the shared feature framework uses $(CC) which is the cross-compiler in cross-compilation builds. Since objtool is a host tool that links with $(HOSTCC) against host libraries, the feature detection can falsely report libopcodes as available when the cross-compiler's sysroot has it but the host system doesn't. Fix this by replacing the feature framework check with a direct inline test that uses $(HOSTCC) to compile and link a test program against libopcodes, similar to how xxhash availability is detected. Fixes: 59953303827e ("objtool: Disassemble code with libopcodes instead of running objdump") Assisted-by: claude-opus-4-5-20251101 Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251223120357.2492008-1-sashal@kernel.org
2026-01-13objtool: fix compilation failure with the x32 toolchainMikulas Patocka
When using the x32 toolchain, compilation fails because the printf specifier "%lx" (long), doesn't match the type of the "checksum" variable (long long). Fix this by changing the printf specifier to "%llx" and casting "checksum" to unsigned long long. Fixes: a3493b33384a ("objtool/klp: Add --debug-checksum=<funcs> to show per-instruction checksums") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/a1158c99-fe0e-a218-4b5b-ffac212489f6@redhat.com
2026-01-13uapi: promote EFSCORRUPTED and EUCLEAN to errno.hDarrick J. Wong
Stop definining these privately and instead move them to the uapi errno.h so that they become canonical instead of copy pasta. Cc: linux-api@vger.kernel.org Signed-off-by: Darrick J. Wong <djwong@kernel.org> Link: https://patch.msgid.link/176826402587.3490369.17659117524205214600.stgit@frogsfrogsfrogs Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-13rtla: Fix parse_cpu_set() bug introduced by strtoi()Costa Shulyupin
The patch 'Replace atoi() with a robust strtoi()' introduced a bug in parse_cpu_set(), which relies on partial parsing of the input string. The function parses CPU specifications like '0-3,5' by incrementing a pointer through the string. strtoi() rejects strings with trailing characters, causing parse_cpu_set() to fail on any CPU list with multiple entries. Restore the original use of atoi() in parse_cpu_set(). Fixes: 7e9dfccf8f11 ("rtla: Replace atoi() with a robust strtoi()") Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20260112192642.212848-2-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2026-01-12selftests/net/ipsec: Fix variable size type not at the end of structAnkit Khushwaha
The "struct alg" object contains a union of 3 xfrm structures: union { struct xfrm_algo; struct xfrm_algo_aead; struct xfrm_algo_auth; } All of them end with a flexible array member used to store key material, but the flexible array appears at *different offsets* in each struct. bcz of this, union itself is of variable-sized & Placing it above char buf[...] triggers: ipsec.c:835:5: warning: field 'u' with variable sized type 'union (unnamed union at ipsec.c:831:3)' not at the end of a struct or class is a GNU extension [-Wgnu-variable-sized-type-not-at-end] 835 | } u; | ^ one fix is to use "TRAILING_OVERLAP()" which works with one flexible array member only. But In "struct alg" flexible array member exists in all union members, but not at the same offset, so TRAILING_OVERLAP cannot be applied. so the fix is to explicitly overlay the key buffer at the correct offset for the largest union member (xfrm_algo_auth). This ensures that the flexible-array region and the fixed buffer line up. No functional change. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com> Link: https://patch.msgid.link/20260109152201.15668-1-ankitkhushwaha.linux@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-12selftests/bpf: Use the correct destructor kfunc typeSami Tolvanen
With CONFIG_CFI enabled, the kernel strictly enforces that indirect function calls use a function pointer type that matches the target function. As bpf_testmod_ctx_release() signature differs from the btf_dtor_kfunc_t pointer type used for the destructor calls in bpf_obj_free_fields(), add a stub function with the correct type to fix the type mismatch. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260110082548.113748-9-samitolvanen@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-12selftests: ublk: add stop command with --safe optionMing Lei
Add 'stop' subcommand to kublk utility that uses the new UBLK_CMD_TRY_STOP_DEV command when --safe option is specified. This allows stopping a device only if it has no active openers, returning -EBUSY otherwise. Also add test_generic_16.sh to test the new functionality. Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-12tools: ynl: cli: print reply in combined format if possibleJakub Kicinski
As pointed out during review of the --list-attrs support the GET ops very often return the same attrs from do and dump. Make the output more readable by combining the reply information, from: Do request attributes: - ifindex: u32 netdev ifindex Do reply attributes: - ifindex: u32 netdev ifindex [ .. other attrs .. ] Dump reply attributes: - ifindex: u32 netdev ifindex [ .. other attrs .. ] To, after: Do request attributes: - ifindex: u32 netdev ifindex Do and Dump reply attributes: - ifindex: u32 netdev ifindex [ .. other attrs .. ] Tested-by: Gal Pressman <gal@nvidia.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20260110233142.3921386-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-12tools: ynl: cli: extract the event/notify handling in --list-attrsJakub Kicinski
Event and notify handling is quite different from do / dump handling. Forcing it into print_mode_attrs() doesn't really buy us anything as events and notifications do not have requests. Call print_attr_list() directly. Apart form subjective code clarity this also removes the word "reply" from the output: Before: Event reply attributes: Now: Event attributes: Tested-by: Gal Pressman <gal@nvidia.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20260110233142.3921386-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-12tools: ynl: cli: factor out --list-attrs / --doc handlingJakub Kicinski
We'll soon add more code to the --doc handling. Factor it out to avoid making main() too long. Tested-by: Gal Pressman <gal@nvidia.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20260110233142.3921386-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-12tools: ynl: cli: add --doc as alias to --list-attrsJakub Kicinski
--list-attrs also provides information about the operation itself. So --doc seems more appropriate. Add an alias. Tested-by: Gal Pressman <gal@nvidia.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20260110233142.3921386-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-12tools: ynl: cli: improve --helpJakub Kicinski
Improve the clarity of --help. Reorder, provide some grouping and add help messages to most of the options. No functional changes intended. Tested-by: Gal Pressman <gal@nvidia.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20260110233142.3921386-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-12tools: ynl: cli: wrap the doc text if it's longJakub Kicinski
We already use textwrap when printing "doc" section about an attribute, but only to indent the text. Switch to using fill() to split and indent all the lines. While at it indent the text by 2 more spaces, so that it doesn't align with the name of the attribute. Before (I'm drawing a "box" at ~60 cols here, in an attempt for clarity): | - irq-suspend-timeout: uint | | The timeout, in nanoseconds, of how long to suspend irq| |processing, if event polling finds events | After: | - irq-suspend-timeout: uint | | The timeout, in nanoseconds, of how long to suspend | | irq processing, if event polling finds events | Tested-by: Gal Pressman <gal@nvidia.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20260110233142.3921386-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-12tools: ynl: cli: introduce formatting for attr names in --list-attrsJakub Kicinski
It's a little hard to make sense of the output of --list-attrs, it looks like a wall of text. Sprinkle a little bit of formatting - make op and attr names bold, and Enum: / Flags: keywords italics. Tested-by: Gal Pressman <gal@nvidia.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20260110233142.3921386-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-12x86/xen: Drop xen_mmu_opsJuergen Gross
Instead of having a pre-filled array xen_mmu_ops for Xen PV paravirt functions, drop the array and assign each element individually. This is in preparation of reducing the paravirt include hell by splitting paravirt.h into multiple more fine grained header files, which will in turn require to split up the pv_ops vector as well. Dropping the pre-filled array makes life easier for objtool to detect missing initializers in multiple pv_ops_ arrays. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://patch.msgid.link/20260105110520.21356-18-jgross@suse.com
2026-01-12perf addr2line.c: Rename a2l_style to cmd_a2l_styleIan Rogers
The a2l_style is only relevant to the command line version, so rename to make this clearer. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Tony Jones <tonyj@suse.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-12perf addr2line: Add a libdw implementationIan Rogers
Add an implementation of addr2line that uses libdw. Other addr2line implementations are slow, particularly in the case of forking addr2line. Add an implementation that caches the libdw information in the dso and uses it to find the file and line number information. Inline information is supported but because cu_walk_functions_at visits the leaf function last add a inline_list__append_tail to reverse the lists order. Committer testing: # perf probe -x ~/bin/perf libdw__addr2line Added new event: probe_perf:libdw_addr2line (on libdw__addr2line in /home/acme/bin/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:libdw_addr2line -aR sleep 1 # # perf stat -e probe_perf:libdw_addr2line perf report -f --dso perf --stdio -s srcfile,srcline # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 4K of event 'cpu/cycles/Pu' # Event count (approx.): 5535180842 # # Overhead Source File Source:Line # ........ ............ ............... # 99.04% inlineloop.c inlineloop.c:21 0.46% inlineloop.c inlineloop.c:20 # # (Tip: For tracepoint events, try: perf report -s trace_fields) # Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline': 44 probe_perf:libdw_addr2line 0.037260744 seconds time elapsed 0.025299000 seconds user 0.011918000 seconds sys # Adding probes to the other addr2line implementations (llvm__addr2line, libbfd__addr2line and cmd__addr2line) I noticed some fallbacks to the llvm one: Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline': 44 probe_perf:libdw_addr2line 23 probe_perf:llvm_addr2line 0 probe_perf:libbfd_addr2line 0 probe_perf:cmd_addr2line Something to investigate further, but at least we don't fallback to the cmd based one :-) Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Tony Jones <tonyj@suse.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-12perf test workload: Add inlineloop test workloadIan Rogers
The purpose of this workload is to gather samples in an inlined function. This can be used to test whether inlined addr2line works correctly. Committer testing: $ perf record perf test -w inlineloop 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.161 MB perf.data (4005 samples) ] $ perf report --stdio --dso perf -s srcfile,srcline # # Total Lost Samples: 0 # # Samples: 4K of event 'cpu/cycles/Pu' # Event count (approx.): 5535180842 # # Overhead Source File Source:Line # ........ ............ ............... # 99.04% inlineloop.c inlineloop.c:21 0.46% inlineloop.c inlineloop.c:20 # $ Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Tony Jones <tonyj@suse.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-12perf unwind-libdw: Fix invalid reference countsIan Rogers
The addition of addr_location__exit() causes use-after put on the maps and map references in the unwind info. Add the gets and then add the map_symbol__exit() calls. Fixes: 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy functions") Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Tony Jones <tonyj@suse.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-12cgroup/cpuset: Move the v1 empty cpus/mems check to cpuset1_validate_change()Waiman Long
As stated in commit 1c09b195d37f ("cpuset: fix a regression in validating config change"), it is not allowed to clear masks of a cpuset if there're tasks in it. This is specific to v1 since empty "cpuset.cpus" or "cpuset.mems" will cause the v2 cpuset to inherit the effective CPUs or memory nodes from its parent. So it is OK to have empty cpus or mems even if there are tasks in the cpuset. Move this empty cpus/mems check in validate_change() to cpuset1_validate_change() to allow more flexibility in setting cpus or mems in v2. cpuset_is_populated() needs to be moved into cpuset-internal.h as it is needed by the empty cpus/mems checking code. Also add a test case to test_cpuset_prs.sh to verify that. Reported-by: Chen Ridong <chenridong@huaweicloud.com> Closes: https://lore.kernel.org/lkml/7a3ec392-2e86-4693-aa9f-1e668a668b9c@huaweicloud.com/ Signed-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Chen Ridong <chenridong@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-01-12cgroup/cpuset: Don't invalidate sibling partitions on cpuset.cpus conflictWaiman Long
Currently, when setting a cpuset's cpuset.cpus to a value that conflicts with the cpuset.cpus/cpuset.cpus.exclusive of a sibling partition, the sibling's partition state becomes invalid. This is overly harsh and is probably not necessary. The cpuset.cpus.exclusive control file, if set, will override the cpuset.cpus of the same cpuset when creating a cpuset partition. So cpuset.cpus has less priority than cpuset.cpus.exclusive in setting up a partition. However, it cannot override a conflicting cpuset.cpus file in a sibling cpuset and the partition creation process will fail. This is inconsistent. That will also make using cpuset.cpus.exclusive less valuable as a tool to set up cpuset partitions as the users have to check if such a cpuset.cpus conflict exists or not. Fix these problems by making sure that once a cpuset.cpus.exclusive is set without failure, it will always be allowed to form a valid partition as long as at least one CPU can be granted from its parent irrespective of the state of the siblings' cpuset.cpus values. Of course, setting cpuset.cpus.exclusive will fail if it conflicts with the cpuset.cpus.exclusive or the cpuset.cpus.exclusive.effective value of a sibling. Partition can still be created by setting only cpuset.cpus without setting cpuset.cpus.exclusive. However, any conflicting CPUs in sibling's cpuset.cpus.exclusive.effective and cpuset.cpus.exclusive values will be removed from its cpuset.cpus.exclusive.effective as long as there is still one or more CPUs left and can be granted from its parent. This CPU stripping is currently done in rm_siblings_excl_cpus(). The new code will now try its best to enable the creation of new partitions with only cpuset.cpus set without invalidating existing ones. However it is not guaranteed that all the CPUs requested in cpuset.cpus will be used in the new partition even when all these CPUs can be granted from the parent. This is similar to the fact that cpuset.cpus.effective may not be able to include all the CPUs requested in cpuset.cpus. In this case, the parent may not able to grant all the exclusive CPUs requested in cpuset.cpus to cpuset.cpus.exclusive.effective if some of them have already been granted to other partitions earlier. With the creation of multiple sibling partitions by setting only cpuset.cpus, this does have the side effect that their exact cpuset.cpus.exclusive.effective settings will depend on the order of partition creation if there are conflicts. Due to the exclusive nature of the CPUs in a partition, it is not easy to make it fair other than the old behavior of invalidating all the conflicting partitions. For example, # echo "0-2" > A1/cpuset.cpus # echo "root" > A1/cpuset.cpus.partition # cat A1/cpuset.cpus.partition root # cat A1/cpuset.cpus.exclusive.effective 0-2 # echo "2-4" > B1/cpuset.cpus # echo "root" > B1/cpuset.cpus.partition # cat B1/cpuset.cpus.partition root # cat B1/cpuset.cpus.exclusive.effective 3-4 # cat B1/cpuset.cpus.effective 3-4 For users who want to be sure that they can get most of the CPUs they want, cpuset.cpus.exclusive should be used instead if they can set it successfully without failure. Setting cpuset.cpus.exclusive will guarantee that sibling conflicts from then onward is no longer possible. To make this change, we have to separate out the is_cpu_exclusive() check in cpus_excl_conflict() into a cgroup v1 only cpuset1_cpus_excl_conflict() helper. The cpus_allowed_validate_change() helper is now no longer needed and can be removed. Some existing tests in test_cpuset_prs.sh are updated and new ones are added to reflect the new behavior. The cgroup-v2.rst doc file is also updated the clarify what exclusive CPUs will be used when a partition is created. Reported-by: Sun Shaojie <sunshaojie@kylinos.cn> Closes: https://lore.kernel.org/lkml/20251117015708.977585-1-sunshaojie@kylinos.cn/ Signed-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Chen Ridong <chenridong@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-01-12perf test subcmd help: Add exclude disjoint subcmd namesIan Rogers
The test is based on an error/fix posted to linux-perf-users. Reported-by: Sri Jayaramappa <sjayaram@akamai.com> Reviewed-by: Sri Jayaramappa <sjayaram@akamai.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Closes: https://lore.kernel.org/linux-perf-users/20251202213632.2873731-1-sjayaram@akamai.com/ Closes: https://urldefense.com/v3/__https://lore.kernel.org/linux-perf-users/20251202213632.2873731-1-sjayaram@akamai.com/__;!!GjvTz_vk!XehekKNUE4Ib_tvqIH6PMIIhly4X3BZ-Y40RC1HKMQ-6OdYEFvUPQhyWv_gk9vsRRN4_RcOLS2Bh0CQ$ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-12perf stat display: Make %f precision consistentIan Rogers
Commit bc22de9bcdb22491 ("perf stat: Display time in precision based on std deviation") added multirun workload elapsed time. There was an effort to make the precision in the output most useful for the user, however, when gathering over runs it means the formatting varies. This change just makes the output format fixed. Before: ``` $ while :; do perf stat --null --repeat 3 sleep 0.1 2>&1 | grep elapsed; done 0.101140 +- 0.000149 seconds time elapsed ( +- 0.15% ) 0.1011396 +- 0.0000218 seconds time elapsed ( +- 0.02% ) 0.101331 +- 0.000124 seconds time elapsed ( +- 0.12% ) ^C $ while :; do perf stat --null --repeat 3 sleep 1 2>&1 | grep elapsed; done 1.001317 +- 0.000146 seconds time elapsed ( +- 0.01% ) 1.001377 +- 0.000172 seconds time elapsed ( +- 0.02% ) 1.00253 +- 0.00131 seconds time elapsed ( +- 0.13% ) ``` After: ``` $ while :; do perf stat --null --repeat 3 sleep 0.1 2>&1 | grep elapsed; done 0.101406408 +- 0.000064778 seconds time elapsed ( +- 0.06% ) 0.101367315 +- 0.000027253 seconds time elapsed ( +- 0.03% ) 0.101434164 +- 0.000084750 seconds time elapsed ( +- 0.08% ) ^C $ while :; do perf stat --null --repeat 3 sleep 1 2>&1 | grep elapsed; done 1.001525467 +- 0.000051703 seconds time elapsed ( +- 0.01% ) 1.001375093 +- 0.000116200 seconds time elapsed ( +- 0.01% ) 1.001141025 +- 0.000046361 seconds time elapsed ( +- 0.00% ) ``` Closes: https://lore.kernel.org/lkml/aTQRgAOpKyI53TEq@gmail.com/ Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-12x86/xen: Drop xen_cpu_opsJuergen Gross
Instead of having a pre-filled array xen_cpu_ops for Xen PV paravirt functions, drop the array and assign each element individually. This is in preparation of reducing the paravirt include hell by splitting paravirt.h into multiple more fine grained header files, which will in turn require to split up the pv_ops vector as well. Dropping the pre-filled array makes life easier for objtool to detect missing initializers in multiple pv_ops_ arrays. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://patch.msgid.link/20260105110520.21356-17-jgross@suse.com
2026-01-12perf build: Raise minimum shellcheck version to 0.7.2Nicolas Schier
Raise the minimum shellcheck version for perf builds to 0.7.2, so that systems with shellcheck versions below 0.7.2 will automatically skip the shell script checking, even if NO_SHELLCHECK is unset. Since commit 241f21be7d0fdf3c ("perf test perftool_testsuite: Use absolute paths"), shellcheck versions before 0.7.2 break the perf build with several SC1090 [2] warnings due to its too strict dynamic source handling [1], e.g.: In tests/shell/base_probe/test_line_semantics.sh line 20: . "$DIR_PATH/../common/init.sh" ^---------------------------^ SC1090: Can't follow non-constant source. Use a directive to specify location. Fixes: 241f21be7d0fdf3c ("perf test perftool_testsuite: Use absolute paths") Signed-off-by: Nicolas Schier <n.schier@avm.de> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jakub Brnak <jbrnak@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Nicolas Schier <nsc@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Philipp Hahn <p.hahn@avm.de> Cc: Veronika Molnarova <vmolnaro@redhat.com> Link: https://github.com/koalaman/shellcheck/issues/1998 # [1] Link: https://www.shellcheck.net/wiki/SC1090 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-12perf test stat tests: Fix for virtualized machinesThomas Richter
On s390 'perf test's 'perf stat tests', subtest test_hybrid fails for z/VM systems. The root cause is this statement: $(perf stat -a -- sleep 0.1 2>&1 |\ grep -E "/cpu-cycles/[uH]*| cpu-cycles[:uH]* -c) The 'perf stat' output on a s390 z/VM system is # perf stat -a -- sleep 0.1 2>&1 Performance counter stats for 'system wide': 56 context-switches # 46.3 cs/sec cs_per_second 1,210.41 msec cpu-clock # 11.9 CPUs CPUs_utilized 12 cpu-migrations # 9.9 migrations/sec ... 81 page-faults # 66.9 faults/sec ... 0.100891009 seconds time elapsed The grep command does not match any single line and exits with error code 1. As the bash script is executed with 'set -e', it aborts with the first error code being non-zero. Fix this and use 'wc -l' to count matching lines instead of 'grep ... -c'. Output before: # perf test 102 102: perf stat tests : FAILED! # Output after: # perf test 102 102: perf stat tests : Ok # Fixes: bb6e7cb11d97ce19 ("perf tools: Add fallback for exclude_guest") Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Jan Polensky <japo@linux.ibm.com> Cc: linux-s390@vger.kernel.org Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-12sched_ext: Add error logging for dsq creation failuresGeorge Guo
Add scx_bpf_error() calls when scx_bpf_create_dsq() fails in multiple schedulers to improve debuggability: - scx_central.bpf.c: central_init() - scx_flatcg.bpf.c: fcg_cgroup_init() and fcg_init() - scx_qmap.bpf.c: qmap_init() Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-01-12x86/xen: Drop xen_irq_opsJuergen Gross
Instead of having a pre-filled array xen_irq_ops for Xen PV paravirt functions, drop the array and assign each element individually. This is in preparation of reducing the paravirt include hell by splitting paravirt.h into multiple more fine grained header files, which will in turn require to split up the pv_ops vector as well. Dropping the pre-filled array makes life easier for objtool to detect missing initializers in multiple pv_ops_ arrays. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://patch.msgid.link/20260105110520.21356-16-jgross@suse.com
2026-01-12selftests: ublk: add end-to-end integrity testCaleb Sander Mateos
Add test case loop_08 to verify the ublk integrity data flow. It uses the kublk loop target to create a ublk device with integrity on top of backing data and integrity files. It then writes to the whole device with fio configured to generate integrity data. Then it reads back the whole device with fio configured to verify the integrity data. It also verifies that injected guard, reftag, and apptag corruptions are correctly detected. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-12selftests: ublk: add integrity params testCaleb Sander Mateos
Add test case null_04 to exercise all the different integrity params. It creates 4 different ublk devices with different combinations of integrity arguments and verifies their integrity limits via sysfs and the metadata_size utility. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-12selftests: ublk: add integrity data support to loop targetCaleb Sander Mateos
To perform and end-to-end test of integrity information through a ublk device, we need to actually store it somewhere and retrieve it. Add this support to kublk's loop target. It uses a second backing file for the integrity data corresponding to the data stored in the first file. The integrity file is initialized with byte 0xFF, which ensures the app and reference tags are set to the "escape" pattern to disable the bio-integrity-auto guard and reftag checks until the blocks are written. The integrity file is opened without O_DIRECT since it will be accessed at sub-block granularity. Each incoming read/write results in a pair of reads/writes, one to the data file, and one to the integrity file. If either backing I/O fails, the error is propagated to the ublk request. If both backing I/Os read/write some bytes, the ublk request is completed with the smaller of the number of blocks accessed by each I/O. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-12selftests: ublk: support non-O_DIRECT backing filesCaleb Sander Mateos
A subsequent commit will add support for using a backing file to store integrity data. Since integrity data is accessed in intervals of metadata_size, which may be much smaller than a logical block on the backing device, direct I/O cannot be used. Add an argument to backing_file_tgt_init() to specify the number of files to open for direct I/O. The remaining files will use buffered I/O. For now, continue to request direct I/O for all the files. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-12selftests: ublk: implement integrity user copy in kublkCaleb Sander Mateos
If integrity data is enabled for kublk, allocate an integrity buffer for each I/O. Extend ublk_user_copy() to copy the integrity data between the ublk request and the integrity buffer if the ublksrv_io_desc indicates that the request has integrity data. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-12selftests: ublk: add kublk support for integrity paramsCaleb Sander Mateos
Add integrity param command line arguments to kublk. Plumb these to struct ublk_params for the null and fault_inject targets, as they don't need to actually read or write the integrity data. Forbid the integrity params for loop or stripe until the integrity data copy is implemented. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-12selftests: ublk: add utility to get block device metadata sizeCaleb Sander Mateos
Some block device integrity parameters are available in sysfs, but others are only accessible using the FS_IOC_GETLBMD_CAP ioctl. Add a metadata_size utility program to print out the logical block metadata size, PI offset, and PI size within the metadata. Example output: $ metadata_size /dev/ublkb0 metadata_size: 64 pi_offset: 56 pi_tuple_size: 8 Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-12selftests: ublk: display UBLK_F_INTEGRITY supportCaleb Sander Mateos
Add support for printing the UBLK_F_INTEGRITY feature flag in the human-readable kublk features output. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>