summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide
AgeCommit message (Collapse)Author
2025-09-10Merge tag 'vmscape-for-linus-20250904' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull vmescape mitigation fixes from Dave Hansen: "Mitigate vmscape issue with indirect branch predictor flushes. vmscape is a vulnerability that essentially takes Spectre-v2 and attacks host userspace from a guest. It particularly affects hypervisors like QEMU. Even if a hypervisor may not have any sensitive data like disk encryption keys, guest-userspace may be able to attack the guest-kernel using the hypervisor as a confused deputy. There are many ways to mitigate vmscape using the existing Spectre-v2 defenses like IBRS variants or the IBPB flushes. This series focuses solely on IBPB because it works universally across vendors and all vulnerable processors. Further work doing vendor and model-specific optimizations can build on top of this if needed / wanted. Do the normal issue mitigation dance: - Add the CPU bug boilerplate - Add a list of vulnerable CPUs - Use IBPB to flush the branch predictors after running guests" * tag 'vmscape-for-linus-20250904' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vmscape: Add old Intel CPUs to affected list x86/vmscape: Warn when STIBP is disabled with SMT x86/bugs: Move cpu_bugs_smt_update() down x86/vmscape: Enable the mitigation x86/vmscape: Add conditional IBPB mitigation x86/vmscape: Enumerate VMSCAPE bug Documentation/hw-vuln: Add VMSCAPE documentation
2025-09-09media: i2c: ov6650: Drop unused driverLaurent Pinchart
The ov6650 driver was introduced in v2.6.37 to support the OMAP1-based Amstrad Delta video phone. The platform still has a board file in the kernel, but support for the camera was dropped in commit ce548396a433 ("media: mach-omap1: board-ams-delta.c: remove soc_camera dependencies") in v5.9. The driver has been unused since as it has received neither ACPI nor DT support. The ov6650 driver is one of the last sensor drivers calling clk_set_rate(). This is deprecated, and calls to the function are being removed to avoid cargo-cult. As the driver is unlikely to ever be used again, drop it instead of trying to avoid call clk_set_rate(). Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Mehdi Djait <mehdi.djait@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-09-09Documentation: update Hans Verkuil's email addressHans Verkuil
Replace hverkuil@xs4all.nl by hverkuil@kernel.org. Signed-off-by: Hans Verkuil <hverkuil@kernel.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2025-09-06md/md-llbitmap: introduce new lockless bitmapYu Kuai
Redundant data is used to enhance data fault tolerance, and the storage method for redundant data vary depending on the RAID levels. And it's important to maintain the consistency of redundant data. Bitmap is used to record which data blocks have been synchronized and which ones need to be resynchronized or recovered. Each bit in the bitmap represents a segment of data in the array. When a bit is set, it indicates that the multiple redundant copies of that data segment may not be consistent. Data synchronization can be performed based on the bitmap after power failure or readding a disk. If there is no bitmap, a full disk synchronization is required. Due to known performance issues with md-bitmap and the unreasonable implementations: - self-managed IO submitting like filemap_write_page(); - global spin_lock I have decided not to continue optimizing based on the current bitmap implementation, this new bitmap is invented without locking from IO fast path and can be used with fast disks. For designs and details, see the comments in drivers/md-llbitmap.c. Link: https://lore.kernel.org/linux-raid/20250829080426.1441678-12-yukuai1@huaweicloud.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Li Nan <linan122@huawei.com>
2025-09-06md/md-bitmap: delay registration of bitmap_ops until creating bitmapYu Kuai
Currently bitmap_ops is registered while allocating mddev, this is fine when there is only one bitmap_ops. Delay setting bitmap_ops until creating bitmap, so that user can choose which bitmap to use before running the array. Link: https://lore.kernel.org/linux-raid/20250721171557.34587-7-yukuai@kernel.org Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Li Nan <linan122@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com>
2025-09-06md/md-bitmap: add a new sysfs api bitmap_typeYu Kuai
The api will be used by mdadm to set bitmap_type while creating new array or assembling array, prepare to add a new bitmap. Currently available options are: cat /sys/block/md0/md/bitmap_type none [bitmap] Link: https://lore.kernel.org/linux-raid/20250829080426.1441678-6-yukuai1@huaweicloud.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Xiao Ni <xni@redhat.com> Reviewed-by: Li Nan <linan122@huawei.com>
2025-09-05cgroup: Merge branch 'for-6.17-fixes' into for-6.18Tejun Heo
Pull for-6.17-fixes to receive 79f919a89c9d ("cgroup: split cgroup_destroy_wq into 3 workqueues") to resolve its conflict with 7fa33aa3b001 ("cgroup: WQ_PERCPU added to alloc_workqueue users"). The latter adds WQ_PERCPU when creating cgroup_destroy_wq and the former splits the workqueue into three. Resolve by applying WQ_PERCPU to the three split workqueues. Signed-off-by: Tejun Heo <tj@kernel.org>
2025-09-05xfs: remove deprecated sysctl knobsDarrick J. Wong
These sysctl knobs were scheduled for removal in September 2025. That time has come, so remove them. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2025-09-05xfs: remove deprecated mount optionsDarrick J. Wong
These four mount options were scheduled for removal in September 2025, so remove them now. Cc: preichl@redhat.com Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2025-09-05xfs: disable deprecated features by default in KconfigDarrick J. Wong
We promised to turn off these old features by default in September 2025. Do so now. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2025-09-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.17-rc5). No conflicts. Adjacent changes: include/net/sock.h c51613fa276f ("net: add sk->sk_drop_counters") 5d6b58c932ec ("net: lockless sock_i_ino()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-04x86/cfi: Add "debug" option to "cfi=" bootparamKees Cook
Add "debug" option for "cfi=" bootparam to get details on early CFI initialization steps so future Kees can find breakage easier. Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250904034656.3670313-5-kees@kernel.org
2025-09-04x86/cfi: Document the "cfi=" bootparam optionsKees Cook
The kernel-parameters.txt didn't have a section for the cfi= options. Add it. Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20250904034656.3670313-3-kees@kernel.org
2025-09-04x86/microcode: Add microcode loader debugging functionalityBorislav Petkov (AMD)
Instead of adding ad-hoc debugging glue to the microcode loader each time I need it, add debugging functionality which is not built by default. Simulate all patch handling the loader does except the actual loading of the microcode patch into the hardware. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250820135043.19048-3-bp@kernel.org
2025-09-04x86/microcode: Add microcode= cmdline parsingBorislav Petkov (AMD)
Add a "microcode=" command line argument after which all options can be passed in a comma-separated list. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Reviewed-by: Chang S. Bae <chang.seok.bae@intel.com> Link: https://lore.kernel.org/20250820135043.19048-2-bp@kernel.org
2025-09-03docs: admin-guide: Fix typo in nfsroot.rstZenghui Yu
There is an obvious mistake in nfsroot.rst where pxelinux was wrongly written as pxeliunx. Fix it. Signed-off-by: Zenghui Yu <zenghui.yu@linux.dev> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250827144738.43361-1-zenghui.yu@linux.dev
2025-09-03genirq: Add support for warning on long-running interrupt handlersWladislav Wiebe
Introduce a mechanism to detect and warn about prolonged interrupt handlers. With a new command-line parameter (irqhandler.duration_warn_us=), users can configure the duration threshold in microseconds when a warning in such format should be emitted: "[CPU14] long duration of IRQ[159:bad_irq_handler [long_irq]], took: 1330 us" The implementation uses local_clock() to measure the execution duration of the generic IRQ per-CPU event handler. Signed-off-by: Wladislav Wiebe <wladislav.wiebe@nokia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/all/20250804093525.851-1-wladislav.wiebe@nokia.com
2025-08-29Fix typo in RAID arrays documentationVivek Alurkar
Changed "write-throuth" to "write-through". Signed-off-by: Vivek Alurkar <primalkenja@gmail.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250821051622.8341-2-primalkenja@gmail.com
2025-08-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.17-rc4). No conflicts. Adjacent changes: drivers/net/ethernet/intel/idpf/idpf_txrx.c 02614eee26fb ("idpf: do not linearize big TSO packets") 6c4e68480238 ("idpf: remove obsolete stashing code") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-27x86/bugs: Add attack vector controls for SSBDavid Kaplan
Attack vector controls for SSB were missed in the initial attack vector series. The default mitigation for SSB requires user-space opt-in so it is only relevant for user->user attacks. Check with attack vector controls when the command is auto - i.e., no explicit user selection has been done. Fixes: 2d31d2874663 ("x86/bugs: Define attack vectors relevant for each bug") Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250819192200.2003074-5-david.kaplan@amd.com
2025-08-25dm-pcache: add persistent cache target in device-mapperDongsheng Yang
This patch introduces dm-pcache, a new DM target that places a DAX- capable persistent-memory device in front of any slower block device and uses it as a high-throughput, low-latency cache. Design highlights ----------------- - DAX data path – data is copied directly between DRAM and the pmem mapping, bypassing the block layer’s overhead. - Segmented, crash-consistent layout - all layout metadata are dual-replicated CRC-protected. - atomic kset flushes; key replay on mount guarantees cache integrity even after power loss. - Striped multi-tree index - Multi‑tree indexing for high parallelism. - overlap-resolution logic ensures non-intersecting cached extents. - Background services - write-back worker flushes dirty keys in order, preserving backing-device crash consistency. This is important for checkpoint in cloud storage. - garbage collector reclaims clean segments when utilisation exceeds a tunable threshold. - Data integrity – optional CRC32 on cached payload; metadata always protected. Comparison with existing block-level caches --------------------------------------------------------------------------------------------------------------------------------- | Feature | pcache (this patch) | bcache | dm-writecache | |----------------------------------|---------------------------------|------------------------------|---------------------------| | pmem access method | DAX | bio (block I/O) | DAX | | Write latency (4 K rand-write) | ~5 µs | ~20 µs | ~5 µs | | Concurrency | multi subtree index | global index tree | single tree + wc_lock | | IOPS (4K randwrite, 32 numjobs) | 2.1 M | 352 K | 283 K | | Read-cache support | YES | YES | NO | | Deployment | no re-format of backend | backend devices must be | no re-format of backend | | | | reformatted | | | Write-back ordering | log-structured; | no ordering guarantee | no ordering guarantee | | | preserves app-IO-order | | | | Data integrity checks | metadata + data CRC(optional) | metadata CRC only | none | --------------------------------------------------------------------------------------------------------------------------------- Signed-off-by: Dongsheng Yang <dongsheng.yang@linux.dev> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-08-22cgroup: cgroup.stat.local time accountingTiffany Yang
There isn't yet a clear way to identify a set of "lost" time that everyone (or at least a wider group of users) cares about. However, users can perform some delay accounting by iterating over components of interest. This patch allows cgroup v2 freezing time to be one of those components. Track the cumulative time that each v2 cgroup spends freezing and expose it to userland via a new local stat file in cgroupfs. Thank you to Michal, who provided the ASCII art in the updated documentation. To access this value: $ mkdir /sys/fs/cgroup/test $ cat /sys/fs/cgroup/test/cgroup.stat.local freeze_time_total 0 Ensure consistent freeze time reads with freeze_seq, a per-cgroup sequence counter. Writes are serialized using the css_set_lock. Signed-off-by: Tiffany Yang <ynaffit@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-08-21Merge tag 'cgroup-for-6.17-rc2-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Fix NULL de-ref in css_rstat_exit() which could happen after allocation failure - Fix a cpuset partition handling bug and a couple other misc issues - Doc spelling fix * tag 'cgroup-for-6.17-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: docs: cgroup: fixed spelling mistakes in documentation cgroup: avoid null de-ref in css_rstat_exit() cgroup/cpuset: Remove the unnecessary css_get/put() in cpuset_partition_write() cgroup/cpuset: Fix a partition error with CPU hotplug cgroup/cpuset: Use static_branch_enable_cpuslocked() on cpusets_insane_config_key
2025-08-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.17-rc3). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-21docs: sysctl: add a few more top-level /proc/sys entriesRandy Dunlap
Add a few missing directories under /proc/sys. Fix punctuation and doubled words. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Reviewed-by: Rik van Riel <riel@surriel.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250819075456.113623-1-rdunlap@infradead.org
2025-08-21docs: kernel-parameters: typo fix and add missing SPDX-License tagBartlomiej Kubik
Fix documentation issues by removing a duplicated word and adding the missing SPDX-License identifier. Signed-off-by: Bartlomiej Kubik <kubik.bartlomiej@gmail.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250819113551.34356-1-kubik.bartlomiej@gmail.com
2025-08-21fs: Add 'initramfs_options' to set initramfs mount optionsLichen Liu
When CONFIG_TMPFS is enabled, the initial root filesystem is a tmpfs. By default, a tmpfs mount is limited to using 50% of the available RAM for its content. This can be problematic in memory-constrained environments, particularly during a kdump capture. In a kdump scenario, the capture kernel boots with a limited amount of memory specified by the 'crashkernel' parameter. If the initramfs is large, it may fail to unpack into the tmpfs rootfs due to insufficient space. This is because to get X MB of usable space in tmpfs, 2*X MB of memory must be available for the mount. This leads to an OOM failure during the early boot process, preventing a successful crash dump. This patch introduces a new kernel command-line parameter, initramfs_options, which allows passing specific mount options directly to the rootfs when it is first mounted. This gives users control over the rootfs behavior. For example, a user can now specify initramfs_options=size=75% to allow the tmpfs to use up to 75% of the available memory. This can significantly reduce the memory pressure for kdump. Consider a practical example: To unpack a 48MB initramfs, the tmpfs needs 48MB of usable space. With the default 50% limit, this requires a memory pool of 96MB to be available for the tmpfs mount. The total memory requirement is therefore approximately: 16MB (vmlinuz) + 48MB (loaded initramfs) + 48MB (unpacked kernel) + 96MB (for tmpfs) + 12MB (runtime overhead) ≈ 220MB. By using initramfs_options=size=75%, the memory pool required for the 48MB tmpfs is reduced to 48MB / 0.75 = 64MB. This reduces the total memory requirement by 32MB (96MB - 64MB), allowing the kdump to succeed with a smaller crashkernel size, such as 192MB. An alternative approach of reusing the existing rootflags parameter was considered. However, a new, dedicated initramfs_options parameter was chosen to avoid altering the current behavior of rootflags (which applies to the final root filesystem) and to prevent any potential regressions. Also add documentation for the new kernel parameter "initramfs_options" This approach is inspired by prior discussions and patches on the topic. Ref: https://www.lightofdawn.org/blog/?viewDetailed=00128 Ref: https://landley.net/notes-2015.html#01-01-2015 Ref: https://lkml.org/lkml/2021/6/29/783 Ref: https://www.kernel.org/doc/html/latest/filesystems/ramfs-rootfs-initramfs.html#what-is-rootfs Signed-off-by: Lichen Liu <lichliu@redhat.com> Link: https://lore.kernel.org/20250815121459.3391223-1-lichliu@redhat.com Tested-by: Rob Landley <rob@landley.net> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-20net: set net.core.rmem_max and net.core.wmem_max to 4 MBEric Dumazet
SO_RCVBUF and SO_SNDBUF have limited range today, unless distros or system admins change rmem_max and wmem_max. Even iproute2 uses 1 MB SO_RCVBUF which is capped by the kernel. Decouple [rw]mem_max and [rw]mem_default and increase [rw]mem_max to 4 MB. Before: $ sysctl net.core.rmem_default net.core.rmem_max net.core.wmem_default net.core.wmem_max net.core.rmem_default = 212992 net.core.rmem_max = 212992 net.core.wmem_default = 212992 net.core.wmem_max = 212992 After: $ sysctl net.core.rmem_default net.core.rmem_max net.core.wmem_default net.core.wmem_max net.core.rmem_default = 212992 net.core.rmem_max = 4194304 net.core.wmem_default = 212992 net.core.wmem_max = 4194304 Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20250819174030.1986278-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19dm-vdo: Promote dm-vdo title to title headingBagas Sanjaya
dm-vdo docs currently has no explicit title heading but instead there are multiple section headings as top-level heading. As such, these sections are rendered as titles and inflates number of entries in the toctree index. Promote the first section heading ("dm-vdo") to title heading. Fixes: 04bf7ac646ab ("dm: add documentation for dm-vdo target") Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-08-19docs: device-mapper: fixed spelling mistakes in documentationSoham Metha
found/fixed the following typos - flushs -> flushes in `Documentation/admin-guide/device-mapper/delay.rst` Signed-off-by: Soham Metha <sohammetha01@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-08-19docs: device-mapper: fix typos in delay.rst and vdo-design.rstShubham Sharma
Fixed the following typos in device-mapper documentation: - explicitely -> explicitly - approriate -> appropriate Signed-off-by: Shubham Sharma <slopixelz@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-08-18docs: Remove remainders of reiserfsDavid Sterba
Reiserfs has been removed in 6.13, there are still some mentions in the documentation about it and the tools. Remove those that don't seem relevant anymore but keep references to reiserfs' r5 hash used by some code. There's one change in a script scripts/selinux/install_policy.sh but it does not seem to be relevant either. Signed-off-by: David Sterba <dsterba@suse.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250813100053.1291961-1-dsterba@suse.com
2025-08-18Merge branch 'bjorn' into docs-mwJonathan Corbet
A big set of typo fixes from Bjorn Helgaas
2025-08-18Documentation: Fix admin-guide typosBjorn Helgaas
Fix typos. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250813200526.290420-4-helgaas@kernel.org
2025-08-18docs: remove a duplicated word from kernel-parameters.txtVivek Yadav
Fix kernel-doc warning in kernel-parameters.txt WARNING: Possible repeated word: 'is' Signed-off-by: Vivek Yadav <vivekyadav1207731111@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250816082452.219009-1-vivekyadav1207731111@gmail.com
2025-08-18docs: Replace dead links to spectre side channel white papersNikola Z. Ivanov
The papers are published by Intel, AMD and MIPS. Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250816190028.55573-1-zlatistiv@gmail.com
2025-08-17Merge tag 'x86_urgent_for_v6.17_rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Remove a transitional asm/cpuid.h header which was added only as a fallback during cpuid helpers reorg - Initialize reserved fields in the SVSM page validation calls structure to zero in order to allow for future structure extensions - Have the sev-guest driver's buffers used in encryption operations be in linear mapping space as the encryption operation can be offloaded to an accelerator - Have a read-only MSR write when in an AMD SNP guest trap to the hypervisor as it is usually done. This makes the guest user experience better by simply raising a #GP instead of terminating said guest - Do not output AVX512 elapsed time for kernel threads because the data is wrong and fix a NULL pointer dereferencing in the process - Adjust the SRSO mitigation selection to the new attack vectors * tag 'x86_urgent_for_v6.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpuid: Remove transitional <asm/cpuid.h> header x86/sev: Ensure SVSM reserved fields in a page validation entry are initialized to zero virt: sev-guest: Satisfy linear mapping requirement in get_derived_key() x86/sev: Improve handling of writes to intercepted TSC MSRs x86/fpu: Fix NULL dereference in avx512_status() x86/bugs: Select best SRSO mitigation
2025-08-14x86/vmscape: Enable the mitigationPawan Gupta
Enable the previously added mitigation for VMscape. Add the cmdline vmscape={off|ibpb|force} and sysfs reporting. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
2025-08-14Documentation/hw-vuln: Add VMSCAPE documentationPawan Gupta
VMSCAPE is a vulnerability that may allow a guest to influence the branch prediction in host userspace, particularly affecting hypervisors like QEMU. Add the documentation. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
2025-08-13Docs: admin-guide: Correct spelling mistakeErick Karanja
Fix spelling mistake directoy to directory Reported-by: codespell Signed-off-by: Erick Karanja <karanja99erick@gmail.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20250813071837.668613-1-karanja99erick@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-12docs: cgroup: fixed spelling mistakes in documentationSoham Metha
found/fixed the following typo - Availablity -> Availability in `Documentation/admin-guide/cgroup-v2.rst` Signed-off-by: Soham Metha <sohammetha01@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-08-11docs: admin-guide: update to current minimum pipe size defaultŠtěpán Němec
The pipe size limit used when the fs.pipe-user-pages-soft sysctl value is reached was increased from one to two pages in commit 46c4c9d1beb7; update the documentation to match the new reality. Fixes: 46c4c9d1beb7 ("pipe: increase minimum default pipe size to 2 pages") Signed-off-by: Štěpán Němec <stepnem@smrk.net> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250729-pipedoc-v2-1-18b8e735a9c6@smrk.net
2025-08-11docs: aoe: Remove trailing whitespaceOsama Albahrani
Fix `ERROR: trailing whitespace` errors from scripts/checkpatch.pl Signed-off-by: Osama Albahrani <osalbahr@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250804152516.16493-1-osalbahr@gmail.com
2025-08-11x86/bugs: Select best SRSO mitigationDavid Kaplan
The SRSO bug can theoretically be used to conduct user->user or guest->guest attacks and requires a mitigation (namely IBPB instead of SBPB on context switch) for these. So mark SRSO as being applicable to the user->user and guest->guest attack vectors. Additionally, SRSO supports multiple mitigations which mitigate different potential attack vectors. Some CPUs are also immune to SRSO from certain attack vectors (like user->kernel). Use the specific attack vectors requiring mitigation to select the best SRSO mitigation to avoid unnecessary performance hits. Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250721160310.1804203-1-david.kaplan@amd.com
2025-08-04Merge tag 'printk-for-6.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Add new "hash_pointers=[auto|always|never]" boot parameter to force the hashing even with "slab_debug" enabled - Allow to stop CPU, after losing nbcon console ownership during panic(), even without proper NMI - Allow to use the printk kthread immediately even for the 1st registered nbcon - Compiler warning removal * tag 'printk-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: nbcon: Allow reacquire during panic printk: Allow to use the printk kthread immediately even for 1st nbcon slab: Decouple slab_debug and no_hash_pointers vsprintf: Use __diag macros to disable '-Wsuggest-attribute=format' compiler-gcc.h: Introduce __diag_GCC_all
2025-08-04Merge tag 'for-6.17/dm-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mikulas Patocka: - fix checking for request-based stackable devices (dm-table) - fix corrupt_bio_byte setup checks (dm-flakey) - add support for resync w/o metadata devices (dm raid) - small code simplification (dm, dm-mpath, vm-vdo, dm-raid) - remove support for asynchronous hashes (dm-verity) - close smatch warning (dm-zoned-target) - update the documentation and enable inline-crypto passthrough (dm-thin) * tag 'for-6.17/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: set DM_TARGET_PASSES_CRYPTO feature for dm-thin dm-thin: update the documentation dm-raid: do not include dm-core.h vdo: omit need_resched() before cond_resched() md: dm-zoned-target: Initialize return variable r to avoid uninitialized use dm-verity: remove support for asynchronous hashes dm-mpath: don't print the "loaded" message if registering fails dm-mpath: make dm_unregister_path_selector return void dm: ima: avoid extra calls to strlen() dm: Simplify dm_io_complete() dm: Remove unnecessary return in dm_zone_endio() dm raid: add support for resync w/o metadata devices dm-flakey: Fix corrupt_bio_byte setup checks dm-table: fix checking for rq stackable devices
2025-08-03Merge tag 'mm-nonmm-stable-2025-08-03-12-47' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Significant patch series in this pull request: - "squashfs: Remove page->mapping references" (Matthew Wilcox) gets us closer to being able to remove page->mapping - "relayfs: misc changes" (Jason Xing) does some maintenance and minor feature addition work in relayfs - "kdump: crashkernel reservation from CMA" (Jiri Bohac) switches us from static preallocation of the kdump crashkernel's working memory over to dynamic allocation. So the difficulty of a-priori estimation of the second kernel's needs is removed and the first kernel obtains extra memory - "generalize panic_print's dump function to be used by other kernel parts" (Feng Tang) implements some consolidation and rationalization of the various ways in which a failing kernel splats information at the operator * tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (80 commits) tools/getdelays: add backward compatibility for taskstats version kho: add test for kexec handover delaytop: enhance error logging and add PSI feature description samples: Kconfig: fix spelling mistake "instancess" -> "instances" fat: fix too many log in fat_chain_add() scripts/spelling.txt: add notifer||notifier to spelling.txt xen/xenbus: fix typo "notifer" net: mvneta: fix typo "notifer" drm/xe: fix typo "notifer" cxl: mce: fix typo "notifer" KVM: x86: fix typo "notifer" MAINTAINERS: add maintainers for delaytop ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below() ucount: fix atomic_long_inc_below() argument type kexec: enable CMA based contiguous allocation stackdepot: make max number of pools boot-time configurable lib/xxhash: remove unused functions init/Kconfig: restore CONFIG_BROKEN help text lib/raid6: update recov_rvv.c zero page usage docs: update docs after introducing delaytop ...
2025-08-02stackdepot: make max number of pools boot-time configurableMatt Fleming
We're hitting the WARN in depot_init_pool() about reaching the stack depot limit because we have long stacks that don't dedup very well. Introduce a new start-up parameter to allow users to set the number of maximum stack depot pools. Link: https://lkml.kernel.org/r/20250718153928.94229-1-matt@readmodwrite.com Signed-off-by: Matt Fleming <mfleming@cloudflare.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Oscar Salvador <osalvador@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-31Merge tag 'cgroup-for-6.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - Allow css_rstat_updated() in NMI context to enable memory accounting for allocations in NMI context. - /proc/cgroups doesn't contain useful information for cgroup2 and was updated to only show v1 controllers. This unfortunately broke something in the wild. Add an option to bring back the old behavior to ease transition. - selftest updates and other cleanups. * tag 'cgroup-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Add compatibility option for content of /proc/cgroups selftests/cgroup: fix cpu.max tests cgroup: llist: avoid memory tears for llist_node selftests: cgroup: Fix missing newline in test_zswap_writeback_one selftests: cgroup: Allow longer timeout for kmem_dead_cgroups cleanup memcg: cgroup: call css_rstat_updated irrespective of in_nmi() cgroup: remove per-cpu per-subsystem locks cgroup: make css_rstat_updated nmi safe cgroup: support to enable nmi-safe css_rstat_updated selftests: cgroup: Fix compilation on pre-cgroupns kernels selftests: cgroup: Optionally set up v1 environment selftests: cgroup: Add support for named v1 hierarchies in test_core selftests: cgroup_util: Add helpers for testing named v1 hierarchies Documentation: cgroup: add section explaining controller availability cgroup: Drop sock_cgroup_classid() dummy implementation
2025-07-31Merge tag 'mm-stable-2025-07-30-15-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "As usual, many cleanups. The below blurbiage describes 42 patchsets. 21 of those are partially or fully cleanup work. "cleans up", "cleanup", "maintainability", "rationalizes", etc. I never knew the MM code was so dirty. "mm: ksm: prevent KSM from breaking merging of new VMAs" (Lorenzo Stoakes) addresses an issue with KSM's PR_SET_MEMORY_MERGE mode: newly mapped VMAs were not eligible for merging with existing adjacent VMAs. "mm/damon: introduce DAMON_STAT for simple and practical access monitoring" (SeongJae Park) adds a new kernel module which simplifies the setup and usage of DAMON in production environments. "stop passing a writeback_control to swap/shmem writeout" (Christoph Hellwig) is a cleanup to the writeback code which removes a couple of pointers from struct writeback_control. "drivers/base/node.c: optimization and cleanups" (Donet Tom) contains largely uncorrelated cleanups to the NUMA node setup and management code. "mm: userfaultfd: assorted fixes and cleanups" (Tal Zussman) does some maintenance work on the userfaultfd code. "Readahead tweaks for larger folios" (Ryan Roberts) implements some tuneups for pagecache readahead when it is reading into order>0 folios. "selftests/mm: Tweaks to the cow test" (Mark Brown) provides some cleanups and consistency improvements to the selftests code. "Optimize mremap() for large folios" (Dev Jain) does that. A 37% reduction in execution time was measured in a memset+mremap+munmap microbenchmark. "Remove zero_user()" (Matthew Wilcox) expunges zero_user() in favor of the more modern memzero_page(). "mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes" (David Hildenbrand) addresses some warts which David noticed in the huge page code. These were not known to be causing any issues at this time. "mm/damon: use alloc_migrate_target() for DAMOS_MIGRATE_{HOT,COLD" (SeongJae Park) provides some cleanup and consolidation work in DAMON. "use vm_flags_t consistently" (Lorenzo Stoakes) uses vm_flags_t in places where we were inappropriately using other types. "mm/memfd: Reserve hugetlb folios before allocation" (Vivek Kasireddy) increases the reliability of large page allocation in the memfd code. "mm: Remove pXX_devmap page table bit and pfn_t type" (Alistair Popple) removes several now-unneeded PFN_* flags. "mm/damon: decouple sysfs from core" (SeongJae Park) implememnts some cleanup and maintainability work in the DAMON sysfs layer. "madvise cleanup" (Lorenzo Stoakes) does quite a lot of cleanup/maintenance work in the madvise() code. "madvise anon_name cleanups" (Vlastimil Babka) provides additional cleanups on top or Lorenzo's effort. "Implement numa node notifier" (Oscar Salvador) creates a standalone notifier for NUMA node memory state changes. Previously these were lumped under the more general memory on/offline notifier. "Make MIGRATE_ISOLATE a standalone bit" (Zi Yan) cleans up the pageblock isolation code and fixes a potential issue which doesn't seem to cause any problems in practice. "selftests/damon: add python and drgn based DAMON sysfs functionality tests" (SeongJae Park) adds additional drgn- and python-based DAMON selftests which are more comprehensive than the existing selftest suite. "Misc rework on hugetlb faulting path" (Oscar Salvador) fixes a rather obscure deadlock in the hugetlb fault code and follows that fix with a series of cleanups. "cma: factor out allocation logic from __cma_declare_contiguous_nid" (Mike Rapoport) rationalizes and cleans up the highmem-specific code in the CMA allocator. "mm/migration: rework movable_ops page migration (part 1)" (David Hildenbrand) provides cleanups and future-preparedness to the migration code. "mm/damon: add trace events for auto-tuned monitoring intervals and DAMOS quota" (SeongJae Park) adds some tracepoints to some DAMON auto-tuning code. "mm/damon: fix misc bugs in DAMON modules" (SeongJae Park) does that. "mm/damon: misc cleanups" (SeongJae Park) also does what it claims. "mm: folio_pte_batch() improvements" (David Hildenbrand) cleans up the large folio PTE batching code. "mm/damon/vaddr: Allow interleaving in migrate_{hot,cold} actions" (SeongJae Park) facilitates dynamic alteration of DAMON's inter-node allocation policy. "Remove unmap_and_put_page()" (Vishal Moola) provides a couple of page->folio conversions. "mm: per-node proactive reclaim" (Davidlohr Bueso) implements a per-node control of proactive reclaim - beyond the current memcg-based implementation. "mm/damon: remove damon_callback" (SeongJae Park) replaces the damon_callback interface with a more general and powerful damon_call()+damos_walk() interface. "mm/mremap: permit mremap() move of multiple VMAs" (Lorenzo Stoakes) implements a number of mremap cleanups (of course) in preparation for adding new mremap() functionality: newly permit the remapping of multiple VMAs when the user is specifying MREMAP_FIXED. It still excludes some specialized situations where this cannot be performed reliably. "drop hugetlb_free_pgd_range()" (Anthony Yznaga) switches some sparc hugetlb code over to the generic version and removes the thus-unneeded hugetlb_free_pgd_range(). "mm/damon/sysfs: support periodic and automated stats update" (SeongJae Park) augments the present userspace-requested update of DAMON sysfs monitoring files. Automatic update is now provided, along with a tunable to control the update interval. "Some randome fixes and cleanups to swapfile" (Kemeng Shi) does what is claims. "mm: introduce snapshot_page" (Luiz Capitulino and David Hildenbrand) provides (and uses) a means by which debug-style functions can grab a copy of a pageframe and inspect it locklessly without tripping over the races inherent in operating on the live pageframe directly. "use per-vma locks for /proc/pid/maps reads" (Suren Baghdasaryan) addresses the large contention issues which can be triggered by reads from that procfs file. Latencies are reduced by more than half in some situations. The series also introduces several new selftests for the /proc/pid/maps interface. "__folio_split() clean up" (Zi Yan) cleans up __folio_split()! "Optimize mprotect() for large folios" (Dev Jain) provides some quite large (>3x) speedups to mprotect() when dealing with large folios. "selftests/mm: reuse FORCE_READ to replace "asm volatile("" : "+r" (XXX));" and some cleanup" (wang lian) does some cleanup work in the selftests code. "tools/testing: expand mremap testing" (Lorenzo Stoakes) extends the mremap() selftest in several ways, including adding more checking of Lorenzo's recently added "permit mremap() move of multiple VMAs" feature. "selftests/damon/sysfs.py: test all parameters" (SeongJae Park) extends the DAMON sysfs interface selftest so that it tests all possible user-requested parameters. Rather than the present minimal subset" * tag 'mm-stable-2025-07-30-15-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (370 commits) MAINTAINERS: add missing headers to mempory policy & migration section MAINTAINERS: add missing file to cgroup section MAINTAINERS: add MM MISC section, add missing files to MISC and CORE MAINTAINERS: add missing zsmalloc file MAINTAINERS: add missing files to page alloc section MAINTAINERS: add missing shrinker files MAINTAINERS: move memremap.[ch] to hotplug section MAINTAINERS: add missing mm_slot.h file THP section MAINTAINERS: add missing interval_tree.c to memory mapping section MAINTAINERS: add missing percpu-internal.h file to per-cpu section mm/page_alloc: remove trace_mm_alloc_contig_migrate_range_info() selftests/damon: introduce _common.sh to host shared function selftests/damon/sysfs.py: test runtime reduction of DAMON parameters selftests/damon/sysfs.py: test non-default parameters runtime commit selftests/damon/sysfs.py: generalize DAMON context commit assertion selftests/damon/sysfs.py: generalize monitoring attributes commit assertion selftests/damon/sysfs.py: generalize DAMOS schemes commit assertion selftests/damon/sysfs.py: test DAMOS filters commitment selftests/damon/sysfs.py: generalize DAMOS scheme commit assertion selftests/damon/sysfs.py: test DAMOS destinations commitment ...