kernel/arch/x86/kvm/hyperv.c, branch master

KVM: x86: hyper-v: Validate all GVAs during PV TLB flush

2026-03-11T17:41:11Z

treewide: Replace kmalloc with kmalloc_obj for non-scalar types

2026-02-21T09:02:28Z

This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook

KVM: x86: Add a wrapper to handle common case of IRQ delivery without dest_map

2026-01-12T17:31:40Z

Turn kvm_irq_delivery_to_apic() into a wrapper that passes NULL for the @dest_map param, as only the ugly I/O APIC RTC hackery needs to know which vCPUs received the IRQ. No functional change intended. Link: https://patch.msgid.link/20251206004311.479939-7-seanjc@google.com Signed-off-by: Sean Christopherson

KVM: x86: Use "checked" versions of get_user() and put_user()

2025-11-17T15:50:20Z

Use the normal, checked versions for get_user() and put_user() instead of the double-underscore versions that omit range checks, as the checked versions are actually measurably faster on modern CPUs (12%+ on Intel, 25%+ on AMD). The performance hit on the unchecked versions is almost entirely due to the added LFENCE on CPUs where LFENCE is serializing (which is effectively all modern CPUs), which was added by commit 304ec1b05031 ("x86/uaccess: Use __uaccess_begin_nospec() and uaccess_try_nospec"). The small optimizations done by commit b19b74bc99b1 ("x86/mm: Rework address range check in get_user() and put_user()") likely shave a few cycles off, but the bulk of the extra latency comes from the LFENCE. Don't bother trying to open-code an equivalent for performance reasons, as the loss of inlining (e.g. see commit ea6f043fc984 ("x86: Make __get_user() generate an out-of-line call") is largely a non-factor (ignoring setups where RET is something entirely different), As measured across tens of millions of calls of guest PTE reads in FNAME(walk_addr_generic): __get_user() get_user() open-coded open-coded, no LFENCE Intel (EMR) 75.1 67.6 75.3 65.5 AMD (Turin) 68.1 51.1 67.5 49.3 Note, Hyper-V MSR emulation is not a remotely hot path, but convert it anyways for consistency, and because there is a general desire to remove __{get,put}_user() entirely. Reported-by: Linus Torvalds Closes: https://lore.kernel.org/all/CAHk-=wimh_3jM9Xe8Zx0rpuf8CPDu6DkRCGb44azk0Sz5yqSnw@mail.gmail.com Cc: Borislav Petkov Link: https://patch.msgid.link/20251106210206.221558-1-seanjc@google.com Signed-off-by: Sean Christopherson

KVM: x86: Export KVM-internal symbols for sub-modules only

2025-09-30T17:40:02Z

Rework almost all of KVM x86's exports to expose symbols only to KVM's vendor modules, i.e. to kvm-{amd,intel}.ko. Keep the generic exports that are guarded by CONFIG_KVM_EXTERNAL_WRITE_TRACKING=y, as they're explicitly designed/intended for external usage. Link: https://lore.kernel.org/r/20250919003303.1355064-6-seanjc@google.com Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini

KVM: x86: hyper-v: Use guard() instead of mutex_lock() to simplify code

2025-09-16T19:55:11Z

Use guard(mutex) instead of mutex_lock/mutex_unlock pair to simplify the error handling when setting up the TSC page for a Hyper-V guest. No functional change intended. Signed-off-by: Liao Yuanhong Link: https://lore.kernel.org/r/20250901131604.646415-1-liaoyuanhong@vivo.com [sean: tweak changelog] Signed-off-by: Sean Christopherson

Merge tag 'kvm-x86-irqs-6.17' of https://github.com/kvm-x86/linux into HEAD

2025-07-29T12:35:46Z

KVM IRQ changes for 6.17 - Rework irqbypass to track/match producers and consumers via an xarray instead of a linked list. Using a linked list leads to O(n^2) insertion times, which is hugely problematic for use cases that create large numbers of VMs. Such use cases typically don't actually use irqbypass, but eliminating the pointless registration is a future problem to solve as it likely requires new uAPI. - Track irqbypass's "token" as "struct eventfd_ctx *" instead of a "void *", to avoid making a simple concept unnecessarily difficult to understand. - Add CONFIG_KVM_IOAPIC for x86 to allow disabling support for I/O APIC, PIC, and PIT emulation at compile time. - Drop x86's irq_comm.c, and move a pile of IRQ related code into irq.c. - Fix a variety of flaws and bugs in the AVIC device posted IRQ code. - Inhibited AVIC if a vCPU's ID is too big (relative to what hardware supports) instead of rejecting vCPU creation. - Extend enable_ipiv module param support to SVM, by simply leaving IsRunning clear in the vCPU's physical ID table entry. - Disable IPI virtualization, via enable_ipiv, if the CPU is affected by erratum #1235, to allow (safely) enabling AVIC on such CPUs. - Dedup x86's device posted IRQ code, as the vast majority of functionality can be shared verbatime between SVM and VMX. - Harden the device posted IRQ code against bugs and runtime errors. - Use vcpu_idx, not vcpu_id, for GA log tag/metadata, to make lookups O(1) instead of O(n). - Generate GA Log interrupts if and only if the target vCPU is blocking, i.e. only if KVM needs a notification in order to wake the vCPU. - Decouple device posted IRQs from VFIO device assignment, as binding a VM to a VFIO group is not a requirement for enabling device posted IRQs. - Clean up and document/comment the irqfd assignment code. - Disallow binding multiple irqfds to an eventfd with a priority waiter, i.e. ensure an eventfd is bound to at most one irqfd through the entire host, and add a selftest to verify eventfd:irqfd bindings are globally unique.

KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB flush

2025-06-25T16:15:24Z

In KVM guests with Hyper-V hypercalls enabled, the hypercalls HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST and HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX allow a guest to request invalidation of portions of a virtual TLB. For this, the hypercall parameter includes a list of GVAs that are supposed to be invalidated. However, when non-canonical GVAs are passed, there is currently no filtering in place and they are eventually passed to checked invocations of INVVPID on Intel / INVLPGA on AMD. While AMD's INVLPGA silently ignores non-canonical addresses (effectively a no-op), Intel's INVVPID explicitly signals VM-Fail and ultimately triggers the WARN_ONCE in invvpid_error(): invvpid failed: ext=0x0 vpid=1 gva=0xaaaaaaaaaaaaa000 WARNING: CPU: 6 PID: 326 at arch/x86/kvm/vmx/vmx.c:482 invvpid_error+0x91/0xa0 [kvm_intel] Modules linked in: kvm_intel kvm 9pnet_virtio irqbypass fuse CPU: 6 UID: 0 PID: 326 Comm: kvm-vm Not tainted 6.15.0 #14 PREEMPT(voluntary) RIP: 0010:invvpid_error+0x91/0xa0 [kvm_intel] Call Trace: vmx_flush_tlb_gva+0x320/0x490 [kvm_intel] kvm_hv_vcpu_flush_tlb+0x24f/0x4f0 [kvm] kvm_arch_vcpu_ioctl_run+0x3013/0x5810 [kvm] Hyper-V documents that invalid GVAs (those that are beyond a partition's GVA space) are to be ignored. While not completely clear whether this ruling also applies to non-canonical GVAs, it is likely fine to make that assumption, and manual testing on Azure confirms "real" Hyper-V interprets the specification in the same way. Skip non-canonical GVAs when processing the list of address to avoid tripping the INVVPID failure. Alternatively, KVM could filter out "bad" GVAs before inserting into the FIFO, but practically speaking the only downside of pushing validation to the final processing is that doing so is suboptimal for the guest, and no well-behaved guest will request TLB flushes for non-canonical addresses. Fixes: 260970862c88 ("KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently") Cc: stable@vger.kernel.org Signed-off-by: Manuel Andreas Suggested-by: Vitaly Kuznetsov Link: https://lore.kernel.org/r/c090efb3-ef82-499f-a5e0-360fc8420fb7@tum.de Signed-off-by: Sean Christopherson

KVM: x86/hyper-v: Use preallocated per-vCPU buffer for de-sparsified vCPU masks

2025-06-24T19:20:16Z

Use a preallocated per-vCPU bitmap for tracking the unpacked set of vCPUs being targeted for Hyper-V's paravirt TLB flushing. If KVM_MAX_NR_VCPUS is set to 4096 (which is allowed even for MAXSMP=n builds), putting the vCPU mask on-stack pushes kvm_hv_flush_tlb() past the default FRAME_WARN limit. arch/x86/kvm/hyperv.c:2001:12: error: stack frame size (1288) exceeds limit (1024) in 'kvm_hv_flush_tlb' [-Werror,-Wframe-larger-than] 2001 | static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc) | ^ 1 error generated. Note, sparse_banks was given the same treatment by commit 7d5e88d301f8 ("KVM: x86: hyper-v: Use preallocated buffer in 'struct kvm_vcpu_hv' instead of on-stack 'sparse_banks'"), for the exact same reason. Reported-by: Abinash Lalotra Closes: https://lore.kernel.org/all/20250613111023.786265-1-abinashsinghlalotra@gmail.com Link: https://lore.kernel.org/all/aEylI-O8kFnFHrOH@google.com Signed-off-by: Sean Christopherson

KVM: x86: Drop superfluous kvm_hv_set_sint() => kvm_hv_synic_set_irq() wrapper

2025-06-20T20:52:43Z

Drop the superfluous kvm_hv_set_sint() and instead wire up ->set() directly to its final destination, kvm_hv_synic_set_irq(). Keep hv_synic_set_irq() instead of kvm_hv_set_sint() to provide some amount of consistency in the ->set() helpers, e.g. to match kvm_pic_set_irq() and kvm_ioapic_set_irq(). kvm_set_msi() is arguably the oddball, e.g. kvm_set_msi_irq() should be something like kvm_msi_to_lapic_irq() so that kvm_set_msi() can instead be kvm_set_msi_irq(), but that's a future problem to solve. No functional change intended. Cc: Vitaly Kuznetsov Cc: Kai Huang Reviewed-by: Vitaly Kuznetsov Acked-by: Kai Huang Link: https://lore.kernel.org/r/20250611213557.294358-5-seanjc@google.com Signed-off-by: Sean Christopherson