summaryrefslogtreecommitdiffstats
path: root/arch/arm64/kvm
AgeCommit message (Collapse)Author
2016-05-27Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull second batch of KVM updates from Radim Krčmář: "General: - move kvm_stat tool from QEMU repo into tools/kvm/kvm_stat (kvm_stat had nothing to do with QEMU in the first place -- the tool only interprets debugfs) - expose per-vm statistics in debugfs and support them in kvm_stat (KVM always collected per-vm statistics, but they were summarised into global statistics) x86: - fix dynamic APICv (VMX was improperly configured and a guest could access host's APIC MSRs, CVE-2016-4440) - minor fixes ARM changes from Christoffer Dall: - new vgic reimplementation of our horribly broken legacy vgic implementation. The two implementations will live side-by-side (with the new being the configured default) for one kernel release and then we'll remove the legacy one. - fix for a non-critical issue with virtual abort injection to guests" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (70 commits) tools: kvm_stat: Add comments tools: kvm_stat: Introduce pid monitoring KVM: Create debugfs dir and stat files for each VM MAINTAINERS: Add kvm tools tools: kvm_stat: Powerpc related fixes tools: Add kvm_stat man page tools: Add kvm_stat vm monitor script kvm:vmx: more complete state update on APICv on/off KVM: SVM: Add more SVM_EXIT_REASONS KVM: Unify traced vector format svm: bitwise vs logical op typo KVM: arm/arm64: vgic-new: Synchronize changes to active state KVM: arm/arm64: vgic-new: enable build KVM: arm/arm64: vgic-new: implement mapped IRQ handling KVM: arm/arm64: vgic-new: Wire up irqfd injection KVM: arm/arm64: vgic-new: Add vgic_v2/v3_enable KVM: arm/arm64: vgic-new: vgic_init: implement map_resources KVM: arm/arm64: vgic-new: vgic_init: implement vgic_init KVM: arm/arm64: vgic-new: vgic_init: implement vgic_create KVM: arm/arm64: vgic-new: vgic_init: implement kvm_vgic_hyp_init ...
2016-05-20KVM: arm/arm64: vgic-new: enable buildAndre Przywara
Now that the new VGIC implementation has reached feature parity with the old one, add the new files to the build system and add a Kconfig option to switch between the two versions. We set the default to the new version to get maximum test coverage, in case people experience problems they can switch back to the old behaviour if needed. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-20kvm: arm64: Fix EC field in inject_abt64Matt Evans
The EC field of the constructed ESR is conditionally modified by ORing in ESR_ELx_EC_DABT_LOW for a data abort. However, ESR_ELx_EC_SHIFT is missing from this condition. Signed-off-by: Matt Evans <matt.evans@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-19Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "Small release overall. x86: - miscellaneous fixes - AVIC support (local APIC virtualization, AMD version) s390: - polling for interrupts after a VCPU goes to halted state is now enabled for s390 - use hardware provided information about facility bits that do not need any hypervisor activity, and other fixes for cpu models and facilities - improve perf output - floating interrupt controller improvements. MIPS: - miscellaneous fixes PPC: - bugfixes only ARM: - 16K page size support - generic firmware probing layer for timer and GIC Christoffer Dall (KVM-ARM maintainer) says: "There are a few changes in this pull request touching things outside KVM, but they should all carry the necessary acks and it made the merge process much easier to do it this way." though actually the irqchip maintainers' acks didn't make it into the patches. Marc Zyngier, who is both irqchip and KVM-ARM maintainer, later acked at http://mid.gmane.org/573351D1.4060303@arm.com ('more formally and for documentation purposes')" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (82 commits) KVM: MTRR: remove MSR 0x2f8 KVM: x86: make hwapic_isr_update and hwapic_irr_update look the same svm: Manage vcpu load/unload when enable AVIC svm: Do not intercept CR8 when enable AVIC svm: Do not expose x2APIC when enable AVIC KVM: x86: Introducing kvm_x86_ops.apicv_post_state_restore svm: Add VMEXIT handlers for AVIC svm: Add interrupt injection via AVIC KVM: x86: Detect and Initialize AVIC support svm: Introduce new AVIC VMCB registers KVM: split kvm_vcpu_wake_up from kvm_vcpu_kick KVM: x86: Introducing kvm_x86_ops VCPU blocking/unblocking hooks KVM: x86: Introducing kvm_x86_ops VM init/destroy hooks KVM: x86: Rename kvm_apic_get_reg to kvm_lapic_get_reg KVM: x86: Misc LAPIC changes to expose helper functions KVM: shrink halt polling even more for invalid wakeups KVM: s390: set halt polling to 80 microseconds KVM: halt_polling: provide a way to qualify wakeups during poll KVM: PPC: Book3S HV: Re-enable XICS fast path for irqfd-generated interrupts kvm: Conditionally register IRQ bypass consumer ...
2016-05-16Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: - virt_to_page/page_address optimisations - support for NUMA systems described using device-tree - support for hibernate/suspend-to-disk - proper support for maxcpus= command line parameter - detection and graceful handling of AArch64-only CPUs - miscellaneous cleanups and non-critical fixes * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (92 commits) arm64: do not enforce strict 16 byte alignment to stack pointer arm64: kernel: Fix incorrect brk randomization arm64: cpuinfo: Missing NULL terminator in compat_hwcap_str arm64: secondary_start_kernel: Remove unnecessary barrier arm64: Ensure pmd_present() returns false after pmd_mknotpresent() arm64: Replace hard-coded values in the pmd/pud_bad() macros arm64: Implement pmdp_set_access_flags() for hardware AF/DBM arm64: Fix typo in the pmdp_huge_get_and_clear() definition arm64: mm: remove unnecessary EXPORT_SYMBOL_GPL arm64: always use STRICT_MM_TYPECHECKS arm64: kvm: Fix kvm teardown for systems using the extended idmap arm64: kaslr: increase randomization granularity arm64: kconfig: drop CONFIG_RTC_LIB dependency arm64: make ARCH_SUPPORTS_DEBUG_PAGEALLOC depend on !HIBERNATION arm64: hibernate: Refuse to hibernate if the boot cpu is offline arm64: kernel: Add support for hibernate/suspend-to-disk PM / Hibernate: Call flush_icache_range() on pages restored in-place arm64: Add new asm macro copy_page arm64: Promote KERNEL_START/KERNEL_END definitions to a header file arm64: kernel: Include _AC definition in page.h ...
2016-05-09kvm: arm64: Enable hardware updates of the Access Flag for Stage 2 page tablesCatalin Marinas
The ARMv8.1 architecture extensions introduce support for hardware updates of the access and dirty information in page table entries. With VTCR_EL2.HA enabled (bit 21), when the CPU accesses an IPA with the PTE_AF bit cleared in the stage 2 page table, instead of raising an Access Flag fault to EL2 the CPU sets the actual page table entry bit (10). To ensure that kernel modifications to the page table do not inadvertently revert a bit set by hardware updates, certain Stage 2 software pte/pmd operations must be performed atomically. The main user of the AF bit is the kvm_age_hva() mechanism. The kvm_age_hva_handler() function performs a "test and clear young" action on the pte/pmd. This needs to be atomic in respect of automatic hardware updates of the AF bit. Since the AF bit is in the same position for both Stage 1 and Stage 2, the patch reuses the existing ptep_test_and_clear_young() functionality if __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG is defined. Otherwise, the existing pte_young/pte_mkold mechanism is preserved. The kvm_set_s2pte_readonly() (and the corresponding pmd equivalent) have to perform atomic modifications in order to avoid a race with updates of the AF bit. The arm64 implementation has been re-written using exclusives. Currently, kvm_set_s2pte_writable() (and pmd equivalent) take a pointer argument and modify the pte/pmd in place. However, these functions are only used on local variables rather than actual page table entries, so it makes more sense to follow the pte_mkwrite() approach for stage 1 attributes. The change to kvm_s2pte_mkwrite() makes it clear that these functions do not modify the actual page table entries. The (pte|pmd)_mkyoung() uses on Stage 2 entries (setting the AF bit explicitly) do not need to be modified since hardware updates of the dirty status are not supported by KVM, so there is no possibility of losing such information. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-03arm64: kvm: Fix kvm teardown for systems using the extended idmapJames Morse
If memory is located above 1<<VA_BITS, kvm adds an extra level to its page tables, merging the runtime tables and boot tables that contain the idmap. This lets us avoid the trampoline dance during initialisation. This also means there is no trampoline page mapped, so __cpu_reset_hyp_mode() can't call __kvm_hyp_reset() in this page. The good news is the idmap is still mapped, so we don't need the trampoline page. The bad news is we can't call it directly as the idmap is above HYP_PAGE_OFFSET, so its address is masked by kvm_call_hyp. Add a function __extended_idmap_trampoline which will branch into __kvm_hyp_reset in the idmap, change kvm_hyp_reset_entry() to return this address if __kvm_cpu_uses_extended_idmap(). In this case __kvm_hyp_reset() will still switch to the boot tables (which are the merged tables that were already in use), and branch into the idmap (where it already was). This fixes boot failures on these systems, where we fail to execute the missing trampoline page when tearing down kvm in init_subsystems(): [ 2.508922] kvm [1]: 8-bit VMID [ 2.512057] kvm [1]: Hyp mode initialized successfully [ 2.517242] kvm [1]: interrupt-controller@e1140000 IRQ13 [ 2.522622] kvm [1]: timer IRQ3 [ 2.525783] Kernel panic - not syncing: HYP panic: [ 2.525783] PS:200003c9 PC:0000007ffffff820 ESR:86000005 [ 2.525783] FAR:0000007ffffff820 HPFAR:00000000003ffff0 PAR:0000000000000000 [ 2.525783] VCPU: (null) [ 2.525783] [ 2.547667] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.6.0-rc5+ #1 [ 2.555137] Hardware name: Default string Default string/Default string, BIOS ROD0084E 09/03/2015 [ 2.563994] Call trace: [ 2.566432] [<ffffff80080888d0>] dump_backtrace+0x0/0x240 [ 2.571818] [<ffffff8008088b24>] show_stack+0x14/0x20 [ 2.576858] [<ffffff80083423ac>] dump_stack+0x94/0xb8 [ 2.581899] [<ffffff8008152130>] panic+0x10c/0x250 [ 2.586677] [<ffffff8008152024>] panic+0x0/0x250 [ 2.591281] SMP: stopping secondary CPUs [ 3.649692] SMP: failed to stop secondary CPUs 0-2,4-7 [ 3.654818] Kernel Offset: disabled [ 3.658293] Memory Limit: none [ 3.661337] ---[ end Kernel panic - not syncing: HYP panic: [ 3.661337] PS:200003c9 PC:0000007ffffff820 ESR:86000005 [ 3.661337] FAR:0000007ffffff820 HPFAR:00000000003ffff0 PAR:0000000000000000 [ 3.661337] VCPU: (null) [ 3.661337] Reported-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-28arm64: kvm: allows kvm cpu hotplugAKASHI Takahiro
The current kvm implementation on arm64 does cpu-specific initialization at system boot, and has no way to gracefully shutdown a core in terms of kvm. This prevents kexec from rebooting the system at EL2. This patch adds a cpu tear-down function and also puts an existing cpu-init code into a separate function, kvm_arch_hardware_disable() and kvm_arch_hardware_enable() respectively. We don't need the arm64 specific cpu hotplug hook any more. Since this patch modifies common code between arm and arm64, one stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid compilation errors. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> [Rebase, added separate VHE init/exit path, changed resets use of kvm_call_hyp() to the __version, en/disabled hardware in init_subsystems(), added icache maintenance to __kvm_hyp_reset() and removed lr restore, removed guest-enter after teardown handling] Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-28arm64: hyp/kvm: Make hyp-stub reject kvm_call_hyp()James Morse
A later patch implements kvm_arch_hardware_disable(), to remove kvm from el2, and re-instate the hyp-stub. This can happen while guests are running, particularly when kvm_reboot() calls kvm_arch_hardware_disable() on each cpu. This can interrupt a guest, remove kvm, then allow the guest to be scheduled again. This causes kvm_call_hyp() to be run against the hyp-stub. Change the hyp-stub to return a new exception type when this happens, and add code to kvm's handle_exit() to tell userspace we failed to enter the guest. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-28arm64: hyp/kvm: Make hyp-stub extensibleGeoff Levand
The existing arm64 hcall implementations are limited in that they only allow for two distinct hcalls; with the x0 register either zero or not zero. Also, the API of the hyp-stub exception vector routines and the KVM exception vector routines differ; hyp-stub uses a non-zero value in x0 to implement __hyp_set_vectors, whereas KVM uses it to implement kvm_call_hyp. To allow for additional hcalls to be defined and to make the arm64 hcall API more consistent across exception vector routines, change the hcall implementations to reserve all x0 values below 0xfff for hcalls such as {s,g}et_vectors(). Define two new preprocessor macros HVC_GET_VECTORS, and HVC_SET_VECTORS to be used as hcall type specifiers and convert the existing __hyp_get_vectors() and __hyp_set_vectors() routines to use these new macros when executing an HVC call. Also, change the corresponding hyp-stub and KVM el1_sync exception vector routines to use these new macros. Signed-off-by: Geoff Levand <geoff@infradead.org> [Merged two hcall patches, moved immediate value from esr to x0, use lr as a scratch register, changed limit to 0xfff] Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-28arm64: kvm: Move lr save/restore from do_el2_call into EL1James Morse
Today the 'hvc' calling KVM or the hyp-stub is expected to preserve all registers. KVM saves/restores the registers it needs on the EL2 stack using do_el2_call(). The hyp-stub has no stack, later patches need to be able to be able to clobber the link register. Move the link register save/restore to the the call sites. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-28arm64: Cleanup SCTLR flagsGeoff Levand
We currently have macros defining flags for the arm64 sctlr registers in both kvm_arm.h and sysreg.h. To clean things up and simplify move the definitions of the SCTLR_EL2 flags from kvm_arm.h to sysreg.h, rename any SCTLR_EL1 or SCTLR_EL2 flags that are common to both registers to be SCTLR_ELx, with 'x' indicating a common flag, and fixup all files to include the proper header or to use the new macro names. Signed-off-by: Geoff Levand <geoff@infradead.org> [Restored pgtable-hwdef.h include] Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-21arm64: kvm: Add support for 16K pagesSuzuki K Poulose
Now that we can handle stage-2 page tables independent of the host page table levels, wire up the 16K page support. Cc: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
2016-04-06arm64: KVM: Warn when PARange is less than 40 bitsMarc Zyngier
We always thought that 40bits of PA range would be the minimum people would actually build. Anything less is terrifyingly small. Turns out that we were both right and wrong. Nobody has ever built such a system, but the ARM Foundation Model has a PARange set to 36bits. Just because we can. Oh well. Now, the KVM API explicitely says that we offer a 40bit PA space to the VM, so we shouldn't run KVM on the Foundation Model at all. That being said, this patch offers a less agressive alternative, and loudly warns about the configuration being unsupported. You'll still be able to run VMs (at your own risks, though). This is just a workaround until we have a proper userspace API where we report the PARange to userspace. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-03-30arm64: kvm: 4.6-rc1: Fix VTCR_EL2 VS settingSuzuki K Poulose
When we detect support for 16bit VMID in ID_AA64MMFR1, we set the VTCR_EL2_VS field to 1 to make use of 16bit vmids. But, with commit 3a3604bc5eb4 ("arm64: KVM: Switch to C-based stage2 init") this is broken and we corrupt VTCR_EL2:T0SZ instead of updating the VS field. VTCR_EL2_VS was actually defined to the field shift (19) and not the real value for VS. This patch fixes the issue. Fixes: commit 3a3604bc5eb4 ("arm64: KVM: Switch to C-based stage2 init") Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-03-21kvm: arm64: Disable compiler instrumentation for hypervisor codeCatalin Marinas
With the recent rewrite of the arm64 KVM hypervisor code in C, enabling certain options like KASAN would allow the compiler to generate memory accesses or function calls to addresses not mapped at EL2. This patch disables the compiler instrumentation on the arm64 hypervisor code for gcov-based profiling (GCOV_KERNEL), undefined behaviour sanity checker (UBSAN) and kernel address sanitizer (KASAN). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: <stable@vger.kernel.org> # 4.5+ Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-03-17Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "Here are the main arm64 updates for 4.6. There are some relatively intrusive changes to support KASLR, the reworking of the kernel virtual memory layout and initial page table creation. Summary: - Initial page table creation reworked to avoid breaking large block mappings (huge pages) into smaller ones. The ARM architecture requires break-before-make in such cases to avoid TLB conflicts but that's not always possible on live page tables - Kernel virtual memory layout: the kernel image is no longer linked to the bottom of the linear mapping (PAGE_OFFSET) but at the bottom of the vmalloc space, allowing the kernel to be loaded (nearly) anywhere in physical RAM - Kernel ASLR: position independent kernel Image and modules being randomly mapped in the vmalloc space with the randomness is provided by UEFI (efi_get_random_bytes() patches merged via the arm64 tree, acked by Matt Fleming) - Implement relative exception tables for arm64, required by KASLR (initial code for ARCH_HAS_RELATIVE_EXTABLE added to lib/extable.c but actual x86 conversion to deferred to 4.7 because of the merge dependencies) - Support for the User Access Override feature of ARMv8.2: this allows uaccess functions (get_user etc.) to be implemented using LDTR/STTR instructions. Such instructions, when run by the kernel, perform unprivileged accesses adding an extra level of protection. The set_fs() macro is used to "upgrade" such instruction to privileged accesses via the UAO bit - Half-precision floating point support (part of ARMv8.2) - Optimisations for CPUs with or without a hardware prefetcher (using run-time code patching) - copy_page performance improvement to deal with 128 bytes at a time - Sanity checks on the CPU capabilities (via CPUID) to prevent incompatible secondary CPUs from being brought up (e.g. weird big.LITTLE configurations) - valid_user_regs() reworked for better sanity check of the sigcontext information (restored pstate information) - ACPI parking protocol implementation - CONFIG_DEBUG_RODATA enabled by default - VDSO code marked as read-only - DEBUG_PAGEALLOC support - ARCH_HAS_UBSAN_SANITIZE_ALL enabled - Erratum workaround Cavium ThunderX SoC - set_pte_at() fix for PROT_NONE mappings - Code clean-ups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (99 commits) arm64: kasan: Fix zero shadow mapping overriding kernel image shadow arm64: kasan: Use actual memory node when populating the kernel image shadow arm64: Update PTE_RDONLY in set_pte_at() for PROT_NONE permission arm64: Fix misspellings in comments. arm64: efi: add missing frame pointer assignment arm64: make mrs_s prefixing implicit in read_cpuid arm64: enable CONFIG_DEBUG_RODATA by default arm64: Rework valid_user_regs arm64: mm: check at build time that PAGE_OFFSET divides the VA space evenly arm64: KVM: Move kvm_call_hyp back to its original localtion arm64: mm: treat memstart_addr as a signed quantity arm64: mm: list kernel sections in order arm64: lse: deal with clobbered IP registers after branch via PLT arm64: mm: dump: Use VA_START directly instead of private LOWEST_ADDR arm64: kconfig: add submenu for 8.2 architectural features arm64: kernel: acpi: fix ioremap in ACPI parking protocol cpu_postboot arm64: Add support for Half precision floating point arm64: Remove fixmap include fragility arm64: Add workaround for Cavium erratum 27456 arm64: mm: Mark .rodata as RO ...
2016-03-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "One of the largest releases for KVM... Hardly any generic changes, but lots of architecture-specific updates. ARM: - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - various optimizations to the vgic save/restore code. PPC: - enabled KVM-VFIO integration ("VFIO device") - optimizations to speed up IPIs between vcpus - in-kernel handling of IOMMU hypercalls - support for dynamic DMA windows (DDW). s390: - provide the floating point registers via sync regs; - separated instruction vs. data accesses - dirty log improvements for huge guests - bugfixes and documentation improvements. x86: - Hyper-V VMBus hypercall userspace exit - alternative implementation of lowest-priority interrupts using vector hashing (for better VT-d posted interrupt support) - fixed guest debugging with nested virtualizations - improved interrupt tracking in the in-kernel IOAPIC - generic infrastructure for tracking writes to guest memory - currently its only use is to speedup the legacy shadow paging (pre-EPT) case, but in the future it will be used for virtual GPUs as well - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (217 commits) KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch KVM: x86: disable MPX if host did not enable MPX XSAVE features arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit arm64: KVM: vgic-v3: Reset LRs at boot time arm64: KVM: vgic-v3: Do not save an LR known to be empty arm64: KVM: vgic-v3: Save maintenance interrupt state only if required arm64: KVM: vgic-v3: Avoid accessing ICH registers KVM: arm/arm64: vgic-v2: Make GICD_SGIR quicker to hit KVM: arm/arm64: vgic-v2: Only wipe LRs on vcpu exit KVM: arm/arm64: vgic-v2: Reset LRs at boot time KVM: arm/arm64: vgic-v2: Do not save an LR known to be empty KVM: arm/arm64: vgic-v2: Move GICH_ELRSR saving to its own function KVM: arm/arm64: vgic-v2: Save maintenance interrupt state only if required KVM: arm/arm64: vgic-v2: Avoid accessing GICH registers KVM: s390: allocate only one DMA page per VM KVM: s390: enable STFLE interpretation only if enabled for the guest KVM: s390: wake up when the VCPU cpu timer expires KVM: s390: step the VCPU timer while in enabled wait KVM: s390: protect VCPU cpu timer with a seqcount KVM: s390: step VCPU cpu timer during kvm_run ioctl ...
2016-03-09arm64: KVM: vgic-v3: Only wipe LRs on vcpu exitMarc Zyngier
So far, we're always writing all possible LRs, setting the empty ones with a zero value. This is obvious doing a low of work for nothing, and we're better off clearing those we've actually dirtied on the exit path (it is very rare to inject more than one interrupt at a time anyway). Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09arm64: KVM: vgic-v3: Reset LRs at boot timeMarc Zyngier
In order to let the GICv3 code be more lazy in the way it accesses the LRs, it is necessary to start with a clean slate. Let's reset the LRs on each CPU when the vgic is probed (which includes a round trip to EL2...). Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09arm64: KVM: vgic-v3: Do not save an LR known to be emptyMarc Zyngier
On exit, any empty LR will be signaled in ICH_ELRSR_EL2. Which means that we do not have to save it, and we can just clear its state in the in-memory copy. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09arm64: KVM: vgic-v3: Save maintenance interrupt state only if requiredMarc Zyngier
Next on our list of useless accesses is the maintenance interrupt status registers (ICH_MISR_EL2, ICH_EISR_EL2). It is pointless to save them if we haven't asked for a maintenance interrupt the first place, which can only happen for two reasons: - Underflow: ICH_HCR_UIE will be set, - EOI: ICH_LR_EOI will be set. These conditions can be checked on the in-memory copies of the regs. Should any of these two condition be valid, we must read GICH_MISR. We can then check for ICH_MISR_EOI, and only when set read ICH_EISR_EL2. This means that in most case, we don't have to save them at all. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-03-09arm64: KVM: vgic-v3: Avoid accessing ICH registersMarc Zyngier
Just like on GICv2, we're a bit hammer-happy with GICv3, and access them more often than we should. Adopt a policy similar to what we do for GICv2, only save/restoring the minimal set of registers. As we don't access the registers linearly anymore (we may skip some), the convoluted accessors become slightly simpler, and we can drop the ugly indexing macro that tended to confuse the reviewers. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Switch the sys_reg search to be a binary searchMarc Zyngier
Our 64bit sys_reg table is about 90 entries long (so far, and the PMU support is likely to increase this). This means that on average, it takes 45 comparaisons to find the right entry (and actually the full 90 if we have to search the invariant table). Not the most efficient thing. Specially when you think that this table is already sorted. Switching to a binary search effectively reduces the search to about 7 comparaisons. Slightly better! As an added bonus, the comparison is done by comparing all the fields at once, instead of one at a time. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Add a new vcpu device control group for PMUv3Shannon Zhao
To configure the virtual PMUv3 overflow interrupt number, we use the vcpu kvm_device ioctl, encapsulating the KVM_ARM_VCPU_PMU_V3_IRQ attribute within the KVM_ARM_VCPU_PMU_V3_CTRL group. After configuring the PMUv3, call the vcpu ioctl with attribute KVM_ARM_VCPU_PMU_V3_INIT to initialize the PMUv3. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Acked-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Introduce per-vcpu kvm device controlsShannon Zhao
In some cases it needs to get/set attributes specific to a vcpu and so needs something else than ONE_REG. Let's copy the KVM_DEVICE approach, and define the respective ioctls for the vcpu file descriptor. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Acked-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Add a new feature bit for PMUv3Shannon Zhao
To support guest PMUv3, use one bit of the VCPU INIT feature array. Initialize the PMU when initialzing the vcpu with that bit and PMU overflow interrupt set. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Acked-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Reset PMU state when resetting vcpuShannon Zhao
When resetting vcpu, it needs to reset the PMU state to initial status. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Add access handler for PMUSERENR registerShannon Zhao
This register resets as unknown in 64bit mode while it resets as zero in 32bit mode. Here we choose to reset it as zero for consistency. PMUSERENR_EL0 holds some bits which decide whether PMU registers can be accessed from EL0. Add some check helpers to handle the access from EL0. When these bits are zero, only reading PMUSERENR will trap to EL2 and writing PMUSERENR or reading/writing other PMU registers will trap to EL1 other than EL2 when HCR.TGE==0. To current KVM configuration (HCR.TGE==0) there is no way to get these traps. Here we write 0xf to physical PMUSERENR register on VM entry, so that it will trap PMU access from EL0 to EL2. Within the register access handler we check the real value of guest PMUSERENR register to decide whether this access is allowed. If not allowed, return false to inject UND to guest. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Add helper to handle PMCR register bitsShannon Zhao
According to ARMv8 spec, when writing 1 to PMCR.E, all counters are enabled by PMCNTENSET, while writing 0 to PMCR.E, all counters are disabled. When writing 1 to PMCR.P, reset all event counters, not including PMCCNTR, to zero. When writing 1 to PMCR.C, reset PMCCNTR to zero. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Add access handler for PMSWINC registerShannon Zhao
Add access handler which emulates writing and reading PMSWINC register and add support for creating software increment event. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Add access handler for PMOVSSET and PMOVSCLR registerShannon Zhao
Since the reset value of PMOVSSET and PMOVSCLR is UNKNOWN, use reset_unknown for its reset handler. Add a handler to emulate writing PMOVSSET or PMOVSCLR register. When writing non-zero value to PMOVSSET, the counter and its interrupt is enabled, kick this vcpu to sync PMU interrupt. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Add access handler for PMINTENSET and PMINTENCLR registerShannon Zhao
Since the reset value of PMINTENSET and PMINTENCLR is UNKNOWN, use reset_unknown for its reset handler. Add a handler to emulate writing PMINTENSET or PMINTENCLR register. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Add access handler for event type registerShannon Zhao
These kind of registers include PMEVTYPERn, PMCCFILTR and PMXEVTYPER which is mapped to PMEVTYPERn or PMCCFILTR. The access handler translates all aarch32 register offsets to aarch64 ones and uses vcpu_sys_reg() to access their values to avoid taking care of big endian. When writing to these registers, create a perf_event for the selected event type. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Add access handler for PMCNTENSET and PMCNTENCLR registerShannon Zhao
Since the reset value of PMCNTENSET and PMCNTENCLR is UNKNOWN, use reset_unknown for its reset handler. Add a handler to emulate writing PMCNTENSET or PMCNTENCLR register. When writing to PMCNTENSET, call perf_event_enable to enable the perf event. When writing to PMCNTENCLR, call perf_event_disable to disable the perf event. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Add access handler for event counter registerShannon Zhao
These kind of registers include PMEVCNTRn, PMCCNTR and PMXEVCNTR which is mapped to PMEVCNTRn. The access handler translates all aarch32 register offsets to aarch64 ones and uses vcpu_sys_reg() to access their values to avoid taking care of big endian. When reading these registers, return the sum of register value and the value perf event counts. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Add access handler for PMCEID0 and PMCEID1 registerShannon Zhao
Add access handler which gets host value of PMCEID0 or PMCEID1 when guest access these registers. Writing action to PMCEID0 or PMCEID1 is UNDEFINED. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Add access handler for PMSELR registerShannon Zhao
Since the reset value of PMSELR_EL0 is UNKNOWN, use reset_unknown for its reset handler. When reading PMSELR, return the PMSELR.SEL field to guest. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Add access handler for PMCR registerShannon Zhao
Add reset handler which gets host value of PMCR_EL0 and make writable bits architecturally UNKNOWN except PMCR.E which is zero. Add an access handler for PMCR. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Define PMU data structure for each vcpuShannon Zhao
Here we plan to support virtual PMU for guest by full software emulation, so define some basic structs and functions preparing for futher steps. Define struct kvm_pmc for performance monitor counter and struct kvm_pmu for performance monitor unit for each vcpu. According to ARMv8 spec, the PMU contains at most 32(ARMV8_PMU_MAX_COUNTERS) counters. Since this only supports ARM64 (or PMUv3), add a separate config symbol for it. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Move vgic-v2 and timer save/restore to virt/kvm/arm/hypMarc Zyngier
We already have virt/kvm/arm/ containing timer and vgic stuff. Add yet another subdirectory to contain the hyp-specific files (timer and vgic again). Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Move kvm/hyp/hyp.h to include/asm/kvm_hyp.hMarc Zyngier
In order to be able to move code outside of kvm/hyp, we need to make the global hyp.h file accessible from a standard location. include/asm/kvm_hyp.h seems good enough. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Move most of the fault decoding to CMarc Zyngier
The fault decoding process (including computing the IPA in the case of a permission fault) would be much better done in C code, as we have a reasonable infrastructure to deal with the VHE/non-VHE differences. Let's move the whole thing to C, including the workaround for erratum 834220, and just patch the odd ESR_EL2 access remaining in hyp-entry.S. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: VHE: Add alternative panic handlingMarc Zyngier
As the kernel fully runs in HYP when VHE is enabled, we can directly branch to the kernel's panic() implementation, and not perform an exception return. Add the alternative code to deal with this. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: VHE: Add fpsimd enabling on guest accessMarc Zyngier
Despite the fact that a VHE enabled kernel runs at EL2, it uses CPACR_EL1 to trap FPSIMD access. Add the required alternative code to re-enable guest FPSIMD access when it has trapped to EL2. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: VHE: Use unified sysreg accessors for timerMarc Zyngier
Switch the timer code to the unified sysreg accessors. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: VHE: Implement VHE activate/deactivate_trapsMarc Zyngier
Running the kernel in HYP mode requires the HCR_E2H bit to be set at all times, and the HCR_TGE bit to be set when running as a host (and cleared when running as a guest). At the same time, the vector must be set to the current role of the kernel (either host or hypervisor), and a couple of system registers differ between VHE and non-VHE. We implement these by using another set of alternate functions that get dynamically patched. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: VHE: Make __fpsimd_enabled VHE awareMarc Zyngier
As non-VHE and VHE have different ways to express the trapping of FPSIMD registers to EL2, make __fpsimd_enabled a patchable predicate and provide a VHE implementation. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: VHE: Enable minimal sysreg save/restoreMarc Zyngier
We're now in a position where we can introduce VHE's minimal save/restore, which is limited to the handful of shared sysregs. Add the required alternative function calls that result in a "do nothing" call on VHE, and the normal save/restore for non-VHE. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: VHE: Use unified system register accessorsMarc Zyngier
Use the recently introduced unified system register accessors for those sysregs that behave differently depending on VHE being in use or not. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>