summaryrefslogtreecommitdiffstats
path: root/arch/x86/entry/entry_32.S
AgeCommit message (Collapse)Author
2019-07-31x86: Use CONFIG_PREEMPTIONThomas Gleixner
CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT. Both PREEMPT and PREEMPT_RT require the same functionality which today depends on CONFIG_PREEMPT. Switch the entry code, preempt and kprobes conditionals over to CONFIG_PREEMPTION. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20190726212124.608488448@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-07-24x86/entry/32: Pass cr2 to do_async_page_fault()Matt Mullins
Commit a0d14b8909de ("x86/mm, tracing: Fix CR2 corruption") added the address parameter to do_async_page_fault(), but does not pass it from the 32-bit entry point. To plumb it through, factor-out common_exception_read_cr2 in the same fashion as common_exception, and uses it from both page_fault and async_page_fault. For a 32-bit KVM guest, this fixes: Run /sbin/init as init process Starting init: /sbin/init exists but couldn't execute it (error -14) Fixes: a0d14b8909de ("x86/mm, tracing: Fix CR2 corruption") Signed-off-by: Matt Mullins <mmullins@fb.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190724042058.24506-1-mmullins@fb.com
2019-07-17x86/mm, tracing: Fix CR2 corruptionPeter Zijlstra
Despite the current efforts to read CR2 before tracing happens there still exist a number of possible holes: idtentry page_fault do_page_fault has_error_code=1 call error_entry TRACE_IRQS_OFF call trace_hardirqs_off* #PF // modifies CR2 CALL_enter_from_user_mode __context_tracking_exit() trace_user_exit(0) #PF // modifies CR2 call do_page_fault address = read_cr2(); /* whoopsie */ And similar for i386. Fix it by pulling the CR2 read into the entry code, before any of that stuff gets a chance to run and ruin things. Reported-by: He Zhe <zhe.he@windriver.com> Reported-by: Eiichi Tsukata <devel@etsukata.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: bp@alien8.de Cc: rostedt@goodmis.org Cc: torvalds@linux-foundation.org Cc: hpa@zytor.com Cc: dave.hansen@linux.intel.com Cc: jgross@suse.com Cc: joel@joelfernandes.org Link: https://lkml.kernel.org/r/20190711114336.116812491@infradead.org Debugged-by: Steven Rostedt <rostedt@goodmis.org>
2019-07-17x86/entry/32: Simplify common_exceptionPeter Zijlstra
Adding one more option to SAVE_ALL can be used in common_exception to simplify things. This also saves duplication later where page_fault will no longer use common_exception. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: bp@alien8.de Cc: torvalds@linux-foundation.org Cc: hpa@zytor.com Cc: dave.hansen@linux.intel.com Cc: jgross@suse.com Cc: zhe.he@windriver.com Cc: joel@joelfernandes.org Cc: devel@etsukata.com Link: https://lkml.kernel.org/r/20190711114335.945136187@infradead.org
2019-07-09x86/entry/32: Fix ENDPROC of common_spuriousJiri Slaby
common_spurious is currently ENDed erroneously. common_interrupt is used in its ENDPROC. So fix this mistake. Found by my asm macros rewrite patchset. Fixes: f8a8fe61fec8 ("x86/irq: Seperate unused system vectors from spurious entry again") Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190709063402.19847-1-jslaby@suse.cz
2019-07-08Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "Most of the changes relate to Peter Zijlstra's cleanup of ptregs handling, in particular the i386 part is now much simplified and standardized - no more partial ptregs stack frames via the esp/ss oddity. This simplifies ftrace, kprobes, the unwinder, ptrace, kdump and kgdb. There's also a CR4 hardening enhancements by Kees Cook, to make the generic platform functions such as native_write_cr4() less useful as ROP gadgets that disable SMEP/SMAP. Also protect the WP bit of CR0 against similar attacks. The rest is smaller cleanups/fixes" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternatives: Add int3_emulate_call() selftest x86/stackframe/32: Allow int3_emulate_push() x86/stackframe/32: Provide consistent pt_regs x86/stackframe, x86/ftrace: Add pt_regs frame annotations x86/stackframe, x86/kprobes: Fix frame pointer annotations x86/stackframe: Move ENCODE_FRAME_POINTER to asm/frame.h x86/entry/32: Clean up return from interrupt preemption path x86/asm: Pin sensitive CR0 bits x86/asm: Pin sensitive CR4 bits Documentation/x86: Fix path to entry_32.S x86/asm: Remove unused TASK_TI_flags from asm-offsets.c
2019-07-03x86/irq: Seperate unused system vectors from spurious entry againThomas Gleixner
Quite some time ago the interrupt entry stubs for unused vectors in the system vector range got removed and directly mapped to the spurious interrupt vector entry point. Sounds reasonable, but it's subtly broken. The spurious interrupt vector entry point pushes vector number 0xFF on the stack which makes the whole logic in __smp_spurious_interrupt() pointless. As a consequence any spurious interrupt which comes from a vector != 0xFF is treated as a real spurious interrupt (vector 0xFF) and not acknowledged. That subsequently stalls all interrupt vectors of equal and lower priority, which brings the system to a grinding halt. This can happen because even on 64-bit the system vector space is not guaranteed to be fully populated. A full compile time handling of the unused vectors is not possible because quite some of them are conditonally populated at runtime. Bring the entry stubs back, which wastes 160 bytes if all stubs are unused, but gains the proper handling back. There is no point to selectively spare some of the stubs which are known at compile time as the required code in the IDT management would be way larger and convoluted. Do not route the spurious entries through common_interrupt and do_IRQ() as the original code did. Route it to smp_spurious_interrupt() which evaluates the vector number and acts accordingly now that the real vector numbers are handed in. Fixup the pr_warn so the actual spurious vector (0xff) is clearly distiguished from the other vectors and also note for the vectored case whether it was pending in the ISR or not. "Spurious APIC interrupt (vector 0xFF) on CPU#0, should never happen." "Spurious interrupt vector 0xed on CPU#1. Acked." "Spurious interrupt vector 0xee on CPU#1. Not pending!." Fixes: 2414e021ac8d ("x86: Avoid building unused IRQ entry stubs") Reported-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Jan Beulich <jbeulich@suse.com> Link: https://lkml.kernel.org/r/20190628111440.550568228@linutronix.de
2019-06-25x86/stackframe/32: Provide consistent pt_regsPeter Zijlstra
Currently pt_regs on x86_32 has an oddity in that kernel regs (!user_mode(regs)) are short two entries (esp/ss). This means that any code trying to use them (typically: regs->sp) needs to jump through some unfortunate hoops. Change the entry code to fix this up and create a full pt_regs frame. This then simplifies various trampolines in ftrace and kprobes, the stack unwinder, ptrace, kdump and kgdb. Much thanks to Josh for help with the cleanups! Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-25x86/stackframe: Move ENCODE_FRAME_POINTER to asm/frame.hPeter Zijlstra
In preparation for wider use, move the ENCODE_FRAME_POINTER macros to a common header and provide inline asm versions. These macros are used to encode a pt_regs frame for the unwinder; see unwind_frame.c:decode_frame_pointer(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-25x86/entry/32: Clean up return from interrupt preemption pathPeter Zijlstra
The code flow around the return from interrupt preemption point seems needlessly complicated. There is only one site jumping to resume_kernel, and none (outside of resume_kernel) jumping to restore_all_kernel. Inline resume_kernel in restore_all_kernel and avoid the CONFIG_PREEMPT dependent label. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-06Merge branch 'x86-entry-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 entry cleanup from Ingo Molnar: "A single commit that removes a redundant complication from preempt-schedule handling in the x86 entry code" * 'x86-entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry: Remove unneeded need_resched() loop
2019-04-05x86/entry: Remove unneeded need_resched() loopValentin Schneider
Since the enabling and disabling of IRQs within preempt_schedule_irq() is contained in a need_resched() loop, there is no need for the outer architecture specific loop. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lkml.kernel.org/r/20190311224752.8337-14-valentin.schneider@arm.com
2019-04-03sched/x86: Save [ER]FLAGS on context switchPeter Zijlstra
Effectively reverts commit: 2c7577a75837 ("sched/x86_64: Don't save flags on context switch") Specifically because SMAP uses FLAGS.AC which invalidates the claim that the kernel has clean flags. In particular; while preemption from interrupt return is fine (the IRET frame on the exception stack contains FLAGS) it breaks any code that does synchonous scheduling, including preempt_enable(). This has become a significant issue ever since commit: 5b24a7a2aa20 ("Add 'unsafe' user access functions for batched accesses") provided for means of having 'normal' C code between STAC / CLAC, exposing the FLAGS.AC state. So far this hasn't led to trouble, however fix it before it comes apart. Reported-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org Fixes: 5b24a7a2aa20 ("Add 'unsafe' user access functions for batched accesses") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-01Merge tag 'stackleak-v4.20-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull stackleak gcc plugin from Kees Cook: "Please pull this new GCC plugin, stackleak, for v4.20-rc1. This plugin was ported from grsecurity by Alexander Popov. It provides efficient stack content poisoning at syscall exit. This creates a defense against at least two classes of flaws: - Uninitialized stack usage. (We continue to work on improving the compiler to do this in other ways: e.g. unconditional zero init was proposed to GCC and Clang, and more plugin work has started too). - Stack content exposure. By greatly reducing the lifetime of valid stack contents, exposures via either direct read bugs or unknown cache side-channels become much more difficult to exploit. This complements the existing buddy and heap poisoning options, but provides the coverage for stacks. The x86 hooks are included in this series (which have been reviewed by Ingo, Dave Hansen, and Thomas Gleixner). The arm64 hooks have already been merged through the arm64 tree (written by Laura Abbott and reviewed by Mark Rutland and Will Deacon). With VLAs having been removed this release, there is no need for alloca() protection, so it has been removed from the plugin" * tag 'stackleak-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: arm64: Drop unneeded stackleak_check_alloca() stackleak: Allow runtime disabling of kernel stack erasing doc: self-protection: Add information about STACKLEAK feature fs/proc: Show STACKLEAK metrics in the /proc file system lkdtm: Add a test for STACKLEAK gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
2018-10-23Merge branch 'x86-paravirt-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirt updates from Ingo Molnar: "Two main changes: - Remove no longer used parts of the paravirt infrastructure and put large quantities of paravirt ops under a new config option PARAVIRT_XXL=y, which is selected by XEN_PV only. (Joergen Gross) - Enable PV spinlocks on Hyperv (Yi Sun)" * 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/hyperv: Enable PV qspinlock for Hyper-V x86/hyperv: Add GUEST_IDLE_MSR support x86/paravirt: Clean up native_patch() x86/paravirt: Prevent redefinition of SAVE_FLAGS macro x86/xen: Make xen_reservation_lock static x86/paravirt: Remove unneeded mmu related paravirt ops bits x86/paravirt: Move the Xen-only pv_mmu_ops under the PARAVIRT_XXL umbrella x86/paravirt: Move the pv_irq_ops under the PARAVIRT_XXL umbrella x86/paravirt: Move the Xen-only pv_cpu_ops under the PARAVIRT_XXL umbrella x86/paravirt: Move items in pv_info under PARAVIRT_XXL umbrella x86/paravirt: Introduce new config option PARAVIRT_XXL x86/paravirt: Remove unused paravirt bits x86/paravirt: Use a single ops structure x86/paravirt: Remove clobbers from struct paravirt_patch_site x86/paravirt: Remove clobbers parameter from paravirt patch functions x86/paravirt: Make paravirt_patch_call() and paravirt_patch_jmp() static x86/xen: Add SPDX identifier in arch/x86/xen files x86/xen: Link platform-pci-unplug.o only if CONFIG_XEN_PVHVM x86/xen: Move pv specific parts of arch/x86/xen/mmu.c to mmu_pv.c x86/xen: Move pv irq related functions under CONFIG_XEN_PV umbrella
2018-10-17x86/entry/32: Clear the CS high bitsJan Kiszka
Even if not on an entry stack, the CS's high bits must be initialized because they are unconditionally evaluated in PARANOID_EXIT_TO_KERNEL_MODE. Failing to do so broke the boot on Galileo Gen2 and IOT2000 boards. [ bp: Make the commit message tone passive and impartial. ] Fixes: b92a165df17e ("x86/entry/32: Handle Entry from Kernel-Mode on Entry-Stack") Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Joerg Roedel <jroedel@suse.de> Acked-by: Joerg Roedel <jroedel@suse.de> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Andrea Arcangeli <aarcange@redhat.com> CC: Andy Lutomirski <luto@kernel.org> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: Brian Gerst <brgerst@gmail.com> CC: Dave Hansen <dave.hansen@intel.com> CC: David Laight <David.Laight@aculab.com> CC: Denys Vlasenko <dvlasenk@redhat.com> CC: Eduardo Valentin <eduval@amazon.com> CC: Greg KH <gregkh@linuxfoundation.org> CC: Ingo Molnar <mingo@kernel.org> CC: Jiri Kosina <jkosina@suse.cz> CC: Josh Poimboeuf <jpoimboe@redhat.com> CC: Juergen Gross <jgross@suse.com> CC: Linus Torvalds <torvalds@linux-foundation.org> CC: Peter Zijlstra <peterz@infradead.org> CC: Thomas Gleixner <tglx@linutronix.de> CC: Will Deacon <will.deacon@arm.com> CC: aliguori@amazon.com CC: daniel.gruss@iaik.tugraz.at CC: hughd@google.com CC: keescook@google.com CC: linux-mm <linux-mm@kvack.org> CC: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/f271c747-1714-5a5b-a71f-ae189a093b8d@siemens.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-09-04x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscallsAlexander Popov
The STACKLEAK feature (initially developed by PaX Team) has the following benefits: 1. Reduces the information that can be revealed through kernel stack leak bugs. The idea of erasing the thread stack at the end of syscalls is similar to CONFIG_PAGE_POISONING and memzero_explicit() in kernel crypto, which all comply with FDP_RIP.2 (Full Residual Information Protection) of the Common Criteria standard. 2. Blocks some uninitialized stack variable attacks (e.g. CVE-2017-17712, CVE-2010-2963). That kind of bugs should be killed by improving C compilers in future, which might take a long time. This commit introduces the code filling the used part of the kernel stack with a poison value before returning to userspace. Full STACKLEAK feature also contains the gcc plugin which comes in a separate commit. The STACKLEAK feature is ported from grsecurity/PaX. More information at: https://grsecurity.net/ https://pax.grsecurity.net/ This code is modified from Brad Spengler/PaX Team's code in the last public patch of grsecurity/PaX based on our understanding of the code. Changes or omissions from the original code are ours and don't reflect the original grsecurity/PaX code. Performance impact: Hardware: Intel Core i7-4770, 16 GB RAM Test #1: building the Linux kernel on a single core 0.91% slowdown Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P 4.2% slowdown So the STACKLEAK description in Kconfig includes: "The tradeoff is the performance impact: on a single CPU system kernel compilation sees a 1% slowdown, other systems and workloads may vary and you are advised to test this feature on your expected workload before deploying it". Signed-off-by: Alexander Popov <alex.popov@linux.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2018-09-03x86/xen: Move pv irq related functions under CONFIG_XEN_PV umbrellaJuergen Gross
All functions in arch/x86/xen/irq.c and arch/x86/xen/xen-asm*.S are specific to PV guests. Include them in the kernel with CONFIG_XEN_PV only. Make the PV specific code in arch/x86/entry/entry_*.S dependent on CONFIG_XEN_PV instead of CONFIG_XEN. The HVM specific code should depend on CONFIG_XEN_PVHVM. While at it reformat the Makefile to make it more readable. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: xen-devel@lists.xenproject.org Cc: virtualization@lists.linux-foundation.org Cc: akataria@vmware.com Cc: rusty@rustcorp.com.au Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/20180828074026.820-2-jgross@suse.com
2018-07-20x86/entry/32: Check for VM86 mode in slow-path checkJoerg Roedel
The SWITCH_TO_KERNEL_STACK macro only checks for CPL == 0 to go down the slow and paranoid entry path. The problem is that this check also returns true when coming from VM86 mode. This is not a problem by itself, as the paranoid path handles VM86 stack-frames just fine, but it is not necessary as the normal code path handles VM86 mode as well (and faster). Extend the check to include VM86 mode. This also makes an optimization of the paranoid path possible. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1532103744-31902-3-git-send-email-joro@8bytes.org
2018-07-20x86/entry/32: Add debug code to check entry/exit CR3Joerg Roedel
Add code to check whether the kernel is entered and left with the correct CR3 and make it depend on CONFIG_DEBUG_ENTRY. This is needed because there is no NX protection of user-addresses in the kernel-CR3 on x86-32 and that type of bug would not be detected otherwise. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Pavel Machek <pavel@ucw.cz> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1531906876-13451-40-git-send-email-joro@8bytes.org
2018-07-20x86/entry/32: Add PTI CR3 switches to NMI handler codeJoerg Roedel
The NMI handler is special, as it needs to leave with the same CR3 as it was entered with. This is required because the NMI can happen within kernel context but with user CR3 already loaded, i.e. after switching to user CR3 but before returning to user space. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Pavel Machek <pavel@ucw.cz> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1531906876-13451-14-git-send-email-joro@8bytes.org
2018-07-20x86/entry/32: Add PTI cr3 switch to non-NMI entry/exit pointsJoerg Roedel
Add unconditional cr3 switches between user and kernel cr3 to all non-NMI entry and exit points. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Pavel Machek <pavel@ucw.cz> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1531906876-13451-13-git-send-email-joro@8bytes.org
2018-07-20x86/entry/32: Simplify debug entry pointJoerg Roedel
The common exception entry code now handles the entry-from-sysenter stack situation and makes sure to leave with the same stack as it entered the kernel. So there is no need anymore for the special handling in the debug entry code. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Pavel Machek <pavel@ucw.cz> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1531906876-13451-12-git-send-email-joro@8bytes.org
2018-07-20x86/entry/32: Handle Entry from Kernel-Mode on Entry-StackJoerg Roedel
It is possible that the kernel is entered from kernel-mode and on the entry-stack. The most common way this happens is when an exception is triggered while loading the user-space segment registers on the kernel-to-userspace exit path. The segment loading needs to be done after the entry-stack switch, because the stack-switch needs kernel %fs for per_cpu access. When this happens, make sure to leave the kernel with the entry-stack again, so that the interrupted code-path runs on the right stack when switching to the user-cr3. Detect this condition on kernel-entry by checking CS.RPL and %esp, and if it happens, copy over the complete content of the entry stack to the task-stack. This needs to be done because once the exception handler is entereed, the task might be scheduled out or even migrated to a different CPU, so this cannot rely on the entry-stack contents. Leave a marker in the stack-frame to detect this condition on the exit path. On the exit path the copy is reversed, copy all of the remaining task-stack back to the entry-stack and switch to it. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Pavel Machek <pavel@ucw.cz> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1531906876-13451-11-git-send-email-joro@8bytes.org
2018-07-20x86/entry/32: Introduce SAVE_ALL_NMI and RESTORE_ALL_NMIJoerg Roedel
These macros will be used in the NMI handler code and replace plain SAVE_ALL and RESTORE_REGS there. The NMI-specific CR3-switch will be added to these macros later. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Pavel Machek <pavel@ucw.cz> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1531906876-13451-10-git-send-email-joro@8bytes.org
2018-07-20x86/entry/32: Leave the kernel via trampoline stackJoerg Roedel
Switch back to the trampoline stack before returning to userspace. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Pavel Machek <pavel@ucw.cz> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1531906876-13451-9-git-send-email-joro@8bytes.org
2018-07-20x86/entry/32: Enter the kernel via trampoline stackJoerg Roedel
Use the entry-stack as a trampoline to enter the kernel. The entry-stack is already in the cpu_entry_area and will be mapped to userspace when PTI is enabled. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Pavel Machek <pavel@ucw.cz> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1531906876-13451-8-git-send-email-joro@8bytes.org
2018-07-20x86/entry/32: Split off return-to-kernel pathJoerg Roedel
Use a separate return path when returning to the kernel. This allows to put the PTI cr3-switch and the switch to the entry-stack into the return-to-user path without further checking. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Pavel Machek <pavel@ucw.cz> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1531906876-13451-7-git-send-email-joro@8bytes.org
2018-07-20x86/entry/32: Unshare NMI return pathJoerg Roedel
NMI will no longer use most of the shared return path, because NMI needs special handling when the CR3 switches for PTI are added. Prepare for that change. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Pavel Machek <pavel@ucw.cz> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1531906876-13451-6-git-send-email-joro@8bytes.org
2018-07-20x86/entry/32: Put ESPFIX code into a macroJoerg Roedel
This makes it easier to split up the shared iret code path. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Pavel Machek <pavel@ucw.cz> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1531906876-13451-5-git-send-email-joro@8bytes.org
2018-07-20x86/entry/32: Rename TSS_sysenter_sp0 to TSS_entry2task_stackJoerg Roedel
The stack address doesn't need to be stored in tss.sp0 if the stack is switched manually like on sysenter. Rename the offset so that it still makes sense when its location is changed in later patches. This stackk will also be used for all kernel-entry points, not just sysenter. Reflect that and the fact that it is the offset to the task-stack location in the name as well. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Pavel Machek <pavel@ucw.cz> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1531906876-13451-3-git-send-email-joro@8bytes.org
2018-06-26x86/entry/32: Add explicit 'l' instruction suffixJan Beulich
Omitting suffixes from instructions in AT&T mode is bad practice when operand size cannot be determined by the assembler from register operands, and is likely going to be warned about by upstream GAS in the future (mine does already). Add the single missing 'l' suffix here. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/5B30C24702000078001CD6A6@prv1-mh.provo.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-14Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variablesLinus Torvalds
The changes to automatically test for working stack protector compiler support in the Kconfig files removed the special STACKPROTECTOR_AUTO option that picked the strongest stack protector that the compiler supported. That was all a nice cleanup - it makes no sense to have the AUTO case now that the Kconfig phase can just determine the compiler support directly. HOWEVER. It also meant that doing "make oldconfig" would now _disable_ the strong stackprotector if you had AUTO enabled, because in a legacy config file, the sane stack protector configuration would look like CONFIG_HAVE_CC_STACKPROTECTOR=y # CONFIG_CC_STACKPROTECTOR_NONE is not set # CONFIG_CC_STACKPROTECTOR_REGULAR is not set # CONFIG_CC_STACKPROTECTOR_STRONG is not set CONFIG_CC_STACKPROTECTOR_AUTO=y and when you ran this through "make oldconfig" with the Kbuild changes, it would ask you about the regular CONFIG_CC_STACKPROTECTOR (that had been renamed from CONFIG_CC_STACKPROTECTOR_REGULAR to just CONFIG_CC_STACKPROTECTOR), but it would think that the STRONG version used to be disabled (because it was really enabled by AUTO), and would disable it in the new config, resulting in: CONFIG_HAVE_CC_STACKPROTECTOR=y CONFIG_CC_HAS_STACKPROTECTOR_NONE=y CONFIG_CC_STACKPROTECTOR=y # CONFIG_CC_STACKPROTECTOR_STRONG is not set CONFIG_CC_HAS_SANE_STACKPROTECTOR=y That's dangerously subtle - people could suddenly find themselves with the weaker stack protector setup without even realizing. The solution here is to just rename not just the old RECULAR stack protector option, but also the strong one. This does that by just removing the CC_ prefix entirely for the user choices, because it really is not about the compiler support (the compiler support now instead automatially impacts _visibility_ of the options to users). This results in "make oldconfig" actually asking the user for their choice, so that we don't have any silent subtle security model changes. The end result would generally look like this: CONFIG_HAVE_CC_STACKPROTECTOR=y CONFIG_CC_HAS_STACKPROTECTOR_NONE=y CONFIG_STACKPROTECTOR=y CONFIG_STACKPROTECTOR_STRONG=y CONFIG_CC_HAS_SANE_STACKPROTECTOR=y where the "CC_" versions really are about internal compiler infrastructure, not the user selections. Acked-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-06Drivers: hv: vmbus: Implement Direct Mode for stimer0Michael Kelley
The 2016 version of Hyper-V offers the option to operate the guest VM per-vcpu stimer's in Direct Mode, which means the timer interupts on its own vector rather than queueing a VMbus message. Direct Mode reduces timer processing overhead in both the hypervisor and the guest, and avoids having timer interrupts pollute the VMbus interrupt stream for the synthetic NIC and storage. This patch enables Direct Mode by default on stimer0 when running on a version of Hyper-V that supports it. In prep for coming support of Hyper-V on ARM64, the arch independent portion of the code contains calls to routines that will be populated on ARM64 but are not needed and do nothing on x86. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-20Revert "x86/retpoline: Simplify vmexit_fill_RSB()"David Woodhouse
This reverts commit 1dde7415e99933bb7293d6b2843752cbdb43ec11. By putting the RSB filling out of line and calling it, we waste one RSB slot for returning from the function itself, which means one fewer actual function call we can make if we're doing the Skylake abomination of call-depth counting. It also changed the number of RSB stuffings we do on vmexit from 32, which was correct, to 16. Let's just stop with the bikeshedding; it didn't actually *fix* anything anyway. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: arjan.van.de.ven@intel.com Cc: bp@alien8.de Cc: dave.hansen@intel.com Cc: jmattson@google.com Cc: karahmed@amazon.de Cc: kvm@vger.kernel.org Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Link: http://lkml.kernel.org/r/1519037457-7643-4-git-send-email-dwmw@amazon.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-10Merge tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Radim Krčmář: "ARM: - icache invalidation optimizations, improving VM startup time - support for forwarded level-triggered interrupts, improving performance for timers and passthrough platform devices - a small fix for power-management notifiers, and some cosmetic changes PPC: - add MMIO emulation for vector loads and stores - allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without requiring the complex thread synchronization of older CPU versions - improve the handling of escalation interrupts with the XIVE interrupt controller - support decrement register migration - various cleanups and bugfixes. s390: - Cornelia Huck passed maintainership to Janosch Frank - exitless interrupts for emulated devices - cleanup of cpuflag handling - kvm_stat counter improvements - VSIE improvements - mm cleanup x86: - hypervisor part of SEV - UMIP, RDPID, and MSR_SMI_COUNT emulation - paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit - allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more AVX512 features - show vcpu id in its anonymous inode name - many fixes and cleanups - per-VCPU MSR bitmaps (already merged through x86/pti branch) - stable KVM clock when nesting on Hyper-V (merged through x86/hyperv)" * tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (197 commits) KVM: PPC: Book3S: Add MMIO emulation for VMX instructions KVM: PPC: Book3S HV: Branch inside feature section KVM: PPC: Book3S HV: Make HPT resizing work on POWER9 KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code KVM: PPC: Book3S PR: Fix broken select due to misspelling KVM: x86: don't forget vcpu_put() in kvm_arch_vcpu_ioctl_set_sregs() KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabled KVM: PPC: Book3S HV: Drop locks before reading guest memory kvm: x86: remove efer_reload entry in kvm_vcpu_stat KVM: x86: AMD Processor Topology Information x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested kvm: embed vcpu id to dentry of vcpu anon inode kvm: Map PFN-type memory regions as writable (if possible) x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=n KVM: arm/arm64: Fixup userspace irqchip static key optimization KVM: arm/arm64: Fix userspace_irqchip_in_use counting KVM: arm/arm64: Fix incorrect timer_is_pending logic MAINTAINERS: update KVM/s390 maintainers MAINTAINERS: add Halil as additional vfio-ccw maintainer MAINTAINERS: add David as a reviewer for KVM/s390 ...
2018-02-05membarrier/x86: Provide core serializing commandMathieu Desnoyers
There are two places where core serialization is needed by membarrier: 1) When returning from the membarrier IPI, 2) After scheduler updates curr to a thread with a different mm, before going back to user-space, since the curr->mm is used by membarrier to check whether it needs to send an IPI to that CPU. x86-32 uses IRET as return from interrupt, and both IRET and SYSEXIT to go back to user-space. The IRET instruction is core serializing, but not SYSEXIT. x86-64 uses IRET as return from interrupt, which takes care of the IPI. However, it can return to user-space through either SYSRETL (compat code), SYSRETQ, or IRET. Given that SYSRET{L,Q} is not core serializing, we rely instead on write_cr3() performed by switch_mm() to provide core serialization after changing the current mm, and deal with the special case of kthread