summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2016-10-19Merge branch 'gup_flag-cleanups'Linus Torvalds
Merge the gup_flags cleanups from Lorenzo Stoakes: "This patch series adjusts functions in the get_user_pages* family such that desired FOLL_* flags are passed as an argument rather than implied by flags. The purpose of this change is to make the use of FOLL_FORCE explicit so it is easier to grep for and clearer to callers that this flag is being used. The use of FOLL_FORCE is an issue as it overrides missing VM_READ/VM_WRITE flags for the VMA whose pages we are reading from/writing to, which can result in surprising behaviour. The patch series came out of the discussion around commit 38e088546522 ("mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing"), which addressed a BUG_ON() being triggered when a page was faulted in with PROT_NONE set but having been overridden by FOLL_FORCE. do_numa_page() was run on the assumption the page _must_ be one marked for NUMA node migration as an actual PROT_NONE page would have been dealt with prior to this code path, however FOLL_FORCE introduced a situation where this assumption did not hold. See https://marc.info/?l=linux-mm&m=147585445805166 for the patch proposal" Additionally, there's a fix for an ancient bug related to FOLL_FORCE and FOLL_WRITE by me. [ This branch was rebased recently to add a few more acked-by's and reviewed-by's ] * gup_flag-cleanups: mm: replace access_process_vm() write parameter with gup_flags mm: replace access_remote_vm() write parameter with gup_flags mm: replace __access_remote_vm() write parameter with gup_flags mm: replace get_user_pages_remote() write/force parameters with gup_flags mm: replace get_user_pages() write/force parameters with gup_flags mm: replace get_vaddr_frames() write/force parameters with gup_flags mm: replace get_user_pages_locked() write/force parameters with gup_flags mm: replace get_user_pages_unlocked() write/force parameters with gup_flags mm: remove write/force parameters from __get_user_pages_unlocked() mm: remove write/force parameters from __get_user_pages_locked() mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
2016-10-19x86/cpufeature: Add AVX512_4VNNIW and AVX512_4FMAPS featuresPiotr Luc
AVX512_4VNNIW - Vector instructions for deep learning enhanced word variable precision. AVX512_4FMAPS - Vector instructions for deep learning floating-point single precision. These new instructions are to be used in future Intel Xeon & Xeon Phi processors. The bits 2&3 of CPUID[level:0x07, EDX] inform that new instructions are supported by a processor. The spec can be found in the Intel Software Developer Manual (SDM) or in the Instruction Set Extensions Programming Reference (ISE). Define new feature flags to enumerate the new instructions in /proc/cpuinfo accordingly to CPUID bits and add the required xsave extensions which are required for proper operation. Signed-off-by: Piotr Luc <piotr.luc@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20161018150111.29926-1-piotr.luc@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-19x86/vmware: Skip timer_irq_works() check on VMwareRenat Valiullin
The timer_irq_works() boot check may sometimes fail in a VM, when the Host is overcommitted or when the Guest is running nested. Since the intended check is unnecessary on VMware's virtual hardware, by-pass it. Signed-off-by: Renat Valiullin <rvaliullin@vmware.com> Acked-by: Alok N Kataria <akataria@vmware.com> Cc: virtualization@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20161013184539.GA11497@rvaliullin-vm Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-19mm: replace access_process_vm() write parameter with gup_flagsLorenzo Stoakes
This removes the 'write' argument from access_process_vm() and replaces it with 'gup_flags' as use of this function previously silently implied FOLL_FORCE, whereas after this patch callers explicitly pass this flag. We make this explicit as use of FOLL_FORCE can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19mm/numa: Remove duplicated include from mprotect.cWei Yongjun
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-mm@kvack.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/1476719259-6214-1-git-send-email-weiyj.lk@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-19mm: replace access_remote_vm() write parameter with gup_flagsLorenzo Stoakes
This removes the 'write' argument from access_remote_vm() and replaces it with 'gup_flags' as use of this function previously silently implied FOLL_FORCE, whereas after this patch callers explicitly pass this flag. We make this explicit as use of FOLL_FORCE can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19mm: replace __access_remote_vm() write parameter with gup_flagsLorenzo Stoakes
This removes the 'write' argument from __access_remote_vm() and replaces it with 'gup_flags' as use of this function previously silently implied FOLL_FORCE, whereas after this patch callers explicitly pass this flag. We make this explicit as use of FOLL_FORCE can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19mm: replace get_user_pages_remote() write/force parameters with gup_flagsLorenzo Stoakes
This removes the 'write' and 'force' from get_user_pages_remote() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19mm: replace get_user_pages() write/force parameters with gup_flagsLorenzo Stoakes
This removes the 'write' and 'force' from get_user_pages() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19mm: replace get_vaddr_frames() write/force parameters with gup_flagsLorenzo Stoakes
This removes the 'write' and 'force' from get_vaddr_frames() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19mm: replace get_user_pages_locked() write/force parameters with gup_flagsLorenzo Stoakes
This removes the 'write' and 'force' use from get_user_pages_locked() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19arm64: percpu: rewrite ll/sc loops in assemblyWill Deacon
Writing the outer loop of an LL/SC sequence using do {...} while constructs potentially allows the compiler to hoist memory accesses between the STXR and the branch back to the LDXR. On CPUs that do not guarantee forward progress of LL/SC loops when faced with memory accesses to the same ERG (up to 2k) between the failed STXR and the branch back, we may end up livelocking. This patch avoids this issue in our percpu atomics by rewriting the outer loop as part of the LL/SC inline assembly block. Cc: <stable@vger.kernel.org> Fixes: f97fc810798c ("arm64: percpu: Implement this_cpu operations") Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-19arm64: swp emulation: bound LL/SC retries before reschedulingWill Deacon
If a CPU does not implement a global monitor for certain memory types, then userspace can attempt a kernel DoS by issuing SWP instructions targetting the problematic memory (for example, a framebuffer mapped with non-cacheable attributes). The SWP emulation code protects against these sorts of attacks by checking for pending signals and potentially rescheduling when the STXR instruction fails during the emulation. Whilst this is good for avoiding livelock, it harms emulation of legitimate SWP instructions on CPUs where forward progress is not guaranteed if there are memory accesses to the same reservation granule (up to 2k) between the failing STXR and the retry of the LDXR. This patch solves the problem by retrying the STXR a bounded number of times (4) before breaking out of the LL/SC loop and looking for something else to do. Cc: <stable@vger.kernel.org> Fixes: bd35a4adc413 ("arm64: Port SWP/SWPB emulation support from arm") Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-19sched/fair: Fix incorrect task group ->load_avgVincent Guittot
A scheduler performance regression has been reported by Joseph Salisbury, which he bisected back to: 3d30544f0212 ("sched/fair: Apply more PELT fixes) The regression triggers when several levels of task groups are involved (read: SystemD) and cpu_possible_mask != cpu_present_mask. The root cause is that group entity's load (tg_child->se[i]->avg.load_avg) is initialized to scale_load_down(se->load.weight). During the creation of a child task group, its group entities on possible CPUs are attached to parent's cfs_rq (tg_parent) and their loads are added to the parent's load (tg_parent->load_avg) with update_tg_load_avg(). But only the load on online CPUs will then be updated to reflect real load, whereas load on other CPUs will stay at the initial value. The result is a tg_parent->load_avg that is higher than the real load, the weight of group entities (tg_parent->se[i]->load.weight) on online CPUs is smaller than it should be, and the task group gets a less running time than what it could expect. ( This situation can be detected with /proc/sched_debug. The ".tg_load_avg" of the task group will be much higher than sum of ".tg_load_avg_contrib" of online cfs_rqs of the task group. ) The load of group entities don't have to be intialized to something else than 0 because their load will increase when an entity is attached. Reported-by: Joseph Salisbury <joseph.salisbury@canonical.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@vger.kernel.org> # 4.8.x Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: joonwoop@codeaurora.org Fixes: 3d30544f0212 ("sched/fair: Apply more PELT fixes) Link: http://lkml.kernel.org/r/1476881123-10159-1-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-19efi/arm: Fix absolute relocation detection for older toolchainsArd Biesheuvel
When building the ARM kernel with CONFIG_EFI=y, the following build error may occur when using a less recent version of binutils (2.23 or older): STUBCPY drivers/firmware/efi/libstub/lib-sort.stub.o 00000000 R_ARM_ABS32 sort 00000004 R_ARM_ABS32 __ksymtab_strings drivers/firmware/efi/libstub/lib-sort.stub.o: absolute symbol references not allowed in the EFI stub (and when building with debug symbols, the list above is much longer, and contains all the internal references between the .debug sections and the actual code) This issue is caused by the fact that objcopy v2.23 or earlier does not support wildcards in its -R and -j options, which means the following line from the Makefile: STUBCOPY_FLAGS-y := -R .debug* -R *ksymtab* -R *kcrctab* fails to take effect, leaving harmless absolute relocations in the binary that are indistinguishable from relocations that may cause crashes at runtime due to the fact that these relocations are resolved at link time using the virtual address of the kernel, which is always different from the address at which the EFI firmware loads and invokes the stub. So, as a workaround, disable debug symbols explicitly when building the stub for ARM, and strip the ksymtab and kcrctab symbols for the only exported symbol we currently reuse in the stub, which is 'sort'. Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1476805991-7160-2-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-19irqchip/eznps: Drop pointless static qualifier in nps400_of_init()Wei Yongjun
There is no need to have the 'struct irq_domain *nps400_root_domain' variable static since new value is always assigned before use. Fixes: 44df427c894a ("irqchip: add nps Internal and external irqchips") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Jason Cooper <jason@lakedaemon.net> Link: http://lkml.kernel.org/r/1476714417-12095-1-git-send-email-weiyj.lk@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-19powerpc: Ignore the pkey system calls for nowStephen Rothwell
Eliminates warning messages: <stdin>:1316:2: warning: #warning syscall pkey_mprotect not implemented [-Wcpp] <stdin>:1319:2: warning: #warning syscall pkey_alloc not implemented [-Wcpp] <stdin>:1322:2: warning: #warning syscall pkey_free not implemented [-Wcpp] Hopefully we will remember to revert this commit if we ever implement them. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-19powerpc: Fix numa topology console printAneesh Kumar K.V
With recent update to printk, we get console output like below: [ 0.550639] Brought up 160 CPUs [ 0.550718] Node 0 CPUs: [ 0.550721] 0 [ 0.550754] -39 [ 0.550794] Node 1 CPUs: [ 0.550798] 40 [ 0.550817] -79 [ 0.550856] Node 16 CPUs: [ 0.550860] 80 [ 0.550880] -119 [ 0.550917] Node 17 CPUs: [ 0.550923] 120 [ 0.550942] -159 Fix this by properly using pr_cont(), ie. KERN_CONT. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-19powerpc/mm: Drop dump_numa_memory_topology()Michael Ellerman
At boot we dump the NUMA memory topology in dump_numa_memory_topology(), at KERN_DEBUG level, resulting in output like: Node 0 Memory: 0x0-0x100000000 Node 1 Memory: 0x100000000-0x200000000 Which is nice enough, but immediately after that we iterate over each node and call setup_node_data(), which also prints out the node ranges, at KERN_INFO, giving eg: numa: Initmem setup node 0 [mem 0x00000000-0xffffffff] numa: Initmem setup node 1 [mem 0x100000000-0x1ffffffff] Additionally dump_numa_memory_topology() does not use KERN_CONT correctly, resulting in split output lines on recent kernels. So drop dump_numa_memory_topology() as superfluous chatter. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-19cxl: Prevent adapter reset if an active context existsVaibhav Jain
This patch prevents resetting the cxl adapter via sysfs in presence of one or more active cxl_context on it. This protects against an unrecoverable error caused by PSL owning a dirty cache line even after reset and host tries to touch the same cache line. In case a force reset of the card is required irrespective of any active contexts, the int value -1 can be stored in the 'reset' sysfs attribute of the card. The patch introduces a new atomic_t member named contexts_num inside struct cxl that holds the number of active context attached to the card , which is checked against '0' before proceeding with the reset. To prevent against a race condition where a context is activated just after reset check is performed, the contexts_num is atomically set to '-1' after reset-check to indicate that no more contexts can be activated on the card anymore. Before activating a context we atomically test if contexts_num is non-negative and if so, increment its value by one. In case the value of contexts_num is negative then it indicates that the card is about to be reset and context activation is error-ed out at that point. Fixes: 62fa19d4b4fd ("cxl: Add ability to reset the card") Cc: stable@vger.kernel.org # v4.0+ Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-19powerpc/boot: Fix boot on systems with uncompressed kernel imageHeiner Kallweit
This commit broke boot on systems with an uncompressed kernel image, namely systems using a cuImage. On such systems the compressed boot image (boot wrapper, uncompressed kernel image, ..) is decompressed by u-boot already, therefore the boot wrapper code sees an uncompressed kernel image. The old decompression code silently assumed an uncompressed kernel image if it found no valid gzip signature, whilst the new code bailed out in this case. Fix this by re-introducing such a fallback if no valid compressed image is found. Fixes: 1b7898ee276b ("Use the pre-boot decompression API") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-19powerpc/mm: Prevent unlikely crash in copro_calculate_slb()Frederic Barrat
If a cxl adapter faults on an invalid address for a kernel context, we may enter copro_calculate_slb() with a NULL mm pointer (kernel context) and an effective address which looks like a user address. Which will cause a crash when dereferencing mm. It is clearly an AFU bug, but there's no reason to crash either. So return an error, so that cxl can ack the interrupt with an address error. Fixes: 73d16a6e0e51 ("powerpc/cell: Move data segment faulting code out of cell platform") Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-19KVM: MIPS: Add missing uaccess.h includeJames Hogan
MIPS KVM uses user memory accessors but mips.c doesn't directly include uaccess.h, so include it now. This wasn't too much of a problem before v4.9-rc1 as asm/module.h included asm/uaccess.h, however since commit 29abfbd9cbba ("mips: separate extable.h, switch module.h to it") this is no longer the case. This resulted in build failures when trace points were disabled, as trace/define_trace.h includes trace/trace_events.h only ifdef TRACEPOINTS_ENABLED, which goes on to include asm/uaccess.h via a couple of other headers. Fixes: 29abfbd9cbba ("mips: separate extable.h, switch module.h to it") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2016-10-18sh: add earlycon support to j2_defconfigRich Felker
Signed-off-by: Rich Felker <dalias@libc.org>
2016-10-18sh: add Kconfig option for J-Core SoC core driversRich Felker
Signed-off-by: Rich Felker <dalias@libc.org>
2016-10-18Merge tag 'for-f2fs-4.9-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs bugfix from Jaegeuk Kim: "This fixes a bug which referenced the wrong pointer, sum_page, in f2fs_gc. It was newly introduced in 4.9-rc1. * tag 'for-f2fs-4.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: f2fs: fix wrong sum_page pointer in f2fs_gc
2016-10-18mm: replace get_user_pages_unlocked() write/force parameters with gup_flagsLorenzo Stoakes
This removes the 'write' and 'force' use from get_user_pages_unlocked() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-18mm: remove write/force parameters from __get_user_pages_unlocked()Lorenzo Stoakes
This removes the redundant 'write' and 'force' parameters from __get_user_pages_unlocked() to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-18mm: remove write/force parameters from __get_user_pages_locked()Lorenzo Stoakes
This removes the redundant 'write' and 'force' parameters from __get_user_pages_locked() to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-18mm: remove gup_flags FOLL_WRITE games from __get_user_pages()Linus Torvalds
This is an ancient bug that was actually attempted to be fixed once (badly) by me eleven years ago in commit 4ceb5db9757a ("Fix get_user_pages() race for write access") but that was then undone due to problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug"). In the meantime, the s390 situation has long been fixed, and we can now fix it by checking the pte_dirty() bit properly (and do it better). The s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement software dirty bits") which made it into v3.9. Earlier kernels will have to look at the page state itself. Also, the VM has become more scalable, and what used a purely theoretical race back then has become easier to trigger. To fix it, we introduce a new internal FOLL_COW flag to mark the "yes, we already did a COW" rather than play racy games with FOLL_WRITE that is very fundamental, and then use the pte dirty flag to validate that the FOLL_COW flag is still valid. Reported-and-tested-by: Phil "not Paul" Oester <kernel@linuxace.com> Acked-by: Hugh Dickins <hughd@google.com> Reviewed-by: Michal Hocko <mhocko@suse.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Nick Piggin <npiggin@gmail.com> Cc: Greg Thelen <gthelen@google.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-18Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes, plus hw-enablement changes: - fix persistent RAM handling - remove pkeys warning - remove duplicate macro - fix debug warning in irq handler - add new 'Knights Mill' CPU related constants and enable the perf bits" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Add Knights Mill CPUID perf/x86/intel/rapl: Add Knights Mill CPUID perf/x86/intel: Add Knights Mill CPUID x86/cpu/intel: Add Knights Mill to Intel family x86/e820: Don't merge consecutive E820_PRAM ranges pkeys: Remove easily triggered WARN x86: Remove duplicate rtit status MSR macro x86/smp: Add irq_enter/exit() in smp_reschedule_interrupt()
2016-10-18Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixlet from Ingo Molnar: "Remove an unused variable" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: alarmtimer: Remove unused but set variable
2016-10-18Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "Fix a crash that can trigger when racing with CPU hotplug: we didn't use sched-domains data structures carefully enough in select_idle_cpu()" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Fix sched domains NULL dereference in select_idle_sibling()
2016-10-18Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Four tooling fixes, two kprobes KASAN related fixes and an x86 PMU driver fix/cleanup" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf jit: Fix build issue on Ubuntu perf jevents: Handle events including .c and .o perf/x86/intel: Remove an inconsistent NULL check kprobes: Unpoison stack in jprobe_return() for KASAN kprobes: Avoid false KASAN reports during stack copy perf header: Set nr_numa_nodes only when we parsed all the data perf top: Fix refreshing hierarchy entries on TUI
2016-10-18Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Two fixes: - a file locks fix (missing critical section, bug introduced in this merge window) - an x86 down_write() stack frame annotation" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking, fs/locks: Add missing file_sem locks locking/rwsem/x86: Add stack frame dependency for ____down_write()
2016-10-18Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Ingo Molnar: "Three irqchip driver fixes" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gicv3: Handle loop timeout proper irqchip/jcore: Fix lost per-cpu interrupts irqchip/eznps: Acknowledge NPS_IPI before calling the handler
2016-10-18Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc fixes from Ingo Molnar: "A CPU hotplug debuggability fix and three objtool false positive warnings fixes for new GCC6 code generation patterns" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Use distinct name for cpu_hotplug.dep_map objtool: Skip all "unreachable instruction" warnings for gcov kernels objtool: Improve rare switch jump table pattern detection objtool: Support '-mtune=atom' stack frame setup instruction
2016-10-18Merge tag 'firewire-update' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fixlet from Stefan Richter: "IEEE 1394 subsystem patch: catch an initialization error in the packet sniffer nosy" * tag 'firewire-update' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: nosy: do not ignore errors in ioremap_nocache()
2016-10-18MAINTAINERS: Add myself as EFI maintainerArd Biesheuvel
At the request of Matt, I am taking up co-maintainership of the EFI subsystem. So add my name to the EFI section in MAINTAINERS, and change the SCM tree reference to point to the new shared Git repo. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20161018143318.15673-2-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-18Merge tag 'drm-fixes-for-v4.9-rc2' of ↵Linus Torvalds
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Just had a couple of amdgpu fixes and one core fix I wanted to get out early to fix some regressions. I'm sure I'll have more stuff this week for -rc2" * tag 'drm-fixes-for-v4.9-rc2' of git://people.freedesktop.org/~airlied/linux: (22 commits) drm: Print device information again in debugfs drm/amd/powerplay: fix bug stop dpm can't work on Vi. drm/amd/powerplay: notify smu no display by default. drm/amdgpu/dpm: implement thermal sensor for CZ/ST drm/amdgpu/powerplay: implement thermal sensor for CZ/ST drm/amdgpu: disable smu hw first on tear down drm/amdgpu: fix amdgpu_need_full_reset (v2) drm/amdgpu/si_dpm: Limit clocks on HD86xx part drm/amd/powerplay: fix static checker warnings in smu7_hwmgr.c drm/amdgpu: potential NULL dereference in debugfs code drm/amd/powerplay: fix static checker warnings in smu7_hwmgr.c drm/amd/powerplay: fix static checker warnings in iceland_smc.c drm/radeon: change vblank_time's calculation method to reduce computational error. drm/amdgpu: change vblank_time's calculation method to reduce computational error. drm/amdgpu: clarify UVD/VCE special handling for CG drm/amd/amdgpu: enable clockgating only after late init drm/radeon: allow TA_CS_BC_BASE_ADDR on SI drm/amdgpu: initialize the context reset_counter in amdgpu_ctx_init drm/amdgpu/gfx8: fix CGCG_CGLS handling drm/radeon: fix modeset tear down code ...
2016-10-18pinctrl: intel: Only restore pins that are used by the driverMika Westerberg
Dell XPS 13 (and maybe some others) uses a GPIO (CPU_GP_1) during suspend to explicitly disable USB touchscreen interrupt. This is done to prevent situation where the lid is closed the touchscreen is left functional. The pinctrl driver (wrongly) assumes it owns all pins which are owned by host and not locked down. It is perfectly fine for BIOS to use those pins as it is also considered as host in this context. What happens is that when the lid of Dell XPS 13 is closed, the BIOS configures CPU_GP_1 low disabling the touchscreen interrupt. During resume we restore all host owned pins to the known state which includes CPU_GP_1 and this overwrites what the BIOS has programmed there causing the touchscreen to fail as no interrupts are reaching the CPU anymore. Fix this by restoring only those pins we know are explicitly requested by the kernel one way or other. Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=176361 Reported-by: AceLan Kao <acelan.kao@canonical.com> Tested-by: AceLan Kao <acelan.kao@canonical.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-10-18pinctrl: baytrail: Fix lockdepVille Syrjälä
Initialize the spinlock before using it. INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.8.0-dwc-bisect #4 Hardware name: Intel Corp. VALLEYVIEW C0 PLATFORM/BYT-T FFD8, BIOS BLAKFF81.X64.0088.R10.1403240443 FFD8_X64_R_2014_13_1_00 03/24/2014 0000000000000000 ffff8800788ff770 ffffffff8133d597 0000000000000000 0000000000000000 ffff8800788ff7e0 ffffffff810cfb9e 0000000000000002 ffff8800788ff7d0 ffffffff8205b600 0000000000000002 ffff8800788ff7f0 Call Trace: [<ffffffff8133d597>] dump_stack+0x67/0x90 [<ffffffff810cfb9e>] register_lock_class+0x52e/0x540 [<ffffffff810d2081>] __lock_acquire+0x81/0x16b0 [<ffffffff810cede1>] ? save_trace+0x41/0xd0 [<ffffffff810d33b2>] ? __lock_acquire+0x13b2/0x16b0 [<ffffffff810cf05a>] ? __lock_is_held+0x4a/0x70 [<ffffffff810d3b1a>] lock_acquire+0xba/0x220 [<ffffffff8136f1fe>] ? byt_gpio_get_direction+0x3e/0x80 [<ffffffff81631567>] _raw_spin_lock_irqsave+0x47/0x60 [<ffffffff8136f1fe>] ? byt_gpio_get_direction+0x3e/0x80 [<ffffffff8136f1fe>] byt_gpio_get_direction+0x3e/0x80 [<ffffffff813740a9>] gpiochip_add_data+0x319/0x7d0 [<ffffffff81631723>] ? _raw_spin_unlock_irqrestore+0x43/0x70 [<ffffffff8136fe3b>] byt_pinctrl_probe+0x2fb/0x620 [<ffffffff8142fb0c>] platform_drv_probe+0x3c/0xa0 ... Based on the diff it looks like the problem was introduced in commit 71e6ca61e826 ("pinctrl: baytrail: Register pin control handling") but I wasn't able to verify that empirically as the parent commit just oopsed when I tried to boot it. Cc: Cristina Ciocan <cristina.ciocan@intel.com> Cc: stable@vger.kernel.org Fixes: 71e6ca61e826 ("pinctrl: baytrail: Register pin control handling") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-10-18pinctrl: aspeed-g5: Fix pin association of SPI1 functionAndrew Jeffery
The SPI1 function was associated with the wrong pins: The functions that those pins provide is either an SPI debug or passthrough function coupled to SPI1. Make the SPI1 mux function configure the relevant pins and associate new SPI1DEBUG and SPI1PASSTHRU functions with the pins that were already defined. The notation used in the datasheet's multi-function pin table for the SoC is often creative: in this case the SYS* signals are enabled by a single bit, which is nothing unusual on its own, but in this case the bit was also participating in a multi-bit bitfield and therefore represented multiple functions. This fact was overlooked in the original patch. Fixes: 56e57cb6c07f (pinctrl: Add pinctrl-aspeed-g5 driver) Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Reviewed-by: Joel Stanley <joel@jms.id.au> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-10-18pinctrl: aspeed-g5: Fix GPIOE1 typoAndrew Jeffery
This prevented C20 from successfully being muxed as GPIO. Fixes: 56e57cb6c07f (pinctrl: Add pinctrl-aspeed-g5 driver) Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-10-18pinctrl: aspeed-g5: Fix names of GPID2 pinsAndrew Jeffery
Fixes simple typos in the initial commit. There is no behavioural change. Fixes: 56e57cb6c07f (pinctrl: Add pinctrl-aspeed-g5 driver) Reported-by: Xo Wang <xow@google.com> Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-10-18pinctrl: aspeed: "Not enabled" is a significant mux stateAndrew Jeffery
Consider a scenario with one pin P that has two signals A and B, where A is defined to be higher priority than B: That is, if the mux IP is in a state that would consider both A and B to be active on P, then A will be the active signal. To instead configure B as the active signal we must configure the mux so that A is inactive. The mux state for signals can be described by logical operations on one or more bits from one or more registers (a "signal expression"), which in some cases leads to aliased mux states for a particular signal. Further, signals described by multi-bit bitfields often do not only need to record the states that would make them active (the "enable" expressions), but also the states that makes them inactive (the "disable" expressions). All of this combined leads to four possible states for a signal: 1. A signal is active with respect to an "enable" expression 2. A signal is not active with respect to an "enable" expression 3. A signal is inactive with respect to a "disable" expression 4. A signal is not inactive with respect to a "disable" expression In the case of P, if we are looking to activate B without explicitly having configured A it's enough to consider A inactive if all of A's "enable" signal expressions evaluate to "not active". If any evaluate to "active" then the corresponding "disable" states must be applied so it becomes inactive. For example, on the AST2400 the pins composing GPIO bank H provide signals ROMD8 through ROMD15 (high priority) and those for UART6 (low priority). The mux states for ROMD8 through ROMD15 are aliased, i.e. there are two mux states that result in the respective signals being configured: A. SCU90[6]=1 B. Strap[4,1:0]=100 Further, the second mux state is a 3-bit bitfield that explicitly defines the enabled state but the disabled state is implicit, i.e. if Strap[4,1:0] is not exactly "100" then ROMD8 through ROMD15 are not considered active. This requires the mux function evaluation logic to use approach 2. above, however the existing code was using approach 3. The problem was brought to light on the Palmetto machines where the strap register value is 0x120ce416, and prevented GPIO requests in bank H from succeeding despite the hardware being in a position to allow them. Fixes: 318398c09a8d ("pinctrl: Add core pinctrl support for Aspeed SoCs") Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-10-18ceph: fix non static symbol warningWei Yongjun
Fixes the following sparse warning: fs/ceph/xattr.c:19:28: warning: symbol 'ceph_other_xattr_handler' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-10-18locking, fs/locks: Add missing file_sem locksPeter Zijlstra
I overlooked a few code-paths that can lead to locks_delete_global_locks(). Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Jeff Layton <jlayton@poochiereds.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Bruce Fields <bfields@fieldses.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-fsdevel@vger.kernel.org Cc: syzkaller <syzkaller@googlegroups.com> Link: http://lkml.kernel.org/r/20161008081228.GF3142@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-18locking/rwsem/x86: Add stack frame dependency for ____down_write()Josh Poimboeuf
Arnd reported the following objtool warning: kernel/locking/rwsem.o: warning: objtool: down_write_killable()+0x16: call without frame pointer save/setup The warning means gcc placed the ____down_write() inline asm (and its call instruction) before the frame pointer setup in down_write_killable(), which breaks frame pointer convention and can result in incorrect stack traces. Force the stack frame to be created before the call instruction by listing the stack pointer as an output operand in the inline asm statement. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1188b7015f04baf361e59de499ee2d7272c59dce.1476393828.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-18ceph: fix uninitialized dentry pointer in ceph_real_mount()Geert Uytterhoeven
fs/ceph/super.c: In function ‘ceph_real_mount’: fs/ceph/super.c:818: warning: ‘root’ may be used uninitialized in this function If s_root is already valid, dentry pointer root is never initialized, and returned by ceph_real_mount(). This will cause a crash later when the caller dereferences the pointer. Fixes: ce2728aaa82bbeba ("ceph: avoid accessing / when mounting a subpath") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Yan, Zheng <zyan@redhat.com>