summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2019-02-08futex: Handle early deadlock return correctlyThomas Gleixner
commit 56222b212e8e ("futex: Drop hb->lock before enqueueing on the rtmutex") changed the locking rules in the futex code so that the hash bucket lock is not longer held while the waiter is enqueued into the rtmutex wait list. This made the lock and the unlock path symmetric, but unfortunately the possible early exit from __rt_mutex_proxy_start() due to a detected deadlock was not updated accordingly. That allows a concurrent unlocker to observe inconsitent state which triggers the warning in the unlock path. futex_lock_pi() futex_unlock_pi() lock(hb->lock) queue(hb_waiter) lock(hb->lock) lock(rtmutex->wait_lock) unlock(hb->lock) // acquired hb->lock hb_waiter = futex_top_waiter() lock(rtmutex->wait_lock) __rt_mutex_proxy_start() ---> fail remove(rtmutex_waiter); ---> returns -EDEADLOCK unlock(rtmutex->wait_lock) // acquired wait_lock wake_futex_pi() rt_mutex_next_owner() --> returns NULL --> WARN lock(hb->lock) unqueue(hb_waiter) The problem is caused by the remove(rtmutex_waiter) in the failure case of __rt_mutex_proxy_start() as this lets the unlocker observe a waiter in the hash bucket but no waiter on the rtmutex, i.e. inconsistent state. The original commit handles this correctly for the other early return cases (timeout, signal) by delaying the removal of the rtmutex waiter until the returning task reacquired the hash bucket lock. Treat the failure case of __rt_mutex_proxy_start() in the same way and let the existing cleanup code handle the eventual handover of the rtmutex gracefully. The regular rt_mutex_proxy_start() gains the rtmutex waiter removal for the failure case, so that the other callsites are still operating correctly. Add proper comments to the code so all these details are fully documented. Thanks to Peter for helping with the analysis and writing the really valuable code comments. Fixes: 56222b212e8e ("futex: Drop hb->lock before enqueueing on the rtmutex") Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Co-developed-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: linux-s390@vger.kernel.org Cc: Stefan Liebler <stli@linux.ibm.com> Cc: Sebastian Sewior <bigeasy@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1901292311410.1950@nanos.tec.linutronix.de
2019-02-08futex: Fix barrier commentDavidlohr Bueso
The current comment for the barrier that guarantees that waiter increment is always before taking the hb spinlock (barrier (A)) needs to be fixed as it is misplaced. This is obviously referring to hb_waiters_inc, which is a full barrier. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190206185602.949-1-dave@stgolabs.net
2019-02-07Merge tag 'platform-drivers-x86-v5.0-2' of ↵Linus Torvalds
git://git.infradead.org/linux-platform-drivers-x86 Pull x86 platform driver fixlet from Darren Hart: "Correct Documentation/ABI 4.21 KernelVersion to 5.0" * tag 'platform-drivers-x86-v5.0-2' of git://git.infradead.org/linux-platform-drivers-x86: Documentation/ABI: Correct mlxreg-io KernelVersion for 5.0
2019-02-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "Three security fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: nVMX: unconditionally cancel preemption timer in free_nested (CVE-2019-7221) KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222) kvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974)
2019-02-07Merge tag 'nfsd-5.0-1' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd fixes from Bruce Fields: "Two small nfsd bugfixes for 5.0, for an RDMA bug and a file clone bug" * tag 'nfsd-5.0-1' of git://linux-nfs.org/~bfields/linux: svcrdma: Remove max_sge check at connect time nfsd: Fix error return values for nfsd4_clone_file_range()
2019-02-07Merge tag 'for-5.0/dm-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: "Both of these fixes address issues in changes merged for 5.0-rc4: - Fix DM core's missing memory barrier before waitqueue_active() calls. - Fix DM core's clone_bio() to work when cloning a subset of a bio with an integrity payload; bio_integrity_trim() wasn't getting called due to bio_trim()'s early return" * tag 'for-5.0/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: don't use bio_trim() afterall dm: add memory barrier before waitqueue_active
2019-02-07KVM: nVMX: unconditionally cancel preemption timer in free_nested ↵Peter Shier
(CVE-2019-7221) Bugzilla: 1671904 There are multiple code paths where an hrtimer may have been started to emulate an L1 VMX preemption timer that can result in a call to free_nested without an intervening L2 exit where the hrtimer is normally cancelled. Unconditionally cancel in free_nested to cover all cases. Embargoed until Feb 7th 2019. Signed-off-by: Peter Shier <pshier@google.com> Reported-by: Jim Mattson <jmattson@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reported-by: Felix Wilhelm <fwilhelm@google.com> Cc: stable@kernel.org Message-Id: <20181011184646.154065-1-pshier@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-07KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222)Paolo Bonzini
Bugzilla: 1671930 Emulation of certain instructions (VMXON, VMCLEAR, VMPTRLD, VMWRITE with memory operand, INVEPT, INVVPID) can incorrectly inject a page fault when passed an operand that points to an MMIO address. The page fault will use uninitialized kernel stack memory as the CR2 and error code. The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR exit to userspace; however, it is not an easy fix, so for now just ensure that the error code and CR2 are zero. Embargoed until Feb 7th 2019. Reported-by: Felix Wilhelm <fwilhelm@google.com> Cc: stable@kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-07kvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974)Jann Horn
kvm_ioctl_create_device() does the following: 1. creates a device that holds a reference to the VM object (with a borrowed reference, the VM's refcount has not been bumped yet) 2. initializes the device 3. transfers the reference to the device to the caller's file descriptor table 4. calls kvm_get_kvm() to turn the borrowed reference to the VM into a real reference The ownership transfer in step 3 must not happen before the reference to the VM becomes a proper, non-borrowed reference, which only happens in step 4. After step 3, an attacker can close the file descriptor and drop the borrowed reference, which can cause the refcount of the kvm object to drop to zero. This means that we need to grab a reference for the device before anon_inode_getfd(), otherwise the VM can disappear from under us. Fixes: 852b6d57dc7f ("kvm: add device control API") Cc: stable@kernel.org Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fix from Jiri Kosina: "A fix for a bug in hid-debug that can lock up the kernel in infinite loop (CVE-2019-3819), from Vladis Dronov" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: debug: fix the ring buffer implementation
2019-02-07Merge tag 'sound-5.0-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of a few small fixes. The most significant one is the fix for the possible race at loading HD-audio drivers. This has been present for long time and surfaced only in a rare occasion, but finally spotted out" * tag 'sound-5.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/ca0132 - Fix build error without CONFIG_PCI ALSA: compress: Fix stop handling on compressed capture streams ALSA: usb-audio: Add support for new T+A USB DAC ALSA: hda - Serialize codec registrations ALSA: hda/realtek - Use a common helper for hp pin reference ALSA: hda/realtek - Fix lose hp_pins for disable auto mute ALSA: hda/realtek - Headset microphone support for System76 darp5
2019-02-07Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes from Michael Tsirkin: "A small fix for a uapi header, and a fix for VDPA for non-x86 guests" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio: drop internal struct from UAPI virtio: support VIRTIO_F_ORDER_PLATFORM
2019-02-07Merge tag 'trace-v5.0-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "This has two fixes for uprobe code. - Cut and paste fix to have uprobe printks say "uprobe" and not "kprobe" - Add terminating '\0' byte when copying function arguments" * tag 'trace-v5.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/uprobes: Fix output for multiple string arguments tracing: uprobes: Fix typo in pr_fmt string
2019-02-07Merge tag 'fuse-fixes-5.0-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "A fix for a CUSE regression introduced in v4.20, as well as fixes for a couple of old bugs" * tag 'fuse-fixes-5.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: decrement NR_WRITEBACK_TEMP on the right page fuse: call pipe_buf_release() under pipe lock cuse: fix ioctl fuse: handle zero sized retrieve correctly
2019-02-07Merge tag 'pinctrl-v5.0-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - Mediatek Kconfig fix - Sunxi regulator, IRQ banks and pin base fixup - Intel Cherryview Strago DMI workaround - Potential regmap problem on mcp23s08 * tag 'pinctrl-v5.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: sunxi: Correct number of IRQ banks on H6 main pin controller pinctrl: mcp23s08: spi: Fix regmap allocation for mcp23s18 pinctrl: cherryview: fix Strago DMI workaround pinctrl: sunxi: Consider pin_base when calculating regulator array index pinctrl: sunxi: Fix and simplify pin bank regulator handling pinctrl: mediatek: fix Kconfig build errors for moore core
2019-02-06dm: don't use bio_trim() afterallMike Snitzer
bio_trim() has an early return, which makes it _not_ idempotent, if the offset is 0 and the bio's bi_size already matches the requested size. Prior to DM, all users of bio_trim() were fine with this. But DM has exposed the fact that bio_trim()'s early return is incompatible with a cloned bio whose integrity payload must be trimmed via bio_integrity_trim(). Fix this by reverting DM back to doing the equivalent of bio_trim() but in an idempotent manner (so bio_integrity_trim is always performed). Follow-on work is needed to assess what benefit bio_trim()'s early return is providing to its existing callers. Reported-by: Milan Broz <gmazyland@gmail.com> Fixes: 57c36519e4b94 ("dm: fix clone_bio() to trigger blk_recount_segments()") Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-02-06dm: add memory barrier before waitqueue_activeMikulas Patocka
Block core changes to switch bio-based IO accounting to be percpu had a side-effect of altering DM core to now rely on calling waitqueue_active (in both bio-based and request-based) to check if another task is in dm_wait_for_completion(). A memory barrier is needed before calling waitqueue_active(). DM core doesn't piggyback on a preceding memory barrier so it must explicitly use its own. For more details on why using waitqueue_active() without a preceding barrier is unsafe, please see the comment before the waitqueue_active() definition in include/linux/wait.h. Add the missing memory barrier by switching to using wq_has_sleeper(). Fixes: 6f75723190d8 ("dm: remove the pending IO accounting") Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-02-06svcrdma: Remove max_sge check at connect timeChuck Lever
Two and a half years ago, the client was changed to use gathered Send for larger inline messages, in commit 655fec6987b ("xprtrdma: Use gathered Send for large inline messages"). Several fixes were required because there are a few in-kernel device drivers whose max_sge is 3, and these were broken by the change. Apparently my memory is going, because some time later, I submitted commit 25fd86eca11c ("svcrdma: Don't overrun the SGE array in svc_rdma_send_ctxt"), and after that, commit f3c1fd0ee294 ("svcrdma: Reduce max_send_sges"). These too incorrectly assumed in-kernel device drivers would have more than a few Send SGEs available. The fix for the server side is not the same. This is because the fundamental problem on the server is that, whether or not the client has provisioned a chunk for the RPC reply, the server must squeeze even the most complex RPC replies into a single RDMA Send. Failing in the send path because of Send SGE exhaustion should never be an option. Therefore, instead of failing when the send path runs out of SGEs, switch to using a bounce buffer mechanism to handle RPC replies that are too complex for the device to send directly. That allows us to remove the max_sge check to enable drivers with small max_sge to work again. Reported-by: Don Dutile <ddutile@redhat.com> Fixes: 25fd86eca11c ("svcrdma: Don't overrun the SGE array in ...") Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-02-06nfsd: Fix error return values for nfsd4_clone_file_range()Trond Myklebust
If the parameter 'count' is non-zero, nfsd4_clone_file_range() will currently clobber all errors returned by vfs_clone_file_range() and replace them with EINVAL. Fixes: 42ec3d4c0218 ("vfs: make remap_file_range functions take and...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-02-06ALSA: hda/ca0132 - Fix build error without CONFIG_PCITakashi Iwai
A call of pci_iounmap() call without CONFIG_PCI leads to a build error on some architectures. We tried to address this and add a check of IS_ENABLED(CONFIG_PCI), but this still doesn't seem enough for sh. Ideally we should fix it globally, it's really a corner case, so let's paper over it with a simpler ifdef. Fixes: 1e73359a24fa ("ALSA: hda/ca0132 - make pci_iounmap() call conditional") Reported-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-02-05ALSA: compress: Fix stop handling on compressed capture streamsCharles Keepax
It is normal user behaviour to start, stop, then start a stream again without closing it. Currently this works for compressed playback streams but not capture ones. The states on a compressed capture stream go directly from OPEN to PREPARED, unlike a playback stream which moves to SETUP and waits for a write of data before moving to PREPARED. Currently however, when a stop is sent the state is set to SETUP for both types of streams. This leaves a capture stream in the situation where a new start can't be sent as that requires the state to be PREPARED and a new set_params can't be sent as that requires the state to be OPEN. The only option being to close the stream, and then reopen. Correct this issues by allowing snd_compr_drain_notify to set the state depending on the stream direction, as we already do in set_params. Fixes: 49bb6402f1aa ("ALSA: compress_core: Add support for capture streams") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-02-05virtio: drop internal struct from UAPIMichael S. Tsirkin
There's no reason to expose struct vring_packed in UAPI - if we do we won't be able to change or drop it, and it's not part of any interface. Let's move it to virtio_ring.c Cc: Tiwei Bie <tiwei.bie@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-02-05ALSA: usb-audio: Add support for new T+A USB DACUdo Eberhardt
This patch adds the T+A VID to the generic check in order to enable native DSD support for T+A devices. This works with the new T+A USB DAC model SD3100HV and will also work with future devices which support the XMOS/Thesycon style DSD format. Signed-off-by: Udo Eberhardt <udo.eberhardt@thesycon.de> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-02-03Linux 5.0-rc5Linus Torvalds
2019-02-03Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A few updates for x86: - Fix an unintended sign extension issue in the fault handling code - Rename the new resource control config switch so it's less confusing - Avoid setting up EFI info in kexec when the EFI runtime is disabled. - Fix the microcode version check in the AMD microcode loader so it only loads higher version numbers and never downgrades - Set EFER.LME in the 32bit trampoline before returning to long mode to handle older AMD/KVM behaviour properly. - Add Darren and Andy as x86/platform reviewers" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Avoid confusion over the new X86_RESCTRL config x86/kexec: Don't setup EFI info if EFI runtime is not enabled x86/microcode/amd: Don't falsely trick the late loading mechanism MAINTAINERS: Add Andy and Darren as arch/x86/platform/ reviewers x86/fault: Fix sign-extend unintended sign extension x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline before returning to long mode x86/cpu: Add Atom Tremont (Jacobsville)
2019-02-03Merge branch 'smp-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull cpu hotplug fixes from Thomas Gleixner: "Two fixes for the cpu hotplug machinery: - Replace the overly clever 'SMT disabled by BIOS' detection logic as it breaks KVM scenarios and prevents speculation control updates when the Hyperthreads are brought online late after boot. - Remove a redundant invocation of the speculation control update function" * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM x86/speculation: Remove redundant arch_smt_update() invocation
2019-02-03Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A pile of perf updates: - Fix broken sanity check in the /proc/sys/kernel/perf_cpu_time_max_percent write handler - Cure a perf script crash which caused by an unitinialized data structure - Highlight the hottest instruction in perf top and not a random one - Cure yet another clang issue when building perf python - Handle topology entries with no CPU correctly in the tools - Handle perf data which contains both tracepoints and performance counter entries correctly. - Add a missing NULL pointer check in perf ordered_events_free()" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf script: Fix crash when processing recorded stat data perf top: Fix wrong hottest instruction highlighted perf tools: Handle TOPOLOGY headers with no CPU perf python: Remove -fstack-clash-protection when building with some clang versions perf core: Fix perf_proc_update_handler() bug perf script: Fix crash with printing mixed trace point and other events perf ordered_events: Fix crash in ordered_events__free
2019-02-03Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fix from Thomas Gleixner: "The dump info for the efi page table debugging lacks a terminator which causes the kernel to crash when the debugfile is read" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/arm64: Fix debugfs crash by adding a terminator for ptdump marker
2019-02-03Merge tag 'for-5.0-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - regression fix: transaction commit can run away due to delayed ref waiting heuristic, this is not necessary now because of the proper reservation mechanism introduced in 5.0 - regression fix: potential crash due to use-before-check of an ERR_PTR return value - fix for transaction abort during transaction commit that needs to properly clean up pending block groups - fix deadlock during b-tree node/leaf splitting, when this happens on some of the fundamental trees, we must prevent new tree block allocation to re-enter indirectly via the block group flushing path - potential memory leak after errors during mount * tag 'for-5.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: On error always free subvol_name in btrfs_mount btrfs: clean up pending block groups when transaction commit aborts btrfs: fix potential oops in device_list_add btrfs: don't end the transaction for delayed refs in throttle Btrfs: fix deadlock when allocating tree block during leaf/node split
2019-02-02Merge tag 'devicetree-fixes-for-5.0-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull Devicetree fix from Rob Herring: "A single fix for building DT bindings in-tree" * tag 'devicetree-fixes-for-5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: Fix dt_binding_check target for in tree builds
2019-02-02Merge tag 'riscv-for-linus-5.0-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux Pull RISC-V fixes from Palmer Dabbelt: "This contains a handful of mostly-independent patches: - make our port respect TIF_NEED_RESCHED, which fixes CONFIG_PREEMPT=y kernels - fix double-put of OF nodes - fix a misspelling of target in our Kconfig - generic PCIe is enabled in our defconfig - fix our SBI early console to properly handle line endings - fix max_low_pfn being counted in PFNs - a change to TASK_UNMAPPED_BASE to match what other arches do This has passed my standard 'boot Fedora' flow" * tag 'riscv-for-linus-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux: riscv: Adjust mmap base address at a third of task size riscv: fixup max_low_pfn with PFN_DOWN. tty/serial: use uart_console_write in the RISC-V SBL early console RISC-V: defconfig: Add CRYPTO_DEV_VIRTIO=y RISC-V: defconfig: Enable Generic PCIE by default RISC-V: defconfig: Move CONFIG_PCI{,E_XILINX} RISC-V: Kconfig: fix spelling mistake "traget" -> "target" RISC-V: asm/page.h: fix spelling mistake "CONFIG_64BITS" -> "CONFIG_64BIT" RISC-V: fix bad use of of_node_put RISC-V: Add _TIF_NEED_RESCHED check for kernel thread when CONFIG_PREEMPT=y
2019-02-02Merge tag 'for-linus-20190202' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "A few fixes that should go into this release. This contains: - MD pull request from Song, fixing a recovery OOM issue (Alexei) - Fix for a sync related stall (Jianchao) - Dummy callback for timeouts (Tetsuo) - IDE atapi sense ordering fix (me)" * tag 'for-linus-20190202' of git://git.kernel.dk/linux-block: ide: ensure atapi sense request aren't preempted blk-mq: fix a hung issue when fsync block: pass no-op callback to INIT_WORK(). md/raid5: fix 'out of memory' during raid cache recovery
2019-02-02Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Five minor bug fixes. The libfc one is a tiny memory leak, the zfcp one is an incorrect user visible parameter and the rest are on error legs or obscure features" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: 53c700: pass correct "dev" to dma_alloc_attrs() scsi: bnx2fc: Fix error handling in probe() scsi: scsi_debug: fix write_same with virtual_gb problem scsi: libfc: free skb when receiving invalid flogi resp scsi: zfcp: fix sysfs block queue limit output for max_segment_size
2019-02-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "24 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (24 commits) autofs: fix error return in autofs_fill_super() autofs: drop dentry reference only when it is never used fs/drop_caches.c: avoid softlockups in drop_pagecache_sb() mm: migrate: don't rely on __PageMovable() of newpage after unlocking it psi: clarify the Kconfig text for the default-disable option mm, memory_hotplug: __offline_pages fix wrong locking mm: hwpoison: use do_send_sig_info() instead of force_sig() kasan: mark file common so ftrace doesn't trace it init/Kconfig: fix grammar by moving a closing parenthesis lib/test_kmod.c: potential double free in error handling mm, oom: fix use-after-free in oom_kill_process mm/hotplug: invalid PFNs from pfn_to_online_page() mm,memory_hotplug: fix scan_movable_pages() for gigantic hugepages psi: fix aggregation idle shut-off mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone oom, oom_reaper: do not enqueue same task twice mm: migrate: make buffer_migrate_page_norefs() actually succeed kernel/exit.c: release ptraced tasks before zap_pid_ns_processes x86_64: increase stack size for KASAN_EXTRA ...
2019-02-02efi/arm64: Fix debugfs crash by adding a terminator for ptdump markerQian Cai
When reading 'efi_page_tables' debugfs triggers an out-of-bounds access here: arch/arm64/mm/dump.c: 282 if (addr >= st->marker[1].start_address) { called from: arch/arm64/mm/dump.c: 331 note_page(st, addr, 2, pud_val(pud)); because st->marker++ is is called after "UEFI runtime end" which is the last element in addr_marker[]. Therefore, add a terminator like the one for kernel_page_tables, so it can be skipped to print out non-existent markers. Here's the KASAN bug report: # cat /sys/kernel/debug/efi_page_tables ---[ UEFI runtime start ]--- 0x0000000020000000-0x0000000020010000 64K PTE RW NX SHD AF ... 0x0000000020200000-0x0000000021340000 17664K PTE RW NX SHD AF ... ... 0x0000000021920000-0x0000000021950000 192K PTE RW x SHD AF ... 0x0000000021950000-0x00000000219a0000 320K PTE RW NX SHD AF ... ---[ UEFI runtime end ]--- ---[ (null) ]--- ---[ (null) ]--- BUG: KASAN: global-out-of-bounds in note_page+0x1f0/0xac0 Read of size 8 at addr ffff2000123f2ac0 by task read_all/42464 Call trace: dump_backtrace+0x0/0x298 show_stack+0x24/0x30 dump_stack+0xb0/0xdc print_address_description+0x64/0x2b0 kasan_report+0x150/0x1a4 __asan_report_load8_noabort+0x30/0x3c note_page+0x1f0/0xac0 walk_pgd+0xb4/0x244 ptdump_walk_pgd+0xec/0x140 ptdump_show+0x40/0x50 seq_read+0x3f8/0xad0 full_proxy_read+0x9c/0xc0 __vfs_read+0xfc/0x4c8 vfs_read+0xec/0x208 ksys_read+0xd0/0x15c __arm64_sys_read+0x84/0x94 el0_svc_handler+0x258/0x304 el0_svc+0x8/0xc The buggy address belongs to the variable: __compound_literal.0+0x20/0x800 Memory state around the buggy address: ffff2000123f2980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff2000123f2a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa >ffff2000123f2a80: fa fa fa fa 00 00 00 00 fa fa fa fa 00 00 00 00 ^ ffff2000123f2b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff2000123f2b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0 [ ardb: fix up whitespace ] [ mingo: fix up some moar ] Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Fixes: 9d80448ac92b ("efi/arm64: Add debugfs node to dump UEFI runtime page tables") Link: http://lkml.kernel.org/r/20190202095017.13799-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-02x86/resctrl: Avoid confusion over the new X86_RESCTRL configJohannes Weiner
"Resource Control" is a very broad term for this CPU feature, and a term that is also associated with containers, cgroups etc. This can easily cause confusion. Make the user prompt more specific. Match the config symbol name. [ bp: In the future, the corresponding ARM arch-specific code will be under ARM_CPU_RESCTRL and the arch-agnostic bits will be carved out under the CPU_RESCTRL umbrella symbol. ] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Babu Moger <Babu.Moger@amd.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morse <james.morse@arm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: linux-doc@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pu Wen <puwen@hygon.cn> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190130195621.GA30653@cmpxchg.org
2019-02-01Merge tag 'xtensa-20190201' of git://github.com/jcmvbkbc/linux-xtensaLinus Torvalds
Pull xtensa fixes from Max Filippov: - fix ccount_timer_shutdown for secondary CPUs - fix secondary CPU initialization - fix secondary CPU reset vector clash with double exception vector - fix present CPUs when booting with 'maxcpus' parameter - limit possible CPUs by configured NR_CPUS - issue a warning if xtensa PIC is asked to retrigger anything other than software IRQ - fix masking/unmasking of the first two IRQs on xtensa MX PIC - fix typo in Kconfig description for user space unaligned access feature - fix Kconfig warning for selecting BUILTIN_DTB * tag 'xtensa-20190201' of git://github.com/jcmvbkbc/linux-xtensa: xtensa: SMP: limit number of possible CPUs by NR_CPUS xtensa: rename BUILTIN_DTB to BUILTIN_DTB_SOURCE xtensa: Fix typo use space=>user space drivers/irqchip: xtensa-mx: fix mask and unmask drivers/irqchip: xtensa: add warning to irq_retrigger xtensa: SMP: mark each possible CPU as present xtensa: smp_lx200_defconfig: fix vectors clash xtensa: SMP: fix secondary CPU initialization xtensa: SMP: fix ccount_timer_shutdown
2019-02-01Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "Although we're still debugging a few minor arm64-specific issues in mainline, I didn't want to hold this lot up in the meantime. We've got an additional KASLR fix after the previous one wasn't quite complete, a fix for a performance regression when mapping executable pages into userspace and some fixes for kprobe blacklisting. All candidates for stable. Summary: - Fix module loading when KASLR is configured but disabled at runtime - Fix accidental IPI when mapping user executable pages - Ensure hyp-stub and KVM world switch code cannot be kprobed" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: hibernate: Clean the __hyp_text to PoC after resume arm64: hyp-stub: Forbid kprobing of the hyp-stub arm64: kprobe: Always blacklist the KVM world-switch code arm64: kaslr: ensure randomized quantities are clean also when kaslr is off arm64: Do not issue IPIs for user executable ptes
2019-02-01Merge tag '5.0-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb3 fixes from Steve French: "SMB3 fixes, some from this week's SMB3 test evemt, 5 for stable and a particularly important one for queryxattr (see xfstests 70 and 117)" * tag '5.0-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: update internal module version number CIFS: fix use-after-free of the lease keys CIFS: Do not consider -ENODATA as stat failure for reads CIFS: Do not count -ENODATA as failure for query directory CIFS: Fix trace command logging for SMB2 reads and writes CIFS: Fix possible oops and memory leaks in async IO cifs: limit amount of data we request for xattrs to CIFSMaxBufSize cifs: fix computation for MAX_SMB2_HDR_SIZE
2019-02-01Merge tag 'apparmor-pr-2019-02-01' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor Pull apparmor bug fixes from John Johansen: "Two bug fixes for apparmor: - Fix aa_label_build() error handling for failed merges - Fix warning about unused function apparmor_ipv6_postroute" * tag 'apparmor-pr-2019-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: apparmor: Fix aa_label_build() error handling for failed merges apparmor: Fix warning about unused function apparmor_ipv6_postroute
2019-02-01autofs: fix error return in autofs_fill_super()Ian Kent
In autofs_fill_super() on error of get inode/make root dentry the return should be ENOMEM as this is the only failure case of the called functions. Link: http://lkml.kernel.org/r/154725123240.11260.796773942606871359.stgit@pluto-themaw-net Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01autofs: drop dentry reference only when it is never usedPan Bian
autofs_expire_run() calls dput(dentry) to drop the reference count of dentry. However, dentry is read via autofs_dentry_ino(dentry) after that. This may result in a use-free-bug. The patch drops the reference count of dentry only when it is never used. Link: http://lkml.kernel.org/r/154725122396.11260.16053424107144453867.stgit@pluto-themaw-net Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()Jan Kara
When superblock has lots of inodes without any pagecache (like is the case for /proc), drop_pagecache_sb() will iterate through all of them without dropping sb->s_inode_list_lock which can lead to softlockups (one of our customers hit this). Fix the problem by going to the slow path and doing cond_resched() in case the process needs rescheduling. Link: http://lkml.kernel.org/r/20190114085343.15011-1-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01mm: migrate: don't rely on __PageMovable() of newpage after unlocking itDavid Hildenbrand
We had a race in the old balloon compaction code before b1123ea6d3b3 ("mm: balloon: use general non-lru movable page feature") refactored it that became visible after backporting 195a8c43e93d ("virtio-balloon: deflate via a page list") without the refactoring. The bug existed from commit d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management") till b1123ea6d3b3 ("mm: balloon: use general non-lru movable page feature"). d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management") was backported to 3.12, so the broken kernels are stable kernels [3.12 - 4.7]. There was a subtle race between dropping the page lock of the newpage in __unmap_and_move() and checking for __is_movable_balloon_page(newpage). Just after dropping this page lock, virtio-balloon could go ahead and deflate the newpage, effectively dequeueing it and clearing PageBalloon, in turn making __is_movable_balloon_page(newpage) fail. This resulted in dropping the reference of the newpage via putback_lru_page(newpage) instead of put_page(newpage), leading to page->lru getting modified and a !LRU page ending up in the LRU lists. With 195a8c43e93d ("virtio-balloon: deflate via a page list") backported, one would suddenly get corrupted lists in release_pages_balloon(): - WARNING: CPU: 13 PID: 6586 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0 - list_del corruption. prev->next should be ffffe253961090a0, but was dead000000000100 Nowadays this race is no longer possible, but it is hidden behind very ugly handling of __ClearPageMovable() and __PageMovable(). __ClearPageMovable() will not make __PageMovable() fail, only PageMovable(). So the new check (__PageMovable(newpage)) will still hold even after newpage was dequeued by virtio-balloon. If anybody would ever change that special handling, the BUG would be introduced again. So instead, make it explicit and use the information of the original isolated page before migration. This patch can be backported fairly easy to stable kernels (in contrast to the refactoring). Link: http://lkml.kernel.org/r/20190129233217.10747-1-david@redhat.com Fixes: d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: Vratislav Bendel <vbendel@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Rafael Aquini <aquini@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Jan Kara <jack@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Vratislav Bendel <vbendel@redhat.com> Cc: Rafael Aquini <aquini@redhat.com> Cc: Konstantin Khlebnikov <k.khlebnikov@samsung.com> Cc: Minchan Kim <minchan@kernel.org> Cc: <stable@vger.kernel.org> [3.12 - 4.7] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01psi: clarify the Kconfig text for the default-disable optionJohannes Weiner
The current help text caused some confusion in online forums about whether or not to default-enable or default-disable psi in vendor kernels. This is because it doesn't communicate the reason for why we made this setting configurable in the first place: that the overhead is non-zero in an artificial scheduler stress test. Since this isn't representative of real workloads, and the effect was not measurable in scheduler-heavy real world applications such as the webservers and memcache installations at Facebook, it's fair to point out that this is a pretty cautious option to select. Link: http://lkml.kernel.org/r/20190129233617.16767-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01mm, memory_hotplug: __offline_pages fix wrong lockingMichal Hocko
Jan has noticed that we do double unlock on some failure paths when offlining a page range. This is indeed the case when test_pages_in_a_zone respp. start_isolate_page_range fail. This was an omission when forward porting the debugging patch from an older kernel. Fix the issue by dropping mem_hotplug_done from the failure condition and keeping the single unlock in the catch all failure path. Link: http://lkml.kernel.org/r/20190115120307.22768-1-mhocko@kernel.org Fixes: 7960509329c2 ("mm, memory_hotplug: print reason for the offlining failure") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Tested-by: Jan Kara <jack@suse.cz> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01mm: hwpoison: use do_send_sig_info() instead of force_sig()Naoya Horiguchi
Currently memory_failure() is racy against process's exiting, which results in kernel crash by null pointer dereference. The root cause is that memory_failure() uses force_sig() to forcibly kill asynchronous (meaning not in the current context) processes. As discussed in thread https://lkml.org/lkml/2010/6/8/236 years ago for OOM fixes, this is not a right thing to do. OOM solves this issue by using do_send_sig_info() as done in commit d2d393099de2 ("signal: oom_kill_task: use SEND_SIG_FORCED instead of force_sig()"), so this patch is suggesting to do the same for hwpoison. do_send_sig_info() properly accesses to siglock with lock_task_sighand(), so is free from the reported race. I confirmed that the reported bug reproduces with inserting some delay in kill_procs(), and it never reproduces with this patch. Note that memory_failure() can send another type of signal using force_sig_mceerr(), and the reported race shouldn't happen on it because force_sig_mceerr() is called only for synchronous processes (i.e. BUS_MCEERR_AR happens only when some process accesses to the corrupted memory.) Link: http://lkml.kernel.org/r/20190116093046.GA29835@hori1.linux.bs1.fc.nec.co.jp Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reported-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01kasan: mark file common so ftrace doesn't trace itAnders Roxell
When option CONFIG_KASAN is enabled toghether with ftrace, function ftrace_graph_caller() gets in to a recursion, via functions kasan_check_read() and kasan_check_write(). Breakpoint 2, ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:179 179 mcount_get_pc x0 // function's pc (gdb) bt #0 ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:179 #1 0xffffff90101406c8 in ftrace_caller () at ../arch/arm64/kernel/entry-ftrace.S:151 #2 0xffffff90106fd084 in kasan_check_write (p=0xffffffc06c170878, size=4) at ../mm/kasan/common.c:105 #3 0xffffff90104a2464 in atomic_add_return (v=<optimized out>, i=<optimized out>) at ./include/generated/atomic-instrumented.h:71 #4 atomic_inc_return (v=<optimized out>) at ./include/generated/atomic-fallback.h:284 #5 trace_graph_entry (trace=0xffffffc03f5ff380) at ../kernel/trace/trace_functions_graph.c:441 #6 0xffffff9010481774 in trace_graph_entry_watchdog (trace=<optimized out>) at ../kernel/trace/trace_selftest.c:741 #7 0xffffff90104a185c in function_graph_enter (ret=<optimized out>, func=<optimized out>, frame_pointer=18446743799894897728, retp=<optimized out>) at ../kernel/trace/trace_functions_graph.c:196 #8 0xffffff9010140628 in prepare_ftrace_return (self_addr=18446743592948977792, parent=0xffffffc03f5ff418, frame_pointer=18446743799894897728) at ../arch/arm64/kernel/ftrace.c:231 #9 0xffffff90101406f4 in ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:182 Backtrace stopped: previous frame identical to this frame (corrupt stack?) (gdb) Rework so that the kasan implementation isn't traced. Link: http://lkml.kernel.org/r/20181212183447.15890-1-anders.roxell@linaro.org Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Acked-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01init/Kconfig: fix grammar by moving a closing parenthesisJonathan Neuschäfer
Link: http://lkml.kernel.org/r/20190129150813.15785-1-j.neuschaefer@gmx.net Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01lib/test_kmod.c: potential double free in error handlingDan Carpenter
There is a copy and paste bug so we set "config->test_driver" to NULL twice instead of setting "config->test_fs". Smatch complains that it leads to a double free: lib/test_kmod.c:840 __kmod_config_init() warn: 'config->test_fs' double freed Link: http://lkml.kernel.org/r/20190121140011.GA14283@kadam Fixes: d9c6a72d6fa2 ("kmod: add test driver to stress test the module loader") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>