summaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)Author
2019-08-27writeback, memcg: Implement foreign dirty flushingTejun Heo
There's an inherent mismatch between memcg and writeback. The former trackes ownership per-page while the latter per-inode. This was a deliberate design decision because honoring per-page ownership in the writeback path is complicated, may lead to higher CPU and IO overheads and deemed unnecessary given that write-sharing an inode across different cgroups isn't a common use-case. Combined with inode majority-writer ownership switching, this works well enough in most cases but there are some pathological cases. For example, let's say there are two cgroups A and B which keep writing to different but confined parts of the same inode. B owns the inode and A's memory is limited far below B's. A's dirty ratio can rise enough to trigger balance_dirty_pages() sleeps but B's can be low enough to avoid triggering background writeback. A will be slowed down without a way to make writeback of the dirty pages happen. This patch implements foreign dirty recording and foreign mechanism so that when a memcg encounters a condition as above it can trigger flushes on bdi_writebacks which can clean its pages. Please see the comment on top of mem_cgroup_track_foreign_dirty_slowpath() for details. A reproducer follows. write-range.c:: #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <sys/types.h> static const char *usage = "write-range FILE START SIZE\n"; int main(int argc, char **argv) { int fd; unsigned long start, size, end, pos; char *endp; char buf[4096]; if (argc < 4) { fprintf(stderr, usage); return 1; } fd = open(argv[1], O_WRONLY); if (fd < 0) { perror("open"); return 1; } start = strtoul(argv[2], &endp, 0); if (*endp != '\0') { fprintf(stderr, usage); return 1; } size = strtoul(argv[3], &endp, 0); if (*endp != '\0') { fprintf(stderr, usage); return 1; } end = start + size; while (1) { for (pos = start; pos < end; ) { long bread, bwritten = 0; if (lseek(fd, pos, SEEK_SET) < 0) { perror("lseek"); return 1; } bread = read(0, buf, sizeof(buf) < end - pos ? sizeof(buf) : end - pos); if (bread < 0) { perror("read"); return 1; } if (bread == 0) return 0; while (bwritten < bread) { long this; this = write(fd, buf + bwritten, bread - bwritten); if (this < 0) { perror("write"); return 1; } bwritten += this; pos += bwritten; } } } } repro.sh:: #!/bin/bash set -e set -x sysctl -w vm.dirty_expire_centisecs=300000 sysctl -w vm.dirty_writeback_centisecs=300000 sysctl -w vm.dirtytime_expire_seconds=300000 echo 3 > /proc/sys/vm/drop_caches TEST=/sys/fs/cgroup/test A=$TEST/A B=$TEST/B mkdir -p $A $B echo "+memory +io" > $TEST/cgroup.subtree_control echo $((1<<30)) > $A/memory.high echo $((32<<30)) > $B/memory.high rm -f testfile touch testfile fallocate -l 4G testfile echo "Starting B" (echo $BASHPID > $B/cgroup.procs pv -q --rate-limit 70M < /dev/urandom | ./write-range testfile $((2<<30)) $((2<<30))) & echo "Waiting 10s to ensure B claims the testfile inode" sleep 5 sync sleep 5 sync echo "Starting A" (echo $BASHPID > $A/cgroup.procs pv < /dev/urandom | ./write-range testfile 0 $((2<<30))) v2: Added comments explaining why the specific intervals are being used. v3: Use 0 @nr when calling cgroup_writeback_by_id() to use best-effort flushing while avoding possible livelocks. v4: Use get_jiffies_64() and time_before/after64() instead of raw jiffies_64 and arthimetic comparisons as suggested by Jan. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-27writeback, memcg: Implement cgroup_writeback_by_id()Tejun Heo
Implement cgroup_writeback_by_id() which initiates cgroup writeback from bdi and memcg IDs. This will be used by memcg foreign inode flushing. v2: Use wb_get_lookup() instead of wb_get_create() to avoid creating spurious wbs. v3: Interpret 0 @nr as 1.25 * nr_dirty to implement best-effort flushing while avoding possible livelocks. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-27writeback: Separate out wb_get_lookup() from wb_get_create()Tejun Heo
Separate out wb_get_lookup() which doesn't try to create one if there isn't already one from wb_get_create(). This will be used by later patches. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-27bdi: Add bdi->idTejun Heo
There currently is no way to universally identify and lookup a bdi without holding a reference and pointer to it. This patch adds an non-recycling bdi->id and implements bdi_get_by_id() which looks up bdis by their ids. This will be used by memcg foreign inode flushing. I left bdi_list alone for simplicity and because while rb_tree does support rcu assignment it doesn't seem to guarantee lossless walk when walk is racing aginst tree rebalance operations. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-27writeback: Generalize and expose wb_completionTejun Heo
wb_completion is used to track writeback completions. We want to use it from memcg side for foreign inode flushes. This patch updates it to remember the target waitq instead of assuming bdi->wb_waitq and expose it outside of fs-writeback.c. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-19block: remove struct request_queue queue_headJunxiao Bi
The dispatch list is not used any more, as the legacy block IO stack has been removed. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-14block: annotate refault stalls from IO submissionJohannes Weiner
psi tracks the time tasks wait for refaulting pages to become uptodate, but it does not track the time spent submitting the IO. The submission part can be significant if backing storage is contended or when cgroup throttling (io.latency) is in effect - a lot of time is spent in submit_bio(). In that case, we underreport memory pressure. Annotate submit_bio() to account submission time as memory stall when the bio is reading userspace workingset pages. Tested-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-06lightnvm: move metadata mapping to lower level driverHans Holmberg
Now that blk_rq_map_kern can map both kmem and vmem, move internal metadata mapping down to the lower level driver. Reviewed-by: Javier González <javier@javigon.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hans Holmberg <hans@owltronix.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-06lightnvm: remove nvm_submit_io_sync_fnHans Holmberg
Move the redundant sync handling interface and wait for a completion in the lightnvm core instead. Reviewed-by: Javier González <javier@javigon.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hans Holmberg <hans@owltronix.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-04blk-mq: add callback of .cleanup_rqMing Lei
SCSI maintains its own driver private data hooked off of each SCSI request, and the pridate data won't be freed after scsi_queue_rq() returns BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE. An upper layer driver (e.g. dm-rq) may need to retry these SCSI requests, before SCSI has fully dispatched them, due to a lower level SCSI driver's resource limitation identified in scsi_queue_rq(). Currently SCSI's per-request private data is leaked when the upper layer driver (dm-rq) frees and then retries these requests in response to BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE returns from scsi_queue_rq(). This usecase is so specialized that it doesn't warrant training an existing blk-mq interface (e.g. blk_mq_free_request) to allow SCSI to account for freeing its driver private data -- doing so would add an extra branch for handling a special case that all other consumers of SCSI (and blk-mq) won't ever need to worry about. So the most pragmatic way forward is to delegate freeing SCSI driver private data to the upper layer driver (dm-rq). Do so by adding new .cleanup_rq callback and calling a new blk_mq_cleanup_rq() method from dm-rq. A following commit will implement the .cleanup_rq() hook in scsi_mq_ops. Cc: Ewan D. Milne <emilne@redhat.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: <stable@vger.kernel.org> Fixes: 396eaf21ee17 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-04block: add req op to reset all zones and flagChaitanya Kulkarni
This patch introduces a new request operation REQ_OP_ZONE_RESET_ALL. This is useful for the applications like mkfs where it needs to reset all the zones present on the underlying block device. As part for this patch we also introduce new QUEUE_FLAG_ZONE_RESETALL which indicates the queue zone reset all capability and corresponding helper macro. Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-04block: Fix spelling in the header above blkg_lookup()Bart Van Assche
See also commit 8f4236d9008b ("block: remove QUEUE_FLAG_BYPASS and ->bypass") # v5.0. Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-04block: Declare several function pointer arguments 'const'Bart Van Assche
Make it clear to the compiler and also to humans that the functions that query request queue properties do not modify any member of the request_queue data structure. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-04blk-mq: remove blk_mq_complete_request_syncMing Lei
blk_mq_tagset_wait_completed_request() has been applied for waiting for completed request's fn, so not necessary to use blk_mq_complete_request_sync() any more. Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Keith Busch <keith.busch@intel.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-04blk-mq: introduce blk_mq_tagset_wait_completed_request()Ming Lei
blk-mq may schedule to call queue's complete function on remote CPU via IPI, but doesn't provide any way to synchronize the request's complete fn. The current queue freeze interface can't provide the synchonization because aborted requests stay at blk-mq queues during EH. In some driver's EH(such as NVMe), hardware queue's resource may be freed & re-allocated. If the completed request's complete fn is run finally after the hardware queue's resource is released, kernel crash will be triggered. Prepare for fixing this kind of issue by introducing blk_mq_tagset_wait_completed_request(). Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Keith Busch <keith.busch@intel.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-04blk-mq: introduce blk_mq_request_completed()Ming Lei
NVMe needs this function to decide if one request to be aborted has been completed in normal IO path already. So introduce it. Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Keith Busch <keith.busch@intel.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-03Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "17 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: drivers/acpi/scan.c: document why we don't need the device_hotplug_lock memremap: move from kernel/ to mm/ lib/test_meminit.c: use GFP_ATOMIC in RCU critical section asm-generic: fix -Wtype-limits compiler warnings cgroup: kselftest: relax fs_spec checks mm/memory_hotplug.c: remove unneeded return for void function mm/migrate.c: initialize pud_entry in migrate_vma() coredump: split pipe command whitespace before expanding template page flags: prioritize kasan bits over last-cpuid ubsan: build ubsan.c more conservatively kasan: remove clang version check for KASAN_STACK mm: compaction: avoid 100% CPU usage during compaction when a task is killed mm: migrate: fix reference check race between __find_get_block() and migration mm: vmscan: check if mem cgroup is disabled or not before calling memcg slab shrinker ocfs2: remove set but not used variable 'last_hash' Revert "kmemleak: allow to coexist with fault injection" kernel/signal.c: fix a kernel-doc markup
2019-08-03asm-generic: fix -Wtype-limits compiler warningsQian Cai
Commit d66acc39c7ce ("bitops: Optimise get_order()") introduced a compilation warning because "rx_frag_size" is an "ushort" while PAGE_SHIFT here is 16. The commit changed the get_order() to be a multi-line macro where compilers insist to check all statements in the macro even when __builtin_constant_p(rx_frag_size) will return false as "rx_frag_size" is a module parameter. In file included from ./arch/powerpc/include/asm/page_64.h:107, from ./arch/powerpc/include/asm/page.h:242, from ./arch/powerpc/include/asm/mmu.h:132, from ./arch/powerpc/include/asm/lppaca.h:47, from ./arch/powerpc/include/asm/paca.h:17, from ./arch/powerpc/include/asm/current.h:13, from ./include/linux/thread_info.h:21, from ./arch/powerpc/include/asm/processor.h:39, from ./include/linux/prefetch.h:15, from drivers/net/ethernet/emulex/benet/be_main.c:14: drivers/net/ethernet/emulex/benet/be_main.c: In function 'be_rx_cqs_create': ./include/asm-generic/getorder.h:54:9: warning: comparison is always true due to limited range of data type [-Wtype-limits] (((n) < (1UL << PAGE_SHIFT)) ? 0 : \ ^ drivers/net/ethernet/emulex/benet/be_main.c:3138:33: note: in expansion of macro 'get_order' adapter->big_page_size = (1 << get_order(rx_frag_size)) * PAGE_SIZE; ^~~~~~~~~ Fix it by moving all of this multi-line macro into a proper function, and killing __get_order() off. [akpm@linux-foundation.org: remove __get_order() altogether] [cai@lca.pw: v2] Link: http://lkml.kernel.org/r/1564000166-31428-1-git-send-email-cai@lca.pw Link: http://lkml.kernel.org/r/1563914986-26502-1-git-send-email-cai@lca.pw Fixes: d66acc39c7ce ("bitops: Optimise get_order()") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Howells <dhowells@redhat.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Bill Wendling <morbo@google.com> Cc: James Y Knight <jyknight@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-08-03page flags: prioritize kasan bits over last-cpuidArnd Bergmann
ARM64 randdconfig builds regularly run into a build error, especially when NUMA_BALANCING and SPARSEMEM are enabled but not SPARSEMEM_VMEMMAP: #error "KASAN: not enough bits in page flags for tag" The last-cpuid bits are already contitional on the available space, so the result of the calculation is a bit random on whether they were already left out or not. Adding the kasan tag bits before last-cpuid makes it much more likely to end up with a successful build here, and should be reliable for randconfig at least, as long as that does not randomize NR_CPUS or NODES_SHIFT but uses the defaults. In order for the modified check to not trigger in the x86 vdso32 code where all constants are wrong (building with -m32), enclose all the definitions with an #ifdef. [arnd@arndb.de: build fix] Link: http://lkml.kernel.org/r/CAK8P3a3Mno1SWTcuAOT0Wa9VS15pdU6EfnkxLbDpyS55yO04+g@mail.gmail.com Link: http://lkml.kernel.org/r/20190722115520.3743282-1-arnd@arndb.de Link: https://lore.kernel.org/lkml/20190618095347.3850490-1-arnd@arndb.de/ Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-08-02Merge tag 'drm-fixes-2019-08-02-1' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull more drm fixes from Daniel Vetter: "Dave sends his pull, everyone realizes they've been asleep at the wheel and hits send on their own pulls :-/ Normally I'd just ignore these all because w/e for me and Dave. But this time around the latecomers also included drm-intel-fixes, which failed to send out a -fixes pull thus far for this release (screwed up vacation coverage, despite that 2/3 maintainers were around ... they all look appropriately guilty), and that really is overdue to get landed. And since I had to do a pull request anyway I pulled the other two late ones too. intel fixes (didn't have any ever since the main merge window pull): - gvt fixes (2 cc: stable) - fix gpu reset vs mm-shrinker vs wakeup fun (needed a few patches) - two gem locking fixes (one cc: stable) - pile of misc fixes all over with minor impact, 6 cc: stable, others from this window exynos: - misc minor fixes misc: - some build/Kconfig fixes - regression fix for vm scalability perf test which seems to mostly exercise dmesg/console logging ... - the vgem cache flush fix for arm64 broke the world on x86, so that's reverted again * tag 'drm-fixes-2019-08-02-1' of git://anongit.freedesktop.org/drm/drm: (42 commits) Revert "drm/vgem: fix cache synchronization on arm/arm64" drm/exynos: fix missing decrement of retry counter drm/exynos: add CONFIG_MMU dependency drm/exynos: remove redundant assignment to pointer 'node' drm/exynos: using dev_get_drvdata directly drm/bochs: Use shadow buffer for bochs framebuffer console drm/fb-helper: Instanciate shadow FB if configured in device's mode_config drm/fb-helper: Map DRM client buffer only when required drm/client: Support unmapping of DRM client buffers drm/i915: Only recover active engines drm/i915: Add a wakeref getter for iff the wakeref is already active drm/i915: Lift intel_engines_resume() to callers drm/vgem: fix cache synchronization on arm/arm64 drm/i810: Use CONFIG_PREEMPTION drm/bridge: tc358764: Fix build error drm/bridge: lvds-encoder: Fix build error while CONFIG_DRM_KMS_HELPER=m drm/i915/gvt: Adding ppgtt to GVT GEM context after shadow pdps settled. drm/i915/gvt: grab runtime pm first for forcewake use drm/i915/gvt: fix incorrect cache entry for guest page mapping drm/i915/gvt: Checking workload's gma earlier ...
2019-08-02Merge tag 'for-linus-5.3a-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - a small cleanup - a fix for a build error on ARM with some configs - a fix of a patch for the Xen gntdev driver - three patches for fixing a potential problem in the swiotlb-xen driver which Konrad was fine with me carrying them through the Xen tree * tag 'for-linus-5.3a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/swiotlb: remember having called xen_create_contiguous_region() xen/swiotlb: simplify range_straddles_page_boundary() xen/swiotlb: fix condition for calling xen_destroy_contiguous_region() xen: avoid link error on ARM xen/gntdev.c: Replace vm_map_pages() with vm_map_pages_zero() xen/pciback: remove set but not used variable 'old_state'
2019-08-02Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Seven fixes to four drivers with no core changes. The mpt3sas one is theoretical until we get a CPU that goes up to 64 bits physical, the qla2xxx one fixes an oops in a driver initialization error leg and the others are mostly cosmetic" [ The fcoe patches may be worth highlighting - they may be "just" cleanups, but they simplify and fix the odd fc_rport_priv structure handling rules so that the new gcc-9 warnings about memset crossing structure boundaries are gone. The old code was hard for humans to understand too, and really confused the compiler sanity checks - Linus ] * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qla2xxx: Fix possible fcport null-pointer dereferences scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA scsi: hpsa: remove printing internal cdb on tag collision scsi: hpsa: correct scsi command status issue after reset scsi: fcoe: pass in fcoe_rport structure instead of fc_rport_priv scsi: fcoe: Embed fc_rport_priv in fcoe_rport structure scsi: libfc: Whitespace cleanup in libfc.h
2019-08-02Merge tag 'for-linus-20190802' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Here's a small collection of fixes that should go into this series. This contains: - io_uring potential use-after-free fix (Jackie) - loop regression fix (Jan) - O_DIRECT fragmented bio regression fix (Damien) - Mark Denis as the new floppy maintainer (Denis) - ataflop switch fall-through annotation (Gustavo) - libata zpodd overflow fix (Kees) - libata ahci deferred probe fix (Miquel) - nbd invalidation BUG_ON() fix (Munehisa) - dasd endless loop fix (Stefan)" * tag 'for-linus-20190802' of git://git.kernel.dk/linux-block: s390/dasd: fix endless loop after read unit address configuration block: Fix __blkdev_direct_IO() for bio fragments MAINTAINERS: floppy: take over maintainership nbd: replace kill_bdev() with __invalidate_device() again ata: libahci: do not complain in case of deferred probe io_uring: fix KASAN use after free in io_sq_wq_submit_work loop: Fix mount(2) failure due to race with LOOP_SET_FD libata: zpodd: Fix small read overflow in zpodd_get_mech_type() ataflop: Mark expected switch fall-through
2019-08-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Doug Ledford: "Here's our second -rc pull request. Nothing particularly special in this one. The client removal deadlock fix is kindy tricky, but we had multiple eyes on it and no one could find a fault in it. A couple Spectre V1 fixes too. Otherwise, all just normal -rc fodder: - A couple Spectre V1 fixes (umad, hfi1) - Fix a tricky deadlock in the rdma core code with refcounting instead of locks (client removal patches) - Build errors (hns) - Fix a scheduling while atomic issue (mlx5) - Use after free fix (mad) - Fix error path return code (hns) - Null deref fix (siw_crypto_hash) - A few other misc. minor fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/hns: Fix error return code in hns_roce_v1_rsv_lp_qp() RDMA/mlx5: Release locks during notifier unregister IB/hfi1: Fix Spectre v1 vulnerability IB/mad: Fix use-after-free in ib mad completion handling RDMA/restrack: Track driver QP types in resource tracker IB/mlx5: Fix MR registration flow to use UMR properly RDMA/devices: Remove the lock around remove_client_context RDMA/devices: Do not deadlock during client removal IB/core: Add mitigation for Spectre V1 Do not dereference 'siw_crypto_shash' before checking RDMA/qedr: Fix the hca_type and hca_rev returned in device attributes RDMA/hns: Fix build error
2019-08-02Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A few fixes for code that came in during the merge window or that started getting exercised differently this time around: - Select regmap MMIO kconfig in spreadtrum driver to avoid compile errors - Complete kerneldoc on devm_clk_bulk_get_optional() - Register an essential clk earlier on mediatek mt8183 SoCs so the clocksource driver can use it - Fix divisor math in the at91 driver - Plug a race in Renesas reset control logic" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: renesas: cpg-mssr: Fix reset control race condition clk: sprd: Select REGMAP_MMIO to avoid compile errors clk: mediatek: mt8183: Register 13MHz clock earlier for clocksource clk: Add missing documentation of devm_clk_bulk_get_optional() argument clk: at91: generated: Truncate divisor to GENERATED_MAX_DIV + 1
2019-08-01RDMA/devices: Remove the lock around remove_client_contextJason Gunthorpe
Due to the complexity of client->remove() callbacks it is desirable to not hold any locks while calling them. Remove the last one by tracking only the highest client ID and running backwards from there over the xarray. Since the only purpose of that lock was to protect the linked list, we can drop the lock. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190731081841.32345-3-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-08-01RDMA/devices: Do not deadlock during client removalJason Gunthorpe
lockdep reports: WARNING: possible circular locking dependency detected modprobe/302 is trying to acquire lock: 0000000007c8919c ((wq_completion)ib_cm){+.+.}, at: flush_workqueue+0xdf/0x990 but task is already holding lock: 000000002d3d2ca9 (&device->client_data_rwsem){++++}, at: remove_client_context+0x79/0xd0 [ib_core] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&device->client_data_rwsem){++++}: down_read+0x3f/0x160 ib_get_net_dev_by_params+0xd5/0x200 [ib_core] cma_ib_req_handler+0x5f6/0x2090 [rdma_cm] cm_process_work+0x29/0x110 [ib_cm] cm_req_handler+0x10f5/0x1c00 [ib_cm] cm_work_handler+0x54c/0x311d [ib_cm] process_one_work+0x4aa/0xa30 worker_thread+0x62/0x5b0 kthread+0x1ca/0x1f0 ret_from_fork+0x24/0x30 -> #1 ((work_completion)(&(&work->work)->work)){+.+.}: process_one_work+0x45f/0xa30 worker_thread+0x62/0x5b0 kthread+0x1ca/0x1f0 ret_from_fork+0x24/0x30 -> #0 ((wq_completion)ib_cm){+.+.}: lock_acquire+0xc8/0x1d0 flush_workqueue+0x102/0x990 cm_remove_one+0x30e/0x3c0 [ib_cm] remove_client_context+0x94/0xd0 [ib_core] disable_device+0x10a/0x1f0 [ib_core] __ib_unregister_device+0x5a/0xe0 [ib_core] ib_unregister_device+0x21/0x30 [ib_core] mlx5_ib_stage_ib_reg_cleanup+0x9/0x10 [mlx5_ib] __mlx5_ib_remove+0x3d/0x70 [mlx5_ib] mlx5_ib_remove+0x12e/0x140 [mlx5_ib] mlx5_remove_device+0x144/0x150 [mlx5_core] mlx5_unregister_interface+0x3f/0xf0 [mlx5_core] mlx5_ib_cleanup+0x10/0x3a [mlx5_ib] __x64_sys_delete_module+0x227/0x350 do_syscall_64+0xc3/0x6a4 entry_SYSCALL_64_after_hwframe+0x49/0xbe Which is due to the read side of the client_data_rwsem being obtained recursively through a work queue flush during cm client removal. The lock is being held across the remove in remove_client_context() so that the function is a fence, once it returns the client is removed. This is required so that the two callers do not proceed with destruction until the client completes removal. Instead of using client_data_rwsem use the existing device unregistration refcount and add a similar client unregistration (client->uses) refcount. This will fence the two unregistration paths without holding any locks. Cc: <stable@vger.kernel.org> Fixes: 921eab1143aa ("RDMA/devices: Re-organize device.c locking") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190731081841.32345-2-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-08-01Merge tag 'gpio-v5.3-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Three GPIO fixes, all touching the core, so quite important: - Fix the request of active low GPIO line events. - Don't issue WARN() stuff on NULL descriptors if the GPIOLIB is disabled. - Preserve the descriptor flags when setting the initial direction on lines" * tag 'gpio-v5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpiolib: Preserve desc->flags when setting state gpio: don't WARN() on NULL descs if gpiolib is disabled gpiolib: fix incorrect IRQ requesting of an active-low lineevent
2019-08-01drm/fb-helper: Instanciate shadow FB if configured in device's mode_configThomas Zimmermann
Generic framebuffer emulation uses a shadow buffer for framebuffers with dirty() function. If drivers want to use the shadow FB without such a function, they can now set prefer_shadow or prefer_shadow_fbdev in their mode_config structures. The former flag is exported to userspace, the latter flag is fbdev-only. v3: * only schedule dirty worker if fbdev uses shadow fb * test shadow fb settings with boolean operators * use bool for struct drm_mode_config.prefer_shadow_fbdev * fix documentation comments Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Noralf Trønnes <noralf@tronnes.org> Tested-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/315834/ Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2019-08-01drm/client: Support unmapping of DRM client buffersThomas Zimmermann
DRM clients, such as the fbdev emulation, have their buffer objects mapped by default. Mapping a buffer implicitly prevents its relocation. Hence, the buffer may permanently consume video memory while it's allocated. This is a problem for drivers of low-memory devices, such as ast, mgag200 or older framebuffer hardware, which will then not have enough memory to display other content (e.g., X11). This patch introduces drm_client_buffer_vmap() and _vunmap(). Internal DRM clients can use these functions to unmap and remap buffer objects as needed. There's no reference counting for vmap operations. Callers are expected to either keep buffers mapped (as it is now), or call vmap and vunmap in pairs around code that accesses the mapped memory. v2: * remove several duplicated NULL-pointer checks v3: * style and typo fixes Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/315831/ Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2019-08-01xen/swiotlb: remember having called xen_create_contiguous_region()Juergen Gross
Instead of always calling xen_destroy_contiguous_region() in case the memory is DMA-able for the used device, do so only in case it has been made DMA-able via xen_create_contiguous_region() before. This will avoid a lot of xen_destroy_contiguous_region() calls for 64-bit capable devices. As the memory in question is owned by swiotlb-xen the PG_owner_priv_1 flag of the first allocated page can be used for remembering. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-31Merge tag 'trace-v5.3-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Two minor fixes: - Fix trace event header include guards, as several did not match the #define to the #ifdef - Remove a redundant test to ftrace_graph_notrace_addr() that was accidentally added" * tag 'trace-v5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: fgraph: Remove redundant ftrace_graph_notrace_addr() test tracing: Fix header include guards in trace event headers
2019-07-31xen: avoid link error on ARMArnd Bergmann
Building the privcmd code as a loadable module on ARM, we get a link error due to the private cache management functions: ERROR: "__sync_icache_dcache" [drivers/xen/xen-privcmd.ko] undefined! Move the code into a new that is always built in when Xen is enabled, as suggested by Juergen Gross and Boris Ostrovsky. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-30tracing: Fix header include guards in trace event headersMasahiro Yamada
These include guards are broken. Match the #if !define() and #define lines so that they work correctly. Link: http://lkml.kernel.org/r/20190720103943.16982-1-yamada.masahiro@socionext.com Fixes: f54d1867005c3 ("dma-buf: Rename struct fence to dma_fence") Fixes: 2e26ca7150a4f ("tracing: Fix tracepoint.h DECLARE_TRACE() to allow more than one header") Fixes: e543002f77f46 ("qdisc: add tracepoint qdisc:qdisc_dequeue for dequeued SKBs") Fixes: 95f295f9fe081 ("dmaengine: tegra: add tracepoints to driver") Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-07-30Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "A few regression and bug fixes for the patches merged in the last cycle: - hns fixes a subtle crash from the ib core SGL rework - hfi1 fixes various error handling, oops and protocol errors - bnxt_re fixes a regression where nvmeof doesn't work on some configurations - mlx5 fixes a serious 'use after free' bug in how MR caching is handled - some edge case crashers in the new statistic core code - more siw static checker fixups" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specification IB/counters: Always initialize the port counter object IB/core: Fix querying total rdma stats IB/mlx5: Prevent concurrent MR updates during invalidation IB/mlx5: Fix clean_mr() to work in the expected order IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cache IB/mlx5: Use direct mkey destroy command upon UMR unreg failure IB/mlx5: Fix unreg_umr to ignore the mkey state RDMA/siw: Remove set but not used variables 'rv' IB/mlx5: Replace kfree with kvfree RDMA/bnxt_re: Honor vlan_id in GID entry comparison IB/hfi1: Drop all TID RDMA READ RESP packets after r_next_psn IB/hfi1: Field not zero-ed when allocating TID flow memory IB/hfi1: Unreserve a flushed OPFN request IB/hfi1: Check for error on call to alloc_rsm_map_table RDMA/hns: Fix sg offset non-zero issue RDMA/siw: Fix error return code in siw_init_module()
2019-07-30Merge tag 'for-linus-hmm' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull HMM fixes from Jason Gunthorpe: "Fix the locking around nouveau's use of the hmm_range_* APIs. It works correctly in the success case, but many of the the edge cases have missing unlocks or double unlocks. The diffstat is a bit big as Christoph did a comprehensive job to move the obsolete API from the core header and into the driver before fixing its flow, but the risk of regression from this code motion is low" * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: nouveau: unlock mmap_sem on all errors from nouveau_range_fault nouveau: remove the block parameter to nouveau_range_fault mm/hmm: move hmm_vma_range_done and hmm_vma_fault to nouveau mm/hmm: always return EBUSY for invalid ranges in hmm_range_{fault,snapshot}
2019-07-30loop: Fix mount(2) failure due to race with LOOP_SET_FDJan Kara
Commit 33ec3e53e7b1 ("loop: Don't change loop device under exclusive opener") made LOOP_SET_FD ioctl acquire exclusive block device reference while it updates loop device binding. However this can make perfectly valid mount(2) fail with EBUSY due to racing LOOP_SET_FD holding temporarily the exclusive bdev reference in cases like this: for i in {a..z}{a..z}; do dd if=/dev/zero of=$i.image bs=1k count=0 seek=1024 mkfs.ext2 $i.image mkdir mnt$i done echo "Run" for i in {a..z}{a..z}; do mount -o loop -t ext2 $i.image mnt$i & done Fix the problem by not getting full exclusive bdev reference in LOOP_SET_FD but instead just mark the bdev as being claimed while we update the binding information. This just blocks new exclusive openers instead of failing them with EBUSY thus fixing the problem. Fixes: 33ec3e53e7b1 ("loop: Don't change loop device under exclusive opener") Cc: stable@vger.kernel.org Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-29scsi: fcoe: Embed fc_rport_priv in fcoe_rport structureHannes Reinecke
Gcc-9 complains for a memset across pointer boundaries, which happens as the code tries to allocate a flexible array on the stack. Turns out we cannot do this without relying on gcc-isms, so with this patch we'll embed the fc_rport_priv structure into fcoe_rport, can use the normal 'container_of' outcast, and will only have to do a memset over one structure. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-29scsi: libfc: Whitespace cleanup in libfc.hHannes Reinecke
No functional change. [mkp: typo] Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-29Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio/vhost fixes from Michael Tsirkin: - Fixes in the iommu and balloon devices. - Disable the meta-data optimization for now - I hope we can get it fixed shortly, but there's no point in making users suffer crashes while we are working on that. * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost: disable metadata prefetch optimization iommu/virtio: Update to most recent specification balloon: fix up comments mm/balloon_compaction: avoid duplicate page removal
2019-07-29Merge tag 'platform-drivers-x86-v5.3-3' of ↵Linus Torvalds
git://git.infradead.org/linux-platform-drivers-x86 Pull x86 platform driver fixes from Andy Shevchenko: "Business as usual, a few fixes and new IDs: - PC Engines APU got one fix for software dependencies to automatically load them and another fix for mapping of key button in the front to issue restart event. - OLPC driver is now probed automatically based on module device table. - Intel PMC core driver supports Intel Ice Lake NNPI processor. - WMI driver missed description of a new field in the structure that has been added" * tag 'platform-drivers-x86-v5.3-3' of git://git.infradead.org/linux-platform-drivers-x86: platform/x86: pcengines-apuv2: use KEY_RESTART for front button platform/x86: intel_pmc_core: Add ICL-NNPI support to PMC Core Platform: OLPC: add SPI MODULE_DEVICE_TABLE platform/x86: wmi: add missing struct parameter description platform/x86: pcengines-apuv2: Fix softdep statement
2019-07-28Merge tag 'tty-5.3-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty fixes from Greg KH: "Here are two tty/vt fixes: - delete the netx-serial driver as the arch has been removed, no need to keep the serial driver for it around either. - vt console_lock fix to resolve a reported noisy warning at runtime Both of these have been in linux-next with no reported issues" * tag 'tty-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: vt: Grab console_lock around con_is_bound in show_bind tty: serial: netx: Delete driver
2019-07-28Merge tag 'spdx-5.3-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull SPDX fixes from Greg KH: "Here are some small SPDX fixes for 5.3-rc2 for things that came in during the 5.3-rc1 merge window that we previously missed. Only three small patches here: - two uapi patches to resolve some SPDX tags that were not correct - fix an invalid SPDX tag in the iomap Makefile file All have been properly reviewed on the public mailing lists" * tag 'spdx-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: iomap: fix Invalid License ID treewide: remove SPDX "WITH Linux-syscall-note" from kernel-space headers again treewide: add "WITH Linux-syscall-note" to SPDX tag of uapi headers
2019-07-28gpio: don't WARN() on NULL descs if gpiolib is disabledBartosz Golaszewski
If gpiolib is disabled, we use the inline stubs from gpio/consumer.h instead of regular definitions of GPIO API. The stubs for 'optional' variants of gpiod_get routines return NULL in this case as if the relevant GPIO wasn't found. This is correct so far. Calling other (non-gpio_get) stubs from this header triggers a warning because the GPIO descriptor couldn't have been requested. The warning however is unconditional (WARN_ON(1)) and is emitted even if the passed descriptor pointer is NULL. We don't want to force the users of 'optional' gpio_get to check the returned pointer before calling e.g. gpiod_set_value() so let's only WARN on non-NULL descriptors. Cc: stable@vger.kernel.org Reported-by: Claus H. Stovgaard <cst@phaseone.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2019-07-27Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: "Two fixes for the fair scheduling class: - Prevent freeing memory which is accessible by concurrent readers - Make the RCU annotations for numa groups consistent" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Use RCU accessors consistently for ->numa_group sched/fair: Don't free p->numa_faults with concurrent readers
2019-07-27Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "A set of locking fixes: - Address the fallout of the rwsem rework. Missing ACQUIREs and a sanity check to prevent a use-after-free - Add missing checks for unitialized mutexes when mutex debugging is enabled. - Remove the bogus code in the generic SMP variant of arch_futex_atomic_op_inuser() - Fixup the #ifdeffery in lockdep to prevent compile warnings" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/mutex: Test for initialized mutex locking/lockdep: Clean up #ifdef checks locking/lockdep: Hide unused 'class' variable locking/rwsem: Add ACQUIRE comments tty/ldsem, locking/rwsem: Add missing ACQUIRE to read_failed sleep loop lcoking/rwsem: Add missing ACQUIRE to read_slowpath sleep loop locking/rwsem: Add missing ACQUIRE to read_slowpath exit when queue is empty locking/rwsem: Don't call owner_on_cpu() on read-owner futex: Cleanup generic SMP variant of arch_futex_atomic_op_inuser()
2019-07-27Merge tag 'devicetree-fixes-for-5.3-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull Devicetree fixes from Rob Herring: "The nvmem changes would typically go thru Greg's tree, but they were missed in the merge window. [ Acked by Greg ] Summary: - Fix mismatches in $id values and actual filenames. Now checked by tools. - Convert nvmem binding to DT schema - Fix a typo in of_property_read_bool() kerneldoc - Remove some redundant description in al-fic interrupt-controller" * tag 'devicetree-fixes-for-5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: Fix more $id value mismatches filenames dt-bindings: nvmem: SID: Fix the examples node names dt-bindings: nvmem: Add YAML schemas for the generic NVMEM bindings of: Fix typo in kerneldoc dt-bindings: interrupt-controller: al-fic: remove redundant binding dt-bindings: clk: allwinner,sun4i-a10-ccu: Correct path in $id
2019-07-27Merge tag 'libnvdimm-fixes-5.3-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "A collection of locking and async operations fixes for v5.3-rc2. These had been soaking in a branch targeting the merge window, but missed due to a regression hunt. This fixed up version has otherwise been in -next this past week with no reported issues. In order to gain confidence in the locking changes the pull also includes a debug / instrumentation patch to enable lockdep coverage for libnvdimm subsystem operations that depend on the device_lock for exclusion. As mentioned in the changelog it is a hack, but it works and documents the locking expectations of the sub-system in a way that others can use lockdep to verify. The driver core touches got an ack from Greg. Summary: - Fix duplicate device_unregister() calls (multiple threads competing to do unregister work when scheduling device removal from a sysfs attribute of the self-same device). - Fix badblocks registration order bug. Ensure region badblocks are initialized in advance of namespace registration. - Fix a deadlock between the bus lock and probe operations. - Export device-core infrastructure to coordinate async operations via the device ->dead state. - Add device-core infrastructure to validate device_lock() usage with lockdep" * tag 'libnvdimm-fixes-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: driver-core, libnvdimm: Let device subsystems add local lockdep coverage libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock libnvdimm/bus: Stop holding nvdi