summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2018-08-22proc/kcore: optimize multiple page readsOmar Sandoval
The current code does a full search of the segment list every time for every page. This is wasteful, since it's almost certain that the next page will be in the same segment. Instead, check if the previous segment covers the current page before doing the list search. Link: http://lkml.kernel.org/r/fd346c11090cf93d867e01b8d73a6567c5ac6361.1531953780.git.osandov@fb.com Signed-off-by: Omar Sandoval <osandov@fb.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22proc/kcore: clean up ELF header generationOmar Sandoval
Currently, the ELF file header, program headers, and note segment are allocated all at once, in some icky code dating back to 2.3. Programs tend to read the file header, then the program headers, then the note segment, all separately, so this is a waste of effort. It's cleaner and more efficient to handle the three separately. Link: http://lkml.kernel.org/r/19c92cbad0e11f6103ff3274b2e7a7e51a1eb74b.1531953780.git.osandov@fb.com Signed-off-by: Omar Sandoval <osandov@fb.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22proc/kcore: hold lock during readOmar Sandoval
Now that we're using an rwsem, we can hold it during the entirety of read_kcore() and have a common return path. This is preparation for the next change. [akpm@linux-foundation.org: fix locking bug reported by Tetsuo Handa] Link: http://lkml.kernel.org/r/d7cfbc1e8a76616f3b699eaff9df0a2730380534.1531953780.git.osandov@fb.com Signed-off-by: Omar Sandoval <osandov@fb.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: James Morse <james.morse@arm.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22proc/kcore: fix memory hotplug vs multiple opens raceOmar Sandoval
There's a theoretical race condition that will cause /proc/kcore to miss a memory hotplug event: CPU0 CPU1 // hotplug event 1 kcore_need_update = 1 open_kcore() open_kcore() kcore_update_ram() kcore_update_ram() // Walk RAM // Walk RAM __kcore_update_ram() __kcore_update_ram() kcore_need_update = 0 // hotplug event 2 kcore_need_update = 1 kcore_need_update = 0 Note that CPU1 set up the RAM kcore entries with the state after hotplug event 1 but cleared the flag for hotplug event 2. The RAM entries will therefore be stale until there is another hotplug event. This is an extremely unlikely sequence of events, but the fix makes the synchronization saner, anyways: we serialize the entire update sequence, which means that whoever clears the flag will always succeed in replacing the kcore list. Link: http://lkml.kernel.org/r/6106c509998779730c12400c1b996425df7d7089.1531953780.git.osandov@fb.com Signed-off-by: Omar Sandoval <osandov@fb.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22proc/kcore: replace kclist_lock rwlock with rwsemOmar Sandoval
Now we only need kclist_lock from user context and at fs init time, and the following changes need to sleep while holding the kclist_lock. Link: http://lkml.kernel.org/r/521ba449ebe921d905177410fee9222d07882f0d.1531953780.git.osandov@fb.com Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22proc/kcore: don't grab lock for memory hotplug notifierOmar Sandoval
The memory hotplug notifier kcore_callback() only needs kclist_lock to prevent races with __kcore_update_ram(), but we can easily eliminate that race by using an atomic xchg() in __kcore_update_ram(). This is preparation for converting kclist_lock to an rwsem. Link: http://lkml.kernel.org/r/0a4bc89f4dbde8b5b2ea309f7b4fb6a85fe29df2.1531953780.git.osandov@fb.com Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22proc/kcore: don't grab lock for kclist_add()Omar Sandoval
Patch series "/proc/kcore improvements", v4. This series makes a few improvements to /proc/kcore. It fixes a couple of small issues in v3 but is otherwise the same. Patches 1, 2, and 3 are prep patches. Patch 4 is a fix/cleanup. Patch 5 is another prep patch. Patches 6 and 7 are optimizations to ->read(). Patch 8 makes it possible to enable CRASH_CORE on any architecture, which is needed for patch 9. Patch 9 adds vmcoreinfo to /proc/kcore. This patch (of 9): kclist_add() is only called at init time, so there's no point in grabbing any locks. We're also going to replace the rwlock with a rwsem, which we don't want to try grabbing during early boot. While we're here, mark kclist_add() with __init so that we'll get a warning if it's called from non-init code. Link: http://lkml.kernel.org/r/98208db1faf167aa8b08eebfa968d95c70527739.1531953780.git.osandov@fb.com Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com> Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22fs/proc/kcore.c: use __pa_symbol() for KCORE_TEXT list entriesJames Morse
elf_kcore_store_hdr() uses __pa() to find the physical address of KCORE_RAM or KCORE_TEXT entries exported as program headers. This trips CONFIG_DEBUG_VIRTUAL's checks, as the KCORE_TEXT entries are not in the linear map. Handle these two cases separately, using __pa_symbol() for the KCORE_TEXT entries. Link: http://lkml.kernel.org/r/20180711131944.15252-1-james.morse@arm.com Signed-off-by: James Morse <james.morse@arm.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Omar Sandoval <osandov@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22fs/proc/vmcore.c: use new typedef vm_fault_tSouptick Joarder
Use new return type vm_fault_t for fault handler in struct vm_operations_struct. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. See 1c8f422059ae ("mm: change return type to vm_fault_t") for reference. Link: http://lkml.kernel.org/r/20180702153325.GA3875@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Cc: Ganesh Goudar <ganeshgr@chelsio.com> Cc: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Cc: David S. Miller <davem@davemloft.net> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22proc: use "unsigned int" in /proc/stat hookAlexey Dobriyan
Number of CPUs is never high enough to force 64-bit arithmetic. Save couple of bytes on x86_64. Link: http://lkml.kernel.org/r/20180627200710.GC18434@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22proc: spread "const" a bitAlexey Dobriyan
Link: http://lkml.kernel.org/r/20180627200614.GB18434@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22proc: use macro in /proc/latency hookAlexey Dobriyan
->latency_record is defined as struct latency_record[LT_SAVECOUNT]; so use the same macro whie iterating. Link: http://lkml.kernel.org/r/20180627200534.GA18434@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22proc: save 2 atomic ops on write to "/proc/*/attr/*"Alexey Dobriyan
Code checks if write is done by current to its own attributes. For that get/put pair is unnecessary as it can be done under RCU. Note: rcu_read_unlock() can be done even earlier since pointer to a task is not dereferenced. It depends if /proc code should look scary or not: rcu_read_lock(); task = pid_task(...); rcu_read_unlock(); if (!task) return -ESRCH; if (task != current) return -EACCESS: P.S.: rename "length" variable. Code like this length = -EINVAL; should not exist. Link: http://lkml.kernel.org/r/20180627200218.GF18113@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22proc: put task earlier in /proc/*/fail-nthAlexey Dobriyan
Link: http://lkml.kernel.org/r/20180627195427.GE18113@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22proc: smaller readlock section in readdir("/proc")Alexey Dobriyan
Readdir context is thread local, so ->pos is thread local, move it out of readlock. Link: http://lkml.kernel.org/r/20180627195339.GD18113@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22fs/proc/uptime.c: use ktime_get_boottime_ts64Arnd Bergmann
get_monotonic_boottime() is deprecated and uses the old timespec type. Let's convert /proc/uptime to use ktime_get_boottime_ts64(). Link: http://lkml.kernel.org/r/20180620081746.282742-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Deepa Dinamani <deepa.kernel@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22proc: fixup PDE allocation bloatAlexey Dobriyan
24074a35c5c975 ("proc: Make inline name size calculation automatic") started to put PDE allocations into kmalloc-256 which is unnecessary as ~40 character names are very rare. Put allocation back into kmalloc-192 cache for 64-bit non-debug builds. Put BUILD_BUG_ON to know when PDE size has gotten out of control. [adobriyan@gmail.com: fix BUILD_BUG_ON breakage on powerpc64] Link: http://lkml.kernel.org/r/20180703191602.GA25521@avx2 Link: http://lkml.kernel.org/r/20180617215732.GA24688@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22/proc/meminfo: add percpu populated pages countDennis Zhou (Facebook)
Currently, percpu memory only exposes allocation and utilization information via debugfs. This more or less is only really useful for understanding the fragmentation and allocation information at a per-chunk level with a few global counters. This is also gated behind a config. BPF and cgroup, for example, have seen an increase in use causing increased use of percpu memory. Let's make it easier for someone to identify how much memory is being used. This patch adds the "Percpu" stat to meminfo to more easily look up how much percpu memory is in use. This number includes the cost for all allocated backing pages and not just insight at the per a unit, per chunk level. Metadata is excluded. I think excluding metadata is fair because the backing memory scales with the numbere of cpus and can quickly outweigh the metadata. It also makes this calculation light. Link: http://lkml.kernel.org/r/20180807184723.74919-1-dennisszhou@gmail.com Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Roman Gushchin <guro@fb.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22mm: zero out the vma in vma_init()Andrew Morton
Rather than in vm_area_alloc(). To ensure that the various oddball stack-based vmas are in a good state. Some of the callers were zeroing them out, others were not. Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22mm: /proc/pid/smaps_rollup: convert to single value seq_fileVlastimil Babka
The /proc/pid/smaps_rollup file is currently implemented via the m_start/m_next/m_stop seq_file iterators shared with the other maps files, that iterate over vma's. However, the rollup file doesn't print anything for each vma, only accumulate the stats. There are some issues with the current code as reported in [1] - the accumulated stats can get skewed if seq_file start()/stop() op is called multiple times, if show() is called multiple times, and after seeks to non-zero position. Patch [1] fixed those within existing design, but I believe it is fundamentally wrong to expose the vma iterators to the seq_file mechanism when smaps_rollup shows logically a single set of values for the whole address space. This patch thus refactors the code to provide a single "value" at offset 0, with vma iteration to gather the stats done internally. This fixes the situations where results are skewed, and simplifies the code, especially in show_smap(), at the expense of somewhat less code reuse. [1] https://marc.info/?l=linux-mm&m=151927723128134&w=2 [vbabka@suse.c: use seq_file infrastructure] Link: http://lkml.kernel.org/r/bf4525b0-fd5b-4c4c-2cb3-adee3dd95a48@suse.cz Link: http://lkml.kernel.org/r/20180723111933.15443-5-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Daniel Colascione <dancol@google.com> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22mm: /proc/pid/smaps: factor out common stats printingVlastimil Babka
To prepare for handling /proc/pid/smaps_rollup differently from /proc/pid/smaps factor out from show_smap() printing the parts of output that are common for both variants, which is the bulk of the gathered memory stats. [vbabka@suse.cz: add const, per Alexey] Link: http://lkml.kernel.org/r/b45f319f-cd04-337b-37f8-77f99786aa8a@suse.cz Link: http://lkml.kernel.org/r/20180723111933.15443-4-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Daniel Colascione <dancol@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22mm: /proc/pid/smaps: factor out mem stats gatheringVlastimil Babka
To prepare for handling /proc/pid/smaps_rollup differently from /proc/pid/smaps factor out vma mem stats gathering from show_smap() - it will be used by both. Link: http://lkml.kernel.org/r/20180723111933.15443-3-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Daniel Colascione <dancol@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22mm: /proc/pid/*maps remove is_pid and related wrappersVlastimil Babka
Patch series "cleanups and refactor of /proc/pid/smaps*". The recent regression in /proc/pid/smaps made me look more into the code. Especially the issues with smaps_rollup reported in [1] as explained in Patch 4, which fixes them by refactoring the code. Patches 2 and 3 are preparations for that. Patch 1 is me realizing that there's a lot of boilerplate left from times where we tried (unsuccessfuly) to mark thread stacks in the output. Originally I had also plans to rework the translation from /proc/pid/*maps* file offsets to the internal structures. Now the offset means "vma number", which is not really stable (vma's can come and go between read() calls) and there's an extra caching of last vma's address. My idea was that offsets would be interpreted directly as addresses, which would also allow meaningful seeks (see the ugly seek_to_smaps_entry() in tools/testing/selftests/vm/mlock2.h). However loff_t is (signed) long long so that might be insufficient somewhere for the unsigned long addresses. So the result is fixed issues with skewed /proc/pid/smaps_rollup results, simpler smaps code, and a lot of unused code removed. [1] https://marc.info/?l=linux-mm&m=151927723128134&w=2 This patch (of 4): Commit b76437579d13 ("procfs: mark thread stack correctly in proc/<pid>/maps") introduced differences between /proc/PID/maps and /proc/PID/task/TID/maps to mark thread stacks properly, and this was also done for smaps and numa_maps. However it didn't work properly and was ultimately removed by commit b18cb64ead40 ("fs/proc: Stop trying to report thread stacks"). Now the is_pid parameter for the related show_*() functions is unused and we can remove it together with wrapper functions and ops structures that differ for PID and TID cases only in this parameter. Link: http://lkml.kernel.org/r/20180723111933.15443-2-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Daniel Colascione <dancol@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22autofs: fix autofs_sbi() does not check super block typeIan Kent
autofs_sbi() does not check the superblock magic number to verify it has been given an autofs super block. Link: http://lkml.kernel.org/r/153475422934.17131.7563724552005298277.stgit@pluto.themaw.net Reported-by: <syzbot+87c3c541582e56943277@syzkaller.appspotmail.com> Signed-off-by: Ian Kent <raven@themaw.net> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-20Merge tag 'trace-v4.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Restructure of lockdep and latency tracers This is the biggest change. Joel Fernandes restructured the hooks from irqs and preemption disabling and enabling. He got rid of a lot of the preprocessor #ifdef mess that they caused. He turned both lockdep and the latency tracers to use trace events inserted in the preempt/irqs disabling paths. But unfortunately, these started to cause issues in corner cases. Thus, parts of the code was reverted back to where lockdep and the latency tracers just get called directly (without using the trace events). But because the original change cleaned up the code very nicely we kept that, as well as the trace events for preempt and irqs disabling, but they are limited to not being called in NMIs. - Have trace events use SRCU for "rcu idle" calls. This was required for the preempt/irqs off trace events. But it also had to not allow them to be called in NMI context. Waiting till Paul makes an NMI safe SRCU API. - New notrace SRCU API to allow trace events to use SRCU. - Addition of mcount-nop option support - SPDX headers replacing GPL templates. - Various other fixes and clean ups. - Some fixes are marked for stable, but were not fully tested before the merge window opened. * tag 'trace-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits) tracing: Fix SPDX format headers to use C++ style comments tracing: Add SPDX License format tags to tracing files tracing: Add SPDX License format to bpf_trace.c blktrace: Add SPDX License format header s390/ftrace: Add -mfentry and -mnop-mcount support tracing: Add -mcount-nop option support tracing: Avoid calling cc-option -mrecord-mcount for every Makefile tracing: Handle CC_FLAGS_FTRACE more accurately Uprobe: Additional argument arch_uprobe to uprobe_write_opcode() Uprobes: Simplify uprobe_register() body tracepoints: Free early tracepoints after RCU is initialized uprobes: Use synchronize_rcu() not synchronize_sched() tracing: Fix synchronizing to event changes with tracepoint_synchronize_unregister() ftrace: Remove unused pointer ftrace_swapper_pid tracing: More reverting of "tracing: Centralize preemptirq tracepoints and unify their usage" tracing/irqsoff: Handle preempt_count for different configs tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and unify their usage" tracing: irqsoff: Account for additional preempt_disable trace: Use rcu_dereference_raw for hooks from trace-event subsystem tracing/kprobes: Fix within_notrace_func() to check only notrace functions ...
2018-08-20Merge tag 'ceph-for-4.19-rc1' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph updates from Ilya Dryomov: "The main things are support for cephx v2 authentication protocol and basic support for rbd images within namespaces (myself). Also included are y2038 conversion patches from Arnd, a pile of miscellaneous fixes from Chengguang and Zheng's feature bit infrastructure for the filesystem" * tag 'ceph-for-4.19-rc1' of git://github.com/ceph/ceph-client: (40 commits) ceph: don't drop message if it contains more data than expected ceph: support cephfs' own feature bits crush: fix using plain integer as NULL warning libceph: remove unnecessary non NULL check for request_key ceph: refactor error handling code in ceph_reserve_caps() ceph: refactor ceph_unreserve_caps() ceph: change to void return type for __do_request() ceph: compare fsc->max_file_size and inode->i_size for max file size limit ceph: add additional size check in ceph_setattr() ceph: add additional offset check in ceph_write_iter() ceph: add additional range check in ceph_fallocate() ceph: add new field max_file_size in ceph_fs_client libceph: weaken sizeof check in ceph_x_verify_authorizer_reply() libceph: check authorizer reply/challenge length before reading libceph: implement CEPHX_V2 calculation mode libceph: add authorizer challenge libceph: factor out encrypt_authorizer() libceph: factor out __ceph_x_decrypt() libceph: factor out __prepare_write_connect() libceph: store ceph_auth_handshake pointer in ceph_connection ...
2018-08-18Merge tag 'driver-core-4.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here are all of the driver core and related patches for 4.19-rc1. Nothing huge here, just a number of small cleanups and the ability to now stop the deferred probing after init happens. All of these have been in linux-next for a while with only a merge issue reported" * tag 'driver-core-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (21 commits) base: core: Remove WARN_ON from link dependencies check drivers/base: stop new probing during shutdown drivers: core: Remove glue dirs from sysfs earlier driver core: remove unnecessary function extern declare sysfs.h: fix non-kernel-doc comment PM / Domains: Stop deferring probe at the end of initcall iommu: Remove IOMMU_OF_DECLARE iommu: Stop deferring probe at end of initcalls pinctrl: Support stopping deferred probe after initcalls dt-bindings: pinctrl: add a 'pinctrl-use-default' property driver core: allow stopping deferred probe after init driver core: add a debugfs entry to show deferred devices sysfs: Fix internal_create_group() for named group updates base: fix order of OF initialization linux/device.h: fix kernel-doc notation warning Documentation: update firmware loader fallback reference kobject: Replace strncpy with memcpy drivers: base: cacheinfo: use OF property_read_u32 instead of get_property,read_number kernfs: Replace strncpy with memcpy device: Add #define dev_fmt similar to #define pr_fmt ...
2018-08-17Merge tag '9p-for-4.19-2' of git://github.com/martinetd/linuxLinus Torvalds
Pull 9p updates from Dominique Martinet: "This contains mostly fixes (6 to be backported to stable) and a few changes, here is the breakdown: - rework how fids are attributed by replacing some custom tracking in a list by an idr - for packet-based transports (virtio/rdma) validate that the packet length matches what the header says - a few race condition fixes found by syzkaller - missing argument check when NULL device is passed in sys_mount - a few virtio fixes - some spelling and style fixes" * tag '9p-for-4.19-2' of git://github.com/martinetd/linux: (21 commits) net/9p/trans_virtio.c: add null terminal for mount tag 9p/virtio: fix off-by-one error in sg list bounds check 9p: fix whitespace issues 9p: fix multiple NULL-pointer-dereferences fs/9p/xattr.c: catch the error of p9_client_clunk when setting xattr failed 9p: validate PDU length net/9p/trans_fd.c: fix race by holding the lock net/9p/trans_fd.c: fix race-condition by flushing workqueue before the kfree() net/9p/virtio: Fix hard lockup in req_done net/9p/trans_virtio.c: fix some spell mistakes in comments 9p/net: Fix zero-copy path in the 9p virtio transport 9p: Embed wait_queue_head into p9_req_t 9p: Replace the fidlist with an IDR 9p: Change p9_fid_create calling convention 9p: Fix comment on smp_wmb net/9p/client.c: version pointer uninitialized fs/9p/v9fs.c: fix spelling mistake "Uknown" -> "Unknown" net/9p: fix error path of p9_virtio_probe 9p/net/protocol.c: return -ENOMEM when kmalloc() failed net/9p/client.c: add missing '\n' at the end of p9_debug() ...
2018-08-17Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge updates from Andrew Morton: - a few misc things - a few Y2038 fixes - ntfs fixes - arch/sh tweaks - ocfs2 updates - most of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (111 commits) mm/hmm.c: remove unused variables align_start and align_end fs/userfaultfd.c: remove redundant pointer uwq mm, vmacache: hash addresses based on pmd mm/list_lru: introduce list_lru_shrink_walk_irq() mm/list_lru.c: pass struct list_lru_node* as an argument to __list_lru_walk_one() mm/list_lru.c: move locking from __list_lru_walk_one() to its caller mm/list_lru.c: use list_lru_walk_one() in list_lru_walk_node() mm, swap: make CONFIG_THP_SWAP depend on CONFIG_SWAP mm/sparse: delete old sparse_init and enable new one mm/sparse: add new sparse_init_nid() and sparse_init() mm/sparse: move buffer init/fini to the common place mm/sparse: use the new sparse buffer functions in non-vmemmap mm/sparse: abstract sparse buffer allocations mm/hugetlb.c: don't zero 1GiB bootmem pages mm, page_alloc: double zone's batchsize mm/oom_kill.c: document oom_lock mm/hugetlb: remove gigantic page support for HIGHMEM mm, oom: remove sleep from under oom_lock kernel/dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous() mm/cma: remove unsupported gfp_mask parameter from cma_alloc() ...
2018-08-17fs/userfaultfd.c: remove redundant pointer uwqColin Ian King
Pointer uwq is being assigned but is never used hence it is redundant and can be removed. Cleans up clang warning: warning: variable 'uwq' set but not used [-Wunused-but-set-variable] Link: http://lkml.kernel.org/r/20180717090802.18357-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17mm: add SHRINK_EMPTY shrinker methods return valueKirill Tkhai
We need to distinguish the situations when shrinker has very small amount of objects (see vfs_pressure_ratio() called from super_cache_count()), and when it has no objects at all. Currently, in the both of these cases, shrinker::count_objects() returns 0. The patch introduces new SHRINK_EMPTY return value, which will be used for "no objects at all" case. It's is a refactoring mostly, as SHRINK_EMPTY is replaced by 0 by all callers of do_shrink_slab() in this patch, and all the magic will happen in further. Link: http://lkml.kernel.org/r/153063069574.1818.11037751256699341813.stgit@localhost.localdomain Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Tested-by: Shakeel Butt <shakeelb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Josef Bacik <jbacik@fb.com> Cc: Li RongQing <lirongqing@baidu.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Roman Gushchin <guro@fb.com> Cc: Sahitya Tummala <stummala@codeaurora.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17fs: propagate shrinker::id to list_lruKirill Tkhai
Add list_lru::shrinker_id field and populate it by registered shrinker id. This will be used to set correct bit in memcg shrinkers map by lru code in next patches, after there appeared the first related to memcg element in list_lru. Link: http://lkml.kernel.org/r/153063059758.1818.14866596416857717800.stgit@localhost.localdomain Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Tested-by: Shakeel Butt <shakeelb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Josef Bacik <jbacik@fb.com> Cc: Li RongQing <lirongqing@baidu.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Roman Gushchin <guro@fb.com> Cc: Sahitya Tummala <stummala@codeaurora.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17fs/super.c: refactor alloc_super()Kirill Tkhai
Do two list_lru_init_memcg() calls after prealloc_super(). destroy_unused_super() in fail path is OK with this. Next patch needs such the order. Link: http://lkml.kernel.org/r/153063058712.1818.3382490999719078571.stgit@localhost.localdomain Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Tested-by: Shakeel Butt <shakeelb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Josef Bacik <jbacik@fb.com> Cc: Li RongQing <lirongqing@baidu.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Roman Gushchin <guro@fb.com> Cc: Sahitya Tummala <stummala@codeaurora.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17fs, mm: account buffer_head to kmemcgShakeel Butt
The buffer_head can consume a significant amount of system memory and is directly related to the amount of page cache. In our production environment we have observed that a lot of machines are spending a significant amount of memory as buffer_head and can not be left as system memory overhead. Charging buffer_head is not as simple as adding __GFP_ACCOUNT to the allocation. The buffer_heads can be allocated in a memcg different from the memcg of the page for which buffer_heads are being allocated. One concrete example is memory reclaim. The reclaim can trigger I/O of pages of any memcg on the system. So, the right way to charge buffer_head is to extract the memcg from the page for which buffer_heads are being allocated and then use targeted memcg charging API. [shakeelb@google.com: use __GFP_ACCOUNT for directed memcg charging] Link: http://lkml.kernel.org/r/20180702220208.213380-1-shakeelb@google.com Link: http://lkml.kernel.org/r/20180627191250.209150-3-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Greg Thelen <gthelen@google.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Roman Gushchin <guro@fb.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17fs: fsnotify: account fsnotify metadata to kmemcgShakeel Butt
Patch series "Directed kmem charging", v8. The Linux kernel's memory cgroup allows limiting the memory usage of the jobs running on the system to provide isolation between the jobs. All the kernel memory allocated in the context of the job and marked with __GFP_ACCOUNT will also be included in the memory usage and be limited by the job's limit. The kernel memory can only be charged to the memcg of the process in whose context kernel memory was allocated. However there are cases where the allocated kernel memory should be charged to the memcg different from the current processes's memcg. This patch series contains two such concrete use-cases i.e. fsnotify and buffer_head. The fsnotify event objects can consume a lot of system memory for large or unlimited queues if there is either no or slow listener. The events are allocated in the context of the event producer. However they should be charged to the event consumer. Similarly the buffer_head objects can be allocated in a memcg different from the memcg of the page for which buffer_head objects are being allocated. To solve this issue, this patch series introduces mechanism to charge kernel memory to a given memcg. In case of fsnotify events, the memcg of the consumer can be used for charging and for buffer_head, the memcg of the page can be charged. For directed charging, the caller can use the scope API memalloc_[un]use_memcg() to specify the memcg to charge for all the __GFP_ACCOUNT allocations within the scope. This patch (of 2): A lot of memory can be consumed by the events generated for the huge or unlimited queues if there is either no or slow listener. This can cause system level memory pressure or OOMs. So, it's better to account the fsnotify kmem caches to the memcg of the listener. However the listener can be in a different memcg than the memcg of the producer and these allocations happen in the context of the event producer. This patch introduces remote memcg charging API which the producer can use to charge the allocations to the memcg of the listener. There are seven fsnotify kmem caches and among them allocations from dnotify_struct_cache, dnotify_mark_cache, fanotify_mark_cache and inotify_inode_mark_cachep happens in the context of syscall from the listener. So, SLAB_ACCOUNT is enough for these caches. The objects from fsnotify_mark_connector_cachep are not accounted as they are small compared to the notification mark or events and it is unclear whom to account connector to since it is shared by all events attached to the inode. The allocations from the event caches happen in the context of the event producer. For such caches we will need to remote charge the allocations to the listener's memcg. Thus we save the memcg reference in the fsnotify_group structure of the listener. This patch has also moved the members of fsnotify_group to keep the size same, at least for 64 bit build, even with additional member by filling the holes. [shakeelb@google.com: use GFP_KERNEL_ACCOUNT rather than open-coding it] Link: http://lkml.kernel.org/r/20180702215439.211597-1-shakeelb@google.com Link: http://lkml.kernel.org/r/20180627191250.209150-2-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Greg Thelen <gthelen@google.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Roman Gushchin <guro@fb.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17ext4: readpages() should submit IO as read-aheadJens Axboe
a_ops->readpages() is only ever used for read-ahead. Ensure that we pass this information down to the block layer. Link: http://lkml.kernel.org/r/20180621010725.17813-5-axboe@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <clm@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17btrfs: readpages() should submit IO as read-aheadJens Axboe
a_ops->readpages() is only ever used for read-ahead. Ensure that we pass this information down to the block layer. Link: http://lkml.kernel.org/r/20180621010725.17813-4-axboe@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <clm@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17mpage: mpage_readpages() should submit IO as read-aheadJens Axboe
a_ops->readpages() is only ever used for read-ahead, yet we don't flag the IO being submitted as such. Fix that up. Any file system that uses mpage_readpages() as its ->readpages() implementation will now get this right. Since we're passing in whether the IO is read-ahead or not, we don't need to pass in the 'gfp' separately, as it is dependent on the IO being read-ahead. Kill off that member. Add some documentation notes on ->readpages() being purely for read-ahead. Link: http://lkml.kernel.org/r/20180621010725.17813-3-axboe@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <clm@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17mpage: add argument structure for do_mpage_readpage()Jens Axboe
Patch series "Submit ->readpages() IO as read-ahead", v4. The only caller of ->readpages() is from read-ahead, yet we don't submit IO flagged with REQ_RAHEAD. This means we don't see it in blktrace, for instance, which is a shame. Additionally, it's preventing further functional changes in the block layer for deadling with read-ahead more intelligently. We already make assumptions about ->readpages() just being for read-ahead in the mpage implementation, using readahead_gfp_mask(mapping) as out GFP mask of choice. This small series fixes up mpage_readpages() to submit with REQ_RAHEAD, which takes care of file systems using mpage_readpages(). The first patch is a prep patch, that makes do_mpage_readpage() take an argument structure. This patch (of 4): We're currently passing 8 arguments to this function, clean it up a bit by packing the arguments in an args structure we pass to it. No intentional functional changes in this patch. [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20180621010725.17813-2-axboe@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Chris Mason <clm@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17fs/seq_file.c: simplify seq_file iteration code and interfaceNeilBrown
The documentation for seq_file suggests that it is necessary to be able to move the iterator to a given offset, however that is not the case. If the iterator is stored in the private data and is stable from one read() syscall to the next, it is only necessary to support first/next interactions. Implementing this in a client is a little clumsy. - if ->start() is given a pos of zero, it should go to start of sequence. - if ->start() is given the name pos that was given to the most recent next() or start(), it should restore the iterator to state just before that last call - if ->start is given another number, it should set the iterator one beyond the start just before the last ->start or ->next call. Also, the documentation says that the implementation can interpret the pos however it likes (other than zero meaning start), but seq_file increments the pos sometimes which does impose on the implementation. This patch simplifies the interface for first/next iteration and simplifies the code, while maintaining complete backward compatability. Now: - if ->start() is given a pos of zero, it should return an iterator placed at the start of the sequence - if ->start() is given a non-zero pos, it should return the iterator in the same state it was after the last ->start or ->next. This is particularly useful for interators which walk the multiple chains in a hash table, e.g. using rhashtable_walk*. See fs/gfs2/glock.c and drivers/staging/lustre/lustre/llite/vvp_dev.c A large part of achieving this is to *always* call ->next after ->show has successfully stored all of an entry in the buffer. Never just increment the index instead. Also: - always pass &m->index to ->start() and ->next(), never a temp variable - don't clear ->from when ->count is zero, as ->from is dead when ->count is zero. Some ->next functions do not increment *pos when they return NULL. To maintain compatability with this, we still need to increment m->index in one place, if ->next didn't increment it. Note that such ->next functions are buggy and should be fixed. A simple demonstration is dd if=/proc/swaps bs=1000 skip=1 Choose any block size larger than the size of /proc/swaps. This will always show the whole last line of /proc/swaps. This patch doesn't work around buggy next() functions for this case. [neilb@suse.com: ensure ->from is valid] Link: http://lkml.kernel.org/r/87601ryb8a.fsf@notabene.neil.brown.name Signed-off-by: NeilBrown <neilb@suse.com> Acked-by: Jonathan Corbet <corbet@lwn.net> [docs] Tested-by: Jann Horn <jannh@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17vfs: discard ATTR_ATTR_FLAGNeilBrown
This flag was introduce in 2.1.37pre1 and the only place it was tested was removed in 2.1.43pre1. The flag was never set. Let's discard it properly. Link: http://lkml.kernel.org/r/877en0hewz.fsf@notabene.neil.brown.name Signed-off-by: NeilBrown <neilb@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17fs/dcache.c: fix kmemcheck splat at take_dentry_name_snapshot()Tetsuo Handa
Since only dentry->d_name.len + 1 bytes out of DNAME_INLINE_LEN bytes are initialized at __d_alloc(), we can't copy the whole size unconditionally. WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff8fa27465ac50) 636f6e66696766732e746d70000000000010000000000000020000000188ffff i i i i i i i i i i i i i u u u u u u u u u u i i i i i u u u u ^ RIP: 0010:take_dentry_name_snapshot+0x28/0x50 RSP: 0018:ffffa83000f5bdf8 EFLAGS: 00010246 RAX: 0000000000000020 RBX: ffff8fa274b20550 RCX: 0000000000000002 RDX: ffffa83000f5be40 RSI: ffff8fa27465ac50 RDI: ffffa83000f5be60 RBP: ffffa83000f5bdf8 R08: ffffa83000f5be48 R09: 0000000000000001 R10: ffff8fa27465ac00 R11: ffff8fa27465acc0 R12: ffff8fa27465ac00 R13: ffff8fa27465acc0 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f79737ac8c0(0000) GS:ffffffff8fc30000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff8fa274c0b000 CR3: 0000000134aa7002 CR4: 00000000000606f0 take_dentry_name_snapshot+0x28/0x50 vfs_rename+0x128/0x870 SyS_rename+0x3b2/0x3d0 entry_SYSCALL_64_fastpath+0x1a/0xa4 0xffffffffffffffff Link: http://lkml.kernel.org/r/201709131912.GBG39012.QMJLOVFSFFOOtH@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-l