summaryrefslogtreecommitdiffstats
path: root/Documentation/filesystems
AgeCommit message (Collapse)Author
2020-12-24Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Various bug fixes and cleanups for ext4; no new features this cycle" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (29 commits) ext4: remove unnecessary wbc parameter from ext4_bio_write_page ext4: avoid s_mb_prefetch to be zero in individual scenarios ext4: defer saving error info from atomic context ext4: simplify ext4 error translation ext4: move functions in super.c ext4: make ext4_abort() use __ext4_error() ext4: standardize error message in ext4_protect_reserved_inode() ext4: remove redundant sb checksum recomputation ext4: don't remount read-only with errors=continue on reboot ext4: fix deadlock with fs freezing and EA inodes jbd2: add a helper to find out number of fast commit blocks ext4: make fast_commit.h byte identical with e2fsprogs/fast_commit.h ext4: fix fall-through warnings for Clang ext4: add docs about fast commit idempotence ext4: remove the unused EXT4_CURRENT_REV macro ext4: fix an IS_ERR() vs NULL check ext4: check for invalid block size early when mounting a file system ext4: fix a memory leak of ext4_free_data ext4: delete nonsensical (commented-out) code inside ext4_xattr_block_set() ext4: update ext4_data_block_valid related comments ...
2020-12-20Merge tag 'gfs2-for-5.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - Don't wait for unfreeze of the wrong filesystems - Remove an obsolete delete_work_func hack and an incorrect sb_start_write - Minor documentation updates and cosmetic care * tag 'gfs2-for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: in signal_our_withdraw wait for unfreeze of _this_ fs only gfs2: Remove sb_start_write from gfs2_statfs_sync gfs2: remove trailing semicolons from macro definitions Revert "GFS2: Prevent delete work from occurring on glocks used for create" gfs2: Make inode operations static MAINTAINERS: Add gfs2 bug tracker link Documentation: Update filesystems/gfs2.rst
2020-12-17Merge tag 'ovl-update-5.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs updates from Miklos Szeredi: - Allow unprivileged mounting in a user namespace. For quite some time the security model of overlayfs has been that operations on underlying layers shall be performed with the privileges of the mounting task. This way an unprvileged user cannot gain privileges by the act of mounting an overlayfs instance. A full audit of all function calls made by the overlayfs code has been performed to see whether they conform to this model, and this branch contains some fixes in this regard. - Support running on copied filesystem images by optionally disabling UUID verification. - Bug fixes as well as documentation updates. * tag 'ovl-update-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: unprivieged mounts ovl: do not get metacopy for userxattr ovl: do not fail because of O_NOATIME ovl: do not fail when setting origin xattr ovl: user xattr ovl: simplify file splice ovl: make ioctl() safe ovl: check privs before decoding file handle vfs: verify source area in vfs_dedupe_file_range_one() vfs: move cap_convert_nscap() call into vfs_setxattr() ovl: fix incorrect extent info in metacopy case ovl: expand warning in ovl_d_real() ovl: document lower modification caveats ovl: warn about orphan metacopy ovl: doc clarification ovl: introduce new "uuid=off" option for inodes index feature ovl: propagate ovl_fs to ovl_decode_real_fh and ovl_encode_real_fh
2020-12-17Merge tag 'f2fs-for-5.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've made more work into per-file compression support. For example, F2FS_IOC_GET | SET_COMPRESS_OPTION provides a way to change the algorithm or cluster size per file. F2FS_IOC_COMPRESS | DECOMPRESS_FILE provides a way to compress and decompress the existing normal files manually. There is also a new mount option, compress_mode=fs|user, which can control who compresses the data. Chao also added a checksum feature with a mount option so that we are able to detect any corrupted cluster. In addition, Daniel contributed casefolding with encryption patch, which will be used for Android devices. Summary: Enhancements: - add ioctls and mount option to manage per-file compression feature - support casefolding with encryption - support checksum for compressed cluster - avoid IO starvation by replacing mutex with rwsem - add sysfs, max_io_bytes, to control max bio size Bug fixes: - fix use-after-free issue when compression and fsverity are enabled - fix consistency corruption during fault injection test - fix data offset for lseek - get rid of buffer_head which has 32bits limit in fiemap - fix some bugs in multi-partitions support - fix nat entry count calculation in shrinker - fix some stat information And, we've refactored some logics and fix minor bugs as well" * tag 'f2fs-for-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (36 commits) f2fs: compress: fix compression chksum f2fs: fix shift-out-of-bounds in sanity_check_raw_super() f2fs: fix race of pending_pages in decompression f2fs: fix to account inline xattr correctly during recovery f2fs: inline: fix wrong inline inode stat f2fs: inline: correct comment in f2fs_recover_inline_data f2fs: don't check PAGE_SIZE again in sanity_check_raw_super() f2fs: convert to F2FS_*_INO macro f2fs: introduce max_io_bytes, a sysfs entry, to limit bio size f2fs: don't allow any writes on readonly mount f2fs: avoid race condition for shrinker count f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE f2fs: add compress_mode mount option f2fs: Remove unnecessary unlikely() f2fs: init dirty_secmap incorrectly f2fs: remove buffer_head which has 32bits limit f2fs: fix wrong block count instead of bytes f2fs: use new conversion functions between blks and bytes f2fs: rename logical_to_blk and blk_to_logical f2fs: fix kbytes written stat for multi-device case ...
2020-12-17Merge tag 'for_v5.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2, reiserfs, quota and writeback updates from Jan Kara: - a couple of quota fixes (mostly for problems found by syzbot) - several ext2 cleanups - one fix for reiserfs crash on corrupted image - a fix for spurious warning in writeback code * tag 'for_v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: writeback: don't warn on an unregistered BDI in __mark_inode_dirty fs: quota: fix array-index-out-of-bounds bug by passing correct argument to vfs_cleanup_quota_inode() reiserfs: add check for an invalid ih_entry_count ext2: Fix fall-through warnings for Clang fs/ext2: Use ext2_put_page docs: filesystems: Reduce ext2.rst to one top-level heading quota: Sanity-check quota file headers on load quota: Don't overflow quota file offsets ext2: Remove unnecessary blank fs/quota: update quota state flags scheme with project quota flags
2020-12-17ext4: add docs about fast commit idempotenceHarshad Shirwadkar
Fast commit on-disk format is designed such that the replay of these tags can be idempotent. This patch adds documentation in the code in form of comments and in form kernel docs that describes these characteristics. This patch also adds a TODO item needed to ensure kernel fast commit replay idempotence. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201119232822.1860882-1-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-12-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge yet more updates from Andrew Morton: - lots of little subsystems - a few post-linux-next MM material. Most of the rest awaits more merging of other trees. Subsystems affected by this series: alpha, procfs, misc, core-kernel, bitmap, lib, lz4, checkpatch, nilfs, kdump, rapidio, gcov, bfs, relay, resource, ubsan, reboot, fault-injection, lzo, apparmor, and mm (swap, memory-hotplug, pagemap, cleanups, and gup). * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (86 commits) mm: fix some spelling mistakes in comments mm: simplify follow_pte{,pmd} mm: unexport follow_pte_pmd apparmor: remove duplicate macro list_entry_is_head() lib/lzo/lzo1x_compress.c: make lzogeneric1x_1_compress() static fault-injection: handle EI_ETYPE_TRUE reboot: hide from sysfs not applicable settings reboot: allow to override reboot type if quirks are found reboot: remove cf9_safe from allowed types and rename cf9_force reboot: allow to specify reboot mode via sysfs reboot: refactor and comment the cpu selection code lib/ubsan.c: mark type_check_kinds with static keyword kcov: don't instrument with UBSAN ubsan: expand tests and reporting ubsan: remove UBSAN_MISC in favor of individual options ubsan: enable for all*config builds ubsan: disable UBSAN_TRAP for all*config ubsan: disable object-size sanitizer under GCC ubsan: move cc-option tests into Kconfig ubsan: remove redundant -Wno-maybe-uninitialized ...
2020-12-15proc: provide details on indirect branch speculationAnand K Mistry
Similar to speculation store bypass, show information about the indirect branch speculation mode of a task in /proc/$pid/status. For testing/benchmarking, I needed to see whether IB (Indirect Branch) speculation (see Spectre-v2) is enabled on a task, to see whether an IBPB instruction should be executed on an address space switch. Unfortunately, this information isn't available anywhere else and currently the only way to get it is to hack the kernel to expose it (like this change). It also helped expose a bug with conditional IB speculation on certain CPUs. Another place this could be useful is to audit the system when using sanboxing. With this change, I can confirm that seccomp-enabled process have IB speculation force disabled as expected when the kernel command line parameter `spectre_v2_user=seccomp`. Since there's already a 'Speculation_Store_Bypass' field, I used that as precedent for adding this one. [amistry@google.com: remove underscores from field name to workaround documentation issue] Link: https://lkml.kernel.org/r/20201106131015.v2.1.I7782b0cedb705384a634cfd8898eb7523562da99@changeid Link: https://lkml.kernel.org/r/20201030172731.1.I7782b0cedb705384a634cfd8898eb7523562da99@changeid Signed-off-by: Anand K Mistry <amistry@google.com> Cc: Anthony Steinhauser <asteinhauser@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Anand K Mistry <amistry@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Alexey Gladkov <gladkov.alexey@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: NeilBrown <neilb@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15Merge branch 'exec-for-v5.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull execve updates from Eric Biederman: "This set of changes ultimately fixes the interaction of posix file lock and exec. Fundamentally most of the change is just moving where unshare_files is called during exec, and tweaking the users of files_struct so that the count of files_struct is not unnecessarily played with. Along the way fcheck and related helpers were renamed to more accurately reflect what they do. There were also many other small changes that fell out, as this is the first time in a long time much of this code has been touched. Benchmarks haven't turned up any practical issues but Al Viro has observed a possibility for a lot of pounding on task_lock. So I have some changes in progress to convert put_files_struct to always rcu free files_struct. That wasn't ready for the merge window so that will have to wait until next time" * 'exec-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits) exec: Move io_uring_task_cancel after the point of no return coredump: Document coredump code exclusively used by cell spufs file: Remove get_files_struct file: Rename __close_fd_get_file close_fd_get_file file: Replace ksys_close with close_fd file: Rename __close_fd to close_fd and remove the files parameter file: Merge __alloc_fd into alloc_fd file: In f_dupfd read RLIMIT_NOFILE once. file: Merge __fd_install into fd_install proc/fd: In fdinfo seq_show don't use get_files_struct bpf/task_iter: In task_file_seq_get_next use task_lookup_next_fd_rcu proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu file: Implement task_lookup_next_fd_rcu kcmp: In get_file_raw_ptr use task_lookup_fd_rcu proc/fd: In tid_fd_mode use task_lookup_fd_rcu file: Implement task_lookup_fd_rcu file: Rename fcheck lookup_fd_rcu file: Replace fcheck_files with files_lookup_fd_rcu file: Factor files_lookup_fd_locked out of fcheck_files file: Rename __fcheck_files to files_lookup_fd_raw ...
2020-12-15Merge tag 'nfsd-5.11' of git://git.linux-nfs.org/projects/cel/cel-2.6Linus Torvalds
Pull nfsd updates from Chuck Lever: "Several substantial changes this time around: - Previously, exporting an NFS mount via NFSD was considered to be an unsupported feature. With v5.11, the community has attempted to make re-exporting a first-class feature of NFSD. This would enable the Linux in-kernel NFS server to be used as an intermediate cache for a remotely-located primary NFS server, for example, even with other NFS server implementations, like a NetApp filer, as the primary. - A short series of patches brings support for multiple RPC/RDMA data chunks per RPC transaction to the Linux NFS server's RPC/RDMA transport implementation. This is a part of the RPC/RDMA spec that the other premiere NFS/RDMA implementation (Solaris) has had for a very long time, and completes the implementation of RPC/RDMA version 1 in the Linux kernel's NFS server. - Long ago, NFSv4 support was introduced to NFSD using a series of C macros that hid dprintk's and goto's. Over time, the kernel's XDR implementation has been greatly improved, but these C macros have remained and become fallow. A series of patches in this pull request completely replaces those macros with the use of current kernel XDR infrastructure. Benefits include: - More robust input sanitization in NFSD's NFSv4 XDR decoders. - Make it easier to use common kernel library functions that use XDR stream APIs (for example, GSS-API). - Align the structure of the source code with the RFCs so it is easier to learn, verify, and maintain our XDR implementation. - Removal of more than a hundred hidden dprintk() call sites. - Removal of some explicit manipulation of pages to help make the eventual transition to xdr->bvec smoother. - On top of several related fixes in 5.10-rc, there are a few more fixes to get the Linux NFSD implementation of NFSv4.2 inter-server copy up to speed. And as usual, there is a pinch of seasoning in the form of a collection of unrelated minor bug fixes and clean-ups. Many thanks to all who contributed this time around!" * tag 'nfsd-5.11' of git://git.linux-nfs.org/projects/cel/cel-2.6: (131 commits) nfsd: Record NFSv4 pre/post-op attributes as non-atomic nfsd: Set PF_LOCAL_THROTTLE on local filesystems only nfsd: Fix up nfsd to ensure that timeout errors don't result in ESTALE exportfs: Add a function to return the raw output from fh_to_dentry() nfsd: close cached files prior to a REMOVE or RENAME that would replace target nfsd: allow filesystems to opt out of subtree checking nfsd: add a new EXPORT_OP_NOWCC flag to struct export_operations Revert "nfsd4: support change_attr_type attribute" nfsd4: don't query change attribute in v2/v3 case nfsd: minor nfsd4_change_attribute cleanup nfsd: simplify nfsd4_change_info nfsd: only call inode_query_iversion in the I_VERSION case nfs_common: need lock during iterate through the list NFSD: Fix 5 seconds delay when doing inter server copy NFSD: Fix sparse warning in nfs4proc.c SUNRPC: Remove XDRBUF_SPARSE_PAGES flag in gss_proxy upcall sunrpc: clean-up cache downcall nfsd: Fix message level for normal termination NFSD: Remove macros that are no longer used NFSD: Replace READ* macros in nfsd4_decode_compound() ...
2020-12-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc updates from Andrew Morton: - a few random little subsystems - almost all of the MM patches which are staged ahead of linux-next material. I'll trickle to post-linux-next work in as the dependents get merged up. Subsystems affected by this patch series: kthread, kbuild, ide, ntfs, ocfs2, arch, and mm (slab-generic, slab, slub, dax, debug, pagecache, gup, swap, shmem, memcg, pagemap, mremap, hmm, vmalloc, documentation, kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction, oom-kill, migration, cma, page-poison, userfaultfd, zswap, zsmalloc, uaccess, zram, and cleanups). * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (200 commits) mm: cleanup kstrto*() usage mm: fix fall-through warnings for Clang mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at mm: shmem: convert shmem_enabled_show to use sysfs_emit_at mm:backing-dev: use sysfs_emit in macro defining functions mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening mm: use sysfs_emit for struct kobject * uses mm: fix kernel-doc markups zram: break the strict dependency from lzo zram: add stat to gather incompressible pages since zram set up zram: support page writeback mm/process_vm_access: remove redundant initialization of iov_r mm/zsmalloc.c: rework the list_add code in insert_zspage() mm/zswap: move to use crypto_acomp API for hardware acceleration mm/zswap: fix passing zero to 'PTR_ERR' warning mm/zswap: make struct kernel_param_ops definitions const userfaultfd/selftests: hint the test runner on required privilege userfaultfd/selftests: fix retval check for userfaultfd_open() userfaultfd/selftests: always dump something in modes userfaultfd: selftests: make __{s,u}64 format specifiers portable ...
2020-12-15tmpfs: fix Documentation nitsRandy Dunlap
Fix a typo, punctuation, use uppercase for CPUs, and limit tmpfs to keeping only its files in virtual memory (phrasing). Link: https://lkml.kernel.org/r/20201202010934.18566-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Hugh Dickins <hughd@google.com> Cc: Chris Down <chris@chrisdown.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-14Merge tag 'docs-5.11' of git://git.lwn.net/linuxLinus Torvalds
Pull documentation updates from Jonathan Corbet: "A much quieter cycle for documentation (happily), with, one hopes, the bulk of the churn behind us. Significant stuff in this pull includes: - A set of new Chinese translations - Italian translation updates - A mechanism from Mauro to automatically format Documentation/features for the built docs - Automatic cross references without explicit :ref: markup - A new reset-controller document - An extensive new document on reporting problems from Thorsten That last patch also adds the CC-BY-4.0 license to LICENSES/dual; there was some discussion on this, but we seem to have consensus and an ack from Greg for that addition" * tag 'docs-5.11' of git://git.lwn.net/linux: (50 commits) docs: fix broken cross reference in translations/zh_CN docs: Note that sphinx 1.7 will be required soon docs: update requirements to install six module docs: reporting-issues: move 'outdated, need help' note to proper place docs: Update documentation to reflect what TAINT_CPU_OUT_OF_SPEC means docs: add a reset controller chapter to the driver API docs docs: make reporting-bugs.rst obsolete docs: Add a new text describing how to report bugs LICENSES: Add the CC-BY-4.0 license Documentation: fix multiple typos found in the admin-guide subdirectory Documentation: fix typos found in admin-guide subdirectory kernel-doc: Fix example in Nested structs/unions docs: clean up sysctl/kernel: titles, version docs: trace: fix event state structure name docs: nios2: add missing ReST file scripts: get_feat.pl: reduce table width for all features output scripts: get_feat.pl: change the group by order scripts: get_feat.pl: make complete table more coincise scripts: kernel-doc: fix parsing function-like typedefs Documentation: fix typos found in process, dev-tools, and doc-guide subdirectories ...
2020-12-14ovl: user xattrMiklos Szeredi
Optionally allow using "user.overlay." namespace instead of "trusted.overlay." This is necessary for overlayfs to be able to be mounted in an unprivileged namepsace. Make the option explicit, since it makes the filesystem format be incompatible. Disable redirect_dir and metacopy options, because these would allow privilege escalation through direct manipulation of the "user.overlay.redirect" or "user.overlay.metacopy" xattrs. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
2020-12-10file: Rename fcheck lookup_fd_rcuEric W. Biederman
Also remove the confusing comment about checking if a fd exists. I could not find one instance in the entire kernel that still matches the description or the reason for the name fcheck. The need for better names became apparent in the last round of discussion of this set of changes[1]. [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com Link: https://lkml.kernel.org/r/20201120231441.29911-10-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-12-10file: Replace fcheck_files with files_lookup_fd_rcuEric W. Biederman
This change renames fcheck_files to files_lookup_fd_rcu. All of the remaining callers take the rcu_read_lock before calling this function so the _rcu suffix is appropriate. This change also tightens up the debug check to verify that all callers hold the rcu_read_lock. All callers that used to call files_check with the files->file_lock held have now been changed to call files_lookup_fd_locked. This change of name has helped remind me of which locks and which guarantees are in place helping me to catch bugs later in the patchset. The need for better names became apparent in the last round of discussion of this set of changes[1]. [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com Link: https://lkml.kernel.org/r/20201120231441.29911-9-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-12-09nfsd: close cached files prior to a REMOVE or RENAME that would replace targetJeff Layton
It's not uncommon for some workloads to do a bunch of I/O to a file and delete it just afterward. If knfsd has a cached open file however, then the file may still be open when the dentry is unlinked. If the underlying filesystem is nfs, then that could trigger it to do a sillyrename. On a REMOVE or RENAME scan the nfsd_file cache for open files that correspond to the inode, and proactively unhash and put their references. This should prevent any delete-on-last-close activity from occurring, solely due to knfsd's open file cache. This must be done synchronously though so we use the variants that call flush_delayed_fput. There are deadlock possibilities if you call flush_delayed_fput while holding locks, however. In the case of nfsd_rename, we don't even do the lookups of the dentries to be renamed until we've locked for rename. Once we've figured out what the target dentry is for a rename, check to see whether there are cached open files associated with it. If there are, then unwind all of the locking, close them all, and then reattempt the rename. None of this is really necessary for "typical" filesystems though. It's mostly of use for NFS, so declare a new export op flag and use that to determine whether to close the files beforehand. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09nfsd: allow filesystems to opt out of subtree checkingJeff Layton
When we start allowing NFS to be reexported, then we have some problems when it comes to subtree checking. In principle, we could allow it, but it would mean encoding parent info in the filehandles and there may not be enough space for that in a NFSv3 filehandle. To enforce this at export upcall time, we add a new export_ops flag that declares the filesystem ineligible for subtree checking. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-09nfsd: add a new EXPORT_OP_NOWCC flag to struct export_operationsJeff Layton
With NFSv3 nfsd will always attempt to send along WCC data to the client. This generally involves saving off the in-core inode information prior to doing the operation on the given filehandle, and then issuing a vfs_getattr to it after the op. Some filesystems (particularly clustered or networked ones) have an expensive ->getattr inode operation. Atomicity is also often difficult or impossible to guarantee on such filesystems. For those, we're best off not trying to provide WCC information to the client at all, and to simply allow it to poll for that information as needed with a GETATTR RPC. This patch adds a new flags field to struct export_operations, and defines a new EXPORT_OP_NOWCC flag that filesystems can use to indicate that nfsd should not attempt to provide WCC info in NFSv3 replies. It also adds a blurb about the new flags field and flag to the exporting documentation. The server will also now skip collecting this information for NFSv2 as well, since that info is never used there anyway. Note that this patch does not add this flag to any filesystem export_operations structures. This was originally developed to allow reexporting nfs via nfsd. Other filesystems may want to consider enabling this flag too. It's hard to tell however which ones have export operations to enable export via knfsd and which ones mostly rely on them for open-by-filehandle support, so I'm leaving that up to the individual maintainers to decide. I am cc'ing the relevant lists for those filesystems that I think may want to consider adding this though. Cc: HPDD-discuss@lists.01.org Cc: ceph-devel@vger.kernel.org Cc: cluster-devel@redhat.com Cc: fuse-devel@lists.sourceforge.net Cc: ocfs2-devel@oss.oracle.com Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-03Documentation: mount_api: change kernel log wordingRandy Dunlap
Change wording to say that messages are logged to the kernel log buffer instead of to dmesg. dmesg is just one program that can print the kernel log buffer. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20201202012409.19194-1-rdunlap@infradead.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-12-03f2fs: add compress_mode mount optionDaeho Jeong
We will add a new "compress_mode" mount option to control file compression mode. This supports "fs" and "user". In "fs" mode (default), f2fs does automatic compression on the compression enabled files. In "user" mode, f2fs disables the automaic compression and gives the user discretion of choosing the target file and the timing. It means the user can do manual compression/decompression on the compression enabled files using ioctls. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-02f2fs: compress: support chksumChao Yu
This patch supports to store chksum value with compressed data, and verify the integrality of compressed data while reading the data. The feature can be enabled through specifying mount option 'compress_chksum'. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-01Documentation: document /proc api for arm64 MTE vm flagsSzabolcs Nagy
Document that /proc/PID/smaps shows PROT_MTE settings in VmFlags. Support for this was introduced in commit 9f3419315f3cdc41a7318e4d50ba18a592b30c8c arm64: mte: Add PROT_MTE support to mmap() and mprotect() Signed-off-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-doc@vger.kernel.org Link: https://lore.kernel.org/r/20201106101940.5777-1-szabolcs.nagy@arm.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-12-01Documentation: Update filesystems/gfs2.rstAndrew Price
Remove an obsolete URL and generally bring the doc up-to-date Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2020-11-23fs-verity: move structs needed for file signing to UAPI headerEric Biggers
Although it isn't used directly by the ioctls, "struct fsverity_descriptor" is required by userspace programs that need to compute fs-verity file digests in a standalone way. Therefore it's also needed to sign files in a standalone way. Similarly, "struct fsverity_formatted_digest" (previously called "struct fsverity_signed_digest" which was misleading) is also needed to sign files if the built-in signature verification is being used. Therefore, move these structs to the UAPI header. While doing this, try to make it clear that the signature-related fields in fsverity_descriptor aren't used in the file digest computation. Acked-by: Luca Boccassi <luca.boccassi@microsoft.com> Link: https://lore.kernel.org/r/20201113211918.71883-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-11-16fs-verity: rename "file measurement" to "file digest"Eric Biggers
I originally chose the name "file measurement" to refer to the fs-verity file digest to avoid confusion with traditional full-file digests or with the bare root hash of the Merkle tree. But the name "file measurement" hasn't caught on, and usually people are calling it something else, usually the "file digest". E.g. see "struct fsverity_digest" and "struct fsverity_formatted_digest", the libfsverity_compute_digest() and libfsverity_sign_digest() functions in libfsverity, and the "fsverity digest" command. Having multiple names for the same thing is always confusing. So to hopefully avoid confusion in the future, rename "fs-verity file measurement" to "fs-verity file digest". This leaves FS_IOC_MEASURE_VERITY as the only reference to "measure" in the kernel, which makes some amount of sense since the ioctl is actively "measuring" the file. I'll be renaming this in fsverity-utils too (though similarly the 'fsverity measure' command, which is a wrapper for FS_IOC_MEASURE_VERITY, will stay). Acked-by: Luca Boccassi <luca.boccassi@microsoft.com> Link: https://lore.kernel.org/r/20201113211918.71883-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-11-16fs-verity: rename fsverity_signed_digest to fsverity_formatted_digestEric Biggers
The name "struct fsverity_signed_digest" is causing confusion because it isn't actually a signed digest, but rather it's the way that the digest is formatted in order to be signed. Rename it to "struct fsverity_formatted_digest" to prevent this confusion. Also update the struct's comment to clarify that it's specific to the built-in signature verification support and isn't a requirement for all fs-verity users. I'll be renaming this struct in fsverity-utils too. Acked-by: Luca Boccassi <luca.boccassi@microsoft.com> Link: https://lore.kernel.org/r/20201113211918.71883-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-11-13docs: filesystems: link ubifs-authentication.rst without .rst extensionJonathan Neuschäfer
Specifying the .rst extension doesn't cause any problems AFAICT, but it's uncommon. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Link: https://lore.kernel.org/r/20201108132415.1789142-1-j.neuschaefer@gmx.net Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-11-12ovl: document lower modification caveatsKevin Locke
Some overlayfs optional features are incompatible with offline changes to the lower tree and may result in -EXDEV, -EIO, or other errors. Such modification is not supported and the error behavior is intentionally not specified. Update the "Changes to underlying filesystems" section to note this restriction. Move the paragraph describing the offline behavior below the online behavior so it is adjacent to the following 3 paragraphs describing the NFS export offline modification behavior. Link: https://lore.kernel.org/linux-unionfs/20200708142353.GA103536@redhat.com/ Link: https://lore.kernel.org/linux-unionfs/CAOQ4uxi23Zsmfb4rCed1n=On0NNA5KZD74jjjeyz+et32sk-gg@mail.gmail.com/ Link: https://lore.kernel.org/linux-unionfs/20200817135651.GA637139@redhat.com/ Link: https://lore.kernel.org/linux-unionfs/20200709153616.GE150543@redhat.com/ Link: https://lore.kernel.org/linux-unionfs/20200812135529.GA122370@kevinolos/ Signed-off-by: Kevin Locke <kevin@kevinlocke.name> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-11-12ovl: doc clarificationMiklos Szeredi
Documentation says "The lower filesystem can be any filesystem supported by Linux". However, this is not the case, as Linux supports vfat and vfat doesn't work as a lower filesystem Reported-by: nerdopolis <bluescreen_avenger@verizon.net> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-11-12ovl: introduce new "uuid=off" option for inodes index featurePavel Tikhomirov
This replaces uuid with null in overlayfs file handles and thus relaxes uuid checks for overlay index feature. It is only possible in case there is only one filesystem for all the work/upper/lower directories and bare file handles from this backing filesystem are unique. In other case when we have multiple filesystems lets just fallback to "uuid=on" which is and equivalent of how it worked before with all uuid checks. This is needed when overlayfs is/was mounted in a container with index enabled (e.g.: to be able to resolve inotify watch file handles on it to paths in CRIU), and this container is copied and started alongside with the original one. This way the "copy" container can't have the same uuid on the superblock and mounting the overlayfs from it later would fail. That is an example of the problem on top of loop+ext4: dd if=/dev/zero of=loopbackfile.img bs=100M count=10 losetup -fP loopbackfile.img losetup -a #/dev/loop0: [64768]:35 (/loop-test/loopbackfile.img) mkfs.ext4 loopbackfile.img mkdir loop-mp mount -o loop /dev/loop0 loop-mp mkdir loop-mp/{lower,upper,work,merged} mount -t overlay overlay -oindex=on,lowerdir=loop-mp/lower,\ upperdir=loop-mp/upper,workdir=loop-mp/work loop-mp/merged umount loop-mp/merged umount loop-mp e2fsck -f /dev/loop0 tune2fs -U random /dev/loop0 mount -o loop /dev/loop0 loop-mp mount -t overlay overlay -oindex=on,lowerdir=loop-mp/lower,\ upperdir=loop-mp/upper,workdir=loop-mp/work loop-mp/merged #mount: /loop-test/loop-mp/merged: #mount(2) system call failed: Stale file handle. If you just change the uuid of the backing filesystem, overlay is not mounting any more. In Virtuozzo we copy container disks (ploops) when create the copy of container and we require fs uuid to be unique for a new container. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-11-09Merge tag 'ext4_for_linus_cleanups' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes and cleanups from Ted Ts'o: "More fixes and cleanups for the new fast_commit features, but also a few other miscellaneous bug fixes and a cleanup for the MAINTAINERS file" * tag 'ext4_for_linus_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (28 commits) jbd2: fix up sparse warnings in checkpoint code ext4: fix sparse warnings in fast_commit code ext4: cleanup fast commit mount options jbd2: don't start fast commit on aborted journal ext4: make s_mount_flags modifications atomic ext4: issue fsdev cache flush before starting fast commit ext4: disable fast commit with data journalling ext4: fix inode dirty check in case of fast commits ext4: remove unnecessary fast commit calls from ext4_file_mmap ext4: mark buf dirty before submitting fast commit buffer ext4: fix code documentatioon ext4: dedpulicate the code to wait on inode that's being committed jbd2: don't read journal->j_commit_sequence without taking a lock jbd2: don't touch buffer state until it is filled jbd2: add todo for a fast commit performance optimization jbd2: don't pass tid to jbd2_fc_end_commit_fallback() jbd2: don't use state lock during commit path jbd2: drop jbd2_fc_init documentation ext4: clean up the JBD2 API that initializes fast commits jbd2: rename j_maxlen to j_total_len and add jbd2_journal_max_txn_bufs ...
2020-11-09docs: filesystems: Reduce ext2.rst to one top-level headingJonathan Neuschäfer
This prevents the other headings like "Options" and "Specification" from leaking out and being listed separately in the table of contents. Link: https://lore.kernel.org/r/20201108004045.1378676-1-j.neuschaefer@gmx.net Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Jan Kara <jack@suse.cz>
2020-11-06jbd2: drop jbd2_fc_init documentationHarshad Shirwadkar
Now that jbd2_fc_init is dropped, drop its docs too. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-8-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: describe fast_commit feature flagsHarshad Shirwadkar
Fast commit feature has flags in the file system as well in JBD2. The meaning of fast commit feature flags can get confusing. Update docs and code to add more documentation about it. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20201106035911.1942128-2-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-03Merge tag 'docs-5.10-warnings' of git://git.lwn.net/linuxLinus Torvalds
Pull documentation build warning fixes from Jonathan Corbet: "This contains a series of warning fixes from Mauro; once applied, the number of warnings from the once-noisy docs build process is nearly zero. Getting to this point has required a lot of work; once there, hopefully we can keep things that way. I have packaged this as a separate pull because it does a fair amount of reaching outside of Documentation/. The changes are all in comments and in code placement. It's all been in linux-next since last week" * tag 'docs-5.10-warnings' of git://git.lwn.net/linux: (24 commits) docs: SafeSetID: fix a warning amdgpu: fix a few kernel-doc markup issues selftests: kselftest_harness.h: fix kernel-doc markups drm: amdgpu_dm: fix a typo gpu: docs: amdgpu.rst: get rid of wrong kernel-doc markups drm: amdgpu: kernel-doc: update some adev parameters docs: fs: api-summary.rst: get rid of kernel-doc include IB/srpt: docs: add a description for cq_size member locking/refcount: move kernel-doc markups to the proper place docs: lockdep-design: fix some warning issues MAINTAINERS: fix broken doc refs due to yaml conversion ice: docs fix a devlink info that broke a table crypto: sun8x-ce*: update entries to its documentation net: phy: remove kernel-doc duplication mm: pagemap.h: fix two kernel-doc markups blk-mq: docs: add kernel-doc description for a new struct member docs: userspace-api: add iommu.rst to the index file docs: hwmon: mp2975.rst: address some html build warnings docs: net: statistics.rst: remove a duplicated kernel-doc docs: kasan.rst: add two missing blank lines ...
2020-10-30debugfs: remove return value of debugfs_create_devm_seqfile()Greg Kroah-Hartman
No one checks the return value of debugfs_create_devm_seqfile(), as it's not needed, so make the return value void, so that no one tries to do so in the future. Link: https://lore.kernel.org/r/20201023131037.2500765-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-28docs: fs: api-summary.rst: get rid of kernel-doc includeMauro Carvalho Chehab
The direct-io.c file used to have just two exported symbols: - dio_end_io() - __blockdev_direct_IO() The first one was removed by changeset c33fe275b530 ("fs: remove no longer used dio_end_io()") And the last one is used on most places indirectly, via the inline macro blockdev_direct_IO() provided by fs.h. Yet, neither the macro or the function have kernel-doc markups. So, drop the inclusion of fs/direct-io.c at the docs. Fixes: c33fe275b530 ("fs: remove no longer used dio_end_io()") Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/d0a9fffedca102633c168adaf157f34288a4ea67.1603791716.git.mchehab+huawei@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-10-22Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "The siginificant new ext4 feature this time around is Harshad's new fast_commit mode. In addition, thanks to Mauricio for fixing a race where mmap'ed pages that are being changed in parallel with a data=journal transaction commit could result in bad checksums in the failure that could cause journal replays to fail. Also notable is Ritesh's buffered write optimization which can result in significant improvements on parallel write workloads. (The kernel test robot reported a 330.6% improvement on fio.write_iops on a 96 core system using DAX) Besides that, we have the usual miscellaneous cleanups and bug fixes" Link: https://lore.kernel.org/r/20200925071217.GO28663@shao2-debian * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (46 commits) ext4: fix invalid inode checksum ext4: add fast commit stats in procfs ext4: add a mount opt to forcefully turn fast commits on ext4: fast commit recovery path jbd2: fast commit recovery path ext4: main fast-commit commit path jbd2: add fast commit machinery ext4 / jbd2: add fast commit initialization ext4: add fast_commit feature and handling for extended mount options doc: update ext4 and journalling docs to include fast commit feature ext4: Detect already used quota file early jbd2: avoid transaction reuse after reformatting ext4: use the normal helper to get the actual inode ext4: fix bs < ps issue reported with dioread_nolock mount opt ext4: data=journal: write-protect pages on j_submit_inode_data_buffers() ext4: data=journal: fixes for ext4_page_mkwrite() jbd2, ext4, ocfs2: introduce/use journal callbacks j_submit|finish_inode_data_buffers() jbd2: introduce/export functions jbd2_journal_submit|finish_inode_data_buffers() ext4: introduce ext4_sb_bread_unmovable() to replace sb_bread_unmovable() ext4: use ext4_sb_bread() instead of sb_bread() ...
2020-10-22Merge tag 'nfsd-5.10' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "The one new feature this time, from Anna Schumaker, is READ_PLUS, which has the same arguments as READ but allows the server to return an array of data and hole extents. Otherwise it's a lot of cleanup and bugfixes" * tag 'nfsd-5.10' of git://linux-nfs.org/~bfields/linux: (43 commits) NFSv4.2: Fix NFS4ERR_STALE error when doing inter server copy SUNRPC: fix copying of multiple pages in gss_read_proxy_verf() sunrpc: raise kernel RPC channel buffer size svcrdma: fix bounce buffers for unaligned offsets and multiple pages nfsd: remove unneeded break net/sunrpc: Fix return value for sysctl sunrpc.transports NFSD: Encode a full READ_PLUS reply NFSD: Return both a hole and a data segment NFSD: Add READ_PLUS hole segment encoding NFSD: Add READ_PLUS data support NFSD: Hoist status code encoding into XDR encoder functions NFSD: Map nfserr_wrongsec outside of nfsd_dispatch NFSD: Remove the RETURN_STATUS() macro NFSD: Call NFSv2 encoders on error returns NFSD: Fix .pc_release method for NFSv2 NFSD: Remove vestigial typedefs NFSD: Refactor nfsd_dispatch() error paths NFSD: Clean up nfsd_dispatch() variables NFSD: Clean up stale comments in nfsd_dispatch() NFSD: Clean up switch statement in nfsd_dispatch() ...
2020-10-21doc: update ext4 and journalling docs to include fast commit featureHarshad Shirwadkar
This patch adds necessary documentation for fast commits. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201015203802.3597742-2-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-21Merge tag 'ceph-for-5.10-rc1' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph updates from Ilya Dryomov: - a patch that removes crush_workspace_mutex (myself). CRUSH computations are no longer serialized and can run in parallel. - a couple new filesystem client metrics for "ceph fs top" command (Xiubo Li) - a fix for a very old messenger bug that affected the filesystem, marked for stable (myself) - assorted fixups and cleanups throughout the codebase from Jeff and others. * tag 'ceph-for-5.10-rc1' of git://github.com/ceph/ceph-client: (27 commits) libceph: clear con->out_msg on Policy::stateful_server faults libceph: format ceph_entity_addr nonces as unsigned libceph: fix ENTITY_NAME format suggestion libceph: move a dout in queue_con_delay() ceph: comment cleanups and clarifications ceph: break up send_cap_msg ceph: drop separate mdsc argument from __send_cap ceph: promote to unsigned long long before shifting ceph: don't SetPageError on readpage errors ceph: mark ceph_fmt_xattr() as printf-like for better type checking ceph: fold ceph_update_writeable_page into ceph_write_begin ceph: fold ceph_sync_writepages into writepage_nounlock ceph: fold ceph_sync_readpages into ceph_readpage ceph: don't call ceph_update_writeable_page from page_mkwrite ceph: break out writeback of incompatible snap context to separate function ceph: add a note explaining session reject error string libceph: switch to the new "osd blocklist add" command libceph, rbd, ceph: "blacklist" -> "blocklist" ceph: have ceph_writepages_start call pagevec_lookup_range_tag ceph: use kill_anon_super helper ...
2020-10-19Merge tag 'fuse-update-5.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi: - Support directly accessing host page cache from virtiofs. This can improve I/O performance for various workloads, as well as reducing the memory requirement by eliminating double caching. Thanks to Vivek Goyal for doing most of the work on this. - Allow autom