summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2019-09-05erofs: kill __submit_bio()Gao Xiang
As Christoph pointed out [1], " Why is there __submit_bio which really just obsfucates what is going on? Also why is __submit_bio using bio_set_op_attrs instead of opencode it as the comment right next to it asks you to? " Let's use submit_bio directly instead. [1] https://lore.kernel.org/r/20190830162812.GA10694@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-18-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: kill prio and nofail of erofs_get_meta_page()Gao Xiang
As Christoph pointed out [1], "Why is there __erofs_get_meta_page with the two weird booleans instead of a single erofs_get_meta_page that gets and gfp_t for additional flags and an unsigned int for additional bio op flags." And since all callers can handle errors, let's kill prio and nofail and erofs_get_inline_page() now. [1] https://lore.kernel.org/r/20190830162812.GA10694@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-17-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: localize erofs_grab_bio()Gao Xiang
As Christoph pointed out [1], "erofs_grab_bio tries to handle a bio_alloc failure, except that the function will not actually fail due the mempool backing it." Sorry about useless code, fix it now and localize erofs_grab_bio [2]. [1] https://lore.kernel.org/r/20190830162812.GA10694@infradead.org/ [2] https://lore.kernel.org/r/20190902122016.GL15931@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-16-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: kill verbose debug info in erofs_fill_superGao Xiang
As Christoph said [1], "That is some very verbose debug info. We usually don't add that and let people trace the function instead. " [1] https://lore.kernel.org/r/20190829101545.GC20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-15-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: use dsb instead of layout for ondisk super_blockGao Xiang
As Christoph pointed out [1], "Why is the variable name for the on-disk subperblock layout? We usually still calls this something with sb in the name, e.g. dsb. for disksuper block. " Let's fix it. [1] https://lore.kernel.org/r/20190829101545.GC20598@infradead.org/ Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-14-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: better erofs symlink stuffsGao Xiang
Fix as Christoph suggested [1] [2], "remove is_inode_fast_symlink and just opencode it in the few places using it" and "Please just set the ops directly instead of obsfucating that in a single caller, single line inline function. And please set it instead of the normal symlink iops in the same place where you also set those." [1] https://lore.kernel.org/r/20190830163910.GB29603@infradead.org/ [2] https://lore.kernel.org/r/20190829102426.GE20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-13-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: update comments in inode.cGao Xiang
As Christoph suggested [1], update them all. [1] https://lore.kernel.org/r/20190829102426.GE20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-12-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: update erofs_fs.h commentsGao Xiang
As Christoph said [1] [2], update it now. [1] https://lore.kernel.org/r/20190902124521.GA22153@infradead.org/ [2] https://lore.kernel.org/r/20190902120548.GB15931@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-11-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: use erofs_inode namingGao Xiang
As Christoph suggested [1], "Why is this called vnode instead of inode? That seems like a rather odd naming for a Linux file system." [1] https://lore.kernel.org/r/20190829101545.GC20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-10-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: kill erofs_{init,exit}_inode_cacheGao Xiang
As Christoph said [1] "having this function seems entirely pointless", let's kill those. filesystem function name ext2,f2fs,ext4,isofs,squashfs,cifs,... init_inodecache In addition, add a necessary "rcu_barrier()" on exit_fs(); [1] https://lore.kernel.org/r/20190829101545.GC20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-9-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: better naming for erofs inode related stuffsGao Xiang
updates inode naming - kill is_inode_layout_compression [1] - kill magic underscores [2] [3] - better naming for datamode & data_mapping_mode [3] - better naming erofs_inode_{compact, extended} [4] [1] https://lore.kernel.org/r/20190829102426.GE20598@infradead.org/ [2] https://lore.kernel.org/r/20190829102426.GE20598@infradead.org/ [3] https://lore.kernel.org/r/20190902122627.GN15931@infradead.org/ [4] https://lore.kernel.org/r/20190902125438.GA17750@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-8-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: use feature_incompat rather than requirementsGao Xiang
As Christoph said [1], "This is only cosmetic, why not stick to feature_compat and feature_incompat?" In my thought, requirements means "incompatible" instead of "feature" though. [1] https://lore.kernel.org/r/20190902125109.GA9826@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-7-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: update erofs_inode_is_data_compressed helperGao Xiang
As Christoph said, "This looks like a really obsfucated way to write: return datamode == EROFS_INODE_FLAT_COMPRESSION || datamode == EROFS_INODE_FLAT_COMPRESSION_LEGACY; " Although I had my own consideration, it's the right way for now. [1] https://lore.kernel.org/r/20190829095954.GB20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-6-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: kill __packed for on-disk structuresGao Xiang
As Christoph suggested "Please don't add __packed" [1], remove all __packed except struct erofs_dirent here. Note that all on-disk fields except struct erofs_dirent (12 bytes with a 8-byte nid) in EROFS are naturally aligned. [1] https://lore.kernel.org/r/20190829095954.GB20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-5-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: some macros are much more readable as a functionGao Xiang
As Christoph suggested [1], these macros are much more readable as a function. [1] https://lore.kernel.org/r/20190829095954.GB20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-4-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: on-disk format should have explicitly assigned numbersGao Xiang
As Christoph suggested [1], on-disk format should have explicitly assigned numbers. [1] https://lore.kernel.org/r/20190829095954.GB20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-3-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-05erofs: remove all the byte offset commentsGao Xiang
As Christoph suggested [1], "Please remove all the byte offset comments. that is something that can easily be checked with gdb or pahole." [1] https://lore.kernel.org/r/20190829095954.GB20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-2-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-04erofs: using switch-case while checking the inode type.Pratik Shinde
while filling the linux inode, using switch-case statement to check the type of inode. switch-case statement looks more clean here. Signed-off-by: Pratik Shinde <pratikshinde320@gmail.com> Reviewed-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190830095615.10995-1-pratikshinde320@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-30erofs: reduntant assignment in __erofs_get_meta_page()Gao Xiang
As Joe Perches suggested [1], err = bio_add_page(bio, page, PAGE_SIZE, 0); - if (unlikely(err != PAGE_SIZE)) { + if (err != PAGE_SIZE) { err = -EFAULT; goto err_out; } The initial assignment to err is odd as it's not actually an error value -E<FOO> but a int size from a unsigned int len. Here the return is either 0 or PAGE_SIZE. This would be more legible to me as: if (bio_add_page(bio, page, PAGE_SIZE, 0) != PAGE_SIZE) { err = -EFAULT; goto err_out; } [1] https://lore.kernel.org/r/74c4784319b40deabfbaea92468f7e3ef44f1c96.camel@perches.com/ Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190829171741.225219-1-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-30erofs: remove all likely/unlikely annotationsGao Xiang
As Dan Carpenter suggested [1], I have to remove all erofs likely/unlikely annotations. [1] https://lore.kernel.org/linux-fsdevel/20190829154346.GK23584@kadam/ Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190829163827.203274-1-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-24erofs: move erofs out of stagingGao Xiang
EROFS filesystem has been merged into linux-staging for a year. EROFS is designed to be a better solution of saving extra storage space with guaranteed end-to-end performance for read-only files with the help of reduced metadata, fixed-sized output compression and decompression inplace technologies. In the past year, EROFS was greatly improved by many people as a staging driver, self-tested, betaed by a large number of our internal users, successfully applied to almost all in-service HUAWEI smartphones as the part of EMUI 9.1 and proven to be stable enough to be moved out of staging. EROFS is a self-contained filesystem driver. Although there are still some TODOs to be more generic, we have a dedicated team actively keeping on working on EROFS in order to make it better with the evolution of Linux kernel as the other in-kernel filesystems. As Pavel suggested, it's better to do as one commit since git can do moves and all histories will be saved in this way. Let's promote it from staging and enhance it more actively as a "real" part of kernel for more wider scenarios! Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Pavel Machek <pavel@denx.de> Cc: David Sterba <dsterba@suse.cz> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Darrick J . Wong <darrick.wong@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Richard Weinberger <richard@nod.at> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chao Yu <yuchao0@huawei.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: Li Guifu <bluce.liguifu@huawei.com> Cc: Fang Wei <fangwei1@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190822213659.5501-1-hsiangkao@aol.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-18Merge tag 'for-5.3-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Two fixes that popped up during testing: - fix for sysfs-related code that adds/removes block groups, warnings appear during several fstests in connection with sysfs updates in 5.3, the fix essentially replaces a workaround with scope NOFS and applies to 5.2-based branch too - add sanity check of trim range" * tag 'for-5.3-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: trim: Check the range passed into to prevent overflow Btrfs: fix sysfs warning and missing raid sysfs directories
2019-08-17Merge tag 'for-linus-2019-08-17' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "A collection of fixes that should go into this series. This contains: - Revert of the REQ_NOWAIT_INLINE and associated dio changes. There were still corner cases there, and even though I had a solution for it, it's too involved for this stage. (me) - Set of NVMe fixes (via Sagi) - io_uring fix for fixed buffers (Anthony) - io_uring defer issue fix (Jackie) - Regression fix for queue sync at exit time (zhengbin) - xen blk-back memory leak fix (Wenwen)" * tag 'for-linus-2019-08-17' of git://git.kernel.dk/linux-block: io_uring: fix an issue when IOSQE_IO_LINK is inserted into defer list block: remove REQ_NOWAIT_INLINE io_uring: fix manual setup of iov_iter for fixed buffers xen/blkback: fix memory leaks blk-mq: move cancel of requeue_work to the front of blk_exit_queue nvme-pci: Fix async probe remove race nvme: fix controller removal race with scan work nvme-rdma: fix possible use-after-free in connect error flow nvme: fix a possible deadlock when passthru commands sent to a multipath device nvme-core: Fix extra device_put() call on error path nvmet-file: fix nvmet_file_flush() always returning an error nvmet-loop: Flush nvme_delete_wq when removing the port nvmet: Fix use-after-free bug when a port is removed nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns
2019-08-15Merge tag 'xfs-5.3-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Darrick Wong: - Fix crashes when the attr fork isn't present due to errors but inode inactivation tries to zap the attr data anyway. - Convert more directory corruption debugging asserts to actual EFSCORRUPTED returns instead of blowing up later on. - Don't fail writeback just because we ran out of memory allocating metadata log data. * tag 'xfs-5.3-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: don't crash on null attr fork xfs_bmapi_read xfs: remove more ondisk directory corruption asserts fs: xfs: xfs_log: Don't use KM_MAYFAIL at xfs_log_reserve().
2019-08-15io_uring: fix an issue when IOSQE_IO_LINK is inserted into defer listJackie Liu
This patch may fix two issues: First, when IOSQE_IO_DRAIN set, the next IOs need to be inserted into defer list to delay execution, but link io will be actively scheduled to run by calling io_queue_sqe. Second, when multiple LINK_IOs are inserted together with defer_list, the LINK_IO is no longer keep order. |-------------| | LINK_IO | ----> insert to defer_list ----------- |-------------| | | LINK_IO | ----> insert to defer_list ----------| |-------------| | | LINK_IO | ----> insert to defer_list ----------| |-------------| | | NORMAL_IO | ----> insert to defer_list ----------| |-------------| | | queue_work at same time <-----| Fixes: 9e645e1105c ("io_uring: add support for sqe links") Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-15block: remove REQ_NOWAIT_INLINEJens Axboe
We had a few issues with this code, and there's still a problem around how we deal with error handling for chained/split bios. For now, just revert the code and we'll try again with a thoroug solution. This reverts commits: e15c2ffa1091 ("block: fix O_DIRECT error handling for bio fragments") 0eb6ddfb865c ("block: Fix __blkdev_direct_IO() for bio fragments") 6a43074e2f46 ("block: properly handle IOCB_NOWAIT for async O_DIRECT IO") 893a1c97205a ("blk-mq: allow REQ_NOWAIT to return an error inline") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-15io_uring: fix manual setup of iov_iter for fixed buffersAleix Roca Nonell
Commit bd11b3a391e3 ("io_uring: don't use iov_iter_advance() for fixed buffers") introduced an optimization to avoid using the slow iov_iter_advance by manually populating the iov_iter iterator in some cases. However, the computation of the iterator count field was erroneous: The first bvec was always accounted for an extent of page size even if the bvec length was smaller. In consequence, some I/O operations on fixed buffers were unable to operate on the full extent of the buffer, consistently skipping some bytes at the end of it. Fixes: bd11b3a391e3 ("io_uring: don't use iov_iter_advance() for fixed buffers") Cc: stable@vger.kernel.org Signed-off-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-14Merge tag 'afs-fixes-20190814' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull afs fixes from David Howells: - Fix the CB.ProbeUuid handler to generate its reply correctly. - Fix a mix up in indices when parsing a Volume Location entry record. - Fix a potential NULL-pointer deref when cleaning up a read request. - Fix the expected data version of the destination directory in afs_rename(). - Fix afs_d_revalidate() to only update d_fsdata if it's not the same as the directory data version to reduce the likelihood of overwriting the result of a competing operation. (d_fsdata carries the directory DV or the least-significant word thereof). - Fix the tracking of the data-version on a directory and make sure that dentry objects get properly initialised, updated and revalidated. Also fix rename to update d_fsdata to match the new directory's DV if the dentry gets moved over and unhash the dentry to stop afs_d_revalidate() from interfering. * tag 'afs-fixes-20190814' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix missing dentry data version updating afs: Only update d_fsdata if different in afs_d_revalidate() afs: Fix off-by-one in afs_rename() expected data version calculation fs: afs: Fix a possible null-pointer dereference in afs_put_read() afs: Fix loop index mixup in afs_deliver_vl_get_entry_by_name_u() afs: Fix the CB.ProbeUuid service handler to reply correctly
2019-08-13seq_file: fix problem when seeking mid-recordNeilBrown
If you use lseek or similar (e.g. pread) to access a location in a seq_file file that is within a record, rather than at a record boundary, then the first read will return the remainder of the record, and the second read will return the whole of that same record (instead of the next record). When seeking to a record boundary, the next record is correctly returned. This bug was introduced by a recent patch (identified below). Before that patch, seq_read() would increment m->index when the last of the buffer was returned (m->count == 0). After that patch, we rely on ->next to increment m->index after filling the buffer - but there was one place where that didn't happen. Link: https://lkml.kernel.org/lkml/877e7xl029.fsf@notabene.neil.brown.name/ Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code and interface") Signed-off-by: NeilBrown <neilb@suse.com> Reported-by: Sergei Turchanov <turchanov@farpost.com> Tested-by: Sergei Turchanov <turchanov@farpost.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Markus Elfring <Markus.Elfring@web.de> Cc: <stable@vger.kernel.org> [4.19+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-08-12xfs: don't crash on null attr fork xfs_bmapi_readDarrick J. Wong
Zorro Lang reported a crash in generic/475 if we try to inactivate a corrupt inode with a NULL attr fork (stack trace shortened somewhat): RIP: 0010:xfs_bmapi_read+0x311/0xb00 [xfs] RSP: 0018:ffff888047f9ed68 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888047f9f038 RCX: 1ffffffff5f99f51 RDX: 0000000000000002 RSI: 0000000000000008 RDI: 0000000000000012 RBP: ffff888002a41f00 R08: ffffed10005483f0 R09: ffffed10005483ef R10: ffffed10005483ef R11: ffff888002a41f7f R12: 0000000000000004 R13: ffffe8fff53b5768 R14: 0000000000000005 R15: 0000000000000001 FS: 00007f11d44b5b80(0000) GS:ffff888114200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000ef6000 CR3: 000000002e176003 CR4: 00000000001606e0 Call Trace: xfs_dabuf_map.constprop.18+0x696/0xe50 [xfs] xfs_da_read_buf+0xf5/0x2c0 [xfs] xfs_da3_node_read+0x1d/0x230 [xfs] xfs_attr_inactive+0x3cc/0x5e0 [xfs] xfs_inactive+0x4c8/0x5b0 [xfs] xfs_fs_destroy_inode+0x31b/0x8e0 [xfs] destroy_inode+0xbc/0x190 xfs_bulkstat_one_int+0xa8c/0x1200 [xfs] xfs_bulkstat_one+0x16/0x20 [xfs] xfs_bulkstat+0x6fa/0xf20 [xfs] xfs_ioc_bulkstat+0x182/0x2b0 [xfs] xfs_file_ioctl+0xee0/0x12a0 [xfs] do_vfs_ioctl+0x193/0x1000 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x6f/0xb0 do_syscall_64+0x9f/0x4d0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f11d39a3e5b The "obvious" cause is that the attr ifork is null despite the inode claiming an attr fork having at least one extent, but it's not so obvious why we ended up with an inode in that state. Reported-by: Zorro Lang <zlang@redhat.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204031 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com>
2019-08-12xfs: remove more ondisk directory corruption assertsDarrick J. Wong
Continue our game of replacing ASSERTs for corrupt ondisk metadata with EFSCORRUPTED returns. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com>
2019-08-11Merge tag 'dax-fixes-5.3-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull dax fixes from Dan Williams: "A filesystem-dax and device-dax fix for v5.3. The filesystem-dax fix is tagged for stable as the implementation has been mistakenly throwing away all cow pages on any truncate or hole punch operation as part of the solution to coordinate device-dma vs truncate to dax pages. The device-dax change fixes up a regression this cycle from the introduction of a common 'internal per-cpu-ref' implementation. Summary: - Fix dax_layout_busy_page() to not discard private cow pages of fs/dax private mappings. - Update the memremap_pages core to properly cleanup on behalf of internal reference-count users like device-dax" * tag 'dax-fixes-5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: mm/memremap: Fix reuse of pgmap instances with internal references dax: dax_layout_busy_page() should not unmap cow pages
2019-08-10Merge tag 'gfs2-v5.3-rc3.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fix from Andreas Gruenbacher: "Fix incorrect lseek / fiemap results" * tag 'gfs2-v5.3-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: gfs2_walk_metadata fix
2019-08-09Merge tag 'for-linus-20190809' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - Revert of a bcache patch that caused an oops for some (Coly) - ata rb532 unused warning fix (Gustavo) - AoE kernel crash fix (He) - Error handling fixup for blkdev_get() (Jan) - libata read/write translation and SFF PIO fix (me) - Use after free and error handling fix for O_DIRECT fragments. There's still a nowait + sync oddity in there, we'll nail that start next week. If all else fails, I'll queue a revert of the NOWAIT change. (me) - Loop GFP_KERNEL -> GFP_NOIO deadlock fix (Mikulas) - Two BFQ regression fixes that caused crashes (Paolo) * tag 'for-linus-20190809' of git://git.kernel.dk/linux-block: bcache: Revert "bcache: use sysfs_match_string() instead of __sysfs_match_string()" loop: set PF_MEMALLOC_NOIO for the worker thread bdev: Fixup error handling in blkdev_get() block, bfq: handle NULL return value by bfq_init_rq() block, bfq: move update of waker and woken list to queue freeing block, bfq: reset last_completed_rq_bfqq if the pointed queue is freed block: aoe: Fix kernel crash due to atomic sleep when exiting libata: add SG safety checks in SFF pio transfers libata: have ata_scsi_rw_xlat() fail invalid passthrough requests block: fix O_DIRECT error handling for bio fragments ata: rb532_cf: Fix unused variable warning in rb532_pata_driver_probe
2019-08-09gfs2: gfs2_walk_metadata fixAndreas Gruenbacher
It turns out that the current version of gfs2_metadata_walker suffers from multiple problems that can cause gfs2_hole_size to report an incorrect size. This will confuse fiemap as well as lseek with the SEEK_DATA flag. Fix that by changing gfs2_hole_walker to compute the metapath to the first data block after the hole (if any), and compute the hole size based on that. Fixes xfstest generic/490. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com> Cc: stable@vger.kernel.org # v4.18+
2019-08-08Merge tag 'nfs-for-5.3-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client fixes from Trond Myklebust: "Highlights include: Stable fixes: - NFSv4: Ensure we check the return value of update_open_stateid() so we correctly track active open state. - NFSv4: Fix for delegation state recovery to ensure we recover all open modes that are active. - NFSv4: Fix an Oops in nfs4_do_setattr Fixes: - NFS: Fix regression whereby fscache errors are appearing on 'nofsc' mounts - NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim() - NFSv4: Fix a credential refcount leak in nfs41_check_delegation_stateid - pNFS: Report errors from the call to nfs4_select_rw_stateid() - NFSv4: Various other delegation and open stateid recovery fixes - NFSv4: Fix state recovery behaviour when server connection times out" * tag 'nfs-for-5.3-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Ensure state recovery handles ETIMEDOUT correctly NFS: Fix regression whereby fscache errors are appearing on 'nofsc' mounts NFSv4: Fix an Oops in nfs4_do_setattr NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim() NFSv4: Check the return value of update_open_stateid() NFSv4.1: Only reap expired delegations NFSv4.1: Fix open stateid recovery NFSv4: Report the error from nfs4_select_rw_stateid() NFSv4: When recovering state fails with EAGAIN, retry the same recovery NFSv4: Print an error in the syslog when state is marked as irrecoverable NFSv4: Fix delegation state recovery NFSv4: Fix a credential refcount leak in nfs41_check_delegation_stateid
2019-08-08Merge tag '5.3-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "Six small SMB3 fixes, two for stable" * tag '5.3-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUAL smb3: update TODO list of missing features smb3: send CAP_DFS capability during session setup SMB3: Fix potential memory leak when processing compound chain SMB3: Fix deadlock in validate negotiate hits reconnect cifs: fix rmmod regression in cifs.ko caused by force_sig changes
2019-08-08bdev: Fixup error handling in blkdev_get()Jan Kara
Commit 89e524c04fa9 ("loop: Fix mount(2) failure due to race with LOOP_SET_FD") converted blkdev_get() to use the new helpers for finishing claiming of a block device. However the conversion botched the error handling in blkdev_get() and thus the bdev has been marked as held even in case __blkdev_get() returned error. This led to occasional warnings with block/001 test from blktests like: kernel: WARNING: CPU: 5 PID: 907 at fs/block_dev.c:1899 __blkdev_put+0x396/0x3a0 Correct the error handling. CC: stable@vger.kernel.org Fixes: 89e524c04fa9 ("loop: Fix mount(2) failure due to race with LOOP_SET_FD") Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-07block: fix O_DIRECT error handling for bio fragmentsJens Axboe
0eb6ddfb865c tried to fix this up, but introduced a use-after-free of dio. Additionally, we still had an issue with error handling, as reported by Darrick: "I noticed a regression in xfs/747 (an unreleased xfstest for the xfs_scrub media scanning feature) on 5.3-rc3. I'll condense that down to a simpler reproducer: error-test: 0 209 linear 8:48 0 error-test: 209 1 error error-test: 210 6446894 linear 8:48 210 Basically we have a ~3G /dev/sdd and we set up device mapper to fail IO for sector 209 and to pass the io to the scsi device everywhere else. On 5.3-rc3, performing a directio pread of this range with a < 1M buffer (in other words, a request for fewer than MAX_BIO_PAGES bytes) yields EIO like you'd expect: pread64(3, 0x7f880e1c7000, 1048576, 0) = -1 EIO (Input/output error) pread: Input/output error +++ exited with 0 +++ But doing it with a larger buffer succeeds(!): pread64(3, "XFSB\0\0\20\0\0\0\0\0\0\fL\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1146880, 0) = 1146880 read 1146880/1146880 bytes at offset 0 1 MiB, 1 ops; 0.0009 sec (1.124 GiB/sec and 1052.6316 ops/sec) +++ exited with 0 +++ (Note that the part of the buffer corresponding to the dm-error area is uninitialized) On 5.3-rc2, both commands would fail with EIO like you'd expect. The only change between rc2 and rc3 is commit 0eb6ddfb865c ("block: Fix __blkdev_direct_IO() for bio fragments"). AFAICT we end up in __blkdev_direct_IO with a 1120K buffer, which gets split into two bios: one for the first BIO_MAX_PAGES worth of data (1MB) and a second one for the 96k after that." Fix this by noting that it's always safe to dereference dio if we get BLK_QC_T_EAGAIN returned, as end_io hasn't been run for that case. So we can safely increment the dio size before calling submit_bio(), and then decrement it on failure (not that it really matters, as the bio and dio are going away). For error handling, return to the original method of just using 'ret' for tracking the error, and the size tracking in dio->size. Fixes: 0eb6ddfb865c ("block: Fix __blkdev_direct_IO() for bio fragments") Fixes: 6a43074e2f46 ("block: properly handle IOCB_NOWAIT for async O_DIRECT IO") Reported-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-07NFSv4: Ensure state recovery handles ETIMEDOUT correctlyTrond Myklebust
Ensure that the state recovery code handles ETIMEDOUT correctly, and also that we set RPC_TASK_TIMEOUT when recovering open state. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-07btrfs: trim: Check the range passed into to prevent overflowQu Wenruo
Normally the range->len is set to default value (U64_MAX), but when it's not default value, we should check if the range overflows. And if it overflows, return -EINVAL before doing anything. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-08-07Btrfs: fix sysfs warning and missing raid sysfs directoriesFilipe Manana
In the 5.3 merge window, commit 7c7e301406d0a9 ("btrfs: sysfs: Replace default_attrs in ktypes with groups"), we started using the member "defaults_groups" for the kobject type "btrfs_raid_ktype". That leads to a series of warnings when running some test cases of fstests, such as btrfs/027, btrfs/124 and btrfs/176. The traces produced by those warnings are like the following: [116648.059212] kernfs: can not remove 'total_bytes', no directory [116648.060112] WARNING: CPU: 3 PID: 28500 at fs/kernfs/dir.c:1504 kernfs_remove_by_name_ns+0x75/0x80 (...) [116648.066482] CPU: 3 PID: 28500 Comm: umount Tainted: G W 5.3.0-rc3-btrfs-next-54 #1 (...) [116648.069376] RIP: 0010:kernfs_remove_by_name_ns+0x75/0x80 (...) [116648.072385] RSP: 0018:ffffabfd0090bd08 EFLAGS: 00010282 [116648.073437] RAX: 0000000000000000 RBX: ffffffffc0c11998 RCX: 0000000000000000 [116648.074201] RDX: ffff9fff603a7a00 RSI: ffff9fff603978a8 RDI: ffff9fff603978a8 [116648.074956] RBP: ffffffffc0b9ca2f R08: 0000000000000000 R09: 0000000000000001 [116648.075708] R10: ffff9ffe1f72e1c0 R11: 0000000000000000 R12: ffffffffc0b94120 [116648.076434] R13: ffffffffb3d9b4e0 R14: 0000000000000000 R15: dead000000000100 [116648.077143] FS: 00007f9cdc78a2c0(0000) GS:ffff9fff60380000(0000) knlGS:0000000000000000 [116648.077852] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [116648.078546] CR2: 00007f9fc4747ab4 CR3: 00000005c7832003 CR4: 00000000003606e0 [116648.079235] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [116648.079907] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [116648.080585] Call Trace: [116648.081262] remove_files+0x31/0x70 [116648.081929] sysfs_remove_group+0x38/0x80 [116648.082596] sysfs_remove_groups+0x34/0x70 [116648.083258] kobject_del+0x20/0x60 [116648.083933] btrfs_free_block_groups+0x405/0x430 [btrfs] [116648.084608] close_ctree+0x19a/0x380 [btrfs] [116648.085278] generic_shutdown_super+0x6c/0x110 [116648.085951] kill_anon_super+0xe/0x30 [116648.086621] btrfs_kill_super+0x12/0xa0 [btrfs] [116648.087289] deactivate_locked_super+0x3a/0x70 [116648.087956] cleanup_mnt+0xb4/0x160 [116648.088620] task_work_run+0x7e/0xc0 [116648.089285] exit_to_usermode_loop+0xfa/0x100 [116648.089933] do_syscall_64+0x1cb/0x220 [116648.090567] entry_SYSCALL_64_after_hwframe+0x49/0xbe [116648.091197] RIP: 0033:0x7f9cdc073b37 (...) [116648.100046] ---[ end trace 22e24db328ccadf8 ]--- [116648.100618] ------------[ cut here ]------------ [116648.101175] kernfs: can not remove 'used_bytes', no directory [116648.101731] WARNING: CPU: 3 PID: 28500 at fs/kernfs/dir.c:1504 kernfs_remove_by_name_ns+0x75/0x80 (...) [116648.105649] CPU: 3 PID: 28500 Comm: umount Tainted: G W 5.3.0-rc3-btrfs-next-54 #1 (...) [116648.107461] RIP: 0010:kernfs_remove_by_name_ns+0x75/0x80 (...) [116648.109336] RSP: 0018:ffffabfd0090bd08 EFLAGS: 00010282 [116648.109979] RAX: 0000000000000000 RBX: ffffffffc0c119a0 RCX: 0000000000000000 [116648.110625] RDX: ffff9fff603a7a00 RSI: ffff9fff603978a8 RDI: ffff9fff603978a8 [116648.111283] RBP: ffffffffc0b9ca41 R08: 0000000000000000 R09: 0000000000000001 [116648.111940] R10: ffff9ffe1f72e1c0 R11: 0000000000000000 R12: ffffffffc0b94120 [116648.112603] R13: ffffffffb3d9b4e0 R14: 0000000000000000 R15: dead000000000100 [116648.113268] FS: 00007f9cdc78a2c0(0000) GS:ffff9fff60380000(0000) knlGS:0000000000000000 [116648.113939] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [116648.114607] CR2: 00007f9fc4747ab4 CR3: 00000005c7832003 CR4: 00000000003606e0 [116648.115286] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [116648.115966] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [116648.116649] Call Trace: [116648.117326] remove_files+0x31/0x70 [116648.117997] sysfs_remove_group+0x38/0x80 [116648.118671] sysfs_remove_groups+0x34/0x70 [116648.119342] kobject_del+0x20/0x60 [116648.120022] btrfs_free_block_groups+0x405/0x430 [btrfs] [116648.120707] close_ctree+0x19a/0x380 [btrfs] [116648.121396] generic_shutdown_super+0x6c/0x110 [116648.122057] kill_anon_super+0xe/0x30 [116648.122702] btrfs_kill_super+0x12/0xa0 [btrfs] [116648.123335] deactivate_locked_super+0x3a/0x70 [116648.123961] cleanup_mnt+0xb4/0x160 [116648.124586] task_work_run+0x7e/0xc0 [116648.125210] exit_to_usermode_loop+0xfa/0x100 [116648.125830] do_syscall_64+0x1cb/0x220 [116648.126463] entry_SYSCALL_64_after_hwframe+0x49/0xbe [116648.127080] RIP: 0033:0x7f9cdc073b37 (...) [116648.135923] ---[ end trace 22e24db328ccadf9 ]--- These happen because, during the unmount path, we call kobject_del() for raid kobjects that are not fully initialized, meaning that we set their ktype (as btrfs_raid_ktype) through link_block_group() but we didn't set their parent kobject, which is done through btrfs_add_raid_kobjects(). We have this split raid kobject setup since commit 75cb379d263521 ("btrfs: defer adding raid type kobject until after chunk relocation") in order to avoid triggering reclaim during contextes where we can not (either we are holding a transaction handle or some lock required by the transaction commit path), so that we do the calls to kobject_add(), which triggers GFP_KERNEL allocations, through btrfs_add_raid_kobjects() in contextes where it is safe to trigger reclaim. That change expected that a new raid kobject can only be created either when mounting the filesystem or after raid profile conversion through the relocation path. However, we can have new raid kobject created in other two cases at least: 1) During device replace (or scrub) after adding a device a to the filesystem. The replace procedure (and scrub) do calls to btrfs_inc_block_group_ro() which can allocate a new block group with a new raid profile (because we now have more devices). This can be triggered by test cases btrfs/027 and btrfs/176. 2) During a degraded mount trough any write path. This can be triggered by test case btrfs/124. Fixing this by adding extra calls to btrfs_add_raid_kobjects(), not only makes things more complex and fragile, can also introduce deadlocks with reclaim the following way: 1) Calling btrfs_add_raid_kobjects() at btrfs_inc_block_group_ro() or anywhere in the replace/scrub path will cause a deadlock with reclaim because if reclaim happens and a transaction commit is triggered, the transaction commit path will block at btrfs_scrub_pause(). 2) During degraded mounts it is essentially impossible to figure out where to add extra calls to btrfs_add_raid_kobjects(), because allocation of a block group with a new raid profile can happen anywhere, which means we can't safely figure out which contextes are safe for reclaim, as we can either hold a transaction handle or some lock needed by the transaction commit path. So it is too complex and error prone to have this split setup of raid kobjects. So fix the issue by consolidating the setup of the kobjects in a single place, at link_block_group(), and setup a nofs context there in order to prevent reclaim being triggered by the memory allocations done through the call chain of kobject_add(). Besides fixing the sysfs warnings during kobject_del(), this also ensures the sysfs directories for the new raid profiles end up created and visible to users (a bug that existed before the 5.3 commit 7c7e301406d0a9 ("btrfs: sysfs: Replace default_attrs in ktypes with groups")). Fixes: 75cb379d263521 ("btrfs: defer adding raid type kobject until after chunk relocation") Fixes: 7c7e301406d0a9 ("btrfs: sysfs: Replace default_attrs in ktypes with groups") Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-08-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: "Yeah I should have sent a pull request last week, so there is a lot more here than usual: 1) Fix memory leak in ebtables compat code, from Wenwen Wang. 2) Several kTLS bug fixes from Jakub Kicinski (circular close on disconnect etc.) 3) Force slave speed check on link state recovery in bonding 802.3ad mode, from Thomas Falcon. 4) Clear RX descriptor bits before assigning buffers to them in stmmac, from Jose Abreu. 5) Several missing of_node_put() calls, mostly wrt. for_each_*() OF loops, from Nishka Dasgupta. 6) Double kfree_skb() in peak_usb can driver, from Stephane Grosjean. 7) Need to hold sock across skb->destructor invocation, from Cong Wang. 8) IP header length needs to be validated in ipip tunnel xmit, from Haishuang Yan. 9) Use after free in ip6 tunnel driver, also from Haishuang Yan. 10) Do not use MSI interrupts on r8169 chips before RTL8168d, from Heiner Kallweit. 11) Upon bridge device init failure, we need to delete the local fdb. From Nikolay Aleksandrov. 12) Handle erros from of_get_mac_address() properly in stmmac, from Martin Blumenstingl. 13) Handle concurrent rename vs. dump in netfilter ipset, from Jozsef Kadlecsik. 14) Setting NETIF_F_LLTX on mac80211 causes complete breakage with some devices, so revert. From Johannes Berg. 15) Fix deadlock in rxrpc, from David Howells. 16) Fix Kconfig deps of enetc driver, we must have PHYLIB. From Yue Haibing. 17) Fix mvpp2 crash on module removal, from Matteo Croce. 18) Fix race in genphy_update_link, from Heiner Kallweit. 19) bpf_xdp_adjust_head() stopped working with generic XDP when we fixes generic XDP to support stacked devices properly, fix from Jesper Dangaard Brouer. 20) Unbalanced RCU locking in rt6_update_exception_stamp_rt(), from David Ahern. 21) Several memory leaks in new sja1105 driver, from Vladimir Oltean" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (214 commits) net: dsa: sja1105: Fix memory leak on meta state machine error path net: dsa: sja1105: Fix memory leak on meta state machine normal path net: dsa: sja1105: Really fix panic on unregistering PTP clock net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as well net: dsa: sja1105: Fix broken learning with vlan_filtering disabled net: dsa: qca8k: Add of_node_put() in qca8k_setup_mdio_bus() net: sched: sample: allow accessing psample_group with rtnl net: sched: police: allow accessing police->params with rtnl net: hisilicon: Fix dma_map_single failed on arm64 net: hisilicon: fix hip04-xmit never return TX_BUSY net: hisilicon: make hip04_tx_reclaim non-reentrant tc-testing: updated vlan action tests with batch create/delete net sched: update vlan action for batched events operations net: stmmac: tc: Do not return a fragment entry net: stmmac: Fix issues when number of Queues >= 4 net: stmmac: xgmac: Fix XGMAC selftests be2net: disable bh with spin_lock in be_process_mcc net: cxgb3_main: Fix a resource leak in a error path in 'init_one()' net: ethernet: sun4i-emac: Support phy-handle property for finding PHYs net: bridge: move default pvid init/deinit to NETDEV_REGISTER/UNREGISTER ...
2019-08-05SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUALSebastien Tisserant
Fix kernel oops when mounting a encryptData CIFS share with CONFIG_DEBUG_VIRTUAL Signed-off-by: Sebastien Tisserant <stisserant@wallix.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-08-05smb3: send CAP_DFS capability during session setupSteve French
We had a report of a server which did not do a DFS referral because the session setup Capabilities field was set to 0 (unlike negotiate protocol where we set CAP_DFS). Better to send it session setup in the capabilities as well (this also more closely matches Windows client behavior). Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> CC: Stable <stable@vger.kernel.org>
2019-08-05SMB3: Fix potential memory leak when processing compound chainPavel Shilovsky
When a reconnect happens in the middle of processing a compound chain the code leaks a buffer from the memory pool. Fix this by properly checking for a return code and freeing buffers in case of error. Also maintain a buf variable to be equal to either smallbuf or bigbuf depending on a response buffer size while parsing a chain and when returning to the caller. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-08-05SMB3: Fix deadlock in validate negotiate hits reconnectPavel Shilovsky
Currently we skip SMB2_TREE_CONNECT command when checking during reconnect because Tree Connect happens when establishing an SMB session. For SMB 3.0 protocol version the code also calls validate negotiate which results in SMB2_IOCL command being sent over the wire. This may deadlock on trying to acquire a mutex when checking for reconnect. Fix this by skipping SMB2_IOCL command when doing the reconnect check. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> CC: Stable <stable@vger.kernel.org>
2019-08-05dax: dax_layout_busy_page() should not unmap cow pagesVivek Goyal
Vivek: "As of now dax_layout_busy_page() calls unmap_mapping_range() with last argument as 1, which says even unmap cow pages. I am wondering who needs to get rid of cow pages as well. I noticed one interesting side affect of this. I mount xfs with -o dax and mmaped a file with MAP_PRIVATE and wrote some data to a page which created cow page. Then I called fallocate() on that file to zero a page of file. fallocate() called dax_layout_busy_page() which unmapped cow pages as well and then I tried to read back the data I wrote and what I get is old data from persistent memory. I lost the data I had written. This read basically resulted in new fault and read back the data from persistent memory. This sounds wrong. Are there any users which need to unmap cow pages as well? If not, I am proposing changing it to not unmap cow pages. I noticed this while while writing virtio_fs code where when I tried to reclaim a memory range and that corrupted th