summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2018-01-22btrfs: remove unused setup_root_args()Misono, Tomohiro
Since setup_root_args() is not used anymore, just remove it. Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: split parse_early_options() in twoMisono, Tomohiro
Now parse_early_options() is used by both btrfs_mount() and btrfs_mount_root(). However, the former only needs subvol related part and the latter needs the others. Therefore extract the subvol related parts from parse_early_options() and move it to new parse function (parse_subvol_options()). Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: cleanup btrfs_mount() using btrfs_mount_root()Misono, Tomohiro
Cleanup btrfs_mount() by using btrfs_mount_root(). This avoids getting btrfs_mount() called twice in mount path. Old btrfs_mount() will do: 0. VFS layer calls vfs_kern_mount() with registered file_system_type (for btrfs, btrfs_fs_type). btrfs_mount() is called on the way. 1. btrfs_parse_early_options() parses "subvolid=" mount option and set the value to subvol_objectid. Otherwise, subvol_objectid has the initial value of 0 2. check subvol_objectid is 5 or not. Assume this time id is not 5, then btrfs_mount() returns by calling mount_subvol() 3. In mount_subvol(), original mount options are modified to contain "subvolid=0" in setup_root_args(). Then, vfs_kern_mount() is called with btrfs_fs_type and new options 4. btrfs_mount() is called again 5. btrfs_parse_early_options() parses "subvolid=0" and set 5 (instead of 0) to subvol_objectid 6. check subvol_objectid is 5 or not. This time id is 5 and mount_subvol() is not called. btrfs_mount() finishes mounting a root 7. (in mount_subvol()) with using a return vale of vfs_kern_mount(), it calls mount_subtree() 8. return subvolume's dentry Reusing the same file_system_type (and btrfs_mount()) for vfs_kern_mount() is the cause of complication. Instead, new btrfs_mount() will do: 1. parse subvol id related options for later use in mount_subvol() 2. mount device's root by calling vfs_kern_mount() with btrfs_root_fs_type, which is not registered to VFS by register_filesystem(). As a result, btrfs_mount_root() is called 3. return by calling mount_subvol() The code of 2. is moved from the first part of mount_subvol(). The semantics of device holder changes from btrfs_fs_type to btrfs_root_fs_type and has to be used in all contexts. Otherwise we'd get wrong results when mount and dev scan would not check the same thing. (this has been found indendently and the fix is folded into this patch) Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> [ fold the btrfs_control_ioctl fixup, extend the comment ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: add btrfs_mount_root() and new file_system_typeMisono, Tomohiro
Add btrfs_mount_root() and new file_system_type for preparation of cleanup of btrfs_mount(). Code path is not changed yet. btrfs_mount_root() is almost the same as current btrfs_mount(), but doesn't have subvolume related part. Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: unify extent_page_data type passed as voidDavid Sterba
Functions called from extent_write_cache_pages used void* as generic callback data, but all of them convert it to extent_page_data, or use it directly. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: sink writepage parameter to extent_write_cache_pagesDavid Sterba
The function extent_write_cache_pages is modelled after write_cache_pages which is a generic interface and the writepage parameter makes sense there. In btrfs we know exactly which callback we're going to use, so we can pass it directly. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: sink flush_fn to extent_write_cache_pagesDavid Sterba
All callers pass the same value flush_write_bio. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: merge two flush_write_bio helpersDavid Sterba
flush_epd_write_bio is same as flush_write_bio, no point having two such functions. Merge them to flush_write_bio. The 'noinline' attribute is removed as it does not have any meaning. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: Rename bin_search -> btrfs_bin_searchNikolay Borisov
Currently there are 2 function doing binary search on btrfs nodes: bin_search and btrfs_bin_search. The latter being a simple wrapper for the former. So eliminate the wrapper and just rename bin_search to btrfs_bin_search. No functional changes Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: sink extent_write_full_page tree argumentNikolay Borisov
The tree argument passed to extent_write_full_page is referenced from the page being passed to the same function. Since we already have enough information to get the reference, remove the function parameter. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: sink extent_write_locked_range tree parameterNikolay Borisov
This function is called only from submit_compressed_extents and the io tree being passed is always that of the inode. But we are also passing the inode, so just move getting the io tree pointer in extent_write_locked_range to simplify the signature. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: Remove pair of bio_get/put in btrfs_schedule_bioNikolay Borisov
This code was added in 492bb6deee34 ("Btrfs: Hold a reference on bios during submit_bio, add some extra bio checks"). However, holding a reference on a bio is necessary only if it's going to be referenced after the submit_bio returns and the bio is completed. In this particular instance this is not the case so there is no need to hold an extra reference since we directly return. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: Fix out of bounds access in btrfs_search_slotNikolay Borisov
When modifying a tree where the root is at BTRFS_MAX_LEVEL - 1 then the level variable is going to be 7 (this is the max height of the tree). On the other hand btrfs_cow_block is always called with "level + 1" as an index into the nodes and slots arrays. This leads to an out of bounds access. Admittdely this will be benign since an OOB access of the nodes array will likely read the 0th element from the slots array, which in this case is going to be 0 (since we start CoW at the top of the tree). The OOB access into the slots array in turn will read the 0th and 1st values of the locks array, which would both be 0 at the time. However, this benign behavior relies on the fact that the path being passed hasn't been initialised, if it has already been used to query a btree then it could potentially have populated the nodes/slots arrays. Fix it by explicitly checking if we are at level 7 (the maximum allowed index in nodes/slots arrays) and explicitly call the CoW routine with NULL for parent's node/slot. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Fixes-coverity-id: 711515 Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: remove duplicate includesPravin Shedge
These duplicate includes have been found with scripts/checkincludes.pl but they have been removed manually to avoid removing false positives. Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: Handle btrfs_set_extent_delalloc failure in fixup workerNikolay Borisov
This function was introduced by 247e743cbe6e ("Btrfs: Use async helpers to deal with pages that have been improperly dirtied") and it didn't do any error handling then. This function might very well fail in ENOMEM situation, yet it's not handled, this could lead to inconsistent state. So let's handle the failure by setting the mapping error bit. Cc: stable@vger.kernel.org Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: put btrfs_ioctl_vol_args_v2 related defines togetherAnand Jain
Just a code spatial rearrangement, no functional change. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: show options: use helper to convert compression type stringDavid Sterba
Use the helper, if the COMPRESS option is set, the result is always defined and not empty. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: prop: use common helper for type to string conversionDavid Sterba
Use the helper for conversion, keep the semantics. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: SETFLAGS ioctl: use helper for compression type conversionDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: compression: add helper for type to string conversionDavid Sterba
There are several places opencoding this conversion, add a helper now that we have 3 compression algorithms. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: remove redundant check in btrfs_get_extent_fiemapNikolay Borisov
Before returning hole_em in btrfs_get_fiemap_extent we check if it's different than null. However, by the time this null check is triggered we already know hole_em is not null because it means it points to the em we found and it has already been dereferenced. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: Remove unused variable in btrfs_get_extentNikolay Borisov
trans was statically assigned to NULL and this never changed over the course of btrfs_get_extent. So remove any code which checks whether trans != NULL and just hardcode the fact trans is always NULL. Resolves-coverity-id: 112806 Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: tree-checker: use %zu format string for size_tArnd Bergmann
The return value of sizeof() is of type size_t, so we must print it using the %z format modifier rather than %l to avoid this warning on some architectures: fs/btrfs/tree-checker.c: In function 'check_dir_item': fs/btrfs/tree-checker.c:273:50: error: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'u32' {aka 'unsigned int'} [-Werror=format=] Fixes: 005887f2e3e0 ("btrfs: tree-checker: Add checker for dir item") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22Btrfs: use struct completion in scrub_submit_raid56_bio_waitLiu Bo
This changes to use struct completion directly and removes 'struct scrub_bio_ret' along with the code using it. This struct is used to get the return value from bio, but the caller can access bio to get the return value directly and is holding a reference on it so it won't go away underneath us and can be removed safely. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22Btrfs: remove unused variable wait in lock_stripe_addLiu Bo
The defined wait is not used anywhere. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22Btrfs: compress_file_range() change page dirty status onceTimofey Titovets
We need to call extent_range_clear_dirty_for_io() on compression range to prevent application from changing page content, while pages compressing. extent_range_clear_dirty_for_io() runs on each loop iteration, "(end - start)" can be much (up to 1024 times) bigger then compression range (BTRFS_MAX_UNCOMPRESSED). The start pointer is advanced each time we manage to compress part of the range. The end pointer does not change so we could redirty the remaining parts repeatedly. Fix that behaviour by call extent_range_clear_dirty_for_io() only once, the first time it happens. This is the safest but probably not the best behaviour. Previous iterations of the patch tried to redirty only the range that we were not able to compress. This has been refused by David for safety reasons, the writeout callchain is complex and there could be some path that relies on redirtying the entire unwritten range. Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> [ enhance changelog, the history and safety concerns, add comment ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22Btrfs: compression heuristic: replace heap sort with radix sortTimofey Titovets
Slowest part of heuristic for now is kernel heap sort() It's can take up to 55% of runtime on sorting bucket items. As sorting will always call on most data sets to get correctly byte_core_set_size, the only way to speed up heuristic, is to speed up sort on bucket. Add a general radix_sort function. Radix sort require 2 buffers, one full size of input array and one for store counters (jump addresses). That increase usage per heuristic workspace +1KiB 8KiB + 1KiB -> 8KiB + 2KiB That is LSD Radix, i use 4 bit as a base for calculating, to make counters array acceptable small (16 elements * 8 byte). That Radix sort implementation have several points to adjust, I added him to make radix sort general usable in kernel, like heap sort, if needed. Performance tested in userspace copy of heuristic code, throughput: - average <-> random data: ~3500 MiB/s - heap sort - average <-> random data: ~6000 MiB/s - radix sort Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com> [ coding style fixes ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: cleanup device states define BTRFS_DEV_STATE_FLUSH_SENTAnand Jain
Currently device state is being managed by each individual int variable such as struct btrfs_device::is_tgtdev_for_dev_replace. Instead of that declare btrfs_device::dev_state BTRFS_DEV_STATE_FLUSH_SENT and use the bit operations. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: cleanup device states define BTRFS_DEV_STATE_REPLACE_TGTAnand Jain
Currently device state is being managed by each individual int variable such as struct btrfs_device::is_tgtdev_for_dev_replace. Instead of that declare btrfs_device::dev_state BTRFS_DEV_STATE_MISSING and use the bit operations. Signed-off-by: Anand Jain <anand.jain@oracle.com> [ whitespace adjustments ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: cleanup device states define BTRFS_DEV_STATE_MISSINGAnand Jain
Currently device state is being managed by each individual int variable such as struct btrfs_device::missing. Instead of that declare btrfs_device::dev_state BTRFS_DEV_STATE_MISSING and use the bit operations. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by : Nikolay Borisov <nborisov@suse.com> [ whitespace adjustments ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: cleanup device states define BTRFS_DEV_STATE_IN_FS_METADATAAnand Jain
Currently device state is being managed by each individual int variable such as struct btrfs_device::in_fs_metadata. Instead of that declare device state BTRFS_DEV_STATE_IN_FS_METADATA and use the bit operations. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> [ whitespace adjustments ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: cleanup device states define BTRFS_DEV_STATE_WRITEABLEAnand Jain
Currently device state is being managed by each individual int variable such as struct btrfs_device::writeable. Instead of that declare device state BTRFS_DEV_STATE_WRITEABLE and use the bit operations. Signed-off-by: Anand Jain <anand.jain@oracle.com> [ whitespace adjustments ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: add helper for device path or missingAnand Jain
This patch creates a helper function to get either the rcu device path or missing. Signed-off-by: Anand Jain <anand.jain@oracle.com> [ rename to btrfs_dev_name, switch to if/else ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: drop btrfs_device::can_discard to query directlyAnand Jain
We can query the bdev directly when needed at btrfs_discard_extent() so drop btrfs_device::can_discard. Signed-off-by: Anand Jain <anand.jain@oracle.com> Suggested-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: make function update_share_count staticColin Ian King
The function update_share_count is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warning: fs/btrfs/backref.c:219:6: warning: symbol 'update_share_count' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: Remove redundant FLAG_VACANCYNikolay Borisov
Commit 9036c10208e1 ("Btrfs: update hole handling v2") added the FLAG_VACANCY to denote holes, however there was already a consistent way of flagging extents which represent hole - ->block_start = EXTENT_MAP_HOLE. And also the only place where this flag is checked is in the fiemap code, but the block_start value is also checked and every other place in the filesystem detects holes by using block_start value's. So remove the extra flag. This survived a full xfstest run. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: extent-tree: Make btrfs_inode_rsv_refill function staticQu Wenruo
This function is no longer used outside of extent-tree.c. Make it static. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: move some zstd work data from stack to workspaceDavid Sterba
* ZSTD_inBuffer in_buf * ZSTD_outBuffer out_buf are used in all functions to pass the compression parameters and the local variables consume some space. We can move them to the workspace and reduce the stack consumption: zstd.c:zstd_decompress -24 (136 -> 112) zstd.c:zstd_decompress_bio -24 (144 -> 120) zstd.c:zstd_compress_pages -24 (264 -> 240) Signed-off-by: David Sterba <dsterba@suse.com> Reviewed-by: Nick Terrell <terrelln@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: reorder btrfs_transaction members for better packingDavid Sterba
There are now 20 bytes of holes, we can reduce that to 4 by minor changes. Moving 'aborted' to the status and flags is also more logical, similar for num_dirty_bgs. The size goes from 432 to 416. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: use narrower type for btrfs_transaction::num_dirty_bgsDavid Sterba
The u64 is an overkill here, we could not possibly create that many blockgroups in one transaction. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: reorder btrfs_trans_handle members for better packingDavid Sterba
Recent updates to the structure left some holes, reorder the types so the packing is tight. The size goes from 112 to 104 on 64bit. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: switch to refcount_t type for btrfs_trans_handle::use_countDavid Sterba
The use_count is a reference counter, we can use the refcount_t type, though we don't use the atomicity. This is not a performance critical code and we could catch the underflows. The type is changed from long, but the number of references will fit an int. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: remove unused member of btrfs_trans_handleDavid Sterba
Last user was removed in a monster commit a22285a6a32390195235171 ("Btrfs: Integrate metadata reservation with start_transaction") in 2010. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: switch btrfs_trans_handle::adding_csums to boolDavid Sterba
The semantics of adding_csums matches bool, 'short' was most likely used to save space in a698d0755adb6f2 ("Btrfs: add a type field for the transaction handle"). Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: remove dead code from btrfs_get_extentEdmund Nadolski
Due to new_inline logic, the create == 0 is always true at this point in the code, so the create != 0 branch can be removed. Signed-off-by: Edmund Nadolski <enadolski@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: btrfs_inode_log_parent should use defined inode_only values.Edmund Nadolski
Replace hardcoded numeric argument values for inode_only with the constants defined for that use. Signed-off-by: Edmund Nadolski <enadolski@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: switch to on-stack csum buffer in csum_tree_blockDavid Sterba
The maximum size of a checksum buffer is known, BTRFS_CSUM_SIZE, and we don't have to allocate it dynamically. This code path is not used at all as we have only the crc32c and use an on-stack buffer already. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22Btrfs: set plug for fsyncLiu Bo
Setting plug can merge adjacent IOs before dispatching IOs to the disk driver. Without plug, it'd not be a problem for single disk usecases, but for multiple disks using raid profile, a large IO can be split to several IOs of stripe length, and plug can be helpful to bring them together for each disk so that we can save several disk access. Moreover, fsync issues synchronous writes, so plug can really take effect. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: factor __btrfs_open_devices() to create btrfs_open_one_device()Anand Jain
No functional changes, create btrfs_open_one_device() from __btrfs_open_devices(). This is a preparatory work to add dynamic device scan. Signed-off-by: Anand Jain <anand.jain@oracle.com> [ minor whitespace fixes ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: move check for device generation to the lastAnand Jain
No functional changes. This helps to move the entire section into a new function. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>