summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2016-11-30btrfs: calculate end of bio offset properlyChristoph Hellwig
Use the bvec offset and len members to prepare for multipage bvecs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: use bi_sizeChristoph Hellwig
Instead of using bi_vcnt to calculate it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: don't access the bio directly in btrfs_csum_one_bioChristoph Hellwig
Use bio_for_each_segment_all to iterate over the segments instead. This requires a bit of reshuffling so that we only lookup up the ordered item once inside the bio_for_each_segment_all loop. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: don't access the bio directly in the direct I/O codeChristoph Hellwig
Just use bio_for_each_segment_all to iterate over all segments. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: don't access the bio directly in the raid5/6 codeChristoph Hellwig
Just use bio_for_each_segment_all to iterate over all segments. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: use bio iterators for the decompression handlersChristoph Hellwig
Pass the full bio to the decompression routines and use bio iterators to iterate over the data in the bio. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: Ensure proper sector alignment for btrfs_free_reserved_data_spaceJeff Mahoney
This fixes the WARN_ON on BTRFS_I(inode)->reserved_extents in btrfs_destroy_inode and the WARN_ON on nonzero delalloc bytes on umount with qgroups enabled. I was able to reproduce this by setting up a small (~500kb) quota limit and writing a file one byte at a time until I hit the limit. The warnings would all hit on umount. The root cause is that we would reserve a block-sized range in both the reservation and the quota in btrfs_check_data_free_space, but if we encountered a problem (like e.g. EDQUOT), we would only release the single byte in the qgroup reservation. That caused an iotree state split, which increased the number of outstanding extents, in turn disallowing releasing the metadata reservation. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30Btrfs: abort transaction if fill_holes() failsJosef Bacik
At this point we will have dropped extent entries from the file, so if we fail to insert the new hole entries then we are leaving the fs in a corrupt state (albeit an easily fixed one). Abort the transaciton if this happens so we can avoid corrupting the fs. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30Btrfs: fix file extent corruptionJosef Bacik
In order to do hole punching we have a block reserve to hold the reservation we need to drop the extents in our range. Since we could end up dropping a lot of extents we set rsv->failfast so we can just loop around again and drop the remaining of the range. Unfortunately we unconditionally fill the hole extents in and start from the last extent we encountered, which we may or may not have dropped. So this can result in overlapping file extent entries, which can be tripped over in a variety of ways, either by hitting BUG_ON(!ret) in fill_holes() after the search, or in btrfs_set_item_key_safe() in btrfs_drop_extent() at a later time by an unrelated task. Fix this by only setting drop_end to the last extent we did actually drop. This way our holes are filled in properly for the range that we did drop, and the rest of the range that remains to be dropped is actually dropped. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: increment ctx->pos for every emitted or skipped dirent in readdirJeff Mahoney
If we process the last item in the leaf and hit an I/O error while reading the next leaf, we return -EIO without having adjusted the position. Since we have emitted dirents, getdents() will return the byte count to the user instead of the error. Subsequent callers will emit the last successful dirent again, and return -EIO again, with the same result. Callers loop forever. Instead, if we always increment ctx->pos after emitting or skipping the dirent, we'll be sure that we won't hit the same one again. When we go to process the next leaf, we won't have emitted any dirents and the -EIO will be returned to the user properly. We also don't need to track if we've emitted a dirent already or if we've changed the position yet. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: remove old tree_root dirent processing in btrfs_real_readdir()Jeff Mahoney
Commit 3de4586c527 (Btrfs: Allow subvolumes and snapshots anywhere in the directory tree) introduced the current system of placing snapshots in the directory tree. It also introduced the behavior of creating the snapshot and then creating the directory entries for it. We've kept this code around for compatibility reasons, but it turns out that no file systems with the old tree_root based snapshots can be mounted on newer (>= 2009) kernels anyway. About a month after the above commit, commit 2a7108ad89e (Btrfs: rev the disk format for the inode compat and csum selection changes) landed, changing the superblock magic number. As a result, we know that we'll never encounter tree_root-based dirents or have to deal with skipping our own snapshot dirents. Since that also means that we're now only iterating over DIR_INDEX items, which only contain one directory entry per leaf item, we don't need to loop over the leaf item contents anymore either. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: Call kunmap if zlib_inflateInit2 failsNick Terrell
If zlib_inflateInit2 fails, the input page is never unmapped. Add a call to kunmap when it fails. Signed-off-by: Nick Terrell <nickrterrell@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: store and load values of stripes_min/stripes_max in balance status itemDavid Sterba
The balance status item contains currently known filter values, but the stripes filter was unintentionally not among them. This would mean, that interrupted and automatically restarted balance does not apply the stripe filters. Fixes: dee32d0ac3719ef8d640efaf0884111df444730f CC: stable@vger.kernel.org # 4.4+ Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: remove redundant check of btrfs_iget return valueChristophe JAILLET
'btrfs_iget()' can not return NULL, so this test can be removed. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: change btrfs_csum_final result param type to u8Domagoj Tršan
csum member of struct btrfs_super_block has array type of u8. It makes sense that function btrfs_csum_final should be also declared to accept u8 *. I changed the declaration of method void btrfs_csum_final(u32 crc, char *result); to void btrfs_csum_final(u32 crc, u8 *result); Signed-off-by: Domagoj Tršan <domagoj.trsan@gmail.com> [ changed cast to u8 at several call sites ] Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30Btrfs: adjust len of writes if following a preallocated extentLiu Bo
If we have |0--hole--4095||4096--preallocate--12287| instead of using preallocated space, a 8K direct write will just create a new 8K extent and it'll end up with |0--new extent--8191||8192--preallocate--12287| It's because we find a hole em and then go to create a new 8K extent directly without adjusting @len. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Chris Mason <clm@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: return early from failed memory allocations in ioctl handlersShailendra Verma
There is no need to call kfree() if memdup_user() fails, as no memory was allocated and the error in the error-valued pointer should be returned. Signed-off-by: Shailendra Verma <shailendra.v@samsung.com> [ edit subject ] Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: add optimized version of eb to eb copyDavid Sterba
Using copy_extent_buffer is suitable for copying betwenn buffers from an arbitrary offset and deals with page boundaries. This is not necessary when doing a full extent_buffer-to-extent_buffer copy. We can utilize the copy_page helper as well. Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: remove constant parameter to memset_extent_buffer and rename itDavid Sterba
The only memset we do is to 0, so sink the parameter to the function and simplify all calls. Rename the function to reflect the behaviour. Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: use specialized page copying helpers in btrfs_clone_extent_bufferDavid Sterba
The copy_page is usually optimized and can be faster than memcpy. Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: use new helpers to set uuids in ebDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: introduce helpers for updating eb uuidsDavid Sterba
The fsid and chunk tree uuid are always located in the first page, we don't need the to use write_extent_buffer. Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: delete unused member from superblockDavid Sterba
__bdev' has never been used since 0b86a832a1f38abec695864ec2eaedc9d2383f1b (2008). Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: remove trivial helper btrfs_find_tree_blockDavid Sterba
During the time, the function has been shrunk to the point that it just calls find_extent_buffer, just passing the parameters. Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: reada, remove pointless BUG_ON check for fs_infoDavid Sterba
We dereference fs_info several times, besides that post-mount functions should never see a NULL fs_info. Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: reada, remove pointless BUG_ON in reada_find_extentDavid Sterba
The lock is held, we make the same lookup that previously failed with EEXIST and we don't insert NULL pointers. Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: reada, sink start parameter to btree_readahead_hookDavid Sterba
Originally, the eb and start were passed separately in case eb is NULL. Since the readahead has been refactored in 4.6, this is not true anymore and we can get rid of the parameter. Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: reada, remove unused parameter from __readahead_hookDavid Sterba
'start' is not used since "btrfs: reada: Pass reada_extent into __readahead_hook directly" (6e39dbe8b9e55280c). Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: reada, cleanup remove unneeded variable in __readahead_hookDavid Sterba
We can't touch the eb directly in case the function is called with a non-zero error, so we can read the eb level when needed. Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: rename helper macros for qgroup and aux data castsDavid Sterba
The helpers are not meant to be generic, the name is misleading. Convert them to static inlines for type checking. Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: remove stale comment from btrfs_statfsDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: remove unused headers, statfs.hDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: remove useless commentsXiaoguang Wang
Fixes: ("btrfs: update btrfs_space_info's bytes_may_use timely") Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: make block group flags in balance printks human-readableAdam Borowski
They're not even documented anywhere, letting users with no recourse but to RTFS. It's no big burden to output the bitfield as words. Also, display unknown flags as hex. Signed-off-by: Adam Borowski <kilobyte@angband.pl> Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30Btrfs: deal with existing encompassing extent map in btrfs_get_extent()Omar Sandoval
My QEMU VM was seeing inexplicable I/O errors that I tracked down to errors coming from the qcow2 virtual drive in the host system. The qcow2 file is a nocow file on my Btrfs drive, which QEMU opens with O_DIRECT. Every once in awhile, pread() or pwrite() would return EEXIST, which makes no sense. This turned out to be a bug in btrfs_get_extent(). Commit 8dff9c853410 ("Btrfs: deal with duplciates during extent_map insertion in btrfs_get_extent") fixed a case in btrfs_get_extent() where two threads race on adding the same extent map to an inode's extent map tree. However, if the added em is merged with an adjacent em in the extent tree, then we'll end up with an existing extent that is not identical to but instead encompasses the extent we tried to add. When we call merge_extent_mapping() to find the nonoverlapping part of the new em, the arithmetic overflows because there is no such thing. We then end up trying to add a bogus em to the em_tree, which results in a EEXIST that can bubble all the way up to userspace. Fix it by extending the identical extent map special case. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30btrfs: add necessary comments about tickets_idWang Xiaoguang
Tickets_id's name may result in some misunderstandings, it just indicates the next ticket will be handled and is not stored per ticket. Fixes: ce12965 ("btrfs: introduce tickets_id to determine whether asynchronous metadata reclaim work makes progress") Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-29btrfs: cleanup: use already calculated value in ↵Wang Xiaoguang
btrfs_should_throttle_delayed_refs() Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-29btrfs: don't abuse REQ_OP_* flags for btrfs_map_blockChristoph Hellwig
btrfs_map_block supports different types of mappings, which to a large extent resemble block layer operations. But they don't always do, and currently btrfs dangerously overlays it's own flag over the block layer flags. This is just asking for a conflict, so introduce a different map flags enum inside of btrfs instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-27Linux 4.9-rc7Linus Torvalds
2016-11-27Merge git://git.infradead.org/intel-iommuLinus Torvalds
Pull IOMMU fixes from David Woodhouse: "Two minor fixes. The first fixes the assignment of SR-IOV virtual functions to the correct IOMMU unit, and the second fixes the excessively large (and physically contiguous) PASID tables used with SVM" * git://git.infradead.org/intel-iommu: iommu/vt-d: Fix PASID table allocation iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions
2016-11-27Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds
Pull MIPS fixes from Ralf Baechle: "Another round of MIPS fixes for 4.9: - Fix unreadable output in __do_page_fault due to the KERN_CONT patchset - Correctly handle MIPS R6 fixes to the c0_wired register" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: mm: Fix output of __do_page_fault MIPS: Mask out limit field when calculating wired entry count
2016-11-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs splice fix from Al Viro. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix default_file_splice_read()
2016-11-26fix default_file_splice_read()Al Viro
Botched calculation of number of pages. As the result, we were dropping pieces when doing splice to pipe from e.g. 9p. Reported-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-11-26Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Here is a revert and two bugfixes for the I2C designware driver. Please note that we are still hunting down a regression for the i2c-octeon driver. While there is a fix pending, we have unclear feedback from the testers currently. An rc8 would be quite helpful for this case" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: Revert "i2c: designware: do not disable adapter after transfer" i2c: designware: fix rx fifo depth tracking i2c: designware: report short transfers
2016-11-26Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fix from Russell King: "This resolves the ksyms issues by reverting the commit which introduced the breakage" There was what I consider to be a better fix, but it's late in the rc game, so I'll take the revert. * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: Revert "arm: move exports to definitions"
2016-11-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix leak in fsl/fman driver, from Dan Carpenter. 2) Call flow dissector initcall earlier than any networking driver can register and start to use it, from Eric Dumazet. 3) Some dup header fixes from Geliang Tang. 4) TIPC link monitoring compat fix from Jon Paul Maloy. 5) Link changes require EEE re-negotiation in bcm_sf2 driver, from Florian Fainelli. 6) Fix bogus handle ID passed into tfilter_notify_chain(), from Roman Mashak. 7) Fix dump size calculation in rtnl_calcit(), from Zhang Shengju. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits) tipc: resolve connection flow control compatibility problem mvpp2: use correct size for memset net/mlx5: drop duplicate header delay.h net: ieee802154: drop duplicate header delay.h ibmvnic: drop duplicate header seq_file.h fsl/fman: fix a leak in tgec_free() net: ethtool: don't require CAP_NET_ADMIN for ETHTOOL_GLINKSETTINGS tipc: improve sanity check for received domain records tipc: fix compatibility bug in link monitoring net: ethernet: mvneta: Remove IFF_UNICAST_FLT which is not implemented dwc_eth_qos: drop duplicate headers net sched filters: fix filter handle ID in tfilter_notify_chain() net: dsa: bcm_sf2: Ensure we re-negotiate EEE during after link change bnxt: do not busy-poll when link is down udplite: call proper backlog handlers ipv6: bump genid when the IFA_F_TENTATIVE flag is clear net/mlx4_en: Free netdev resources under state lock net: revert "net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit" rtnetlink: fix the wrong minimal dump size getting from rtnl_calcit() bnxt_en: Fix a VXLAN vs GENEVE issue ...
2016-11-26Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: - Fix a crash that occurs at driver initialization if the memory region is already busy (request_mem_region() fails). - Fix a vma validation check that mistakenly allows a private device- dax mapping to be established. Device-dax explicitly forbids private mappings so it can guarantee a given fault granularity and backing memory type. Both of these fixes have soaked in -next and are tagged for -stable. * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: device-dax: fail all private mapping attempts device-dax: check devm_nsio_enable() return value
2016-11-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Radim Krčmář: "Four fixes for bugs found by syzkaller on x86, all for stable" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: check for pic and ioapic presence before use KVM: x86: fix out-of-bounds accesses of rtc_eoi map KVM: x86: drop error recovery in em_jmp_far and em_ret_far KVM: x86: fix out-of-bounds access in lapic
2016-11-26Merge tag 'powerpc-4.9-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fixes marked for stable: - Set missing wakeup bit in LPCR on POWER9 - Fix the early OPAL console wrappers - Fixup kernel read only mapping Fixes for code merged this cycle: - Fix missing CRCs, add more asm-prototypes.h declarations" * tag 'powerpc-4.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Fixup kernel read only mapping powerpc/boot: Fix the early OPAL console wrappers powerpc: Fix missing CRCs, add more asm-prototypes.h declarations powerpc: Set missing wakeup bit in LPCR on POWER9
2016-11-25tipc: resolve connection flow control compatibility problemJon Paul Maloy
In commit 10724cc7bb78 ("tipc: redesign connection-level flow control") we replaced the previous message based flow control with one based on 1k blocks. In order to ensure backwards compatibility the mechanism falls back to using message as base unit when it senses that the peer doesn't support the new algorithm. The default flow control window, i.e., how many units can be sent before the sender blocks and waits for an acknowledge (aka advertisement) is 512. This was tested against the previous version, which uses an acknowledge frequency of on ack per 256 received message, and found to work fine. However, we missed the fact that versions older than Linux 3.15 use an acknowledge frequency of 512, which is exactly the limit where a 4.6+ sender will stop and wait for acknowledge. This would also work fine if it weren't for the fact that if the first sent message on a 4.6+ server side is an empty SYNACK, this one is also is counted as a sent message, while it is not counted as a received message on a legacy 3.15-receiver. This leads to the sender always being one step ahead of the receiver, a scenario causing the sender to block after 512 sent messages, while the receiver only has registered 511 read messages. Hence, the legacy receiver is not trigged to send an acknowledge, with a permanently blocked sender as result. We solve this deadlock by simply allowing the sender to send one more message before it blocks, i.e., by a making minimal change to the condition used for determining connection congestion. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>