summaryrefslogtreecommitdiffstats
path: root/drivers/md/raid5-log.h
AgeCommit message (Collapse)Author
2017-05-11md/r5cache: gracefully handle journal device errors for writeback modeSong Liu
For the raid456 with writeback cache, when journal device failed during normal operation, it is still possible to persist all data, as all pending data is still in stripe cache. However, it is necessary to handle journal failure gracefully. During journal failures, the following logic handles the graceful shutdown of journal: 1. raid5_error() marks the device as Faulty and schedules async work log->disable_writeback_work; 2. In disable_writeback_work (r5c_disable_writeback_async), the mddev is suspended, set to write through, and then resumed. mddev_suspend() flushes all cached stripes; 3. All cached stripes need to be flushed carefully to the RAID array. This patch fixes issues within the process above: 1. In r5c_update_on_rdev_error() schedule disable_writeback_work for journal failures; 2. In r5c_disable_writeback_async(), wait for MD_SB_CHANGE_PENDING, since raid5_error() updates superblock. 3. In handle_stripe(), allow stripes with data in journal (s.injournal > 0) to make progress during log_failed; 4. In delay_towrite(), if log failed only process data in the cache (skip new writes in dev->towrite); 5. In __get_priority_stripe(), process loprio_list during journal device failures. 6. In raid5_remove_disk(), wait for all cached stripes are flushed before calling log_exit(). Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com>
2017-04-10raid5-ppl: use resize_stripes() when enabling or disabling pplArtur Paszkiewicz
Use resize_stripes() instead of raid5_reset_stripe_cache() to allocate or free sh->ppl_page at runtime for all stripes in the stripe cache. raid5_reset_stripe_cache() required suspending the mddev and could deadlock because of GFP_KERNEL allocations. Move the 'newsize' check to check_reshape() to allow reallocating the stripes with the same number of disks. Allocate sh->ppl_page in alloc_stripe() instead of grow_buffers(). Pass 'struct r5conf *conf' as a parameter to alloc_stripe() because it is needed to check whether to allocate ppl_page. Add free_stripe() and use it to free stripes rather than directly call kmem_cache_free(). Also free sh->ppl_page in free_stripe(). Set MD_HAS_PPL at the end of ppl_init_log() instead of explicitly setting it in advance and add another parameter to log_init() to allow calling ppl_init_log() without the bit set. Don't try to calculate partial parity or add a stripe to log if it does not have ppl_page set. Enabling ppl can now be performed without suspending the mddev, because the log won't be used until new stripes are allocated with ppl_page. Calling mddev_suspend/resume is still necessary when disabling ppl, because we want all stripes to finish before stopping the log, but resize_stripes() can be called after mddev_resume() when ppl is no longer active. Suggested-by: NeilBrown <neilb@suse.com> Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Shaohua Li <shli@fb.com>
2017-03-22md/raid5: call bio_endio() directly rather than queueing for later.NeilBrown
We currently gather bios that need to be returned into a bio_list and call bio_endio() on them all together. The original reason for this was to avoid making the calls while holding a spinlock. Locking has changed a lot since then, and that reason is no longer valid. So discard return_io() and various return_bi lists, and just call bio_endio() directly as needed. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
2017-03-16raid5-ppl: support disk hot add/remove with PPLArtur Paszkiewicz
Add a function to modify the log by removing an rdev when a drive fails or adding when a spare/replacement is activated as a raid member. Removing a disk just clears the child log rdev pointer. No new stripes will be accepted for this child log in ppl_write_stripe() and running io units will be processed without writing PPL to the device. Adding a disk sets the child log rdev pointer and writes an empty PPL header. Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Shaohua Li <shli@fb.com>
2017-03-16raid5-ppl: Partial Parity Log write logging implementationArtur Paszkiewicz
Implement the calculation of partial parity for a stripe and PPL write logging functionality. The description of PPL is added to the documentation. More details can be found in the comments in raid5-ppl.c. Attach a page for holding the partial parity data to stripe_head. Allocate it only if mddev has the MD_HAS_PPL flag set. Partial parity is the xor of not modified data chunks of a stripe and is calculated as follows: - reconstruct-write case: xor data from all not updated disks in a stripe - read-modify-write case: xor old data and parity from all updated disks in a stripe Implement it using the async_tx API and integrate into raid_run_ops(). It must be called when we still have access to old data, so do it when STRIPE_OP_BIODRAIN is set, but before ops_run_prexor5(). The result is stored into sh->ppl_page. Partial parity is not meaningful for full stripe write and is not stored in the log or used for recovery, so don't attempt to calculate it when stripe has STRIPE_FULL_WRITE. Put the PPL metadata structures to md_p.h because userspace tools (mdadm) will also need to read/write PPL. Warn about using PPL with enabled disk volatile write-back cache for now. It can be removed once disk cache flushing before writing PPL is implemented. Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Shaohua Li <shli@fb.com>
2017-03-16raid5: separate header for log functionsArtur Paszkiewicz
Move raid5-cache declarations from raid5.h to raid5-log.h, add inline wrappers for functions which will be shared with ppl and use them in raid5 core instead of direct calls to raid5-cache. Remove unused parameter from r5c_cache_data(), move two duplicated pr_debug() calls to r5l_init_log(). Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Shaohua Li <shli@fb.com>