From e73f6e8a0de8ce28edef986d86720ef83dd2864f Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Fri, 27 Feb 2015 15:25:31 -0500 Subject: dm switch: fix Documentation to use plain text Signed-off-by: Mike Snitzer --- Documentation/device-mapper/switch.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/device-mapper/switch.txt b/Documentation/device-mapper/switch.txt index 8897d0494838..424835e57f27 100644 --- a/Documentation/device-mapper/switch.txt +++ b/Documentation/device-mapper/switch.txt @@ -47,8 +47,8 @@ consume far too much memory. Using this device-mapper switch target we can now build a two-layer device hierarchy: - Upper Tier – Determine which array member the I/O should be sent to. - Lower Tier – Load balance amongst paths to a particular member. + Upper Tier - Determine which array member the I/O should be sent to. + Lower Tier - Load balance amongst paths to a particular member. The lower tier consists of a single dm multipath device for each member. Each of these multipath devices contains the set of paths directly to -- cgit v1.2.3 From 0ce65797a77ee780f62909d3128bf08b9735718b Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Thu, 26 Feb 2015 00:50:28 -0500 Subject: dm: impose configurable deadline for dm_request_fn's merge heuristic Otherwise, for sequential workloads, the dm_request_fn can allow excessive request merging at the expense of increased service time. Add a per-device sysfs attribute to allow the user to control how long a request, that is a reasonable merge candidate, can be queued on the request queue. The resolution of this request dispatch deadline is in microseconds (ranging from 1 to 100000 usecs), to set a 20us deadline: echo 20 > /sys/block/dm-7/dm/rq_based_seq_io_merge_deadline The dm_request_fn's merge heuristic and associated extra accounting is disabled by default (rq_based_seq_io_merge_deadline is 0). This sysfs attribute is not applicable to bio-based DM devices so it will only ever report 0 for them. By allowing a request to remain on the queue it will block others requests on the queue. But introducing a short dequeue delay has proven very effective at enabling certain sequential IO workloads on really fast, yet IOPS constrained, devices to build up slightly larger IOs -- yielding 90+% throughput improvements. Having precise control over the time taken to wait for larger requests to build affords control beyond that of waiting for certain IO sizes to accumulate (which would require a deadline anyway). This knob will only ever make sense with sequential IO workloads and the particular value used is storage configuration specific. Given the expected niche use-case for when this knob is useful it has been deemed acceptable to expose this relatively crude method for crafting optimal IO on specific storage -- especially given the solution is simple yet effective. In the context of DM multipath, it is advisable to tune this sysfs attribute to a value that offers the best performance for the common case (e.g. if 4 paths are expected active, tune for that; if paths fail then performance may be slightly reduced). Alternatives were explored to have request-based DM autotune this value (e.g. if/when paths fail) but they were quickly deemed too fragile and complex to warrant further design and development time. If this problem proves more common as faster storage emerges we'll have to look at elevating a generic solution into the block core. Tested-by: Shiva Krishna Merla Signed-off-by: Mike Snitzer --- Documentation/ABI/testing/sysfs-block-dm | 14 ++++++++++++++ 1 file changed, 14 insertions(+) (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-block-dm b/Documentation/ABI/testing/sysfs-block-dm index 87ca5691e29b..ac4b6fe245d9 100644 --- a/Documentation/ABI/testing/sysfs-block-dm +++ b/Documentation/ABI/testing/sysfs-block-dm @@ -23,3 +23,17 @@ Description: Device-mapper device suspend state. Contains the value 1 while the device is suspended. Otherwise it contains 0. Read-only attribute. Users: util-linux, device-mapper udev rules + +What: /sys/block/dm-/dm/rq_based_seq_io_merge_deadline +Date: March 2015 +KernelVersion: 4.1 +Contact: dm-devel@redhat.com +Description: Allow control over how long a request that is a + reasonable merge candidate can be queued on the request + queue. The resolution of this deadline is in + microseconds (ranging from 1 to 100000 usecs). + Setting this attribute to 0 (the default) will disable + request-based DM's merge heuristic and associated extra + accounting. This attribute is not applicable to + bio-based DM devices so it will only ever report 0 for + them. -- cgit v1.2.3 From 17e149b8f73ba116e71e25930dd6f2eb3828792d Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Wed, 11 Mar 2015 15:01:09 -0400 Subject: dm: add 'use_blk_mq' module param and expose in per-device ro sysfs attr Request-based DM's blk-mq support defaults to off; but a user can easily change the default using the dm_mod.use_blk_mq module/boot option. Also, you can check what mode a given request-based DM device is using with: cat /sys/block/dm-X/dm/use_blk_mq This change enabled further cleanup and reduced work (e.g. the md->io_pool and md->rq_pool isn't created if using blk-mq). Signed-off-by: Mike Snitzer --- Documentation/ABI/testing/sysfs-block-dm | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-block-dm b/Documentation/ABI/testing/sysfs-block-dm index ac4b6fe245d9..f9f2339b9a0a 100644 --- a/Documentation/ABI/testing/sysfs-block-dm +++ b/Documentation/ABI/testing/sysfs-block-dm @@ -37,3 +37,11 @@ Description: Allow control over how long a request that is a accounting. This attribute is not applicable to bio-based DM devices so it will only ever report 0 for them. + +What: /sys/block/dm-/dm/use_blk_mq +Date: March 2015 +KernelVersion: 4.1 +Contact: dm-devel@redhat.com +Description: Request-based Device-mapper blk-mq I/O path mode. + Contains the value 1 if the device is using blk-mq. + Otherwise it contains 0. Read-only attribute. -- cgit v1.2.3 From 0e0e32c16cfd2eeaf4fd4f16aa6cccd1333ce1e0 Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Wed, 18 Mar 2015 20:57:29 -0400 Subject: dm thin: remove stale 'trim' message documentation The 'trim' message wasn't ever implemented. Signed-off-by: Mike Snitzer --- Documentation/device-mapper/thin-provisioning.txt | 3 --- 1 file changed, 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/device-mapper/thin-provisioning.txt b/Documentation/device-mapper/thin-provisioning.txt index 2f5173500bd9..4f67578b2954 100644 --- a/Documentation/device-mapper/thin-provisioning.txt +++ b/Documentation/device-mapper/thin-provisioning.txt @@ -380,9 +380,6 @@ then you'll have no access to blocks mapped beyond the end. If you load a target that is bigger than before, then extra blocks will be provisioned as and when needed. -If you wish to reduce the size of your thin device and potentially -regain some space then send the 'trim' message to the pool. - ii) Status -- cgit v1.2.3 From 65ff5b7ddf0541f2b6e5cc59c47bfbf6cbcd91b8 Mon Sep 17 00:00:00 2001 From: Sami Tolvanen Date: Wed, 18 Mar 2015 15:52:14 +0000 Subject: dm verity: add error handling modes for corrupted blocks Add device specific modes to dm-verity to specify how corrupted blocks should be handled. The following modes are defined: - DM_VERITY_MODE_EIO is the default behavior, where reading a corrupted block results in -EIO. - DM_VERITY_MODE_LOGGING only logs corrupted blocks, but does not block the read. - DM_VERITY_MODE_RESTART calls kernel_restart when a corrupted block is discovered. In addition, each mode sends a uevent to notify userspace of corruption and to allow further recovery actions. The driver defaults to previous behavior (DM_VERITY_MODE_EIO) and other modes can be enabled with an additional parameter to the verity table. Signed-off-by: Sami Tolvanen Signed-off-by: Mike Snitzer --- Documentation/device-mapper/verity.txt | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) (limited to 'Documentation') diff --git a/Documentation/device-mapper/verity.txt b/Documentation/device-mapper/verity.txt index 9884681535ee..64ccc5a079a5 100644 --- a/Documentation/device-mapper/verity.txt +++ b/Documentation/device-mapper/verity.txt @@ -11,6 +11,7 @@ Construction Parameters + [<#opt_params> ] This is the type of the on-disk hash format. @@ -62,6 +63,22 @@ Construction Parameters The hexadecimal encoding of the salt value. +<#opt_params> + Number of optional parameters. If there are no optional parameters, + the optional paramaters section can be skipped or #opt_params can be zero. + Otherwise #opt_params is the number of following arguments. + + Example of optional parameters section: + 1 ignore_corruption + +ignore_corruption + Log corrupted blocks, but allow read operations to proceed normally. + +restart_on_corruption + Restart the system when a corrupted block is discovered. This option is + not compatible with ignore_corruption and requires user space support to + avoid restart loops. + Theory of operation =================== -- cgit v1.2.3 From 0e9cebe724597a76ab1b0ebc0a21e16f7db11b47 Mon Sep 17 00:00:00 2001 From: Josef Bacik Date: Fri, 20 Mar 2015 10:50:37 -0400 Subject: dm: add log writes target Introduce a new target that is meant for file system developers to test file system integrity at particular points in the life of a file system. We capture all write requests and associated data and log them to a separate device for later replay. There is a userspace utility to do this replay. The idea behind this is to give file system developers a tool to verify that the file system is always consistent. Signed-off-by: Josef Bacik Reviewed-by: Zach Brown Signed-off-by: Mike Snitzer --- Documentation/device-mapper/log-writes.txt | 140 +++++++++++++++++++++++++++++ 1 file changed, 140 insertions(+) create mode 100644 Documentation/device-mapper/log-writes.txt (limited to 'Documentation') diff --git a/Documentation/device-mapper/log-writes.txt b/Documentation/device-mapper/log-writes.txt new file mode 100644 index 000000000000..c10f30c9b534 --- /dev/null +++ b/Documentation/device-mapper/log-writes.txt @@ -0,0 +1,140 @@ +dm-log-writes +============= + +This target takes 2 devices, one to pass all IO to normally, and one to log all +of the write operations to. This is intended for file system developers wishing +to verify the integrity of metadata or data as the file system is written to. +There is a log_write_entry written for every WRITE request and the target is +able to take arbitrary data from userspace to insert into the log. The data +that is in the WRITE requests is copied into the log to make the replay happen +exactly as it happened originally. + +Log Ordering +============ + +We log things in order of completion once we are sure the write is no longer in +cache. This means that normal WRITE requests are not actually logged until the +next REQ_FLUSH request. This is to make it easier for userspace to replay the +log in a way that correlates to what is on disk and not what is in cache, to +make it easier to detect improper waiting/flushing. + +This works by attaching all WRITE requests to a list once the write completes. +Once we see a REQ_FLUSH request we splice this list onto the request and once +the FLUSH request completes we log all of the WRITEs and then the FLUSH. Only +completed WRITEs, at the time the REQ_FLUSH is issued, are added in order to +simulate the worst case scenario with regard to power failures. Consider the +following example (W means write, C means complete): + +W1,W2,W3,C3,C2,Wflush,C1,Cflush + +The log would show the following + +W3,W2,flush,W1.... + +Again this is to simulate what is actually on disk, this allows us to detect +cases where a power failure at a particular point in time would create an +inconsistent file system. + +Any REQ_FUA requests bypass this flushing mechanism and are logged as soon as +they complete as those requests will obviously bypass the device cache. + +Any REQ_DISCARD requests are treated like WRITE requests. Otherwise we would +have all the DISCARD requests, and then the WRITE requests and then the FLUSH +request. Consider the following example: + +WRITE block 1, DISCARD block 1, FLUSH + +If we logged DISCARD when it completed, the replay would look like this + +DISCARD 1, WRITE 1, FLUSH + +which isn't quite what happened and wouldn't be caught during the log replay. + +Target interface +================ + +i) Constructor + + log-writes + + dev_path : Device that all of the IO will go to normally. + log_dev_path : Device where the log entries are written to. + +ii) Status + + <#logged entries> + + #logged entries : Number of logged entries + highest allocated sector : Highest allocated sector + +iii) Messages + + mark + + You can use a dmsetup message to set an arbitrary mark in a log. + For example say you want to fsck a file system after every + write, but first you need to replay up to the mkfs to make sure + we're fsck'ing something reasonable, you would do something like + this: + + mkfs.btrfs -f /dev/mapper/log + dmsetup message log 0 mark mkfs + + + This would allow you to replay the log up to the mkfs mark and + then replay from that point on doing the fsck check in the + interval that you want. + + Every log has a mark at the end labeled "dm-log-writes-end". + +Userspace component +=================== + +There is a userspace tool that will replay the log for you in various ways. +It can be found here: https://github.com/josefbacik/log-writes + +Example usage +============= + +Say you want to test fsync on your file system. You would do something like +this: + +TABLE="0 $(blockdev --getsz /dev/sdb) log-writes /dev/sdb /dev/sdc" +dmsetup create log --table "$TABLE" +mkfs.btrfs -f /dev/mapper/log +dmsetup message log 0 mark mkfs + +mount /dev/mapper/log /mnt/btrfs-test + +dmsetup message log 0 mark fsync +md5sum /mnt/btrfs-test/foo +umount /mnt/btrfs-test + +dmsetup remove log +replay-log --log /dev/sdc --replay /dev/sdb --end-mark fsync +mount /dev/sdb /mnt/btrfs-test +md5sum /mnt/btrfs-test/foo + + +Another option is to do a complicated file system operation and verify the file +system is consistent during the entire operation. You could do this with: + +TABLE="0 $(blockdev --getsz /dev/sdb) log-writes /dev/sdb /dev/sdc" +dmsetup create log --table "$TABLE" +mkfs.btrfs -f /dev/mapper/log +dmsetup message log 0 mark mkfs + +mount /dev/mapper/log /mnt/btrfs-test + +btrfs filesystem balance /mnt/btrfs-test +umount /mnt/btrfs-test +dmsetup remove log + +replay-log --log /dev/sdc --replay /dev/sdb --end-mark mkfs +btrfsck /dev/sdb +replay-log --log /dev/sdc --replay /dev/sdb --start-mark mkfs \ + --fsck "btrfsck /dev/sdb" --check fua + +And that will replay the log until it sees a FUA request, run the fsck command +and if the fsck passes it will replay to the next FUA, until it is completed or +the fsck command exists abnormally. -- cgit v1.2.3 From e44f23b32dc7916b2bc12817e2f723fefa21ba41 Mon Sep 17 00:00:00 2001 From: Milan Broz Date: Sun, 5 Apr 2015 18:03:10 +0200 Subject: dm crypt: update URLs to new cryptsetup project page Cryptsetup home page moved to GitLab. Also remove link to abandonded Truecrypt page. Signed-off-by: Milan Broz Signed-off-by: Mike Snitzer --- Documentation/device-mapper/dm-crypt.txt | 4 ++-- Documentation/device-mapper/verity.txt | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/device-mapper/dm-crypt.txt b/Documentation/device-mapper/dm-crypt.txt index ad697781f9ac..692171fe9da0 100644 --- a/Documentation/device-mapper/dm-crypt.txt +++ b/Documentation/device-mapper/dm-crypt.txt @@ -5,7 +5,7 @@ Device-Mapper's "crypt" target provides transparent encryption of block devices using the kernel crypto API. For a more detailed description of supported parameters see: -http://code.google.com/p/cryptsetup/wiki/DMCrypt +https://gitlab.com/cryptsetup/cryptsetup/wikis/DMCrypt Parameters: \ [<#opt_params> ] @@ -80,7 +80,7 @@ Example scripts =============== LUKS (Linux Unified Key Setup) is now the preferred way to set up disk encryption with dm-crypt using the 'cryptsetup' utility, see -http://code.google.com/p/cryptsetup/ +https://gitlab.com/cryptsetup/cryptsetup [[ #!/bin/sh diff --git a/Documentation/device-mapper/verity.txt b/Documentation/device-mapper/verity.txt index 64ccc5a079a5..e15bc1a0fb98 100644 --- a/Documentation/device-mapper/verity.txt +++ b/Documentation/device-mapper/verity.txt @@ -142,7 +142,7 @@ block boundary) are the hash blocks which are stored a depth at a time The full specification of kernel parameters and on-disk metadata format is available at the cryptsetup project's wiki page - http://code.google.com/p/cryptsetup/wiki/DMVerity + https://gitlab.com/cryptsetup/cryptsetup/wikis/DMVerity Status ====== @@ -159,7 +159,7 @@ Set up a device: A command line tool veritysetup is available to compute or verify the hash tree or activate the kernel device. This is available from -the cryptsetup upstream repository http://code.google.com/p/cryptsetup/ +the cryptsetup upstream repository https://gitlab.com/cryptsetup/cryptsetup/ (as a libcryptsetup extension). Create hash on the device: -- cgit v1.2.3