summaryrefslogtreecommitdiffstats
path: root/drivers/block/null_blk.h
AgeCommit message (Collapse)Author
2020-12-07null_blk: Move driver into its own directoryDamien Le Moal
Move null_blk driver code into the new sub-directory drivers/block/null_blk. Suggested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-07null_blk: Allow controlling max_hw_sectors limitDamien Le Moal
Add the module option and configfs attribute max_sectors to allow configuring the maximum size of a command issued to a null_blk device. This allows exercising the block layer BIO splitting with different limits than the default BLK_SAFE_MAX_SECTORS. This is also useful for testing the zone append write path of file systems as the max_hw_sectors limit value is also used for the max_zone_append_sectors limit. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-07null_blk: discard zones on resetDamien Le Moal
When memory backing is enabled, use null_handle_discard() to free the backing memory used by a zone when the zone is being reset. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-07null_blk: Improve implicit zone closeDamien Le Moal
When open zone resource management is enabled, that is, when a null_blk zoned device is created with zone_max_open different than 0, implicitly or explicitly opening a zone may require implicitly closing a zone that is already implicitly open. This operation is done using the function null_close_first_imp_zone(), which search for an implicitly open zone to close starting from the first sequential zone. This implementation is simple but may result in the same being constantly implicitly closed and then implicitly reopened on write, namely, the lowest numbered zone that is being written. Avoid this by starting the search for an implicitly open zone starting from the zone following the last zone that was implicitly closed. The function null_close_first_imp_zone() is renamed null_close_imp_open_zone(). Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-07null_blk: improve zone lockingDamien Le Moal
With memory backing disabled, using a single spinlock for protecting zone information and zone resource management prevents the parallel execution on multiple queue of IO requests to different zones. Furthermore, regardless of the use of memory backing, if a null_blk device is created without limits on the number of open and active zones, accounting for zone resource management is not necessary. >From these observations, zone locking is changed as follows to improve performance: 1) the zone_lock spinlock is renamed zone_res_lock and used only if zone resource management is necessary, that is, if either zone_max_open or zone_max_active are not 0. This is indicated using the new boolean need_zone_res_mgmt in the nullb_device structure. null_zone_write() is modified to reduce the amount of code executed with the zone_res_lock spinlock held. 2) With memory backing disabled, per zone locking is changed to a spinlock per zone. 3) Introduce the structure nullb_zone to replace the use of struct blk_zone for zone information. This new structure includes a union of a spinlock and a mutex for zone locking. The spinlock is used when memory backing is disabled and the mutex is used with memory backing. With these changes, fio performance with zonemode=zbd for 4K random read and random write on a dual socket (24 cores per socket) machine using the none schedulder is as follows: before patch: write (psync x 96 jobs) = 465 KIOPS read (libaio@qd=8 x 96 jobs) = 1361 KIOPS after patch: write (psync x 96 jobs) = 456 KIOPS read (libaio@qd=8 x 96 jobs) = 4096 KIOPS Write performance remains mostly unchanged but read performance is three times higher. Performance when using the mq-deadline scheduler is not changed by this patch as mq-deadline becomes the bottleneck for a multi-queue device. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-11-06null_blk: Fix scheduling in atomic with zoned modeDamien Le Moal
Commit aa1c09cb65e2 ("null_blk: Fix locking in zoned mode") changed zone locking to using the potentially sleeping wait_on_bit_io() function. This is acceptable when memory backing is enabled as the device queue is in that case marked as blocking, but this triggers a scheduling while in atomic context with memory backing disabled. Fix this by relying solely on the device zone spinlock for zone information protection without temporarily releasing this lock around null_process_cmd() execution in null_zone_write(). This is OK to do since when memory backing is disabled, command processing does not block and the memory backing lock nullb->lock is unused. This solution avoids the overhead of having to mark a zoned null_blk device queue as blocking when memory backing is unused. This patch also adds comments to the zone locking code to explain the unusual locking scheme. Fixes: aa1c09cb65e2 ("null_blk: Fix locking in zoned mode") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-29null_blk: Fix locking in zoned modeDamien Le Moal
When the zoned mode is enabled in null_blk, Serializing read, write and zone management operations for each zone is necessary to protect device level information for managing zone resources (zone open and closed counters) as well as each zone condition and write pointer position. Commit 35bc10b2eafb ("null_blk: synchronization fix for zoned device") introduced a spinlock to implement this serialization. However, when memory backing is also enabled, GFP_NOIO memory allocations are executed under the spinlock, resulting in might_sleep() warnings. Furthermore, the zone_lock spinlock is locked/unlocked using spin_lock_irq/spin_unlock_irq, similarly to the memory backing code with the nullb->lock spinlock. This nested use of irq locks wrecks the irq enabled/disabled state. Fix all this by introducing a bitmap for per-zone lock, with locking implemented using wait_on_bit_lock_io() and clear_and_wake_up_bit(). This locking mechanism allows keeping a zone locked while executing null_process_cmd(), serializing all operations to the zone while allowing to sleep during memory backing allocation with GFP_NOIO. Device level zone resource management information is protected using a spinlock which is not held while executing null_process_cmd(); Fixes: 35bc10b2eafb ("null_blk: synchronization fix for zoned device") Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-27null_blk: synchronization fix for zoned deviceKanchan Joshi
Parallel write,read,zone-mgmt operations accessing/altering zone state and write-pointer may get into race. Avoid the situation by using a new spinlock for zoned device. Concurrent zone-appends (on a zone) returning same write-pointer issue is also avoided using this lock. Cc: stable@vger.kernel.org Fixes: e0489ed5daeb ("null_blk: Support REQ_OP_ZONE_APPEND") Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-29null_blk: add support for max open/active zone limit for zoned devicesNiklas Cassel
Add support for user space to set a max open zone and a max active zone limit via configfs. By default, the default values are 0 == no limit. Call the block layer API functions used for exposing the configured limits to sysfs. Add accounting in null_blk_zoned so that these new limits are respected. Performing an operation that would exceed these limits results in a standard I/O error. A max open zone limit exists in the ZBC standard. While null_blk_zoned is used to test the Zoned Block Device model in Linux, when it comes to differences between ZBC and ZNS, null_blk_zoned mostly follows ZBC. Therefore, implement the manage open zone resources function from ZBC, but additionally add support for max active zones. This enables user space not only to test against a device with an open zone limit, but also to test against a device with an active zone limit. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-08null_blk: introduce zone capacity for zoned deviceAravind Ramesh
Allow emulation of a zoned device with a per zone capacity smaller than the zone size as as defined in the Zoned Namespace (ZNS) Command Set specification. The zone capacity defaults to the zone size if not specified and must be smaller than the zone size otherwise. Reviewed-by: Matias Bjørling <matias.bjorling@wdc.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-04-23null_blk: Cleanup zoned device initializationDamien Le Moal
Move all zoned mode related code from null_blk_main.c to null_blk_zoned.c, avoiding an ugly #ifdef in the process. Rename null_zone_init() into null_init_zoned_dev(), null_zone_exit() into null_free_zoned_dev() and add the new function null_register_zoned_dev() to finalize the zoned dev setup before add_disk(). Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-23null_blk: Fix zoned command handlingDamien Le Moal
For write operations issued to a null_blk device with zoned mode enabled, the state and write pointer position of the zone targeted by the command should be checked before badblocks and memory backing are handled as the write may be first failed due to, for instance, a sector position not aligned with the zone write pointer. This order of checking for errors reflects more accuratly the behavior of physical zoned devices. Furthermore, the write pointer position of the target zone should be incremented only and only if no errors are reported by badblocks and memory backing handling. To fix this, introduce the small helper function null_process_cmd() which execute null_handle_badblocks() and null_handle_memory_backed() and use this function in null_zone_write() to correctly handle write requests to zoned null devices depending on the type and state of the write target zone. Also call this function in null_handle_zoned() to process read requests to zoned null devices. null_process_cmd() is called directly from null_handle_cmd() for regular null devices, resulting in no functional change for these type of devices. To have symmetric names, the function null_handle_zoned() is renamed to null_process_zoned_cmd(). Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-02-25null_blk: remove unused fields in 'nullb_cmd'Dongli Zhang
'list', 'll_list' and 'csd' are no longer used. The 'list' is not used since it was introduced by commit f2298c0403b0 ("null_blk: multi queue aware block test driver"). The 'll_list' is no longer used since commit 3c395a969acc ("null_blk: set a separate timer for each command"). The 'csd' is no longer used since commit ce2c350b2cfe ("null_blk: use blk_complete_request and blk_mq_complete_request"). Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-12block: rework zone reportingChristoph Hellwig
Avoid the need to allocate a potentially large array of struct blk_zone in the block layer by switching the ->report_zones method interface to a callback model. Now the caller simply supplies a callback that is executed on each reported zone, and private data for it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-12null_blk: clean up report zonesChristoph Hellwig
Make the instance name match the method name and define the name to NULL instead of providing an inline stub, which is rather pointless for a method call. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-17null_blk: return fixed zoned reads > write pointerAjay Joshi
A zoned block device maintains a write pointer within a zone, and reads beyond the write pointer are undefined. Fill data buffer returned above the write pointer with 0xFF. Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Matias Bjørling <matias.bjorling@wdc.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-16null_blk: format pr_* logs with pr_fmtAndré Almeida
Instead of writing "null_blk: " at the beginning of each pr_err/info/warn log message, format messages using pr_fmt() macro. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: André Almeida <andrealmeid@collabora.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-23null_blk: fix inline misuseJens Axboe
You can't magically mark a function inline and expect that to work. Fixes: fceb5d1b19cb ("null_blk: create a helper for zoned devices") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-23null_blk: create a helper for zoned devicesChaitanya Kulkarni
This patch creates a helper function for handling zoned block device operations. This patch also restructured the code for null_blk_zoned.c and uses the pattern to return blk_status_t and catch the error in the function null_handle_cmd() into cmd->error variable instead of setting it up in the deeper layer just like the way it is done for flush, badblocks and memory backed case in the null_handle_cmd(). We also move null_handle_zoned() to the null_blk_zoned.c to keep the zoned code separate. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-12null_blk: fixup ->report_zones() for !CONFIG_BLK_DEV_ZONEDJens Axboe
A previous commit changed the prototype, but didn't adjust the function for when zoned device support is disabled. Fix it up. Fixes: bd976e527259 ("block: Kill gfp_t argument of blkdev_report_zones()") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-11block: Kill gfp_t argument of blkdev_report_zones()Damien Le Moal
Only GFP_KERNEL and GFP_NOIO are used with blkdev_report_zones(). In preparation of using vmalloc() for large report buffer and zone array allocations used by this function, remove its "gfp_t gfp_mask" argument and rely on the caller context to use memalloc_noio_save/restore() where necessary (block layer zone revalidation and dm-zoned I/O error path). Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-06null_blk: add zoned config support informationJohn Pittman
If the kernel is built without CONFIG_BLK_DEV_ZONED, a modprobe of the null_blk driver with zoned=1 fails with 'Invalid argument'. This can be confusing to users, prompting a search as to why the parameter is invalid. To assist in that search, add a bit more information to the failure, additionally adding to the documentation that CONFIG_BLK_DEV_ZONED is needed for zoned=1. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: John Pittman <jpittman@redhat.com> Added null_blk prefix to error message. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07null_blk: Add conventional zone configuration for zoned supportMasato Suzuki
Allow the creation of conventional zones by adding the zone_nr_conv configuration attribute. This new attribute is used only for zoned devices and indicates the number of conventional zones to create. The default value is 0. Since host-managed zoned block devices must always have at least one sequential zone, if the value of zone_nr_conv is larger than or equal to the total number of zones of the device nr_zones, zone_nr_conv is automatically changed to nr_zones - 1. Reviewed-by: Matias Bjorling <matias.bjorling@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Masato Suzuki <masato.suzuki@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-25block: add a report_zones methodChristoph Hellwig
Dispatching a report zones command through the request queue is a major pain due to the command reply payload rewriting necessary. Given that blkdev_report_zones() is executing everything synchronously, implement report zones as a block device file operation instead, allowing major simplification of the code in many places. sd, null-blk, dm-linear and dm-flakey being the only block device drivers supporting exposing zoned block devices, these drivers are modified to provide the device side implementation of the report_zones() block device file operation. For device mappers, a new report_zones() target type operation is defined so that the upper block layer calls blkdev_report_zones() can be propagated down to the underlying devices of the dm targets. Implementation for this new operation is added to the dm-linear and dm-flakey targets. Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> [Damien] * Changed method block_device argument to gendisk * Various bug fixes and improvements * Added support for null_blk, dm-linear and dm-flakey. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-12null_blk: fix zoned support for non-rq based operationJens Axboe
The supported added for zones in null_blk seem to assume that only rq based operation is possible. But this depends on the queue_mode setting, if this is set to 0, then cmd->bio is what we need to be operating on. Right now any attempt to load null_blk with queue_mode=0 will insta-crash, since cmd->rq is NULL and null_handle_cmd() assumes it to always be set. Make the zoned code deal with bio's instead, or pass in the appropriate sector/nr_sectors instead. Fixes: ca4b2a011948 ("null_blk: add zone support") Tested-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-09null_blk: add zone supportMatias Bjørling
Adds support for exposing a null_blk device through the zone device interface. The interface is managed with the parameters zoned and zone_size. If zoned is set, the null_blk instance registers as a zoned block device. The zone_size parameter defines how big each zone will be. Signed-off-by: Matias Bjørling <matias.bjorling@wdc.com> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-09null_blk: move shared definitions to header fileMatias Bjørling
Split the null_blk device driver, such that it can prepare for zoned block interface support. Signed-off-by: Matias Bjørling <matias.bjorling@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>