- Apr 22, 2019
-
-
Yufen Yu authored
commit 2da78092 "block: Fix dev_t minor allocation lifetime" specifically moved blk_free_devt(dev->devt) call to part_release() to avoid reallocating device number before the device is fully shutdown. However, it can cause use-after-free on gendisk in get_gendisk(). We use md device as example to show the race scenes: Process1 Worker Process2 md_free blkdev_open del_gendisk add delete_partition_work_fn() to wq __blkdev_get get_gendisk put_disk disk_release kfree(disk) find part from ext_devt_idr get_disk_and_module(disk) cause use after free delete_partition_work_fn put_device(part) part_release remove part from ext_devt_idr Before <devt, hd_struct pointer> is removed from ext_devt_idr by delete_partition_work_fn(), we can find the devt and then access gendisk by hd_struct pointer. But, if we access the gendisk after it have been freed, it can cause in use-after-freeon gendisk in get_gendisk(). We fix this by adding a new helper blk_invalidate_devt() in delete_partition() and del_gendisk(). It replaces hd_struct pointer in idr with value 'NULL', and deletes the entry from idr in part_release() as we do now. Thanks to Jan Kara for providing the solution and more clear comments for the code. Fixes: 2da78092 ("block: Fix dev_t minor allocation lifetime") Cc: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Keith Busch <keith.busch@intel.com> Reviewed-by:
Jan Kara <jack@suse.cz> Suggested-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Yufen Yu <yuyufen@huawei.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Apr 14, 2019
-
-
Jens Axboe authored
A previous commit moved the shallow depth and BFQ depth map calculations to be done at init time, moving it outside of the hotter IO path. This potentially causes hangs if the users changes the depth of the scheduler map, by writing to the 'nr_requests' sysfs file for that device. Add a blk-mq-sched hook that allows blk-mq to inform the scheduler if the depth changes, so that the scheduler can update its internal state. Tested-by:
Kai Krakow <kai@kaishome.de> Reported-by:
Paolo Valente <paolo.valente@linaro.org> Fixes: f0635b8a ("bfq: calculate shallow depths at init time") Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Apr 12, 2019
-
-
Martin Wilck authored
Drivers now report to the block layer if they support media change events. If this is not the case, there's no need to allocate the event structure, and all event handling code can effectively be skipped. This simplifies code flow in particular for non-removable sd devices. This effectively reverts commit 75e3f3ee ("block: always allocate genhd->ev if check_events is implemented"). The sysfs files for the events are kept in place even if no events are supported, as user space may rely on them being present. The only difference is that an error code is now returned if the user tries to set poll_msecs. Reviewed-by:
Hannes Reinecke <hare@suse.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Martin Wilck <mwilck@suse.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Martin Wilck authored
Currently, an empty disk->events field tells the block layer not to forward media change events to user space. This was done in commit 7c88a168 ("block: don't propagate unlisted DISK_EVENTs to userland") in order to avoid events from "fringe" drivers to be forwarded to user space. By doing so, the block layer lost the information which events were supported by a particular block device, and most importantly, whether or not a given device supports media change events at all. Prepare for not interpreting the "events" field this way in the future any more. This is done by adding an additional field "event_flags" to struct gendisk, and two flag bits that can be set to have the device treated like one that had the "events" field set to a non-zero value before. This applies only to the sd and sr drivers, which are changed to set the new flags. The new flags are DISK_EVENT_FLAG_POLL to enforce polling of the device for synchronous events, and DISK_EVENT_FLAG_UEVENT to tell the blocklayer to generate udev events from kernel events. In order to add the event_flags field to struct gendisk, the events field is converted to an "unsigned short"; it doesn't need to hold values bigger than 2 anyway. This patch doesn't change behavior. Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Martin Wilck <mwilck@suse.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Martin Wilck authored
The async_events field, intended to be used for drivers that support asynchronous notifications about disk events (aka media change events), isn't currently used by any driver, and apparently that has been that way for a long time (if not forever). Remove it. Reviewed-by:
Hannes Reinecke <hare@suse.de> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Martin Wilck <mwilck@suse.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
We currently have to call nth_page when iterating over pages inside a bio_vec. Jens complained a while ago that this is fairly expensive. To mitigate this we can check that that the actual page structures are contiguous when adding them to the bio, and just do check pointer arithmetics later on. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Instead of needing a special macro to iterate over all pages in a bvec just do a second passs over the whole bio. This also matches what we do on the release side. The release side helper is moved up to where we need the get helper to clearly express the symmetry. Reviewed-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
No caller uses bio_iov_iter_get_pages multiple times on a given bio, and that funtionality isn't all that useful. Removing it will make some future changes a little easier and also simplifies the function a bit. Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Return early on error, and add an unlikely annotation for that case. Reviewed-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
The offset in scatterlists is allowed to be larger than the page size, so don't go to great length to avoid that case and simplify the arithmetics. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Apr 10, 2019
-
-
Jérôme Glisse authored
When bio_add_pc_page() fails in bio_copy_user_iov() we should free the page we just allocated otherwise we are leaking it. Cc: linux-block@vger.kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Reviewed-by:
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by:
Jérôme Glisse <jglisse@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Ming Lei authored
In NVMe's error handler, follows the typical steps of tearing down hardware for recovering controller: 1) stop blk_mq hw queues 2) stop the real hw queues 3) cancel in-flight requests via blk_mq_tagset_busy_iter(tags, cancel_request, ...) cancel_request(): mark the request as abort blk_mq_complete_request(req); 4) destroy real hw queues However, there may be race between #3 and #4, because blk_mq_complete_request() may run q->mq_ops->complete(rq) remotelly and asynchronously, and ->complete(rq) may be run after #4. This patch introduces blk_mq_complete_request_sync() for fixing the above race. Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Bart Van Assche <bvanassche@acm.org> Cc: James Smart <james.smart@broadcom.com> Cc: linux-nvme@lists.infradead.org Reviewed-by:
Keith Busch <keith.busch@intel.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Paolo Valente authored
The function bfq_bfqq_expire() invokes the function __bfq_bfqq_expire(), and the latter may free the in-service bfq-queue. If this happens, then no other instruction of bfq_bfqq_expire() must be executed, or a use-after-free will occur. Basing on the assumption that __bfq_bfqq_expire() invokes bfq_put_queue() on the in-service bfq-queue exactly once, the queue is assumed to be freed if its refcounter is equal to one right before invoking __bfq_bfqq_expire(). But, since commit 9dee8b3b ("block, bfq: fix queue removal from weights tree") this assumption is false. __bfq_bfqq_expire() may also invoke bfq_weights_tree_remove() and, since commit 9dee8b3b ("block, bfq: fix queue removal from weights tree"), also the latter function may invoke bfq_put_queue(). So __bfq_bfqq_expire() may invoke bfq_put_queue() twice, and this is the actual case where the in-service queue may happen to be freed. To address this issue, this commit moves the check on the refcounter of the queue right around the last bfq_put_queue() that may be invoked on the queue. Fixes: 9dee8b3b ("block, bfq: fix queue removal from weights tree") Reported-by:
Dmitrii Tcvetkov <demfloro@demfloro.ru> Reported-by:
Douglas Anderson <dianders@chromium.org> Tested-by:
Dmitrii Tcvetkov <demfloro@demfloro.ru> Tested-by:
Douglas Anderson <dianders@chromium.org> Signed-off-by:
Paolo Valente <paolo.valente@linaro.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Apr 08, 2019
-
-
Ming Lei authored
Commit f6970f83 ("block: don't check if adjacent bvecs in one bio can be mergeable") changes bvec merge by only considering two bvecs from different bios. However, if the former bio doesn't inlcude any io bvec, then the following warning may be triggered: warning: ‘bvec.bv_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized] In practice, it shouldn't be triggered. Fixes it by adding check on former bio, the check shouldn't add any cost given 'bio->bi_iter' can be hit in cache. Reported-by:
Jens Axboe <axboe@kernel.dk> Fixes: f6970f83 ("block: don't check if adjacent bvecs in one bio can be mergeable") Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Angelo Ruocco authored
Some of the comments in the bfq files had typos. This patch fixes them. Signed-off-by:
Angelo Ruocco <angeloruocco90@gmail.com> Signed-off-by:
Paolo Valente <paolo.valente@linaro.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Hisao Tanabe authored
The 'def' local variable became unused after commit f382fb0b ("block: remove legacy IO schedulers"), let's remove it. Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Hisao Tanabe <xtanabe@gmail.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Apr 06, 2019
-
-
David Kozub authored
As the function is responsible for executing the individual steps supplied in the steps argument, execute_steps is a more descriptive name than the rather generic next. Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Jon Derrick <jonathan.derrick@intel.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
David Kozub authored
Originally each of the opal functions that call next include opal_discovery0 in the array of steps. This is superfluous and can be done always inside next. Acked-by:
Jon Derrick <jonathan.derrick@intel.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
David Kozub authored
The steps argument is only read by the next function, so it can be passed directly as an argument rather than via opal_dev. Normally, the steps is an array on the stack, so the pointer stops being valid then the function that set opal_dev.steps returns. If opal_dev.steps was not set to NULL before return it would become a dangling pointer. When the steps are passed as argument this becomes easier to see and more difficult to misuse. Acked-by:
Jon Derrick <jonathan.derrick@intel.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
David Kozub authored
Replace integer literals by Opal tokens defined in opal_proto.h where possible. Reviewed-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Jon Derrick <jonathan.derrick@intel.com> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
David Kozub authored
Instead of having multiple places defining the same argument list to get a specific column of a sed-opal table, provide a generic version and call it from those functions. Co-authored-by:
David Kozub <zub@linux.fjfi.cvut.cz> Signed-off-by:
Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Jon Derrick <jonathan.derrick@intel.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
David Kozub authored
Define OPAL_LIFECYCLE token and use it instead of literals in get_lsp_lifecycle. Acked-by:
Jon Derrick <jonathan.derrick@intel.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jonas Rabenstein authored
Split the header generation from the (normal) memcpy part if a bytestring is copied into the command buffer. This allows in-place generation of the bytestring content. For example, copy_from_user may be used without an intermediate buffer. Signed-off-by:
Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Jon Derrick <jonathan.derrick@intel.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jonas Rabenstein authored
Add function address (and if available its symbol) to the message if a step function fails. Signed-off-by:
Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Jon Derrick <jonathan.derrick@intel.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
David Kozub authored
response_get_token had already been in place, its functionality had been duplicated within response_get_{u64,bytestring} with the same error handling. Unify the handling by reusing response_get_token within the other functions. Co-authored-by:
Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Signed-off-by:
Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Jon Derrick <jonathan.derrick@intel.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
David Kozub authored
response_get_{string,u64} include error handling for argument resp being NULL but response_get_token does not handle this. Make all three of response_get_{string,u64,token} handle NULL resp in the same way. Co-authored-by:
Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Signed-off-by:
Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Jon Derrick <jonathan.derrick@intel.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
David Kozub authored
Every step starts with resetting the cmd buffer as well as the comid and constructs the appropriate OPAL_CALL command. Consequently, those actions may be combined into one generic function. On should take care that the opening and closing tokens for the argument list are already emitted by cmd_start and cmd_finalize respectively and thus must not be additionally added. Co-authored-by:
Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Signed-off-by:
Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Reviewed-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Jon Derrick <jonathan.derrick@intel.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
David Kozub authored
Every step ends by calling cmd_finalize (via finalize_and_send) yet every step adds the token OPAL_ENDLIST on its own. Moving this into cmd_finalize decreases code duplication. Co-authored-by:
Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Signed-off-by:
Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Reviewed-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Jon Derrick <jonathan.derrick@intel.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jonas Rabenstein authored
All add_token_* functions have a common set of conditions that have to be checked. Use a common function for those checks in order to avoid different behaviour as well as code duplication. Acked-by:
Jon Derrick <jonathan.derrick@intel.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Co-authored-by:
David Kozub <zub@linux.fjfi.cvut.cz> Signed-off-by:
Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jonas Rabenstein authored
Also the values of OPAL_UID_LENGTH and OPAL_METHOD_LENGTH are the same, it is weird to use OPAL_UID_LENGTH for the definition of the methods. Signed-off-by:
Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Jon Derrick <jonathan.derrick@intel.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
David Kozub authored
This should make no change in functionality. The formatting changes were triggered by checkpatch.pl. Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Reviewed-by:
Jon Derrick <jonathan.derrick@intel.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
David Kozub authored
The implementation of IOC_OPAL_ENABLE_DISABLE_MBR handled the value opal_mbr_data.enable_disable incorrectly: enable_disable is expected to be one of OPAL_MBR_ENABLE(0) or OPAL_MBR_DISABLE(1). enable_disable was passed directly to set_mbr_done and set_mbr_enable_disable where is was interpreted as either OPAL_TRUE(1) or OPAL_FALSE(0). The end result was that calling IOC_OPAL_ENABLE_DISABLE_MBR with OPAL_MBR_ENABLE actually disabled the shadow MBR and vice versa. This patch adds correct conversion from OPAL_MBR_DISABLE/ENABLE to OPAL_FALSE/TRUE. The change affects existing programs using IOC_OPAL_ENABLE_DISABLE_MBR but this is typically used only once when setting up an Opal drive. Acked-by:
Jon Derrick <jonathan.derrick@intel.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Scott Bauer <sbauer@plzdonthack.me> Signed-off-by:
David Kozub <zub@linux.fjfi.cvut.cz> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Currently support for 64-bit sector_t and blkcnt_t is optional on 32-bit architectures. These types are required to support block device and/or file sizes larger than 2 TiB, and have generally defaulted to on for a long time. Enabling the option only increases the i386 tinyconfig size by 145 bytes, and many data structures already always use 64-bit values for their in-core and on-disk data structures anyway, so there should not be a large change in dynamic memory usage either. Dropping this option removes a somewhat weird non-default config that has cause various bugs or compiler warnings when actually used. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Apr 05, 2019
-
-
Bart Van Assche authored
blk_mq_try_issue_directly() can return BLK_STS*_RESOURCE for requests that have been queued. If that happens when blk_mq_try_issue_directly() is called by the dm-mpath driver then dm-mpath will try to resubmit a request that is already queued and a kernel crash follows. Since it is nontrivial to fix blk_mq_request_issue_directly(), revert the blk_mq_request_issue_directly() changes that went into kernel v5.0. This patch reverts the following commits: * d6a51a97 ("blk-mq: replace and kill blk_mq_request_issue_directly") # v5.0. * 5b7a6f12 ("blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests") # v5.0. * 7f556a44 ("blk-mq: refactor the code of issue request directly") # v5.0. Cc: Christoph Hellwig <hch@infradead.org> Cc: Ming Lei <ming.lei@redhat.com> Cc: Jianchao Wang <jianchao.w.wang@oracle.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: James Smart <james.smart@broadcom.com> Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: Laurence Oberman <loberman@redhat.com> Cc: <stable@vger.kernel.org> Reported-by:
Laurence Oberman <loberman@redhat.com> Tested-by:
Laurence Oberman <loberman@redhat.com> Fixes: 7f556a44 ("blk-mq: refactor the code of issue request directly") # v5.0. Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Apr 04, 2019
-
-
Johannes Thumshirn authored
With the introduction of BIO_NO_PAGE_REF we've used up all available bits in bio::bi_flags. Convert the defines of the flags to an enum and add a BUILD_BUG_ON() call to make sure no-one adds a new one and thus overrides the BVEC_POOL_IDX causing crashes. Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Hannes Reinecke <hare@suse.com> Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Dongli Zhang authored
We would never be able to sort the list if we first reset plug->rq_count which is used in conditional check later. Fixes: ce5b009c ("block: improve logic around when to sort a plug list") Reviewed-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Apr 02, 2019
-
-
Yufen Yu authored
For now, we just trace plug for single queue device or drivers provide .commit_rqs, and have not trace plug for multiple queues device. But, unplug events will be recorded when call blk_mq_flush_plug_list(). Then, trace events will be asymmetrical, just have unplug and without plug. This patch add trace plug and unplug for multiple queues device in blk_mq_make_request(). After that, we can accurately trace plug and unplug for multiple queues. Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Yufen Yu <yuyufen@huawei.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Shenghui Wang authored
kfree() can leak the hctx->fq->flush_rq field. Reviewed-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Shenghui Wang <shhuiw@foxmail.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Apr 01, 2019
-
-
Ming Lei authored
Now both passthrough and FS IO have supported multi-page bvec, and bvec merging has been handled actually when adding page to bio, then adjacent bvecs won't be mergeable any more if they belong to same bio. So only try to merge bvecs if they are from different bios. Cc: Omar Sandoval <osandov@fb.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Ming Lei authored
Inside __blk_segment_map_sg(), page sized bvec mapping is optimized a bit with one standalone branch. So reuse __blk_bvec_map_sg() to do that. Cc: Omar Sandoval <osandov@fb.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-