- May 31, 2019
-
-
John Pittman authored
While troubleshooting issues where cloned request limits have been exceeded, it is often beneficial to know the actual values that have been breached. Print these values, assisting in ease of identification of root cause of the breach. Reviewed-by:
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
John Pittman <jpittman@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Bart Van Assche authored
Document the meaning of the blk_mq_hw_queue_to_node() arguments. Reviewed-by:
Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com> Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Bart Van Assche authored
Change one occurrence of 'performace' into 'performance'. Cc: Max Gurtovoy <maxg@mellanox.com> Fixes: fe631457 ("blk-mq: map all HWQ also in hyperthreaded system") # v4.13. Reviewed-by:
Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com> Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Bart Van Assche authored
Document all bsg_setup_queue() arguments as required. Fixes: aae3b069 ("bsg: pass in desired timeout handler") # v5.0. Reviewed-by:
Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com> Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Bart Van Assche authored
Add documentation for the @rqw argument and change " - " into ": ". Fixes: 84f60324 ("block: add rq_qos_wait to rq_qos") # v5.0-rc1~52^2~140. Reviewed-by:
Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com> Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Bart Van Assche authored
This patch avoids that the kernel-doc script complains about these function headers when building with W=1. Cc: Hannes Reinecke <hare@suse.com> Cc: Keith Busch <keith.busch@intel.com> Fixes: ed76e329 ("blk-mq: abstract out queue map") # v5.0. Fixes: e42b3867 ("blk-mq-rdma: pass in queue map to blk_mq_rdma_map_queues") # v5.0. Reviewed-by:
Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com> Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Bart Van Assche authored
Commit e99e88a9 renamed a function argument without updating the corresponding kernel-doc header. Update the kernel-doc header. Reviewed-by:
Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com> Reviewed-by:
Kees Cook <keescook@chromium.org> Fixes: e99e88a9 ("treewide: setup_timer() -> timer_setup()") # v4.15. Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Bart Van Assche authored
This patch avoids that the kernel-doc tool warns about this function header when building with W=1. Reviewed-by:
Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com> Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Bart Van Assche authored
This patch avoids that the kernel-doc tool warns about this function header when building with W=1. Reviewed-by:
Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com> Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- May 29, 2019
-
-
Jes Sorensen authored
If blk_mq_init_allocated_queue() fails, make sure to free the poll stat callback struct allocated. Signed-off-by:
Jes Sorensen <jsorensen@fb.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Ming Lei authored
Now a063057d ("block: Fix a race between request queue removal and the block cgroup controller") has been reverted, and blkcg_exit_queue() won't be called in blk_cleanup_queue() any more. So don't need to protect generic_make_request_checks() with blk_queue_enter(), then the total mess can be cleaned. 37f9579f ("blk-mq: Avoid that submitting a bio concurrently with device removal triggers a crash") is reverted. Cc: Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Ming Lei authored
Commit 498f6650 ("block: Fix a race between the cgroup code and request queue initialization") moves what blk_exit_queue does into blk_cleanup_queue() for fixing issue caused by changing back queue lock. However, after legacy request IO path is killed, driver queue lock won't be used at all, and there isn't story for changing back queue lock. Then the issue addressed by Commit 498f6650 doesn't exist any more. So move move blk_exit_queue into __blk_release_queue. This patch basically reverts the following two commits: 498f6650 block: Fix a race between the cgroup code and request queue initialization 24ecc358 block: Ensure that a request queue is dissociated from the cgroup controller Cc: Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- May 23, 2019
-
-
Bob Liu authored
The following is a description of a hang in blk_mq_freeze_queue_wait(). The hang happens on attempt to freeze a queue while another task does queue unfreeze. The root cause is an incorrect sequence of percpu_ref_resurrect() and percpu_ref_kill() and as a result those two can be swapped: CPU#0 CPU#1 ---------------- ----------------- q1 = blk_mq_init_queue(shared_tags) q2 = blk_mq_init_queue(shared_tags): blk_mq_add_queue_tag_set(shared_tags): blk_mq_update_tag_set_depth(shared_tags): list_for_each_entry() blk_mq_freeze_queue(q1) > percpu_ref_kill() > blk_mq_freeze_queue_wait() blk_cleanup_queue(q1) blk_mq_freeze_queue(q1) > percpu_ref_kill() ^^^^^^ freeze_depth can't guarantee the order blk_mq_unfreeze_queue() > percpu_ref_resurrect() > blk_mq_freeze_queue_wait() ^^^^^^ Hang here!!!! This wrong sequence raises kernel warning: percpu_ref_kill_and_confirm called more than once on blk_queue_usage_counter_release! WARNING: CPU: 0 PID: 11854 at lib/percpu-refcount.c:336 percpu_ref_kill_and_confirm+0x99/0xb0 But the most unpleasant effect is a hang of a blk_mq_freeze_queue_wait(), which waits for a zero of a q_usage_counter, which never happens because percpu-ref was reinited (instead of being killed) and stays in PERCPU state forever. How to reproduce: - "insmod null_blk.ko shared_tags=1 nr_devices=0 queue_mode=2" - cpu0: python Script.py 0; taskset the corresponding process running on cpu0 - cpu1: python Script.py 1; taskset the corresponding process running on cpu1 Script.py: ------ #!/usr/bin/python3 import os import sys while True: on = "echo 1 > /sys/kernel/config/nullb/%s/power" % sys.argv[1] off = "echo 0 > /sys/kernel/config/nullb/%s/power" % sys.argv[1] os.system(on) os.system(off) ------ This bug was first reported and fixed by Roman, previous discussion: [1] Message id: 1443287365-4244-7-git-send-email-akinobu.mita@gmail.com [2] Message id: 1443563240-29306-6-git-send-email-tj@kernel.org [3] https://patchwork.kernel.org/patch/9268199/ Reviewed-by:
Hannes Reinecke <hare@suse.com> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Roman Pen <roman.penyaev@profitbricks.com> Signed-off-by:
Bob Liu <bob.liu@oracle.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
At this point these fields aren't used for anything, so we can remove them. Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Hannes Reinecke <hare@suse.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
We fundamentally do not have a maximum segement size for devices with a virt boundary. So don't bother checking it, especially given that the existing checks didn't properly work to start with as we never fully update the front/back segment size and miss the bi_seg_front_size that wuld have been required for some cases. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Hannes Reinecke <hare@suse.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
We currently fail to update the front/back segment size in the bio when deciding to allow an otherwise gappy segement to a device with a virt boundary. The reason why this did not cause problems is that devices with a virt boundary fundamentally don't use segments as we know it and thus don't care. Make that assumption formal by forcing an unlimited segement size in this case. Fixes: f6970f83 ("block: don't check if adjacent bvecs in one bio can be mergeable") Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Hannes Reinecke <hare@suse.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Currently ll_merge_requests_fn, unlike all other merge functions, reduces nr_phys_segments by one if the last segment of the previous, and the first segment of the next segement are contigous. While this seems like a nice solution to avoid building smaller than possible requests it causes a mismatch between the segments actually present in the request and those iterated over by the bvec iterators, including __rq_for_each_bio. This can for example mistrigger the single segment optimization in the nvme-pci driver, and might lead to mismatching nr_phys_segments number when recalculating the number of request when inserting a cloned request. We could possibly work around this by making the bvec iterators take the front and back segment size into account, but that would require moving them from the bio to the bio_iter and spreading this mess over all users of bvecs. Or we could simply remove this optimization under the assumption that most users already build good enough bvecs, and that the bio merge patch never cared about this optimization either. The latter is what this patch does. dff824b2 ("nvme-pci: optimize mapping of small single segment requests"). Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Hannes Reinecke <hare@suse.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- May 16, 2019
-
-
Jackie Liu authored
Use the new struct_size() helper to keep code simple. Reviewed-by:
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by:
Jackie Liu <liuyun01@kylinos.cn> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- May 04, 2019
-
-
Ming Lei authored
Now freeing hw queue resource is moved to hctx's release handler, we don't need to worry about the race between blk_cleanup_queue and run queue any more. So don't drain in-progress dispatch in blk_cleanup_queue(). This is basically revert of c2856ae2 ("blk-mq: quiesce queue before freeing queue"). Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen <martin.petersen@oracle.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>, Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Hannes Reinecke <hare@suse.com> Tested-by:
James Smart <james.smart@broadcom.com> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Ming Lei authored
hctx is always released after requeue is freed. With holding queue's kobject refcount, it is safe for driver to run queue, so one run queue might be scheduled after blk_sync_queue() is done. So moving the cancel of hctx->run_work into blk_mq_hw_sysfs_release() for avoiding run released queue. Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen <martin.petersen@oracle.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>, Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Hannes Reinecke <hare@suse.com> Tested-by:
James Smart <james.smart@broadcom.com> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Ming Lei authored
In normal queue cleanup path, hctx is released after request queue is freed, see blk_mq_release(). However, in __blk_mq_update_nr_hw_queues(), hctx may be freed because of hw queues shrinking. This way is easy to cause use-after-free, because: one implicit rule is that it is safe to call almost all block layer APIs if the request queue is alive; and one hctx may be retrieved by one API, then the hctx can be freed by blk_mq_update_nr_hw_queues(); finally use-after-free is triggered. Fixes this issue by always freeing hctx after releasing request queue. If some hctxs are removed in blk_mq_update_nr_hw_queues(), introduce a per-queue list to hold them, then try to resuse these hctxs if numa node is matched. Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen <martin.petersen@oracle.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>, Reviewed-by:
Hannes Reinecke <hare@suse.com> Tested-by:
James Smart <james.smart@broadcom.com> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Ming Lei authored
Split blk_mq_alloc_and_init_hctx into two parts, and one is blk_mq_alloc_hctx() for allocating all hctx resources, another is blk_mq_init_hctx() for initializing hctx, which serves as counter-part of blk_mq_exit_hctx(). Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: linux-scsi@vger.kernel.org Cc: Martin K . Petersen <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com> Reviewed-by:
Hannes Reinecke <hare@suse.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Tested-by:
James Smart <james.smart@broadcom.com> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Ming Lei authored
Once blk_cleanup_queue() returns, tags shouldn't be used any more, because blk_mq_free_tag_set() may be called. Commit 45a9c9d9 ("blk-mq: Fix a use-after-free") fixes this issue exactly. However, that commit introduces another issue. Before 45a9c9d9, we are allowed to run queue during cleaning up queue if the queue's kobj refcount is held. After that commit, queue can't be run during queue cleaning up, otherwise oops can be triggered easily because some fields of hctx are freed by blk_mq_free_queue() in blk_cleanup_queue(). We have invented ways for addressing this kind of issue before, such as: 8dc765d4 ("SCSI: fix queue cleanup race before queue initialization is done") c2856ae2 ("blk-mq: quiesce queue before freeing queue") But still can't cover all cases, recently James reports another such kind of issue: https://marc.info/?l=linux-scsi&m=155389088124782&w=2 This issue can be quite hard to address by previous way, given scsi_run_queue() may run requeues for other LUNs. Fixes the above issue by freeing hctx's resources in its release handler, and this way is safe becasue tags isn't needed for freeing such hctx resource. This approach follows typical design pattern wrt. kobject's release handler. Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen <martin.petersen@oracle.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>, Reported-by:
James Smart <james.smart@broadcom.com> Fixes: 45a9c9d9 ("blk-mq: Fix a use-after-free") Cc: stable@vger.kernel.org Reviewed-by:
Hannes Reinecke <hare@suse.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Tested-by:
James Smart <james.smart@broadcom.com> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Ming Lei authored
With holding queue's kobject refcount, it is safe for driver to schedule requeue. However, blk_mq_kick_requeue_list() may be called after blk_sync_queue() is done because of concurrent requeue activities, then requeue work may not be completed when freeing queue, and kernel oops is triggered. So moving the cancel of requeue_work into blk_mq_release() for avoiding race between requeue and freeing queue. Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen <martin.petersen@oracle.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>, Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by:
Hannes Reinecke <hare@suse.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Tested-by:
James Smart <james.smart@broadcom.com> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Ming Lei authored
Just like aio/io_uring, we need to grab 2 refcount for queuing one request, one is for submission, another is for completion. If the request isn't queued from plug code path, the refcount grabbed in generic_make_request() serves for submission. In theroy, this refcount should have been released after the sumission(async run queue) is done. blk_freeze_queue() works with blk_sync_queue() together for avoiding race between cleanup queue and IO submission, given async run queue activities are canceled because hctx->run_work is scheduled with the refcount held, so it is fine to not hold the refcount when running the run queue work function for dispatch IO. However, if request is staggered into plug list, and finally queued from plug code path, the refcount in submission side is actually missed. And we may start to run queue after queue is removed because the queue's kobject refcount isn't guaranteed to be grabbed in flushing plug list context, then kernel oops is triggered, see the following race: blk_mq_flush_plug_list(): blk_mq_sched_insert_requests() insert requests to sw queue or scheduler queue blk_mq_run_hw_queue Because of concurrent run queue, all requests inserted above may be completed before calling the above blk_mq_run_hw_queue. Then queue can be freed during the above blk_mq_run_hw_queue(). Fixes the issue by grab .q_usage_counter before calling blk_mq_sched_insert_requests() in blk_mq_flush_plug_list(). This way is safe because the queue is absolutely alive before inserting request. Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen <martin.petersen@oracle.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>, Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Tested-by:
James Smart <james.smart@broadcom.com> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- May 02, 2019
-
-
Raul E Rangel authored
The comment was out of date. Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Raul E Rangel <rrangel@chromium.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Apr 30, 2019
-
-
Christoph Hellwig authored
Reviewed-by:
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Various block layer files do not have any licensing information at all. Add SPDX tags for the default kernel GPLv2 license to those. Reviewed-by:
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
All these files have some form of the usual GPLv2 or later boilerplate. Switch them to use SPDX tags instead. Reviewed-by:
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
All these files have some form of the usual GPLv2 boilerplate. Switch them to use SPDX tags instead. Reviewed-by:
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Share the bi_size update by moving the done label up, and duplicate the bv_len update in the two callers to get rid of the bvec_merge label. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
We are never called with file system pages by defintions for the passthrough interface, and we also never undo any addition later these days. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
The same page optimization is a rather odd corner case, which is not used outside bio.c and which really should not be used outside of bio.c either - we have better highlevel helpers like the rq/bio mapping helpers. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
We only have two callers that need the integer loop iterator, and they can easily maintain it themselves. Suggested-by:
Matthew Wilcox <willy@infradead.org> Reviewed-by:
Johannes Thumshirn <jthumshirn@suse.de> Acked-by:
David Sterba <dsterba@suse.com> Reviewed-by:
Hannes Reinecke <hare@suse.com> Acked-by:
Coly Li <colyli@suse.de> Reviewed-by:
Matthew Wilcox <willy@infradead.org> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Apr 25, 2019
-
-
Kimberly Brown authored
The kobj_type default_attrs field is being replaced by the default_groups field. Replace all of the ktype default_attrs fields in the block subsystem with default_groups and use the ATTRIBUTE_GROUPS macro to create the default groups. Remove default_ctx_attrs[] because it doesn't contain any attributes. This patch was tested by verifying that the sysfs files for the attributes in the default groups were created. Signed-off-by:
Kimberly Brown <kimbrownkd@gmail.com> Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Apr 24, 2019
-
-
Ming Lei authored
The refcount has been increased for pages retrieved from non-bvec iov iter via __bio_iov_iter_get_pages(), so don't need to do that again. Otherwise, IO pages are leaked easily. Cc: Christoph Hellwig <hch@lst.de> Reviewed-by:
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Fixes: 7321ecbf ("block: change how we get page references in bio_iov_iter_get_pages") Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Apr 23, 2019
-
-
Ming Lei authored
bio_add_page() and __bio_add_page() are capable of adding pages into bio, and now we have at least two such usages alreay: - __bio_iov_bvec_add_pages() - nvmet_bdev_execute_rw(). So update comments on these two helpers. The thing is a bit special for __bio_try_merge_page(), given the caller needs to know if the new added page is same with the last added page, then it isn't safe to pass multi-page in case that 'same_page' is true, so adds warning on potential misuse, and updates comment on __bio_try_merge_page(). Cc: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Reviewed-by:
Hannes Reinecke <hare@suse.com> Reviewed-by:
Christoph Hellwig <hch@infradead.org> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Apr 22, 2019
-
-
Weiping Zhang authored
If the low level driver has no timeout handler, the /sys/block/<disk>/queue/io_timeout will not be displayed. Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Weiping Zhang <zhangweiping@didiglobal.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
While we generally allow scatterlists to have offsets larger than page size for an entry, and other subsystems like the crypto code make use of that, the block layer isn't quite ready for that. Flip the switch back to avoid them for now, and revisit that decision early in a merge window once the known offenders are fixed. Fixes: 8a96a0e4 ("block: rewrite blk_bvec_map_sg to avoid a nth_page call") Reviewed-by:
Ming Lei <ming.lei@redhat.com> Tested-by:
Guenter Roeck <linux@roeck-us.net> Reported-by:
Guenter Roeck <linux@roeck-us.net> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Yufen Yu authored
commit 2da78092 "block: Fix dev_t minor allocation lifetime" specifically moved blk_free_devt(dev->devt) call to part_release() to avoid reallocating device number before the device is fully shutdown. However, it can cause use-after-free on gendisk in get_gendisk(). We use md device as example to show the race scenes: Process1 Worker Process2 md_free blkdev_open del_gendisk add delete_partition_work_fn() to wq __blkdev_get get_gendisk put_disk disk_release kfree(disk) find part from ext_devt_idr get_disk_and_module(disk) cause use after free delete_partition_work_fn put_device(part) part_release remove part from ext_devt_idr Before <devt, hd_struct pointer> is removed from ext_devt_idr by delete_partition_work_fn(), we can find the devt and then access gendisk by hd_struct pointer. But, if we access the gendisk after it have been freed, it can cause in use-after-freeon gendisk in get_gendisk(). We fix this by adding a new helper blk_invalidate_devt() in delete_partition() and del_gendisk(). It replaces hd_struct pointer in idr with value 'NULL', and deletes the entry from idr in part_release() as we do now. Thanks to Jan Kara for providing the solution and more clear comments for the code. Fixes: 2da78092 ("block: Fix dev_t minor allocation lifetime") Cc: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Keith Busch <keith.busch@intel.com> Reviewed-by:
Jan Kara <jack@suse.cz> Suggested-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Yufen Yu <yuyufen@huawei.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-