Skip to content
Snippets Groups Projects
  1. Oct 13, 2014
  2. Oct 09, 2014
  3. Oct 07, 2014
  4. Oct 03, 2014
  5. Oct 01, 2014
  6. Sep 27, 2014
  7. Sep 25, 2014
  8. Sep 24, 2014
    • Tejun Heo's avatar
      blk-mq, percpu_ref: start q->mq_usage_counter in atomic mode · 17497acb
      Tejun Heo authored
      
      blk-mq uses percpu_ref for its usage counter which tracks the number
      of in-flight commands and used to synchronously drain the queue on
      freeze.  percpu_ref shutdown takes measureable wallclock time as it
      involves a sched RCU grace period.  This means that draining a blk-mq
      takes measureable wallclock time.  One would think that this shouldn't
      matter as queue shutdown should be a rare event which takes place
      asynchronously w.r.t. userland.
      
      Unfortunately, SCSI probing involves synchronously setting up and then
      tearing down a lot of request_queues back-to-back for non-existent
      LUNs.  This means that SCSI probing may take above ten seconds when
      scsi-mq is used.
      
        [    0.949892] scsi host0: Virtio SCSI HBA
        [    1.007864] scsi 0:0:0:0: Direct-Access     QEMU     QEMU HARDDISK    1.1. PQ: 0 ANSI: 5
        [    1.021299] scsi 0:0:1:0: Direct-Access     QEMU     QEMU HARDDISK    1.1. PQ: 0 ANSI: 5
        [    1.520356] tsc: Refined TSC clocksource calibration: 2491.910 MHz
      
        <stall>
      
        [   16.186549] sd 0:0:0:0: Attached scsi generic sg0 type 0
        [   16.190478] sd 0:0:1:0: Attached scsi generic sg1 type 0
        [   16.194099] osd: LOADED open-osd 0.2.1
        [   16.203202] sd 0:0:0:0: [sda] 31457280 512-byte logical blocks: (16.1 GB/15.0 GiB)
        [   16.208478] sd 0:0:0:0: [sda] Write Protect is off
        [   16.211439] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
        [   16.218771] sd 0:0:1:0: [sdb] 31457280 512-byte logical blocks: (16.1 GB/15.0 GiB)
        [   16.223264] sd 0:0:1:0: [sdb] Write Protect is off
        [   16.225682] sd 0:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
      
      This is also the reason why request_queues start in bypass mode which
      is ended on blk_register_queue() as shutting down a fully functional
      queue also involves a RCU grace period and the queues for non-existent
      SCSI devices never reach registration.
      
      blk-mq basically needs to do the same thing - start the mq in a
      degraded mode which is faster to shut down and then make it fully
      functional only after the queue reaches registration.  percpu_ref
      recently grew facilities to force atomic operation until explicitly
      switched to percpu mode, which can be used for this purpose.  This
      patch makes blk-mq initialize q->mq_usage_counter in atomic mode and
      switch it to percpu mode only once blk_register_queue() is reached.
      
      Note that this issue was previously worked around by 0a30288d
      ("blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during
      probe") for v3.17.  The temp fix was reverted in preparation of adding
      persistent atomic mode to percpu_ref by 9eca8046 ("Revert "blk-mq,
      percpu_ref: implement a kludge for SCSI blk-mq stall during probe"").
      This patch and the prerequisite percpu_ref changes will be merged
      during v3.18 devel cycle.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarChristoph Hellwig <hch@infradead.org>
      Link: http://lkml.kernel.org/g/20140919113815.GA10791@lst.de
      
      
      Fixes: add703fd ("blk-mq: use percpu_ref for mq usage count")
      Reviewed-by: default avatarKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      17497acb
    • Tejun Heo's avatar
      percpu_ref: add PERCPU_REF_INIT_* flags · 2aad2a86
      Tejun Heo authored
      
      With the recent addition of percpu_ref_reinit(), percpu_ref now can be
      used as a persistent switch which can be turned on and off repeatedly
      where turning off maps to killing the ref and waiting for it to drain;
      however, there currently isn't a way to initialize a percpu_ref in its
      off (killed and drained) state, which can be inconvenient for certain
      persistent switch use cases.
      
      Similarly, percpu_ref_switch_to_atomic/percpu() allow dynamic
      selection of operation mode; however, currently a newly initialized
      percpu_ref is always in percpu mode making it impossible to avoid the
      latency overhead of switching to atomic mode.
      
      This patch adds @flags to percpu_ref_init() and implements the
      following flags.
      
      * PERCPU_REF_INIT_ATOMIC	: start ref in atomic mode
      * PERCPU_REF_INIT_DEAD		: start ref killed and drained
      
      These flags should be able to serve the above two use cases.
      
      v2: target_core_tpg.c conversion was missing.  Fixed.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      2aad2a86
    • Tejun Heo's avatar
      Revert "blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe" · 9eca8046
      Tejun Heo authored
      
      This reverts commit 0a30288d, which
      was a temporary fix for SCSI blk-mq stall issue.  The following
      patches will fix the issue properly by introducing atomic mode to
      percpu_ref.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@lst.de>
      9eca8046
    • Tejun Heo's avatar
      blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe · 0a30288d
      Tejun Heo authored
      
      blk-mq uses percpu_ref for its usage counter which tracks the number
      of in-flight commands and used to synchronously drain the queue on
      freeze.  percpu_ref shutdown takes measureable wallclock time as it
      involves a sched RCU grace period.  This means that draining a blk-mq
      takes measureable wallclock time.  One would think that this shouldn't
      matter as queue shutdown should be a rare event which takes place
      asynchronously w.r.t. userland.
      
      Unfortunately, SCSI probing involves synchronously setting up and then
      tearing down a lot of request_queues back-to-back for non-existent
      LUNs.  This means that SCSI probing may take more than ten seconds
      when scsi-mq is used.
      
      This will be properly fixed by implementing a mechanism to keep
      q->mq_usage_counter in atomic mode till genhd registration; however,
      that involves rather big updates to percpu_ref which is difficult to
      apply late in the devel cycle (v3.17-rc6 at the moment).  As a
      stop-gap measure till the proper fix can be implemented in the next
      cycle, this patch introduces __percpu_ref_kill_expedited() and makes
      blk_mq_freeze_queue() use it.  This is heavy-handed but should work
      for testing the experimental SCSI blk-mq implementation.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarChristoph Hellwig <hch@infradead.org>
      Link: http://lkml.kernel.org/g/20140919113815.GA10791@lst.de
      
      
      Fixes: add703fd ("blk-mq: use percpu_ref for mq usage count")
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Tested-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      0a30288d
  9. Sep 22, 2014
Loading