1. 03 Dec, 2009 1 commit
    • Martin K. Petersen's avatar
      block: Allow devices to indicate whether discarded blocks are zeroed · 98262f27
      Martin K. Petersen authored
      The discard ioctl is used by mkfs utilities to clear a block device
      prior to putting metadata down.  However, not all devices return zeroed
      blocks after a discard.  Some drives return stale data, potentially
      containing old superblocks.  It is therefore important to know whether
      discarded blocks are properly zeroed.
      
      Both ATA and SCSI drives have configuration bits that indicate whether
      zeroes are returned after a discard operation.  Implement a block level
      interface that allows this information to be bubbled up the stack and
      queried via a new block device ioctl.
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      98262f27
  2. 30 Nov, 2009 1 commit
  3. 26 Nov, 2009 7 commits
    • Corrado Zoccolo's avatar
      cfq-iosched: fix corner cases in idling logic · 8e550632
      Corrado Zoccolo authored
      Idling logic was disabled in some corner cases, leading to unfair share
       for noidle queues.
       * the idle timer was not armed if there were other requests in the
         driver. unfortunately, those requests could come from other workloads,
         or queues for which we don't enable idling. So we will check only
         pending requests from the active queue
       * rq_noidle check on no-idle queue could disable the end of tree idle if
         the last completed request was rq_noidle. Now, we will disable that
         idle only if all the queues served in the no-idle tree had rq_noidle
         requests.
      Reported-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarCorrado Zoccolo <czoccolo@gmail.com>
      Acked-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      8e550632
    • Corrado Zoccolo's avatar
      cfq-iosched: idling on deep seeky sync queues · 76280aff
      Corrado Zoccolo authored
      Seeky sync queues with large depth can gain unfairly big share of disk
       time, at the expense of other seeky queues. This patch ensures that
       idling will be enabled for queues with I/O depth at least 4, and small
       think time. The decision to enable idling is sticky, until an idle
       window times out without seeing a new request.
      
      The reasoning behind the decision is that, if an application is using
      large I/O depth, it is already optimized to make full utilization of
      the hardware, and therefore we reserve a slice of exclusive use for it.
      Reported-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarCorrado Zoccolo <czoccolo@gmail.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      76280aff
    • Corrado Zoccolo's avatar
      cfq-iosched: fix no-idle preemption logic · e4a22919
      Corrado Zoccolo authored
      An incoming no-idle queue should preempt the active no-idle queue
       only if the active queue is idling due to service tree empty.
       Previous code was buggy in two ways:
       * it relied on service_tree field to be set on the active queue, while
         it is not set when the code is idling for a new request
       * it didn't check for the service tree empty condition, so could lead to
         LIFO behaviour if multiple queues with depth > 1 were preempting each
         other on an non-NCQ device.
      Reported-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarCorrado Zoccolo <czoccolo@gmail.com>
      Acked-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      e4a22919
    • Corrado Zoccolo's avatar
      cfq-iosched: fix ncq detection code · e459dd08
      Corrado Zoccolo authored
      CFQ's detection of queueing devices initially assumes a queuing device
      and detects if the queue depth reaches a certain threshold.
      However, it will reconsider this choice periodically.
      
      Unfortunately, if device is considered not queuing, CFQ will force a
      unit queue depth for some workloads, thus defeating the detection logic.
      This leads to poor performance on queuing hardware,
      since the idle window remains enabled.
      
      Given this premise, switching to hw_tag = 0 after we have proved at
      least once that the device is NCQ capable is not a good choice.
      
      The new detection code starts in an indeterminate state, in which CFQ behaves
      as if hw_tag = 1, and then, if for a long observation period we never saw
      large depth, we switch to hw_tag = 0, otherwise we stick to hw_tag = 1,
      without reconsidering it again.
      Signed-off-by: default avatarCorrado Zoccolo <czoccolo@gmail.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      e459dd08
    • Corrado Zoccolo's avatar
      cfq-iosched: cleanup unreachable code · c16632ba
      Corrado Zoccolo authored
      cfq_should_idle returns false for no-idle queues that are not the last,
      so the control flow will never reach the removed code in a state that
      satisfies the if condition.
      The unreachable code was added to emulate previous cfq behaviour for
      non-NCQ rotational devices. My tests show that even without it, the
      performances and fairness are comparable with previous cfq, thanks to
      the fact that all seeky queues are grouped together, and that we idle at
      the end of the tree.
      Signed-off-by: default avatarCorrado Zoccolo <czoccolo@gmail.com>
      Acked-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      c16632ba
    • Ilya Loginov's avatar
      block: add helpers to run flush_dcache_page() against a bio and a request's pages · 2d4dc890
      Ilya Loginov authored
      Mtdblock driver doesn't call flush_dcache_page for pages in request.  So,
      this causes problems on architectures where the icache doesn't fill from
      the dcache or with dcache aliases.  The patch fixes this.
      
      The ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE symbol was introduced to avoid
      pointless empty cache-thrashing loops on architectures for which
      flush_dcache_page() is a no-op.  Every architecture was provided with this
      flush pages on architectires where ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE is
      equal 1 or do nothing otherwise.
      
      See "fix mtd_blkdevs problem with caches on some architectures" discussion
      on LKML for more information.
      Signed-off-by: default avatarIlya Loginov <isloginov@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Peter Horton <phorton@bitbox.co.uk>
      Cc: "Ed L. Cashin" <ecashin@coraid.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      2d4dc890
    • Gui Jianfeng's avatar
      cfq: Make use of service count to estimate the rb_key offset · 3586e917
      Gui Jianfeng authored
      For the moment, different workload cfq queues are put into different
      service trees. But CFQ still uses "busy_queues" to estimate rb_key
      offset when inserting a cfq queue into a service tree. I think this
      isn't appropriate, and it should make use of service tree count to do
      this estimation. This patch is for for-2.6.33 branch.
      Signed-off-by: default avatarGui Jianfeng <guijianfeng@cn.fujitsu.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      3586e917
  4. 11 Nov, 2009 1 commit
  5. 10 Nov, 2009 1 commit
  6. 08 Nov, 2009 1 commit
    • Corrado Zoccolo's avatar
      cfq-iosched: fix next_rq computation · cf7c25cf
      Corrado Zoccolo authored
      Cfq has a bug in computation of next_rq, that affects transition
      between multiple sequential request streams in a single queue
      (e.g.: two sequential buffered writers of the same priority),
      causing the alternation between the two streams for a transient period.
      
        8,0    1    18737     0.260400660  5312  D   W 141653311 + 256
        8,0    1    20839     0.273239461  5400  D   W 141653567 + 256
        8,0    1    20841     0.276343885  5394  D   W 142803919 + 256
        8,0    1    20843     0.279490878  5394  D   W 141668927 + 256
        8,0    1    20845     0.292459993  5400  D   W 142804175 + 256
        8,0    1    20847     0.295537247  5400  D   W 141668671 + 256
        8,0    1    20849     0.298656337  5400  D   W 142804431 + 256
        8,0    1    20851     0.311481148  5394  D   W 141668415 + 256
        8,0    1    20853     0.314421305  5394  D   W 142804687 + 256
        8,0    1    20855     0.318960112  5400  D   W 142804943 + 256
      
      The fix makes sure that the next_rq is computed from the last
      dispatched request, and not affected by merging.
      
        8,0    1    37776     4.305161306     0  D   W 141738087 + 256
        8,0    1    37778     4.308298091     0  D   W 141738343 + 256
        8,0    1    37780     4.312885190     0  D   W 141738599 + 256
        8,0    1    37782     4.315933291     0  D   W 141738855 + 256
        8,0    1    37784     4.319064459     0  D   W 141739111 + 256
        8,0    1    37786     4.331918431  5672  D   W 142803007 + 256
        8,0    1    37788     4.334930332  5672  D   W 142803263 + 256
        8,0    1    37790     4.337902723  5672  D   W 142803519 + 256
        8,0    1    37792     4.342359774  5672  D   W 142803775 + 256
        8,0    1    37794     4.345318286     0  D   W 142804031 + 256
      Signed-off-by: default avatarCorrado Zoccolo <czoccolo@gmail.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      cf7c25cf
  7. 04 Nov, 2009 2 commits
  8. 03 Nov, 2009 3 commits
    • Jens Axboe's avatar
      cfq-iosched: fix merge error · 125c4f22
      Jens Axboe authored
      We ended up with testing the same condition twice, pretty
      pointless. Remove that first if.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      125c4f22
    • Shaohua Li's avatar
      cfq-iosched: limit coop preemption · 4b27e1bb
      Shaohua Li authored
      CFQ has an optimization for cooperated applications. if several
      io-context have close requests, they will get boost. But the
      optimization get abused. Considering thread a, b, which work on one
      file. a reads sectors s, s+2, s+4, ...; b reads sectors s+1, s+3, s
      +5, ... Both a and b are sequential read, so they can open idle window.
      a reads a sector s and goes to idle window and wakeup b. b reads sector
      s+1, since in current implementation, cfq_should_preempt() thinks a and
      b are cooperators, b will preempt a. b then reads sector s+1 and goes to
      idle window and wakeup a. for the same reason, a will preempt b and
      reads s+2. a and b will continue the circle. The circle will be very
      long, and a and b will occupy whole disk queue. Other applications will
      nearly have no chance to run.
      
      Fix this limiting coop preempt until a queue is scheduled normally
      again.
      Signed-off-by: default avatarShaohua Li <shaohua.li@intel.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      4b27e1bb
    • Jens Axboe's avatar
      cfq-iosched: fix bad return value cfq_should_preempt() · e6ec4fe2
      Jens Axboe authored
      Commit a6151c3a inadvertently reversed
      a preempt condition check, potentially causing a performance regression.
      Make the meta check correct again.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      e6ec4fe2
  9. 02 Nov, 2009 1 commit
  10. 28 Oct, 2009 6 commits
    • Jens Axboe's avatar
      cfq-iosched: fix style issue in cfq_get_avg_queues() · 5869619c
      Jens Axboe authored
      Line breaks and bad brace placement.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      5869619c
    • Corrado Zoccolo's avatar
      cfq-iosched: fairness for sync no-idle queues · 718eee05
      Corrado Zoccolo authored
      Currently no-idle queues in cfq are not serviced fairly:
      even if they can only dispatch a small number of requests at a time,
      they have to compete with idling queues to be serviced, experiencing
      large latencies.
      
      We should notice, instead, that no-idle queues are the ones that would
      benefit most from having low latency, in fact they are any of:
      * processes with large think times (e.g. interactive ones like file
        managers)
      * seeky (e.g. programs faulting in their code at startup)
      * or marked as no-idle from upper levels, to improve latencies of those
        requests.
      
      This patch improves the fairness and latency for those queues, by:
      * separating sync idle, sync no-idle and async queues in separate
        service_trees, for each priority
      * service all no-idle queues together
      * and idling when the last no-idle queue has been serviced, to
        anticipate for more no-idle work
      * the timeslices allotted for idle and no-idle service_trees are
        computed proportionally to the number of processes in each set.
      
      Servicing all no-idle queues together should have a performance boost
      for NCQ-capable drives, without compromising fairness.
      Signed-off-by: default avatarCorrado Zoccolo <czoccolo@gmail.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      718eee05
    • Corrado Zoccolo's avatar
      cfq-iosched: enable idling for last queue on priority class · a6d44e98
      Corrado Zoccolo authored
      cfq can disable idling for queues in various circumstances.
      When workloads of different priorities are competing, if the higher
      priority queue has idling disabled, lower priority queues may steal
      its disk share. For example, in a scenario with an RT process
      performing seeky reads vs a BE process performing sequential reads,
      on an NCQ enabled hardware, with low_latency unset,
      the RT process will dispatch only the few pending requests every full
      slice of service for the BE process.
      
      The patch solves this issue by always performing idle on the last
      queue at a given priority class > idle. If the same process, or one
      that can pre-empt it (so at the same priority or higher), submits a
      new request within the idle window, the lower priority queue won't
      dispatch, saving the disk bandwidth for higher priority ones.
      
      Note: this doesn't touch the non_rotational + NCQ case (no hardware
      to test if this is a benefit in that case).
      Signed-off-by: default avatarCorrado Zoccolo <czoccolo@gmail.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      a6d44e98
    • Corrado Zoccolo's avatar
      cfq-iosched: reimplement priorities using different service trees · c0324a02
      Corrado Zoccolo authored
      We use different service trees for different priority classes.
      This allows a simplification in the service tree insertion code, that no
      longer has to consider priority while walking the tree.
      Signed-off-by: default avatarCorrado Zoccolo <czoccolo@gmail.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      c0324a02
    • Corrado Zoccolo's avatar
      cfq-iosched: preparation to handle multiple service trees · aa6f6a3d
      Corrado Zoccolo authored
      We embed a pointer to the service tree in each queue, to handle multiple
      service trees easily.
      Service trees are enriched with a counter.
      cfq_add_rq_rb is invoked after putting the rq in the fifo, to ensure
      that all fields in rq are properly initialized.
      Signed-off-by: default avatarCorrado Zoccolo <czoccolo@gmail.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      aa6f6a3d
    • Corrado Zoccolo's avatar
      cfq-iosched: adapt slice to number of processes doing I/O · 5db5d642
      Corrado Zoccolo authored
      When the number of processes performing I/O concurrently increases,
      a fixed time slice per process will cause large latencies.
      
      This patch, if low_latency mode is enabled,  will scale the time slice
      assigned to each process according to a 300ms target latency.
      
      In order to keep fairness among processes:
      * The number of active processes is computed using a special form of
      running average, that quickly follows sudden increases (to keep latency low),
      and decrease slowly (to have fairness in spite of rapid decreases of this
      value).
      
      To safeguard sequential bandwidth, we impose a minimum time slice
      (computed using 2*cfq_slice_idle as base, adjusted according to priority
      and async-ness).
      Signed-off-by: default avatarCorrado Zoccolo <czoccolo@gmail.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      5db5d642
  11. 27 Oct, 2009 1 commit
  12. 26 Oct, 2009 4 commits
  13. 24 Oct, 2009 1 commit
  14. 12 Oct, 2009 1 commit
  15. 09 Oct, 2009 1 commit
    • KOSAKI Motohiro's avatar
      elv_iosched_store(): fix strstrip() misuse · 8c279598
      KOSAKI Motohiro authored
      elv_iosched_store() ignore the return value of strstrip().  It makes small
      inconsistent behavior.
      
      This patch fixes it.
      
       <before>
       ====================================
       # cd /sys/block/{blockdev}/queue
      
       case1:
       # echo "anticipatory" > scheduler
       # cat scheduler
       noop [anticipatory] deadline cfq
      
       case2:
       # echo "anticipatory " > scheduler
       # cat scheduler
       noop [anticipatory] deadline cfq
      
       case3:
       # echo " anticipatory" > scheduler
       bash: echo: write error: Invalid argument
      
       <after>
       ====================================
       # cd /sys/block/{blockdev}/queue
      
       case1:
       # echo "anticipatory" > scheduler
       # cat scheduler
       noop [anticipatory] deadline cfq
      
       case2:
       # echo "anticipatory " > scheduler
       # cat scheduler
       noop [anticipatory] deadline cfq
      
       case3:
       # echo " anticipatory" > scheduler
       noop [anticipatory] deadline cfq
      
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      8c279598
  16. 08 Oct, 2009 1 commit
  17. 07 Oct, 2009 2 commits
  18. 06 Oct, 2009 4 commits
  19. 05 Oct, 2009 1 commit