1. 17 Jan, 2017 2 commits
  2. 21 Jul, 2016 1 commit
    • Tahsin Erdogan's avatar
      block: do not merge requests without consulting with io scheduler · 72ef799b
      Tahsin Erdogan authored
      Before merging a bio into an existing request, io scheduler is called to
      get its approval first. However, the requests that come from a plug
      flush may get merged by block layer without consulting with io
      scheduler.
      
      In case of CFQ, this can cause fairness problems. For instance, if a
      request gets merged into a low weight cgroup's request, high weight cgroup
      now will depend on low weight cgroup to get scheduled. If high weigt cgroup
      needs that io request to complete before submitting more requests, then it
      will also lose its timeslice.
      
      Following script demonstrates the problem. Group g1 has a low weight, g2
      and g3 have equal high weights but g2's requests are adjacent to g1's
      requests so they are subject to merging. Due to these merges, g2 gets
      poor disk time allocation.
      
      cat > cfq-merge-repro.sh << "EOF"
      #!/bin/bash
      set -e
      
      IO_ROOT=/mnt-cgroup/io
      
      mkdir -p $IO_ROOT
      
      if ! mount | grep -qw $IO_ROOT; then
        mount -t cgroup none -oblkio $IO_ROOT
      fi
      
      cd $IO_ROOT
      
      for i in g1 g2 g3; do
        if [ -d $i ]; then
          rmdir $i
        fi
      done
      
      mkdir g1 && echo 10 > g1/blkio.weight
      mkdir g2 && echo 495 > g2/blkio.weight
      mkdir g3 && echo 495 > g3/blkio.weight
      
      RUNTIME=10
      
      (echo $BASHPID > g1/cgroup.procs &&
       fio --readonly --name name1 --filename /dev/sdb \
           --rw read --size 64k --bs 64k --time_based \
           --runtime=$RUNTIME --offset=0k &> /dev/null)&
      
      (echo $BASHPID > g2/cgroup.procs &&
       fio --readonly --name name1 --filename /dev/sdb \
           --rw read --size 64k --bs 64k --time_based \
           --runtime=$RUNTIME --offset=64k &> /dev/null)&
      
      (echo $BASHPID > g3/cgroup.procs &&
       fio --readonly --name name1 --filename /dev/sdb \
           --rw read --size 64k --bs 64k --time_based \
           --runtime=$RUNTIME --offset=256k &> /dev/null)&
      
      sleep $((RUNTIME+1))
      
      for i in g1 g2 g3; do
        echo ---- $i ----
        cat $i/blkio.time
      done
      
      EOF
      # ./cfq-merge-repro.sh
      ---- g1 ----
      8:16 162
      ---- g2 ----
      8:16 165
      ---- g3 ----
      8:16 686
      
      After applying the patch:
      
      # ./cfq-merge-repro.sh
      ---- g1 ----
      8:16 90
      ---- g2 ----
      8:16 445
      ---- g3 ----
      8:16 471
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      72ef799b
  3. 28 Jun, 2016 1 commit
    • Jan Kara's avatar
      block: Convert fifo_time from ulong to u64 · 9828c2c6
      Jan Kara authored
      Currently rq->fifo_time is unsigned long but CFQ stores nanosecond
      timestamp in it which would overflow on 32-bit archs. Convert it to u64
      to avoid the overflow. Since the rq->fifo_time is unioned with struct
      call_single_data(), this does not change the size of struct request in
      any way.
      
      We have to slightly fixup block/deadline-iosched.c so that comparison
      happens in the right types.
      
      Fixes: 9a7f38c4Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      9828c2c6
  4. 01 Feb, 2016 1 commit
  5. 24 Feb, 2014 1 commit
  6. 11 Sep, 2013 1 commit
  7. 03 Jul, 2013 1 commit
    • Jianpeng Ma's avatar
      elevator: Fix a race in elevator switching · d50235b7
      Jianpeng Ma authored
      There's a race between elevator switching and normal io operation.
          Because the allocation of struct elevator_queue and struct elevator_data
          don't in a atomic operation.So there are have chance to use NULL
          ->elevator_data.
          For example:
              Thread A:                               Thread B
              blk_queu_bio                            elevator_switch
              spin_lock_irq(q->queue_block)           elevator_alloc
              elv_merge                               elevator_init_fn
      
          Because call elevator_alloc, it can't hold queue_lock and the
          ->elevator_data is NULL.So at the same time, threadA call elv_merge and
          nedd some info of elevator_data.So the crash happened.
      
          Move the elevator_alloc into func elevator_init_fn, it make the
          operations in a atomic operation.
      
          Using the follow method can easy reproduce this bug
          1:dd if=/dev/sdb of=/dev/null
          2:while true;do echo noop > scheduler;echo deadline > scheduler;done
      
          The test method also use this method.
      Signed-off-by: default avatarJianpeng Ma <majianpeng@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d50235b7
  8. 23 Mar, 2013 1 commit
    • Kent Overstreet's avatar
      block: Add bio_end_sector() · f73a1c7d
      Kent Overstreet authored
      Just a little convenience macro - main reason to add it now is preparing
      for immutable bio vecs, it'll reduce the size of the patch that puts
      bi_sector/bi_size/bi_idx into a struct bvec_iter.
      Signed-off-by: default avatarKent Overstreet <koverstreet@google.com>
      CC: Jens Axboe <axboe@kernel.dk>
      CC: Lars Ellenberg <drbd-dev@lists.linbit.com>
      CC: Jiri Kosina <jkosina@suse.cz>
      CC: Alasdair Kergon <agk@redhat.com>
      CC: dm-devel@redhat.com
      CC: Neil Brown <neilb@suse.de>
      CC: Martin Schwidefsky <schwidefsky@de.ibm.com>
      CC: Heiko Carstens <heiko.carstens@de.ibm.com>
      CC: linux-s390@vger.kernel.org
      CC: Chris Mason <chris.mason@fusionio.com>
      CC: Steven Whitehouse <swhiteho@redhat.com>
      Acked-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      f73a1c7d
  9. 09 Dec, 2012 1 commit
  10. 06 Mar, 2012 1 commit
    • Tejun Heo's avatar
      elevator: make elevator_init_fn() return 0/-errno · b2fab5ac
      Tejun Heo authored
      elevator_ops->elevator_init_fn() has a weird return value.  It returns
      a void * which the caller should assign to q->elevator->elevator_data
      and %NULL return denotes init failure.
      
      Update such that it returns integer 0/-errno and sets elevator_data
      directly as necessary.
      
      This makes the interface more conventional and eases further cleanup.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b2fab5ac
  11. 13 Dec, 2011 1 commit
    • Tejun Heo's avatar
      block, cfq: move icq cache management to block core · 3d3c2379
      Tejun Heo authored
      Let elevators set ->icq_size and ->icq_align in elevator_type and
      elv_register() and elv_unregister() respectively create and destroy
      kmem_cache for icq.
      
      * elv_register() now can return failure.  All callers updated.
      
      * icq caches are automatically named "ELVNAME_io_cq".
      
      * cfq_slab_setup/kill() are collapsed into cfq_init/exit().
      
      * While at it, minor indentation change for iosched_cfq.elevator_name
        for consistency.
      
      This will help moving icq management to block core.  This doesn't
      introduce any functional change.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      3d3c2379
  12. 02 Jun, 2011 1 commit
    • Jeff Moyer's avatar
      iosched: prevent aliased requests from starving other I/O · 796d5116
      Jeff Moyer authored
      Hi, Jens,
      
      If you recall, I posted an RFC patch for this back in July of last year:
      http://lkml.org/lkml/2010/7/13/279
      
      The basic problem is that a process can issue a never-ending stream of
      async direct I/Os to the same sector on a device, thus starving out
      other I/O in the system (due to the way the alias handling works in both
      cfq and deadline).  The solution I proposed back then was to start
      dispatching from the fifo after a certain number of aliases had been
      dispatched.  Vivek asked why we had to treat aliases differently at all,
      and I never had a good answer.  So, I put together a simple patch which
      allows aliases to be added to the rb tree (it adds them to the right,
      though that doesn't matter as the order isn't guaranteed anyway).  I
      think this is the preferred solution, as it doesn't break up time slices
      in CFQ or batches in deadline.  I've tested it, and it does solve the
      starvation issue.  Let me know what you think.
      
      Cheers,
      Jeff
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      796d5116
  13. 10 Mar, 2011 1 commit
  14. 11 May, 2009 1 commit
    • Tejun Heo's avatar
      block: convert to pos and nr_sectors accessors · 83096ebf
      Tejun Heo authored
      With recent cleanups, there is no place where low level driver
      directly manipulates request fields.  This means that the 'hard'
      request fields always equal the !hard fields.  Convert all
      rq->sectors, nr_sectors and current_nr_sectors references to
      accessors.
      
      While at it, drop superflous blk_rq_pos() < 0 test in swim.c.
      
      [ Impact: use pos and nr_sectors accessors ]
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarGeert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
      Tested-by: default avatarGrant Likely <grant.likely@secretlab.ca>
      Acked-by: default avatarGrant Likely <grant.likely@secretlab.ca>
      Tested-by: default avatarAdrian McMenamin <adrian@mcmen.demon.co.uk>
      Acked-by: default avatarAdrian McMenamin <adrian@mcmen.demon.co.uk>
      Acked-by: default avatarMike Miller <mike.miller@hp.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Cc: Borislav Petkov <petkovbb@googlemail.com>
      Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
      Cc: Eric Moore <Eric.Moore@lsi.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Pete Zaitcev <zaitcev@redhat.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Paul Clements <paul.clements@steeleye.com>
      Cc: Tim Waugh <tim@cyberelk.net>
      Cc: Jeff Garzik <jgarzik@pobox.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Alex Dubov <oakad@yahoo.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Dario Ballabio <ballabio_dario@emc.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: unsik Kim <donari75@gmail.com>
      Cc: Laurent Vivier <Laurent@lvivier.info>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      83096ebf
  15. 29 Dec, 2008 1 commit
  16. 09 Oct, 2008 2 commits
  17. 18 Dec, 2007 1 commit
  18. 02 Nov, 2007 3 commits
  19. 24 Jul, 2007 1 commit
  20. 17 Jul, 2007 1 commit
  21. 01 Dec, 2006 1 commit
  22. 30 Sep, 2006 4 commits
  23. 30 Jun, 2006 1 commit
  24. 23 Jun, 2006 2 commits
  25. 08 Jun, 2006 1 commit
    • Jens Axboe's avatar
      [PATCH] elevator switching race · bc1c1169
      Jens Axboe authored
      There's a race between shutting down one io scheduler and firing up the
      next, in which a new io could enter and cause the io scheduler to be
      invoked with bad or NULL data.
      
      To fix this, we need to maintain the queue lock for a bit longer.
      Unfortunately we cannot do that, since the elevator init requires to be
      run without the lock held.  This isn't easily fixable, without also
      changing the mempool API.  So split the initialization into two parts,
      and alloc-init operation and an attach operation.  Then we can
      preallocate the io scheduler and related structures, and run the attach
      inside the lock after we detach the old one.
      
      This patch has survived 30 minutes of 1 second io scheduler switching
      with a very busy io load.
      Signed-off-by: default avatarJens Axboe <axboe@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      bc1c1169
  26. 21 Apr, 2006 1 commit
  27. 19 Mar, 2006 1 commit
  28. 18 Mar, 2006 1 commit
  29. 06 Jan, 2006 1 commit
  30. 18 Nov, 2005 1 commit
  31. 04 Nov, 2005 1 commit
  32. 28 Oct, 2005 1 commit