1. 13 Jan, 2019 2 commits
  2. 17 Dec, 2018 2 commits
  3. 13 Dec, 2018 2 commits
  4. 08 Dec, 2018 2 commits
  5. 05 Dec, 2018 1 commit
  6. 27 Nov, 2018 1 commit
  7. 21 Nov, 2018 1 commit
  8. 13 Nov, 2018 1 commit
    • Waiman Long's avatar
      locking/lockdep: Fix debug_locks off performance problem · 5cf2ab06
      Waiman Long authored
      [ Upstream commit 9506a742 ]
      
      It was found that when debug_locks was turned off because of a problem
      found by the lockdep code, the system performance could drop quite
      significantly when the lock_stat code was also configured into the
      kernel. For instance, parallel kernel build time on a 4-socket x86-64
      server nearly doubled.
      
      Further analysis into the cause of the slowdown traced back to the
      frequent call to debug_locks_off() from the __lock_acquired() function
      probably due to some inconsistent lockdep states with debug_locks
      off. The debug_locks_off() function did an unconditional atomic xchg
      to write a 0 value into debug_locks which had already been set to 0.
      This led to severe cacheline contention in the cacheline that held
      debug_locks.  As debug_locks is being referenced in quite a few different
      places in the kernel, this greatly slow down the system performance.
      
      To prevent that trashing of debug_locks cacheline, lock_acquired()
      and lock_contended() now checks the state of debug_locks before
      proceeding. The debug_locks_off() function is also modified to check
      debug_locks before calling __debug_locks_off().
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1539913518-15598-1-git-send-email-longman@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5cf2ab06
  9. 04 Nov, 2018 1 commit
  10. 04 Oct, 2018 1 commit
    • Bart Van Assche's avatar
      scsi: klist: Make it safe to use klists in atomic context · 1390c37d
      Bart Van Assche authored
      [ Upstream commit 624fa779 ]
      
      In the scsi_transport_srp implementation it cannot be avoided to
      iterate over a klist from atomic context when using the legacy block
      layer instead of blk-mq. Hence this patch that makes it safe to use
      klists in atomic context. This patch avoids that lockdep reports the
      following:
      
      WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
       Possible interrupt unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&(&k->k_lock)->rlock);
                                     local_irq_disable();
                                     lock(&(&q->__queue_lock)->rlock);
                                     lock(&(&k->k_lock)->rlock);
        <Interrupt>
          lock(&(&q->__queue_lock)->rlock);
      
      stack backtrace:
      Workqueue: kblockd blk_timeout_work
      Call Trace:
       dump_stack+0xa4/0xf5
       check_usage+0x6e6/0x700
       __lock_acquire+0x185d/0x1b50
       lock_acquire+0xd2/0x260
       _raw_spin_lock+0x32/0x50
       klist_next+0x47/0x190
       device_for_each_child+0x8e/0x100
       srp_timed_out+0xaf/0x1d0 [scsi_transport_srp]
       scsi_times_out+0xd4/0x410 [scsi_mod]
       blk_rq_timed_out+0x36/0x70
       blk_timeout_work+0x1b5/0x220
       process_one_work+0x4fe/0xad0
       worker_thread+0x63/0x5a0
       kthread+0x1c1/0x1e0
       ret_from_fork+0x24/0x30
      
      See also commit c9ddf734 ("scsi: scsi_transport_srp: Fix shost to
      rport translation").
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: James Bottomley <jejb@linux.vnet.ibm.com>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1390c37d
  11. 19 Sep, 2018 1 commit
  12. 15 Sep, 2018 1 commit
  13. 05 Sep, 2018 1 commit
    • Petr Mladek's avatar
      printk/nmi: Prevent deadlock when accessing the main log buffer in NMI · cd71265a
      Petr Mladek authored
      commit 03fc7f9c upstream.
      
      The commit 719f6a70 ("printk: Use the main logbuf in NMI
      when logbuf_lock is available") brought back the possible deadlocks
      in printk() and NMI.
      
      The check of logbuf_lock is done only in printk_nmi_enter() to prevent
      mixed output. But another CPU might take the lock later, enter NMI, and:
      
            + Both NMIs might be serialized by yet another lock, for example,
      	the one in nmi_cpu_backtrace().
      
            + The other CPU might get stopped in NMI, see smp_send_stop()
      	in panic().
      
      The only safe solution is to use trylock when storing the message
      into the main log-buffer. It might cause reordering when some lines
      go to the main lock buffer directly and others are delayed via
      the per-CPU buffer. It means that it is not useful in general.
      
      This patch replaces the problematic NMI deferred context with NMI
      direct context. It can be used to mark a code that might produce
      many messages in NMI and the risk of losing them is more critical
      than problems with eventual reordering.
      
      The context is then used when dumping trace buffers on oops. It was
      the primary motivation for the original fix. Also the reordering is
      even smaller issue there because some traces have their own time stamps.
      
      Finally, nmi_cpu_backtrace() need not longer be serialized because
      it will always us the per-CPU buffers again.
      
      Fixes: 719f6a70 ("printk: Use the main logbuf in NMI when logbuf_lock is available")
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20180627142028.11259-1-pmladek@suse.com
      To: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: stable@vger.kernel.org
      Acked-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cd71265a
  14. 17 Aug, 2018 1 commit
    • Chintan Pandya's avatar
      ioremap: Update pgtable free interfaces with addr · a3480696
      Chintan Pandya authored
      commit 785a19f9 upstream.
      
      The following kernel panic was observed on ARM64 platform due to a stale
      TLB entry.
      
       1. ioremap with 4K size, a valid pte page table is set.
       2. iounmap it, its pte entry is set to 0.
       3. ioremap the same address with 2M size, update its pmd entry with
          a new value.
       4. CPU may hit an exception because the old pmd entry is still in TLB,
          which leads to a kernel panic.
      
      Commit b6bdb751 ("mm/vmalloc: add interfaces to free unmapped page
      table") has addressed this panic by falling to pte mappings in the above
      case on ARM64.
      
      To support pmd mappings in all cases, TLB purge needs to be performed
      in this case on ARM64.
      
      Add a new arg, 'addr', to pud_free_pmd_page() and pmd_free_pte_page()
      so that TLB purge can be added later in seprate patches.
      
      [toshi.kani@hpe.com: merge changes, rewrite patch description]
      Fixes: 28ee90fe ("x86/mm: implement free pmd/pte page interfaces")
      Signed-off-by: default avatarChintan Pandya <cpandya@codeaurora.org>
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: mhocko@suse.com
      Cc: akpm@linux-foundation.org
      Cc: hpa@zytor.com
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: stable@vger.kernel.org
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20180627141348.21777-3-toshi.kani@hpe.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a3480696
  15. 25 Jul, 2018 1 commit
  16. 03 Jul, 2018 1 commit
    • Geert Uytterhoeven's avatar
      lib/vsprintf: Remove atomic-unsafe support for %pCr · ea0ac01f
      Geert Uytterhoeven authored
      commit 666902e4 upstream.
      
      "%pCr" formats the current rate of a clock, and calls clk_get_rate().
      The latter obtains a mutex, hence it must not be called from atomic
      context.
      
      Remove support for this rarely-used format, as vsprintf() (and e.g.
      printk()) must be callable from any context.
      
      Any remaining out-of-tree users will start seeing the clock's name
      printed instead of its rate.
      Reported-by: default avatarJia-Ju Bai <baijiaju1990@gmail.com>
      Fixes: 900cca29 ("lib/vsprintf: add %pC{,n,r} format specifiers for clocks")
      Link: http://lkml.kernel.org/r/1527845302-12159-5-git-send-email-geert+renesas@glider.be
      To: Jia-Ju Bai <baijiaju1990@gmail.com>
      To: Jonathan Corbet <corbet@lwn.net>
      To: Michael Turquette <mturquette@baylibre.com>
      To: Stephen Boyd <sboyd@kernel.org>
      To: Zhang Rui <rui.zhang@intel.com>
      To: Eduardo Valentin <edubezval@gmail.com>
      To: Eric Anholt <eric@anholt.net>
      To: Stefan Wahren <stefan.wahren@i2se.com>
      To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: linux-doc@vger.kernel.org
      Cc: linux-clk@vger.kernel.org
      Cc: linux-pm@vger.kernel.org
      Cc: linux-serial@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-renesas-soc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: Geert Uytterhoeven <geert+renesas@glider.be>
      Cc: stable@vger.kernel.org # 4.1+
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ea0ac01f
  17. 30 May, 2018 2 commits
  18. 22 May, 2018 2 commits
    • Ross Zwisler's avatar
      radix tree: fix multi-order iteration race · 572e2385
      Ross Zwisler authored
      commit 9f418224 upstream.
      
      Fix a race in the multi-order iteration code which causes the kernel to
      hit a GP fault.  This was first seen with a production v4.15 based
      kernel (4.15.6-300.fc27.x86_64) utilizing a DAX workload which used
      order 9 PMD DAX entries.
      
      The race has to do with how we tear down multi-order sibling entries
      when we are removing an item from the tree.  Remember for example that
      an order 2 entry looks like this:
      
        struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
      
      where 'entry' is in some slot in the struct radix_tree_node, and the
      three slots following 'entry' contain sibling pointers which point back
      to 'entry.'
      
      When we delete 'entry' from the tree, we call :
      
        radix_tree_delete()
          radix_tree_delete_item()
            __radix_tree_delete()
              replace_slot()
      
      replace_slot() first removes the siblings in order from the first to the
      last, then at then replaces 'entry' with NULL.  This means that for a
      brief period of time we end up with one or more of the siblings removed,
      so:
      
        struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
      
      This causes an issue if you have a reader iterating over the slots in
      the tree via radix_tree_for_each_slot() while only under
      rcu_read_lock()/rcu_read_unlock() protection.  This is a common case in
      mm/filemap.c.
      
      The issue is that when __radix_tree_next_slot() => skip_siblings() tries
      to skip over the sibling entries in the slots, it currently does so with
      an exact match on the slot directly preceding our current slot.
      Normally this works:
      
                                            V preceding slot
        struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
                                                    ^ current slot
      
      This lets you find the first sibling, and you skip them all in order.
      
      But in the case where one of the siblings is NULL, that slot is skipped
      and then our sibling detection is interrupted:
      
                                                   V preceding slot
        struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
                                                          ^ current slot
      
      This means that the sibling pointers aren't recognized since they point
      all the way back to 'entry', so we think that they are normal internal
      radix tree pointers.  This causes us to think we need to walk down to a
      struct radix_tree_node starting at the address of 'entry'.
      
      In a real running kernel this will crash the thread with a GP fault when
      you try and dereference the slots in your broken node starting at
      'entry'.
      
      We fix this race by fixing the way that skip_siblings() detects sibling
      nodes.  Instead of testing against the preceding slot we instead look
      for siblings via is_sibling_entry() which compares against the position
      of the struct radix_tree_node.slots[] array.  This ensures that sibling
      entries are properly identified, even if they are no longer contiguous
      with the 'entry' they point to.
      
      Link: http://lkml.kernel.org/r/20180503192430.7582-6-ross.zwisler@linux.intel.com
      Fixes: 148deab2 ("radix-tree: improve multiorder iterators")
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Reported-by: default avatarCR, Sapthagirish <sapthagirish.cr@intel.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      572e2385
    • Matthew Wilcox's avatar
      lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctly · f6c0f020
      Matthew Wilcox authored
      commit 1e3054b9 upstream.
      
      I had neglected to increment the error counter when the tests failed,
      which made the tests noisy when they fail, but not actually return an
      error code.
      
      Link: http://lkml.kernel.org/r/20180509114328.9887-1-mpe@ellerman.id.au
      Fixes: 3cc78125 ("lib/test_bitmap.c: add optimisation tests")
      Signed-off-by: default avatarMatthew Wilcox <mawilcox@microsoft.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reported-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Tested-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Yury Norov <ynorov@caviumnetworks.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: <stable@vger.kernel.org>	[4.13+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f6c0f020
  19. 09 May, 2018 1 commit
    • Matthew Wilcox's avatar
      errseq: Always report a writeback error once · 0799a0ea
      Matthew Wilcox authored
      commit b4678df1 upstream.
      
      The errseq_t infrastructure assumes that errors which occurred before
      the file descriptor was opened are of no interest to the application.
      This turns out to be a regression for some applications, notably Postgres.
      
      Before errseq_t, a writeback error would be reported exactly once (as
      long as the inode remained in memory), so Postgres could open a file,
      call fsync() and find out whether there had been a writeback error on
      that file from another process.
      
      This patch changes the errseq infrastructure to report errors to all
      file descriptors which are opened after the error occurred, but before
      it was reported to any file descriptor.  This restores the user-visible
      behaviour.
      
      Cc: stable@vger.kernel.org
      Fixes: 5660e13d ("fs: new infrastructure for writeback error handling and reporting")
      Signed-off-by: default avatarMatthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0799a0ea
  20. 01 May, 2018 1 commit
    • Dmitry Vyukov's avatar
      kobject: don't use WARN for registration failures · a5f42767
      Dmitry Vyukov authored
      commit 3e14c6ab upstream.
      
      This WARNING proved to be noisy. The function still returns an error
      and callers should handle it. That's how most of kernel code works.
      Downgrade the WARNING to pr_err() and leave WARNINGs for kernel bugs.
      Signed-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reported-by: syzbot+209c0f67f99fec8eb14b@syzkaller.appspotmail.com
      Reported-by: syzbot+7fb6d9525a4528104e05@syzkaller.appspotmail.com
      Reported-by: syzbot+2e63711063e2d8f9ea27@syzkaller.appspotmail.com
      Reported-by: syzbot+de73361ee4971b6e6f75@syzkaller.appspotmail.com
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a5f42767
  21. 26 Apr, 2018 1 commit
    • Yonghong Song's avatar
      bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y · 3e01c16d
      Yonghong Song authored
      
      [ Upstream commit 09584b40 ]
      
      With CONFIG_BPF_JIT_ALWAYS_ON is defined in the config file,
      tools/testing/selftests/bpf/test_kmod.sh failed like below:
        [root@localhost bpf]# ./test_kmod.sh
        sysctl: setting key "net.core.bpf_jit_enable": Invalid argument
        [ JIT enabled:0 hardened:0 ]
        [  132.175681] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
        [  132.458834] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
        [ JIT enabled:1 hardened:0 ]
        [  133.456025] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
        [  133.730935] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
        [ JIT enabled:1 hardened:1 ]
        [  134.769730] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
        [  135.050864] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
        [ JIT enabled:1 hardened:2 ]
        [  136.442882] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
        [  136.821810] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
        [root@localhost bpf]#
      
      The test_kmod.sh load/remove test_bpf.ko multiple times with different
      settings for sysctl net.core.bpf_jit_{enable,harden}. The failed test #297
      of test_bpf.ko is designed such that JIT always fails.
      
      Commit 290af866 (bpf: introduce BPF_JIT_ALWAYS_ON config)
      introduced the following tightening logic:
          ...
              if (!bpf_prog_is_dev_bound(fp->aux)) {
                      fp = bpf_int_jit_compile(fp);
          #ifdef CONFIG_BPF_JIT_ALWAYS_ON
                      if (!fp->jited) {
                              *err = -ENOTSUPP;
                              return fp;
                      }
          #endif
          ...
      With this logic, Test #297 always gets return value -ENOTSUPP
      when CONFIG_BPF_JIT_ALWAYS_ON is defined, causing the test failure.
      
      This patch fixed the failure by marking Test #297 as expected failure
      when CONFIG_BPF_JIT_ALWAYS_ON is defined.
      
      Fixes: 290af866 (bpf: introduce BPF_JIT_ALWAYS_ON config)
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e01c16d
  22. 19 Apr, 2018 1 commit
  23. 31 Mar, 2018 1 commit
  24. 28 Mar, 2018 1 commit
    • Toshi Kani's avatar
      mm/vmalloc: add interfaces to free unmapped page table · acdb4981
      Toshi Kani authored
      commit b6bdb751 upstream.
      
      On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may
      create pud/pmd mappings.  A kernel panic was observed on arm64 systems
      with Cortex-A75 in the following steps as described by Hanjun Guo.
      
       1. ioremap a 4K size, valid page table will build,
       2. iounmap it, pte0 will set to 0;
       3. ioremap the same address with 2M size, pgd/pmd is unchanged,
          then set the a new value for pmd;
       4. pte0 is leaked;
       5. CPU may meet exception because the old pmd is still in TLB,
          which will lead to kernel panic.
      
      This panic is not reproducible on x86.  INVLPG, called from iounmap,
      purges all levels of entries associated with purged address on x86.  x86
      still has memory leak.
      
      The patch changes the ioremap path to free unmapped page table(s) since
      doing so in the unmap path has the following issues:
      
       - The iounmap() path is shared with vunmap(). Since vmap() only
         supports pte mappings, making vunmap() to free a pte page is an
         overhead for regular vmap users as they do not need a pte page freed
         up.
      
       - Checking if all entries in a pte page are cleared in the unmap path
         is racy, and serializing this check is expensive.
      
       - The unmap path calls free_vmap_area_noflush() to do lazy TLB purges.
         Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB
         purge.
      
      Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which
      clear a given pud/pmd entry and free up a page for the lower level
      entries.
      
      This patch implements their stub functions on x86 and arm64, which work
      as workaround.
      
      [akpm@linux-foundation.org: fix typo in pmd_free_pte_page() stub]
      Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com
      Fixes: e61ce6ad ("mm: change ioremap to set up huge I/O mappings")
      Reported-by: default avatarLei Li <lious.lilei@hisilicon.com>
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Wang Xuefeng <wxf.wang@hisilicon.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Chintan Pandya <cpandya@codeaurora.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      acdb4981
  25. 19 Mar, 2018 1 commit
  26. 15 Mar, 2018 1 commit
    • Kees Cook's avatar
      lib/bug.c: exclude non-BUG/WARN exceptions from report_bug() · d50cb5ce
      Kees Cook authored
      commit 1b4cfe3c upstream.
      
      Commit b8347c21 ("x86/debug: Handle warnings before the notifier
      chain, to fix KGDB crash") changed the ordering of fixups, and did not
      take into account the case of x86 processing non-WARN() and non-BUG()
      exceptions.  This would lead to output of a false BUG line with no other
      information.
      
      In the case of a refcount exception, it would be immediately followed by
      the refcount WARN(), producing very strange double-"cut here":
      
        lkdtm: attempting bad refcount_inc() overflow
        ------------[ cut here ]------------
        Kernel BUG at 0000000065f29de5 [verbose debug info unavailable]
        ------------[ cut here ]------------
        refcount_t overflow at lkdtm_REFCOUNT_INC_OVERFLOW+0x6b/0x90 in cat[3065], uid/euid: 0/0
        WARNING: CPU: 0 PID: 3065 at kernel/panic.c:657 refcount_error_report+0x9a/0xa4
        ...
      
      In the prior ordering, exceptions were searched first:
      
         do_trap_no_signal(struct task_struct *tsk, int trapnr, char *str,
         ...
                      if (fixup_exception(regs, trapnr))
                              return 0;
      
        -               if (fixup_bug(regs, trapnr))
        -                       return 0;
        -
      
      As a result, fixup_bugs()'s is_valid_bugaddr() didn't take into account
      needing to search the exception list first, since that had already
      happened.
      
      So, instead of searching the exception list twice (once in
      is_valid_bugaddr() and then again in fixup_exception()), just add a
      simple sanity check to report_bug() that will immediately bail out if a
      BUG() (or WARN()) entry is not found.
      
      Link: http://lkml.kernel.org/r/20180301225934.GA34350@beast
      Fixes: b8347c21 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash")
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Richard Weinberger <richard.weinberger@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d50cb5ce
  27. 03 Mar, 2018 1 commit
    • James Hogan's avatar
      lib/mpi: Fix umul_ppmm() for MIPS64r6 · 22d5e20c
      James Hogan authored
      
      [ Upstream commit bbc25bee ]
      
      Current MIPS64r6 toolchains aren't able to generate efficient
      DMULU/DMUHU based code for the C implementation of umul_ppmm(), which
      performs an unsigned 64 x 64 bit multiply and returns the upper and
      lower 64-bit halves of the 128-bit result. Instead it widens the 64-bit
      inputs to 128-bits and emits a __multi3 intrinsic call to perform a 128
      x 128 multiply. This is both inefficient, and it results in a link error
      since we don't include __multi3 in MIPS linux.
      
      For example commit 90a53e44 ("cfg80211: implement regdb signature
      checking") merged in v4.15-rc1 recently broke the 64r6_defconfig and
      64r6el_defconfig builds by indirectly selecting MPILIB. The same build
      errors can be reproduced on older kernels by enabling e.g. CRYPTO_RSA:
      
      lib/mpi/generic_mpih-mul1.o: In function `mpihelp_mul_1':
      lib/mpi/generic_mpih-mul1.c:50: undefined reference to `__multi3'
      lib/mpi/generic_mpih-mul2.o: In function `mpihelp_addmul_1':
      lib/mpi/generic_mpih-mul2.c:49: undefined reference to `__multi3'
      lib/mpi/generic_mpih-mul3.o: In function `mpihelp_submul_1':
      lib/mpi/generic_mpih-mul3.c:49: undefined reference to `__multi3'
      lib/mpi/mpih-div.o In function `mpihelp_divrem':
      lib/mpi/mpih-div.c:205: undefined reference to `__multi3'
      lib/mpi/mpih-div.c:142: undefined reference to `__multi3'
      
      Therefore add an efficient MIPS64r6 implementation of umul_ppmm() using
      inline assembly and the DMULU/DMUHU instructions, to prevent __multi3
      calls being emitted.
      
      Fixes: 7fd08ca5 ("MIPS: Add build support for the MIPS R6 ISA")
      Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: linux-mips@linux-mips.org
      Cc: linux-crypto@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      22d5e20c
  28. 25 Feb, 2018 1 commit
  29. 22 Feb, 2018 2 commits
  30. 16 Feb, 2018 3 commits
    • Andrey Ryabinin's avatar
      lib/ubsan: add type mismatch handler for new GCC/Clang · 2617e62c
      Andrey Ryabinin authored
      commit 42440c1f upstream.
      
      UBSAN=y fails to build with new GCC/clang:
      
          arch/x86/kernel/head64.o: In function `sanitize_boot_params':
          arch/x86/include/asm/bootparam_utils.h:37: undefined reference to `__ubsan_handle_type_mismatch_v1'
      
      because Clang and GCC 8 slightly changed ABI for 'type mismatch' errors.
      Compiler now uses new __ubsan_handle_type_mismatch_v1() function with
      slightly modified 'struct type_mismatch_data'.
      
      Let's add new 'struct type_mismatch_data_common' which is independent from
      compiler's layout of 'struct type_mismatch_data'.  And make
      __ubsan_handle_type_mismatch[_v1]() functions transform compiler-dependent
      type mismatch data to our internal representation.  This way, we can
      support both old and new compilers with minimal amount of change.
      
      Link: http://lkml.kernel.org/r/20180119152853.16806-1-aryabinin@virtuozzo.comSigned-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: default avatarSodagudi Prasad <psodagud@codeaurora.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2617e62c
    • Andrew Morton's avatar
      lib/ubsan.c: s/missaligned/misaligned/ · 5a5df777
      Andrew Morton authored
      commit b8fe1120 upstream.
      
      A vist from the spelling fairy.
      
      Cc: David Laight <David.Laight@ACULAB.COM>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5a5df777
    • Arnd Bergmann's avatar
      kasan: rework Kconfig settings · 062cd346
      Arnd Bergmann authored
      commit e7c52b84 upstream.
      
      We get a lot of very large stack frames using gcc-7.0.1 with the default
      -fsanitize-address-use-after-scope --param asan-stack=1 options, which can
      easily cause an overflow of the kernel stack, e.g.
      
        drivers/gpu/drm/i915/gvt/handlers.c:2434:1: warning: the frame size of 46176 bytes is larger than 3072 bytes
        drivers/net/wireless/ralink/rt2x00/rt2800lib.c:5650:1: warning: the frame size of 23632 bytes is larger than 3072 bytes
        lib/atomic64_test.c:250:1: warning: the frame size of 11200 bytes is larger than 3072 bytes
        drivers/gpu/drm/i915/gvt/handlers.c:2621:1: warning: the frame size of 9208 bytes is larger than 3072 bytes
        drivers/media/dvb-frontends/stv090x.c:3431:1: warning: the frame size of 6816 bytes is larger than 3072 bytes
        fs/fscache/stats.c:287:1: warning: the frame size of 6536 bytes is larger than 3072 bytes
      
      To reduce this risk, -fsanitize-address-use-after-scope is now split out
      into a separate CONFIG_KASAN_EXTRA Kconfig option, leading to stack
      frames that are smaller than 2 kilobytes most of the time on x86_64.  An
      earlier version of this patch also prevented combining KASAN_EXTRA with
      KASAN_INLINE, but that is no longer necessary with gcc-7.0.1.
      
      All patches to get the frame size below 2048 bytes with CONFIG_KASAN=y
      and CONFIG_KASAN_EXTRA=n have been merged by maintainers now, so we can
      bring back that default now.  KASAN_EXTRA=y still causes lots of
      warnings but now defaults to !COMPILE_TEST to disable it in
      allmodconfig, and it remains disabled in all other defconfigs since it
      is a new option.  I arbitrarily raise the warning limit for KASAN_EXTRA
      to 3072 to reduce the noise, but an allmodconfig kernel still has around
      50 warnings on gcc-7.
      
      I experimented a bit more with smaller stack frames and have another
      follow-up series that reduces the warning limit for 64-bit architectures
      to 1280 bytes (without CONFIG_KASAN).
      
      With earlier versions of this patch series, I also had patches to address
      the warnings we get with KASAN and/or KASAN_EXTRA, using a
      "noinline_if_stackbloat" annotation.
      
      That annotation now got replaced with a gcc-8 bugfix (see
      https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715) and a workaround for
      older compilers, which means that KASAN_EXTRA is now just as bad as
      before and will lead to an instant stack overflow in a few extreme
      cases.
      
      This reverts parts of commit 3f181b4d ("lib/Kconfig.debug: disable
      -Wframe-larger-than warnings with KASAN=y").  Two patches in linux-next
      should be merged first to avoid introducing warnings in an allmodconfig
      build:
        3cd890db ("media: dvb-frontends: fix i2c access helpers for KASAN")
        16c3ada8 ("media: r820t: fix r820t_write_reg for KASAN")
      
      Do we really need to backport this?
      
      I think we do: without this patch, enabling KASAN will lead to
      unavoidable kernel stack overflow in certain device drivers when built
      with gcc-7 or higher on linux-4.10+ or any version that contains a
      backport of commit c5caf21a.  Most people are probably still on
      older compilers, but it will get worse over time as they upgrade their
      distros.
      
      The warnings we get on kernels older than this should all be for code
      that uses dangerously large stack frames, though most of them do not
      cause an actual stack overflow by themselves.The asan-stack option was
      added in linux-4.0, and commit 3f181b4d ("lib/Kconfig.debug:
      disable -Wframe-larger-than warnings with KASAN=y") effectively turned
      off the warning for allmodconfig kernels, so I would like to see this
      fix backported to any kernels later than 4.0.
      
      I have done dozens of fixes for individual functions with stack frames
      larger than 2048 bytes with asan-stack, and I plan to make sure that
      all those fixes make it into the stable kernels as well (most are
      already there).
      
      Part of the complication here is that asan-stack (from 4.0) was
      originally assumed to always require much larger stacks, but that
      turned out to be a combination of multiple gcc bugs that we have now
      worked around and fixed, but sanitize-address-use-after-scope (from
      v4.10) has a much higher inherent stack usage and also suffers from at
      least three other problems that we have analyzed but not yet fixed
      upstream, each of them makes the stack usage more severe than it should
      be.
      
      Link: http://lkml.kernel.org/r/20171221134744.2295529-1-arnd@arndb.deSigned-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      062cd346
  31. 03 Feb, 2018 1 commit