1. 02 Sep, 2019 4 commits
  2. 12 Aug, 2019 1 commit
  3. 12 Feb, 2019 2 commits
  4. 06 Feb, 2019 6 commits
  5. 31 Jan, 2019 7 commits
  6. 26 Jan, 2019 7 commits
    • Steve French's avatar
      cifs: allow disabling insecure dialects in the config · bfbd7e95
      Steve French authored
      commit 7420451f upstream.
      
      allow disabling cifs (SMB1 ie vers=1.0) and vers=2.0 in the
      config for the build of cifs.ko if want to always prevent mounting
      with these less secure dialects.
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      Reviewed-by: default avatarJeremy Allison <jra@samba.org>
      Cc: Alakesh Haloi <alakeshh@amazon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bfbd7e95
    • Scott Mayhew's avatar
      nfs: fix a deadlock in nfs client initialization · 53818c76
      Scott Mayhew authored
      commit c156618e upstream.
      
      The following deadlock can occur between a process waiting for a client
      to initialize in while walking the client list during nfsv4 server trunking
      detection and another process waiting for the nfs_clid_init_mutex so it
      can initialize that client:
      
      Process 1                               Process 2
      ---------                               ---------
      spin_lock(&nn->nfs_client_lock);
      list_add_tail(&CLIENTA->cl_share_link,
              &nn->nfs_client_list);
      spin_unlock(&nn->nfs_client_lock);
                                              spin_lock(&nn->nfs_client_lock);
                                              list_add_tail(&CLIENTB->cl_share_link,
                                                      &nn->nfs_client_list);
                                              spin_unlock(&nn->nfs_client_lock);
                                              mutex_lock(&nfs_clid_init_mutex);
                                              nfs41_walk_client_list(clp, result, cred);
                                              nfs_wait_client_init_complete(CLIENTA);
      (waiting for nfs_clid_init_mutex)
      
      Make sure nfs_match_client() only evaluates clients that have completed
      initialization in order to prevent that deadlock.
      
      This patch also fixes v4.0 trunking behavior by not marking the client
      NFS_CS_READY until the clientid has been confirmed.
      Signed-off-by: default avatarScott Mayhew <smayhew@redhat.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarQian Lu <luqia@amazon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53818c76
    • Junxiao Bi's avatar
      ocfs2: fix panic due to unrecovered local alloc · f976e59e
      Junxiao Bi authored
      [ Upstream commit 532e1e54 ]
      
      mount.ocfs2 ignore the inconsistent error that journal is clean but
      local alloc is unrecovered.  After mount, local alloc not empty, then
      reserver cluster didn't alloc a new local alloc window, reserveration
      map is empty(ocfs2_reservation_map.m_bitmap_len = 0), that triggered the
      following panic.
      
      This issue was reported at
      
        https://oss.oracle.com/pipermail/ocfs2-devel/2015-May/010854.html
      
      and was advised to fixed during mount.  But this is a very unusual
      inconsistent state, usually journal dirty flag should be cleared at the
      last stage of umount until every other things go right.  We may need do
      further debug to check that.  Any way to avoid possible futher
      corruption, mount should be abort and fsck should be run.
      
        (mount.ocfs2,1765,1):ocfs2_load_local_alloc:353 ERROR: Local alloc hasn't been recovered!
        found = 6518, set = 6518, taken = 8192, off = 15912372
        ocfs2: Mounting device (202,64) on (node 0, slot 3) with ordered data mode.
        o2dlm: Joining domain 89CEAC63CC4F4D03AC185B44E0EE0F3F ( 0 1 2 3 4 5 6 8 ) 8 nodes
        ocfs2: Mounting device (202,80) on (node 0, slot 3) with ordered data mode.
        o2hb: Region 89CEAC63CC4F4D03AC185B44E0EE0F3F (xvdf) is now a quorum device
        o2net: Accepted connection from node yvwsoa17p (num 7) at 172.22.77.88:7777
        o2dlm: Node 7 joins domain 64FE421C8C984E6D96ED12C55FEE2435 ( 0 1 2 3 4 5 6 7 8 ) 9 nodes
        o2dlm: Node 7 joins domain 89CEAC63CC4F4D03AC185B44E0EE0F3F ( 0 1 2 3 4 5 6 7 8 ) 9 nodes
        ------------[ cut here ]------------
        kernel BUG at fs/ocfs2/reservations.c:507!
        invalid opcode: 0000 [#1] SMP
        Modules linked in: ocfs2 rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs fscache lockd grace ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sunrpc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 ovmapi ppdev parport_pc parport xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea acpi_cpufreq pcspkr i2c_piix4 i2c_core sg ext4 jbd2 mbcache2 sr_mod cdrom xen_blkfront pata_acpi ata_generic ata_piix floppy dm_mirror dm_region_hash dm_log dm_mod
        CPU: 0 PID: 4349 Comm: startWebLogic.s Not tainted 4.1.12-124.19.2.el6uek.x86_64 #2
        Hardware name: Xen HVM domU, BIOS 4.4.4OVM 09/06/2018
        task: ffff8803fb04e200 ti: ffff8800ea4d8000 task.ti: ffff8800ea4d8000
        RIP: 0010:[<ffffffffa05e96a8>]  [<ffffffffa05e96a8>] __ocfs2_resv_find_window+0x498/0x760 [ocfs2]
        Call Trace:
          ocfs2_resmap_resv_bits+0x10d/0x400 [ocfs2]
          ocfs2_claim_local_alloc_bits+0xd0/0x640 [ocfs2]
          __ocfs2_claim_clusters+0x178/0x360 [ocfs2]
          ocfs2_claim_clusters+0x1f/0x30 [ocfs2]
          ocfs2_convert_inline_data_to_extents+0x634/0xa60 [ocfs2]
          ocfs2_write_begin_nolock+0x1c6/0x1da0 [ocfs2]
          ocfs2_write_begin+0x13e/0x230 [ocfs2]
          generic_perform_write+0xbf/0x1c0
          __generic_file_write_iter+0x19c/0x1d0
          ocfs2_file_write_iter+0x589/0x1360 [ocfs2]
          __vfs_write+0xb8/0x110
          vfs_write+0xa9/0x1b0
          SyS_write+0x46/0xb0
          system_call_fastpath+0x18/0xd7
        Code: ff ff 8b 75 b8 39 75 b0 8b 45 c8 89 45 98 0f 84 e5 fe ff ff 45 8b 74 24 18 41 8b 54 24 1c e9 56 fc ff ff 85 c0 0f 85 48 ff ff ff <0f> 0b 48 8b 05 cf c3 de ff 48 ba 00 00 00 00 00 00 00 10 48 85
        RIP   __ocfs2_resv_find_window+0x498/0x760 [ocfs2]
         RSP <ffff8800ea4db668>
        ---[ end trace 566f07529f2edf3c ]---
        Kernel panic - not syncing: Fatal exception
        Kernel Offset: disabled
      
      Link: http://lkml.kernel.org/r/20181121020023.3034-2-junxiao.bi@oracle.comSigned-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarYiwen Jiang <jiangyiwen@huawei.com>
      Acked-by: default avatarJoseph Qi <jiangqi903@gmail.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Changwei Ge <ge.changwei@h3c.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f976e59e
    • Javier Barrio's avatar
      quota: Lock s_umount in exclusive mode for Q_XQUOTA{ON,OFF} quotactls. · 8c3e1bcc
      Javier Barrio authored
      [ Upstream commit 41c4f85c ]
      
      Commit 1fa5efe3 (ext4: Use generic helpers for quotaon
      and quotaoff) made possible to call quotactl(Q_XQUOTAON/OFF) on ext4 filesystems
      with sysfile quota support. This leads to calling dquot_enable/disable without s_umount
      held in excl. mode, because quotactl_cmd_onoff checks only for Q_QUOTAON/OFF.
      
      The following WARN_ON_ONCE triggers (in this case for dquot_enable, ext4, latest Linus' tree):
      
      [  117.807056] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: quota,prjquota
      
      [...]
      
      [  155.036847] WARNING: CPU: 0 PID: 2343 at fs/quota/dquot.c:2469 dquot_enable+0x34/0xb9
      [  155.036851] Modules linked in: quota_v2 quota_tree ipv6 af_packet joydev mousedev psmouse serio_raw pcspkr i2c_piix4 intel_agp intel_gtt e1000 ttm drm_kms_helper drm agpgart fb_sys_fops syscopyarea sysfillrect sysimgblt i2c_core input_leds kvm_intel kvm irqbypass qemu_fw_cfg floppy evdev parport_pc parport button crc32c_generic dm_mod ata_generic pata_acpi ata_piix libata loop ext4 crc16 mbcache jbd2 usb_storage usbcore sd_mod scsi_mod
      [  155.036901] CPU: 0 PID: 2343 Comm: qctl Not tainted 4.20.0-rc6-00025-gf5d58277 #9
      [  155.036903] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [  155.036911] RIP: 0010:dquot_enable+0x34/0xb9
      [  155.036915] Code: 41 56 41 55 41 54 55 53 4c 8b 6f 28 74 02 0f 0b 4d 8d 7d 70 49 89 fc 89 cb 41 89 d6 89 f5 4c 89 ff e8 23 09 ea ff 85 c0 74 0a <0f> 0b 4c 89 ff e8 8b 09 ea ff 85 db 74 6a 41 8b b5 f8 00 00 00 0f
      [  155.036918] RSP: 0018:ffffb09b00493e08 EFLAGS: 00010202
      [  155.036922] RAX: 0000000000000001 RBX: 0000000000000008 RCX: 0000000000000008
      [  155.036924] RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffff9781b67cd870
      [  155.036926] RBP: 0000000000000002 R08: 0000000000000000 R09: 61c8864680b583eb
      [  155.036929] R10: ffffb09b00493e48 R11: ffffffffff7ce7d4 R12: ffff9781b7ee8d78
      [  155.036932] R13: ffff9781b67cd800 R14: 0000000000000004 R15: ffff9781b67cd870
      [  155.036936] FS:  00007fd813250b88(0000) GS:ffff9781ba000000(0000) knlGS:0000000000000000
      [  155.036939] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  155.036942] CR2: 00007fd812ff61d6 CR3: 000000007c882000 CR4: 00000000000006b0
      [  155.036951] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  155.036953] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  155.036955] Call Trace:
      [  155.037004]  dquot_quota_enable+0x8b/0xd0
      [  155.037011]  kernel_quotactl+0x628/0x74e
      [  155.037027]  ? do_mprotect_pkey+0x2a6/0x2cd
      [  155.037034]  __x64_sys_quotactl+0x1a/0x1d
      [  155.037041]  do_syscall_64+0x55/0xe4
      [  155.037078]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  155.037105] RIP: 0033:0x7fd812fe1198
      [  155.037109] Code: 02 77 0d 48 89 c1 48 c1 e9 3f 75 04 48 8b 04 24 48 83 c4 50 5b c3 48 83 ec 08 49 89 ca 48 63 d2 48 63 ff b8 b3 00 00 00 0f 05 <48> 89 c7 e8 c1 eb ff ff 5a c3 48 63 ff b8 bb 00 00 00 0f 05 48 89
      [  155.037112] RSP: 002b:00007ffe8cd7b050 EFLAGS: 00000206 ORIG_RAX: 00000000000000b3
      [  155.037116] RAX: ffffffffffffffda RBX: 00007ffe8cd7b148 RCX: 00007fd812fe1198
      [  155.037119] RDX: 0000000000000000 RSI: 00007ffe8cd7cea9 RDI: 0000000000580102
      [  155.037121] RBP: 00007ffe8cd7b0f0 R08: 000055fc8eba8a9d R09: 0000000000000000
      [  155.037124] R10: 00007ffe8cd7b074 R11: 0000000000000206 R12: 00007ffe8cd7b168
      [  155.037126] R13: 000055fc8eba8897 R14: 0000000000000000 R15: 0000000000000000
      [  155.037131] ---[ end trace 210f864257175c51 ]---
      
      and then the syscall proceeds without s_umount locking.
      
      This patch locks the superblock ->s_umount sem. in exclusive mode for all Q_XQUOTAON/OFF
      quotactls too in addition to Q_QUOTAON/OFF.
      
      AFAICT, other than ext4, only xfs and ocfs2 are affected by this change.
      The VFS will now call in xfs_quota_* functions with s_umount held, which wasn't the case
      before. This looks good to me but I can not say for sure. Ext4 and ocfs2 where already
      beeing called with s_umount exclusive via quota_quotaon/off which is basically the same.
      Signed-off-by: default avatarJavier Barrio <javier.barrio.mart@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8c3e1bcc
    • Johannes Thumshirn's avatar
      btrfs: improve error handling of btrfs_add_link · 6b337c59
      Johannes Thumshirn authored
      [ Upstream commit 1690dd41 ]
      
      In the error handling block, err holds the return value of either
      btrfs_del_root_ref() or btrfs_del_inode_ref() but it hasn't been checked
      since it's introduction with commit fe66a05a (Btrfs: improve error
      handling for btrfs_insert_dir_item callers) in 2012.
      
      If the error handling in the error handling fails, there's not much left
      to do and the abort either happened earlier in the callees or is
      necessary here.
      
      So if one of btrfs_del_root_ref() or btrfs_del_inode_ref() failed, abort
      the transaction, but still return the original code of the failure
      stored in 'ret' as this will be reported to the user.
      Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6b337c59
    • Joel Fernandes (Google)'s avatar
      pstore/ram: Do not treat empty buffers as valid · e1e3a467
      Joel Fernandes (Google) authored
      [ Upstream commit 30696378 ]
      
      The ramoops backend currently calls persistent_ram_save_old() even
      if a buffer is empty. While this appears to work, it is does not seem
      like the right thing to do and could lead to future bugs so lets avoid
      that. It also prevents misleading prints in the logs which claim the
      buffer is valid.
      
      I got something like:
      
      	found existing buffer, size 0, start 0
      
      When I was expecting:
      
      	no valid data in buffer (sig = ...)
      
      This bails out early (and reports with pr_debug()), since it's an
      acceptable state.
      Signed-off-by: default avatarJoel Fernandes (Google) <joel@joelfernandes.org>
      Co-developed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e1e3a467
    • Daniel Santos's avatar
      jffs2: Fix use of uninitialized delayed_work, lockdep breakage · f63e5b19
      Daniel Santos authored
      [ Upstream commit a788c527 ]
      
      jffs2_sync_fs makes the assumption that if CONFIG_JFFS2_FS_WRITEBUFFER
      is defined then a write buffer is available and has been initialized.
      However, this does is not the case when the mtd device has no
      out-of-band buffer:
      
      int jffs2_nand_flash_setup(struct jffs2_sb_info *c)
      {
              if (!c->mtd->oobsize)
                      return 0;
      ...
      
      The resulting call to cancel_delayed_work_sync passing a uninitialized
      (but zeroed) delayed_work struct forces lockdep to become disabled.
      
      [   90.050639] overlayfs: upper fs does not support tmpfile.
      [   90.652264] INFO: trying to register non-static key.
      [   90.662171] the code is fine but needs lockdep annotation.
      [   90.673090] turning off the locking correctness validator.
      [   90.684021] CPU: 0 PID: 1762 Comm: mount_root Not tainted 4.14.63 #0
      [   90.696672] Stack : 00000000 00000000 80d8f6a2 00000038 805f0000 80444600 8fe364f4 805dfbe7
      [   90.713349]         80563a30 000006e2 8068370c 00000001 00000000 00000001 8e2fdc48 ffffffff
      [   90.730020]         00000000 00000000 80d90000 00000000 00000106 00000000 6465746e 312e3420
      [   90.746690]         6b636f6c 03bf0000 f8000000 20676e69 00000000 80000000 00000000 8e2c2a90
      [   90.763362]         80d90000 00000001 00000000 8e2c2a90 00000003 80260dc0 08052098 80680000
      [   90.780033]         ...
      [   90.784902] Call Trace:
      [   90.789793] [<8000f0d8>] show_stack+0xb8/0x148
      [   90.798659] [<8005a000>] register_lock_class+0x270/0x55c
      [   90.809247] [<8005cb64>] __lock_acquire+0x13c/0xf7c
      [   90.818964] [<8005e314>] lock_acquire+0x194/0x1dc
      [   90.828345] [<8003f27c>] flush_work+0x200/0x24c
      [   90.837374] [<80041dfc>] __cancel_work_timer+0x158/0x210
      [   90.847958] [<801a8770>] jffs2_sync_fs+0x20/0x54
      [   90.857173] [<80125cf4>] iterate_supers+0xf4/0x120
      [   90.866729] [<80158fc4>] sys_sync+0x44/0x9c
      [   90.875067] [<80014424>] syscall_common+0x34/0x58
      Signed-off-by: default avatarDaniel Santos <daniel.santos@pobox.com>
      Reviewed-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f63e5b19
  7. 23 Jan, 2019 4 commits
    • Jan Kara's avatar
      blockdev: Fix livelocks on loop device · 0fb89795
      Jan Kara authored
      commit 04906b2f upstream.
      
      bd_set_size() updates also block device's block size. This is somewhat
      unexpected from its name and at this point, only blkdev_open() uses this
      functionality. Furthermore, this can result in changing block size under
      a filesystem mounted on a loop device which leads to livelocks inside
      __getblk_gfp() like:
      
      Sending NMI from CPU 0 to CPUs 1:
      NMI backtrace for cpu 1
      CPU: 1 PID: 10863 Comm: syz-executor0 Not tainted 4.18.0-rc5+ #151
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
      01/01/2011
      RIP: 0010:__sanitizer_cov_trace_pc+0x3f/0x50 kernel/kcov.c:106
      ...
      Call Trace:
       init_page_buffers+0x3e2/0x530 fs/buffer.c:904
       grow_dev_page fs/buffer.c:947 [inline]
       grow_buffers fs/buffer.c:1009 [inline]
       __getblk_slow fs/buffer.c:1036 [inline]
       __getblk_gfp+0x906/0xb10 fs/buffer.c:1313
       __bread_gfp+0x2d/0x310 fs/buffer.c:1347
       sb_bread include/linux/buffer_head.h:307 [inline]
       fat12_ent_bread+0x14e/0x3d0 fs/fat/fatent.c:75
       fat_ent_read_block fs/fat/fatent.c:441 [inline]
       fat_alloc_clusters+0x8ce/0x16e0 fs/fat/fatent.c:489
       fat_add_cluster+0x7a/0x150 fs/fat/inode.c:101
       __fat_get_block fs/fat/inode.c:148 [inline]
      ...
      
      Trivial reproducer for the problem looks like:
      
      truncate -s 1G /tmp/image
      losetup /dev/loop0 /tmp/image
      mkfs.ext4 -b 1024 /dev/loop0
      mount -t ext4 /dev/loop0 /mnt
      losetup -c /dev/loop0
      l /mnt
      
      Fix the problem by moving initialization of a block device block size
      into a separate function and call it when needed.
      
      Thanks to Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> for help with
      debugging the problem.
      
      Reported-by: syzbot+9933e4476f365f5d5a1b@syzkaller.appspotmail.com
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0fb89795
    • Kees Cook's avatar
      pstore/ram: Avoid allocation and leak of platform data · 96188b18
      Kees Cook authored
      commit 5631e857 upstream.
      
      Yue Hu noticed that when parsing device tree the allocated platform data
      was never freed. Since it's not used beyond the function scope, this
      switches to using a stack variable instead.
      Reported-by: default avatarYue Hu <huyue2@yulong.com>
      Fixes: 35da6094 ("pstore/ram: add Device Tree bindings")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      96188b18
    • Josef Bacik's avatar
      btrfs: wait on ordered extents on abort cleanup · f97fd292
      Josef Bacik authored
      commit 74d5d229 upstream.
      
      If we flip read-only before we initiate writeback on all dirty pages for
      ordered extents we've created then we'll have ordered extents left over
      on umount, which results in all sorts of bad things happening.  Fix this
      by making sure we wait on ordered extents if we have to do the aborted
      transaction cleanup stuff.
      
      generic/475 can produce this warning:
      
       [ 8531.177332] WARNING: CPU: 2 PID: 11997 at fs/btrfs/disk-io.c:3856 btrfs_free_fs_root+0x95/0xa0 [btrfs]
       [ 8531.183282] CPU: 2 PID: 11997 Comm: umount Tainted: G        W 5.0.0-rc1-default+ #394
       [ 8531.185164] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014
       [ 8531.187851] RIP: 0010:btrfs_free_fs_root+0x95/0xa0 [btrfs]
       [ 8531.193082] RSP: 0018:ffffb1ab86163d98 EFLAGS: 00010286
       [ 8531.194198] RAX: ffff9f3449494d18 RBX: ffff9f34a2695000 RCX:0000000000000000
       [ 8531.195629] RDX: 0000000000000002 RSI: 0000000000000001 RDI:0000000000000000
       [ 8531.197315] RBP: ffff9f344e930000 R08: 0000000000000001 R09:0000000000000000
       [ 8531.199095] R10: 0000000000000000 R11: ffff9f34494d4ff8 R12:ffffb1ab86163dc0
       [ 8531.200870] R13: ffff9f344e9300b0 R14: ffffb1ab86163db8 R15:0000000000000000
       [ 8531.202707] FS:  00007fc68e949fc0(0000) GS:ffff9f34bd800000(0000)knlGS:0000000000000000
       [ 8531.204851] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [ 8531.205942] CR2: 00007ffde8114dd8 CR3: 000000002dfbd000 CR4:00000000000006e0
       [ 8531.207516] Call Trace:
       [ 8531.208175]  btrfs_free_fs_roots+0xdb/0x170 [btrfs]
       [ 8531.210209]  ? wait_for_completion+0x5b/0x190
       [ 8531.211303]  close_ctree+0x157/0x350 [btrfs]
       [ 8531.212412]  generic_shutdown_super+0x64/0x100
       [ 8531.213485]  kill_anon_super+0x14/0x30
       [ 8531.214430]  btrfs_kill_super+0x12/0xa0 [btrfs]
       [ 8531.215539]  deactivate_locked_super+0x29/0x60
       [ 8531.216633]  cleanup_mnt+0x3b/0x70
       [ 8531.217497]  task_work_run+0x98/0xc0
       [ 8531.218397]  exit_to_usermode_loop+0x83/0x90
       [ 8531.219324]  do_syscall_64+0x15b/0x180
       [ 8531.220192]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
       [ 8531.221286] RIP: 0033:0x7fc68e5e4d07
       [ 8531.225621] RSP: 002b:00007ffde8116608 EFLAGS: 00000246 ORIG_RAX:00000000000000a6
       [ 8531.227512] RAX: 0000000000000000 RBX: 00005580c2175970 RCX:00007fc68e5e4d07
       [ 8531.229098] RDX: 0000000000000001 RSI: 0000000000000000 RDI:00005580c2175b80
       [ 8531.230730] RBP: 0000000000000000 R08: 00005580c2175ba0 R09:00007ffde8114e80
       [ 8531.232269] R10: 0000000000000000 R11: 0000000000000246 R12:00005580c2175b80
       [ 8531.233839] R13: 00007fc68eac61c4 R14: 00005580c2175a68 R15:0000000000000000
      
      Leaving a tree in the rb-tree:
      
      3853 void btrfs_free_fs_root(struct btrfs_root *root)
      3854 {
      3855         iput(root->ino_cache_inode);
      3856         WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree));
      
      CC: stable@vger.kernel.org
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      [ add stacktrace ]
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f97fd292
    • David Sterba's avatar
      Revert "btrfs: balance dirty metadata pages in btrfs_finish_ordered_io" · 0400be16
      David Sterba authored
      commit 77b7aad1 upstream.
      
      This reverts commit e73e81b6.
      
      This patch causes a few problems:
      
      - adds latency to btrfs_finish_ordered_io
      - as btrfs_finish_ordered_io is used for free space cache, generating
        more work from btrfs_btree_balance_dirty_nodelay could end up in the
        same workque, effectively deadlocking
      
      12260 kworker/u96:16+btrfs-freespace-write D
      [<0>] balance_dirty_pages+0x6e6/0x7ad
      [<0>] balance_dirty_pages_ratelimited+0x6bb/0xa90
      [<0>] btrfs_finish_ordered_io+0x3da/0x770
      [<0>] normal_work_helper+0x1c5/0x5a0
      [<0>] process_one_work+0x1ee/0x5a0
      [<0>] worker_thread+0x46/0x3d0
      [<0>] kthread+0xf5/0x130
      [<0>] ret_from_fork+0x24/0x30
      [<0>] 0xffffffffffffffff
      
      Transaction commit will wait on the freespace cache:
      
      838 btrfs-transacti D
      [<0>] btrfs_start_ordered_extent+0x154/0x1e0
      [<0>] btrfs_wait_ordered_range+0xbd/0x110
      [<0>] __btrfs_wait_cache_io+0x49/0x1a0
      [<0>] btrfs_write_dirty_block_groups+0x10b/0x3b0
      [<0>] commit_cowonly_roots+0x215/0x2b0
      [<0>] btrfs_commit_transaction+0x37e/0x910
      [<0>] transaction_kthread+0x14d/0x180
      [<0>] kthread+0xf5/0x130
      [<0>] ret_from_fork+0x24/0x30
      [<0>] 0xffffffffffffffff
      
      And then writepages ends up waiting on transaction commit:
      
      9520 kworker/u96:13+flush-btrfs-1 D
      [<0>] wait_current_trans+0xac/0xe0
      [<0>] start_transaction+0x21b/0x4b0
      [<0>] cow_file_range_inline+0x10b/0x6b0
      [<0>] cow_file_range.isra.69+0x329/0x4a0
      [<0>] run_delalloc_range+0x105/0x3c0
      [<0>] writepage_delalloc+0x119/0x180
      [<0>] __extent_writepage+0x10c/0x390
      [<0>] extent_write_cache_pages+0x26f/0x3d0
      [<0>] extent_writepages+0x4f/0x80
      [<0>] do_writepages+0x17/0x60
      [<0>] __writeback_single_inode+0x59/0x690
      [<0>] writeback_sb_inodes+0x291/0x4e0
      [<0>] __writeback_inodes_wb+0x87/0xb0
      [<0>] wb_writeback+0x3bb/0x500
      [<0>] wb_workfn+0x40d/0x610
      [<0>] process_one_work+0x1ee/0x5a0
      [<0>] worker_thread+0x1e0/0x3d0
      [<0>] kthread+0xf5/0x130
      [<0>] ret_from_fork+0x24/0x30
      [<0>] 0xffffffffffffffff
      
      Eventually, we have every process in the system waiting on
      balance_dirty_pages(), and nobody is able to make progress on page
      writeback.
      
      The original patch tried to fix an OOM condition, that happened on 4.4 but no
      success reproducing that on later kernels (4.19 and 4.20). This is more likely
      a problem in OOM itself.
      
      Link: https://lore.kernel.org/linux-btrfs/20180528054821.9092-1-ethanlien@synology.com/Reported-by: default avatarChris Mason <clm@fb.com>
      CC: stable@vger.kernel.org # 4.18+
      CC: ethanlien <ethanlien@synology.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0400be16
  8. 16 Jan, 2019 8 commits
  9. 13 Jan, 2019 1 commit