AgeCommit message (Collapse)AuthorLines
2017-06-23Merge tag 'random_for_linus_stable' of ↵Linus Torvalds-6/+6
git:// Pull random fixes from Ted Ts'o: "Fix some locking and gcc optimization issues from the most recent random_for_linus_stable pull request" * tag 'random_for_linus_stable' of git:// random: silence compiler warnings and fix race
2017-06-23Merge tag 'for-4.12/dm-fixes-4' of ↵Linus Torvalds-6/+31
git:// Pull device mapper fixes from Mike Snitzer: - a revert of a DM mirror commit that has proven to make the code prone to crash - a DM io reference count fix that resolves a NULL pointer seen when issuing discards to a DM mirror target's device whose mirror legs do not all support discards - a couple DM integrity fixes * tag 'for-4.12/dm-fixes-4' of git:// dm io: fix duplicate bio completion due to missing ref count dm integrity: fix to not disable/enable interrupts from interrupt context Revert "dm mirror: use all available legs on multiple failures" dm integrity: reject mappings too large for device
2017-06-23Merge branch 'akpm' (patches from Andrew)Linus Torvalds-35/+86
Merge misc fixes from Andrew Morton: "8 fixes" * emailed patches from Andrew Morton <>: fs/exec.c: account for argv/envp pointers ocfs2: fix deadlock caused by recursive locking in xattr slub: make sysfs file removal asynchronous lib/cmdline.c: fix get_options() overflow while parsing ranges fs/dax.c: fix inefficiency in dax_writeback_mapping_range() autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings mm, thp: remove cond_resched from __collapse_huge_page_copy
2017-06-23fs/exec.c: account for argv/envp pointersKees Cook-4/+24
When limiting the argv/envp strings during exec to 1/4 of the stack limit, the storage of the pointers to the strings was not included. This means that an exec with huge numbers of tiny strings could eat 1/4 of the stack limit in strings and then additional space would be later used by the pointers to the strings. For example, on 32-bit with a 8MB stack rlimit, an exec with 1677721 single-byte strings would consume less than 2MB of stack, the max (8MB / 4) amount allowed, but the pointers to the strings would consume the remaining additional stack space (1677721 * 4 == 6710884). The result (1677721 + 6710884 == 8388605) would exhaust stack space entirely. Controlling this stack exhaustion could result in pathological behavior in setuid binaries (CVE-2017-1000365). [ additional commenting from Kees] Fixes: b6a2fea39318 ("mm: variable length argument support") Link: Signed-off-by: Kees Cook <> Acked-by: Rik van Riel <> Acked-by: Michal Hocko <> Cc: Alexander Viro <> Cc: Qualys Security Advisory <> Cc: <> Signed-off-by: Andrew Morton <> Signed-off-by: Linus Torvalds <>
2017-06-23ocfs2: fix deadlock caused by recursive locking in xattrEric Ren-10/+17
Another deadlock path caused by recursive locking is reported. This kind of issue was introduced since commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()"). Two deadlock paths have been fixed by commit b891fa5024a9 ("ocfs2: fix deadlock issue when taking inode lock at vfs entry points"). Yes, we intend to fix this kind of case in incremental way, because it's hard to find out all possible paths at once. This one can be reproduced like this. On node1, cp a large file from home directory to ocfs2 mountpoint. While on node2, run setfacl/getfacl. Both nodes will hang up there. The backtraces: On node1: __ocfs2_cluster_lock.isra.39+0x357/0x740 [ocfs2] ocfs2_inode_lock_full_nested+0x17d/0x840 [ocfs2] ocfs2_write_begin+0x43/0x1a0 [ocfs2] generic_perform_write+0xa9/0x180 __generic_file_write_iter+0x1aa/0x1d0 ocfs2_file_write_iter+0x4f4/0xb40 [ocfs2] __vfs_write+0xc3/0x130 vfs_write+0xb1/0x1a0 SyS_write+0x46/0xa0 On node2: __ocfs2_cluster_lock.isra.39+0x357/0x740 [ocfs2] ocfs2_inode_lock_full_nested+0x17d/0x840 [ocfs2] ocfs2_xattr_set+0x12e/0xe80 [ocfs2] ocfs2_set_acl+0x22d/0x260 [ocfs2] ocfs2_iop_set_acl+0x65/0xb0 [ocfs2] set_posix_acl+0x75/0xb0 posix_acl_xattr_set+0x49/0xa0 __vfs_setxattr+0x69/0x80 __vfs_setxattr_noperm+0x72/0x1a0 vfs_setxattr+0xa7/0xb0 setxattr+0x12d/0x190 path_setxattr+0x9f/0xb0 SyS_setxattr+0x14/0x20 Fix this one by using ocfs2_inode_{lock|unlock}_tracker, which is exported by commit 439a36b8ef38 ("ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock"). Link: Fixes: 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") Signed-off-by: Eric Ren <> Reported-by: Thomas Voegtle <> Tested-by: Thomas Voegtle <> Reviewed-by: Joseph Qi <> Cc: Mark Fasheh <> Cc: Joel Becker <> Cc: Junxiao Bi <> Signed-off-by: Andrew Morton <> Signed-off-by: Linus Torvalds <>
2017-06-23slub: make sysfs file removal asynchronousTejun Heo-14/+27
Commit bf5eb3de3847 ("slub: separate out sysfs_slab_release() from sysfs_slab_remove()") made slub sysfs file removals synchronous to kmem_cache shutdown. Unfortunately, this created a possible ABBA deadlock between slab_mutex and sysfs draining mechanism triggering the following lockdep warning. ====================================================== [ INFO: possible circular locking dependency detected ] 4.10.0-test+ #48 Not tainted ------------------------------------------------------- rmmod/1211 is trying to acquire lock: (s_active#120){++++.+}, at: [<ffffffff81308073>] kernfs_remove+0x23/0x40 but task is already holding lock: (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (slab_mutex){+.+.+.}: lock_acquire+0xf6/0x1f0 __mutex_lock+0x75/0x950 mutex_lock_nested+0x1b/0x20 slab_attr_store+0x75/0xd0 sysfs_kf_write+0x45/0x60 kernfs_fop_write+0x13c/0x1c0 __vfs_write+0x28/0x120 vfs_write+0xc8/0x1e0 SyS_write+0x49/0xa0 entry_SYSCALL_64_fastpath+0x1f/0xc2 -> #0 (s_active#120){++++.+}: __lock_acquire+0x10ed/0x1260 lock_acquire+0xf6/0x1f0 __kernfs_remove+0x254/0x320 kernfs_remove+0x23/0x40 sysfs_remove_dir+0x51/0x80 kobject_del+0x18/0x50 __kmem_cache_shutdown+0x3e6/0x460 kmem_cache_destroy+0x1fb/0x2d0 kvm_exit+0x2d/0x80 [kvm] vmx_exit+0x19/0xa1b [kvm_intel] SyS_delete_module+0x198/0x1f0 entry_SYSCALL_64_fastpath+0x1f/0xc2 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(slab_mutex); lock(s_active#120); lock(slab_mutex); lock(s_active#120); *** DEADLOCK *** 2 locks held by rmmod/1211: #0: (cpu_hotplug.dep_map){++++++}, at: [<ffffffff810a7877>] get_online_cpus+0x37/0x80 #1: (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0 stack backtrace: CPU: 3 PID: 1211 Comm: rmmod Not tainted 4.10.0-test+ #48 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012 Call Trace: print_circular_bug+0x1be/0x210 __lock_acquire+0x10ed/0x1260 lock_acquire+0xf6/0x1f0 __kernfs_remove+0x254/0x320 kernfs_remove+0x23/0x40 sysfs_remove_dir+0x51/0x80 kobject_del+0x18/0x50 __kmem_cache_shutdown+0x3e6/0x460 kmem_cache_destroy+0x1fb/0x2d0 kvm_exit+0x2d/0x80 [kvm] vmx_exit+0x19/0xa1b [kvm_intel] SyS_delete_module+0x198/0x1f0 ? SyS_delete_module+0x5/0x1f0 entry_SYSCALL_64_fastpath+0x1f/0xc2 It'd be the cleanest to deal with the issue by removing sysfs files without holding slab_mutex before the rest of shutdown; however, given the current code structure, it is pretty difficult to do so. This patch punts sysfs file removal to a work item. Before commit bf5eb3de3847, the removal was punted to a RCU delayed work item which is executed after release. Now, we're punting to a different work item on shutdown which still maintains the goal removing the sysfs files earlier when destroying kmem_caches. Link: Fixes: bf5eb3de3847 ("slub: separate out sysfs_slab_release() from sysfs_slab_remove()") Signed-off-by: Tejun Heo <> Reported-by: Steven Rostedt (VMware) <> Tested-by: Steven Rostedt (VMware) <> Cc: Christoph Lameter <> Cc: Pekka Enberg <> Cc: David Rientjes <> Cc: Joonsoo Kim <> Signed-off-by: Andrew Morton <> Signed-off-by: Linus Torvalds <>
2017-06-23lib/cmdline.c: fix get_options() overflow while parsing rangesIlya Matveychikov-3/+3
When using get_options() it's possible to specify a range of numbers, like 1-100500. The problem is that it doesn't track array size while calling internally to get_range() which iterates over the range and fills the memory with numbers. Link: Signed-off-by: Ilya V. Matveychikov <> Cc: Jonathan Corbet <> Cc: <> Signed-off-by: Andrew Morton <> Signed-off-by: Linus Torvalds <>
2017-06-23fs/dax.c: fix inefficiency in dax_writeback_mapping_range()Jan Kara-0/+1
dax_writeback_mapping_range() fails to update iteration index when searching radix tree for entries needing cache flushing. Thus each pagevec worth of entries is searched starting from the start which is inefficient and prone to livelocks. Update index properly. Link: Fixes: 9973c98ecfda3 ("dax: add support for fsync/sync") Signed-off-by: Jan Kara <> Reviewed-by: Ross Zwisler <> Cc: Dan Williams <> Cc: <> Signed-off-by: Andrew Morton <> Signed-off-by: Linus Torvalds <>
2017-06-23autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAILNeilBrown-1/+1
If a positive status is passed with the AUTOFS_DEV_IOCTL_FAIL ioctl, autofs4_d_automount() will return ERR_PTR(status) with that status to follow_automount(), which will then dereference an invalid pointer. So treat a positive status the same as zero, and map to ENOENT. See comment in systemd src/core/automount.c::automount_send_ready(). Link: Signed-off-by: NeilBrown <> Cc: Ian Kent <> Cc: <> Signed-off-by: Andrew Morton <> Signed-off-by: Linus Torvalds <>
2017-06-23mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappingsArd Biesheuvel-2/+13
Existing code that uses vmalloc_to_page() may assume that any address for which is_vmalloc_addr() returns true may be passed into vmalloc_to_page() to retrieve the associated struct page. This is not un unreasonable assumption to make, but on architectures that have CONFIG_HAVE_ARCH_HUGE_VMAP=y, it no longer holds, and we need to ensure that vmalloc_to_page() does not go off into the weeds trying to dereference huge PUDs or PMDs as table entries. Given that vmalloc() and vmap() themselves never create huge mappings or deal with compound pages at all, there is no correct answer in this case, so return NULL instead, and issue a warning. When reading /proc/kcore on arm64, you will hit an oops as soon as you hit the huge mappings used for the various segments that make up the mapping of vmlinux. With this patch applied, you will no longer hit the oops, but the kcore contents willl be incorrect (these regions will be zeroed out) We are fixing this for kcore specifically, so it avoids vread() for those regions. At least one other problematic user exists, i.e., /dev/kmem, but that is currently broken on arm64 for other reasons. Link: Signed-off-by: Ard Biesheuvel <> Acked-by: Mark Rutland <> Reviewed-by: Laura Abbott <> Cc: Michal Hocko <> Cc: zhong jiang <> Cc: Dave Hansen <> Signed-off-by: Andrew Morton <> Signed-off-by: Linus Torvalds <>
2017-06-23mm, thp: remove cond_resched from __collapse_huge_page_copyDavid Rientjes-1/+0
This is a partial revert of commit 338a16ba1549 ("mm, thp: copying user pages must schedule on collapse") which added a cond_resched() to __collapse_huge_page_copy(). On x86 with CONFIG_HIGHPTE, __collapse_huge_page_copy is called in atomic context and thus scheduling is not possible. This is only a possible config on arm and i386. Although need_resched has been shown to be set for over 100 jiffies while doing the iteration in __collapse_huge_page_copy, this is better than doing if (in_atomic()) cond_resched() to cover only non-CONFIG_HIGHPTE configs. Link: Signed-off-by: David Rientjes <> Reported-by: Larry Finger <> Tested-by: Larry Finger <> Acked-by: Michal Hocko <> Cc: Vlastimil Babka <> Signed-off-by: Andrew Morton <> Signed-off-by: Linus Torvalds <>
2017-06-23Merge tag 'scsi-fixes' of ↵Linus Torvalds-4/+1
git:// Pull SCSI fixes from James Bottomley: "Two fixes to remove spurious WARN_ONs from the new(ish) qedi driver. The driver already prints a warning message, there's no need to panic users by printing something that looks like an oops as well" * tag 'scsi-fixes' of git:// scsi: qedi: Remove WARN_ON from clear task context. scsi: qedi: Remove WARN_ON for untracked cleanup.
2017-06-23Merge tag 'xfs-4.12-fixes-5' of git:// Torvalds-2/+5
Pull xfs fixes from Darrick Wong: "I have one more bugfix for you for 4.12-rc7 to fix a disk corruption problem: - don't allow swapon on files on the realtime device, because the swap code will swap pages out to blocks on the data device, thereby corrupting the filesystem" * tag 'xfs-4.12-fixes-5' of git:// xfs: don't allow bmap on rt files
2017-06-22Merge branch 'for-next' of git:// Torvalds-9/+14
Pull cifs fixes from Steve French: "Various small fixes for stable" * 'for-next' of git:// CIFS: Fix some return values in case of error in 'crypt_message' cifs: remove redundant return in cifs_creation_time_get CIFS: Improve readdir verbosity CIFS: check if pages is null rather than bv for a failed allocation CIFS: Set ->should_dirty in cifs_user_readv()
2017-06-22Merge tag 'for-linus' of git:// Torvalds-60/+163
Pull KVM fixes from Radim Krčmář: "MIPS: - Fix build with KVM, DYNAMIC_DEBUG and JUMP_LABEL. PPC: - Fix host crashes/hangs on POWER9. - Properly restore userspace state after KVM_RUN ioctl. s390: - Fix address translation in odd-ball cases (real-space designation ASCEs). x86: - Fix privilege escalation in 64-bit Windows guests All patches are for stable and the x86 also has a CVE" * tag 'for-linus' of git:// KVM: x86: fix singlestepping over syscall KVM: s390: gaccess: fix real-space designation asce handling for gmap shadows KVM: MIPS: Fix maybe-uninitialized build failure KVM: PPC: Book3S HV: Ignore timebase offset on POWER9 DD1 KVM: PPC: Book3S HV: Save/restore host values of debug registers KVM: PPC: Book3S HV: Preserve userspace HTM state properly KVM: PPC: Book3S HV: Restore critical SPRs to host values on guest exit KVM: PPC: Book3S HV: Context-switch EBB registers properly KVM: PPC: Book3S HV: Cope with host using large decrementer mode
2017-06-22Merge tag 'mfd-fixes-4.12' of ↵Linus Torvalds-3/+2
git:// Pull MFD fixes from Lee Jones: - arizona: use address passed in, rather than hard coded value - correct STM32 clock-names value in DT binding documentation * tag 'mfd-fixes-4.12' of git:// dt-bindings: mfd: Update STM32 timers clock names mfd: arizona: Fix typo using hard-coded register
2017-06-22KVM: x86: fix singlestepping over syscallPaolo Bonzini-30/+34
TF is handled a bit differently for syscall and sysret, compared to the other instructions: TF is checked after the instruction completes, so that the OS can disable #DB at a syscall by adding TF to FMASK. When the sysret is executed the #DB is taken "as if" the syscall insn just completed. KVM emulates syscall so that it can trap 32-bit syscall on Intel processors. Fix the behavior, otherwise you could get #DB on a user stack which is not nice. This does not affect Linux guests, as they use an IST or task gate for #DB. This fixes CVE-2017-7518. Cc: Reported-by: Andy Lutomirski <> Signed-off-by: Paolo Bonzini <> Signed-off-by: Radim Krčmář <>
2017-06-22Merge tag 'kvm-s390-master-4.12-2' of ↵Radim Krčmář-9/+6
git:// KVM: s390: fix shadow table handling for nested guests Some odd-ball cases (real-space designation ASCEs) are handled wrong for the shadow page tables. Fix it.
2017-06-22KVM: s390: gaccess: fix real-space designation asce handling for gmap shadowsHeiko Carstens-9/+6
For real-space designation asces the asce origin part is only a token. The asce token origin must not be used to generate an effective address for storage references. This however is erroneously done within kvm_s390_shadow_tables(). Furthermore within the same function the wrong parts of virtual addresses are used to generate a corresponding real address (e.g. the region second index is used as region first index). Both of the above can result in incorrect address translations. Only for real space designations with a token origin of zero and addresses below one megabyte the translation was correct. Furthermore replace a "!asce.r" statement with a "!*fake" statement to make it more obvious that a specific condition has nothing to do with the architecture, but with the fake handling of real space designations. Fixes: 3218f7094b6b ("s390/mm: support real-space for gmap shadows") Cc: David Hildenbrand <> Cc: Signed-off-by: Heiko Carstens <> Reviewed-by: Martin Schwidefsky <> Signed-off-by: Christian Borntraeger <>
2017-06-21Merge branch 'for-linus' of git:// Torvalds-65/+87
Pull block fixes from Jens Axboe: "This contains a set of fixes for xen-blkback by way of Konrad, and a performance regression fix for blk-mq for shared tags. The latter could account for as much as a 50x reduction in performance, with the test case from the user with 500 name spaces. A more realistic setup on my end with 32 drives showed a 3.5x drop. The fix has been thoroughly tested before being committed" * 'for-linus' of git:// blk-mq: fix performance regression with shared tags xen-blkback: don't leak stack data via response ring xen/blkback: don't use xen_blkif_get() in xen-blkback kthread xen/blkback: don't free be structure too early xen/blkback: fix disconnect while I/Os in flight
2017-06-21xfs: don't allow bmap on rt filesDarrick J. Wong-2/+5
bmap returns a dumb LBA address but not the block device that goes with that LBA. Swapfiles don't care about this and will blindly assume that the data volume is the correct blockdev, which is totally bogus for files on the rt subvolume. This results in the swap code doing IOs to arbitrary locations on the data device(!) if the passed in mapping is a realtime file, so just turn off bmap for rt files. Signed-off-by: Darrick J. Wong <> Reviewed-by: Christoph Hellwig <>
2017-06-21Merge git:// Torvalds-191/+256
Pull networking fixes from David Miller: 1) Fix refcounting wrt timers which hold onto inet6 address objects, from Xin Long. 2) Fix an ancient bug in wireless wext ioctls, from Johannes Berg. 3) Firmware handling fixes in brcm80211 driver, from Arend Van Spriel. 4) Several mlx5 driver fixes (firmware readiness, timestamp cap reporting, devlink command validity checking, tc offloading, etc.) From Eli Cohen, Maor Dickman, Chris Mi, and Or Gerlitz. 5) Fix dst leak in IP/IP6 tunnels, from Haishuang Yan. 6) Fix dst refcount bug in decnet, from Wei Wang. 7) Netdev can be double freed in register_vlan_device(). Fix from Gao Feng. 8) Don't allow object to be destroyed while it is being dumped in SCTP, from Xin Long. 9) Fix dpaa_eth build when modular, from Madalin Bucur. 10) Fix throw route leaks, from Serhey Popovych. 11) IFLA_GROUP missing from if_nlmsg_size() and ifla_policy[] table, also from Serhey Popovych. 12) Fix premature TX SKB free in stmmac, from Niklas Cassel. * git:// (36 commits) igmp: add a missing spin_lock_init() net: stmmac: free an skb first when there are no longer any descriptors using it sfc: remove duplicate up_write on VF filter_sem rtnetlink: add IFLA_GROUP to ifla_policy ipv6: Do not leak throw route references dt-bindings: net: sms911x: Add missing optional VDD regulators dpaa_eth: reuse the dma_ops provided by the FMan MAC device fsl/fman: propagate dma_ops net/core: remove explicit do_softirq() from busy_poll_stop() fib_rules: Resolve goto rules target on delete sctp: ensure ep is not destroyed before doing the dump net/hns:bugfix of ethtool -t phy self_test net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev cxgb4: notify uP to route ctrlq compl to rdma rspq ip6_tunnel: Correct tos value in collect_md mode decnet: always not take dst->__refcnt when inserting dst into hash table ip6_tunnel: fix potential issue in __ip6_tnl_rcv ip_tunnel: fix potential issue in ip_tunnel_rcv brcmfmac: fix uninitialized warning in brcmf_usb_probe_phase2() net/mlx5e: Avoid doing a cleanup call if the profile doesn't have it ...
2017-06-21Merge tag 'pinctrl-v4.12-3' of ↵Linus Torvalds-51/+42
git:// Pull more pin control fixes from Linus Walleij: "Some late arriving fixes. I should have sent earlier, just swamped with work as usual. Thomas patch makes AMD systems usable despite firmware bugs so it is fairly important. - Make the AMD driver use a regular interrupt rather than a chained one, so the system does not lock up. - Fix a function call error deep inside the STM32 driver" * tag 'pinctrl-v4.12-3' of git:// pinctrl: stm32: Fix bad function call pinctrl/amd: Use regular interrupt instead of chained
2017-06-21Merge branch 'for-linus' of ↵Linus Torvalds-8/+11
git:// Pull HID fixes from Jiri Kosina: - revert of a commit to magicmouse driver that regressess certain devices, from Daniel Stone - quirk for a specific Dell mouse, from Sebastian Parschauer * 'for-linus' of git:// Revert "HID: magicmouse: Set multi-touch keybits for Magic Mouse" HID: Add quirk for Dell PIXART OEM mouse
2017-06-21Merge branch 'for-linus' of ↵Linus Torvalds-7/+37
git:// Pull livepatching fix from Jiri Kosina: "Fix the way how livepatches are being stacked with respect to RCU, from Petr Mladek" * 'for-linus' of git:// livepatch: Fix stacking of patches with respect to RCU
2017-06-21Merge branch 'ufs-fixes' of ↵Linus Torvalds-32/+28
git:// Pull more ufs fixes from Al Viro: "More UFS fixes, unfortunately including build regression fix for the 64-bit s_dsize commit. Fixed in this pile: - trivial bug in signedness of 32bit timestamps on ufs1 - ESTALE instead of ufs_error() when doing open-by-fhandle on something deleted - build regression on 32bit in ufs_new_fragments() - calculating that many percents of u64 pulls libgcc stuff on some of those. Mea culpa. - fix hysteresis loop broken by typo in (right next to the location of previous bug). - fix the insane limits of said hysteresis loop on filesystems with very low percentage of reserved blocks. If it's 5% or less, just use the OPTSPACE policy. - calculate those limits once and mount time. This tree does pass xfstests clean (both ufs1 and ufs2) and it _does_ survive cross-builds. Again, my apologies for missing that, especially since I have noticed a related percentage-of-64bit issue in earlier patches (when dealing with amount of reserved blocks). Self-LART applied..." * 'ufs-fixes' of git:// ufs: fix the logics for tail relocation ufs_iget(): fail with -ESTALE on deleted inode fix signedness of timestamps on ufs1
2017-06-21Allow stack to grow up to address space limitHelge Deller-5/+8
Fix expand_upwards() on architectures with an upward-growing stack (parisc, metag and partly IA-64) to allow the stack to reliably grow exactly up to the address space limit given by TASK_SIZE. Signed-off-by: Helge Deller <> Acked-by: Hugh Dickins <> Signed-off-by: Linus Torvalds <>
2017-06-21mm: fix new crash in unmapped_area_topdown()Hugh Dickins-2/+4
Trinity gets kernel BUG at mm/mmap.c:1963! in about 3 minutes of mmap testing. That's the VM_BUG_ON(gap_end < gap_start) at the end of unmapped_area_topdown(). Linus points out how MAP_FIXED (which does not have to respect our stack guard gap intentions) could result in gap_end below gap_start there. Fix that, and the similar case in its alternative, unmapped_area(). Cc: Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas") Reported-by: Dave Jones <> Debugged-by: Linus Torvalds <> Signed-off-by: Hugh Dickins <> Acked-by: Michal Hocko <> Signed-off-by: Linus Torvalds <>
2017-06-21blk-mq: fix performance regression with shared tagsJens Axboe-24/+61
If we have shared tags enabled, then every IO completion will trigger a full loop of every queue belonging to a tag set, and every hardware queue for each of those queues, even if nothing needs to be done. This causes a massive performance regression if you have a lot of shared devices. Instead of doing this huge full scan on every IO, add an atomic counter to the main queue that tracks how many hardware queues have been marked as needing a restart. With that, we can avoid looking for restartable queues, if we don't have to. Max reports that this restores performance. Before this patch, 4K IOPS was limited to 22-23K IOPS. With the patch, we are running at 950-970K IOPS. Fixes: 6d8c6c0f97ad ("blk-mq: Restart a single queue if tag sets are shared") Reported-by: Max Gurtovoy <> Tested-by: Max Gurtovoy <> Reviewed-by: Bart Van Assche <> Tested-by: Bart Van Assche <> Signed-off-by: Jens Axboe <>
2017-06-21dm io: fix duplicate bio completion due to missing ref countMike Snitzer-2/+2
If only a subset of the devices associated with multiple regions support a given special operation (eg. DISCARD) then the dec_count() that is used to set error for the region must increment the io->count. Otherwise, when the dec_count() is called it can cause the dm-io caller's bio to be completed multiple times. As was reported against the dm-mirror target that had mirror legs with a mix of discard capabilities. Bug: Reported-by: Zhang Yi <> Signed-off-by: Mike Snitzer <>
2017-06-21dm integrity: fix to not disable/enable interrupts from interrupt contextMike Snitzer-2/+5
Use spin_lock_irqsave and spin_unlock_irqrestore rather than spin_{lock,unlock}_irq in submit_flush_bio(). Otherwise lockdep issues the following warning: DEBUG_LOCKS_WARN_ON(current->hardirq_context) WARNING: CPU: 1 PID: 0 at kernel/locking/lockdep.c:2748 trace_hardirqs_on_caller+0x107/0x180 Reported-by: Ondrej Kozina <> Tested-by: Ondrej Kozina <> Signed-off-by: Mike Snitzer <> Acked-by: Mikulas Patocka <>
2017-06-21CIFS: Fix some return values in case of error in 'crypt_message'Christophe Jaillet-1/+3
'rc' is known to be 0 at this point. So if 'init_sg' or 'kzalloc' fails, we should return -ENOMEM instead. Also remove a useless 'rc' in a debug message as it is meaningless here. Fixes: 026e93dc0a3ee ("CIFS: Encrypt SMB3 requests before sending") Signed-off-by: Christophe JAILLET <> Reviewed-by: Pavel Shilovsky <> Reviewed-by: Aurelien Aptel <> Signed-off-by: Steve French <> CC: Stable <>
2017-06-20cifs: remove redundant return in cifs_creation_time_getColin Ian King-2/+0
There is a redundant return in function cifs_creation_time_get that appears to be old vestigial code than can be removed. So remove it. Detected by CoverityScan, CID#1361924 ("Structurally dead code") Signed-off-by: Colin Ian King <> Signed-off-by: Steve French <>
2017-06-20CIFS: Improve readdir verbosityPavel Shilovsky-4/+9
Downgrade the loglevel for SMB2 to prevent filling the log with messages if e.g. readdir was interrupted. Also make SMB2 and SMB1 codepaths do the same logging during readdir. Signed-off-by: Pavel Shilovsky <> Signed-off-by: Steve French <> CC: Stable <>
2017-06-20CIFS: check if pages is null rather than bv for a failed allocationColin Ian King-1/+1
pages is being allocated however a null check on bv is being used to see if the allocation failed. Fix this by checking if pages is null. Detected by CoverityScan, CID#1432974 ("Logically dead code") Fixes: ccf7f4088af2dd ("CIFS: Add asynchronous context to support kernel AIO") Signed-off-by: Colin Ian King <> Reviewed-by: Pavel Shilovsky <> Signed-off-by: Steve French <>
2017-06-20CIFS: Set ->should_dirty in cifs_user_readv()Dan Carpenter-1/+1
The current code causes a static checker warning because ITER_IOVEC is zero so the condition is never true. Fixes: 6685c5e2d1ac ("CIFS: Add asynchronous read support through kernel AIO") Signed-off-by: Dan Carpenter <> Signed-off-by: Steve French <>
2017-06-20igmp: add a missing spin_lock_init()WANG Cong-0/+1
Andrey reported a lockdep warning on non-initialized spinlock: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 1 PID: 4099 Comm: a.out Not tainted 4.12.0-rc6+ #9 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x292/0x395 lib/dump_stack.c:52 register_lock_class+0x717/0x1aa0 kernel/locking/lockdep.c:755 ? 0xffffffffa0000000 __lock_acquire+0x269/0x3690 kernel/locking/lockdep.c:3255 lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855 __raw_spin_lock_bh ./include/linux/spinlock_api_smp.h:135 _raw_spin_lock_bh+0x36/0x50 kernel/locking/spinlock.c:175 spin_lock_bh ./include/linux/spinlock.h:304 ip_mc_clear_src+0x27/0x1e0 net/ipv4/igmp.c:2076 igmpv3_clear_delrec+0xee/0x4f0 net/ipv4/igmp.c:1194 ip_mc_destroy_dev+0x4e/0x190 net/ipv4/igmp.c:1736 We miss a spin_lock_init() in igmpv3_add_delrec(), probably because previously we never use it on this code path. Since we already unlink it from the global mc_tomb list, it is probably safe not to acquire this spinlock here. It does not harm to have it although, to avoid conditional locking. Fixes: c38b7d327aaf ("igmp: acquire pmc lock for ip_mc_clear_src()") Reported-by: Andrey Konovalov <> Signed-off-by: Cong Wang <> Signed-off-by: David S. Miller <>
2017-06-20Merge tag 'wireless-drivers-for-davem-2017-06-20' of ↵David S. Miller-36/+49
git:// Kalle Valo says: ==================== wireless-drivers fixes for 4.12 Two important fixes for brcmfmac. The rest of the brcmfmac patches are either code preparation and fixing a new build warning. brcmfmac * fix a NULL pointer dereference during resume * fix a NULL pointer dereference with USB devices, a regression from v4.12-rc1 ==================== Signed-off-by: David S. Miller <>
2017-06-20net: stmmac: free an skb first when there are no longer any descriptors using itNiklas Cassel-4/+16
When having the skb pointer in the first descriptor, stmmac_tx_clean can get called at a moment where the IP has only cleared the own bit of the first descriptor, thus freeing the skb, even though there can be several descriptors whose buffers point into the same skb. By simply moving the skb pointer from the first descriptor to the last descriptor, a skb will get freed only when the IP has cleared the own bit of all the descriptors that are using that skb. Signed-off-by: Niklas Cassel <> Signed-off-by: David S. Miller <>
2017-06-20sfc: remove duplicate up_write on VF filter_semEdward Cree-2/+0
Somehow two copies of the line 'up_write(&vf->efx->filter_sem);' got into efx_ef10_sriov_set_vf_vlan(). This would put the mutex in a bad state and cause all subsequent down attempts to hang. Fixes: 671b53eec2ed ("sfc: Ensure down_write(&filter_sem) and up_write() are matched before calling efx_net_open()") Signed-off-by: Edward Cree <> Signed-off-by: David S. Miller <>
2017-06-20rtnetlink: add IFLA_GROUP to ifla_policySerhey Popovych-0/+2
Network interface groups support added while ago, however there is no IFLA_GROUP attribute description in policy and netlink message size calculations until now. Add IFLA_GROUP attribute to the policy. Fixes: cbda10fa97d7 ("net_device: add support for network device groups") Signed-off-by: Serhey Popovych <> Signed-off-by: David S. Miller <>
2017-06-20ipv6: Do not leak throw route referencesSerhey Popovych-18/+7
While commit 73ba57bfae4a ("ipv6: fix backtracking for throw routes") does good job on error propagation to the fib_rules_lookup() in fib rules core framework that also corrects throw routes handling, it does not solve route reference leakage problem happened when we return -EAGAIN to the fib_rules_lookup() and leave routing table entry referenced in arg->result. If rule with matched throw route isn't last matched in the list we overwrite arg->result losing reference on throw route stored previously forever. We also partially revert commit ab997ad40839 ("ipv6: fix the incorrect return value of throw route") since we never return routing table entry with dst.error == -EAGAIN when CONFIG_IPV6_MULTIPLE_TABLES is on. Also there is no point to check for RTF_REJECT flag since it is always set throw route. Fixes: 73ba57bfae4a ("ipv6: fix backtracking for throw routes") Signed-off-by: Serhey Popovych <> Signed-off-by: David S. Miller <>
2017-06-20dt-bindings: net: sms911x: Add missing optional VDD regulatorsKrzysztof Kozlowski-0/+1
The lan911x family of devices require supplying from 3.3 V power supplies (connected to VDD_IO, VDD_A and VREG_3.3 pins). The existing driver however obtains only VDD_IO and VDD_A regulators in an optional way so document this in bindings. Signed-off-by: Krzysztof Kozlowski <> Reviewed-by: Linus Walleij <> Signed-off-by: David S. Miller <>
2017-06-20Merge branch 'net-fix-loadable-module-for-DPAA-Ethernet'David S. Miller-1/+3
Madalin Bucur says: ==================== net: fix loadable module for DPAA Ethernet The DPAA Ethernet makes use of a symbol that is not exported. Address the issue by propagating the dma_ops rather than calling arch_setup_dma_ops(). ==================== Signed-off-by: David S. Miller <>
2017-06-20dpaa_eth: reuse the dma_ops provided by the FMan MAC deviceMadalin Bucur-1/+1
Remove the use of arch_setup_dma_ops() that was not exported and was breaking loadable module compilation. Signed-off-by: Madalin Bucur <> Signed-off-by: David S. Miller <>
2017-06-20fsl/fman: propagate dma_opsMadalin Bucur-0/+2
Make sure dma_ops are set, to be later used by the Ethernet driver. Signed-off-by: Madalin Bucur <> Signed-off-by: David S. Miller <>
2017-06-20net/core: remove explicit do_softirq() from busy_poll_stop()Sebastian Siewior-2/+0
Since commit 217f69743681 ("net: busy-poll: allow preemption in sk_busy_loop()") there is an explicit do_softirq() invocation after local_bh_enable() has been invoked. I don't understand why we need this because local_bh_enable() will invoke do_softirq() once the softirq counter reached zero and we have softirq-related work pending. Signed-off-by: Sebastian Andrzej Siewior <> Signed-off-by: David S. Miller <>
2017-06-20fib_rules: Resolve goto rules target on deleteSerhey Popovych-7/+14
We should avoid marking goto rules unresolved when their target is actually reachable after rule deletion. Consolder following sample scenario: # ip -4 ru sh 0: from all lookup local 32000: from all goto 32100 32100: from all lookup main 32100: from all lookup default 32766: from all lookup main 32767: from all lookup default # ip -4 ru del pref 32100 table main # ip -4 ru sh 0: from all lookup local 32000: from all goto 32100 [unresolved] 32100: from all lookup default 32766: from all lookup main 32767: from all lookup default After removal of first rule with preference 32100 we mark all goto rules as unreachable, even when rule with same preference as removed one still present. Check if next rule with same preference is available and make all rules with goto action pointing to it. Signed-off-by: Serhey Popovych <> Signed-off-by: David S. Miller <>
2017-06-20dt-bindings: mfd: Update STM32 timers clock namesFabrice Gasnier-1/+1
Clock name has been updated during driver/DT binding review: Update DT binding doc to reflect this. Fixes: 8f9359c6c6a0 (dt-bindings: mfd: Add bindings for STM32 Timers driver) Signed-off-by: Fabrice Gasnier <> Acked-by: Benjamin Gaignard <> Signed-off-by: Lee Jones <>
2017-06-20KVM: MIPS: Fix maybe-uninitialized build failureJames Cowgill-1/+5
This commit fixes a "maybe-uninitialized" build failure in arch/mips/kvm/tlb.c when KVM, DYNAMIC_DEBUG and JUMP_LABEL are all enabled. The failure is: In file included from ./include/linux/printk.h:329:0, from ./include/linux/kernel.h:13, from ./include/asm-generic/bug.h:15, from ./arch/mips/include/asm/bug.h:41, from ./include/linux/bug.h:4, from ./include/linux/thread_info.h:11, from ./include/asm-generic/current.h:4, from ./arch/mips/include/generated/asm/current.h:1, from ./include/linux/sched.h:11, from arch/mips/kvm/tlb.c:13: arch/mips/kvm/tlb.c: In function ‘kvm_mips_host_tlb_inv’: ./include/linux/dynamic_debug.h:126:3: error: ‘idx_kernel’ may be used uninitialized in this function [-Werror=maybe-uninitialized] __dynamic_pr_debug(&descriptor, pr_fmt(fmt), \ ^~~~~~~~~~~~~~~~~~ arch/mips/kvm/tlb.c:169:16: note: ‘idx_kernel’ was declared here int idx_user, idx_kernel; ^~~~~~~~~~ There is a similar error relating to "idx_user". Both errors were observed with GCC 6. As far as I can tell, it is impossible for either idx_user or idx_kernel to be uninitialized when they are later read in the calls to kvm_debug, but to satisfy the compiler, add zero initializers to both variables. Signed-off-by: James Cowgill <> Fixes: 57e3869cfaae ("KVM: MIPS/TLB: Generalise host TLB invalidate to kernel ASID") Cc: <> # 4.11+ Acked-by: James Hogan <> Signed-off-by: Radim Krčmář <>