<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/mm, branch linux-4.15.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.15.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.15.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2018-04-19T06:55:12Z</updated>
<entry>
<title>mm/gup_benchmark: handle gup failures</title>
<updated>2018-04-19T06:55:12Z</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@redhat.com</email>
</author>
<published>2018-04-13T22:35:16Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=84f086ee4e6af45256e10a03afdd19039e7981f4'/>
<id>urn:sha1:84f086ee4e6af45256e10a03afdd19039e7981f4</id>
<content type='text'>
commit 09e35a4a1ca8b9988ca9b8557d17948cd6c0808b upstream.

Patch series "mm/get_user_pages_fast fixes, cleanups", v2.

Turns out get_user_pages_fast and __get_user_pages_fast return different
values on error when given a single page: __get_user_pages_fast returns
0.  get_user_pages_fast returns either 0 or an error.

Callers of get_user_pages_fast expect an error so fix it up to return an
error consistently.

Stress the difference between get_user_pages_fast and
__get_user_pages_fast to make sure callers aren't confused.

This patch (of 3):

__gup_benchmark_ioctl does not handle the case where get_user_pages_fast
fails:

 - a negative return code will cause a buffer overrun

 - returning with partial success will cause use of uninitialized
   memory.

[akpm@linux-foundation.org: simplification]
Link: http://lkml.kernel.org/r/1522962072-182137-3-git-send-email-mst@redhat.com
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Huang Ying &lt;ying.huang@intel.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Thorsten Leemhuis &lt;regressions@leemhuis.info&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>get_user_pages_fast(): return -EFAULT on access_ok failure</title>
<updated>2018-04-19T06:55:12Z</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@redhat.com</email>
</author>
<published>2018-04-13T22:35:20Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=7e255357ef66c435c57d4e6cf304540324e39678'/>
<id>urn:sha1:7e255357ef66c435c57d4e6cf304540324e39678</id>
<content type='text'>
commit c61611f70958d86f659bca25c02ae69413747a8d upstream.

get_user_pages_fast is supposed to be a faster drop-in equivalent of
get_user_pages.  As such, callers expect it to return a negative return
code when passed an invalid address, and never expect it to return 0
when passed a positive number of pages, since its documentation says:

 * Returns number of pages pinned. This may be fewer than the number
 * requested. If nr_pages is 0 or negative, returns 0. If no pages
 * were pinned, returns -errno.

When get_user_pages_fast fall back on get_user_pages this is exactly
what happens.  Unfortunately the implementation is inconsistent: it
returns 0 if passed a kernel address, confusing callers: for example,
the following is pretty common but does not appear to do the right thing
with a kernel address:

        ret = get_user_pages_fast(addr, 1, writeable, &amp;page);
        if (ret &lt; 0)
                return ret;

Change get_user_pages_fast to return -EFAULT when supplied a kernel
address to make it match expectations.

All callers have been audited for consistency with the documented
semantics.

Link: http://lkml.kernel.org/r/1522962072-182137-4-git-send-email-mst@redhat.com
Fixes: 5b65c4677a57 ("mm, x86/mm: Fix performance regression in get_user_pages_fast()")
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reported-by: syzbot+6304bf97ef436580fede@syzkaller.appspotmail.com
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Huang Ying &lt;ying.huang@intel.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Thorsten Leemhuis &lt;regressions@leemhuis.info&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>percpu: add __GFP_NORETRY semantics to the percpu balancing path</title>
<updated>2018-04-08T12:27:35Z</updated>
<author>
<name>Dennis Zhou</name>
<email>dennisszhou@gmail.com</email>
</author>
<published>2018-02-16T18:07:19Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5bb3f4acc8aa6c479f041c3449a03d38c349273a'/>
<id>urn:sha1:5bb3f4acc8aa6c479f041c3449a03d38c349273a</id>
<content type='text'>
commit 47504ee04b9241548ae2c28be7d0b01cff3b7aa6 upstream.

Percpu memory using the vmalloc area based chunk allocator lazily
populates chunks by first requesting the full virtual address space
required for the chunk and subsequently adding pages as allocations come
through. To ensure atomic allocations can succeed, a workqueue item is
used to maintain a minimum number of empty pages. In certain scenarios,
such as reported in [1], it is possible that physical memory becomes
quite scarce which can result in either a rather long time spent trying
to find free pages or worse, a kernel panic.

This patch adds support for __GFP_NORETRY and __GFP_NOWARN passing them
through to the underlying allocators. This should prevent any
unnecessary panics potentially caused by the workqueue item. The passing
of gfp around is as additional flags rather than a full set of flags.
The next patch will change these to caller passed semantics.

V2:
Added const modifier to gfp flags in the balance path.
Removed an extra whitespace.

[1] https://lkml.org/lkml/2018/2/12/551

Signed-off-by: Dennis Zhou &lt;dennisszhou@gmail.com&gt;
Suggested-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reported-by: syzbot+adb03f3f0bb57ce3acda@syzkaller.appspotmail.com
Acked-by: Christoph Lameter &lt;cl@linux.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm/vmscan: wake up flushers for legacy cgroups too</title>
<updated>2018-03-28T16:23:00Z</updated>
<author>
<name>Andrey Ryabinin</name>
<email>aryabinin@virtuozzo.com</email>
</author>
<published>2018-03-22T23:17:42Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c8f7955b5493a6176360c7acab3dbb4ed77cb703'/>
<id>urn:sha1:c8f7955b5493a6176360c7acab3dbb4ed77cb703</id>
<content type='text'>
commit 1c610d5f93c709df56787f50b3576704ac271826 upstream.

Commit 726d061fbd36 ("mm: vmscan: kick flushers when we encounter dirty
pages on the LRU") added flusher invocation to shrink_inactive_list()
when many dirty pages on the LRU are encountered.

However, shrink_inactive_list() doesn't wake up flushers for legacy
cgroup reclaim, so the next commit bbef938429f5 ("mm: vmscan: remove old
flusher wakeup from direct reclaim path") removed the only source of
flusher's wake up in legacy mem cgroup reclaim path.

This leads to premature OOM if there is too many dirty pages in cgroup:
    # mkdir /sys/fs/cgroup/memory/test
    # echo $$ &gt; /sys/fs/cgroup/memory/test/tasks
    # echo 50M &gt; /sys/fs/cgroup/memory/test/memory.limit_in_bytes
    # dd if=/dev/zero of=tmp_file bs=1M count=100
    Killed

    dd invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=0

    Call Trace:
     dump_stack+0x46/0x65
     dump_header+0x6b/0x2ac
     oom_kill_process+0x21c/0x4a0
     out_of_memory+0x2a5/0x4b0
     mem_cgroup_out_of_memory+0x3b/0x60
     mem_cgroup_oom_synchronize+0x2ed/0x330
     pagefault_out_of_memory+0x24/0x54
     __do_page_fault+0x521/0x540
     page_fault+0x45/0x50

    Task in /test killed as a result of limit of /test
    memory: usage 51200kB, limit 51200kB, failcnt 73
    memory+swap: usage 51200kB, limit 9007199254740988kB, failcnt 0
    kmem: usage 296kB, limit 9007199254740988kB, failcnt 0
    Memory cgroup stats for /test: cache:49632KB rss:1056KB rss_huge:0KB shmem:0KB
            mapped_file:0KB dirty:49500KB writeback:0KB swap:0KB inactive_anon:0KB
	    active_anon:1168KB inactive_file:24760KB active_file:24960KB unevictable:0KB
    Memory cgroup out of memory: Kill process 3861 (bash) score 88 or sacrifice child
    Killed process 3876 (dd) total-vm:8484kB, anon-rss:1052kB, file-rss:1720kB, shmem-rss:0kB
    oom_reaper: reaped process 3876 (dd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Wake up flushers in legacy cgroup reclaim too.

Link: http://lkml.kernel.org/r/20180315164553.17856-1-aryabinin@virtuozzo.com
Fixes: bbef938429f5 ("mm: vmscan: remove old flusher wakeup from direct reclaim path")
Signed-off-by: Andrey Ryabinin &lt;aryabinin@virtuozzo.com&gt;
Tested-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>Revert "mm: page_alloc: skip over regions of invalid pfns where possible"</title>
<updated>2018-03-28T16:22:59Z</updated>
<author>
<name>Daniel Vacek</name>
<email>neelx@redhat.com</email>
</author>
<published>2018-03-22T23:17:38Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=f981611c4ae34db8595b1bea8cbdbb2b70ed8227'/>
<id>urn:sha1:f981611c4ae34db8595b1bea8cbdbb2b70ed8227</id>
<content type='text'>
commit f59f1caf72ba00d519c793c3deb32cd3be32edc2 upstream.

This reverts commit b92df1de5d28 ("mm: page_alloc: skip over regions of
invalid pfns where possible").  The commit is meant to be a boot init
speed up skipping the loop in memmap_init_zone() for invalid pfns.

But given some specific memory mapping on x86_64 (or more generally
theoretically anywhere but on arm with CONFIG_HAVE_ARCH_PFN_VALID) the
implementation also skips valid pfns which is plain wrong and causes
'kernel BUG at mm/page_alloc.c:1389!'

  crash&gt; log | grep -e BUG -e RIP -e Call.Trace -e move_freepages_block -e rmqueue -e freelist -A1
  kernel BUG at mm/page_alloc.c:1389!
  invalid opcode: 0000 [#1] SMP
  --
  RIP: 0010: move_freepages+0x15e/0x160
  --
  Call Trace:
    move_freepages_block+0x73/0x80
    __rmqueue+0x263/0x460
    get_page_from_freelist+0x7e1/0x9e0
    __alloc_pages_nodemask+0x176/0x420
  --

  crash&gt; page_init_bug -v | grep RAM
  &lt;struct resource 0xffff88067fffd2f8&gt;          1000 -        9bfff       System RAM (620.00 KiB)
  &lt;struct resource 0xffff88067fffd3a0&gt;        100000 -     430bffff       System RAM (  1.05 GiB = 1071.75 MiB = 1097472.00 KiB)
  &lt;struct resource 0xffff88067fffd410&gt;      4b0c8000 -     4bf9cfff       System RAM ( 14.83 MiB = 15188.00 KiB)
  &lt;struct resource 0xffff88067fffd480&gt;      4bfac000 -     646b1fff       System RAM (391.02 MiB = 400408.00 KiB)
  &lt;struct resource 0xffff88067fffd560&gt;      7b788000 -     7b7fffff       System RAM (480.00 KiB)
  &lt;struct resource 0xffff88067fffd640&gt;     100000000 -    67fffffff       System RAM ( 22.00 GiB)

  crash&gt; page_init_bug | head -6
  &lt;struct resource 0xffff88067fffd560&gt;      7b788000 -     7b7fffff       System RAM (480.00 KiB)
  &lt;struct page 0xffffea0001ede200&gt;   1fffff00000000  0 &lt;struct pglist_data 0xffff88047ffd9000&gt; 1 &lt;struct zone 0xffff88047ffd9800&gt; DMA32          4096    1048575
  &lt;struct page 0xffffea0001ede200&gt;       505736 505344 &lt;struct page 0xffffea0001ed8000&gt; 505855 &lt;struct page 0xffffea0001edffc0&gt;
  &lt;struct page 0xffffea0001ed8000&gt;                0  0 &lt;struct pglist_data 0xffff88047ffd9000&gt; 0 &lt;struct zone 0xffff88047ffd9000&gt; DMA               1       4095
  &lt;struct page 0xffffea0001edffc0&gt;   1fffff00000400  0 &lt;struct pglist_data 0xffff88047ffd9000&gt; 1 &lt;struct zone 0xffff88047ffd9800&gt; DMA32          4096    1048575
  BUG, zones differ!

  crash&gt; kmem -p 77fff000 78000000 7b5ff000 7b600000 7b787000 7b788000
        PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
  ffffea0001e00000  78000000                0        0  0 0
  ffffea0001ed7fc0  7b5ff000                0        0  0 0
  ffffea0001ed8000  7b600000                0        0  0 0       &lt;&lt;&lt;&lt;
  ffffea0001ede1c0  7b787000                0        0  0 0
  ffffea0001ede200  7b788000                0        0  1 1fffff00000000

Link: http://lkml.kernel.org/r/20180316143855.29838-1-neelx@redhat.com
Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek &lt;neelx@redhat.com&gt;
Acked-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Pavel Tatashin &lt;pasha.tatashin@oracle.com&gt;
Cc: Paul Burton &lt;paul.burton@imgtec.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm/shmem: do not wait for lock_page() in shmem_unused_huge_shrink()</title>
<updated>2018-03-28T16:22:59Z</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2018-03-22T23:17:35Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=d3d155da63b99cd6dcb8bf0a46d6eae3a456b546'/>
<id>urn:sha1:d3d155da63b99cd6dcb8bf0a46d6eae3a456b546</id>
<content type='text'>
commit b3cd54b257ad95d344d121dc563d943ca39b0921 upstream.

shmem_unused_huge_shrink() gets called from reclaim path.  Waiting for
page lock may lead to deadlock there.

There was a bug report that may be attributed to this:

  http://lkml.kernel.org/r/alpine.LRH.2.11.1801242349220.30642@mail.ewheeler.net

Replace lock_page() with trylock_page() and skip the page if we failed
to lock it.  We will get to the page on the next scan.

We can test for the PageTransHuge() outside the page lock as we only
need protection against splitting the page under us.  Holding pin oni
the page is enough for this.

Link: http://lkml.kernel.org/r/20180316210830.43738-1-kirill.shutemov@linux.intel.com
Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure")
Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Reported-by: Eric Wheeler &lt;linux-mm@lists.ewheeler.net&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[4.8+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm/thp: do not wait for lock_page() in deferred_split_scan()</title>
<updated>2018-03-28T16:22:58Z</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2018-03-22T23:17:31Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=29c11d86b74ff6b4795ed715d23211e89c209425'/>
<id>urn:sha1:29c11d86b74ff6b4795ed715d23211e89c209425</id>
<content type='text'>
commit fa41b900c30b45fab03783724932dc30cd46a6be upstream.

deferred_split_scan() gets called from reclaim path.  Waiting for page
lock may lead to deadlock there.

Replace lock_page() with trylock_page() and skip the page if we failed
to lock it.  We will get to the page on the next scan.

Link: http://lkml.kernel.org/r/20180315150747.31945-1-kirill.shutemov@linux.intel.com
Fixes: 9a982250f773 ("thp: introduce deferred_split_huge_page()")
Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm/khugepaged.c: convert VM_BUG_ON() to collapse fail</title>
<updated>2018-03-28T16:22:58Z</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2018-03-22T23:17:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=babe10f62b6bf7cc30ae85cb2f1e382c210219fd'/>
<id>urn:sha1:babe10f62b6bf7cc30ae85cb2f1e382c210219fd</id>
<content type='text'>
commit fece2029a9e65b9a990831afe2a2b83290cbbe26 upstream.

khugepaged is not yet able to convert PTE-mapped huge pages back to PMD
mapped.  We do not collapse such pages.  See check
khugepaged_scan_pmd().

But if between khugepaged_scan_pmd() and __collapse_huge_page_isolate()
somebody managed to instantiate THP in the range and then split the PMD
back to PTEs we would have a problem --
VM_BUG_ON_PAGE(PageCompound(page)) will get triggered.

It's possible since we drop mmap_sem during collapse to re-take for
write.

Replace the VM_BUG_ON() with graceful collapse fail.

Link: http://lkml.kernel.org/r/20180315152353.27989-1-kirill.shutemov@linux.intel.com
Fixes: b1caa957ae6d ("khugepaged: ignore pmd tables with THP mapped with ptes")
Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Laura Abbott &lt;labbott@redhat.com&gt;
Cc: Jerome Marchand &lt;jmarchan@redhat.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>hugetlbfs: check for pgoff value overflow</title>
<updated>2018-03-28T16:22:58Z</updated>
<author>
<name>Mike Kravetz</name>
<email>mike.kravetz@oracle.com</email>
</author>
<published>2018-03-22T23:17:13Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e0fdb5385c4bf26b4be60c0042344c315c039aeb'/>
<id>urn:sha1:e0fdb5385c4bf26b4be60c0042344c315c039aeb</id>
<content type='text'>
commit 63489f8e821144000e0bdca7e65a8d1cc23a7ee7 upstream.

A vma with vm_pgoff large enough to overflow a loff_t type when
converted to a byte offset can be passed via the remap_file_pages system
call.  The hugetlbfs mmap routine uses the byte offset to calculate
reservations and file size.

A sequence such as:

  mmap(0x20a00000, 0x600000, 0, 0x66033, -1, 0);
  remap_file_pages(0x20a00000, 0x600000, 0, 0x20000000000000, 0);

will result in the following when task exits/file closed,

  kernel BUG at mm/hugetlb.c:749!
  Call Trace:
    hugetlbfs_evict_inode+0x2f/0x40
    evict+0xcb/0x190
    __dentry_kill+0xcb/0x150
    __fput+0x164/0x1e0
    task_work_run+0x84/0xa0
    exit_to_usermode_loop+0x7d/0x80
    do_syscall_64+0x18b/0x190
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2

The overflowed pgoff value causes hugetlbfs to try to set up a mapping
with a negative range (end &lt; start) that leaves invalid state which
causes the BUG.

The previous overflow fix to this code was incomplete and did not take
the remap_file_pages system call into account.

[mike.kravetz@oracle.com: v3]
  Link: http://lkml.kernel.org/r/20180309002726.7248-1-mike.kravetz@oracle.com
[akpm@linux-foundation.org: include mmdebug.h]
[akpm@linux-foundation.org: fix -ve left shift count on sh]
Link: http://lkml.kernel.org/r/20180308210502.15952-1-mike.kravetz@oracle.com
Fixes: 045c7a3f53d9 ("hugetlbfs: fix offset overflow in hugetlbfs mmap")
Signed-off-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Reported-by: Nic Losby &lt;blurbdust@gmail.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: "Kirill A . Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Yisheng Xie &lt;xieyisheng1@huawei.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>lockdep: fix fs_reclaim warning</title>
<updated>2018-03-28T16:22:55Z</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@I-love.SAKURA.ne.jp</email>
</author>
<published>2018-03-22T23:17:10Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=cb5cfed66ebcd8a1ad004e86a61e22c231aa00e4'/>
<id>urn:sha1:cb5cfed66ebcd8a1ad004e86a61e22c231aa00e4</id>
<content type='text'>
commit 2e517d681632326ed98399cb4dd99519efe3e32c upstream.

Dave Jones reported fs_reclaim lockdep warnings.

  ============================================
  WARNING: possible recursive locking detected
  4.15.0-rc9-backup-debug+ #1 Not tainted
  --------------------------------------------
  sshd/24800 is trying to acquire lock:
   (fs_reclaim){+.+.}, at: [&lt;0000000084f438c2&gt;] fs_reclaim_acquire.part.102+0x5/0x30

  but task is already holding lock:
   (fs_reclaim){+.+.}, at: [&lt;0000000084f438c2&gt;] fs_reclaim_acquire.part.102+0x5/0x30

  other info that might help us debug this:
   Possible unsafe locking scenario:

         CPU0
         ----
    lock(fs_reclaim);
    lock(fs_reclaim);

   *** DEADLOCK ***

   May be due to missing lock nesting notation

  2 locks held by sshd/24800:
   #0:  (sk_lock-AF_INET6){+.+.}, at: [&lt;000000001a069652&gt;] tcp_sendmsg+0x19/0x40
   #1:  (fs_reclaim){+.+.}, at: [&lt;0000000084f438c2&gt;] fs_reclaim_acquire.part.102+0x5/0x30

  stack backtrace:
  CPU: 3 PID: 24800 Comm: sshd Not tainted 4.15.0-rc9-backup-debug+ #1
  Call Trace:
   dump_stack+0xbc/0x13f
   __lock_acquire+0xa09/0x2040
   lock_acquire+0x12e/0x350
   fs_reclaim_acquire.part.102+0x29/0x30
   kmem_cache_alloc+0x3d/0x2c0
   alloc_extent_state+0xa7/0x410
   __clear_extent_bit+0x3ea/0x570
   try_release_extent_mapping+0x21a/0x260
   __btrfs_releasepage+0xb0/0x1c0
   btrfs_releasepage+0x161/0x170
   try_to_release_page+0x162/0x1c0
   shrink_page_list+0x1d5a/0x2fb0
   shrink_inactive_list+0x451/0x940
   shrink_node_memcg.constprop.88+0x4c9/0x5e0
   shrink_node+0x12d/0x260
   try_to_free_pages+0x418/0xaf0
   __alloc_pages_slowpath+0x976/0x1790
   __alloc_pages_nodemask+0x52c/0x5c0
   new_slab+0x374/0x3f0
   ___slab_alloc.constprop.81+0x47e/0x5a0
   __slab_alloc.constprop.80+0x32/0x60
   __kmalloc_track_caller+0x267/0x310
   __kmalloc_reserve.isra.40+0x29/0x80
   __alloc_skb+0xee/0x390
   sk_stream_alloc_skb+0xb8/0x340
   tcp_sendmsg_locked+0x8e6/0x1d30
   tcp_sendmsg+0x27/0x40
   inet_sendmsg+0xd0/0x310
   sock_write_iter+0x17a/0x240
   __vfs_write+0x2ab/0x380
   vfs_write+0xfb/0x260
   SyS_write+0xb6/0x140
   do_syscall_64+0x1e5/0xc05
   entry_SYSCALL64_slow_path+0x25/0x25

This warning is caused by commit d92a8cfcb37e ("locking/lockdep:
Rework FS_RECLAIM annotation") which replaced the use of
lockdep_{set,clear}_current_reclaim_state() in __perform_reclaim()
and lockdep_trace_alloc() in slab_pre_alloc_hook() with
fs_reclaim_acquire()/ fs_reclaim_release().

Since __kmalloc_reserve() from __alloc_skb() adds __GFP_NOMEMALLOC |
__GFP_NOWARN to gfp_mask, and all reclaim path simply propagates
__GFP_NOMEMALLOC, fs_reclaim_acquire() in slab_pre_alloc_hook() is
trying to grab the 'fake' lock again when __perform_reclaim() already
grabbed the 'fake' lock.

The

  /* this guy won't enter reclaim */
  if ((current-&gt;flags &amp; PF_MEMALLOC) &amp;&amp; !(gfp_mask &amp; __GFP_NOMEMALLOC))
          return false;

test which causes slab_pre_alloc_hook() to try to grab the 'fake' lock
was added by commit cf40bd16fdad ("lockdep: annotate reclaim context
(__GFP_NOFS)").  But that test is outdated because PF_MEMALLOC thread
won't enter reclaim regardless of __GFP_NOMEMALLOC after commit
341ce06f69ab ("page allocator: calculate the alloc_flags for allocation
only once") added the PF_MEMALLOC safeguard (

  /* Avoid recursion of direct reclaim */
  if (p-&gt;flags &amp; PF_MEMALLOC)
          goto nopage;

in __alloc_pages_slowpath()).

Thus, let's fix outdated test by removing __GFP_NOMEMALLOC test and
allow __need_fs_reclaim() to return false.

Link: http://lkml.kernel.org/r/201802280650.FJC73911.FOSOMLJVFFQtHO@I-love.SAKURA.ne.jp
Fixes: d92a8cfcb37ecd13 ("locking/lockdep: Rework FS_RECLAIM annotation")
Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Reported-by: Dave Jones &lt;davej@codemonkey.org.uk&gt;
Tested-by: Dave Jones &lt;davej@codemonkey.org.uk&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Nick Piggin &lt;npiggin@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Nikolay Borisov &lt;nborisov@suse.com&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[4.14+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
