<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/mm, branch linux-6.8.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.8.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-6.8.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2024-05-30T07:49:00Z</updated>
<entry>
<title>mm/userfaultfd: Do not place zeropages when zeropages are disallowed</title>
<updated>2024-05-30T07:49:00Z</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2024-04-11T16:14:40Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5c8323a24d6e3c42b99518b84ae6fcc5341f8fbe'/>
<id>urn:sha1:5c8323a24d6e3c42b99518b84ae6fcc5341f8fbe</id>
<content type='text'>
[ Upstream commit 90a7592da14951bd21f74a53246ba30955a648aa ]

s390x must disable shared zeropages for processes running VMs, because
the VMs could end up making use of "storage keys" or protected
virtualization, which are incompatible with shared zeropages.

Yet, with userfaultfd it is possible to insert shared zeropages into
such processes. Let's fallback to simply allocating a fresh zeroed
anonymous folio and insert that instead.

mm_forbids_zeropage() was introduced in commit 593befa6ab74 ("mm: introduce
mm_forbids_zeropage function"), briefly before userfaultfd went
upstream.

Note that we don't want to fail the UFFDIO_ZEROPAGE request like we do
for hugetlb, it would be rather unexpected. Further, we also
cannot really indicated "not supported" to user space ahead of time: it
could be that the MM disallows zeropages after userfaultfd was already
registered.

[ agordeev: Fixed checkpatch complaints ]

Fixes: c1a4de99fada ("userfaultfd: mcopy_atomic|mfill_zeropage: UFFDIO_COPY|UFFDIO_ZEROPAGE preparation")
Reviewed-by: Peter Xu &lt;peterx@redhat.com&gt;
Link: https://lore.kernel.org/r/20240411161441.910170-2-david@redhat.com
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>libfs: Add simple_offset_rename() API</title>
<updated>2024-05-30T07:48:59Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2024-04-15T15:20:55Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8596e390ded758e530fd240d81400d1797be928b'/>
<id>urn:sha1:8596e390ded758e530fd240d81400d1797be928b</id>
<content type='text'>
[ Upstream commit 5a1a25be995e1014abd01600479915683e356f5c ]

I'm about to fix a tmpfs rename bug that requires the use of
internal simple_offset helpers that are not available in mm/shmem.c

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Link: https://lore.kernel.org/r/20240415152057.4605-3-cel@kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Stable-dep-of: ad191eb6d694 ("shmem: Fix shmem_rename2()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>libfs: Add simple_offset_empty()</title>
<updated>2024-05-30T07:48:59Z</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2024-02-17T20:23:54Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=abac2274f85a457d7f4c1649e3b7d458c77d7899'/>
<id>urn:sha1:abac2274f85a457d7f4c1649e3b7d458c77d7899</id>
<content type='text'>
[ Upstream commit ecba88a3b32d733d41e27973e25b2bc580f64281 ]

For simple filesystems that use directory offset mapping, rely
strictly on the directory offset map to tell when a directory has
no children.

After this patch is applied, the emptiness test holds only the RCU
read lock when the directory being tested has no children.

In addition, this adds another layer of confirmation that
simple_offset_add/remove() are working as expected.

Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Link: https://lore.kernel.org/r/170820143463.6328.7872919188371286951.stgit@91.116.238.104.host.secureserver.net
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Stable-dep-of: 23cdd0eed3f1 ("libfs: Fix simple_offset_rename_exchange()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm: use memalloc_nofs_save() in page_cache_ra_order()</title>
<updated>2024-05-17T10:15:08Z</updated>
<author>
<name>Kefeng Wang</name>
<email>wangkefeng.wang@huawei.com</email>
</author>
<published>2024-04-26T11:29:38Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=cf6a1d16c6df3c30b03f0c6a92a2ba7f86dffb45'/>
<id>urn:sha1:cf6a1d16c6df3c30b03f0c6a92a2ba7f86dffb45</id>
<content type='text'>
commit 30153e4466647a17eebfced13eede5cbe4290e69 upstream.

See commit f2c817bed58d ("mm: use memalloc_nofs_save in readahead path"),
ensure that page_cache_ra_order() do not attempt to reclaim file-backed
pages too, or it leads to a deadlock, found issue when test ext4 large
folio.

 INFO: task DataXceiver for:7494 blocked for more than 120 seconds.
 "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 task:DataXceiver for state:D stack:0     pid:7494  ppid:1      flags:0x00000200
 Call trace:
  __switch_to+0x14c/0x240
  __schedule+0x82c/0xdd0
  schedule+0x58/0xf0
  io_schedule+0x24/0xa0
  __folio_lock+0x130/0x300
  migrate_pages_batch+0x378/0x918
  migrate_pages+0x350/0x700
  compact_zone+0x63c/0xb38
  compact_zone_order+0xc0/0x118
  try_to_compact_pages+0xb0/0x280
  __alloc_pages_direct_compact+0x98/0x248
  __alloc_pages+0x510/0x1110
  alloc_pages+0x9c/0x130
  folio_alloc+0x20/0x78
  filemap_alloc_folio+0x8c/0x1b0
  page_cache_ra_order+0x174/0x308
  ondemand_readahead+0x1c8/0x2b8
  page_cache_async_ra+0x68/0xb8
  filemap_readahead.isra.0+0x64/0xa8
  filemap_get_pages+0x3fc/0x5b0
  filemap_splice_read+0xf4/0x280
  ext4_file_splice_read+0x2c/0x48 [ext4]
  vfs_splice_read.part.0+0xa8/0x118
  splice_direct_to_actor+0xbc/0x288
  do_splice_direct+0x9c/0x108
  do_sendfile+0x328/0x468
  __arm64_sys_sendfile64+0x8c/0x148
  invoke_syscall+0x4c/0x118
  el0_svc_common.constprop.0+0xc8/0xf0
  do_el0_svc+0x24/0x38
  el0_svc+0x4c/0x1f8
  el0t_64_sync_handler+0xc0/0xc8
  el0t_64_sync+0x188/0x190

Link: https://lkml.kernel.org/r/20240426112938.124740-1-wangkefeng.wang@huawei.com
Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
Signed-off-by: Kefeng Wang &lt;wangkefeng.wang@huawei.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm/slub: avoid zeroing outside-object freepointer for single free</title>
<updated>2024-05-17T10:15:00Z</updated>
<author>
<name>Nicolas Bouchinet</name>
<email>nicolas.bouchinet@ssi.gouv.fr</email>
</author>
<published>2024-04-30T11:34:59Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=56900355485f6e82114b18c812edd57fd7970dcb'/>
<id>urn:sha1:56900355485f6e82114b18c812edd57fd7970dcb</id>
<content type='text'>
commit 8f828aa48812ced28aa39cb3cfe55ef2444d03dd upstream.

Commit 284f17ac13fe ("mm/slub: handle bulk and single object freeing
separately") splits single and bulk object freeing in two functions
slab_free() and slab_free_bulk() which leads slab_free() to call
slab_free_hook() directly instead of slab_free_freelist_hook().

If `init_on_free` is set, slab_free_hook() zeroes the object.
Afterward, if `slub_debug=F` and `CONFIG_SLAB_FREELIST_HARDENED` are
set, the do_slab_free() slowpath executes freelist consistency
checks and try to decode a zeroed freepointer which leads to a
"Freepointer corrupt" detection in check_object().

During bulk free, slab_free_freelist_hook() isn't affected as it always
sets it objects freepointer using set_freepointer() to maintain its
reconstructed freelist after `init_on_free`.

For single free, object's freepointer thus needs to be avoided when
stored outside the object if `init_on_free` is set. The freepointer left
as is, check_object() may later detect an invalid pointer value due to
objects overflow.

To reproduce, set `slub_debug=FU init_on_free=1 log_level=7` on the
command line of a kernel build with `CONFIG_SLAB_FREELIST_HARDENED=y`.

dmesg sample log:
[   10.708715] =============================================================================
[   10.710323] BUG kmalloc-rnd-05-32 (Tainted: G    B           T ): Freepointer corrupt
[   10.712695] -----------------------------------------------------------------------------
[   10.712695]
[   10.712695] Slab 0xffffd8bdc400d580 objects=32 used=4 fp=0xffff9d9a80356f80 flags=0x200000000000a00(workingset|slab|node=0|zone=2)
[   10.716698] Object 0xffff9d9a80356600 @offset=1536 fp=0x7ee4f480ce0ecd7c
[   10.716698]
[   10.716698] Bytes b4 ffff9d9a803565f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   10.720703] Object   ffff9d9a80356600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   10.720703] Object   ffff9d9a80356610: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   10.724696] Padding  ffff9d9a8035666c: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   10.724696] Padding  ffff9d9a8035667c: 00 00 00 00                                      ....
[   10.724696] FIX kmalloc-rnd-05-32: Object at 0xffff9d9a80356600 not freed

Fixes: 284f17ac13fe ("mm/slub: handle bulk and single object freeing separately")
Cc: &lt;stable@vger.kernel.org&gt;
Co-developed-by: Chengming Zhou &lt;chengming.zhou@linux.dev&gt;
Signed-off-by: Chengming Zhou &lt;chengming.zhou@linux.dev&gt;
Signed-off-by: Nicolas Bouchinet &lt;nicolas.bouchinet@ssi.gouv.fr&gt;
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm: zswap: fix shrinker NULL crash with cgroup_disable=memory</title>
<updated>2024-05-02T14:35:32Z</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2024-04-18T12:26:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=b0fdabc908a7f81d12382c87ca9e46a9c2e14042'/>
<id>urn:sha1:b0fdabc908a7f81d12382c87ca9e46a9c2e14042</id>
<content type='text'>
commit 682886ec69d22363819a83ddddd5d66cb5c791e1 upstream.

Christian reports a NULL deref in zswap that he bisected down to the zswap
shrinker.  The issue also cropped up in the bug trackers of libguestfs [1]
and the Red Hat bugzilla [2].

The problem is that when memcg is disabled with the boot time flag, the
zswap shrinker might get called with sc-&gt;memcg == NULL.  This is okay in
many places, like the lruvec operations.  But it crashes in
memcg_page_state() - which is only used due to the non-node accounting of
cgroup's the zswap memory to begin with.

Nhat spotted that the memcg can be NULL in the memcg-disabled case, and I
was then able to reproduce the crash locally as well.

[1] https://github.com/libguestfs/libguestfs/issues/139
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2275252

Link: https://lkml.kernel.org/r/20240418124043.GC1055428@cmpxchg.org
Link: https://lkml.kernel.org/r/20240417143324.GA1055428@cmpxchg.org
Fixes: b5ba474f3f51 ("zswap: shrink zswap pool based on memory pressure")
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reported-by: Christian Heusel &lt;christian@heusel.eu&gt;
Debugged-by: Nhat Pham &lt;nphamcs@gmail.com&gt;
Suggested-by: Nhat Pham &lt;nphamcs@gmail.com&gt;
Tested-by: Christian Heusel &lt;christian@heusel.eu&gt;
Acked-by: Yosry Ahmed &lt;yosryahmed@google.com&gt;
Cc: Chengming Zhou &lt;chengming.zhou@linux.dev&gt;
Cc: Dan Streetman &lt;ddstreet@ieee.org&gt;
Cc: Richard W.M. Jones &lt;rjones@redhat.com&gt;
Cc: Seth Jennings &lt;sjenning@redhat.com&gt;
Cc: Vitaly Wool &lt;vitaly.wool@konsulko.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[v6.8]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm: turn folio_test_hugetlb into a PageType</title>
<updated>2024-05-02T14:35:32Z</updated>
<author>
<name>Matthew Wilcox (Oracle)</name>
<email>willy@infradead.org</email>
</author>
<published>2024-03-21T14:24:43Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9fdcc5b6359dfdaa52a55033bf50e2cedd66eb32'/>
<id>urn:sha1:9fdcc5b6359dfdaa52a55033bf50e2cedd66eb32</id>
<content type='text'>
commit d99e3140a4d33e26066183ff727d8f02f56bec64 upstream.

The current folio_test_hugetlb() can be fooled by a concurrent folio split
into returning true for a folio which has never belonged to hugetlbfs.
This can't happen if the caller holds a refcount on it, but we have a few
places (memory-failure, compaction, procfs) which do not and should not
take a speculative reference.

Since hugetlb pages do not use individual page mapcounts (they are always
fully mapped and use the entire_mapcount field to record the number of
mappings), the PageType field is available now that page_mapcount()
ignores the value in this field.

In compaction and with CONFIG_DEBUG_VM enabled, the current implementation
can result in an oops, as reported by Luis. This happens since 9c5ccf2db04b
("mm: remove HUGETLB_PAGE_DTOR") effectively added some VM_BUG_ON() checks
in the PageHuge() testing path.

[willy@infradead.org: update vmcoreinfo]
  Link: https://lkml.kernel.org/r/ZgGZUvsdhaT1Va-T@casper.infradead.org
Link: https://lkml.kernel.org/r/20240321142448.1645400-6-willy@infradead.org
Fixes: 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR")
Signed-off-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reported-by: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218227
Cc: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()</title>
<updated>2024-05-02T14:35:26Z</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2024-04-19T08:58:19Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9c9b32d46afab2d911897914181c488954012300'/>
<id>urn:sha1:9c9b32d46afab2d911897914181c488954012300</id>
<content type='text'>
commit 52ccdde16b6540abe43b6f8d8e1e1ec90b0983af upstream.

When I did memory failure tests recently, below warning occurs:

DEBUG_LOCKS_WARN_ON(1)
WARNING: CPU: 8 PID: 1011 at kernel/locking/lockdep.c:232 __lock_acquire+0xccb/0x1ca0
Modules linked in: mce_inject hwpoison_inject
CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:__lock_acquire+0xccb/0x1ca0
RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082
RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8
RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0
RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb
R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10
R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004
FS:  00007ff9f32aa740(0000) GS:ffffa1ce5fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff9f3134ba0 CR3: 00000008484e4000 CR4: 00000000000006f0
Call Trace:
 &lt;TASK&gt;
 lock_acquire+0xbe/0x2d0
 _raw_spin_lock_irqsave+0x3a/0x60
 hugepage_subpool_put_pages.part.0+0xe/0xc0
 free_huge_folio+0x253/0x3f0
 dissolve_free_huge_page+0x147/0x210
 __page_handle_poison+0x9/0x70
 memory_failure+0x4e6/0x8c0
 hard_offline_page_store+0x55/0xa0
 kernfs_fop_write_iter+0x12c/0x1d0
 vfs_write+0x380/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xbc/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff9f3114887
RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887
RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001
RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00
 &lt;/TASK&gt;
Kernel panic - not syncing: kernel: panic_on_warn set ...
CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
Call Trace:
 &lt;TASK&gt;
 panic+0x326/0x350
 check_panic_on_warn+0x4f/0x50
 __warn+0x98/0x190
 report_bug+0x18e/0x1a0
 handle_bug+0x3d/0x70
 exc_invalid_op+0x18/0x70
 asm_exc_invalid_op+0x1a/0x20
RIP: 0010:__lock_acquire+0xccb/0x1ca0
RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082
RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8
RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0
RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb
R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10
R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004
 lock_acquire+0xbe/0x2d0
 _raw_spin_lock_irqsave+0x3a/0x60
 hugepage_subpool_put_pages.part.0+0xe/0xc0
 free_huge_folio+0x253/0x3f0
 dissolve_free_huge_page+0x147/0x210
 __page_handle_poison+0x9/0x70
 memory_failure+0x4e6/0x8c0
 hard_offline_page_store+0x55/0xa0
 kernfs_fop_write_iter+0x12c/0x1d0
 vfs_write+0x380/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xbc/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff9f3114887
RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887
RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001
RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00
 &lt;/TASK&gt;

After git bisecting and digging into the code, I believe the root cause is
that _deferred_list field of folio is unioned with _hugetlb_subpool field.
In __update_and_free_hugetlb_folio(), folio-&gt;_deferred_list is
initialized leading to corrupted folio-&gt;_hugetlb_subpool when folio is
hugetlb.  Later free_huge_folio() will use _hugetlb_subpool and above
warning happens.

But it is assumed hugetlb flag must have been cleared when calling
folio_put() in update_and_free_hugetlb_folio().  This assumption is broken
due to below race:

CPU1					CPU2
dissolve_free_huge_page			update_and_free_pages_bulk
 update_and_free_hugetlb_folio		 hugetlb_vmemmap_restore_folios
					  folio_clear_hugetlb_vmemmap_optimized
  clear_flag = folio_test_hugetlb_vmemmap_optimized
  if (clear_flag) &lt;-- False, it's already cleared.
   __folio_clear_hugetlb(folio) &lt;-- Hugetlb is not cleared.
  folio_put
   free_huge_folio &lt;-- free_the_page is expected.
					 list_for_each_entry()
					  __folio_clear_hugetlb &lt;-- Too late.

Fix this issue by checking whether folio is hugetlb directly instead of
checking clear_flag to close the race window.

Link: https://lkml.kernel.org/r/20240419085819.1901645-1-linmiaohe@huawei.com
Fixes: 32c877191e02 ("hugetlb: do not clear hugetlb dtor until allocating vmemmap")
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Reviewed-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm/hugetlb: fix missing hugetlb_lock for resv uncharge</title>
<updated>2024-05-02T14:35:26Z</updated>
<author>
<name>Peter Xu</name>
<email>peterx@redhat.com</email>
</author>
<published>2024-04-17T21:18:35Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=538faabf31e9c53d8c870d114846fda958a0de10'/>
<id>urn:sha1:538faabf31e9c53d8c870d114846fda958a0de10</id>
<content type='text'>
commit b76b46902c2d0395488c8412e1116c2486cdfcb2 upstream.

There is a recent report on UFFDIO_COPY over hugetlb:

https://lore.kernel.org/all/000000000000ee06de0616177560@google.com/

350:	lockdep_assert_held(&amp;hugetlb_lock);

Should be an issue in hugetlb but triggered in an userfault context, where
it goes into the unlikely path where two threads modifying the resv map
together.  Mike has a fix in that path for resv uncharge but it looks like
the locking criteria was overlooked: hugetlb_cgroup_uncharge_folio_rsvd()
will update the cgroup pointer, so it requires to be called with the lock
held.

Link: https://lkml.kernel.org/r/20240417211836.2742593-3-peterx@redhat.com
Fixes: 79aa925bf239 ("hugetlb_cgroup: fix reservation accounting")
Signed-off-by: Peter Xu &lt;peterx@redhat.com&gt;
Reported-by: syzbot+4b8077a5fccc61c385a1@syzkaller.appspotmail.com
Reviewed-by: Mina Almasry &lt;almasrymina@google.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm/shmem: inline shmem_is_huge() for disabled transparent hugepages</title>
<updated>2024-04-27T15:13:02Z</updated>
<author>
<name>Sumanth Korikkar</name>
<email>sumanthk@linux.ibm.com</email>
</author>
<published>2024-04-09T15:54:07Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=1e05f0897425b51f1b8b3d5c62e89a9898aa80d4'/>
<id>urn:sha1:1e05f0897425b51f1b8b3d5c62e89a9898aa80d4</id>
<content type='text'>
commit 1f737846aa3c45f07a06fa0d018b39e1afb8084a upstream.

In order to  minimize code size (CONFIG_CC_OPTIMIZE_FOR_SIZE=y),
compiler might choose to make a regular function call (out-of-line) for
shmem_is_huge() instead of inlining it. When transparent hugepages are
disabled (CONFIG_TRANSPARENT_HUGEPAGE=n), it can cause compilation
error.

mm/shmem.c: In function `shmem_getattr':
./include/linux/huge_mm.h:383:27: note: in expansion of macro `BUILD_BUG'
  383 | #define HPAGE_PMD_SIZE ({ BUILD_BUG(); 0; })
      |                           ^~~~~~~~~
mm/shmem.c:1148:33: note: in expansion of macro `HPAGE_PMD_SIZE'
 1148 |                 stat-&gt;blksize = HPAGE_PMD_SIZE;

To prevent the possible error, always inline shmem_is_huge() when
transparent hugepages are disabled.

Link: https://lkml.kernel.org/r/20240409155407.2322714-1-sumanthk@linux.ibm.com
Signed-off-by: Sumanth Korikkar &lt;sumanthk@linux.ibm.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Ilya Leoshkevich &lt;iii@linux.ibm.com&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
