<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/mm/memfd_luo.c, branch 0x221E-v0.0.1-v6.19</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=0x221E-v0.0.1-v6.19</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=0x221E-v0.0.1-v6.19'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2026-03-19T15:15:16Z</updated>
<entry>
<title>mm: memfd_luo: always dirty all folios</title>
<updated>2026-03-19T15:15:16Z</updated>
<author>
<name>Pratyush Yadav (Google)</name>
<email>pratyush@kernel.org</email>
</author>
<published>2026-02-23T17:39:29Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e901c871d4b592f0042e30f3a0f031eae79744ec'/>
<id>urn:sha1:e901c871d4b592f0042e30f3a0f031eae79744ec</id>
<content type='text'>
commit 7e04bf1f33151a30e06a65b74b5f2c19fc2be128 upstream.

A dirty folio is one which has been written to.  A clean folio is its
opposite.  Since a clean folio has no user data, it can be freed under
memory pressure.

memfd preservation with LUO saves the flag at preserve().  This is
problematic.  The folio might get dirtied later.  Saving it at freeze()
also doesn't work, since the dirty bit from PTE is normally synced at
unmap and there might still be mappings of the file at freeze().

To see why this is a problem, say a folio is clean at preserve, but gets
dirtied later.  The serialized state of the folio will mark it as clean.
After retrieve, the next kernel will see the folio as clean and might try
to reclaim it under memory pressure.  This will result in losing user
data.

Mark all folios of the file as dirty, and always set the
MEMFD_LUO_FOLIO_DIRTY flag.  This comes with the side effect of making all
clean folios un-reclaimable.  This is a cost that has to be paid for
participants of live update.  It is not expected to be a common use case
to preserve a lot of clean folios anyway.

Since the value of pfolio-&gt;flags is a constant now, drop the flags
variable and set it directly.

Link: https://lkml.kernel.org/r/20260223173931.2221759-3-pratyush@kernel.org
Fixes: b3749f174d68 ("mm: memfd_luo: allow preserving memfd")
Signed-off-by: Pratyush Yadav (Google) &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm: memfd_luo: always make all folios uptodate</title>
<updated>2026-03-19T15:15:16Z</updated>
<author>
<name>Pratyush Yadav (Google)</name>
<email>pratyush@kernel.org</email>
</author>
<published>2026-02-23T17:39:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=39abeb73859fe8d276ebc5ef31d78ff394e3f7ee'/>
<id>urn:sha1:39abeb73859fe8d276ebc5ef31d78ff394e3f7ee</id>
<content type='text'>
commit 50d7b4332f27762d24641970fc34bb68a2621926 upstream.

Patch series "mm: memfd_luo: fixes for folio flag preservation".

This series contains a couple fixes for flag preservation for memfd live
update.

The first patch fixes memfd preservation when fallocate() was used to
pre-allocate some pages.  For these memfds, all the writes to fallocated
pages touched after preserve were lost.

The second patch fixes dirty flag tracking.  If the dirty flag is not
tracked correctly, the next kernel might incorrectly reclaim some folios
under memory pressure, losing user data.  This is a theoretical bug that I
observed when reading the code, and haven't been able to reproduce it.


This patch (of 2):

When a folio is added to a shmem file via fallocate, it is not zeroed on
allocation.  This is done as a performance optimization since it is
possible the folio will never end up being used at all.  When the folio is
used, shmem checks for the uptodate flag, and if absent, zeroes the folio
(and sets the flag) before returning to user.

With LUO, the flags of each folio are saved at preserve time.  It is
possible to have a memfd with some folios fallocated but not uptodate.
For those, the uptodate flag doesn't get saved.  The folios might later
end up being used and become uptodate.  They would get passed to the next
kernel via KHO correctly since they did get preserved.  But they won't
have the MEMFD_LUO_FOLIO_UPTODATE flag.

This means that when the memfd is retrieved, the folios will be added to
the shmem file without the uptodate flag.  They will be zeroed before
first use, losing the data in those folios.

Since we take a big performance hit in allocating, zeroing, and pinning
all folios at prepare time anyway, take some more and zero all
non-uptodate ones too.

Later when there is a stronger need to make prepare faster, this can be
optimized.

To avoid racing with another uptodate operation, take the folio lock.

Link: https://lkml.kernel.org/r/20260223173931.2221759-2-pratyush@kernel.org
Fixes: b3749f174d68 ("mm: memfd_luo: allow preserving memfd")
Signed-off-by: Pratyush Yadav (Google) &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>liveupdate: luo_file: remember retrieve() status</title>
<updated>2026-03-19T15:15:12Z</updated>
<author>
<name>Pratyush Yadav (Google)</name>
<email>pratyush@kernel.org</email>
</author>
<published>2026-02-16T13:22:19Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=1d3ad69484dc1cc53be62d2554e7ef038a627af9'/>
<id>urn:sha1:1d3ad69484dc1cc53be62d2554e7ef038a627af9</id>
<content type='text'>
commit f85b1c6af5bc3872f994df0a5688c1162de07a62 upstream.

LUO keeps track of successful retrieve attempts on a LUO file.  It does so
to avoid multiple retrievals of the same file.  Multiple retrievals cause
problems because once the file is retrieved, the serialized data
structures are likely freed and the file is likely in a very different
state from what the code expects.

The retrieve boolean in struct luo_file keeps track of this, and is passed
to the finish callback so it knows what work was already done and what it
has left to do.

All this works well when retrieve succeeds.  When it fails,
luo_retrieve_file() returns the error immediately, without ever storing
anywhere that a retrieve was attempted or what its error code was.  This
results in an errored LIVEUPDATE_SESSION_RETRIEVE_FD ioctl to userspace,
but nothing prevents it from trying this again.

The retry is problematic for much of the same reasons listed above.  The
file is likely in a very different state than what the retrieve logic
normally expects, and it might even have freed some serialization data
structures.  Attempting to access them or free them again is going to
break things.

For example, if memfd managed to restore 8 of its 10 folios, but fails on
the 9th, a subsequent retrieve attempt will try to call
kho_restore_folio() on the first folio again, and that will fail with a
warning since it is an invalid operation.

Apart from the retry, finish() also breaks.  Since on failure the
retrieved bool in luo_file is never touched, the finish() call on session
close will tell the file handler that retrieve was never attempted, and it
will try to access or free the data structures that might not exist, much
in the same way as the retry attempt.

There is no sane way of attempting the retrieve again.  Remember the error
retrieve returned and directly return it on a retry.  Also pass this
status code to finish() so it can make the right decision on the work it
needs to do.

This is done by changing the bool to an integer.  A value of 0 means
retrieve was never attempted, a positive value means it succeeded, and a
negative value means it failed and the error code is the value.

Link: https://lkml.kernel.org/r/20260216132221.987987-1-pratyush@kernel.org
Fixes: 7c722a7f44e0 ("liveupdate: luo_file: implement file systems callbacks")
Signed-off-by: Pratyush Yadav (Google) &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm: memfd_luo: restore and free memfd_luo_ser on failure</title>
<updated>2026-01-27T03:03:47Z</updated>
<author>
<name>Pratyush Yadav (Google)</name>
<email>pratyush@kernel.org</email>
</author>
<published>2026-01-22T15:18:41Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c657c5dc1360fa1ed1b090aedb5883d9cf9f0a0f'/>
<id>urn:sha1:c657c5dc1360fa1ed1b090aedb5883d9cf9f0a0f</id>
<content type='text'>
memfd_luo_ser has the serialization metadata.  It is of no use once
restoration fails.  Free it on failure.

Link: https://lkml.kernel.org/r/20260122151842.4069702-4-pratyush@kernel.org
Fixes: b3749f174d68 ("mm: memfd_luo: allow preserving memfd")
Signed-off-by: Pratyush Yadav (Google) &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: memfd_luo: use memfd_alloc_file() instead of shmem_file_setup()</title>
<updated>2026-01-27T03:03:47Z</updated>
<author>
<name>Pratyush Yadav (Google)</name>
<email>pratyush@kernel.org</email>
</author>
<published>2026-01-22T15:18:40Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=02e117b8ca58928f193e5b4df48d6763232f5e91'/>
<id>urn:sha1:02e117b8ca58928f193e5b4df48d6763232f5e91</id>
<content type='text'>
When restoring a memfd, the file is created using shmem_file_setup(). 
While memfd creation also calls this function to get the file, it also
does other things:

  1. The O_LARGEFILE flag is set on the file. If this is not done,
  writes on the memfd exceeding 2 GiB fail.

  2. FMODE_LSEEK, FMODE_PREAD, and FMODE_PWRITE are set on the file.
  This makes sure the file is seekable and can be used with pread() and
  pwrite().

  3. Initializes the security field for the inode and makes sure that
  inode creation is permitted by the security module.

Currently, none of those things are done.  This means writes above 2 GiB
fail, pread(), and pwrite() fail, and so on.  lseek() happens to work
because file_init_path() sets it because shmem defines fop-&gt;llseek.

Fix this by using memfd_alloc_file() to get the file to make sure the
initialization sequence for normal and preserved memfd is the same.

Link: https://lkml.kernel.org/r/20260122151842.4069702-3-pratyush@kernel.org
Fixes: b3749f174d68 ("mm: memfd_luo: allow preserving memfd")
Signed-off-by: Pratyush Yadav (Google) &lt;pratyush@kernel.org&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: memfd_luo: allow preserving memfd</title>
<updated>2025-11-27T22:24:41Z</updated>
<author>
<name>Pratyush Yadav</name>
<email>ptyadav@amazon.de</email>
</author>
<published>2025-11-25T16:58:44Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=b3749f174d686627f702234e64bad976dc432dbc'/>
<id>urn:sha1:b3749f174d686627f702234e64bad976dc432dbc</id>
<content type='text'>
The ability to preserve a memfd allows userspace to use KHO and LUO to
transfer its memory contents to the next kernel.  This is useful in many
ways.  For one, it can be used with IOMMUFD as the backing store for IOMMU
page tables.  Preserving IOMMUFD is essential for performing a hypervisor
live update with passthrough devices.  memfd support provides the first
building block for making that possible.

For another, applications with a large amount of memory that takes time to
reconstruct, reboots to consume kernel upgrades can be very expensive. 
memfd with LUO gives those applications reboot-persistent memory that they
can use to quickly save and reconstruct that state.

While memfd is backed by either hugetlbfs or shmem, currently only support
on shmem is added.  To be more precise, support for anonymous shmem files
is added.

The handover to the next kernel is not transparent.  All the properties of
the file are not preserved; only its memory contents, position, and size. 
The recreated file gets the UID and GID of the task doing the restore, and
the task's cgroup gets charged with the memory.

Once preserved, the file cannot grow or shrink, and all its pages are
pinned to avoid migrations and swapping.  The file can still be read from
or written to.

Use vmalloc to get the buffer to hold the folios, and preserve it using
kho_preserve_vmalloc().  This doesn't have the size limit.

Link: https://lkml.kernel.org/r/20251125165850.3389713-15-pasha.tatashin@soleen.com
Signed-off-by: Pratyush Yadav &lt;ptyadav@amazon.de&gt;
Co-developed-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Signed-off-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Reviewed-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Tested-by: David Matlack &lt;dmatlack@google.com&gt;
Cc: Aleksander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Cc: Alexander Graf &lt;graf@amazon.com&gt;
Cc: Alice Ryhl &lt;aliceryhl@google.com&gt;
Cc: Andriy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Cc: anish kumar &lt;yesanishhere@gmail.com&gt;
Cc: Anna Schumaker &lt;anna.schumaker@oracle.com&gt;
Cc: Bartosz Golaszewski &lt;bartosz.golaszewski@linaro.org&gt;
Cc: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Cc: Borislav Betkov &lt;bp@alien8.de&gt;
Cc: Chanwoo Choi &lt;cw00.choi@samsung.com&gt;
Cc: Chen Ridong &lt;chenridong@huawei.com&gt;
Cc: Chris Li &lt;chrisl@kernel.org&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Daniel Wagner &lt;wagi@kernel.org&gt;
Cc: Danilo Krummrich &lt;dakr@kernel.org&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Jeffery &lt;djeffery@redhat.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Guixin Liu &lt;kanie@linux.alibaba.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Joanthan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Cc: Joel Granados &lt;joel.granados@kernel.org&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Lennart Poettering &lt;lennart@poettering.net&gt;
Cc: Leon Romanovsky &lt;leon@kernel.org&gt;
Cc: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Cc: Lukas Wunner &lt;lukas@wunner.de&gt;
Cc: Marc Rutland &lt;mark.rutland@arm.com&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Matthew Maurer &lt;mmaurer@google.com&gt;
Cc: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Cc: Myugnjoo Ham &lt;myungjoo.ham@samsung.com&gt;
Cc: Parav Pandit &lt;parav@nvidia.com&gt;
Cc: Pratyush Yadav &lt;pratyush@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Saeed Mahameed &lt;saeedm@nvidia.com&gt;
Cc: Samiullah Khawaja &lt;skhawaja@google.com&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Stuart Hayes &lt;stuart.w.hayes@gmail.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Cc: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Cc: William Tu &lt;witu@nvidia.com&gt;
Cc: Yoann Congal &lt;yoann.congal@smile.fr&gt;
Cc: Zhu Yanjun &lt;yanjun.zhu@linux.dev&gt;
Cc: Zijun Hu &lt;quic_zijuhu@quicinc.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
