<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/mm/memory.c, branch linux-2.6.36.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-2.6.36.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-2.6.36.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2010-09-20T17:44:37Z</updated>
<entry>
<title>mm: further fix swapin race condition</title>
<updated>2010-09-20T17:44:37Z</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2010-09-20T02:40:22Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=31c4a3d3a0f84a5847665f8aa0552d188389f791'/>
<id>urn:sha1:31c4a3d3a0f84a5847665f8aa0552d188389f791</id>
<content type='text'>
Commit 4969c1192d15 ("mm: fix swapin race condition") is now agreed to
be incomplete.  There's a race, not very much less likely than the
original race envisaged, in which it is further necessary to check that
the swapcache page's swap has not changed.

Here's the reasoning: cast in terms of reuse_swap_page(), but probably
could be reformulated to rely on try_to_free_swap() instead, or on
swapoff+swapon.

A, faults into do_swap_page(): does page1 = lookup_swap_cache(swap1) and
comes through the lock_page(page1).

B, a racing thread of the same process, faults on the same address: does
page1 = lookup_swap_cache(swap1) and now waits in lock_page(page1), but
for whatever reason is unlucky not to get the lock any time soon.

A carries on through do_swap_page(), a write fault, but cannot reuse the
swap page1 (another reference to swap1).  Unlocks the page1 (but B
doesn't get it yet), does COW in do_wp_page(), page2 now in that pte.

C, perhaps the parent of A+B, comes in and write faults the same swap
page1 into its mm, reuse_swap_page() succeeds this time, swap1 is freed.

kswapd comes in after some time (B still unlucky) and swaps out some
pages from A+B and C: it allocates the original swap1 to page2 in A+B,
and some other swap2 to the original page1 now in C.  But does not
immediately free page1 (actually it couldn't: B holds a reference),
leaving it in swap cache for now.

B at last gets the lock on page1, hooray! Is PageSwapCache(page1)? Yes.
Is pte_same(*page_table, orig_pte)? Yes, because page2 has now been
given the swap1 which page1 used to have.  So B proceeds to insert page1
into A+B's page_table, though its content now belongs to C, quite
different from what A wrote there.

B ought to have checked that page1's swap was still swap1.

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix swapin race condition</title>
<updated>2010-09-10T01:57:24Z</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2010-09-09T23:37:52Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=4969c1192d15afa3389e7ae3302096ff684ba655'/>
<id>urn:sha1:4969c1192d15afa3389e7ae3302096ff684ba655</id>
<content type='text'>
The pte_same check is reliable only if the swap entry remains pinned (by
the page lock on swapcache).  We've also to ensure the swapcache isn't
removed before we take the lock as try_to_free_swap won't care about the
page pin.

One of the possible impacts of this patch is that a KSM-shared page can
point to the anon_vma of another process, which could exit before the page
is freed.

This can leave a page with a pointer to a recycled anon_vma object, or
worse, a pointer to something that is no longer an anon_vma.

[riel@redhat.com: changelog help]
Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>guard page for stacks that grow upwards</title>
<updated>2010-08-24T19:13:20Z</updated>
<author>
<name>Luck, Tony</name>
<email>tony.luck@intel.com</email>
</author>
<published>2010-08-24T18:44:18Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8ca3eb08097f6839b2206e2242db4179aee3cfb3'/>
<id>urn:sha1:8ca3eb08097f6839b2206e2242db4179aee3cfb3</id>
<content type='text'>
pa-risc and ia64 have stacks that grow upwards. Check that
they do not run into other mappings. By making VM_GROWSUP
0x0 on architectures that do not ever use it, we can avoid
some unpleasant #ifdefs in check_stack_guard_page().

Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: make stack guard page logic use vm_prev pointer</title>
<updated>2010-08-21T15:50:00Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2010-08-20T23:49:40Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0e8e50e20c837eeec8323bba7dcd25fe5479194c'/>
<id>urn:sha1:0e8e50e20c837eeec8323bba7dcd25fe5479194c</id>
<content type='text'>
Like the mlock() change previously, this makes the stack guard check
code use vma-&gt;vm_prev to see what the mapping below the current stack
is, rather than have to look it up with find_vma().

Also, accept an abutting stack segment, since that happens naturally if
you split the stack with mlock or mprotect.

Tested-by: Ian Campbell &lt;ijc@hellion.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix page table unmap for stack guard page properly</title>
<updated>2010-08-14T18:44:56Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2010-08-14T18:44:56Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=11ac552477e32835cb6970bf0a70c210807f5673'/>
<id>urn:sha1:11ac552477e32835cb6970bf0a70c210807f5673</id>
<content type='text'>
We do in fact need to unmap the page table _before_ doing the whole
stack guard page logic, because if it is needed (mainly 32-bit x86 with
PAE and CONFIG_HIGHPTE, but other architectures may use it too) then it
will do a kmap_atomic/kunmap_atomic.

And those kmaps will create an atomic region that we cannot do
allocations in.  However, the whole stack expand code will need to do
anon_vma_prepare() and vma_lock_anon_vma() and they cannot do that in an
atomic region.

Now, a better model might actually be to do the anon_vma_prepare() when
_creating_ a VM_GROWSDOWN segment, and not have to worry about any of
this at page fault time.  But in the meantime, this is the
straightforward fix for the issue.

See https://bugzilla.kernel.org/show_bug.cgi?id=16588 for details.

Reported-by: Wylda &lt;wylda@volny.cz&gt;
Reported-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt;
Reported-by: Mike Pagano &lt;mpagano@gentoo.org&gt;
Reported-by: François Valenduc &lt;francois.valenduc@tvcablenet.be&gt;
Tested-by: Ed Tomlinson &lt;edt@aei.ca&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Greg KH &lt;gregkh@suse.de&gt;
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix missing page table unmap for stack guard page failure case</title>
<updated>2010-08-13T16:24:04Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2010-08-13T16:24:04Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=5528f9132cf65d4d892bcbc5684c61e7822b21e9'/>
<id>urn:sha1:5528f9132cf65d4d892bcbc5684c61e7822b21e9</id>
<content type='text'>
.. which didn't show up in my tests because it's a no-op on x86-64 and
most other architectures.  But we enter the function with the last-level
page table mapped, and should unmap it at exit.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: keep a guard page below a grow-down stack segment</title>
<updated>2010-08-13T00:54:33Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2010-08-13T00:54:33Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=320b2b8de12698082609ebbc1a17165727f4c893'/>
<id>urn:sha1:320b2b8de12698082609ebbc1a17165727f4c893</id>
<content type='text'>
This is a rather minimally invasive patch to solve the problem of the
user stack growing into a memory mapped area below it.  Whenever we fill
the first page of the stack segment, expand the segment down by one
page.

Now, admittedly some odd application might _want_ the stack to grow down
into the preceding memory mapping, and so we may at some point need to
make this a process tunable (some people might also want to have more
than a single page of guarding), but let's try the minimal approach
first.

Tested with trivial application that maps a single page just below the
stack, and then starts recursing.  Without this, we will get a SIGSEGV
_after_ the stack has smashed the mapping.  With this patch, we'll get a
nice SIGBUS just as the stack touches the page just above the mapping.

Requested-by: Keith Packard &lt;keithp@keithp.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mmu-notifiers: remove mmu notifier calls in apply_to_page_range()</title>
<updated>2010-08-10T03:45:03Z</updated>
<author>
<name>Jeremy Fitzhardinge</name>
<email>jeremy@goop.org</email>
</author>
<published>2010-08-10T00:19:52Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=57250a5bf0f6ff68dc339572adbd881a11f366fa'/>
<id>urn:sha1:57250a5bf0f6ff68dc339572adbd881a11f366fa</id>
<content type='text'>
It is not appropriate for apply_to_page_range() to directly call any mmu
notifiers, because it is a general purpose function whose effect depends
on what context it is called in and what the callback function does.

In particular, if it is being used as part of an mmu notifier
implementation, the recursive calls can be particularly problematic.

It is up to apply_to_page_range's caller to do any notifier calls if
necessary.  It does not affect any in-tree users because they all operate
on init_mm, and mmu notifiers only pertain to usermode mappings.

[stefano.stabellini@eu.citrix.com: remove unused local `start']
Signed-off-by: Jeremy Fitzhardinge &lt;jeremy.fitzhardinge@citrix.com&gt;
Signed-off-by: Stefano Stabellini &lt;stefano.stabellini@eu.citrix.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Stefano Stabellini &lt;stefano.stabellini@eu.citrix.com&gt;
Cc: Avi Kivity &lt;avi@qumranet.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: set VM_FAULT_WRITE in do_swap_page()</title>
<updated>2010-08-10T03:45:02Z</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2010-08-10T00:19:49Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9a5b489b870def9a93f5e89dac03ebe136f901db'/>
<id>urn:sha1:9a5b489b870def9a93f5e89dac03ebe136f901db</id>
<content type='text'>
Set the flag if do_swap_page is decowing the page the same way do_wp_page
would too.

Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>rmap: add exclusive page to private anon_vma on swapin</title>
<updated>2010-08-10T03:45:02Z</updated>
<author>
<name>Rik van Riel</name>
<email>riel@redhat.com</email>
</author>
<published>2010-08-10T00:19:48Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=ad8c2ee801ad7a52d919b478d9b2c7b39a72d295'/>
<id>urn:sha1:ad8c2ee801ad7a52d919b478d9b2c7b39a72d295</id>
<content type='text'>
On swapin it is fairly common for a page to be owned exclusively by one
process.  In that case we want to add the page to the anon_vma of that
process's VMA, instead of to the root anon_vma.

This will reduce the amount of rmap searching that the swapout code needs
to do.

Signed-off-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
