<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/mm/page_alloc.c, branch linux-2.6.37.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-2.6.37.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-2.6.37.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2011-03-07T23:05:11Z</updated>
<entry>
<title>mm: fix dubious code in __count_immobile_pages()</title>
<updated>2011-03-07T23:05:11Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@gmail.com</email>
</author>
<published>2011-02-25T22:44:25Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=7c457763a53170ed4815f861b7c6a758f9cb0ed3'/>
<id>urn:sha1:7c457763a53170ed4815f861b7c6a758f9cb0ed3</id>
<content type='text'>
commit 29723fccc837d20039078f7a571e8d457eb0d6c6 upstream.

When pfn_valid_within() failed 'iter' was incremented twice.

Signed-off-by: Namhyung Kim &lt;namhyung@gmail.com&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Reviewed-by: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>mm: page allocator: adjust the per-cpu counter threshold when memory is low</title>
<updated>2011-02-17T23:14:38Z</updated>
<author>
<name>Mel Gorman</name>
<email>mel@csn.ul.ie</email>
</author>
<published>2011-01-13T23:45:41Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=39a06a2ac3bcb57fc04ffb53af8e402e96c515f4'/>
<id>urn:sha1:39a06a2ac3bcb57fc04ffb53af8e402e96c515f4</id>
<content type='text'>
commit 88f5acf88ae6a9778f6d25d0d5d7ec2d57764a97 upstream.

Commit aa45484 ("calculate a better estimate of NR_FREE_PAGES when memory
is low") noted that watermarks were based on the vmstat NR_FREE_PAGES.  To
avoid synchronization overhead, these counters are maintained on a per-cpu
basis and drained both periodically and when a threshold is above a
threshold.  On large CPU systems, the difference between the estimate and
real value of NR_FREE_PAGES can be very high.  The system can get into a
case where pages are allocated far below the min watermark potentially
causing livelock issues.  The commit solved the problem by taking a better
reading of NR_FREE_PAGES when memory was low.

Unfortately, as reported by Shaohua Li this accurate reading can consume a
large amount of CPU time on systems with many sockets due to cache line
bouncing.  This patch takes a different approach.  For large machines
where counter drift might be unsafe and while kswapd is awake, the per-cpu
thresholds for the target pgdat are reduced to limit the level of drift to
what should be a safe level.  This incurs a performance penalty in heavy
memory pressure by a factor that depends on the workload and the machine
but the machine should function correctly without accidentally exhausting
all memory on a node.  There is an additional cost when kswapd wakes and
sleeps but the event is not expected to be frequent - in Shaohua's test
case, there was one recorded sleep and wake event at least.

To ensure that kswapd wakes up, a safe version of zone_watermark_ok() is
introduced that takes a more accurate reading of NR_FREE_PAGES when called
from wakeup_kswapd, when deciding whether it is really safe to go back to
sleep in sleeping_prematurely() and when deciding if a zone is really
balanced or not in balance_pgdat().  We are still using an expensive
function but limiting how often it is called.

When the test case is reproduced, the time spent in the watermark
functions is reduced.  The following report is on the percentage of time
spent cumulatively spent in the functions zone_nr_free_pages(),
zone_watermark_ok(), __zone_watermark_ok(), zone_watermark_ok_safe(),
zone_page_state_snapshot(), zone_page_state().

vanilla                      11.6615%
disable-threshold            0.2584%

David said:

: We had to pull aa454840 "mm: page allocator: calculate a better estimate
: of NR_FREE_PAGES when memory is low and kswapd is awake" from 2.6.36
: internally because tests showed that it would cause the machine to stall
: as the result of heavy kswapd activity.  I merged it back with this fix as
: it is pending in the -mm tree and it solves the issue we were seeing, so I
: definitely think this should be pushed to -stable (and I would seriously
: consider it for 2.6.37 inclusion even at this late date).

Signed-off-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Reported-by: Shaohua Li &lt;shaohua.li@intel.com&gt;
Reviewed-by: Christoph Lameter &lt;cl@linux.com&gt;
Tested-by: Nicolas Bareil &lt;nico@chdir.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Kyle McMartin &lt;kyle@mcmartin.ca&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>PM / Hibernate: Fix memory corruption related to swap</title>
<updated>2010-12-06T22:52:08Z</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rjw@sisk.pl</email>
</author>
<published>2010-12-03T21:57:45Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c9e664f1fdf34aa8cede047b206deaa8f1945af0'/>
<id>urn:sha1:c9e664f1fdf34aa8cede047b206deaa8f1945af0</id>
<content type='text'>
There is a problem that swap pages allocated before the creation of
a hibernation image can be released and used for storing the contents
of different memory pages while the image is being saved.  Since the
kernel stored in the image doesn't know of that, it causes memory
corruption to occur after resume from hibernation, especially on
systems with relatively small RAM that need to swap often.

This issue can be addressed by keeping the GFP_IOFS bits clear
in gfp_allowed_mask during the entire hibernation, including the
saving of the image, until the system is finally turned off or
the hibernation is aborted.  Unfortunately, for this purpose
it's necessary to rework the way in which the hibernate and
suspend code manipulates gfp_allowed_mask.

This change is based on an earlier patch from Hugh Dickins.

Signed-off-by: Rafael J. Wysocki &lt;rjw@sisk.pl&gt;
Reported-by: Ondrej Zary &lt;linux@rainbow-software.org&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: stable@kernel.org
</content>
</entry>
<entry>
<title>mm/page_alloc.c: fix build_all_zonelist() where percpu_alloc() is wrongly called under stop_machine_run()</title>
<updated>2010-11-24T21:50:45Z</updated>
<author>
<name>KAMEZAWA Hiroyuki</name>
<email>kamezawa.hiroyu@jp.fujitsu.com</email>
</author>
<published>2010-11-24T20:57:09Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e9959f0f37160e1f5351af828cc981712b5066c1'/>
<id>urn:sha1:e9959f0f37160e1f5351af828cc981712b5066c1</id>
<content type='text'>
During memory hotplug, build_allzonelists() may be called under
stop_machine_run().  In this function, setup_zone_pageset() is called.
But it's bug because it will do page allocation under stop_machine_run().

Here is a report from Alok Kataria.

  BUG: sleeping function called from invalid context at kernel/mutex.c:94
  in_atomic(): 0, irqs_disabled(): 1, pid: 4, name: migration/0
  Pid: 4, comm: migration/0 Not tainted 2.6.35.6-45.fc14.x86_64 #1
  Call Trace:
   [&lt;ffffffff8103d12b&gt;] __might_sleep+0xeb/0xf0
   [&lt;ffffffff81468245&gt;] mutex_lock+0x24/0x50
   [&lt;ffffffff8110eaa6&gt;] pcpu_alloc+0x6d/0x7ee
   [&lt;ffffffff81048888&gt;] ? load_balance+0xbe/0x60e
   [&lt;ffffffff8103a1b3&gt;] ? rt_se_boosted+0x21/0x2f
   [&lt;ffffffff8103e1cf&gt;] ? dequeue_rt_stack+0x18b/0x1ed
   [&lt;ffffffff8110f237&gt;] __alloc_percpu+0x10/0x12
   [&lt;ffffffff81465e22&gt;] setup_zone_pageset+0x38/0xbe
   [&lt;ffffffff810d6d81&gt;] ? build_zonelists_node.clone.58+0x79/0x8c
   [&lt;ffffffff81452539&gt;] __build_all_zonelists+0x419/0x46c
   [&lt;ffffffff8108ef01&gt;] ? cpu_stopper_thread+0xb2/0x198
   [&lt;ffffffff8108f075&gt;] stop_machine_cpu_stop+0x8e/0xc5
   [&lt;ffffffff8108efe7&gt;] ? stop_machine_cpu_stop+0x0/0xc5
   [&lt;ffffffff8108ef57&gt;] cpu_stopper_thread+0x108/0x198
   [&lt;ffffffff81467a37&gt;] ? schedule+0x5b2/0x5cc
   [&lt;ffffffff8108ee4f&gt;] ? cpu_stopper_thread+0x0/0x198
   [&lt;ffffffff81065f29&gt;] kthread+0x7f/0x87
   [&lt;ffffffff8100aae4&gt;] kernel_thread_helper+0x4/0x10
   [&lt;ffffffff81065eaa&gt;] ? kthread+0x0/0x87
   [&lt;ffffffff8100aae0&gt;] ? kernel_thread_helper+0x0/0x10
  Built 5 zonelists in Node order, mobility grouping on.  Total pages: 289456
  Policy zone: Normal

This patch tries to fix the issue by moving setup_zone_pageset() out from
stop_machine_run(). It's obviously not necessary to be called under
stop_machine_run().

[akpm@linux-foundation.org: remove unneeded local]
Reported-by: Alok Kataria &lt;akataria@vmware.com&gt;
Signed-off-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Petr Vandrovec &lt;petr@vmware.com&gt;
Cc: Pekka Enberg &lt;penberg@cs.helsinki.fi&gt;
Reviewed-by: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: add casts to/from gfp_t in gfp_to_alloc_flags()</title>
<updated>2010-10-26T23:52:09Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@gmail.com</email>
</author>
<published>2010-10-26T21:21:59Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e6223a3b19421e3a8df1352d21fd0d71093f44ae'/>
<id>urn:sha1:e6223a3b19421e3a8df1352d21fd0d71093f44ae</id>
<content type='text'>
This removes following warning from sparse:

 mm/page_alloc.c:1934:9: warning: restricted gfp_t degrades to integer

Signed-off-by: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>writeback: do not sleep on the congestion queue if there are no congested BDIs or if significant congestion is not being encountered in the current zone</title>
<updated>2010-10-26T23:52:07Z</updated>
<author>
<name>Mel Gorman</name>
<email>mel@csn.ul.ie</email>
</author>
<published>2010-10-26T21:21:45Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0e093d99763eb4cea09f8ca4f1d01f34e121d10b'/>
<id>urn:sha1:0e093d99763eb4cea09f8ca4f1d01f34e121d10b</id>
<content type='text'>
If congestion_wait() is called with no BDI congested, the caller will
sleep for the full timeout and this may be an unnecessary sleep.  This
patch adds a wait_iff_congested() that checks congestion and only sleeps
if a BDI is congested else, it calls cond_resched() to ensure the caller
is not hogging the CPU longer than its quota but otherwise will not sleep.

This is aimed at reducing some of the major desktop stalls reported during
IO.  For example, while kswapd is operating, it calls congestion_wait()
but it could just have been reclaiming clean page cache pages with no
congestion.  Without this patch, it would sleep for a full timeout but
after this patch, it'll just call schedule() if it has been on the CPU too
long.  Similar logic applies to direct reclaimers that are not making
enough progress.

Signed-off-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memory hotplug: unify is_removable and offline detection code</title>
<updated>2010-10-26T23:52:06Z</updated>
<author>
<name>KAMEZAWA Hiroyuki</name>
<email>kamezawa.hiroyu@jp.fujitsu.com</email>
</author>
<published>2010-10-26T21:21:30Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=49ac825587f33afec8841b7fab2eb4db775014e6'/>
<id>urn:sha1:49ac825587f33afec8841b7fab2eb4db775014e6</id>
<content type='text'>
Now, sysfs interface of memory hotplug shows whether the section is
removable or not.  But it checks only migrateype of pages and doesn't
check details of cluster of pages.

Next, memory hotplug's set_migratetype_isolate() has the same kind of
check, too.

This patch adds the function __count_unmovable_pages() and makes above 2
checks to use the same logic.  Then, is_removable and hotremove code uses
the same logic.  No changes in the hotremove logic itself.

TODO: need to find a way to check RECLAMABLE. But, considering bit,
      calling shrink_slab() against a range before starting memory hotremove
      sounds better. If so, this patch's logic doesn't need to be changed.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Reported-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memory hotplug: fix notifier's return value check</title>
<updated>2010-10-26T23:52:06Z</updated>
<author>
<name>KAMEZAWA Hiroyuki</name>
<email>kamezawa.hiroyu@jp.fujitsu.com</email>
</author>
<published>2010-10-26T21:21:29Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=4b20477f588055fbe87e69435d3c2344d250f0d7'/>
<id>urn:sha1:4b20477f588055fbe87e69435d3c2344d250f0d7</id>
<content type='text'>
Even if notifier cannot find any pages, it doesn't mean no pages are
available...And, if there are no notifiers registered, this condition will
be always true and memory hotplug will show -EBUSY.

This is a bug but not critical.

In most case, a pageblock which will be offlined is MIGRATE_MOVABLE This
"notifier" is called only when the pageblock is _not_ MIGRATE_MOVABLE.
But if not MIGRATE_MOVABLE, it's common case that memory hotplug will
fail.  So, no one notice this bug.

Signed-off-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, page-allocator: do not check the state of a non-existant buddy during free</title>
<updated>2010-10-26T23:52:03Z</updated>
<author>
<name>Mel Gorman</name>
<email>mel@csn.ul.ie</email>
</author>
<published>2010-10-26T21:21:11Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=b7f50cfa3630b6e079929ffccfd442d65064ee1f'/>
<id>urn:sha1:b7f50cfa3630b6e079929ffccfd442d65064ee1f</id>
<content type='text'>
There is a bug in commit 6dda9d55 ("page allocator: reduce fragmentation
in buddy allocator by adding buddies that are merging to the tail of the
free lists") that means a buddy at order MAX_ORDER is checked for merging.
 A page of this order never exists so at times, an effectively random
piece of memory is being checked.

Alan Curry has reported that this is causing memory corruption in
userspace data on a PPC32 platform (http://lkml.org/lkml/2010/10/9/32).
It is not clear why this is happening.  It could be a cache coherency
problem where pages mapped in both user and kernel space are getting
different cache lines due to the bad read from kernel space
(http://lkml.org/lkml/2010/10/13/179).  It could also be that there are
some special registers being io-remapped at the end of the memmap array
and that a read has special meaning on them.  Compiler bugs have been
ruled out because the assembly before and after the patch looks relatively
harmless.

This patch fixes the problem by ensuring we are not reading a possibly
invalid location of memory.  It's not clear why the read causes corruption
but one way or the other it is a buggy read.

Signed-off-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Corrado Zoccolo &lt;czoccolo@gmail.com&gt;
Reported-by: Alan Curry &lt;pacman@kosh.dhis.org&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'x86/urgent' into core/memblock</title>
<updated>2010-10-12T00:05:11Z</updated>
<author>
<name>H. Peter Anvin</name>
<email>hpa@linux.intel.com</email>
</author>
<published>2010-10-12T00:05:11Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=8e4029ee3517084ae00fbfbcb51cc365d8857061'/>
<id>urn:sha1:8e4029ee3517084ae00fbfbcb51cc365d8857061</id>
<content type='text'>
Reason for merge:

Forward-port urgent change to arch/x86/mm/srat_64.c to the memblock tree.

Resolved Conflicts:
	arch/x86/mm/srat_64.c

Originally-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
</content>
</entry>
</feed>
