<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/include/asm-generic/tlb.h, branch linux-3.0.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-3.0.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-3.0.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2013-01-11T17:03:49Z</updated>
<entry>
<title>mm: limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT</title>
<updated>2013-01-11T17:03:49Z</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.cz</email>
</author>
<published>2013-01-04T23:35:12Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=c39096f1ae6551a0d43b84cb5dad309f0d392a38'/>
<id>urn:sha1:c39096f1ae6551a0d43b84cb5dad309f0d392a38</id>
<content type='text'>
commit 53a59fc67f97374758e63a9c785891ec62324c81 upstream.

Since commit e303297e6c3a ("mm: extended batches for generic
mmu_gather") we are batching pages to be freed until either
tlb_next_batch cannot allocate a new batch or we are done.

This works just fine most of the time but we can get in troubles with
non-preemptible kernel (CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY)
on large machines where too aggressive batching might lead to soft
lockups during process exit path (exit_mmap) because there are no
scheduling points down the free_pages_and_swap_cache path and so the
freeing can take long enough to trigger the soft lockup.

The lockup is harmless except when the system is setup to panic on
softlockup which is not that unusual.

The simplest way to work around this issue is to limit the maximum
number of batches in a single mmu_gather.  10k of collected pages should
be safe to prevent from soft lockups (we would have 2ms for one) even if
they are all freed without an explicit scheduling point.

This patch doesn't add any new explicit scheduling points because it
relies on zap_pmd_range during page tables zapping which calls
cond_resched per PMD.

The following lockup has been reported for 3.0 kernel with a huge
process (in order of hundreds gigs but I do know any more details).

  BUG: soft lockup - CPU#56 stuck for 22s! [kernel:31053]
  Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc mptctl mptbase autofs4 binfmt_misc dm_round_robin dm_multipath bonding cpufreq_conservative cpufreq_userspace cpufreq_powersave pcc_cpufreq mperf microcode fuse loop osst sg sd_mod crc_t10dif st qla2xxx scsi_transport_fc scsi_tgt netxen_nic i7core_edac iTCO_wdt joydev e1000e serio_raw pcspkr edac_core iTCO_vendor_support acpi_power_meter rtc_cmos hpwdt hpilo button container usbhid hid dm_mirror dm_region_hash dm_log linear uhci_hcd ehci_hcd usbcore usb_common scsi_dh_emc scsi_dh_alua scsi_dh_hp_sw scsi_dh_rdac scsi_dh dm_snapshot pcnet32 mii edd dm_mod raid1 ext3 mbcache jbd fan thermal processor thermal_sys hwmon cciss scsi_mod
  Supported: Yes
  CPU 56
  Pid: 31053, comm: kernel Not tainted 3.0.31-0.9-default #1 HP ProLiant DL580 G7
  RIP: 0010:  _raw_spin_unlock_irqrestore+0x8/0x10
  RSP: 0018:ffff883ec1037af0  EFLAGS: 00000206
  RAX: 0000000000000e00 RBX: ffffea01a0817e28 RCX: ffff88803ffd9e80
  RDX: 0000000000000200 RSI: 0000000000000206 RDI: 0000000000000206
  RBP: 0000000000000002 R08: 0000000000000001 R09: ffff887ec724a400
  R10: 0000000000000000 R11: dead000000200200 R12: ffffffff8144c26e
  R13: 0000000000000030 R14: 0000000000000297 R15: 000000000000000e
  FS:  00007ed834282700(0000) GS:ffff88c03f200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 000000000068b240 CR3: 0000003ec13c5000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process kernel (pid: 31053, threadinfo ffff883ec1036000, task ffff883ebd5d4100)
  Call Trace:
    release_pages+0xc5/0x260
    free_pages_and_swap_cache+0x9d/0xc0
    tlb_flush_mmu+0x5c/0x80
    tlb_finish_mmu+0xe/0x50
    exit_mmap+0xbd/0x120
    mmput+0x49/0x120
    exit_mm+0x122/0x160
    do_exit+0x17a/0x430
    do_group_exit+0x3d/0xb0
    get_signal_to_deliver+0x247/0x480
    do_signal+0x71/0x1b0
    do_notify_resume+0x98/0xb0
    int_signal+0x12/0x17
  DWARF2 unwinder stuck at int_signal+0x12/0x17

Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm: uninline large generic tlb.h functions</title>
<updated>2011-05-25T15:39:20Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-05-25T00:12:14Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9547d01bfb9c351dc19067f8a4cea9d3955f4125'/>
<id>urn:sha1:9547d01bfb9c351dc19067f8a4cea9d3955f4125</id>
<content type='text'>
Some of these functions have grown beyond inline sanity, move them
out-of-line.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Requested-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Requested-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: extended batches for generic mmu_gather</title>
<updated>2011-05-25T15:39:16Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-05-25T00:12:01Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e303297e6c3a7b847c4731eb14006ca6b435ecca'/>
<id>urn:sha1:e303297e6c3a7b847c4731eb14006ca6b435ecca</id>
<content type='text'>
Instead of using a single batch (the small on-stack, or an allocated
page), try and extend the batch every time it runs out and only flush once
either the extend fails or we're done.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Requested-by: Nick Piggin &lt;npiggin@kernel.dk&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, powerpc: move the RCU page-table freeing into generic code</title>
<updated>2011-05-25T15:39:16Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-05-25T00:12:00Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=267239116987d64850ad2037d8e0f3071dc3b5ce'/>
<id>urn:sha1:267239116987d64850ad2037d8e0f3071dc3b5ce</id>
<content type='text'>
In case other architectures require RCU freed page-tables to implement
gup_fast() and software filled hashes and similar things, provide the
means to do so by moving the logic into generic code.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Requested-by: David Miller &lt;davem@davemloft.net&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: mmu_gather rework</title>
<updated>2011-05-25T15:39:12Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-05-25T00:11:45Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=d16dfc550f5326a4000f3322582a7c05dec91d7a'/>
<id>urn:sha1:d16dfc550f5326a4000f3322582a7c05dec91d7a</id>
<content type='text'>
Rework the existing mmu_gather infrastructure.

The direct purpose of these patches was to allow preemptible mmu_gather,
but even without that I think these patches provide an improvement to the
status quo.

The first 9 patches rework the mmu_gather infrastructure.  For review
purpose I've split them into generic and per-arch patches with the last of
those a generic cleanup.

The next patch provides generic RCU page-table freeing, and the followup
is a patch converting s390 to use this.  I've also got 4 patches from
DaveM lined up (not included in this series) that uses this to implement
gup_fast() for sparc64.

Then there is one patch that extends the generic mmu_gather batching.

After that follow the mm preemptibility patches, these make part of the mm
a lot more preemptible.  It converts i_mmap_lock and anon_vma-&gt;lock to
mutexes which together with the mmu_gather rework makes mmu_gather
preemptible as well.

Making i_mmap_lock a mutex also enables a clean-up of the truncate code.

This also allows for preemptible mmu_notifiers, something that XPMEM I
think wants.

Furthermore, it removes the new and universially detested unmap_mutex.

This patch:

Remove the first obstacle towards a fully preemptible mmu_gather.

The current scheme assumes mmu_gather is always done with preemption
disabled and uses per-cpu storage for the page batches.  Change this to
try and allocate a page for batching and in case of failure, use a small
on-stack array to make some progress.

Preemptible mmu_gather is desired in general and usable once i_mmap_lock
becomes a mutex.  Doing it before the mutex conversion saves us from
having to rework the code by moving the mmu_gather bits inside the
pte_lock.

Also avoid flushing the tlb batches from under the pte lock, this is
useful even without the i_mmap_lock conversion as it significantly reduces
pte lock hold times.

[akpm@linux-foundation.org: fix comment tpyo]
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()</title>
<updated>2009-07-27T19:10:38Z</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2009-07-22T05:44:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=9e1b32caa525cb236e80e9c671e179bcecccc657'/>
<id>urn:sha1:9e1b32caa525cb236e80e9c671e179bcecccc657</id>
<content type='text'>
mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()

Upcoming paches to support the new 64-bit "BookE" powerpc architecture
will need to have the virtual address corresponding to PTE page when
freeing it, due to the way the HW table walker works.

Basically, the TLB can be loaded with "large" pages that cover the whole
virtual space (well, sort-of, half of it actually) represented by a PTE
page, and which contain an "indirect" bit indicating that this TLB entry
RPN points to an array of PTEs from which the TLB can then create direct
entries. Thus, in order to invalidate those when PTE pages are deleted,
we need the virtual address to pass to tlbilx or tlbivax instructions.

The old trick of sticking it somewhere in the PTE page struct page sucks
too much, the address is almost readily available in all call sites and
almost everybody implemets these as macros, so we may as well add the
argument everywhere. I added it to the pmd and pud variants for consistency.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt; [MN10300 &amp; FRV]
Acked-by: Nick Piggin &lt;npiggin@suse.de&gt;
Acked-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt; [s390]
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>asm-generic/tlb.h: remove &lt;linux/quicklist.h&gt;</title>
<updated>2008-02-04T15:48:00Z</updated>
<author>
<name>H. Peter Anvin</name>
<email>hpa@zytor.com</email>
</author>
<published>2008-02-04T15:48:00Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=e1adbcf10608c83de6a81a02ebce859611433b52'/>
<id>urn:sha1:e1adbcf10608c83de6a81a02ebce859611433b52</id>
<content type='text'>
Remove unused &lt;linux/quicklist.h&gt; from &lt;asm-generic/tlb.h&gt;; per
Christoph Lameter this should have been part of a previous patch
reversal but apparently didn't get removed.

Signed-off-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>asm-generic/tlb.h: build fix</title>
<updated>2008-01-31T21:05:48Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2008-01-31T21:05:48Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=62152d0ea7012382cd814c7b361b4ef2029f26e6'/>
<id>urn:sha1:62152d0ea7012382cd814c7b361b4ef2029f26e6</id>
<content type='text'>
bring back the avr32, blackfin, sh, sparc architectures into working order,
by reverting the effects of this change that came in via the x86 tree:

   commit a5a19c63f4e55e32dc0bc3d936d7f94793d8b380
   Author: Jeremy Fitzhardinge &lt;jeremy@goop.org&gt;
   Date:   Wed Jan 30 13:33:39 2008 +0100

       x86: demacro asm-x86/pgalloc_32.h

Sorry about that!

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>x86: demacro asm-x86/pgalloc_32.h</title>
<updated>2008-01-30T12:33:39Z</updated>
<author>
<name>Jeremy Fitzhardinge</name>
<email>jeremy@goop.org</email>
</author>
<published>2008-01-30T12:33:39Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=a5a19c63f4e55e32dc0bc3d936d7f94793d8b380'/>
<id>urn:sha1:a5a19c63f4e55e32dc0bc3d936d7f94793d8b380</id>
<content type='text'>
Convert macros into inline functions, for better type-checking.

This patch required a little bit of fiddling with headers in order to
make __(pte|pmd)_free_tlb inline rather than macros.
asm-generic/tlb.h includes asm/pgalloc.h, though it doesn't directly
use any pgalloc definitions.  I removed this include to avoid an
include cycle, but it may cause secondary compile failures by things
depending on the indirect inclusion; arch/x86/mm/hugetlbpage.c was one
such place; there may be others.

Signed-off-by: Jeremy Fitzhardinge &lt;jeremy@xensource.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>Revert quicklist need-&gt;flush fix</title>
<updated>2007-12-27T06:04:09Z</updated>
<author>
<name>Christoph Lameter</name>
<email>clameter@sgi.com</email>
</author>
<published>2007-12-26T20:43:01Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=49eaaa1a6c950e7a92c4386c199b8ec950f840b9'/>
<id>urn:sha1:49eaaa1a6c950e7a92c4386c199b8ec950f840b9</id>
<content type='text'>
Did not fix the reported issue. Apart from other weirdness this causes a
bad link between the TLB flushing logic and the quicklists. If there is
indeed an issue that an arch needs a tlb flush before free then the arch
code needs to set tlb-&gt;need_flush before calling quicklist_free.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
