<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/arch/powerpc/platforms, branch linux-4.3.y</title>
<subtitle>Hosts the 0x221E linux distro kernel.</subtitle>
<id>https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.3.y</id>
<link rel='self' href='https://universe.0xinfinity.dev/distro/kernel/atom?h=linux-4.3.y'/>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/'/>
<updated>2016-01-31T19:25:53Z</updated>
<entry>
<title>powerpc/powernv: pr_warn_once on unsupported OPAL_MSG type</title>
<updated>2016-01-31T19:25:53Z</updated>
<author>
<name>Stewart Smith</name>
<email>stewart@linux.vnet.ibm.com</email>
</author>
<published>2015-12-11T01:08:23Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=17aa8ac53418d99ac6394ae47a59fc573bb1972e'/>
<id>urn:sha1:17aa8ac53418d99ac6394ae47a59fc573bb1972e</id>
<content type='text'>
commit 98da62b716a3b24ab8e77453c9a8a954124c18cd upstream.

When running on newer OPAL firmware that supports sending extra
OPAL_MSG types, we would print a warning on *every* message received.

This could be a problem for kernels that don't support OPAL_MSG_OCC
on machines that are running real close to thermal limits and the
OCC is throttling the chip. For a kernel that is paying attention to
the message queue, we could get these notifications quite often.

Conceivably, future message types could also come fairly often,
and printing that we didn't understand them 10,000 times provides
no further information than printing them once.

Signed-off-by: Stewart Smith &lt;stewart@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion"</title>
<updated>2016-01-31T19:25:53Z</updated>
<author>
<name>Alistair Popple</name>
<email>alistair@popple.id.au</email>
</author>
<published>2015-12-18T06:16:17Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=a6699429a8d3829f112ce9a228f47a75227bf422'/>
<id>urn:sha1:a6699429a8d3829f112ce9a228f47a75227bf422</id>
<content type='text'>
commit 036592fbbe753d236402a0ae68148e7c143a0f0e upstream.

Commit 25642e1459ac ("powerpc/opal-irqchip: Fix double endian
conversion") fixed an endian bug by calling opal_handle_events() in
opal_event_unmask().

However this introduced a deadlock if we find an event is active
during unmasking and call opal_handle_events() again. The bad call
sequence is:

  opal_interrupt()
  -&gt; opal_handle_events()
     -&gt; generic_handle_irq()
        -&gt; handle_level_irq()
           -&gt; raw_spin_lock(&amp;desc-&gt;lock)
              handle_irq_event(desc)
              unmask_irq(desc)
              -&gt; opal_event_unmask()
                 -&gt; opal_handle_events()
                    -&gt; generic_handle_irq()
                       -&gt; handle_level_irq()
                          -&gt; raw_spin_lock(&amp;desc-&gt;lock)	(BOOM)

When generating multiple opal events in quick succession this would lead
to the following stall warnings:

EEH: Fenced PHB#0 detected, location: U78C9.001.WZS09XA-P1-C32
INFO: rcu_sched detected stalls on CPUs/tasks:

         12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=2065
         15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=2065
         (detected by 13, t=2102 jiffies, g=1325, c=1324, q=602)
NMI watchdog: BUG: soft lockup - CPU#18 stuck for 22s! [irqbalance:2696]
INFO: rcu_sched detected stalls on CPUs/tasks:
         12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=8371
         15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=8371
         (detected by 20, t=8407 jiffies, g=1325, c=1324, q=1290)

This patch corrects the problem by queuing the work if an event is
active during unmasking, which is similar to the pre-endian fix
behaviour.

Fixes: 25642e1459ac ("powerpc/opal-irqchip: Fix double endian conversion")
Signed-off-by: Alistair Popple &lt;alistair@popple.id.au&gt;
Reported-by: Andrew Donnellan &lt;andrew.donnellan@au1.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>powerpc/opal-irqchip: Fix double endian conversion</title>
<updated>2016-01-31T19:25:52Z</updated>
<author>
<name>Alistair Popple</name>
<email>alistair@popple.id.au</email>
</author>
<published>2015-12-07T00:28:28Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=0f8e324f8bb96da97ca98e02c7bfdefcc02a2149'/>
<id>urn:sha1:0f8e324f8bb96da97ca98e02c7bfdefcc02a2149</id>
<content type='text'>
commit 25642e1459ace29f6ce5a171efc8b7b59a52a2d4 upstream.

The OPAL event calls return a mask of events that are active in big
endian format. This is checked when unmasking the events in the
irqchip by comparison with a cached value. The cached value was stored
in big endian format but should've been converted to CPU endian
first.

This bug leads to OPAL event delivery being delayed or dropped on some
systems. Symptoms may include a non-functional console.

The bug is fixed by calling opal_handle_events(...) instead of
duplicating code in opal_event_unmask(...).

Fixes: 9f0fd0499d30 ("powerpc/powernv: Add a virtual irqchip for opal events")
Reported-by: Douglas L Lehr &lt;dllehr@us.ibm.com&gt;
Signed-off-by: Alistair Popple &lt;alistair@popple.id.au&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>powerpc/powernv: Handle irq_happened flag correctly in off-line loop</title>
<updated>2015-10-21T09:52:49Z</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2015-10-21T05:06:24Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=53c656c4138511c2ba54df413dc29976cfa9f084'/>
<id>urn:sha1:53c656c4138511c2ba54df413dc29976cfa9f084</id>
<content type='text'>
This fixes a bug where it is possible for an off-line CPU to fail to go
into a low-power state (nap/sleep/winkle), and to become unresponsive to
requests from the KVM subsystem to wake up and run a VCPU. What can
happen is that a maskable interrupt of some kind (external, decrementer,
hypervisor doorbell, or HMI) after we have called local_irq_disable() at
the beginning of pnv_smp_cpu_kill_self() and before interrupts are
hard-disabled inside power7_nap/sleep/winkle(). In this situation, the
pending event is marked in the irq_happened flag in the PACA. This
pending event prevents power7_nap/sleep/winkle from going to the
requested low-power state; instead they return immediately. We don't
deal with any of these pending event flags in the off-line loop in
pnv_smp_cpu_kill_self() because power7_nap et al. return 0 in this case,
so we will have srr1 == 0, and none of the processing to clear
interrupts or doorbells will be done.

Usually, the most obvious symptom of this is that a KVM guest will fail
with a console message saying "KVM: couldn't grab cpu N".

This fixes the problem by making sure we handle the irq_happened flags
properly. First, we hard-disable before the off-line loop. Once we have
hard-disabled, the irq_happened flags can't change underneath us. We
unconditionally clear the DEC and HMI flags: there is no processing of
timer interrupts while off-line, and the necessary HMI processing is all
done in lower-level code. We leave the EE and DBELL flags alone for the
first iteration of the loop, so that we won't fail to respond to a
split-core request that came in just before hard-disabling. Within the
loop, we handle external interrupts if the EE bit is set in irq_happened
as well as if the low-power state was interrupted by an external
interrupt. (We don't need to do the msgclr for a pending doorbell in
irq_happened, because doorbells are edge-triggered and don't remain
pending in hardware.) Then we clear both the EE and DBELL flags, and
once clear, they cannot be set again (until this CPU comes online again,
that is).

This also fixes the debug check to not be done when we just ran a KVM
guest or when the sleep didn't happen because of a pending event in
irq_happened.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc/powernv: Panic on unhandled Machine Check</title>
<updated>2015-10-08T21:07:19Z</updated>
<author>
<name>Daniel Axtens</name>
<email>dja@axtens.net</email>
</author>
<published>2015-09-23T06:41:48Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=f2dd80ecca5f06b46134f2bd811f046c503c756c'/>
<id>urn:sha1:f2dd80ecca5f06b46134f2bd811f046c503c756c</id>
<content type='text'>
All unrecovered machine check errors on PowerNV should cause an
immediate panic. There are 2 reasons that this is the right policy:
it's not safe to continue, and we're already trying to reboot.

Firstly, if we go through the recovery process and do not successfully
recover, we can't be sure about the state of the machine, and it is
not safe to recover and proceed.

Linux knows about the following sources of Machine Check Errors:
- Uncorrectable Errors (UE)
- Effective - Real Address Translation (ERAT)
- Segment Lookaside Buffer (SLB)
- Translation Lookaside Buffer (TLB)
- Unknown/Unrecognised

In the SLB, TLB and ERAT cases, we can further categorise these as
parity errors, multihit errors or unknown/unrecognised.

We can handle SLB errors by flushing and reloading the SLB. We can
handle TLB and ERAT multihit errors by flushing the TLB. (It appears
we may not handle TLB and ERAT parity errors: I will investigate
further and send a followup patch if appropriate.)

This leaves us with uncorrectable errors. Uncorrectable errors are
usually the result of ECC memory detecting an error that it cannot
correct, but they also crop up in the context of PCI cards failing
during DMA writes, and during CAPI error events.

There are several types of UE, and there are 3 places a UE can occur:
Skiboot, the kernel, and userspace. For Skiboot errors, we have the
facility to make some recoverable. For userspace, we can simply kill
(SIGBUS) the affected process. We have no meaningful way to deal with
UEs in kernel space or in unrecoverable sections of Skiboot.

Currently, these unrecovered UEs fall through to
machine_check_expection() in traps.c, which calls die(), which OOPSes
and sends SIGBUS to the process. This sometimes allows us to stumble
onwards. For example we've seen UEs kill the kernel eehd and
khugepaged. However, the process killed could have held a lock, or it
could have been a more important process, etc: we can no longer make
any assertions about the state of the machine. Similarly if we see a
UE in skiboot (and again we've seen this happen), we're not in a
position where we can make any assertions about the state of the
machine.

Likewise, for unknown or unrecognised errors, we're not able to say
anything about the state of the machine.

Therefore, if we have an unrecovered MCE, the most appropriate thing
to do is to panic.

The second reason is that since e784b6499d9c ("powerpc/powernv: Invoke
opal_cec_reboot2() on unrecoverable machine check errors."), we
attempt a special OPAL reboot on an unhandled MCE. This is so the
hardware can record error data for later debugging.

The comments in that commit assert that we are heading down the panic
path anyway. At the moment this is not always true. With UEs in kernel
space, for instance, they are marked as recoverable by the hardware,
so if the attempt to reboot failed (e.g. old Skiboot), we wouldn't
panic() but would simply die() and OOPS. It doesn't make sense to be
staggering on if we've just tried to reboot: we should panic().

Explicitly panic() on unrecovered MCEs on PowerNV.
Update the comments appropriately.

This fixes some hangs following EEH events on cxlflash setups.

Signed-off-by: Daniel Axtens &lt;dja@axtens.net&gt;
Reviewed-by: Andrew Donnellan &lt;andrew.donnellan@au1.ibm.com&gt;
Reviewed-by: Ian Munsie &lt;imunsie@au1.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc/ps3: Remove unused os_area_db_id_video_mode</title>
<updated>2015-09-30T00:33:02Z</updated>
<author>
<name>Michael Ellerman</name>
<email>mpe@ellerman.id.au</email>
</author>
<published>2015-09-25T02:14:48Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=4fa9a3f6b616fd9f2555d9d0c04513a942750986'/>
<id>urn:sha1:4fa9a3f6b616fd9f2555d9d0c04513a942750986</id>
<content type='text'>
This struct is unused, which is now a build error with gcc 6:

  error: 'os_area_db_id_video_mode' defined but not used

There doesn't seem to be any good reason to keep it around so remove it,
it's in the history if anyone needs it.

Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2015-09-18T15:11:42Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-09-18T15:11:42Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=fadb97b089563da69ba326f9fea6399d071462b2'/>
<id>urn:sha1:fadb97b089563da69ba326f9fea6399d071462b2</id>
<content type='text'>
Pull irq updates from Thomas Gleixner:
 "This is a rather large update post rc1 due to the final steps of
  cleanups and API changes which had to wait for the preparatory patches
  to hit your tree.

   - Regression fixes for ARM GIC irqchips

   - Regression fixes and lockdep anotations for renesas irq chips

   - The leftovers of the cleanup and preparatory patches which have
     been ignored by maintainers

   - Final conversions of the newly merged users of obsolete APIs

   - Final removal of obsolete APIs

   - Final removal of ARM artifacts which had been introduced during the
     conversion of ARM to the generic interrupt code.

   - Final split of the irq_data into chip specific and common data to
     reflect the needs of hierarchical irq domains.

   - Treewide removal of the first argument of interrupt flow handlers,
     i.e. the irq number, which is not used by the majority of handlers
     and simple to retrieve from the other argument the irq descriptor.

   - A few comment updates and build warning fixes"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
  arm64: Remove ununsed set_irq_flags
  ARM: Remove ununsed set_irq_flags
  sh: Kill off set_irq_flags usage
  irqchip: Kill off set_irq_flags usage
  gpu/drm: Kill off set_irq_flags usage
  genirq: Remove irq argument from irq flow handlers
  genirq: Move field 'msi_desc' from irq_data into irq_common_data
  genirq: Move field 'affinity' from irq_data into irq_common_data
  genirq: Move field 'handler_data' from irq_data into irq_common_data
  genirq: Move field 'node' from irq_data into irq_common_data
  irqchip/gic-v3: Use IRQD_FORWARDED_TO_VCPU flag
  irqchip/gic: Use IRQD_FORWARDED_TO_VCPU flag
  genirq: Provide IRQD_FORWARDED_TO_VCPU status flag
  genirq: Simplify irq_data_to_desc()
  genirq: Remove __irq_set_handler_locked()
  pinctrl/pistachio: Use irq_set_handler_locked
  gpio: vf610: Use irq_set_handler_locked
  powerpc/mpc8xx: Use irq_set_handler_locked()
  powerpc/ipic: Use irq_set_handler_locked()
  powerpc/cpm2: Use irq_set_handler_locked()
  ...
</content>
</entry>
<entry>
<title>Merge tag 'powerpc-4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux</title>
<updated>2015-09-18T15:01:06Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-09-18T15:01:06Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=f240bdd2a5b7d523ecced64c855ee2c1499f2854'/>
<id>urn:sha1:f240bdd2a5b7d523ecced64c855ee2c1499f2854</id>
<content type='text'>
Pull powerpc fixes from Michael Ellerman:

 - Fix 32-bit TCE table init in kdump kernel from Nish

 - Fix kdump with non-power-of-2 crashkernel= from Nish

 - Abort cxl_pci_enable_device_hook() if PCI channel is offline from
   Andrew

 - Fix to release DRC when configure_connector() fails from Bharata

 - Wire up sys_userfaultfd()

 - Fix race condition in tearing down MSI interrupts from Paul

 - Fix unbalanced pci_dev_get() in cxl_probe() from Daniel

 - Fix cxl build failure due to -Wunused-variable gcc behaviour change
   from Ian

 - Tell the toolchain to use ABI v2 when building an LE boot wrapper
   from Benh

 - Fix THP to recompute hash value after a failed update from Aneesh

 - 32-bit memcpy/memset: only use dcbz once cache is enabled from
   Christophe

* tag 'powerpc-4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc32: memset: only use dcbz once cache is enabled
  powerpc32: memcpy: only use dcbz once cache is enabled
  powerpc/mm: Recompute hash value after a failed update
  powerpc/boot: Specify ABI v2 when building an LE boot wrapper
  cxl: Fix build failure due to -Wunused-variable behaviour change
  cxl: Fix unbalanced pci_dev_get in cxl_probe
  powerpc/MSI: Fix race condition in tearing down MSI interrupts
  powerpc: Wire up sys_userfaultfd()
  powerpc/pseries: Release DRC when configure_connector fails
  cxl: abort cxl_pci_enable_device_hook() if PCI channel is offline
  powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel=
  powerpc/powernv/pci-ioda: fix 32-bit TCE table init in kdump kernel
</content>
</entry>
<entry>
<title>genirq: Remove irq argument from irq flow handlers</title>
<updated>2015-09-16T13:47:51Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2015-09-14T08:42:37Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=bd0b9ac405e1794d72533c3d487aa65b6b955a0c'/>
<id>urn:sha1:bd0b9ac405e1794d72533c3d487aa65b6b955a0c</id>
<content type='text'>
Most interrupt flow handlers do not use the irq argument. Those few
which use it can retrieve the irq number from the irq descriptor.

Remove the argument.

Search and replace was done with coccinelle and some extra helper
scripts around it. Thanks to Julia for her help!

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Julia Lawall &lt;Julia.Lawall@lip6.fr&gt;
Cc: Jiang Liu &lt;jiang.liu@linux.intel.com&gt;
</content>
</entry>
<entry>
<title>powerpc/mpc52xx: Use irq_set_handler_locked()</title>
<updated>2015-09-16T13:43:10Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2015-06-23T13:52:31Z</published>
<link rel='alternate' type='text/html' href='https://universe.0xinfinity.dev/distro/kernel/commit/?id=6b83bd941479e99f6890ed4b5c54cb1f43c7146f'/>
<id>urn:sha1:6b83bd941479e99f6890ed4b5c54cb1f43c7146f</id>
<content type='text'>
Use irq_set_handler_locked() as it avoids a redundant lookup of the
irq descriptor.

Search and replacement was done with coccinelle:

@@
struct irq_data *d;
expression E1;
@@

-__irq_set_handler_locked(d-&gt;irq, E1);
+irq_set_handler_locked(d, E1);

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Jiang Liu &lt;jiang.liu@linux.intel.com&gt;
Cc: Julia Lawall &lt;julia.lawall@lip6.fr&gt;
Cc: Anatolij Gustschin &lt;agust@denx.de&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: linuxppc-dev@lists.ozlabs.org
</content>
</entry>
</feed>
