summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)Author
2016-01-31powerpc/module: Handle R_PPC64_ENTRY relocationsUlrich Weigand
commit a61674bdfc7c2bf909c4010699607b62b69b7bec upstream. GCC 6 will include changes to generated code with -mcmodel=large, which is used to build kernel modules on powerpc64le. This was necessary because the large model is supposed to allow arbitrary sizes and locations of the code and data sections, but the ELFv2 global entry point prolog still made the unconditional assumption that the TOC associated with any particular function can be found within 2 GB of the function entry point: func: addis r2,r12,(.TOC.-func)@ha addi r2,r2,(.TOC.-func)@l .localentry func, .-func To remove this assumption, GCC will now generate instead this global entry point prolog sequence when using -mcmodel=large: .quad .TOC.-func func: .reloc ., R_PPC64_ENTRY ld r2, -8(r12) add r2, r2, r12 .localentry func, .-func The new .reloc triggers an optimization in the linker that will replace this new prolog with the original code (see above) if the linker determines that the distance between .TOC. and func is in range after all. Since this new relocation is now present in module object files, the kernel module loader is required to handle them too. This patch adds support for the new relocation and implements the same optimization done by the GNU linker. Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-31powerpc: Make {cmp}xchg* and their atomic_ versions fully orderedBoqun Feng
commit 81d7a3294de7e9828310bbf986a67246b13fa01e upstream. According to memory-barriers.txt, xchg*, cmpxchg* and their atomic_ versions all need to be fully ordered, however they are now just RELEASE+ACQUIRE, which are not fully ordered. So also replace PPC_RELEASE_BARRIER and PPC_ACQUIRE_BARRIER with PPC_ATOMIC_ENTRY_BARRIER and PPC_ATOMIC_EXIT_BARRIER in __{cmp,}xchg_{u32,u64} respectively to guarantee fully ordered semantics of atomic{,64}_{cmp,}xchg() and {cmp,}xchg(), as a complement of commit b97021f85517 ("powerpc: Fix atomic_xxx_return barrier semantics") This patch depends on patch "powerpc: Make value-returning atomics fully ordered" for PPC_ATOMIC_ENTRY_BARRIER definition. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-31powerpc: Make value-returning atomics fully orderedBoqun Feng
commit 49e9cf3f0c04bf76ffa59242254110309554861d upstream. According to memory-barriers.txt: > Any atomic operation that modifies some state in memory and returns > information about the state (old or new) implies an SMP-conditional > general memory barrier (smp_mb()) on each side of the actual > operation ... Which mean these operations should be fully ordered. However on PPC, PPC_ATOMIC_ENTRY_BARRIER is the barrier before the actual operation, which is currently "lwsync" if SMP=y. The leading "lwsync" can not guarantee fully ordered atomics, according to Paul Mckenney: https://lkml.org/lkml/2015/10/14/970 To fix this, we define PPC_ATOMIC_ENTRY_BARRIER as "sync" to guarantee the fully-ordered semantics. This also makes futex atomics fully ordered, which can avoid possible memory ordering problems if userspace code relies on futex system call for fully ordered semantics. Fixes: b97021f85517 ("powerpc: Fix atomic_xxx_return barrier semantics") Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-31powerpc/powernv: pr_warn_once on unsupported OPAL_MSG typeStewart Smith
commit 98da62b716a3b24ab8e77453c9a8a954124c18cd upstream. When running on newer OPAL firmware that supports sending extra OPAL_MSG types, we would print a warning on *every* message received. This could be a problem for kernels that don't support OPAL_MSG_OCC on machines that are running real close to thermal limits and the OCC is throttling the chip. For a kernel that is paying attention to the message queue, we could get these notifications quite often. Conceivably, future message types could also come fairly often, and printing that we didn't understand them 10,000 times provides no further information than printing them once. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-31powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion"Alistair Popple
commit 036592fbbe753d236402a0ae68148e7c143a0f0e upstream. Commit 25642e1459ac ("powerpc/opal-irqchip: Fix double endian conversion") fixed an endian bug by calling opal_handle_events() in opal_event_unmask(). However this introduced a deadlock if we find an event is active during unmasking and call opal_handle_events() again. The bad call sequence is: opal_interrupt() -> opal_handle_events() -> generic_handle_irq() -> handle_level_irq() -> raw_spin_lock(&desc->lock) handle_irq_event(desc) unmask_irq(desc) -> opal_event_unmask() -> opal_handle_events() -> generic_handle_irq() -> handle_level_irq() -> raw_spin_lock(&desc->lock) (BOOM) When generating multiple opal events in quick succession this would lead to the following stall warnings: EEH: Fenced PHB#0 detected, location: U78C9.001.WZS09XA-P1-C32 INFO: rcu_sched detected stalls on CPUs/tasks: 12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=2065 15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=2065 (detected by 13, t=2102 jiffies, g=1325, c=1324, q=602) NMI watchdog: BUG: soft lockup - CPU#18 stuck for 22s! [irqbalance:2696] INFO: rcu_sched detected stalls on CPUs/tasks: 12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=8371 15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=8371 (detected by 20, t=8407 jiffies, g=1325, c=1324, q=1290) This patch corrects the problem by queuing the work if an event is active during unmasking, which is similar to the pre-endian fix behaviour. Fixes: 25642e1459ac ("powerpc/opal-irqchip: Fix double endian conversion") Signed-off-by: Alistair Popple <alistair@popple.id.au> Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-31powerpc/opal-irqchip: Fix double endian conversionAlistair Popple
commit 25642e1459ace29f6ce5a171efc8b7b59a52a2d4 upstream. The OPAL event calls return a mask of events that are active in big endian format. This is checked when unmasking the events in the irqchip by comparison with a cached value. The cached value was stored in big endian format but should've been converted to CPU endian first. This bug leads to OPAL event delivery being delayed or dropped on some systems. Symptoms may include a non-functional console. The bug is fixed by calling opal_handle_events(...) instead of duplicating code in opal_event_unmask(...). Fixes: 9f0fd0499d30 ("powerpc/powernv: Add a virtual irqchip for opal events") Reported-by: Douglas L Lehr <dllehr@us.ibm.com> Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-31powerpc/tm: Check for already reclaimed tasksMichael Neuling
commit 7f821fc9c77a9b01fe7b1d6e72717b33d8d64142 upstream. Currently we can hit a scenario where we'll tm_reclaim() twice. This results in a TM bad thing exception because the second reclaim occurs when not in suspend mode. The scenario in which this can happen is the following. We attempt to deliver a signal to userspace. To do this we need obtain the stack pointer to write the signal context. To get this stack pointer we must tm_reclaim() in case we need to use the checkpointed stack pointer (see get_tm_stackpointer()). Normally we'd then return directly to userspace to deliver the signal without going through __switch_to(). Unfortunatley, if at this point we get an error (such as a bad userspace stack pointer), we need to exit the process. The exit will result in a __switch_to(). __switch_to() will attempt to save the process state which results in another tm_reclaim(). This tm_reclaim() now causes a TM Bad Thing exception as this state has already been saved and the processor is no longer in TM suspend mode. Whee! This patch checks the state of the MSR to ensure we are TM suspended before we attempt the tm_reclaim(). If we've already saved the state away, we should no longer be in TM suspend mode. This has the additional advantage of checking for a potential TM Bad Thing exception. Found using syscall fuzzer. Fixes: fb09692e71f1 ("powerpc: Add reclaim and recheckpoint functions for context switching transactional memory processes") Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-31powerpc/tm: Block signal return setting invalid MSR stateMichael Neuling
commit d2b9d2a5ad5ef04ff978c9923d19730cb05efd55 upstream. Currently we allow both the MSR T and S bits to be set by userspace on a signal return. Unfortunately this is a reserved configuration and will cause a TM Bad Thing exception if attempted (via rfid). This patch checks for this case in both the 32 and 64 bit signals code. If both T and S are set, we mark the context as invalid. Found using a syscall fuzzer. Fixes: 2b0a576d15e0 ("powerpc: Add new transactional memory state to the signal context") Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-31net: filter: make JITs zero A for SKF_AD_ALU_XOR_XRabin Vincent
[ Upstream commit 55795ef5469290f89f04e12e662ded604909e462 ] The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data instructions since it XORs A with X while all the others replace A with some loaded value. All the BPF JITs fail to clear A if this is used as the first instruction in a filter. This was found using american fuzzy lop. Add a helper to determine if A needs to be cleared given the first instruction in a filter, and use this in the JITs. Except for ARM, the rest have only been compile-tested. Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum") Signed-off-by: Rabin Vincent <rabin@rab.in> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-31KVM: PPC: Book3S HV: Prohibit setting illegal transaction state in MSRPaul Mackerras
commit c20875a3e638e4a03e099b343ec798edd1af5cc6 upstream. Currently it is possible for userspace (e.g. QEMU) to set a value for the MSR for a guest VCPU which has both of the TS bits set, which is an illegal combination. The result of this is that when we execute a hrfid (hypervisor return from interrupt doubleword) instruction to enter the guest, the CPU will take a TM Bad Thing type of program interrupt (vector 0x700). Now, if PR KVM is configured in the kernel along with HV KVM, we actually handle this without crashing the host or giving hypervisor privilege to the guest; instead what happens is that we deliver a program interrupt to the guest, with SRR0 reflecting the address of the hrfid instruction and SRR1 containing the MSR value at that point. If PR KVM is not configured in the kernel, then we try to run the host's program interrupt handler with the MMU set to the guest context, which almost certainly causes a host crash. This closes the hole by making kvmppc_set_msr_hv() check for the illegal combination and force the TS field to a safe value (00, meaning non-transactional). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-01-31KVM: PPC: Book3S HV: Don't dynamically split core when already splitPaul Mackerras
commit f74f2e2e26199f695ca3df94f29e9ab7cb707ea4 upstream. In static micro-threading modes, the dynamic micro-threading code is supposed to be disabled, because subcores can't make independent decisions about what micro-threading mode to put the core in - there is only one micro-threading mode for the whole core. The code that implements dynamic micro-threading checks for this, except that the check was missed in one case. This means that it is possible for a subcore in static 2-way micro-threading mode to try to put the core into 4-way micro-threading mode, which usually leads to stuck CPUs, spinlock lockups, and other stalls in the host. The problem was in the can_split_piggybacked_subcores() function, which should always return false if the system is in a static micro-threading mode. This fixes the problem by making can_split_piggybacked_subcores() use subcore_config_ok() for its checks, as subcore_config_ok() includes the necessary check for the static micro-threading modes. Credit to Gautham Shenoy for working out that the reason for the hangs and stalls we were seeing was that we were trying to do dynamic 4-way micro-threading while we were in static 2-way mode. Fixes: b4deba5c41e9 Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-28Merge tag 'powerpc-4.3-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: - powerpc/dma: dma_set_coherent_mask() should not be GPL only from Ben * tag 'powerpc-4.3-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/dma: dma_set_coherent_mask() should not be GPL only
2015-10-28powerpc/dma: dma_set_coherent_mask() should not be GPL onlyBenjamin Herrenschmidt
When turning this from inline to an exported function I was a bit over-eager and made it GPL only. This prevents the use of pretty much all non-GPL PCI driver which is a bit over the top. Let's bring it back in line with other architecture. Fixes: 817820b0226a ("powerpc/iommu: Support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-23Merge tag 'powerpc-4.3-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Revert "Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8" from Paul - Handle irq_happened flag correctly in off-line loop from Paul - Validate rtas.entry before calling enter_rtas() from Vasant * tag 'powerpc-4.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/rtas: Validate rtas.entry before calling enter_rtas() powerpc/powernv: Handle irq_happened flag correctly in off-line loop powerpc: Revert "Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8"
2015-10-22powerpc/rtas: Validate rtas.entry before calling enter_rtas()Vasant Hegde
Currently we do not validate rtas.entry before calling enter_rtas(). This leads to a kernel oops when user space calls rtas system call on a powernv platform (see below). This patch adds code to validate rtas.entry before making enter_rtas() call. Oops: Exception in kernel mode, sig: 4 [#1] SMP NR_CPUS=1024 NUMA PowerNV task: c000000004294b80 ti: c0000007e1a78000 task.ti: c0000007e1a78000 NIP: 0000000000000000 LR: 0000000000009c14 CTR: c000000000423140 REGS: c0000007e1a7b920 TRAP: 0e40 Not tainted (3.18.17-340.el7_1.pkvm3_1_0.2400.1.ppc64le) MSR: 1000000000081000 <HV,ME> CR: 00000000 XER: 00000000 CFAR: c000000000009c0c SOFTE: 0 NIP [0000000000000000] (null) LR [0000000000009c14] 0x9c14 Call Trace: [c0000007e1a7bba0] [c00000000041a7f4] avc_has_perm_noaudit+0x54/0x110 (unreliable) [c0000007e1a7bd80] [c00000000002ddc0] ppc_rtas+0x150/0x2d0 [c0000007e1a7be30] [c000000000009358] syscall_exit+0x0/0x98 Cc: stable@vger.kernel.org # v3.2+ Fixes: 55190f88789a ("powerpc: Add skeleton PowerNV platform") Reported-by: NAGESWARA R. SASTRY <nasastry@in.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [mpe: Reword change log, trim oops, and add stable + fixes] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-21powerpc/powernv: Handle irq_happened flag correctly in off-line loopPaul Mackerras
This fixes a bug where it is possible for an off-line CPU to fail to go into a low-power state (nap/sleep/winkle), and to become unresponsive to requests from the KVM subsystem to wake up and run a VCPU. What can happen is that a maskable interrupt of some kind (external, decrementer, hypervisor doorbell, or HMI) after we have called local_irq_disable() at the beginning of pnv_smp_cpu_kill_self() and before interrupts are hard-disabled inside power7_nap/sleep/winkle(). In this situation, the pending event is marked in the irq_happened flag in the PACA. This pending event prevents power7_nap/sleep/winkle from going to the requested low-power state; instead they return immediately. We don't deal with any of these pending event flags in the off-line loop in pnv_smp_cpu_kill_self() because power7_nap et al. return 0 in this case, so we will have srr1 == 0, and none of the processing to clear interrupts or doorbells will be done. Usually, the most obvious symptom of this is that a KVM guest will fail with a console message saying "KVM: couldn't grab cpu N". This fixes the problem by making sure we handle the irq_happened flags properly. First, we hard-disable before the off-line loop. Once we have hard-disabled, the irq_happened flags can't change underneath us. We unconditionally clear the DEC and HMI flags: there is no processing of timer interrupts while off-line, and the necessary HMI processing is all done in lower-level code. We leave the EE and DBELL flags alone for the first iteration of the loop, so that we won't fail to respond to a split-core request that came in just before hard-disabling. Within the loop, we handle external interrupts if the EE bit is set in irq_happened as well as if the low-power state was interrupted by an external interrupt. (We don't need to do the msgclr for a pending doorbell in irq_happened, because doorbells are edge-triggered and don't remain pending in hardware.) Then we clear both the EE and DBELL flags, and once clear, they cannot be set again (until this CPU comes online again, that is). This also fixes the debug check to not be done when we just ran a KVM guest or when the sleep didn't happen because of a pending event in irq_happened. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-21powerpc: Revert "Use the POWER8 Micro Partition Prefetch Engine in KVM HV on ↵Paul Mackerras
POWER8" This reverts commit 9678cdaae939 ("Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8") because the original commit had multiple, partly self-cancelling bugs, that could cause occasional memory corruption. In fact the logmpp instruction was incorrectly using register r0 as the source of the buffer address and operation code, and depending on what was in r0, it would either do nothing or corrupt the 64k page pointed to by r0. The logmpp instruction encoding and the operation code definitions could be corrected, but then there is the problem that there is no clearly defined way to know when the hardware has finished writing to the buffer. The original commit attempted to work around this by aborting the write-out before starting the prefetch, but this is ineffective in the case where the virtual core is now executing on a different physical core from the one where the write-out was initiated. These problems plus advice from the hardware designers not to use the function (since the measured performance improvement from using the feature was actually mostly negative), mean that reverting the code is the best option. Fixes: 9678cdaae939 ("Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8") Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-16Merge tag 'powerpc-4.3-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Re-enable CONFIG_SCSI_DH in our defconfigs - Remove unused os_area_db_id_video_mode - cxl: fix leak of IRQ names in cxl_free_afu_irqs() from Andrew - cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API from Andrew - cxl: fix leak of ctx->mapping when releasing kernel API contexts from Andrew - cxl: Workaround malformed pcie packets on some cards from Philippe - cxl: Fix number of allocated pages in SPA from Christophe Lombard - Fix checkstop in native_hpte_clear() with lockdep from Cyril - Panic on unhandled Machine Check on powernv from Daniel - selftests/powerpc: Fix build failure of load_unaligned_zeropad test * tag 'powerpc-4.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Fix build failure of load_unaligned_zeropad test powerpc/powernv: Panic on unhandled Machine Check powerpc: Fix checkstop in native_hpte_clear() with lockdep cxl: Fix number of allocated pages in SPA cxl: Workaround malformed pcie packets on some cards cxl: fix leak of ctx->mapping when releasing kernel API contexts cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API cxl: fix leak of IRQ names in cxl_free_afu_irqs() powerpc/ps3: Remove unused os_area_db_id_video_mode powerpc/configs: Re-enable CONFIG_SCSI_DH
2015-10-09powerpc/powernv: Panic on unhandled Machine CheckDaniel Axtens
All unrecovered machine check errors on PowerNV should cause an immediate panic. There are 2 reasons that this is the right policy: it's not safe to continue, and we're already trying to reboot. Firstly, if we go through the recovery process and do not successfully recover, we can't be sure about the state of the machine, and it is not safe to recover and proceed. Linux knows about the following sources of Machine Check Errors: - Uncorrectable Errors (UE) - Effective - Real Address Translation (ERAT) - Segment Lookaside Buffer (SLB) - Translation Lookaside Buffer (TLB) - Unknown/Unrecognised In the SLB, TLB and ERAT cases, we can further categorise these as parity errors, multihit errors or unknown/unrecognised. We can handle SLB errors by flushing and reloading the SLB. We can handle TLB and ERAT multihit errors by flushing the TLB. (It appears we may not handle TLB and ERAT parity errors: I will investigate further and send a followup patch if appropriate.) This leaves us with uncorrectable errors. Uncorrectable errors are usually the result of ECC memory detecting an error that it cannot correct, but they also crop up in the context of PCI cards failing during DMA writes, and during CAPI error events. There are several types of UE, and there are 3 places a UE can occur: Skiboot, the kernel, and userspace. For Skiboot errors, we have the facility to make some recoverable. For userspace, we can simply kill (SIGBUS) the affected process. We have no meaningful way to deal with UEs in kernel space or in unrecoverable sections of Skiboot. Currently, these unrecovered UEs fall through to machine_check_expection() in traps.c, which calls die(), which OOPSes and sends SIGBUS to the process. This sometimes allows us to stumble onwards. For example we've seen UEs kill the kernel eehd and khugepaged. However, the process killed could have held a lock, or it could have been a more important process, etc: we can no longer make any assertions about the state of the machine. Similarly if we see a UE in skiboot (and again we've seen this happen), we're not in a position where we can make any assertions about the state of the machine. Likewise, for unknown or unrecognised errors, we're not able to say anything about the state of the machine. Therefore, if we have an unrecovered MCE, the most appropriate thing to do is to panic. The second reason is that since e784b6499d9c ("powerpc/powernv: Invoke opal_cec_reboot2() on unrecoverable machine check errors."), we attempt a special OPAL reboot on an unhandled MCE. This is so the hardware can record error data for later debugging. The comments in that commit assert that we are heading down the panic path anyway. At the moment this is not always true. With UEs in kernel space, for instance, they are marked as recoverable by the hardware, so if the attempt to reboot failed (e.g. old Skiboot), we wouldn't panic() but would simply die() and OOPS. It doesn't make sense to be staggering on if we've just tried to reboot: we should panic(). Explicitly panic() on unrecovered MCEs on PowerNV. Update the comments appropriately. This fixes some hangs following EEH events on cxlflash setups. Signed-off-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-09powerpc: Fix checkstop in native_hpte_clear() with lockdepCyril Bur
native_hpte_clear() is called in real mode from two places: - Early in boot during htab initialisation if firmware assisted dump is active. - Late in the kexec path. In both contexts there is no need to disable interrupts are they are already disabled. Furthermore, locking around the tlbie() is only required for pre POWER5 hardware. On POWER5 or newer hardware concurrent tlbie()s work as expected and on pre POWER5 hardware concurrent tlbie()s could result in deadlock. This code would only be executed at crashdump time, during which all bets are off, concurrent tlbie()s are unlikely and taking locks is unsafe therefore the best course of action is to simply do nothing. Concurrent tlbie()s are not possible in the first case as secondary CPUs have not come up yet. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-08arch/powerpc: provide zero_bytemask() for big-endianChris Metcalf
For some reason, only the little-endian flavor of powerpc provided the zero_bytemask() implementation. Reported-by: Michal Sojka <sojkam1@fel.cvut.cz> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
2015-10-06word-at-a-time.h: fix some Kbuild filesChris Metcalf
arch/tile added word-at-a-time.h after the patch that added generic-y entries; the generic-y entry is now stale. arch/h8300 is newer than the generic-y patch for word-at-a-time.h, and needs a generic-y entry. arch/powerpc seems to have gotten a generic-y entry by mistake in the first patch; this change removes it. Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
2015-10-04Merge branch 'strscpy' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile Pull strscpy string copy function implementation from Chris Metcalf. Chris sent this during the merge window, but I waffled back and forth on the pull request, which is why it's going in only now. The new "strscpy()" function is definitely easier to use and more secure than either strncpy() or strlcpy(), both of which are horrible nasty interfaces that have serious and irredeemable problems. strncpy() has a useless return value, and doesn't NUL-terminate an overlong result. To make matters worse, it pads a short result with zeroes, which is a performance disaster if you have big buffers. strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking the insane NUL padding, but having a differently broken return value which returns the original length of the source string. Which means that it will read characters past the count from the source buffer, and you have to trust the source to be properly terminated. It also makes error handling fragile, since the test for overflow is unnecessarily subtle. strscpy() avoids both these problems, guaranteeing the NUL termination (but not excessive padding) if the destination size wasn't zero, and making the overflow condition very obvious by returning -E2BIG. It also doesn't read past the size of the source, and can thus be used for untrusted source data too. So why did I waffle about this for so long? Every time we introduce a new-and-improved interface, people start doing these interminable series of trivial conversion patches. And every time that happens, somebody does some silly mistake, and the conversion patch to the improved interface actually makes things worse. Because the patch is mindnumbing and trivial, nobody has the attention span to look at it carefully, and it's usually done over large swatches of source code which means that not every conversion gets tested. So I'm pulling the strscpy() support because it *is* a better interface. But I will refuse to pull mindless conversion patches. Use this in places where it makes sense, but don't do trivial patches to fix things that aren't actually known to be broken. * 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: tile: use global strscpy() rather than private copy string: provide strscpy() Make asm/word-at-a-time.h available on all architectures
2015-09-30powerpc/ps3: Remove unused os_area_db_id_video_modeMichael Ellerman
This struct is unused, which is now a build error with gcc 6: error: 'os_area_db_id_video_mode' defined but not used There doesn't seem to be any good reason to keep it around so remove it, it's in the history if anyone needs it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-29powerpc/configs: Re-enable CONFIG_SCSI_DHMichael Ellerman
Commit 086b91d052eb ("scsi_dh: integrate into the core SCSI code") changed CONFIG_SCSI_DH from tristate to bool. Our defconfigs have CONFIG_SCSI_DH=m, which the kconfig machinery warns us is invalid, but instead of converting it to =y it leaves it unset. This means we loose the CONFIG_SCSI_DH code and everything that depends on it. So convert the values in the defconfigs to =y. Fixes: 086b91d052eb ("scsi_dh: integrate into the core SCSI code") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-25Merge tag 'pci-v4.3-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "These are fixes for things we merged for v4.3 (VPD, MSI, and bridge window management), and a new Renesas R8A7794 SoC device ID. Details: Resource management: - Revert pci_read_bridge_bases() unification (Bjorn Helgaas) - Clear IORESOURCE_UNSET when clipping a bridge window (Bjorn Helgaas) MSI: - Fix MSI IRQ domains for VFs on virtual buses (Alex Williamson) Renesas R-Car host bridge driver: - Add R8A7794 support (Sergei Shtylyov) Miscellaneous: - Fix devfn for VPD access through function 0 (Alex Williamson) - Use function 0 VPD only for identical functions (Alex Williamson)" * tag 'pci-v4.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: rcar: Add R8A7794 support PCI: Use function 0 VPD for identical functions, regular VPD for others PCI: Fix devfn for VPD access through function 0 PCI/MSI: Fix MSI IRQ domains for VFs on virtual buses PCI: Clear IORESOURCE_UNSET when clipping a bridge window PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code"
2015-09-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "AMD fixes for bugs introduced in the 4.2 merge window, and a few PPC bug fixes too" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: disable halt_poll_ns as default for s390x KVM: x86: fix off-by-one in reserved bits check KVM: x86: use correct page table format to check nested page table reserved bits KVM: svm: do not call kvm_set_cr0 from init_vmcb KVM: x86: trap AMD MSRs for the TSeg base and mask KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store() KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exit KVM: PPC: Book3S HV: Fix handling of interrupted VCPUs kvm: svm: reset mmu on VCPU reset
2015-09-25KVM: disable halt_poll_ns as default for s390xDavid Hildenbrand
We observed some performance degradation on s390x with dynamic halt polling. Until we can provide a proper fix, let's enable halt_poll_ns as default only for supported architectures. Architectures are now free to set their own halt_poll_ns default value. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-21powerpc: Wire up sys_membarrier()Michael Ellerman
The selftest passes on 64-bit LE & BE, and 32-bit. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-21KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store()Thomas Huth
Access to the kvm->buses (like with the kvm_io_bus_read() and -write() functions) has to be protected via the kvm->srcu lock. The kvmppc_h_logical_ci_load() and -store() functions are missing this lock so far, so let's add it there, too. This fixes the problem that the kernel reports "suspicious RCU usage" when lock debugging is enabled. Cc: stable@vger.kernel.org # v4.1+ Fixes: 99342cf8044420eebdf9297ca03a14cb6a7085a1 Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2015-09-21KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exitGautham R. Shenoy
In guest_exit_cont we call kvmhv_commence_exit which expects the trap number as the argument. However r3 doesn't contain the trap number at this point and as a result we would be calling the function with a spurious trap number. Fix this by copying r12 into r3 before calling kvmhv_commence_exit as r12 contains the trap number. Cc: stable@vger.kernel.org # v4.1+ Fixes: eddb60fb1443 Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2015-09-21KVM: PPC: Book3S HV: Fix handling of interrupted VCPUsPaul Mackerras
This fixes a bug which results in stale vcore pointers being left in the per-cpu preempted vcore lists when a VM is destroyed. The result of the stale vcore pointers is usually either a crash or a lockup inside collect_piggybacks() when another VM is run. A typical lockup message looks like: [ 472.161074] NMI watchdog: BUG: soft lockup - CPU#24 stuck for 22s! [qemu-system-ppc:7039] [ 472.161204] Modules linked in: kvm_hv kvm_pr kvm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ses enclosure shpchp rtc_opal i2c_opal powernv_rng binfmt_misc dm_service_time scsi_dh_alua radeon i2c_algo_bit drm_kms_helper ttm drm tg3 ptp pps_core cxgb3 ipr i2c_core mdio dm_multipath [last unloaded: kvm_hv] [ 472.162111] CPU: 24 PID: 7039 Comm: qemu-system-ppc Not tainted 4.2.0-kvm+ #49 [ 472.162187] task: c000001e38512750 ti: c000001e41bfc000 task.ti: c000001e41bfc000 [ 472.162262] NIP: c00000000096b094 LR: c00000000096b08c CTR: c000000000111130 [ 472.162337] REGS: c000001e41bff520 TRAP: 0901 Not tainted (4.2.0-kvm+) [ 472.162399] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24848844 XER: 00000000 [ 472.162588] CFAR: c00000000096b0ac SOFTE: 1 GPR00: c000000000111170 c000001e41bff7a0 c00000000127df00 0000000000000001 GPR04: 0000000000000003 0000000000000001 0000000000000000 0000000000874821 GPR08: c000001e41bff8e0 0000000000000001 0000000000000000 d00000000efde740 GPR12: c000000000111130 c00000000fdae400 [ 472.163053] NIP [c00000000096b094] _raw_spin_lock_irqsave+0xa4/0x130 [ 472.163117] LR [c00000000096b08c] _raw_spin_lock_irqsave+0x9c/0x130 [ 472.163179] Call Trace: [ 472.163206] [c000001e41bff7a0] [c000001e41bff7f0] 0xc000001e41bff7f0 (unreliable) [ 472.163295] [c000001e41bff7e0] [c000000000111170] __wake_up+0x40/0x90 [ 472.163375] [c000001e41bff830] [d00000000efd6fc0] kvmppc_run_core+0x1240/0x1950 [kvm_hv] [ 472.163465] [c000001e41bffa30] [d00000000efd8510] kvmppc_vcpu_run_hv+0x5a0/0xd90 [kvm_hv] [ 472.163559] [c000001e41bffb70] [d00000000e9318a4] kvmppc_vcpu_run+0x44/0x60 [kvm] [ 472.163653] [c000001e41bffba0] [d00000000e92e674] kvm_arch_vcpu_ioctl_run+0x64/0x170 [kvm] [ 472.163745] [c000001e41bffbe0] [d00000000e9263a8] kvm_vcpu_ioctl+0x538/0x7b0 [kvm] [ 472.163834] [c000001e41bffd40] [c0000000002d0f50] do_vfs_ioctl+0x480/0x7c0 [ 472.163910] [c000001e41bffde0] [c0000000002d1364] SyS_ioctl+0xd4/0xf0 [ 472.163986] [c000001e41bffe30] [c000000000009260] system_call+0x38/0xd0 [ 472.164060] Instruction dump: [ 472.164098] ebc1fff0 ebe1fff8 7c0803a6 4e800020 60000000 60000000 60420000 8bad02e2 [ 472.164224] 7fc3f378 4b6a57c1 60000000 7c210b78 <e92d0000> 89290009 792affe3 40820070 The bug is that kvmppc_run_vcpu does not correctly handle the case where a vcpu task receives a signal while its guest vcpu is executing in the guest as a result of being piggy-backed onto the execution of another vcore. In that case we need to wait for the vcpu to finish executing inside the guest, and then remove this vcore from the preempted vcores list. That way, we avoid leaving this vcpu's vcore on the preempted vcores list when the vcpu gets interrupted. Fixes: ec2571650826 Reported-by: Thomas Huth <thuth@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2015-09-18Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "Mostly stable material, a lot of ARM fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits) sched: access local runqueue directly in single_task_running arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS' arm64: KVM: Remove all traces of the ThumbEE registers arm: KVM: Disable virtual timer even if the guest is not using it arm64: KVM: Disable virtual timer even if the guest is not using it arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resources KVM: s390: Replace incorrect atomic_or with atomic_andnot arm: KVM: Fix incorrect device to IPA mapping arm64: KVM: Fix user access for debug registers KVM: vmx: fix VPID is 0000H in non-root operation KVM: add halt_attempted_poll to VCPU stats kvm: fix zero length mmio searching kvm: fix double free for fast mmio eventfd kvm: factor out core eventfd assign/deassign logic kvm: don't try to register to KVM_FAST_MMIO_BUS for non mmio eventfd KVM: make the declaration of functions within 80 characters KVM: arm64: add workaround for Cortex-A57 erratum #852523 KVM: fix polling for guest halt continued even if disable it arm/arm64: KVM: Fix PSCI affinity info return value for non valid cores arm64: KVM: set {v,}TCR_EL2 RES1 bits ...
2015-09-18Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "This is a rather large update post rc1 due to the final steps of cleanups and API changes which had to wait for the preparatory patches to hit your tree. - Regression fixes for ARM GIC irqchips - Regression fixes and lockdep anotations for renesas irq chips - The leftovers of the cleanup and preparatory patches which have been ignored by maintainers - Final conversions of the newly merged users of obsolete APIs - Final removal of obsolete APIs - Final removal of ARM artifacts which had been introduced during the conversion of ARM to the generic interrupt code. - Final split of the irq_data into chip specific and common data to reflect the needs of hierarchical irq domains. - Treewide removal of the first argument of interrupt flow handlers, i.e. the irq number, which is not used by the majority of handlers and simple to retrieve from the other argument the irq descriptor. - A few comment updates and build warning fixes" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) arm64: Remove ununsed set_irq_flags ARM: Remove ununsed set_irq_flags sh: Kill off set_irq_flags usage irqchip: Kill off set_irq_flags usage gpu/drm: Kill off set_irq_flags usage genirq: Remove irq argument from irq flow handlers genirq: Move field 'msi_desc' from irq_data into irq_common_data genirq: Move field 'affinity' from irq_data into irq_common_data genirq: Move field 'handler_data' from irq_data into irq_common_data genirq: Move field 'node' from irq_data into irq_common_data irqchip/gic-v3: Use IRQD_FORWARDED_TO_VCPU flag irqchip/gic: Use IRQD_FORWARDED_TO_VCPU flag genirq: Provide IRQD_FORWARDED_TO_VCPU status flag genirq: Simplify irq_data_to_desc() genirq: Remove __irq_set_handler_locked() pinctrl/pistachio: Use irq_set_handler_locked gpio: vf610: Use irq_set_handler_locked powerpc/mpc8xx: Use irq_set_handler_locked() powerpc/ipic: Use irq_set_handler_locked() powerpc/cpm2: Use irq_set_handler_locked() ...
2015-09-18Merge tag 'powerpc-4.3-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix 32-bit TCE table init in kdump kernel from Nish - Fix kdump with non-power-of-2 crashkernel= from Nish - Abort cxl_pci_enable_device_hook() if PCI channel is offline from Andrew - Fix to release DRC when configure_connector() fails from Bharata - Wire up sys_userfaultfd() - Fix race condition in tearing down MSI interrupts from Paul - Fix unbalanced pci_dev_get() in cxl_probe() from Daniel - Fix cxl build failure due to -Wunused-variable gcc behaviour change from Ian - Tell the toolchain to use ABI v2 when building an LE boot wrapper from Benh - Fix THP to recompute hash value after a failed update from Aneesh - 32-bit memcpy/memset: only use dcbz once cache is enabled from Christophe * tag 'powerpc-4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc32: memset: only use dcbz once cache is enabled powerpc32: memcpy: only use dcbz once cache is enabled powerpc/mm: Recompute hash value after a failed update powerpc/boot: Specify ABI v2 when building an LE boot wrapper cxl: Fix build failure due to -Wunused-variable behaviour change cxl: Fix unbalanced pci_dev_get in cxl_probe powerpc/MSI: Fix race condition in tearing down MSI interrupts powerpc: Wire up sys_userfaultfd() powerpc/pseries: Release DRC when configure_connector fails cxl: abort cxl_pci_enable_device_hook() if PCI channel is offline powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel= powerpc/powernv/pci-ioda: fix 32-bit TCE table init in kdump kernel
2015-09-17powerpc32: memset: only use dcbz once cache is enabledLEROY Christophe
memset() uses instruction dcbz to speed up clearing by not wasting time loading cache line with data that will be overwritten. Some platform like mpc52xx do no have cache active at startup and can therefore not use memset(). Allthough no part of the code explicitly uses memset(), GCC may make calls to it. This patch modifies memset() such that at startup, memset() unconditionally skip the optimised bloc that uses dcbz instruction. Once the initial MMU is set up, in machine_init() we patch memset() by replacing this inconditional jump by a NOP Tested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-17powerpc32: memcpy: only use dcbz once cache is enabledLEROY Christophe
memcpy() uses instruction dcbz to speed up copy by not wasting time loading cache line with data that will be overwritten. Some platform like mpc52xx do no have cache active at startup and can therefore not use memcpy(). Allthough no part of the code explicitly uses memcpy(), GCC makes calls to it. This patch modifies memcpy() such that at startup, memcpy() unconditionally jumps to generic_memcpy() which doesn't use the dcbz instruction. Once the initial MMU is set up, in machine_init() we patch memcpy() by replacing this inconditional jump by a NOP Reported-by: Michal Sojka <sojkam1@fel.cvut.cz> Tested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-16genirq: Remove irq argument from irq flow handlersThomas Gleixner
Most interrupt flow handlers do not use the irq argument. Those few which use it can retrieve the irq number from the irq descriptor. Remove the argument. Search and replace was done with coccinelle and some extra helper scripts around it. Thanks to Julia for her help! Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Jiang Liu <jiang.liu@linux.intel.com>
2015-09-16powerpc/mpc8xx: Use irq_set_handler_locked()Thomas Gleixner
Use irq_set_handler_locked() as it avoids a redundant lookup of the irq descriptor. Search and replacement was done with coccinelle: @@ struct irq_data *d; expression E1; @@ -__irq_set_handler_locked(d->irq, E1); +irq_set_handler_locked(d, E1); Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org
2015-09-16powerpc/ipic: Use irq_set_handler_locked()Thomas Gleixner
Use irq_set_handler_locked() as it avoids a redundant lookup of the irq descriptor. Search and replacement was done with coccinelle: @@ struct irq_data *d; expression E1; @@ -__irq_set_handler_locked(d->irq, E1); +irq_set_handler_locked(d, E1); Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Anton Blanchard <anton@samba.org> Cc: linuxppc-dev@lists.ozlabs.org
2015-09-16powerpc/cpm2: Use irq_set_handler_locked()Thomas Gleixner
Use irq_set_handler_locked() as it avoids a redundant lookup of the irq descriptor. Search and replacement was done with coccinelle: @@ struct irq_data *d; expression E1; @@ -__irq_set_handler_locked(d->irq, E1); +irq_set_handler_locked(d, E1); Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org
2015-09-16powerpc/mpc52xx: Use irq_set_handler_locked()Thomas Gleixner
Use irq_set_handler_locked() as it avoids a redundant lookup of the irq descriptor. Search and replacement was done with coccinelle: @@ struct irq_data *d; expression E1; @@ -__irq_set_handler_locked(d->irq, E1); +irq_set_handler_locked(d, E1); Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Anatolij Gustschin <agust@denx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org
2015-09-16powerpc/mm: Recompute hash value after a failed updateAneesh Kumar K.V
If we had secondary hash flag set, we ended up modifying hash value in the updatepp code path. Hence with a failed updatepp we will be using a wrong hash value for the following hash insert. Fix this by recomputing hash before insert. Without this patch we can end up with using wrong slot number in linux pte. That can result in us missing an hash pte update or invalidate which can cause memory corruption or even machine check. Fixes: 6d492ecc6489 ("powerpc/THP: Add code to handle HPTE faults for hugepages") Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-16powerpc/boot: Specify ABI v2 when building an LE boot wrapperBenjamin Herrenschmidt
The kernel does it, not the boot wrapper, which breaks with some cross compilers that still default to ABI v1. Fixes: 147c05168fc8 ("powerpc/boot: Add support for 64bit little endian wrapper") Cc: stable@vger.kernel.org # v3.16+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-16KVM: add halt_attempted_poll to VCPU statsPaolo Bonzini
This new statistic can help diagnosing VCPUs that, for any reason, trigger bad behavior of halt_poll_ns autotuning. For example, say halt_poll_ns = 480000, and wakeups are spaced exactly like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes 10+20+40+80+160+320+480 = 1110 microseconds out of every 479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then is consuming about 30% more CPU than it would use without polling. This would show as an abnormally high number of attempted polling compared to the successful polls. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com< Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-15PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code"Bjorn Helgaas
Revert dff22d2054b5 ("PCI: Call pci_read_bridge_bases() from core instead of arch code"). Reading PCI bridge windows is not arch-specific in itself, but there is PCI core code that doesn't work correctly if we read them too early. For example, Hannes found this case on an ARM Freescale i.mx6 board: pci_bus 0000:00: root bus resource [mem 0x01000000-0x01efffff] pci 0000:00:00.0: PCI bridge to [bus 01-ff] pci 0000:00:00.0: BAR 8: no space for [mem size 0x01000000] (mem window) pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x00200000] pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x00004000] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00000100] The 00:00.0 mem window needs to be at least 3MB: the 01:00.0 device needs 0x204100 of space, and mem windows are megabyte-aligned. Bus sizing can increase a bridge window size, but never *decrease* it (see d65245c3297a ("PCI: don't shrink bridge resources")). Prior to dff22d2054b5, ARM didn't read bridge windows at all, so the "original size" was zero, and we assigned a 3MB window. After dff22d2054b5, we read the bridge windows before sizing the bus. The firmware programmed a 16MB window (size 0x01000000) in 00:00.0, and since we never decrease the size, we kept 16MB even though we only needed 3MB. But 16MB doesn't fit in the host bridge aperture, so we failed to assign space for the window and the downstream devices. I think this is a defect in the PCI core: we shouldn't rely on the firmware to assign sensible windows. Ray reported a similar problem, also on ARM, with Broadcom iProc. Issues like this are too hard to fix right now, so revert dff22d2054b5. Reported-by: Hannes <oe5hpm@gmail.com> Reported-by: Ray Jui <rjui@broadcom.com> Link: http://lkml.kernel.org/r/CAAa04yFQEUJm7Jj1qMT57-LG7ZGtnhNDBe=PpSRa70Mj+XhW-A@mail.gmail.com Link: http://lkml.kernel.org/r/55F75BB8.4070405@broadcom.com Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2015-09-15powerpc, irq: Use access helper irq_data_get_affinity_mask()Jiang Liu
Use access helper irq_data_get_affinity_mask() so we can move the affinity mask to irq_common_data. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1433145945-789-25-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-14powerpc/cell: Prepare irq handler for irq argument removalThomas Gleixner
The irq argument of most interrupt flow handlers is unused or merily used instead of a local variable. The handlers which need the irq argument can retrieve the irq number from the irq descriptor. Search and update was done with coccinelle and the invaluable help of Julia Lawall. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linuxppc-dev@lists.ozlabs.org
2015-09-14powerpc/85xx: Prepare irq handlers for irq argument removalThomas Gleixner
The irq argument of most interrupt flow handlers is unused or merily used instead of a local variable. The handlers which need the irq argument can retrieve the irq number from the irq descriptor. Search and update was done with coccinelle and the invaluable help of Julia Lawall. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Scott Wood <scottwood@freescale.com> Cc: linuxppc-dev@lists.ozlabs.org
2015-09-14powerpc/mpc5121_ads_cpld: Prepare irq handler for irq argument removalThomas Gleixner
The irq argument of most interrupt flow handlers is unused or merily used instead of a local variable. The handlers which need the irq argument can retrieve the irq number from the irq descriptor. Search and update was done with coccinelle and the invaluable help of Julia Lawall. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Anatolij Gustschin <agust@denx.de> Cc: linuxppc-dev@lists.ozlabs.org