summaryrefslogtreecommitdiffstats
path: root/arch/powerpc
AgeCommit message (Collapse)Author
2016-02-29powerpc/mm/book3s-64: Use physical addresses in upper page table tree levelsPaul Mackerras
This changes the Linux page tables to store physical addresses rather than kernel virtual addresses in the upper levels of the tree (pgd, pud and pmd) for 64-bit Book 3S machines. This also changes the hugepd pointers used to implement hugepages when the base page size is 4k to store physical addresses rather than virtual addresses (again just for 64-bit Book3S machines). This frees up some high order bits, and will be needed with PowerISA v3.0 machines which read the page table tree in hardware in radix mode. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-27powerpc/mm/book3s-64: Free up 7 high-order bits in the Linux PTEPaul Mackerras
This frees up bits 57-63 in the Linux PTE on 64-bit Book 3S machines. In the 4k page case, this is done just by reducing the size of the RPN field to 39 bits, giving 51-bit real addresses. In the 64k page case, we had 10 unused bits in the middle of the PTE, so this moves the RPN field down 10 bits to make use of those unused bits. This means the RPN field is now 3 bits larger at 37 bits, giving 53-bit real addresses in the normal case, or 49-bit real addresses for the special 4k PFN case. We are doing this in order to be able to move some other PTE bits into the positions where PowerISA V3.0 processors will expect to find them in radix-tree mode. Ultimately we will be able to move the RPN field to lower bit positions and make it larger. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-27powerpc/mm/book3s-64: Clean up some obsolete or misleading commentsPaul Mackerras
No code changes. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-25Merge tag 'powerpc-4.5-4' into nextMichael Ellerman
Pull in our current fixes from 4.5, in particular the "Fix Multi hit ERAT" bug is causing folks some grief when testing next.
2016-02-24powerpc: Fix BUG_ON() reporting in real modeBalbir Singh
I ran into this issue while debugging an early boot problem. The system hit a BUG_ON() but report bug failed to print the line number and file name. The reason being that the system was running in real mode and report_bug() searches for addresses in the PAGE_OFFSET+ region. Suggested-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-24powerpc: Use BUILD_BUG_ON_MSG() for unsupported {cmp}xchg sizespan xinhui
__xchg_called_with_bad_pointer() can't tell us which code uses {cmp}xchg with an unsupported size, and no error is reported until the link stage. To make such problems easier to debug, use BUILD_BUG_ON_MSG() instead. Signed-off-by: pan xinhui <xinhui.pan@linux.vnet.ibm.com> [mpe: Tweak change log wording & add relaxed/acquire] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> fixup
2016-02-24powerpc/powernv: Add AST graphics driver to powernv_defconfigJeremy Kerr
Most current OpenPOWER platforms have an AST BMC, so add graphics support via the AST DRM driver. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-24powerpc/powernv: Add powernv firmware interface drivers to powernv_defconfigJeremy Kerr
There are a few firmware-provided interfaces for OpenPOWER platforms: the PRD infrastructure, IPMI support, and MTD access to the PNOR flash. This change adds these to powernv_defconfig Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-24powerpc/powernv: Add powernv_defconfigJeremy Kerr
This change adds a defconfig for the non-virtualised power platforms, based on pseries_defconfig, but without pseries, and little-endian, and no OF trampoline. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-22powerpc: Add POWER9 cputable entryMichael Neuling
Add a cputable entry for POWER9. More code is required to actually boot and run on a POWER9 but this gets the base piece in which we can start building on. Copies over from POWER8 except for: - Adds a new CPU_FTR_ARCH_300 bit to start hanging new architecture features from (in subsequent patches). - Advertises new user features bits PPC_FEATURE2_ARCH_3_00 & HAS_IEEE128 when on POWER9. - Drops CPU_FTR_SUBCORE. - Drops PMU code and machine check. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-22powerpc: Use defines for __init_tlb_power[78]Michael Neuling
Use defines for literals __init_tlb_power[78] rather than hand coding them. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-22powerpc/powernv: Create separate subcores CPU feature bitMichael Neuling
Subcores isn't really part of the 2.07 architecture but currently we turn it on using the 2.07 feature bit. Subcores is really a POWER8 specific feature. This adds a new CPU_FTR bit just for subcores and moves the subcore init code over to use this. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-22powerpc/powernv: don't create OPAL msglog sysfs entry if memcons init failsAndrew Donnellan
When initialising OPAL interfaces, there is a possibility that opal_msglog_init() may fail to initialise the msglog/memory console. Fix opal_msglog_sysfs_init() so it doesn't try to create sysfs entry for the msglog if this occurs. Suggested-by: Joel Stanley <joel@jms.id.au> Fixes: 9b4fffa14906 ("powerpc/powernv: new function to access OPAL msglog") Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-22powerpc/mm/hash: Clear the invalid slot information correctlyAneesh Kumar K.V
We can get a hash pte fault with 4k base page size and find the pte already inserted with 64K base page size. In that case we need to clear the existing slot information from the old pte. Fix this correctly With THP, we also clear the slot information with respect to all the 64K hash pte mapping that 16MB page. They are all invalid now. This make sure we don't find the slot valid when we fault with 4k base page size. Finding the slot valid should not result in any wrong behavior because we do check again in hash page table for the validity. But we can avoid that check completely. Fixes: a43c0eb8364c022 ("powerpc/mm: Convert 4k hash insert to C") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-22powerpc/eeh: Fix partial hotplug criterionGavin Shan
During error recovery, the device could be removed as part of the partial hotplug. The criterion used to come with partial hotplug is: if the device driver provides error_detected(), slot_reset() and resume() callbacks, it's immune from hotplug. Otherwise, it's going to experience partial hotplug during EEH recovery. But the criterion isn't correct enough: mlx4_core driver for Mellanox adapters provides error_detected(), slot_reset() callbacks, but resume() isn't there. Those Mellanox adapters won't be to involved in the partial hotplug. This fixes the criterion to a practical one: adpater with driver that provides error_detected(), slot_reset() will be immune from partial hotplug. resume() isn't mandatory. Fixes: f2da4ccf ("powerpc/eeh: More relaxed hotplug criterion") Cc: stable@vger.kernel.org #v4.4+ Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-18powerpc: atomic: Implement acquire/release/relaxed variants for cmpxchgBoqun Feng
Implement cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed, based on which _release variants can be built. To avoid superfluous barriers in _acquire variants, we implement these operations with assembly code rather use __atomic_op_acquire() to build them automatically. For the same reason, we keep the assembly implementation of fully ordered cmpxchg operations. However, we don't do the similar for _release, because that will require putting barriers in the middle of ll/sc loops, which is probably a bad idea. Note cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed are not compiler barriers. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-18powerpc: atomic: Implement acquire/release/relaxed variants for xchgBoqun Feng
Implement xchg{,64}_relaxed and atomic{,64}_xchg_relaxed, based on these _relaxed variants, release/acquire variants and fully ordered versions can be built. Note that xchg{,64}_relaxed and atomic_{,64}_xchg_relaxed are not compiler barriers. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-18powerpc: atomic: Implement atomic{, 64}_*_return_* variantsBoqun Feng
On powerpc, acquire and release semantics can be achieved with lightweight barriers("lwsync" and "ctrl+isync"), which can be used to implement __atomic_op_{acquire,release}. For release semantics, since we only need to ensure all memory accesses that issue before must take effects before the -store- part of the atomics, "lwsync" is what we only need. On the platform without "lwsync", "sync" should be used. Therefore in __atomic_op_release() we use PPC_RELEASE_BARRIER. For acquire semantics, "lwsync" is what we only need for the similar reason. However on the platform without "lwsync", we can use "isync" rather than "sync" as an acquire barrier. Therefore in __atomic_op_acquire() we use PPC_ACQUIRE_BARRIER, which is barrier() on UP, "lwsync" if available and "isync" otherwise. Implement atomic{,64}_{add,sub,inc,dec}_return_relaxed, and build other variants with these helpers. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-18powerpc: Fix kgdb on little endian ppc64leBalbir Singh
I spent some time trying to use kgdb and debugged my inability to resume from kgdb_handle_breakpoint(). NIP is not incremented and that leads to a loop in the debugger. I've tested this lightly on a virtual instance with KDB enabled. After the patch, I am able to get the "go" command to work as expected. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-17powerpc/ioda: Set "read" permission when "write" is setAlexey Kardashevskiy
Quite often drivers set only "write" permission assuming that this includes "read" permission as well and this works on plenty of platforms. However IODA2 is strict about this and produces an EEH when "read" permission is not set and reading happens. This adds a workaround in the IODA code to always add the "read" bit when the "write" bit is set. Fixes: 10b35b2b7485 ("powerpc/powernv: Do not set "read" flag if direction==DMA_NONE") Cc: stable@vger.kernel.org # 4.2+ Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Douglas Miller <dougmill@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-15powerpc/mm: Fix Multi hit ERAT cause by recent THP updateAneesh Kumar K.V
With ppc64 we use the deposited pgtable_t to store the hash pte slot information. We should not withdraw the deposited pgtable_t without marking the pmd none. This ensure that low level hash fault handling will skip this huge pte and we will handle them at upper levels. Recent change to pmd splitting changed the above in order to handle the race between pmd split and exit_mmap. The race is explained below. Consider following race: CPU0 CPU1 shrink_page_list() add_to_swap() split_huge_page_to_list() __split_huge_pmd_locked() pmdp_huge_clear_flush_notify() // pmd_none() == true exit_mmap() unmap_vmas() zap_pmd_range() // no action on pmd since pmd_none() == true pmd_populate() As result the THP will not be freed. The leak is detected by check_mm(): BUG: Bad rss-counter state mm:ffff880058d2e580 idx:1 val:512 The above required us to not mark pmd none during a pmd split. The fix for ppc is to clear the huge pte of _PAGE_USER, so that low level fault handling code skip this pte. At higher level we do take ptl lock. That should serialze us against the pmd split. Once the lock is acquired we do check the pmd again using pmd_same. That should always return false for us and hence we should retry the access. We do the pmd_same check in all case after taking plt with THP (do_huge_pmd_wp_page, do_huge_pmd_numa_page and huge_pmd_set_accessed) Also make sure we wait for irq disable section in other cpus to finish before flipping a huge pte entry with a regular pmd entry. Code paths like find_linux_pte_or_hugepte depend on irq disable to get a stable pte_t pointer. A parallel thp split need to make sure we don't convert a pmd pte to a regular pmd entry without waiting for the irq disable section to finish. Fixes: eef1b3ba053a ("thp: implement split_huge_pmd()") Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-15powerpc/powernv: Fix stale PE primary busGavin Shan
When PCI bus is unplugged during full hotplug for EEH recovery, the platform PE instance (struct pnv_ioda_pe) isn't released and it dereferences the stale PCI bus that has been released. It leads to kernel crash when referring to the stale PCI bus. This fixes the issue by correcting the PE's primary bus when it's oneline at plugging time, in pnv_pci_dma_bus_setup() which is to be called by pcibios_fixup_bus(). Cc: stable@vger.kernel.org # v4.1+ Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reported-by: Pradipta Ghosh <pradghos@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-15powerpc/eeh: Fix stale cached primary busGavin Shan
When PE is created, its primary bus is cached to pe->bus. At later point, the cached primary bus is returned from eeh_pe_bus_get(). However, we could get stale cached primary bus and run into kernel crash in one case: full hotplug as part of fenced PHB error recovery releases all PCI busses under the PHB at unplugging time and recreate them at plugging time. pe->bus is still dereferencing the PCI bus that was released. This adds another PE flag (EEH_PE_PRI_BUS) to represent the validity of pe->bus. pe->bus is updated when its first child EEH device is online and the flag is set. Before unplugging in full hotplug for error recovery, the flag is cleared. Fixes: 8cdb2833 ("powerpc/eeh: Trace PCI bus from PE") Cc: stable@vger.kernel.org #v3.11+ Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reported-by: Pradipta Ghosh <pradghos@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-15powerpc/pseries: Don't trace hcalls on offline CPUsDenis Kirjanov
If a cpu is hotplugged while the hcall trace points are active, it's possible to hit a warning from RCU due to the trace points calling into RCU from an offline cpu, eg: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 Make the hypervisor tracepoints conditional by using TRACE_EVENT_FN_COND. Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10powerpc/perf/hv-gpci: Increase request buffer sizeSukadev Bhattiprolu
The GPCI hcall allows for a 4K buffer but we limit the buffer to 1K. The problem with a 1K buffer is if a request results in returning more values than can be accomodated in the 1K buffer the request will fail. The buffer we are using is currently allocated on the stack and hence limited in size. Instead use a per-CPU 4K buffer like we do with 24x7 counters (hv-24x7.c). While here, rename the macro GPCI_MAX_DATA_BYTES to HGPCI_MAX_DATA_BYTES for consistency with 24x7 counters. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10powerpc/powernv: allocate sparse PE# when using M64 BAR in Single PE modeWei Yang
When M64 BAR is set to Single PE mode, the PE# assigned to VF could be sparse. This patch restructures the code to allocate sparse PE# for VFs when M64 BAR is set to Single PE mode. Also it rename the offset to pe_num_map to reflect the content is the PE number. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10powerpc/powernv: boundary the total VF BAR size instead of the individual oneWei Yang
Each VF could have 6 BARs at most. When the total BAR size exceeds the gate, after expanding it will also exhaust the M64 Window. This patch limits the boundary by checking the total VF BAR size instead of the individual BAR. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10powerpc/powernv: replace the hard coded boundary with gateWei Yang
At the moment 64bit-prefetchable window can be maximum 64GB, which is currently got from device tree. This means that in shared mode the maximum supported VF BAR size is 64GB/256=256MB. While this size could exhaust the whole 64bit-prefetchable window. This is a design decision to set a boundary to 64MB of the VF BAR size. Since VF BAR size with 64MB would occupy a quarter of the 64bit-prefetchable window, this is affordable. This patch replaces magic limit of 64MB with "gate", which is 1/4 of the M64 Segment Size(m64_segsize >> 2) and adds comment to explain the reason for it. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Reviewed-by: Gavin Shan <gwshan@linux.vent.ibm.com> Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10powerpc/powernv: use one M64 BAR in Single PE mode for one VF BARWei Yang
In current implementation, when VF BAR is bigger than 64MB, it uses 4 M64 BARs in Single PE mode to cover the number of VFs required to be enabled. By doing so, several VFs would be in one VF Group and leads to interference between VFs in the same group. And in this patch, m64_wins is renamed to m64_map, which means index number of the M64 BAR used to map the VF BAR. Based on Gavin's comments. Also makes sure the VF BAR size is bigger than 32MB when M64 BAR is used in Single PE mode. This patch changes the design by using one M64 BAR in Single PE mode for one VF BAR. This gives absolute isolation for VFs. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10powerpc/powernv: simplify the calculation of iov resource alignmentWei Yang
The alignment of IOV BAR on PowerNV platform is the total size of the IOV BAR. No matter whether the IOV BAR is extended with number of roundup_pow_of_two(total_vfs) or number of max PE number (256), the total size could be calculated by (vfs_expanded * VF_BAR_size). This patch simplifies the pnv_pci_iov_resource_alignment() by removing the first case. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10powerpc/powernv: don't enable SRIOV when VF BAR has non 64bit-prefetchable BARWei Yang
On PHB3, we enable SRIOV devices by mapping IOV BAR with M64 BARs. If a SRIOV device's IOV BAR is not 64bit-prefetchable, this is not assigned from 64bit prefetchable window, which means M64 BAR can't work on it. The reason is PCI bridges support only 2 memory windows and the kernel code programs bridges in the way that one window is 32bit-nonprefetchable and the other one is 64bit-prefetchable. So if devices' IOV BAR is 64bit and non-prefetchable, it will be mapped into 32bit space and therefore M64 cannot be used for it. This patch makes this explicit and truncate IOV resource in this case to save MMIO space. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10powerpc/powernv: Simplify definitions of EEH debugfs handlersGavin Shan
The EEH debugfs handlers have same prototype. This introduces a macro to define them, then to simplify the code. No logical changes. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10powerpc/xmon: Add xmon command to dump process/task similar to ps(1)Douglas Miller
Add 'P' command with optional task_struct address to dump all/one task's information: task pointer, kernel stack pointer, PID, PPID, state (interpreted), CPU where (last) running, and command. Signed-off-by: Douglas Miller <dougmill@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-09powerpc/xmon: add command to dump OPAL msglogAndrew Donnellan
Add the 'do' command to dump the OPAL msglog in xmon. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> [mpe: Reduce the amount of ifdefery required] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-09powerpc/powernv: new function to access OPAL msglogAndrew Donnellan
Currently, the OPAL msglog/console buffer is exposed as a sysfs file, with the sysfs read handler responsible for retrieving the log from the OPAL buffer. We'd like to be able to use it in xmon as well. Refactor the OPAL msglog code to create a new function, opal_msglog_copy(), that copies to an arbitrary buffer. Separate the initialisation code into generic memcons init and sysfs file creation. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-08powerpc/eeh: fix incorrect function name in commentAndrew Donnellan
The comment block above pcibios_set_pcie_reset_state() incorrectly refers to pcibios_set_pcie_slot_reset(). Fix the comment accordingly. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-08powerpc/xmon: fix typo in usage messageAndrew Donnellan
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-08powerpc/powernv: Remove support for p5ioc2Russell Currey
"p5ioc2 is used by approximately 2 machines in the world, and has never ever been a supported configuration." The code for p5ioc2 is essentially unused and complicates what is already a very complicated codebase. Its removal is essentially a "free win" in the effort to simplify the powernv PCI code. In addition, support for p5ioc2 has been dropped from skiboot. There's no reason to keep it around in the kernel. Signed-off-by: Russell Currey <ruscur@russell.cc> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-08powerpc: Fix dedotify for binutils >= 2.26Andreas Schwab
Since binutils 2.26 BFD is doing suffix merging on STRTAB sections. But dedotify modifies the symbol names in place, which can also modify unrelated symbols with a name that matches a suffix of a dotted name. To remove the leading dot of a symbol name we can just increment the pointer into the STRTAB section instead. Backport to all stables to avoid breakage when people update their binutils - mpe. Cc: stable@vger.kernel.org Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-31powerpc/book3s_32: Fix build error with checkpoint restartAneesh Kumar K.V
In file included from mm/vmscan.c:54:0: include/linux/swapops.h: In function ‘pte_to_swp_entry’: include/linux/swapops.h:69:2: error: implicit declaration of function ‘pte_swp_soft_dirty’ [-Werror=implicit-function-declaration] if (pte_swp_soft_dirty(pte)) ^ include/linux/swapops.h:70:3: error: implicit declaration of function ‘pte_swp_clear_soft_dirty’ [-Werror=implicit-function-declaration] pte = pte_swp_clear_soft_dirty(pte); We support soft dirty tracking only with book3s 64 for now. So change the Kconfig dependency accordingly. Also CHECKPOINT_RESTORE feature is not really dependent on SOFT_DIRTY. We track the dependency between MEM_SOFT_DIRTY and ARCH_SOFT_DIRTY through headers Fixes: 7207f43665b8 ("powerpc/mm: Add page soft dirty tracking") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-29Merge tag 'powerpc-4.5-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Wire up copy_file_range() syscall from Chandan Rajendra - Simplify module TOC handling from Alan Modra - Remove newly added extra definition of pmd_dirty from Stephen Rothwell - Allow user space to map rtas_rmo_buf from Vasant Hegde - Fix PE location code from Gavin Shan - Remove PPMU_HAS_SSLOT flag for Power8 from Madhavan Srinivasan - Fixup _HPAGE_CHG_MASK from Aneesh Kumar K.V * tag 'powerpc-4.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Fixup _HPAGE_CHG_MASK powerpc/perf: Remove PPMU_HAS_SSLOT flag for Power8 powerpc/eeh: Fix PE location code powerpc/mm: Allow user space to map rtas_rmo_buf powerpc: Remove newly added extra definition of pmd_dirty powerpc: Simplify module TOC handling powerpc: Wire up copy_file_range() syscall
2016-01-28powerpc/mm: Fixup _HPAGE_CHG_MASKAneesh Kumar K.V
This was wrongly updated by commit 7aa9a23c69ea ("powerpc, thp: remove infrastructure for handling splitting PMDs") during the last merge window. Fix it up. This could lead to incorrect behaviour in THP and/or mprotect(), at a minimum. Fixes: 7aa9a23c69ea ("powerpc, thp: remove infrastructure for handling splitting PMDs") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-28powerpc/perf: Remove PPMU_HAS_SSLOT flag for Power8Madhavan Srinivasan
Commit 7a7868326d77 ("powerpc/perf: Add an explict flag indicating presence of SLOT field") introduced the PPMU_HAS_SSLOT flag to remove the assumption that MMCRA[SLOT] was present when PPMU_ALT_SIPR was not set. That commit's changelog also mentions that Power8 does not support MMCRA[SLOT]. However when the Power8 PMU support was merged, it errnoeously included the PPMU_HAS_SSLOT flag. So remove PPMU_HAS_SSLOT from the Power8 flags. mpe: On systems where MMCRA[SLOT] exists, the field occupies bits 37:39 (IBM numbering). On Power8 bit 37 is reserved, and 38:39 overlap with the high bits of the Threshold Event Counter Mantissa. I am not aware of any published events which use the threshold counting mechanism, which would cause the mantissa bits to be set. So in practice this bug is unlikely to trigger. Fixes: e05b9b9e5c10 ("powerpc/perf: Power8 PMU support") Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-27Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "s390 and POWER bug fixes, plus enabling the KVM-VFIO interface on s390" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM doc: Fix KVM_SMI chapter number KVM: s390: fix memory overwrites when vx is disabled KVM: s390: Enable the KVM-VFIO device KVM: s390: fix guest fprs memory leak KVM: PPC: Fix ONE_REG AltiVec support KVM: PPC: Increase memslots to 512 KVM: PPC: Book3S PR: Remove unused variable 'vcpu_book3s' KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8 KVM: PPC: Book3S HV: Handle unexpected traps in guest entry/exit code better
2016-01-27powerpc/eeh: Fix PE location codeGavin Shan
In eeh_pe_loc_get(), the PE location code is retrieved from the "ibm,loc-code" property of the device node for the bridge of the PE's primary bus. It's not correct because the property indicates the parent PE's location code. This reads the correct PE location code from "ibm,io-base-loc-code" or "ibm,slot-location-code" property of PE parent bus's device node. Cc: stable@vger.kernel.org # v3.16+ Fixes: 357b2f3dd9b7 ("powerpc/eeh: Dump PE location code") Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-26Merge tag 'kvm-s390-master-4.5-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fixes for kvm/master (targeting 4.5) 1. Fallout of some bigger floating point/vector rework in s390 - memory leak -> stable 4.3+ - memory overwrite -> stable 4.4+ 2. enable KVM-VFIO for s390
2016-01-25powerpc/mm: Allow user space to map rtas_rmo_bufVasant Hegde
With commit 90a545e9 (restrict /dev/mem to idle io memory ranges) mapping rtas_rmo_buf from user space is failing. Hence we are not able to make RTAS syscall. This patch calls page_is_rtas_user_buf before calling iomem_is_exclusive in devmem_is_allowed(). This will allow user space to map rtas_rmo_buf and we are able to make RTAS syscall. Reported-by: Bharata B Rao <bharata@linux.vnet.ibm.com> CC: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-22wrappers for ->i_mutex accessAl Viro
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-21Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge third patch-bomb from Andrew Morton: "I'm pretty much done for -rc1 now: - the rest of MM, basically - lib/ updates - checkpatch, epoll, hfs, fatfs, ptrace, coredump, exit - cpu_mask simplifications - kexec, rapidio, MAINTAINERS etc, etc. - more dma-mapping cleanups/simplifications from hch" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (109 commits) MAINTAINERS: add/fix git URLs for various subsystems mm: memcontrol: add "sock" to cgroup2 memory.stat mm: memcontrol: basic memory statistics in cgroup2 memory controller mm: memcontrol: do not uncharge old page in page cache replacement Documentation: cgroup: add memory.swap.{current,max} description mm: free swap cache aggressively if memcg swap is full mm: vmscan: do not scan anon pages if memcg swap limit is hit swap.h: move memcg related stuff to the end of the file mm: memcontrol: replace mem_cgroup_lruvec_online with mem_cgroup_online mm: vmscan: pass memcg to get_scan_count() mm: memcontrol: charge swap to cgroup2 mm: memcontrol: clean up alloc, online, offline, free functions mm: memcontrol: flatten struct cg_proto mm: memcontrol: rein in the CONFIG space madness net: drop tcp_memcontrol.c mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM mm: memcontrol: allow to disable kmem accounting for cgroup2 mm: memcontrol: account "kmem" consumers in cgroup2 memory controller mm: memcontrol: move kmem accounting code to CONFIG_MEMCG mm: memcontrol: separate kmem code from legacy tcp accounting code ...
2016-01-21Merge tag 'pci-v4.5-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "PCI changes for the v4.5 merge window: Enumeration: - Simplify config space size computation (Bjorn Helgaas) - Avoid iterating through ROM outside the resource window (Edward O'Callaghan) - Support PCIe devices with short cfg_size (Jason S. McMullan) - Add Netronome vendor and device IDs (Jason S. McMullan) - Limit config space size for Netronome NFP6000 family (Jason S. McMullan) - Add Netronome NFP4000 PF device ID (Simon Horman) - Limit config space size for Netronome NFP4000 (Simon Horman) - Print warnings for all invalid expansion ROM headers (Vladis Dronov) Resource management: - Fix minimum allocation address overwrite (Christoph Biedl) PCI device hotplug: - acpiphp_ibm: Fix null dereferences on null ibm_slot (Colin Ian King) - pciehp: Always protect pciehp_disable_slot() with hotplug mutex (Guenter Roeck) - shpchp: Constify hpc_ops structure (Julia Lawall) - ibmphp: Remove unneeded NULL test (Julia Lawall) Power management: - Make ASPM sysfs link_state_store() consistent with link_state_show() (Andy Lutomirski) Virtualization - Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183 (Tim Sander) MSI: - Remove empty pci_msi_init_pci_dev() (Bjorn Helgaas) - Mark PCIe/PCI (MSI) IRQ cascade handlers as IRQF_NO_THREAD (Grygorii Strashko) - Initialize MSI capability for all architectures (Guilherme G. Piccoli) - Relax msi_domain_alloc() to support parentless MSI irqdomains (Liu Jiang) ARM Versatile host bridge driver: - Remove unused pci_sys_data structures (Lorenzo Pieralisi) Broadcom iProc host bridge driver: - Hide CONFIG_PCIE_IPROC (Arnd Bergmann) - Do not use 0x in front of %pap (Dmitry V. Krivenok) - Update iProc PCIe device tree binding (Ray Jui) - Add PAXC interface support (Ray Jui) - Add iProc PCIe MSI device tree binding (Ray Jui) - Add iProc PCIe MSI support (Ray Jui) Freescale i.MX6 host bridge driver: - Use gpio_set_value_cansleep() (Fabio Estevam) - Add support for active-low reset GPIO (Petr Štetiar) HiSilicon host bridge driver: - Add support for HiSilicon Hip06 PCIe host controllers (Gabriele Paoloni) Intel VMD host bridge driver: - Export irq_domain_set_info() for module use (Keith Busch) - x86/PCI: Allow DMA ops specific to a PCI domain (Keith Busch) - Use 32 bit PCI domain numbers (Keith Busch) - Add driver for Intel Volume Management Device (VMD) (Keith Busch) Qualcomm host bridge driver: - Document PCIe devicetree bindings (Stanimir Varbanov) - Add Qualcomm PCIe controller driver (Stanimir Varbanov) - dts: apq8064: add PCIe devicetree node (Stanimir Varbanov) - dts: ifc6410: enable PCIe DT node for this board (Stanimir Varbanov) Renesas R-Car host bridge driver: - Add support for R-Car H3 to pcie-rcar (Harunobu Kurokawa) - Allow DT to override default window settings (Phil Edworthy) - Convert to DT resource parsing API (Phil Edworthy) - Revert "PCI: rcar: Build pcie-rcar.c only on ARM" (Phil Edworthy) - Remove unused pci_sys_data struct from pcie-rcar (Phil Edworthy) - Add runtime PM support to pcie-rcar (Phil Edworthy) - Add Gen2 PHY setup to pcie-rcar (Phil Edworthy) - Add gen2 fallback compatibility string for pci-rcar-gen2 (Simon Horman) - Add gen2 fallback compatibility string for pcie-rcar (Simon Horman) Synopsys DesignWare host bridge driver: - Simplify control flow (Bjorn Helgaas) - Make config accessor override checking symmetric (Bjorn Helgaas) - Ensure ATU is enabled before IO/conf space accesses (Stanimir Varbanov) Miscellaneous: - Add of_pci_get_host_bridge_resources() stub (Arnd Bergmann) - Check for PCI_HEADER_TYPE_BRIDGE equality, not bitmask (Bjorn Helgaas) - Fix all whitespace issues (Bogicevic Sasa) - x86/PCI: Simplify pci_bios_{read,write} (Geliang Tang) - Use to_pci_dev() instead of open-coding it (Geliang Tang) - Use kobj_to_dev() instead of open-coding it (Geliang Tang) - Use list_for_each_entry() to simplify code (Geliang Tang) - Fix typos in <linux/msi.h> (Thomas Petazzoni) - x86/PCI: Clarify AMD Fam10h config access restrictions comment (Tomasz Nowicki)" * tag 'pci-v4.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (58 commits) PCI: Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183 PCI: Limit config space size for Netronome NFP4000 PCI: Add Netronome NFP4000 PF device ID x86/PCI: Add driver for Intel Volume Management Device (VMD) PCI/AER: Use 32 bit PCI domain numbers x86/PCI: Allow DMA ops specific to a PCI domain irqdomain: Export irq_domain_set_info() for module use PCI: host: Add of_pci_get_host_bridge_resources() stub genirq/MSI: Relax msi_domain_alloc() to support parentless MSI irqdomains PCI: rcar: Add Gen2 PHY setup to pcie-rcar PCI: rcar: Add runtime PM support to pcie-rcar PCI: designware: Make config accessor override checking symmetric PCI: ibmphp: Remove unneeded NULL test ARM: dts: ifc6410: enable PCIe DT node for this board ARM: dts: apq8064: add PCIe devicetree node PCI: hotplug: Use list_for_each_entry() to simplify code PCI: rcar: Remove unused pci_sys_data struct from pcie-rcar PCI: hisi: Add support for HiSilicon Hip06 PCIe host controllers PCI: Avoid iterating through memory outside the resource window PCI: acpiphp_ibm: Fix null dereferences on null ibm_slot ...