summaryrefslogtreecommitdiffstats
path: root/arch
AgeCommit message (Collapse)Author
2016-03-09powerpc/eeh: eeh_pci_enable(): fix checking of post-request stateAndrew Donnellan
In eeh_pci_enable(), after making the request to set the new options, we call eeh_ops->wait_state() to check that the request finished successfully. At the moment, if eeh_ops->wait_state() returns 0, we return 0 without checking that it reflects the expected outcome. This can lead to callers further up the chain incorrectly assuming the slot has been successfully unfrozen and continuing to attempt recovery. On powernv, this will occur if pnv_eeh_get_pe_state() or pnv_eeh_get_phb_state() return 0, which in turn occurs if the relevant OPAL call returns OPAL_EEH_STOPPED_MMIO_DMA_FREEZE or OPAL_EEH_PHB_ERROR respectively. On pseries, this will occur if pseries_eeh_get_state() returns 0, which in turn occurs if RTAS reports that the PE is in the MMIO Stopped and DMA Stopped states. Obviously, none of these cases represent a successful completion of a request to thaw MMIO or DMA. Fix the check so that a wait_state() return value of 0 won't be considered successful for the EEH_OPT_THAW_MMIO or EEH_OPT_THAW_DMA cases. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Remove duplicated check in eeh_dump_pe_log()Gavin Shan
When eeh_dump_pe_log() is only called by eeh_slot_error_detail(), we already have the check that the PE isn't in PCI config blocked state in eeh_slot_error_detail(). So we needn't the duplicated check in eeh_dump_pe_log(). This removes the duplicated check in eeh_dump_pe_log(). No logical changes introduced. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Synchronize recovery in host/guestGavin Shan
When passing through SRIOV VFs to guest, we possibly encounter EEH error on PF. In this case, the VF PEs are put into frozen state. The error could be reported to guest before it's captured by the host. That means the guest could attempt to recover errors on VFs before host gets chance to recover errors on PFs. The VFs won't be recovered successfully. This enforces the recovery order for above case: the recovery on child PE in guest is hold until the recovery on parent PE in host is completed. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Don't remove passed VFsGavin Shan
When we have partial hotplug as part of the error recovery on PF, the VFs that are bound with vfio-pci driver will experience hotplug. That's not allowed. This checks if the VF PE is passed or not. If it does, we leave the VF without removing it. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Don't propagate error to guestGavin Shan
When EEH error happened to the parent PE of those PEs that have been passed through to guest, the error is propagated to guest domain and the VFIO driver's error handlers are called. It's not correct as the error in the host domain shouldn't be propagated to guests and affect them. This adds one more limitation when calling EEH error handlers. If the PE has been passed through to guest, the error handlers won't be called. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: powerpc/eeh: Support error recovery for VF PEWei Yang
PFs are enumerated on PCI bus, while VFs are created by PF's driver. In EEH recovery, it has two cases: 1. Device and driver is EEH aware, error handlers are called. 2. Device and driver is not EEH aware, un-plug the device and plug it again by enumerating it. The special thing happens on the second case. For a PF, we could use the original pci core to enumerate the bus, while for VF we need to record the VFs which aer un-plugged then plug it again. Also The patch caches the VF index in pci_dn, which can be used to calculate VF's bus, device and function number. Those information helps to locate the VF's PCI device instance when doing hotplug during EEH recovery if necessary. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/powernv: Support PCI config restore for VFsWei Yang
After PE reset, OPAL API opal_pci_reinit() is called on all devices contained in the PE to reinitialize them. While skiboot is not aware of VFs, we have to implement the function in kernel to reinitialize VFs after reset on PE for VFs. In this patch, two functions pnv_pci_fixup_vf_mps() and pnv_eeh_restore_vf_config() both manipulate the MPS of the VF, since for a VF it has three cases. 1. Normal creation for a VF In this case, pnv_pci_fixup_vf_mps() is called to make the MPS a proper value compared with its parent. 2. EEH recovery without VF removed In this case, MPS is stored in pci_dn and pnv_eeh_restore_vf_config() is called to restore it and reinitialize other part. 3. EEH recovery with VF removed In this case, VF will be removed then re-created. Both functions are called. First pnv_pci_fixup_vf_mps() is called to store the proper MPS to pci_dn and then pnv_eeh_restore_vf_config() is called to do proper thing. This introduces two functions: pnv_pci_fixup_vf_mps() to fixup the VF's MPS to make sure it is equal to parent's and store this value in pci_dn for future use. pnv_eeh_restore_vf_config() to re-initialize on VF by restoring MPS, disabling completion timeout, enabling SERR, etc. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/powernv: Support EEH reset for VF PEWei Yang
PEs for VFs don't have primary bus. So they have to have their own reset backend, which is used during EEH recovery. The patch implements the reset backend for VF's PE by issuing FLR or AF FLR to the VFs, which are contained in the PE. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Create PE for VFsWei Yang
This creates PEs for VFs in the weak function pcibios_bus_add_device(). Those PEs for VFs are identified with newly introduced flag EEH_PE_VF so that we treat them differently during EEH recovery. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: EEH device for VFWei Yang
VFs and their corresponding pdn are created and released dynamically when their PF's SRIOV capability is enabled and disabled. This creates and releases EEH devices for VFs when creating and releasing their pdn instances, which means EEH devices and pdn instances have same life cycle. Also, VF's EEH device is identified by (struct eeh_dev::physfn). Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Cache normal BARs, not windows or IOV BARsWei Yang
This restricts the EEH address cache to use only the first 7 BARs. This makes __eeh_addr_cache_insert_dev() ignore PCI bridge window and IOV BARs. As the result of this change, eeh_addr_cache_get_dev() will return VFs from VF's resource addresses instead of parent PFs. This also removes PCI bridge check as we limit __eeh_addr_cache_insert_dev() to 7 BARs and this effectively excludes PCI bridges from being cached. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/pci: Remove VFs prior to PFWei Yang
As commit ac205b7bb72f ("PCI: make sriov work with hotplug remove") indicates, VFs which is on the same PCI bus as their PF, should be removed before the PF. Otherwise, we might run into kernel crash at PCI unplugging time. This applies the above pattern to powerpc PCI hotplug path. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09powerpc/eeh: Reworked eeh_pe_bus_get()Gavin Shan
The original implementation is ugly: unnecessary if statements and "out" tag. This reworks the function to avoid above weaknesses. No functional changes introduced. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03powerpc/mm: Move hash64 tlbflush code into a new headerAneesh Kumar K.V
No code changes. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03powerpc/mm: Move hash related mmu-*.h headers to book3s/Aneesh Kumar K.V
No code changes. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03powerpc/mm: add _PAGE_HASHPTE similar to 4K hashAneesh Kumar K.V
We don't need to update linux page table entry with _PAGE_HASHPTE early in hash pte fault. A parallel pte update will loop via _PAGE_BUSY and look at _PAGE_HASHPTE for a required hpte flush only if _PAGE_BUSY is cleared. That ensures a pte update will wait for a parallel hpte insert to finish before looking at _PAGE_HASHPTE bit. To avoid further confusion drop setting _PAGE_HASHPTE in cmpxchg in __hash_page_4K. commit 41743a4e34f0 ("powerpc: Free a PTE bit on ppc64 with 64K pages") did similar change for 64K config Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03powerp/mm: Update code commentsAneesh Kumar K.V
We are updating pte in those functions. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03mm: Some arch may want to use HPAGE_PMD related values as variablesKirill A. Shutemov
With next generation power processor, we are having a new mmu model [1] that require us to maintain a different linux page table format. Inorder to support both current and future ppc64 systems with a single kernel we need to make sure kernel can select between different page table format at runtime. With the new MMU (radix MMU) added, we will have two different pmd hugepage size 16MB for hash model and 2MB for Radix model. Hence make HPAGE_PMD related values as a variable. Actual conversion of HPAGE_PMD to a variable for ppc64 happens in a followup patch. [1] http://ibm.biz/power-isa3 (Needs registration). Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03powerpc/mm: Switch book3s 64 with 64K page size to 4 level page tableAneesh Kumar K.V
This is needed so that we can support both hash and radix page table using single kernel. Radix kernel uses a 4 level table. We now use physical address in upper page table tree levels. Even though they are aligned to their size, for the masked bits we use the bit positions as per PowerISA 3.0. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03powerpc/mm: Don't have conditional defines for real_pte_tAneesh Kumar K.V
We remove real_pte_t out of STRICT_MM_TYPESCHECK. Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-03powerpc/mm: Split pgtable types to separate headerAneesh Kumar K.V
We move the page table accessors into a separate header. We will later add a big endian variant of the table which is needed for radix. No functionality change only code movement. Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02powerpc: Add the ability to save VSX without giving it upCyril Bur
This patch adds the ability to be able to save the VSX registers to the thread struct without giving up (disabling the facility) next time the process returns to userspace. This patch builds on a previous optimisation for the FPU and VEC registers in the thread copy path to avoid a possibly pointless reload of VSX state. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02powerpc: Add the ability to save Altivec without giving it upCyril Bur
This patch adds the ability to be able to save the VEC registers to the thread struct without giving up (disabling the facility) next time the process returns to userspace. This patch builds on a previous optimisation for the FPU registers in the thread copy path to avoid a possibly pointless reload of VEC state. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02powerpc: Add the ability to save FPU without giving it upCyril Bur
This patch adds the ability to be able to save the FPU registers to the thread struct without giving up (disabling the facility) next time the process returns to userspace. This patch optimises the thread copy path (as a result of a fork() or clone()) so that the parent thread can return to userspace with hot registers avoiding a possibly pointless reload of FPU register state. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02powerpc: Prepare for splitting giveup_{fpu, altivec, vsx} in twoCyril Bur
This prepares for the decoupling of saving {fpu,altivec,vsx} registers and marking {fpu,altivec,vsx} as being unused by a thread. Currently giveup_{fpu,altivec,vsx}() does both however optimisations to task switching can be made if these two operations are decoupled. save_all() will permit the saving of registers to thread structs and leave threads MSR with bits enabled. This patch introduces no functional change. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02powerpc: Restore FPU/VEC/VSX if previously usedCyril Bur
Currently the FPU, VEC and VSX facilities are lazily loaded. This is not a problem unless a process is using these facilities. Modern versions of GCC are very good at automatically vectorising code, new and modernised workloads make use of floating point and vector facilities, even the kernel makes use of vectorised memcpy. All this combined greatly increases the cost of a syscall since the kernel uses the facilities sometimes even in syscall fast-path making it increasingly common for a thread to take an *_unavailable exception soon after a syscall, not to mention potentially taking all three. The obvious overcompensation to this problem is to simply always load all the facilities on every exit to userspace. Loading up all FPU, VEC and VSX registers every time can be expensive and if a workload does avoid using them, it should not be forced to incur this penalty. An 8bit counter is used to detect if the registers have been used in the past and the registers are always loaded until the value wraps to back to zero. Several versions of the assembly in entry_64.S were tested: 1. Always calling C. 2. Performing a common case check and then calling C. 3. A complex check in asm. After some benchmarking it was determined that avoiding C in the common case is a performance benefit (option 2). The full check in asm (option 3) greatly complicated that codepath for a negligible performance gain and the trade-off was deemed not worth it. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> [mpe: Move load_vec in the struct to fill an existing hole, reword change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> fixup
2016-03-02powerpc: Explicitly disable math features when copying threadCyril Bur
Currently when threads get scheduled off they always giveup the FPU, Altivec (VMX) and Vector (VSX) units if they were using them. When they are scheduled back on a fault is then taken to enable each facility and load registers. As a result explicitly disabling FPU/VMX/VSX has not been necessary. Future changes and optimisations remove this mandatory giveup and fault which could cause calls such as clone() and fork() to copy threads and run them later with FPU/VMX/VSX enabled but no registers loaded. This patch starts the process of having MSR_{FP,VEC,VSX} mean that a threads registers are hot while not having MSR_{FP,VEC,VSX} means that the registers must be loaded. This allows for a smarter return to userspace. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02powerpc/mm: Split hash page table sizing heuristic into a helperDavid Gibson
htab_get_table_size() either retrieve the size of the hash page table (HPT) from the device tree - if the HPT size is determined by firmware - or uses a heuristic to determine a good size based on RAM size if the kernel is responsible for allocating the HPT. To support a PAPR extension allowing resizing of the HPT, we're going to want the memory size -> HPT size logic elsewhere, so split it out into a helper function. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01powerpc/mm: Clean up memory hotplug failure pathsDavid Gibson
This makes a number of cleanups to handling of mapping failures during memory hotplug on Power: For errors creating the linear mapping for the hot-added region: * This is now reported with EFAULT which is more appropriate than the previous EINVAL (the failure is unlikely to be related to the function's parameters) * An error in this path now prints a warning message, rather than just silently failing to add the extra memory. * Previously a failure here could result in the region being partially mapped. We now clean up any partial mapping before failing. For errors creating the vmemmap for the hot-added region: * This is now reported with EFAULT instead of causing a BUG() - this could happen for external reason (e.g. full hash table) so it's better to handle this non-fatally * An error message is also printed, so the failure won't be silent * As above a failure could cause a partially mapped region, we now clean this up. [mpe: move htab_remove_mapping() out of #ifdef CONFIG_MEMORY_HOTPLUG to enable this] Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01powerpc/mm: Handle removing maybe-present bolted HPTEsDavid Gibson
At the moment the hpte_removebolted callback in ppc_md returns void and will BUG_ON() if the hpte it's asked to remove doesn't exist in the first place. This is awkward for the case of cleaning up a mapping which was partially made before failing. So, we add a return value to hpte_removebolted, and have it return ENOENT in the case that the HPTE to remove didn't exist in the first place. In the (sole) caller, we propagate errors in hpte_removebolted to its caller to handle. However, we handle ENOENT specially, continuing to complete the unmapping over the specified range before returning the error to the caller. This means that htab_remove_mapping() will work sanely on a partially present mapping, removing any HPTEs which are present, while also returning ENOENT to its caller in case it's important there. There are two callers of htab_remove_mapping(): - In remove_section_mapping() we already WARN_ON() any error return, which is reasonable - in this case the mapping should be fully present - In vmemmap_remove_mapping() we BUG_ON() any error. We change that to just a WARN_ON() in the case of ENOENT, since failing to remove a mapping that wasn't there in the first place probably shouldn't be fatal. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01powerpc/mm: Clean up error handling for htab_remove_mappingDavid Gibson
Currently, the only error that htab_remove_mapping() can report is -EINVAL, if removal of bolted HPTEs isn't implemeted for this platform. We make a few clean ups to the handling of this: * EINVAL isn't really the right code - there's nothing wrong with the function's arguments - use ENODEV instead * We were also printing a warning message, but that's a decision better left up to the callers, so remove it * One caller is vmemmap_remove_mapping(), which will just BUG_ON() on error, making the warning message redundant, so no change is needed there. * The other caller is remove_section_mapping(). This is called in the memory hot remove path at a point after vmemmap_remove_mapping() so if hpte_removebolted isn't implemented, we'd expect to have already BUG()ed anyway. Put a WARN_ON() here, in lieu of a printk() since this really shouldn't be happening. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01powerpc: Fix misspellings in comments.Adam Buchbinder
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01powerpc/ps3: gelic_udbg: use struct udphdr from <linux/udp.h>Luis Henriques
Instead of defining a local version of struct udphdr use the standard definition from <linux/udp.h>. The 'src' field is named 'source' in the <linux/udp.h> definition. Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01powerpc/ps3: gelic_udbg: use struct iphdr from <linux/ip.h>Luis Henriques
Instead of defining a local version of struct iphdr use the standard definition from <linux/ip.h>. Several fields in the <linux/ip.h> definition have different names: - proto -> protocol - src -> saddr - dest -> daddr - total_length -> tot_len - checksum -> check Also, 'ver_len' is composed by 'version' and 'ihl' in <linux/ip.h>. Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01powerpc/ps3: gelic_udbg: use struct vlan_hdr from <linux/if_vlan.h>Luis Henriques
Instead of defining the local struct vlantag use the standard definition of vlan_hdr from <linux/if_vlan.h>. The fields in the <linux/if_vlan.h> definition have different names: - vlan -> h_vlan_TCI - subtype -> h_vlan_encapsulated_proto While there, use also the ETH_P_IP macro instead of an hard-coded 0x0800 value. Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-01powerpc/ps3: gelic_udbg: use struct ethhdr from <linux/if_ether.h>Luis Henriques
Instead of defining a local version of struct ethhdr use the standard definition from <linux/if_ether.h>. The fields in the <linux/if_ether.h> definition have different names: - dest -> h_dest - src -> h_source - type -> h_proto While there, use a few other standard functions/macros: - eth_broadcast_addr (instead of a memset) - ETH_ALEN - ETH_P_8021Q Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29powerpc/mm/book3s-64: Expand the real page number field of the Linux PTEPaul Mackerras
Now that other PTE fields have been moved out of the way, we can expand the RPN field of the PTE on 64-bit Book 3S systems and align it with the RPN field in the radix PTE format used by PowerISA v3.0 CPUs in radix mode. For 64k page size, this means we need to move the _PAGE_COMBO and _PAGE_4K_PFN bits. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29powerpc/mm/book3s-64: Move software-used bits in PTEPaul Mackerras
This moves the _PAGE_SPECIAL and _PAGE_SOFT_DIRTY bits in the Linux PTE on 64-bit Book 3S systems to bit positions which are designated for software use in the radix PTE format used by PowerISA v3.0 CPUs in radix mode. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29powerpc/mm/book3s-64: Shuffle read, write, execute and user bits in PTEPaul Mackerras
This moves the _PAGE_EXEC, _PAGE_RW and _PAGE_USER bits around in the Linux PTE on 64-bit Book 3S systems to correspond with the bit positions used in radix mode by PowerISA v3.0 CPUs. This also adds a _PAGE_READ bit corresponding to the read permission bit in the radix PTE. _PAGE_READ is currently unused but could possibly be used in future to improve pte_protnone(). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29powerpc/mm/book3s-64: Move HPTE-related bits in PTE to upper endPaul Mackerras
This moves the _PAGE_HASHPTE, _PAGE_F_GIX and _PAGE_F_SECOND fields in the Linux PTE on 64-bit Book 3S systems to the most significant byte. Of the 5 bits, one is a software-use bit and the other four are reserved bit positions in the PowerISA v3.0 radix PTE format. Using these bits is OK because these bits are all to do with tracking the HPTE(s) associated with the Linux PTE, and therefore won't be needed in radix mode. This frees up bit positions in the lower two bytes. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29powerpc/mm/book3s-64: Move _PAGE_PTE to 2nd most significant bitPaul Mackerras
This changes _PAGE_PTE for 64-bit Book 3S processors from 0x1 to 0x4000_0000_0000_0000, because that bit is used as the L (leaf) bit by PowerISA v3.0 CPUs in radix mode. The "leaf" bit indicates that the PTE points to a page directly rather than another radix level, which is what the _PAGE_PTE bit means. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29powerpc/mm/book3s-64: Move _PAGE_PRESENT to the most significant bitPaul Mackerras
This changes _PAGE_PRESENT for 64-bit Book 3S processors from 0x2 to 0x8000_0000_0000_0000, because that is where PowerISA v3.0 CPUs in radix mode will expect to find it. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29powerpc/mm/book3s-64: Use physical addresses in upper page table tree levelsPaul Mackerras
This changes the Linux page tables to store physical addresses rather than kernel virtual addresses in the upper levels of the tree (pgd, pud and pmd) for 64-bit Book 3S machines. This also changes the hugepd pointers used to implement hugepages when the base page size is 4k to store physical addresses rather than virtual addresses (again just for 64-bit Book3S machines). This frees up some high order bits, and will be needed with PowerISA v3.0 machines which read the page table tree in hardware in radix mode. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-27powerpc/mm/book3s-64: Free up 7 high-order bits in the Linux PTEPaul Mackerras
This frees up bits 57-63 in the Linux PTE on 64-bit Book 3S machines. In the 4k page case, this is done just by reducing the size of the RPN field to 39 bits, giving 51-bit real addresses. In the 64k page case, we had 10 unused bits in the middle of the PTE, so this moves the RPN field down 10 bits to make use of those unused bits. This means the RPN field is now 3 bits larger at 37 bits, giving 53-bit real addresses in the normal case, or 49-bit real addresses for the special 4k PFN case. We are doing this in order to be able to move some other PTE bits into the positions where PowerISA V3.0 processors will expect to find them in radix-tree mode. Ultimately we will be able to move the RPN field to lower bit positions and make it larger. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-27powerpc/mm/book3s-64: Clean up some obsolete or misleading commentsPaul Mackerras
No code changes. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-25Merge tag 'powerpc-4.5-4' into nextMichael Ellerman
Pull in our current fixes from 4.5, in particular the "Fix Multi hit ERAT" bug is causing folks some grief when testing next.
2016-02-24powerpc: Fix BUG_ON() reporting in real modeBalbir Singh
I ran into this issue while debugging an early boot problem. The system hit a BUG_ON() but report bug failed to print the line number and file name. The reason being that the system was running in real mode and report_bug() searches for addresses in the PAGE_OFFSET+ region. Suggested-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-24powerpc: Use BUILD_BUG_ON_MSG() for unsupported {cmp}xchg sizespan xinhui
__xchg_called_with_bad_pointer() can't tell us which code uses {cmp}xchg with an unsupported size, and no error is reported until the link stage. To make such problems easier to debug, use BUILD_BUG_ON_MSG() instead. Signed-off-by: pan xinhui <xinhui.pan@linux.vnet.ibm.com> [mpe: Tweak change log wording & add relaxed/acquire] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> fixup
2016-02-24powerpc/powernv: Add AST graphics driver to powernv_defconfigJeremy Kerr
Most current OpenPOWER platforms have an AST BMC, so add graphics support via the AST DRM driver. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-24powerpc/powernv: Add powernv firmware interface drivers to powernv_defconfigJeremy Kerr
There are a few firmware-provided interfaces for OpenPOWER platforms: the PRD infrastructure, IPMI support, and MTD access to the PNOR flash. This change adds these to powernv_defconfig Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>