summaryrefslogtreecommitdiffstats
path: root/arch
AgeCommit message (Collapse)Author
2016-09-25arch/powerpc: Add CONFIG_FSL_DPAA to corenetXX_smp_defconfigClaudiu Manoil
Enable the drivers on the powerpc arch. Signed-off-by: Roy Pledge <roy.pledge@nxp.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-25powerpc/8xx: make user addr DTLB miss the short pathChristophe Leroy
User space DTLB miss represent approximatly 90% of TLB misses so make it the shortest path. Also remove an unneccessary double jump in FixupDAR Before this patch, we spend 3.3 TB ticks in the handler for each user address miss and 3.4 TB ticks for each kernel address miss After this patch, we send 3.0 TB ticks in the handler for each user address miss and 3.9 TB ticks for each kernel address miss Taking into account that user misses represent 90% of the total, this patch provides an improvement of approx. 9% Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-25powerpc/8xx: Move additional DTLBMiss handlers out of exception areaChristophe Leroy
When all options are activated, there is not enough space for the DTLBMiss handlers that handles IMMR area and linear RAM pages in the exception area once we have added hugepage handling. So lets move them after .0x2000 Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-25powerpc/8xx: use r3 to scratch CR in ITLBmissChristophe Leroy
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-25powerpc/8xx: add dedicated machine check handlerChristophe Leroy
During a machine check, the 8xx provides indication of whether the check is due to data or instruction access, so let's display it. Lets also move 8xx specific handling into the new handler. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-25powerpc/8xx: add system_reset_exceptionChristophe Leroy
When the watchdog is in NMI mode, the system reset interrupt is generated when the watchdog counter expires. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-25powerpc/fsl_pci: Size upper inbound window based on RAM sizeScott Wood
This allows PCI devices that can only address (e.g.) 36 or 40 bit DMA to use direct DMA, at the cost of not being able to DMA to non-RAM addresses (this doesn't affect MSIs as there is a separate dedicated window for that) which we wouldn't have been able to do anyway if the RAM size didn't trigger the creation of the second inbound window. It also fixes an off-by-one error that set dma_direct_ops on PCI devices whose dma mask could address all the space below the DMA offset (previously 40 bits), but not the window that starts at the DMA offset. Signed-off-by: Scott Wood <oss@buserror.net> Cc: Tillmann Heidsieck <theidsieck@leenox.de> Tested-by: Tillmann Heidsieck <theidsieck@leenox.de>
2016-09-25powerpc/8xx: use SPRN_EIE and SPRN_EID to enable/disable interruptsChristophe Leroy
The 8xx has two special registers called EID (External Interrupt Disable) and EIE (External Interrupt Enable) for clearing/setting EE in MSR. It avoids the three instructions set mfmsr/ori/mtmsr or mfmsr/rlwinm/mtmsr and it avoids using a general register. We just have to write something in the special register to change MSR EE bit. So we write r0 into the register, regardless of r0 value. Writing to one of those two special registers also set the MSR RI bit, but this bit is only unset during beginning of exception prolog and end of exception epilog. When executing C-functions MSR RI is always set. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-25powerpc/83xx: factor out the common codes of setup arch functionsKevin Hao
Factor out the common codes of setup arch functions to a separate function. It does make no sense to print a board specific info in setup arch functions, so use a more general one. For ASP8347E board, there is no pci device node. So it is safe to invoke mpc83xx_setup_pci() in its setup arch function even there is no such invocation in its original setup arch function. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-25soc/fsl/qe: fix Oops on CPM1 (and likely CPM2)Christophe Leroy
Commit 0e6e01ff694ee ("CPM/QE: use genalloc to manage CPM/QE muram") has changed the way muram is managed. genalloc uses kmalloc(), hence requires the SLAB to be up and running. On powerpc 8xx, cpm_reset() is called early during startup. cpm_reset() then calls cpm_muram_init() before SLAB is available, hence the following Oops. cpm_reset() cannot be called during initcalls because the CPM is needed for console. This patch removes the call to cpm_muram_init() from cpm_reset(). cpm_muram_init() will be called from a new function called cpm_init() which is declared as subsys_initcall, unless cpm_muram_alloc() is called earlier for the serial console in which case cpm_muram_init() will be called from there. The reason for calling it from two places is that some drivers (e.g. i2c-cpm) need some of the initialisations done by cpm_muram_init() but don't call cpm_muram_alloc(). The console driver calls cpm_muram_alloc() but some platforms might not use the CPM serial ports for console. [ 0.000000] Unable to handle kernel paging request for data at address 0x00000008 [ 0.000000] Faulting instruction address: 0xc01acce0 [ 0.000000] Oops: Kernel access of bad area, sig: 11 [#1] [ 0.000000] PREEMPT CMPC885 [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.4.14-g0886ed8 #5 [ 0.000000] task: c05183e0 ti: c0536000 task.ti: c0536000 [ 0.000000] NIP: c01acce0 LR: c0011068 CTR: 00000000 [ 0.000000] REGS: c0537e50 TRAP: 0300 Not tainted (4.4.14-s3k-dev-g0886ed8-svn) [ 0.000000] MSR: 00001032 <ME,IR,DR,RI> CR: 28044428 XER: 00000000 [ 0.000000] DAR: 00000008 DSISR: c0000000 GPR00: c0011068 c0537f00 c05183e0 00000000 00009000 ffffffff 00000bc0 ffffffff GPR08: ff003000 ff00b000 ff003bbf 00000000 22044422 100d43a8 00000000 07ff94e8 GPR16: 00000000 07bb5d70 00000000 07ff81f4 07ff81f4 07ff81f4 00000000 00000000 GPR24: 07ffb3a0 07fe7628 c0550000 c7ffa190 c0540000 ff003bbf 00000000 00000001 [ 0.000000] NIP [c01acce0] gen_pool_add_virt+0x14/0xdc [ 0.000000] LR [c0011068] cpm_muram_init+0xd4/0x18c [ 0.000000] Call Trace: [ 0.000000] [c0537f00] [00000200] 0x200 (unreliable) [ 0.000000] [c0537f20] [c0011068] cpm_muram_init+0xd4/0x18c [ 0.000000] [c0537f70] [c0494684] cpm_reset+0xb4/0xc8 [ 0.000000] [c0537f90] [c0494c64] cmpc885_setup_arch+0x10/0x30 [ 0.000000] [c0537fa0] [c0493cd4] setup_arch+0x130/0x168 [ 0.000000] [c0537fb0] [c04906bc] start_kernel+0x88/0x380 [ 0.000000] [c0537ff0] [c0002224] start_here+0x38/0x98 [ 0.000000] Instruction dump: [ 0.000000] 91430010 91430014 80010014 83e1000c 7c0803a6 38210010 4e800020 7c0802a6 [ 0.000000] 9421ffe0 bf61000c 90010024 7c7e1b78 <80630008> 7c9c2378 7cc31c30 3863001f [ 0.000000] ---[ end trace dc8fa200cb88537f ]--- fixes: 0e6e01ff694ee ("CPM/QE: use genalloc to manage CPM/QE muram") Cc: stable@vger.linux.org Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [scottwood: Removed some string changes unrelated to bugfix] Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-25powerpc/mpic: use of_property_read_boolJulia Lawall
Use of_property_read_bool to check for the existence of a property. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression e1,e2; statement S2,S1; @@ - if (of_get_property(e1,e2,NULL)) + if (of_property_read_bool(e1,e2)) S1 else S2 // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-25powerpc: Convert fsl_rstcr_restart to a reset handlerAndrey Smirnov
Convert fsl_rstcr_restart into a function to be registered with register_reset_handler(). Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> [scottwood: Converted mvme7100 as well] Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-25powerpc: Call chained reset handlers during resetAndrey Smirnov
Call out to all restart handlers that were added via register_restart_handler() API when restarting the machine. Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-25powerpc: Factor out common code in setup-common.cAndrey Smirnov
Factor out a small bit of common code in machine_restart(), machine_power_off() and machine_halt(). Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-24powerpc/sgy_cts1000: Fix gpio_halt_cb()'s signatureAndrey Smirnov
Halt callback in struct machdep_calls is declared with __noreturn attribute, so omitting that attribute in gpio_halt_cb()'s signatrue results in compilation error. Change the signature to address the problem as well as change the code of the function to avoid ever returning from the function. Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-24powerpc/e8248e: Select PHYLIB only if NETDEVICES is enabledAndrey Smirnov
Select PHYLIB only if NETDEVICES is enabled and MDIO_BITBANG only if PHYLIB is present to avoid warnings from Kconfig. To prevent undefined references during linking register MDIO driver only if CONFIG_MDIO_BITBANG is enabled. Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-24powerpc/mpc85xx_mds: Select PHYLIB only if NETDEVICES is enabledAndrey Smirnov
PHYLIB depends on NETDEVICES, so to avoid unmet dependencies warning from Kconfig it needs to be selected conditionally. Also add checks if PHYLIB is built-in to avoid undefined references to PHYLIB's symbols. Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-24powerpc32: Use instruction symbolic names in check_io_access()Christophe Leroy
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-09-23powerpc: Clean up tm_abort duplication in hash_utils_64.cRui Teng
The same logic appears twice and should probably be pulled out into a function. Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Rui Teng <rui.teng@linux.vnet.ibm.com> [mpe: Rename to tm_flush_hash_page() and move comment into the function] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23powerpc/powernv: Fix comment style and spellingAndrew Donnellan
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23powerpc/32: Remove CLR_TOP32Christophe Leroy
CLR_TOP32() is defined as blank. Last useful instance of CLR_TOP32() was removed by commit 40ef8cbc6d360 ("powerpc: Get 64-bit configs to compile with ARCH=powerpc") in 2005. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23powerpc: Fix usage of _PAGE_RO in hugepageChristophe Leroy
On some CPUs like the 8xx, _PAGE_RW hence _PAGE_WRITE is defined as 0 and _PAGE_RO has to be set when a page is not writable _PAGE_RO is defined by default in pte-common.h, however BOOK3S/64 doesn't include that file so _PAGE_RO has to be defined explicitly in book3s/64/pgtable.h Fixes: a7b9f671f2d14 ("powerpc32: adds handling of _PAGE_RO") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23powerpc/eeh: Skip finding bus until after failure reportingRussell Currey
In eeh_handle_special_event(), eeh_pe_bus_get() is called before calling eeh_report_failure() on every device under a PE. If a PE was missing a bus for some reason, the error would occur before reporting failure, even though eeh_report_failure() doesn't require a bus. Fix this by moving the bus retrieval and error check after the eeh_report_failure() calls. Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23powerpc/powernv/eeh: Skip finding bus for VF resetsRussell Currey
When the PE used in pnv_eeh_reset() is that of a VF, pnv_eeh_reset_vf_pe() is used. Unlike the other reset functions called in pnv_eeh_reset(), the VF reset doesn't require a bus, and if a bus was missing the function would error out before resetting the VF PE. To avoid this, reorder the VF reset function to occur before finding and checking the bus. Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23powerpc/eeh: Null check uses of eeh_pe_bus_getRussell Currey
eeh_pe_bus_get() can return NULL if a PCI bus isn't found for a given PE. Some callers don't check this, and can cause a null pointer dereference under certain circumstances. Fix this by checking NULL everywhere eeh_pe_bus_get() is called. Fixes: 8a6b1bc70dbb ("powerpc/eeh: EEH core to handle special event") Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23powerpc/pseries: Remove unnecessary syscall trampolineNicholas Piggin
When we originally added the ability to split the exception vectors from the kernel (commit 1f6a93e4c35e ("powerpc: Make it possible to move the interrupt handlers away from the kernel" 2008-09-15)), the LOAD_HANDLER() macro used an addi instruction to compute the offset of the common handler from the kernel base address. Using addi meant the handler had to be within 32K of the kernel base address, due to the addi instruction taking a signed immediate value. That necessitated creating a trampoline for the system call handler, because system_call_common (in entry64.S) is not linked within 32K of the kernel base address. Later in commit 61e2390ede3c ("powerpc: Make load_hander handle upto 64k offset" 2012-11-15) we changed LOAD_HANDLER to take a 64K offset, by changing it to use ori. Although system_call_common is not in head_64.S or exceptions-64s.S, it is included in head-y, which causes it to be linked early in the kernel text, so in practice it ends up below 64K. Additionally if it can't be placed below 64K the linker will fail to build with a "relocation truncated to fit" error. So remove the trampoline. Newer toolchains are able to work out that the ori in LOAD_HANDLER only takes a 16 bit offset, and so they generate a 16 bit relocation. Older toolchains (binutils 2.22 at least) are not so smart, so we have to add the @l annotation to tell the assembler to generate a 16 bit relocation. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23powerpc/pseries: Fix HV facility unavailable to use correct handlerNicholas Piggin
The 0xf80 hv_facility_unavailable trampoline branches to the 0xf60 handler. This works because they both do the same thing, but it should be fixed. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23powerpc/powernv/pci: Add PHB register dump debugfs handleRussell Currey
On EEH events the kernel will print a dump of relevant registers. If EEH is unavailable (i.e. CONFIG_EEH is disabled, a new platform doesn't have EEH support, etc) this information isn't readily available. Add a new debugfs handler to trigger a PHB register dump, so that this information can be made available on demand. Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23powerpc/64/kexec: Remove BookE special default_machine_kexec_prepare()Benjamin Herrenschmidt
The only difference is now the TCE table check which doesn't need to be ifdef'ed out, it will basically do nothing on BookE (it is only useful for ancient IBM machines). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23powerpc/64/kexec: Copy image with MMU off when possibleBenjamin Herrenschmidt
Currently we turn the MMU off after copying the image, and we make sure there is no overlap between the hash table and the target pages in that case. That doesn't work for Radix however. In that case, the page tables are scattered and we can't really enforce that the target of the image isn't overlapping one of them. So instead, let's turn the MMU off before copying the image in radix mode. Thankfully, in radix mode, even under a hypervisor, we know we don't have the same kind of RMA limitations that hash mode has. While at it, also turn the MMU off early when using hash in non-LPAR mode, that way we can get rid of the collision check completely. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23powerpc/mm: Add radix flush all with IS=3Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23powerpc/64/kexec: Fix MMU cleanup on radixBenjamin Herrenschmidt
Just using the hash ops won't work anymore since radix will have NULL in there. Instead create an mmu_cleanup_all() function which will do the right thing based on the MMU mode. For Radix, for now I clear UPRT and the PTCR, effectively switching back to Radix with no partition table setup. Currently set it to NULL on BookE thought it might be a good idea to wipe the TLB there (Scott ?) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23powerpc/64/kexec: NULL check "clear_all" in kexec_sequenceBenjamin Herrenschmidt
With Radix, it can be NULL even on !BOOKE these days so replace the ifdef with a NULL check which is cleaner anyway. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-20powerpc: Remove all usages of NO_IRQMichael Ellerman
NO_IRQ has been == 0 on powerpc for just over ten years (since commit 0ebfff1491ef ("[POWERPC] Add new interrupt mapping core and change platforms to use it")). It's also 0 on most other arches. Although it's fairly harmless, every now and then it causes confusion when a driver is built on powerpc and another arch which doesn't define NO_IRQ. There's at least 6 definitions of NO_IRQ in drivers/, at least some of which are to work around that problem. So we'd like to remove it. This is fairly trivial in the arch code, we just convert: if (irq == NO_IRQ) to if (!irq) if (irq != NO_IRQ) to if (irq) irq = NO_IRQ; to irq = 0; return NO_IRQ; to return 0; And a few other odd cases as well. At least for now we keep the #define NO_IRQ, because there is driver code that uses NO_IRQ and the fixes to remove those will go via other trees. Note we also change some occurrences in PPC sound drivers, drivers/ps3, and drivers/macintosh. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-20powerpc/pseries: fix memory leak in queue_hotplug_event() error pathAndrew Donnellan
If we fail to allocate work, we don't end up using hp_errlog_copy. Free it in the error path. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-20powerpc/nvram: Fix an incorrect partition mergePan Xinhui
When we merge two contiguous partitions whose signatures are marked NVRAM_SIG_FREE, We need update prev's length and checksum, then write it to nvram, not cur's. So lets fix this mistake now. Also use memset instead of strncpy to set the partition's name. It's more readable if we want to fill up with duplicate chars . Fixes: fa2b4e54d41f ("powerpc/nvram: Improve partition removal") Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-20powerpc/nvram: Fix a memory leak in err pathPan Xinhui
If kmemdup fails, We need kfree *buff* first then return -ENOMEM. Otherwise there is a memory leak. Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-20powerpc/64s: Optimise MSR handling in exception handlingNicholas Piggin
mtmsrd with L=1 only affects MSR_EE and MSR_RI bits, and we always know what state those bits are, so the kernel MSR does not need to be loaded when modifying them. mtmsrd is often in the critical execution path, so avoiding dependency on even L1 load is noticable. On a POWER8 this saves about 3 cycles from the syscall path, and possibly a few from other exception returns (not measured). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-20powerpc/64: Optimise syscall entry for virtual, relocatable caseNicholas Piggin
The mflr r10 instruction was left over from when the code used LR to branch to system_call_entry from the exception handler. That was changed by commit 6a404806dfce ("powerpc: Avoid link stack corruption in MMU on syscall entry path") to use the count register. The value is never used now, so mflr can be removed, and r10 can be used for storage rather than spilling to the SPR scratch register. The scratch register spill causes a long pipeline stall due to the SPR read after write. This change brings getppid syscall cost from 406 to 376 cycles on POWER8. getppid for non-relocatable case is 371 cycles. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-20powerpc/mm: Update FORCE_MAX_ZONEORDER range to allow hugetlb w/4KAneesh Kumar K.V
For hugetlb to work with 4K page size, we need MAX_ORDER to be 13 or more. When switching from a 64K page size to 4K linux page size using make oldconfig, we end up with a CONFIG_FORCE_MAX_ZONEORDER value of 9. This results in a 16M hugepage beiing considered as a gigantic huge page which in turn results in failure to setup hugepages if gigantic hugepage support is not enabled. This also results in kernel crash with 4K radix configuration. We hit the below BUG_ON on radix: kernel BUG at mm/huge_memory.c:364! Oops: Exception in kernel mode, sig: 5 [#1] SMP NR_CPUS=2048 NUMA PowerNV CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc1-00006-gbae9cc6 #1 task: c0000000f1af8000 task.stack: c0000000f1aec000 NIP: c000000000c5fa0c LR: c000000000c5f9d8 CTR: c000000000c5f9a4 REGS: c0000000f1aef920 TRAP: 0700 Not tainted (4.8.0-rc1-00006-gbae9cc6) MSR: 9000000102029033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE,TM[E]> CR: 24000844 XER: 00000000 CFAR: c000000000c5f9e0 SOFTE: 1 .... NIP [c000000000c5fa0c] hugepage_init+0x68/0x238 LR [c000000000c5f9d8] hugepage_init+0x34/0x238 Fixes: a7ee539584acf ("powerpc/Kconfig: Update config option based on page size") Cc: stable@vger.kernel.org # v4.7+ Reported-by: Santhosh <santhog4@linux.vnet.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-20powerpc/64: Replay hypervisor maintenance interrupt firstNicholas Piggin
The HMI (Hypervisor Maintenance Interrupt) is defined by the architecture to be higher priority than other maskable interrupts, so replay it first, as a best-effort to replay according to hardware priorities. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-19powerpc: Ensure .mem(init|exit).text are within _stext/_etextMichael Ellerman
In our linker script we open code the list of text sections, because we need to include the __ftr_alt sections, which are arch-specific. This means we can't use TEXT_TEXT as defined in vmlinux.lds.h, and so we don't have the MEM_KEEP() logic for memory hotplug sections. If we build the kernel with the gold linker, and with CONFIG_MEMORY_HOTPLUG=y, we see that functions marked __meminit can end up outside of the _stext/_etext range, and also outside of _sinittext/_einittext, eg: c000000000000000 T _stext c0000000009e0000 A _etext c0000000009e3f18 T hash__vmemmap_create_mapping c000000000ca0000 T _sinittext c000000000d00844 T _einittext This causes them to not be recognised as text by is_kernel_text(), and prevents them being patched by jump_label (and presumably ftrace/kprobes etc.). Fix it by adding MEM_KEEP() directives, mirroring what TEXT_TEXT does. This isn't a problem when CONFIG_MEMORY_HOTPLUG=n, because we use the standard INIT_TEXT_SECTION() and EXIT_TEXT macros from vmlinux.lds.h. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-19powerpc: Don't change the section in _GLOBAL()Michael Ellerman
Currently the _GLOBAL() macro unilaterally sets the assembler section to ".text" at the start of the macro. This is rude as the caller may be using a different section. So let the caller decide which section to emit the code into. On big endian we do need to switch to the ".opd" section to emit the OPD, but do that with pushsection/popsection, thereby leaving the original section intact. I verified that the order of all entries in System.map is unchanged after this patch. The actual addresses shift around slightly so you can't just diff the System.map. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-19powerpc/kernel: Use kprobe blacklist for asm functionsNicholas Piggin
Rather than forcing the whole function into the ".kprobes.text" section, just add the symbol's address to the kprobe blacklist. This also lets us drop the three versions of the_KPROBE macro, in exchange for just one version of _ASM_NOKPROBE_SYMBOL - which is a good cleanup. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-19powerpc: Use kprobe blacklist for exception handlersNicholas Piggin
Currently we mark the C implementations of some exception handlers as __kprobes. This has the effect of putting them in the ".kprobes.text" section, which separates them from the rest of the text. Instead we can use the blacklist macros to add the symbols to a blacklist which kprobes will check. This allows the linker to move exception handler functions close to callers and avoids trampolines in larger kernels. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Reword change log a bit] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13powerpc: Set used_(vsr|vr|spe) in sigreturn path when MSR bits are activeSimon Guo
Normally, when MSR[VSX/VR/SPE] bits == 1, the used_vsr/used_vr/used_spe bit have already been set. However when loading a signal frame from user space we need to explicitly set used_vsr/used_vr/used_spe to make them consistent with the MSR bits from the signal frame. For example, CRIU application, who utilizes sigreturn to restore checkpointed process, will lead to the case where MSR[VSX] bit is active in signal frame, but used_vsr bit is not set in the kernel. (the same applies to VR/SPE). This patch fixes this by always setting used_* bit when MSR related bits are active in signal frame and we are doing sigreturn. Based on a proposal by Benh. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> [mpe: Massage change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13powerpc/ptrace: Fix cppcheck issue in gpr32_set_common/gpr32_get_common()Simon Guo
The ckpt_regs usage in gpr32_set_common/gpr32_get_common() will lead to following cppcheck error at ifndef CONFIG_PPC_TRANSACTIONAL_MEM case: [arch/powerpc/kernel/ptrace.c:2062]: (error) Uninitialized variable: ckpt_regs [arch/powerpc/kernel/ptrace.c:2130]: (error) Uninitialized variable: ckpt_regs The problem is due to gpr32_set_common() used ckpt_regs variable which only makes sense at #ifdef CONFIG_PPC_TRANSACTIONAL_MEM. This patch fix this issue by passing in "regs" parameter instead. Reported-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13powerpc/32: Add missing \n and switch to pr_warn()Colin Ian King
The message is missing a \n, add it. Switch to pr_warn(), it's shorter and less ugly. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13powerpc/mm: Update the HID bit when switching from radix to hashAneesh Kumar K.V
Power9 DD1 requires to update the hid0 register when switching from hash to radix. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13powerpc/mm/radix: Use different pte update sequence for different POWER9 revsAneesh Kumar K.V
POWER9 DD1 requires pte to be marked invalid (V=0) before updating it with the new value. This makes this distinction for the different revisions. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>