summaryrefslogtreecommitdiffstats
path: root/drivers/iommu
AgeCommit message (Collapse)Author
2016-01-19Merge tag 'iommu-updates-v4.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "The updates include: - Small code cleanups in the AMD IOMMUv2 driver - Scalability improvements for the DMA-API implementation of the AMD IOMMU driver. This is just a starting point, but already showed some good improvements in my tests. - Removal of the unused Renesas IPMMU/IPMMUI driver - Updates for ARM-SMMU include: * Some fixes to get the driver working nicely on Broadcom hardware * A change to the io-pgtable API to indicate the unit in which to flush (all callers converted, with Ack from Laurent) * Use of devm_* for allocating/freeing the SMMUv3 buffers - Some other small fixes and improvements for other drivers" * tag 'iommu-updates-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (46 commits) iommu/vt-d: Fix up error handling in alloc_iommu iommu/vt-d: Check the return value of iommu_device_create() iommu/amd: Remove an unneeded condition iommu/amd: Preallocate dma_ops apertures based on dma_mask iommu/amd: Use trylock to aquire bitmap_lock iommu/amd: Make dma_ops_domain->next_index percpu iommu/amd: Relax locking in dma_ops path iommu/amd: Initialize new aperture range before making it visible iommu/amd: Build io page-tables with cmpxchg64 iommu/amd: Allocate new aperture ranges in dma_ops_alloc_addresses iommu/amd: Optimize dma_ops_free_addresses iommu/amd: Remove need_flush from struct dma_ops_domain iommu/amd: Iterate over all aperture ranges in dma_ops_area_alloc iommu/amd: Flush iommu tlb in dma_ops_free_addresses iommu/amd: Rename dma_ops_domain->next_address to next_index iommu/amd: Remove 'start' parameter from dma_ops_area_alloc iommu/amd: Flush iommu tlb in dma_ops_aperture_alloc() iommu/amd: Retry address allocation within one aperture iommu/amd: Move aperture_range.offset to another cache-line iommu/amd: Add dma_ops_aperture_alloc() function ...
2016-01-19Merge branches 's390', 'arm/renesas', 'arm/msm', 'arm/shmobile', 'arm/smmu', ↵Joerg Roedel
'x86/amd' and 'x86/vt-d' into next
2016-01-11Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "The main changes in this cycle were: - code patching and cpu_has cleanups (Borislav Petkov) - paravirt cleanups (Juergen Gross) - TSC cleanup (Thomas Gleixner) - ptrace cleanup (Chen Gang)" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arch/x86/kernel/ptrace.c: Remove unused arg_offs_table x86/mm: Align macro defines x86/cpu: Provide a config option to disable static_cpu_has x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros x86/cpufeature: Cleanup get_cpu_cap() x86/cpufeature: Move some of the scattered feature bits to x86_capability x86/paravirt: Remove paravirt ops pmd_update[_defer] and pte_update_defer x86/paravirt: Remove unused pv_apic_ops structure x86/tsc: Remove unused tsc_pre_init() hook x86: Remove unused function cpu_has_ht_siblings() x86/paravirt: Kill some unused patching functions
2016-01-07iommu/vt-d: Fix up error handling in alloc_iommuJoerg Roedel
Only check for error when iommu->iommu_dev has been assigned and only assign drhd->iommu when the function can't fail anymore. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-01-07iommu/vt-d: Check the return value of iommu_device_create()Nicholas Krause
This adds the proper check to alloc_iommu to make sure that the call to iommu_device_create has completed successfully and if not return the error code to the caller after freeing up resources allocated previously. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-01-07iommu/dma: Use correct offset in map_sgRobin Murphy
When mapping a non-page-aligned scatterlist entry, we copy the original offset to the output DMA address before aligning it to hand off to iommu_map_sg(), then later adding the IOVA page address portion to get the final mapped address. However, when the IOVA page size is smaller than the CPU page size, it is the offset within the IOVA page we want, not that within the CPU page, which can easily be larger than an IOVA page and thus result in an incorrect final address. Fix the bug by taking only the IOVA-aligned part of the offset as the basis of the DMA address, not the whole thing. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-01-07iommu/amd: Remove an unneeded conditionDan Carpenter
get_device_id() returns an unsigned short device id. It never fails and it never returns a negative so we can remove this condition. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Preallocate dma_ops apertures based on dma_maskJoerg Roedel
Preallocate between 4 and 8 apertures when a device gets it dma_mask. With more apertures we reduce the lock contention of the domain lock significantly. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Use trylock to aquire bitmap_lockJoerg Roedel
First search for a non-contended aperture with trylock before spinning. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Make dma_ops_domain->next_index percpuJoerg Roedel
Make this pointer percpu so that we start searching for new addresses in the range we last stopped and which is has a higher probability of being still in the cache. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Relax locking in dma_ops pathJoerg Roedel
Remove the long holding times of the domain->lock and rely on the bitmap_lock instead. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Initialize new aperture range before making it visibleJoerg Roedel
Make sure the aperture range is fully initialized before it is visible to the address allocator. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Build io page-tables with cmpxchg64Joerg Roedel
This allows to build up the page-tables without holding any locks. As a consequence it removes the need to pre-populate dma_ops page-tables. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Allocate new aperture ranges in dma_ops_alloc_addressesJoerg Roedel
It really belongs there and not in __map_single. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Optimize dma_ops_free_addressesJoerg Roedel
Don't flush the iommu tlb when we free something behind the current next_bit pointer. Update the next_bit pointer instead and let the flush happen on the next wraparound in the allocation path. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Remove need_flush from struct dma_ops_domainJoerg Roedel
The flushing of iommu tlbs is now done on a per-range basis. So there is no need anymore for domain-wide flush tracking. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Iterate over all aperture ranges in dma_ops_area_allocJoerg Roedel
This way we don't need to care about the next_index wrapping around in dma_ops_alloc_addresses. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Flush iommu tlb in dma_ops_free_addressesJoerg Roedel
Instead of setting need_flush, do the flush directly in dma_ops_free_addresses. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Rename dma_ops_domain->next_address to next_indexJoerg Roedel
It points to the next aperture index to allocate from. We don't need the full address anymore because this is now tracked in struct aperture_range. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Remove 'start' parameter from dma_ops_area_allocJoerg Roedel
Parameter is not needed because the value is part of the already passed in struct dma_ops_domain. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Flush iommu tlb in dma_ops_aperture_alloc()Joerg Roedel
Since the allocator wraparound happens in this function now, flush the iommu tlb there too. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Retry address allocation within one apertureJoerg Roedel
Instead of skipping to the next aperture, first try again in the current one. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Move aperture_range.offset to another cache-lineJoerg Roedel
Moving it before the pte_pages array puts in into the same cache-line as the spin-lock and the bitmap array pointer. This should safe a cache-miss. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Add dma_ops_aperture_alloc() functionJoerg Roedel
Make this a wrapper around iommu_ops_area_alloc() for now and add more logic to this function later on. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Pass correct shift to iommu_area_alloc()Joerg Roedel
The page-offset of the aperture must be passed instead of 0. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Flush the IOMMU TLB before the addresses are freedJoerg Roedel
This allows to keep the bitmap_lock only for a very short period of time. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Flush IOMMU TLB on __map_single error pathJoerg Roedel
There have been present PTEs which in theory could have made it to the IOMMU TLB. Flush the addresses out on the error path to make sure no stale entries remain. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Introduce bitmap_lock in struct aperture_rangeJoerg Roedel
This lock only protects the address allocation bitmap in one aperture. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Move 'struct dma_ops_domain' definition to amd_iommu.cJoerg Roedel
It is only used in this file anyway, so keep it there. Same with 'struct aperture_range'. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/amd: Warn only once on unexpected pte valueJoerg Roedel
This prevents possible flooding of the kernel log. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/ipmmu-vmsa: Don't truncate ttbr if LPAE is not enabledGeert Uytterhoeven
If CONFIG_PHYS_ADDR_T_64BIT=n: drivers/iommu/ipmmu-vmsa.c: In function 'ipmmu_domain_init_context': drivers/iommu/ipmmu-vmsa.c:434:2: warning: right shift count >= width of type ipmmu_ctx_write(domain, IMTTUBR0, ttbr >> 32); ^ As io_pgtable_cfg.arm_lpae_s1_cfg.ttbr[] is an array of u64s, assigning it to a phys_addr_t may truncates it. Make ttbr u64 to fix this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/dma: Avoid unlikely high-order allocationsRobin Murphy
Doug reports that the equivalent page allocator on 32-bit ARM exhibits particularly pathalogical behaviour under memory pressure when fragmentation is high, where allocating a 4MB buffer takes tens of seconds and the number of calls to alloc_pages() is over 9000![1] We can drastically improve that situation without losing the other benefits of high-order allocations when they would succeed, by assuming memory pressure is relatively constant over the course of an allocation, and not retrying allocations at orders we know to have failed before. This way, the best-case behaviour remains unchanged, and in the worst case we should see at most a dozen or so (MAX_ORDER - 1) failed attempts before falling back to single pages for the remainder of the buffer. [1]:http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/394660.html Reported-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-28iommu/dma: Add some missing #includesRobin Murphy
dma-iommu.c was naughtily relying on an implicit transitive #include of linux/vmalloc.h, which is apparently not present on some architectures. Add that, plus a couple more headers for other functions which are used similarly. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-22Merge branch 'for-joerg/arm-smmu/updates' of ↵Joerg Roedel
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu
2015-12-19x86/cpufeature: Remove unused and seldomly used cpu_has_xx macrosBorislav Petkov
Those are stupid and code should use static_cpu_has_safe() or boot_cpu_has() instead. Kill the least used and unused ones. The remaining ones need more careful inspection before a conversion can happen. On the TODO. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1449481182-27541-4-git-send-email-bp@alien8.de Cc: David Sterba <dsterba@suse.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Matt Mackall <mpm@selenic.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <jbacik@fb.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-18Merge tag 'iommu-fixes-v4.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fixes from Joerg Roedel: "Two similar fixes for the Intel and AMD IOMMU drivers to add proper access checks before calling handle_mm_fault" * tag 'iommu-fixes-v4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Do access checks before calling handle_mm_fault() iommu/amd: Do proper access checking before calling handle_mm_fault()
2015-12-17iommu/io-pgtable-arm: Ensure we free the final level on teardownWill Deacon
When tearing down page tables, we return early for the final level since we know that we won't have any table pointers to follow. Unfortunately, this also means that we forget to free the final level, so we end up leaking memory. Fix the issue by always freeing the current level, but just don't bother to iterate over the ptes if we're at the final level. Cc: <stable@vger.kernel.org> Reported-by: Zhang Bo <zhangbo_a@xiaomi.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-17iommu/arm-smmu: Use STE.S1STALLD only when supportedPrem Mallappa
It is ILLEGAL to set STE.S1STALLD to 1 if stage 1 is enabled and either the stall or terminate models are not supported. This patch fixes the STALLD check and ensures that we don't set STALLD in the STE when it is not supported. Signed-off-by: Prem Mallappa <pmallapp@broadcom.com> [will: consistently use IDR0_STALL_MODEL_* prefix] Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-17iommu/arm-smmu: Fix write to GERRORN registerPrem Mallappa
When acknowledging global errors, the GERRORN register should be written with the original GERROR value so that active errors are toggled. This patch fixed the driver to write the original GERROR value to GERRORN, instead of an active error mask. Signed-off-by: Prem Mallappa <pmallapp@broadcom.com> [will: reworked use of active bits and fixed commit log] Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-17iommu/io-pgtable: Make io_pgtable_ops_to_pgtable() macro commonRobin Murphy
There is no need to keep a useful accessor for a public structure hidden away in a private implementation. Move it out alongside the structure definition so that other implementations may reuse it. Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-17iommu/arm-smmu: Invalidate TLBs properlyRobin Murphy
When invalidating an IOVA range potentially spanning multiple pages, such as when removing an entire intermediate-level table, we currently only issue an invalidation for the first IOVA of that range. Since the architecture specifies that address-based TLB maintenance operations target a single entry, an SMMU could feasibly retain live entries for subsequent pages within that unmapped range, which is not good. Make sure we hit every possible entry by iterating over the whole range at the granularity provided by the pagetable implementation. Signed-off-by: Robin Murphy <robin.murphy@arm.com> [will: added missing semicolons...] Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-17iommu/io-pgtable: Indicate granule for TLB maintenanceRobin Murphy
IOMMU hardware with range-based TLB maintenance commands can work happily with the iova and size arguments passed via the tlb_add_flush callback, but for IOMMUs which require separate commands per entry in the range, it is not straightforward to infer the necessary granularity when it comes to issuing the actual commands. Add an additional argument indicating the granularity for the benefit of drivers needing to know, and update the ARM LPAE code appropriately (for non-leaf invalidations we currently just assume the worst-case page granularity rather than walking the table to check). Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-17iommu/io-pgtable-arm: Avoid dereferencing bogus PTEsRobin Murphy
In the case of corrupted page tables, or when an invalid size is given, __arm_lpae_unmap() may recurse beyond the maximum number of levels. Unfortunately the detection of this error condition only happens *after* calculating a nonsense offset from something which might not be a valid table pointer and dereferencing that to see if it is a valid PTE. Make things a little more robust by checking the level is valid before doing anything which depends on it being so. Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-17iommu/arm-smmu: Handle unknown CERROR values gracefullyWill Deacon
Whilst the architecture only defines a few of the possible CERROR values, we should handle unknown values gracefully rather than go out of bounds trying to print out an error description. Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-17iommu/arm-smmu: Correct group reference countPeng Fan
The basic flow for add a device: arm_smmu_add_device |->iommu_group_get_for_dev |->iommu_group_get return group; (1) |->ops->device_group : Init/increase reference count to/by 1. |->iommu_group_add_device : Increase reference count by 1. return group (2) |->return 0; Since we are adding one device, the flow is (2) and the group reference count will be increased by 2. So, we need to add iommu_group_put at the end of arm_smmu_add_device to decrease the count by 1. Also take the failure path into consideration when fail to add a device. Signed-off-by: Peng Fan <van.freenix@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-17iommu/arm-smmu: Use incoming shareability attributes in bypass modeWill Deacon
When we initialise a bypass STE, we memset the structure to zero and set the Valid and Config fields to indicate that the stream should bypass the SMMU. Unfortunately, this results in an SHCFG field of 0 which means that the shareability of any incoming transactions is overridden with non-shareable, leading to potential coherence problems down the line. This patch fixes the issue by initialising bypass STEs to use the incoming shareability attributes. When translation is in effect at either stage 1 or stage 2, the shareability is determined by the page tables. Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-17iommu/arm-smmu: Delete an unnecessary check before free_io_pgtable_ops()Markus Elfring
The free_io_pgtable_ops() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-17iommu/arm-smmu: Convert DMA buffer allocations to the managed APIWill Deacon
The ARM SMMUv3 driver uses dma_{alloc,free}_coherent to manage its queues and configuration data structures. This patch converts the driver to the managed (dmam_*) API, so that resources are freed automatically on device teardown. This greatly simplifies the failure paths and allows us to remove a bunch of handcrafted freeing code. Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-17iommu/arm-smmu: Remove #define for non-existent PRIQ_0_OF fieldWill Deacon
PRIQ_0_OF has been removed from the SMMUv3 architecture, so remove its corresponding (and unused) #define from the driver. Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-15Revert "scatterlist: use sg_phys()"Dan Williams
commit db0fa0cb0157 "scatterlist: use sg_phys()" did replacements of the form: phys_addr_t phys = page_to_phys(sg_page(s)); phys_addr_t phys = sg_phys(s) & PAGE_MASK; However, this breaks platforms where sizeof(phys_addr_t) > sizeof(unsigned long). Revert for 4.3 and 4.4 to make room for a combined helper in 4.5. Cc: <stable@vger.kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: db0fa0cb0157 ("scatterlist: use sg_phys()") Suggested-by: Joerg Roedel <joro@8bytes.org> Reported-by: Vitaly Lavrov <vel21ripn@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>