summaryrefslogtreecommitdiffstats
path: root/drivers/iommu
AgeCommit message (Collapse)Author
2020-09-21iommu/arm-smmu: Add support for split pagetablesJordan Crouse
Enable TTBR1 for a context bank if IO_PGTABLE_QUIRK_ARM_TTBR1 is selected by the io-pgtable configuration. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-09-21iommu/arm-smmu: Pass io-pgtable config to implementation specific functionJordan Crouse
Construct the io-pgtable config before calling the implementation specific init_context function and pass it so the implementation specific function can get a chance to change it before the io-pgtable is created. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-09-21iommu/arm-smmu-v3: Fix endianness annotationsJean-Philippe Brucker
When building with C=1, sparse reports some issues regarding endianness annotations: arm-smmu-v3.c:221:26: warning: cast to restricted __le64 arm-smmu-v3.c:221:24: warning: incorrect type in assignment (different base types) arm-smmu-v3.c:221:24: expected restricted __le64 [usertype] arm-smmu-v3.c:221:24: got unsigned long long [usertype] arm-smmu-v3.c:229:20: warning: incorrect type in argument 1 (different base types) arm-smmu-v3.c:229:20: expected restricted __le64 [usertype] *[assigned] dst arm-smmu-v3.c:229:20: got unsigned long long [usertype] *ent arm-smmu-v3.c:229:25: warning: incorrect type in argument 2 (different base types) arm-smmu-v3.c:229:25: expected unsigned long long [usertype] *[assigned] src arm-smmu-v3.c:229:25: got restricted __le64 [usertype] * arm-smmu-v3.c:396:20: warning: incorrect type in argument 1 (different base types) arm-smmu-v3.c:396:20: expected restricted __le64 [usertype] *[assigned] dst arm-smmu-v3.c:396:20: got unsigned long long * arm-smmu-v3.c:396:25: warning: incorrect type in argument 2 (different base types) arm-smmu-v3.c:396:25: expected unsigned long long [usertype] *[assigned] src arm-smmu-v3.c:396:25: got restricted __le64 [usertype] * arm-smmu-v3.c:1349:32: warning: invalid assignment: |= arm-smmu-v3.c:1349:32: left side has type restricted __le64 arm-smmu-v3.c:1349:32: right side has type unsigned long arm-smmu-v3.c:1396:53: warning: incorrect type in argument 3 (different base types) arm-smmu-v3.c:1396:53: expected restricted __le64 [usertype] *dst arm-smmu-v3.c:1396:53: got unsigned long long [usertype] *strtab arm-smmu-v3.c:1424:39: warning: incorrect type in argument 1 (different base types) arm-smmu-v3.c:1424:39: expected unsigned long long [usertype] *[assigned] strtab arm-smmu-v3.c:1424:39: got restricted __le64 [usertype] *l2ptr While harmless, they are incorrect and could hide actual errors during development. Fix them. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20200918141856.629722-1-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-21iommu/io-pgtable-arm: Clean up faulty sanity checkRobin Murphy
Checking for a nonzero dma_pfn_offset was a quick shortcut to validate whether the DMA == phys assumption could hold at all. Checking for a non-NULL dma_range_map is not quite equivalent, since a map may be present to describe a limited DMA window even without an offset, and thus this check can now yield false positives. However, it only ever served to short-circuit going all the way through to __arm_lpae_alloc_pages(), failing the canonical test there, and having a bit more to clean up. As such, we can simply remove it without loss of correctness. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-18iommu/dma: Handle init_iova_flush_queue() failure in dma-iommu pathTom Murphy
The init_iova_flush_queue() function can fail if we run out of memory. Fall back to noflush queue if it fails. Signed-off-by: Tom Murphy <murphyt7@tcd.ie> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20200910122539.3662-1-murphyt7@tcd.ie Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-18iommu/amd: Restore IRTE.RemapEn bit for amd_iommu_activate_guest_modeSuravee Suthikulpanit
Commit e52d58d54a32 ("iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE") removed an assumption that modify_irte_ga always set the valid bit, which requires the callers to set the appropriate value for the struct irte_ga.valid bit before calling the function. Similar to the commit 26e495f34107 ("iommu/amd: Restore IRTE.RemapEn bit after programming IRTE"), which is for the function amd_iommu_deactivate_guest_mode(). The same change is also needed for the amd_iommu_activate_guest_mode(). Otherwise, this could trigger IO_PAGE_FAULT for the VFIO based VMs with AVIC enabled. Fixes: e52d58d54a321 ("iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE") Reported-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Tested-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Cc: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20200916111720.43913-1-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-18iommu/tegra-smmu: Fix tlb_maskNicolin Chen
The "num_tlb_lines" might not be a power-of-2 value, being 48 on Tegra210 for example. So the current way of calculating tlb_mask using the num_tlb_lines is not correct: tlb_mask=0x5f in case of num_tlb_lines=48, which will trim a setting of 0x30 (48) to 0x10. Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20200917113155.13438-2-nicoleotsuka@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-18iommu/pamu: Replace use of kzfree with kfree_sensitiveAlex Dewar
kzfree() is effectively deprecated as of commit 453431a54934 ("mm, treewide: rename kzfree() to kfree_sensitive()") and is now simply an alias for kfree_sensitive(). So just replace it with kfree_sensitive(). Signed-off-by: Alex Dewar <alex.dewar90@gmail.com> Link: https://lore.kernel.org/r/20200911135325.66156-1-alex.dewar90@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-18iommu/renesas: Update help description for IPMMU_VMSA configLad Prabhakar
ipmmu-vmsa driver is also used on Renesas RZ/G{1,2} Soc's, update the same to reflect the help description for IPMMU_VMSA config. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Chris Paterson <Chris.Paterson2@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20200911101912.20701-1-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-18iommu/amd: Fix potential @entry null derefJoao Martins
After commit 26e495f34107 ("iommu/amd: Restore IRTE.RemapEn bit after programming IRTE"), smatch warns: drivers/iommu/amd/iommu.c:3870 amd_iommu_deactivate_guest_mode() warn: variable dereferenced before check 'entry' (see line 3867) Fix this by moving the @valid assignment to after @entry has been checked for NULL. Fixes: 26e495f34107 ("iommu/amd: Restore IRTE.RemapEn bit after programming IRTE") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20200910171621.12879-1-joao.m.martins@oracle.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-18iommu/mediatek: Add support for MT8167Fabien Parent
Add support for the IOMMU on MT8167 Signed-off-by: Fabien Parent <fparent@baylibre.com> Reviewed-by: Yong Wu <yong.wu@mediatek.com> Link: https://lore.kernel.org/r/20200907101649.1573134-3-fparent@baylibre.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-18iommu/mediatek: Add flag for legacy ivrp paddrFabien Parent
Add a new flag in order to select which IVRP_PADDR format is used by an SoC. Signed-off-by: Fabien Parent <fparent@baylibre.com> Reviewed-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20200907101649.1573134-2-fparent@baylibre.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-17x86/mmu: Allocate/free a PASIDFenghua Yu
A PASID is allocated for an "mm" the first time any thread binds to an SVA-capable device and is freed from the "mm" when the SVA is unbound by the last thread. It's possible for the "mm" to have different PASID values in different binding/unbinding SVA cycles. The mm's PASID (non-zero for valid PASID or 0 for invalid PASID) is propagated to a per-thread PASID MSR for all threads within the mm through IPI, context switch, or inherited. This is done to ensure that a running thread has the right PASID in the MSR matching the mm's PASID. [ bp: s/SVM/SVA/g; massage. ] Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1600187413-163670-10-git-send-email-fenghua.yu@intel.com
2020-09-17iommu/vt-d: Change flags type to unsigned int in binding mmFenghua Yu
"flags" passed to intel_svm_bind_mm() is a bit mask and should be defined as "unsigned int" instead of "int". Change its type to "unsigned int". Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/1600187413-163670-3-git-send-email-fenghua.yu@intel.com
2020-09-17drm, iommu: Change type of pasid to u32Fenghua Yu
PASID is defined as a few different types in iommu including "int", "u32", and "unsigned int". To be consistent and to match with uapi definitions, define PASID and its variations (e.g. max PASID) as "u32". "u32" is also shorter and a little more explicit than "unsigned int". No PASID type change in uapi although it defines PASID as __u64 in some places. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/1600187413-163670-2-git-send-email-fenghua.yu@intel.com
2020-09-17dma-mapping: introduce DMA range map, supplanting dma_pfn_offsetJim Quinlan
The new field 'dma_range_map' in struct device is used to facilitate the use of single or multiple offsets between mapping regions of cpu addrs and dma addrs. It subsumes the role of "dev->dma_pfn_offset" which was only capable of holding a single uniform offset and had no region bounds checking. The function of_dma_get_range() has been modified so that it takes a single argument -- the device node -- and returns a map, NULL, or an error code. The map is an array that holds the information regarding the DMA regions. Each range entry contains the address offset, the cpu_start address, the dma_start address, and the size of the region. of_dma_configure() is the typical manner to set range offsets but there are a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel driver code. These cases now invoke the function dma_direct_set_offset(dev, cpu_addr, dma_addr, size). Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> [hch: various interface cleanups] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Nathan Chancellor <natechancellor@gmail.com>
2020-09-16iommu/amd: Remove domain search for PCI/MSIThomas Gleixner
Now that the domain can be retrieved through device::msi_domain the domain search for PCI_MSI[X] is not longer required. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200826112334.400700807@linutronix.de
2020-09-16iommu/vt-d: Remove domain search for PCI/MSI[X]Thomas Gleixner
Now that the domain can be retrieved through device::msi_domain the domain search for PCI_MSI[X] is not longer required. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112334.305699301@linutronix.de
2020-09-16iommm/amd: Store irq domain in struct deviceThomas Gleixner
As the next step to make X86 utilize the direct MSI irq domain operations store the irq domain pointer in the device struct when a device is probed. It only overrides the irqdomain of devices which are handled by a regular PCI/MSI irq domain which protects PCI devices behind special busses like VMD which have their own irq domain. No functional change. It just avoids the redirection through arch_*_msi_irqs() and allows the PCI/MSI core to directly invoke the irq domain alloc/free functions instead of having to look up the irq domain for every single MSI interupt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112333.806328762@linutronix.de
2020-09-16iommm/vt-d: Store irq domain in struct deviceThomas Gleixner
As a first step to make X86 utilize the direct MSI irq domain operations store the irq domain pointer in the device struct when a device is probed. This is done from dmar_pci_bus_add_dev() because it has to work even when DMA remapping is disabled. It only overrides the irqdomain of devices which are handled by a regular PCI/MSI irq domain which protects PCI devices behind special busses like VMD which have their own irq domain. No functional change. It just avoids the redirection through arch_*_msi_irqs() and allows the PCI/MSI core to directly invoke the irq domain alloc/free functions instead of having to look up the irq domain for every single MSI interupt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112333.714566121@linutronix.de
2020-09-16x86/msi: Consolidate MSI allocationThomas Gleixner
Convert the interrupt remap drivers to retrieve the pci device from the msi descriptor and use info::hwirq. This is the first step to prepare x86 for using the generic MSI domain ops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112332.466405395@linutronix.de
2020-09-16x86_ioapic_Consolidate_IOAPIC_allocationThomas Gleixner
Move the IOAPIC specific fields into their own struct and reuse the common devid. Get rid of the #ifdeffery as it does not matter at all whether the alloc info is a couple of bytes longer or not. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112332.054367732@linutronix.de
2020-09-16x86/msi: Consolidate HPET allocationThomas Gleixner
None of the magic HPET fields are required in any way. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.943993771@linutronix.de
2020-09-16iommu/irq_remapping: Consolidate irq domain lookupThomas Gleixner
Now that the iommu implementations handle the X86_*_GET_PARENT_DOMAIN types, consolidate the two getter functions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.741909337@linutronix.de
2020-09-16iommu/amd: Consolidate irq domain getterThomas Gleixner
The irq domain request mode is now indicated in irq_alloc_info::type. Consolidate the two getter functions into one. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.634777249@linutronix.de
2020-09-16iommu/vt-d: Consolidate irq domain getterThomas Gleixner
The irq domain request mode is now indicated in irq_alloc_info::type. Consolidate the two getter functions into one. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.530546013@linutronix.de
2020-09-16x86/irq: Add allocation type for parent domain retrievalThomas Gleixner
irq_remapping_ir_irq_domain() is used to retrieve the remapping parent domain for an allocation type. irq_remapping_irq_domain() is for retrieving the actual device domain for allocating interrupts for a device. The two functions are similar and can be unified by using explicit modes for parent irq domain retrieval. Add X86_IRQ_ALLOC_TYPE_IOAPIC/HPET_GET_PARENT and use it in the iommu implementations. Drop the parent domain retrieval for PCI_MSI/X as that is unused. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.436350257@linutronix.de
2020-09-16x86_irq_Rename_X86_IRQ_ALLOC_TYPE_MSI_to_reflect_PCI_dependencyThomas Gleixner
No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.343103175@linutronix.de
2020-09-16iommu/amd: Prevent NULL pointer dereferenceThomas Gleixner
Dereferencing irq_data before checking it for NULL is suboptimal. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Joerg Roedel <jroedel@suse.de>
2020-09-11dma-direct: rename and cleanup __phys_to_dmaChristoph Hellwig
The __phys_to_dma vs phys_to_dma distinction isn't exactly obvious. Try to improve the situation by renaming __phys_to_dma to phys_to_dma_unencryped, and not forcing architectures that want to override phys_to_dma to actually provide __phys_to_dma. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2020-09-07iommu/arm-smmu-v3: permit users to disable msi pollingBarry Song
Polling by MSI isn't necessarily faster than polling by SEV. Tests on hi1620 show hns3 100G NIC network throughput can improve from 25G to 27G if we disable MSI polling while running 16 netperf threads sending UDP packets in size 32KB. TX throughput can improve from 7G to 7.7G for single thread. The reason for the throughput improvement is that the latency to poll the completion of CMD_SYNC becomes smaller. After sending a CMD_SYNC in an empty cmd queue, typically we need to wait for 280ns using MSI polling. But we only need around 190ns after disabling MSI polling. This patch provides a command line option so that users can decide to use MSI polling or not based on their tests. Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20200827092957.22500-4-song.bao.hua@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07iommu/arm-smmu-v3: replace module_param_named by module_param for disable_bypassBarry Song
Just use module_param() - going out of the way to specify a "different" name that's identical to the variable name is silly. Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20200827092957.22500-3-song.bao.hua@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07iommu/arm-smmu-v3: replace symbolic permissions by octal permissions for ↵Barry Song
module parameter This fixed the below checkpatch issue: WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'. 417: FILE: drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:417: module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO); Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20200827092957.22500-2-song.bao.hua@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07iommu/arm-smmu-v3: Fix l1 stream table size in the error messageZenghui Yu
The actual size of level-1 stream table is l1size. This looks like an oversight on commit d2e88e7c081ef ("iommu/arm-smmu: Fix LOG2SIZE setting for 2-level stream tables") which forgot to update the @size in error message as well. As memory allocation failure is already bad enough, nothing worse would happen. But let's be careful. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200826141758.341-1-yuzenghui@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-04iommu/tegra-smmu: Add locking around mapping operationsDmitry Osipenko
The mapping operations of the Tegra SMMU driver are subjected to a race condition issues because SMMU Address Space isn't allocated and freed atomically, while it should be. This patch makes the mapping operations atomic, it fixes an accidentally released Host1x Address Space problem which happens while running multiple graphics tests in parallel on Tegra30, i.e. by having multiple threads racing with each other in the Host1x's submission and completion code paths, performing IOVA mappings and unmappings in parallel. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Thierry Reding <treding@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20200901203730.27865-1-digetx@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/sun50i: Fix set-but-not-used variable warningJoerg Roedel
Fix the following warning the the SUN50I driver: drivers/iommu/sun50i-iommu.c: In function 'sun50i_iommu_irq': drivers/iommu/sun50i-iommu.c:890:14: warning: variable 'iova' set but not used [-Wunused-but-set-variable] 890 | phys_addr_t iova; | ^~~~ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200904113906.3906-1-joro@8bytes.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/mediatek: Check 4GB mode by reading infracfgMiles Chen
In previous discussion [1] and [2], we found that it is risky to use max_pfn or totalram_pages to tell if 4GB mode is enabled. Check 4GB mode by reading infracfg register, remove the usage of the un-exported symbol max_pfn. This is a step towards building mtk_iommu as a kernel module. [1] https://lore.kernel.org/lkml/20200603161132.2441-1-miles.chen@mediatek.com/ [2] https://lore.kernel.org/lkml/20200604080120.2628-1-miles.chen@mediatek.com/ [3] https://lore.kernel.org/lkml/20200715205120.GA778876@bogus/ Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Yong Wu <yong.wu@mediatek.com> Cc: Yingjoe Chen <yingjoe.chen@mediatek.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Rob Herring <robh@kernel.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Joerg Roedel <joro@8bytes.org> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Miles Chen <miles.chen@mediatek.com> Link: https://lore.kernel.org/r/20200904104038.4979-1-miles.chen@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/vt-d: Handle 36bit addressing for x86-32Chris Wilson
Beware that the address size for x86-32 may exceed unsigned long. [ 0.368971] UBSAN: shift-out-of-bounds in drivers/iommu/intel/iommu.c:128:14 [ 0.369055] shift exponent 36 is too large for 32-bit type 'long unsigned int' If we don't handle the wide addresses, the pages are mismapped and the device read/writes go astray, detected as DMAR faults and leading to device failure. The behaviour changed (from working to broken) in commit fa954e683178 ("iommu/vt-d: Delegate the dma domain to upper layer"), but the error looks older. Fixes: fa954e683178 ("iommu/vt-d: Delegate the dma domain to upper layer") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: James Sewart <jamessewart@arista.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: <stable@vger.kernel.org> # v5.3+ Link: https://lore.kernel.org/r/20200822160209.28512-1-chris@chris-wilson.co.uk Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/iova: Replace cmpxchg with xchg in queue_iovaYuqi Jin
The performance of the atomic_xchg is better than atomic_cmpxchg because no comparison is required. While the value of @fq_timer_on can only be 0 or 1. Let's use atomic_xchg instead of atomic_cmpxchg here because we only need to check that the value changes from 0 to 1 or from 1 to 1. Signed-off-by: Yuqi Jin <jinyuqi@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Link: https://lore.kernel.org/r/1598517834-30275-1-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/amd: Do not use IOMMUv2 functionality when SME is activeJoerg Roedel
When memory encryption is active the device is likely not in a direct mapped domain. Forbid using IOMMUv2 functionality for now until finer grained checks for this have been implemented. Signed-off-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200824105415.21000-3-joro@8bytes.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/amd: Do not force direct mapping when SME is activeJoerg Roedel
Do not force devices supporting IOMMUv2 to be direct mapped when memory encryption is active. This might cause them to be unusable because their DMA mask does not include the encryption bit. Signed-off-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200824105415.21000-2-joro@8bytes.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/dma: Remove broken huge page handlingRobin Murphy
The attempt to handle huge page allocations was originally added since the comments around stripping __GFP_COMP in other implementations were nonsensical, and we naively assumed that split_huge_page() could simply be called equivalently to split_page(). It turns out that this doesn't actually work correctly, so just get rid of it - there's little point going to the effort of allocating huge pages if we're only going to split them anyway. Reported-by: Roman Gushchin <guro@fb.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/e287dbe69aa0933abafd97c80631940fd188ddd1.1599132844.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/amd: Use cmpxchg_double() when updating 128-bit IRTESuravee Suthikulpanit
When using 128-bit interrupt-remapping table entry (IRTE) (a.k.a GA mode), current driver disables interrupt remapping when it updates the IRTE so that the upper and lower 64-bit values can be updated safely. However, this creates a small window, where the interrupt could arrive and result in IO_PAGE_FAULT (for interrupt) as shown below. IOMMU Driver Device IRQ ============ =========== irte.RemapEn=0 ... change IRTE IRQ from device ==> IO_PAGE_FAULT !! ... irte.RemapEn=1 This scenario has been observed when changing irq affinity on a system running I/O-intensive workload, in which the destination APIC ID in the IRTE is updated. Instead, use cmpxchg_double() to update the 128-bit IRTE at once without disabling the interrupt remapping. However, this means several features, which require GA (128-bit IRTE) support will also be affected if cmpxchg16b is not supported (which is unprecedented for AMD processors w/ IOMMU). Fixes: 880ac60e2538 ("iommu/amd: Introduce interrupt remapping ops structure") Reported-by: Sean Osborne <sean.m.osborne@oracle.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Tested-by: Erik Rockstrom <erik.rockstrom@oracle.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20200903093822.52012-3-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/amd: Restore IRTE.RemapEn bit after programming IRTESuravee Suthikulpanit
Currently, the RemapEn (valid) bit is accidentally cleared when programming IRTE w/ guestMode=0. It should be restored to the prior state. Fixes: b9fc6b56f478 ("iommu/amd: Implements irq_set_vcpu_affinity() hook to setup vapic mode for pass-through devices") Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20200903093822.52012-2-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set()Lu Baolu
The dev_iommu_priv_set() must be called after probe_device(). This fixes a NULL pointer deference bug when booting a system with kernel cmdline "intel_iommu=on,igfx_off", where the dev_iommu_priv_set() is abused. The following stacktrace was produced: Command line: BOOT_IMAGE=/isolinux/bzImage console=tty1 intel_iommu=on,igfx_off ... DMAR: Host address width 39 DMAR: DRHD base: 0x000000fed90000 flags: 0x0 DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 1c0000c40660462 ecap 19e2ff0505e DMAR: DRHD base: 0x000000fed91000 flags: 0x1 DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da DMAR: RMRR base: 0x0000009aa9f000 end: 0x0000009aabefff DMAR: RMRR base: 0x0000009d000000 end: 0x0000009f7fffff DMAR: No ATSR found BUG: kernel NULL pointer dereference, address: 0000000000000038 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.9.0-devel+ #2 Hardware name: LENOVO 20HGS0TW00/20HGS0TW00, BIOS N1WET46S (1.25s ) 03/30/2018 RIP: 0010:intel_iommu_init+0xed0/0x1136 Code: fe e9 61 02 00 00 bb f4 ff ff ff e9 57 02 00 00 48 63 d1 48 c1 e2 04 48 03 50 20 48 8b 12 48 85 d2 74 0b 48 8b 92 d0 02 00 00 48 89 7a 38 ff c1 e9 15 f5 ff ff 48 c7 c7 60 99 ac a7 49 c7 c7 a0 RSP: 0000:ffff96d180073dd0 EFLAGS: 00010282 RAX: ffff8c91037a7d20 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffffffffff RBP: ffff96d180073e90 R08: 0000000000000001 R09: ffff8c91039fe3c0 R10: 0000000000000226 R11: 0000000000000226 R12: 000000000000000b R13: ffff8c910367c650 R14: ffffffffa8426d60 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8c9107480000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000038 CR3: 00000004b100a001 CR4: 00000000003706e0 Call Trace: ? _raw_spin_unlock_irqrestore+0x1f/0x30 ? call_rcu+0x10e/0x320 ? trace_hardirqs_on+0x2c/0xd0 ? rdinit_setup+0x2c/0x2c ? e820__memblock_setup+0x8b/0x8b pci_iommu_init+0x16/0x3f do_one_initcall+0x46/0x1e4 kernel_init_freeable+0x169/0x1b2 ? rest_init+0x9f/0x9f kernel_init+0xa/0x101 ret_from_fork+0x22/0x30 Modules linked in: CR2: 0000000000000038 ---[ end trace 3653722a6f936f18 ]--- Fixes: 01b9d4e21148c ("iommu/vt-d: Use dev_iommu_priv_get/set()") Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Reported-by: Wendy Wang <wendy.wang@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Link: https://lore.kernel.org/linux-iommu/96717683-70be-7388-3d2f-61131070a96a@secunet.com/ Link: https://lore.kernel.org/r/20200903065132.16879-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/vt-d: Serialize IOMMU GCMD register modificationsLu Baolu
The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG General Description) that: If multiple control fields in this register need to be modified, software must serialize the modifications through multiple writes to this register. However, in irq_remapping.c, modifications of IRE and CFI are done in one write. We need to do two separate writes with STS checking after each. It also checks the status register before writing command register to avoid unnecessary register write. Fixes: af8d102f999a4 ("x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lore.kernel.org/r/20200828000615.8281-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu: Rename iommu_tlb_* functions to iommu_iotlb_*Tom Murphy
To keep naming consistent we should stick with *iotlb*. This patch renames a few remaining functions. Signed-off-by: Tom Murphy <murphyt7@tcd.ie> Link: https://lore.kernel.org/r/20200817210051.13546-1-murphyt7@tcd.ie Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/tegra-smmu: Prune IOMMU group when it is releasedThierry Reding
In order to share groups between multiple devices we keep track of them in a per-SMMU list. When an IOMMU group is released, a dangling pointer to it stays around in that list. Fix this by implementing an IOMMU data release callback for groups where the dangling pointer can be removed. Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20200806155404.3936074-4-thierry.reding@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/tegra-smmu: Balance IOMMU group reference countThierry Reding
For groups that are shared between multiple devices, care must be taken to acquire a reference for each device, otherwise the IOMMU core ends up dropping the last reference too early, which will cause the group to be released while consumers may still be thinking that they're holding a reference to it. Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20200806155404.3936074-3-thierry.reding@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/tegra-smmu: Set IOMMU group nameThierry Reding
Set the name of static IOMMU groups to help with debugging. Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20200806155404.3936074-2-thierry.reding@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>