summaryrefslogtreecommitdiffstats
path: root/drivers/iommu/intel-iommu.c
AgeCommit message (Collapse)Author
2020-01-07iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attrLu Baolu
This adds the Intel VT-d specific callback of setting DOMAIN_ATTR_NESTING domain attribution. It is necessary to let the VT-d driver know that the domain represents a virtual machine which requires the IOMMU hardware to support nested translation mode. Return success if the IOMMU hardware suports nested mode, otherwise failure. Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07iommu/vt-d: Identify domains using first level page tableLu Baolu
This checks whether a domain should use the first level page table for map/unmap and marks it in the domain structure. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07iommu/vt-d: Loose requirement for flush queue initializatonLu Baolu
Currently if flush queue initialization fails, we return error or enforce the system-wide strict mode. These are unnecessary because we always check the existence of a flush queue before queuing any iova's for lazy flushing. Printing a informational message is enough. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07iommu/vt-d: Avoid iova flush queue in strict modeLu Baolu
If Intel IOMMU strict mode is enabled by users, it's unnecessary to create the iova flush queue. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07iommu/vt-d: trace: Extend map_sg trace eventLu Baolu
Current map_sg stores trace message in a coarse manner. This extends it so that more detailed messages could be traced. The map_sg trace message looks like: map_sg: dev=0000:00:17.0 [1/9] dev_addr=0xf8f90000 phys_addr=0x158051000 size=4096 map_sg: dev=0000:00:17.0 [2/9] dev_addr=0xf8f91000 phys_addr=0x15a858000 size=4096 map_sg: dev=0000:00:17.0 [3/9] dev_addr=0xf8f92000 phys_addr=0x15aa13000 size=4096 map_sg: dev=0000:00:17.0 [4/9] dev_addr=0xf8f93000 phys_addr=0x1570f1000 size=8192 map_sg: dev=0000:00:17.0 [5/9] dev_addr=0xf8f95000 phys_addr=0x15c6d0000 size=4096 map_sg: dev=0000:00:17.0 [6/9] dev_addr=0xf8f96000 phys_addr=0x157194000 size=4096 map_sg: dev=0000:00:17.0 [7/9] dev_addr=0xf8f97000 phys_addr=0x169552000 size=4096 map_sg: dev=0000:00:17.0 [8/9] dev_addr=0xf8f98000 phys_addr=0x169dde000 size=4096 map_sg: dev=0000:00:17.0 [9/9] dev_addr=0xf8f99000 phys_addr=0x148351000 size=4096 Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07iommu/vt-d: Replace Intel specific PASID allocator with IOASIDJacob Pan
Make use of generic IOASID code to manage PASID allocation, free, and lookup. Replace Intel specific code. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07iommu/vt-d: Fix CPU and IOMMU SVM feature matching checksJacob Pan
Shared Virtual Memory(SVM) is based on a collective set of hardware features detected at runtime. There are requirements for matching CPU and IOMMU capabilities. The current code checks CPU and IOMMU feature set for SVM support but the result is never stored nor used. Therefore, SVM can still be used even when these checks failed. The consequences can be: 1. CPU uses 5-level paging mode for virtual address of 57 bits, but IOMMU can only support 4-level paging mode with 48 bits address for DMA. 2. 1GB page size is used by CPU but IOMMU does not support it. VT-d unrecoverable faults may be generated. The best solution to fix these problems is to prevent them in the first place. This patch consolidates code for checking PASID, CPU vs. IOMMU paging mode compatibility, as well as provides specific error messages for each failed checks. On sane hardware configurations, these error message shall never appear in kernel log. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-01-07iommu/vt-d: Add Kconfig option to enable/disable scalable modeLu Baolu
This adds Kconfig option INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON to make it easier for distributions to enable or disable the Intel IOMMU scalable mode by default during kernel build. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-17iommu/vt-d: Allocate reserved region for ISA with correct permissionJerry Snitselaar
Currently the reserved region for ISA is allocated with no permissions. If a dma domain is being used, mapping this region will fail. Set the permissions to DMA_PTE_READ|DMA_PTE_WRITE. Cc: Joerg Roedel <jroedel@suse.de> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: iommu@lists.linux-foundation.org Cc: stable@vger.kernel.org # v5.3+ Fixes: d850c2ee5fe2 ("iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions") Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-17iommu/vt-d: Fix dmar pte read access not set errorLu Baolu
If the default DMA domain of a group doesn't fit a device, it will still sit in the group but use a private identity domain. When map/unmap/iova_to_phys come through iommu API, the driver should still serve them, otherwise, other devices in the same group will be impacted. Since identity domain has been mapped with the whole available memory space and RMRRs, we don't need to worry about the impact on it. Link: https://www.spinics.net/lists/iommu/msg40416.html Cc: Jerry Snitselaar <jsnitsel@redhat.com> Reported-by: Jerry Snitselaar <jsnitsel@redhat.com> Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with private") Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Tested-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-12-17iommu/vt-d: Set ISA bridge reserved region as relaxableAlex Williamson
Commit d850c2ee5fe2 ("iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions") created a direct-mapped reserved memory region in order to replace the static identity mapping of the ISA address space, where the latter was then removed in commit df4f3c603aeb ("iommu/vt-d: Remove static identity map code"). According to the history of this code and the Kconfig option surrounding it, this direct mapping exists for the benefit of legacy ISA drivers that are not compatible with the DMA API. In conjuntion with commit 9b77e5c79840 ("vfio/type1: check dma map request is within a valid iova range") this change introduced a regression where the vfio IOMMU backend enforces reserved memory regions per IOMMU group, preventing userspace from creating IOMMU mappings conflicting with prescribed reserved regions. A necessary prerequisite for the vfio change was the introduction of "relaxable" direct mappings introduced by commit adfd37382090 ("iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions"). These relaxable direct mappings provide the same identity mapping support in the default domain, but also indicate that the reservation is software imposed and may be relaxed under some conditions, such as device assignment. Convert the ISA bridge direct-mapped reserved region to relaxable to reflect that the restriction is self imposed and need not be enforced by drivers such as vfio. Fixes: 1c5c59fbad20 ("iommu/vt-d: Differentiate relaxable and non relaxable RMRRs") Cc: stable@vger.kernel.org # v5.3+ Link: https://lore.kernel.org/linux-iommu/20191211082304.2d4fab45@x1.home Reported-by: cprt <cprt@protonmail.com> Tested-by: cprt <cprt@protonmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-11-12Merge branches 'iommu/fixes', 'arm/qcom', 'arm/renesas', 'arm/rockchip', ↵Joerg Roedel
'arm/mediatek', 'arm/tegra', 'arm/smmu', 'x86/amd', 'x86/vt-d', 'virtio' and 'core' into next
2019-11-11iommu/vt-d: Turn off translations at shutdownDeepa Dinamani
The intel-iommu driver assumes that the iommu state is cleaned up at the start of the new kernel. But, when we try to kexec boot something other than the Linux kernel, the cleanup cannot be relied upon. Hence, cleanup before we go down for reboot. Keeping the cleanup at initialization also, in case BIOS leaves the IOMMU enabled. I considered turning off iommu only during kexec reboot, but a clean shutdown seems always a good idea. But if someone wants to make it conditional, such as VMM live update, we can do that. There doesn't seem to be such a condition at this time. Tested that before, the info message 'DMAR: Translation was enabled for <iommu> but we are not in kdump mode' would be reported for each iommu. The message will not appear when the DMA-remapping is not enabled on entry to the kernel. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-11-11iommu/vt-d: Check VT-d RMRR region in BIOS is reported as reservedYian Chen
VT-d RMRR (Reserved Memory Region Reporting) regions are reserved for device use only and should not be part of allocable memory pool of OS. BIOS e820_table reports complete memory map to OS, including OS usable memory ranges and BIOS reserved memory ranges etc. x86 BIOS may not be trusted to include RMRR regions as reserved type of memory in its e820 memory map, hence validate every RMRR entry with the e820 memory map to make sure the RMRR regions will not be used by OS for any other purposes. ia64 EFI is working fine so implement RMRR validation as a dummy function Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Yian Chen <yian.chen@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-10-30iommu/vt-d: Fix panic after kexec -p for kdumpJohn Donnelly
This cures a panic on restart after a kexec operation on 5.3 and 5.4 kernels. The underlying state of the iommu registers (iommu->flags & VTD_FLAG_TRANS_PRE_ENABLED) on a restart results in a domain being marked as "DEFER_DEVICE_DOMAIN_INFO" that produces an Oops in identity_mapping(). [ 43.654737] BUG: kernel NULL pointer dereference, address: 0000000000000056 [ 43.655720] #PF: supervisor read access in kernel mode [ 43.655720] #PF: error_code(0x0000) - not-present page [ 43.655720] PGD 0 P4D 0 [ 43.655720] Oops: 0000 [#1] SMP PTI [ 43.655720] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.3.2-1940.el8uek.x86_64 #1 [ 43.655720] Hardware name: Oracle Corporation ORACLE SERVER X5-2/ASM,MOTHERBOARD,1U, BIOS 30140300 09/20/2018 [ 43.655720] RIP: 0010:iommu_need_mapping+0x29/0xd0 [ 43.655720] Code: 00 0f 1f 44 00 00 48 8b 97 70 02 00 00 48 83 fa ff 74 53 48 8d 4a ff b8 01 00 00 00 48 83 f9 fd 76 01 c3 48 8b 35 7f 58 e0 01 <48> 39 72 58 75 f2 55 48 89 e5 41 54 53 48 8b 87 28 02 00 00 4c 8b [ 43.655720] RSP: 0018:ffffc9000001b9b0 EFLAGS: 00010246 [ 43.655720] RAX: 0000000000000001 RBX: 0000000000001000 RCX: fffffffffffffffd [ 43.655720] RDX: fffffffffffffffe RSI: ffff8880719b8000 RDI: ffff8880477460b0 [ 43.655720] RBP: ffffc9000001b9e8 R08: 0000000000000000 R09: ffff888047c01700 [ 43.655720] R10: 00002194036fc692 R11: 0000000000000000 R12: 0000000000000000 [ 43.655720] R13: ffff8880477460b0 R14: 0000000000000cc0 R15: ffff888072d2b558 [ 43.655720] FS: 0000000000000000(0000) GS:ffff888071c00000(0000) knlGS:0000000000000000 [ 43.655720] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 43.655720] CR2: 0000000000000056 CR3: 000000007440a002 CR4: 00000000001606b0 [ 43.655720] Call Trace: [ 43.655720] ? intel_alloc_coherent+0x2a/0x180 [ 43.655720] ? __schedule+0x2c2/0x650 [ 43.655720] dma_alloc_attrs+0x8c/0xd0 [ 43.655720] dma_pool_alloc+0xdf/0x200 [ 43.655720] ehci_qh_alloc+0x58/0x130 [ 43.655720] ehci_setup+0x287/0x7ba [ 43.655720] ? _dev_info+0x6c/0x83 [ 43.655720] ehci_pci_setup+0x91/0x436 [ 43.655720] usb_add_hcd.cold.48+0x1d4/0x754 [ 43.655720] usb_hcd_pci_probe+0x2bc/0x3f0 [ 43.655720] ehci_pci_probe+0x39/0x40 [ 43.655720] local_pci_probe+0x47/0x80 [ 43.655720] pci_device_probe+0xff/0x1b0 [ 43.655720] really_probe+0xf5/0x3a0 [ 43.655720] driver_probe_device+0xbb/0x100 [ 43.655720] device_driver_attach+0x58/0x60 [ 43.655720] __driver_attach+0x8f/0x150 [ 43.655720] ? device_driver_attach+0x60/0x60 [ 43.655720] bus_for_each_dev+0x74/0xb0 [ 43.655720] driver_attach+0x1e/0x20 [ 43.655720] bus_add_driver+0x151/0x1f0 [ 43.655720] ? ehci_hcd_init+0xb2/0xb2 [ 43.655720] ? do_early_param+0x95/0x95 [ 43.655720] driver_register+0x70/0xc0 [ 43.655720] ? ehci_hcd_init+0xb2/0xb2 [ 43.655720] __pci_register_driver+0x57/0x60 [ 43.655720] ehci_pci_init+0x6a/0x6c [ 43.655720] do_one_initcall+0x4a/0x1fa [ 43.655720] ? do_early_param+0x95/0x95 [ 43.655720] kernel_init_freeable+0x1bd/0x262 [ 43.655720] ? rest_init+0xb0/0xb0 [ 43.655720] kernel_init+0xe/0x110 [ 43.655720] ret_from_fork+0x24/0x50 Fixes: 8af46c784ecfe ("iommu/vt-d: Implement is_attach_deferred iommu ops entry") Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: John Donnelly <john.p.donnelly@oracle.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-10-18iommu/vt-d: Return the correct dma mask when we are bypassing the IOMMUArvind Sankar
We must return a mask covering the full physical RAM when bypassing the IOMMU mapping. Also, in iommu_need_mapping, we need to check using dma_direct_get_required_mask to ensure that the device's dma_mask can cover physical RAM before deciding to bypass IOMMU mapping. Based on an earlier patch from Christoph Hellwig. Fixes: 249baa547901 ("dma-mapping: provide a better default ->get_required_mask") Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-10-15iommu/vt-d: Refactor find_domain() helperLu Baolu
Current find_domain() helper checks and does the deferred domain attachment and return the domain in use. This isn't always the use case for the callers. Some callers only want to retrieve the current domain in use. This refactors find_domain() into two helpers: 1) find_domain() only returns the domain in use; 2) deferred_attach_domain() does the deferred domain attachment if required and return the domain in use. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-10-15iommu: Add gfp parameter to iommu_ops::mapTom Murphy
Add a gfp_t parameter to the iommu_ops::map function. Remove the needless locking in the AMD iommu driver. The iommu_ops::map function (or the iommu_map function which calls it) was always supposed to be sleepable (according to Joerg's comment in this thread: https://lore.kernel.org/patchwork/patch/977520/ ) and so should probably have had a "might_sleep()" since it was written. However currently the dma-iommu api can call iommu_map in an atomic context, which it shouldn't do. This doesn't cause any problems because any iommu driver which uses the dma-iommu api uses gfp_atomic in it's iommu_ops::map function. But doing this wastes the memory allocators atomic pools. Signed-off-by: Tom Murphy <murphyt7@tcd.ie> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-09-19Merge tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping updates from Christoph Hellwig: - add dma-mapping and block layer helpers to take care of IOMMU merging for mmc plus subsequent fixups (Yoshihiro Shimoda) - rework handling of the pgprot bits for remapping (me) - take care of the dma direct infrastructure for swiotlb-xen (me) - improve the dma noncoherent remapping infrastructure (me) - better defaults for ->mmap, ->get_sgtable and ->get_required_mask (me) - cleanup mmaping of coherent DMA allocations (me) - various misc cleanups (Andy Shevchenko, me) * tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mapping: (41 commits) mmc: renesas_sdhi_internal_dmac: Add MMC_CAP2_MERGE_CAPABLE mmc: queue: Fix bigger segments usage arm64: use asm-generic/dma-mapping.h swiotlb-xen: merge xen_unmap_single into xen_swiotlb_unmap_page swiotlb-xen: simplify cache maintainance swiotlb-xen: use the same foreign page check everywhere swiotlb-xen: remove xen_swiotlb_dma_mmap and xen_swiotlb_dma_get_sgtable xen: remove the exports for xen_{create,destroy}_contiguous_region xen/arm: remove xen_dma_ops xen/arm: simplify dma_cache_maint xen/arm: use dev_is_dma_coherent xen/arm: consolidate page-coherent.h xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance arm: remove wrappers for the generic dma remap helpers dma-mapping: introduce a dma_common_find_pages helper dma-mapping: always use VM_DMA_COHERENT for generic DMA remap vmalloc: lift the arm flag for coherent mappings to common code dma-mapping: provide a better default ->get_required_mask dma-mapping: remove the dma_declare_coherent_memory export remoteproc: don't allow modular build ...
2019-09-11Merge branches 'arm/omap', 'arm/exynos', 'arm/smmu', 'arm/mediatek', ↵Joerg Roedel
'arm/qcom', 'arm/renesas', 'x86/amd', 'x86/vt-d' and 'core' into next
2019-09-11iommu/vt-d: Declare Broadwell igfx dmar support snafuChris Wilson
Despite the widespread and complete failure of Broadwell integrated graphics when DMAR is enabled, known over the years, we have never been able to root cause the issue. Instead, we let the failure undermine our confidence in the iommu system itself when we should be pushing for it to be always enabled. Quirk away Broadwell and remove the rotten apple. References: https://bugs.freedesktop.org/show_bug.cgi?id=89360 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Martin Peres <martin.peres@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-09-11iommu/vt-d: Use bounce buffer for untrusted devicesLu Baolu
The Intel VT-d hardware uses paging for DMA remapping. The minimum mapped window is a page size. The device drivers may map buffers not filling the whole IOMMU window. This allows the device to access to possibly unrelated memory and a malicious device could exploit this to perform DMA attacks. To address this, the Intel IOMMU driver will use bounce pages for those buffers which don't fill whole IOMMU pages. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Xu Pengfei <pengfei.xu@intel.com> Tested-by: Mika Westerberg <mika.westerberg@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-09-11iommu/vt-d: Add trace events for device dma map/unmapLu Baolu
This adds trace support for the Intel IOMMU driver. It also declares some events which could be used to trace the events when an IOVA is being mapped or unmapped in a domain. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-09-11iommu/vt-d: Don't switch off swiotlb if bounce page is usedLu Baolu
The bounce page implementation depends on swiotlb. Hence, don't switch off swiotlb if the system has untrusted devices or could potentially be hot-added with any untrusted devices. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-09-11iommu/vt-d: Check whether device requires bounce bufferLu Baolu
This adds a helper to check whether a device needs to use bounce buffer. It also provides a boot time option to disable the bounce buffer. Users can use this to prevent the iommu driver from using the bounce buffer for performance gain. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Xu Pengfei <pengfei.xu@intel.com> Tested-by: Mika Westerberg <mika.westerberg@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-09-04dma-mapping: explicitly wire up ->mmap and ->get_sgtableChristoph Hellwig
While the default ->mmap and ->get_sgtable implementations work for the majority of our dma_map_ops impementations they are inherently safe for others that don't use the page allocator or CMA and/or use their own way of remapping not covered by the common code. So remove the defaults if these methods are not wired up, but instead wire up the default implementations for all safe instances. Fixes: e1c7e324539a ("dma-mapping: always provide the dma_map_ops based implementation") Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-08-30Revert "iommu/vt-d: Avoid duplicated pci dma alias consideration"Lu Baolu
This reverts commit 557529494d79f3f1fadd486dd18d2de0b19be4da. Commit 557529494d79f ("iommu/vt-d: Avoid duplicated pci dma alias consideration") aimed to address a NULL pointer deference issue happened when a thunderbolt device driver returned unexpectedly. Unfortunately, this change breaks a previous pci quirk added by commit cc346a4714a59 ("PCI: Add function 1 DMA alias quirk for Marvell devices"), as the result, devices like Marvell 88SE9128 SATA controller doesn't work anymore. We will continue to try to find the real culprit mentioned in 557529494d79f, but for now we should revert it to fix current breakage. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204627 Cc: Stijn Tintel <stijn@linux-ipv6.be> Cc: Petr Vandrovec <petr@vandrovec.name> Reported-by: Stijn Tintel <stijn@linux-ipv6.be> Reported-by: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-08-23Merge branch 'for-joerg/arm-smmu/updates' of ↵Joerg Roedel
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu
2019-08-23iommu/vt-d: Request passthrough mode from IOMMU coreJoerg Roedel
Get rid of the iommu_pass_through variable and request passthrough mode via the new iommu core function. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-08-20Merge branch 'for-joerg/batched-unmap' of ↵Joerg Roedel
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into core
2019-08-09iommu/vt-d: Fix possible use-after-free of private domainLu Baolu
Multiple devices might share a private domain. One real example is a pci bridge and all devices behind it. When remove a private domain, make sure that it has been detached from all devices to avoid use-after-free case. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with private") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-08-09iommu/vt-d: Detach domain before using a private oneLu Baolu
When the default domain of a group doesn't work for a device, the iommu driver will try to use a private domain. The domain which was previously attached to the device must be detached. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with private") Reported-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lkml.org/lkml/2019/8/2/1379 Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-08-06iommu/vt-d: Detach domain when move device out of groupLu Baolu
When removing a device from an iommu group, the domain should be detached from the device. Otherwise, the stale domain info will still be cached by the driver and the driver will refuse to attach any domain to the device again. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Fixes: b7297783c2bb6 ("iommu/vt-d: Remove duplicated code for device hotplug") Reported-and-tested-by: Vlad Buslov <vladbu@mellanox.com> Suggested-by: Robin Murphy <robin.murphy@arm.com> Link: https://lkml.org/lkml/2019/7/26/1133 Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-29iommu: Pass struct iommu_iotlb_gather to ->unmap() and ->iotlb_sync()Will Deacon
To allow IOMMU drivers to batch up TLB flushing operations and postpone them until ->iotlb_sync() is called, extend the prototypes for the ->unmap() and ->iotlb_sync() IOMMU ops callbacks to take a pointer to the current iommu_iotlb_gather structure. All affected IOMMU drivers are updated, but there should be no functional change since the extra parameter is ignored for now. Signed-off-by: Will Deacon <will@kernel.org>
2019-07-22iommu/vt-d: Check if domain->pgd was allocatedDmitry Safonov
There is a couple of places where on domain_init() failure domain_exit() is called. While currently domain_init() can fail only if alloc_pgtable_page() has failed. Make domain_exit() check if domain->pgd present, before calling domain_unmap(), as it theoretically should crash on clearing pte entries in dma_pte_clear_level(). Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: iommu@lists.linux-foundation.org Signed-off-by: Dmitry Safonov <dima@arista.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-22iommu/vt-d: Don't queue_iova() if there is no flush queueDmitry Safonov
Intel VT-d driver was reworked to use common deferred flushing implementation. Previously there was one global per-cpu flush queue, afterwards - one per domain. Before deferring a flush, the queue should be allocated and initialized. Currently only domains with IOMMU_DOMAIN_DMA type initialize their flush queue. It's probably worth to init it for static or unmanaged domains too, but it may be arguable - I'm leaving it to iommu folks. Prevent queuing an iova flush if the domain doesn't have a queue. The defensive check seems to be worth to keep even if queue would be initialized for all kinds of domains. And is easy backportable. On 4.19.43 stable kernel it has a user-visible effect: previously for devices in si domain there were crashes, on sata devices: BUG: spinlock bad magic on CPU#6, swapper/0/1 lock: 0xffff88844f582008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 CPU: 6 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #1 Call Trace: <IRQ> dump_stack+0x61/0x7e spin_bug+0x9d/0xa3 do_raw_spin_lock+0x22/0x8e _raw_spin_lock_irqsave+0x32/0x3a queue_iova+0x45/0x115 intel_unmap+0x107/0x113 intel_unmap_sg+0x6b/0x76 __ata_qc_complete+0x7f/0x103 ata_qc_complete+0x9b/0x26a ata_qc_complete_multiple+0xd0/0xe3 ahci_handle_port_interrupt+0x3ee/0x48a ahci_handle_port_intr+0x73/0xa9 ahci_single_level_irq_intr+0x40/0x60 __handle_irq_event_percpu+0x7f/0x19a handle_irq_event_percpu+0x32/0x72 handle_irq_event+0x38/0x56 handle_edge_irq+0x102/0x121 handle_irq+0x147/0x15c do_IRQ+0x66/0xf2 common_interrupt+0xf/0xf RIP: 0010:__do_softirq+0x8c/0x2df The same for usb devices that use ehci-pci: BUG: spinlock bad magic on CPU#0, swapper/0/1 lock: 0xffff88844f402008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #4 Call Trace: <IRQ> dump_stack+0x61/0x7e spin_bug+0x9d/0xa3 do_raw_spin_lock+0x22/0x8e _raw_spin_lock_irqsave+0x32/0x3a queue_iova+0x77/0x145 intel_unmap+0x107/0x113 intel_unmap_page+0xe/0x10 usb_hcd_unmap_urb_setup_for_dma+0x53/0x9d usb_hcd_unmap_urb_for_dma+0x17/0x100 unmap_urb_for_dma+0x22/0x24 __usb_hcd_giveback_urb+0x51/0xc3 usb_giveback_urb_bh+0x97/0xde tasklet_action_common.isra.4+0x5f/0xa1 tasklet_action+0x2d/0x30 __do_softirq+0x138/0x2df irq_exit+0x7d/0x8b smp_apic_timer_interrupt+0x10f/0x151 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:_raw_spin_unlock_irqrestore+0x17/0x39 Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: iommu@lists.linux-foundation.org Cc: <stable@vger.kernel.org> # 4.14+ Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Dmitry Safonov <dima@arista.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-22iommu/vt-d: Avoid duplicated pci dma alias considerationLu Baolu
As we have abandoned the home-made lazy domain allocation and delegated the DMA domain life cycle up to the default domain mechanism defined in the generic iommu layer, we needn't consider pci alias anymore when mapping/unmapping the context entries. Without this fix, we see kernel NULL pointer dereference during pci device hot-plug test. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Fixes: fa954e6831789 ("iommu/vt-d: Delegate the dma domain to upper layer") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reported-and-tested-by: Xu Pengfei <pengfei.xu@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-22Revert "iommu/vt-d: Consolidate domain_init() to avoid duplication"Joerg Roedel
This reverts commit 123b2ffc376e1b3e9e015c75175b61e88a8b8518. This commit reportedly caused boot failures on some systems and needs to be reverted for now. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-04Merge branches 'x86/vt-d', 'x86/amd', 'arm/smmu', 'arm/omap', ↵Joerg Roedel
'generic-dma-ops' and 'core' into next
2019-06-22Revert "iommu/vt-d: Fix lock inversion between iommu->lock and ↵Peter Xu
device_domain_lock" This reverts commit 7560cc3ca7d9d11555f80c830544e463fcdb28b8. With 5.2.0-rc5 I can easily trigger this with lockdep and iommu=pt: ====================================================== WARNING: possible circular locking dependency detected 5.2.0-rc5 #78 Not tainted ------------------------------------------------------ swapper/0/1 is trying to acquire lock: 00000000ea2b3beb (&(&iommu->lock)->rlock){+.+.}, at: domain_context_mapping_one+0xa5/0x4e0 but task is already holding lock: 00000000a681907b (device_domain_lock){....}, at: domain_context_mapping_one+0x8d/0x4e0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (device_domain_lock){....}: _raw_spin_lock_irqsave+0x3c/0x50 dmar_insert_one_dev_info+0xbb/0x510 domain_add_dev_info+0x50/0x90 dev_prepare_static_identity_mapping+0x30/0x68 intel_iommu_init+0xddd/0x1422 pci_iommu_init+0x16/0x3f do_one_initcall+0x5d/0x2b4 kernel_init_freeable+0x218/0x2c1 kernel_init+0xa/0x100 ret_from_fork+0x3a/0x50 -> #0 (&(&iommu->lock)->rlock){+.+.}: lock_acquire+0x9e/0x170 _raw_spin_lock+0x25/0x30 domain_context_mapping_one+0xa5/0x4e0 pci_for_each_dma_alias+0x30/0x140 dmar_insert_one_dev_info+0x3b2/0x510 domain_add_dev_info+0x50/0x90 dev_prepare_static_identity_mapping+0x30/0x68 intel_iommu_init+0xddd/0x1422 pci_iommu_init+0x16/0x3f do_one_initcall+0x5d/0x2b4 kernel_init_freeable+0x218/0x2c1 kernel_init+0xa/0x100 ret_from_fork+0x3a/0x50 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(device_domain_lock); lock(&(&iommu->lock)->rlock); lock(device_domain_lock); lock(&(&iommu->lock)->rlock); *** DEADLOCK *** 2 locks held by swapper/0/1: #0: 00000000033eb13d (dmar_global_lock){++++}, at: intel_iommu_init+0x1e0/0x1422 #1: 00000000a681907b (device_domain_lock){....}, at: domain_context_mapping_one+0x8d/0x4e0 stack backtrace: CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc5 #78 Hardware name: LENOVO 20KGS35G01/20KGS35G01, BIOS N23ET50W (1.25 ) 06/25/2018 Call Trace: dump_stack+0x85/0xc0 print_circular_bug.cold.57+0x15c/0x195 __lock_acquire+0x152a/0x1710 lock_acquire+0x9e/0x170 ? domain_context_mapping_one+0xa5/0x4e0 _raw_spin_lock+0x25/0x30 ? domain_context_mapping_one+0xa5/0x4e0 domain_context_mapping_one+0xa5/0x4e0 ? domain_context_mapping_one+0x4e0/0x4e0 pci_for_each_dma_alias+0x30/0x140 dmar_insert_one_dev_info+0x3b2/0x510 domain_add_dev_info+0x50/0x90 dev_prepare_static_identity_mapping+0x30/0x68 intel_iommu_init+0xddd/0x1422 ? printk+0x58/0x6f ? lockdep_hardirqs_on+0xf0/0x180 ? do_early_param+0x8e/0x8e ? e820__memblock_setup+0x63/0x63 pci_iommu_init+0x16/0x3f do_one_initcall+0x5d/0x2b4 ? do_early_param+0x8e/0x8e ? rcu_read_lock_sched_held+0x55/0x60 ? do_early_param+0x8e/0x8e kernel_init_freeable+0x218/0x2c1 ? rest_init+0x230/0x230 kernel_init+0xa/0x100 ret_from_fork+0x3a/0x50 domain_context_mapping_one() is taking device_domain_lock first then iommu lock, while dmar_insert_one_dev_info() is doing the reverse. That should be introduced by commit: 7560cc3ca7d9 ("iommu/vt-d: Fix lock inversion between iommu->lock and device_domain_lock", 2019-05-27) So far I still cannot figure out how the previous deadlock was triggered (I cannot find iommu lock taken before calling of iommu_flush_dev_iotlb()), however I'm pretty sure that that change should be incomplete at least because it does not fix all the places so we're still taking the locks in different orders, while reverting that commit is very clean to me so far that we should always take device_domain_lock first then the iommu lock. We can continue to try to find the real culprit mentioned in 7560cc3ca7d9, but for now I think we should revert it to fix current breakage. CC: Joerg Roedel <joro@8bytes.org> CC: Lu Baolu <baolu.lu@linux.intel.com> CC: dave.jiang@intel.com Signed-off-by: Peter Xu <peterx@redhat.com> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-06-18iommu/vt-d: Silence a variable set but not usedQian Cai
The commit "iommu/vt-d: Probe DMA-capable ACPI name space devices" introduced a compilation warning due to the "iommu" variable in for_each_active_iommu() but never used the for each element, i.e, "drhd->iommu". drivers/iommu/intel-iommu.c: In function 'probe_acpi_namespace_devices': drivers/iommu/intel-iommu.c:4639:22: warning: variable 'iommu' set but not used [-Wunused-but-set-variable] struct intel_iommu *iommu; Silence the warning the same way as in the commit d3ed71e5cc50 ("drivers/iommu/intel-iommu.c: fix variable 'iommu' set but not used") Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-06-18iommu/vt-d: Remove an unused variable "length"Qian Cai
The linux-next commit "iommu/vt-d: Duplicate iommu_resv_region objects per device list" [1] left out an unused variable, drivers/iommu/intel-iommu.c: In function 'dmar_parse_one_rmrr': drivers/iommu/intel-iommu.c:4014:9: warning: variable 'length' set but not used [-Wunused-but-set-variable] [1] https://lore.kernel.org/patchwork/patch/1083073/ Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-06-14Merge tag 'iommu-fixes-v5.2-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - three fixes for Intel VT-d to fix a potential dead-lock, a formatting fix and a bit setting fix - one fix for the ARM-SMMU to make it work on some platforms with sub-optimal SMMU emulation * tag 'iommu-fixes-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/arm-smmu: Avoid constant zero in TLBI writes iommu/vt-d: Set the right field for Page Walk Snoop iommu/vt-d: Fix lock inversion between iommu->lock and device_domain_lock iommu: Add missing new line for dma type
2019-06-12iommu/vt-d: Consolidate domain_init() to avoid duplicationLu Baolu
The domain_init() and md_domain_init() do almost the same job. Consolidate them to avoid duplication. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-06-12iommu/vt-d: Cleanup after delegating DMA domain to generic iommuSai Praneeth Prakhya
[No functional changes] 1. Starting with commit df4f3c603aeb ("iommu/vt-d: Remove static identity map code") there are no callers for iommu_prepare_rmrr_dev() but the implementation of the function still exists, so remove it. Also, as a ripple effect remove get_domain_for_dev() and iommu_prepare_identity_map() because they aren't being used either. 2. Remove extra new line in couple of places. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-06-12iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices()Lu Baolu
The drhd and device scope list should be iterated with the iommu global lock held. Otherwise, a suspicious RCU usage message will be displayed. [ 3.695886] ============================= [ 3.695917] WARNING: suspicious RCU usage [ 3.695950] 5.2.0-rc2+ #2467 Not tainted [ 3.695981] ----------------------------- [ 3.696014] drivers/iommu/intel-iommu.c:4569 suspicious rcu_dereference_check() usage! [ 3.696069] other info that might help us debug this: [ 3.696126] rcu_scheduler_active = 2, debug_locks = 1 [ 3.696173] no locks held by swapper/0/1. [ 3.696204] stack backtrace: [ 3.696241] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc2+ #2467 [ 3.696370] Call Trace: [ 3.696404] dump_stack+0x85/0xcb [ 3.696441] intel_iommu_init+0x128c/0x13ce [ 3.696478] ? kmem_cache_free+0x16b/0x2c0 [ 3.696516] ? __fput+0x14b/0x270 [ 3.696550] ? __call_rcu+0xb7/0x300 [ 3.696583] ? get_max_files+0x10/0x10 [ 3.696631] ? set_debug_rodata+0x11/0x11 [ 3.696668] ? e820__memblock_setup+0x60/0x60 [ 3.696704] ? pci_iommu_init+0x16/0x3f [ 3.696737] ? set_debug_rodata+0x11/0x11 [ 3.696770] pci_iommu_init+0x16/0x3f [ 3.696805] do_one_initcall+0x5d/0x2e4 [ 3.696844] ? set_debug_rodata+0x11/0x11 [ 3.696880] ? rcu_read_lock_sched_held+0x6b/0x80 [ 3.696924] kernel_init_freeable+0x1f0/0x27c [ 3.696961] ? rest_init+0x260/0x260 [ 3.696997] kernel_init+0xa/0x110 [ 3.697028] ret_from_fork+0x3a/0x50 Fixes: fa212a97f3a36 ("iommu/vt-d: Probe DMA-capable ACPI name space devices") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-06-12iommu/vt-d: Allow DMA domain attaching to rmrr locked deviceLu Baolu
We don't allow a device to be assigned to user level when it is locked by any RMRR's. Hence, intel_iommu_attach_device() will return error if a domain of type IOMMU_DOMAIN_UNMANAGED is about to attach to a device locked by rmrr. But this doesn't apply to a domain of type other than IOMMU_DOMAIN_UNMANAGED. This adds a check to fix this. Fixes: fa954e6831789 ("iommu/vt-d: Delegate the dma domain to upper layer") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reported-and-tested-by: Qian Cai <cai@lca.pw> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-06-12iommu/vt-d: Don't enable iommu's which have been ignoredLu Baolu
The iommu driver will ignore some iommu units if there's no device under its scope or those devices have been explicitly set to bypass the DMA translation. Don't enable those iommu units, otherwise the devices under its scope won't work. Fixes: d8190dc638866 ("iommu/vt-d: Enable DMA remapping after rmrr mapped") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-06-12iommu/vt-d: Set domain type for a private domainLu Baolu
Otherwise, domain_get_iommu() will be broken. Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with private") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-06-12iommu/vt-d: Don't return error when device gets right domainLu Baolu
If a device gets a right domain in add_device ops, it shouldn't return error. Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with private") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>