summaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)Author
2017-06-27Backmerge tag 'v4.12-rc7' into drm-nextDave Airlie
Linux 4.12-rc7 Needed at least rc6 for drm-misc-next-fixes, may as well go to rc7
2017-06-25Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A few fixes for timekeeping and timers: - Plug a subtle race due to a missing READ_ONCE() in the timekeeping code where reloading of a pointer results in an inconsistent callback argument being supplied to the clocksource->read function. - Correct the CLOCK_MONOTONIC_RAW sub-nanosecond accounting in the time keeping core code, to prevent a possible discontuity. - Apply a similar fix to the arm64 vdso clock_gettime() implementation - Add missing includes to clocksource drivers, which relied on indirect includes which fails in certain configs. - Use the proper iomem pointer for read/iounmap in a probe function" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm64/vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting time: Fix clock->read(clock) race around clocksource changes clocksource: Explicitly include linux/clocksource.h when needed clocksource/drivers/arm_arch_timer: Fix read and iounmap of incorrect variable
2017-06-23Merge tag 'acpi-4.12-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "This fixes the ACPI-based enumeration of some I2C and SPI devices broken in 4.11. Specifics: - I2C and SPI devices are expected to be enumerated by the I2C and SPI subsystems, respectively, but due to a change made during the 4.11 cycle, in some cases the ACPI core marks them as already enumerated which causes the I2C and SPI subsystems to overlook them, so fix that (Jarkko Nikula)" * tag 'acpi-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / scan: Fix enumeration for special SPI and I2C devices
2017-06-23slub: make sysfs file removal asynchronousTejun Heo
Commit bf5eb3de3847 ("slub: separate out sysfs_slab_release() from sysfs_slab_remove()") made slub sysfs file removals synchronous to kmem_cache shutdown. Unfortunately, this created a possible ABBA deadlock between slab_mutex and sysfs draining mechanism triggering the following lockdep warning. ====================================================== [ INFO: possible circular locking dependency detected ] 4.10.0-test+ #48 Not tainted ------------------------------------------------------- rmmod/1211 is trying to acquire lock: (s_active#120){++++.+}, at: [<ffffffff81308073>] kernfs_remove+0x23/0x40 but task is already holding lock: (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (slab_mutex){+.+.+.}: lock_acquire+0xf6/0x1f0 __mutex_lock+0x75/0x950 mutex_lock_nested+0x1b/0x20 slab_attr_store+0x75/0xd0 sysfs_kf_write+0x45/0x60 kernfs_fop_write+0x13c/0x1c0 __vfs_write+0x28/0x120 vfs_write+0xc8/0x1e0 SyS_write+0x49/0xa0 entry_SYSCALL_64_fastpath+0x1f/0xc2 -> #0 (s_active#120){++++.+}: __lock_acquire+0x10ed/0x1260 lock_acquire+0xf6/0x1f0 __kernfs_remove+0x254/0x320 kernfs_remove+0x23/0x40 sysfs_remove_dir+0x51/0x80 kobject_del+0x18/0x50 __kmem_cache_shutdown+0x3e6/0x460 kmem_cache_destroy+0x1fb/0x2d0 kvm_exit+0x2d/0x80 [kvm] vmx_exit+0x19/0xa1b [kvm_intel] SyS_delete_module+0x198/0x1f0 entry_SYSCALL_64_fastpath+0x1f/0xc2 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(slab_mutex); lock(s_active#120); lock(slab_mutex); lock(s_active#120); *** DEADLOCK *** 2 locks held by rmmod/1211: #0: (cpu_hotplug.dep_map){++++++}, at: [<ffffffff810a7877>] get_online_cpus+0x37/0x80 #1: (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0 stack backtrace: CPU: 3 PID: 1211 Comm: rmmod Not tainted 4.10.0-test+ #48 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012 Call Trace: print_circular_bug+0x1be/0x210 __lock_acquire+0x10ed/0x1260 lock_acquire+0xf6/0x1f0 __kernfs_remove+0x254/0x320 kernfs_remove+0x23/0x40 sysfs_remove_dir+0x51/0x80 kobject_del+0x18/0x50 __kmem_cache_shutdown+0x3e6/0x460 kmem_cache_destroy+0x1fb/0x2d0 kvm_exit+0x2d/0x80 [kvm] vmx_exit+0x19/0xa1b [kvm_intel] SyS_delete_module+0x198/0x1f0 ? SyS_delete_module+0x5/0x1f0 entry_SYSCALL_64_fastpath+0x1f/0xc2 It'd be the cleanest to deal with the issue by removing sysfs files without holding slab_mutex before the rest of shutdown; however, given the current code structure, it is pretty difficult to do so. This patch punts sysfs file removal to a work item. Before commit bf5eb3de3847, the removal was punted to a RCU delayed work item which is executed after release. Now, we're punting to a different work item on shutdown which still maintains the goal removing the sysfs files earlier when destroying kmem_caches. Link: http://lkml.kernel.org/r/20170620204512.GI21326@htj.duckdns.org Fixes: bf5eb3de3847 ("slub: separate out sysfs_slab_release() from sysfs_slab_remove()") Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-21Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "This contains a set of fixes for xen-blkback by way of Konrad, and a performance regression fix for blk-mq for shared tags. The latter could account for as much as a 50x reduction in performance, with the test case from the user with 500 name spaces. A more realistic setup on my end with 32 drives showed a 3.5x drop. The fix has been thoroughly tested before being committed" * 'for-linus' of git://git.kernel.dk/linux-block: blk-mq: fix performance regression with shared tags xen-blkback: don't leak stack data via response ring xen/blkback: don't use xen_blkif_get() in xen-blkback kthread xen/blkback: don't free be structure too early xen/blkback: fix disconnect while I/Os in flight
2017-06-21ACPI / scan: Fix enumeration for special SPI and I2C devicesJarkko Nikula
Commit f406270bf73d ("ACPI / scan: Set the visited flag for all enumerated devices") caused that two group of special SPI or I2C devices do not enumerate. SPI and I2C devices are expected to be enumerated by the SPI and I2C subsystems but change caused that acpi_bus_attach() marks those devices with acpi_device_set_enumerated(). First group of devices are matched using Device Tree compatible property with special _HID "PRP0001". Those devices have matched scan handler, acpi_scan_attach_handler() retuns 1 and acpi_bus_attach() marks them with acpi_device_set_enumerated(). Second group of devices without valid _HID such as "LNXVIDEO" have device->pnp.type.platform_id set to zero and change again marks them with acpi_device_set_enumerated(). Fix this by flagging the SPI and I2C devices during struct acpi_device object initialization time and let the code in acpi_bus_attach() to go through the device_attach() and acpi_default_enumeration() path for all SPI and I2C devices. Fixes: f406270bf73d (ACPI / scan: Set the visited flag for all enumerated devices) Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: 4.11+ <stable@vger.kernel.org> # 4.11+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix refcounting wrt timers which hold onto inet6 address objects, from Xin Long. 2) Fix an ancient bug in wireless wext ioctls, from Johannes Berg. 3) Firmware handling fixes in brcm80211 driver, from Arend Van Spriel. 4) Several mlx5 driver fixes (firmware readiness, timestamp cap reporting, devlink command validity checking, tc offloading, etc.) From Eli Cohen, Maor Dickman, Chris Mi, and Or Gerlitz. 5) Fix dst leak in IP/IP6 tunnels, from Haishuang Yan. 6) Fix dst refcount bug in decnet, from Wei Wang. 7) Netdev can be double freed in register_vlan_device(). Fix from Gao Feng. 8) Don't allow object to be destroyed while it is being dumped in SCTP, from Xin Long. 9) Fix dpaa_eth build when modular, from Madalin Bucur. 10) Fix throw route leaks, from Serhey Popovych. 11) IFLA_GROUP missing from if_nlmsg_size() and ifla_policy[] table, also from Serhey Popovych. 12) Fix premature TX SKB free in stmmac, from Niklas Cassel. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits) igmp: add a missing spin_lock_init() net: stmmac: free an skb first when there are no longer any descriptors using it sfc: remove duplicate up_write on VF filter_sem rtnetlink: add IFLA_GROUP to ifla_policy ipv6: Do not leak throw route references dt-bindings: net: sms911x: Add missing optional VDD regulators dpaa_eth: reuse the dma_ops provided by the FMan MAC device fsl/fman: propagate dma_ops net/core: remove explicit do_softirq() from busy_poll_stop() fib_rules: Resolve goto rules target on delete sctp: ensure ep is not destroyed before doing the dump net/hns:bugfix of ethtool -t phy self_test net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev cxgb4: notify uP to route ctrlq compl to rdma rspq ip6_tunnel: Correct tos value in collect_md mode decnet: always not take dst->__refcnt when inserting dst into hash table ip6_tunnel: fix potential issue in __ip6_tnl_rcv ip_tunnel: fix potential issue in ip_tunnel_rcv brcmfmac: fix uninitialized warning in brcmf_usb_probe_phase2() net/mlx5e: Avoid doing a cleanup call if the profile doesn't have it ...
2017-06-21blk-mq: fix performance regression with shared tagsJens Axboe
If we have shared tags enabled, then every IO completion will trigger a full loop of every queue belonging to a tag set, and every hardware queue for each of those queues, even if nothing needs to be done. This causes a massive performance regression if you have a lot of shared devices. Instead of doing this huge full scan on every IO, add an atomic counter to the main queue that tracks how many hardware queues have been marked as needing a restart. With that, we can avoid looking for restartable queues, if we don't have to. Max reports that this restores performance. Before this patch, 4K IOPS was limited to 22-23K IOPS. With the patch, we are running at 950-970K IOPS. Fixes: 6d8c6c0f97ad ("blk-mq: Restart a single queue if tag sets are shared") Reported-by: Max Gurtovoy <maxg@mellanox.com> Tested-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Tested-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-21Merge tag 'drm-misc-next-2017-06-19_0' of ↵Dave Airlie
git://anongit.freedesktop.org/git/drm-misc into drm-next UAPI Changes: - vc4: Add get/set tiling format ioctls (Eric) Driver Changes: - vc4: Add tiling T-format support for scanout (Eric) - vc4: Use atomic helpers in commit (Boris) Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Eric Anholt <eric@anholt.net> * tag 'drm-misc-next-2017-06-19_0' of git://anongit.freedesktop.org/git/drm-misc: drm/vc4: Mimic drm_atomic_helper_commit() behavior drm/vc4: Add get/set tiling ioctls. drm/vc4: Add T-format scanout support.
2017-06-21Merge tag 'drm-intel-next-2017-06-19' of ↵Dave Airlie
git://anongit.freedesktop.org/git/drm-intel into drm-next Final pile of features for 4.13 New uabi: - batch bo in first slot, for faster execbuf assembly in userspace (Chris Wilson) - (sub)slice getparam, needed for mesa perf support (Robert Bragg) First pile of patches for cnl/cfl support, maintained by Rodrigo but with lots of contributions from others. Still incomplete since public review still ongoing. Features/refactoring: - Make execbuf faster (Chris Wilson), a pile of series to make execbuf buffer handling have fewer passes, use less list walking, postpone more work to async workers and shuffle buffers less, all to make the common case much faster (in some cases at least). - cold boot support for glk dsi (Madhav Chauhan) - Clean up pipe A quirk and related old platform hacks (Ville) - perf sampling support for kbl/glk (Lionel) - perf cleanups (Robert Bragg) - wire atomic state to backlight code, to avoid pipe lookup hacks (Maarten) - reduce request waiting latency/overhead to remove the spinning and associated cpu cycle wasting (Chris) - fix 90/270 rotation wm computation (Ville) - new ddb allocation algo for skl (Kumar Mahesh) - fix regression due to system suspend optimiazatino (Imre) - the usual pile of small cleanups and refactors all over GVT updates contained in this tag: - optimization for per-VM mmio save/restore (Changbin) - optimization for mmio hash table (Changbin) - scheduler optimization with event (Ping) - vGPU reset refinement (Fred) - other misc refactor and cleanups, etc. * tag 'drm-intel-next-2017-06-19' of git://anongit.freedesktop.org/git/drm-intel: (170 commits) drm/i915: Update DRIVER_DATE to 20170619 drm/i915/cfl: Introduce Coffee Lake workarounds. drm/i915: Store 9 bits of PCI Device ID for platforms with a LP PCH drm/i915: Stash a pointer to the obj's resv in the vma drm/i915: Async GPU relocation processing drm/i915: Allow execbuffer to use the first object as the batch drm/i915: Wait upon userptr get-user-pages within execbuffer drm/i915: First try the previous execbuffer location drm/i915: Store a persistent reference for an object in the execbuffer cache drm/i915: Eliminate lots of iterations over the execobjects array drm/i915: Disable EXEC_OBJECT_ASYNC when doing relocations drm/i915: Pass vma to relocate entry drm/i915: Store a direct lookup from object handle to vma drm/i915: Fix retrieval of hangcheck stats drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty drm/i915: Mark CPU cache as dirty on every transition for CPU writes drm/i915: Make i915_vma_destroy() static drm/i915: Actually attach the tv_format property to the SDVO connector Revert "drm/i915/skl: New ddb allocation algorithm" drm/i915/glk: Add cold boot sequence for GLK DSI ...
2017-06-21Merge tag 'drm-msm-next-2017-06-20' of ↵Dave Airlie
git://people.freedesktop.org/~robclark/linux into drm-next This time around, the biggest thing is a bunch of GEM rework for more fine grained locking and prep work to handle multiple address spaces (ie. per-process pagetables). Also some HDMI fixes for 8x96 (snapdragon 820). One unrelated bus patch, for something that seems to get merged through whatever random tree (and has all the right ack's). * tag 'drm-msm-next-2017-06-20' of git://people.freedesktop.org/~robclark/linux: drm/msm: Fix potential buffer overflow issue bus: SIMPLE_PM_BUS does not depend on ARCH_RENESAS drm/msm: Separate locking of buffer resources from struct_mutex drm/msm/hdmi: Fix HDMI pink strip issue seen on 8x96 drm/msm/hdmi: 8996 PLL: Populate unprepare drm/msm/hdmi: Use bitwise operators when building register values drm/msm: update generated headers drm/msm: remove address-space id drm/msm: support for an arbitrary number of address spaces drm/msm: refactor how we handle vram carveout buffers drm/msm: pass address-space to _get_iova() and friends drm/msm/mdp4+5: move aspace/id to base class drm/msm/mdp5: kill pipe_lock drm/msm: fix locking inconsistency for gpu->hw_init() drm/msm: Remove memptrs->wptr drm/msm: Add a struct to pass configuration to msm_gpu_init() drm/msm: Add hint to DRM_IOCTL_MSM_GEM_INFO to return an object IOVA drm/msm: Remove idle function hook drm/msm: Remove DRM_MSM_NUM_IOCTLS drm/msm: gpu: Enable zap shader for A5XX
2017-06-20time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accountingJohn Stultz
Due to how the MONOTONIC_RAW accumulation logic was handled, there is the potential for a 1ns discontinuity when we do accumulations. This small discontinuity has for the most part gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW in their vDSO clock_gettime implementation, we've seen failures with the inconsistency-check test in kselftest. This patch addresses the issue by using the same sub-ns accumulation handling that CLOCK_MONOTONIC uses, which avoids the issue for in-kernel users. Since the ARM64 vDSO implementation has its own clock_gettime calculation logic, this patch reduces the frequency of errors, but failures are still seen. The ARM64 vDSO will need to be updated to include the sub-nanosecond xtime_nsec values in its calculation for this issue to be completely fixed. Signed-off-by: John Stultz <john.stultz@linaro.org> Tested-by: Daniel Mentz <danielmentz@google.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: "stable #4 . 8+" <stable@vger.kernel.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-06-20time: Fix clock->read(clock) race around clocksource changesJohn Stultz
In tests, which excercise switching of clocksources, a NULL pointer dereference can be observed on AMR64 platforms in the clocksource read() function: u64 clocksource_mmio_readl_down(struct clocksource *c) { return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask; } This is called from the core timekeeping code via: cycle_now = tkr->read(tkr->clock); tkr->read is the cached tkr->clock->read() function pointer. When the clocksource is changed then tkr->clock and tkr->read are updated sequentially. The code above results in a sequential load operation of tkr->read and tkr->clock as well. If the store to tkr->clock hits between the loads of tkr->read and tkr->clock, then the old read() function is called with the new clock pointer. As a consequence the read() function dereferences a different data structure and the resulting 'reg' pointer can point anywhere including NULL. This problem was introduced when the timekeeping code was switched over to use struct tk_read_base. Before that, it was theoretically possible as well when the compiler decided to reload clock in the code sequence: now = tk->clock->read(tk->clock); Add a helper function which avoids the issue by reading tk_read_base->clock once into a local variable clk and then issue the read function via clk->read(clk). This guarantees that the read() function always gets the proper clocksource pointer handed in. Since there is now no use for the tkr.read pointer, this patch also removes it, and to address stopping the fast timekeeper during suspend/resume, it introduces a dummy clocksource to use rather then just a dummy read function. Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: stable <stable@vger.kernel.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Daniel Mentz <danielmentz@google.com> Link: http://lkml.kernel.org/r/1496965462-20003-2-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-06-20Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "One build fix for an Amlogic clk driver and a handful of Allwinner clk driver fixes for some DT bindings and a randconfig build error that all came in this merge window" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: sunxi-ng: a64: Export PLL_PERIPH0 clock for the PRCM clk: sunxi-ng: h3: Export PLL_PERIPH0 clock for the PRCM dt-bindings: clock: sunxi-ccu: Add pll-periph to PRCM's needed clocks clk: sunxi-ng: sun5i: Fix ahb_bist_clk definition clk: sunxi-ng: enable SUNXI_CCU_MP for PRCM clk: meson: gxbb: fix build error without RESET_CONTROLLER clk: sunxi-ng: v3s: Fix usb otg device reset bit clk: sunxi-ng: a31: Correct lcd1-ch1 clock register offset
2017-06-20Merge branch 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-next A few more things for 4.13: - Semaphore support using sync objects - Drop fb location programming - Optimize bo list ioctl * 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux: drm/amdgpu: Optimize mutex usage (v4) drm/amdgpu: Optimization of AMDGPU_BO_LIST_OP_CREATE (v2) amdgpu: use drm sync objects for shared semaphores (v6) amdgpu/cs: split out fence dependency checking (v2) drm/amdgpu: don't check the default value for vm size
2017-06-20Merge tag 'drm/tegra/for-4.13-rc1' of ↵Dave Airlie
git://anongit.freedesktop.org/tegra/linux into drm-next drm/tegra: Changes for v4.13-rc1 This starts off with the addition of more documentation for the host1x and DRM drivers and finishes with a slew of fixes and enhancements for the staging IOCTLs as a result of the awesome work done by Dmitry and Erik on the grate reverse-engineering effort. * tag 'drm/tegra/for-4.13-rc1' of git://anongit.freedesktop.org/tegra/linux: gpu: host1x: At first try a non-blocking allocation for the gather copy gpu: host1x: Refactor channel allocation code gpu: host1x: Remove unused host1x_cdma_stop() definition gpu: host1x: Remove unused 'struct host1x_cmdbuf' gpu: host1x: Check waits in the firewall gpu: host1x: Correct swapped arguments in the is_addr_reg() definition gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall gpu: host1x: Forbid RESTART opcode in the firewall gpu: host1x: Forbid relocation address shifting in the firewall gpu: host1x: Do not leak BO's phys address to userspace gpu: host1x: Correct host1x_job_pin() error handling gpu: host1x: Initialize firewall class to the job's one drm/tegra: dc: Disable plane if it is invisible drm/tegra: dc: Apply clipping to the plane drm/tegra: dc: Avoid reset asserts on Tegra20 drm/tegra: Check syncpoint ID in the 'submit' IOCTL drm/tegra: Correct copying of waitchecks and disable them in the 'submit' IOCTL drm/tegra: Check for malformed offsets and sizes in the 'submit' IOCTL drm/tegra: Add driver documentation gpu: host1x: Flesh out kerneldoc
2017-06-19mm: larger stack guard gap, between vmasHugh Dickins
Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: Oleg Nesterov <oleg@redhat.com> Original-patch-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-19Merge tag 'mac80211-for-davem-2017-06-16' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Here's just the fix for that ancient bug: * remove wext calling ndo_do_ioctl, since nobody needs that now and it makes the type change easier * use struct iwreq instead of struct ifreq almost everywhere in wireless extensions code * copy only struct iwreq from userspace in dev_ioctl for the wireless extensions, since it's smaller than struct ifreq ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16amdgpu: use drm sync objects for shared semaphores (v6)Dave Airlie
This creates a new command submission chunk for amdgpu to add in and out sync objects around the submission. Sync objects are managed via the drm syncobj ioctls. The command submission interface is enhanced with two new chunks, one for syncobj pre submission dependencies, and one for post submission sync obj signalling, and just takes a list of handles for each. This is based on work originally done by David Zhou at AMD, with input from Christian Konig on what things should look like. In theory VkFences could be backed with sync objects and just get passed into the cs as syncobj handles as well. NOTE: this interface addition needs a version bump to expose it to userspace. TODO: update to dep_sync when rebasing onto amdgpu master. (with this - r-b from Christian) v1.1: keep file reference on import. v2: move to using syncobjs v2.1: change some APIs to just use p pointer. v3: make more robust against CS failures, we now add the wait sems but only remove them once the CS job has been submitted. v4: rewrite names of API and base on new syncobj code. v5: move post deps earlier, rename some apis v6: lookup post deps earlier, and just replace fences in post deps stage (Christian) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-16drm/i915: Allow execbuffer to use the first object as the batchChris Wilson
Currently, the last object in the execlist is the always the batch. However, when building the batch buffer we often know the batch object first and if we can use the first slot in the execlist we can emit relocation instructions relative to it immediately and avoid a separate pass to adjust the relocations to point to the last execlist slot. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-06-16drm/msm: Add hint to DRM_IOCTL_MSM_GEM_INFO to return an object IOVAJordan Crouse
Modify the 'pad' member of struct drm_msm_gem_info to 'flags'. If the user sets 'flags' to non-zero it means that they want a IOVA for the GEM object instead of a mmap() offset. Return the iova in the 'offset' member. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> [robclark: s/hint/flags in commit msg] Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16drm/msm: Remove DRM_MSM_NUM_IOCTLSJordan Crouse
The ioctl array is sparsely populated but the compiler will make sure that it is sufficiently sized for all the values that we have so we can safely use ARRAY_SIZE() instead of having a constantly changing #define in the uapi header. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16Merge tag 'configfs-for-4.12' of git://git.infradead.org/users/hch/configfsLinus Torvalds
Pull configfs updates from Christoph Hellwig: "A fix from Nic for a race seen in production (including a stable tag). And while I'm sending you this I'm also sneaking in a trivial new helper from Bart so that we don't need inter-tree dependencies for the next merge window" * tag 'configfs-for-4.12' of git://git.infradead.org/users/hch/configfs: configfs: Introduce config_item_get_unless_zero() configfs: Fix race between create_link and configfs_rmdir
2017-06-16Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block layer fix from Jens Axboe: "Just a single fix this week, fixing a regression introduced in this release. When we put the final reference to the queue, we may need to block. Ensure that we can safely do so. From Bart" * 'for-linus' of git://git.kernel.dk/linux-block: block: Fix a blk_exit_rl() regression
2017-06-16Merge branch 'dmi-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging Pull dmi fixes from Jean Delvare. * 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging: firmware: dmi_scan: Check DMI structure length firmware: dmi: Fix permissions of product_family firmware: dmi_scan: Make dmi_walk and dmi_walk_early return real error codes firmware: dmi_scan: Look for SMBIOS 3 entry point first
2017-06-16BackMerge tag 'v4.12-rc5' into drm-nextDave Airlie
Linux 4.12-rc5 for nouveau fixes
2017-06-16Merge tag 'imx-drm-next-2017-06-08' of ↵Dave Airlie
git://git.pengutronix.de/git/pza/linux into drm-next imx-drm: cleanups and YUV 4:2:0 memory read/write reduction support - Remove counter load enable form PRE, which has no effect. - Add support for setting the double read/write reduction flag in channel parameter memory. This can be used to save some memory bandwidth when capturing in YUV 4:2:0 chroma subsampled formats. - Allocate DMA channel structures as needed, most of the 64 channels are unused or even reserved. - Remove unused interrupt busy waiting routine. - Set VDIC field order for both AUTO and MAN inputs simultaneously as both can't be active at the same time. * tag 'imx-drm-next-2017-06-08' of git://git.pengutronix.de/git/pza/linux: gpu: ipu-v3: vdic: include AUTO field order bit in ipu_vdi_set_field_order gpu: ipu-v3: remove interrupt busy waiting routine gpu: ipu-v3: allocate ipuv3_channels as needed gpu: ipu-v3: Add support for double read/write reduction gpu: ipu-v3: prg: remove counter load enable
2017-06-16Merge branch 'drm/next/du' of git://linuxtv.org/pinchartl/media into drm-nextDave Airlie
The series interleaves DRM and V4L2 patches due to dependencies between the R- Car DU and VSP drivers. Mauro has acked all the V4L2 patches to go through your tree, and they don't conflict with anything queued for v4.13 in his tree. If I need to send any conflicting patches through Mauro's tree for v4.13, I'll make sure to base them on this branch. * 'drm/next/du' of git://linuxtv.org/pinchartl/media: drm: rcar-du: Map memory through the VSP device v4l: vsp1: Add API to map and unmap DRM buffers through the VSP v4l: vsp1: Map the DL and video buffers through the proper bus master v4l: rcar-fcp: Add an API to retrieve the FCP device v4l: rcar-fcp: Don't get/put module reference drm: rcar-du: Register a completion callback with VSP1 v4l: vsp1: Extend VSP1 module API to allow DRM callbacks v4l: vsp1: Postpone frame end handling in event of display list race drm: rcar-du: Arm the page flip event after queuing the page flip
2017-06-16Merge branch 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-next New radeon and amdgpu features for 4.13: - Lots of Vega10 bug fixes - Preliminary Raven support - KIQ support for compute rings - MEC queue management rework from Andres - Audio support for DCE6 - SR-IOV improvements - Improved module parameters for controlling radeon vs amdgpu support for SI and CIK - Bug fixes - General code cleanups [airlied: dropped drmP.h header from one file was needed and build broke] * 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux: (362 commits) drm/amdgpu: Fix compiler warnings drm/amdgpu: vm_update_ptes remove code duplication drm/amd/amdgpu: Port VCN over to new SOC15 macros drm/amd/amdgpu: Port PSP v10.0 over to new SOC15 macros drm/amd/amdgpu: Port PSP v3.1 over to new SOC15 macros drm/amd/amdgpu: Port NBIO v7.0 driver over to new SOC15 macros drm/amd/amdgpu: Port NBIO v6.1 driver over to new SOC15 macros drm/amd/amdgpu: Port UVD 7.0 over to new SOC15 macros drm/amd/amdgpu: Port MMHUB over to new SOC15 macros drm/amd/amdgpu: Cleanup gfxhub read-modify-write patterns drm/amd/amdgpu: Port GFXHUB over to new SOC15 macros drm/amd/amdgpu: Add offset variant to SOC15 macros drm/amd/powerplay: add avfs control for Vega10 drm/amdgpu: add virtual display support for raven drm/amdgpu/gfx9: fix compute ring doorbell index drm/amd/amdgpu: Rename KIQ ring to avoid spaces drm/amd/amdgpu: gfx9 tidy ups (v2) drm/amdgpu: add contiguous flag in ucode bo create drm/amdgpu: fix missed gpu info firmware when cache firmware during S3 drm/amdgpu: export test ib debugfs interface ...
2017-06-16Merge tag 'drm-misc-next-2017-06-15' of ↵Dave Airlie
git://anongit.freedesktop.org/git/drm-misc into drm-next Cross-subsystem Changes: - dt-bindings: add vendor prefix for NLT Technologies, Ltd. (Lucas) - dt-bindings: Add support for samsung s6e3hf2 panel (Hoegeun) Core Changes: - Add drm_panel_bridge to avoid connector boilerplate in drivers (Eric) - Trival fixes for dupe forward decl and reduce scope of variable (Dawid) Driver Changes: - dw-hdmi: Use mode_valid hook on bridge instead of connector (Jose) - vc4,atmel-hlcdc: Use drm_panel_bridge where appropriate (Eric) - panel: Add Innolux P079ZCA panel driver (Chris) - panel-simple: Add NL12880B20-05, NL192108AC18-02D, P320HVN03 panels (Lucas) - panel-samsung-s6e3ha2: Add s6e3hf2 panel support (Hoegeun) - zte,vc4,pl111,panel,mxsfb: Miscellaneous fixes Cc: Jose Abreu <Jose.Abreu@synopsys.com> Cc: Eric Anholt <eric@anholt.net> Cc: Chris Zhong <zyw@rock-chips.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Hoegeun Kwon <hoegeun.kwon@samsung.com> Cc: Dawid Kurek <dawikur@gmail.com> * tag 'drm-misc-next-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc: (26 commits) drm: Reduce scope of 'state' variable drm: mxsfb_crtc: Reset the eLCDIF controller drm: Remove duplicate forward declaration drm/panel: s6e3ha2: Add support for s6e3hf2 panel on TM2e board dt-bindings: Add support for samsung s6e3hf2 panel drm/panel: add backlight dependency for sitronix-st7789v drm/panel: S6E3HA2 needs backlight code drm/panel: simple: add support for AUO P320HVN03 drm/panel: simple: add support for NLT NL192108AC18-02D dt-bindings: add vendor prefix for NLT Technologies, Ltd. drm/panel: simple: add support for NEC NL12880B20-05 drm/panel: add Innolux P079ZCA panel driver dt-bindings: Add INNOLUX P079ZCA panel bindings drm/vc4: Fix resource leak in 'vc4_get_hang_state_ioctl()' in error handling path drm/vc4/vc4_bo.c: always set bo->resv drm: Add const to name field declaration in struct drm_prop_enum_list drm/pl111: Fix offset calculation for the primary plane. drm/atmel-hlcdc: Fix panel registration drm/bridge: Build the panel wrapper in drm_kms_helper drm/atmel-hlcdc: Replace the panel usage with drm_panel_bridge. ...
2017-06-15drm/vc4: Add get/set tiling ioctls.Eric Anholt
This allows mesa to set the tiling format for a BO and have that tiling format be respected by mesa on the other side of an import/export (and by vc4 scanout in the kernel), without defining a protocol to pass the tiling through userspace. Signed-off-by: Eric Anholt <eric@anholt.net> Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-2-eric@anholt.net Acked-by: Dave Airlie <airlied@redhat.com>
2017-06-15drm/vc4: Add T-format scanout support.Eric Anholt
The T tiling format is what V3D uses for textures, with no raster support at all until later revisions of the hardware (and always at a large 3D performance penalty). If we can't scan out V3D's format, then we often need to do a relayout at some stage of the pipeline, either right before texturing from the scanout buffer (common in X11 without a compositor) or between a tiled screen buffer right before scanout (an option I've considered in trying to resolve this inconsistency, but which means needing to use the dirty fb ioctl and having some update policy). T-format scanout lets us avoid either of those shadow copies, for a massive, obvious performance improvement to X11 window dragging without a compositor. Unfortunately, enabling a compositor to work around the discrepancy has turned out to be too costly in memory consumption for the Raspbian distribution. Because the HVS operates a scanline at a time, compositing from T does increase the memory bandwidth cost of scanout. On my 1920x1080@32bpp display on a RPi3, we go from about 15% of system memory bandwidth with linear to about 20% with tiled. However, for X11 this still ends up being a huge performance win in active usage. This patch doesn't yet handle src_x/src_y offsetting within the tiled buffer. However, we fail to do so for untiled buffers already. Signed-off-by: Eric Anholt <eric@anholt.net> Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-1-eric@anholt.net Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-06-15gpu: host1x: Refactor channel allocation codeMikko Perttunen
This is largely a rewrite of the Host1x channel allocation code, bringing several changes: - The previous code could deadlock due to an interaction between the 'reflock' mutex and CDMA timeout handling. This gets rid of the mutex. - Support for more than 32 channels, required for Tegra186 - General refactoring, including better encapsulation of channel ownership handling into channel.c Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15gpu: host1x: Correct swapped arguments in the is_addr_reg() definitionDmitry Osipenko
Arguments of the .is_addr_reg() are swapped in the definition of the function, that is quite confusing. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15gpu: host1x: Forbid unrelated SETCLASS opcode in the firewallDmitry Osipenko
Several channels could be made to write the same unit concurrently via the SETCLASS opcode, trusting userspace is a bad idea. It should be possible to drop the per-client channel reservation and add a per-unit locking by inserting MLOCK's to the command stream to re-allow the SETCLASS opcode, but it will be much more work. Let's forbid the unit-unrelated class changes for now. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15drm/tegra: Correct copying of waitchecks and disable them in the 'submit' IOCTLDmitry Osipenko
The waitchecks along with multiple syncpoints per submit are not ready for use yet, let's forbid them for now. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15gpu: host1x: Flesh out kerneldocThierry Reding
Improve kerneldoc for the public parts of the host1x infrastructure in preparation for adding driver-specific part to the GPU documentation. Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15firmware: dmi_scan: Make dmi_walk and dmi_walk_early return real error codesAndy Lutomirski
Currently they return -1 on error, which will confuse callers if they try to interpret it as a normal negative error code. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org> Signed-off-by: Jean Delvare <jdelvare@suse.de>
2017-06-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) The netlink attribute passed in to dev_set_alias() is not necessarily NULL terminated, don't use strlcpy() on it. From Alexander Potapenko. 2) Fix implementation of atomics in arm64 bpf JIT, from Daniel Borkmann. 3) Correct the release of netdevs and driver private data in certain circumstances. 4) Sanitize netlink message length properly in decnet, from Mateusz Jurczyk. 5) Don't leak kernel data in rtnl_fill_vfinfo() netlink blobs. From Yuval Mintz. 6) Hash secret is never initialized in ipv6 ILA translation code, from Arnd Bergmann. I guess those clang warnings about unused inline functions are useful for something! 7) Fix endian selection in bpf_endian.h, from Daniel Borkmann. 8) Sanitize sockaddr length before dereferncing any fields in AF_UNIX and CAIF. From Mateusz Jurczyk. 9) Fix timestamping for GMAC3 chips in stmmac driver, from Mario Molitor. 10) Do not leak netdev on dev_alloc_name() errors in mac80211, from Johannes Berg. 11) Fix locking in sctp_for_each_endpoint(), from Xin Long. 12) Fix wrong memset size on 32-bit in snmp6, from Christian Perle. 13) Fix use after free in ip_mc_clear_src(), from WANG Cong. 14) Fix regressions caused by ICMP rate limiting changes in 4.11, from Jesper Dangaard Brouer. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (91 commits) i40e: Fix a sleep-in-atomic bug net: don't global ICMP rate limit packets originating from loopback net/act_pedit: fix an error code net: update undefined ->ndo_change_mtu() comment net_sched: move tcf_lock down after gen_replace_estimator() caif: Add sockaddr length check before accessing sa_family in connect handler qed: fix dump of context data qmi_wwan: new Telewell and Sierra device IDs net: phy: Fix MDIO_THUNDER dependencies netconsole: Remove duplicate "netconsole: " logging prefix igmp: acquire pmc lock for ip_mc_clear_src() r8152: give the device version net: rps: fix uninitialized symbol warning mac80211: don't send SMPS action frame in AP mode when not needed mac80211/wpa: use constant time memory comparison for MACs mac80211: set bss_info data before configuring the channel mac80211: remove 5/10 MHz rate code from station MLME mac80211: Fix incorrect condition when checking rx timestamp mac80211: don't look at the PM bit of BAR frames i40e: fix handling of HW ATR eviction ...
2017-06-15Merge tag 'acpi-4.12-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These revert an ACPICA commit from the 4.11 cycle that causes problems to happen on some systems and add a protection against possible kernel crashes due to table reference counter imbalance. Specifics: - Revert a 4.11 ACPICA change that made assumptions which are not satisfied on some systems and caused the enumeration of resources to fail on them (Rafael Wysocki). - Add a mechanism to prevent tables from being unmapped prematurely due to reference counter overflows (Lv Zheng)" * tag 'acpi-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPICA: Tables: Mechanism to handle late stage acpi_get_table() imbalance Revert "ACPICA: Disassembler: Enhance resource descriptor detection"
2017-06-15Merge tag 'media/v4.12-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - some build dependency issues at CEC core with randconfigs - fix an off by one error at vb2 - a race fix at cec core - driver fixes at tc358743, sir_ir and rainshadow-cec * tag 'media/v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] media/cec.h: use IS_REACHABLE instead of IS_ENABLED [media] cec: race fix: don't return -ENONET in cec_receive() [media] sir_ir: infinite loop in interrupt handler [media] cec-notifier.h: handle unreachable CONFIG_CEC_CORE [media] cec: improve MEDIA_CEC_RC dependencies [media] vb2: Fix an off by one error in 'vb2_plane_vaddr' [media] rainshadow-cec: Fix missing spin_lock_init() [media] tc358743: fix register i2c_rd/wr function fix
2017-06-14drm: Remove duplicate forward declarationDawid Kurek
Forward declarations in C are great but I'm pretty sure one is enough. Signed-off-by: Dawid Kurek <dawikur@gmail.com> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: http://patchwork.freedesktop.org/patch/msgid/20170614213518.GA3554@gmail.com
2017-06-15Merge branch 'acpica-fixes'Rafael J. Wysocki
* acpica-fixes: ACPICA: Tables: Mechanism to handle late stage acpi_get_table() imbalance Revert "ACPICA: Disassembler: Enhance resource descriptor detection"
2017-06-14drm/i915/perf: Add OA unit support for Gen 8+Robert Bragg
Enables access to OA unit metrics for BDW, CHV, SKL and BXT which all share (more-or-less) the same OA unit design. Of particular note in comparison to Haswell: some OA unit HW config state has become per-context state and as a consequence it is somewhat more complicated to manage synchronous state changes from the cpu while there's no guarantee of what context (if any) is currently actively running on the gpu. The periodic sampling frequency which can be particularly useful for system-wide analysis (as opposed to command stream synchronised MI_REPORT_PERF_COUNT commands) is perhaps the most surprising state to have become per-context save and restored (while the OABUFFER destination is still a shared, system-wide resource). This support for gen8+ takes care to consider a number of timing challenges involved in synchronously updating per-context state primarily by programming all config state from the cpu and updating all current and saved contexts synchronously while the OA unit is still disabled. The driver intentionally avoids depending on command streamer programming to update OA state considering the lack of synchronization between the automatic loading of OACTXCONTROL state (that includes the periodic sampling state and enable state) on context restore and the parsing of any general purpose BB the driver can control. I.e. this implementation is careful to avoid the possibility of a context restore temporarily enabling any out-of-date periodic sampling state. In addition to the risk of transiently-out-of-date state being loaded automatically; there are also internal HW latencies involved in the loading of MUX configurations which would be difficult to account for from the command streamer (and we only want to enable the unit when once the MUX configuration is complete). Since the Gen8+ OA unit design no longer supports clock gating the unit off for a single given context (which effectively stopped any progress of counters while any other context was running) and instead supports tagging OA reports with a context ID for filtering on the CPU, it means we can no longer hide the system-wide progress of counters from a non-privileged application only interested in metrics for its own context. Although we could theoretically try and subtract the progress of other contexts before forwarding reports via read() we aren't in a position to filter reports captured via MI_REPORT_PERF_COUNT commands. As a result, for Gen8+, we always require the dev.i915.perf_stream_paranoid to be unset for any access to OA metrics if not root. v5: Drain submitted requests when enabling metric set to ensure no lite-restore erases the context image we just updated (Lionel) v6: In addition to drain, sw