summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2020-11-10um: Call pgtable_pmd_page_dtor() in __pmd_free_tlb()Richard Weinberger
Commit b2b29d6d0119 ("mm: account PMD tables like PTE tables") uncovered a bug in uml, we forgot to call the destructor. While we are here, give x a sane name. Reported-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Co-developed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Richard Weinberger <richard@nod.at> Tested-by: Christopher Obbard <chris.obbard@collabora.com>
2020-11-10Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull core dump fix from Al Viro: "Fix for multithreaded coredump playing fast and loose with getting registers of secondary threads; if a secondary gets caught in the middle of exit(2), the conditition it will be stopped in for dumper to examine might be unusual enough for things to go wrong. Quite a few architectures are fine with that, but some are not." * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: don't dump the threads that had been already exiting when zapped.
2020-11-10Merge tag 'for-5.10-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A handful of minor fixes and updates: - handle missing device replace item on mount (syzbot report) - fix space reservation calculation when finishing relocation - fix memory leak on error path in ref-verify (debugging feature) - fix potential overflow during defrag on 32bit arches - minor code update to silence smatch warning - minor error message updates" * tag 'for-5.10-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: ref-verify: fix memory leak in btrfs_ref_tree_mod btrfs: dev-replace: fail mount if we don't have replace item with target device btrfs: scrub: update message regarding read-only status btrfs: clean up NULL checks in qgroup_unreserve_range() btrfs: fix min reserved size calculation in merge_reloc_root btrfs: print the block rsv type when we fail our reservation btrfs: fix potential overflow in cluster_pages_for_defrag on 32bit arch
2020-11-10Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds
Pull fscrypt fix from Eric Biggers: "Fix a regression where a new WARN_ON() was reachable when using FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32 on ext4, causing xfstest generic/602 to sometimes fail on ext4" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: remove reachable WARN in fscrypt_setup_iv_ino_lblk_32_key()
2020-11-10Merge branch 'turbostat' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat updates from Len Brown: "Update update to version 20.09.30, one kernel side fix" * 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: update version number powercap: restrict energy meter to root access tools/power turbostat: harden against cpu hotplug tools/power turbostat: adjust for temperature offset tools/power turbostat: Build with _FILE_OFFSET_BITS=64 tools/power turbostat: Support AMD Family 19h tools/power turbostat: Remove empty columns for Jacobsville tools/power turbostat: Add a new GFXAMHz column that exposes gt_act_freq_mhz. tools/power x86_energy_perf_policy: Input/output error in a VM tools/power turbostat: Skip pc8, pc9, pc10 columns, if they are disabled tools/power turbostat: Support additional CPU model numbers tools/power turbostat: Fix output formatting for ACPI CST enumeration tools/power turbostat: Replace HTTP links with HTTPS ones: TURBOSTAT UTILITY tools/power turbostat: Use sched_getcpu() instead of hardcoded cpu 0 tools/power turbostat: Enable accumulate RAPL display tools/power turbostat: Introduce functions to accumulate RAPL consumption tools/power turbostat: Make the energy variable to be 64 bit tools/power turbostat: Always print idle in the system configuration header tools/power turbostat: Print /dev/cpu_dma_latency
2020-11-10tools/power turbostat: update version numberLen Brown
goodbye summer... Signed-off-by: Len Brown <len.brown@intel.com>
2020-11-10powercap: restrict energy meter to root accessLen Brown
Remove non-privileged user access to power data contained in /sys/class/powercap/intel-rapl*/*/energy_uj Non-privileged users currently have read access to power data and can use this data to form a security attack. Some privileged drivers/applications need read access to this data, but don't expose it to non-privileged users. For example, thermald uses this data to ensure that power management works correctly. Thus removing non-privileged access is preferred over completely disabling this power reporting capability with CONFIG_INTEL_RAPL=n. Fixes: 95677a9a3847 ("PowerCap: Fix mode for energy counter") Signed-off-by: Len Brown <len.brown@intel.com> Cc: stable@vger.kernel.org
2020-11-09Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "ARM: - fix compilation error when PMD and PUD are folded - fix regression in reads-as-zero behaviour of ID_AA64ZFR0_EL1 - add aarch64 get-reg-list test x86: - fix semantic conflict between two series merged for 5.10 - fix (and test) enforcement of paravirtual cpuid features selftests: - various cleanups to memory management selftests - new selftests testcase for performance of dirty logging" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits) KVM: selftests: allow two iterations of dirty_log_perf_test KVM: selftests: Introduce the dirty log perf test KVM: selftests: Make the number of vcpus global KVM: selftests: Make the per vcpu memory size global KVM: selftests: Drop pointless vm_create wrapper KVM: selftests: Add wrfract to common guest code KVM: selftests: Simplify demand_paging_test with timespec_diff_now KVM: selftests: Remove address rounding in guest code KVM: selftests: Factor code out of demand_paging_test KVM: selftests: Use a single binary for dirty/clear log test KVM: selftests: Always clear dirty bitmap after iteration KVM: selftests: Add blessed SVE registers to get-reg-list KVM: selftests: Add aarch64 get-reg-list test selftests: kvm: test enforcement of paravirtual cpuid features selftests: kvm: Add exception handling to selftests selftests: kvm: Clear uc so UCALL_NONE is being properly reported selftests: kvm: Fix the segment descriptor layout to match the actual layout KVM: x86: handle MSR_IA32_DEBUGCTLMSR with report_ignored_msrs kvm: x86: request masterclock update any time guest uses different msr kvm: x86: ensure pv_cpuid.features is initialized when enabling cap ...
2020-11-09Merge tag 'nfsd-5.10-1' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd fixes from Bruce Fields: "This is mainly server-to-server copy and fallout from Chuck's 5.10 rpc refactoring" * tag 'nfsd-5.10-1' of git://linux-nfs.org/~bfields/linux: net/sunrpc: fix useless comparison in proc_do_xprt() net/sunrpc: return 0 on attempt to write to "transports" NFSD: fix missing refcount in nfsd4_copy by nfsd4_do_async_copy NFSD: Fix use-after-free warning when doing inter-server copy NFSD: MKNOD should return NFSERR_BADTYPE instead of NFSERR_INVAL SUNRPC: Fix general protection fault in trace_rpc_xdr_overflow() NFSD: NFSv3 PATHCONF Reply is improperly formed
2020-11-09Merge tag 'ext4_for_linus_cleanups' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes and cleanups from Ted Ts'o: "More fixes and cleanups for the new fast_commit features, but also a few other miscellaneous bug fixes and a cleanup for the MAINTAINERS file" * tag 'ext4_for_linus_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (28 commits) jbd2: fix up sparse warnings in checkpoint code ext4: fix sparse warnings in fast_commit code ext4: cleanup fast commit mount options jbd2: don't start fast commit on aborted journal ext4: make s_mount_flags modifications atomic ext4: issue fsdev cache flush before starting fast commit ext4: disable fast commit with data journalling ext4: fix inode dirty check in case of fast commits ext4: remove unnecessary fast commit calls from ext4_file_mmap ext4: mark buf dirty before submitting fast commit buffer ext4: fix code documentatioon ext4: dedpulicate the code to wait on inode that's being committed jbd2: don't read journal->j_commit_sequence without taking a lock jbd2: don't touch buffer state until it is filled jbd2: add todo for a fast commit performance optimization jbd2: don't pass tid to jbd2_fc_end_commit_fallback() jbd2: don't use state lock during commit path jbd2: drop jbd2_fc_init documentation ext4: clean up the JBD2 API that initializes fast commits jbd2: rename j_maxlen to j_total_len and add jbd2_journal_max_txn_bufs ...
2020-11-09Merge tag 'erofs-for-5.10-rc4-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: "A week ago, Vladimir reported an issue that the kernel log would become polluted if the page allocation debug option is enabled. I also found this when I cleaned up magical page->mapping and originally planned to submit these all for 5.11 but it seems the impact can be noticed so submit the fix in advance. In addition, nl6720 also reported that atime is empty although EROFS has the only one on-disk timestamp as a practical consideration for now but it's better to derive it as what we did for the other timestamps. Summary: - fix setting up pcluster improperly for temporary pages - derive atime instead of leaving it empty" * tag 'erofs-for-5.10-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: fix setting up pcluster for temporary pages erofs: derive atime instead of leaving it empty
2020-11-09KVM: selftests: allow two iterations of dirty_log_perf_testPaolo Bonzini
Even though one iteration is not enough for the dirty log performance test (due to the cost of building page tables, zeroing memory etc.) two is okay and it is the default. Without this patch, "./dirty_log_perf_test" without any further arguments fails. Cc: Ben Gardon <bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08Linux 5.10-rc3Linus Torvalds
2020-11-08net/sunrpc: fix useless comparison in proc_do_xprt()Dan Carpenter
In the original code, the "if (*lenp < 0)" check didn't work because "*lenp" is unsigned. Fortunately, the memory_read_from_buffer() call will never fail in this context so it doesn't affect runtime. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-11-08Merge tag 'driver-core-5.10-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core documentation fixes from Greg KH: "Some small Documentation fixes that were fallout from the larger documentation update we did in 5.10-rc2. Nothing major here at all, but all of these have been in linux-next and resolve build warnings when building the documentation files" * tag 'driver-core-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: Documentation: remove mic/index from misc-devices/index.rst scripts: get_api.pl: Add sub-titles to ABI output scripts: get_abi.pl: Don't let ABI files to create subtitles docs: leds: index.rst: add a missing file docs: ABI: sysfs-class-net: fix a typo docs: ABI: sysfs-driver-dma-ioatdma: what starts with /sys
2020-11-08Merge tag 'tty-5.10-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are a small number of small tty and serial fixes for some reported problems for the tty core, vt code, and some serial drivers. They include fixes for: - a buggy and obsolete vt font ioctl removal - 8250_mtk serial baudrate runtime warnings - imx serial earlycon build configuration fix - txx9 serial driver error path cleanup issues - tty core fix in release_tty that can be triggered by trying to bind an invalid serial port name to a speakup console device Almost all of these have been in linux-next without any problems, the only one that hasn't, just deletes code :)" * tag 'tty-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: vt: Disable KD_FONT_OP_COPY tty: fix crash in release_tty if tty->port is not set serial: txx9: add missing platform_driver_unregister() on error in serial_txx9_init tty: serial: imx: enable earlycon by default if IMX_SERIAL_CONSOLE is enabled serial: 8250_mtk: Fix uart_get_baud_rate warning
2020-11-08Merge tag 'usb-5.10-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB fixes and new device ids: - USB gadget fixes for some reported issues - Fixes for the ever-troublesome apple fastcharge driver, hopefully we finally have it right. - More USB core quirks for odd devices - USB serial driver fixes for some long-standing issues that were recently found - some new USB serial driver device ids All have been in linux-next with no reported issues" * tag 'usb-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: apple-mfi-fastcharge: fix reference leak in apple_mfi_fc_set_property usb: mtu3: fix panic in mtu3_gadget_stop() USB: serial: option: add Telit FN980 composition 0x1055 USB: serial: option: add LE910Cx compositions 0x1203, 0x1230, 0x1231 USB: serial: cyberjack: fix write-URB completion race USB: Add NO_LPM quirk for Kingston flash drive USB: serial: option: add Quectel EC200T module support usb: raw-gadget: fix memory leak in gadget_setup usb: dwc2: Avoid leaving the error_debugfs label unused usb: dwc3: ep0: Fix delay status handling usb: gadget: fsl: fix null pointer checking usb: gadget: goku_udc: fix potential crashes in probe usb: dwc3: pci: add support for the Intel Alder Lake-S
2020-11-08fork: fix copy_process(CLONE_PARENT) race with the exiting ->real_parentEddy Wu
current->group_leader->exit_signal may change during copy_process() if current->real_parent exits. Move the assignment inside tasklist_lock to avoid the race. Signed-off-by: Eddy Wu <eddy_wu@trendmicro.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-08vt: Disable KD_FONT_OP_COPYDaniel Vetter
It's buggy: On Fri, Nov 06, 2020 at 10:30:08PM +0800, Minh Yuan wrote: > We recently discovered a slab-out-of-bounds read in fbcon in the latest > kernel ( v5.10-rc2 for now ). The root cause of this vulnerability is that > "fbcon_do_set_font" did not handle "vc->vc_font.data" and > "vc->vc_font.height" correctly, and the patch > <https://lkml.org/lkml/2020/9/27/223> for VT_RESIZEX can't handle this > issue. > > Specifically, we use KD_FONT_OP_SET to set a small font.data for tty6, and > use KD_FONT_OP_SET again to set a large font.height for tty1. After that, > we use KD_FONT_OP_COPY to assign tty6's vc_font.data to tty1's vc_font.data > in "fbcon_do_set_font", while tty1 retains the original larger > height. Obviously, this will cause an out-of-bounds read, because we can > access a smaller vc_font.data with a larger vc_font.height. Further there was only one user ever. - Android's loadfont, busybox and console-tools only ever use OP_GET and OP_SET - fbset documentation only mentions the kernel cmdline font: option, not anything else. - systemd used OP_COPY before release 232 published in Nov 2016 Now unfortunately the crucial report seems to have gone down with gmane, and the commit message doesn't say much. But the pull request hints at OP_COPY being broken https://github.com/systemd/systemd/pull/3651 So in other words, this never worked, and the only project which foolishly every tried to use it, realized that rather quickly too. Instead of trying to fix security issues here on dead code by adding missing checks, fix the entire thing by removing the functionality. Note that systemd code using the OP_COPY function ignored the return value, so it doesn't matter what we're doing here really - just in case a lone server somewhere happens to be extremely unlucky and running an affected old version of systemd. The relevant code from font_copy_to_all_vcs() in systemd was: /* copy font from active VT, where the font was uploaded to */ cfo.op = KD_FONT_OP_COPY; cfo.height = vcs.v_active-1; /* tty1 == index 0 */ (void) ioctl(vcfd, KDFONTOP, &cfo); Note this just disables the ioctl, garbage collecting the now unused callbacks is left for -next. v2: Tetsuo found the old mail, which allowed me to find it on another archive. Add the link too. Acked-by: Peilin Ye <yepeilin.cs@gmail.com> Reported-by: Minh Yuan <yuanmingbuaa@gmail.com> References: https://lists.freedesktop.org/archives/systemd-devel/2016-June/036935.html References: https://github.com/systemd/systemd/pull/3651 Cc: Greg KH <greg@kroah.com> Cc: Peilin Ye <yepeilin.cs@gmail.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://lore.kernel.org/r/20201108153806.3140315-1-daniel.vetter@ffwll.ch Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-08Merge tag 'xfs-5.10-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Darrick Wong: - Fix an uninitialized struct problem - Fix an iomap problem zeroing unwritten EOF blocks - Fix some clumsy error handling when writeback fails on filesystems with blocksize < pagesize - Fix a retry loop not resetting loop variables properly - Fix scrub flagging rtinherit inodes on a non-rt fs, since the kernel actually does permit that combination - Fix excessive page cache flushing when unsharing part of a file * tag 'xfs-5.10-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: only flush the unshared range in xfs_reflink_unshare xfs: fix scrub flagging rtinherit even if there is no rt device xfs: fix missing CoW blocks writeback conversion retry iomap: clean up writeback state logic on writepage error iomap: support partial page discard on writeback block mapping failure xfs: flush new eof page on truncate to avoid post-eof corruption xfs: set xefi_discard when creating a deferred agfl free log intent item
2020-11-08Merge branch 'hch' (patches from Christoph)Linus Torvalds
Merge procfs splice read fixes from Christoph Hellwig: "Greg reported a problem due to the fact that Android tests use procfs files to test splice, which stopped working with the changes for set_fs() removal. This series adds read_iter support for seq_file, and uses those for various proc files using seq_file to restore splice read support" [ Side note: Christoph initially had a scripted "move everything over" patch, which looks fine, but I personally would prefer us to actively discourage splice() on random files. So this does just the minimal basic core set of proc file op conversions. For completeness, and in case people care, that script was sed -i -e 's/\.proc_read\(\s*=\s*\)seq_read/\.proc_read_iter\1seq_read_iter/g' but I'll wait and see if somebody has a strong argument for using splice on random small /proc files before I'd run it on the whole kernel. - Linus ] * emailed patches from Christoph Hellwig <hch@lst.de>: proc "seq files": switch to ->read_iter proc "single files": switch to ->read_iter proc/stat: switch to ->read_iter proc/cpuinfo: switch to ->read_iter proc: wire up generic_file_splice_read for iter ops seq_file: add seq_read_iter
2020-11-08Merge tag 'x86-urgent-2020-11-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A set of x86 fixes: - Use SYM_FUNC_START_WEAK in the mem* ASM functions instead of a combination of .weak and SYM_FUNC_START_LOCAL which makes LLVMs integrated assembler upset - Correct the mitigation selection logic which prevented the related prctl to work correctly - Make the UV5 hubless system work correctly by fixing up the malformed table entries and adding the missing ones" * tag 'x86-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/uv: Recognize UV5 hubless system identifier x86/platform/uv: Remove spaces from OEM IDs x86/platform/uv: Fix missing OEM_TABLE_ID x86/speculation: Allow IBPB to be conditionally enabled on CPUs with always-on STIBP x86/lib: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S
2020-11-08Merge tag 'perf-urgent-2020-11-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Thomas Gleixner: "A single fix for the perf core plugging a memory leak in the address filter parser" * tag 'perf-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix a memory leak in perf_event_parse_addr_filter()
2020-11-08Merge tag 'locking-urgent-2020-11-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull futex fix from Thomas Gleixner: "A single fix for the futex code where an intermediate state in the underlying RT mutex was not handled correctly and triggering a BUG() instead of treating it as another variant of retry condition" * tag 'locking-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Handle transient "ownerless" rtmutex state correctly
2020-11-08Merge tag 'irq-urgent-2020-11-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A set of fixes for interrupt chip drivers: - Fix the fallout of the IPI as interrupt conversion in Kconfig and the BCM2836 interrupt chip driver - Fixes for interrupt affinity setting and the handling of hierarchical irq domains in the SiFive PLIC driver - Make the unmapped event handling in the TI SCI driver work correctly - A few minor fixes and cleanups in various chip drivers and Kconfig" * tag 'irq-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: dt-bindings: irqchip: ti, sci-inta: Fix diagram indentation for unmapped events irqchip/ti-sci-inta: Add support for unmapped event handling dt-bindings: irqchip: ti, sci-inta: Update for unmapped event handling irqchip/renesas-intc-irqpin: Merge irlm_bit and needs_irlm irqchip/sifive-plic: Fix chip_data access within a hierarchy irqchip/sifive-plic: Fix broken irq_set_affinity() callback irqchip/stm32-exti: Add all LP timer exti direct events support irqchip/bcm2836: Fix missing __init annotation irqchip/mips: Drop selection of IRQ_DOMAIN_HIERARCHY irqchip/mst: Make mst_intc_of_init static irqchip/mst: MST_IRQ should depend on ARCH_MEDIATEK or ARCH_MSTARV7 genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_HIERARCHY
2020-11-08Merge tag 'core-urgent-2020-11-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull entry code fix from Thomas Gleixner: "A single fix for the generic entry code to correct the wrong assumption that the lockdep interrupt state needs not to be established before calling the RCU check" * tag 'core-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: entry: Fix the incorrect ordering of lockdep and RCU check
2020-11-08Merge tag 'powerpc-5.10-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - fix miscompilation with GCC 4.9 by using asm_goto_volatile for put_user() - fix for an RCU splat at boot caused by a recent lockdep change - fix for a possible deadlock in our EEH debugfs code - several fixes for handling of _PAGE_ACCESSED on 32-bit platforms - build fix when CONFIG_NUMA=n Thanks to Andreas Schwab, Christophe Leroy, Oliver O'Halloran, Qian Cai, and Scott Cheloha. * tag 'powerpc-5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/numa: Fix build when CONFIG_NUMA=n powerpc/8xx: Manage _PAGE_ACCESSED through APG bits in L1 entry powerpc/8xx: Always fault when _PAGE_ACCESSED is not set powerpc/40x: Always fault when _PAGE_ACCESSED is not set powerpc/603: Always fault when _PAGE_ACCESSED is not set powerpc: Use asm_goto_volatile for put_user() powerpc/smp: Call rcu_cpu_starting() earlier powerpc/eeh_cache: Fix a possible debugfs deadlock
2020-11-08KVM: selftests: Introduce the dirty log perf testBen Gardon
The dirty log perf test will time verious dirty logging operations (enabling dirty logging, dirtying memory, getting the dirty log, clearing the dirty log, and disabling dirty logging) in order to quantify dirty logging performance. This test can be used to inform future performance improvements to KVM's dirty logging infrastructure. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20201027233733.1484855-6-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Make the number of vcpus globalAndrew Jones
We also check the input number of vcpus against the maximum supported. Signed-off-by: Andrew Jones <drjones@redhat.com> Message-Id: <20201104212357.171559-8-drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Make the per vcpu memory size globalAndrew Jones
Rename vcpu_memory_bytes to something with "percpu" in it in order to be less ambiguous. Also make it global to simplify things. Signed-off-by: Andrew Jones <drjones@redhat.com> Message-Id: <20201104212357.171559-7-drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Drop pointless vm_create wrapperAndrew Jones
Signed-off-by: Andrew Jones <drjones@redhat.com> Message-Id: <20201104212357.171559-3-drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Add wrfract to common guest codeBen Gardon
Wrfract will be used by the dirty logging perf test introduced later in this series to dirty memory sparsely. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20201027233733.1484855-5-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Simplify demand_paging_test with timespec_diff_nowBen Gardon
Add a helper function to get the current time and return the time since a given start time. Use that function to simplify the timekeeping in the demand paging test. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20201027233733.1484855-4-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Remove address rounding in guest codeBen Gardon
Rounding the address the guest writes to a host page boundary will only have an effect if the host page size is larger than the guest page size, but in that case the guest write would still go to the same host page. There's no reason to round the address down, so remove the rounding to simplify the demand paging test. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20201027233733.1484855-3-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Factor code out of demand_paging_testBen Gardon
Much of the code in demand_paging_test can be reused by other, similar multi-vCPU-memory-touching-perfromance-tests. Factor that common code out for reuse. No functional change expected. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20201027233733.1484855-2-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Use a single binary for dirty/clear log testPeter Xu
Remove the clear_dirty_log test, instead merge it into the existing dirty_log_test. It should be cleaner to use this single binary to do both tests, also it's a preparation for the upcoming dirty ring test. The default behavior will run all the modes in sequence. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20201001012233.6013-1-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Always clear dirty bitmap after iterationPeter Xu
We used not to clear the dirty bitmap before because KVM_GET_DIRTY_LOG would overwrite it the next time it copies the dirty log onto it. In the upcoming dirty ring tests we'll start to fetch dirty pages from a ring buffer, so no one is going to clear the dirty bitmap for us. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20201001012228.5916-1-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Add blessed SVE registers to get-reg-listAndrew Jones
Add support for the SVE registers to get-reg-list and create a new test, get-reg-list-sve, which tests them when running on a machine with SVE support. Signed-off-by: Andrew Jones <drjones@redhat.com> Message-Id: <20201029201703.102716-5-drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Add aarch64 get-reg-list testAndrew Jones
Check for KVM_GET_REG_LIST regressions. The blessed list was created by running on v4.15 with the --core-reg-fixup option. The following script was also used in order to annotate system registers with their names when possible. When new system registers are added the names can just be added manually using the same grep. while read reg; do if [[ ! $reg =~ ARM64_SYS_REG ]]; then printf "\t$reg\n" continue fi encoding=$(echo "$reg" | sed "s/ARM64_SYS_REG(//;s/),//") if ! name=$(grep "$encoding" ../../../../arch/arm64/include/asm/sysreg.h); then printf "\t$reg\n" continue fi name=$(echo "$name" | sed "s/.*SYS_//;s/[\t ]*sys_reg($encoding)$//") printf "\t$reg\t/* $name */\n" done < <(aarch64/get-reg-list --core-reg-fixup --list) Signed-off-by: Andrew Jones <drjones@redhat.com> Message-Id: <20201029201703.102716-3-drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08selftests: kvm: test enforcement of paravirtual cpuid featuresOliver Upton
Add a set of tests that ensure the guest cannot access paravirtual msrs and hypercalls that have been disabled in the KVM_CPUID_FEATURES leaf. Expect a #GP in the case of msr accesses and -KVM_ENOSYS from hypercalls. Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Peter Shier <pshier@google.com> Reviewed-by: Aaron Lewis <aaronlewis@google.com> Message-Id: <20201027231044.655110-7-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08selftests: kvm: Add exception handling to selftestsAaron Lewis
Add the infrastructure needed to enable exception handling in selftests. This allows any of the exception and interrupt vectors to be overridden in the guest. Signed-off-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Alexander Graf <graf@amazon.com> Message-Id: <20201012194716.3950330-4-aaronlewis@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08selftests: kvm: Clear uc so UCALL_NONE is being properly reportedAaron Lewis
Ensure the out value 'uc' in get_ucall() is properly reporting UCALL_NONE if the call fails. The return value will be correctly reported, however, the out parameter 'uc' will not be. Clear the struct to ensure the correct value is being reported in the out parameter. Signed-off-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Alexander Graf <graf@amazon.com> Message-Id: <20201012194716.3950330-3-aaronlewis@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08selftests: kvm: Fix the segment descriptor layout to match the actual layoutAaron Lewis
Fix the layout of 'struct desc64' to match the layout described in the SDM Vol 3, Chapter 3 "Protected-Mode Memory Management", section 3.4.5 "Segment Descriptors", Figure 3-8 "Segment Descriptor". The test added later in this series relies on this and crashes if this layout is not correct. Signed-off-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Alexander Graf <graf@amazon.com> Message-Id: <20201012194716.3950330-2-aaronlewis@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: x86: handle MSR_IA32_DEBUGCTLMSR with report_ignored_msrsPankaj Gupta
Windows2016 guest tries to enable LBR by setting the corresponding bits in MSR_IA32_DEBUGCTLMSR. KVM does not emulate MSR_IA32_DEBUGCTLMSR and spams the host kernel logs with error messages like: kvm [...]: vcpu1, guest rIP: 0xfffff800a8b687d3 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x1, nop" This patch fixes this by enabling error logging only with 'report_ignored_msrs=1'. Signed-off-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com> Message-Id: <20201105153932.24316-1-pankaj.gupta.linux@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08kvm: x86: request masterclock update any time guest uses different msrOliver Upton
Commit 5b9bb0ebbcdc ("kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper fn", 2020-10-21) subtly changed the behavior of guest writes to MSR_KVM_SYSTEM_TIME(_NEW). Restore the previous behavior; update the masterclock any time the guest uses a different msr than before. Fixes: 5b9bb0ebbcdc ("kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper fn", 2020-10-21) Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Peter Shier <pshier@google.com> Message-Id: <20201027231044.655110-6-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08kvm: x86: ensure pv_cpuid.features is initialized when enabling capOliver Upton
Make the paravirtual cpuid enforcement mechanism idempotent to ioctl() ordering by updating pv_cpuid.features whenever userspace requests the capability. Extract this update out of kvm_update_cpuid_runtime() into a new helper function and move its other call site into kvm_vcpu_after_set_cpuid() where it more likely belongs. Fixes: 66570e966dd9 ("kvm: x86: only provide PV features if enabled in guest's CPUID") Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Peter Shier <pshier@google.com> Message-Id: <20201027231044.655110-5-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08kvm: x86: reads of restricted pv msrs should also result in #GPOliver Upton
commit 66570e966dd9 ("kvm: x86: only provide PV features if enabled in guest's CPUID") only protects against disallowed guest writes to KVM paravirtual msrs, leaving msr reads unchecked. Fix this by enforcing KVM_CPUID_FEATURES for msr reads as well. Fixes: 66570e966dd9 ("kvm: x86: only provide PV features if enabled in guest's CPUID") Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Peter Shier <pshier@google.com> Message-Id: <20201027231044.655110-4-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: x86: use positive error values for msr emulation that causes #GPMaxim Levitsky
Recent introduction of the userspace msr filtering added code that uses negative error codes for cases that result in either #GP delivery to the guest, or handled by the userspace msr filtering. This breaks an assumption that a negative error code returned from the msr emulation code is a semi-fatal error which should be returned to userspace via KVM_RUN ioctl and usually kill the guest. Fix this by reusing the already existing KVM_MSR_RET_INVALID error code, and by adding a new KVM_MSR_RET_FILTERED error code for the userspace filtered msrs. Fixes: 291f35fb2c1d1 ("KVM: x86: report negative values from wrmsr emulation to userspace") Reported-by: Qian Cai <cai@redhat.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20201101115523.115780-1-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: Documentation: Update entry for KVM_CAP_ENFORCE_PV_CPUIDPeter Xu
Should be squashed into 66570e966dd9cb4f. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20201023183358.50607-3-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: Documentation: Update entry for KVM_X86_SET_MSR_FILTERPeter Xu
It should be an accident when rebase, since we've already have section 8.25 (which is KVM_CAP_S390_DIAG318). Fix the number. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20201001012044.5151-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>