summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2017-09-05 12:19:08 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2017-09-05 12:19:08 -0700
commit439644096c1a6afb9bd9953130f4444a856f76c5 (patch)
treecdf21533aa0ec95d084988f234186f8a5071d5df
parentb42a362e6d10c342004b183defcb9940331b6737 (diff)
parentd97561f461e4cb5b2e170bc33b466cffbddc8719 (diff)
Merge tag 'pm-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki: "This time (again) cpufreq gets the majority of changes which mostly are driver updates (including a major consolidation of intel_pstate), some schedutil governor modifications and core cleanups. There also are some changes in the system suspend area, mostly related to diagnostics and debug messages plus some renames of things related to suspend-to-idle. One major change here is that suspend-to-idle is now going to be preferred over S3 on systems where the ACPI tables indicate to do so and provide requsite support (the Low Power Idle S0 _DSM in particular). The system sleep documentation and the tools related to it are updated too. The rest is a few cpuidle changes (nothing major), devfreq updates, generic power domains (genpd) framework updates and a few assorted modifications elsewhere. Specifics: - Drop the P-state selection algorithm based on a PID controller from intel_pstate and make it use the same P-state selection method (based on the CPU load) for all types of systems in the active mode (Rafael Wysocki, Srinivas Pandruvada). - Rework the cpufreq core and governors to make it possible to take cross-CPU utilization updates into account and modify the schedutil governor to actually do so (Viresh Kumar). - Clean up the handling of transition latency information in the cpufreq core and untangle it from the information on which drivers cannot do dynamic frequency switching (Viresh Kumar). - Add support for new SoCs (MT2701/MT7623 and MT7622) to the mediatek cpufreq driver and update its DT bindings (Sean Wang). - Modify the cpufreq dt-platdev driver to autimatically create cpufreq devices for the new (v2) Operating Performance Points (OPP) DT bindings and update its whitelist of supported systems (Viresh Kumar, Shubhrajyoti Datta, Marc Gonzalez, Khiem Nguyen, Finley Xiao). - Add support for Ux500 to the cpufreq-dt driver and drop the obsolete dbx500 cpufreq driver (Linus Walleij, Arnd Bergmann). - Add new SoC (R8A7795) support to the cpufreq rcar driver (Khiem Nguyen). - Fix and clean up assorted issues in the cpufreq drivers and core (Arvind Yadav, Christophe Jaillet, Colin Ian King, Gustavo Silva, Julia Lawall, Leonard Crestez, Rob Herring, Sudeep Holla). - Update the IO-wait boost handling in the schedutil governor to make it less aggressive (Joel Fernandes). - Rework system suspend diagnostics to make it print fewer messages to the kernel log by default, add a sysfs knob to allow more suspend-related messages to be printed and add Low Power S0 Idle constraints checks to the ACPI suspend-to-idle code (Rafael Wysocki, Srinivas Pandruvada). - Prefer suspend-to-idle over S3 on ACPI-based systems with the ACPI_FADT_LOW_POWER_S0 flag set and the Low Power Idle S0 _DSM interface present in the ACPI tables (Rafael Wysocki). - Update documentation related to system sleep and rename a number of items in the code to make it cleare that they are related to suspend-to-idle (Rafael Wysocki). - Export a variable allowing device drivers to check the target system sleep state from the core system suspend code (Florian Fainelli). - Clean up the cpuidle subsystem to handle the polling state on x86 in a more straightforward way and to use %pOF instead of full_name (Rafael Wysocki, Rob Herring). - Update the devfreq framework to fix and clean up a few minor issues (Chanwoo Choi, Rob Herring). - Extend diagnostics in the generic power domains (genpd) framework and clean it up slightly (Thara Gopinath, Rob Herring). - Fix and clean up a couple of issues in the operating performance points (OPP) framework (Viresh Kumar, Waldemar Rymarkiewicz). - Add support for RV1108 to the rockchip-io Adaptive Voltage Scaling (AVS) driver (David Wu). - Fix the usage of notifiers in CPU power management on some platforms (Alex Shi). - Update the pm-graph system suspend/hibernation and boot profiling utility (Todd Brandt). - Make it possible to run the cpupower utility without CPU0 (Prarit Bhargava)" * tag 'pm-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (87 commits) cpuidle: Make drivers initialize polling state cpuidle: Move polling state initialization code to separate file cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol cpufreq: imx6q: Fix imx6sx low frequency support cpufreq: speedstep-lib: make several arrays static, makes code smaller PM: docs: Delete the obsolete states.txt document PM: docs: Describe high-level PM strategies and sleep states PM / devfreq: Fix memory leak when fail to register device PM / devfreq: Add dependency on PM_OPP PM / devfreq: Move private devfreq_update_stats() into devfreq PM / devfreq: Convert to using %pOF instead of full_name PM / AVS: rockchip-io: add io selectors and supplies for RV1108 cpufreq: ti: Fix 'of_node_put' being called twice in error handling path cpufreq: dt-platdev: Drop few entries from whitelist cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2 ARM: ux500: don't select CPUFREQ_DT cpuidle: Convert to using %pOF instead of full_name cpufreq: Convert to using %pOF instead of full_name PM / Domains: Convert to using %pOF instead of full_name cpufreq: Cap the default transition delay value to 10 ms ...
-rw-r--r--Documentation/ABI/testing/sysfs-power12
-rw-r--r--Documentation/admin-guide/pm/cpufreq.rst8
-rw-r--r--Documentation/admin-guide/pm/index.rst12
-rw-r--r--Documentation/admin-guide/pm/intel_pstate.rst61
-rw-r--r--Documentation/admin-guide/pm/sleep-states.rst245
-rw-r--r--Documentation/admin-guide/pm/strategies.rst52
-rw-r--r--Documentation/admin-guide/pm/system-wide.rst8
-rw-r--r--Documentation/admin-guide/pm/working-state.rst9
-rw-r--r--Documentation/devicetree/bindings/clock/mt8173-cpu-dvfs.txt83
-rw-r--r--Documentation/devicetree/bindings/cpufreq/cpufreq-mediatek.txt247
-rw-r--r--Documentation/devicetree/bindings/power/rockchip-io-domain.txt2
-rw-r--r--Documentation/power/states.txt125
-rw-r--r--arch/arm/boot/dts/tango4-smp8758.dtsi1
-rw-r--r--arch/arm/mach-tegra/cpuidle-tegra114.c4
-rw-r--r--drivers/acpi/processor_idle.c23
-rw-r--r--drivers/acpi/sleep.c202
-rw-r--r--drivers/base/power/domain.c251
-rw-r--r--drivers/base/power/main.c103
-rw-r--r--drivers/base/power/opp/of.c37
-rw-r--r--drivers/base/power/wakeup.c10
-rw-r--r--drivers/cpufreq/Kconfig.arm21
-rw-r--r--drivers/cpufreq/Makefile4
-rw-r--r--drivers/cpufreq/arm_big_little.c10
-rw-r--r--drivers/cpufreq/cppc_cpufreq.c1
-rw-r--r--drivers/cpufreq/cpufreq-dt-platdev.c65
-rw-r--r--drivers/cpufreq/cpufreq-dt.c1
-rw-r--r--drivers/cpufreq/cpufreq-nforce2.c2
-rw-r--r--drivers/cpufreq/cpufreq.c41
-rw-r--r--drivers/cpufreq/cpufreq_conservative.c6
-rw-r--r--drivers/cpufreq/cpufreq_governor.c20
-rw-r--r--drivers/cpufreq/cpufreq_governor.h3
-rw-r--r--drivers/cpufreq/cpufreq_ondemand.c12
-rw-r--r--drivers/cpufreq/dbx500-cpufreq.c103
-rw-r--r--drivers/cpufreq/elanfreq.c4
-rw-r--r--drivers/cpufreq/gx-suspmod.c2
-rw-r--r--drivers/cpufreq/imx6q-cpufreq.c9
-rw-r--r--drivers/cpufreq/intel_pstate.c322
-rw-r--r--drivers/cpufreq/longrun.c1
-rw-r--r--drivers/cpufreq/loongson2_cpufreq.c2
-rw-r--r--drivers/cpufreq/mediatek-cpufreq.c (renamed from drivers/cpufreq/mt8173-cpufreq.c)29
-rw-r--r--drivers/cpufreq/pmac32-cpufreq.c7
-rw-r--r--drivers/cpufreq/pmac64-cpufreq.c2
-rw-r--r--drivers/cpufreq/s5pv210-cpufreq.c3
-rw-r--r--drivers/cpufreq/sa1100-cpufreq.c5
-rw-r--r--drivers/cpufreq/sa1110-cpufreq.c5
-rw-r--r--drivers/cpufreq/sh-cpufreq.c3
-rw-r--r--drivers/cpufreq/speedstep-ich.c2
-rw-r--r--drivers/cpufreq/speedstep-lib.c4
-rw-r--r--drivers/cpufreq/speedstep-smi.c2
-rw-r--r--drivers/cpufreq/sti-cpufreq.c8
-rw-r--r--drivers/cpufreq/tango-cpufreq.c38
-rw-r--r--drivers/cpufreq/ti-cpufreq.c4
-rw-r--r--drivers/cpufreq/unicore2-cpufreq.c3
-rw-r--r--drivers/cpuidle/Makefile1
-rw-r--r--drivers/cpuidle/cpuidle.c18
-rw-r--r--drivers/cpuidle/driver.c32
-rw-r--r--drivers/cpuidle/dt_idle_states.c24
-rw-r--r--drivers/cpuidle/governors/ladder.c14
-rw-r--r--drivers/cpuidle/governors/menu.c13
-rw-r--r--drivers/cpuidle/poll_state.c37
-rw-r--r--drivers/devfreq/Kconfig1
-rw-r--r--drivers/devfreq/devfreq-event.c4
-rw-r--r--drivers/devfreq/devfreq.c5
-rw-r--r--drivers/devfreq/governor.h4
-rw-r--r--drivers/idle/intel_idle.c181
-rw-r--r--drivers/mfd/db8500-prcmu.c62
-rw-r--r--drivers/platform/x86/intel-hid.c17
-rw-r--r--drivers/power/avs/rockchip-io-domain.c38
-rw-r--r--drivers/regulator/of_regulator.c2
-rw-r--r--include/linux/cpufreq.h38
-rw-r--r--include/linux/cpuidle.h21
-rw-r--r--include/linux/devfreq.h13
-rw-r--r--include/linux/pm.h4
-rw-r--r--include/linux/pm_domain.h3
-rw-r--r--include/linux/suspend.h48
-rw-r--r--kernel/cpu_pm.c50
-rw-r--r--kernel/power/hibernate.c29
-rw-r--r--kernel/power/main.c64
-rw-r--r--kernel/power/power.h5
-rw-r--r--kernel/power/suspend.c184
-rw-r--r--kernel/power/suspend_test.c4
-rw-r--r--kernel/sched/cpufreq_schedutil.c98
-rw-r--r--kernel/sched/deadline.c2
-rw-r--r--kernel/sched/fair.c8
-rw-r--r--kernel/sched/idle.c8
-rw-r--r--kernel/sched/rt.c2
-rw-r--r--kernel/sched/sched.h10
-rw-r--r--kernel/time/timekeeping_debug.c5
-rw-r--r--tools/power/cpupower/utils/cpupower.c15
-rw-r--r--tools/power/cpupower/utils/helpers/cpuid.c4
-rw-r--r--tools/power/cpupower/utils/helpers/helpers.h5
-rw-r--r--tools/power/cpupower/utils/helpers/misc.c2
-rw-r--r--tools/power/cpupower/utils/idle_monitor/hsw_ext_idle.c4
-rw-r--r--tools/power/cpupower/utils/idle_monitor/mperf_monitor.c3
-rw-r--r--tools/power/cpupower/utils/idle_monitor/nhm_idle.c8
-rw-r--r--tools/power/cpupower/utils/idle_monitor/snb_idle.c4
-rw-r--r--tools/power/pm-graph/Makefile19
-rwxr-xr-xtools/power/pm-graph/analyze_boot.py586
-rwxr-xr-xtools/power/pm-graph/analyze_suspend.py534
-rw-r--r--tools/power/pm-graph/bootgraph.861
-rw-r--r--tools/power/pm-graph/sleepgraph.848
101 files changed, 2835 insertions, 1746 deletions
diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power
index f523e5a3ac33..713cab1d5f12 100644
--- a/Documentation/ABI/testing/sysfs-power
+++ b/Documentation/ABI/testing/sysfs-power
@@ -273,3 +273,15 @@ Description:
This output is useful for system wakeup diagnostics of spurious
wakeup interrupts.
+
+What: /sys/power/pm_debug_messages
+Date: July 2017
+Contact: Rafael J. Wysocki <rjw@rjwysocki.net>
+Description:
+ The /sys/power/pm_debug_messages file controls the printing
+ of debug messages from the system suspend/hiberbation
+ infrastructure to the kernel log.
+
+ Writing a "1" to this file enables the debug messages and
+ writing a "0" (default) to it disables them. Reads from
+ this file return the current value.
diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst
index 7af83a92d2d6..47153e64dfb5 100644
--- a/Documentation/admin-guide/pm/cpufreq.rst
+++ b/Documentation/admin-guide/pm/cpufreq.rst
@@ -479,14 +479,6 @@ This governor exposes the following tunables:
# echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) > ondemand/sampling_rate
-
-``min_sampling_rate``
- The minimum value of ``sampling_rate``.
-
- Equal to 10000 (10 ms) if :c:macro:`CONFIG_NO_HZ_COMMON` and
- :c:data:`tick_nohz_active` are both set or to 20 times the value of
- :c:data:`jiffies` in microseconds otherwise.
-
``up_threshold``
If the estimated CPU load is above this value (in percent), the governor
will set the frequency to the maximum value allowed for the policy.
diff --git a/Documentation/admin-guide/pm/index.rst b/Documentation/admin-guide/pm/index.rst
index 7f148f76f432..49237ac73442 100644
--- a/Documentation/admin-guide/pm/index.rst
+++ b/Documentation/admin-guide/pm/index.rst
@@ -5,12 +5,6 @@ Power Management
.. toctree::
:maxdepth: 2
- cpufreq
- intel_pstate
-
-.. only:: subproject and html
-
- Indices
- =======
-
- * :ref:`genindex`
+ strategies
+ system-wide
+ working-state
diff --git a/Documentation/admin-guide/pm/intel_pstate.rst b/Documentation/admin-guide/pm/intel_pstate.rst
index 1d6249825efc..d2b6fda3d67b 100644
--- a/Documentation/admin-guide/pm/intel_pstate.rst
+++ b/Documentation/admin-guide/pm/intel_pstate.rst
@@ -167,35 +167,17 @@ is set.
``powersave``
.............
-Without HWP, this P-state selection algorithm generally depends on the
-processor model and/or the system profile setting in the ACPI tables and there
-are two variants of it.
-
-One of them is used with processors from the Atom line and (regardless of the
-processor model) on platforms with the system profile in the ACPI tables set to
-"mobile" (laptops mostly), "tablet", "appliance PC", "desktop", or
-"workstation". It is also used with processors supporting the HWP feature if
-that feature has not been enabled (that is, with the ``intel_pstate=no_hwp``
-argument in the kernel command line). It is similar to the algorithm
+Without HWP, this P-state selection algorithm is similar to the algorithm
implemented by the generic ``schedutil`` scaling governor except that the
utilization metric used by it is based on numbers coming from feedback
registers of the CPU. It generally selects P-states proportional to the
-current CPU utilization, so it is referred to as the "proportional" algorithm.
-
-The second variant of the ``powersave`` P-state selection algorithm, used in all
-of the other cases (generally, on processors from the Core line, so it is
-referred to as the "Core" algorithm), is based on the values read from the APERF
-and MPERF feedback registers and the previously requested target P-state.
-It does not really take CPU utilization into account explicitly, but as a rule
-it causes the CPU P-state to ramp up very quickly in response to increased
-utilization which is generally desirable in server environments.
-
-Regardless of the variant, this algorithm is run by the driver's utilization
-update callback for the given CPU when it is invoked by the CPU scheduler, but
-not more often than every 10 ms (that can be tweaked via ``debugfs`` in `this
-particular case <Tuning Interface in debugfs_>`_). Like in the ``performance``
-case, the hardware configuration is not touched if the new P-state turns out to
-be the same as the current one.
+current CPU utilization.
+
+This algorithm is run by the driver's utilization update callback for the
+given CPU when it is invoked by the CPU scheduler, but not more often than
+every 10 ms. Like in the ``performance`` case, the hardware configuration
+is not touched if the new P-state turns out to be the same as the current
+one.
This is the default P-state selection algorithm if the
:c:macro:`CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE` kernel configuration option
@@ -720,34 +702,7 @@ P-state is called, the ``ftrace`` filter can be set to to
gnome-shell-3409 [001] ..s. 2537.650850: intel_pstate_set_pstate <-intel_pstate_timer_func
<idle>-0 [000] ..s. 2537.654843: intel_pstate_set_pstate <-intel_pstate_timer_func
-Tuning Interface in ``debugfs``
--------------------------------
-
-The ``powersave`` algorithm provided by ``intel_pstate`` for `the Core line of
-processors in the active mode <powersave_>`_ is based on a `PID controller`_
-whose parameters were chosen to address a number of different use cases at the
-same time. However, it still is possible to fine-tune it to a specific workload
-and the ``debugfs`` interface under ``/sys/kernel/debug/pstate_snb/`` is
-provided for this purpose. [Note that the ``pstate_snb`` directory will be
-present only if the specific P-state selection algorithm matching the interface
-in it actually is in use.]
-
-The following files present in that directory can be used to modify the PID
-controller parameters at run time:
-
-| ``deadband``
-| ``d_gain_pct``
-| ``i_gain_pct``
-| ``p_gain_pct``
-| ``sample_rate_ms``
-| ``setpoint``
-
-Note, however, that achieving desirable results this way generally requires
-expert-level understanding of the power vs performance tradeoff, so extra care
-is recommended when attempting to do that.
-
.. _LCEU2015: http://events.linuxfoundation.org/sites/events/files/slides/LinuxConEurope_2015.pdf
.. _SDM: http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html
.. _ACPI specification: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf
-.. _PID controller: https://en.wikipedia.org/wiki/PID_controller
diff --git a/Documentation/admin-guide/pm/sleep-states.rst b/Documentation/admin-guide/pm/sleep-states.rst
new file mode 100644
index 000000000000..1e5c0f00cb2f
--- /dev/null
+++ b/Documentation/admin-guide/pm/sleep-states.rst
@@ -0,0 +1,245 @@
+===================
+System Sleep States
+===================
+
+::
+
+ Copyright (c) 2017 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+
+Sleep states are global low-power states of the entire system in which user
+space code cannot be executed and the overall system activity is significantly
+reduced.
+
+
+Sleep States That Can Be Supported
+============