summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--Documentation/ABI/testing/sysfs-power45
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt22
-rw-r--r--Documentation/cpu-freq/cpufreq-stats.txt6
-rw-r--r--Documentation/cpu-freq/intel-pstate.txt54
-rw-r--r--Documentation/devicetree/bindings/cpufreq/brcm,stb-avs-cpu-freq.txt78
-rw-r--r--Documentation/devicetree/bindings/opp/opp.txt27
-rw-r--r--Documentation/devicetree/bindings/power/domain-idle-state.txt33
-rw-r--r--Documentation/devicetree/bindings/power/power_domain.txt43
-rw-r--r--Documentation/power/devices.txt14
-rw-r--r--Documentation/power/states.txt62
-rw-r--r--MAINTAINERS12
-rw-r--r--arch/arm/mach-imx/gpc.c17
-rw-r--r--arch/x86/kernel/acpi/wakeup_64.S9
-rw-r--r--arch/x86/power/hibernate_64.c94
-rw-r--r--drivers/acpi/processor_perflib.c55
-rw-r--r--drivers/acpi/sleep.c8
-rw-r--r--drivers/base/power/domain.c363
-rw-r--r--drivers/base/power/main.c2
-rw-r--r--drivers/base/power/opp/core.c521
-rw-r--r--drivers/base/power/opp/debugfs.c52
-rw-r--r--drivers/base/power/opp/of.c111
-rw-r--r--drivers/base/power/opp/opp.h23
-rw-r--r--drivers/base/power/power.h19
-rw-r--r--drivers/base/power/qos.c6
-rw-r--r--drivers/base/power/runtime.c62
-rw-r--r--drivers/base/power/sysfs.c6
-rw-r--r--drivers/base/power/wakeirq.c76
-rw-r--r--drivers/base/power/wakeup.c6
-rw-r--r--drivers/cpufreq/Kconfig.arm29
-rw-r--r--drivers/cpufreq/Makefile2
-rw-r--r--drivers/cpufreq/acpi-cpufreq.c117
-rw-r--r--drivers/cpufreq/brcmstb-avs-cpufreq.c1057
-rw-r--r--drivers/cpufreq/cppc_cpufreq.c7
-rw-r--r--drivers/cpufreq/cpufreq-dt-platdev.c15
-rw-r--r--drivers/cpufreq/cpufreq-dt.c12
-rw-r--r--drivers/cpufreq/cpufreq.c25
-rw-r--r--drivers/cpufreq/cpufreq_conservative.c46
-rw-r--r--drivers/cpufreq/cpufreq_governor.c30
-rw-r--r--drivers/cpufreq/cpufreq_governor.h5
-rw-r--r--drivers/cpufreq/cpufreq_ondemand.c17
-rw-r--r--drivers/cpufreq/cpufreq_stats.c22
-rw-r--r--drivers/cpufreq/integrator-cpufreq.c239
-rw-r--r--drivers/cpufreq/intel_pstate.c826
-rw-r--r--drivers/cpufreq/powernv-cpufreq.c65
-rw-r--r--drivers/cpuidle/cpuidle-powernv.c2
-rw-r--r--drivers/cpuidle/cpuidle.c19
-rw-r--r--drivers/cpuidle/dt_idle_states.c6
-rw-r--r--drivers/cpuidle/governor.c4
-rw-r--r--drivers/cpuidle/governors/ladder.c2
-rw-r--r--drivers/cpuidle/governors/menu.c2
-rw-r--r--drivers/cpuidle/sysfs.c4
-rw-r--r--drivers/devfreq/devfreq.c2
-rw-r--r--drivers/devfreq/event/exynos-nocp.c1
-rw-r--r--drivers/devfreq/event/exynos-ppmu.c6
-rw-r--r--drivers/devfreq/event/rockchip-dfi.c1
-rw-r--r--drivers/devfreq/exynos-bus.c29
-rw-r--r--drivers/devfreq/rk3399_dmc.c15
-rw-r--r--drivers/idle/intel_idle.c154
-rw-r--r--drivers/net/ethernet/smsc/smsc911x.c6
-rw-r--r--drivers/power/avs/rockchip-io-domain.c2
-rw-r--r--drivers/powercap/intel_rapl.c389
-rw-r--r--drivers/thermal/intel_powerclamp.c359
-rw-r--r--include/acpi/processor.h3
-rw-r--r--include/linux/cpu.h2
-rw-r--r--include/linux/cpufreq.h6
-rw-r--r--include/linux/cpuidle.h9
-rw-r--r--include/linux/pm_domain.h28
-rw-r--r--include/linux/pm_opp.h72
-rw-r--r--include/linux/pm_runtime.h11
-rw-r--r--include/linux/sched.h3
-rw-r--r--include/linux/suspend.h2
-rw-r--r--kernel/fork.c2
-rw-r--r--kernel/power/main.c88
-rw-r--r--kernel/power/power.h6
-rw-r--r--kernel/power/suspend.c69
-rw-r--r--kernel/sched/core.c1
-rw-r--r--kernel/sched/cpufreq_schedutil.c119
-rw-r--r--kernel/sched/idle.c175
-rw-r--r--mm/kasan/kasan.c9
79 files changed, 4399 insertions, 1549 deletions
diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power
index 50b368d490b5..f523e5a3ac33 100644
--- a/Documentation/ABI/testing/sysfs-power
+++ b/Documentation/ABI/testing/sysfs-power
@@ -7,30 +7,35 @@ Description:
subsystem.
What: /sys/power/state
-Date: May 2014
+Date: November 2016
Contact: Rafael J. Wysocki <rjw@rjwysocki.net>
Description:
The /sys/power/state file controls system sleep states.
Reading from this file returns the available sleep state
- labels, which may be "mem", "standby", "freeze" and "disk"
- (hibernation). The meanings of the first three labels depend on
- the relative_sleep_states command line argument as follows:
- 1) relative_sleep_states = 1
- "mem", "standby", "freeze" represent non-hibernation sleep
- states from the deepest ("mem", always present) to the
- shallowest ("freeze"). "standby" and "freeze" may or may
- not be present depending on the capabilities of the
- platform. "freeze" can only be present if "standby" is
- present.
- 2) relative_sleep_states = 0 (default)
- "mem" - "suspend-to-RAM", present if supported.
- "standby" - "power-on suspend", present if supported.
- "freeze" - "suspend-to-idle", always present.
-
- Writing to this file one of these strings causes the system to
- transition into the corresponding state, if available. See
- Documentation/power/states.txt for a description of what
- "suspend-to-RAM", "power-on suspend" and "suspend-to-idle" mean.
+ labels, which may be "mem" (suspend), "standby" (power-on
+ suspend), "freeze" (suspend-to-idle) and "disk" (hibernation).
+
+ Writing one of the above strings to this file causes the system
+ to transition into the corresponding state, if available.
+
+ See Documentation/power/states.txt for more information.
+
+What: /sys/power/mem_sleep
+Date: November 2016
+Contact: Rafael J. Wysocki <rjw@rjwysocki.net>
+Description:
+ The /sys/power/mem_sleep file controls the operating mode of
+ system suspend. Reading from it returns the available modes
+ as "s2idle" (always present), "shallow" and "deep" (present if
+ supported). The mode that will be used on subsequent attempts
+ to suspend the system (by writing "mem" to the /sys/power/state
+ file described above) is enclosed in square brackets.
+
+ Writing one of the above strings to this file causes the mode
+ represented by it to be used on subsequent attempts to suspend
+ the system.
+
+ See Documentation/power/states.txt for more information.
What: /sys/power/disk
Date: September 2006
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 62d68b2056de..be2d6d0a03a4 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1560,6 +1560,12 @@
disable
Do not enable intel_pstate as the default
scaling driver for the supported processors
+ passive
+ Use intel_pstate as a scaling driver, but configure it
+ to work with generic cpufreq governors (instead of
+ enabling its internal governor). This mode cannot be
+ used along with the hardware-managed P-states (HWP)
+ feature.
force
Enable intel_pstate on systems that prohibit it by default
in favor of acpi-cpufreq. Forcing the intel_pstate driver
@@ -1580,6 +1586,9 @@
Description Table, specifies preferred power management
profile as "Enterprise Server" or "Performance Server",
then this feature is turned on by default.
+ per_cpu_perf_limits
+ Allow per-logical-CPU P-State performance control limits using
+ cpufreq sysfs interface
intremap= [X86-64, Intel-IOMMU]
on enable Interrupt Remapping (default)
@@ -2122,6 +2131,12 @@
memory contents and reserves bad memory
regions that are detected.
+ mem_sleep_default= [SUSPEND] Default system suspend mode:
+ s2idle - Suspend-To-Idle
+ shallow - Power-On Suspend or equivalent (if supported)
+ deep - Suspend-To-RAM or equivalent (if supported)
+ See Documentation/power/states.txt.
+
meye.*= [HW] Set MotionEye Camera parameters
See Documentation/video4linux/meye.txt.
@@ -3475,13 +3490,6 @@
[KNL, SMP] Set scheduler's default relax_domain_level.
See Documentation/cgroup-v1/cpusets.txt.
- relative_sleep_states=
- [SUSPEND] Use sleep state labeling where the deepest
- state available other than hibernation is always "mem".
- Format: { "0" | "1" }
- 0 -- Traditional sleep state labels.
- 1 -- Relative sleep state labels.
-
reserve= [KNL,BUGS] Force the kernel to ignore some iomem area
reservetop= [X86-32]
diff --git a/Documentation/cpu-freq/cpufreq-stats.txt b/Documentation/cpu-freq/cpufreq-stats.txt
index 8d9773f23550..3c355f6ad834 100644
--- a/Documentation/cpu-freq/cpufreq-stats.txt
+++ b/Documentation/cpu-freq/cpufreq-stats.txt
@@ -44,11 +44,17 @@ the stats driver insertion.
total 0
drwxr-xr-x 2 root root 0 May 14 16:06 .
drwxr-xr-x 3 root root 0 May 14 15:58 ..
+--w------- 1 root root 4096 May 14 16:06 reset
-r--r--r-- 1 root root 4096 May 14 16:06 time_in_state
-r--r--r-- 1 root root 4096 May 14 16:06 total_trans
-r--r--r-- 1 root root 4096 May 14 16:06 trans_table
--------------------------------------------------------------------------------
+- reset
+Write-only attribute that can be used to reset the stat counters. This can be
+useful for evaluating system behaviour under different governors without the
+need for a reboot.
+
- time_in_state
This gives the amount of time spent in each of the frequencies supported by
this CPU. The cat output will have "<frequency> <time>" pair in each line, which
diff --git a/Documentation/cpu-freq/intel-pstate.txt b/Documentation/cpu-freq/intel-pstate.txt
index e6bd1e6512a5..1953994ef5e6 100644
--- a/Documentation/cpu-freq/intel-pstate.txt
+++ b/Documentation/cpu-freq/intel-pstate.txt
@@ -48,7 +48,7 @@ In addition to the frequency-controlling interfaces provided by the cpufreq
core, the driver provides its own sysfs files to control the P-State selection.
These files have been added to /sys/devices/system/cpu/intel_pstate/.
Any changes made to these files are applicable to all CPUs (even in a
-multi-package system).
+multi-package system, Refer to later section on placing "Per-CPU limits").
max_perf_pct: Limits the maximum P-State that will be requested by
the driver. It states it as a percentage of the available performance. The
@@ -120,13 +120,57 @@ frequency is fictional for Intel Core processors. Even if the scaling
driver selects a single P-State, the actual frequency the processor
will run at is selected by the processor itself.
+Per-CPU limits
+
+The kernel command line option "intel_pstate=per_cpu_perf_limits" forces
+the intel_pstate driver to use per-CPU performance limits. When it is set,
+the sysfs control interface described above is subject to limitations.
+- The following controls are not available for both read and write
+ /sys/devices/system/cpu/intel_pstate/max_perf_pct
+ /sys/devices/system/cpu/intel_pstate/min_perf_pct
+- The following controls can be used to set performance limits, as far as the
+architecture of the processor permits:
+ /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq
+ /sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq
+ /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
+- User can still observe turbo percent and number of P-States from
+ /sys/devices/system/cpu/intel_pstate/turbo_pct
+ /sys/devices/system/cpu/intel_pstate/num_pstates
+- User can read write system wide turbo status
+ /sys/devices/system/cpu/no_turbo
+
+Support of energy performance hints
+It is possible to provide hints to the HWP algorithms in the processor
+to be more performance centric to more energy centric. When the driver
+is using HWP, two additional cpufreq sysfs attributes are presented for
+each logical CPU.
+These attributes are:
+ - energy_performance_available_preferences
+ - energy_performance_preference
+
+To get list of supported hints:
+$ cat energy_performance_available_preferences
+ default performance balance_performance balance_power power
+
+The current preference can be read or changed via cpufreq sysfs
+attribute "energy_performance_preference". Reading from this attribute
+will display current effective setting. User can write any of the valid
+preference string to this attribute. User can always restore to power-on
+default by writing "default".
+
+Since threads can migrate to different CPUs, this is possible that the
+new CPU may have different energy performance preference than the previous
+one. To avoid such issues, either threads can be pinned to specific CPUs
+or set the same energy performance preference value to all CPUs.
+
Tuning Intel P-State driver
-When HWP mode is not used, debugfs files have also been added to allow the
-tuning of the internal governor algorithm. These files are located at
-/sys/kernel/debug/pstate_snb/. The algorithm uses a PID (Proportional
-Integral Derivative) controller. The PID tunable parameters are:
+When the performance can be tuned using PID (Proportional Integral
+Derivative) controller, debugfs files are provided for adjusting performance.
+They are presented under:
+/sys/kernel/debug/pstate_snb/
+The PID tunable parameters are:
deadband
d_gain_pct
i_gain_pct
diff --git a/Documentation/devicetree/bindings/cpufreq/brcm,stb-avs-cpu-freq.txt b/Documentation/devicetree/bindings/cpufreq/brcm,stb-avs-cpu-freq.txt
new file mode 100644
index 000000000000..af2385795d78
--- /dev/null
+++ b/Documentation/devicetree/bindings/cpufreq/brcm,stb-avs-cpu-freq.txt
@@ -0,0 +1,78 @@
+Broadcom AVS mail box and interrupt register bindings
+=====================================================
+
+A total of three DT nodes are required. One node (brcm,avs-cpu-data-mem)
+references the mailbox register used to communicate with the AVS CPU[1]. The
+second node (brcm,avs-cpu-l2-intr) is required to trigger an interrupt on
+the AVS CPU. The interrupt tells the AVS CPU that it needs to process a
+command sent to it by a driver. Interrupting the AVS CPU is mandatory for
+commands to be processed.
+
+The interface also requires a reference to the AVS host interrupt controller,
+so a driver can react to interrupts generated by the AVS CPU whenever a command
+has been processed. See [2] for more information on the brcm,l2-intc node.
+
+[1] The AVS CPU is an independent co-processor that runs proprietary
+firmware. On some SoCs, this firmware supports DFS and DVFS in addition to
+Adaptive Voltage Scaling.
+
+[2] Documentation/devicetree/bindings/interrupt-controller/brcm,l2-intc.txt
+
+
+Node brcm,avs-cpu-data-mem
+--------------------------
+
+Required properties:
+- compatible: must include: brcm,avs-cpu-data-mem and
+ should include: one of brcm,bcm7271-avs-cpu-data-mem or
+ brcm,bcm7268-avs-cpu-data-mem
+- reg: Specifies base physical address and size of the registers.
+- interrupts: The interrupt that the AVS CPU will use to interrupt the host
+ when a command completed.
+- interrupt-parent: The interrupt controller the above interrupt is routed
+ through.
+- interrupt-names: The name of the interrupt used to interrupt the host.
+
+Optional properties:
+- None
+
+Node brcm,avs-cpu-l2-intr
+-------------------------
+
+Required properties:
+- compatible: must include: brcm,avs-cpu-l2-intr and
+ should include: one of brcm,bcm7271-avs-cpu-l2-intr or
+ brcm,bcm7268-avs-cpu-l2-intr
+- reg: Specifies base physical address and size of the registers.
+
+Optional properties:
+- None
+
+
+Example
+=======
+
+ avs_host_l2_intc: interrupt-controller@f04d1200 {
+ #interrupt-cells = <1>;
+ compatible = "brcm,l2-intc";
+ interrupt-parent = <&intc>;
+ reg = <0xf04d1200 0x48>;
+ interrupt-controller;
+ interrupts = <0x0 0x19 0x0>;
+ interrupt-names = "avs";
+ };
+
+ avs-cpu-data-mem@f04c4000 {
+ compatible = "brcm,bcm7271-avs-cpu-data-mem",
+ "brcm,avs-cpu-data-mem";
+ reg = <0xf04c4000 0x60>;
+ interrupts = <0x1a>;
+ interrupt-parent = <&avs_host_l2_intc>;
+ interrupt-names = "sw_intr";
+ };
+
+ avs-cpu-l2-intr@f04d1100 {
+ compatible = "brcm,bcm7271-avs-cpu-l2-intr",
+ "brcm,avs-cpu-l2-intr";
+ reg = <0xf04d1100 0x10>;
+ };
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt
index ee91cbdd95ee..9f5ca4457b5f 100644
--- a/Documentation/devicetree/bindings/opp/opp.txt
+++ b/Documentation/devicetree/bindings/opp/opp.txt
@@ -86,8 +86,14 @@ Optional properties:
Single entry is for target voltage and three entries are for <target min max>
voltages.
- Entries for multiple regulators must be present in the same order as
- regulators are specified in device's DT node.
+ Entries for multiple regulators shall be provided in the same field separated
+ by angular brackets <>. The OPP binding doesn't provide any provisions to
+ relate the values to their power supplies or the order in which the supplies
+ need to be configured and that is left for the implementation specific
+ binding.
+
+ Entries for all regulators shall be of the same size, i.e. either all use a
+ single value or triplets.
- opp-microvolt-<name>: Named opp-microvolt property. This is exactly similar to
the above opp-microvolt property, but allows multiple voltage ranges to be
@@ -104,10 +110,13 @@ Optional properties:
Should only be set if opp-microvolt is set for the OPP.
- Entries for multiple regulators must be present in the same order as
- regulators are specified in device's DT node. If this property isn't required
- for few regulators, then this should be marked as zero for them. If it isn't
- required for any regulator, then this property need not be present.
+ Entries for multiple regulators shall be provided in the same field separated
+ by angular brackets <>. If current values aren't required for a regulator,
+ then it shall be filled with 0. If current values aren't required for any of
+ the regulators, then this field is not required. The OPP binding doesn't
+ provide any provisions to relate the values to their power supplies or the
+ order in which the supplies need to be configured and that is left for the
+ implementation specific binding.
- opp-microamp-<name>: Named opp-microamp property. Similar to
opp-microvolt-<name> property, but for microamp instead.
@@ -386,10 +395,12 @@ Example 4: Handling multiple regulators
/ {
cpus {
cpu@0 {
- compatible = "arm,cortex-a7";
+ compatible = "vendor,cpu-type";
...
- cpu-supply = <&cpu_supply0>, <&cpu_supply1>, <&cpu_supply2>;
+ vcc0-supply = <&cpu_supply0>;
+ vcc1-supply = <&cpu_supply1>;
+ vcc2-supply = <&cpu_supply2>;
operating-points-v2 = <&cpu0_opp_table>;
};
};
diff --git a/Documentation/devicetree/bindings/power/domain-idle-state.txt b/Documentation/devicetree/bindings/power/domain-idle-state.txt
new file mode 100644
index 000000000000..eefc7ed22ca2
--- /dev/null
+++ b/Documentation/devicetree/bindings/power/domain-idle-state.txt
@@ -0,0 +1,33 @@
+PM Domain Idle State Node:
+
+A domain idle state node represents the state parameters that will be used to
+select the state when there are no active components in the domain.
+
+The state node has the following parameters -
+
+- compatible:
+ Usage: Required
+ Value type: <string>
+ Definition: Must be "domain-idle-state".
+
+- entry-latency-us
+ Usage: Required
+ Value type: <prop-encoded-array>
+ Definition: u32 value representing worst case latency in
+ microseconds required to enter the idle state.
+ The exit-latency-us duration may be guaranteed
+ only after entry-latency-us has passed.
+
+- exit-latency-us
+ Usage: Required
+ Value type: <prop-encoded-array>
+ Definition: u32 value representing worst case latency
+ in microseconds required to exit the idle state.
+
+- min-residency-us
+ Usage: Required
+ Value type: <prop-encoded-array>
+ Definition: u32 value representing minimum residency duration
+ in microseconds after which the idle state will yield
+ power benefits after overcoming the overhead in entering
+i the idle state.
diff --git a/Documentation/devicetree/bindings/power/power_domain.txt b/Documentation/devicetree/bindings/power/power_domain.txt
index 025b5e7df61c..723e1ad937da 100644
--- a/Documentation/devicetree/bindings/power/power_domain.txt
+++ b/Documentation/devicetree/bindings/power/power_domain.txt
@@ -29,6 +29,15 @@ Optional properties:
specified by this binding. More details about power domain specifier are
available in the next section.
+- domain-idle-states : A phandle of an idle-state that shall be soaked into a
+ generic domain power state. The idle state definitions are
+ compatible with domain-idle-state specified in [1].
+ The domain-idle-state property reflects the idle state of this PM domain and
+ not the idle states of the devices or sub-domains in the PM domain. Devices
+ and sub-domains have their own idle-states independent of the parent
+ domain's idle states. In the absence of this property, the domain would be
+ considered as capable of being powered-on or powered-off.
+
Example:
power: power-controller@12340000 {
@@ -59,6 +68,38 @@ The nodes above define two power controllers: 'parent' and 'child'.
Domains created by the 'child' power controller are subdomains of '0' power
domain provided by the 'parent' power controller.
+Example 3:
+ parent: power-c