From 6ccd2ecd422644277b7d8b37222e3af3f43ea9ae Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 11 Dec 2014 10:20:59 -0800 Subject: rcu: Improve diagnostics for spurious RCU CPU stall warnings The current RCU CPU stall warning code will print "Stall ended before state dump start" any time that the stall-warning code is triggered on a CPU that has already reported a quiescent state for the current grace period and if all quiescent states have been reported for the current grace period. However, a true stall can result in these symptoms, for example, by preventing RCU's grace-period kthreads from ever running This commit therefore checks for this condition, reporting the end of the stall only if one of the grace-period counters has actually advanced. Otherwise, it reports the last time that the grace-period kthread made meaningful progress. (In normal situations, the grace-period kthread should make meaningful progress at least every jiffies_till_next_fqs jiffies.) Reported-by: Miroslav Benes Signed-off-by: Paul E. McKenney Tested-by: Miroslav Benes --- Documentation/RCU/stallwarn.txt | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'Documentation/RCU/stallwarn.txt') diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index ed186a902d31..55f9707fe60a 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -187,6 +187,11 @@ o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the behavior, you might need to replace some of the cond_resched() calls with calls to cond_resched_rcu_qs(). +o Anything that prevents RCU's grace-period kthreads from running. + This can result in the "All QSes seen" console-log message. + This message will include information on when the kthread last + ran and how often it should be expected to run. + o A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might happen to preempt a low-priority task in the middle of an RCU read-side critical section. This is especially damaging if -- cgit v1.2.3 From fb81a44b88e6173ed0f6e9d6a1afa5305fb63f6e Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 17 Dec 2014 08:35:02 -0800 Subject: rcu: Add GP-kthread-starvation checks to CPU stall warnings This commit adds a message that is printed if the relevant grace-period kthread has not been able to run for the two seconds preceding the stall warning. (The two seconds is double the maximum interval between successive bouts of quiescent-state forcing.) Signed-off-by: Paul E. McKenney --- Documentation/RCU/stallwarn.txt | 9 +++++++++ 1 file changed, 9 insertions(+) (limited to 'Documentation/RCU/stallwarn.txt') diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index 55f9707fe60a..53e7d2856db6 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -152,6 +152,15 @@ no non-lazy callbacks ("." is printed otherwise, as shown above) and "D" indicates that dyntick-idle processing is enabled ("." is printed otherwise, for example, if disabled via the "nohz=" kernel boot parameter). +If the relevant grace-period kthread has been unable to run prior to +the stall warning, the following additional line is printed: + + rcu_preempt kthread starved for 2023 jiffies! + +Starving the grace-period kthreads of CPU time can of course result in +RCU CPU stall warnings even when all CPUs and tasks have passed through +the required quiescent states. + Multiple Warnings From One Stall -- cgit v1.2.3