From 6b05dfacd761c6ace11def4b3b42fc6a7583fec3 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Date: Tue, 21 Apr 2020 19:04:02 +0200
Subject: docs: RCU: Convert checklist.txt to ReST

- Add a SPDX header;
- Adjust document title;
- Some whitespace fixes and new line breaks;
- Use the right list markups;
- Add it to RCU/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 Documentation/RCU/checklist.rst | 465 ++++++++++++++++++++++++++++++++++++++++
 Documentation/RCU/checklist.txt | 458 ---------------------------------------
 Documentation/RCU/index.rst     |   3 +
 3 files changed, 468 insertions(+), 458 deletions(-)
 create mode 100644 Documentation/RCU/checklist.rst
 delete mode 100644 Documentation/RCU/checklist.txt

(limited to 'Documentation/RCU')

diff --git a/Documentation/RCU/checklist.rst b/Documentation/RCU/checklist.rst
new file mode 100644
index 000000000000..2efed9926c3f
--- /dev/null
+++ b/Documentation/RCU/checklist.rst
@@ -0,0 +1,465 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+================================
+Review Checklist for RCU Patches
+================================
+
+
+This document contains a checklist for producing and reviewing patches
+that make use of RCU.  Violating any of the rules listed below will
+result in the same sorts of problems that leaving out a locking primitive
+would cause.  This list is based on experiences reviewing such patches
+over a rather long period of time, but improvements are always welcome!
+
+0.	Is RCU being applied to a read-mostly situation?  If the data
+	structure is updated more than about 10% of the time, then you
+	should strongly consider some other approach, unless detailed
+	performance measurements show that RCU is nonetheless the right
+	tool for the job.  Yes, RCU does reduce read-side overhead by
+	increasing write-side overhead, which is exactly why normal uses
+	of RCU will do much more reading than updating.
+
+	Another exception is where performance is not an issue, and RCU
+	provides a simpler implementation.  An example of this situation
+	is the dynamic NMI code in the Linux 2.6 kernel, at least on
+	architectures where NMIs are rare.
+
+	Yet another exception is where the low real-time latency of RCU's
+	read-side primitives is critically important.
+
+	One final exception is where RCU readers are used to prevent
+	the ABA problem (https://en.wikipedia.org/wiki/ABA_problem)
+	for lockless updates.  This does result in the mildly
+	counter-intuitive situation where rcu_read_lock() and
+	rcu_read_unlock() are used to protect updates, however, this
+	approach provides the same potential simplifications that garbage
+	collectors do.
+
+1.	Does the update code have proper mutual exclusion?
+
+	RCU does allow -readers- to run (almost) naked, but -writers- must
+	still use some sort of mutual exclusion, such as:
+
+	a.	locking,
+	b.	atomic operations, or
+	c.	restricting updates to a single task.
+
+	If you choose #b, be prepared to describe how you have handled
+	memory barriers on weakly ordered machines (pretty much all of
+	them -- even x86 allows later loads to be reordered to precede
+	earlier stores), and be prepared to explain why this added
+	complexity is worthwhile.  If you choose #c, be prepared to
+	explain how this single task does not become a major bottleneck on
+	big multiprocessor machines (for example, if the task is updating
+	information relating to itself that other tasks can read, there
+	by definition can be no bottleneck).  Note that the definition
+	of "large" has changed significantly:  Eight CPUs was "large"
+	in the year 2000, but a hundred CPUs was unremarkable in 2017.
+
+2.	Do the RCU read-side critical sections make proper use of
+	rcu_read_lock() and friends?  These primitives are needed
+	to prevent grace periods from ending prematurely, which
+	could result in data being unceremoniously freed out from
+	under your read-side code, which can greatly increase the
+	actuarial risk of your kernel.
+
+	As a rough rule of thumb, any dereference of an RCU-protected
+	pointer must be covered by rcu_read_lock(), rcu_read_lock_bh(),
+	rcu_read_lock_sched(), or by the appropriate update-side lock.
+	Disabling of preemption can serve as rcu_read_lock_sched(), but
+	is less readable and prevents lockdep from detecting locking issues.
+
+	Letting RCU-protected pointers "leak" out of an RCU read-side
+	critical section is every bid as bad as letting them leak out
+	from under a lock.  Unless, of course, you have arranged some
+	other means of protection, such as a lock or a reference count
+	-before- letting them out of the RCU read-side critical section.
+
+3.	Does the update code tolerate concurrent accesses?
+
+	The whole point of RCU is to permit readers to run without
+	any locks or atomic operations.  This means that readers will
+	be running while updates are in progress.  There are a number
+	of ways to handle this concurrency, depending on the situation:
+
+	a.	Use the RCU variants of the list and hlist update
+		primitives to add, remove, and replace elements on
+		an RCU-protected list.	Alternatively, use the other
+		RCU-protected data structures that have been added to
+		the Linux kernel.
+
+		This is almost always the best approach.
+
+	b.	Proceed as in (a) above, but also maintain per-element
+		locks (that are acquired by both readers and writers)
+		that guard per-element state.  Of course, fields that
+		the readers refrain from accessing can be guarded by
+		some other lock acquired only by updaters, if desired.
+
+		This works quite well, also.
+
+	c.	Make updates appear atomic to readers.	For example,
+		pointer updates to properly aligned fields will
+		appear atomic, as will individual atomic primitives.
+		Sequences of operations performed under a lock will -not-
+		appear to be atomic to RCU readers, nor will sequences
+		of multiple atomic primitives.
+
+		This can work, but is starting to get a bit tricky.
+
+	d.	Carefully order the updates and the reads so that
+		readers see valid data at all phases of the update.
+		This is often more difficult than it sounds, especially
+		given modern CPUs' tendency to reorder memory references.
+		One must usually liberally sprinkle memory barriers
+		(smp_wmb(), smp_rmb(), smp_mb()) through the code,
+		making it difficult to understand and to test.
+
+		It is usually better to group the changing data into
+		a separate structure, so that the change may be made
+		to appear atomic by updating a pointer to reference
+		a new structure containing updated values.
+
+4.	Weakly ordered CPUs pose special challenges.  Almost all CPUs
+	are weakly ordered -- even x86 CPUs allow later loads to be
+	reordered to precede earlier stores.  RCU code must take all of
+	the following measures to prevent memory-corruption problems:
+
+	a.	Readers must maintain proper ordering of their memory
+		accesses.  The rcu_dereference() primitive ensures that
+		the CPU picks up the pointer before it picks up the data
+		that the pointer points to.  This really is necessary
+		on Alpha CPUs.	If you don't believe me, see:
+
+			http://www.openvms.compaq.com/wizard/wiz_2637.html
+
+		The rcu_dereference() primitive is also an excellent
+		documentation aid, letting the person reading the
+		code know exactly which pointers are protected by RCU.
+		Please note that compilers can also reorder code, and
+		they are becoming increasingly aggressive about doing
+		just that.  The rcu_dereference() primitive therefore also
+		prevents destructive compiler optimizations.  However,
+		with a bit of devious creativity, it is possible to
+		mishandle the return value from rcu_dereference().
+		Please see rcu_dereference.txt in this directory for
+		more information.
+
+		The rcu_dereference() primitive is used by the
+		various "_rcu()" list-traversal primitives, such
+		as the list_for_each_entry_rcu().  Note that it is
+		perfectly legal (if redundant) for update-side code to
+		use rcu_dereference() and the "_rcu()" list-traversal
+		primitives.  This is particularly useful in code that
+		is common to readers and updaters.  However, lockdep
+		will complain if you access rcu_dereference() outside
+		of an RCU read-side critical section.  See lockdep.txt
+		to learn what to do about this.
+
+		Of course, neither rcu_dereference() nor the "_rcu()"
+		list-traversal primitives can substitute for a good
+		concurrency design coordinating among multiple updaters.
+
+	b.	If the list macros are being used, the list_add_tail_rcu()
+		and list_add_rcu() primitives must be used in order
+		to prevent weakly ordered machines from misordering
+		structure initialization and pointer planting.
+		Similarly, if the hlist macros are being used, the
+		hlist_add_head_rcu() primitive is required.
+
+	c.	If the list macros are being used, the list_del_rcu()
+		primitive must be used to keep list_del()'s pointer
+		poisoning from inflicting toxic effects on concurrent
+		readers.  Similarly, if the hlist macros are being used,
+		the hlist_del_rcu() primitive is required.
+
+		The list_replace_rcu() and hlist_replace_rcu() primitives
+		may be used to replace an old structure with a new one
+		in their respective types of RCU-protected lists.
+
+	d.	Rules similar to (4b) and (4c) apply to the "hlist_nulls"
+		type of RCU-protected linked lists.
+
+	e.	Updates must ensure that initialization of a given
+		structure happens before pointers to that structure are
+		publicized.  Use the rcu_assign_pointer() primitive
+		when publicizing a pointer to a structure that can
+		be traversed by an RCU read-side critical section.
+
+5.	If call_rcu() or call_srcu() is used, the callback function will
+	be called from softirq context.  In particular, it cannot block.
+
+6.	Since synchronize_rcu() can block, it cannot be called
+	from any sort of irq context.  The same rule applies
+	for synchronize_srcu(), synchronize_rcu_expedited(), and
+	synchronize_srcu_expedited().
+
+	The expedited forms of these primitives have the same semantics
+	as the non-expedited forms, but expediting is both expensive and
+	(with the exception of synchronize_srcu_expedited()) unfriendly
+	to real-time workloads.  Use of the expedited primitives should
+	be restricted to rare configuration-change operations that would
+	not normally be undertaken while a real-time workload is running.
+	However, real-time workloads can use rcupdate.rcu_normal kernel
+	boot parameter to completely disable expedited grace periods,
+	though this might have performance implications.
+
+	In particular, if you find yourself invoking one of the expedited
+	primitives repeatedly in a loop, please do everyone a favor:
+	Restructure your code so that it batches the updates, allowing
+	a single non-expedited primitive to cover the entire batch.
+	This will very likely be faster than the loop containing the
+	expedited primitive, and will be much much easier on the rest
+	of the system, especially to real-time workloads running on
+	the rest of the system.
+
+7.	As of v4.20, a given kernel implements only one RCU flavor,
+	which is RCU-sched for PREEMPT=n and RCU-preempt for PREEMPT=y.
+	If the updater uses call_rcu() or synchronize_rcu(),
+	then the corresponding readers my use rcu_read_lock() and
+	rcu_read_unlock(), rcu_read_lock_bh() and rcu_read_unlock_bh(),
+	or any pair of primitives that disables and re-enables preemption,
+	for example, rcu_read_lock_sched() and rcu_read_unlock_sched().
+	If the updater uses synchronize_srcu() or call_srcu(),
+	then the corresponding readers must use srcu_read_lock() and
+	srcu_read_unlock(), and with the same srcu_struct.  The rules for
+	the expedited primitives are the same as for their non-expedited
+	counterparts.  Mixing things up will result in confusion and
+	broken kernels, and has even resulted in an exploitable security
+	issue.
+
+	One exception to this rule: rcu_read_lock() and rcu_read_unlock()
+	may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh()
+	in cases where local bottom halves are already known to be
+	disabled, for example, in irq or softirq context.  Commenting
+	such cases is a must, of course!  And the jury is still out on
+	whether the increased speed is worth it.
+
+8.	Although synchronize_rcu() is slower than is call_rcu(), it
+	usually results in simpler code.  So, unless update performance is
+	critically important, the updaters cannot block, or the latency of
+	synchronize_rcu() is visible from userspace, synchronize_rcu()
+	should be used in preference to call_rcu().  Furthermore,
+	kfree_rcu() usually results in even simpler code than does
+	synchronize_rcu() without synchronize_rcu()'s multi-millisecond
+	latency.  So please take advantage of kfree_rcu()'s "fire and
+	forget" memory-freeing capabilities where it applies.
+
+	An especially important property of the synchronize_rcu()
+	primitive is that it automatically self-limits: if grace periods
+	are delayed for whatever reason, then the synchronize_rcu()
+	primitive will correspondingly delay updates.  In contrast,
+	code using call_rcu() should explicitly limit update rate in
+	cases where grace periods are delayed, as failing to do so can
+	result in excessive realtime latencies or even OOM conditions.
+
+	Ways of gaining this self-limiting property when using call_rcu()
+	include:
+
+	a.	Keeping a count of the number of data-structure elements
+		used by the RCU-protected data structure, including
+		those waiting for a grace period to elapse.  Enforce a
+		limit on this number, stalling updates as needed to allow
+		previously deferred frees to complete.	Alternatively,
+		limit only the number awaiting deferred free rather than
+		the total number of elements.
+
+		One way to stall the updates is to acquire the update-side
+		mutex.	(Don't try this with a spinlock -- other CPUs
+		spinning on the lock could prevent the grace period
+		from ever ending.)  Another way to stall the updates
+		is for the updates to use a wrapper function around
+		the memory allocator, so that this wrapper function
+		simulates OOM when there is too much memory awaiting an
+		RCU grace period.  There are of course many other
+		variations on this theme.
+
+	b.	Limiting update rate.  For example, if updates occur only
+		once per hour, then no explicit rate limiting is
+		required, unless your system is already badly broken.
+		Older versions of the dcache subsystem take this approach,
+		guarding updates with a global lock, limiting their rate.
+
+	c.	Trusted update -- if updates can only be done manually by
+		superuser or some other trusted user, then it might not
+		be necessary to automatically limit them.  The theory
+		here is that superuser already has lots of ways to crash
+		the machine.
+
+	d.	Periodically invoke synchronize_rcu(), permitting a limited
+		number of updates per grace period.
+
+	The same cautions apply to call_srcu() and kfree_rcu().
+
+	Note that although these primitives do take action to avoid memory
+	exhaustion when any given CPU has too many callbacks, a determined
+	user could still exhaust memory.  This is especially the case
+	if a system with a large number of CPUs has been configured to
+	offload all of its RCU callbacks onto a single CPU, or if the
+	system has relatively little free memory.
+
+9.	All RCU list-traversal primitives, which include
+	rcu_dereference(), list_for_each_entry_rcu(), and
+	list_for_each_safe_rcu(), must be either within an RCU read-side
+	critical section or must be protected by appropriate update-side
+	locks.	RCU read-side critical sections are delimited by
+	rcu_read_lock() and rcu_read_unlock(), or by similar primitives
+	such as rcu_read_lock_bh() and rcu_read_unlock_bh(), in which
+	case the matching rcu_dereference() primitive must be used in
+	order to keep lockdep happy, in this case, rcu_dereference_bh().
+
+	The reason that it is permissible to use RCU list-traversal
+	primitives when the update-side lock is held is that doing so
+	can be quite helpful in reducing code bloat when common code is
+	shared between readers and updaters.  Additional primitives
+	are provided for this case, as discussed in lockdep.txt.
+
+10.	Conversely, if you are in an RCU read-side critical section,
+	and you don't hold the appropriate update-side lock, you -must-
+	use the "_rcu()" variants of the list macros.  Failing to do so
+	will break Alpha, cause aggressive compilers to generate bad code,
+	and confuse people trying to read your code.
+
+11.	Any lock acquired by an RCU callback must be acquired elsewhere
+	with softirq disabled, e.g., via spin_lock_irqsave(),
+	spin_lock_bh(), etc.  Failing to disable softirq on a given
+	acquisition of that lock will result in deadlock as soon as
+	the RCU softirq handler happens to run your RCU callback while
+	interrupting that acquisition's critical section.
+
+12.	RCU callbacks can be and are executed in parallel.  In many cases,
+	the callback code simply wrappers around kfree(), so that this
+	is not an issue (or, more accurately, to the extent that it is
+	an issue, the memory-allocator locking handles it).  However,
+	if the callbacks do manipulate a shared data structure, they
+	must use whatever locking or other synchronization is required
+	to safely access and/or modify that data structure.
+
+	Do not assume that RCU callbacks will be executed on the same
+	CPU that executed the corresponding call_rcu() or call_srcu().
+	For example, if a given CPU goes offline while having an RCU
+	callback pending, then that RCU callback will execute on some
+	surviving CPU.	(If this was not the case, a self-spawning RCU
+	callback would prevent the victim CPU from ever going offline.)
+	Furthermore, CPUs designated by rcu_nocbs= might well -always-
+	have their RCU callbacks executed on some other CPUs, in fact,
+	for some  real-time workloads, this is the whole point of using
+	the rcu_nocbs= kernel boot parameter.
+
+13.	Unlike other forms of RCU, it -is- permissible to block in an
+	SRCU read-side critical section (demarked by srcu_read_lock()
+	and srcu_read_unlock()), hence the "SRCU": "sleepable RCU".
+	Please note that if you don't need to sleep in read-side critical
+	sections, you should be using RCU rather than SRCU, because RCU
+	is almost always faster and easier to use than is SRCU.
+
+	Also unlike other forms of RCU, explicit initialization and
+	cleanup is required either at build time via DEFINE_SRCU()
+	or DEFINE_STATIC_SRCU() or at runtime via init_srcu_struct()
+	and cleanup_srcu_struct().  These last two are passed a
+	"struct srcu_struct" that defines the scope of a given
+	SRCU domain.  Once initialized, the srcu_struct is passed
+	to srcu_read_lock(), srcu_read_unlock() synchronize_srcu(),
+	synchronize_srcu_expedited(), and call_srcu().	A given
+	synchronize_srcu() waits only for SRCU read-side critical
+	sections governed by srcu_read_lock() and srcu_read_unlock()
+	calls that have been passed the same srcu_struct.  This property
+	is what makes sleeping read-side critical sections tolerable --
+	a given subsystem delays only its own updates, not those of other
+	subsystems using SRCU.	Therefore, SRCU is less prone to OOM the
+	system than RCU would be if RCU's read-side critical sections
+	were permitted to sleep.
+
+	The ability to sleep in read-side critical sections does not
+	come for free.	First, corresponding srcu_read_lock() and
+	srcu_read_unlock() calls must be passed the same srcu_struct.
+	Second, grace-period-detection overhead is amortized only
+	over those updates sharing a given srcu_struct, rather than
+	being globally amortized as they are for other forms of RCU.
+	Therefore, SRCU should be used in preference to rw_semaphore
+	only in extremely read-intensive situations, or in situations
+	requiring SRCU's read-side deadlock immunity or low read-side
+	realtime latency.  You should also consider percpu_rw_semaphore
+	when you need lightweight readers.
+
+	SRCU's expedited primitive (synchronize_srcu_expedited())
+	never sends IPIs to other CPUs, so it is easier on
+	real-time workloads than is synchronize_rcu_expedited().
+
+	Note that rcu_assign_pointer() relates to SRCU just as it does to
+	other forms of RCU, but instead of rcu_dereference() you should
+	use srcu_dereference() in order to avoid lockdep splats.
+
+14.	The whole point of call_rcu(), synchronize_rcu(), and friends
+	is to wait until all pre-existing readers have finished before
+	carrying out some otherwise-destructive operation.  It is
+	therefore critically important to -first- remove any path
+	that readers can follow that could be affected by the
+	destructive operation, and -only- -then- invoke call_rcu(),
+	synchronize_rcu(), or friends.
+
+	Because these primitives only wait for pre-existing readers, it
+	is the caller's responsibility to guarantee that any subsequent
+	readers will execute safely.
+
+15.	The various RCU read-side primitives do -not- necessarily contain
+	memory barriers.  You should therefore plan for the CPU
+	and the compiler to freely reorder code into and out of RCU
+	read-side critical sections.  It is the responsibility of the
+	RCU update-side primitives to deal with this.
+
+	For SRCU readers, you can use smp_mb__after_srcu_read_unlock()
+	immediately after an srcu_read_unlock() to get a full barrier.
+
+16.	Use CONFIG_PROVE_LOCKING, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and the
+	__rcu sparse checks to validate your RCU code.	These can help
+	find problems as follows:
+
+	CONFIG_PROVE_LOCKING:
+		check that accesses to RCU-protected data
+		structures are carried out under the proper RCU
+		read-side critical section, while holding the right
+		combination of locks, or whatever other conditions
+		are appropriate.
+
+	CONFIG_DEBUG_OBJECTS_RCU_HEAD:
+		check that you don't pass the
+		same object to call_rcu() (or friends) before an RCU
+		grace period has elapsed since the last time that you
+		passed that same object to call_rcu() (or friends).
+
+	__rcu sparse checks:
+		tag the pointer to the RCU-protected data
+		structure with __rcu, and sparse will warn you if you
+		access that pointer without the services of one of the
+		variants of rcu_dereference().
+
+	These debugging aids can help you find problems that are
+	otherwise extremely difficult to spot.
+
+17.	If you register a callback using call_rcu() or call_srcu(), and
+	pass in a function defined within a loadable module, then it in
+	necessary to wait for all pending callbacks to be invoked after
+	the last invocation and before unloading that module.  Note that
+	it is absolutely -not- sufficient to wait for a grace period!
+	The current (say) synchronize_rcu() implementation is -not-
+	guaranteed to wait for callbacks registered on other CPUs.
+	Or even on the current CPU if that CPU recently went offline
+	and came back online.
+
+	You instead need to use one of the barrier functions:
+
+	-	call_rcu() -> rcu_barrier()
+	-	call_srcu() -> srcu_barrier()
+
+	However, these barrier functions are absolutely -not- guaranteed
+	to wait for a grace period.  In fact, if there are no call_rcu()
+	callbacks waiting anywhere in the system, rcu_barrier() is within
+	its rights to return immediately.
+
+	So if you need to wait for both an RCU grace period and for
+	all pre-existing call_rcu() callbacks, you will need to execute
+	both rcu_barrier() and synchronize_rcu(), if necessary, using
+	something like workqueues to to execute them concurrently.
+
+	See rcubarrier.txt for more information.
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
deleted file mode 100644
index e98ff261a438..000000000000
--- a/Documentation/RCU/checklist.txt
+++ /dev/null
@@ -1,458 +0,0 @@
-Review Checklist for RCU Patches
-
-
-This document contains a checklist for producing and reviewing patches
-that make use of RCU.  Violating any of the rules listed below will
-result in the same sorts of problems that leaving out a locking primitive
-would cause.  This list is based on experiences reviewing such patches
-over a rather long period of time, but improvements are always welcome!
-
-0.	Is RCU being applied to a read-mostly situation?  If the data
-	structure is updated more than about 10% of the time, then you
-	should strongly consider some other approach, unless detailed
-	performance measurements show that RCU is nonetheless the right
-	tool for the job.  Yes, RCU does reduce read-side overhead by
-	increasing write-side overhead, which is exactly why normal uses
-	of RCU will do much more reading than updating.
-
-	Another exception is where performance is not an issue, and RCU
-	provides a simpler implementation.  An example of this situation
-	is the dynamic NMI code in the Linux 2.6 kernel, at least on
-	architectures where NMIs are rare.
-
-	Yet another exception is where the low real-time latency of RCU's
-	read-side primitives is critically important.
-
-	One final exception is where RCU readers are used to prevent
-	the ABA problem (https://en.wikipedia.org/wiki/ABA_problem)
-	for lockless updates.  This does result in the mildly
-	counter-intuitive situation where rcu_read_lock() and
-	rcu_read_unlock() are used to protect updates, however, this
-	approach provides the same potential simplifications that garbage
-	collectors do.
-
-1.	Does the update code have proper mutual exclusion?
-
-	RCU does allow -readers- to run (almost) naked, but -writers- must
-	still use some sort of mutual exclusion, such as:
-
-	a.	locking,
-	b.	atomic operations, or
-	c.	restricting updates to a single task.
-
-	If you choose #b, be prepared to describe how you have handled
-	memory barriers on weakly ordered machines (pretty much all of
-	them -- even x86 allows later loads to be reordered to precede
-	earlier stores), and be prepared to explain why this added
-	complexity is worthwhile.  If you choose #c, be prepared to
-	explain how this single task does not become a major bottleneck on
-	big multiprocessor machines (for example, if the task is updating
-	information relating to itself that other tasks can read, there
-	by definition can be no bottleneck).  Note that the definition
-	of "large" has changed significantly:  Eight CPUs was "large"
-	in the year 2000, but a hundred CPUs was unremarkable in 2017.
-
-2.	Do the RCU read-side critical sections make proper use of
-	rcu_read_lock() and friends?  These primitives are needed
-	to prevent grace periods from ending prematurely, which
-	could result in data being unceremoniously freed out from
-	under your read-side code, which can greatly increase the
-	actuarial risk of your kernel.
-
-	As a rough rule of thumb, any dereference of an RCU-protected
-	pointer must be covered by rcu_read_lock(), rcu_read_lock_bh(),
-	rcu_read_lock_sched(), or by the appropriate update-side lock.
-	Disabling of preemption can serve as rcu_read_lock_sched(), but
-	is less readable and prevents lockdep from detecting locking issues.
-
-	Letting RCU-protected pointers "leak" out of an RCU read-side
-	critical section is every bid as bad as letting them leak out
-	from under a lock.  Unless, of course, you have arranged some
-	other means of protection, such as a lock or a reference count
-	-before- letting them out of the RCU read-side critical section.
-
-3.	Does the update code tolerate concurrent accesses?
-
-	The whole point of RCU is to permit readers to run without
-	any locks or atomic operations.  This means that readers will
-	be running while updates are in progress.  There are a number
-	of ways to handle this concurrency, depending on the situation:
-
-	a.	Use the RCU variants of the list and hlist update
-		primitives to add, remove, and replace elements on
-		an RCU-protected list.	Alternatively, use the other
-		RCU-protected data structures that have been added to
-		the Linux kernel.
-
-		This is almost always the best approach.
-
-	b.	Proceed as in (a) above, but also maintain per-element
-		locks (that are acquired by both readers and writers)
-		that guard per-element state.  Of course, fields that
-		the readers refrain from accessing can be guarded by
-		some other lock acquired only by updaters, if desired.
-
-		This works quite well, also.
-
-	c.	Make updates appear atomic to readers.	For example,
-		pointer updates to properly aligned fields will
-		appear atomic, as will individual atomic primitives.
-		Sequences of operations performed under a lock will -not-
-		appear to be atomic to RCU readers, nor will sequences
-		of multiple atomic primitives.
-
-		This can work, but is starting to get a bit tricky.
-
-	d.	Carefully order the updates and the reads so that
-		readers see valid data at all phases of the update.
-		This is often more difficult than it sounds, especially
-		given modern CPUs' tendency to reorder memory references.
-		One must usually liberally sprinkle memory barriers
-		(smp_wmb(), smp_rmb(), smp_mb()) through the code,
-		making it difficult to understand and to test.
-
-		It is usually better to group the changing data into
-		a separate structure, so that the change may be made
-		to appear atomic by updating a pointer to reference
-		a new structure containing updated values.
-
-4.	Weakly ordered CPUs pose special challenges.  Almost all CPUs
-	are weakly ordered -- even x86 CPUs allow later loads to be
-	reordered to precede earlier stores.  RCU code must take all of
-	the following measures to prevent memory-corruption problems:
-
-	a.	Readers must maintain proper ordering of their memory
-		accesses.  The rcu_dereference() primitive ensures that
-		the CPU picks up the pointer before it picks up the data
-		that the pointer points to.  This really is necessary
-		on Alpha CPUs.	If you don't believe me, see:
-
-			http://www.openvms.compaq.com/wizard/wiz_2637.html
-
-		The rcu_dereference() primitive is also an excellent
-		documentation aid, letting the person reading the
-		code know exactly which pointers are protected by RCU.
-		Please note that compilers can also reorder code, and
-		they are becoming increasingly aggressive about doing
-		just that.  The rcu_dereference() primitive therefore also
-		prevents destructive compiler optimizations.  However,
-		with a bit of devious creativity, it is possible to
-		mishandle the return value from rcu_dereference().
-		Please see rcu_dereference.txt in this directory for
-		more information.
-
-		The rcu_dereference() primitive is used by the
-		various "_rcu()" list-traversal primitives, such
-		as the list_for_each_entry_rcu().  Note that it is
-		perfectly legal (if redundant) for update-side code to
-		use rcu_dereference() and the "_rcu()" list-traversal
-		primitives.  This is particularly useful in code that
-		is common to readers and updaters.  However, lockdep
-		will complain if you access rcu_dereference() outside
-		of an RCU read-side critical section.  See lockdep.txt
-		to learn what to do about this.
-
-		Of course, neither rcu_dereference() nor the "_rcu()"
-		list-traversal primitives can substitute for a good
-		concurrency design coordinating among multiple updaters.
-
-	b.	If the list macros are being used, the list_add_tail_rcu()
-		and list_add_rcu() primitives must be used in order
-		to prevent weakly ordered machines from misordering
-		structure initialization and pointer planting.
-		Similarly, if the hlist macros are being used, the
-		hlist_add_head_rcu() primitive is required.
-
-	c.	If the list macros are being used, the list_del_rcu()
-		primitive must be used to keep list_del()'s pointer
-		poisoning from inflicting toxic effects on concurrent
-		readers.  Similarly, if the hlist macros are being used,
-		the hlist_del_rcu() primitive is required.
-
-		The list_replace_rcu() and hlist_replace_rcu() primitives
-		may be used to replace an old structure with a new one
-		in their respective types of RCU-protected lists.
-
-	d.	Rules similar to (4b) and (4c) apply to the "hlist_nulls"
-		type of RCU-protected linked lists.
-
-	e.	Updates must ensure that initialization of a given
-		structure happens before pointers to that structure are
-		publicized.  Use the rcu_assign_pointer() primitive
-		when publicizing a pointer to a structure that can
-		be traversed by an RCU read-side critical section.
-
-5.	If call_rcu() or call_srcu() is used, the callback function will
-	be called from softirq context.  In particular, it cannot block.
-
-6.	Since synchronize_rcu() can block, it cannot be called
-	from any sort of irq context.  The same rule applies
-	for synchronize_srcu(), synchronize_rcu_expedited(), and
-	synchronize_srcu_expedited().
-
-	The expedited forms of these primitives have the same semantics
-	as the non-expedited forms, but expediting is both expensive and
-	(with the exception of synchronize_srcu_expedited()) unfriendly
-	to real-time workloads.  Use of the expedited primitives should
-	be restricted to rare configuration-change operations that would
-	not normally be undertaken while a real-time workload is running.
-	However, real-time workloads can use rcupdate.rcu_normal kernel
-	boot parameter to completely disable expedited grace periods,
-	though this might have performance implications.
-
-	In particular, if you find yourself invoking one of the expedited
-	primitives repeatedly in a loop, please do everyone a favor:
-	Restructure your code so that it batches the updates, allowing
-	a single non-expedited primitive to cover the entire batch.
-	This will very likely be faster than the loop containing the
-	expedited primitive, and will be much much easier on the rest
-	of the system, especially to real-time workloads running on
-	the rest of the system.
-
-7.	As of v4.20, a given kernel implements only one RCU flavor,
-	which is RCU-sched for PREEMPT=n and RCU-preempt for PREEMPT=y.
-	If the updater uses call_rcu() or synchronize_rcu(),
-	then the corresponding readers my use rcu_read_lock() and
-	rcu_read_unlock(), rcu_read_lock_bh() and rcu_read_unlock_bh(),
-	or any pair of primitives that disables and re-enables preemption,
-	for example, rcu_read_lock_sched() and rcu_read_unlock_sched().
-	If the updater uses synchronize_srcu() or call_srcu(),
-	then the corresponding readers must use srcu_read_lock() and
-	srcu_read_unlock(), and with the same srcu_struct.  The rules for
-	the expedited primitives are the same as for their non-expedited
-	counterparts.  Mixing things up will result in confusion and
-	broken kernels, and has even resulted in an exploitable security
-	issue.
-
-	One exception to this rule: rcu_read_lock() and rcu_read_unlock()
-	may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh()
-	in cases where local bottom halves are already known to be
-	disabled, for example, in irq or softirq context.  Commenting
-	such cases is a must, of course!  And the jury is still out on
-	whether the increased speed is worth it.
-
-8.	Although synchronize_rcu() is slower than is call_rcu(), it
-	usually results in simpler code.  So, unless update performance is
-	critically important, the updaters cannot block, or the latency of
-	synchronize_rcu() is visible from userspace, synchronize_rcu()
-	should be used in preference to call_rcu().  Furthermore,
-	kfree_rcu() usually results in even simpler code than does
-	synchronize_rcu() without synchronize_rcu()'s multi-millisecond
-	latency.  So please take advantage of kfree_rcu()'s "fire and
-	forget" memory-freeing capabilities where it applies.
-
-	An especially important property of the synchronize_rcu()
-	primitive is that it automatically self-limits: if grace periods
-	are delayed for whatever reason, then the synchronize_rcu()
-	primitive will correspondingly delay updates.  In contrast,
-	code using call_rcu() should explicitly limit update rate in
-	cases where grace periods are delayed, as failing to do so can
-	result in excessive realtime latencies or even OOM conditions.
-
-	Ways of gaining this self-limiting property when using call_rcu()
-	include:
-
-	a.	Keeping a count of the number of data-structure elements
-		used by the RCU-protected data structure, including
-		those waiting for a grace period to elapse.  Enforce a
-		limit on this number, stalling updates as needed to allow
-		previously deferred frees to complete.	Alternatively,
-		limit only the number awaiting deferred free rather than
-		the total number of elements.
-
-		One way to stall the updates is to acquire the update-side
-		mutex.	(Don't try this with a spinlock -- other CPUs
-		spinning on the lock could prevent the grace period
-		from ever ending.)  Another way to stall the updates
-		is for the updates to use a wrapper function around
-		the memory allocator, so that this wrapper function
-		simulates OOM when there is too much memory awaiting an
-		RCU grace period.  There are of course many other
-		variations on this theme.
-
-	b.	Limiting update rate.  For example, if updates occur only
-		once per hour, then no explicit rate limiting is
-		required, unless your system is already badly broken.
-		Older versions of the dcache subsystem take this approach,
-		guarding updates with a global lock, limiting their rate.
-
-	c.	Trusted update -- if updates can only be done manually by
-		superuser or some other trusted user, then it might not
-		be necessary to automatically limit them.  The theory
-		here is that superuser already has lots of ways to crash
-		the machine.
-
-	d.	Periodically invoke synchronize_rcu(), permitting a limited
-		number of updates per grace period.
-
-	The same cautions apply to call_srcu() and kfree_rcu().
-
-	Note that although these primitives do take action to avoid memory
-	exhaustion when any given CPU has too many callbacks, a determined
-	user could still exhaust memory.  This is especially the case
-	if a system with a large number of CPUs has been configured to
-	offload all of its RCU callbacks onto a single CPU, or if the
-	system has relatively little free memory.
-
-9.	All RCU list-traversal primitives, which include
-	rcu_dereference(), list_for_each_entry_rcu(), and
-	list_for_each_safe_rcu(), must be either within an RCU read-side
-	critical section or must be protected by appropriate update-side
-	locks.	RCU read-side critical sections are delimited by
-	rcu_read_lock() and rcu_read_unlock(), or by similar primitives
-	such as rcu_read_lock_bh() and rcu_read_unlock_bh(), in which
-	case the matching rcu_dereference() primitive must be used in
-	order to keep lockdep happy, in this case, rcu_dereference_bh().
-
-	The reason that it is permissible to use RCU list-traversal
-	primitives when the update-side lock is held is that doing so
-	can be quite helpful in reducing code bloat when common code is
-	shared between readers and updaters.  Additional primitives
-	are provided for this case, as discussed in lockdep.txt.
-
-10.	Conversely, if you are in an RCU read-side critical section,
-	and you don't hold the appropriate update-side lock, you -must-
-	use the "_rcu()" variants of the list macros.  Failing to do so
-	will break Alpha, cause aggressive compilers to generate bad code,
-	and confuse people trying to read your code.
-
-11.	Any lock acquired by an RCU callback must be acquired elsewhere
-	with softirq disabled, e.g., via spin_lock_irqsave(),
-	spin_lock_bh(), etc.  Failing to disable softirq on a given
-	acquisition of that lock will result in deadlock as soon as
-	the RCU softirq handler happens to run your RCU callback while
-	interrupting that acquisition's critical section.
-
-12.	RCU callbacks can be and are executed in parallel.  In many cases,
-	the callback code simply wrappers around kfree(), so that this
-	is not an issue (or, more accurately, to the extent that it is
-	an issue, the memory-allocator locking handles it).  However,
-	if the callbacks do manipulate a shared data structure, they
-	must use whatever locking or other synchronization is required
-	to safely access and/or modify that data structure.
-
-	Do not assume that RCU callbacks will be executed on the same
-	CPU that executed the corresponding call_rcu() or call_srcu().
-	For example, if a given CPU goes offline while having an RCU
-	callback pending, then that RCU callback will execute on some
-	surviving CPU.	(If this was not the case, a self-spawning RCU
-	callback would prevent the victim CPU from ever going offline.)
-	Furthermore, CPUs designated by rcu_nocbs= might well -always-
-	have their RCU callbacks executed on some other CPUs, in fact,
-	for some  real-time workloads, this is the whole point of using
-	the rcu_nocbs= kernel boot parameter.
-
-13.	Unlike other forms of RCU, it -is- permissible to block in an
-	SRCU read-side critical section (demarked by srcu_read_lock()
-	and srcu_read_unlock()), hence the "SRCU": "sleepable RCU".
-	Please note that if you don't need to sleep in read-side critical
-	sections, you should be using RCU rather than SRCU, because RCU
-	is almost always faster and easier to use than is SRCU.
-
-	Also unlike other forms of RCU, explicit initialization and
-	cleanup is required either at build time via DEFINE_SRCU()
-	or DEFINE_STATIC_SRCU() or at runtime via init_srcu_struct()
-	and cleanup_srcu_struct().  These last two are passed a
-	"struct srcu_struct" that defines the scope of a given
-	SRCU domain.  Once initialized, the srcu_struct is passed
-	to srcu_read_lock(), srcu_read_unlock() synchronize_srcu(),
-	synchronize_srcu_expedited(), and call_srcu().	A given
-	synchronize_srcu() waits only for SRCU read-side critical
-	sections governed by srcu_read_lock() and srcu_read_unlock()
-	calls that have been passed the same srcu_struct.  This property
-	is what makes sleeping read-side critical sections tolerable --
-	a given subsystem delays only its own updates, not those of other
-	subsystems using SRCU.	Therefore, SRCU is less prone to OOM the
-	system than RCU would be if RCU's read-side critical sections
-	were permitted to sleep.
-
-	The ability to sleep in read-side critical sections does not
-	come for free.	First, corresponding srcu_read_lock() and
-	srcu_read_unlock() calls must be passed the same srcu_struct.
-	Second, grace-period-detection overhead is amortized only
-	over those updates sharing a given srcu_struct, rather than
-	being globally amortized as they are for other forms of RCU.
-	Therefore, SRCU should be used in preference to rw_semaphore
-	only in extremely read-intensive situations, or in situations
-	requiring SRCU's read-side deadlock immunity or low read-side
-	realtime latency.  You should also consider percpu_rw_semaphore
-	when you need lightweight readers.
-
-	SRCU's expedited primitive (synchronize_srcu_expedited())
-	never sends IPIs to other CPUs, so it is easier on
-	real-time workloads than is synchronize_rcu_expedited().
-
-	Note that rcu_assign_pointer() relates to SRCU just as it does to
-	other forms of RCU, but instead of rcu_dereference() you should
-	use srcu_dereference() in order to avoid lockdep splats.
-
-14.	The whole point of call_rcu(), synchronize_rcu(), and friends
-	is to wait until all pre-existing readers have finished before
-	carrying out some otherwise-destructive operation.  It is
-	therefore critically important to -first- remove any path
-	that readers can follow that could be affected by the
-	destructive operation, and -only- -then- invoke call_rcu(),
-	synchronize_rcu(), or friends.
-
-	Because these primitives only wait for pre-existing readers, it
-	is the caller's responsibility to guarantee that any subsequent
-	readers will execute safely.
-
-15.	The various RCU read-side primitives do -not- necessarily contain
-	memory barriers.  You should therefore plan for the CPU
-	and the compiler to freely reorder code into and out of RCU
-	read-side critical sections.  It is the responsibility of the
-	RCU update-side primitives to deal with this.
-
-	For SRCU readers, you can use smp_mb__after_srcu_read_unlock()
-	immediately after an srcu_read_unlock() to get a full barrier.
-
-16.	Use CONFIG_PROVE_LOCKING, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and the
-	__rcu sparse checks to validate your RCU code.	These can help
-	find problems as follows:
-
-	CONFIG_PROVE_LOCKING: check that accesses to RCU-protected data
-		structures are carried out under the proper RCU
-		read-side critical section, while holding the right
-		combination of locks, or whatever other conditions
-		are appropriate.
-
-	CONFIG_DEBUG_OBJECTS_RCU_HEAD: check that you don't pass the
-		same object to call_rcu() (or friends) before an RCU
-		grace period has elapsed since the last time that you
-		passed that same object to call_rcu() (or friends).
-
-	__rcu sparse checks: tag the pointer to the RCU-protected data
-		structure with __rcu, and sparse will warn you if you
-		access that pointer without the services of one of the
-		variants of rcu_dereference().
-
-	These debugging aids can help you find problems that are
-	otherwise extremely difficult to spot.
-
-17.	If you register a callback using call_rcu() or call_srcu(), and
-	pass in a function defined within a loadable module, then it in
-	necessary to wait for all pending callbacks to be invoked after
-	the last invocation and before unloading that module.  Note that
-	it is absolutely -not- sufficient to wait for a grace period!
-	The current (say) synchronize_rcu() implementation is -not-
-	guaranteed to wait for callbacks registered on other CPUs.
-	Or even on the current CPU if that CPU recently went offline
-	and came back online.
-
-	You instead need to use one of the barrier functions:
-
-	o	call_rcu() -> rcu_barrier()
-	o	call_srcu() -> srcu_barrier()
-
-	However, these barrier functions are absolutely -not- guaranteed
-	to wait for a grace period.  In fact, if there are no call_rcu()
-	callbacks waiting anywhere in the system, rcu_barrier() is within
-	its rights to return immediately.
-
-	So if you need to wait for both an RCU grace period and for
-	all pre-existing call_rcu() callbacks, you will need to execute
-	both rcu_barrier() and synchronize_rcu(), if necessary, using
-	something like workqueues to to execute them concurrently.
-
-	See rcubarrier.txt for more information.
diff --git a/Documentation/RCU/index.rst b/Documentation/RCU/index.rst
index 81a0a1e5f767..c1ba4d130bb0 100644
--- a/Documentation/RCU/index.rst
+++ b/Documentation/RCU/index.rst
@@ -1,3 +1,5 @@
+.. SPDX-License-Identifier: GPL-2.0
+
 .. _rcu_concepts:
 
 ============
@@ -8,6 +10,7 @@ RCU concepts
    :maxdepth: 3
 
    arrayRCU
+   checklist
    rcubarrier
    rcu_dereference
    whatisRCU
-- 
cgit v1.2.3


From a3b0a79f8903f955250505f99d1e37b6c7d7b060 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Date: Tue, 21 Apr 2020 19:04:03 +0200
Subject: docs: RCU: Convert lockdep-splat.txt to ReST

- Add a SPDX header;
- Add a document title;
- Some whitespace fixes and new line breaks;
- Mark literal blocks as such;
- Add it to RCU/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 Documentation/RCU/index.rst         |   1 +
 Documentation/RCU/lockdep-splat.rst | 115 ++++++++++++++++++++++++++++++++++++
 Documentation/RCU/lockdep-splat.txt | 110 ----------------------------------
 3 files changed, 116 insertions(+), 110 deletions(-)
 create mode 100644 Documentation/RCU/lockdep-splat.rst
 delete mode 100644 Documentation/RCU/lockdep-splat.txt

(limited to 'Documentation/RCU')

diff --git a/Documentation/RCU/index.rst b/Documentation/RCU/index.rst
index c1ba4d130bb0..430a37132b2c 100644
--- a/Documentation/RCU/index.rst
+++ b/Documentation/RCU/index.rst
@@ -11,6 +11,7 @@ RCU concepts
 
    arrayRCU
    checklist
+   lockdep-splat
    rcubarrier
    rcu_dereference
    whatisRCU
diff --git a/Documentation/RCU/lockdep-splat.rst b/Documentation/RCU/lockdep-splat.rst
new file mode 100644
index 000000000000..2a5c79db57dc
--- /dev/null
+++ b/Documentation/RCU/lockdep-splat.rst
@@ -0,0 +1,115 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================
+Lockdep-RCU Splat
+=================
+
+Lockdep-RCU was added to the Linux kernel in early 2010
+(http://lwn.net/Articles/371986/).  This facility checks for some common
+misuses of the RCU API, most notably using one of the rcu_dereference()
+family to access an RCU-protected pointer without the proper protection.
+When such misuse is detected, an lockdep-RCU splat is emitted.
+
+The usual cause of a lockdep-RCU slat is someone accessing an
+RCU-protected data structure without either (1) being in the right kind of
+RCU read-side critical section or (2) holding the right update-side lock.
+This problem can therefore be serious: it might result in random memory
+overwriting or worse.  There can of course be false positives, this
+being the real world and all that.
+
+So let's look at an example RCU lockdep splat from 3.0-rc5, one that
+has long since been fixed::
+
+    =============================
+    WARNING: suspicious RCU usage
+    -----------------------------
+    block/cfq-iosched.c:2776 suspicious rcu_dereference_protected() usage!
+
+other info that might help us debug this::
+
+    rcu_scheduler_active = 1, debug_locks = 0
+    3 locks held by scsi_scan_6/1552:
+    #0:  (&shost->scan_mutex){+.+.}, at: [<ffffffff8145efca>]
+    scsi_scan_host_selected+0x5a/0x150
+    #1:  (&eq->sysfs_lock){+.+.}, at: [<ffffffff812a5032>]
+    elevator_exit+0x22/0x60
+    #2:  (&(&q->__queue_lock)->rlock){-.-.}, at: [<ffffffff812b6233>]
+    cfq_exit_queue+0x43/0x190
+
+    stack backtrace:
+    Pid: 1552, comm: scsi_scan_6 Not tainted 3.0.0-rc5 #17
+    Call Trace:
+    [<ffffffff810abb9b>] lockdep_rcu_dereference+0xbb/0xc0
+    [<ffffffff812b6139>] __cfq_exit_single_io_context+0xe9/0x120
+    [<ffffffff812b626c>] cfq_exit_queue+0x7c/0x190
+    [<ffffffff812a5046>] elevator_exit+0x36/0x60
+    [<ffffffff812a802a>] blk_cleanup_queue+0x4a/0x60
+    [<ffffffff8145cc09>] scsi_free_queue+0x9/0x10
+    [<ffffffff81460944>] __scsi_remove_device+0x84/0xd0
+    [<ffffffff8145dca3>] scsi_probe_and_add_lun+0x353/0xb10
+    [<ffffffff817da069>] ? error_exit+0x29/0xb0
+    [<ffffffff817d98ed>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
+    [<ffffffff8145e722>] __scsi_scan_target+0x112/0x680
+    [<ffffffff812c690d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
+    [<ffffffff817da069>] ? error_exit+0x29/0xb0
+    [<ffffffff812bcc60>] ? kobject_del+0x40/0x40
+    [<ffffffff8145ed16>] scsi_scan_channel+0x86/0xb0
+    [<ffffffff8145f0b0>] scsi_scan_host_selected+0x140/0x150
+    [<ffffffff8145f149>] do_scsi_scan_host+0x89/0x90
+    [<ffffffff8145f170>] do_scan_async+0x20/0x160
+    [<ffffffff8145f150>] ? do_scsi_scan_host+0x90/0x90
+    [<ffffffff810975b6>] kthread+0xa6/0xb0
+    [<ffffffff817db154>] kernel_thread_helper+0x4/0x10
+    [<ffffffff81066430>] ? finish_task_switch+0x80/0x110
+    [<ffffffff817d9c04>] ? retint_restore_args+0xe/0xe
+    [<ffffffff81097510>] ? __kthread_init_worker+0x70/0x70
+    [<ffffffff817db150>] ? gs_change+0xb/0xb
+
+Line 2776 of block/cfq-iosched.c in v3.0-rc5 is as follows::
+
+	if (rcu_dereference(ioc->ioc_data) == cic) {
+
+This form says that it must be in a plain vanilla RCU read-side critical
+section, but the "other info" list above shows that this is not the
+case.  Instead, we hold three locks, one of which might be RCU related.
+And maybe that lock really does protect this reference.  If so, the fix
+is to inform RCU, perhaps by changing __cfq_exit_single_io_context() to
+take the struct request_queue "q" from cfq_exit_queue() as an argument,
+which would permit us to invoke rcu_dereference_protected as follows::
+
+	if (rcu_dereference_protected(ioc->ioc_data,
+				      lockdep_is_held(&q->queue_lock)) == cic) {
+
+With this change, there would be no lockdep-RCU splat emitted if this
+code was invoked either from within an RCU read-side critical section
+or with the ->queue_lock held.  In particular, this would have suppressed
+the above lockdep-RCU splat because ->queue_lock is held (see #2 in the
+list above).
+
+On the other hand, perhaps we really do need an RCU read-side critical
+section.  In this case, the critical section must span the use of the
+return value from rcu_dereference(), or at least until there is some
+reference count incremented or some such.  One way to handle this is to
+add rcu_read_lock() and rcu_read_unlock() as follows::
+
+	rcu_read_lock();
+	if (rcu_dereference(ioc->ioc_data) == cic) {
+		spin_lock(&ioc->lock);
+		rcu_assign_pointer(ioc->ioc_data, NULL);
+		spin_unlock(&ioc->lock);
+	}
+	rcu_read_unlock();
+
+With this change, the rcu_dereference() is always within an RCU
+read-side critical section, which again would have suppressed the
+above lockdep-RCU splat.
+
+But in this particular case, we don't actually dereference the pointer
+returned from rcu_dereference().  Instead, that pointer is just compared
+to the cic pointer, which means that the rcu_dereference() can be replaced
+by rcu_access_pointer() as follows::
+
+	if (rcu_access_pointer(ioc->ioc_data) == cic) {
+
+Because it is legal to invoke rcu_access_pointer() without protection,
+this change would also suppress the above lockdep-RCU splat.
diff --git a/Documentation/RCU/lockdep-splat.txt b/Documentation/RCU/lockdep-splat.txt
deleted file mode 100644
index b8096316fd11..000000000000
--- a/Documentation/RCU/lockdep-splat.txt
+++ /dev/null
@@ -1,110 +0,0 @@
-Lockdep-RCU was added to the Linux kernel in early 2010
-(http://lwn.net/Articles/371986/).  This facility checks for some common
-misuses of the RCU API, most notably using one of the rcu_dereference()
-family to access an RCU-protected pointer without the proper protection.
-When such misuse is detected, an lockdep-RCU splat is emitted.
-
-The usual cause of a lockdep-RCU slat is someone accessing an
-RCU-protected data structure without either (1) being in the right kind of
-RCU read-side critical section or (2) holding the right update-side lock.
-This problem can therefore be serious: it might result in random memory
-overwriting or worse.  There can of course be false positives, this
-being the real world and all that.
-
-So let's look at an example RCU lockdep splat from 3.0-rc5, one that
-has long since been fixed:
-
-=============================
-WARNING: suspicious RCU usage
------------------------------
-block/cfq-iosched.c:2776 suspicious rcu_dereference_protected() usage!
-
-other info that might help us debug this:
-
-
-rcu_scheduler_active = 1, debug_locks = 0
-3 locks held by scsi_scan_6/1552:
- #0:  (&shost->scan_mutex){+.+.}, at: [<ffffffff8145efca>]
-scsi_scan_host_selected+0x5a/0x150
- #1:  (&eq->sysfs_lock){+.+.}, at: [<ffffffff812a5032>]
-elevator_exit+0x22/0x60
- #2:  (&(&q->__queue_lock)->rlock){-.-.}, at: [<ffffffff812b6233>]
-cfq_exit_queue+0x43/0x190
-
-stack backtrace:
-Pid: 1552, comm: scsi_scan_6 Not tainted 3.0.0-rc5 #17
-Call Trace:
- [<ffffffff810abb9b>] lockdep_rcu_dereference+0xbb/0xc0
- [<ffffffff812b6139>] __cfq_exit_single_io_context+0xe9/0x120
- [<ffffffff812b626c>] cfq_exit_queue+0x7c/0x190
- [<ffffffff812a5046>] elevator_exit+0x36/0x60
- [<ffffffff812a802a>] blk_cleanup_queue+0x4a/0x60
- [<ffffffff8145cc09>] scsi_free_queue+0x9/0x10
- [<ffffffff81460944>] __scsi_remove_device+0x84/0xd0
- [<ffffffff8145dca3>] scsi_probe_and_add_lun+0x353/0xb10
- [<ffffffff817da069>] ? error_exit+0x29/0xb0
- [<ffffffff817d98ed>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
- [<ffffffff8145e722>] __scsi_scan_target+0x112/0x680
- [<ffffffff812c690d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
- [<ffffffff817da069>] ? error_exit+0x29/0xb0
- [<ffffffff812bcc60>] ? kobject_del+0x40/0x40
- [<ffffffff8145ed16>] scsi_scan_channel+0x86/0xb0
- [<ffffffff8145f0b0>] scsi_scan_host_selected+0x140/0x150
- [<ffffffff8145f149>] do_scsi_scan_host+0x89/0x90
- [<ffffffff8145f170>] do_scan_async+0x20/0x160
- [<ffffffff8145f150>] ? do_scsi_scan_host+0x90/0x90
- [<ffffffff810975b6>] kthread+0xa6/0xb0
- [<ffffffff817db154>] kernel_thread_helper+0x4/0x10
- [<ffffffff81066430>] ? finish_task_switch+0x80/0x110
- [<ffffffff817d9c04>] ? retint_restore_args+0xe/0xe
- [<ffffffff81097510>] ? __kthread_init_worker+0x70/0x70
- [<ffffffff817db150>] ? gs_change+0xb/0xb
-
-Line 2776 of block/cfq-iosched.c in v3.0-rc5 is as follows:
-
-	if (rcu_dereference(ioc->ioc_data) == cic) {
-
-This form says that it must be in a plain vanilla RCU read-side critical
-section, but the "other info" list above shows that this is not the
-case.  Instead, we hold three locks, one of which might be RCU related.
-And maybe that lock really does protect this reference.  If so, the fix
-is to inform RCU, perhaps by changing __cfq_exit_single_io_context() to
-take the struct request_queue "q" from cfq_exit_queue() as an argument,
-which would permit us to invoke rcu_dereference_protected as follows:
-
-	if (rcu_dereference_protected(ioc->ioc_data,
-				      lockdep_is_held(&q->queue_lock)) == cic) {
-
-With this change, there would be no lockdep-RCU splat emitted if this
-code was invoked either from within an RCU read-side critical section
-or with the ->queue_lock held.  In particular, this would have suppressed
-the above lockdep-RCU splat because ->queue_lock is held (see #2 in the
-list above).
-
-On the other hand, perhaps we really do need an RCU read-side critical
-section.  In this case, the critical section must span the use of the
-return value from rcu_dereference(), or at least until there is some
-reference count incremented or some such.  One way to handle this is to
-add rcu_read_lock() and rcu_read_unlock() as follows:
-
-	rcu_read_lock();
-	if (rcu_dereference(ioc->ioc_data) == cic) {
-		spin_lock(&ioc->lock);
-		rcu_assign_pointer(ioc->ioc_data, NULL);
-		spin_unlock(&ioc->lock);
-	}
-	rcu_read_unlock();
-
-With this change, the rcu_dereference() is always within an RCU
-read-side critical section, which again would have suppressed the
-above lockdep-RCU splat.
-
-But in this particular case, we don't actually dereference the pointer
-returned from rcu_dereference().  Instead, that pointer is just compared
-to the cic pointer, which means that the rcu_dereference() can be replaced
-by rcu_access_pointer() as follows:
-
-	if (rcu_access_pointer(ioc->ioc_data) == cic) {
-
-Because it is legal to invoke rcu_access_pointer() without protection,
-this change would also suppress the above lockdep-RCU splat.
-- 
cgit v1.2.3


From 058cc23bcad08aca62987cc795fe406ac39146d0 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Date: Tue, 21 Apr 2020 19:04:04 +0200
Subject: docs: RCU: Convert lockdep.txt to ReST

- Add a SPDX header;
- Adjust document title;
- Mark literal blocks as such;
- Add it to RCU/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 Documentation/RCU/index.rst   |   1 +
 Documentation/RCU/lockdep.rst | 116 ++++++++++++++++++++++++++++++++++++++++++
 Documentation/RCU/lockdep.txt | 112 ----------------------------------------
 3 files changed, 117 insertions(+), 112 deletions(-)
 create mode 100644 Documentation/RCU/lockdep.rst
 delete mode 100644 Documentation/RCU/lockdep.txt

(limited to 'Documentation/RCU')

diff --git a/Documentation/RCU/index.rst b/Documentation/RCU/index.rst
index 430a37132b2c..fa7a2a8949b7 100644
--- a/Documentation/RCU/index.rst
+++ b/Documentation/RCU/index.rst
@@ -11,6 +11,7 @@ RCU concepts
 
    arrayRCU
    checklist
+   lockdep
    lockdep-splat
    rcubarrier
    rcu_dereference
diff --git a/Documentation/RCU/lockdep.rst b/Documentation/RCU/lockdep.rst
new file mode 100644
index 000000000000..f1fc8ae3846a
--- /dev/null
+++ b/Documentation/RCU/lockdep.rst
@@ -0,0 +1,116 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+========================
+RCU and lockdep checking
+========================
+
+All flavors of RCU have lockdep checking available, so that lockdep is
+aware of when each task enters and leaves any flavor of RCU read-side
+critical section.  Each flavor of RCU is tracked separately (but note
+that this is not the case in 2.6.32 and earlier).  This allows lockdep's
+tracking to include RCU state, which can sometimes help when debugging
+deadlocks and the like.
+
+In addition, RCU provides the following primitives that check lockdep's
+state::
+
+	rcu_read_lock_held() for normal RCU.
+	rcu_read_lock_bh_held() for RCU-bh.
+	rcu_read_lock_sched_held() for RCU-sched.
+	srcu_read_lock_held() for SRCU.
+
+These functions are conservative, and will therefore return 1 if they
+aren't certain (for example, if CONFIG_DEBUG_LOCK_ALLOC is not set).
+This prevents things like WARN_ON(!rcu_read_lock_held()) from giving false
+positives when lockdep is disabled.
+
+In addition, a separate kernel config parameter CONFIG_PROVE_RCU enables
+checking of rcu_dereference() primitives:
+
+	rcu_dereference(p):
+		Check for RCU read-side critical section.
+	rcu_dereference_bh(p):
+		Check for RCU-bh read-side critical section.
+	rcu_dereference_sched(p):
+		Check for RCU-sched read-side critical section.
+	srcu_dereference(p, sp):
+		Check for SRCU read-side critical section.
+	rcu_dereference_check(p, c):
+		Use explicit check expression "c" along with
+		rcu_read_lock_held().  This is useful in code that is
+		invoked by both RCU readers and updaters.
+	rcu_dereference_bh_check(p, c):
+		Use explicit check expression "c" along with
+		rcu_read_lock_bh_held().  This is useful in code that
+		is invoked by both RCU-bh readers and updaters.
+	rcu_dereference_sched_check(p, c):
+		Use explicit check expression "c" along with
+		rcu_read_lock_sched_held().  This is useful in code that
+		is invoked by both RCU-sched readers and updaters.
+	srcu_dereference_check(p, c):
+		Use explicit check expression "c" along with
+		srcu_read_lock_held()().  This is useful in code that
+		is invoked by both SRCU readers and updaters.
+	rcu_dereference_raw(p):
+		Don't check.  (Use sparingly, if at all.)
+	rcu_dereference_protected(p, c):
+		Use explicit check expression "c", and omit all barriers
+		and compiler constraints.  This is useful when the data
+		structure cannot change, for example, in code that is
+		invoked only by updaters.
+	rcu_access_pointer(p):
+		Return the value of the pointer and omit all barriers,
+		but retain the compiler constraints that prevent duplicating
+		or coalescsing.  This is useful when when testing the
+		value of the pointer itself, for example, against NULL.
+
+The rcu_dereference_check() check expression can be any boolean
+expression, but would normally include a lockdep expression.  However,
+any boolean expression can be used.  For a moderately ornate example,
+consider the following::
+
+	file = rcu_dereference_check(fdt->fd[fd],
+				     lockdep_is_held(&files->file_lock) ||
+				     atomic_read(&files->count) == 1);
+
+This expression picks up the pointer "fdt->fd[fd]" in an RCU-safe manner,
+and, if CONFIG_PROVE_RCU is configured, verifies that this expression
+is used in:
+
+1.	An RCU read-side critical section (implicit), or
+2.	with files->file_lock held, or
+3.	on an unshared files_struct.
+
+In case (1), the pointer is picked up in an RCU-safe manner for vanilla
+RCU read-side critical sections, in case (2) the ->file_lock prevents
+any change from taking place, and finally, in case (3) the current task
+is the only task accessing the file_struct, again preventing any change
+from taking place.  If the above statement was invoked only from updater
+code, it could instead be written as follows::
+
+	file = rcu_dereference_protected(fdt->fd[fd],
+					 lockdep_is_held(&files->file_lock) ||
+					 atomic_read(&files->count) == 1);
+
+This would verify cases #2 and #3 above, and furthermore lockdep would
+complain if this was used in an RCU read-side critical section unless one
+of these two cases held.  Because rcu_dereference_protected() omits all
+barriers and compiler constraints, it generates better code than do the
+other flavors of rcu_dereference().  On the other hand, it is illegal
+to use rcu_dereference_protected() if either the RCU-protected pointer
+or the RCU-protected data that it points to can change concurrently.
+
+Like rcu_dereference(), when lockdep is enabled, RCU list and hlist
+traversal primitives check for being called from within an RCU read-side
+critical section.  However, a lockdep expression can be passed to them
+as a additional optional argument.  With this lockdep expression, these
+traversal primitives will complain only if the lockdep expression is
+false and they are called from outside any RCU read-side critical section.
+
+For example, the workqueue for_each_pwq() macro is intended to be used
+either within an RCU read-side critical section or with wq->mutex held.
+It is thus implemented as follows::
+
+	#define for_each_pwq(pwq, wq)
+		list_for_each_entry_rcu((pwq), &(wq)->pwqs, pwqs_node,
+					lock_is_held(&(wq->mutex).dep_map))
diff --git a/Documentation/RCU/lockdep.txt b/Documentation/RCU/lockdep.txt
deleted file mode 100644
index 89db949eeca0..000000000000
--- a/Documentation/RCU/lockdep.txt
+++ /d