diff options
-rw-r--r-- | Documentation/ww-mutex-design.txt | 344 | ||||
-rw-r--r-- | arch/ia64/include/asm/mutex.h | 10 | ||||
-rw-r--r-- | arch/powerpc/include/asm/mutex.h | 10 | ||||
-rw-r--r-- | arch/sh/include/asm/mutex-llsc.h | 4 | ||||
-rw-r--r-- | arch/x86/include/asm/mutex_32.h | 11 | ||||
-rw-r--r-- | arch/x86/include/asm/mutex_64.h | 11 | ||||
-rw-r--r-- | include/asm-generic/mutex-dec.h | 10 | ||||
-rw-r--r-- | include/asm-generic/mutex-null.h | 2 | ||||
-rw-r--r-- | include/asm-generic/mutex-xchg.h | 10 | ||||
-rw-r--r-- | include/linux/mutex-debug.h | 1 | ||||
-rw-r--r-- | include/linux/mutex.h | 363 | ||||
-rw-r--r-- | kernel/mutex.c | 384 | ||||
-rw-r--r-- | lib/Kconfig.debug | 13 | ||||
-rw-r--r-- | lib/debug_locks.c | 2 | ||||
-rw-r--r-- | lib/locking-selftest.c | 720 |
15 files changed, 1803 insertions, 92 deletions
diff --git a/Documentation/ww-mutex-design.txt b/Documentation/ww-mutex-design.txt new file mode 100644 index 000000000000..8a112dc304c3 --- /dev/null +++ b/Documentation/ww-mutex-design.txt @@ -0,0 +1,344 @@ +Wait/Wound Deadlock-Proof Mutex Design +====================================== + +Please read mutex-design.txt first, as it applies to wait/wound mutexes too. + +Motivation for WW-Mutexes +------------------------- + +GPU's do operations that commonly involve many buffers. Those buffers +can be shared across contexts/processes, exist in different memory +domains (for example VRAM vs system memory), and so on. And with +PRIME / dmabuf, they can even be shared across devices. So there are +a handful of situations where the driver needs to wait for buffers to +become ready. If you think about this in terms of waiting on a buffer +mutex for it to become available, this presents a problem because +there is no way to guarantee that buffers appear in a execbuf/batch in +the same order in all contexts. That is directly under control of +userspace, and a result of the sequence of GL calls that an application +makes. Which results in the potential for deadlock. The problem gets +more complex when you consider that the kernel may need to migrate the +buffer(s) into VRAM before the GPU operates on the buffer(s), which +may in turn require evicting some other buffers (and you don't want to +evict other buffers which are already queued up to the GPU), but for a +simplified understanding of the problem you can ignore this. + +The algorithm that the TTM graphics subsystem came up with for dealing with +this problem is quite simple. For each group of buffers (execbuf) that need +to be locked, the caller would be assigned a unique reservation id/ticket, +from a global counter. In case of deadlock while locking all the buffers +associated with a execbuf, the one with the lowest reservation ticket (i.e. +the oldest task) wins, and the one with the higher reservation id (i.e. the +younger task) unlocks all of the buffers that it has already locked, and then +tries again. + +In the RDBMS literature this deadlock handling approach is called wait/wound: +The older tasks waits until it can acquire the contended lock. The younger tasks +needs to back off and drop all the locks it is currently holding, i.e. the +younger task is wounded. + +Concepts +-------- + +Compared to normal mutexes two additional concepts/objects show up in the lock +interface for w/w mutexes: + +Acquire context: To ensure eventual forward progress it is important the a task +trying to acquire locks doesn't grab a new reservation id, but keeps the one it +acquired when starting the lock acquisition. This ticket is stored in the +acquire context. Furthermore the acquire context keeps track of debugging state +to catch w/w mutex interface abuse. + +W/w class: In contrast to normal mutexes the lock class needs to be explicit for +w/w mutexes, since it is required to initialize the acquire context. + +Furthermore there are three different class of w/w lock acquire functions: + +* Normal lock acquisition with a context, using ww_mutex_lock. + +* Slowpath lock acquisition on the contending lock, used by the wounded task + after having dropped all already acquired locks. These functions have the + _slow postfix. + + From a simple semantics point-of-view the _slow functions are not strictly + required, since simply calling the normal ww_mutex_lock functions on the + contending lock (after having dropped all other already acquired locks) will + work correctly. After all if no other ww mutex has been acquired yet there's + no deadlock potential and hence the ww_mutex_lock call will block and not + prematurely return -EDEADLK. The advantage of the _slow functions is in + interface safety: + - ww_mutex_lock has a __must_check int return type, whereas ww_mutex_lock_slow + has a void return type. Note that since ww mutex code needs loops/retries + anyway the __must_check doesn't result in spurious warnings, even though the + very first lock operation can never fail. + - When full debugging is enabled ww_mutex_lock_slow checks that all acquired + ww mutex have been released (preventing deadlocks) and makes sure that we + block on the contending lock (preventing spinning through the -EDEADLK + slowpath until the contended lock can be acquired). + +* Functions to only acquire a single w/w mutex, which results in the exact same + semantics as a normal mutex. This is done by calling ww_mutex_lock with a NULL + context. + + Again this is not strictly required. But often you only want to acquire a + single lock in which case it's pointless to set up an acquire context (and so + better to avoid grabbing a deadlock avoidance ticket). + +Of course, all the usual variants for handling wake-ups due to signals are also +provided. + +Usage +----- + +Three different ways to acquire locks within the same w/w class. Common +definitions for methods #1 and #2: + +static DEFINE_WW_CLASS(ww_class); + +struct obj { + struct ww_mutex lock; + /* obj data */ +}; + +struct obj_entry { + struct list_head head; + struct obj *obj; +}; + +Method 1, using a list in execbuf->buffers that's not allowed to be reordered. +This is useful if a list of required objects is already tracked somewhere. +Furthermore the lock helper can use propagate the -EALREADY return code back to +the caller as a signal that an object is twice on the list. This is useful if +the list is constructed from userspace input and the ABI requires userspace to +not have duplicate entries (e.g. for a gpu commandbuffer submission ioctl). + +int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) +{ + struct obj *res_obj = NULL; + struct obj_entry *contended_entry = NULL; + struct obj_entry *entry; + + ww_acquire_init(ctx, &ww_class); + +retry: + list_for_each_entry (entry, list, head) { + if (entry->obj == res_obj) { + res_obj = NULL; + continue; + } + ret = ww_mutex_lock(&entry->obj->lock, ctx); + if (ret < 0) { + contended_entry = entry; + goto err; + } + } + + ww_acquire_done(ctx); + return 0; + +err: + list_for_each_entry_continue_reverse (entry, list, head) + ww_mutex_unlock(&entry->obj->lock); + + if (res_obj) + ww_mutex_unlock(&res_obj->lock); + + if (ret == -EDEADLK) { + /* we lost out in a seqno race, lock and retry.. */ + ww_mutex_lock_slow(&contended_entry->obj->lock, ctx); + res_obj = contended_entry->obj; + goto retry; + } + ww_acquire_fini(ctx); + + return ret; +} + +Method 2, using a list in execbuf->buffers that can be reordered. Same semantics +of duplicate entry detection using -EALREADY as method 1 above. But the +list-reordering allows for a bit more idiomatic code. + +int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) +{ + struct obj_entry *entry, *entry2; + + ww_acquire_init(ctx, &ww_class); + + list_for_each_entry (entry, list, head) { + ret = ww_mutex_lock(&entry->obj->lock, ctx); + if (ret < 0) { + entry2 = entry; + + list_for_each_entry_continue_reverse (entry2, list, head) + ww_mutex_unlock(&entry2->obj->lock); + + if (ret != -EDEADLK) { + ww_acquire_fini(ctx); + return ret; + } + + /* we lost out in a seqno race, lock and retry.. */ + ww_mutex_lock_slow(&entry->obj->lock, ctx); + + /* + * Move buf to head of the list, this will point + * buf->next to the first unlocked entry, + * restarting the for loop. + */ + list_del(&entry->head); + list_add(&entry->head, list); + } + } + + ww_acquire_done(ctx); + return 0; +} + +Unlocking works the same way for both methods #1 and #2: + +void unlock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) +{ + struct obj_entry *entry; + + list_for_each_entry (entry, list, head) + ww_mutex_unlock(&entry->obj->lock); + + ww_acquire_fini(ctx); +} + +Method 3 is useful if the list of objects is constructed ad-hoc and not upfront, +e.g. when adjusting edges in a graph where each node has its own ww_mutex lock, +and edges can only be changed when holding the locks of all involved nodes. w/w +mutexes are a natural fit for such a case for two reasons: +- They can handle lock-acquisition in any order which allows us to start walking + a graph from a starting point and then iteratively discovering new edges and + locking down the nodes those edges connect to. +- Due to the -EALREADY return code signalling that a given objects is already + held there's no need for additional book-keeping to break cycles in the graph + or keep track off which looks are already held (when using more than one node + as a starting point). + +Note that this approach differs in two important ways from the above methods: +- Since the list of objects is dynamically constructed (and might very well be + different when retrying due to hitting the -EDEADLK wound condition) there's + no need to keep any object on a persistent list when it's not locked. We can + therefore move the list_head into the object itself. +- On the other hand the dynamic object list construction also means that the -EALREADY return + code can't be propagated. + +Note also that methods #1 and #2 and method #3 can be combined, e.g. to first lock a +list of starting nodes (passed in from userspace) using one of the above +methods. And then lock any additional objects affected by the operations using +method #3 below. The backoff/retry procedure will be a bit more involved, since +when the dynamic locking step hits -EDEADLK we also need to unlock all the +objects acquired with the fixed list. But the w/w mutex debug checks will catch +any interface misuse for these cases. + +Also, method 3 can't fail the lock acquisition step since it doesn't return +-EALREADY. Of course this would be different when using the _interruptible +variants, but that's outside of the scope of these examples here. + +struct obj { + struct ww_mutex ww_mutex; + struct list_head locked_list; +}; + +static DEFINE_WW_CLASS(ww_class); + +void __unlock_objs(struct list_head *list) +{ + struct obj *entry, *temp; + + list_for_each_entry_safe (entry, temp, list, locked_list) { + /* need to do that before unlocking, since only the current lock holder is + allowed to use object */ + list_del(&entry->locked_list); + ww_mutex_unlock(entry->ww_mutex) + } +} + +void lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) +{ + struct obj *obj; + + ww_acquire_init(ctx, &ww_class); + +retry: + /* re-init loop start state */ + loop { + /* magic code which walks over a graph and decides which objects + * to lock */ + + ret = ww_mutex_lock(obj->ww_mutex, ctx); + if (ret == -EALREADY) { + /* we have that one already, get to the next object */ + continue; + } + if (ret == -EDEADLK) { + __unlock_objs(list); + + ww_mutex_lock_slow(obj, ctx); + list_add(&entry->locked_list, list); + goto retry; + } + + /* locked a new object, add it to the list */ + list_add_tail(&entry->locked_list, list); + } + + ww_acquire_done(ctx); + return 0; +} + +void unlock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) +{ + __unlock_objs(list); + ww_acquire_fini(ctx); +} + +Method 4: Only lock one single objects. In that case deadlock detection and +prevention is obviously overkill, since with grabbing just one lock you can't +produce a deadlock within just one class. To simplify this case the w/w mutex +api can be used with a NULL context. + +Implementation Details +---------------------- + +Design: + ww_mutex currently encapsulates a struct mutex, this means no extra overhead for + normal mutex locks, which are far more common. As such there is only a small + increase in code size if wait/wound mutexes are not used. + + In general, not much contention is expected. The locks are typically used to + serialize access to resources for devices. The only way to make wakeups + smarter would be at the cost of adding a field to struct mutex_waiter. This + would add overhead to all cases where normal mutexes are used, and + ww_mutexes are generally less performance sensitive. + +Lockdep: + Special care has been taken to warn for as many cases of api abuse + as possible. Some common api abuses will be caught with + CONFIG_DEBUG_MUTEXES, but CONFIG_PROVE_LOCKING is recommended. + + Some of the errors which will be warned about: + - Forgetting to call ww_acquire_fini or ww_acquire_init. + - Attempting to lock more mutexes after ww_acquire_done. + - Attempting to lock the wrong mutex after -EDEADLK and + unlocking all mutexes. + - Attempting to lock the right mutex after -EDEADLK, + before unlocking all mutexes. + + - Calling ww_mutex_lock_slow before -EDEADLK was returned. + + - Unlocking mutexes with the wrong unlock function. + - Calling one of the ww_acquire_* twice on the same context. + - Using a different ww_class for the mutex than for the ww_acquire_ctx. + - Normal lockdep errors that can result in deadlocks. + + Some of the lockdep errors that can result in deadlocks: + - Calling ww_acquire_init to initialize a second ww_acquire_ctx before + having called ww_acquire_fini on the first. + - 'normal' deadlocks that can occur. + +FIXME: Update this section once we have the TASK_DEADLOCK task state flag magic +implemented. diff --git a/arch/ia64/include/asm/mutex.h b/arch/ia64/include/asm/mutex.h index bed73a643a56..f41e66d65e31 100644 --- a/arch/ia64/include/asm/mutex.h +++ b/arch/ia64/include/asm/mutex.h @@ -29,17 +29,15 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) * __mutex_fastpath_lock_retval - try to take the lock by moving the count * from 1 to a 0 value * @count: pointer of type atomic_t - * @fail_fn: function to call if the original value was not 1 * - * Change the count from 1 to a value lower than 1, and call <fail_fn> if - * it wasn't 1 originally. This function returns 0 if the fastpath succeeds, - * or anything the slow path function returns. + * Change the count from 1 to a value lower than 1. This function returns 0 + * if the fastpath succeeds, or -1 otherwise. */ static inline int -__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) +__mutex_fastpath_lock_retval(atomic_t *count) { if (unlikely(ia64_fetchadd4_acq(count, -1) != 1)) - return fail_fn(count); + return -1; return 0; } diff --git a/arch/powerpc/include/asm/mutex.h b/arch/powerpc/include/asm/mutex.h index 5399f7e18102..127ab23e1f6c 100644 --- a/arch/powerpc/include/asm/mutex.h +++ b/arch/powerpc/include/asm/mutex.h @@ -82,17 +82,15 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) * __mutex_fastpath_lock_retval - try to take the lock by moving the count * from 1 to a 0 value * @count: pointer of type atomic_t - * @fail_fn: function to call if the original value was not 1 * - * Change the count from 1 to a value lower than 1, and call <fail_fn> if - * it wasn't 1 originally. This function returns 0 if the fastpath succeeds, - * or anything the slow path function returns. + * Change the count from 1 to a value lower than 1. This function returns 0 + * if the fastpath succeeds, or -1 otherwise. */ static inline int -__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) +__mutex_fastpath_lock_retval(atomic_t *count) { if (unlikely(__mutex_dec_return_lock(count) < 0)) - return fail_fn(count); + return -1; return 0; } diff --git a/arch/sh/include/asm/mutex-llsc.h b/arch/sh/include/asm/mutex-llsc.h index 090358a7e1bb..dad29b687bd3 100644 --- a/arch/sh/include/asm/mutex-llsc.h +++ b/arch/sh/include/asm/mutex-llsc.h @@ -37,7 +37,7 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) } static inline int -__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) +__mutex_fastpath_lock_retval(atomic_t *count) { int __done, __res; @@ -51,7 +51,7 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) : "t"); if (unlikely(!__done || __res != 0)) - __res = fail_fn(count); + __res = -1; return __res; } diff --git a/arch/x86/include/asm/mutex_32.h b/arch/x86/include/asm/mutex_32.h index 03f90c8a5a7c..0208c3c2cbc6 100644 --- a/arch/x86/include/asm/mutex_32.h +++ b/arch/x86/include/asm/mutex_32.h @@ -42,17 +42,14 @@ do { \ * __mutex_fastpath_lock_retval - try to take the lock by moving the count * from 1 to a 0 value * @count: pointer of type atomic_t - * @fail_fn: function to call if the original value was not 1 * - * Change the count from 1 to a value lower than 1, and call <fail_fn> if it - * wasn't 1 originally. This function returns 0 if the fastpath succeeds, - * or anything the slow path function returns + * Change the count from 1 to a value lower than 1. This function returns 0 + * if the fastpath succeeds, or -1 otherwise. */ -static inline int __mutex_fastpath_lock_retval(atomic_t *count, - int (*fail_fn)(atomic_t *)) +static inline int __mutex_fastpath_lock_retval(atomic_t *count) { if (unlikely(atomic_dec_return(count) < 0)) - return fail_fn(count); + return -1; else return 0; } diff --git a/arch/x86/include/asm/mutex_64.h b/arch/x86/include/asm/mutex_64.h index 68a87b0f8e29..2c543fff241b 100644 --- a/arch/x86/include/asm/mutex_64.h +++ b/arch/x86/include/asm/mutex_64.h @@ -37,17 +37,14 @@ do { \ * __mutex_fastpath_lock_retval - try to take the lock by moving the count * from 1 to a 0 value * @count: pointer of type atomic_t - * @fail_fn: function to call if the original value was not 1 * - * Change the count from 1 to a value lower than 1, and call <fail_fn> if - * it wasn't 1 originally. This function returns 0 if the fastpath succeeds, - * or anything the slow path function returns + * Change the count from 1 to a value lower than 1. This function returns 0 + * if the fastpath succeeds, or -1 otherwise. */ -static inline int __mutex_fastpath_lock_retval(atomic_t *count, - int (*fail_fn)(atomic_t *)) +static inline int __mutex_fastpath_lock_retval(atomic_t *count) { if (unlikely(atomic_dec_return(count) < 0)) - return fail_fn(count); + return -1; else return 0; } diff --git a/include/asm-generic/mutex-dec.h b/include/asm-generic/mutex-dec.h index f104af7cf437..d4f9fb4e53df 100644 --- a/include/asm-generic/mutex-dec.h +++ b/include/asm-generic/mutex-dec.h @@ -28,17 +28,15 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) * __mutex_fastpath_lock_retval - try to take the lock by moving the count * from 1 to a 0 value * @count: pointer of type atomic_t - * @fail_fn: function to call if the original value was not 1 * - * Change the count from 1 to a value lower than 1, and call <fail_fn> if - * it wasn't 1 originally. This function returns 0 if the fastpath succeeds, - * or anything the slow path function returns. + * Change the count from 1 to a value lower than 1. This function returns 0 + * if the fastpath succeeds, or -1 otherwise. */ static inline int -__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) +__mutex_fastpath_lock_retval(atomic_t *count) { if (unlikely(atomic_dec_return(count) < 0)) - return fail_fn(count); + return -1; return 0; } diff --git a/include/asm-generic/mutex-null.h b/include/asm-generic/mutex-null.h index e1bbbc72b6a2..61069ed334e2 100644 --- a/include/asm-generic/mutex-null.h +++ b/include/asm-generic/mutex-null.h @@ -11,7 +11,7 @@ #define _ASM_GENERIC_MUTEX_NULL_H #define __mutex_fastpath_lock(count, fail_fn) fail_fn(count) -#define __mutex_fastpath_lock_retval(count, fail_fn) fail_fn(count) +#define __mutex_fastpath_lock_retval(count) (-1) #define __mutex_fastpath_unlock(count, fail_fn) fail_fn(count) #define __mutex_fastpath_trylock(count, fail_fn) fail_fn(count) #define __mutex_slowpath_needs_to_unlock() 1 diff --git a/include/asm-generic/mutex-xchg.h b/include/asm-generic/mutex-xchg.h index c04e0db8a2d6..f169ec064785 100644 --- a/include/asm-generic/mutex-xchg.h +++ b/include/asm-generic/mutex-xchg.h @@ -39,18 +39,16 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) * __mutex_fastpath_lock_retval - try to take the lock by moving the count * from 1 to a 0 value * @count: pointer of type atomic_t - * @fail_fn: function to call if the original value was not 1 * - * Change the count from 1 to a value lower than 1, and call <fail_fn> if it - * wasn't 1 originally. This function returns 0 if the fastpath succeeds, - * or anything the slow path function returns + * Change the count from 1 to a value lower than 1. This function returns 0 + * if the fastpath succeeds, or -1 otherwise. */ static inline int -__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) +__mutex_fastpath_lock_retval(atomic_t *count) { if (unlikely(atomic_xchg(count, 0) != 1)) if (likely(atomic_xchg(count, -1) != 1)) - return fail_fn(count); + return -1; return 0; } diff --git a/include/linux/mutex-debug.h b/include/linux/mutex-debug.h index 731d77d6e155..4ac8b1977b73 100644 --- a/include/linux/mutex-debug.h +++ b/include/linux/mutex-debug.h @@ -3,6 +3,7 @@ #include <linux/linkage.h> #include <linux/lockdep.h> +#include <linux/debug_locks.h> /* * Mutexes - debugging helpers: diff --git a/include/linux/mutex.h b/include/linux/mutex.h index 433da8a1a426..3793ed7feeeb 100644 --- a/include/linux/mutex.h +++ b/include/linux/mutex.h @@ -10,6 +10,7 @@ #ifndef __LINUX_MUTEX_H #define __LINUX_MUTEX_H +#include <asm/current.h> #include <linux/list.h> #include <linux/spinlock_types.h> #include <linux/linkage.h> @@ -77,6 +78,40 @@ struct mutex_waiter { #endif }; +struct ww_class { + atomic_long_t stamp; + struct lock_class_key acquire_key; + struct lock_class_key mutex_key; + const char *acquire_name; + const char *mutex_name; +}; + +struct ww_acquire_ctx { + struct task_struct *task; + unsigned long stamp; + unsigned acquired; +#ifdef CONFIG_DEBUG_MUTEXES + unsigned done_acquire; + struct ww_class *ww_class; + struct ww_mutex *contending_lock; +#endif +#ifdef CONFIG_DEBUG_LOCK_ALLOC + struct lockdep_map dep_map; +#endif +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH + unsigned deadlock_inject_interval; + unsigned deadlock_inject_countdown; +#endif +}; + +struct ww_mutex { + struct mutex base; + struct ww_acquire_ctx *ctx; +#ifdef CONFIG_DEBUG_MUTEXES + struct ww_class *ww_class; +#endif +}; + #ifdef CONFIG_DEBUG_MUTEXES # include <linux/mutex-debug.h> #else @@ -101,8 +136,11 @@ static inline void mutex_destroy(struct mutex *lock) {} #ifdef CONFIG_DEBUG_LOCK_ALLOC # define __DEP_MAP_MUTEX_INITIALIZER(lockname) \ , .dep_map = { .name = #lockname } +# define __WW_CLASS_MUTEX_INITIALIZER(lockname, ww_class) \ + , .ww_class = &ww_class #else # define __DEP_MAP_MUTEX_INITIALIZER(lockname) +# define __WW_CLASS_MUTEX_INITIALIZER(lockname, ww_class) #endif #define __MUTEX_INITIALIZER(lockname) \ @@ -112,13 +150,49 @@ static inline void mutex_destroy(struct mutex *lock) {} __DEBUG_MUTEX_INITIALIZER(lockname) \ __DEP_MAP_MUTEX_INITIALIZER(lockname) } +#define __WW_CLASS_INITIALIZER(ww_class) \ + { .stamp = ATOMIC_LONG_INIT(0) \ + , .acquire_name = #ww_class "_acquire" \ + , .mutex_name = #ww_class "_mutex" } + +#define __WW_MUTEX_INITIALIZER(lockname, class) \ + { .base = { \__MUTEX_INITIALIZER(lockname) } \ + __WW_CLASS_MUTEX_INITIALIZER(lockname, class) } + #define DEFINE_MUTEX(mutexname) \ struct mutex mutexname = __MUTEX_INITIALIZER(mutexname) +#define DEFINE_WW_CLASS(classname) \ + struct ww_class classname = __WW_CLASS_INITIALIZER(classname) + +#define DEFINE_WW_MUTEX(mutexname, ww_class) \ + struct ww_mutex mutexname = __WW_MUTEX_INITIALIZER(mutexname, ww_class) + + extern void __mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key); /** + * ww_mutex_init - initialize the w/w mutex + * @lock: the mutex to be initialized + * @ww_class: the w/w class the mutex should belong to + * + * Initialize the w/w mutex to unlocked state and associate it with the given + * class. + * + * It is not allowed to initialize an already locked mutex. + */ +static inline void ww_mutex_init(struct ww_mutex *lock, + struct ww_class *ww_class) +{ + __mutex_init(&lock->base, ww_class->mutex_name, &ww_class->mutex_key); + lock->ctx = NULL; +#ifdef CONFIG_DEBUG_MUTEXES + lock->ww_class = ww_class; +#endif +} + +/** * mutex_is_locked - is the mutex locked * @lock: the mutex to be queried * @@ -136,6 +210,7 @@ static inline int mutex_is_locked(struct mutex *lock) #ifdef CONFIG_DEBUG_LOCK_ALLOC extern void mutex_lock_nested(struct mutex *lock, unsigned int subclass); extern void _mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *nest_lock); + extern int __must_check mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass); extern int __must_check mutex_lock_killable_nested(struct mutex *lock, @@ -147,7 +222,7 @@ extern int __must_check mutex_lock_killable_nested(struct mutex *lock, #define mutex_lock_nest_lock(lock, nest_lock) \ do { \ - typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \ + typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \ _mutex_lock_nest_lock(lock, &(nest_lock)->dep_map); \ } while (0) @@ -170,6 +245,292 @@ extern int __must_check mutex_lock_killable(struct mutex *lock); */ extern int mutex_trylock(struct mutex *lock); extern void mutex_unlock(struct mutex *lock); + +/** + * ww_acquire_init - initialize a w/w acquire context + * @ctx: w/w acquire context to initialize + * @ww_class: w/w class of the context + * + * Initializes an context to acquire multiple mutexes of the given w/w class. + * + * Context-based w/w mutex acquiring can be done in any order whatsoever within + * a given lock class. Deadlocks will be detected and handled with the + * wait/wound logic. + * + * Mixing of context-based w/w mutex acquiring and single w/w mutex locking can + * result in undetected deadlocks and is so forbidden. Mixing different contexts + * for the same w/w class when acquiring mutexes can also result in undetected + * deadlocks, and is hence also forbidden. Both types of abuse will be caught by + * enabling CONFIG_PROVE_LOCKING. + * + * Nesting of acquire contexts for _different_ w/w classes is possible, subject + * to the usual locking rules between different lock classes. + * + * An acquire context must be released with ww_acquire_fini by the same task + * before the memory is freed. It is recommended to allocate the context itself + * on the stack. + */ +static inline void ww_acquire_init(struct ww_acquire_ctx *ctx, + struct ww_class *ww_class) +{ + ctx->task = current; + ctx->stamp = atomic_long_inc_return(&ww_class->stamp); + ctx->acquired = 0; +#ifdef CONFIG_DEBUG_MUTEXES + ctx->ww_class = ww_class; + ctx->done_acquire = 0; + ctx->contending_lock = NULL; +#endif +#ifdef CONFIG_DEBUG_LOCK_ALLOC + debug_check_no_locks_freed((void *)ctx, sizeof(*ctx)); + lockdep_init_map(&ctx->dep_map, ww_class->acquire_name, + &ww_class->acquire_key, 0); + mutex_acquire(&ctx->dep_map, 0, 0, _RET_IP_); +#endif +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH + ctx->deadlock_inject_interval = 1; + ctx->deadlock_inject_countdown = ctx->stamp & 0xf; +#endif +} + +/** + * ww_acquire_done - marks the end of the acquire phase + * @ctx: the acquire context + * + * Marks the end of the acquire phase, any further w/w mutex lock calls using + * this context are forbidden. + * + * Calling this function is optional, it is just useful to document w/w mutex + * code and clearly designated the acquire phase from actually using the locked + * data structures. + */ +static inline void ww_acquire_done(struct ww_acquire_ctx *ctx) +{ +#ifdef CONFIG_DEBUG_MUTEXES + lockdep_assert_held(ctx); + + DEBUG_LOCKS_WARN_ON(ctx->done_acquire); + ctx->done_acquire = 1; +#endif +} + +/** + * ww_acquire_fini - releases a w/w acquire context + * @ctx: the acquire context to free + * + * Releases a w/w acquire context. This must be called _after_ all acquired w/w + * mutexes have been released with ww_mutex_unlock. + */ +static inline void ww_acquire_fini(struct ww_acquire_ctx *ctx) +{ +#ifdef CONFIG_DEBUG_MUTEXES + mutex_release(&ctx->dep_map, 0, _THIS_IP_); + + DEBUG_LOCKS_WARN_ON(ctx->acquired); + if (!config_enabled(CONFIG_PROVE_LOCKING)) + /* + * lockdep will normally handle this, + * but fail without anyway + */ + ctx->done_acquire = 1; + + if (!config_enabled(CONFIG_DEBUG_LOCK_ALLOC)) + /* ensure ww_acquire_fini will still fail if called twice */ + ctx->acquired = ~0U; +#endif +} + +extern int __must_check __ww_mutex_lock(struct ww_mutex *lock, + struct ww_acquire_ctx *ctx); +extern int __must_check __ww_mutex_lock_interruptible(struct ww_mutex *lock, + struct ww_acquire_ctx *ctx); + +/** + * ww_mutex_lock - acquire the w/w mutex + * @lock: the mutex to be acquired + * @ctx: w/w acquire context, or NULL to acquire only a single lock. + * + * Lock the w/w mutex exclusively for this task. + * + * Deadlocks within a given w/w class of locks are detected and handled with the + * wait/wound algorithm. If the lock isn't immediately avaiable this function + * will either sleep until it is (wait case). Or it selects the current context + * for backing off by returning -EDEADLK (wound case). Trying to acquire the + * same lock with the same context twice is also detected and signalled by + * returning -EALREADY. Returns 0 if the mutex was successfully acquired. + * + * In the wound case the caller must release all currently held w/w mutexes for + * the given context and then wait for this contending lock to be available by + * calling ww_mutex_lock_slow. Alternatively callers can opt to not acquire this + * lock and proceed with trying to acquire further w/w mutexes (e.g. when + * scanning through lru lists trying to free resources). + * + * The mutex must later on be released by the same task that + * acquired it. The task may not exit without first unlocking the mutex. Also, + * kernel memory where the mutex resides must not be freed with the mutex still + * locked. The mutex must first be initialized (or statically defined) before it + * can be locked. memset()-ing the mutex to 0 is not allowed. The mutex must be + * of the same w/w lock class as was used to initialize the acquire context. + * + * A mutex acquired with this function must be released with ww_mutex_unlock. + */ +static inline int ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) +{ + if (ctx) + return __ww_mutex_lock(lock, ctx); + else { + mutex_lock(&lock->base); + return 0; + } +} + +/** + * ww_mutex_lock_interruptible - acquire the w/w mutex, interruptible + * @lock: the mutex to be acquired + * @ctx: w/w acquire context + * + * Lock the w/w mutex exclusively for this task. + * + * Deadlocks within a given w/w class of locks are detected and handled with the + * wait/wound algorithm. If the lock isn't immediately avaiable this function + * will either sleep until it is (wait case). Or it selects the current context + * for backing off by returning -EDEADLK (wound case). Trying to acquire the + * same lock with the same context twice is also detected and signalled by + * returning -EALREADY. Returns 0 if the mutex was successfully acquired. If a + * signal arrives while waiting for the lock then this function returns -EINTR. + * + * In the wound case the caller must release all currently held w/w mutexes for + * the given context and then wait for this contending lock to be available by + * calling ww_mutex_lock_slow_interruptible. Alternatively callers can opt to + * not acquire this lock and proceed with trying to acquire further w/w mutexes + * (e.g. when scanning through lru lists trying to free resources). + * + * The mutex must later on be released by the same task that + * acquired it. The task may not exit without first unlocking the mutex. Also, + * kernel memory where the mutex resides must not be freed with the mutex still + * locked. The mutex must first be initialized (or statically defined) before it + * can be locked. memset()-ing the mutex to 0 is not allowed. The mutex must be + * of the same w/w lock class as was used to initialize the acquire context. + * + * A mutex acquired with this function must be released with ww_mutex_unlock. + */ +static inline int __must_check ww_mutex_lock_interruptible(struct ww_mutex *lock, + struct ww_acquire_ctx *ctx) +{ + if (ctx) + return __ww_mutex_lock_interruptible(lock, ctx); + else + return mutex_lock_interruptible(&lock->base); +} + +/** + * ww_mutex_lock_slow - slowpath acquiring of the w/w mutex + * @lock: the mutex to be acquired + * @ctx: w/w acquire context + * + * Acquires a w/w mutex with the given context after a wound case. This function + * will sleep until the lock becomes available. + * + * The caller must have released all w/w mutexes already acquired with the + * context and then call this function on the contended lock. + * + * Afterwards the caller may continue to (re)acquire the other w/w mutexes it + * needs with ww_mutex_lock. Note that the -EALREADY return code from + * ww_mutex_lock can be used to avoid locking this contended mutex twice. + * + * It is forbidden to call this function with any other w/w mutexes associated + * with the context held. It is forbidden to call this on anything else than the + * contending mutex. + * + * Note that the slowpath |