Operating System: Three Easy Pieces --- Locks (Note)

From the introduction to concurrency, we saw one of the fundamental problems in concurrent

programming: we would like to execute a series of instructions atomically, but due to the presence

of interrupts on a single processor (multiple threads executing on multiple processors concurrently),

we could't. In this chapter, we thus attack this problem directly, with the introduction of something

referred to as a lock. Programmers annotate source code with locks, putting them around critical

sections, and thus ensure that any such critical section executes as if it were a single atomic instruction.

Locks: The basic idea

As an example, assume our critical section looks like this, the canonical update of a shared variable:

balance = balance + 1;

of course, other critical sections are possible, such as adding an element to a linked list or other more

complex updates to shared structures, but we will just keep to this example for now. To use a lock, we

add some code around the critical section like this:

lock_t mutex;

.....

lock(&mutex);

balance = balance + 1;

unlock(&mutex);

A lock is just a variable, and thus to use one, you must declare a lock variable of some kind (such as mutex

above). This lock variable (or just "lock" for short) holds the state of the lock at any instant in time. It is

either available (or unlocked or free) and thus no thread holds the lock, or acquired (or locked or held), and

thus exactly one thread holds the lock and presumably is in a critical section. We could store other information

in the data type as well, such as which thread holds the lock, or a queue for ordering lock acquisition, but

information like that is hidden from the user of lock.

The semantics of the lock() and unlock() routines are simple. Calling the routine lock() tries to acquire the lock;

if no other thread holds the lock (i.e. it is free), the thread will acquire the lock and enter the critical section; this

thread is sometimes said to be the owner of the lock. If another thread then calls lock() on that same lock variable

(mutex in this example), it will not return while the lock is held by another thread; in this way, other threads are

prevented from entering the critical section while the first thread that holds the lock is in there.

Once the owner of the lock calls unlock(), the lock is now available (free) again. If no other threads are waiting for

the lock (i.e. no other thread has called lock() and is stuck therein), the state of the lock is simply changed to free.

If there are waiting threads (stuck in lock()), one of them will (eventually) notice (or be informed of) this change

of the lock's state, acquire the lock, and enter the critical section.

Locks provide some minimal amount of control over scheduling to programmers. In general, we view threads as

entities created by the programmer but scheduled by the OS, in any fashion that the OS chooses. Locks yield some

of that control back to the programmer; by putting a lock around a section of code, the programmer can guarantee

the no more than a single thread can ever be active within that code. Thus locks help transform the chaos that is

traditional OS scheduling into a more controlled activity.