Operating System: Three Easy Pieces --- Locks (Note)

From the introduction to concurrency, we saw one of the fundamental problems in concurrent

programming: we could like to execute a series of instructions atomically, but due to the

presence of interrupts on a single processor or multiple threads executing on multiple processors

concurrently, we couldn't. In this chapter, we thus attack this problem directly, with the

introduction of something referred to as a lock. Programmers annote source code with locks,

putting them around critical sections, and thus ensure that any such critical section executes as if

it were a single atomic instruction.

　　　　　　　　　　　　　　　　　　　　　Locks: The Basic Idea

As an example, assume our critical section looks like this, the canonical update of a shared

variable:

balance = balance + 1;

Of course, other critical sections are possible, such as adding an element to a linked list or other

more complex updates to shared structures, but we will just keep to this simple example for now.

To use a lock, we add some code around the critical section like this:

lock_t mutex;
...
lock(&mutex);
balance = balance + 1;
unlock(&mutex);

A lock is just a variable, and thus to use one, you must declare a lock variable of some kind (such

as mutex above). This lock variable or just lock for short holds the state of the lock at any instant

in time. It is either available or unlocked or held, and thus exactly one thread holds the lock and

presumably is in a critical section. We could store other information in the data type as well, such

as which thread holds the lock, or a queue for ordering lock acquisition, but information like that

is hidden from the user of lock.

The semantics of the lock() and unlock() routines are simple. Calling the routine lock() tries to

acquire the lock; if no other thread holds the lock (i.e.,it is free), the thread will acquire the lock

and enter the critical section; this thread is sometimes said to be the owner the lock. If another

thread then calls lock() on that same lock variable (mutex in this example), it will not return while

the lock is held by another thread; in this way, other threads are prevented from entering the

critical section while the first thread that holds the lock is in there.

Once the owner of the lock calls the unlock(), the lock is now available (free) again. If no other

threads are waiting for the lock (i.e., no other thread has called lock() and is stuck therein), the

state of the lock is simply changed to free. If there are waiting threads (stuck in lock()), one of

them will eventually notice or be infromed of this change of the lock's state, acquire the lock, and

enter the critical section.

Locks provide some minimal amount of control over scheduling to programmers. In general, we

view threads as entities created by the programmers but scheduled by the OS, in any fashion

that the OS chooses. Locks yield some of that control back to the programmer; by putting a lock

around a section of code, the programmer can guarantee that no more than a single thread can

ever be active within that code. Thus locks help transfrom the chaos that is traditional OS

scheduling into a more controlled way.