Operating System: Three Easy Pieces --- Paging: Small Tables (Note)

We now tackle the second problem that paging introduces: page tables are too big and thus consume

too much memory. Let us start out with a linear page table. As you might recall, linear page tables get

pretty big. Assume again a 32-bit address space, with 4KB pages and 4 bytes page table entry. An address

space that has roughly one million virtual pages in it. Multiply by the page table entry size and you see

that our page table is 4MB in size. Recall also: we usually have one page table for every process in the

system. With one hundred active process (not uncommon on a modern system), we will be allocating

hundreds of megabytes of memory just for page tables. As a result, we are insearch of some techniques

to reduce this heavy burden. There are a lot of them, so let us get going. But not before our crux: How

to make page tables smaller?

Simple array-based page tables (usually called linear page tables) are too big, taking up for too much

memory on typical systems. How can we make page tables smaller? What are the key issues? What

inefficiencies arise as a result of these new data structures?

1. Hybrid Approach: Paging and Segments

Whenever you have reasonable but different approaches to something in life. You should always examine

the combination of the two to see if you can obtain the best of both worlds. We call such a combination

a hybrid.

2. Multi-Level Page Tables

A different approach does not rely on segmentation, but attcks the same problem: how to get rid of all

these invalid regions in the page table instead of keeping them all in memory? We call this approach a

multi-level page table, as it turns the linear page table into something like a tree. This approach is so

effective that many modern operating systems employ it.

The basic idea behind a multi-level page table is simple. First, chop up the page table into page-sized units;

then, if an entire page of page-table entries (PTEs) is invalid, do not allocate that page of the page table

at all. To track whether a page of the page table is valid (and if valid, where it is in memory), use a new

structure, called the page directory. The page directory thus either can be used to tell you where a page

of the page table is, or that the entire page of page table contains no valid pages. The page directory, in

a simple two-level page table, contains one entry per page of the page table. It consists of a number of

page directory entries (PDE). A PDE minimally has a valid bit and a page frame number (PFN), similar to a

PTE. However, as hinted at above, the meaning of this valid bit is slightly different: If the PDE entry is valid,

it means that at least one of the pages of the page table that the entry points to (via the PFN) is valid. e.g.

in at least one PTE on that page pointed to by this PDE; the valid bit in that PTE is set to one. If the PDE entry

is not valid, the rest of the PDE is not defined.

Multi-level page tables have some obvious advantages over approaches we have seen thus far. First, and

perhaps most obviously, the multi-level page table only allocates page-table space in proportion to the amount

of address space you are using. thus it is generally compact and supports sparse address space. Second, if

carefully constructed, each portion of the page table fits neatly within a page, making it easier to manage memory;

the OS can simply grab the next free page when it needs to allocate or grow a page table. Contrast this to a

simple linear page table, whic is just an array of PTEs indexed by VPN; with such a structure, the entire linear

page table must reside contiguously on physical memory. For a large page table (say 4MB), finding such a large

chunk of unused contiguous free physical memory can be quite a challenge.

With a multi-level structure, we add a level of indirection, through use of the page directory, which points to pieces

of the page table. That indirection allows us to place page table pages wherever we would like in physical memory.

It shoud be noted that there is a cost to multi-level tables; on a TLB miss, two loads from memory will be required

to get the right translation information from the page table (one for the page directory, and one for the PTE itself),

in contrast to just one load with linear page table. Thus, the multi-level page table is a small example of a time-space

trade-off. We wanted smaller tables and get them, but not for free; although in the common case (TLB hit), performance

is obviously identical, a TLB miss suffers from a higher cost with this smaller table. Another obvious negative is

complexity, whether it is the hardware of OS handling the page-table lookup on a TLB miss, doing so is undoubtedly

more involved than a simpler linear page table lookup. Often we are willing to increase complexity in order to improve

performance or reduce overheads; in the case of multi-level table, we make page table lookups more complicated

in order to save valuable memory.