Virtual memory(5)

The execve Function Revisited

Suppose that the program running in the current process makes the following call:

Execve("a.out", NULL, NULL);

Loading and running a.out requires the following steps:

. Delete existing user areas. Delete the existing area structs in the user portion of the current process’s virtual address.
. Map private areas. Create new area structs for the text, data, bss, and stack areas of the new program. All of these new areas are private copy-on-write. The text and data areas are mapped to the text and data sections of the a.out file. The bss area is demand-zero, mapped to an anonymous file whose size is contained in a.out. The stack and heap area are also demand-zero, initially of zero-length. Figure 9.31 summarizes the different mappings of the private areas.
. Map shared areas. If the a.out program was linked with shared objects, such as the standard C library libc.so, then these objects are dynamically linked into the program, and then mapped into the shared region of the user’s virtual address space.
. Set the program counter (PC). The last thing that execve does is to set the program counter in the current process’s context to point to the entry point in the text area.

The next time this process is scheduled, it will begin execution from the entry point. Linux will swap in code and data pages as needed.

User-level Memory Mapping with the mmap Function

Unix processes can use the mmap function to create new areas of virtual memory and to map objects into these areas.

The mmap function asks the kernel to create a new virtual memory area, preferably one that starts at address start, and to map a contiguous chunk of the object specified by file descriptor fd to the new area.

The contiguous object chunk has a size of length bytes and starts at an offset of offset bytes from the beginning of the file.

The start address is merely a hint, and is usually specified as NULL. For our purposes, we will always assume a NULL start address.

Figure 9.32 depicts the meaning of these arguments.

The prot argument contains bits that describe the access permissions of the newly mapped virtual memory area (i.e., the vm_prot bits in the corresponding area struct).

. PROT_EXEC: Pages in the area consist of instructions that may be executed by the CPU.
. PROT_READ: Pages in the area may be read.
. PROT_WRITE: Pages in the area may be written.
. PROT_NONE: Pages in the area cannot be accessed.

The flags argument consists of bits that describe the type of the mapped object.

If the MAP_ANON flag bit is set, then the backing store is an anonymous object and the corresponding virtual pages are demand-zero.

MAP_PRIVATE indicates a private copy-on-write object, and MAP_SHARED indicates a shared object.

For example,

bufp = Mmap(-1, size, PROT_READ, MAP_PRIVATE|MAP_ANON, 0, 0);

asks the kernel to create a new read-only, private, demand-zero area of virtual memory containing size bytes.

If the call is successful, then bufp contains the address of the new area.

The munmap function deletes regions of virtual memory:

The munmap function deletes the area starting at virtual address start and consist- ing of the next length bytes.

Subsequent references to the deleted region result in segmentation faults.

Dynamic Memory Allocation

While it is certainly possible to use the low-level mmap and munmap functions to create and delete areas of virtual memory, C programmers typically find it more

convenient and more portable to use a dynamic memory allocator when they need to acquire additional virtual memory at run time.

A dynamic memory allocator maintains an area of a process’s virtual memory known as the heap (Figure 9.33).

We will assume that the heap is an area of demand-zero mem- ory.

For each process, the kernel maintains a variable brk (pronounced “break”) that points to the top of the heap.

An allocator maintains the heap as a collection of various-sized blocks. Each block is a contiguous chunk of virtual memory that is either allocated or free.

Allocators come in two basic styles. Both styles require the application to explicitly allocate blocks. They differ about which entity is responsible for freeing allocated blocks.

. Explicit allocators require the application to explicitly free any allocated blocks. (malloc, free, new, delete)

. Implicit allocators, on the other hand, require the allocator to detect when an allocated block is no longer being used by the program and then free the block.

Implicit allocators are also known as garbage collectors, and the process of automatically freeing unused allocated blocks is known as garbage collection.

For example, higher-level languages such as Lisp, ML, and Java rely on garbage collection to free allocated blocks.

......For example, applications that do intensive manipulation of graphs will often use the standard allocator to acquire a large block of virtual memory,

and then use an application-specific allocator to manage the memory within that block as the nodes of the graph are created and destroyed.

The malloc and free Functions

Malloc does not initialize the memory it returns.

Applications that want initialized dynamic memory can use calloc, a thin wrapper around the malloc function that initializes the allocated memory to zero.

Applications that want to change the size of a previously allocated block can use the realloc function.

Programs free allocated heap blocks by calling the free function.

The ptr argument must point to the beginning of an allocated block that was obtained from malloc, calloc, or realloc.

If not, then the behavior of free is undefined. Even worse, since it returns nothing, free gives no indication to the application that something is wrong.

As we shall see in Section 9.11, this can produce some baffling run-time errors.

allocate procedure:

Figure 9.34 shows how an implementation of malloc and free might manage a (very) small heap of 16 words for a C program.

Each box represents a 4-byte word. Initially, the heap consists of a single 16-word double- word aligned free block.

. Figure 9.34(a): The program asks for a four-word block. Malloc responds by carving out a four-word block from the front of the free block and returning a pointer to the first word of the block.
. Figure 9.34(b): The program requests a five-word block. Malloc responds by allocating a six-word block from the front of the free block. In this example, malloc pads the block with an extra word in order to keep the free block aligned on a double-word boundary.
. Figure 9.34(c): The program requests a six-word block and malloc responds by carving out a six-word block from the free block.
. Figure 9.34(d): The program frees the six-word block that was allocated in Figure 9.34(b). Notice that after the call to free returns, the pointer p2 still points to the freed block. It is the responsibility of the application not to use p2 again until it is reinitialized by a new call to malloc.

Why Dynamic Memory Allocation?

The most important reason that programs use dynamic memory allocation is that often they do not know the sizes of certain data structures until the program actually runs.

While not a problem for this simple example, the presence of hard-coded array bounds can become a maintenance nightmare for large software products with millions of lines of code and numerous users.

Allocator Requirements and Goals

Explicit allocators must operate within some rather stringent constraints.：

. Handling arbitrary request sequences. An application can make an arbitrary sequence of allocate and free requests, subject to the constraint that each free request must correspond to a currently allocated block obtained from a previous allocate request.

Thus, the allocator cannot make any assumptions about the ordering of allocate and free requests.

For example, the allocator cannot assume that all allocate requests are accompanied by a matching free request, or that matching allocate and free requests are nested.

. Making immediate responses to requests. The allocator must respond imme- diately to allocate requests. Thus, the allocator is not allowed to reorder or buffer requests in order to improve performance.

. Using only the heap. In order for the allocator to be scalable, any non-scalar data structures used by the allocator must be stored in the heap itself.

. Aligning blocks (alignment requirement). The allocator must align blocks in such a way that they can hold any type of data object. On most systems, this means that the block returned by the allocator is aligned on an 8-byte (double- word) boundary.

. Not modifying allocated blocks. Allocators can only manipulate or change free blocks. In particular, they are not allowed to modify or move blocks once they are allocated. Thus, techniques such as compaction of allocated blocks are not permitted.