Electronic – What happens if a process needs more pages than number of entries in page table

computer-architecture

I am having a little trouble understanding the concept of paging. Below is a simple example to illustrate my question.

Suppose main memory has 128 bytes, organized into 32 pages of size 4 bytes each, and the page table has 8 entries.

A process needs 10 pages (= 40 bytes). As a result, the page table will contain mapping to real address of 8 pages. For access to these pages, the page number portion of the logical address will be used as index into the page table to obtain the physical address. However, with the typical scheme of using the page number as an index, the process won't be able to access the 9th and 10th pages.

Does it mean that the page table should always contain as many entries as the maximum possible number of pages (in this case, 32)? Or does the page table have tuples (page number, frame number), with the page number mentioned explicitly (as in TLB)? How is this supposed to be done?

Best Answer

In general a page table must be able to cover the entire virtual address space. However, this does not mean that every page needs to have a page table entry.

For 32-bit systems the most popular means of achieving this is using a hierarchical (also called multilevel) page table. With 4-byte PTEs and 4 KiB pages, a hierarchical page table would use a page table base pointer that points to a page of Page Directory Entries. The most significant 10 bits of the address are used to index a 4-byte PDE. If the valid bit of the PDE is set, then the address field of the PDE points to a page of PTEs which is indexed by the remaining 10 translated address bits (the least significant 12 bits are not used in translation but are the offset within the page).

(Linear page tables use a similar arrangement where the page table is mapped in the virtual address space, typically starting at virtual address 0. The virtual address of the desired PTE is translated by the TLB. If this generates a TLB miss, then the virtual address of the PTE [equivalent to a PDE] that translates the page table virtual address is translated by the TLB. If that misses, then the virtual address of the PTE that translates that page directory page virtual address is used--this is equivalent to the page table base pointer in a hierarchical page table and is often stored in a special purpose register. Linear page tables have the disadvantages of excluding a portion of the virtual address space for ordinary use, potentially 'wasting' TLB entries, and having more variable TLB miss overhead, but the advantage of exploiting the TLB to cache PDE-equivalents and potentially more natural support of variable sized translation pages, which can reduce the overhead of caching PDE-equivalents and in larger address space potentially remove a level of look-up. The variability of miss overhead can be managed by locking TLB entries for critical parts of the page table.)

With such tables, as long as the valid portion of the virtual address space is not sparse, many PDEs can be marked as invalid; this effectively compresses a 4 KiB page table holding invalid PTEs into the 4 bytes of the PDE.

If a PDE that has its valid bit cleared (and therefore the page of PTEs is not present) is referenced, then this would be treated as a form of page fault and the necessary populating of the subtable (e.g., for a large [4 MiB] chunk of allocated memory that has never been referenced, the subtable might be filled with translations pointing to a read-only page of zeros). Since the subtable would ideally only be swapped out (which may not be supported by the OS anyway) if all PTEs are invalid (unallocated or swapped out), having a virtual address with its contents in memory but not having in memory the subtable that provides the translation is very unlikely. Invalid PDEs are typically used for chunks of the virtual address space that are not allocated (i.e., an access would generate a permission violation exception).

In a typical 32-bit system, if every 4 MiB chunk of the virtual address space had at least one valid page, then the entire page table would have to be present (barring swapping out of subtables). Common systems do not allocate memory sparsely, so many subtables would have no valid PTEs and can be represented by marking the PDE as invalid.

Another, somewhat less popular page table organization is the inverted (or hashed) page table. This uses a hash table to store the PTEs. This requires each entry (which may include multiple PTEs, though no ISA defines such use) to be tagged with enough of the virtual address to uniquely identify the entry. This has the disadvantage that hash conflicts must be managed. (Providing more than one entry per index/hash bucket can reduce the frequency of conflicts.) For non-sparse use of the address space, inverted page tables generally have more storage and bandwidth overhead than hierarchical page tables and add complexity for handling hash conflicts. For (unusual) access patterns which do not have spatial locality--especially for large address spaces--or compared to hierarchical page tables that do not cache PDEs, inverted page tables allow faster TLB fill since only one look-up is required (assuming no conflict).

Inverted page tables are sometimes called and may be used as software TLBs. They act like TLBs but stored in memory and managed/sized by software. Like a TLB, there must usually be a mechanism to handle misses. Such mechanisms are OS-specific and can include using a linked list (so in addition to tag overhead, there would be overhead for a next pointer) and using a secondary (victim) hash table.