The funny page table terminology on AMD64

What’s the next word in this sequence: PT, PD, PDP, …?

As you probably know, “AX” means “A extended”, and therefore “EAX” means “extended AX extended”. With the 64 bit extensions of the 8080 architecture, AMD chose “RAX”, not adding another “extended”…

Something similar happened with page tables sice the i386. The i386 (1985) could (theoretically) map 4 GB of memory to 4 GB of memory, so it needed two levels of page tables. One single 4 KB “page directory table” (PD) had 1024 32 bit page directory entries (PDE), pointing to 1024 4KB “page tables” (PT), which had 1024 page table entried (PTE).

The Pentium Pro (1995) implemented a hack called “PAE” (Physical Address Extension) that allowed a total of up to 64 GB of RAM, without changing the 4 GB limit per address space. For this, page table entries now had to be 64 bits wide, and only 512 entries fit into a 4 KB page table. The same was true for the page directory: It could now point to page tables above 4 GB, so entries there had to be 64 bits wide as well, and again only 512 entries fit now. Therefore a third level of page tables had to be introduced: Intel called it the “page directory pointer table” (PDP), and it only contains 4 (64 bit) entries to the four page directories, so that every virtual address space could be 4 GB. (The register CR3, which now points to the PDP, also got the alternate name PDPTR: “page directory pointer table register”.)

When in 2003 AMD introduced the AMD64 64 bit extensions to the i386 architecture, page tables had to be extended once more: In the implementation currently on the market (and copied by Intel), the CPUs can map 48 bit virtual addresses to 52 bit physical addresses. Using all 512 entries of the page directory pointer table (instead of just 4) only allows 39 bits of virtual addresses (512 GB), so another level of page tables was introduced. (They could have introduced more extra levels, but 256 TB of address space seemed to be enough for now – another level can be introduced at any time with new CPUs by just changing the OS, and without having to change application programs.)

The interesting fact is now what AMD called it… “page directory pointer-pointer” (PDPP)? “page directory pointer directory” (PDPD)? No, they understood that numbering the page table levels was a better idea, as they all have the same format anyway. The (single) 4th level page table is called “page map level 4” (PML4). The other levels are still named PDP, PD and PT in the documentation, though (also in Intel’s), probably to make it easier for developers familiar with i386/PAE.

7 thoughts on “The funny page table terminology on AMD64”

  1. Something I don’t like about “rax”, etc. is that x86-64 assemblers don’t allow you to use “r0” “r1” “r2” … instead of “rax”, “rcx”, “rdx” … . They should be consistent…

    myria

  2. Even uglier is that there is no easy way to get the corresponding byte/word/dword part of a register name. Especially for macros this is very annoying (think of crypto-code). The usual work-around is to alias (r0b, r0w, r0d, r0) to (al, ax, eax, rax), possibly making the code less readable for others.

  3. Sandpile (http://sandpile.org/aa64/paging.htm) actually refers to the four tables as PML1,PML2,PML3, and PML4. They call the entries PML1E,PML2E,PML3E, and PML4E. I haven’t seen those terms used anywhere else, but it was pretty confusing at first, and I was scared that there was something huge added that I didn’t already know about.

  4. My understanding is that:

    The X in AX doesn’t stand for eXtended but stands for ANY/BOTH.
    AL = A: low part
    AH = A: high part
    AX = A: any/both part

    That’s why the 4 registers with low/high parts are called AX, CX, DX and BX while the 4 GP registers which can only be accessed whole don’t have this X suffix. BP, SP, SI, DI.

  5. Pingback: dx format

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.