Monthly Archives: July 2010

Why is there no CR1 – and why are control registers such a mess anyway?

If you want to enable protected mode or paging on the i386/x86_64 architecture, you use CR0, which is short for control register 0. Makes sense. These are important system settings. But if you want to switch the pagetable format, you have to change a bit in CR4 (CR1 does not exist and CR2 and CR3 don’t hold control bits), if you want to switch to 64 bit mode, you have to change a bit in an MSR, oh, and if you want to turn on single stepping, that’s actually in your FLAGS. Also, have I mentioned that CR5 through CR15 don’t exist – except for CR8, of course?

Like many (but unfortunately not all) quirks of the i386/x86_64 architecture, this mess can be explained with history.

8086 – FLAGS

x86 history typically starts with the 16 bit 8086, but although it was not binary compatible with its predecessor, it was nevertheless a rather straightforward assembly-level compatible 16 bit extension of the 8 bit Intel 8080 with some ideas of the Zilog Z80. The 8086 is still a classic “home computer class” CPU, which was not meant for modern operating systems: It had no MMU of any kind, and no concept of privileged and unpriviliged modes. Therefore, control bits that we see as system state today were encoded into the 16 bit FLAGS register: The interrupt enable bit and the trap flag (which will cause a software interrupt after the next instruction and thus lets you single-step) are encoded into FLAGS right next to the ALU’s flags like Zero and Carry.

80286 – Machine Status Word

The 80286 then came with a simple form of memory management that allowed more sophisticated (but not yet “modern”) operating systems to run – like the original versions of OS/2. The 16 bit “Machine Status Word” was created to host the big switch between legacy mode (real mode) and the new memory-managed mode (protected mode) and a program could access it using the new instructions “lmsw” and “smsw”. The 80286 had more system state than just this bit: The GDT, the IDT and the TSS had its own registers and dedicated instructions to access them (“lgdt”/”sgdt”, “lidt”/”sidt”, “ltr”/”str”)

i386 – Control Registers

The i386 finally had a real MMU that allowed paging and thus modern operating systems. The MMU required two more registers in the system state, one for the base address of the pagetables, and one to read a fault address from. Intel decided against adding more special purpose registers with dedicated accessor instructions, but instead introduced eight indexed 32 bit wide “control registers” CR0 to CR7. The new accessors “mov crn, r32“/”mov r32, crn” allowed copying between registers and control registers and had the 3 bit CR index encoded in the opcode.

The old MSW was also wired into the lower 16 bits of CR0; but CR0 was also extended with new bits like the switch to turn on paging. CR1 was kept reserved, presumably as a second control register for miscellaneous control bits, and CR2 and CR3 were used for the aforementioned fault address and pagetable base pointer. The opcodes to access reserved control registers generated an “invalid opcode” fault, making it possible for Intel to reuse the opcodes later if they don’t use the control registers.

i486 – CR4

The i486 added a few more control bits, and some of them went into CR0. But instead of overflowing the new bits into CR1, Intel decided to skip it and open up CR4 instead – for unknown reasons.

Pentium – MSRs

On the Pentium, Intel added for the first time control bits that were a property of the implementation as opposed to the architecture, i.e. bits that are microarchitecture-specific and will therefore only work on certain CPUs and not necessarily be supported on later CPUs – like caching details and debug settings. In order not to waste the valuable CR space with throw-away control bits, Intel introduced the Model-specific Registers (MSRs). The MSR address space is 32 bits, and every MSR is 64 bits wide. The two new instructions “rdmsr” and “wrmsr” copy between an ECX-indexed MSR and the EDX:EAX registers.

Pentium II – SYSENTER MSR

The SYSENTER instruction that got introduced on the on the Pentium II is a fast way to switch between unprivileged and privileged mode. Instead of looking up the destination segment, instruction pointer and stack pointer in memory, the CPU holds this information in three special-purpose system registers. CR space is valuable, so Intel decided against filling up CR5, CR6 and CR7, so they put it into the MSR address space instead – at 0×174 through 0×176. This was practically an abuse of the MSR concept.

AMD K6 – EFER MSR

Who can blame AMD for doing similar things then? With the K6, which was introduced at the same time as the Pentium II, AMD diverged from just copying Intel for the first time and actually added features of their own: They added the SYSCALL instruction, and with it, a control bit that turns it on and off, and an extra control register with the target location. Being afraid to collide with Intel extensions they they didn’t know about, they put the extra system registers into the MSR space: the control register “EFER” (Extended Feature Enable Register) at 0xC000_0080 and the Syscall Target Register (STAR) at 0xC000_0081. Intel had been nicely lining up MSRs counting up from 0, so AMD decided to start counting at 0xC000_0080. Understandable as this is, it is basically the same abuse of the MSR concept as Intel’s with SYSENTER.

A very similar thing happened in the CPUID space, by the way: While Intel encoded all its feature bits in leaf 0x0000_0001, AMD defined leaf 0x8000_0001 for its features.

x86_64 – Chaos!

So far everything looked like it was getting a little more controlled. Both Intel and AMD are only adding new control registers in the MSR space, and since this is a big address space and AMD and Intel extend it on rather opposite locations, it all looks nicer. But then came x86_64: For the first time, Intel was copying a feature that AMD introduced, and it needed to be compatible with all its details. AMD had encoded the availibility of x86_64 in its own CPUID leaf in 0x8000_0001, so Intel had to support this leaf as well. And since Long Mode was turned on in the EFER MSR, Intel had to support an MSR in the AMD space of 0xC000_0000. Long mode also required supporting SYSCALL, so Intel also supported the STAR MSR.

Since x86_64 introduced the REX prefix to double the number of available general-purpose registers, AMD decided to allow this prefix also for “mov cr”, doubling the number of control registers and therefore introducing CR8 through CR15 – also doubling their width. And since AMD introduced them, they owned them, and decided to use CR8 for the “Task Priority Register” feature.

VMX and SVM

The architecture is messy, sure, but does it matter? Maybe not… as long as CPUs didn’t have virtualization extensions! Both Intel VMX and AMD SVM are designed so that they can automatically switch the complete privileged machine state including control registers and certain MSRs. Intel for example special cases CR0, CR3, CR4 and CR8, leaves CR2 to the user. AMD on the other hand has 16 fields for all CRs in its switcher. And because of the two different starting points of the MSR space, Intel VMX required a whitelist bitmap for 8192 MSRs starting at 0x0000_0000 and for another 8192 MSRs starting at 0xC000_0000 – and of course SYSENTER_CS, EFER, STAR and friends are special-cased. If you want to have a lot of fun, read the VMCS layout reference of Intel’s manual 3B!

Future?

  • CR1 and CR5 to CR7 are still “owned” by Intel. AMD has shown that they don’t want to use them – and even Intel has not added a control register since 1989.
  • CR9 through CR15 are technically owned by AMD, since they introduced them with x86_64 and decided to use CR8. Intel adopted the reserved ones when adopting x86_64, but it is unlikely that Intel will ever adopt smaller changes to the architecture from AMD, and AMD is unlikely to use them if they won’t be part of the architecture, so these will probably never be used either. On the other hand, AMD added these to the auto-switcher list of their SVM Virtual Machine Control Block (VMCB), showing that they haven’t given up on them yet.
  • The MSR space is properly de-facto partitioned. Intel continues adding MSRs at 0 and AMD at 0xC000_0000 – but MSR have already lost their model-specificness in 1997. MSRs are the new CRs.

Dear Intel, dear AMD: I like the control registers, and I hate to see them wasted. Why don’t you finally define CR1 and give it a few control bits in the future? If you’re scared about collisions, I will be happy to be the arbiter. Ah, whatever: Intel, you get to define all even bits in CR1, and AMD, you get to define all odd bits. Okay? Cool.