Redundant SSE instructions

As we all know the x86-ISA has a lot of redundant instructions (ie. instructions with the same semantic but different opcodes). Sometimes this is unavoidable, sometimes it looks like bad design. But with SSE it gets really weird. Let’s say we want to perform xmm0 <- xmm0 & xmm1 (ie. bitwise and). Not an uncommon operation; but we have 3 different ways do archive this:

  • andps xmm0, xmm1 (0f 54 c1)
  • andpd xmm0, xmm1 (66 0f 54 c1)
  • pand xmm0, xmm1 (66 0f db c1)

(Note that andpd/pand are SSE2 instructions)
Regarding the result in xmm0 these are really the same instructions. Now, why did Intel do this? First we’re going to inspect andps/andpd. Looking at the optimization manuals we get a hint: The ps/pd mark the target register to contain singles or doubles, so they should match the actual data you are operating on. read more

Virtualization: The elegant way and the x86 way

Virtualization means running one or more complete operating systems (at the same time) on one machine, possibly on top of another operating system. VMware, VirtualPC, Parallels etc. support, for example, running a complete GNU/Linux OS on top of Windows. For virtualization, the Virtual Machine Monitor (VMM) must be more powerful than kernel mode code of the guest: The guest’s kernel mode code must not be allowed to change the global state of the machine, but may not notice that its attempts fail, as it was designed for kernel mode. The VMM as the arbiter must be able to control the guest completely. read more

The real reason for driver signing in Vista x64

In Windows Vista x64, drivers are required to be signed by someone holding a VeriSign code certificate or they won’t load. There is no way to (permanently) disable this signing even if you are Administrator. The F8 startup menu has an option to disable it, but you must select it every time you boot up. Microsoft’s claimed reason for this is that it prevents Trojans from installing kernel-mode rootkits. That is a load of crap. read more

Asking the kernel how to make a syscall

Imagine you’re an i386 user mode application on a modern operating system, and you want to make a syscall, for example to request some memory or create a new thread. But syscalls can be made in various ways on the i386 family of CPUs (int, call gates, sysenter, syscall), and CPUs tend to support only a subset of them. But hardcoding “int” into the kernel is a waste of resources on modern CPUs, because sysenter is a lot faster. read more