Intel used some strange opcodes for the SSE3 instructions. All MMX/SSE opcodes use the 0x0f prefix (former “pop cs”). They soon noticed the the 0x0f area gets full, so they used the 0×66, 0xf2, 0xf3 prefix as modifiers. The basic rule is:
- 0×66: makes MMX-intructions use the xmm registers; for SSE instructions this switches from packed single to packed double
- 0xf3: for SSE-instructions: use the scalar single form
- 0xf2: for SSE-instructions: use the scalar double form
The isn’t always 100% correct, so you need different tables for 0xf2, 0xf3 form anyway, when writing a disassembler. Example
- 0fdb: pand mmx, mmx
- 660fdb: pand xmm, xmm
- 0f58: addps
- 660f58: addpd
- f30f58: addss
- f20f58: addsd
But some of the SSE3 instructions (addsubps, addsubpd, haddps, haddpd, hsubps, haddps) just seem to be placed at random. By the scheme above, one would expect that addsubps/d share the same opcode and you switch between them with the 0×66 prefix. But they used:
- f20fd0.. addsubps
- 660fd0.. addsubpd
- 0fd0 and f30fd0 are both invalid
When I first saw this, I thought this is just an error in the tables. Do you have an idea why they didn’t used 0fd0 for addsubps and 660fd0 for addsubpd?