FFREEP – the assembly instruction that never existed

Due to simplified instruction decoding of the Intel 80287, this CPU had opcode aliases for instructions like FXCH, FSTP, i.e. there were some additional encodings that did the same as the originals as defined by the 8087. As a side effect of this, a new instruction, FFREEP appeared, although not intented by Intel.

The “Intel 80287 Programmer’s Reference Manual” (at least the revised 1987 version) listed FFREEP in the opcode list and explained what it does:

DF 1101 1111 1100 0REG (6)
The marked encodings are not generated by the language translators. If, however, the 80287 encounters one of these encodings in the instruction stream, it will execute it as follows:
FFREE ST(i) and pop stack

FFREEP was not documented in the instruction reference, but sice the instruction was now somewhat official, Intel had to keep the instruction in all future CPUs, and AMD, Cyrix and most other cloners implemented it as well; at least since the 387 class FPUs. But nobody documented it.

FFREEP made a brief appearance again in the 1997 “Intel Architecture Optimization Manual” for the Pentium Pro. This CPU was the first x86 processor that translated x86 instruction into RISC-like micro-ops, so the optimization manual listed the number of micro-ops necessary for each instruction, including the otherwise undocumented FFREEP.

In 2002, AMD finally documented FFREEP – not. They dedicated a whole page in the “AMD Athlon Processor x86 Code Optimization Guide” to this instruction, describing what it does, and how it can be used. They even state:

Note that the FFREEP instruction, although insufficiently documented in the past, is supported by all 32-bit x86 processors.

This is not entirely correct, as the Nexgen 586PF did not support FFREEP – AMD obviously interprets “all 32-bit x86 processors” as “all Intel and AMD (and possibly Cyrix) 32-bit x86 processors”. Oh, and please note that even after this, AMD does not list FFREEP in its x86/AMD64 instruction reference.

Despite the facts that FFREEP has now been retroactively documented, it has existed in all P6-class and later CPUs, and it actually serves a purpose, it is still hardly used, although most disassemblers (objdump, HT) and i386 emulators (Bochs, QEMU) support it. The GCC toolchain seems to be the only one that ever emits code using FFREEP, but it only does so if tuning for AMD K8 CPUs.

References: 1, 2, 3, 4, 5, 6, 7

10 thoughts on “FFREEP – the assembly instruction that never existed”

  1. It is used in mplayer (svn version , not in the official release yet).
    The new sse optimized mp3 decoding library (mp3lib) includes one ffreep instruction (hand coded).

    Reply
  2. The AMD Athlon Processor x86 Code Optimization Guide states that FFREEP is converted to an internal NOP inside the AMD Athlon processor.

    Reply
  3. You cannot use the fincstp alone since, according to Intel manuals, this instruction isn’t equivalent to popping the stack. The reason is that the tag of the old top of stack isn’t marked empty.The problem manifest itself when you try to load the stack after using “fincstp”, then the processor will detect false stack overflow and will leave a NAN in the TOP, unless exceptions are enabled. I guess the alternative is to use “fstp st(0)”, that is store and pop, or “ffree st(0)” then “fincstp”

    Reply
  4. “ffreep” is until today (2023) the fastest way to clean up the FPU stack: It “cleans” one register and pops the top. Two at once, you’ll need max 4 instructions. Don’t use “fcompp”, it takes at least 3 clocks. Don’t use “fstp ST”, it takes at least one clock. Since Haswell it takes one clock for “ffreep” but the FPU can execute two instructions per clock what means only 2 clocks to clean up the FPU stack (Ivy Bridge one, but here 4 clocks to clean up!).

    Reply
  5. Use the “db” direction, example:
    db 0DFh,0C4h { ffreep ST(4) }
    db 0DFh,0C2h { ffreep ST(2) }

    Reply

Leave a Comment