Category Archives: Uncategorized

A Visual Defragmenter for the Commodore 64

For decades, PC users have been able to relax by watching the computer defragment a disk. Now C64 users can do the same! Introducing “defrag1541”, a disk defragmentation tool for C64 and 1541.

defrag1541 consists of 49 lines of BASIC, takes about 15-30 minutes to defragment a typical disk, and visualizes its work using colored blocks on the screen. The different colors symbolize the different files, and the character shapes show whether the block is at its original (round) or final (square) position. (Note that this tool is meant for visualization only, do not use it on real data!)

In the remainder of this article, I will describe how the program works.

The 1541 Disk Format

Commodore 1541 floppy disks use the CBM filesystem. A disk consists of 683 sectors of 256 bytes, which are addressed as a track/sector tuple. There are

  • 35 tracks, numbered 1-35
  • with 21 to 17 sectors each, numbered from 0

The number of sectors depends on the track, higher tracks (closer to the center) have fewer sectors:

Tracks Sectors
1-17 21
18-24 19
25-30 18
31-35 17

The following table visualizes this: The columns are the tracks, the lines are the sectors. The higher numbered tracks don’t have as many sectors.


Track 18 is special, it holds all the metadata. 18/0 contains the disk name and the Block Availability Map (BAM), which uses one bit per block to keep track of which blocks are allocated, and the remaining sectors contain 32 byte directory entries, so 8 directory entries per block. Every directory entry consists of a file type, a 16 byte file name, and a track/sector tuple for the first block of the file.

Every file is stored as a linked list. Every block holds the track/sector tuple of the next block in its first two bytes, followed by 254 data bytes. A track value of 0 indicated the last block of a file.

The Data Model

The easiest way to look at the problem of defragmenting a disk is to imagine all files on the disk as concatenated into one long file that needs to be defragmented. Now every used block on disk has a logical block index. Logical block 0 is the first block of the first file, block 1 is the second block of the first file – if the first file was two blocks or more. Otherwise it’s the first block of the second file, and so on.

So for every physical block, we need to know what logical block is stored there. Then we build the desired mapping: For every physical block, what logical block should be stored there. Our core defragmenter code then moves sectores around and updates links until the two mappings are identical.

The Data Structures

At the very beginning, we are defining the arrays. We do this early, so we can’t run into OUT OF MEMORY errors at runtime. In fact, this program occupies about 99% of the 38911 bytes available to BASIC on the C64.

10 dimds(17):dimn(143):dimcc(143):dimci(682):dimic(682):dimcj(682):dimjc(682)
15 dimtr(682):dimsc(682):dimaa(682):dimdd(682):dimoo(682):dimss(682):n$=chr$(0)

The following table contains a small overview of the arrays. We will talk about them more as they are used in the code.

ds(17) array of sector numbers of dentry chain
n(143) map dentry index -> number of blocks
cc(143) map dentry index -> color
ci(682) map physical block index -> logical block index – current
ic(682) inverse of ci()
cj(682) map physical block index -> logical block index – desired
jc(682) inverse of cj()
tr(682) map block index -> track
sc(682) map block index -> sector
aa(682) map block index -> screen ram location
dd(682) map logical block index -> dentry
oo(682) map logical block index -> offset
ss(682) stack

The UI

We create a border around a 35×21 character area on the screen, so that every block can be represented by a character.

20 a=1024:b=1060:poke646,15:v=53280:pokev,0:pokev+1,15:printchr$(147):pokev+1,0
25 pokea,112:pokea+22*40,109:fori=1to35:pokea+i,67:pokea+22*40+i,67:next
30 pokeb,110:fori=1to21:pokea+40*i,66:pokeb+40*i,66:next:pokeb+22*40,125

The array cc() will map a file index to a color. We cycle through the C64 palette skipping black, which is the background color.

35 j=1:fori=0to143:cc(i)=j:j=j+1:ifj=16thenj=1
40 next

Reading the Current State

As a first step, we need to find out where data is stored on disk, i.e. for every physical block, we need to know what logical block (or none) is currently stored there.

The following code iterates over all directory entries and follows the linked lists of all files, recording in the array ci() the logical block index for each physical block. The reverse mapping (at what physical location is the logical block stored?) is built in the array ic(). The array nb() collects the number of blocks each file occupies.

In addition, we are keeping track of the sectors on track 18 that are used for directory entries. We will need this information later to update the directory entries when blocks have moved.

Since we are using linear block indexes for everything, we need to convert the track/sector tuples that the filesystem works with. This is done by the t2 comparisons in the second half of the code block.

45 open1,8,15,"i":fori=0to682:ci(i)=-1:ic(i)=-1:next:s=1:open2,8,2,"#"
50 ds(di)=s:ci(357+s)=-2:di=di+1:print#1,"u1 2 0 18"s:get#2,t$,s$:t1=asc(t$+n$)
55 s1=asc(s$+n$):fori=0to255step32:print#1,"u1 2 0 18"s:poke1082+s*40,4
60 print#1,"b-p 2"i+2:get#2,f$:f=asc(f$+n$):iff<>129andf<>130goto105
65 get#2,t$,s$:t2=asc(t$+n$):s2=asc(s$+n$):ift2=0goto105
70 print#1,"u1 2 0"t2;s2:poke1064+t2+s2*40,81
75 poke55336+t2+s2*40,cc(ni):ift2<18thena=s2+(t2-1)*21:goto95
80 ift2<25thena=s2+(t2-18)*19+357:goto95
85 ift2<31thena=s2+(t2-25)*18+490:goto95
90 a=s2+(t2-31)*17+598
95 ci(a)=cd:ic(cd)=a:n(ni)=n(ni)+1
100 cd=cd+1:goto65
105 ni=ni+1:next:s=s1:on-(t1<>0)goto50:ci(357)=-2

The ci(357) and ci(357+s) assignments mark the directory sectors as occupied, so that the defrag algorithm can’t use them as temporary storage.

As this code is walking the linked lists, it shows them on the screen in the form of a bullet, in a color that represents the file index.

Helper Data Structures

We have to do some programmatic work, which will take about a minute, so we draw progress bars. For this, we first draw “[” and “]” characters in an empty line at the bottom of the screen.

110 poke1944,27:poke1980,29

When updating links during defragmentation, we will need to convert linear block indexes back into track/sector tuples, so we create the two arrays tr() and sc() for a quick lookup. In addition, in aa() we calculate the screen position for the character that corresponds to the block.

115 i=0:fort=1to35:readns:fors=0tons:tr(i)=t:sc(i)=s:aa(i)=1064+t+s*40:i=i+1
120 next:poke1944+t,46:next:data20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20
125 data20,18,18,18,18,18,18,18,17,17,17,17,17,17,16,16,16,16,16

Later, we will also need to know for every logical block index what file it belongs to and which index it has within the file. This will be stored in dd() and oo().

130 ford=0to143:ifn(d)<>0theno=0:forj=1ton(d):dd(l)=d:oo(l)=o:l=l+1:o=o+1:next
135 poke1945+int(d/4.2),81:next

Desired Allocation

We have to build a desired allocation array, which has the same format as the current allocation array, just with the data about where we would like the blocks to be. The array cj maps physical blocks to logical blocks, and jc() is the reverse of this.

For simplicity, we are just filling the disk linearly, starting with track 1, and not using an interleave. This doesn’t make the defragmenter very useful, because the disk might actually read more slowly after this, but it makes for a very clean visualization!

140 fori=0to682:cj(i)=-1:jc(i)=-1:next:c=0:b=0:fori=0to143:ifn(i)=0goto155
145 forj=1ton(i):ifc>=357andc<376thenc=376
150 cj(c)=b:jc(b)=c:c=c+1:b=b+1:next
155 poke1945+int(i/4.2),160:next:fori=357to375:cj(i)=ci(i):next

The code could be easily extended to use a more reasonable strategy. The 1541 operating system for example picks the empty block closest to track 18 as the first block for a new file, then stays on the same track as long as there are free blocks, preferably with an interleave of 10. Then it goes on to the closest track with an empty sector again.

We're done thinking, so let's clear the progress bar:

160 fori=0to34:poke1945+i,32:next:poke1944,32:poke1980,32:sl=0:i=0

The Main Algorithm

Now we iterate over all physical blocks and make sure the current allocation matches the desired allocation.

  • If the current allocation matches the desired allocation, do nothing.
  • If the block is currently empty, just move the correct block there.
  • Otherwise, we need to move the logical block that is there to its new location first – using recursion!

For recursion, we use the array ss() as the stack and sl as the stack pointer. Recursion can cause a cycle though, e.g. when logical block A is stored where logical block B should be stored and vice versa. So if the block we are looking at is the same as the bottom of the stack, we have to break the cycle.

The following code checks for block identity, empty block as well as a cycle:

165 on-(ci(i)=cj(i))goto190:on-(ci(i)=-1)goto185:on-(sl=0orss(0)<>i)goto180

To break the cycle, we need the help of a temporary block. We find a free block and move the current block there (subroutine 225, we'll talk about it later), then continue.

170 forj=0to682:ifci(j)<>-1thennext
175 i1=i:i2=j:gosub205:goto165

For recursion, we push the current index on the stack and continue with the physical block that should contain the data that the current block currently contains.

180 ss(sl)=i:sl=sl+1:i=jc(ci(i)):goto165

In case the current block is empty, we move it using subroutine 225.

185 i1=ic(cj(i)):i2=i:gosub205

At the end of the loop, continue with the value from the top of the stack, if there is any, otherwise we take the next block in sequence.

190 ifci(i)>=0thenpokeaa(i),160
195 ifsl<>0thensl=sl-1:i=ss(sl):goto165
200 i=i+1:on-(i<683)goto165:close2:close1:open1,8,15,"v":end

When we are done, we send a "V" (validate) command to the drive, which will follow the links of all files and recreate the block availability map. For simplicity, our code never updates this table, so we have the operating system do it for us at the end.

Moving a Block

In order to move a block, in addition to copying the data between the two physical blocks and updating our data structures, we also have to update the track/sector link tuple from the previous block or the directory entry.

This reads the block into a buffer and writes it to a different location without transfering it to the computer:

205 pokeaa(i1),18:print#1,"u1 2 0"tr(i1)sc(i1):pokeaa(i2),23
210 pokeaa(i2)+54272,cc(dd(i2)):print#1,"u2 2 0"tr(i2)sc(i2)
215 pokeaa(i1),32:pokeaa(i2),160

The POKE instructions update the screen.

To update the parent link, we have to find out whether this is the first block in a file, in which the parent link is located in the directory entry.

220 c=ci(i1):o=oo(c):t=tr(i2):s=sc(i2):ifo<>0goto240

If that's the case, we find the correct directory entry, read the block and write the new link.

225 d=dd(c):s1=ds(int(d/8)):pp=peek(aa(357+s1)):pokeaa(357+s1),0
230 print#1,"u1 2 0"18;s1:print#1,"b-p 2"(dand7)*32+3
235 print#2,chr$(t)chr$(s);:print#1,"u2 2 0"18;s1:pokeaa(357+s1),pp:goto250

Otherwise, we read the preceding block in the file and update the link there.

240 a=ic(c-1):tt=tr(a):ss=sc(a):print#1,"u1 2 0"tt;ss:pp=peek(aa(a))
245 pokeaa(a),0:print#2,chr$(t)chr$(s);:print#1,"u2 2 0"tt;ss:pokeaa(a),pp

Finally, we update the data structures to reflect the fact that this physical block now contains the new data, and the block we moved it from is now empty.

250 v=ci(i1):ic(v)=i2:ci(i2)=v:ci(i1)=-1:return


defrag1541 is not meant as a serious tool. Nevertheless, here are some limitations and ideas for improvement:

  • The algorithm for the desired allocation was chosen for its looks, not to make disks actually faster.
  • Optionally, the algorithm could also place data blocks on track 18. That would give the user 17 more blocks on the disk.
  • That said, disks that currently have data blocks on track 18 can't get defragmented with the current algorithm. It avoids track 18 and will run out of space.
  • The current code does not test for read errors at all.
  • We also do not update BAM ourselves and rely on the operating system doing it at the end. While the code is written so that interrupting the process at any point will not lead to data loss, it is important to run the VALIDATE command at the end if data should ever be written to the disk again.
  • Of course it could be much faster. A lot of this is CPU bound, because BASIC is so slow. Rewriting everything in assembly would help. Some parts of the code could also directly run inside the disk drive. I doubt the complete data structures would ever fit into the 1541's 2 KB of RAM, so defragmentation can't completely run on the disk drive, but major parts of the logic could still live there, to minimize the lag introduced by communication between the computer and the drive.

C64-Artikel in der c’t Retro 2018

This post is about articles I wrote for the c’t magazine in German language. The post is therefore in German.

Im heute erschienenen “c’t Retro 2018” Sonderheft befinden sich drei Artikel von mir.

Brotkasten für die Welt” beschreibt die Hardware, Peripherie und die grundlegende Bedienung des C64.

Emulieren mit VICE” beschäftigt sich mit der optimalen Konfiguration des VICE-Emulators.

Und “Der eigene Katakis-Klon” beschreibt auf nur 6 Seiten ein vollständiges kleines Shooter-Spiel im Quelltext und vermittelt dabei die Grundladen von 6502-Maschinensprache sowie von Interrupt-basierter Programmierung des VIC.

(Bei den hier abgebildeten Seiten handelt es sich um die Vorschau in der c’t-App.)

Repairing a Commodore 1084S Power Switch

The power switch of the Commodore 1084S tends to break easily. To replace it, you will need a standard “Preh TV3” switch (which can be found online), a screwdriver, a soldering iron, desoldering braid, and some basic soldering skills.

At the back, there are four screws in the corners:

Three more screws at the back panel:

And four at the bottom:

When lifting the back piece, be careful not to rip off the speaker wires. You only need to lift it a bit, and you can have it rest on the back panel. Do not kill yourself touching any of the high voltage parts!

The switch is located at the top left of the board and has six solder points.

This is what your new switch should look like:

Take some desoldering braid and loosen it a little…

…so you can put a pin of the switch through it. Then heat it with the soldering iron.

After desoldering all six solder points, it should look like this:

You should now be able to remove the old switch.

Add the plastic button to the new switch, and if it has two extra legs in the middle, bend them down a little.

Put the new switch in place and make sure all six pins go through the holes.

Solder all the pins!

When putting the monitor back together, you have to make sure to fit the two plastic “rails” so that they hold the board and their pins go through the holes in the bottom part.

When sliding the back part back on, you might need a second person holding the rails.

Now put the screws back in, and you’re done!

Why does PETSCII have upper case and lower case reversed?

The PETSCII character encoding that is used on the Commodore 64 (and all other Commodore 8 bit computers) is similar to ASCII, but different: Uppercase and lowercase are swapped! Why is this?

The "PET 2001" from 1977 had a built-in character set of 128 characters. This would have been enough for the 96 printable ASCII characters, and an extra 32 graphical characters. But Commodore decided to replace the 26 lowercase characters with even more graphical characters. After all, the PET did not support a bitmapped display, so the only way to display graphics was using the graphical characters built into the character set. These consisted of symbols useful for box drawing as well as miscellaneous symbols (including the french deck suits).

So the first PET was basically using ASCII, but with missing lower case. This had an influence on the character codes produced by the keyboard:

Key ASCII keyboard PET keyboard
A a ($61) A ($41)
Shift + A A ($41) ♠ ($C1)

On a standard ASCII system, pressing "A" unshifted produces a lower case "a", and together with shift, it produces an uppercase "A". On a PET, because there is no lower case, pressing "A" unshifted produces an uppercase "A", and together with shift produces the "spade" symbol ("♠") from the PET-specific graphical characters.

Later Commodore 8 bit computers added a second character set that supported uppercase and lowercase. Commodore decided to allow the user to switch between the two character sets at any time (by pressing the "Commodore" and "Shift" keys together), so applications generally didn't know which character set was active. Therefore, the keyboard had to produce the same character code independent of the current character set:

key ASCII keyboard PET keyboard (upper/graph) PET keyboard (upper/lower)
A a ($61) A ($41) a ($41)
Shift + A A ($41) ♠ ($C1) A ($C1)

An unshifted "A" still produces a code of $41, but it has to be displayed as a lower case "a" if the upper/lower character set is enabled, so Commodore had to put lower case characters at the $40-$5F area – which in ASCII are occupied by the uppercase characters. A shifted "A" still produces a code of $C1, so Commodore put the uppercase characters into the $C0-$DF area.

Now that both upper case and lower case were supported, Commodore decided to map the previously undefined $60-$7F area to upper case as well:

range ASCII upper case PETSCII lower case PETSCII lower case PETSCII with fallback
$40-$5F upper case upper case lower case lower case
$60-$7F lower case undefined undefined upper case
$C0-$DF undefined graphical upper case upper case

If you only look at the area between $00 and $7F, you can see that PETSCII reverses upper and lower case compared to ASCII, but the $60-$7F area is only a compatibility fallback; upper case is actually in the $C0-$DF area – it's what the keyboard driver produces.

xhyve – Lightweight Virtualization on OS X Based on bhyve

The Hypervisor.framework user mode virtualization API introduced in Mac OS X 10.10 (Yosemite) cannot only be used for toy projects like the hvdos DOS Emulator, but is full-featured enough to support a full virtualization solution that can for example run Linux.

xhyve is a lightweight virtualization solution for OS X that is capable of running Linux. It is a port of FreeBSD’s bhyve, a KVM+QEMU alternative written by Peter Grehan and Neel Natu.

  • super lightweight, only 230 KB in size
  • completely standalone, no dependencies
  • the only BSD-licensed virtualizer on OS X
  • does not require a kernel extension (bhyve’s kernel code was ported to user mode code calling into Hypervisor.framework)
  • multi-CPU support
  • networking support
  • can run off-the-shelf Linux distributions (and could be extended to run other operating systems)

xhyve may make a good solution for running Docker on your Mac, for instance.

Running Tiny Core Linux on xhyve

The xhyve repository already contains a small Linux system for testing, so you can try out xhyve by just typing these few lines:

$ git clone
$ cd xhyve
$ make
$ ./

And you will see Tiny Core Linux booting in your terminal window:

Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Initializing cgroup subsys cpuacct
Linux version 3.16.6-tinycore64 (tc@box) (gcc version 4.9.1 (GCC) ) #777 SMP Thu Oct 16 10:21:00 UTC 2014
Command line: earlyprintk=serial console=ttyS0 acpi=off
e820: BIOS-provided physical RAM map:
BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
BIOS-e820: [mem 0x0000000000100000-0x000000003fffffff] usable
NX (Execute Disable) protection: active
SMBIOS 2.6 present.
AGP: No AGP bridge found
e820: last_pfn = 0x40000 max_arch_pfn = 0x400000000
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
CPU MTRRs all blank - virtualized system.


 //\   Core is distributed with ABSOLUTELY NO WARRANTY.


To shut down the VM and exit to the Mac’s command line, enter:

$ sudo halt

Running Ubuntu on xhyve

You can also install a more complete Linux distribution on xhyve. The tricky bit is that xhyve doesn’t come with a BIOS or EFI booter, so it is necessary to extract the kernel and initrd from the Linux image and pass them to xhyve manually.

First download Ubuntu Server (the desktop version doesn’t support the text mode installer) into the directory “ubuntu” inside the “xhyve” directory:

$ ls -l
total 1218560
-rw-r--r--@ 1 mist  staff  623902720  6 Jun 22:14 ubuntu-14.04.2-server-amd64.iso

We need to extract the kernel and initrd, which is a little tricky, because OS X doesn’t recognize the hybrid file system on the image without a little hack:

$ dd if=/dev/zero bs=2k count=1 of=/tmp/tmp.iso
$ dd if=ubuntu-14.04.2-server-amd64.iso bs=2k skip=1 >> /tmp/tmp.iso
$ hdiutil attach /tmp/tmp.iso
$ cp /Volumes/Ubuntu-Server\ 14/install/vmlinuz .
$ cp /Volumes/Ubuntu-Server\ 14/install/initrd.gz .

Create a virtual hard disk image (8 GB in the example):

$ dd if=/dev/zero of=hdd.img bs=1g count=8

Then create a script to run xhyve with the correct arguments for the installer:


CMDLINE="earlyprintk=serial console=ttyS0 acpi=off"

MEM="-m 1G"
#SMP="-c 2"
NET="-s 2:0,virtio-net"
IMG_CD="-s 3,ahci-cd,ubuntu/ubuntu-14.04.2-server-amd64.iso"
IMG_HDD="-s 4,virtio-blk,ubuntu/hdd.img"
PCI_DEV="-s 0:0,hostbridge -s 31,lpc"
LPC_DEV="-l com1,stdio"


You will want networking enabled, so it’s easiest to run the script as root (this requirement is lifted if you codesign the binary):

$ sudo ./

You will see the Ubuntu text mode installer:

  ┌───────────────────────┤ [!!] Select a language ├────────────────────────┐
  │                                                                         │
  │ Choose the language to be used for the installation process. The        │
  │ selected language will also be the default language for the installed   │
  │ system.                                                                 │
  │                                                                         │
  │ Language:                                                               │
  │                                                                         │
  │                               C                                         │
  │                               English                                   │
  │                                                                         │
  │     <Go Back>                                                           │
  │                                                                         │

<Tab> moves; <Space> selects; <Enter> activates buttons

All answers should be straightforward, and the defaults are usually fine. Make sure to select “Yes” when asked “Install the GRUB boot loader to the master boot record”.

At the very end, on the “Installation complete” screen, select “Go back” and “Execute a shell”, so you can copy the installed kernel and initrd to the Mac side. In the VM, type this:

# cd /target
# sbin/ifconfig
# tar c boot | nc -l -p 1234

On the Mac, type this, replacing the IP with the output from ifconfig before:

$ cd ubuntu
$ nc 1234 | tar x

In the VM, exit the shell:

# exit

Then select “Finish the installation”.

To run the Ubuntu installation from the virtual hard disk, create the following script, fixing up the kernel and initrd version numbers:


CMDLINE="earlyprintk=serial console=ttyS0 acpi=off root=/dev/vda1 ro"

MEM="-m 1G"
#SMP="-c 2"
NET="-s 2:0,virtio-net"
IMG_HDD="-s 4,virtio-blk,ubuntu/hdd.img"
PCI_DEV="-s 0:0,hostbridge -s 31,lpc"
LPC_DEV="-l com1,stdio"


Then run the script:

$ sudo ./

To make your Linux installation useful, you may want to install an SSH server:

$ sudo apt-get install openssh-server

Or install a full UI that you can access using VNC:

$ sudo apt-get install xubuntu-desktop vnc4server

Then run the VNC server:

$ vnc4server :0 -geometry 1024x768

And conntect to it by pasting this into Finder’s Cmd+K “Connect to Server” dialog:


If you also follow skerit’s Ubuntu VNC tutorial, you’ll get to a desktop like this.

Next Steps

xhyve is very basic and lightweight, but it has a lot of potential. If you are a developer, you are welcome to contribute to it. A list of current TODOs and ideas is part of the README file in the repository.

Using the OS X 10.10 Hypervisor Framework: A Simple DOS Emulator

Since Version 10.10 (Yosemite), OS X contains Hypervisor.framework, which provides a thin user mode abstraction of the Intel VT features. It enables apps to use virtualization without the need of a kernel extension (KEXT) – which makes them compatible with the OS X App Store guidelines.

The idea is that the OS takes care of memory management (including nested paging) as well as scheduling virtual CPUs like normal threads. All we have to do is create a virtual CPU (or more!), set up all its state, assign it some memory, and run it… and then handle all “VM exits” – Intel lingo for hypervisor traps.

There is no real documentation, but the headers contain a decent amount of information. Here are some declarations from Hypervisor/hv.h:


 * @function   hv_vm_create

 * @abstract   Creates a VM instance for the current task

 * @param      flags  RESERVED

 * @result     0 on success or error code


extern hv_return_t hv_vm_create(hv_vm_options_t flags) __HV_10_10;


 * @function   hv_vm_map

 * @abstract   Maps a region in the virtual address space of the current task

 *             into the guest physical address space of the VM

 * @param      uva    Page aligned virtual address in the current task

 * @param      gpa    Page aligned address in the guest physical address space

 * @param      size   Size in bytes of the region to be mapped

 * @param      flags  READ, WRITE and EXECUTE permissions of the region

 * @result     0 on success or error code


extern hv_return_t hv_vm_map(hv_uvaddr_t uva, hv_gpaddr_t gpa, size_t size,

hv_memory_flags_t flags) __HV_10_10;


 * @function   hv_vcpu_create

 * @abstract   Creates a vCPU instance for the current thread

 * @param      vcpu   Pointer to the vCPU ID (written on success)

 * @param      flags  RESERVED

 * @result     0 on success or error code


extern hv_return_t hv_vcpu_create(hv_vcpuid_t *vcpu,

hv_vcpu_options_t flags) __HV_10_10;


 * @function   hv_vcpu_run

 * @abstract   Executes a vCPU

 * @param      vcpu  vCPU ID

 * @result     0 on success or error code

 * @discussion

 *             Call blocks until the next VMEXIT of the vCPU


 *             Must be called by the owning thread


extern hv_return_t hv_vcpu_run(hv_vcpuid_t vcpu) __HV_10_10;

So let’s create a virtual machine that runs simple DOS applications in 16 bit real mode, and trap all “int” DOS system calls – similar to DOSBox.

First, we need to create a VM:


This creates a VM for the current Mach task (i.e. UNIX process). It’s implicit, so it doesn’t return anything. Then we allocate some memory and assign it to the VM:

#define VM_MEM_SIZE (1 * 1024 * 1024)
void *vm_mem = valloc(VM_MEM_SIZE);
hv_vm_map(vm_mem, 0, VM_MEM_SIZE, HV_MEMORY_READ | 
                                  HV_MEMORY_WRITE | 

And we need to create a virtual CPU:

hv_vcpuid_t vcpu;
hv_vcpu_create(&vcpu, HV_VCPU_DEFAULT);

Now comes the annoying part: Set up the CPU state. If the state is illegal or inconsistent, the CPU will refuse to run. You will need to refer to the Intel Manual 3C for all the context. Luckily, most virtual machines start from 16 bit real mode, and mode changes will be done by the boot loader or operating system inside the VM, so you won’t have to worry about setting up any other state than real mode state. Real mode state setup looks something like this:

hv_vmx_vcpu_write_vmcs(vcpu, VMCS_GUEST_CS_SELECTOR, 0);
hv_vmx_vcpu_write_vmcs(vcpu, VMCS_GUEST_CS_LIMIT, 0xffff);
hv_vmx_vcpu_write_vmcs(vcpu, VMCS_GUEST_CS_ACCESS_RIGHTS, 0x9b);
hv_vmx_vcpu_write_vmcs(vcpu, VMCS_GUEST_CS_BASE, 0);

hv_vmx_vcpu_write_vmcs(vcpu, VMCS_GUEST_DS_SELECTOR, 0);
hv_vmx_vcpu_write_vmcs(vcpu, VMCS_GUEST_DS_LIMIT, 0xffff);
hv_vmx_vcpu_write_vmcs(vcpu, VMCS_GUEST_DS_ACCESS_RIGHTS, 0x93);
hv_vmx_vcpu_write_vmcs(vcpu, VMCS_GUEST_DS_BASE, 0);


hv_vmx_vcpu_write_vmcs(vcpu, VMCS_GUEST_CR0, 0x20);
hv_vmx_vcpu_write_vmcs(vcpu, VMCS_GUEST_CR3, 0x0);
hv_vmx_vcpu_write_vmcs(vcpu, VMCS_GUEST_CR4, 0x2000);

After that, we should populate RAM with the code we want to execute:

FILE *f = fopen(argv[1], "r");
fread((char *)vm_mem + 0x100, 1, 64 * 1024, f);

…and assign the GPRs the proper initial state – including the instruction pointer, which will point to the code:

hv_vcpu_write_register(vcpu, HV_X86_RIP, 0x100);
hv_vcpu_write_register(vcpu, HV_X86_RFLAGS, 0x2);
hv_vcpu_write_register(vcpu, HV_X86_RSP, 0x0);

The virtual CPU is fully set up, we can now run it!


This call runs the virtual CPU (while blocking the calling thread) until its time slice expires or a “VM exit” happens. A VM exit is a hypervisor-class exception, i.e. an event in the VM that the hypervisor wants to trap. We can trap events like exceptions, certain privileged instructions (CPUID, HLT, RDTSC, RDMSR, …) and control register (CR0, CR2, CR3, CR4, …) accesses.

After hv_vcpu_run() returns, we need to read the exit reason and act upon it, and run the virtual CPU again. Here is a minimal loop to handle VM exits:

for (;;) {

	uint64_t exit_reason = hv_vmx_vcpu_read_vmcs(vcpu, VMCS_EXIT_REASON);

	switch (exit_reason) {

EXIT_REASON_EXT_INTR is caused by host interrupts (usually it means that the time slice is up), so we will just ignore it. EXIT_REASON_EPT_FAULT happens every time the guest accesses a page for the first time, or when the guest accesses an unmapped page – this way we can emulate MMIO. In our case, we can also ignore those.

For emulating DOS, we are catching EXIT_REASON_EXCEPTION, which is caused by the int instruction (if caught). We can get the number of the interrupt from the virtual CPU state without decoding instructions:

uint8_t interrupt_number = hv_vmx_vcpu_read_vmcs(vcpu, VMCS_IDT_VECTORING_INFO) & 0xFF;

…and emulate the system call. We can read and write GPRs using the hv_vcpu_read_register() and hv_vcpu_write_register() calls.

hvdos – a simple DOS Emulator for OS X

The full source of hvdos, a simple DOS emulator using the OS X Hypervisor framework, is available at

It contains an adapted version of the libcpu DOS system call library and manages to run (parts of) some .COM files. A good demo is the ZIP decompression tool.

Creating your own Hypervisor

hvdos can serve as a template for your own Hypervisor.framework experiments. It contains wrapper functions for error handling, a header that defines all Intel VT constants (taken from FreeBSD), complete 16 bit real mode initialization, as well as a few helper functions to set up the fields VMCS_PIN_BASED_CTLS, VMCS_PRI_PROC_BASED_CTLS, VMCS_SEC_PROC_BASED_CTLS and VMCS_ENTRY_CTLS properly. These are needed to define, among other things, which events cause VM exits.

You can easily add more CPUs by creating one POSIX thread per virtual CPU. For every thread, you create a virtual CPU and run a VM exit main loop.

You can for example start writing an IBM PC emulator by running Bochs BIOS and trapping I/O accesses, or running MS-DOS without BIOS by trapping BIOS int calls.

Or you could bridge an existing open source solution (QEMU, QEMU+KVM, VirtualBox, DOSBox, …) to use Hypervisor.framework…