Author Archives: Michael Steil

Commodore Peripheral Bus: Overview

The well-known Serial Bus (aka Serial “IEC” Bus) of the Commodore 64 that connects to disk drives such as the 1541 is just one variant of a whole family of busses and protocols used by the line of 8 bit Commodore machines from the PET to the C65. This is the first article of a multi-part series on the Commodore Peripheral Bus family.

The following figure compares the different protocol stacks. There are three different connectors with their unique byte transfer protocols: IEEE-488, Serial, and TCBM. Fast Serial and JiffyDOS are optimized, but backwards-compatible protocols for existing cables, and CBDOS integrates the drive directly into the computer. The higher level protocols are the same across all Commodore 8 bit machines, including the “KERNAL” operating system APIs.

These are the properties and tradeoffs of the different variants of the protocol stack1:

IEEE-488 Serial Fast Serial JiffyDOS TCBM CBDOS
Data Wires 13 3 4 3 12
Speed (KB/sec) 2.1 0.4 2.1 2.1 2.4
Controller Code (bytes) 334 434 708 739 262 0
Comments compatible with industry standard very slow requires decidated bit shifting hardware point-to-point drive integrated into computer

All variants are based on the IEEE-488 standard and therefore share (mostly) the same basic architecture:
* All participants are daisy-chained.
* One dedicated controller (the computer) does bus arbitration of up to 31 devices.
* One-to-many: Any participant can send data to any set of participants.
* A device has multiple channels for different functions.
* Data transmission is byte stream based.

The different variants and layers will be described in multiple articles.

NOTE: I am going to release one part every week from now, at which time links will be added to the bullet points below. The articles will also be announced on my Twitter account @pagetable and my Mastodon account

  • Part 0: Overview and Introduction
    That’s this part.
  • Part 1: IEEE-488 [PET/CBM Series; 1977]
    This part covers layers 1 (electrical) and 2 (byte transfer) of IEEE-488, an 8-bit parallel bus with three handshake lines, an ATN line for bus arbitration and very relaxed timing requirements.
  • Part 2: The TALK/LISTEN Layer
    This part talks about layer 3 (TALK/LISTEN), which is shared between all bus variants.
  • Part 3: The Commodore DOS Layer
    This part describes layer 4 (Commodore DOS), which is shared between all bus variants.
  • Part 4: Standard Serial (IEC) [VIC-20, C64; 1981]
    The VIC-20 introduced a serial version of layers 1 and 2 with one clock and one data line for serial data transmission, and an ATN line for bus arbitration. It has some strict timing requirements. This bus is supported by all members of the home computer line: VIC-20, C64, Plus/4 Series, C128 and C65.
  • Part 5: TCBM [C16, C116, Plus/4; 1984]
    The Plus/4 Series introduced a 1-to-1 bus between the computer and one drive, with 8 bit parallel data, two handshake lines, and two status lines from the drive to the computer. It was the short-lived planned successor of the Standard Serial bus, but was then replaced by Fast Serial.
  • Part 6: JiffyDOS [1985]
    JiffyDOS, a 3rd party ROM patch for computers and drives, replaces layer 2 byte transmission of Standard Serial by using the clock and data lines in a more efficient way. Bus arbitration is unchanged. The controller detects a device’s JiffyDOS support and can fall back to the Standard Serial protocol.
  • Part 7: Fast Serial [C128; 1986]
    The C128 introduced Fast Serial, which replaces layer 2 byte transmission of Standard Serial by using a previously unused wire in the Serial connector as a third line for data transmission. Bus arbitration is unchanged. The controller detects a device’s Fast Serial support and can fall back to the Standard Serial protocol.
  • Part 8: CBDOS [C65; 1991]
    The unreleased C65 added CBDOS (“computer-based DOS”) by integrating one or more drive controllers into the computer. There are no layers 1 and 2, and layer 3 sits directly on top of function calls that call into the DOS code running on the same CPU.

  1. The speeds have been measured by repeatedly reading the status channel of a disk drive. IEEE-488, Serial and JiffyDOS were measured on a 1 MHz C64 and Fast Serial on a C128, which executes all (Fast) Serial code in 1 MHz mode. TCBM was measured on a 1.77 MHz Plus/4 with the screen on, which makes the effective CPU speed similar to the C64. This is the code: a9 00 20 bd ff a9 01 a2 08 a0 0f 20 ba ff 20 c0 ff a9 08 20 b4 ff a9 6f 20 96 ff a2 00 20 a5 ff 9d 00 04 e8 d0 f7 60. Both Fast Serial and JiffyDOS can reach higher speeds in the special case of loading files using custom protocols. Controller code size was measured on CBM2 for IEEE-488, on C64 for Serial and JiffyDOS, and on C128 for Fast Serial. Code sizes are approximate and do not include the LOAD and SAVE code.

Archiving C64 Tapes Correctly

It’s pretty simple to archive Commodore 64 tapes, but it’s hard if you want to do it right. Creating the complete archive of the German “INPUT 64” magazine was not as easy as getting one copy of each of the 32 tapes and reading them. The tapes are over 30 years old by now, and many of them are hardly readable any more.

Good Tapes, Bad Tapes

Here is the overview of the tapes I have of each issue, how many I could read correctly, and what percentage that is:

Issue # Copies # Read OK % Read OK
8501 8 2 25%
8502 6 1 16.7%
8503 8 1 12.5%
8504 6 0 0%
8505 6 1 16.7%
8506 6 1 16.7%
8507 6 0 0%
8508 6 1 16.7%
8509 6 5 83.3%
8510 6 0 0%
8511 6 3 50%
8512 10 3 33.3%
8601 4 3 75%
8602 4 2 50%
8603 4 3 75%
8604 4 4 100%
8605 6 3 50%
8606 4 2 50%
8607 4 4 100%
8608 4 3 75%
8609 6 4 66.7%
8610 6 5 83.3%
8611 2 1 50%
8612 2 1 50%
8701 2 1 50%
8702 2 2 100%
8703 2 2 100%
8704 2 2 100%
8705 2 1 50%
8706 4 1 25%
8707 6 0 0%
8708 4 2 50%

As you can see, some tapes are pretty much unreadable these days, others are okay. For issues that were problematic, I kept buying more copies on eBay, hoping to eventually find one that reads correctly. (Interestingly, the distribution suggests that correlates more with the brand of tape (8707 seems to be a very bad one) than with age.)

By the way, all numbers of copies are divisible by two, because all INPUT 64 tapes have an identical copy of the data on the reverse side. So for issue 8702, for example, I only had a single tape, and both sides read correctly.

How to Dump Tapes

There are many ways to dump C64 tapes, and I’ll describe two.

If you have a C64 and an actual Datasette reader, you can use the 1541 Ultimate-II+ cartridge. The included dongle allows you to connect the Datasette to the cartridge, so you can directly record a tape onto a .TAP file on a USB storage device also connected to the cartridge.

If you are doing this, make sure that your Datasette has a clean head and is correctly aligned. HeadAlign by enthusi is very useful for this. If you have multiple Datasette recorders, try them all and start out with the best one before changing the alignment.

Another way is to record the tape into a WAV file using a regular tape recorder. Online shops are full of very cheap (but good) Walkman-like devices with a built-in analog-digital-converter that connect directly to USB and show up as an audio-in device. Using a tool like Audacity and one of the many WAV-to-TAP tools (my favorite is the ancient “tape64”, whose Windows binary works nicely with Wine), you can then convert it into a TAP file.

The advantage of this method is that you can still massage dumps of tapes that you have trouble with. In Audacity, you can use filters or split the two channels, and in conversion tools, you can adjust the threshold or the speed.

Interestingly, some tapes read well on a tape player, others read well on a Datasette. If you have trouble getting a correct dump and few copies, you might want to try different methods of reading the tapes.


How do I know whether a tape has in fact been read correctly? Practically all recording formats use checksums, so a tool like the excellent tapclean can help:

$ tapclean -t 8702a.tap 

TAPClean 0.34 - (C) 2006-2017 TC Team [Built Jun 12 2017 by ldf]
Based on Final TAP 2.76 Console - (C) 2001-2006 Subchrist Software

Read tolerance = 10

Computer type: C64 PAL (985248 Hz)

Loaded: 8702a.tap

Scanning...  Pauses  C64 ROM tape

TAPClean version: 0.34


TAP Name    : 8702a.tap
TAP Size    : 1221451 bytes (1192 kB)
TAP Version : 1
Recognized  : 94%
Data Files  : 28
Pauses      : 186
Gaps        : 204
Magic CRC32 : A9F18B1C
TAP Time    : 8:44.16
Bootable    : YES (1 part, name: INPUT 64)
Loader ID   : n/a

Overall Result    : FAIL

Header test       : PASS [Sig: OK] [Ver: OK] [Siz: OK]
Recognition test  : FAIL [1153201 of 1221431 bytes accounted for] [94%]
Checksum test     : PASS [28 of 28 checksummed files OK]
Read test         : PASS [0 Errors]
Optimization test : FAIL [0 of 28 files OK]

Saved: tcreport.txt
Operation completed in 00:00:09.

The tool recognized 28 pieces of data on the tape, and all of them had correct checksums. That’s a good sign, but not good enough. And that’s not just because single-byte checksums might be too weak to rely on.


It is possible that some pieces of data did not get recognized. If a header is unreadable, tapclean will treat it as an unrecognized area and warn about it. In the printout above, the tape failed the “recognition test”, because it only understood 94% of the tape (which includes silence). What are the other 6%?

Tapes often contain some garbage, and many INPUT 64 tapes have a minute of beeps at the end that don’t seem to encode any data. So these 6% may not be a problem.

Another data point on whether all data objects got recognized is by extracting everything and having a look at the listing:

$ tapclean -t 8702a.tap -doprg
$ ls prg/
007 (033C-03FB) [INPUT_64].prg
008 (033C-03FB) [INPUT_64].prg
010 (0318-09FF).prg
011 (0318-09FF).prg
013 (033C-0354) [CTEXTE].prg
017 (3000-3FBA).prg
021 (033C-0354) [RAHMEN_______COM].prg
025 (C440-CFF4).prg
029 (033C-0354) [TITELBILD].prg
033 (0801-1891).prg
037 (033C-0354) [l_O_H_N_S_T_E_U].prg
041 (0801-6BCE).prg
045 (033C-0354) [j_u_l_i_a].prg
049 (0801-4B99).prg
052 (033C-0354) [i_d___w_E_R_K_S].prg
056 (0801-3F16).prg
060 (033C-0354) [6_4_E_R__t_I_P_S].prg
064 (0801-59CE).prg
068 (033C-0354) [i_n_p_u_t___c_a].prg
072 (0801-2E7C).prg
076 (033C-0354) [l_A_B_E_L___t_O].prg
080 (0801-2AEC).prg
084 (033C-0354) [d_R_E_I___M_A_L].prg
088 (0801-5538).prg
091 (033C-0354) [eNGLISCHE_gramMA].prg
095 (0801-5B22).prg
098 (033C-0354) [v_O_R_S_C_H_A_U].prg
102 (0801-2A94).prg

Commercial Commodore 64 tapes usually don’t use the original encoding as supported by the C64’s operating system. More optimized schemes are often 5x-10x more efficient. Nevertheless, the first program on tape has to use the original encoding, so that the tape is bootable.

This tape contains a bootable program named “INPUT 64” at the beginning. The Commodore encoding saves the 192 bytes header (file name, type etc.) twice (007, 008), followed by the data, which is also saved twice (010, 011). On this tape, everything after this is in “SUPERTAPE” format. Every SUPERTAPE file consists of a single header (e.g. 013 for “CTEXTE”) and a single copy of the data (e.g. 017).

The printout above shows that after the boot program (2x header, 2x payload), there are 12 additional files. For each file, there is a header and there is payload. So this looks okay. Here is an example of a missing object:

$ tapclean -t 8707a.tap -doprg
Checksum test     : PASS [29 of 29 checksummed files OK]
$ ls prg/
066 (033C-03FB) [INPUT_64].prg
067 (033C-03FB) [INPUT_64].prg
069 (0318-09FF).prg
070 (0318-09FF).prg
072 (033C-0354) [CTEXTE].prg
076 (3000-4093).prg
080 (033C-0354) [RAHMEN_______COM].prg
084 (C440-CFF2).prg
088 (033C-0354) [TITELBILD].prg
092 (0801-1891).prg
096 (033C-0354) [i_n_p_u_t___w_I].prg
100 (0801-4A1D).prg
103 (033C-0354) [eNGLISCHE_gramMA].prg
107 (0801-5E09).prg
111 (033C-0354) [s_P_I_D_E_R].prg
115 (0801-471B).prg
118 (033C-0354) [a_S_S_E_M_B_L_E].prg
128 (033C-0354) [i_d___w_E_R_K_S].prg
132 (0801-3E4D).prg
135 (033C-0354) [i_c_i].prg
139 (0801-2DB0).prg
143 (033C-0354) [6_4_E_R__t_I_P_S].prg
147 (0801-4911).prg
151 (033C-0354) [p_I_N_G___p_O_N].prg
155 (0801-247F).prg
159 (033C-0354) [r___T_S_E_L_E_C].prg
163 (0801-22A5).prg
167 (033C-0354) [v_O_R_S_C_H_A_U].prg
171 (0801-0E89).prg

The checksum test passed, but the payload for the file with header 118 (“a_S_S_E_M_B_L_E”) is missing. In this case, it’s clear, but if two entries in sequence had been missing, it would have been tricky to detect this.

Comparing Multiple Copies

With just a single copy, we cannot really know whether the data is correct. Checksums are not reliable enough, and it’s hard to detect if a file was just not recognized. If we have two copies of the tape and they read the same, we can be very confident that the dumps are correct. tapclean’s “magic CRC32” that it prints after analyzing a tape is a strong checksum of all recognized data concatenated. So if two dumped copies have the same CRC32, we can assume that the dumps were correct.

It is quite unlikely that two dumps have the same file(s) missing, and it’s extremely unlikely that they had the same bit flips. Well, that is, if we assume that the tapes did not have mastering errors (or weaknesses), but also verifying the checksums and visual inspection of the contents should rule this out.

The following bash script shows the CRC32 values (e.g. “3D3C8936”), the number of correctly checksummed files and the number of total files (e.g. 30-31) for each tape:

for i in `find . -name \*.tap`; do
  echo $i
  (tapclean -t $i > $i.log)&
for i in `find . -name \*.log`; do
  issue=$(basename $i | cut -c 1-4)
  numfiles=$(cat $i | grep "^Checksum test" | cut -d "[" -f 2 | cut -d " " -f 1-3 | sed -e "s/ of /-/")
  crc32=$(cat $i | grep "^Magic CRC32" | cut -d ":" -f 2)
  echo $issue $crc32 $numfiles $(echo $i | sed -e "s/.log$//")

Here’s example output:

8708 00000000 0-0   ./8708a2.tap
8708 16979EE4 32-32 ./8708b2.tap
8708 16979EE4 32-32 ./8708b.tap
8708 3D3C8936 30-31 ./8708a.tap

8708b.tap and 8708b2.tap (the back sides of two different copies) are correct dumps: They have the same CRC32, they have all correct checksums, and the number of recognized items is even (header, data, header, data, …).

More Copies

Using this strategy, I was able to confirm correct dumps of 18 of the 32 tapes. 14 more to go.

The obvious way is to buy more copies on eBay until I have two that produce the same data. But luckily, there was another source: The C64 TOSEC Collection contains dumps of 20 of the tapes. Their time stamp says 1996, which is when then tapes were 10 years old instead of 30, so they should have been in much better shape. By looking at the checkums and the contents, we can get an idea of whether they are likely correct dumps. This is what the script from before says:

8508 8F9A4357 34-34 ./tosec/8508.tap
8511 F75CC32E 34-34 ./tosec/8511.tap
8605 200E286D 30-30 ./tosec/8605.tap
8510 D0534C58 34-34 ./tosec/8510.tap
8509 74B093F1 32-32 ./tosec/8509.tap
8606 DA4AF054 29-29 ./tosec/8606.tap
8502 8BDAF783 40-40 ./tosec/8502.tap
8512 764F2F59 31-31 ./tosec/8512.tap
8705 A103816A 30-30 ./tosec/8705.tap
8704 E84CC753 30-30 ./tosec/8704.tap
8602 B4F9D173 32-32 ./tosec/8602.tap
8612 547A7371 32-32 ./tosec/8612.tap
8506 78BC0D59 35-35 ./tosec/8506.tap
8603 E70B3E72 30-30 ./tosec/8603.tap
8507 22A807D1 32-32 ./tosec/8507.tap
8608 F9BD6BC9 34-34 ./tosec/8608.tap
8505 9C7FC3F2 35-35 ./tosec/8505.tap
8601 3F749A69 30-30 ./tosec/8601.tap
8611 48C4C35F 28-28 ./tosec/8611.tap
8703 C6757D2F 29-30 ./tosec/8703.tap

At least 8703 has an incorrect checksum. Several tapes have an odd number of recognized items, but sometimes, tapclean doesn’t seem to count correctly, so extracting all items (“-doprg”) and counting them makes sure what the right number is.

If we add these 20 copies to our collection and run the script again, we will be able to verify several more correct dumps.

8502 B76D97FE 35-39 ./1/8502a.tap
8502 8BDAF783 40-40 ./1/8502b.tap
8502 00000000 0-0   ./3/8502a.tap
8502 11D3CFC5 10-20 ./3/8502b.tap
8502 A7A10C6E 0-3   ./4/8502a.tap
8502 FC899939 0-0   ./4/8502b.tap
8502 8BDAF783 40-40 ./tosec/8502.tap

In this example, 1/8502b.tap and tosec/8502.tap contain the same data, so this verifies that these dumps are correct. Overall, this allowed me to verify the correctness of dumps of 8502, 8505, 8506, 8508, 8611, 8612 and 8705. Seven down, seven more to go!

Splice Verify

There are many issues where one dump looks okay, but we do not have an identical one to verify it. But it is not strictly necessary to have another identical dump. If we are certain that no files are missing, and that for every file, there is another dump with the identical file, we can be very certain that the dump is correct. I call this “splice verify”.

Let’s look at 8706:

8706 E6E7FCDA 28-28 ./1/8706a.tap
8706 8F259FA2 9-18  ./1/8706b.tap
8706 3BDC28F3 23-28 ./2/8706a.tap
8706 1B471853 8-17  ./2/8706b.tap

1/8706a.tap is a candidate for a correct dump. We can test whether there are identical copies of each item on the tape on other copies by hashing all items on the candidate and looking for them on the other copies. The following script will extract all tapes into subdirectories of “dir”:

rm -rf dir
mkdir dir
for i in `find . -name \*.tap | cut -c 3-`; do
    cd $(dirname $i)
    rm -rf $(basename $i)-$(dirname $i).dir
    mkdir $(basename $i)-$(dirname $i).dir
    cd $(basename $i)-$(dirname $i).dir
    tapclean -t ../$(basename $i) -doprg
    mv prg/* .
    rmdir prg
    cd ..
    mv $(basename $i)-$(dirname $i).dir ../dir/
  ) &

Then the following script will search for extra copies of every item on the candidate tape:

cd dir
for md5 in $(md5 -q 8706a.tap-1.dir/???\ *); do
  echo $md5
  md5 -r */* | grep $md5

And this script makes sure that every correctly read file on the other tapes also exists on the candidate:

for md5 in $(md5 -r */???\ * | grep -v 8706a.tap-1.dir | grep -v BAD.prg$ | cut -d " " -f 1 | sort | uniq); do
  echo $md5
  md5 -r 8706a.tap-1.dir/* | grep $md5

Using this method, it is possible to verify 8503, 8507, 8510, 8701 and 8706. Only two more tapes need to be verified!


We could keep buying more copies of 8504 and 8707 on eBay, but they don’t actually appear all that often. So let’s look at how bad our dumps are. Let’s take 8504:

12C4A3EA 30-30 ./8504a-1.tap
7C6D8541 34-35 ./8504b-1.tap
1337D0C8 20-25 ./8504a-2.tap
868DA7B6 38-38 ./8504b-2.tap
B80FD685 29-30 ./8504a-3.tap
8948C0B7 33-34 ./8504b-3.tap
C909C30F 26-27 ./8504a-4.tap
E3F7B698 8-24  ./8504b-4.tap

None of the dumps seems correct. 8504b-2.tap is the best, but the file count as printed by tapclean is wrong, and there is actually an item missing: The payload of “h_i_r_e_s_s_p_e”.

012 (033C-03FB) [INPUT_64].prg
013 (033C-03FB) [INPUT_64].prg
019 (0300-0303).prg
020 (0300-0303).prg
025 (033C-03FB) [GUTEN_TAG].prg
026 (033C-03FB) [GUTEN_TAG].prg
031 (0801-0C96).prg
032 (0801-0C96).prg
039 (033C-0354) [TEXTE].prg
046 (3000-3B76).prg
053 (033C-0354) [RAHMEN_______COM].prg
059 (C440-CF78).prg
066 (033C-0354) [TITELBILD].prg
073 (0A00-0AFF).prg
080 (033C-0354) [h_i_r_e_s_s_p_e].prg
093 (033C-0354) [k_a_l_e_n_d_e_r].prg
100 (0801-38A2).prg
106 (033C-0354) [r_e_v_e_r_s_i].prg
112 (0801-267D).prg
119 (033C-0354) [s_h_o_r_t_____s].prg
125 (0801-2075).prg
131 (033C-0354) [k_o_n_t_a_k_t_e].prg
137 (0801-57D4).prg
144 (033C-0354) [bits___bytes_im].prg
152 (0801-6A70).prg
159 (033C-0354) [einkommensteuert].prg
166 (0801-28CB).prg
172 (033C-0354) [h_i_l_f_s_p_r_o].prg
179 (0801-1BEC).prg
185 (033C-0354) [n_e_w_s].prg
191 (0801-2873).prg
198 (033C-0354) [6_4_E_R_____t_i].prg
205 (0801-4D89).prg
211 (033C-0354) [a_r_t_e_m___s].prg
217 (0801-2E30).prg
223 (033C-0354) [s_u_p_e_r_t_a_p].prg
230 (0801-1A4A).prg
236 (033C-0354) [l_a_s_t__n_o_t].prg
243 (0801-2106).prg

If it weren’t for the missing data, we could splice-verify this dump. Using the scripts from earlier, we can show that all files can be found on other copies, and all correct files from other copies are on this dump – except for the “h_i_r_e_s_s_p_e” payload.

Then why not splice in the correct object from a correct dump? 8504b-1.tap is one of the dumps that contains a correct copy. tapclean’s tcreport.txt file for the tape with the correct data shows us some information about the “h_i_r_e_s_s_p_e” payload:

Seq. no.: 113
Location: $3DFA2 -> $3E13B -> $4E25E -> $4E269
LA: $0801  EA: $3000  SZ: 10239
Pilot/Trailer Size: 62/0
Checkbyte Actual/Expected: $7DA9/$7DA9, PASS
Read Errors: 0
Unoptimized Pulses: 12338
CRC32: CBF1C874

It is located between offsets $3DFA2 and $4E269 inside the file. Here’s the part from the hexdump:

$ hexdump 8504a-1.tap
0003df70  21 43 21 21 21 21 21 21  1e 21 21 21 43 fb 00 6c
0003df80  09 00 00 0b 19 00 00 ff  47 00 00 e5 2b 00 00 d9
0003df90  12 00 00 fe 3d 04 32 32  32 21 1e 21 37 32 2f 21
0003dfa0  21 21 21 43 1e 32 21 21  1e 32 32 2f 21 21 21 21
0003dfb0  40 21 32 1e 21 21 32 32  32 21 1e 21 21 43 21 32

Every byte in a TAP file (after the 20 byte header) represents the length of a pulse in units of about 8 microseconds. Zero is an escape code, after which a three byte little endian value follows which specifies pulse lengths above 255. So the zeroes here just before what tapclean identified as the beginning of the payload are areas of silence on the tape. Grouped correctly, these are the 6 very long pulses, which represent a pause of about half a second:

00 6c 09 00
00 0b 19 00
00 ff 47 00
00 e5 2b 00
00 d9 12 00
00 fe 3d 04

Just after this is where we want to cut.

The cutting position of the end can be found similarly, and by looking at the tcreport.txt of the tape with the broken file, we can find out what part to cut and replace with the good version.

At the end, we need to make sure the number of data bytes in the TAP file’s header at offset 16 (32 bit little endian) is correct.

By doing this, we can create correct versions of the two remaining issues, so now we have a verified correct dump of each of the 32 issues!

The Moral of the Story

Read your tapes early. If you have any tapes, read them now! It doesn’t matter if they have been dumped before: These dumps might be incorrect, because they might not have not been verified with a second copy.

A Visual Defragmenter for the Commodore 64

For decades, PC users have been able to relax by watching the computer defragment a disk. Now C64 users can do the same! Introducing “defrag1541”, a disk defragmentation tool for C64 and 1541.

defrag1541 consists of 49 lines of BASIC, takes about 15-30 minutes to defragment a typical disk, and visualizes its work using colored blocks on the screen. The different colors symbolize the different files, and the character shapes show whether the block is at its original (round) or final (square) position. (Note that this tool is meant for visualization only, do not use it on real data!)

In the remainder of this article, I will describe how the program works.

The 1541 Disk Format

Commodore 1541 floppy disks use the CBM filesystem. A disk consists of 683 sectors of 256 bytes, which are addressed as a track/sector tuple. There are

  • 35 tracks, numbered 1-35
  • with 21 to 17 sectors each, numbered from 0

The number of sectors depends on the track, higher tracks (closer to the center) have fewer sectors:

Tracks Sectors
1-17 21
18-24 19
25-30 18
31-35 17

The following table visualizes this: The columns are the tracks, the lines are the sectors. The higher numbered tracks don’t have as many sectors.


Track 18 is special, it holds all the metadata. 18/0 contains the disk name and the Block Availability Map (BAM), which uses one bit per block to keep track of which blocks are allocated, and the remaining sectors contain 32 byte directory entries, so 8 directory entries per block. Every directory entry consists of a file type, a 16 byte file name, and a track/sector tuple for the first block of the file.

Every file is stored as a linked list. Every block holds the track/sector tuple of the next block in its first two bytes, followed by 254 data bytes. A track value of 0 indicated the last block of a file.

The Data Model

The easiest way to look at the problem of defragmenting a disk is to imagine all files on the disk as concatenated into one long file that needs to be defragmented. Now every used block on disk has a logical block index. Logical block 0 is the first block of the first file, block 1 is the second block of the first file – if the first file was two blocks or more. Otherwise it’s the first block of the second file, and so on.

So for every physical block, we need to know what logical block is stored there. Then we build the desired mapping: For every physical block, what logical block should be stored there. Our core defragmenter code then moves sectores around and updates links until the two mappings are identical.

The Data Structures

At the very beginning, we are defining the arrays. We do this early, so we can’t run into OUT OF MEMORY errors at runtime. In fact, this program occupies about 99% of the 38911 bytes available to BASIC on the C64.

10 dimds(17):dimn(143):dimcc(143):dimci(682):dimic(682):dimcj(682):dimjc(682)
15 dimtr(682):dimsc(682):dimaa(682):dimdd(682):dimoo(682):dimss(682):n$=chr$(0)

The following table contains a small overview of the arrays. We will talk about them more as they are used in the code.

ds(17) array of sector numbers of dentry chain
n(143) map dentry index -> number of blocks
cc(143) map dentry index -> color
ci(682) map physical block index -> logical block index – current
ic(682) inverse of ci()
cj(682) map physical block index -> logical block index – desired
jc(682) inverse of cj()
tr(682) map block index -> track
sc(682) map block index -> sector
aa(682) map block index -> screen ram location
dd(682) map logical block index -> dentry
oo(682) map logical block index -> offset
ss(682) stack

The UI

We create a border around a 35×21 character area on the screen, so that every block can be represented by a character.

20 a=1024:b=1060:poke646,15:v=53280:pokev,0:pokev+1,15:printchr$(147):pokev+1,0
25 pokea,112:pokea+22*40,109:fori=1to35:pokea+i,67:pokea+22*40+i,67:next
30 pokeb,110:fori=1to21:pokea+40*i,66:pokeb+40*i,66:next:pokeb+22*40,125

The array cc() will map a file index to a color. We cycle through the C64 palette skipping black, which is the background color.

35 j=1:fori=0to143:cc(i)=j:j=j+1:ifj=16thenj=1
40 next

Reading the Current State

As a first step, we need to find out where data is stored on disk, i.e. for every physical block, we need to know what logical block (or none) is currently stored there.

The following code iterates over all directory entries and follows the linked lists of all files, recording in the array ci() the logical block index for each physical block. The reverse mapping (at what physical location is the logical block stored?) is built in the array ic(). The array nb() collects the number of blocks each file occupies.

In addition, we are keeping track of the sectors on track 18 that are used for directory entries. We will need this information later to update the directory entries when blocks have moved.

Since we are using linear block indexes for everything, we need to convert the track/sector tuples that the filesystem works with. This is done by the t2 comparisons in the second half of the code block.

45 open1,8,15,"i":fori=0to682:ci(i)=-1:ic(i)=-1:next:s=1:open2,8,2,"#"
50 ds(di)=s:ci(357+s)=-2:di=di+1:print#1,"u1 2 0 18"s:get#2,t$,s$:t1=asc(t$+n$)
55 s1=asc(s$+n$):fori=0to255step32:print#1,"u1 2 0 18"s:poke1082+s*40,4
60 print#1,"b-p 2"i+2:get#2,f$:f=asc(f$+n$):iff<>129andf<>130goto105
65 get#2,t$,s$:t2=asc(t$+n$):s2=asc(s$+n$):ift2=0goto105
70 print#1,"u1 2 0"t2;s2:poke1064+t2+s2*40,81
75 poke55336+t2+s2*40,cc(ni):ift2<18thena=s2+(t2-1)*21:goto95
80 ift2<25thena=s2+(t2-18)*19+357:goto95
85 ift2<31thena=s2+(t2-25)*18+490:goto95
90 a=s2+(t2-31)*17+598
95 ci(a)=cd:ic(cd)=a:n(ni)=n(ni)+1
100 cd=cd+1:goto65
105 ni=ni+1:next:s=s1:on-(t1<>0)goto50:ci(357)=-2

The ci(357) and ci(357+s) assignments mark the directory sectors as occupied, so that the defrag algorithm can’t use them as temporary storage.

As this code is walking the linked lists, it shows them on the screen in the form of a bullet, in a color that represents the file index.

Helper Data Structures

We have to do some programmatic work, which will take about a minute, so we draw progress bars. For this, we first draw “[” and “]” characters in an empty line at the bottom of the screen.

110 poke1944,27:poke1980,29

When updating links during defragmentation, we will need to convert linear block indexes back into track/sector tuples, so we create the two arrays tr() and sc() for a quick lookup. In addition, in aa() we calculate the screen position for the character that corresponds to the block.

115 i=0:fort=1to35:readns:fors=0tons:tr(i)=t:sc(i)=s:aa(i)=1064+t+s*40:i=i+1
120 next:poke1944+t,46:next:data20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20
125 data20,18,18,18,18,18,18,18,17,17,17,17,17,17,16,16,16,16,16

Later, we will also need to know for every logical block index what file it belongs to and which index it has within the file. This will be stored in dd() and oo().

130 ford=0to143:ifn(d)<>0theno=0:forj=1ton(d):dd(l)=d:oo(l)=o:l=l+1:o=o+1:next
135 poke1945+int(d/4.2),81:next

Desired Allocation

We have to build a desired allocation array, which has the same format as the current allocation array, just with the data about where we would like the blocks to be. The array cj maps physical blocks to logical blocks, and jc() is the reverse of this.

For simplicity, we are just filling the disk linearly, starting with track 1, and not using an interleave. This doesn’t make the defragmenter very useful, because the disk might actually read more slowly after this, but it makes for a very clean visualization!

140 fori=0to682:cj(i)=-1:jc(i)=-1:next:c=0:b=0:fori=0to143:ifn(i)=0goto155
145 forj=1ton(i):ifc>=357andc<376thenc=376
150 cj(c)=b:jc(b)=c:c=c+1:b=b+1:next
155 poke1945+int(i/4.2),160:next:fori=357to375:cj(i)=ci(i):next

The code could be easily extended to use a more reasonable strategy. The 1541 operating system for example picks the empty block closest to track 18 as the first block for a new file, then stays on the same track as long as there are free blocks, preferably with an interleave of 10. Then it goes on to the closest track with an empty sector again.

We're done thinking, so let's clear the progress bar:

160 fori=0to34:poke1945+i,32:next:poke1944,32:poke1980,32:sl=0:i=0

The Main Algorithm

Now we iterate over all physical blocks and make sure the current allocation matches the desired allocation.

  • If the current allocation matches the desired allocation, do nothing.
  • If the block is currently empty, just move the correct block there.
  • Otherwise, we need to move the logical block that is there to its new location first – using recursion!

For recursion, we use the array ss() as the stack and sl as the stack pointer. Recursion can cause a cycle though, e.g. when logical block A is stored where logical block B should be stored and vice versa. So if the block we are looking at is the same as the bottom of the stack, we have to break the cycle.

The following code checks for block identity, empty block as well as a cycle:

165 on-(ci(i)=cj(i))goto190:on-(ci(i)=-1)goto185:on-(sl=0orss(0)<>i)goto180

To break the cycle, we need the help of a temporary block. We find a free block and move the current block there (subroutine 225, we'll talk about it later), then continue.

170 forj=0to682:ifci(j)<>-1thennext
175 i1=i:i2=j:gosub205:goto165

For recursion, we push the current index on the stack and continue with the physical block that should contain the data that the current block currently contains.

180 ss(sl)=i:sl=sl+1:i=jc(ci(i)):goto165

In case the current block is empty, we move it using subroutine 225.

185 i1=ic(cj(i)):i2=i:gosub205

At the end of the loop, continue with the value from the top of the stack, if there is any, otherwise we take the next block in sequence.

190 ifci(i)>=0thenpokeaa(i),160
195 ifsl<>0thensl=sl-1:i=ss(sl):goto165
200 i=i+1:on-(i<683)goto165:close2:close1:open1,8,15,"v":end

When we are done, we send a "V" (validate) command to the drive, which will follow the links of all files and recreate the block availability map. For simplicity, our code never updates this table, so we have the operating system do it for us at the end.

Moving a Block

In order to move a block, in addition to copying the data between the two physical blocks and updating our data structures, we also have to update the track/sector link tuple from the previous block or the directory entry.

This reads the block into a buffer and writes it to a different location without transfering it to the computer:

205 pokeaa(i1),18:print#1,"u1 2 0"tr(i1)sc(i1):pokeaa(i2),23
210 pokeaa(i2)+54272,cc(dd(i2)):print#1,"u2 2 0"tr(i2)sc(i2)
215 pokeaa(i1),32:pokeaa(i2),160

The POKE instructions update the screen.

To update the parent link, we have to find out whether this is the first block in a file, in which the parent link is located in the directory entry.

220 c=ci(i1):o=oo(c):t=tr(i2):s=sc(i2):ifo<>0goto240

If that's the case, we find the correct directory entry, read the block and write the new link.

225 d=dd(c):s1=ds(int(d/8)):pp=peek(aa(357+s1)):pokeaa(357+s1),0
230 print#1,"u1 2 0"18;s1:print#1,"b-p 2"(dand7)*32+3
235 print#2,chr$(t)chr$(s);:print#1,"u2 2 0"18;s1:pokeaa(357+s1),pp:goto250

Otherwise, we read the preceding block in the file and update the link there.

240 a=ic(c-1):tt=tr(a):ss=sc(a):print#1,"u1 2 0"tt;ss:pp=peek(aa(a))
245 pokeaa(a),0:print#2,chr$(t)chr$(s);:print#1,"u2 2 0"tt;ss:pokeaa(a),pp

Finally, we update the data structures to reflect the fact that this physical block now contains the new data, and the block we moved it from is now empty.

250 v=ci(i1):ic(v)=i2:ci(i2)=v:ci(i1)=-1:return


defrag1541 is not meant as a serious tool. Nevertheless, here are some limitations and ideas for improvement:

  • The algorithm for the desired allocation was chosen for its looks, not to make disks actually faster.
  • Optionally, the algorithm could also place data blocks on track 18. That would give the user 17 more blocks on the disk.
  • That said, disks that currently have data blocks on track 18 can't get defragmented with the current algorithm. It avoids track 18 and will run out of space.
  • The current code does not test for read errors at all.
  • We also do not update BAM ourselves and rely on the operating system doing it at the end. While the code is written so that interrupting the process at any point will not lead to data loss, it is important to run the VALIDATE command at the end if data should ever be written to the disk again.
  • Of course it could be much faster. A lot of this is CPU bound, because BASIC is so slow. Rewriting everything in assembly would help. Some parts of the code could also directly run inside the disk drive. I doubt the complete data structures would ever fit into the 1541's 2 KB of RAM, so defragmentation can't completely run on the disk drive, but major parts of the logic could still live there, to minimize the lag introduced by communication between the computer and the drive.

C64-Artikel in der c’t Retro 2018

This post is about articles I wrote for the c’t magazine in German language. The post is therefore in German.

Im heute erschienenen “c’t Retro 2018” Sonderheft befinden sich drei Artikel von mir.

Brotkasten für die Welt” beschreibt die Hardware, Peripherie und die grundlegende Bedienung des C64.

Emulieren mit VICE” beschäftigt sich mit der optimalen Konfiguration des VICE-Emulators.

Und “Der eigene Katakis-Klon” beschreibt auf nur 6 Seiten ein vollständiges kleines Shooter-Spiel im Quelltext und vermittelt dabei die Grundladen von 6502-Maschinensprache sowie von Interrupt-basierter Programmierung des VIC.

(Bei den hier abgebildeten Seiten handelt es sich um die Vorschau in der c’t-App.)

Repairing a Commodore 1084S Power Switch

The power switch of the Commodore 1084S tends to break easily. To replace it, you will need a standard “Preh TV3” switch (which can be found online), a screwdriver, a soldering iron, desoldering braid, and some basic soldering skills.

At the back, there are four screws in the corners:

Three more screws at the back panel:

And four at the bottom:

When lifting the back piece, be careful not to rip off the speaker wires. You only need to lift it a bit, and you can have it rest on the back panel. Do not kill yourself touching any of the high voltage parts!

The switch is located at the top left of the board and has six solder points.

This is what your new switch should look like:

Take some desoldering braid and loosen it a little…

…so you can put a pin of the switch through it. Then heat it with the soldering iron.

After desoldering all six solder points, it should look like this:

You should now be able to remove the old switch.

Add the plastic button to the new switch, and if it has two extra legs in the middle, bend them down a little.

Put the new switch in place and make sure all six pins go through the holes.

Solder all the pins!

When putting the monitor back together, you have to make sure to fit the two plastic “rails” so that they hold the board and their pins go through the holes in the bottom part.

When sliding the back part back on, you might need a second person holding the rails.

Now put the screws back in, and you’re done!

A Minimal C64 Datasette Program Loader

The Commodore Datasette recording format is heavily optimized for data safety and can compensate for many typical issues of cassette tape, like incorrect speed, inconsistent speed (wow/flutter), and small as well as longer dropouts. This makes the format more complex and way less efficient than, for example, “Turbo Tape” or all other custom formats used by commercial games. Let’s explore the format by writing a minimal tape loader for the C64, optimized for size, which can decode correct tapes, but does not support error correction.


All information is encoded in the duration of pulses. There are three types of pulses: short (352 µs), medium (512 µs) and long (672 µs). To measure the lengths of the pulses, we can use one of the timers in the CIA chips: These count cycles of the C64 base clock (roughly microseconds), counting down from a a 16 bit start value. Since the pulse lengths are above 255, this would require us to compare 16 bit measurement results, but since microseconds are too high a resolution for tapes anyway, we can just as well only count units of 8 clock cycles – this matches the resolution of the “TAP” archival format.

This can be done by having timer A continuously count down from 7 to 0 (at the speed of the system clock) and having timer B count underruns of timer A:

    lda #7          ;divide timer b by 8
    ldx #0
    sta $dd04
    stx $dd05
    lda #%00010001
    sta $dd0e       ;start timer a
    dex ; $ff
    stx $dd06       ;always start timer b
    stx $dd07       ;from $ffff

The lengths of the pulses in units of 8 PAL clock cycles are:

length_short  = $30
length_medium = $42
length_long   = $56

To differentiate the pulses, let’s take the arithmetic means as thresholds:

threshold_short_medium = (length_short + length_medium) / 2
threshold_medium_long = (length_medium + length_long) / 2

The subroutine get_pulse measures a pulse by reading the low byte of the value of timer B (which counts down from $ffff) and restarts the timer. The (negative) length is returned in A. It will also set the C flag if it was a short pulse. If C is clear, it was a medium or a long pulse.

    lda #$10
p1: bit $dc0d       ;wait for start
    bne p1          ;of pulse

    ldx #%01011001  ;value to restart timer b
p2: bit $dc0d       ;wait for end
    beq p2          ;of pulse

    lda $dd06       ;read timer b
    stx $dd0f       ;restart timer b
    cmp #$ff-threshold_short_medium
    rts             ;c=1: short

Bits and Bytes

Every bit is encoded using two pulses: short/medium is 0, and medium/short is 1:

S/M 0
M/S 1

The byte marker (long/medium) allows detecting byte boundaries:

L/M byte marker

So a byte is encoded with a byte marker, followed by the 8 bits (LSB first) and an (inverted) parity bit.

 L/M    S/M  S/M  S/M  S/M  S/M  S/M  S/M  S/M  M/S
 marker bit0 bit1 bit2 bit3 bit4 bit5 bit6 bit7 parity

There is one more pulse combination, the end of data marker (long/short), which indicates the end of a file:

L/S end of data

The code to read a byte first reads pulses until it finds either the byte marker of the end of data marker. In the case of end of data, it returns with the C flag set. If it found a byte marker, it will reads 8 bits, ignoring the length of the first pulse and only deciding whether it was a 0 or a 1 based on the length of the second pulse. The check bit is ignored: When reading the next byte, the code to search for the marker will just skip it.

; wait for byte marker
    jsr get_pulse
    cmp #$ff-threshold_medium_long
    bcs get_byte    ;not long
    jsr get_pulse
    bcs b2          ;short = end of data
; get 8 bits
    lda #%01111111
b1: pha
    jsr get_pulse   ;ignore first
    jsr get_pulse
    ror             ;shift in bit
    bcs b1          ;until canary bit
b2: rts

For code compactness, this loop of 8 iterations stores a canary bit with a value of 0 at position 7 of the result during initialization. With every read bit that is rotated into the result, the canary bit is shifted to the right, and after 8 iterations, it is shifted into the carry flag. This way, we can detect we’re done with all 8 bits without using a counter.

Leader and Countdown

The encoding described so far allows a cassette tape to store an arbitrary stream of bytes, with a marker to indicate the end, but no way to know where a file starts. Therefore, the data of every file is prefixed by the leader and the countdown.

The leader is a sequence of less than a second to up to ten seconds consisting entirely of short pulses. It allows the reader to adjust the expected pulse lengths to compensate for different motor speeds. In our minimal reader, we can just ignore it.

The countdown is a sequence of the hex bytes $89, $88, $87, … down to $81. This is the code to detect it:

c0: jsr get_byte
c1: ldy #$89
    sty tmp_cntdwn  ;start with $89
    ldy #9
    bne c2
cx: jsr get_byte
c2: cmp tmp_cntdwn
    bne c4
    dec tmp_cntdwn
    bne cx
c4: cpy #9           ;first byte wrong?
    beq c0           ;then read new byte
    bne c1           ;compare against $89

This code may look more complicated than it needs to be, but the extra complexity comes from the edge case of an incomplete countdown immediately followed by a complete countdown (e.g. $89, $88, $89, $88, $87, …, $81).


After the countdown, the subroutine get_block will read the number of bytes passed in A into memory pointed to by ptr.

    sta count
    ldy #0
g1: jsr get_byte
    bcs g2
    sta (ptr),y
    dec count
    bne g1
g2: rts

If it encounters an end of data marker, the subroutine will return prematurely with the C flag set.


The encoding described so far allows storing individual files, but there is no metadata yet. So for every file on tape, there is a 192 byte header which precedes it.

The header encodes the filename, its type, and in case of programs, the start and end addresses.

Offset Length Description
0      1      File Type ($01 or $03 for PRG)
1      2      Start Address
3      2      End Address
5      16     Filename
21     171    unused

There are other file types to support SEQ files, which are stored a sequences of individual 192 byte files.

The header itself encoded exactly like a 192 byte file and recorded just before the file data: with a leader and a countdown.

The KERNAL’s save routines will actually store a backup copy of both the header and the full file contents immediately after the original to mitigate tape dropouts, with the countdowns using the sequence $09, $08, $07, … down to $01. Our code assumes an error-less tape, so we ignore the copies.

Connecting everything together

Our minimal loader makes several assumptions: The tape must be error-less and rewound, and the first file must be a program, correctly preceded by a header.

The user has to press the PLAY button, so we have to wait until this has happened:

    lda #$10
m1: bit $01
    bne m1

Timing is critical when reading from tape, so we have to turn off interrupts and disable the screen to make sure the VIC doesn’t steal cycles:

    lda $d011
    and #$ff-$10    ;disable screen
    sta $d011
m2: lda $d012
    bne m2          ;wait for new screen

Now we turn on the Datasette motor:

    lda $01
    and #$ff-$20    ;motor on
    sta $01

Now we can read the header by calling get_block with a byte count of 192:

    ldx #buffer
    stx ptr
    sty ptr + 1
    jsr get_countdown
    lda #192
    jsr get_block

We ignore most fields of the header and only read the start address to know where to load the file:

    ldx buffer + 1
    ldy buffer + 2
    stx ptr
    sty ptr + 1

Now we can load the file:

m3: lda #0
    jsr get_block
    inc ptr + 1
    bcc m3

Once we’re done, we turn the motor off and the screen back on:

    lda $d011
    ora #$10
    sta $d011       ;screen on
    lda 1
    ora #$20        ;motor off
    sta 1


This minimal tape loader takes up less than 200 bytes. It might seem useless for any other purpose than learning about the Datasette recoding format, since the C64 KERNAL already contains code like this. But it is common to create modified versions of the KERNAL that add features, removing the original tape code to create some space, so this minimal loader could be added to such a modified KERNAL to save on space but still be able to load games from tape.

The complete source is available at bugfixes and size optimizations are very much appreciated.