Fitting 44% More Data on a C64/1541 Floppy Disk

The physical data format on a Commodore 1541 5¼-inch floppy disk as used by the C64 is completely defined in software. The drive’s operating system fits 170 KB on a disk. This article explores different strategies, each with its pros and cons, to fit up to 246 KB.

Regular 1541 Disk Format

These are the specifications of the original 5¼-inch “minifloppy” from 1976:

  • 35 tracks
  • radius of outermost track: 57.150 mm
  • track spacing: 1/48 inch (i.e. 48 tpi, tracks-per-inch)
  • track capacity: 50 Kbit/track1 (= 25.4 bytes per millimeter on track 35)

The Shugart SA-400 drive, which defined the minifloppy, included a controller that formatted the disk with 90 KB. Like Apple, Commodore bought just the mechanism without the controller (Shugart called this the SA-390), and used its own 6502-based system as the controller (which was cheaper) and defined their own format (which made better use of the disk).

The Shugart format was using the same number of sectors on each track. But tracks on the outside are longer: track 1 is 46% longer than track 35. And if the material on the disk must be able to hold 50 Kbit on track 35, it must be safe to store 70 KBit on track 1. The combined length of all tracks is 10590 mm, so based on the 50 KBit for track 35, the whole disk can hold 269 KB2, if every track is read and written at its individual speed.

Commodore chose to support four speed zones. The fastest one on the outside tracks reads and writes a byte every 26 µs (at 300 rpm), the slowest one on the inside does this every 32 µs.

Track # Sectors Speed Zone µs/Byte Raw Kbit/Track
1 – 17 21 3 26 60.0
18 – 24 19 2 28 55.8
25 – 30 18 1 30 52.1
31 – 35 17 0 32 48.8

The following graph shows the data densities per millimeter for tracks 1 to 35. The 25.4 bytes per millimeter of the original specification are never exceeded3. (An optimum scheme would max out the 25.4 bytes/mm on all tracks.)

It is interesting to note that track 18, which holds the filesystem metadata, and is therefore the most important track of the disk, has a comparably low density and should therefore have higher reliability.

With the 17 to 21 sectors per track, this makes a total of 683 sectors of 256 bytes each, or 170.75 KB.

Tracks 36-41

While the original “minifloppy” disk format only specified 35 tracks, later disks (with larger cutouts) and drive mechanics supported 40 tracks.

While the firmware of all 48 tpi Commodore 5¼-inch drives stayed with the original 35 track format, the mechanics of the 1541 and later drives are designed for 40 tracks, and in practice, all drives can even reach track 41, and the magnetic layer of disks is still reliable on track 41 as well4.

There have always been a number of tools to format, read and write data in tracks 36-40 (or 41). They usually use speed zone 0, the slowest one, so each track fits another 17 sectors, for a total of 85 (or 102) extra sectors, or 21 (25) KB. This is a disk capacity increase of 12% (15%).

It seems straightforward and safe to use the extra tracks (at least until track 40) like this, but the density used on these tracks actually exceeds the specified 25.4 bytes per millimeter by up to 8.7%

Max Speed Zones

A more radical approach to fitting more data is to use the fastest speed zone for all tracks. This way, every track will hold 21 sectors, for a total disk capacity of 861 sectors (215 KB) when using 41 tracks.

(In practice, track 18 of disks in a custom format have to be formatted in the original format, otherwise the original DOS won’t be able to e.g. load the boot program of a game or an application. This limitation does of course not apply to extra data disks of a game, or pure user data disks.)

But this will exceed the specified capacity of the medium for tracks 18 and above: in the case of track 41 even by 34%!

How reliable is a disk written with too high a density? Let’s do an experiment.

The following Python script will create a description of an all-speed-zone-3 test disk. Each of the 41 tracks contains 21 sectors filled with 0..255. Otherwise, the disk is completely consistent with the Commodore “2A” on-disk format.

print 'no-tracks 84'  
print 'track-size 7928'  
for track in range(1,42):  
    print 'track {}'.format(track)  
    print '   speed 3'  
    print '   begin-at 0'  
    for sector in range(0, 21):  
        print '   sync 40'  
        print '   gcr 08' # header  
        print '   begin-checksum'  
        print '      checksum {:02x}'.format(track ^ sector)  
        print '      gcr {:02x}'.format(sector)  
        print '      gcr {:02x}'.format(track)  
        print '      gcr 30 30' # ID  
        print '   end-checksum'  
        print '   gcr 0f 0f' # OFF bytes  
        print '   ; Trk {} Sec {}'.format(track, sector)  
        print '   bytes 55 55 55 55 55 55 55 55 55' # gap  
        print '   sync 40'  
        print '   gcr 07' # data  
        print '   begin-checksum'  
        print '      gcr',  
        for i in range(0, 256):  
            print '{:02x}'.format(i),  
        print ''  
        print '      checksum 00'  
        print '   end-checksum'  
        print '   gcr 00 00' # OFF bytes  
        print '   bytes 55 55 55 55 55 55 55' # gap  
    print 'end-track'

The .TXT output of the script can be converted into a binary .G64 image using the excellent g64conv utility. Using a ZoomFloppy, nibtools and either a 1541 with a parallel cable mod or a 1571, you can write this image to disk (nibwrite), and then keep reading it in a loop (nibread). I was able to read such a disk 1000 times on a 1571 without any errors.

Minimizing Gaps

But there is a way to fit 41 more blocks (10.25 KB, 20% extra) without exceeceding the specs of 48 tpi disks – and still being read compatible with the original firmware.

Let’s look at track 1 with its 21 sectors. For every sector, there is a SYNC mark, a header, another sync mark, and the sector data.

This structure gets written when a disk is first formatted. When reading a sector, the drive looks for a SYNC pattern, reads the header, which contains among other data the number of the sector that follows, and if it is the correct sector number, it waits for another SYNC mark and reads the sector data.

When writing a sector, the drive also looks for the correct header, and then overwrites the section afterwards, including the SYNC mark.

Since it is impossible to hit one particular byte location when switching to write mode, there is a 9 byte gap after the header, and another gap (usually 8 bytes) after the sector contents5. It won’t matter if the newly written sector is a few bytes too early or too late, because it will only spill into one of the two gaps:

In addition, there are “tail gaps” at the end of each track:

The lengths of the sectors in each speed zone don’t add up to the complete length of the track. This is because the tracks are not evenly divisible by their number of sectors, and also to account for drives with slightly incorrect motor speeds.

By minimzing some of these accounting structures and using more of the tail gaps for data, it is possible to fit one or two additional sectors onto each track.

Data Gaps

There are usually around 8 gap bytes after the sector data. If we remove them, reading the disk will not be impacted in any way: After reading a sector, a drive never immediately reads the next one, because it will have to decode and transmit the data first. And even if it did, in the worst case, it might miss the next sector, because the next header has arrived too quickly, but it will catch it again after one rotation.

Removing the gap after the sector data will make correctly writing to the disk impossible though. There is very little wiggle room for hitting the correct range of bytes when overwriting a sector. Overshooting the area will easily destroy the next SYNC mark or the next header, so the following sector won’t be readable any more.

Header Gaps

A standard disk has 9 gap bytes after the header. It is not possible to completely remove it and still be compatible with the sector read routines in the 1541 firmware: After the drive has read the header, it needs to decode it and decide whether this is the sector it is looking for. If it is, it will scan for the next SYNC mark and then read the data. But decoding and deciding takes time, and every 26 to 32 CPU clock cycles, the next byte passes the read head. So if there is no gap, the SYNC mark will have passed the read head before the drive has made the decision to read this sector.

With 2 gap bytes instead of 9, the original read code still works, and so does most other third party read code.

Like removing the data gap, minimizing the gap after the header will also break writing, and this time horribly. When overwriting the sector, the original firmware code will skip 9 bytes before beginning to write the SYNC mark and the sector data. If there is no gap at the end of the data, it is guaranteed to overwrite the next header. But even if there is, when skipping the supposed header gap (“wait 9 bytes”), it is actually skipping the SYNC mark and the beginning of the old sector data. So when the sector gets read again, the first SYNC mark gets detected, but the data is overwritten by the misaligned new sector data.

SYNC Marks

The SYNC marks consists of sequences of consecutive 1-bits, which cannot otherwise be part of the data stream. For a SYNC to be detected, it needs to be at least 10 bits long. The original firmware always writes 40 bits, but we can trim this down to 16 bits, saving 2×3 bytes for every sector.

The Format

With a 2 byte header gap, no sector gap and 16 bit SYNC marks, we have saved 18 bytes per sector. Now there is space for one more sector on tracks 1-17 and 25-35, and even two more sectors on tracks 18-246.

The following python code will generate a .TXT test file that can be converted to a .G64 using g64conv.

def speed_for_track(track):  
    if track < 18:  
        return 3  
    if track < 25:  
        return 2  
    if track < 31:  
        return 1  
    return 0

def sectors_for_track(track):  
    return [18, 19, 21, 22][speed_for_track(track)]

    print 'no-tracks 84'  
    print 'track-size 7928'  
    for track in range(1, 36):  
        print 'track {}'.format(track)  
        print '   speed '.format(speed_for_track(track))  
        print '   begin-at 0'  
        for sector in range(0, sectors_for_track(track)):  
            print '   sync 16'  
            print '   gcr 08' # header  
            print '   begin-checksum'  
            print '      checksum {:02x}'.format(track ^ sector)  
            print '      gcr {:02x}'.format(sector)  
            print '      gcr {:02x}'.format(track)  
            print '      gcr 30 30' # ID  
            print '   end-checksum'  
            print '   gcr 0f 0f' # OFF bytes  
            print '   ; Trk {} Sec {}'.format(track, sector)  
            print '   bytes 55 55' # gap  
            print '   sync 16'  
            print '   gcr 07' # data  
            print '   begin-checksum'  
            print '      gcr',  
            for i in range(0, 128):  
                print '{:02x} {:02x}'.format(track, sector),  
            print ''  
            print '      checksum 00'  
            print '   end-checksum'  
            if track >= 18:  
                print '   bits 1111'  
        print 'end-track'

Reading/Writing

The original 1541 firmware can still read all regular sectors, but since it checks track/sector numbers for validity, the new sectors won’t be legal for regular files, and they can’t be read using the U1 block read command either. The internal “job queue” API won’t do this check though. The following BASIC program will allocate buffer #2 (line 30), send a read command for track T, sector S into buffer 2 (lines 40-50), wait for its completion (lines 60-70), and receive and print the buffer (line 90-100).

10 t=18:s=19  
20 open1,8,15  
30 open2,8,2,"#2"  
40 print#1,"m-w"+chr$(10)+chr$(0)+chr$(2)+chr$(t)+chr$(s);  
50 print#1,"m-w"+chr$(2)+chr$(0)+chr$(1)+chr$(128);  
60 print#1,"m-r"+chr$(2)+chr$(0)+chr$(1);  
70 get#1,a$:a=asc(a$+chr$(0)):ifaand128thenprint".";:goto70  
80 ifa<>1thenprint"error ";a:end  
90 print#1,"b-p 2 0"  
100 fori=0to255:get#2,a$:printasc(a$+chr$(0));:next  
110 close2:close1

Since writing any sector will destroy it as well as the following sectors, care must be taken that the user does not attempt to write to it. Commodore DOS stores a filesystem version identifier at track 18, sector 0, offset 2. This is “A” (0x41) on the 1541. Overwriting this byte with any other value will soft write protect the disk: All write accesses through the filesystem API will fail with code 73, but it cannot prevent direct block writes (command U2).

So with the caveat of not being able to write to the disk any more, we can create a disk with extra sectors that can still be read with the original software – a perfect use for game disks, for example. A game could use a disk that uses the extra sectors on all tracks including the directory track (track 18), and have the user load the initial boot program using the original 1541 DOS functionality. (The boot program cannot use any of the extra sectors.) Then, a fast loader would take over to load the actual game program and data. This can be any existing fast loader without any adaptations, as long as it doesn’t check for the validity of sector numbers.

Max Speed Zones + Minimizing Gaps

We can also use the fastest speed zone together with minimal gaps. This way, all tracks will have 22 sectors. With 41 tracks, that’s a total of 924 sectors (231 KB), which is 241 extra sectors, or 35% more capacity.

(Again, if the disk is supposed to be bootable, track 18 will have to have the regular format.)

Custom Format

The Commodore DOS format uses a SYNC-prefixed header and a separate SYNC-prefixed data section, which is required so that sectors can be rewritten reliably. If a disk is not meant to be modified, we can change a sector to only pretty much only consist of a SYNC mark, its sector number, the 256 data bytes and a checksum. The next sector can just follow without any gap.

This format is of course completely incompatible with Commodore DOS and requires uploading custom read code into the 1541.

But it allows us to fit two extra sectors per track compared to the original format.

def speed_for_track(track):  
    if track < 18:  
        return 3  
    if track < 25:  
        return 2  
    if track < 31:  
        return 1  
    return 0

def sectors_for_track(track):  
    return [19, 20, 21, 23][speed_for_track(track)]

print 'no-tracks 84'  
print 'track-size 7928'  
for track in range(1, 36):  
    print 'track {}'.format(track)  
    print '   speed '.format(speed_for_track(track))  
    print '   begin-at 0'  
    for sector in range(0, sectors_for_track(track)):  
        print '   sync 16'  
        print '   gcr {:02x}'.format(sector)  
        print '   begin-checksum'  
        print '      gcr',  
        for i in range(0, 128):  
            print '{:02x} {:02x}'.format(track, sector),  
        print ''  
        print '      checksum 00'  
        print '   end-checksum'  
        print '   bits 1111'  
    print 'end-track'

Max Speed Zones + Custom Format

Again, we can use speed zone 3 for all tracks, so they all have 23 sectors each, for a total of 943 sectors, or 38% more than a regular disk.

Theoretical Maximum

The custom format was still using 256 byte sectors, each with a SYNC mark, a sector number and a checksum. What if we just stored everything in one giant sector? We still need one SYNC mark on the track, so we know where it starts, and in order to reset the hardware’s bit alignment logic. Speed zone 3 fits 7692 raw bytes onto a track. At least two bytes will be needed for the SYNC mark, and with the data GCR-encoded (and the size increasing by 5/4), we will have 6152 bytes for data, which is 24 sectors, with only 8 bytes to spare. Writing such a disk is only possible if the drive is no faster than 0.1% above the specified 300 rpm.

Writing this format onto all 41 tracks will give us 984 blocks (246 KB), or 44% more.

print 'no-tracks 84'  
print 'track-size 7928'  
for track in range(1,42):  
    print 'track {}'.format(track)  
    print '   speed 3'  
    print '   begin-at 0'  
    print '   sync 16'  
    print '      gcr',  
    for i in range(0, 24 * 256):  
        print '{:02x}'.format(i & 0xff),  
    print ''  
    print 'end-track'

But reading will be tricky and slow. There is only one SYNC mark, so reading any portion of the track will first require waiting for the SYNC mark (1/10 sec on average). The single sector holds 6 KB of data, but the drive’s buffer RAM ist only 2 KB total, so it cannot store the whole sector. It therefore needs to skip any number of bytes until the requested section data stream is reached.

Using 512 or 1024 byte sectors that can still fit the drive’s RAM wouldn’t work though: The single giant 6 KB sector just barely fit with only 8 bytes to spare, breaking it up into smaller sectors would overflow the available space.

Comparison

The following table is sorted by capacity:

Method Capacity (blocks) Capacity % Compatible with DOS Density Spec Compliant
Regular Format 683 (170 KB) 100% yes yes
Minimizing Gaps 724 (181 KB) 106% read-only yes
Custom Format 753 (188 KB) 110% no yes
Tracks 36-41 785 (196 KB) 115% yes kind of
Minimizing Gaps + Tracks 36-41 832 (208 KB) 122% read-only kind of
Max Speed Zones1 861 (215 KB) 126% no no
Custom Format + Tracks 36-41 867 (216 KB) 127% no kind of
Max Speed Zones + Minimizing Gaps 902 (225 KB) 132% no no
Custom Format + Max Speed Zones 943 (235 KB) 138% no no
Theoretical Maximum 984 (246 KB) 144% no no

The highest gains can be achieved by breaking the density spec, which will also make the disk incompatible with DOS. Using tracks 36-41 and just breaking the density spec a little will also help a lot. Optimizing the on-disk data structures only helps a bit, but in the case of “Minimizing Gaps” retains read-only DOS compatibility.


  1. Note that this refers to the “unformatted” capacity, which does not take the limitation into account that a certain number of consecutive 0- or 1-bits cannot be stored, so data must be converted into a code that satisfies these requirements, like Commodore’s GCR (“Group Code Recording”), which stores 5 bits for every 4 bits of data. So the maximum usable capacity of a 1541 disk is 4/5 of the unformatted capacity.

  2. “Max Speed Zones” always implies “Tracks 36-41”.

  3. Again, this is the unformatted capacity. Using GCR, this would be reduced to 215 KB.

  4. A speed zone 4 with a rate of one byte every 24 would have worked for tracks 1-10. This would have added another 2 sectors per track, for a total of 20 extra sectors. It is unknown why Commodore chose to use only 4 zones – I am assuming to keep the electronics simpler.

  5. Doing the theoretical capacity calculation again, 41 tracks have a total length of 11200 mm (11.2 m!), which corresponds to an unformatted capacity of 298 KB, formatted 238 KB.

  6. The length of the gap after the data varies between the formatter in the original firmware and third party formatting tools.

  7. The original “DOS 1” firmware of the Commodore 2040 drive fit one more sector into zone 2, but this was too tight, easily overwriting sector 0 when writing sector 19 on drives with a slightly too fast motor (unfortunately with the BAM sector on track 18 as a likely victim), so this was changed in DOS 2. This explains why the tail gap of zone 2 is significantly longer.

3 thoughts on “Fitting 44% More Data on a C64/1541 Floppy Disk”

  1. This sheds some good lights on the engineering decisions made by Commodore when they built their disk drives, and what you could do differently if you, for example, think about read-only disks. Thank your for sharing this!
    As an add-on, the followup-disk drives for the PET used a 50% higher clock frequency, and managed to put up to about 500k on a single side disk, or over 1M on a 8250 double sided disk. However, they did it with about 7700bpi, which is about 30% above the actual media specification! Did they cheat?
    https://extrapages.de/archives/20190102-Floppy-notes.html

  2. This is great writing/exploration Michael. And now I want to dig in to experiment with my Atari equivalents.

  3. A 2031, 1540 and early 1541 DISK (until ROM -02) have 8 gap bytes after the header. The 9 gap bytes were introduced with the 1541-03 ROM.

    Note that the 4040 also writes 8 gap bytes. But, other than the 2031 and the later disks, these 8 byte were GCR-converted, resulting in 10 byte on the medium. This GCR conversion does not take place with the 2031 and later disk drives (154x, 157x).

    Most probably, Commodore widened the GAP from 8 to 9 byte in order to become write compatible with the 4040 disk drive. It seems they did not use 10 byte, because then, the 1541 would not be compatible with 2031, 1540 and early 1541.

Leave a Comment