1200 Baud Archeology: Reconstructing Apple I BASIC from a Cassette Tape

The audio file that was posted two weeks ago is indeed a very important artifact of computer history: It is a recording of the “Apple I BASIC” cassette tape that came with the Apple I. It is the first piece of Software ever sold by Apple (not counting computer firmware).

Here is the first confirmed perfect dump of the 4096 bytes: apple1basic.bin

The Apple I is extremely rare. Only 200 were built, and less than 100 are believed to be in existence. Neither Steve nor Woz own an Apple I any more, and neither does Apple Inc. (Update: Woz still owns some.) The cassettes are even rarer, as not every Apple I came with one. There has not been a dump of the tape until 2002, when Achim Breidenbach of Boinx Software got an MP3 recording of an original Apple 1 BASIC tape by Larry Nelson, an Apple I owner, and, with a little help from his father (who worked with an Apple I back in 1976), managed to decode it by writing a program – in GFA BASIC on an Atari ST. Here is a screenshot of the visualization this program could provide:

Achim wrote an Apple I emulator, included a commented disassembly of his Apple I BASIC dump, and published it on the internet. Other people continued working on the disassembly and changed instructions that they thought were mistakes in the dump. The only dump that can be easily found today includes these changes. It was time to analyze the tape again and get an authoritative dump of the 4096 bytes.

So here is how to decode the signal. Let us first open the audio file in Audacity and look at the waveform. The signal is mono, and as it turns out that the quality of the left channel is better, let us delete the other channel. This is what the whole ~35 second recording looks like:

There is silence at the beginning and at the end – but we can just as well hear that. We need to zoom in to see the actual waveform. From about 2 seconds to 12.4 seconds, when we hear a continuous beep, the signal signal consists of uniform waves:

Afterwards, during what sounds like noise, the waveform looks very different: There are shorter waves and loger waves.

The original output of the Apple I when writing the tape was a square wave, but the signal was filtered by the properties of the magnetic tape, so high frequencies were removed, and the signal became rounder. On the way back in from tape, the op-amp of the Cassette Interface converted the signal back into a square wave by converting all samples above a certain value into 1, and all samples below into 0. While the threshold in the op-amp was fixed, it could be effectively manipulated by changing the output volume of the cassette player. Effect->Amplify in Audacity gives us the following picture:

It is now time to write a small program to measure and dump the width of the pulses. We need to be able to parametrize the threshold (the volume knob) to be able to try different settings. In Audacity, we have to save the file as “WAV” first, because it is easy to decode this file format: We just skip the 44 byte header, and every following byte is a signed 16 bit value representing the sample.

First runs show us that all pulses in the first part of the file are around 56 samples long, while the rest of the file contains pulses which have a length of either around 22 or around 45. It seems the first part is a sync signal, so that the reader knows when the data starts. The two pulse lengths afterwards represent ones and zeros. It turns out that the shorter pulses are zeros; this way we get readable 6502 assembly code and some ASCII data (though with bit #7 set).

Here is a hexdump of the first few bytes:

0000000 4c b0 e2 ad 11 d0 10 fb ad 10 d0 60 8a 29 20 f0
0000010 23 a9 a0 85 e4 4c c9 e3 a9 20 c5 24 b0 0c a9 8d
0000020 a0 07 20 c9 e3 a9 a0 88 d0 f8 a0 00 b1 e2 e6 e2
0000030 d0 02 e6 e3 60 20 15 e7 20 76 e5 a5 e2 c5 e6 a5

In order to verify that our decoded data is correct, we must make sure that after the sync, there are only pulses that fall into the 22 or 45 category. If we play with the volume and find a volume region (i.e. different but similar volume settings) that has no pulse length errors and decodes into the same data, we can be confident that the data is correct. According to what is printed on the cassette, Apple 1 BASIC is supposed to reside in RAM at 0xE000 to 0xEFFF, so
the length of the decoded data should be 4096 bytes.

The following program is a possibly minimal implementation of the required algorithm:

apple1basic-decode.c

#include <stdio.h>

#define DIVISOR 30

int main() {
    int index = 0, last = 0, direction = 1, syncstate = 0, bitindex = 0;
    int distance;
    unsigned char outbyte;
    signed short sample;

    while (!feof(stdin)) {
        sample = getchar() | getchar()<<8;
        if (!direction) {
            if (sample>(32768/DIVISOR)) {
                distance = index-last;
                if (distance<50) {
                    if (syncstate == 2) {
                        outbyte = outbyte << 1 | (distance<32? 0:1);
                        if (!((++bitindex)&7)) putchar(outbyte);
                    }
                    if (syncstate == 1) syncstate++;
                } else if ((distance<70) && !syncstate)
                    syncstate++;
                last = index;
                direction++;
            }
        } else
            if (sample<-(32768/DIVISOR))
                direction--;
        index++;
    }

    return bitindex/8;
}

Run it like this:

cat apple1basic-mono.wav | ./apple1basic-decode > apple1basic.bin

You can optionally specify a parameter to the program which overrides the default value of 30 for "DIVISOR". Values between ~25 and ~95 will reconstruct the data correctly.

apple1basic.bin

If you can read 6502 assembly, you should definitely have a look at the disassembly, as this is very high quality (in the 1970s, this meant: very tightly compressed) code.

Update: You can run Apple I BASIC as a scripting language on Mac OS X now!

121 thoughts on “1200 Baud Archeology: Reconstructing Apple I BASIC from a Cassette Tape”

  1. Nice Find!

    For those new to 6502 assembly language, from AppleWin
    F7
    bload “apple1basic.bin”,2000
    2000L

    Michael, AppleWin Debugger Developer

  2. Apple Inc does own an Apple I

    It is actually owned by Apple Computer Australia, and on loan and display at the Powerhouse Museum in Sydney.

    For many years, it was under a glass box in the foyer of the Apple Australia offices.

  3. Pingback: pligg.com
  4. We all have to thank you.
    Those kind of things should deserve to be saved for the future exactly like we do with paintings, architecture and other forms of art that show some geniality or beauty.
    Those are not just 4096 bytes of code (you could spell by heart). They are a genial solution to a problem heavier than an elephant at that time.

  5. Pingback: meneame.net
  6. Nice save
    This technique could be used to save many classic computer cassette based programs from extinction.
    In the UK there were many such computers especially the Sinclair spectrum, acorn atom and bbc microcomputer.

  7. Nice work, but the run instructions are wrong. One might as well say,
    cat apple1basic-mono.wav | ./apple1basic-decode | cat > apple1basic.bin
    It’s a common, and gratuitous mistake. The correct instructions are,
    ./apple1basic-decode apple1basic.bin

  8. “./apple1basic-decode apple1basic.bin” should probably be “./apple1basic-decode < apple1basic.bin”. The program doesn’t parse command line arguments, and C doesn’t do it automatically like Perl.

  9. David and tux_rocker, what are you on about?
    “cat apple1basic-mono.wav | ./apple1basic-decode” pipes the wave to the decoder, the ” > apple1basic.bin” redirects stdout from the decoder to a file called applebasic.bin

    What’s wrong with that?

  10. I own an Apple 1.

    …And a copy of Apple 1 BASIC on cassette, and Woz’s Mini-Assembler that is “origin-ed” for the Apple 1. (This is the same Mini-Assembler that was in the Apple ][ ROMs, at $F666). And a few other Apple 1 goodies.

    Do you realize that the cassette interface for the Apple 1 and the Apple ][ are identical?

    Yep, you can read an Apple 1 audio cassette with any old, easy-to-find Apple ][. And from there, you can use any one of a million methods to get the data out of memory and onto another medium.

    Also, you can simply use the Apple ][ to create a NEW cassette for your Apple 1 (if you happen to be lucky enough to have one).

    BTW, I think mine is “serial number” 0064. At least that’s what I think the “0064” that is written in Sharpie on the PC board means…

  11. The instructions first given are fine.

    cat file.wav | ./decode > file.bin

    or

    ./decode file.bin

    ack is correct that

    ./decode file.wav > file.bin

    won’t work, because the program is only reading from stdin and isn’t opening any files explicitly.

  12. Hmm… something ate part of my post.

    the input redirect is missing.

    Perhaps that’s why things look goofy; I’ll try again.

    cat file.wav | ./decode > file.bin

    or

    ./decode (less than sign) file.wav > file.bin

    will work.

  13. It’s really funny how reading about this recaptures all the excitement that I felt about computing in this era. Thanks for waking up those memories!

  14. Hint for future generations: please don’t encode to mp3 :) A suitably cut down wav file (8-bit, one-channel, 8 or 16K sample rate) would work better and probably be about the same size or smaller.

    MP3s are like JPGs, they compress information by taking advantage of known limitations of human sensory organs. That means information is lost.

    This seems to work, though, so the mp3 encoding must not have hurt it too badly.

  15. Just an FYI not related to this article: Clicking on your anti-spam word (to hear it) results in a nasty PHP error. FYI

  16. Confused by your code. The direction variable is never less than 1 so why check it? What is the redundant else if at the bottom supposed to be doing?

  17. Ah, this may be useful to me. I’m in the middle of restoring an old vintage Processor Tech SOL-20. Ironically, when I purchased this machine, I could have bought an Apple 1, but it didn’t come with a power supply, keyboard, or display interface, so I bought the SOL kit. Oops.
    Anyway, I have 30 year old old cassettes with data that is currently of unknown quality, this sort of audio restoration was exactly what I was thinking of. I figured I’d dump the audio data to the computer, clean it up, and then convert it to binary data. I know some SOL hobbyists have stored good binary copies of loaded data for the major apps, but they were loaded into RAM and then dumped via a serial port, that required a good workable copy of the cassette. I’m not sure this is possible in my case, I have dozens of self-written applications that don’t exist elsewhere. So I thank you for your efforts and showing some of the techniques that might aid me in the next phase of my project.

  18. Fritz – Also nice to know, it was put together on my Macbook Pro… ;-)

    Much respect to Apple and those people who found the artifact.

  19. Hmm, We’ve got a plastic flexible 45rpm vinyl record of a program for the ZX80 (or was it for the ZX81) I’ll have to dig it out and try the same thing!

  20. This reminds me of my Commodore 64 days. I started off with the Commodore Datasette drive, but eventually got a disk drive. When it went on the blink, I was faced with several months of saving for a new one, and I discovered that a most commercial software for the 64 was sold in Great Britain on tape. I sprung for one program, the most important…Silent Service. It worked great from tape. In addition to the non-commercial software I already had on tape, it got me through those several disk-free months.

  21. Nice work. Anyone remember the Marantz Pianocorder, a player piano from the late 1970s that played from cassette tapes? Back in the mid-1990s, I used a similar technique to capture and archive the digital data for all 500+ known cassettes for that system. I started off capturing the tapes in real time and doing this sort of offline processing, but it was ultimately faster to do it all in one step by capturing the squared-up datastream byte-by-byte into the bidirectional parallel port of a PC running DOS. After I had transferred all the tapes, the project kind of grew in scope and I later wrote a plug-in for WINAMP to transcode MIDI files on-the-fly to Pianocorder format. The funny thing is, I learned recently that those Chuck E. Cheese animatronic shows from the 1980s were based on modified Pianocorder hardware. There is a surprisingly active collector’s market around that stuff… quite a few guys setting up Chuck E. Cheese and Showbiz animatronics in their homes and programming their own shows.

    In going through all the Pianocorder tapes, I often needed 2 or 3 copies of each tape to get a clean transfer as the result of the oxide binder breaking down. Baking the tapes was necessary in some cases.

    The time to do these sorts of projects is NOW. The original materials may not be readable for much longer.

  22. Pingback: Web 2.0 Announcer
  23. Apple fans weren’t the only ones trading data via cassette tape. My IMSAI 8080 had one for a while until we got those wonderful 8 inch floppy drives – imagine a whole 1.2 MB for your programs and another 1.2 MB for data for our CP/M systems. As I remember it there was a hobbyist convention in Kansas City in 1976 to set the cassette exchange specs – the Kansas City Standard. And later on all those Vic 20s and Commodore 64s had cassette interfaces, too.

    And now we routinely fill up terabyte drives…

    I think I still have some old Vic 20 cassette programs somewhere. I used to use one to decode RTTY on ham radio.

  24. Steil:

    I did a comparison of the bytes given Mahon’s disassembly to your dump. Well, there are a few differences, but not in the places where he made his corrections (which means the corrections were on the mark). Here are the places in your dump which differ from Mahon’s:

    e88a: 18 — clc
    e88b: a0 00 — ldy #$00
    e88d: a5 dc — lda $dc
    e88f: 71 dc — adc ($dc),y

    and

    e89e: a0 34 — ldy #$34
    e8a0: 46 d9 — lsr $d9
    e8a2: 4c e0 e3 — jmp $e3e0
    e8a5: a0 4a — ldy #$4a

    and other differences in the data areas: ($ea45) = $a3, ($ec53) = $40, ($ec80) = $07, ($eca2) = $0e, ($ed4d) = $ae.

    Now that we have the correct code, it’s time to sit down and figure out what the code really means. :)

  25. “Now that we have the correct code, it’s time to sit down and figure out what the code really means. :)”

    I suggest you read a very useful book called ‘What’s Really Inside the Commodore 64’

    Its a complete dissasably of all 16k of kernal and commodore (microsoft) basic V2, with the most detailed comments.

  26. Sorry to disappoint you but apple 1 BASIC was recovered from tape over 5 years ago, disassembled, and is currently running on replica 1 computers. We have also recovered star trek for the apple 1, lunar lander, blackjack and other programs.

  27. BOB SMITH: “I used to use one to decode RTTY on ham radio.”

    That is the COOLEST thing I ever heard of anyone doing with a computer EVER. Did you ever make a program to encode text or even DATA into RTTY for transmission? You could have interfaced two computers over HAM for that, you still could!

    I am getting all kinds of cool ideas with this, I need to read more about ham radio, what range could you get with them? Could you arrange to have a friend retransmit your transmission for even longer range?

    Oh man that is soo cool!!!

    Email me -edward@nardella.ca

  28. Too bad I trashed my old 1976 Technical Design Labs z80 software tapes a few years ago :-( Now, how about resurrecting the PDP-8!

  29. Pingback: Web 2.0 Announcer
  30. The Apple I BASIC cassette wasn’t even included with all of the 200 Apple Is produced eons ago, but a few engineering souls have managed to extract the data and create an MP3 of the wave structure. Not surprisingly, the tone resembles that of a 1200 Baud connection, and if we should say so ourselves, would make for a wicked ringtone.

  31. Since these old computers had audio in and audio out, does that mean that they could send data over the phone lines without a modem?

  32. Pingback: Apple
  33. I think most people mean 4096 bytes (4kb) of memory.

    Still, incredible. If we still did things in assembly, things wouldn’t be soo darn bloated and higher quality code would exist.

  34. “Still, incredible. If we still did things in assembly, things wouldn’t be soo darn bloated and higher quality code would exist.”

    You mean unmaintanable and unportable code.. GIGO applies to assembly code just as surely as it does to higher-level languages, and if it was so fantastic the ‘market’ would not have needed HLLs..

    That said, it’s still pretty amazing how much folks could squeeze into 4KB back in the day, just like it’s amazing how Greek stonemasons could build seamless columns, and how many artisan skills from antiquity have been lost, with folks today trying to relearn them..

  35. Pingback: mac apple

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.