Category Archives: hacks

cbmbasic 1.0 with Plugins

I moved cbmbasic development to SourceForge and released version 1.0, which has the following added features:

  • RDTIM/SETTIM support (George Talbot)
  • LOAD”$” on Win32 (Lorenzo)
  • RND() is now random (Wolfram Sang)
  • The C code now hooks into the cbmbasic plugin infrastructure. This lets developers add additional statements, functions etc. Right now, you can turn this on with “SYS 1″ (turn off with “SYS 0″), and use the new statements LOCATE y,x (set cursor position), SYSTEM string (run command line command) and the extended WAIT port,mask, which implements the Bill Gates easter egg.

Amiga/Lorraine Mugs

Every touristy place has them: Souvenirs with given names on them. If you have an uncommon name, or a friend with an uncommon name, you might look through the whole collection – and notice that they have generic ones like “#1 FRIEND” (i case you really don’t find your friend’s name), and, sometimes, generic ones in Spanish.

Who can resist a Las Vegas souvenir mug with “AMIGA” on it? Especially if you can get “AMIGA” and “LORRAINE” together at twice the price?

Note to self: I travel too much.

Commodore BASIC as a Scripting Language for UNIX and Windows – now Open Source

Update: The source is available at

Attention Slashdot crowd, here is a little background:

This application is a recompiled version of the original Commodore 64 binary – it is not a reimplementation, so while it runs at pretty much the maximum possible speed, it is still 100% compatible. The huge C file in the archive has been produced by feeding the original 6502 code into my static recompiler and optimizing it with LLVM. The original operating system interface (character I/O, LOAD, SAVE etc.) has been reimplemented in native C, so Commodore BASIC interfaces nicely with OS X/Windows/Unix – you can use pipe I/O, and you can pass the filename of a BASIC program on the command line.

Yes, you could also just run a standard C64 emulator, but it wouldn’t be nearly this speed, and everything would run inside a sandbox; and there would be no way to interface this to your OS.

A while back, I released Apple I BASIC and Commodore BASIC as a scripting language for Mac OS X 10.5 on Intel. It did not work on any other OS or on a different CPU type.

Today, we are releasing Commodore BASIC as a Scripting Language – it works on Linux, Windows, Mac OS X 10.4/10.5 (Intel and PowerPC), and you even get the source, so you can adapt it to other operating systems and CPUs.

Download it here:

The archive comes with binaries for Mac OS X and Windows. The source compiles on Linux, Windows and Mac OS X. All code is BSD-licensed. Main work by Michael Steil, speed optimizations, Linux and Windows fixes by James Abbatiello.

The core of the BASIC interpreter is in the file cbmbasic.c, which is platform, endianness and bitness independent. For all I/O, it calls out into runtime.c, do it should be able to adapt this project for any OS by just changing runtime.c.

All function calls that the core interpreter can’t handle end up in kernal_dispatch() in runtime.c, where a switch statement dispatches these to C functions. For Commodore BASIC, runtime.c has to support several Commodore KERNAL library routines. Some of them are very important (CHRIN, CHROUT) and some are only used for certain BASIC statements (LOAD, SAVE, OPEN, SETTIM). runtime.c does not implement all functions yet.

Feel free to port cbmbasic to your system and architecture of choice, and extend runtime.c to support more functions. If you like, send your changes back to us (mist64<at>mac<dot>com, abbeyj<at>gmail<dot>com), so we can update the main project and give your changes to everyone. We’re also interested how fast you can get it with different compilers and settings.

See my old article for more info as well as some insights on how it is done.

See also:

Create your own Version of Microsoft BASIC for 6502

Update: The source is available at

If you disassemble a single binary, you can never tell why something was done in a certain way. If you have eight different versions, you can tell a lot. This episode of “Computer Archeology” is about reverse engineering eight different versions of Microsoft BASIC 6502 (Commodore, AppleSoft etc.), reconstructing the family tree, and understanding when bugs were fixed and when new bugs, features and easter eggs were introduced.

This article also presents a set of assembly source files that can be made to compile into a byte exact copy of seven different versions of Microsoft BASIC, and lets you even create your own version.

Microsoft BASIC for MOS 6502

First written in 1976, Microsoft BASIC for the 8 bit MOS 6502 has been available for virtually every 6502-based computer including the Commodore series (PET, C64), the Apple II series, Atari 8 bit machines, and many more.

These are the first eight versions of Microsoft BASIC:

Name Release VER ROM FP ROR Buffer Extensions Version
Commodore BASIC 1 1977 Y 9 Y ZP CBM 1.0
OSI BASIC 1977 1.0rev3.2 Y 6 Y - 1.0a
AppleSoft I 1977 1.1 N 9 Y 0200 Apple 1.1
KIM BASIC 1977 1.1 N 9 N ZP - 1.1a
AppleSoft II 1978 Y 9 Y 0200 Apple 2
Commodore BASIC 2 1979 Y 9 Y 0200 CBM 2a
KBD BASIC 1982 Y 6 Y 0700 KBD 2b
MicroTAN 1980 Y 9 N ZP - 2c

Name: Name of the computer system or BASIC interpreter

Release: Release date of this version – not necessarily the date when the source code was forked from Microsoft’s

VER: Version string inside the interpreter itself

ROM: Whether the software shipped in ROM, or was a program supposed to be loaded into RAM

FP: Whether the 6 digit or 9 digit floating point library was included. 9 digit als means that long error messages were included instead of two character codes, and the GET statement was supported.

ROR: Whether the ROR assembly instruction was used or whether the code worked around it

Buffer: Location of the direct mode input buffer; either zero page or above

Extensions: What BASIC extensions were added by the OEM, of any.

Version: My private version number used in this article and in my combined source

The Microsoft BASIC 6502 Combined Source Code

Download the assembly source code here:

In order to assemble if, you will need the CC65 compiler/assembler/linker package.

The source can be assembled into byte-exact versions of the following seven BASICs:

  • Commodore BASIC 1
  • AppleSoft I
  • Commodore BASIC 2 (PET)
  • Intellivision Keyboard Component BASIC
  • MicroTAN BASIC

You can build the source by running the shell script This will create the seven files cbmbasic1.bin, osi.bin, applesoft.bin, kb9.bin, cbmbasic2.bin, kbd.bin and microtan.bin in the “tmp” directory, which are identical to the original ROMs.

You are welcome to help clean up the source more, to make it more readable and to break features out into CONFIG_* defines, so that the source base can be made more customizable.

Make sure to read on to the end of the article, as it explains more about the source and what you can do with it.

Microsoft BASIC 1

Ric Weiland, Bill Gates and Monte Davidoff at Microsoft wrote MOS 6502 BASIC in the summer of 1976 by converting the Intel 8080 version. While the former could fit well into 8 KB, so that a computer manufacturer could add some machine-specific I/O code and ship a single 8 KB ROM, code density was less on the 6502, and they could not fit it significantly below 8 KB – it was around 7900 bytes – so that computers with BASIC in ROM would require more than a single 8 KB ROM chip.

Spilling over 8 KB anyway, they decided to also offer an improved version with extra features in a little under 9 KB: This version had a 40 bit floating point library (“9 digits”) instead of the 32 bit one (“6 digits”), and the two-character error codes were replaced with actual error messages:

6 digit BASIC 9 digit BASIC

9 digit BASIC also added support for the GET statement to read single keystrokes from the keyboard.

On startup, Microsoft BASIC 6502 asks for the size of memory:


If the user just presses return, BASIC detects the size of memory itself. If, on the other hand, the user enters “A”, it prints:


Versions since 1.1 print:


Then it asks:


Microsoft’s codebase could also be assembled either for use in ROM or in RAM: The RAM version additionally asks:


These four statements are located at the very end of the interpreter image (actually, the init code is at the very end, but that gets overwritten anyway), so that up to 250 more bytes are available for the BASIC program if the start of BASIC RAM was set to the beginning of the SIN/COS/TAN/ATN code (“N”), or to overwrite ATN only (“A”) – in this case, the user would gain about 100 bytes extra.

All these questions were very similar to the ones presented on an Intel 8080 BASIC system – after all, BASIC 6502 was a direct port.

The start message looks something like this:


Microsoft’s codebase was very generic and didn’t make any assumptions on the machine it was running on. A single binary image could run on any 6502 system, if the start of RAM was set correctly, the calls to “MONRDKEY”, “MONCOUT”, “LOAD” and “SAVE” were filled with pointers to the machine-specific I/O code, and the “ISCNTC” function was filled with code to test for Ctrl+C.

Microsoft maintained this source tree internally and, at different points in time, handed their current version of the source to OEMs, which adapted and/or extended it for their machines. While most OEM versions were heavily modified in its user interaction (startup screen, line editing…), most of the code was very similar; some functions were even never changed for any version of BASIC. No OEM ever came back to Microsoft for updates, except for Apple and Commodore, which both synced once each, up to the bugfixed version 2.

Commodore BASIC 1 (1.0)

The BASIC that shipped with the first Commodore PET in 1977 is the oldest known version of Microsoft BASIC for 6502. It does not say “Microsoft” anywhere, and memory size detection and screen width were hardcoded, so on startup, it just prints *** COMMODORE BASIC ***, followed by the number of bytes available for BASIC.

Commodore added the “OPEN”, “CLOSE”, “PRINT#”, “INPUT#” and “CMD” statements for file I/O and added VERIFY to compare a program in memory to a file on a storage device. They also added “SYS” to call into assembly code – Microsoft’s code had only provided the “USR” function with a similar purpose. It seems Commodore didn’t like the “OK” prompt, so they renamed it to “READY.”.

All machine-specifics were properly abstracted by calls into the KERNAL jump table, the upper 7 KB of the 16 KB ROM – except for one call out into the screen editor part of the PET ROM:

        lda     (INDEX),y
        jsr     LE7F3	; patch
        ldy     #$00
        asl     a
        adc     #$05
        adc     INDEX
        sta     INDEX
        bcc     L33A7
        inc     INDEX+1

This code fixes the garbage collector by doing the missing ldy/asl/adc in the patch code.

Speaking of patches: Commodore BASIC 1 has been binary patched a lot: There are six patch functions appended to the very end of the interpreter image that work around miscellaneous fixes. This is what one of these calls into a patch function looks like:

        jmp     PATCH1
        jmp     CONTROL_C_TYPED

Here is the patch function – someone indeed forget to clear the carry flag:

        jmp     CONTROL_C_TYPED

Some of these patches are in generic code, and some in Microsoft-specific code. Later fixes in generic code are not necessarily identical to these patches. So this indicates that Commodore wrote the fixes. But it is unknown why these additions were done in the binary as opposed to the source: Commodore had the source and made lots of additions to it. Maybe it was just more convenient to patch the binary for debugging at some point.

Ohio Scientific (1.0a)

Ohio Scientific sold a wide series of 6502-based machines for several years, but they all shipped with the same version of 6 digit BASIC bought from Microsoft in 1977.

6 digit vs. 9 digit was probably a compile time option, because the differences are pretty straightforward, as can be seen in this example:

; ----------------------------------------------------------------------------
; ----------------------------------------------------------------------------
        adc     ARGEXTENSION
        sta     FACEXTENSION
        lda     FAC+4
        adc     ARG+4
        sta     FAC+4
        lda     FAC+3
        adc     ARG+3
        sta     FAC+3
        lda     FAC+2
        adc     ARG+2
        sta     FAC+2
        lda     FAC+1
        adc     ARG+1
        sta     FAC+1
        jmp     NORMALIZE_FAC5

Ohio Scientific only made minimal adaptions for their computers, and added no extensions. It asks for memory size and terminal width, and then prints OSI 6502 BASIC VERSION 1.0 REV 3.2".

One quirk on the Ohio Scientific is the inclusion of the WANT SIN-COS-TAN-ATN string, although BASIC ran in ROM. The code to print this string and adjust memory layout accordingly is not included. OSI BASIC is 7906 bytes in size. Without the extra string, they could have saved 21 bytes.

The string Garbage Collector was horribly broken in OSI BASIC, effectively destroying all string data – in Commodore BASIC 1, it had been binary patched for fix the problem.

AppleSoft I (1.1)

Apple shipped the first Apple II systems with Integer BASIC in ROM, Microsoft BASIC was only available as an option loaded from disk or tape. AppleSoft BASIC, as it was named, had only minor adaptions and extensions. On startup, it printed:


In order to make AppleSoft feel more like Integer BASIC, it showed a ‘]’ character instead of “OK” and said “ERR” instead of ERROR.

The memory size easter egg was modified in this version, it printed COPYRIGHT 1977 BY MICROSOFT CO instead of Weiland’s and Gates’ names. Since the Apple II character output code ignored the uppermost bit, this text could be hidden in ROM by setting the MSBs of every character:

.;287F C3 CF D0 D9 D2 C9 C7 C8 "COPYRIGH"
.;2887 D4 A0 B1 B9 B7 B7 A0 C2 "T 1977 B"
.;288F D9 A0 CD C9 C3 D2 CF D3 "Y MICROS"
.;2897 CF C6 D4 A0 C3 CF 0D 00 "OFT CO."

This version introduced another easter egg present in all later versions: BASIC 1.1 was the first version to include the “MICROSOFT!” easter egg text, as described in a previous article. The encoded (XOR 0×87) text was hidden in some floating point constants and never addressed.

AppleSoft I is the oldest known BASIC 1.1. Compared to 1.0, version 1.1 included minor bugfixes in GET/INPUT/READ, TAB() and LIST, as well as the fix in the Garbage Collector present in the Ohio Scientific machines and binary patched in Commodore BASIC 1.

BASIC 1.0 also had a bug where lines in direct mode that started with a colon were ignored:

        jsr     CHRGET
.ifdef CONFIG_11
        beq     L2351

CHRGET is supposed to set the zero flag on the end of an instruction, which can be end of line (0 character) or a colon. The original code wanted to check for an empty line and got the first character, and went on reading another line of it was empty – but a colon as the first character had the same effect. 1.1 fixed this by setting the flags on the value again.

Version 1.1 also contained various tiny speed optimizations: BEQs and BNEs were changed so that a cycle could be saved on the more likely case.

Here is another optimization in LEFT$/RIGHT$/MID$:

.ifndef CONFIG_11
        sta     JMPADRS+1
        sta     JMPADRS+2
        sta     Z52
.ifdef CONFIG_11
        lda     Z52
        ldy     #$00
.ifndef CONFIG_11
        inc     JMPADRS+1
        jmp     (JMPADRS+1)

The original code isn’t only suboptimal, it’s even dangerous, because it only increments the low byte of the address it wants to jump to and assumes it doesn’t roll over to $00.

For some reason, the random number seed was changed slightly:

    .ifdef CONFIG_11
        .byte   $80,$4F,$C7,$52,$58
        .byte   $80,$4F,$C7,$52,$59

But this doesn’t make a difference, due to a bug present in all 9 digit versions of BASIC: The value is copied into the zero page together with the CHRGET routine:

        lda     GENERIC_CHRGET-1,x
        sta     CHRGET-1,x
        bne     L4098

On 9 digit BASIC, one extra byte had to be copied, but the start index was not changed, so the last digit was omitted. This bug exists in every known version of Microsoft BASIC.

Another bug was introduced on the Apple II: All previous versions of BASIC had the input buffer for instructions in direct mode in the zero page. On the Apple II, it was at $0200 in RAM, which broke some code that made assumptions on the address:

        jsr     ISCNTC   ; check for Ctrl + C
        lda     TXTPTR
        ldy     TXTPTR+1 ; high-byte of instruction pointer
        beq     L2683    ; 0 -> direct mode
        sta     OLDTEXT
        sty     OLDTEXT+1

Subsequent versions of BASIC compared the high-address of the text pointer:

        cpy     #>INPUTBUFFER

KIM-1 (1.1a)

The KIM-1 is a computer kit based around the MOS 6502, which was sold by the makers of the 6502 to show off the capabilities of this CPU. A 6 digit and a 9 digit version of Microsoft BASIC was available on tape, but the 6 digit version seems to be very rare. BASIC for the KIM-1 is the most authentic version of Microsoft BASIC, because it has only been minimally modified, it contains all questions about memory size, screen width, and the trigonometric functions, as well as the memory width easter egg. The encoded “MICROSOFT!” string can be found at the same spot as on the Apple II.

Although this is based on BASIC 1.1, just like AppleSoft I, there are a few fixes in array handling and the PRINT statement.

But they also introduced another bug: In input handling, again concerning the location of the input buffer, there is the following code:

        ldx     #<(INPUTBUFFER-1)
        ldy     #>(INPUTBUFFER-1)
        bne     L2AF8	; always

This code has been in place since 1.0 and assumes that INPUTBUFFER is above $0100. On the CBM1, which had the input buffer in the zero page, this had been hotfixed by Commodore by swapping the ldx and the ldy. On the OSI, this code didn’t exist, as it is only included in versions that have the GET statement, i.e. 9 digit versions. AppleSoft I was not affected either, because it had the input buffer at $0200. And versions after the KIM fixed this by replacing the BNE with a BEQ in case the input buffer is in the zero page. It is obviously hard to maintain a single codebase with many compile time options that still does optimizations like these.

Since the first KIM-1 systems shipped in late 1975, their CPUs had the 6502 ROR bug, so KIM-1 BASIC had to work around this: Every ROR instruction is replaced by a corresponding sequence using LSR instead.

AppleSoft II (2.0)

AppleSoft II is the oldest version of Microsoft BASIC 2. It was available on tape or disk, and also in ROM in later Apple II models. It is the first BASIC from an OEM that had extended BASIC which was re-sync’ed with Microsoft’s codebase. In other words: Apple licensed an improved and bugfixed version of BASIC, and merged their old changes into it.

BASIC 2 contains mostly bugfixes (all input buffer location bugs have finally been eliminated), small optimizations (reuse two adjacent zeros inside the floating point constant of 1/2 as the 16 bit constant of zero instead of laying it down separately), better error handling for DEF FN, and support for “GO TO” with a space in between as a synonym for GOTO. Also, the memory test pattern has been changed from $92/$24 to the more standard $55/$AA.

In AppleSoft II, Apple also eliminated the “memory size” and “terminal width” questions.

Commodore BASIC 2 (2.0a)

Just like Apple, Commodore went back to Microsoft for an updated version of BASIC, and integrated its changes into the new version. The version they got was slightly newer than Apple’s, but the major difference was that Microsoft added the “WAIT 6502″ easter egg. For this, they changed the encoding of the string “MICROSOFT!” that was hidden in every BASIC since 1.1 from XORed ASCII into PETSCII with the upper two bits randomly set – this way, the text would be just as obfuscated, but it the decoder would be shorter on PET systems. So Commodore BASIC 2 is the only version of Microsoft BASIC that ever accesses this hidden text.

Every version since 2.0a had the PETSCII version of the “MICROSOFT!” text in it – and so did every version of BASIC for 6809.

Intellivision Keyboard Component BASIC (2.0b)

The Mattel Intellivision is a game console released in 1980 that contained a very nonstandard 16 bit “CP1610″ CPU. After a series of delays, the “Keyboard Component”, an extension with its own 6502 CPU and Microsoft BASIC, was released in 1982, but canceled very soon. They are very rare today.

The BASIC in the Keyboard Component is the most custom of all known versions. It is based on a 6 digit version of BASIC 2 and younger than Commodore BASIC 2: It contains two bugfixes: One piece of code that pulled its caller’s address from the stack and normalized it by adding one, had forgotten to respect the carry, so this could fail if the caller sits just on a page boundary. The other fix changed the number of steps needed for normalizing a floating point number.

Intellivision BASIC replaced LOAD and SAVE by PLOD, PSAV, VLOD, VSAV and SLOD, PRT, GETC and VER were added, and PEEK, POKE and WAIT were removed. But the customizations were even more extensive: Instead of keeping the interface to library code, a lot of code was replaced inline, and the whole init code was rewritten. While most of the generic code, for example memory handling was unchanged across Commodore, Ohio, AppleSoft and KIM, making it easier to later integrate Microsoft’s fixes, some of even this code was altered on the Keyboard Component.

What is interesting about the strings in Intellivision BASIC is that they use both upper- and lower case. The start message is this:

Copyright Microsoft, Mattel  1980

But upper-/lowercase support doesn’t stop here: The complete code has been extended to be case insensitive, but case preserving. The CHRGET code, a super-optimized function living in the zeropage has been patched with a call to this function:

        cmp     #'a'
        bcc     LF43A
        cmp     #'z'+1
        bcs     LF43A
        sbc     #$1F

This very unoptimized piece of code adds at least 17 cycles to every CHRGET, and will slow down execution measurably.

Microtan BASIC (2.0c)

The version of BASIC that shipped on the Tangerine MICROTAN 65 is, like the Ohio and KIM versions, again a very authentic version with few changes. The updated BASIC 2 contained a single bug fix, which is the floating point constant of -32768 which hadn’t been updated from 6 to 9 digits correctly and was missing a byte. The startup message looks like this:


Microtan BASIC contains the complete “memory size” and terminal width procedures and the “Weiland/Gates” easter egg.

Although the Microtan was introduced in 1980, its version of BASIC was, like the KIM version, assembled with code that worked around the ROR bug in 6502 chips until mid-1976. The I/O library on the other hand made use of ROR, suggesting that this compile time option was set in error.

Bugs never fixed

As you can see, the first versions had many bugs that were quickly fixed, but fixed became less and less – simply because there were only very few bugs left. But still there are some bugs that never got fixed. The short copy of the random number seed for example, exists on all versions.

Similarly, the two extra constants used for generating random numbers (CONRND1, CONRND2) are 4 bytes in all versions, which is one byte short for 9 digit BASIC. But this is another bug that doesn’t really matter, since the numbers will still be random enough.

The buggy check on large line numbers has also never been fixed. Typing 35072121 into any version of Microsoft BASIC will have the interpreter jump to a pseudo random memory address. The buggy code resides in “LINGET”.

Something similar happens in the case of PRINT 5+"A"+-5: The interpreter will build up the formula on the CPU stack, but miss the string/float type mismatch because of the “+-”, and messes up its stack when removing items. This bug is in “FRMEVL”.

But the fact that Microsoft never fixed these bugs in their codebase doesn’t mean none of the OEMs fixed them. While the LINGET and FRMEVL seem to have been unnoticed everywhere, at least the CONRND1/CONRND2 bug has been fixed by Commodore, at least as early as for the VIC-20 in 1980.

How to build your own BASIC

Now that you have the source that can build seven different OEM versions of Microsoft BASIC, and that you know about the differences between those, you might be interested in building your own version of BASIC 6502 for some 6502-based machine or customizing BASIC to build a bugfixed or extended version for some platform.

First duplicate one of the cfg files, and add it to cbmbasic2 is a good start, as you can quite easily test the resulting images in the VICE emulator – CC65 can even provide symbol information for the VICE debugger. Add a case in defines.s to define one of CBM1, CBM2, APPLE etc., because you need one flavour of platform specific code, and include your own defines_*.s. For Commodore BASIC, you also need to define CONFIG_CBM_ALL.

If you are targeting a new type of computer, make sure to adjust the zero page locations in your defines_*.s file (ZP_STARTn) so that they don’t clash with your I/O library. Also make sure that, in case you are compiling for RAM, the init code does not try to detect the memory size and overwrite itself.

The CONFIG_n defines specify what Microsoft-version the OEM version is based on. If CONFIG_2B is defined, for example, CONFIG_2A, CONFIG_2, CONFIG_11A, CONFIG_11 and CONFIG_10A will be defined as well, and all bugfixes up to version 2B will be enabled. The following symbols can be defined in addition:


jump out into CBM1′s binary patches instead of doing the right thing inline

add all Commodore-specific additions except file I/O


include the CBM2 “WAIT 6502″ easter egg

support Commodore PRINT#, INPUT#, GET#, CMD

all I/O has bit #7 set

Y needs to be preserved when calling MONCOUT

terminal doesn’t need explicit CRs on line ends

disable support for Microsoft-style “@”, “_”, BEL etc.

don’t support PEEK, POKE and WAIT

don’t do a very volatile trick that saves one byte

support for the NULL statement (send sync 0s for serial terminals)

preserve LINNUM on a PEEK

whether PRINTNULLS does anything

print CR when line end reached

optimizations for RAM version of BASIC, only use on 1.x

use workaround for buggy 6502s from 1975/1976; not safe for CONFIG_SMALL!

check both bytes of the caller’s address in NAMENOTFOUND

where in the init code to call SCRTCH

use 6 digit FP instead of 9 digit, use 2 character error messages, don’t have GET

Changing symbol definitions can alter an existing base configuration, but it is not guaranteed to assemble or work correctly.

I am very interested in your creations. Please add a comment to this article if you have made something new out of this source base!

Using the Floating Point Library Standalone

The complete project has been split into many components, each in their own assembly source file. The core floating point library is in float.s, extra trigonometric functions are in trig.s. It should not be too hard to use this broken-out part (in a 6 digit or 9 digit version) standalone in your own creations. The 9 digit version is a little over 2 KB in size, the 6 digit version is a little smaller.

Adding More Versions

If you want to add another version of BASIC into the source base, you can do it like this: Use “da65″ from the CC65 package to dissemble your version of BASIC and all existing .bin files (with the correct base addresses), and run a “diff” command on the new disassembly and each of the disassemblies of the existing versions. The diff that contained the fewest changes (just look at the file size) is probably a good candidate to base your new version on. Or look at the release date or the family tree to find a version which is similar.

Now create a new version in the source base, as described earlier. Make sure the new version assembles; then compare the disassembly of your version with the disassembly of the original binary in a diff program, like the excellent Mac OS X FileMerge, to find the differences. In most cases, you will only have to adjust a few defines (CONFIG_* and zero page locations) in your defines_*.s file to get matching output. Otherwise, add ifdefs to the respective source files. Run to verify that you didn’t break the other versions.

Repeat the last step until the assembly process outputs the same file. Send your changes to me. :-)

Note that the idea of all versions of BASIC in the current source code is that they are all direct forks from Microsoft’s codebase. I chose not to include versions like Commodore BASIC 4, Commodore BASIC 2 for the VIC-20/C64 etc., and I wouldn’t add very late AppleSoft versions, because these are only extended versions of earlier forks and contain no extra code from the original Microsoft source base. Versions that would be very interesting to integrate would be AppleSoft II and Atari Microsoft BASIC, preferably the very first revisions of these.


  • Function names and all uppercase comments taken from Bob Sander-Cederlof’s excellent AppleSoft II disassembly
  • AppleSoft lite by Tom Greene helped a lot, too.
  • Thanks to Joe Zbicak for his help with Intellivision Keyboard BASIC
  • This work is dedicated to the memory of my dear hacking pal Michael “acidity” Kollmann.

Building the Solaris Kernel in 73 Easy Steps

Everyone and their grandmother builds Linux kernels. Many people build BSD, and some brave men even compile the OS X kernel every now and then. Why not compile your own Solaris kernel for a change?

There is lots of documentation scattered out there, many pieces, incomplete, outdated and over-generalized tutorials. This will walk you through installing Solaris, adding all components required for building, and actually compiling a kernel in 73 easy steps.

I won’t give you any options, because options make everything more complicated. We’re installing Solaris on a dedicated machine, the versions of the build system and the target kernel/system will match and we’re targeting x86/x64 only.

  1. Get a physical computer with maybe a GB of RAM and significantly more than 10 GB of disk space. VMware and VirtualBox seem to have issues with current builds (b97-b99).
  2. Navigate to
  3. Download the latest build of OpenSolaris Express Community Edition (Nevada). You can’t compile a kernel on any other Solaris distribution. Get Nevada. If you want to run a specific build, hack the URL and replace the build number; older builds are available, but not linked to.
  4. burn DVD
  5. boot DVD
  6. GRUB: “Solaris Express”
  7. select “Solaris Interactive”
  8. make your language perference
  9. choose Networked
  10. DHCP yes
  11. IPv6 no
  12. Kerberos no
  13. Name Service: none
  14. NFS domain derived by the system
  15. choose your time zone
  16. enter your root password
  17. reboot yes, eject yes
  18. media CD/DVD
  19. accept the license
  20. custom install
  21. no localizations
  22. no additional products
  23. Entire Group, Default (~7 GB)
  24. select your install HD – remember that this is a dedicated disk that will be wiped!
  25. accept a single Solaris MBR partition with all the space
  26. modify the layout: remove /export/home and allocate all to “/”, keep swap
  27. wait
  28. the system will reboot. remove the DVD
  29. wait a long time again – the GUI login screen will show up eventually, don’t log in on the console
  30. log in as root
  31. (b99 will hang after a short “About Gnome” screen, hit Ctrl+Alt+Backspace to kill the X-Server and log in again)
  32. Administration -> Users and Groups, Add User
  33. log out
  34. log in as user
  35. find out what compiler you need: says you need Sun Studio 11 up to b99, and Sun Studio 12 starting with b100.
  36. click on the linked Sun Studio
  37. it might say “Chinese-Simplified” as language, but the file is correct.
  38. don’t use the download manager, just click on the file and choose “Save File”
  39. Navigate to
  40. choose your build of your installed system (don’t say “current”, that’s newer!)
  41. download “ON Source”: on-src.tar.bz2
  42. download “ON Specific Build Tools (i386)”: SUNWonbld.i386.tar.bz2
  43. download “ON Binary-Only Components (debug, i386): on-closed-bins.i386.tar.bz2
  44. All Applications -> Accessories -> Terminal
  45. fix the prompt:
    echo "export PS1='h:W u$ '" >> .bashrc
  46. add onbld to the PATH:
    echo "PATH=/opt/onbld/bin:$PATH" >> .bashrc
  47. close the window and start a new terminal, or ssh into the machine; you can get your IP with
    /usr/sbin/ifconfig -a
  48. su
  49. bash
  50. cd /opt
  51. bzip2 -cd /export/home/username/Desktop/sunstudio*.tar.bz2 | tar xf -
  52. cd /export/home/username/Desktop
  53. bzip2 -cd SUNWonbld*.tar.bz2 | tar xf -
  54. pkgadd -d onbld SUNWonbld
  55. y
  56. close Terminal window, open a new one
  57. mkdir work
  58. cd work
  59. bzip2 -cd ~/Desktop/on-closed-bins*.tar.bz2 | tar xf -
  60. bzip2 -cd ~/Desktop/on-src*.tar.bz2 | tar xf -
  61. cd usr/src/tools
  62. mkdir proto
  63. ln -s /opt proto
  64. cd ../../..
  65. cp usr/src/tools/env/ .
  66. vi
  67. set GATE to “work” (base directory name)
  68. set CODEMGR_WS to “/export/home/username/$GATE” (full path)
  69. set STAFFER to your username
  70. set VERSION to “$STAFFER” to have your name in the kernel version
  71. bldenv ./
  72. cd usr/src/uts
  73. dmake all

(If it complains with “illegal option -m64″, your compiler is too old. Get Sun Studio 12. If anything fails, read the README from the download page for updates on the build you’re using.)

Now if we only knew what to do with that kernel!

The Xbox 360 Security System and its Weaknesses

After the disaster of the original Xbox, Microsoft put a lot of effort in designing what is probably the most sophisticated consumer hardware security system to date. We present its design, its implementation, its weaknesses, how it was hacked, and how to do it better next time.

Michael Steil has been involved with various embedded systems hacking projects, like the Xbox, the Xbox 360 and the GameCube. In 2006, he has spoken at Google about the flaws in the security system of the original Xbox.

Felix Domke is the principal author of the Xbox 360 hack and the Linux port. He has also significantly contributed to hacking the dbox2, the GameCube and the Wii, and porting Linux to the respective platforms.

Apple I BASIC as a Mac OS X Scripting Language

Update: Commodore BASIC as a Scripting Language for UNIX and Windows – now Open Source

Recently, we reconstructed a perfect copy of Apple I BASIC, the first piece of software Apple ever sold – in 1976. Now it is time to make something useful out of it. Wouldn’t it be nice if you could use Apple I BASIC to replace, say, Perl? Wouldn’t it be nice if you could do this:

$ cat reverse.bas
10 DIM A$(100)
30 FOR I = LEN(A$) TO 1 STEP -1
40 PRINT A$(I,I);
70 END
$ chmod a+x reverse.bas
$ echo MICHAEL STEIL | ./reverse.bas

Here is Apple I BASIC as a scripting language for Mac OS X Intel:

Just yet another Apple I emulator for Mac? No, it is not. There are some very important differences:

  • The “apple1basic” executable is a statically recompiled version of the original binary. All code is running natively.
  • “apple1basic” plugs right into UNIX stdin and stdout.
  • You can pass “apple1basic” the filename of a BASIC program to run.
  • You can run BASIC programs like shell scripts.

Let’s play with it for a bit. First, copy “apple1basic” to /usr/bin:

$ sudo cp apple1basic /usr/bin

Let’s try direct mode first:

$ apple1basic

Now let’s write a small program:

$ apple1basic
>10 FOR I = 1 TO 10
>30 NEXT I
>40 END
        HELLO WORLD!
         HELLO WORLD!

We can tell apple1basic to run a BASIC program from a file, too:

$ cat hello.bas
10 FOR I = 1 TO 10
40 END
$ apple1basic hello.bas
        HELLO WORLD!
         HELLO WORLD!

apple1basic can be interactive:

$ cat name.bas
10 DIM N$(20)
40 PRINT "HELLO, "; N$; "!"
50 END
$ apple1basic name.bas

apple1basic supports redirection of stdin and stdout. Note that if stdin is a pipe, “INPUT” doesn’t print the “?”:

$ cat reverse.bas
10 DIM A$(100)
30 FOR I = LEN(A$) TO 1 STEP -1
40 PRINT A$(I,I);
70 END
$ echo MICHAEL STEIL | apple1basic reverse.bas

Which brings us back to our first example: You can even use apple1basic as a UNIX script interpreter by adding the hashbang as the first line:

$ cat reverse.bas
10 DIM A$(100)
30 FOR I = LEN(A$) TO 1 STEP -1
40 PRINT A$(I,I);
60 END
$ chmod a+x reverse.bas
$ echo MICHAEL STEIL | ./reverse.bas

Some more programs

These games, written for the Apple I in the 1970s, have been taken from this and other sites and converted from tokenized hexcode into ASCII.

You can also download hanoi-apple1.bas, a fixed version of Amit Singh’s BASIC program from his Hanoimania project.

How it is done

My static recompiler takes the Apple I BASIC binary as an input and produces a native executable. This file does not contain the original 6502 code and runs 100% natively. On a 2 GHz machine, it runs about 2000 times faster than a 1 MHz Apple I, so it can run one 6502 clock cycle per native cycle. (In the best case, an interpreter can only get up to 1/10 of an emulated cycle per native cycle.)

Accesses to the Apple I terminal (keyboard and screen) are handled by functions that implement a few hacks. Depending on the program counter and the stack, I can discard some output (like terminal echo, 80 column forced line wrap) and redirect files and stdin to keyboard input, depending on context. For example, if there is a filename on the command line, keyboard emulation passes these characters into BASIC, and adds the “RUN” command. All following input will be read from stdin, and as soon as the code to print the “>” prompt is detected, the output handler quits the application.

Commodore BASIC / Microsoft BASIC 6502

I have done the same thing to the 9 KB Microsoft BASIC for 6502 taken from the Commodore 64 ROM.

$ cbmbasic

    **** COMMODORE 64 BASIC V2 ****



It has many of the same features as Apple I BASIC: It discards the banner and the “READY.” prompt output when runnig in non-interactive mode, accepts a BASIC file as a parameter, and can be used as a script interpreter.

But… why??

I love the idea of reusing old code. Emulation is nice, but it rarely integrates well into modern systems. Most code out there has either a very lightweight (or well-understood) interface to the hardware (like the Apple I) or the opertaing system (like Commodore BASIC, as well as all programs from the GUI era), so hooking it up with modern operting system services can work out very nicely.

Also, emulation of vintage systems rarely cares too much about performance any more, since it is usually fast enough, i.e. as fast as or faster than the original system. But computers have always been slow, and running very old code can have the advantage of working with really fast software.

And yes, I think there is a lot of useful vintage code out there.


1200 Baud Archeology: Reconstructing Apple I BASIC from a Cassette Tape

The audio file that was posted two weeks ago is indeed a very important artifact of computer history: It is a recording of the “Apple I BASIC” cassette tape that came with the Apple I. It is the first piece of Software ever sold by Apple (not counting computer firmware).

Here is the first confirmed perfect dump of the 4096 bytes: apple1basic.bin

The Apple I is extremely rare. Only 200 were built, and less than 100 are believed to be in existence. Neither Steve nor Woz own an Apple I any more, and neither does Apple Inc. (Update: Woz still owns some.) The cassettes are even rarer, as not every Apple I came with one. There has not been a dump of the tape until 2002, when Achim Breidenbach of Boinx Software got an MP3 recording of an original Apple 1 BASIC tape by Larry Nelson, an Apple I owner, and, with a little help from his father (who worked with an Apple I back in 1976), managed to decode it by writing a program – in GFA BASIC on an Atari ST. Here is a screenshot of the visualization this program could provide:

Achim wrote an Apple I emulator, included a commented disassembly of his Apple I BASIC dump, and published it on the internet. Other people continued working on the disassembly and changed instructions that they thought were mistakes in the dump. The only dump that can be easily found today includes these changes. It was time to analyze the tape again and get an authoritative dump of the 4096 bytes.

So here is how to decode the signal. Let us first open the audio file in Audacity and look at the waveform. The signal is mono, and as it turns out that the quality of the left channel is better, let us delete the other channel. This is what the whole ~35 second recording looks like:

There is silence at the beginning and at the end – but we can just as well hear that. We need to zoom in to see the actual waveform. From about 2 seconds to 12.4 seconds, when we hear a continuous beep, the signal signal consists of uniform waves:

Afterwards, during what sounds like noise, the waveform looks very different: There are shorter waves and loger waves.

The original output of the Apple I when writing the tape was a square wave, but the signal was filtered by the properties of the magnetic tape, so high frequencies were removed, and the signal became rounder. On the way back in from tape, the op-amp of the Cassette Interface converted the signal back into a square wave by converting all samples above a certain value into 1, and all samples below into 0. While the threshold in the op-amp was fixed, it could be effectively manipulated by changing the output volume of the cassette player. Effect->Amplify in Audacity gives us the following picture:

It is now time to write a small program to measure and dump the width of the pulses. We need to be able to parametrize the threshold (the volume knob) to be able to try different settings. In Audacity, we have to save the file as “WAV” first, because it is easy to decode this file format: We just skip the 44 byte header, and every following byte is a signed 16 bit value representing the sample.

First runs show us that all pulses in the first part of the file are around 56 samples long, while the rest of the file contains pulses which have a length of either around 22 or around 45. It seems the first part is a sync signal, so that the reader knows when the data starts. The two pulse lengths afterwards represent ones and zeros. It turns out that the shorter pulses are zeros; this way we get readable 6502 assembly code and some ASCII data (though with bit #7 set).

Here is a hexdump of the first few bytes:

0000000 4c b0 e2 ad 11 d0 10 fb ad 10 d0 60 8a 29 20 f0
0000010 23 a9 a0 85 e4 4c c9 e3 a9 20 c5 24 b0 0c a9 8d
0000020 a0 07 20 c9 e3 a9 a0 88 d0 f8 a0 00 b1 e2 e6 e2
0000030 d0 02 e6 e3 60 20 15 e7 20 76 e5 a5 e2 c5 e6 a5

In order to verify that our decoded data is correct, we must make sure that after the sync, there are only pulses that fall into the 22 or 45 category. If we play with the volume and find a volume region (i.e. different but similar volume settings) that has no pulse length errors and decodes into the same data, we can be confident that the data is correct. According to what is printed on the cassette, Apple 1 BASIC is supposed to reside in RAM at 0xE000 to 0xEFFF, so
the length of the decoded data should be 4096 bytes.

The following program is a possibly minimal implementation of the required algorithm:


#include <stdio.h>

#define DIVISOR 30

int main() {
    int index = 0, last = 0, direction = 1, syncstate = 0, bitindex = 0;
    int distance;
    unsigned char outbyte;
    signed short sample;

    while (!feof(stdin)) {
        sample = getchar() | getchar()<<8;
        if (!direction) {
            if (sample>(32768/DIVISOR)) {
                distance = index-last;
                if (distance<50) {
                    if (syncstate == 2) {
                        outbyte = outbyte << 1 | (distance<32? 0:1);
                        if (!((++bitindex)&7)) putchar(outbyte);
                    if (syncstate == 1) syncstate++;
                } else if ((distance<70) && !syncstate)
                last = index;
        } else
            if (sample<-(32768/DIVISOR))

    return bitindex/8;

Run it like this:

cat apple1basic-mono.wav | ./apple1basic-decode > apple1basic.bin

You can optionally specify a parameter to the program which overrides the default value of 30 for "DIVISOR". Values between ~25 and ~95 will reconstruct the data correctly.


If you can read 6502 assembly, you should definitely have a look at the disassembly, as this is very high quality (in the 1970s, this meant: very tightly compressed) code.

Update: You can run Apple I BASIC as a scripting language on Mac OS X now!