Category Archives: archeology

Comparative C64 ROM Disassembly Study Guide

The Commodore 64 ROM has been subject to immense reverse engineering. Many commented disassemblies were published over the decades, scattered over different media such as books, magazines, disks, and later, the internet – and there are even some commentaries that apply to the C64 ROM, but were written with other systems in mind that shared Microsoft’s BASIC interpreter.

In the past weeks, I have collected and published several of these comentaries in a unified format:

Wouldn’t it be nice to see the comments of all these sources at the same time when looking up code in the C64 ROM?

At pagetable.com/c64rom, you can now see a cross-referenced HTML of the disassembled C64 ROM, with four commentaries side-by-side – the Comparative C64 ROM Disassembly Study Guide:

If you can’t fit all columns on your screen, try reducing the text size in your browser.

The raw txt files with the commentaries as well as the script to combine them are maintained at github.com/mist64/c64rom. Improvements welcome.

And, as mentioned previously, there are many more commentaries in existence, if you want to help me convert them into the canonical format, send me an email.

Fully Commented Commodore 64 BASIC ROM Disassembly – based on Microsoft’s Source

On my quest of collecting as many commentaries on the Commodore 64 ROM at pagetable.com/c64rom, we have gathered Lee Davison’s excellent commentary, the German de facto standard by Data Becker, and an adaptation of Bob Sander-Cederlof’s Apple II ROM commentary, all in the same cross-referenced HTML format.

Now that Microsoft’s original source of MOS 6502 BASIC is available, I’ve added it as the fourth commented disassembly, with the standard disassembly on the left, and the original source, both assembly and comments, lined up correctly on the right:

As always, the HTML version is available at pagetable.com/c64rom, while the raw txt files are maintained at github.com/mist64/c64rom.

While this may be the best set of comments for the BASIC part, we’re not done yet! There are many more (either direct or indirect) commentaries on the C64 ROM in existence:

Please contribute to our collection by helping convert one or more of these sources into the common format! Send me an email if you are interested!

Microsoft BASIC for 6502 Original Source Code [1978]

This is the original 1978 source code of Microsoft BASIC for 6502 with all original comments, documentation and easter eggs:

M6502.MAC (1978-07-27, 6955 lines, 161,685 bytes)

This is currently the oldest publicly available piece of source written by Bill Gates.

Language

Like the 8080 version, the 6502 version was developed on a PDP-10, using the MACRO-10 assembler. A set of macros developed by Paul Allen allowed MACRO-10 to understand and translate 6502 assembly, albeit in a modified format to fit the syntax of macros, for example:

MOS 6502 MACRO-10
LDA #0 LDAI 0
LDA (ADDR),Y LDADY ADDR

MACRO-10 did not support hex numbers, which is why most numbers are in decimal format. In the floating point code, all numbers are octal. The RADIX statement switches between the two. Octal can also be forced with a ^O prefix.

Conditional translation is done using the IFE and IFN statements, which test whether the argument is zero. The following only adds the string to the binary if REALIO is equal to 4:

IFE     REALIO-4,<DT"APPLE BASIC V1.1">

Macros

The source defines many macros that make development easier. There are some examples:

Macro Definition Comment
SYNCHK (Q) LDAI <Q>
JSR SYNCHR
Get the next character and make sure it’s Q, otherwise SYNTAX ERROR. This pattern is used a lot.
LDWD (WD) LDA WD
LDY <WD>+1
Most 16 bit constants are loaded into A/Y with this macro, but macros for A/X and X/Y also exist.
LDWDI (WD) LDAI <<WD>&^O377>
LDYI <<WD>/^O400>
This loads an immediate constant into A/Y.
PSHWD (WD) LDA <WD>+1
PHA
LDA WD
PHA
This pushes a 16 bit value from memory (absolute or zero page) onto the stack.
JEQ (WD) BNE .+5
JMP WD
A compact way to express out-of-bounds branches. Macros exist for all branches.
SKIP2 XWD ^O1000,^O054 This emits a byte value of 0x2C (BIT absolute), which skips the next instruction. (The ^O1000 part wraps the byte in a PDP-10 instruction – see below.)

Configurations

The BASIC source supports several compile-time configuration options:

Name Comment Description
INTPRC INTEGER ARRAYS
ADDPRC FOR ADDITIONAL PRECISION 40 bit (9 digit) vs 32 bit (7 digit) float
LNGERR LONG ERROR MESSAGES Error message strings instead of two-character codes
TIME CAPABILITY TO SET AND READ A CLK TI and TI$ support
EXTIO EXTERNAL I/O PRINT#, INPUT#, CMD, SYS (!), OPEN and CLOSE support
DISKO SAVE AND LOAD COMMANDS LOAD, SAVE (and on Commodore: VERIFY) support
NULCMD FOR THE "NULL" COMMAND NULL support, a command to configure the number of NUL characters to print to the terminal after each line break
GETCMD GET support
RORSW If 1, the ROR instruction is not used
ROMSW TELLS IF THIS IS ON ROM The RAM version can optionally jetison the SIN, COS, TAN and ATN commands at startup
CLMWID Column width for TAB()
LONGI LONG INITIALIZATION SWITCH
STKEND The top of stack at startup
BUFPAG Page of the input buffer; if 0, the buffer uses parts of the zero page
BUFLEN INPUT BUFFER SIZE
LINLEN TERMINAL LINE LENGTH
ROMLOC ADDRESS OF START OF PURE SEGMENT
KIMROM KIM-specific smaller config

Targets

The constant REALIO is used to configure what computer system to generate the binary for. It has one of the following values:

Value Comment Banner Machine
0 PDP-10 SIMULATING 6502 SIMULATED BASIC FOR THE 6502 V1.1 Paul Allen’s Simulator on PDP-10
1 MOS TECH,KIM KIM BASIC V1.1 MOS KIM-1
2 OSI OSI 6502 BASIC VERSION 1.1 OSI Model 500
3 COMMODORE ### COMMODORE BASIC ### Commodore PET 2001
4 APPLE APPLE BASIC V1.1 Apple II
5 STM STM BASIC V1.1 (unreleased)

All versions except Commodore also print “COPYRIGHT 1978 MICROSOFT” in a new line.

The target defines the setting of the configuration constants, but some code is also conditionally compiled depending on a specific target.

What is interesting is that initially it was Microsoft adapting their source for the different computers, instead of giving source to the different vendors and having them adapt it. Features like file I/O and time support seem to have been specifically developed for Commodore, for example. Later, the computer companies would get the source from Microsoft and develop themselves – source code of the Apple and Commodore derivatives is available; they both contain Microsoft comments.

By the way, the numbering of these targets probably indicated in which order Microsoft signed contracts with computer manufacturers. MOS was first (for the KIM), then OSI, then Commodore/MOS again (this time for the PET), then Apple.

The PDP-10 Target

Paul Allen’s additional macros for 6502 development made the MACRO-10 assembler output one 36 bit PDP-10 instruction word for every 6502 byte. When targeting a real 6502 machine, the 6502 binary could be created by simply extracting one byte from every PDP-10 word.

In the case of targeting the simulator, the code created by the assembler could just be run without modification, since every emitted PDP-10 instruction was constructed so that it would trap – the linked-in simulator would then extract the 6502 opcode from the instruction and emulate the 6502 behavior.

While this trick was mostly abstracted by the (unreleased) macro package, its workings can be seen in a few cases in the BASIC source. Here, it defines SKIP1 and SKIP2. Instead of just emitting 0×24 or 0x2C, respectively, it combines it with the octal value of 01000 to make it a PDP-10 instruction that traps:

DEFINE  SKIP1,  <XWD ^O1000,^O044>      ;BIT ZERO PAGE TRICK.
DEFINE  SKIP2,  <XWD ^O1000,^O054>      ;BIT ABS TRICK.

In the initialization code, it writes a JMP instruction into RAM. On the simulator, it has to patch up the opcode of JMP (0x4C, decimal 76) to be the correct PDP-10 instruction:

        LDAI    76              ;JMP INSTRUCTION.
IFE     REALIO,<HRLI 1,^O1000>  ;MAKE AN INST.

With this information, we can reconstruct what the set of 6502 macros, which is not part of this source, probably looked like. Here is LDAI (LDA immediate):

DEFINE  LDAI    (Q),<
        XWD ^O1000,^O251        ;EMIT OPCODE
        XWD ^O1000,<Q>          ;EMIT OPERAND
>

You can also see native TJSR PDP-10 assembly instructions for character I/O:

IFE     REALIO,<
        TJSR    INSIM##>        ;GET A CHARACTER FROM SIMULATOR
IFE     REALIO,<
        TJSR    OUTSIM##>       ;CALL SIMULATOR OUTPUT ROUTINE

The DDT command, which breaks into the PDP-10′s DDT debugger, only exists in this config:

IFE     REALIO,<
DDT:    PLA                     ;GET RID OF NEWSTT RETURN.
        PLA
        HRRZ    14,.JBDDT##
        JRST    0(14)>

The KIM and OSI Targets

The KIM target is meant for the MOS KIM-1 and Ohio Scientific OSI Model 500 single-board computers. These are the first ports to specific computers, and also the cleanest, i.e. except for the character I/O interface and the very simple LOAD/SAVE implementation for the KIM, there is nothing specific about these targets.

The Commodore Target

The Commodore target is meant for the Commodore PET 2001. It includes LOAD/SAVE/VERIFY (the commands jump directly to outside “KERNAL” ROM code), the I/O commands (SYS, PRINT#, OPEN etc.), the GET command and the π, ST, TI and TI$ symbols. CLEAR is renamed to CLR, "OK" is renamed to "READY.", the BEL character is not printed, and character I/O code behaves differently to account for the more featureful screen editor of the PET.

Oh, and the Commodore version of course includes the Bill Gates WAIT 6502,1 easter egg! This is the WAIT instruction:

; THE WAIT LOCATION,MASK1,MASK2 STATEMENT WAITS UNTIL THE CONTENTS
; OF LOCATION IS NONZERO WHEN XORED WITH MASK2
; AND THEN ANDED WITH MASK1. IF MASK2 IS NOT PRESENT, IT
; IS ASSUMED TO BE ZERO.

FNWAIT: JSR     GETNUM
        STX     ANDMSK
        LDXI    0
        JSR     CHRGOT
        BEQ     ZSTORDO
        JSR     COMBYT          ;GET MASK2.
STORDO: STX     EORMSK
        LDYI    0
WAITER: LDADY   POKER
        EOR     EORMSK
        AND     ANDMSK
        BEQ     WAITER
ZERRTS: RTS                     ;GOT A NONZERO.

Note how the BEQ instruction references ZSTORDO, not STORDO – execution sneaks out of this function here.

Well, on non-Commodore machines, ZSTORDO is assigned to be the same as STORDO, so everything is fine:

IFN     REALIO-3,<ZSTORDO=STORDO>

But on Commodore, we have this code hidden near the top of the floating point math package – close enough so the BEQ can reach it, but inside code that is least likely to get touched:

IFE     REALIO-3,<
ZSTORD:!        LDA     POKER
        CMPI    146
        BNE     STORDO
        LDA     POKER+1
        SBCI    31
        BNE     STORDO
        STA     POKER
        TAY
        LDAI    200
        STA     POKER+1
MRCHKR: LDXI    12
IF1,<
MRCHR:  LDA     60000,X,>
IF2,<
MRCHR:  LDA     SINCON+36,X,>
        ANDI    77
        STADY   POKER
        INY
        BNE     PKINC
        INC     POKER+1
PKINC:  DEX
        BNE     MRCHR
        DEC     ANDMSK
        BNE     MRCHKR
        RTS
IF2,<PURGE ZSTORD>>

(IF1 and IF2 are true on the first and the second assembler pass, respectively, so the conditional there is to hint to the assembler in the first pass that SINCON+36 is not a zero page address. Also note that all numbers here are octal, since this code is in the floating point package.)

First of all, the final line here removes ZSTORD from the list of symbols after the second pass, so that Commodore would not notice it in a printout of all symbols – very smart!

As has been discussed before, this code writes the string “MICROSOFT!” into the PET’s screen RAM if the argument to WAIT is “6502″. The encoded string is hidden as two extra 40 bit floating point numbers appended to the coefficients used by the SIN function:

IFN     ADDPRC,<
SINCON: 5               ;DEGREE-1.
        204     ; -14.381383816
        346
        032
        055
        033
        206     ; 42.07777095
        050
        007
        373
        370
        207     ; -76.704133676
        231
        150
        211
        001
        207     ; 81.605223690
        043
        065
        337
        341
        206     ; -41.34170209
        245
        135
        347
        050
        203     ; 6.2831853070
        111
        017
        332
        242
        241     ; 7.2362932E7
        124
        106
        217
        23
        217     ; 73276.2515
        122
        103
        211
        315>

These last ten bytes, nicely disguised as octal values of floating point constants, spell out “MICROSOFT!” backwards after clearing the upper two bits. What’s interesting is that the floating point values next to them are actually incorrect: They should be 7.12278788E9 and 26913.7691 instead.

Also note that these constants are not conditionally assembled! All versions built since the Commodore easter egg was introduced also contained these 10 bytes – including BASIC for the Motorola 6800!

The Apple Target

The Apple target is meant for the Apple II, and contains no customizations other than some changes around I/O handling (which calls into the monitor ROM). Note that this is not yet the “AppleSoft” version of BASIC, which was a more customized version modified by Apple later.

The STM Target

“STM” most likely stands for “Semi-Tech Microelectronics” – a company that never shipped a 6502-based computer. Their first machine was the “Pied Piper”, a Z80-based system, and they later made a PC clone. It seems they had a 6502-based computer in development that never shipped – or at least they were considering making one, and Microsoft added the target; this target doesn’t actually change any of the defaults.

Organization of the Source

The source uses the PAGE and SUBTTL keywords for organization. Here are the headings:

SUBTTL  SWITCHES,MACROS.
SUBTTL  INTRODUCTION AND COMPILATION PARAMETERS.
SUBTTL  SOME EXPLANATION.
SUBTTL  PAGE ZERO.
SUBTTL  RAM CODE.
SUBTTL  DISPATCH TABLES, RESERVED WORDS, AND ERROR TEXTS.
SUBTTL  GENERAL STORAGE MANAGEMENT ROUTINES.
SUBTTL  ERROR HANDLER, READY, TERMINAL INPUT, COMPACTIFY, NEW, REINIT.
SUBTTL  THE "LIST" COMMAND.
SUBTTL  THE "FOR" STATEMENT.
SUBTTL  NEW STATEMENT FETCHER.
SUBTTL  RESTORE,STOP,END,CONTINUE,NULL,CLEAR.
SUBTTL  LOAD AND SAVE SUBROUTINES.
SUBTTL  RUN,GOTO,GOSUB,RETURN.
SUBTTL  "IF ... THEN" CODE.
SUBTTL  "ON ... GO TO ..." CODE.
SUBTTL  LINGET -- READ A LINE NUMBER INTO LINNUM
SUBTTL  "LET" CODE.
SUBTTL  PRINT CODE.
SUBTTL  INPUT AND READ CODE.
SUBTTL  THE NEXT CODE IS THE "NEXT CODE"
SUBTTL  DIMENSION AND VARIABLE SEARCHING.
SUBTTL  MULTIPLE DIMENSION CODE.
SUBTTL  INTEGER ARITHMETIC ROUTINES.
SUBTTL  FRE FUNCTION AND INTEGER TO FLOATING ROUTINES.
SUBTTL  SIMPLE-USER-DEFINED-FUNCTION CODE.
SUBTTL  STRING FUNCTIONS.
SUBTTL  PEEK, POKE, AND FNWAIT.
SUBTTL  FLOATING POINT ADDITION AND SUBTRACTION.
SUBTTL  NATURAL LOG FUNCTION.
SUBTTL  FLOATING MULTIPLICATION AND DIVISION.
SUBTTL  FLOATING POINT MOVEMENT ROUTINES.
SUBTTL  SIGN, SGN, FLOAT, NEG, ABS.
SUBTTL  COMPARE TWO NUMBERS.
SUBTTL  GREATEST INTEGER FUNCTION.
SUBTTL  FLOATING POINT INPUT ROUTINE.
SUBTTL  FLOATING POINT OUTPUT ROUTINE.
SUBTTL  EXPONENTIATION AND SQUARE ROOT FUNCTION.
SUBTTL  EXPONENTIATION FUNCTION.
SUBTTL  POLYNOMIAL EVALUATOR AND THE RANDOM NUMBER GENERATOR.
SUBTTL  SINE, COSINE AND TANGENT FUNCTIONS.
SUBTTL  ARCTANGENT FUNCTION.
SUBTTL  SYSTEM INITIALIZATION CODE.

Paul Allen vs. Bill Gates

The source of the 8080 version states:

PAUL ALLEN WROTE THE NON-RUNTIME STUFF.
BILL GATES WROTE THE RUNTIME STUFF.
MONTE DAVIDOFF WROTE THE MATH PACKAGE.

People have since wondered what runtime vs. non-runtime meant, especially since Paul Allen’s recent debate on whether the company’s ownership was faily split.

The BASIC for 6502 source sheds some light on this:

NON-RUNTIME STUFF
        THE CODE TO INPUT A LINE, CRUNCH IT, GIVE ERRORS,
        FIND A SPECIFIC LINE IN THE PROGRAM,
        PERFORM A "NEW", "CLEAR", AND "LIST" ARE
        ALL IN THIS AREA. [...]

So by “runtime” they just literally mean “at run time”: all code that is active when the program runs, as opposed to non-runtime, which is all code that assists editing the program.

By this understanding, we can assume this:

  • Paul Allen wrote the macro package for the MACRO-10 assembler, the 6502 simulator, the tokenizer, the detokenizer, as well as finding, inserting and deleting BASIC lines.
  • Bill Gates implemented all BASIC statements, functions, operators, expression evaluation, stack management for FOR and GOSUB, the memory manager, as well as the array and string library.
  • Monte Davidoff wrote the floating point math package.

Version and Date

The last entry in the change log has a date of 1978-07-27. Both the comment in the first line of the file and the message printed at startup call it version 1.1.

What does this say about the version of the source? Is it the last version? Let’s look at the last bug fix and compare which BASIC binaries contain this fix, and let’s see whether there are fixes in BASIC binaries that are not in the source.

I have previously compared binaries of derivatives of BASIC for 6502 and compiled the information at github.com/mist64/msbasic. The last entry in the log of this source is about a bug that failed to correctly invalidate a pointer in the RETURN statement. According to my analysis of BASIC 6502 versions, this is fixed in the BASIC binaries for AIM-65, SYM-1, Commodore v2, KBD BASIC and MicroTAN, i.e. on everything my previous analysis calls CONFIG_2A and higher.

The same analysis also came to the conclusion that there were two successors, CONFIG_2B and CONFIG_2C. At least the two CONFIG_2B fixes exist in two BASIC binaries: KBD BASIC and MicroTAN, but they don’t exist in this source. It’s very unlikely that both these bugs (and only these!) got fixed by the two computer manufacturers independently, so it’s safe to assume that this source is not the final version – but pretty close to it!

Interesting Finds

  • This code is comparing a keyboard input character to the BEL code. Bob Albrecht is a computer educator that “was instrumental in helping bring about a public-domain version of Basic (called Tiny Basic) for early microcomputers.”.
    CMPI    7               ;IS IT BOB ALBRECHT RINGING THE BELL
                            ;FOR SCHOOL KIDS?
    
  • External documentation usually calls the conversion of ASCII BASIC text into the compressed format “tokenizing”. The source calls this “crunching”.
  • Microsoft is still spelled “Micro-Soft”.
  • Apparently the multiplication function could use some performance improvements:
    		
    BNE     MLTPL2          ;SLOW AS A TURTLE !
    
  • The NEW command is actually called SCRATCH in labels and comments – maybe other BASIC dialects called it that, and they decided to rename it to NEW later?
  • The math package documentation says:
    MATH PACKAGE
            THE MATH PACKAGE CONTAINS FLOATING INPUT (FIN),
            FLOATING OUTPUT (FOUT), FLOATING COMPARE (FCOMP)
            ... AND ALL THE NUMERIC OPERATORS AND FUNCTIONS.
            THE FORMATS, CONVENTIONS AND ENTRY POINTS ARE ALL
            DESCRIBED IN THE MATH PACKAGE ITSELF.
    

    Commodore’s derived source changes this to:

    ; MATH PACKAGE
    ;       THE MATH PACKAGE CONTAINS FLOATING INPUT FIN, OUTPUT
    ;       FOUT, COMPARE FCOMP...AND ALL THE NUMERIC OPERATORS
    ;       AND FUNCTIONS.  THE FORMATS, CONVENTIONS AND ENTRY
    ;       POINTS ARE ALL DESCRIBED IN THE MATH PACKAGE ITSELF.
    ;       (HA,HA...)
    
  • CHRGET is a central piece of BASIC for 6502. Here it is in its entirety:
    ; THIS CODE GETS CHANGED THROUGHOUT EXECUTION.
    ; IT IS MADE TO BE FAST THIS WAY.
    ; ALSO, [X] AND [Y] ARE NOT DISTURBED
    ;
    ; "CHRGET" USING [TXTPTR] AS THE CURRENT TEXT PNTR
    ; FETCHES A NEW CHARACTER INTO ACCA AFTER INCREMENTING [TXTPTR]
    ; AND SETS CONDITION CODES ACCORDING TO WHAT'S IN ACCA.
    ;       NOT C=  NUMERIC   ("0" THRU "9")
    ;       Z=      ":" OR END-OF-LINE (A NULL)
    ;
    ; [ACCA] = NEW CHAR.
    ; [TXTPTR]=[TXTPTR]+1
    ;
    ; THE FOLLOWING EXISTS IN ROM IF ROM EXISTS AND IS LOADED
    ; DOWN HERE BY INIT. OTHERWISE IT IS JUST LOADED INTO THIS
    ; RAM LIKE ALL THE REST OF RAM IS LOADED.
    ;
    CHRGET: INC     CHRGET+7        ;INCREMENT THE WHOLE TXTPTR.
            BNE     CHRGOT
            INC     CHRGET+8
    CHRGOT: LDA     60000           ;A LOAD WITH AN EXT ADDR.
    TXTPTR= CHRGOT+1
            CMPI    " "             ;SKIP SPACES.
            BEQ     CHRGET
    QNUM:   CMPI    ":"             ;IS IT A ":"?
            BCS     CHRRTS          ;IT IS .GE. ":"
            SEC
            SBCI    "0"             ;ALL CHARS .GT. "9" HAVE RET'D SO
            SEC
            SBCI    256-"0"         ;SEE IF NUMERIC.
                                    ;TURN CARRY ON IF NUMERIC.
                                    ;ALSO, SETZ IF NULL.
    CHRRTS: RTS                     ;RETURN TO CALLER.
    

    Did you ever wonder why all versions have $EA60 encoded into the LDA instruction that later gets overwritten? Because it’s 60000 decimal. That’s why! The source actually uses 60000 as a placeholder for 16 bit values in several places.

  • The handling of π, ST, TI and TI$ (all Commodore-specific) looks wonky: Instead of making them tokens, they are special cased in several places. I always assumed it was Commodore adding this without understanding (or wanting to disrupt) the existing code, but it was Microsoft adding these features. Maybe they were added by someone other than the original developers?

Origin of the File

The source was posted on the Korean-language blog 6502.tistory.com without further comment, in a marked-up format:

================================================================================================
FILE: "david mac g5 b:m6502.asm"
================================================================================================

000001  TITLE   BASIC M6502 8K VER 1.1 BY MICRO-SOFT
[...]
006955          END     $Z+START

End of File -- Lines: 6955 Characters: 154740

SUMMARY:

  Total number of files : 1
  Total file lines      : 6955
  Total file characters : 154740

This formatting was created by an unpublished tool by David T. Craig, who published a lot of Apple-related soure code (Apple II, Apple III, Lisa) in this format in as early as 1993, first anonymously, later with his name).

The filename “david mac g5 b:m6502.asm” (disk name “david mac g5 b”, file name “m6502.asm”, since it was a classic Mac OS tool) confirms David Craig’s involvement, and it means the line numbers were added no earlier than 2003.

Given all this, it is safe to assume the file with the Microsoft BASIC for 6502 source originated at Apple, and was given to David Craig together with the other source be published.

The version I posted is a reconstruction of the original file, with the header, the footer and the line numbers removed, and the spaces converted back into tabs. I chose the name “M6502.MAC” to be consistent with the MACRO-10 file extension used by the Microsoft BASIC for 8080 sources.

Fully Commented Commodore 64 BASIC ROM Disassembly – based on Applesoft!

In our series about C64 ROM commentaries (English version by Lee Davison, German version by Data Becker), I’m now presenting a most unusual C64 ROM commentary – based on a commented disassembly of the Apple II ROM.

S-C DocuMentor for Applesoft” is a commented disassembly of the BASIC ROM of the Apple II computer. Like Commodore BASIC, “Applesoft” BASIC is based on Microsoft BASIC for 6502, but on an older revision. Since the two BASIC interpreters are almost the same instruction for instruction (modulo some command extensions on both sides), the commentary translated over very nicely.

The cross-referenced HTML version of the “S-C C64 BASIC Disassembly” is available here at pagetable.com/c64rom.

The raw txt files of all commentaries are maintained at github.com/mist64/c64rom. Fixes and additions happily accepted!

Fully Commented Commodore 64 ROM Disassembly (English)

After last week’s German C64 ROM disassembly from the “64 intern” book, I have now also converted Lee Davison’s commented disassembly into the same format.

The cross-referenced HTML version is available here at pagetable.com/c64rom.

The raw txt files of both the German and the English commented disassemblies are maintained at github.com/mist64/c64rom. The two files seem to have been independently developed, which gives us the opportunity to compare, find mistakes, and merge missing information.

I will happily accept additions and corrections to either file – let’s create the one true source of C64 ROM information!

Fully Commented Commodore 64 ROM Disassembly (German)

Whenever I need to look up some code in the ROM of the Commodore 64, I have the choice of the commented disassembly by Marko Mäkelä, the one by Ninja/The Dreams, or the one by Lee Davison – or I can just use my paper copy of “Das neue Commodore-64-intern-Buch“, an excellent line-by-line commentary in German.

That’s why I scanned, OCRed, cleaned up and cross-referenced it.

The raw txt file is maintained at github.com/mist64/c64rom. Corrections, additions and translations welcome.

The cross-referenced HTML version is available here at pagetable.com/c64rom.

Rhapsody Developer’s Guide [PDF, 1997]


Feiler, Jesse.
Rhapsody Developer’s Guide.
Boston: AP Professional, 1997.
ISBN 0-12-251334-7

(528 pages, 13.3 MB PDF)

Rhapsody Developer’s Guide provides a road map to Rhapsody technology and the ways it can be used. Based on a modern microkernel, Rhapsody runs on PowerPC and Intel processors, and supports traditional Mac OS applications (in the Blue Box) as well as modern applications in the Yellow Box. Totally object-oriented, the Yellow Box platform offers an unparalleled development environment that permits rapid implementation of functionality ranging from traditional personal computer applications to media-rich, Internet-enabled, and database-driven applications for the next century.

This book describes the architecture of Rhapsody, including its cross-platform implementation on PowerPC and Intel. It details the Yellow Box platform (based on OpenStep) and provides a complete description of the core API, as well as a description of the architecture that will be enriched in the future with additional functionality from Apple. The languages of Rhapsody are discussed, and the API is presented in a language-neutral way that will be convenient for C++ developers, classic and modem Objective-C users, and Java programmers. Throughout, there is an emphasis on how Rhapsody relates to existing investments in code and programming expertise. Screen shots and code samples from products shipping today using Rhapsody technology provide opportunities and challenges to new Rhapsody developers.

About the Author

Jesse Feiler is software director of the Philmont Software Mill. He is also the author of Cyberdog and Real World Apple Guide. He has served as a consultant, author, and speaker for many prestigious businesses, including the Federal Reserve Bank ofNew York, Prodigy, Kodak, Young & Rubicam, and Apple Computer, Inc.

Why is my TI-99/4A in Black and White?

by James Abbatiello

My first computer was a Texas Instruments TI-99/4A. Longtime readers may remember a previous article where we implemented TI-99/4A BASIC as a Scripting Language for modern computers. Recently I got nostalgic for the actual hardware so I got my 99 out of the closet where it had been for a decade or more. I hooked it up to the TV and turned it on. I was expecting to see something like this:TI-99/4a title screen

Instead I was greeted by this:

Well, that’s not right! It is in black and white. And what’s with all these vertical black lines? Clearly something’s wrong but what could it be?

New Meets Old

At first I suspected that it was my TV, which is a fairly new LCD. Old computers or game consoles sometimes played a bit fast and loose with the NTSC standard and it seemed unlikely that a new TV would ever have been tested with something as old as a TI-99/4A. Perhaps the TV just couldn’t interpret output from the 99. So I tried with a CRT TV:

Well the black bars are gone (or at least not as apparent) but it is still in black and white. Something must be wrong with the computer itself.

All About Video Signals

The output from the back of the computer is a composite video signal but using a 5-pin DIN connector (that also carries audio and power) instead of the usual RCA jack. Back when this computer was new that signal would usually to go an RF modulator which was then connected to a TV via a 300-ohm connector. Nowadays you can still do the same thing but since most TVs don’t have screw terminals on the back anymore it can be more convenient to take the composite video signal and hook it directly into the composite input on the TV. All that is required is a simple adapter cable that can be created yourself or purchased online.

I thought that perhaps the video circuitry was generating separate Luminance (Y) and Chrominance (C) signals and then combining them into the final composite output. If this were the case then it would suggest something was wrong in the C amplifier or the final combining stage. It turns out that this is not the case. The video chip in the TI-99/4A is referred to as the Video Display Processor (VDP) and is a TMS9918A, TMS9928A or TMS9929A depending on the region the computer was originally intended for and the television standard in use there (e.g. NTSC or PAL). My computer was made for the US market and outputs NTSC signals using the TMS9918A. This chip has a single video output pin that supplies composite video directly with the Y and C already mixed. So if something was wrong with just the C generation circuitry then it was something broken inside the VDP and my only recourse would be to try to find a replacement chip.

Mad Scientist Equipment

The VDP still seemed to work correctly in all other respects so I was hopeful that the true problem lay elsewhere. I thought I’d take a look at the signal on an oscilloscope. We’d expect to the see the NTSC colorburst and if it was missing that would explain why no color was showing up on the TVs. Here’s what it looked like:

And here’s a closeup of the interesting portion:

I’m no expert but that looks like a horizontal sync pulse followed by a colorburst to me. But there was still no color on the TV.

The composite video signal that the VDP generates is sent to a simple 2-transistor amplifier and then to the output jack. I didn’t think it was likely but perhaps something in the amplifier had given out and Y was still strong enough to get picked up by the TV but C wasn’t. To test this I took the computer apart and tapped the signal right as it came out of the chip and before it went through the amplifier. It was still black-and-white. This suggested that the problem was not in the amplifier.

The Healing Power of Crystals

At this point I knew that the VDP was mostly working correctly. It generated the right pattern on the screen so it must be able to communicate with both the video RAM and the CPU. That accounts for most of the pins on the VDP, the ones handling digital signals. The remaining pins are mostly for power and the connection to the quartz crystal that provides the timing. I checked the power and that seemed fine. So let’s take a closer look at the crystal:

The crystal is the gray-colored component in the middle. To the right is the VDP, covered in thermal paste. Just behind the crystal is a variable inductor.

A variable inductor: now that’s interesting! It is connected to the crystal and apparently used for fine tuning the frequency. Could the fix be as simple as turning an adjustment screw?

Alas, no. I turned it as far as it would go in both directions with no improvement to the video output. If the frequency was off it was beyond the ability of this adjustment to correct. I don’t have any equipment to allow precision measuring of the actual frequency this crystal was producing, but I do have the internet. A little Googling brought me to this post on the TI-99/4A mailing list. Yes, there’s still an active mailing list for a computer that hasn’t been manufactured in almost 30 years!

That post describes the symptoms that I was experiencing and indicated that the solution was to replace the crystal. This was somewhat surprising to me. I’d heard of electrolytic capacitors going bad in old equipment but a quartz crystal? They’re usually quite reliable. But you can’t argue with real-world experience.

The VDP takes the frequency of this crystal (10.738635 MHz) and divides it by 3 to produce the NTSC colorburst frequency (3.579545 MHz). If the frequency of the crystal was off then the generated colorburst would also be off and the TV wouldn’t be able to sync to it. Without seeing a valid colorburst the TV isn’t going to produce any color. That would certainly explain our symptoms!

So after deciding to replace this crystal we have to actually find a replacement part. We want a crystal that runs at exactly 10.738635 MHz. We also need it rated for the proper “load capacitance”. Running a crystal with the wrong capacitance will shift the frequency from the rated frequency. That would be bad since our entire goal is to get the frequency back to the ideal. The original crystal was rated for a load capacitance of 32pF (you can just make out the 32 in the above picture although it is partially obscured by the blue wire). So we want a replacement crystal that’s also rated for 32pF.

Let’s go internet shopping for 10.738635 MHz crystals. Jameco doesn’t carry any. Digikey has some but didn’t have any in stock with a 32pF load capacitance. Luckily Mouser came through for me! A few days later and I had a replacement crystal:

And after a little surgery on the motherboard:

Now for the moment of truth:

Success! Now to play some Parsec!

Bonus Oscilloscope Image

If you’re wondering what the colorburst looks like with the new crystal then wonder no longer.

Looks pretty similar to my eyes but apparently it makes a world of difference to a TV.

Assembly Evolution Part 1: Accessing Memory and the strange case of the Intel 4004

by Julien Oster, reprinted with permission.

While it has become far less relevant for non-system developers to write assembly than it was a few decades ago, by now CPUs have nevertheless made it much more comfortable to do so. Today we are used to a lot of things: fancy indirect addressing modes with scale, a galore of general purpose registers instead of an accumulator and maybe one or two crippled index registers, condition codes for nearly every instruction (on ARM)…

But also the basics themselves have evolved. Let’s take a look at what past programmers had to put up with in entirely simple, everyday things. We’ll start with the most trivial: writing to memory.

Our goal is to write a single immediate value of 3 into the memory location 5. In light of paging, segmenting and bank switching, we’ll use whatever is convenient as a definition for “memory location”. Also, we’ll let the CPU decide what the word size should be. Since you only need 2 bits to represent a 3, it should fit with every CPUs word size (except for 1 bit CPUs, which actually existed, but that’s a story for another posting). If we have the choice, we’ll just take the smallest.

We’ll work backwards, from the present to the past, to explore the wonders of direct addressing in Intel CPUs. (One precautionary warning though: I only really tested the 4004 code in an emulator, and my habits are highly tainted by current Intel CPUs. So if I made some mistake somewhere, kindly point it out and I’ll fix it!)

x86

On a modern x86 CPU, it is of course fairly easy to write the value 3 to memory cell 5. You just do it:

mov byte [5], 3

A single instruction, simple and obvious. I cheated a bit by not using a segment prefix, nor did I set up any segment registers/selectors beforehand. But assuming a nowadays common OS environment in protected mode, you probably don’t want to fiddle with those selectors anyway.

8085

The Intel 8085 is somewhat of a direct predecessor to the 8086, the first in the line of the excessively successful x86 processors. While the 8086 has a 16 bit data bus, the 8085 only has 8 bit. The address bus is already full 16 bit, but its 16 bit capabilities are limited. Specifically, there is no immediate 16 bit addressing (except for branches), leaving us no way to specify our memory location in the instruction that actually performs the move.

Memory is instead addressed with a pseudo register called M. This pseudo register is in reality just backed by the registers H and L paired together, each 8 bit wide, and accessing it accesses the memory location they point at (you may take a guess which register receives the High byte, and which the Low byte of the address).

Luckily, there are a few simple 16bit instructions for moving immediate values, so all in all we can write our byte with:

LXI H, 0005h ; unlucky syntax, as this actually means HL instead of just H

MOV M, 3

By the way, bonus points if you are somehow able to find out just when the address in HL is available on the address bus. The same applies to the 8080 and 8008. Does the CPU copy the register pair’s content to the address bus pins only when actual memory operations take place, or are the address bus pins somehow directly connected to H and L itself? Is that even feasible? I’d really like to find out…

8008

We continue going further back, skipping the 8080 because it was identical in that regard, and arrive at its direct predecessor instead, the Intel 8008. The 8080 and 8085 were source compatible to the 8008 (which, mind you, is not the same as binary compatible… also it may or may not have required some light automated translation), but in the downward direction we have something vital taking from us: While already using 16bit addresses (with only a 14bit address bus, resulting in 16k memory, though), the only instructions that were allowed to contain 16bit immediate values at all are jumps and branches. Consequently, we are left with no way to completely specify our destination address in one instruction!

Instead, we have to access H and L, together forming pseudo register M’s address, one at a time:

LHI 00h

LLI 05h

LMI 3

4004

It’s hardly possible to go back further than the Intel 4004, at least if you are only considering single chip CPUs (at the time of its conception in the early 70s, there were already famous multi-chip CPUs with comfortable orthogonal instruction sets, notably the PDPs). Indeed, it was the first widely available single chip CPU. This little thing was a 4-bit CPU with some strange quirks, which we will explore further. Overall, it bears little to no resemblance to its successor in name, the Intel 8008 (except for the internal stack, which both had–I will cover that in another posting).

But let’s just look at the code for writing a value of 3 into the memory location at 5 first:

FIM P0, 5; load address 05h into pair R0,R1

SRC P0   ; set address bus to contents of R0,R1

LDM 3    ; load 3 into accumulator

WRM      ; write accumulator content to memory

That looks a bit strange.

As a 4 bit CPU, the 4004 has 4 bit wide registers and addresses 4 bit nibbles as words in memory. It has only one accumulator on which the majority of operations is performed, but sixteen index registers (R0-R15).

Those index registers are handy for accessing memory: Besides loading values directly from ROM, an instruction exists to load data indirectly, which sets the address bus to the ROM cell’s content. Another instruction performs an indirect jump instead. Other than that, you can just increment index registers, albeit there is the interesting “ISZ” instruction that not only increments, but also branches if the result is not 0.

Because the 4004 uses 8 bits to address the 4 bit nibbles, every two consecutive index registers form a pair, which is then used for memory references.

Note that I explicitly said ROM above. This is because in the 4004 architecture, ROM and RAM are actually vastly different beasts, at least from the assembly programmer’s perspective. You can not directly access RAM. It always involves index register pairs, manually sending their content to the address bus (with a strangely named instruction “SRC”, which for some reason spells out send register control) and then issuing another instruction which transfers from or to the accumulator.

Interestingly, accessing regular RAM nibbles is not your only choice among the transfer instructions. You can also fetch from and to I/O ports. But the CPU does not have any direct I/O port, instead they are available on both RAM and ROM! You can also read and write “RAM status characters”, which to me look like plain regular RAM cells within another namespace. If someone knows, I’d love to hear what they were used for (and if they maybe did behave differently to normal RAM).

Take a look at the data sheet. Within its only 9 pages, the instruction set is depicted on page 4 and 5. Especially in the light that fairly reasonable orthogonal instruction sets appear to have been available in multi-chip CPUs, this first single-chip CPU is clearly a strange specialization towards the desk calculator it was meant for (the Busicom 141-PF). It has the aforementioned index register-centered RAM access, separate ROM (although there is a transfer instruction which strangely refers to some optional “read/write program memory”), a three level internal stack which is almost useless for general purpose programming and a lone special purpose instruction for “keyboard process” (KBP).

Original 4004 CPUs go from anything from a few to a few thousand dollars on eBay, depending on their packaging and revision. If you’d like to, you can instead play around with a virtual one in this java-script based, fully fledged assembler, disassembler and emulator, or read the rescued source code of the Busicom 141-PF calculator. There’s lots more of schematics, data sheets and other resources on the Intel’s anniversary project page.

That is, if you are brave enough.

The story of 15 Second Copy for the C-64

by Mike Pall, published with permission.

[This is a follow-up to Thomas Tempelmann’s Story of FCopy for the C-64.]

Ok, I have to make a confession … more than 25 years late:

I’ve reverse-engineered Thomas Tempelmann’s code, added various improvements and spread them around. I guess I’m at least partially responsible for the slew of fast-loaders, fast-copys etc. that circulated in the German C64 scene and beyond. Uh, oh …

I’ve only published AFLG (auto-fast-loader-generator) under my real name in the German “RUN” magazine. It owes quite a bit to TT’s original ideas. I guess I have to apologize to Thomas for not giving proper credit. But back then in the 80′s, intellectual property matters wasn’t exactly something a kid like me was overly concerned with.

Later on, everyone was soldering parallel-transfer cables to the VIA #1 of the 1541 and plugging them into the C64 userport. This provided extra bandwidth compared to the standard serial cable. It allowed much faster loading of programs with a tiny parallel loader (a file named “!”, that was prepended on all disks). Note that the commercial kits with cables, custom EPROMs and silly dongles followed only much later.

So I wrote “15 second copy”, which worked with a plain parallel cable. Yes, it copied a full 35 track disk in 15 seconds! There was only one down-side: this was only the time for reading/writing from and to disk — you had to swap the floppies seven times (!) and that usually took quite a bit more extra time! ;-)

It worked by transferring the “live” GCR-encoded data from the 1541′s disk head to the C64 and simultaneously doing a fast checksum. Part of the checksumming was done on the 1541, part was done on the C64. There simply weren’t enough cycles left on either side! Most of the transfer happened asynchronously by adjusting for the slightly different CPU frequencies and with only a minimum number of handshakes. This meant meticulous cycle counting and use of some odd tricks.

The raw GCR took up more space (684*324 bytes) in the C64 RAM, so that’s why it required 4 passes. Other copy programs fully decoded the GCR and required only 3 passes. But GCR decoding was rather time-consuming, so they had to skip some sectors and read every track multiple times. OTOH my program was able to read/write at the full 300rpm, i.e. 5 tracks per second plus stepper time, which boils down to 2x ~7.5 seconds for read and write. Yep, you had to swap the floppies every 2 seconds …

Ok, so I spread the program. For free. I even made a 40 track version, which took 17 seconds. Only to see these coming back in various mutations, with the original credits ripped out, decorated with multiple intros, different groups pretending they wrote it or cracked it (it was free, there was nothing to crack!). The only thing they left alone were the copy routines, probably because they were extremely fragile and hard to understand. So it was really easy to recognize my own code. Some of the commercial parallel-cable + ROM kits even bragged with “Backups in 15 seconds!”. These were blatant rip-offs: they basically changed the screen colors and added a check for their dongles. Duh.

Let’s just say this rather frustrating experience taught me a lot and that’s why I’m doing open source today.

So I shelved my plans to write an enhanced version which would try to compress the memory to reduce the number of passes. Ah, yes … I wrote quite a few packers, too … but I’ll save that story for another time.

I still have the disks with the source code somewhere in my basement. But I’m not so sure I’ll be able to read them anymore. They weren’t of high quality to begin with … and I’d have to find my homegrown toolchain, too. ;-)

But I took the time to reverse-engineer my own code from one of the copies that are floating around on the net. For better understanding on the C64/1541 handshake issues, refer to this article. If you’re wondering about the weird bvc * loops: the 6502 CPU of the 1541 has an SO pin, which is triggered by a full shift register for the data from the disk head. This directly sets the overflow flag in the CPU and allows reading the contents from the shift register with very low latency.

Yes, there’s a lot more weird code in there. For the sake of brevity, here are only the inner loops of the I/O routines for the read, write and verify pass for the C64 and the 1541 side. Enjoy!

  ;--- 1541: Read ---
  ldy #$20
f_read:
  bvc *        ; Wait for disk shift register to fill
  clv
  lda $1c01    ; Load data from disk
  sta $1801    ; Send byte to C64 via parallel cable
  inc $1800    ; Toggle serial pin
  eor $80      ; Compute checksum for 1st GCR byte in $80
  sta $80
  bvc *
  clv
  lda $1c01    ; Load data from disk
  sta $1801    ; Send byte to C64 via parallel cable
  dec $1800    ; Toggle serial pin
  eor $81      ; Compute checksum for 2nd GCR byte in $81
  sta $81
  ; ...
  ; Copy and checksum to $82 $83 $84
  ; And another time for $80 $81 $82 $83 $84 with inverted toggles
  ; ...
  dey
  beq f_read_end
  jmp f_read
f_read_end:
  ; Copy the remaining 4 bytes and checksum to $80 $81 $82
  ; Lots of bit-shifting and xoring to indirectly verify
  ; the sector checksum from the 5 byte xor of the raw GCR data

  ;--- C64: Read ---
  ; Setup ($5d) and ($5f) to point to GCR buffer
  ldy #$00
c_read:
  bit $dd00    ; Wait for serial pin to toggle
  bpl *-3
  lda $dd01    ; Read incoming data (from 1541)
  sta ($5d),y  ; Store to buffer
  iny
  bit $dd00    ; Wait for serial pin to toggle
  bmi *-3
  lda $dd01    ; Read incoming data (from 1541)
  sta ($5d),y  ; Store to buffer
  iny
  bne c_read
c_read2:
  bit $dd00    ; Wait for serial pin to toggle
  bpl *-3
  lda $dd01    ; Read incoming data (from 1541)
  sta ($5d),y  ; Store to buffer
  iny
  bit $dd00    ; Wait for serial pin to toggle
  bmi *-3
  lda $dd01    ; Read incoming data (from 1541)
  sta ($5d),y  ; Store to buffer
  iny
  cpy #$44
  bne c_read2

  ;--- C64: Write ---
  ; Setup ($5d) and ($5f) to point to GCR buffer
  ldy #$00
  tya
c_write:
  eor ($5d),y  ; Load from buffer and compute checksum
  bit $dd00    ; Wait for serial pin to toggle
  bpl *-3
  sta $dd01    ; Store xor'ed outgoing data (to 1541)
  iny
  eor ($5d),y  ; Load from buffer and compute checksum
  bit $dd00    ; Wait for serial pin to toggle
  bmi *-3
  sta $dd01    ; Store xor'ed outgoing data (to 1541)
  iny
  bne c_write
c_write2:
  eor ($5f),y  ; Load from buffer and compute checksum
  bit $dd00    ; Wait for serial pin to toggle
  bpl *-3
  sta $dd01    ; Store xor'ed outgoing data (to 1541)
  iny
  eor ($5f),y  ; Load from buffer and compute checksum
  bit $dd00    ; Wait for serial pin to toggle
  bmi *-3
  sta $dd01    ; Store xor'ed outgoing data (to 1541)
  iny
  cpy #$44
  bne c_write2
  ldx $5b
  sta $0200,x  ; Store checksum for verify pass
  inx
  stx $5b

  ;--- 1541: Write ---
  ldy #$a2
  lda #$00
f_write:
  bvc *        ; Wait for disk shift register to clear
  clv
  eor $1801    ; Xor with incoming data (from C64)
  sta $1c01    ; Write data to disk shift register
  dec $1800    ; Toggle serial pin
  lda $1801    ; Reload data to undo xor for next byte
  bvc *        ; Wait for disk shift register to clear
  clv
  eor $1801    ; Xor with incoming data (from C64)
  sta $1c01    ; Write data to disk shift register
  inc $1800    ; Toggle serial pin
  lda $1801    ; Reload data to undo xor for next byte
  dey
  bne f_write

  ;--- 1541: Verify ---
  ; Get checksum computed by c_write on the C64 side
  ldy #$a2
f_verify:
  bvc *        ; Wait for disk shift register to fill
  clv
  eor $1c01    ; Xor with data from disk
  bvc *        ; Wait for disk shift register to fill
  clv
  eor $1c01    ; Xor with data from disk
  dey
  bne f_verify
  ; Verify is ok if checksum is zero