Commodore's Assemblers: Part 1: MOS Cross-Assembler

In the series about the assemblers Commodore used for developing the ROMs of their 8-bit computers, this article covers the 1975 “MOS Cross-Assembler”, which was available for various mainfraimes of the era.

Series Overview

Part 0: Overview
Part 1: MOS Cross-Assembler ← this article
Part 2: MOS Resident Assembler
Part 3: BSO CY6502
Part 4: HCD65
Part 5: 6502ASM

History

MOS Technology, Inc. released two assemblers for the newly introduced 6502 architecture: the “Cross-Assembler”, available for various mainframes and minicomputers, and the “Resident Assembler”, running on natively on 6502 systems.

According to Norm Farrington, MOS contracted with the company COMPAS to write the Cross-Assembler. COMPAS specialized in 6502-related software and hardware and developed some of the official MOS peripherals for the KIM-1.

The Cross-Assembler was written in FORTRAN¹ and used a 6-bit (i.e. all uppercase) character encoding. On a CDC Cyber 175 system, it required 120K (!) 60-bit words of memory to run and had “generally acceptable” response times.

The Resident Assembler (part 2 of the series) was then developed (also by COMPAS) using the Cross-Assembler, and designed to be compatible in that it understood the same source format, with the same math features and the same directives and options. This way, MOS defined the basic format supported by all future Commodore assemblers.

Platforms

Today, you would expect a cross-assembler to run on Linux, Windows or Mac. But these were different times: When the 6502 was released, microcomputers were just appearing – and the 6502 was a part of this shift – and even the first fridge-sized minicomputers like the PDP series had only been introduced 10 years earlier. Most companies that used computers at the time used terminals that dialed into mainframes hosted in computing centers. And the market was very fragmented.

The 6502 Programming Manual (6500-50A) states:

The Cross Assembler is available on various time share systems or for batch use on the user’s system.

According to the 6502 Cross Assembler Manual (6500-60P), the supported platforms as of August 1975 were:

GE Timesharing, running on GE/Honeywell mainframes
National CSS time-sharing, running VP/CSS on IBM System/360 and System/370 mainframes

And the 1975 MOS Technology marketing brochure:

Current plans involve having the software available on several of the more popular Time Sharing services.

In addition, it will be available for deck sales. Batch decks for the CDC, IBM, and PDP-11 class machines are available and we will support several other popular mini and major computer systems in the near future.

Furthermore:

The 1975 MCS6500 Microprocessor Software Support brochure shows the Cross-Assembler running on the United Computing Systems (UCS) time-sharing service.
A 1976 dissertation shows that the Iowa State University ran the Cross-Assembler locally on an IBM 360/370.
A 1977 magazine article mentions support for PDP-8, PDP-10 and PDP-11.

Basic Syntax

Back then, not all computer systems used the ASCII encoding, and some computers didn’t even support lower case. The encoding of source files is therefore specific to the platform the Cross-Assembler runs on, and only uppercase characters were allowed.

Here is an example from the manual:

;
; 650X CROSS ASSEMBLER SAMPLE PROGRAM.
;
 *=$C000     DEFINE ORIGIN.
 LDX #$FF    SET UP STACK.
 TXS         LOAD STACK POINTER.
 LDA #$F0    LOAD A WITH HEX F0.
 STA ASAVE   SAVE A IN ASAVE.
;
; ALLOCATE SAVE AREA.
;
 *=$0000
ASAVE *=*
 .END

Lines starting with ; are comments and will be ignored.
Labels start at the first column, everything else is indented.
Comments may follow the operand or the operand-less mnemonic. No ; is necessary.
The Cross-Assembler uses the *= syntax to define the current assembly address.

The rule about the start columns of labels and assembly statements is actually more relaxed, as this very compressed example from the C64 KERNAL shows:

;COMMAND SERIAL BUS DEVICE TO LISTEN
;
LISTN ORA #$20 ;MAKE A LISTEN ADR
JSR RSP232 ;PROTECT SELF FROM RS232 NMI'S
LIST1 PHA

Since all mnemos and register names are reserved keywords and cannot be used for labels, the assembler does not enforce indenting for assembly statement. The JSR RSP232 starting at the first column is legal. In fact, even labels may be indented. This example prepends comments after statements with ;, which is legal because the assembler ignores everything after the statement anyway.

If you want to go for maximum readability (and don’t care about the size of the source), you could also indent the above example like this:

; COMMAND SERIAL BUS DEVICE TO LISTEN
;
LISTN  ORA #$20       ; MAKE A LISTEN ADR
       JSR RSP232     ; PROTECT SELF FROM RS232 NMI'S
LIST1  PHA

Labels can be up to 6 characters in length, so one could use 7 character indents for statements so that they always line up. That’s also how the LST output is formatted, which will be described later.

Assembly Statements

The accepted syntax of assembly statements matches the one in the 6502 Programming Manual. This includes the syntax for statements that take the accumulator as the argument:

ASL A
LSR A
ROL A
ROR A

Modern assemblers usually allow omitting the A; the Cross-Assembler does not.

One additional feature is an alternative syntax to the indirect, y-indexed addressing mode. In addition to

LDA (PNT),Y

the following syntax is accepted:

LDA (PNT)Y

Expressions

Operands of assembly statements can use hexadecimal ($), octal (@), binary (%) and decimal (no prefix) constants. Mathematical expressions using +, -, * and / are possible, but they are always evaluated left-to-right with no operator precedence and no parenthetical grouping. Character/string literals are prefixed with '.

Directives

The Cross-Assembler understands the following directives:

.BYTE: store one or more bytes
.WORD: store one or more words (little endian)
.DBYTE: store one or more words (big endian)
.PAGE: optionally set a section title, and cause an LST page break
.SKIP: insert a number of blank lines into the LST
.END: stop assembly; not required but suggested at end of file
.OPT: set or clear a list of options
- XREF – NOXREF: add a cross-reference to the end of the LST
- ERRORS – NOERRORS: write errors into separate file
- COUNT/CNT – NOCOUNT: add mnemo statistics to the end of the LST
- LIST – NOLIST: enable LST output
- MEMORY – NOMEMORY: enable writing object file
- GENERATE – NOGENERATE: verbose printing of character strings in the LST

Only the first three characters after the period are actually checked.

LST File

Like most development tools from the 1970s, the assembler can create a so-called listing file (suggested file extension .LST) during assembly that shows the source and the generated bytes side-by-side and is meant to be printed on paper. Here is a example from the manual:

 CARD # LOC     CODE        CARD
    1                   CR=15
    2                   LF=12
    3                   ; LOW CORE DATA AREAS
    4  0000  E7 06      TEMTBL   .WORD G3TEM, G1TEM
    5  0002  E7 05
    6                   GROUP=B10
    7  0004  00         THI     .BYTE 0
    8  0005  00         TLO     .BYTE 0
    9  0006  00 00 00   3PER    .WORD 0
***** ERROR ** LABEL DOESN'T BEGIN WITH ALPHABETIC CHARACTER - NEAR COLUMN 1
   10  0009  B1 0E      NEXT    LDA (SAVIL)Y
[...]
  269  07C9  C9 3B              CMP #';
  270  07CB  00 00              BEQ DONE
*****  ERROR ** UNDEFINED SYMBOL - NEAR COLUMN 18
  280                           .END

END OF MOS/TECHNOLOGY 650X ASSEMBLY VERSION 4
NUMBER OF ERRORS =    2,   NUMBER OF WARNINGS =    0

The first column (CARD #) is the line number in the source. The LOC field is the memory address, which is followed by the output bytes (CODE) and the source line (CARD), which the assembler re-indented for readability.

The CARD nomenclature stems from 1960s mainframes, where each line of text was represented by one punch card.

Error messages are shown as extra lines after the line that caused the error. Note that the assembler will keep working through the file no matter what, and will output placeholder bytes for lines with errors.

The original LST printout of the KIM-1 ROM, which was part of the user manual, is a real-world example of a 1200-line LST file, with the symbol table and the mnemo statistics (.OPT COUNT) at the end.

OBJ File

The main output of the assembler is the binary program, which is in the form of a so-called “interface file” with a suggested file extension of .OBJ.

The diverse set of platforms that the Cross-Assembler ran on all had different word sizes, and many of them measured memory only in words and did not even have a concept of (8-bit) “bytes”. Therefore, the assembler could not output a binary file, but instead wrote a portable, hex-encoded text file, like this:

;18E500A200A0DC60A228A01960B00786D684D3206CE5A6D6A4D3600D8C
;18E51820A0E5A9008D910285CFA9488D8F02A9EB8D9002A90A8D890C62
 [...]
;06FFFA43FEE2FC48FF0665
;0001200021

Every line consists of the following characters:

;: first character of each record
2 chars: number of data bytes to follow
4 chars: load address of first byte of line
2 chars (repeated): data byte
4 chars: checksum

The last line is
* ;00: identifier for last line
* 4 chars: number of preceding lines
* 4 chars: checksum

The checksum is calculated by adding all preceding bytes of the line together.

This text-only file is platform-independent and can easily be transferred between different computer systems, and e.g. downloaded from a time-sharing system in order to write it to an EPROM.

Use at Commodore

The Cross-Assembler was used at MOS to create the very first 6502 code, like the KIM-1 ROM (shown above), or the TIM ROM (MCS6530-004).

A large amount of the original Commodore source code has been preserved, and all code before 1984 is in a format very similar to the original MOS definition supported by both the Cross-Assembler and the Resident Assembler. So at first sight, it is not so clear which assembler Commodore used for developing ROMs of the PET, VIC-20 etc. and the disk drives.

The next article in the series will discuss this topic further.

An earlier version of the article stated that the original MOS Cross-Assembler had been implemented by a grad student at the University of Illinois at Urbana-Champaign. In fact, the linked paper is only about the COMPAS/MOS Cross-Assembler as it was used on the university’s CDC computer.↩

8 thoughts on “Commodore's Assemblers: Part 1: MOS Cross-Assembler”

Mirko

2021-05-15 at 10:52

Thanks for the reconstruction of such a brilliant moment in computing history.
AM

2021-05-16 at 10:35

Cannot wait for more!
Thomas Findeisen

2021-05-20 at 16:30

Thanks for all the effort and the nice work since years….
Thomas Cherryhomes

2021-05-22 at 20:04

Is the cross-assembler preserved? I’d love to compile it on my CYBER 170/865 🙂

-Thom
Norm Farrington

2021-05-22 at 20:41

Thank you for compiling this history. A point of correction or amplification to your “Part One” of this series.

While a graduate student at the University of Illinois perhaps did create a cross assembler, I do not believe he commercialized that effort and it was used only for his thesis.

The 6502 cross assemblers that were available on NCSS, UCS, and GE timesharing and on DEC and several mainframes (in FORTRAN) were commercialized by COMPAS Microsystems and offered by us on those services. Mike Corder was also the VP of COMPAS overseeing the work that he, John Feagans and others did, as well as in the Resident Assembler porting later.

MOS technology contracted with COMPAS to do the original commercially available cross-assemblers and we also did the resident assembler development as well as a higher level language known as CSL.

Thank you,
Norm Farrington,
Past Chairman & CEO of Computer Applications Corporation, DBA COMPAS.
- Michael Steil
  
  2021-05-22 at 21:34
  
  Thank you for the correction! The article has been updated. For reference, the original statement has been moved to a footnote.
Thomas Cherryhomes

2022-06-27 at 06:06

Am still looking for this exact assembler, if anyone has it on a paper punch deck or tape?
<MM>

2024-01-13 at 20:22

(Another little typo:

>Here is a example from the manual:

a -> an

Please delete this comment after reading!)