In the series about the assemblers Commodore used for developing the ROMs of their 8-bit computers, this article covers the 1975 “MOS Cross-Assembler”, which was available for various mainfraimes of the era.
- Part 0: Overview
- Part 1: MOS Cross-Assembler ← this article
- Part 2: MOS Resident Assembler
- Part 3: BSO CY6502
- Part 4: HCD65
- Part 5: 6502ASM
MOS Technology, Inc. released two assemblers for the newly introduced 6502 architecture: the “Cross-Assembler”, available for various mainframes and minicomputers, and the “Resident Assembler”, running on natively on 6502 systems.
According to Norm Farrington, MOS contracted with the company COMPAS to write the Cross-Assembler. COMPAS specialized in 6502-related software and hardware and developed some of the official MOS peripherals for the KIM-1.
The Cross-Assembler was written in FORTRAN1 and used a 6-bit (i.e. all uppercase) character encoding. On a CDC Cyber 175 system, it required 120K (!) 60-bit words of memory to run and had “generally acceptable” response times.
The Resident Assembler (part 2 of the series) was then developed (also by COMPAS) using the Cross-Assembler, and designed to be compatible in that it understood the same source format, with the same math features and the same directives and options. This way, MOS defined the basic format supported by all future Commodore assemblers.
Today, you would expect a cross-assembler to run on Linux, Windows or Mac. But these were different times: When the 6502 was released, microcomputers were just appearing – and the 6502 was a part of this shift – and even the first fridge-sized minicomputers like the PDP series had only been introduced 10 years earlier. Most companies that used computers at the time used terminals that dialed into mainframes hosted in computing centers. And the market was very fragmented.
The 6502 Programming Manual (6500-50A) states:
The Cross Assembler is available on various time share systems or for batch use on the user’s system.
According to the 6502 Cross Assembler Manual (6500-60P), the supported platforms as of August 1975 were:
- GE Timesharing, running on GE/Honeywell mainframes
- National CSS time-sharing, running VP/CSS on IBM System/360 and System/370 mainframes
And the 1975 MOS Technology marketing brochure:
Current plans involve having the software available on several of the more popular Time Sharing services.
In addition, it will be available for deck sales. Batch decks for the CDC, IBM, and PDP-11 class machines are available and we will support several other popular mini and major computer systems in the near future.
- The 1975 MCS6500 Microprocessor Software Support brochure shows the Cross-Assembler running on the United Computing Systems (UCS) time-sharing service.
- A 1976 dissertation shows that the Iowa State University ran the Cross-Assembler locally on an IBM 360/370.
- A 1977 magazine article mentions support for PDP-8, PDP-10 and PDP-11.
Back then, not all computer systems used the ASCII encoding, and some computers didn’t even support lower case. The encoding of source files is therefore specific to the platform the Cross-Assembler runs on, and only uppercase characters were allowed.
Here is an example from the manual:
; ; 650X CROSS ASSEMBLER SAMPLE PROGRAM. ; *=$C000 DEFINE ORIGIN. LDX #$FF SET UP STACK. TXS LOAD STACK POINTER. LDA #$F0 LOAD A WITH HEX F0. STA ASAVE SAVE A IN ASAVE. ; ; ALLOCATE SAVE AREA. ; *=$0000 ASAVE *=* .END
- Lines starting with
;are comments and will be ignored.
- Labels start at the first column, everything else is indented.
- Comments may follow the operand or the operand-less mnemonic. No
- The Cross-Assembler uses the
*=syntax to define the current assembly address.
The rule about the start columns of labels and assembly statements is actually more relaxed, as this very compressed example from the C64 KERNAL shows:
;COMMAND SERIAL BUS DEVICE TO LISTEN ; LISTN ORA #$20 ;MAKE A LISTEN ADR JSR RSP232 ;PROTECT SELF FROM RS232 NMI'S LIST1 PHA
Since all mnemos and register names are reserved keywords and cannot be used for labels, the assembler does not enforce indenting for assembly statement. The
JSR RSP232 starting at the first column is legal. In fact, even labels may be indented. This example prepends comments after statements with
;, which is legal because the assembler ignores everything after the statement anyway.
If you want to go for maximum readability (and don’t care about the size of the source), you could also indent the above example like this:
; COMMAND SERIAL BUS DEVICE TO LISTEN ; LISTN ORA #$20 ; MAKE A LISTEN ADR JSR RSP232 ; PROTECT SELF FROM RS232 NMI'S LIST1 PHA
Labels can be up to 6 characters in length, so one could use 7 character indents for statements so that they always line up. That’s also how the LST output is formatted, which will be described later.
The accepted syntax of assembly statements matches the one in the 6502 Programming Manual. This includes the syntax for statements that take the accumulator as the argument:
ASL A LSR A ROL A ROR A
Modern assemblers usually allow omitting the
A; the Cross-Assembler does not.
One additional feature is an alternative syntax to the indirect, y-indexed addressing mode. In addition to
the following syntax is accepted:
Operands of assembly statements can use hexadecimal (
$), octal (
@), binary (
%) and decimal (no prefix) constants. Mathematical expressions using
/ are possible, but they are always evaluated left-to-right with no operator precedence and no parenthetical grouping. Character/string literals are prefixed with
The Cross-Assembler understands the following directives:
.BYTE: store one or more bytes
.WORD: store one or more words (little endian)
.DBYTE: store one or more words (big endian)
.PAGE: optionally set a section title, and cause an LST page break
.SKIP: insert a number of blank lines into the LST
.END: stop assembly; not required but suggested at end of file
.OPT: set or clear a list of options
NOXREF: add a cross-reference to the end of the LST
NOERRORS: write errors into separate file
NOCOUNT: add mnemo statistics to the end of the LST
NOLIST: enable LST output
NOMEMORY: enable writing object file
NOGENERATE: verbose printing of character strings in the LST
Only the first three characters after the period are actually checked.
Like most development tools from the 1970s, the assembler can create a so-called listing file (suggested file extension
.LST) during assembly that shows the source and the generated bytes side-by-side and is meant to be printed on paper. Here is a example from the manual:
CARD # LOC CODE CARD 1 CR=15 2 LF=12 3 ; LOW CORE DATA AREAS 4 0000 E7 06 TEMTBL .WORD G3TEM, G1TEM 5 0002 E7 05 6 GROUP=B10 7 0004 00 THI .BYTE 0 8 0005 00 TLO .BYTE 0 9 0006 00 00 00 3PER .WORD 0 ***** ERROR ** LABEL DOESN'T BEGIN WITH ALPHABETIC CHARACTER - NEAR COLUMN 1 10 0009 B1 0E NEXT LDA (SAVIL)Y [...] 269 07C9 C9 3B CMP #'; 270 07CB 00 00 BEQ DONE ***** ERROR ** UNDEFINED SYMBOL - NEAR COLUMN 18 280 .END END OF MOS/TECHNOLOGY 650X ASSEMBLY VERSION 4 NUMBER OF ERRORS = 2, NUMBER OF WARNINGS = 0
The first column (
CARD #) is the line number in the source. The
LOC field is the memory address, which is followed by the output bytes (
CODE) and the source line (
CARD), which the assembler re-indented for readability.
CARD nomenclature stems from 1960s mainframes, where each line of text was represented by one punch card.
Error messages are shown as extra lines after the line that caused the error. Note that the assembler will keep working through the file no matter what, and will output placeholder bytes for lines with errors.
The main output of the assembler is the binary program, which is in the form of a so-called “interface file” with a suggested file extension of
The diverse set of platforms that the Cross-Assembler ran on all had different word sizes, and many of them measured memory only in words and did not even have a concept of (8-bit) “bytes”. Therefore, the assembler could not output a binary file, but instead wrote a portable, hex-encoded text file, like this:
;18E500A200A0DC60A228A01960B00786D684D3206CE5A6D6A4D3600D8C ;18E51820A0E5A9008D910285CFA9488D8F02A9EB8D9002A90A8D890C62 [...] ;06FFFA43FEE2FC48FF0665 ;0001200021
Every line consists of the following characters:
;: first character of each record
- 2 chars: number of data bytes to follow
- 4 chars: load address of first byte of line
- 2 chars (repeated): data byte
- 4 chars: checksum
The last line is
;00: identifier for last line
* 4 chars: number of preceding lines
* 4 chars: checksum
The checksum is calculated by adding all preceding bytes of the line together.
This text-only file is platform-independent and can easily be transferred between different computer systems, and e.g. downloaded from a time-sharing system in order to write it to an EPROM.
Use at Commodore
The Cross-Assembler was used at MOS to create the very first 6502 code, like the KIM-1 ROM (shown above), or the TIM ROM (MCS6530-004).
A large amount of the original Commodore source code has been preserved, and all code before 1984 is in a format very similar to the original MOS definition supported by both the Cross-Assembler and the Resident Assembler. So at first sight, it is not so clear which assembler Commodore used for developing ROMs of the PET, VIC-20 etc. and the disk drives.
The next article in the series will discuss this topic further.
An earlier version of the article stated that the original MOS Cross-Assembler had been implemented by a grad student at the University of Illinois at Urbana-Champaign. In fact, the linked paper is only about the COMPAS/MOS Cross-Assembler as it was used on the university’s CDC computer.↩