Category Archives: 6502

How many Commodore 64 computers were really sold?

Nobody doubts that the C64 was the greatest selling single computer model of all time, it even made it into the Guinness Book of World Records, but nobody quite knows how many it really was: Most sources say 17 million, others say 22 or even 30 million. With a high degree of confidence, I can now say that Commodore only sold 12.5 million units – how I would know that, you ask, and how do I dare to contradict well-known facts? By analyzing serial numbers!

But let us first examine the existing claims of 17, 22 and 30 million.

Jack Tramiel’s numbers

The numbers 22 and 30 million actually come from Commodore founder Jack Tramiel himself. At the “Impact of the Commodore 64: A 25th Anniversary Celebration” in December 2007 at the Computer History Museum, Tramiel claimed that Commodore sold nearly half a million C64s a month until he left the company in 1984, and he extrapolated this to “between 22 and 30 million units” during the lifetime of the C64. Tramiel’s assistant Michael Tomczyk, who left the company around the same time, uses the same 22 million figure in his 1984 book “Home Computer Wars – An Insider’s Account of Commodore and Jack Tramiel”.

Commodore’s Numbers

But these numbers contradict Commodore’s official sales numbers as researched by Marc Walters: The 1993 Annual Report supposedly states that 17 million units had been sold (production was stopped in April 1994, so this should be very close to the final number). Commodore employee Dr. Peter Kittel confirms the 17 million.

Walters quotes Commodore’s 1993 financial details for the sales numbers between 1990 and 1993:

The following figures are from Commodore annual reports:
Fiscal 1990: 700K - 800K (decline begins),
fiscal 1991: 800K (non official 1M)
fiscal 1992: 650K
fiscal 1993: 150K - 200K
1993 Annual Report: 17M total C64, 4.5M C128

Marc Walters’ estimates

Walters also gives his own estimates for 1982 through 1989:

1982: 150K - 300K
1983: 2M
1984: 2M-3M
1985: 2M -3M
1986: 2M -3M
1987: 1M - 2M
1988: 1M - 1.5M
1989: 1M - 1.5M

This data was later picked up by Jeremy Reimer for his Ars Technica article “Total share: 30 years of personal computer market share figures“.

With these numbers, Commodore would have sold between 13.5 and 19 million units. The following graph shows this, with the 17 million figure added as a scaled version of the ranges given:

Putting all reliable data together

This data is quite consistent, but generic sources contradict these steep numbers in the first years:

This data, with Commodore’s own 1990-1993 data added and some of the missing numbers extrapolated, shows a rather flat and consistent sales curve with a single spike in 1984, totalling about 12.5 million units:

In 1984, about 2 to 2.5 million units were sold, which is about 200,000 units per month. There is no way Tramiel’s statement holds true, given the many sources of the early numbers and their consistency. This also falsifies claims of Tramiel’s 22 million and 30 million sold units, which could be approximated by extrapolating the 2.5 million units across a decade – but the existing data shows that the C64 clearly didn’t perform equally well afterwards.

Examining serial numbers

There are two projects that collect C64 serial numbers:

While the former collects lots of details including case badges and board types, the latter contains information like the video standard as well as photos of all serial number stickers. For the following statistics, we only need the C64 Serial Registry.

Commodore 64 computers were produced in at least 11 different factories worldwide with different and not immediately obvious conventions for serial numbers. But all PCBs, of which there were 16 versions, were manufactured in Hong Kong, and most versions were numbered sequentially, starting at 0 for each new version. The serial number could be found on a sticker on the shield of the cartridge port.

The fact that serial numbers are indeed per board version and are reset to zero for every new board can be seen in the following graph, which was created by sorting the serial numbers by board first and then by number:

If the assumption is correct, serial numbers have to increase linearly until they approach the highest serial number of the board version. All boards should have about the same slope. This is true for all board versions except “250425/A” and “250425/B”, as well as the last two, “250469/A” and “250469/A”.

This can be explained by the bias towards machines from North America in the C64 Serial Registry: About one third of the entries in the database are from the US and Canadian market, but a much smaller percentage of units has actually been sold there – the C64 was very strong in Europe. The first, second and fourth of these board types with a much higher slope were all sold exclusively, and the third one predominantly to PAL markets, i.e. outside North America.

(If the slope there is roughly twice as steep, we can estimate NTSC units are overrepresented in the database by a factor of two, meaning that about 1/6 of all C64 were sold in North America.)

These are the 16 different versions and revisions, the year they first appeared in, the number of boards of this kind in the C64 Serial Registry (that have the board serial number field filled), and the maximum serial number observed:

year board num_seem max_seen
1982 326298 4
1982 326298/A 21 325,512
1982 326298/B 6
1982 326298/C 6
1982 KU-14194HB 16
1983 250407/A 7 208,282
1983 250407/B 50 1,152,644
1983 250407/C 18 1,218,502
1984 250425/- 48 1,273,699
1984 250425/A 9 500,165
1984 250425/B 4 536,345
1986 250466 18 438,001
1987 250469/3 14 444,384
1988 250469/4 29 1,124,586
1989 250469/A 16 1,994,012
1990 250469/B 17 2,242,493

(Note that four of the early versions do not have any easily discoverable serial numbers on them. Also note that the PET 64 and Educator 64 devices, which were basically “326298/A” C64 boards in an all-in-one computer, are also counted, but they sold in very small numbers. The same is true for the ill-fated “250469/B”-based C64GS game console. The portable SX-64 has a different board and is not counted.)

Now we can use the formula for the German Tank Problem to estimate the total number produced for each type of board for which we have the maximum observed serial number:

k is the sample size and m is the highest serial number observed.

year board num_seem max_seen total
1982 326298 4
1982 326298/A 21 325,512 341,012
1982 326298/B 6
1982 326298/C 6
1982 KU-14194HB 16
1983 250407/A 7 208,282 238,036
1983 250407/B 50 1,152,644 1,175,696
1983 250407/C 18 1,218,502 1,286,196
1984 250425/- 48 1,273,699 1,300,233
1984 250425/A 9 500,165 555,738
1984 250425/B 4 536,345 670,430
1986 250466 18 438,001 462,333
1987 250469/3 14 444,384 476,125
1988 250469/4 29 1,124,586 1,163,364
1989 250469/A 16 1,994,012 2,118,637
1990 250469/B 17 2,242,493 2,374,403

For the board types without serial numbers, we can approximate the number of boards produced by scaling the result of board “326298/A” to the number of observed units of the missing ones. Since all boards without serials are from 1982 just like “326298/A”, it is probably not a very bad estimate.

(On the other hand, board “KU-14194HB” has only ever been put into machines produced in Germany and sold in Europe, so because of the bias of the database towards North America, this board type might be underrepresented.)

year board num_seem max_seen total
1982 326298 4 64,955
1982 326298/A 21 325,512 341,012
1982 326298/B 6 97,432
1982 326298/C 6 97,432
1982 KU-14194HB 16 259,818
1983 250407/A 7 208,282 238,036
1983 250407/B 50 1,152,644 1,175,696
1983 250407/C 18 1,218,502 1,286,196
1984 250425/- 48 1,273,699 1,300,233
1984 250425/A 9 500,165 555,738
1984 250425/B 4 536,345 670,430
1986 250466 18 438,001 462,333
1987 250469/3 14 444,384 476,125
1988 250469/4 29 1,124,586 1,163,364
1989 250469/A 16 1,994,012 2,118,637
1990 250469/B 17 2,242,493 2,374,403

So according to this estimate, about 12.5 million Commodore 64 computers were produced, which matches the number above.

(By the way, about 6 million of these had the new smaller board “250469/X” with the HMOS chipset, and almost all machines since 1989, which is about 4.5 million, were sold in Europe.)

Commodore Plus/4, C116, C16 (TED) Technical Documents

The Commodore Plus/4, the C16 and the C116 from 1984 were members of the 6502-based “TED” series, named after the 7360 TED (“Text Editing Device”) video controller. The TED systems were basically the low-cost cousins of the C64: The overall system architecture and the video chip are very similar to the C64′s, but they lack certain features like hardware sprites. On the other hand, there are some added features like extra colors and more control over the internal timing of the video chip.

In the Commodore archive at, there is a collection of GIF images that are scans of the some very interesting technical documents on the TED series, originally provided by Tibor Biczo and published by William Levak and Marko Mäkelä. I sorted the pages and converted them into searchable PDFs that are much nicer to look at:

“TED System Hardware Manual”

(PDF, 48 pages, 7.6 MB)

“TED 7360R0 Preliminary Data Sheet” (Apr 1983)

(PDF, 23 pages, 5.8 MB)

“TED Extra Pages”

(PDF, 5 pages, 1.4 MB)

The “Extra Pages” contain a map of the circuit board, a Plus/4 memory map in German, a TED register map, and a German version of section 4.5.2 of the TED System Hardware Manual.

“Service Manual Model Plus 4 Computer.pdf” (Oct 1984, PN-314001-04)

(PDF, 25 pages, 4.9 MB)

The service manual is also taken from

Final Cartridge III Undocumented Functions

The “Final Cartridge III” has been among the most popular Commodore 64 extensions, providing a floppy speeder, BASIC extensions, a machine language monior, a freezer and even a (rarely used) graphical desktop. The major advantage compared to other C64 cartridges is the consistent way in which the Final Cartridge III extends the C64 experience.

As it turns out, there are several undocumented instructions implemented in the Final Cartridge III.

Filtered Directory


The DOS”$” command passes all characters following the “$” to the disk drive, allowing the user to specify filters, like this:


This feature is not available for the “@” command in the monitor.

Fast Format


The 26 second fast format known from the “DESKTOP” GUI is also available from the command line. Note that this also works with the “@” command in the monitor. If the ID is omitted, this only overwrites BAM and directory, just like the “N” command.

Disk Rename


This commands renames the disk without erasing it. The ID can be up to 5 characters, so the default “2A” can be overwritten.

FC III ROM Banking


B, followed by a digit between 0 to 3, in the monitor enables the view of the ROMs of the Final Cartridge III. The specified ROM bank will be visible between $8000 and $BFFF. B without a parameter switches the ROM back off.

The following commands in the machine language monitor can be used to dump the complete ROM of an FC3 to disk:

B 0
T 8000 BFFF 8000
S "B0",08,8000,C000
B 1
T 8000 BFFF 8000
S "B1",08,8000,C000
B 2
T 8000 BFFF 8000
S "B2",08,8000,C000
B 3
T 8000 BFFF 8000
S "B3",08,8000,C000

Standalone Commodore BASIC on the iPhone/iPad

You might remember the hassle about the Commodore 64 emulator in the iPhone App Store about a year ago: First it was approved, but then pulled again, because it allowed access to the C64′s BASIC – general-purpose interpreters were not allowed. After Apple relaxed this restriction, BASIC was added again.

So now it fills me with joy that Ahmad Hawwash managed to get standalone Commodore BASIC into the App Store! His “Hand BASIC – CBM Flavor” is free of charge, runs on iPhone/iPod touch and iPad, and is based on’s Open Source cbmbasic project, a recompiled version of the original Commodore 64 binary, so the BASIC interpreter is itself not interpreted, but runs natively and at full speed, which is in the order of 500 times faster than on a C64.

“Hand BASIC” has LOAD/SAVE support and comes with several demo programs – just type LOAD"$",8 and LIST to see them and LOAD"NAME",8 and RUN to run them. Type the (nonstandard/added) HIDE keyword to hide the keyboard.

What would be very interesting now:

  • Some cool BASIC programs that run in here – games, maybe?
  • A compiler backend that produces Commodore BASIC code, so I can run any code on the iPhone through this – with a 38911 byte RAM limitation, of course. (Actually, a program is free to set the TOPMEM pointer higher, allowing up to 62 KB of RAM for code and variables.)

Any other ideas? :-)

Internals of BRK/IRQ/NMI/RESET on a MOS 6502

After 35 years of measuring the behaviour of the MOS 6502 CPU to better understand what is going on, the Visual6502 simulator finally allows us insight into the chip, so we can understand what the CPU does internally. One interesting thing here is the question how the 6502 handles BRK, IRQ, NMI and RESET.

The Specification

Let’s revisit the documented part first. The 6502 knows three vectors at the top of its address space:

Signal Vector
  • On a RESET, the CPU loads the vector from $FFFC/$FFFD into the program counter and continues fetching instructions from there.
  • On an NMI, the CPU pushes the low byte and the high byte of the program counter as well as the processor status onto the stack, disables interrupts and loads the vector from $FFFA/$FFFB into the program counter and continues fetching instructions from there.
  • On an IRQ, the CPU does the same as in the NMI case, but uses the vector at $FFFE/$FFFF.
  • On a BRK instruction, the CPU does the same as in the IRQ case, but sets bit #4 (B flag) in the copy of the status register that is saved on the stack.

The four operations are very similar, they only differ in the location of the vector, whether they actually push data onto the stack, and whether they set the B flag.

Signal Vector Push PC and P Set B Flag
NMI $FFFA/$FFFB yes no
IRQ $FFFE/$FFFF yes no
BRK $FFFE/$FFFF yes yes


Ignoring opcode fetches, the PLA ROM defines the following cycles of the BRK instruction (6502 Programming Manual, page 131):

  • store PC(hi)
  • store PC(lo)
  • store P
  • fetch PC(lo) from $FFFE
  • fetch PC(hi) from $FFFF


An IRQ does basically the same thing as a BRK, but it clears the B flag in the pushed status byte. The CPU goes through the same sequence of cycles as in the BRK case, which is done like this:

If there is an IRQ pending and the current instruction has just finished, the interrupt logic in the 6502 forces the instruction register (“IR”) to “0″, so instead of executing the next instruction, the PLA will decode the instruction with the opcode 0×00 – which is BRK! Of course it has to kick in a few cycles later again to make sure a B value of 0 is pushed, but otherwise, it’s just the BRK instruction executing.


Not surprisingly, NMI is done the same way: “0″ is injected into the instruction stream, but this time, some extra logic makes sure that the addresses $FFFA/$FFFB are put onto the address bus when fetching the vector.


RESET also runs through the same sequence, but it is the most different of the four cases, since it does not write the current PC and status onto the stack – but this was hacked trivially: The bus cycles exist, but the read/write line is not set to “write”, but “read” instead. The following trace was created with the transistor data from the Visual 6502 project and shows the first nine cycles after letting go of RESET:

#0 AB:00FF D:00 R/W:1 PC:00FF A:AA X:00 Y:00 SP:00 P:02 IR:00  READ $00FF = $00

Cycle 0: When a 6502 is turned on, the stack pointer is initialized with zero. The BRK/IRQ/NMI/RESET sequence pulls the instruction register (IR) to 0.

#1 AB:00FF D:00 R/W:1 PC:00FF A:AA X:00 Y:00 SP:00 P:02 IR:00  READ $00FF = $00
#2 AB:00FF D:00 R/W:1 PC:00FF A:AA X:00 Y:00 SP:00 P:02 IR:00  READ $00FF = $00
#3 AB:0100 D:00 R/W:1 PC:00FF A:AA X:00 Y:00 SP:00 P:02 IR:00  READ $0100 = $00

Cycle 3: The first stack access happens at address $0100 – a push first stores the value at $0100 + SP, then decrements SP. In the BRK/IRQ/NMI case, this would have stored the high-byte of the PC. But for RESET, it is a read cycle, not a write cycle, and the result is discarded.

#4 AB:01FF D:00 R/W:1 PC:00FF A:AA X:00 Y:00 SP:00 P:02 IR:00  READ $01FF = $00

Cycle 4: SP is now 0xFF (even if the internal state does not reflect that), so the second stack access (which would have been the low-byte of PC) targets 0x01FF. Again, the result is discarded, and SP decremented.

#5 AB:01FE D:00 R/W:1 PC:00FF A:AA X:00 Y:00 SP:00 P:02 IR:00  READ $01FE = $00

Cycle 5: SP is now 0xFE, and the third stack access, (the status register) happens at 0x01FE. SP is decremented again.

#6 AB:FFFC D:E2 R/W:1 PC:00FF A:AA X:00 Y:00 SP:FD P:06 IR:00  READ $FFFC = $E2

Cycle 6: The internal state of the CPU now shows that SP is 0xFD, because it got decremented 3 times for the three fake push operations. The low-byte of the vector is read.

#7 AB:FFFD D:FC R/W:1 PC:00FF A:AA X:00 Y:00 SP:FD P:16 IR:00  READ $FFFD = $FC

Cycle 7: The high-byte of the vector is read.

#8 AB:FCE2 D:A2 R/W:1 PC:FCE2 A:AA X:00 Y:00 SP:FD P:16 IR:00  READ $FCE2 = $A2

Cycle 8: The first actual instruction is fetched.

Since RESET is not timing critical, it doesn’t matter whether a few cycles are wasted by doing the fake stack cycles.

Measuring the ROR Bug in the Early MOS 6502

The MOS 6502 CPU was introduced in September of 1975, and while the documentation described the three shift/rotate instructions ASL, LSR and ROL, the ROR instruction was missing – the documentation said that ROR would be available in chips starting in June 1976. In fact, the reason for this omission was that the instruction, while being present, didn’t behave correctly. Only few 6502s with the defect are in existence, and nobody seemed to have checked what was actually going on in these chips.

Simon C got my KIM-1 working again, which has a 6502 from week 51 of 1975. There are 512 possible inputs to ROR (8 bit A plus 1 bit C; assuming it doesn’t have dependencies on other registers), and roughly two bytes of output: the 8 bit result and the processor status (flags) register. We ran the following programs on the KIM-1 – note that we had to split the task into several programs, because the KIM-1 doesn’t have enough RAM to hold all results.

Program 1a: for A=00..FF, with C=0, execute “ROR A”, save result into array

.C:0200   A2 00      LDX #$00
.C:0202   8A         TXA
.C:0203   18         CLC
.C:0204   6A         ROR A
.C:0205   9D 00 03   STA $0300,X
.C:0208   E8         INX
.C:0209   D0 F7      BNE $0202
.C:020B   00         BRK

Program 1b: for A=00..FF, with C=1, execute “ROR A”, save result into array

.C:0200   A2 00      LDX #$00
.C:0202   8A         TXA
.C:0203   38         SEC
.C:0204   6A         ROR A
.C:0205   9D 00 03   STA $0300,X
.C:0208   E8         INX
.C:0209   D0 F7      BNE $0202
.C:020B   00         BRK

Program 2a: for A=00..FF, with C=0, execute “ROR A”, save flags into array

.C:0200   A2 00      LDX #$00
.C:0202   8A         TXA
.C:0203   38         SEC
.C:0204   6A         ROR A
.C:0205   08         PHP
.C:0206   68         PLA
.C:0207   9D 00 03   STA $0300,X
.C:020a   E8         INX
.C:020b   D0 F5      BNE $0202
.C:020d   00         BRK

Program 2b: for A=00..FF, with C=1, execute “ROR A”, save flags into array

.C:0200   A2 00      LDX #$00
.C:0202   8A         TXA
.C:0203   38         SEC
.C:0204   6A         ROR A
.C:0205   08         PHP
.C:0206   68         PLA
.C:0207   9D 00 03   STA $0300,X
.C:020a   E8         INX
.C:020b   D0 F5      BNE $0202
.C:020d   00         BRK

These are the results:

Program 1a and 1b (result with C=0 and C=1) produced the same output!

00 -> 00
01 -> 02
02 -> 04
03 -> 06
04 -> 08
05 -> 0A
FF -> FE

This is a shift left (!!!) of the input value; and it is independent of the carry flag – just like ASL.

Program 2a (status when C=0) produced the following output:

00:     7E (Z=1, N=0, C=0)
01..3F: 7C (Z=0, N=0, C=0)
40..7F: FC (Z=0, N=1, C=0)
80:     7E (Z=1, N=0, C=0)
81..BF: 7C (Z=0, N=0, C=0)
C0..FF: FC (Z=0, N=1, C=0)

Program 2b (status when C=1) produced the following output:

00:     7F (Z=1, N=0, C=1)
01..3F: 7D (Z=0, N=0, C=1)
40..7F: FD (Z=0, N=1, C=1)
80:     7F (Z=1, N=0, C=1)
81..BF: 7D (Z=0, N=0, C=1)
C0..FF: FD (Z=0, N=1, C=1)

These are the correct flags corresponding to the incorrect results in A. The carry flag is the same as the input carry flag, i.e. it is unmodified.

Our preliminary summary is this:

  ROR          Broken ROR on pre-June '76 CPU (Memory or Accumulator)
  Operation:       |7|6|5|4|3|2|1|0| <- 0
                   +-+-+-+-+-+-+-+-+                    N Z C I D V
                                                        / / _ _ _ _

So ROR on early 6502s does three things wrong:

  1. It shifts left, instead of right (behaves like ASL)
  2. It shifts a zero in, instead of C (behaves like ASL)
  3. It doesn't update C (as if it wasn't a rotate instruction)

All three problems are flags that are sent to the ALU: the shift direction, the input bit, and the carry writeback.

Unresolved questions:

  • We only tested ROR A; other addressing modes of ROR might behave differently. Other addressing modes might even be working - but I doubt that, since MOS would certainly have documented the working ones then.
  • ROR might have more dependencies than A and C.
  • What is it in the chip that causes the bug? I'm sure the fine guys at will be able to figure this one out soon. It is unlikely to be a bug in the PLA ROM, because a bug there would not affect different addressing modes of the same instruction with very different timings. It is more likely that it is in the "Random Control Logic" part.

Measuring the Entropy of the MOS 6502 CPU

Everything can be expressed in bits. It takes 4 kilobits to decompress ZIP data, 25 kilobits to kill a human, 43 megabits for a working Mac OS X kernel, and 10^120 bits to describe our universe. What is the entropy/complexity of the 6502 CPU, you might wonder?

You have probably already seen Visual 6502, a simulator of the 6502 that operates at the transistor level and visualizes what’s going on inside the CPU. The data the program operates on was derived by converting a high-resolution photograph of the 6502 die into polygons, and converting these polygons into the node and transistor configuration.

If you ignore the polygon data, which is only needed for visualization, the data to fully describe the 6502 consists of a set of nodes and a set of transistors.

There are 1725 nodes in the 6502, and 3510 transistors. Every transistor has one node as an input that turns the transistor on or off. If the transistor is on, it connects two other nodes, and if it is off, it separates them. Every node can either be neutral, or be a pullup – the latter means that the node will behave as if it was connected to VCC whenever the path through this node is otherwise floating, i.e. not connected to VCC or GND (or an external pulldown).

So the raw information that describes the 6502 is a set of nodes, for which there is one bit of information, whether it is a pullup node or a neutral node, as well as a set of transistors, which is a three-tuple consisting of gate, c1 and c2.

The file segdefs.js in visual6502 contains (next to the polygon information) the node state: “+” means pullup, “-” (or a missing node description) means neutral. The file transdefs.js contains the transistor tuples.

Stripped from irrelevant data, the 6502 description would look like this: The node states:

BOOL nodes[] = {
    /*  0 */ 1,
    /*  1 */ 0,
    /*  2 */ 0,
    /*  3 */ 1,
    /*  4 */ 1,
    /*  5 */ 1,
    /*  6 */ 1,
    /*  7 */ 0,
    /*  8 */ 1,
    /*  9 */ 0,
    /* 10 */ 1,

…and the transistors:

struct {
	int gate;
	int c1;
	int c2;
} transistors[] = {
    {357, 558, 217},
    {1608, 657, 349},
    {412, 558, 1146},
    {558, 558, 943},
    {826, 230, 657},

A quick estimate would be 1 bit per node (pullup or neutral), and 3 times 11 bits (3 node numbers from 1 to 1725) per transistor, resulting in about 117 kilobits; but there is still a lot of redundancy in this representation.

The transistor tuples can be represented in a much more compact way. First, there is no ordering for the transistors, so if we sort them by gate, we can encode just the difference between the current gate and the previous gate:

struct {
	int gate;
	int c1;
	int c2;
} transistors[] = {
    {1, 890, 558},
    {4, 558, 11},
    {5, 558, 146},
    {6, 558, 282},
    {6, 874, 657},
    {7, 657, 1591},
    {7, 657, 1591},
    {7, 657, 1591},
    {8, 150, 558},

In practice, the difference between two gate nodes in the transistor list is a number between 0 and 4, so we could use 3 bits (instead of 11) for the gate. But in fact, it is enough to store a single bit: The nodes have no order either, so we can renumber them so that all nodes that are connected to a gate of a transistor are numbered from 0, and all nodes that are not connected to gates will be numbered above the ones with gates:

struct {
	int gate;
	int c1;
	int c2;
} transistors[] = {
    {0, 890, 558},
    {1, 558, 11},
    {2, 558, 146},
    {3, 558, 282},
    {3, 874, 657},
    {4, 657, 1591},
    {4, 657, 1591},
    {4, 657, 1591},
    {5, 150, 558},

This way, there are no holes in the list of gates – a new transistor has either the next node number as its gate, or the same one again (i.e. a node is gate of several transistors). It is enough to store a single bit for the gate.

The example above already shows that the nodes 657 and 558 show up a lot: These are VCC and GND, respectively. In fact, 234 transistors connect to VCC, and 2493 connect to GND. We could Huffman-encode the c1 and c2 nodes, but in practice, all other nodes except these two are relatively uniformly distributed, so let’s just special case them.

There is also no ordering between the two nodes the transistor connects, and no transistor will ever connect VCC/GND to VCC/GND, so we only need to do the special VCC/GND encoding for c1, and leave c2 as it is. Let’s use a single “1″ bit to represent GND, “01″ as VCC, and all other 11 bit node numbers will be prefixed with “00″, making them occupy 13 bits.

This way, the 3510 values of c1 can be described in 2493 * 1 (the GNDs) + 234 * 2 (the VCCs) + 783 * (2 + 11) (the others) bits = 13140 bits. The c2 values occupy 3510 * 11 bits = 38610 bits. If you add the bit per transistor for the gate (i.e. 3510 bits), we’re at 55260 bits for the transistor description.

Using arithmetic coding, we can describe a node number in about 10.75 bits instead of 11. This way, we save (783 + 3510) * 0.25 bits = 1073 bits. The transistor description thus fits into 54187 bits.

As stated previously, we also need one bit per node for the pullup information. About half the nodes are pullup, and we don’t have the freedom of sorting and renaming the nodes any more (we’ve done that already for the transistor gates), so we’ll have to go with the 1725 raw bits.

So the final number is 54187 bits + 1725 bits = 55912 bits. That’s 6989 bytes, or roughly 56 kilobits, and about twice as big as H1N1.

With this number, we can estimate the complexity of a Commodore 64 to be about 200 kilobits (CPU, VIC, SID, 2xCIA, PLA) – plus another 160 kilobits of ROM.

But wait: Did you see the duplicate transistors in the sorted list? Yes, there are some duplicates, in fact, there are only 3288 unique transistors. And when looking at it on a higher level, the 130 entry instruction decode ROM has duplicates too: Only 115 are unique, leading to redundant paths in the node/transistor graph. The reason for this redundancy is routing: There are only two dimensions in a computer chip (plus layers), so you cannot just connect anything to anything.

The homework for today, dear reader, is to develop an algorithm that minimizes the description of the 6502 by removing duplicate transistors as well as equivalent paths.

High-Res Pictures of a MOS KIM-1

The MOS KIM-1 is a quite rare collector’€™s item today. So if you hold one in your hands, you better take some high resolution pictures of the board. Here they are:

Note that this is the original revision of the board (pre-Rev A), and the 6502 CPU is from week 51 of the year 1975 – so it has the ROR bug!

Does anyone know what the three digit numbers 002 and 003 on the 6530 RIOTs mean? Are these the indexes of the ROM images? If so, what is ROM #001 and was there a #000? Also, the back has the number “0372″ on it – is this a serial number? Looking at the dates of the chips, this seems to be the oldest KIM-1 of all those I could find on the internet.