Game Development Archeology: Zelda on Game Boy comes with source

Imagine you’re writing a Game Boy game, and the resulting ROM with all the code and data is just a little over one megabyte in size. No big deal, just pad the game to two megabytes, and use a 2 MB ROM in the cartridge. Just tell the linker to allocate 2 MB or RAM, put the actual data at the beginning, and then write a 2 MB “.gb” image to disk, which will then be sent to the ROM chip factory.

Now imagine you’re doing this in MS-DOS. Your linker, probably written in C, calls malloc() of the runtime library of the C compiler. You already know where this is going?

While modern operating systems will always clear all malloc()ed memory, so that you cannot get to other processes’ data, this was uncommon in the single-user MS-DOS days. If you allocate 2 MB of RAM (the linker must have used a DOS extender or XMS), you’d get memory with random data in it: leftovers from whatever was in this memory before. (seppel tells me that this can also be caused by seek()ing over EOF in MS-DOS, in which case the previous data on the hard disk will be in the image.)

This is what happened with the 1998 Game Boy/Game Boy Color game “The Legend of Zelda – Link’s Awakening DX” (MD5: ee0424cf1523f67c5007566aed70696d). If you look at the image starting at 0×106000, you will find all kinds of interesting data, which will tell you a lot about the game’s development. Let’s call this “game development archeology”…

The ROM image includes big chunks of Borland’s Turbo C IDE (Turbo Vision interface) for DOS, as well as traces of the “QBasic” MS-DOS Editor. It is unclear which editor they used for what, but they might have used Turbo C to write DOS code to support building, as there is a complete copy of this C program in the ROM:

#include
#include

int main(void) {
	FILE *fp,*f1;
	int a=0xcd;
	int b=0xc6;
	int c=0x29;
	int ch;
	unsigned long i=0;

	if((fp=fopen("zeldag.gb","rb"))==NULL) {
		printf("can't open the file");
		return 0;
	}

	if((f1=fopen("ttmp.asm","wt"))==NULL) {
		printf("can't new file ttmp.asm");
		return 0;
	}

	while((ch=fgetc(fp))!=EOF) {
		if(a==ch) {
			i++;
			ch=fgetc(fp);
			if(b==ch) {
				i++;
				ch=fgetc(fp);
				if(c==ch)
					fprintf(f1,"%lXH, " , i);
			}
		}
		i++;
	}

	fclose(fp);
	fclose(f1);
}

This writes the file offsets at which 0xcd,0xc6,0×29 was found in the ROM image into ttmp.asm. These bytes, interpreted as Z80 machine code, mean “CALL 0×29C6″. In the final ROM image, this sequence appears once, at 0×442B. If you have any idea why they look for this, please post it in the comments.

This is the list of files in their project directory at D:\GAMEBOY:
BANK37.ASM
CLEARKU.ASM
DAMA1.ASM
DAMA2.ASM
END.CPP
ENDEND.ASM
ENDEND.LST
ENDEND1.ASM
FEND1.ASM
FEND2.ASM
FEND3.ASM
FIND.ASM
FIND.CPP
IFCHAENL.ASM
INTWCHA.ASM
TEST.ASM
TTMP.ASM
TTMP.TXT
ZIPUTP.CPP

These filenames also appear in the ROM:
ADDPLAG.ASM
ADDPLAGF.ASM
CH64TBL.ASM
FEND.ASM
G.ASM
H.ASM
INSERTKU.ASM
INTWIN.ASM
KKKKKK.ASM
L.ASM
NOPLAY1.ASM
TAB.ASM
Y.ASM
ZXHPDM.ASM

And here comes the interesting part: There is actually some assembly source in the ROM; here is a small snippet:

JoyPort_1:
                 AND $02 ;LEFT
                 JR  NZ, JoyPort_2
                 CALL LEFTScroll
                 RET
JoyPort_2:
                 AND $04 ;UP
                 JR  NZ, JoyPort_3
                 CALL UPScroll
                 RET
JoyPort_3:
                 AND $08 ;DOWN
                 JR  NZ, JoyPort_4
                 CALL DOWNScroll

Well-documented, it seems. But there is also some assembly code that looks like this:

L_B000_28F7:
                LD A,$7F
                LD BC,$0800
                LD D,A
                LD HL,$9800
L_B000_2900:
                LD A,D
                LD (HLI),A
                DEC BC
                LD  A,B
                OR  A,C
                JR  NZ,L_B000_2900
                RET
L_B000_2914:
                LD  A,(HLI)
                LD  (DE),A
                INC DE
                DEC BC
                LD  A,B
                OR  A,C
                JR  NZ,L_B000_2914
                RET

The label names suggest that this code has been disassembled from existing Z80 machine code. Link’s Awakening DX is a color remake of an older Game Boy game, so it might very well be that they lost the original source, disassembled the old code and used it again for the remake. This could be easily proven by disassembling the original version and looking for this code.

If you want to do game development archeology yourself, you might want to look at titles like “X-Men – Wolverine’s Rage” (MD5: b1729716baaea01d4baa795db31800b0), which contains Windows 9x registry keys and INF files, “Mortal Kombat 4″ (MD5: 7311f937a542baadf113e9115158cde3), in which you can find some small source fragments, “Gift” (MD5: e6a51088c8fea7980649064bd3a9f9ff), which will tell you that the developers had some Game Boy emulators installed on their system, or the “BIT-MANAGERS” games “Spirou” (MD5:5aa012cf540a5267d6adea6659764441, Turbo C, MAP file, source) and “TinTin in Tibet” (Game Boy Color version, MD5: 8150a3978211939d367f48ffcd49f979), which, amongst other things, contains references to Nintendo’s Game Boy Advance (!) SDK (”C:\Cygnus\thumbelf-000512\H-i686-cygwin32\lib\gcc-lib\thumb-elf\2.9-arm-000512″, “/tantor/build/nintendo/arm-000512/i686-cygwin32/src/newlib/libc/stdio/stdio.c”).

If you find any more things like these, please post them (or links to your stories) as comments! Happy hacking!

pixelstats trackingpixel

61 Responses to “Game Development Archeology: Zelda on Game Boy comes with source”

  1. hamtitampti says:

    are games not signed or scrambled with a secret chipper ? wouldn’t it be possible that the chipper algo for signing the game including the keys are in the ram too ?

  2. Hi, just wanted you to know that your RSS feed seems to be broken for this post — it would seem that it is the “

  3. …yes, the left arrow in the “LEFT” comment. Apparently the comment was cut off there too :-)

  4. Michael Steil says:

    @Steinar: Changed it – seems like that wasn’t it. :-(

  5. Michael Steil says:

    @hamtitampti: The 1989 Game Boy didn’t include any security mechanisms other than the requirement for the ROM to include a (copyrighted) Nintendo logo, which was checked by a 256 byte bootloader internal to the CPU. ROM encryption would have had to be handled by hardware, since Game Boys had too little RAM (8 KB) to decode the game code and data into it – instead, they typically ran the code directly from ROM.

  6. Hahaha, nice work!, i should do some game archeology too, looks very funny :-)

  7. caustik says:

    Would be interesting to write a script to grep through a database of ROM files looking for such hidden treasures. Actually, in general, a script that finds source code in an arbitrary set of files would be useful to have around.

  8. Suma says:

    Nice find!!

  9. Chris Baird says:

    At the fresh young age of 14, I discovered the VIC20 cartridge game ‘Mole Attack’ had chucks of its full assembler listing preserved in the ROM! …. there began a journey to understand this new computer language…

  10. Charles says:

    In other important news, I found a torn scrap of paper with writing on it today. It says:

    bread
    milk
    pick up dry clea

  11. Kayamon says:

    You should check out the arcade ROM for Golden Axe 2, if you can. They did a similar thing there, where they accidentally included a fairly large chunk of the assembler source code.

  12. Neat! I wonder how prevalent this is, over the course of video game history.

  13. Somebody says:

    The GoodTools name for the file with the MD5 sum ee0424cf1523f67c5007566aed70696d is “Zelda no Densetsu – Yume no Miru Shima DX (J) (V1.0) [C][p1][T+Chi][!].gbc” – if you are hunting for code left there by Nintendo, you should use clean dumps of the game. None of the source code is present in the clean dumps of the game.

    b1729716baaea01d4baa795db31800b0 X-Men – Wolverine’s Rage (U) [C][!].gbc – clean dump
    7311f937a542baadf113e9115158cde3 Mortal Kombat 4 (E) [C][!].gbc – clean dump
    e6a51088c8fea7980649064bd3a9f9ff Gift (E) [C][t2].gbc – *t*rained, but the clean dump also has those in it…
    5aa012cf540a5267d6adea6659764441 Spirou (U) (M4) [S].gb – no indication if this is a clean dump
    8150a3978211939d367f48ffcd49f979 Tintin in Tibet (E) (M7) [C][!].gbc – clean dump

    But nice finds nonetheless ;)

  14. Dwedit says:

    Air Fortress (Famicom) has big chunks of the source code sitting around in the rom dump, and this is for a clean version!

    http://bmf.rustedmagick.com/cr/airfortress.htm

  15. ProN00b says:

    doesn’t have to be disassembled, some c compilers first create asm and then binary or they wrote some stuff in c for beeing quick, turned it into asm and put it into their existing asm code.

  16. miguel says:

    @caustik

    There is such a tool, it’s called strings.

  17. willie lo says:

    what’s the big deal with playboy? they were supposed to make a game for the gb. what happened to it?

  18. JLennox says:

    I found source code padded to the end of a generic SNES cart copier BIOS I dumped. It’s kind of funny, they used “buttery” instead of “battery” and “fuck” instead of “func” (for function): http://red-stars.net/others/copier_trim.txt

  19. securitynews says:

    Is this also a way to proof how previous games could be ported to the GB?

  20. a monkey says:

    Guys, this stuff isn’t even in the clean ROM. It’s junk left in there by whoever dumped/hacked it. (What is in there, though, is a lot of the game’s dialogue repeated several times in both English and French. O_o)

    There have been games that accidentally left bits of their source code in the ROM, and with more modern systems it’s not uncommon to see lists of filenames intended to be shown on error report screens, but this game isn’t one of them.

  21. Anonymous says:

    Well, considering what Somebody said up there, these things are much more likely to be the contents of memory from the people who dumped the ROMs, not from those who created the data in the first place. A lot of the stuff you found would make much more sense as the contents of a reverse engineerer’s memory.

  22. Korrey says:

    Source code is nothin’. It would be hilarious to find some long-lost ASCII artporn. :D

  23. Ashims says:

    @ willie lo

    Too busy on other pursuits I expect. And of course, I’ve never found hef to be much of a coder. Takes him ages to get anything done these days.

  24. Correction says:

    Modern operating systems do not clear malloc()ed data before returning it. However multiprocessing operating systems should (and almost all do now) only reuse memory from the same process (malloc is a library call — normally there is another way to actually get more memory for the process).

    If you want that, you can use calloc() (clear-alloc), or do a memset() on the returned memory.

  25. Basketball Jones says:

    If you look in the ROM for Robin Hood: Prince of Thieves for NES, there is some assembly source code padded in there also.

  26. Jimi S. says:

    It’s all good!

  27. marc says:

    Hey, nice article, enjoyable read. Thanks!

  28. Sean Riddle says:

    Perhaps the first case of this is in the Channel F cart #26 Alien Invasion. There is some assembly source along the lines of
    ***** DC H’0140′953′MUSICEUTMISL,PUTDOT,PUTLIST
    And Channel F might have the first video game Easter egg:
    http://members.cox.net/seanriddle/chanf.html

  29. It would be nice to gather all this info on a webpage. Maybe a wiki where people could post their findings and explore them collaboratively. This could lead to lots of interesting studies.

  30. Kyle says:

    Didn’t your mother ever tell you to zero your buffers?

  31. corey says:

    Very cool. This brings back memories from my childhood.

  32. mmxbass says:

    I love the first comment in this article “secret chipper” bahahaha.

  33. Ben says:

    “While modern operating systems will always clear all malloc()ed memory, so that you cannot get to other processes’ data [..]”

    Actually that isn’t true. Most UNIXes don’t clear allocated memory. Thus the reason why if you dig through OpenSSH or any other application with private data they always clear it before releasing it back into the pool.

    Some UNIXes provide /etc/malloc.conf which let you set the clear on return behavior, but that does have an impact on performance.

    - Ben

  34. bd_ says:

    Linux, at least, will zero pages before mapping them into a process’s address space, but malloc() will not clear them again, so it’s possible to get a chunk of data from the same process.

    Note that the clearing happens when the page is mapped back in, not when it’s unmapped, so groveling through /proc/kcore could still get such data, which may be why openssh etc will clear the pages in question.

  35. Stoo says:

    The PS2 game Ico has the .s file created by the compiler left on the production disc. Surprising it got through Sony’s TRC actually. Includes all the VU and EE code with fairly clear labels too.

  36. [...] pagetable.com » Blog Archive » Game Development Archeology: Zelda on Game Boy comes with source This is what happened with the 1998 Game Boy/Game Boy Color game “The Legend of Zelda – Link’s Awakening DX” (MD5: ee0424cf1523f67c5007566aed70696d). If you look at the image starting at 0×106000, you will find all kinds of interesting data, which will tell you a lot about the game’s development. Let’s call this “game development archeology”… [...]

  37. Sean Dunlevy says:

    Going back to the C64 days, loading & resetting ‘Thing on a Spring’ allowed you to perform SYS 49152 to access a complete disassembler. Similarly, a reset of Rsmbo followed by hitting the restore kry entered the music editor.

  38. RonK says:

    Too bad the guys developing “TinTin in Tibet” were almost certainly only using cygwin as a platform for cross-compilation; if we could catch them on a GPL violation they’d have to release the full source code.

    Still interesting that they were using an open-source platform for their development.

  39. sk says:

    FOR THE LOVE OF GOD MAN, you’ve been slashdotted…. please get a captua!

  40. KennyMillar says:

    The original ZA Spectrum game ‘Manic Miner’ had loads of source code in the game file (on cassette!) including the cheat code!

  41. KennyMillar says:

    Obviously I meant ZX spectrum..

  42. Antoine Chavasse says:

    I remember something similar in an old amiga game that was called Ballistix. Some portions of the source code could be found in certain unused sectors of the disk.

  43. memals says:

    “nights into dreams” on the dreamcast had a special GDROM that included wallpaer images, also some other dreamcast games did the same.

  44. Marius says:

    I remember I found a FULL Basic compiler at the end of the game Moon Alert for ZX Spectrum :)

  45. Slappy says:

    My friend and I also found a compiler at the end of “Way Of The Exploding Fist” whilst persuading the game to run from a Microdrive cartridge, got that one printed in Microdrive Exchange.

  46. Slappy says:

    Those were the days!

  47. [...] * Game Development Archeology: Zelda on Game Boy comes with source: WARNUNG! Extremstes Geektum voraus! This is what happened with the […] game “The Legend of Zelda – Link’s Awakening DX” […]. If you look at the image starting at 0×106000, you will find all kinds of interesting data, which will tell you a lot about the game’s development. Let’s call this “game development archeology”… [...]

  48. MikeFM says:

    It’d be cool to collect a lot of these bits of code and document them. I’d love to buy a book that had major chunks of code from old games with information on their design and stuff.

  49. [...] Story about a Zelda cartridge produced with the source code in the ROM, probably because they malloc’s a bunch of RAM for the ROM image without clearing it. main page [...]

Leave a Reply

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Click to hear an audio file of the anti-spam word