Announcement: ‘libcpu’ Binary Translator

December 29th, 2009

I just did a Lightning Talk at the 26th Chaos Communication Congress 26C3 about our new project “libcpu”, and it has already been picked up by Golem.de and reddit.com, so I might as well announce it here:

“libcpu” is an open source library that emulates several CPU architectures, allowing itself to be used as the CPU core for different kinds of emulator projects. It uses its own frontends for the different CPU types, and uses LLVM for the backend. libcpu is supposed to be able to do user mode and system emulation, and dynamic as well as static recompilation.

Here are my slides: 26C3-libcpu.pdf

Here is the video recording: 26C3-libcpu.mp4

Read more at http://www.libcpu.org/

pixelstats trackingpixel

Having Fun with Branch Delay Slots

November 22nd, 2009

Branch Delay Slots are one of the awkward features of RISC architectures. RISC CPUs are pipelined by definition, so while the current instruction is in execution, the following instruction(s) will be in the pipeline already. If there is for example a conditional branch in the instruction stream, the CPU cannot know whether the next instruction is the one following the branch or the instruction at the target location until it has evaluated the branch. This would cause a bubble in the pipeline; therefore some RISC architectures have a branch delay slot: The instruction after the branch will always be executed, no matter whether the branch is taken or not.

So in practice, you can put the instruction that would be before the branch right after the branch, if this instruction is independent of the branch instruction, i.e. doesn’t access the same registers. Otherwise, you can fill it with a NOP. Out-of-order architectures can do this reordering at runtime, so there would be no need for a delay slot. Nevertheless, the delay slot is a feature of the architecture, not the implementation.

Some RISCs like PowerPC and ARM do not have a delay slot, but for example MIPS, SPARC, PA-RISC have it. But there are some variations: MIPS and PA-RISC have an annihilation/nullify/likely bit in the instruction, so the programmer can choose that the instruction in the delay slot only gets executed if the branch is taken.

Other CPUs, like SPARC, PA-RISC or the ill-fated Motorola M88K have optional delay slots: The programmer can set a bit in the opcode if he cannot come up with a good instruction for the delay slot, and save the wasted NOP in the program code this way – the CPU will put a bubble into the pipeline.

Now the interesting question is what happens if the branch and the delay instruction are not independent. What if the delay instruction writes r5 and the branch jumps to r5? What if it’s a branch-and-link, and the delay instruction modifies the link register? On MIPS, this is illegal, and undefined.

In practice, MIPS won’t halt and catch fire though. As you would expect from the design of a CPU pipeline, the CPU basically executes the branch and the delay instruction in order, as they are stored in the instruction stream, and it only delays the write to PC, i.e. the actual jump until after the delay instruction. So, for example, if you modify a register that the branch depends on, it will not influence the branch, but be in effect after the jump.

On the aforementioned Motorola M88K, this behaviour is documented, and GCC even makes use of it:

820:   7d ad 00 08   cmp   r13,r13,0x08    ; compare

824:   d4 6d 00 05   bb0.n 0x03,r13,0x834  ; cond. branch
828:   63 df 00 00   addu  r30,r31,0       ; delay slot

82c:   cc 00 00 7f   bsr.n 0xa28           ; function call
830:   60 21 01 ac   addu  r1,r1,0x1ac     ; delay slot: fix up link

834:   00 00 00 00   nop

The first three instructions are a compare/branch sequence. If the branch is taken, execution will continue at 0×834, otherwise at 0×82c, after the delay slot. The delay instruction is independent of the branch, nothing special here yet.

But now look at the following two instructions: 0×82c and 0×830 are not independent. r1 is the M88K’s link register, so the “bsr” writes the addres of the following instruction after the delay slot (0×834) into r1. The delay slot also writes into r1: It adds 0×1ac to it. These instructions are executed in order, but the actual branch to 0xa28 will only be done after the delay instruction.

So what these two instructions effectively do is call a function, but set up the return address to skip the next 0×1ac bytes (107 instructions) after return. If the conditional branch at 0×824 is taken, the code at 0×834 will be executed, otherwise, 0xa28 is called, and the “taken” case of the conditional branch is skipped. This trick can be used whenever you have C code with an “if” statement of which one case is a single function call:

if (...) {
    call_something();
} else {
    [...]
}
pixelstats trackingpixel

PCEPTPDPTE

October 31st, 2009

Here is a new pagetable entry.

I like Intel. I told you before how Intel messed up the x86 register nomenclature by extending A to AX (A extended) and then to EAX (extended A extended). Then AMD came and extended the register once more, giving it a more sane name: RAX.

I also told you before how Intel messed up the x86 pagetable nomenclature: There were pagetables (PT, level 1) and page directories (PD, level 2) on the i386, and for the Pentium Pro, they added page directory pointers (PDP, level 3). Then AMD came and extended it once more, giving it a more sane name: page map level 4 (PML4).

With the advent of virtualization, both Intel and AMD added a feature to get rid of the slow software shadow pagetables, and added hardware support for nested pagetables, i.e. the guest has 4 levels of pagetables, and the host has another 4 levels.

AMD called these – surprise, surprise! – nested pagetables, NPT. Intel was more creative. With a history of extending architectures, they went with the big E: extended pagetables, EPT.

Let’s practice a bit: A PD is a page directory, a PDE is a page directory entry. You can also call it a PDPTE, a page directory pagetable entry (level 2 PTE), because after all, all these entries on all levels are PTEs, because they share the same format. A PDPPTE is a page directory pointer pagetable entry, aka level 3 entry.

If we use nested paging – excuse me – extended paging on Intel, we need to prepend EPT to our nice little abbreviations. An EPTPTE is a level 1 entry, an EPTPDPTE is level 2 not to be confused with an EPTPDPPTE, which is level 3, and a level 4 entry is EPTPML4PTE.

It get even better. Oracle/Sun/Innotek VirtualBox uses Hungarian Notation for its variable names, so it prepends “P” for pointer and “C” for constant. So what would you call a variable, which is a pointer to a constant level 2 EPT entry?

Of course, PCEPTPDPTE.

/** Pointer to a const EPT Page Directory Pointer Entry. */
typedef const EPTPDPTE *PCEPTPDPTE;

I thought about this for a while, and considered patenting this brilliant idea of mine, but here it is, free of patents and free for everyone to use: Michael’s nomenclature for Intel/AMD pagetables:

new name description old name
P4 pagetable level 4 page PML4
P3 pagetable level 3 page PDP
P2 pagetable level 2 page PD
P1 pagetable level 1 page PT
P4E pagetable level 4 entry PML4E/PML4PTE
P3E pagetable level 3 entry PDPE/PDPPTE
P2E pagetable level 2 entry PDE/PDPTE
P1E pagetable level 1 entry PTE
NP4 nested pagetable level 4 page EPTPML4
NP3 nested pagetable level 3 page EPTPDP
NP2 nested pagetable level 2 page EPTPD
NP1 nested pagetable level 1 page EPTPT
NP4E nested pagetable level 4 entry EPTPML4E/EPTPML4PTE
NP3E nested pagetable level 3 entry EPTPDPE/EPTPDPPTE
NP2E nested pagetable level 2 entry EPTPDE/EPTPDPTE
NP1E nested pagetable level 1 entry EPTPTE

You are welcome.

pixelstats trackingpixel

A Standalone printf() for Early Bootup

September 7th, 2009

A while ago, I complained about operating systems with overly complicated startup code that spends too much time in assembly and does hot have printf() or framebuffer access until very late.

This second post is about printf(): Many systems use POST codes (on i386/x86_64, i.e. writes to port 0×80) or debug LED for debugging, or have complicated and cumbersome implementations for puts() and print_hex() – and printf() is only available very late, because it has some special requirement, like console channels being set up. But printf() is not rocket science: Al it needs is C and a stack. Whatever system you are bringing up on whatever platform: Having printf() as early as possible will prove very useful.

Today, I am presenting a full-featured standalone version of printf() that can be added to arbitrary 32 or 64 bit C code. The code has been taken from FreeBSD (sys/kern/subr_prf.c) and is therefore BSD-licensed. Unnecessary functions have been removed and all typedefs required have been added.

/*-
 * Copyright (c) 1986, 1988, 1991, 1993
 *	The Regents of the University of California.  All rights reserved.
 * (c) UNIX System Laboratories, Inc.
 * All or some portions of this file are derived from material licensed
 * to the University of California by American Telephone and Telegraph
 * Co. or Unix System Laboratories, Inc. and are reproduced herein with
 * the permission of UNIX System Laboratories, Inc.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 * 4. Neither the name of the University nor the names of its contributors
 *    may be used to endorse or promote products derived from this software
 *    without specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 *
 *	@(#)subr_prf.c	8.3 (Berkeley) 1/21/94
 */

typedef unsigned long size_t;
typedef long ssize_t;
#ifdef __64BIT__
typedef unsigned long long uintmax_t;
typedef long long intmax_t;
#else
typedef unsigned int uintmax_t;
typedef int intmax_t;
#endif
typedef unsigned char u_char;
typedef unsigned int u_int;
typedef unsigned long u_long;
typedef unsigned short u_short;
typedef unsigned long long u_quad_t;
typedef long long quad_t;
typedef unsigned long uintptr_t;
typedef long ptrdiff_t;
#define NULL ((void*)0)
#define NBBY    8               /* number of bits in a byte */
char const hex2ascii_data[] = "0123456789abcdefghijklmnopqrstuvwxyz";
#define hex2ascii(hex)  (hex2ascii_data[hex])
#define va_list __builtin_va_list
#define va_start __builtin_va_start
#define va_arg __builtin_va_arg
#define va_end __builtin_va_end
#define toupper(c)      ((c) - 0x20 * (((c) >= 'a') && ((c) <= 'z')))
static size_t
strlen(const char *s)
{
	size_t l = 0;
	while (*s++)
		l++;
	return l;
}

/* Max number conversion buffer length: a u_quad_t in base 2, plus NUL byte. */
#define MAXNBUF	(sizeof(intmax_t) * NBBY + 1)

/*
 * Put a NUL-terminated ASCII number (base <= 36) in a buffer in reverse
 * order; return an optional length and a pointer to the last character
 * written in the buffer (i.e., the first character of the string).
 * The buffer pointed to by `nbuf' must have length >= MAXNBUF.
 */
static char *
ksprintn(char *nbuf, uintmax_t num, int base, int *lenp, int upper)
{
	char *p, c;

	p = nbuf;
	*p = '\0';
	do {
		c = hex2ascii(num % base);
		*++p = upper ? toupper(c) : c;
	} while (num /= base);
	if (lenp)
		*lenp = p - nbuf;
	return (p);
}

/*
 * Scaled down version of printf(3).
 *
 * Two additional formats:
 *
 * The format %b is supported to decode error registers.
 * Its usage is:
 *
 *	printf("reg=%b\n", regval, "*");
 *
 * where  is the output base expressed as a control character, e.g.
 * \10 gives octal; \20 gives hex.  Each arg is a sequence of characters,
 * the first of which gives the bit number to be inspected (origin 1), and
 * the next characters (up to a control character, i.e. a character <= 32),
 * give the name of the register.  Thus:
 *
 *	kvprintf("reg=%b\n", 3, "\10\2BITTWO\1BITONE\n");
 *
 * would produce output:
 *
 *	reg=3
 *
 * XXX:  %D  -- Hexdump, takes pointer and separator string:
 *		("%6D", ptr, ":")   -> XX:XX:XX:XX:XX:XX
 *		("%*D", len, ptr, " " -> XX XX XX XX ...
 */
int
kvprintf(char const *fmt, void (*func)(int, void*), void *arg, int radix, va_list ap)
{
#define PCHAR(c) {int cc=(c); if (func) (*func)(cc,arg); else *d++ = cc; retval++; }
	char nbuf[MAXNBUF];
	char *d;
	const char *p, *percent, *q;
	u_char *up;
	int ch, n;
	uintmax_t num;
	int base, lflag, qflag, tmp, width, ladjust, sharpflag, neg, sign, dot;
	int cflag, hflag, jflag, tflag, zflag;
	int dwidth, upper;
	char padc;
	int stop = 0, retval = 0;

	num = 0;
	if (!func)
		d = (char *) arg;
	else
		d = NULL;

	if (fmt == NULL)
		fmt = "(fmt null)\n";

	if (radix < 2 || radix > 36)
		radix = 10;

	for (;;) {
		padc = ' ';
		width = 0;
		while ((ch = (u_char)*fmt++) != '%' || stop) {
			if (ch == '\0')
				return (retval);
			PCHAR(ch);
		}
		percent = fmt - 1;
		qflag = 0; lflag = 0; ladjust = 0; sharpflag = 0; neg = 0;
		sign = 0; dot = 0; dwidth = 0; upper = 0;
		cflag = 0; hflag = 0; jflag = 0; tflag = 0; zflag = 0;
reswitch:	switch (ch = (u_char)*fmt++) {
		case '.':
			dot = 1;
			goto reswitch;
		case '#':
			sharpflag = 1;
			goto reswitch;
		case '+':
			sign = 1;
			goto reswitch;
		case '-':
			ladjust = 1;
			goto reswitch;
		case '%':
			PCHAR(ch);
			break;
		case '*':
			if (!dot) {
				width = va_arg(ap, int);
				if (width < 0) {
					ladjust = !ladjust;
					width = -width;
				}
			} else {
				dwidth = va_arg(ap, int);
			}
			goto reswitch;
		case '0':
			if (!dot) {
				padc = '0';
				goto reswitch;
			}
		case '1': case '2': case '3': case '4':
		case '5': case '6': case '7': case '8': case '9':
				for (n = 0;; ++fmt) {
					n = n * 10 + ch - '0';
					ch = *fmt;
					if (ch < '0' || ch > '9')
						break;
				}
			if (dot)
				dwidth = n;
			else
				width = n;
			goto reswitch;
		case 'b':
			num = (u_int)va_arg(ap, int);
			p = va_arg(ap, char *);
			for (q = ksprintn(nbuf, num, *p++, NULL, 0); *q;)
				PCHAR(*q--);

			if (num == 0)
				break;

			for (tmp = 0; *p;) {
				n = *p++;
				if (num & (1 << (n - 1))) {
					PCHAR(tmp ? ',' : '<');
					for (; (n = *p) > ' '; ++p)
						PCHAR(n);
					tmp = 1;
				} else
					for (; *p > ' '; ++p)
						continue;
			}
			if (tmp)
				PCHAR('>');
			break;
		case 'c':
			PCHAR(va_arg(ap, int));
			break;
		case 'D':
			up = va_arg(ap, u_char *);
			p = va_arg(ap, char *);
			if (!width)
				width = 16;
			while(width--) {
				PCHAR(hex2ascii(*up >> 4));
				PCHAR(hex2ascii(*up & 0x0f));
				up++;
				if (width)
					for (q=p;*q;q++)
						PCHAR(*q);
			}
			break;
		case 'd':
		case 'i':
			base = 10;
			sign = 1;
			goto handle_sign;
		case 'h':
			if (hflag) {
				hflag = 0;
				cflag = 1;
			} else
				hflag = 1;
			goto reswitch;
		case 'j':
			jflag = 1;
			goto reswitch;
		case 'l':
			if (lflag) {
				lflag = 0;
				qflag = 1;
			} else
				lflag = 1;
			goto reswitch;
		case 'n':
			if (jflag)
				*(va_arg(ap, intmax_t *)) = retval;
			else if (qflag)
				*(va_arg(ap, quad_t *)) = retval;
			else if (lflag)
				*(va_arg(ap, long *)) = retval;
			else if (zflag)
				*(va_arg(ap, size_t *)) = retval;
			else if (hflag)
				*(va_arg(ap, short *)) = retval;
			else if (cflag)
				*(va_arg(ap, char *)) = retval;
			else
				*(va_arg(ap, int *)) = retval;
			break;
		case 'o':
			base = 8;
			goto handle_nosign;
		case 'p':
			base = 16;
			sharpflag = (width == 0);
			sign = 0;
			num = (uintptr_t)va_arg(ap, void *);
			goto number;
		case 'q':
			qflag = 1;
			goto reswitch;
		case 'r':
			base = radix;
			if (sign)
				goto handle_sign;
			goto handle_nosign;
		case 's':
			p = va_arg(ap, char *);
			if (p == NULL)
				p = "(null)";
			if (!dot)
				n = strlen (p);
			else
				for (n = 0; n < dwidth && p[n]; n++)
					continue;

			width -= n;

			if (!ladjust && width > 0)
				while (width--)
					PCHAR(padc);
			while (n--)
				PCHAR(*p++);
			if (ladjust && width > 0)
				while (width--)
					PCHAR(padc);
			break;
		case 't':
			tflag = 1;
			goto reswitch;
		case 'u':
			base = 10;
			goto handle_nosign;
		case 'X':
			upper = 1;
		case 'x':
			base = 16;
			goto handle_nosign;
		case 'y':
			base = 16;
			sign = 1;
			goto handle_sign;
		case 'z':
			zflag = 1;
			goto reswitch;
handle_nosign:
			sign = 0;
			if (jflag)
				num = va_arg(ap, uintmax_t);
			else if (qflag)
				num = va_arg(ap, u_quad_t);
			else if (tflag)
				num = va_arg(ap, ptrdiff_t);
			else if (lflag)
				num = va_arg(ap, u_long);
			else if (zflag)
				num = va_arg(ap, size_t);
			else if (hflag)
				num = (u_short)va_arg(ap, int);
			else if (cflag)
				num = (u_char)va_arg(ap, int);
			else
				num = va_arg(ap, u_int);
			goto number;
handle_sign:
			if (jflag)
				num = va_arg(ap, intmax_t);
			else if (qflag)
				num = va_arg(ap, quad_t);
			else if (tflag)
				num = va_arg(ap, ptrdiff_t);
			else if (lflag)
				num = va_arg(ap, long);
			else if (zflag)
				num = va_arg(ap, ssize_t);
			else if (hflag)
				num = (short)va_arg(ap, int);
			else if (cflag)
				num = (char)va_arg(ap, int);
			else
				num = va_arg(ap, int);
number:
			if (sign && (intmax_t)num < 0) {
				neg = 1;
				num = -(intmax_t)num;
			}
			p = ksprintn(nbuf, num, base, &tmp, upper);
			if (sharpflag && num != 0) {
				if (base == 8)
					tmp++;
				else if (base == 16)
					tmp += 2;
			}
			if (neg)
				tmp++;

			if (!ladjust && padc != '0' && width
			    && (width -= tmp) > 0)
				while (width--)
					PCHAR(padc);
			if (neg)
				PCHAR('-');
			if (sharpflag && num != 0) {
				if (base == 8) {
					PCHAR('0');
				} else if (base == 16) {
					PCHAR('0');
					PCHAR('x');
				}
			}
			if (!ladjust && width && (width -= tmp) > 0)
				while (width--)
					PCHAR(padc);

			while (*p)
				PCHAR(*p--);

			if (ladjust && width && (width -= tmp) > 0)
				while (width--)
					PCHAR(padc);

			break;
		default:
			while (percent < fmt)
				PCHAR(*percent++);
			/*
			 * Since we ignore an formatting argument it is no
			 * longer safe to obey the remaining formatting
			 * arguments as the arguments will no longer match
			 * the format specs.
			 */
			stop = 1;
			break;
		}
	}
#undef PCHAR
}

static void
putchar(int c, void *arg)
{
	/* add your putchar here */
}

void
printf(const char *fmt, ...)
{
	/* http://www.pagetable.com/?p=298 */
	va_list ap;

	va_start(ap, fmt);
	kvprintf(fmt, putchar, NULL, 10, ap);
	va_end(ap);
}

One thing you will have to do is #define or #undef __64__BIT to enable or disable support for printing 64 bit values. If you are on a 64 bit architecture, you will need this to print addresses properly. If you are on 32 bit, this support will require external library functions to do 64 bit arithmetic, so you probably want to turn this off.

On ARM, which doesn't have an assembly statement for division, will always need a library routine. You can use FreeBSD's at sys/libkern/arm/divsi3.S, if you to add the following #defines to make the file compile:

#define __FBSDID(a)
#define RET bx lr
#define ENTRY_NP(a) _##a: .global _##a
#define _KERNEL

All you need to do now to get printf() working is to fill the putchar() function, for example with a call to a serial_putc() implementation, if you have a serial port. On a PC, serial_putc() could look like this:

static void outb(short p, unsigned char a)
{
        asm volatile("outb %b0, %w1" : : "a" (a), "Nd" (p));
}
static unsigned char inb(short p)
{
        unsigned char a;
        asm volatile("inb %w1, %b0" : "=a" (a) : "Nd" (p));
        return a;
}
#define SERIAL_PORT 0x3f8
static void
serial_putc(unsigned char c) {
	while (!(inb(SERIAL_PORT + 5) & 0x20));
	outb(SERIAL_PORT, c);
}

On an "ARM Integrator/CP" board, the default for QEMU emulating ARM, the follwing serial_putc() would apply (make sure you are running the MMU off, or with the serial port MMIO region mapped):

#define SERIAL_PORT 0x16000000
static void
serial_putc(unsigned char c) {
	while ((*(volatile unsigned int *)(SERIAL_PORT + 0x18)) & 0x20);
	*(volatile unsigned int *)(SERIAL_PORT) = (c);
}

Unfortunately, sometimes you are bringing up or debugging a system that does not have a serial port - or at least you haven't found one: Apple Macintosh and various game systems come to mind. In this case, you might be able to set up a framebuffer and print to that. In a later post, I will present a small standalone framebuffer library.

pixelstats trackingpixel

Pictures of Apple Lisa 2 Boards

September 1st, 2009

The Apple Lisa is a quite rare collector’s item today. So if you hold one in your hands, you better take some high resolution pictures of the boards. Here they are:

CPU Board

I/O Board

Memory Board

pixelstats trackingpixel

TI-99/4A BASIC as a Scripting Language

August 24th, 2009

It is a good time for statically recompiled versions of BASIC from old computers.  First there was Apple I BASIC.  Then came Commodore BASIC.  Now, due to overwhelming demand, we’re proud to release TI-99/4A BASIC. For those unfamiliar the TI-99/4A was a home computer by Texas Instruments released in 1981. Unusually for the time it had a 16-bit CPU: the TMS9900.

Download the program from the project page on SourceForge. Binaries for Windows and Mac OS X are available and the source should compile on most POSIX-like systems. If you port it to a new platform please drop us a line; we’d love hear about it.

It supports the same interfaces as the previous projects.  You can use it interactively in direct mode:

$ tibasic
TI BASIC READY
>print 4*atn(1)
 3.141592654

>bye

Or you can write a line-numbered program:

$ tibasic
>10 FOR I=1 TO 10
>20 PRINT TAB(I);"Hello, world!"
>30 NEXT I
>RUN
Hello, world!
 Hello, world!
  Hello, world!
   Hello, world!
    Hello, world!
     Hello, world!
      Hello, world!
       Hello, world!
        Hello, world!
         Hello, world!

** DONE **

>BYE

You can also run programs from a file:

$ cat name.bas
10 INPUT "What is your name? ":N$
20 PRINT "Hello, ";N$;"!"

$ tibasic.exe name.bas
What is your name? James
Hello, James!

There are a few sample programs available on the homepage that you can run in this manner.

How it works

This program works much like the already mentioned Commodore BASIC project. The original program is statically recompiled to produce a new native binary. The recompiler is the work of Michael Steil and the  support for the TI-99/4A is by James Abbatiello.

The output of the recompiler is tibasic.c which is platform independent. The support functions are in runtime_functions.c and it is this file that you would edit to port to a new system or to add new features.

There are a couple quirks of the TI-99/4A which made this a bit trickier to support than the C64 version:

  • The BASIC interpreter was not written in assembly language as you might expect. Instead it was programmed in something called Graphics Programming Language (GPL). GPL was much like an assembly language with some high-level primitives for commonly needed functions. A program written in GPL would be assembled to bytecode and then run on an interpreter that was written in TMS9900 assembly language. This made BASIC programs on the machine rather slow. The CPU would execute the GPL interpreter. The GPL interpreter would run the BASIC interpreter. Finally, the BASIC interpreter would run your program. Only the TMS9900 code is statically recompiled in this program. The GPL still runs interpreted just as it did on the original machine. While it would theoretically be possible to statically recompile the GPL code as well this would be significantly more difficult since GPL was never officially documented by TI. The only definitive reference is the original interpreter.
  • C64 BASIC outputs characters to the screen via one function (CHROUT) which was easy to trap. TI-99/4A BASIC outputs to the screen with direct writes to video memory from multiple different places in the code. This requires some ugly hacks to get anything approximating reasonable on stdout.

Limitations

The Cassette, Disk, and Sound devices are not currently emulated. Contributions in these areas are most welcome.

Links

pixelstats trackingpixel