6502 Language Implementation Approaches

userbinator · on Nov 4, 2018

I still find it to be an interesting intellectual challenge, even though it has no commercial use that I know of.

The huge volume of embedded cores based on a 6502 would disagree strongly --- I'm sure everyone has at one point used a 6502-based embedded system. They're everywhere in things like keyboards and mouses, LCD monitors (the MCU responsible for generating the OSD and such), and toys like the famous Tamagotchi and Furby, as well as keychain picture frames:

https://hackaday.com/2013/05/24/tamagotchi-rom-dump-and-reve...

https://news.ycombinator.com/item?id=17751599

http://spritesmods.com/?art=picframe

Also popular "not very good for HLL" architectures that yet have (subset of) C compilers for them include the 8051 and the Microchip PIC series.

ci5er · on Nov 4, 2018

We (the company I worked for as a Systems Engineer in Japan) sold hundreds of millions (billions?) of HC11s. Pretty much a 6809 SoC.

Now, they were embedded in mass-consumer devices, so only a few people had to program them.

a1369209993 · on Nov 4, 2018

> not very good for HLL

I think it's worth noting that it's very easy to treat the 6502 as having 128 16-bit general purpose registers. If you're okay with poorly optimised code, that makes for a very nice compiler target, while still allowing you to break out to hand-written assembly for inner loops and other sensitive areas.

Edit: On rereading, TFA links to https://dwheeler.com/6502/a-lang.txt, where this is approach is discussed under solution 3.

quickthrower2 · on Nov 4, 2018

Would the Arduino chip be a better choice for the intellectual challenge? You get a lot of modules you can buy from different vendors, and at the end of the day it's a simple to program chip like the 6502 from what I have seen.

octorian · on Nov 5, 2018

There is no "Arduino chip." Its a hardware module and software ecosystem built around the Atmel AVR microcontroller series.

quickthrower2 · on Nov 5, 2018

Yeah I was being lazy but I meant the chip that come on e.g. an arduino uno, i.e. ATmega328P

DonHopkins · on Nov 4, 2018

Apple Contributed Programs Volume 3 included "MICROLISP", written in AppleSoft BASIC by Ole Anderson.

    MICROLISP/16-JUN-78
    COPYRIGHT 1978 APPLE COMPUTER INC
    IT WILL TAKE APPROXIMATELY 1.5 MINUTES
    TO INITIALIZE THE ARRAYS.
    LIST ELEMENTS= 3634

Text AppleSoft BASIC source code: http://apple.rscott.org/tools/view.htm?crc=e7d8e947136190aa&...

DSK file: http://apple.rscott.org/tools/dldsk.htm?crc=e7d8e947136190aa

Documentation: https://archive.org/details/Apple_Software_Bank_Vol_3-5/page...

MicroLisp and other languages (including other Lisps, Logos, etc) are discussed in:

The Apple II Programmer's Catalog of Languages and Toolkits

http://apple2.org.za/gswv/a2zine/GS.WorldView/v1999/Feb/A2.N...

lisper · on Nov 4, 2018

It's actually quite amazing that it's possible to run Lisp in 48k of RAM, but it is.

http://www.flownet.com/ron/plisp.html

Someone · on Nov 4, 2018

http://history.siam.org/sup/Fox_1960_LISP.pdf:

”The current basic LISP system uses about 12,000 of the 32,000 memory of the IBM 704.”

The 704 was 36 bits, so that 12,000 was 54 kilobytes.

Speed-wise, the 704 had more and larger registers, but ran slower than a 6502 (about 40 kHz for a 704 according to http://bitsavers.org/pdf/ibm/704/24-6661-2_704_Manual_1955.p... vs about a MHz for typical 6502s)

On the plus side, the 704 did multiplications and divisions. https://www-03.ibm.com/ibm/history/exhibits/mainframe/mainfr... gives it about 4,000 multiplications or divisions per second. I think that’s quite a bit faster than a 6502, on 36 bit words, but you don’t need 36-bit arithmetic for a LISP.

It also had floating point, but you don’t need that for a LISP, either.

forinti · on Nov 4, 2018

I used a Pascal on a BBC Micro which consisted of 2 16KB ROMS: one for the editor and one for the compiler. It would have maybe 32KB of RAM minus the screen memory (1KB to 20KB).

rbanffy · on Nov 5, 2018

32K for an editor?! That's insane!

If I'm not clear, that's insanely huge for an 8-bit computer.

Heliodromus · on Nov 5, 2018

Actually...

I was co-author of Acorn ISO Pascal.

The system fit in two 16KB EPROMS for a total of 32KB, but only one of the 16KB ROMS could be mapped into the address space at a time.

The way this worked was that the compiler was self-hosted (i.e. written in ISO Pascal and compiled itself) and generated our own stack-based virtual machine code which we referred to as BL-code based on our own initials.

One of the 16K ROMS contained the BL-code of the self-compiled compiler... which only fit after a considerable amount of effort, including a few "macro" BL-codes designed for the purpose. Remember this was full BSI-certified ISO-Pascal, plus Acorn extensions for graphics etc, not some toy subset.

The other 16K ROM contained everything else, meaning the BL-code interpreter, screen editor, run-time libraries (floating point - which we copied from BBC basic, Pascal I/O, heap, etc), and command line interpreter. The editor, which I wrote, was around 4KB and fairly sophisticated for the time, including full regex global replace.

One interesting tidbit is how the system actually ran given that the compiler was in one ROM, and the interpreter needed to run it in the other ROM, with only one ROM able to be mapped into the address space at a time... The way we handled this was to relocate the interpreter into RAM in order to run the compiler (but run from ROM when running user programs), so the interpreter was organized into pure code, pure data and relocatable address tables to make this possible.

Getting the whole system to fit into those two 16K ROMS was a heck of a challenge!

rbanffy · on Nov 5, 2018

Having grown up in Brazil, I never had contact with the BBC micros until I became interested in retro computing. What you folks accomplished is not appreciated enough on the other side of the Atlantic.

forinti · on Nov 5, 2018

I think the editor was just one of the 16KB ROMS. But you are right. Wordwise was an incredible text editor and it also fitted in a 16KB ROM.

32KB was the total RAM you had on a BBC Micro B.

jgrahamc · on Nov 4, 2018

Why do you find it incredible? It really depends on many features you want to implement. I had LISP in ROM on my BBC Micro. That was about 5.5K of code (6502) on the ROM plus some data. The machine had 32KB of RAM.

stevekemp · on Nov 4, 2018

Agreed, Lisp will work on small environments pretty well. For smaller-still systems I'd recommend FORTH as a good alternative - as discussed in the article itself.

lisper · on Nov 4, 2018

Incredible != Amazing.

sctb · on Nov 4, 2018

Discussed in 2016: https://news.ycombinator.com/item?id=12536211.

david-given · on Nov 5, 2018

...so I wrote this self-hosting compiler for the 6502 and Z80:

http://cowlark.com/cowgol/

I say self-hosting, but on a 64kB BBC Micro second processor with floppy disk it takes about seven minutes to compile Hello World, so I haven't bothered to actually recompile the whole toolchain. (The overwhelming majority of of that time is spent doing disk I/O, as there's way too much state to keep in RAM. The compiler is an eight-pass behemoth.)

Here's an accelerated screencast of the thing in action: https://www.youtube.com/watch?v=epTQPSi3IyQ

The language itself is a simple strongly-typed fully compiled thing with a syntax based on Ada, supporting nice stuff like nested subroutines and so on. It has native 8-bit types (unlike C). Its main claim to being interesting is that it statically allocates all variables, using a simple but effective algorithm to walk the call tree and assign multiple variables to the same address if they're not going to be used at the same time. It's super effective. (This is Wheeler's solution 1.) This feature made the entire project possible, because it allowed me to do without stack frames completely. Trying to access the stack on either the 6502 or Z80 is an utter disaster.

The 6502 is a _bizarre_ thing to generate code for. 8-bit code is fine, but 16-bit and above is painful --- efficient maths is really hard. I kept finding the generated code breaking down into tiny microloops because when doing arithmetic with 16-bit values it can actually be shorter to use a loop than to inline it (in certain circumstances). The instruction set is orthogonal, except when it isn't; there's no LDA zpg,Y for example, but there's a LDX zpg,Y. Things like moving values from one memory location to another are so expensive that the setup cost in using helper functions frequently outweighs the benefit.

But in general, once you get your head around it and accept that it simply cannot do things like 16-bit signed comparisons in a fashion which won't make you cringe, it's not too bad. Index registers are great, as is zero page indirection (at least for 8-bit offsets). It's fast, taking a few cycles per instruction. It's also fairly sensible: there's frequently only one sane way to do things.

The Z80 drove me nuts, though. It's unbelievably unorthogonal. (You can only do 8-bit direct memory accesses via A --- B, C, D, E, H or L cannot be directly read from or written to memory!) It's slow --- the non-8080 instructions are so painfully slow (ld ix, (abs) is 20 cycles!) that they're only barely worth it. The 16-bit stuff doesn't help nearly as much as you'd think, either; you can only do adds and subtractions, with limited registers, and there's no carry so they're no use so 32 bit operations have to be done using the 8 bit instructions anyway. I did find the resulting code density to be better than the 6502, but not that much.

I'm really looking forward to doing a 6809 port one day...

rbanffy · on Nov 5, 2018

> The Z80 drove me nuts

I can't respect any CPU that takes FOUR clock cycles to do a NOP. Not sure if it takes longer if the PC points to a page boundary.

slededit · on Nov 5, 2018

Yea but think how fast your Z80 interrupts can be with those shadow registers!

resman · on Nov 5, 2018

The next version of PLASMA has a JIT compiler that will compile PLASMA byte code routines into native machine code based on call frequency. Currently supports 6502 and 65802/65816 backends into a 4K code buffer. It doubles the speed of the PLASMA compiler, itself written in PLASMA.

dmsc · on Nov 5, 2018

For a native 6502 compiler (to a VM similar to the PLASMA one), see FastBasic: https://github.com/dmsc/fastbasic

rbanffy · on Nov 5, 2018

> I briefly used Manx's Aztec C; it was awful. It generated bad code, and was a pain to use.

But it came wit a Unix-like shell, make and vi. And probably ed.

purplezooey · on Nov 4, 2018

This thing can use almost 3/4 watt of power at normal clock speed. For that amount of power you could use an ESP32 and get a lot more like Bluetooth etc.

cmrdporcupine · on Nov 5, 2018

TFA: "I still find it to be an interesting intellectual challenge, even though it has no commercial use that I know of. "

I think you might be missing the point.