There is something about RISC-V that really inspires lots of hackers and it’s no...

bee_rider · 2024-10-11T13:31:47 1728653507

It is cool as heck that truly open hardware might actually win in our lifetimes. (ARM was an interesting start, but too much licensing).

marssaxman · 2024-10-11T17:28:16 1728667696

For me, it's the fact that it is a truly open standard, with no licensing entanglements. It has the potential to be a durable ecosystem, worth investing in.

chrisjj · 2024-10-11T13:52:36 1728654756

> There is something about RISC-V that really inspires lots of hackers

"Not Arm" :)

trq01758 · 2024-10-11T14:07:32 1728655652

Simplicity, it's a modern MOS 6502. Base RISC-V has even less instructions than 6502.

Joker_vD · 2024-10-11T15:34:56 1728660896

Also it has less registers (32 vs. 256 in the zero page) and less addressing modes.

wk_end · 2024-10-11T18:37:17 1728671837

But the 6502's "registers" are much smaller and you can do much less with them. You can't really sensibly compare the two approaches so superficially.

snvzz · 2024-10-12T05:14:43 1728710083

Can they really be called registers, when they're bytes in DRAM?

It really is just a convenient short addressing mode.

The 6502 has actual registers, A/X/Y and the specialized S/P/PC.

brucehoult · 2024-10-13T01:49:49 1728784189

> Can they really be called registers, when they're bytes in DRAM?

Yes they can, because that's just an implementation detail.

Registers are nothing more than a conveniently short address for frequently-accessed working storage. Sometimes they are in their own address space (which in modern use usually doesn't have indirect/computed addressing, but can), but sometimes they are in the same address space as RAM e.g. in AVR the first 32 bytes of RAM are the registers (which might or might not be implemented in the same technology). Some early / small AVRs didn't have any other RAM. The same is true of PIC and 8051. And then there is the TMS9900 where the only on-chip registers were the PC and a pointer to where in RAM the working registers were stored.

It seems entirely appropriate to refer to the 6502's Zero Page as "registers" given that 1) it barely has any others, and 2) the very fundamental for modern software base+offset addressing mode exists only using two zero page bytes as the base. You would otherwise be reduced to using self-modifying code for any access via pointer.

If the 6502 ISA had not become obsolete for other reasons -- the desire for more than 8 bit ALUs and 16 bit addresses -- it is entirely likely that as CPUs became faster than RAM and more transistors were able to be put in the CPU then future 6502s would have brought Zero Page on-chip.

chrisjj · 2024-10-13T17:19:52 1728839992

> It seems entirely appropriate to refer to the 6502's Zero Page as "registers" given that 1) it barely has any others,

Enough to write any program, mind.

> and 2) the very fundamental for modern software base+offset addressing mode exists only using two zero page bytes as the base.

Correction: base indirect + offset.

jdjebdndn · 2024-10-13T12:32:59 1728822779

It's not an implementation detail, if it has to go through the system bus

brucehoult · 2024-10-14T01:28:36 1728869316

Going through the system bus IS an implementation detail.

You could build a 6502-compatible CPU with a (extra [1]) 256 byte on-chip register file, and treat, for example, `0x1265` as simply a 16 bit instruction `ADC A,R18`, or `0x0791` as an x86-ish `MOV [R7+Y],A`.

All binary programs would run just as they do on the 1975 6502, just a lot faster.

[1] in the original 6502, the registers aren't in a register file in the modern sense, they're implemented with flip flops and all are accessible simultaneously (with wired-OR on to a bus in some cases if the decode ROM selected several at the same time)

chrisjj · 2024-10-13T17:20:28 1728840028

Or off the CPU regardless, mind.

snvzz · 2024-10-13T05:14:45 1728796485

I'd call it scratch page or something of the sort.

The main issue in my mind is that an actual set of registers which are inside the chip do already exist.

crackez · 2024-10-11T23:24:08 1728689048

Yet the same number of register bytes (when only counting 6502's zero page)... 32x8 = 256 bytes...

brucehoult · 2024-10-12T02:16:30 1728699390

31x4 = 124 bytes :-) The Zero register doesn't have to physically exist, and we're talking about 32 bit CPUs e.g. Pi Pico 2 here, right.

Ok, 128 bytes if you add in the PC.

Except for RV32E -- as seen in the very popular $0.10 CH32V003 -- which has 15x4 = 60 bytes of GPRs, plus the PC.

Plus usually a few CSRs on practical CPUs, though Zicsr is an extension so you don't have to have it.

acegopher · 2024-10-11T17:58:40 1728669520

The same could be said of the ARM Cortex-M0+.

kragen · 2024-10-11T18:08:32 1728670112

The Cortex-M0's Thumb-1 is a really unpleasant instruction set compared to ARM, Thumb-2, RISC-V, or ARM64.

brucehoult · 2024-10-12T02:37:50 1728700670

Though no worse than 16 or 32 bit x86 (without FPU), and probably better because the lower 8 registers are general-purpose.

Also you can get something useful from the "spare" five registers r8-r12 as they support MOV, ADD and CMP with any other register, plus BX. Sadly you're on your own with PUSH/POP except for PUSH LR / POP PC.

Thumb-1 (or ARMv6-M) is fairly similar to RISC-V C extension. It's overall a bit more powerful because it has more opcodes available and because RVC dedicates some opcodes to floating point. RVC only lets you do MV and ADD on all 32 (or 16 in RV32) registers, not CMP (not that RISC-V has CMP anyway). Plus, RVC lets you load/store any register into the stack frame. Thumb-1 r8-r14 need to be copied to/from r0-r7 to load or store them.

But on the other hand, RVC is never present without the full-size 4 byte instructions, even on the $0.10 CH32V003, making that a bit more pleasant than the similar price Cortex M0 Puya PY32F002.

kragen · 2024-10-14T00:46:01 1728866761

My initial experience with Thumb-1 was like stepping on a series of rakes. Can't use ADD? Why not? Oh, it turns out you have to use ADDS. Wait, why am I getting an error when I try to use ADDS? Turns out that inside an ITTE (etc.) block, you can't use ADDS; you have to use ADD. And the various other irregular restrictions on what you can express are similarly unpredictable. Maybe my gripe isn't really with Thumb-1 but with GAS, but even when you learn the restrictions, it still takes extra mental effort to program under them. I did have some similar experiences with 8086 code (it took me a certain amount of trial and error to learn which registers I could use as base registers and index registers, as I recall) but never 80386 code, where all of its registers are just as general-purpose as on Thumb-1, unless you're looking for sizecoding hacks to get your demo down under 64 bytes or whatever.

I agree that RVC is similar in theory, but being able to mix 4-byte instructions into your RVC code largely eliminates the stepping-on-rakes problem, even on Graham Smecher's redoubtable Minimax which Jecel Assumpção mentioned. I still prefer ARM assembly over RISC-V, but both definitely have their merits.

brucehoult · 2024-10-14T01:32:36 1728869556

If you have ITTE (etc.) then you're not on Thumb-1 (e.g. ARM7TDMI) or ARMv6-M (Cortex M0+), you're on Thumb-2.

> but being able to mix 4-byte instructions into your RVC code largely eliminates the stepping-on-rakes problem

Absolutely, which is why I pointed out that no one (at least no one commercial) has ever implemented RVC alone, not even on the 10c CH32V003.

kragen · 2024-10-14T01:59:01 1728871141

Oh, you're right, of course. I misremembered that rake. I stepped on some others I can't remember now, though.

I wouldn't be surprised to see commercial implementations of Minimax. It seems like it would have a much better cost/benefit ratio than SeRV for some applications.

jecel · 2024-10-13T05:00:10 1728795610

It is better to say RVC is almost never present without the full-size 4 byte instructions since we have one counter example:

https://github.com/gsmecher/minimax

This is an experimental rather than practical design that only directly implements the compressed instructions in hardware and then implements the normal RV32I instructions in "microcode" written using the compressed instructions.

kragen · 2024-10-13T18:45:08 1728845108

Minimax is a super cool design! I think it's not really a counterexample, because it does implement the uncompressed instructions, just more slowly.

brucehoult · 2024-10-14T12:04:53 1728907493

The LUT counts do look competitive, until you realise that this doesn't include the cost of the microcode.

Probably fine on FPGA where there's lots of almost free BRAM, but on an ASIC where you'd need to use SRAM or mask ROM, or if you used LUTRAM, it would look very different.

Plus, the speed penalty for the microcoded instructions is huge. perhaps not as huge as SeRV :-)

kragen · 2024-10-14T15:47:33 1728920853

That sounds reasonable, yeah. Presumably you'd write your inner loops purely in RVC instructions; in the situations where you'd use SeRV, you wouldn't be using it for your computational bottlenecks, which you'd build special-purpose hardware for, but just to sort of orchestrate a sequence of steps. But Minimax seems like it could really reduce the amount of stuff you had to design special-purpose hardware for.

rootnod3 · 2024-10-11T19:15:25 1728674125

For me, part of it is also the beauty of the ISA. I think it is just really well thought out with its extensions and namespacing for custom ISAs.

KerrAvon · 2024-10-11T18:56:40 1728673000

It's a bunch of things; many of the reasons are actually technical. It's very simple to compile to RISC-V instructions.

brucehoult · 2024-10-12T03:48:05 1728704885

The offsets for J{AL} and Bcc are a little tricky, though only half a dozen lines of code to sort out.

zyffsht · 2024-10-12T02:46:19 1728701179

Tyyygyg