Hacker News new | past | comments | ask | show | jobs | submit login

I'll defer to you I guess, as you seem to know more about this than me. The only thing is searching through the 6502.org forums I don't see a consensus on this?Plenty of people talking about the advantages of MVN/MVP for bulk transfers. I seem to recall doing the cycle counting myself at one point, too, and finding it advantageous.

One neat trick (I remember reading about from Alan Cox I believe) if you have control over the hardware is to memory map I/O devices like serial input / output such that incrementing addresses starting at a given address all point to the same physical device/register. E.g. allocate 256 contiguous bytes in your memory map to point to the same thing. This way you can do bulk I/O transfers to/from memory using MVP/MVN instead of "get a byte, put a byte" instruction by instruction.




The trick you describe was being used by Silicon Valley Computer ADP50L IDE controller from early nineties (1991). Memory mapped I/O instead of traditional x86 port access lets you skip doing manual loop for 'rep movsb', result can be 50% speed bump

https://forum.vcfed.org/index.php?threads/performance-of-lo-...

Port IO Read Speed : 219.39 KB/s

MMIO Read Speed : 310.77 KB/s

Some variants of XTIDE hardware also implement this, as does the free bios.


Ah here it is: http://forum.6502.org/viewtopic.php?f=2&t=5035 referencing a now-lost G+ post from Alan Cox:

"The emulator also has a fun hack for disk performance I'm hoping will get replicated in some of the upcoming retro 65C816 board design. Like the 6502 the 65C816 sucks at continually reading from an MMIO port and writing it to sequential memory locations. It sucks less than a 6502 because you've got 16bit index registers, but at the same clock it was doing about 100K/second that a Z80 can do 250K (with ini loops). The revised emulated disk interface has the same mmio port replicated across a chunk of address space and this allows a block move instruction (MVN) to do all the work at 6 clocks/byte. At that point the 65C816 suddenly jumps to twice as fast as the Z80 on disk I/O."


MVP/MVN are 7 cycles per byte.

If you're moving memory around in bank 0 (or have memory mapping), you can use the direct page register to read/write anywhere in bank 0 and the stack to read/write anywhere in bank 0.

16-bit LDA dp, PHA is 4 + 4 = 8 cycles or 4 cycles per byte. Best case would be if you know it's constant data before hand, eg, LDA #0, PHA, PHA .... 2 cycles per byte!

For general purpose copying MVP and MVN are easier and have better code density.


2 cycles per byte! It takes 4 cycles for PHA to push the 16-bit Accumulator, two bytes, onto the stack. There's also 16-bit PHD, PHX and PHY.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: