> Cgo calls take about 40ns, about the same time encoding/json takes to parse a single digit integer.
As an aside, that sure feels like a lot of time to parse a single-digit integer. Not catastrophic by any means, but still, 100 to 200 cycles is a sign the program is definitely Doing Something. Perhaps that’s the memory allocator?
they changed the ABI in 1.17 to pass arguments in registers instead of the stack: https://go.dev/doc/go1.17#compiler so if you used this solution you might not need to do the fixup anymore if the ABI matches.