> but their instruction window does not seem to scale. The issue with scaling in...

yvdriess · on Oct 26, 2015

I was indeed more thinking of the fine-grained dataflow or event-driven tasks, in the style of VAL/SISAL or very recently OCR (https://xstack.exascale-tech.com/wiki/index.php/Main_Page).

I heard there were some fundamental problems with TRIPS, but never got any detailed explanation. Any idea what it was? Is fine-grained dataflow still too much overhead to do in hardware or was there something else.

There were some dataflow-critique papers back in the 80-90s, but most of their points seem moot these days: low transistor occupancy (compared to the 90%+ of the simple RISC microprocessors) and high cost of associative memory needed for the token matchers. These days, we are working with billions instead of 10k transistors and commonly ship processors with megabytes of associative memory in the form of SRAM cache.

cfallin · on Oct 27, 2015

Cool, I have some more reading to do -- thanks for that link!

My understanding of TRIPS is that compiler complexity is an issue -- it depends on being able to chunk the program into hyperblocks with fairly high control locality (i.e., remain in one hyperblock for a while), because the transition cost is higher than a simple branch on a conventional superscalar. In the best case this can be great (32 ALUs worth of ILP, IIRC) but in the worst case (e.g. large code footprint) this can degrade terribly.

IIRC there are some more practical issues too -- e.g., the ISA is specific to the microarchitecture (because it explicitly maps to ALUs in the fabric), so binary compatibility is an issue.

All that said, still a really interesting idea, I think.