Hacker News new | past | comments | ask | show | jobs | submit login

It works well if it code-generated down to efficient sequential code. It does not work well if you actually perform fine-grained dataflow execution: all individual operations can fire when their inputs become available. The former works well on our multicore processors, but the latter is essential to expose and exploit the large amount of parallelism (billion-way parallelism) needed to scale up. There is quite some overhead associated with managing (store/match/sync) data tokens, which is best handled in hardware. For instance, the Cray XMT (threadstorm processor) could do a thread context switch in one clock cycle. How many cycles is a thread switch on a modern intel core these days? Thousands?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: