Sonnet 3.5 is fast for its quality. But yeah, it's nowhere near Google's flash models. But I assume that is largely just because its a much smaller model.
Ooh, what are these ASICs you're talking about? My understanding was that we'll see AMD/Nvidia gpus continue to be pushed and very competitive as well as have new system architectures like cerebras or grok. I haven't heard about new compute platforms framed as ASICs.
Is Cerebras an integrated circuit or more an integrated wafer? :-)
And yeah their cost is ridiculous, on the order for high 6 to low 7 figures per wafer. The rack alone looks several times more expensive than the 8x NVIDIA pods [1]
Imagine if Anthropic or someone eventually release a Claude 3.5 but at like a whopping 10x its current speed.
Would be incredibly more useful and game changing than a slow o1 model that may or not be x percent smarter.