Well it is a very simple solver, you can't change it after it's synthesized and it's not true it uses no power - you need strong enough incoming light. That's your power source. It's probably safe to assume that each layer removes some power in a mutiplicative fashion, therefore it's only so much you can do in a single glass step before the beam dissipates.
And as far as timing goes - 1 cm of glass delays by ~50 picoseconds. That corresponds to around 20 gigahertz. Fast, but not mind blowingly fast.
Well it's MNIST - most basic and classic OCR benchmark out there. Also, no info on how much light power is actually necessary, though I'm sure it's exponential in the number of layers. And another thing I forgot to write is: how are you going to interface with this thing?
Glass is 100% recyclable. It's conceivable that a robot could be built to fabricate ML models and meltdown / update old models when necessary. That would be pretty steampunk.
The device is not 1 cm, it is 'microns each side', right? That should make it roughly 10000x faster than 20 GHz, which is quite mind blowing for me. Especially if you imagine multi-layer superstructures for resolving very complex tasks with small amounts of energy.
And as far as timing goes - 1 cm of glass delays by ~50 picoseconds. That corresponds to around 20 gigahertz. Fast, but not mind blowingly fast.