It's far too early to compare this with any commercial NN accelerators. From the article:
Although the performance of ONNs is not yet competitive with leading-edge electronic processors at >200 TOPS (for example, Google TPU and other chips), there are straightforward approaches towards increasing our performance both in scale and speed. Further, with a single processor speed of 11.3 TOPS, our VCA is approaching this range. The CA is fundamentally limited in data size only by the electrical digital-to-analogue converter memory, and processing 4K-resolution (4,096×2,160 pixels) images at >7,000 frames per second is possible. The 720 synapses of the CNN (72 wavelengths per synapses per neuron, 10 neurons), a substantial increase for optical networks, enabled us to classify the MNIST dataset. Nonetheless, further scaling is needed to increase the theoretical prediction accuracy from 90% to that of state-of-the-art electronics, typically substantially greater than 95%.
For comparison, Nvidia A100 has 2TB/s memory bandwidth. That's 16k Gbps.
Typically when you mention a frame rate in the context of a processor performance, you care about throughput, not input data transfer bandwidth. Max data transfer bandwidth can limit max throughput, but usually not in the first layer. For example, a standard benchmark is Resnet-50 Imagenet frame rate.
absurd bandwidth? try this on for size: the square kilometer array does online (“realtime” loses its meaning when the data processing latency gets long enough) interferometry on millions of antenna achieving an OUTPUT data rate of a few TB/s, which is then filtered down to a paltry 1 TB/s by an exaflop computer. it’s a mind blowing instrument, and as a public science project there is lots of fun stuff to read about it :)
Although the performance of ONNs is not yet competitive with leading-edge electronic processors at >200 TOPS (for example, Google TPU and other chips), there are straightforward approaches towards increasing our performance both in scale and speed. Further, with a single processor speed of 11.3 TOPS, our VCA is approaching this range. The CA is fundamentally limited in data size only by the electrical digital-to-analogue converter memory, and processing 4K-resolution (4,096×2,160 pixels) images at >7,000 frames per second is possible. The 720 synapses of the CNN (72 wavelengths per synapses per neuron, 10 neurons), a substantial increase for optical networks, enabled us to classify the MNIST dataset. Nonetheless, further scaling is needed to increase the theoretical prediction accuracy from 90% to that of state-of-the-art electronics, typically substantially greater than 95%.