Researchers Move Closer to Completely Optical Artificial Neural Network

sytelus · on Aug 10, 2018

Tip: When you read articles that says "researchers might have..." or "researchers get closer to...", read them as follows:

These folks ran out of their funding and need to renew their grants. For that purpose here's their progress report and no, tech is still quite far away. Watch out for same article title next year same time.

To be clear, there is nothing wrong with this. Some progress takes time and funders just need to know that folks are working hard at it.

OscarCunningham · on Aug 10, 2018

I think it has more to do with the way that researchers describe their projects to journalists. Whatever you're working on is to complicated to explain properly, but you can at least say what it might eventually be useful for. So you tell that to the journalist, and (if they are good) the journalist publishes "A step towards X" (if they're a bad journalist you'll see "Scientists discover X").

etaioinshrdlu · on Aug 10, 2018

Can it compute anything non linear? I.e. can it actually have any activation function?

ovi256 · on Aug 10, 2018

Without a non-linear activation function, it wouldn't be an ANN, because multiple linear layers are equivalent to a single layer applying the composition of their transforms.

The article gives no clue about this part. I have no idea if the optical domain can compute non-linear functions.

jf- · on Aug 10, 2018

If you want a non-linear function that can be produced optically, consider an evanescent wave[1], as used in TIRF microscopy. Whether this is suitable in practise as an activation function I couldn’t say, though the curve does to my eye look like something that could be used in place of relu.

[1] https://en.m.wikipedia.org/wiki/Evanescent_field

mmf · on Aug 10, 2018

Optical computation will never become relevant at scale. There are fundamental reasons for this: first, particle size. A photon at usable wavelengths is extremely large, much larger of any modern electron based _devices_ This makes it imossible to scale to usable density. Second, optic-optic (as opposed to electro-optic) non linear effects are based on interaction with electrons, in particular with electron decay from an energy state to another which is tipically extremely slow.

nabla9 · on Aug 10, 2018

Isn't optical Kerr effect (separate from Kerr electro-optic and Magneto-optic Kerr effect) pure optic-optic nonlinearity without electrons?

taneq · on Aug 10, 2018

"If an elderly but distinguished scientist says that something is possible, he is almost certainly right; but if he says that it is impossible, he is very probably wrong." - Arthur C. Clarke

nabla9 · on Aug 10, 2018

"A platitude is a trite, meaningless, or prosaic statement, often used as a thought-terminating cliché, aimed at quelling social, emotional, or cognitive unease" – Wikipedia

taneq · on Aug 10, 2018

"Touche." - Me.

EchoAce · on Aug 10, 2018

It may be my lack of knowledge about optics, but from an ML perspective this seems rather mundane if not useless. Model training involves high levels of parallelism on a large scale for difficult tasks, something I can’t see these optical chips doing. Does anyone have any further information that might enlighten me to otherwise?

stochastic_monk · on Aug 10, 2018

By performing these transformations optically, they primarily get data parallelism (like [GTV]PUs). I expect this to happen. NVIDIA’s ACDC paper provides an FFT-accelerated neural network layer (similar to deep-fried convnets), with an offhand remark that the transformations could be performed optically. I wonder what kind of information bandwidth they can get, though.

dekhn · on Aug 10, 2018

Physicists were using optical lenses to do approximate FFTs over a hundred years ago.

drumttocs8 · on Aug 10, 2018

Can you use something like wavelength division multiplexing to get different data streams and achieve parallelism there?

d--b · on Aug 10, 2018

I can't read the article as it's behind a paywall, but if you can make a chip that's 100% optical, then it means that when you beam your input data at the input end of the chip, you _instantly_ get the output at the end. No need for cycles for multiplying, adding and so on. Plus it wouldn't heat up like silicon does.

p1esk · on Aug 10, 2018

you _instantly_ get the output

That's not how physics works, unfortunately.

lawlessone · on Aug 10, 2018

Makes me think of the positronic brains from Asimovs books.

bra-ket · on Aug 10, 2018

related: "All-optical machine learning using diffractive deep neural networks" https://news.ycombinator.com/item?id=17698135