>According to LightOn, its Appliance can reach a peak performance of 1.5 PetaOPS at 30W TDP and can deliver performance that 8 to 40 times higher than GPU-only acceleration.
Impressive!
LightOn hasn't received much discussion on here before. Some links have been submitted, and this is the only one I could find comments on: https://news.ycombinator.com/item?id=27797829
From the website of the manufacturer [1] it appears that the co-processor is essentially an analog computer for matrix-vector multiplications. I am quite sceptical about the accuracy and value range of the computations. Even puny single-precision floating point operations are accurate to something like 7 decimal digits and have a dynamic range of hundreds of dB. According to the spec sheet, the appliance only uses 6-bit inputs and 8-bit outputs, so the relative errors are probably on the percent level. This makes it hard to believe that any signal will propagate through something like a DNN without completely drowning in noise.
> Even puny single-precision floating point operations are accurate to something like 7 decimal digits and have a dynamic range of hundreds of dB. According to the spec sheet, the appliance only uses 6-bit inputs and 8-bit outputs, so the relative errors are probably on the percent level. This makes it hard to believe that any signal will propagate through something like a DNN without completely drowning in noise.
And there have already been successful experiments with stronger quantization, like 8-bit neural nets, or even 1-bit (!) neural nets. There is a lot of evidence that neural networks can be very resilient to quantization noise.
I think it's hilarious. when I was in school, microprocessors were 8-bit. Then as the world got digitized it was 16-bit microprocessors. A least in my narrow world. Then 32 bit and floats came along and finally double floats. Each step was a big product effort and launch we sell as better technology. And just as we are finally got past the need to always make both 32 and 64 bit versions of everything, we turn around and head back down. Though I saw someone (Qualcomm?) with a press release about their next gen microprocessor supporting 32 bit floats! Not sure if it counts as another U-turn or they're just a straggler.
I think you may be improperly conflating address or pointer width with the data types used for integer or floating-point arithmetic—we had floats up to 80 bits with "16-bit" x86 processors. Nobody's moving backwards in terms of pointer size. And any history of the progression of bit widths is incomplete without mentioning the SIMD trend that PCs have been on since the mid-90s, starting with 64-bit integer vectors and culminating in today's 512-bit vector instructions that re-use those original 64-bit vector registers as mask registers.
I don't think there's any point at which PCs have ever abandoned 8-bit and 16-bit data types, or ever appeared to be on a trajectory to do so. We've just had some shifts over the years of what kinds of things we use narrower data types for.
That's product differentiation: 32 bits is enough for mass market gpus for gamers. They figure the scientific computing market has more money, so they charge a ton extra if you want decent 64 bit, even though it's basically the same hardware. AMD Radeon VII is an exception to that, no longer made but not really obsolete yet, and you can still find them.
These 1-bit-per-coefficient neural nets need pretty good floating point implementation to train them. From what I remember, they are trained with rounding - basically, a floating point weight gets rounded to -1, 0 or 1 and computed floating point gradient is added to a floating point weight.
I think you missed the fact we were talking about neural network, not an animal brain self replicating and branching and competing with itself for billions of years until it becomes aware of itself. Give us the the same time.
The earliest and simplest brains where still useful. Even insects can fly around in 3D space, I doubt you need something as complicated as a mouse brain to run a self driving car let alone a drone.
I’m not sure how much it matters given this thread looks like it’s going off on several successive tangents, but the important (and hard) thing with a self-driving car is making sure it doesn’t hit stuff, not the actual driving part.
And drones, trivially agree: Megaphragma mymaripenne has 7400 neurones, compared to the 71/14 million in a house mouse nervous system/brain.
Best estimate I have for total cell numbers in mouse brain—about 75 million neurons and 35 million other cell types. This estimate is from a 476 mg brain of a C57BL/6J case—the standard mouse used by many researchers.
Based on much other work with discrete neuron populations the range among different genometypes of mice probable +/- 40%.
for details see: www.nervenet.org/papers/brainrev99.html
Expect many more (and I hope better) estimates soon from Clarity/SHIELD whole brain lighsheet counting with Al Johnson and colleagues at Duke and team at Life Canvas Tech.
If there’s too much noise just lower the dropout probability from 50 to 30%. ;)
Joking aside, it is interesting how much noise and quantization these neural networks can work with. I think there’s a lot of room for low precision noisy computation here.
It's not a transmission line though, SNR does not apply in the same way. It's more like CMOS where the signal is refreshed at each gate. Each stage of an ANN applies some weight and activation. You can think of each input vector as a vector with a true value plus some noise. As long as that feature stays within some bounds, it is going to represent the same "thought vector".
It may require some architecture changes to make training feasible, but it's far from a nonstarter.
And that is only considering backprop learning. The brain does not use backprop, and has way higher noise levels.
I think the parent was referring to the same noise that you are, compute precision, not transmission, and was suggesting that perhaps it won’t easily stay within bounds due to the fact that some kinds of repeated calculations lose more precision at every step.
Maybe it’s application dependent, maybe NNs or other matrix-heavy domains can tolerate low precision much more easily than scientific simulations. It certainly wouldn’t surprise me if these “LightOPS” processors work well in a narrow range of applications, and won’t improve or speed up just anything that needs a matrix multiply.
But that is what I'm saying if your vectors are reduced to 8 bit scalar components you can represent a 256x256x256 worth of detail in the world (doesn't need to be linear but still really limited details) ?
I’m totally out of my expertise here but I have a question - from my understanding ray tracing is primarily used for lighting/shadows/reflections- wouldn’t it be OK for something like shadows to be inaccurate- maybe some sort of amalgamation over frame refreshes? Real world light is messy anyway. I’m talking about a game type scenario not something scientific.
Maybe another way to ask is- we’re trying to simulate a real world “analog” scene - maybe using an analog processing technique could actually be quite faithful for generating it?
Or not. Like I said I don’t understand too much of this.
Think about it like this : you have a spaceship model fits in 256x256x256m - to get the maximum resolution while still fitting in 8 bits you would make each axis in 1m increments and you have 256 values. So you can't have sub 1m details in geometry. Floating point is different because you have an exponent so it's not linear, you can technically have larger scale, but you sacrifice even more precision in mantissa.
Now I'm not sure what this 6/8 bit precision or analog precision means so I can't say with confidence, but if your scalars are that low precision you can't really do much. You could technically encode it with some fancy tricks like instead of storing coordinates for each vertex you store the delta from previous one etc. but I think this wouldn't work if the device was just some dumb analog matrix multiplier with baked logic.
Also having low detail shadows creates visual artifacts, see this for example [1]
Thank you for the reply. The article didn’t give details what it how it actually works so I went directly to the company’s site and found this [1]
> … leverages light scattering to perform a specific kind of matrix-vector operation called Random Projections.
… have a long history for the analysis of large-size data since they achieve universal data compression. In other words, you can use this tool to reduce the size of any type of data, while keeping all the important information that is needed for Machine Learning.
So it sounds like the whole point is to reduce the data size and then feed it into GPUs like normal ML. Kind of a neat idea.
I've never heard of LightOn, and wish the website had a bit more concrete info on the specifics of the coprocessor, but I am somewhat familiar with a similar photonic coprocessor made by NTT (the Coherent Ising Machine). It's still in the research stage, the logic uses interferometry effects, and requires kilometers of fiber optic cables. Interestingly, there is a simulator based on mean field theory that runs on GPUs and FPGAs(*) that can solve some problems (e.g. SAT) with close to state of the art performance.
(*) disclosure: my company helped build the simulator
Relief is on the horizon. Ethereum should switch to proof of stake in June 2022 and you are about to see an unholy torrent of used GPUs hit the market. I would expect you can pick up any you like for peanuts then.
Impressive!
LightOn hasn't received much discussion on here before. Some links have been submitted, and this is the only one I could find comments on: https://news.ycombinator.com/item?id=27797829