>According to LightOn, its Appliance can reach a peak performance of 1.5 PetaOPS at 30W TDP and can deliver performance that 8 to 40 times higher than GPU-only acceleration.
Impressive!
LightOn hasn't received much discussion on here before. Some links have been submitted, and this is the only one I could find comments on: https://news.ycombinator.com/item?id=27797829
From the website of the manufacturer [1] it appears that the co-processor is essentially an analog computer for matrix-vector multiplications. I am quite sceptical about the accuracy and value range of the computations. Even puny single-precision floating point operations are accurate to something like 7 decimal digits and have a dynamic range of hundreds of dB. According to the spec sheet, the appliance only uses 6-bit inputs and 8-bit outputs, so the relative errors are probably on the percent level. This makes it hard to believe that any signal will propagate through something like a DNN without completely drowning in noise.
> Even puny single-precision floating point operations are accurate to something like 7 decimal digits and have a dynamic range of hundreds of dB. According to the spec sheet, the appliance only uses 6-bit inputs and 8-bit outputs, so the relative errors are probably on the percent level. This makes it hard to believe that any signal will propagate through something like a DNN without completely drowning in noise.
And there have already been successful experiments with stronger quantization, like 8-bit neural nets, or even 1-bit (!) neural nets. There is a lot of evidence that neural networks can be very resilient to quantization noise.
I think it's hilarious. when I was in school, microprocessors were 8-bit. Then as the world got digitized it was 16-bit microprocessors. A least in my narrow world. Then 32 bit and floats came along and finally double floats. Each step was a big product effort and launch we sell as better technology. And just as we are finally got past the need to always make both 32 and 64 bit versions of everything, we turn around and head back down. Though I saw someone (Qualcomm?) with a press release about their next gen microprocessor supporting 32 bit floats! Not sure if it counts as another U-turn or they're just a straggler.
I think you may be improperly conflating address or pointer width with the data types used for integer or floating-point arithmetic—we had floats up to 80 bits with "16-bit" x86 processors. Nobody's moving backwards in terms of pointer size. And any history of the progression of bit widths is incomplete without mentioning the SIMD trend that PCs have been on since the mid-90s, starting with 64-bit integer vectors and culminating in today's 512-bit vector instructions that re-use those original 64-bit vector registers as mask registers.
I don't think there's any point at which PCs have ever abandoned 8-bit and 16-bit data types, or ever appeared to be on a trajectory to do so. We've just had some shifts over the years of what kinds of things we use narrower data types for.
That's product differentiation: 32 bits is enough for mass market gpus for gamers. They figure the scientific computing market has more money, so they charge a ton extra if you want decent 64 bit, even though it's basically the same hardware. AMD Radeon VII is an exception to that, no longer made but not really obsolete yet, and you can still find them.
These 1-bit-per-coefficient neural nets need pretty good floating point implementation to train them. From what I remember, they are trained with rounding - basically, a floating point weight gets rounded to -1, 0 or 1 and computed floating point gradient is added to a floating point weight.
I think you missed the fact we were talking about neural network, not an animal brain self replicating and branching and competing with itself for billions of years until it becomes aware of itself. Give us the the same time.
The earliest and simplest brains where still useful. Even insects can fly around in 3D space, I doubt you need something as complicated as a mouse brain to run a self driving car let alone a drone.
I’m not sure how much it matters given this thread looks like it’s going off on several successive tangents, but the important (and hard) thing with a self-driving car is making sure it doesn’t hit stuff, not the actual driving part.
And drones, trivially agree: Megaphragma mymaripenne has 7400 neurones, compared to the 71/14 million in a house mouse nervous system/brain.
Best estimate I have for total cell numbers in mouse brain—about 75 million neurons and 35 million other cell types. This estimate is from a 476 mg brain of a C57BL/6J case—the standard mouse used by many researchers.
Based on much other work with discrete neuron populations the range among different genometypes of mice probable +/- 40%.
for details see: www.nervenet.org/papers/brainrev99.html
Expect many more (and I hope better) estimates soon from Clarity/SHIELD whole brain lighsheet counting with Al Johnson and colleagues at Duke and team at Life Canvas Tech.
If there’s too much noise just lower the dropout probability from 50 to 30%. ;)
Joking aside, it is interesting how much noise and quantization these neural networks can work with. I think there’s a lot of room for low precision noisy computation here.
It's not a transmission line though, SNR does not apply in the same way. It's more like CMOS where the signal is refreshed at each gate. Each stage of an ANN applies some weight and activation. You can think of each input vector as a vector with a true value plus some noise. As long as that feature stays within some bounds, it is going to represent the same "thought vector".
It may require some architecture changes to make training feasible, but it's far from a nonstarter.
And that is only considering backprop learning. The brain does not use backprop, and has way higher noise levels.
I think the parent was referring to the same noise that you are, compute precision, not transmission, and was suggesting that perhaps it won’t easily stay within bounds due to the fact that some kinds of repeated calculations lose more precision at every step.
Maybe it’s application dependent, maybe NNs or other matrix-heavy domains can tolerate low precision much more easily than scientific simulations. It certainly wouldn’t surprise me if these “LightOPS” processors work well in a narrow range of applications, and won’t improve or speed up just anything that needs a matrix multiply.
But that is what I'm saying if your vectors are reduced to 8 bit scalar components you can represent a 256x256x256 worth of detail in the world (doesn't need to be linear but still really limited details) ?
I’m totally out of my expertise here but I have a question - from my understanding ray tracing is primarily used for lighting/shadows/reflections- wouldn’t it be OK for something like shadows to be inaccurate- maybe some sort of amalgamation over frame refreshes? Real world light is messy anyway. I’m talking about a game type scenario not something scientific.
Maybe another way to ask is- we’re trying to simulate a real world “analog” scene - maybe using an analog processing technique could actually be quite faithful for generating it?
Or not. Like I said I don’t understand too much of this.
Think about it like this : you have a spaceship model fits in 256x256x256m - to get the maximum resolution while still fitting in 8 bits you would make each axis in 1m increments and you have 256 values. So you can't have sub 1m details in geometry. Floating point is different because you have an exponent so it's not linear, you can technically have larger scale, but you sacrifice even more precision in mantissa.
Now I'm not sure what this 6/8 bit precision or analog precision means so I can't say with confidence, but if your scalars are that low precision you can't really do much. You could technically encode it with some fancy tricks like instead of storing coordinates for each vertex you store the delta from previous one etc. but I think this wouldn't work if the device was just some dumb analog matrix multiplier with baked logic.
Also having low detail shadows creates visual artifacts, see this for example [1]
Thank you for the reply. The article didn’t give details what it how it actually works so I went directly to the company’s site and found this [1]
> … leverages light scattering to perform a specific kind of matrix-vector operation called Random Projections.
… have a long history for the analysis of large-size data since they achieve universal data compression. In other words, you can use this tool to reduce the size of any type of data, while keeping all the important information that is needed for Machine Learning.
So it sounds like the whole point is to reduce the data size and then feed it into GPUs like normal ML. Kind of a neat idea.
I've never heard of LightOn, and wish the website had a bit more concrete info on the specifics of the coprocessor, but I am somewhat familiar with a similar photonic coprocessor made by NTT (the Coherent Ising Machine). It's still in the research stage, the logic uses interferometry effects, and requires kilometers of fiber optic cables. Interestingly, there is a simulator based on mean field theory that runs on GPUs and FPGAs(*) that can solve some problems (e.g. SAT) with close to state of the art performance.
(*) disclosure: my company helped build the simulator
Relief is on the horizon. Ethereum should switch to proof of stake in June 2022 and you are about to see an unholy torrent of used GPUs hit the market. I would expect you can pick up any you like for peanuts then.
Seems like you can try it out[1], I find this a bit funny that it is easier to get trail access to quantum CPU and light based GPU, but for Cerebras and Graphcore trial access you need spend thousands of dollars.
The only actual hardware description I could find was in this arXiv link, where you have a laser that is spread then put through a light gate chip (as found in projectors), a random mask, then to a CCD camera. This does random multiplies, in parallel, which somehow prove useful.
I'm not super well versed on this topic. But the basic physics of photon and electron make light a poor source for computing compared to electricity. Photons don't interact with each other much when crossed, while electrons interfere each other greatly. It's really difficult to build logic gates out of photons while electrons work great with semiconducting materials to build gates.
Light is good for communication due to its non-interfering nature to pump up the bandwidth. There's a saying light is great for data transmission while electricity is great for computing.
So when people claim making supercomputer out of optical computing, take it with a grain of salt.
Photons may not interact with themselves much, but they interact with other materials in significant and useful ways, and there are plenty of materials that change their optical properties in response to an electric field (which can be either electronic or optical in origin, given that light, particularly coherent light can have fairly strong electric fields).
There are millions of things that an electronic digital computer can do that are unlikely to be replaced by photonics, but a hybrid approach may offer advantages in computing specific things. As we have slowed down getting perf/watt advantages from shrinking processes, more and more specialized hardware has been used for performing calculations. It's not that far-fetched to think that photonics might have a niche where it has performance advantages.
You're right that photons interact with many materials. However, few exhibit semiconducting properties that's useful for building logic gates.
Silicon has a kind of perfect energy band gap for its valence band electrons jumping to conducting band free elections to allow electricity flow. The gap is not too small to introduce ambiguity or too big to require too much energy to move from non-conductive to conductive state. Photonics needs to find a semiconducting material/alloy that beats silicon to be a viable option.
Most research in building logic gate with light is a combination of light and electricity. The most promising recently is using Josephson junction in a superconducting electric current loop. A photon hitting the loop adds energy to the superconducting current. With enough energy the current becomes critical current, which moves the Josephson junction from 0 voltage to a voltage. The raised voltage gives off the energy and the current falls back to non-critical. Continuous photons hitting the loop causes the Josephson junction to have an extremely high frequency AC voltage. That's a photon controlled gate.
But the research is still really early and it requires low temperature superconductivity. It's still a long way to be competitive in reality.
I think we are both in agreement that photonics will not replace silicon for logic gates (at least with current technology). However both what I was talking and in the specific example from TFA photonics are being used for analogue computing which does not necessarily require logic gates.
The simplest example would be AM modulating a laser with a signal and passing the result through a prism. This calculates a Fourier transform on the signal. This is not a great example because, for most domains, the output is too noisy to be of great use and ADCs/DSPs can do this fairly efficiently already. I believe the computer mentioned in TFA involves matrix projections.
As other commenter said, the magic happens not just between photons, but between photons and non-linear optical materials, in which these photons travel.
While not directly related to computing, I was fascinated to learn that lasers routinely use crystals to cut wavelength of the light by half ([1], [2]).
If that "1 task" happens to be performing matrix multiplication (or even merely fused multiply add), you can do a heck of a lot with that. You still need digital circuitry to support the IO, but the key idea is doing linear algebra in a way that is faster and/or generates less heat per unit compute.
Not a stupid question. Economically, memory density will hit a brick wall soon. Developers should prefer to waste time and save space, since parallel computation will not hit a similar limit in the foreseeable future. Memory-to-core ratio is going to be falling.
TLDR is you can't. For a very simple example, storing the products of all 3x3 matrices * length 3 vectors in Float16 precision would take 2^193 bytes (which is obviously impractical).
I remember hearing about Lifi [1] years ago and thinking we would see it everywhere but this has not been the case. I wonder what has held it back.
Like this super computer Lifi promised unmatched speeds that wifi can not do. Very neat to see the field is still advancing.
That page is amateurish and riddled with absurd statements, like the claim visible light travels faster than radio waves. It also claims LiFi offers speeds "100 times faster than wifi" but their thousand dollar router can only do 100Mbps. To add insult to injury it rips off the Wi-Fi trademark logo as well.
There may be a market for visible light networking but LiFi is total garbage.
It seems that light reflected off walls can achieve up to 70 Mbit/s in a controlled environment. But yeah, it's still hard to think about direct applications in our lives.
there are many use cases that don’t require you to be connected 24/7. You could while driving under a street light download a movie in seconds. Then go on your way loading your cars infotainment with maps and several movies for your kids in the back or whatever. I haven’t looked into the tech in years but the promised of faster downloads and a city wide array of street lights made it seem like the future.
Aperture fouling (dirt / grime) seems like it would make such a service flaky. Aside from needlessly low data caps, the cell network is pretty great for road trip connectivity. High bandwidth intermittent connectivity would be nice for self-driving car sensor log offload.
Inside my home it would be cool to have 10 gbps to my laptop, but I don't have a real use case where that's meaningfully better than the 500-800 mbps that I already get with WiFi.
The sensor could be mounted behind your windshield which we already have windshield wash built into every car and be cleaned effortlessly. I don’t see that being an issue.
Maybe, but the other side also needs to be cleaned. And the car side TX won't be anywhere near as bright as a street lamp (and probably IR) and will be more vulnerable.
As you point out, all of this is likely surmountable, but... why bother if the status quo is serving the use case?
The flashing happens so fast it is invisible to human eye. LED lights already do with with pulse width modulation to make them as efficient as they are and it is not an issue.
Usually we go to the lower-energy, longer-wavelength part of the spectrum for applications like this, since gamma rays and the high-energy, short-wavelength part of the spectrum is very hard to control and gives you cancer.
I see this "cell phones will use x-rays and gamma rays in the future" claim often enough that I wish an expert in the field would write an article on just how bad the effects would be.
There are some people that view these as a solution to Proof of Work energy use
I’ve been researching OPUs for this purpose, as in, the concept of optical processors only came to my attention because of the people looking for an edge in mining and energy use.
From what I can tell, it could only be a stopgap or decade
long solution to PoW, as people would just hoard these over time till the energy use was the same
I don't see how that adds up, unless production of these light computers is extremely constrained they are just going to be produced en masse if they provide an edge for PoW mining, then you'll just have many more computers using the same amount of energy. And if the supply is constrained in such a way that only a tiny amount of people have access to the technology it just means that you have a centralized mining pool that can easily perform a 51% attack.
It's always a zero-sum game in the end. The only way to "fix" PoW is to get rid of it.
Indeed, I should have made that clearer. 10 years is what separates the iphone 1 and the iphone 8, or the PlayStation and the PlayStation 3. If there's a breakthrough in computing technology it'll be everywhere within a couple of years IMO, and miners will be among the first ones served because they're willing to pay above market price and in bulk (see the GPU situation at the moment).
I don't understand the relevance of your articles. What would nodes have to do with the issue of who creates the nodes? Why would nodes reject valid blocks arbitrarily? Why would it be an improvement?
I'm talking about a situation where a small group of people would have access to exclusive technology that would let them mine blocks faster than the rest of the miners with no way for others to compute without losing money. Nodes are irrelevant here unless they decide to arbitrarily reject valid blocks coming for certain miners because they'd deem them "unfair competition", but that's a huge can of worms. Who decides who goes on the list? Based on what? Could it not be trivially worked around?
Pour more computing into it (via your exclusive technology or whatever), the difficulty ramps up and you're back to square one (but you'll mine bitcoin alright).
Were you start censoring transactions, node would start rejecting your blocks, back to square one.
Right well, thats why there is a new hashing algorithm. And they want other energy heavy cryptocurrencies to switch to it.
Basically there is a whole row going on in the far far corners of crypto land, where some Cambridge and Columbia university alumni have made a new hashing algorithm (HeavyHash) that is heavy on linear operations specifically so that it would theoretically have an advantage on light based computers and be a sustainability solution [for now].
They stood up an example project back in May and forgot about it ("optical bitcoin", oBTC), but people kept mining it. Just on GPUs and FPGAs, as the low power better processors don't exist yet so there isn't really anything different at the moment. Because these are students with no funding, the management is very weak.
There is at least one fork of that project that has better management ("photonic bitcoin", pBTC). But they are waiting for OPUs to exist at all, as there are a variety of vaporware companies out there with massive funding.
HeavyHash itself is being used in more and more other newly launched cryptocurrencies. Hoping for OPUs to become available to actually make their networks different than others.
They all so far only aspire to be examples for other major energy consuming cryptocurrencies we've heard of to switch to. The university students had submitted a proposal (BIP) to Bitcoin-core, and that slow moving network typically needs real world examples and fear of missing out for consensus to shift.
The whole point of PoW is that it requires a given amount of effort at a given level of difficulty to maintain a given level of production. (i.e. it's a function of the capital cost of the equipment and the operational cost of running it) All else being equal, if the amount of effort to produce a given result is reduced it will result in an increased level of difficulty netting out to a similar level of energy use.
This is the reason that Bitcoin, for example, keeps ratcheting up the difficulty: to counteract the increased performance of CPUs, then GPUs, then FPGAs and finally ASICs over time. It's an arms race that you can't 'win' for any extended period of time since the difficulty is not a constant, but rather determined by the desired level of production.
This is something that I appreciate about ethash (Ethereum's current PoW algo)... it is tied to GPUs (and memory controllers). One side effect is that older GPUs tend to drive higher ROI numbers vs. an arms race of always having the latest smaller nm ASIC technology.
Oh and CPUs don't work well because that drives people to create botnets. FPGAs kind of sit out there in an esoteric expensive and difficult to run island all by themselves.
From what I understand, most of the energy loss of our current computers is not due to using irreversible logic, but rather the actual resistance of the materials.
Once we have better materials or different paradigms like photonic computing, then it would make sense to implement the https://en.wikipedia.org/wiki/Toffoli_gatehttps://en.wikipedia.org/wiki/Fredkin_gate and whatnot. Btw, Fredkin and Toffoli, and the rest of the kings of reversible computing were big advocates of cellular automata ;-)
The lighton device accelerates random linear projections. Its input is a vector x and its output is a vector W*x, where W is a fixed random matrix. This is useful for a class of machine learning algorithms, but not frequently used in deep learning.
Impressive!
LightOn hasn't received much discussion on here before. Some links have been submitted, and this is the only one I could find comments on: https://news.ycombinator.com/item?id=27797829