Groq, a Stealthy Startup Founded by Google’s TPU Team, Is Raising $60M

writepub · on Sept 6, 2018

Hardware is hard. For Gorq to be successful -

- The boards using their chips need to fit into commodity interfaces (PCIE? DIMM? ) in Open Compute hardware

- Someone needs to buy hundreds of thousands of these to even minutely impact their bottom line

- Multiple such big volume wins need to consistently happen.

- Their IP needs to actually be defensible. Otherwise, an Intel/Samsung whose manufacturing prowess & channel reach is multiple times that of Gorq will undercut the pricing with almost the same performance/watt. Oh, and they'll happily play nice with Open Compute, other standards bodies.

- Most of all, their product needs to work as advertised, at scale, in a reliable fashion. This is easier said than done in semiconductors, especially for the kind of performance gains they're marketing.

If they'd gotten here with ~$30M capital, and demonstrated traction in the marketplace, I'd give them a chance, but expecting Google's pay while working at an independent chip startup, with $0 in revenues portends financial doom. It's not the founder's fault though - hardware is capital intensive, not compatible with agile development & an MVP just won't cut it - it needs to be fully functional & reliable right out of the bat. I sincerely hope I'm wrong, as I'd like to see the silicon put back in silicon valley - having been in the semiconductor industry for 20 years, it just seems unlikely.

deepnotderp · on Sept 6, 2018

> The boards using their chips need to fit into commodity interfaces (PCIE? DIMM? ) in Open Compute hardware

I mean, it's not exactly like pcie interfaces are ultra hard, must be done in house, cutting edge technology anymore. And why the huge emphasis on OCP? Nothing there necessitates any kind of extreme innovation.

> Their IP needs to actually be defensible. Otherwise, an Intel/Samsung whose manufacturing prowess & channel reach is multiple times that of Gorq will undercut the pricing with almost the same performance/watt. Oh, and they'll happily play nice with Open Compute, other standards bodies.

This has been true historically, but TSMC currently can out manufacture both intel and Samsung wrt 7nm. In addition, the steamroller of Moore's law no longer really holds with dennard scaling dead and wire/MOL rc and variation soaking up performance gains.

Btw, most of these criticisms could have been levied against nvidia way back when.

writepub · on Sept 6, 2018

Not a criticism, merely truths about the industry.

Also, from Nvidia's founding to now, things have radically changed. It's depressing how winner-take-all semi has become. Nvidia could tape-out at $300k, today it's $2M. Number of semi players has reduced drastically even within the last 5 years

pk3469 · on Sept 7, 2018

Nvidia didn't succeed because of their hardware manufacturing. They succeeded because they were dirty and ruthless on the market, and to this day will buy or sue the living fuck out of anything that could threaten their dominance.

foobiekr · on Sept 6, 2018

If you look at the spaces where ASICs are still successful, one thing that is clearly important that you don't mention is the software kit around the chip(s). Half the team (or more) will be software guys. It's not about fabbing (since these companies are fabless and typically TSMC), it's about software support.

Eridrus · on Sept 6, 2018

This is a great point; AMD has had chips for quite a while, but only just announced a TensorFlow release that can use them, and it's a binary release, rather than them successfully upstreaming their changes.

trhway · on Sept 6, 2018

they can be just acquired back to Google or by another player for about $10M/head.

jtmcmc · on Sept 7, 2018

yep this has google acquisition/aquiihire written all over it IMO

deepnotderp · on Sept 7, 2018

The circle of life Google to startup back to Google via acquisition

writepub · on Sept 6, 2018

For what? Acquisition only makes sense if a market opportunity is so close that missing it will affect the bottom line. I believe companies can independently build their own TPUs at lesser than $60M within 2 years. And even then the volumes for these chips are ... ?

Unless millions of units are being sold/month, it might just be cheaper to have some AI acceleration done in Intel's integrated CPU+FPGA offerings [1].

[1]: https://www.nextplatform.com/2018/05/24/a-peek-inside-that-i...

trhway · on Sept 6, 2018

>I believe companies can independently build their own TPUs at lesser than $60M within 2 years.

I think we have very different work history wrt. the size/agility/cost of doing things/etc. of the companies that we've worked at :)

writepub · on Sept 6, 2018

50 employees * $250,000/annum * 2 years = $25M. At 3 years, it's $37M. Realistically, for $35M you should have a working, reliable TPU if you're already a semi company, like Samsung

trhway · on Sept 6, 2018

>$250,000/annum

it is bare minimum that you'd have to pay to an engineer, and i'm still not sure that you can find somebody able to do hardware design in AI space on that kind of low money today.

Also - the actual cost of an employee is significantly higher than what you pay to the employee. Plus everything else you need to run R&D company, especially if it involves hardware prototyping, etc.

I mean i can believe that a hot startup where people would take equity and see good chances at good exit can probably reach some reasonable prototype (not actual product) stage on $60M. For any bigger established company - it would be several times more expensive with an order of magnitude higher chances to screw the project (due to incorrect product fit, internal politics, overall internal inefficiency, managers actually hiring a crowd of "AI hardware designers" at $250K :), etc.)

Look at Cisco - the company does understand that they can't do anything new in house, so they regularly spin-off a $100M wad of cash (with usually the same 3 guys attached) and buy it back 1-2 years later for $700-800M. If Cisco attempted to invest those $800M into new tech development in-house the chances for a good outcome would be much lower.

vonmoltke · on Sept 6, 2018

> >$250,000/annum

> it is bare minimum that you'd have to pay to an engineer, and i'm still not sure that you can find somebody able to do hardware design in AI space on that kind of low money today.

Maybe in SV. In Dallas, only top analog and digital designers are commanding that kind of salary. There's plenty of fab and board shop capacity there, too.

Also, I don't see why you specifically need "somebody able to do hardware design in [the] AI space". Experienced digital and mixed-signal designers should be sufficient.

shard · on Sept 7, 2018

Remember, the cost to the employer is roughly double the employee's salary, therefore the $250,000 per annum cost means the employee salary is $125,000 per annum. Not exactly top dollar.

rhymenoceros · on Sept 7, 2018

Where did you get an 100% overhead for an employer?

I've done back-of-the-envelope estimates before and have come nowhere near that. At best, it's somewhere between 15-25% depending on employer 401(k)/etc matching, deferred comp, and office space pricing.

The worst I could find is 40% - http://web.mit.edu/e-club/hadzima/how-much-does-an-employee-... - and that uses a much different cost basis (not $125k).

Where in the world did you come up with that number?

shard · on Sept 8, 2018

Sure, everyone has a different number. For example, in the article you linked to, there is this metric: "the fully functioning managed employee costs about 2.7 times the base salary".

vonmoltke · on Sept 7, 2018

The person I was replying to specifically said "pay to an engineer". That is not the fully-loaded cost.

shard · on Sept 8, 2018

The person he was replying to was talking about cost to design the chip, which would imply employer costs. The error has to stop somewhere.

Eridrus · on Sept 6, 2018

I think a lot of next generation chips are not just squeezing more compute into hardware, but examining whether we can change the nature of neural net compute ro better fit what we can do in hardware. That's Graphcore's pitch at least. And plenty of people are looking into whether you can do non-IEEE754 math in a way that works for compute, e.g. FP16, quantized arithmetic, etc.

You probably don't need the entire team to be familiar with the math, but someone has to be.

And as other comments have noted, Software is actually going to be a very big deal.

deepnotderp · on Sept 7, 2018

No one gets paid 250K because they know what fp16 minifloats are

Eridrus · on Sept 7, 2018

At some point it wasn't widely known that you could do NN training with fp16. Now it's common knowledge, but what is the next thing?

deepnotderp · on Sept 6, 2018

Hardware design with synthesis is much easier than you think and costs a lot less.

slewis · on Sept 7, 2018

I'd add one more: they need to be able to ship new product at the cadence of Nvidia, Intel et al. Showing wins in the lab is very different from getting that product to market before the next round of improvements from competitors overtakes you.

And then you need to do it again.

sitkack · on Sept 7, 2018

I think this was more true back when Intel was in the thick of the slope of Moore’s law.

If new accelerators slip into existing hardware and don’t increase the power budget, one could get millions of installs in old and existing hardware.

If this is a beefy hot accelerator, that is a totally different story.

willart4food · on Sept 7, 2018

Maybe. I am not saying you're wrong, but we're dealing with orders of magnitude Innovation, and Clayton M. Christensen has shown us that when it comes to disruptive tech, the status-quo standard and incumbents are not the yardstick that they once were.

The Future is not what it used to be.

I stand by my "Maybe", 100% guaranteed.

singularity2001 · on Sept 6, 2018

- The boards using their chips need to fit into commodity interfaces

USB? There are USB GPU boxes too.

writepub · on Sept 6, 2018

USB? USB 3.1 is 16X slower than PCIE and maybe 100X slower than DIMM. USB is not an OCP recommended peripherals interface, it's for consumer removable products.

Then again, I'm not entirely sure about the requirement for a high speed interface in AI. I'm assuming a fast interface is needed as the amount of data ingested is typically large

akhilcacharya · on Sept 6, 2018

People use DIMM as a general purpose interface?

ianhowson · on Sept 6, 2018

No, but it can be done. I worked on an FPGA board interfacing through the DRAM interface once: https://www.researchgate.net/publication/4139277_Pilchard_-_...

writepub · on Sept 6, 2018

https://www.extremetech.com/computing/193983-sandisks-dimm-b...

cmrx64 · on Sept 6, 2018

http://www.ghielectronics.com/products/ucm

nicodjimenez · on Sept 6, 2018

Delivering an incremental speed boost vs nvidia chips seems like a tough way to win. Even if they can be 2x faster, people will still stick to cuda. They either have to be 5-10x faster or find a new market. Definitely possible. Would be pretty embarrassing for nvidia if that happened.

sytelus · on Sept 7, 2018

Nope. People aren't addicted to CUDA. In fact its hell to work with CUDA. Just to even download the binaries you have to go through registration process, docs are shit and version dependency is nightmare. The only reason CUDA is in use is because in old days it was the only game in town and Caffee framework integrated it from some very early code researchers wrote. Then people kept using that baseline code all the way to TF and PyTorch. Thanks to TPU, frameworks are already being to forced to be agnostic and new alternatives will be much easier to integrate.

If Groq chip delivers what its promising then you can bet that it would be integrated within few months in most frameworks and people will soon forget about CUDA. Most people who work with deep learning neither write code specific to cuda nor do they care that cuda is being used under the hood as long as things are being massively parallelized.

sanxiyn · on Sept 7, 2018

> If Groq chip delivers what its promising then you can bet that it would be integrated within few months in most frameworks and people will soon forget about CUDA.

Groq seems careful not to promise any price point. Even if Groq delivers every promise, if it's expensive its adoption will be chancy.

ur-whale · on Sept 7, 2018

A large part of accelerating AI loads in H/w lives in the software stack, not in the chip itself.

If people can run TensorFlow loads directly on the chip via qroq's s/w stack, who cares about Cuda.

That will be the differentiator for this company, not their h/w manufacturing prowess.

sanxiyn · on Sept 7, 2018

Apparently they have a person who was involved in TensorFlow TPU port.

sanxiyn · on Sept 7, 2018

Nope, as long as TensorFlow works, a lot of people won't care about CUDA.

sytelus · on Sept 7, 2018

The numbers on their website is quite stunning if they are true:

- 16X more power efficient than TitanX

- 3X more ops than TitanX

- 25K images/sec vs 5K images/sec inference on nVidia

I'm completely bewildered why NVidia hasn't came up with deep learning specific chips yet that doesn't have crud of massive rendering pipeline.

https://blog.groq.com/2017/11/09/69/

sanxiyn · on Sept 7, 2018

NVIDIA already created such a chip: http://nvdla.org/

scottlegrand2 · on Sept 6, 2018

If it's FP32 or FP16/32, it's interesting. If it's INT8/32 it's incrementally better than a 2080TI GPU, and if it's INT4/32, it's stillborn.

deepnotderp · on Sept 6, 2018

They say "TOPS" which usually refers to INT8

monocasa · on Sept 6, 2018

It depends. If the chips are for training, I agree with you. If they're for inference, I think the jury's still out.

sanxiyn · on Sept 7, 2018

It's an inference chip.

syntaxing · on Sept 6, 2018

I'm kind of curious how many electrical engineers and talent that can design ASICs are out there? Most electrical engineers that I have met that designed ASICs at one point or another were for mainly military use or in the Semi industry. But the team that designed the chips were always super small compared to the mechanical or electrical team.

deepnotderp · on Sept 6, 2018

If you mean full custom, very few, and most are in i/o.

If you mean synthesis, a lot.

syntaxing · on Sept 6, 2018

Yeah it seems like full custom ASICs kind of died off a little when affordable robust MCUs came out. Kind of nice to see it picking up again!

woodrowbarlow · on Sept 6, 2018

founded by _former members_ (albeit founding members) of Google's TPU team.

i wonder how they negotiated this from a legal standpoint. every employment contract i've ever signed certainly would not allow for starting a project this similar to my employer's core business.

ttul · on Sept 6, 2018

You can’t - in California anyhow - really stop employees from quitting and then competing with you. Most non compete forms have been made unlawful, which is a really good thing for competitiveness.

In any case, I don’t see any confirmation that this startup is pursuing something that would be competitive with Google. The whole stealth thing leaves us with little to go on.

ryandrake · on Sept 6, 2018

But companies can (and Google does) prevent current employees from working on side projects—even on their own time and equipment. So I suppose the relevant question is: did these guys leave and then start with a completely blank slate? Presumably yes! Any IP ownership ambiguity that would arise from moonlighting would have been flushed out during funding due diligence.

kornish · on Sept 6, 2018

Is this true? Moonlighting is legal in CA [1].

[1]: https://danashultz.com/2016/05/31/moonlighting-employees-pro...

ryandrake · on Sept 6, 2018

They can prevent moonlighting if it creates a conflict of interest, for example if the work is similar to what the employer does or could do. For large companies, the definition of that can amount to “every potential side project”.

williamscales · on Sept 6, 2018

Depending on the IP assignment clause in one's employment agreement, even if you are allowed to moonlight on your startup your employer may own the IP. I just checked mine and if I make any inventions related to my employer's business, then per the terms, I automatically grant the company the right to that invention.

I've heard rumors that one can negotiate these clauses. But this is what I expect would be relevant for these engineers.

ttul · on Sept 7, 2018

They can try to enforce that, but my understanding is that your brain is generally yours outside of working hours, and stuff you dream up is also yours.

But, obviously it’s easier to quit and then do your inventing.

ryandrake · on Sept 7, 2018

It seems less risky to just not moonlight, especially if that moonlighting has the potential to turn into a high-$$$ business. Day zero IP litigation is last thing you need to be involved with when you're trying to bootstrap a technology startup. Even if you know with 100% certainly you would win, you'd be bled dry fighting BigCompany's hundreds of lawyers.

sanxiyn · on Sept 7, 2018

> did these guys leave and then start with a completely blank slate?

Almost certainly yes, because they also need to avoid Google patents granted for TPU. Source: I investigated this area and there are lots of TPU patents.

mayank · on Sept 6, 2018

Noncompetes are not enforceable in California, and it’s not terribly hard for experts to continue innovating past what they did at a previous employer.

wpietri · on Sept 6, 2018

And I think this is a big reason why California is home to so much innovation. It's a great example of how a strong focus on individual worker rights over corporate power provides broad social benefit.

Soundest · on Sept 6, 2018

This looks interesting. Whilst I agree with other commenters that it's hard to compete in hardware I think there's a good niche for this product. Google isn't going to start selling TPUs so something off-the-shelf for machine learning might make some real traction. Best case, lot's of sales to cloud providers (amazon, microsoft etc.) and lots of custom houses. Worst case would be acquisition by MS or Amazon.

Having said that, it's certainly true that Nvidia are tough to beat. But right now we're in a bubble, VC will throw millions at companies and big corporations will throw billions at acquisitions. So I think it's probably a very profitably move in general.

boulos · on Sept 6, 2018

Disclosure: I work on Google Cloud.

Local inference can be important and even a requirement, so at NEXT we announced our intent to start shipping our Edge TPUs: https://cloud.google.com/edge-tpu/

perrohunter · on Sept 7, 2018

I’m still waiting patiently for those to arrive so I can order a couple

jmunjr · on Sept 18, 2018

This is troubling:

https://seekingalpha.com/article/4206948-nvidias-inference-p...

HNNewer · on Sept 7, 2018

I believe they got so much funding because they are coming from Google, not for the product itself. They could have sold even crap hardware.

lquist · on Sept 6, 2018

Is this competitive to NVIDIA's chips? How worried should they be?

writepub · on Sept 6, 2018

not the least bit worried. It'snot about performance, it's reliability. When you're a big cloud provider, only a miracle can get you to use a chip from a no-name company that no one else has dog food-ed. Then there's the actual number being marketed, there's been no independent verification. Right now, this is a fancy P.R hit piece

sytelus · on Sept 7, 2018

No, NVidia should be very worried. There is a huge uproar in the community with some of the practices NVidia has forced like you must use 3X expensive version of same GPUs in data center. Even consumer GPUs are in short supply. Most of the people are not using massive complex rendering pipelines that these GPUs have but they are paying in terms of price and wattage. There is a huge demand for consumer version of TPU like chips and the market is going to eat up any similarly performing alternatives. Lot of gain in NVidia's revenue comes from blockchain and deep learning segment. Much of these gain is at risk by chips like TPU or Groq. It is quite surprising that NVidia hasn't announced any competing product and I hope they don't get sleeping at the wheel while this big wave is about to hit the market.

twtw · on Sept 7, 2018

> Most of the people are not using massive complex rendering pipelines that these GPUs have but they are paying in terms of price and wattage.

This is not how compute on GPUs works now, or since G80 was released in 2006. The "massive complex rendering pipeline" doesn't even light up.

sanxiyn · on Sept 7, 2018

NVIDIA should be worried, if true. That's a big if though.

nobrains · on Sept 6, 2018

Are they planning to create cryptocurrency miners?

person_of_color · on Sept 6, 2018

Are they hiring?

jasondrowley · on Sept 6, 2018

Author of the article here. Yes, they are apparently hiring, or so suggests some of my internet research. One of the founders' linkedin profiles said they're mostly building in Haskell.

person_of_color · on Sept 6, 2018

Thanks. They must be using www.clash-lang.org/

sanxiyn · on Sept 7, 2018

It's probably Bluespec instead.

css · on Sept 6, 2018

> a company with a very spartan website

What does the word "spartan" mean in this context? "Serious?" "Utilitarian?" I do not know this usage of the word.

Edit (from Webster):

> 2 b often not capitalized: marked by simplicity, frugality, or avoidance of luxury and comfort. "A spartan room"

jasondrowley · on Sept 6, 2018

Author of the article here. If you look at the company's website, it is very, very plain.

css · on Sept 6, 2018

Yeah, I like it! I just have never seen that word used in that context.

charmides · on Sept 6, 2018

Are you a native speaker of English?

css · on Sept 6, 2018

English and Mandarin, learned in parallel.

hyperpape · on Sept 6, 2018

austere

anonymous5133 · on Sept 6, 2018

Definetly seems interesting.