Hacker News new | past | comments | ask | show | jobs | submit login
The tiny corp raised $5.1M (geohot.github.io)
629 points by super256 on May 24, 2023 | hide | past | favorite | 317 comments



For background, "geohot", is George Hotz, who's a known hacker / tech personality[0]

This project fits the pattern of his previous projects: he gets excited about the currently hot thing in tech, makes his own knockoff version, generates a ton of buzz in the tech press for it, and then it fizzles out because he doesn't have the resources or attention span to actually make something at that scale.

In 2016, Tesla and self-driving cars led to his comma one project ("I could build a better vision system than Tesla autopilot in 3 months"). In 2020, Ethereum got hot and so he created "cheapETH". In 2022 it was Elon's Twitter, which led him to "fixing Twitter search". And in 2023 it's NVIDIA.

I'd love to see an alternative to CUDA / NVIDIA so I hope this one breaks the pattern, but I'd be very, very careful before giving him a deposit.

[0] https://en.wikipedia.org/wiki/George_Hotz


My opinion of geohot definitely dropped after he started tweeting how easy it would be to fix Twitter, and then he started soliciting free work. He obviously underestimated the difficulty of shipping a feature across web and mobile. Hacking a prototype is trivial. Making it work well for all platforms, fully accessible, and across all supported languages is a bigger hurdle. It just gave me the impression that he thought frontend development was trivial and he'd just be able to hack it out in a day.


Why is Twitter such a Waterloo for all these obviously accomplished people?

It seems like they've been assuming Twitter is the way it is because it was staffed by technically incompetent leftists, and if only they could apply their own get-things-done attitude and "neutral" politics, then the problem would be trivially fixable.

Where does this fallacy come from? Is it because of the illusory simplicity of the tweet format? Something like: "We just need to come up with the right algorithm and do an embarrassingly parallel run over these tiny 280-character chunks of text. How hard can that be. In my own Very Serious Day Job, I deal with oompabytes of very complex data. This tweet processing stuff should be child's play in comparison."


I think there's a certain type of person, particularly common in tech, who thinks this way about _everything_; "oh, that's way easier than what I do, how hard could it be". A kind of reverse impostor syndrome. See the cryptocurrency space; it's more or less been 15 years worth of crypto people accidentally repeating all the failures of conventional finance from the last couple of centuries, because, after all, how hard could it be?


> Technical people suffer from what I call "Engineer's Disease". We think because we're an expert in one area, we're automatically an expert in other areas. Just recognizing that helps.

https://news.ycombinator.com/item?id=10812804


I think the more interesting question is why this symptom mostly happens to "engineers".

I've seen enough engineers presume they can easily become experts in law; I haven't seen many lawyers presume they can easily become experts in engineering.

Why?


It's certainly not _just_ engineers; you see it in the hard sciences and medicine to an extent, as well. Someone recently posted a study purporting to show harm caused by masks to HN, say; while its authors didn't appear to include anyone with expertise in the relevant medical specialties, they did include a chemist and a veterinarian. And, if you're a fan of Matt Levine, you'll know that dentists stereotypically tend to think of themselves as being experts at high finance.

But it definitely does seem to be especially pronounced with engineers.

(NB. I am a software engineer, and not a sociologist, so, argh, this is potentially getting a bit meta.)


Re: dentists. I have a few friends who are MD's who say they went into it to "help people," and that if they "just wanted to make money," they would have been "one of those tech CEO's." But when you look at how they run their offices and finances, you see that there is very little crossover between their medical skill into business. They just assume that they would be a successful CEO.


I've noticed it as a pretty widespread phenomenon for anyone who has the subjective experience of being competent and thinking that's enough to translate to other fields.

Super common in hot takes on politics, medical contrarianism, etc.

Though it's probably true that certain fields are more predisposed to it than others.


IMO, there's a certain level of arrogance intrinsic to engineering: To build something new, you need a belief, first and foremost, that you can build it at all, and almost as importantly better. Weeding out all the people who don't have, at least to some degree, that belief, and you end up with a disproportionate fraction of people who think that way about everything.


I can confirm I think this way about almost everything. Because I can’t see why I can’t be a lawyer, or a farmer, or a dentist given that I spend enough time to learn it.


you could also call it 'the halo effect'.


You hit the nail on the head. There are definitely these kinds of people and they are definitely highly concentrated in tech.



Perfectly describes your non-engineer neighbor or best friend when he encounters an idea he’s never heard of before.


“Reverse imposter syndrome” is a great coinage; I’m going to start using this!


Actually, on second thoughts, I should possibly have called it intruder syndrome :) (Reverse imposter syndrome could just describe Dunning-Kruger depending on which axis you're reversing...)


it's called dunning-kruger, the epidemic syndrome of silicon valley


It's definitely similar, but I think it's _subtly_ different (though it's often found in the same people).

Dunning-Kruger is, approximately "I'm good at the thing I do" (by someone who is actually incompetent).

What I'm talking about is "That thing that other people are doing is really easy; I'd be good at it" (the thing is not easy, and they would not be good at it).

If the person in the latter case actually ends up doing the allegedly easy thing, they may realise that actually they are not good at it, in which case it's not Dunning-Kruger. This is pretty common, I think; person barges in, saying "this will be easy, because I've decided the thing I'm good at is more difficult than it", admits it's not easy, and either leaves or learns. Alternatively of course they may retreat into full Dunning-Kruger; see the Musk Twitter debacle, which is _both_, say.


100 years from know, it'll be known as "the musk-trump complex"


I think, and, well, here's a phrase I've never used before, that may be slightly unfair on Trump. Trump does actually take 'expert' advice; he's just astonishingly bad at choosing experts (witness the amazingly weird lawyers he surrounds himself with). But he does seem to at some level realise that he doesn't know everything.


I think it's because not many non-wierd folks want to work for him. If he were just 5% better at supporting those who work for him, he might increase his success rate by 50%.


having worked in his general vicinity briefly, I can attest he doesn't take any advice unless it's pushed on him. His PR people, at least, seem to be constantly manhandling/damage-controlling the man


the "Musk-Trump complex" is more commonly known as "narcissism". I think what is being discussed here is somewhat different.


fair :-)


> Dunning-Kruger is, approximately “I’m good at the thing I do” (by someone who is actually incompetent).

Nope, but I can overlook because DK is misunderstood this way by almost everyone, and the authors have caused & encouraged the misunderstanding.

Dunning and Kruger didn’t test anyone who’s actually incompetent at all! The use of that word in the paper is so hyperbolic and misleading it should have been rejected on those grounds alone. They tested only Cornell undergrads. They didn’t check whether people were good at what they do, they only checked how well people could estimate the skill of others around them. The participants had to rank themselves, and the whole mysterious question in the paper is why the ranking wasn’t perfect. (And is that a mystery, really?) It is hypothesized that DK measured nothing more than a statistical case of regression to the mean, which is well explained by having to guess how good others are: https://www.talyarkoni.org/blog/2010/07/07/what-the-dunning-...

Contrary to popular belief, DK did not demonstrate that people wildly overestimate their abilities. The primary data in the paper shows a positive correllation between self-rank and skill. There’s no reversal like most people seem to think. Furthermore, they only tested very simple skills at and least one of them was completely subjective (ability to get a joke.) Other papers have shown that no such effect occurs when it comes to complex subjects like engineering and law; people are generally quite good at knowing they didn’t major in a subject.


Meaning people in SV are subject to the same cognitive biases as everyone? Knowing how to say Dunning-Kruger doesn’t exempt one from it’s effects, right? The paper didn’t show less skilled people estimating their abilities to be higher than skilled people, it only showed a self eval / skill curve that has a slope less than 1.


It's because of scale.

Very complicated algorithms and mathematical proofs can still be understood by a single person, and be explored by a small number of people who all know each other. Brain surgery is done by a small team of people. These are typical "smart people" occupations.

Something as simple as Twitter still needs machinery that spans across technical skills, needs 24 hour monitoring, and needs lawyer and accountant support, so nobody can actually to it.

People think they can do it, because it's easy to spin up a demo that sends messages to a few thousand people and then shut it down again. They don't think about how to scan for CSAM, or how to respond to foreign government censorship requests.


I'm a senior developer, and I have to admit I'm one of those guys ;).

WhatsApp was 55 people big when they got acquired, and to me that sounds about right.

Twitter employed 7,500 people. 7,500!!!! So please tell me where the complexity lies? Surely not in the front-end code I can tell you that.

Let's compare it to something WAY-WAY-WAY more complex, like a game with multiplayer, awesome mod tools, etc.: ROBLOX: 2,200 employees. Do I need to mention they wrote their own physics simulation engine and keeping realtime multiplayer going?

So please, explain this to me: how is Twitter more than 3 times more complex than Roblox???

Maybe I'm wrong, that's very possible, I've been wrong in the past. But just explain this 1 thing then: Twitter needs more than 3 times the manpower than Roblox?


Well, when Elon Musk took over and gutted half the staff, I distinctly remember HN full out of outrage and predicting (in hindsight, "impotently wishing" would be more accurate) doom and how Twitter will go down any day now.

Then nothing happened. At least, nothing that I personally observed as a casual Twitter reader. The goalposts were moved to "it will go down with the New Year's Eve spike", and once again nothing happened. Then the narrative became "the cracks will only be noticeable in a few months", and here we are and yet again, nothing.

So Musk and Geohot came out as the saner voices of that whole debacle. Of course Geohot said exaggerated things like "you only need 40 engineers to run Twitter", but if it turns out it takes 300 engineers, then I would consider this as Geohot being proven mostly right.


Did you see the news about DeSantis yesterday? Musk convinced him to announce his presidential candidacy on Twitter, and the live stream just didn’t work.

I don’t think that qualifies as “nothing happened” when features used in high-profile events fail, with the CEO and a potential future president left on the line. Any other platform wouldn’t have struggled with a stream of this size.

I guess you might say that’s just one thing, and other than the CEO’s live streams not working, everything is fine. But there are numerous other examples of accumulating paper cuts and failures at Twitter. I think this is close to what most of those doomsayers expected would happen.


Google also recently had a total failure in a public event. It's not necessarily saying much about Twitter.

https://mashable.com/article/google-ai-maps-search-event-bin...

> the AI falsely said the James Webb Space Telescope took the first ever picture of an exoplanet

> During the announcement about a new Lens feature, the demo phone was misplaced and the presenter wasn't able to show the demo

> Google seemed to say, "let's pretend this never happened," and immediately made the livestream recording private after the event


> the live stream just didn’t work

Are you sure ? Others say 6.5 M listened to the livestream that was delayed 20 mins


They had to switch to David Sack's account to do the livestream and I think there were about 700k listeners that were on at the time of announcment. The issues weren't just infrastructure related, Musk had challenges with his mute button and it was creating feedback because he and Sacks were next to each other on their phones.

But yeah, it could have gone better for various reasons.


so was it no longer live, or did they encounter 20 minutes of technical issues that delayed the start? cuz either way it seems pretty obvious that at least for some amount of time it didn't work


There have been outages, just not as catastrophic as predicted.


I think that depends on who did the predicting :)

There was a lot of "ooh, it will catastrophically fail within weeks", which was fundamentally an assumption that the previous team was entirely incompetent. (Any halfway decent team tries their hardest to build resilient systems, not things that need hand-holding all the time.)

The current trajectory is exactly on the expected failure path predicted by anybody who does actually work on large systems - a steady increase of smaller failures, punctuated by the occasional large failure. (Cf. DeSantis announcement)

In essence, a reduction in staff will result in worse SLO results. It will result in less coverage of edge cases (technical and UX). Smaller teams are more constrained to travel on "the happy path". And the fact that marginal utility of additional engineers decreases means you can usually reduce teams a lot before impacting that path.

In complex systems, reductions also mean you're more vulnerable to a black swan event being irrecoverable, but that still requires a black swan first.


it really is a testament to how well engineered Twitter is/was. I well remember Musk gloating about how the architecture was stupid and he'd fix it. Twitter would be long gone if his remarks were anywhere near the reality


Guess you don't remember the fail whale? Twitter was held together with gum and bailing wire for a long time. Yes it got better, but I'm certainly not going to use it as the example for good engineering.


> they've been assuming Twitter is the way it is because it was staffed by technically incompetent leftists

I don't think anyone argued Twitter was run by technically incompetent people. Where was this, if so? By leftists, yes, and by far too many people, yes. Both were argued repeatedly. But those things are now proven objectively true. The Twitter files showed just how systematic their enforcement of left wing orthodoxy was, and Musk fired most of the staff yet the site kept trucking and even launching new changes which is more or less the definition of having been over-staffed.


The wep app itself is easy.... it's everything around the tech that is hard (scaling, regulatory, moderation)


For me it was when he said that the cardinality of integers is the same as real numbers. Then I saw his twitter and all the politics and crazy stuff about QM.


> the cardinality of integers is the same as real numbers

That's definitely more outrageous than saying that frontend is trivial. Whatever, I never took him seriously anyway.


Isn't that just a trivial misunderstanding of Hilbert's Hotel?

https://en.wikipedia.org/wiki/Hilbert%27s_paradox_of_the_Gra...


Either he's: Trying to be edgy for edgy's sake or he bragging about how he's smarter than the experts in the field while demonstrating a lack of understanding (thinking he doesn't have to prove it to others, they should just trust him). Neither give me that great of an opinion of him. If you don't understand something don't tell everyone that studies the thing that they're wrong. If the experts are wrong, show them, embarrass them, get a fields metal, another million dollars, and a shit ton of fame. Essentially, pony up or shut up.


This got me thinking, is there a scenario where the number of new guests is uncountable? Seems to me that every kind of ferries/buses/guests story is just going to be countable, since a finite number of countables is still countable.

Maybe something that pretends to be the real numbers, like a matrioshka doll of infinite containers inside containers.


The easiest analogy I can come up with is an infinite pipe that's completely full with water. When a new amount of water arrives, say 1 Liter, all the water just flows along the pipe a bit further to make space for the new water.


In the hotel fashion, might be difficult. Think of it this way, if we make the Hilbert Hotel infinitely tall and each floor has infinitely many rooms such that they correspond to each rational number, we can fill any number of people from any source on the first floor alone.

I think we could only do this with an even more absurd scenario like if each room was filled with pregnant women who are giving birth, and the children rapidly age, and also give birth at an infinite rate? That would create an infinite nesting doll like situation for each guest


Some light, coffee reading "Cardinality of the continuum" [1]: in short, the cardinality of real numbers (ℝ) is often called the cardinality of the continuum, and denoted by 𝔠 or 2^ℵ_0 or ℶ_1 (beth-one [2); whereas, interestingly [3], the cardinality of the integers (ℤ) is the same as the cardinality of the natural numbers (ℕ) and is ℵ_0 (aleph-null) [perhaps what was meant initially?].

Related: the Schröder–Bernstein theorem [4], "if there exist injective functions f : A → B and g : B → A between the sets A and B, then there exists a bijective function h : A → B.".

Not related, but great: Max Cooper (sound) and Martin Krzywinski (visuals) did a splendid job visualising "ℵ_2" [5].

[1] https://en.wikipedia.org/wiki/Cardinality_of_the_continuum

[2] https://en.wiktionary.org/wiki/%E2%84%B6

[3] "Cardinalities and Bijections - Showing the Natural Numbers and the Integers are the same size", https://www.youtube.com/watch?v=kuJwmvW96Zs

[4] https://en.wikipedia.org/wiki/Schr%C3%B6der%E2%80%93Bernstei...

[5] "Max Cooper - Aleph 2 (Official Video by Martin Krzywinski)", https://www.youtube.com/watch?v=tNYfqklRehM


adding upon this comment to why the two cardinalities are not equal, on one hand we have the set of integers {..., -2, -1, 0, 1, 2, ...} and they can be put into a bijection with the set of natural numbers {1, 2, 3, 4, ...}, this is done by rearranging the set of integers like {0, -1, 1, -2, 2, -3, 3, ...}. so this is a countably infinite set (one that has a cardinality of ℵ_0)

As for the set of real numbers, we have the subset of irrational numbers which are uncountably infinite (see cantors diagonalization argument) thus making the whole set of real numbers, a set whose cardinality is ℵ_1.

The annotated turing book goes into this pretty well in the first couple pages.


Quite. Then there is the question, is the cardinality of the continuum the first cardinality bigger than the cardinality of the naturals?

It turns out the 'continuum hypothesis' can be true or it can be false. Neither contradicts standard ZFC set theory: the hypothesis is 'independent'.


One way to think about it would be to replace or with and: the continuum hypothesis can be true and false: it is a 'polycomputational object' [1].

[1] Using the concept of polycomputing from There’s Plenty of Room Right Here: Biological Systems as Evolved, Overloaded, Multi-Scale Machines: "Form and function are tightly entwined in nature, and in some cases, in robotics as well. Thus, efforts to re-shape living systems for biomedical or bioengineering purposes require prediction and control of their function at multiple scales. This is challenging for many reasons, one of which is that living systems perform multiple functions in the same place at the same time. We refer to this as 'polycomputing'—the ability of the same substrate to simultaneously compute different things, and make those computational results available to different observers.", https://www.mdpi.com/2313-7673/8/1/110


Interesting, that's not a concept I have come across before. But to be honest, I wasn't sure which conjunction to use (and, or or).


Here is Michael Levin, one of the paper's author, speaking at length about the polycomputing concept and more: "Agency, Attractors, & Observer-Dependent Computation in Biology & Beyond" [1].

[1] https://www.youtube.com/watch?v=whZRH7IGAq0


To be fair, infinity is not a concept that is in any way well understood or defined.


It is quite thoroughly studied in mathematics, and that particular issue has a definitive answer.


It only has a definitive answer in the mainstream interpretation of mathematics.

On the relative fringes, there are serious studies on alternative interpretations. See for example

https://en.wikipedia.org/wiki/Constructivism_(philosophy_of_...

(You can skip to the part that discusses Cantor's arguments, but I suspect that if you haven't heard about related concepts you probably want to understand what it is first.)


I love that the "hater" thread turns into a discussion of uncountable infinities :)

The "cardinality" section of that Wikipedia page describes my objection well. I don't doubt the real numbers are not recursively enumerable, but that doesn't mean they have a larger cardinality than the integers.


I stand by my statement. Pony up or shut up.

If you're trolling to troll, then expect the hate because you're being annoying.

If you think you're right and all the mathematicians are wrong, pony up. Hell, you'll have a lot more lulz when you win a fields metal.

If you don't pony up I don't know why you would be surprised people assume you're arrogant. You can doubt the status quo without being arrogant. We both know that you're not going to take someone's word just because they said so, so why expect others?


No, I don't think there's very much room for controversy here. I mean, I don't know what exactly Hotz have said, since there was no quote (and, honestly, I'm not that interested either), but if somebody is simply saying "the cardinality of integers is the same as real numbers" and leaving it at that, he is just plainly wrong.

Math is all about definitions and what follows from these definitions. So, you can define "integer", "real number", "cardinality", "equals" and so on however you like, and make all sorts of correct statements — as everyone will see by following your arguments all way from the definition/axiom and until the very end of your proof. But if you don't provide any definitions of your own, then you rely on some other definitions, and everyone has no other choice than assume that these are the very much "mainstream" ones, as you are referring to them as if they are well-known.

Now, it is unquestionably true (and easily provable) that the set of all computable numbers is countable, and anyone who says otherwise is wrong. But unless you specifically define real numbers as a subset of computable numbers, as constructivists are inclined to do, your listeners won't assume that, since this is not how real numbers are generally defined, and by virtue of not providing your own definition you are implicitly referring to a "general definition". (And, honestly, you shouldn't even call any subset of computable numbers "a set of Real numbers": this name is already taken.)

These general definitions and assumptions lead to all sorts of complications, and I personally have my doubts that real numbers exist in any meaningful sense (although I'm not committed to that statement, since there are several mathematical constructs that I would like to dismiss as "clearly nonsense", except they allow us to prove some very "no-nonsense" stuff — I don't know how to deal with that, and I never heard that anybody does). But I definitely cannot say that cardinality of integers is the same as the cardinality of reals, because this is simply not true under the common definitions (which is easily provable). (And less importantly, but worth saying that the contrary is not proven by constructivist methods — as half of the actually useful math in general ends up being, unfortunately).

So, as a somebody, who doesn't quite believe in non-computable numbers, I am very sympathetic to anybody who says that Real numbers do not exist. I don't understand how could they, what does it mean for an object that we cannot define to exist. Yet, I can accept (as a game) some well-known theory which talks about these non-quite-existant "Real numbers", and prove some statements about them, and one of these easily provable statements is that cardinality of continuum does not equal the cardinality of Natural numbers.


> I personally have my doubts that real numbers exist in any meaningful sense

I think we're in agreement here in principle. (Since we're on the topic, I'd like to add that naming this suspicious set "real" numbers is a tad bit ironic)

That said, I don't like the idea of having a group of people "owning" words as if they had a monopoly over them. The statement "the cardinality of integers is the same as real numbers" can be understood to mean "real numbers should actually be computable numbers".

I didn't bother to look up what Hotz wrote on twitter that triggered this discussion, I was just providing context that the issue of cardinality isn't as settled as some might think. It's probably not fruitful to argue whether a statement from hearsay uses words accurately or not though.


It does not. Logical outcomes that use infinity as an intermediary are inherently not reliable. An example of this is the Ramanujan summation where 1+2+3+... results in -1/12, an outcome which is disputed among mathematicians due to the fact that we have not defined the concept of infinity properly.


> Ramanujan summation

That's not a sum in the traditional sense so don't think about it this way.

Infinities are used quite often in mathematics for rather mundane things. Calculus doesn't work without it. It is also quite important to the foundation of many other areas but this is often hidden unless you get into advanced works (in this sentence we're not considering a typical undergraduate Multivariate Calculus, PDEs, or Linear Algebra as "advanced")


Calculus works fine without infinity. Finitism is basically a philosophical position without practical consequences. Plenty of serious people have planted their flag there. I don't find it particularly surprising that someone who works with computers, especially at a low level, would be drawn to it.


What does it have to do with cardinality of continuum, with several proofs since Cantor?


> how easy it would be to fix ... obviously underestimated the difficulty of ...

If I had a dollar for every job that I didnt get where I estimated the correct degree of difficulty, and they laughed, went with the person who said it would be easy and they could bang it out in a day of sleeping - I would be rich.

The loud optimist wins the 5.1 mil every time.


That's because there is no penalty for being late in many projects. As soon as there is, this doesn't happen anymore but in software there hardly ever is any real penalty.

As soon as there is a dead serious one, I've noticed everyone get serious and starts ignoring the rainbows and unicorns people. If it's a slap on the write the rainbows still win.


I don’t remember that he said it was easy, I read some of his tweets and listened (twice) to the twitter space he did with elon and the big take away for me was that he was convinced twitter wouldn’t be able to move much without a heavy refactor. I can believe that. Elon who disagreed with him at the time seemed to have changed his mind now.


His attitude towards society appears not to be one of being serious, and legally correct, at all times.

It's likely there was some humor and bravado, as is the culture of his east coast origins.

The truth of his engagement with Twitter was, Just based on my watching him and his live streams during that time, that he was looking for a thing to do while he ceded control over comma AI, to a new executive leadership group.


People completely forget how complex things can get when you have to serve millions of users across platforms, devices, countries, and accessibility settings, and all of it needs to work because you've got hundreds of millions of advertising dollars paying for views and engagement.


The first part is also probably far easier than the second, and lawyer work to comply with various laws so you can actually get paid.

...that doesn't change a fact there are some failures where developers really should know better and design it less shit


To be entirely fair many things would be easy to fix if you can throw away everything and make clean implementation. It's the existing codebase and data that makes that difficult.

Of course, he should know better than to throw claims like that.


agree. he was talking about how he can do this and that to fix Twitter then just fizzled, blame the Twitter's code and said Twitter need to rewrite from the ground up.


Twitter doesn't work well and isn't accessible so we needn't burden every product change with these additional concerns.


For reference, comma released a new version of openpilot today, and we now sell comma threes.

https://blog.comma.ai/092release/


Congrats! Just read up on thr taco bell drive. Great achievment! https://blog.comma.ai/taco-bell/


Congratulations, sincerely.

I'm glad that the company exists. I'm glad that smaller car manufacturers were exploring integrating the self-driving software, too.

And, from a celebrity media perspective, I think your engagement at Twitter was underreported especially in gossip mongering pseudo serious technology fan sites like this one.

It was an interesting shift for you, as you mature out of your first big startup into kind of a journeyman phase, in my view. (Tasting different experiences.) Something that is relatable to large numbers of us.



First thing I thought of too, talk about life imitating art.

I hadn't laughed so hard in a long time.


Not gonna lie: I watched the 'best of' Bachman and Hanneman immediately after I posted it!


Are you really George Hotz?


You're creating a false narrative man. Him creating tinygrad makes sense given he created a self-driving car company which leverages deep learning (and now uses tinygrad in prod, go figure).

It's not like he decided to hop from self-driving cars to 'AI' because fads changed.


The same self-driving company he left? His "Amazing Journey" post[1] for Comma was illuminating; once a company is "[...] no longer a race car, [but] a boat" GeoHot is likely to bounce. I like AMD as much as the next guy, but I don't want to rely the software being provided by a personality who only really gets excited by working on new projects before the "wild success" stage (if we can call it that for Comma)

1. https://geohot.github.io//blog/jekyll/update/2022/10/29/the-...


He is still a part of Comma, as far as it seems, he is just by his own admission not the best guy to lead a company that got big enough for him and which needs certain stability and adhere to certain bureaucracy if you will. In some way, I think it's rather admirable that someone knows when to move aside and let others take the lead.


how big is it?


To me this is fair criticism. I was thinking the same thing. However, the world does need a viable competitor to Nvidia in AI.

AI is not going anywhere. This is not a fad like some of the others mentioned but more likely than not where the next decade of innovation is built on.



They're slowing burning through their VC money trying to make a business out of the hobbyist market while Cruise and Waymo have fully autonomous cars deployed in SF and are scaling up.


As a very happy user of Comma, I think it is reasonable to say the company is going to fail, but that ignores that the product they created is still awesome. Comma is light years better than any built-in driving assist in any non-Tesla car. And it's comparable to Tesla for far less money.

The reason the company might fail is because their main thesis, being that car manufacturers would just license the self driving tech to somebody else (like Comma), never came about. Car manufacturers are just too conservative. It was a perfectly reasonable bet to make though. Unfortunately they ended up in the business of selling hardware and giving away software for free when they wanted to be in the business of selling software.


> As a very happy user of Comma, I think it is reasonable to say the company is going to fail, but that ignores that the product they created is still awesome. Comma is light years better than any built-in driving assist in any non-Tesla car. And it's comparable to Tesla for far less money.

I'm Comma user in one of my cars as well, and I do like it. But, when was last time you tried built-in driving assist in a Tesla-priced car?

Tesla's driver assist is nothing special nowadays.


I've logged a lot of miles of Comma in a 2023 Hyundai Ioniq and 2022 Toyota Prius Prime, and the built in driving assist of both cars is nowhere near Comma, both in terms of steering and accelerator/brake.

Things that Comma handles seamlessly that the built-in cruise in both cars will not:

- Full stop and go

- Sharp turns on the highway that require slowing down (both built-in adaptive cruise modes will gladly just drive you off a cliff at 65 mph)

- Situations where the lane lines are hard to see or are implied

- Non-highway driving

- Not requiring me to touch the steering wheel every 20 seconds

Maybe those things work in higher end cars (though I'd say the Ioniq is a fairly high-end car), but then again with Comma you get it for ~$2k in a ton of cars instead of having to buy a luxury car.

It is true that if you are on a highway, with clear lane lines, the steering assist in both cars is certainly a lot better than nothing, but it's just not nearly matching the reliability and versatility of Comma in any sort of imperfect situation.


> - Not requiring me to touch the steering wheel every 20 seconds

In many countries doing this will void your insurance.

> - Sharp turns on the highway that require slowing down (both built-in adaptive cruise modes will gladly just drive you off a cliff at 65 mph)

While it's probably given that this will happen, it's also an infrastructural failure. Just place a limited speed limit sign way before the sharp turn, or fix the road so it doesn't make a sharp turn.


Car manufacturers ended up using Mobileye though...


That wasn't precisely by choice though, IIRC.


Comma didn't exist yet.


No, they are recently choosing to go with Mobileye too. For example: Porsche announces collaboration with Mobileye from a couple of weeks ago:

https://newsroom.porsche.com/en/2023/company/porsche-mobiley...


I don't mind bit of skepticism. But let's see: Comma is actually used by people in their cars to achieve some degree of self driving. He actually shipped something that works! The idea is ambitious as well as commercially interesting: car manufacturers don't reinvent wheel for every component but rely on OEMs to help them out. This was a similar bet. That way it is wildly more successful than hundreds of other undifferentiated yet another social/lending/flavour of the season apps.

It takes guts to put out such bold bets in writing. We've seen many (senior!) tech people sneer at George's at times naive optimism. I actually find the "how hard could be" attitude refreshing against "no no it is complicated you can't do that" gatekeeping. Because otherwise we will end up using big-tech for lack of alternative. It is not the critic who counts and all that..


The scale of investment is wildly different. $5.57M in actual revenue vs $18.1M raised isn’t that far from a viable product and positive ROI for their investors.

Cruse and Waymo have invested billions, they need 10’s of billions in annual sales or their project is a failure.


> Cruise and Waymo have fully autonomous cars deployed in SF and are scaling up.

Is that it? SF only? After billions invested? There is a laundry list of those that tried and failed especially with burning an insurmountable amount of VC money even with billions of their own money.

Lyft: Scrapped and sold their self-driving project. [0]

Uber: Scrapped their robot-taxi project and sold it off. [1]

Zoox: Once valued at $3BN, acquired by Amazon for $1BN after nearly going bankrupt and is still using specialised cars for self driving only in SF. [2]

Cruise: Acquired by GM and still using specialised cars for self driving in SF [3]

Drive.ai: Ran out of money and almost bankrupt and acquired by Apple. [4] No where to be found on the roads.

Waymo: Same situation as Cruise, but Google keeping them alive.

Comma has lasted longer than these over-valued companies and is already in lots of consumer grade vehicles beyond SF today and not in specialised cars and taxis unlike Cruise and Waymo who are still stuck in SF [5].

[0] https://www.nytimes.com/live/2021/04/26/business/stock-marke...

[1] https://www.npr.org/2020/12/08/944337751/uber-sells-its-auto...

[2] https://www.cnbc.com/2020/06/26/amazon-buys-self-driving-tec...

[3] https://fortune.com/2016/03/11/gm-buying-self-driving-tech-s...

[4] https://techcrunch.com/2019/06/25/self-driving-startup-drive...

[5] https://techcrunch.com/2023/05/18/cruise-waymo-near-approval...


Anyone can call a Waymo robo-taxi right now in Phoenix and it works really well. It's a modified electric Jaguar. That's a pretty big difference from the others.


Good thing everyone lives in Phoenix and SF XD


If it works in Phoenix there's no reason it wouldn't work in a large part of the US, they just haven't rolled it out yet. Proving that it works is the hard part.


How so?

Different roads, geography, etc?


Sure Phoenix is an easy case with dry weather and car-based infrastructure but there are huge areas of the US that are fundamentally the same. If they can solve the problem of driving safely and conveniently in Arizona they can set it up elsewhere too.


Comma's hypothesis could be correct and its licensing is also very flexible, if they do succeed in making an android equivalent of SDC(self driving), they will be used by all these car manufacturers eventually.


are you trolling? comma is the only profitable company you mentioned while its the opposite for way and cruise, both of which are not scalable.


> both of which are not scalable

which is the critical element everyone is in denial about, even to the point of saying Tesla has a long way to catch up.


there are just 3 possible options:

1. They are living in a sf USA big city centric bubble. 2. They are very easily influenced by marketing. 3. They are just trolling.

Comma is a product you can buy all code is opensource. All others is just a service, where people could theoretically just be diving remote and they sell it as self driving.


4. they're in the "everything Elon is evil" bubble and refuse to entertain any challenges to this belief


Waymo and Cruise have burnt through money like no one else. They want to make their own cars, thier own Ai chips/lidar, their own self driving tech. Delusional at best, malicious at worst. They made a hard SW problem into hard HW+SW problem. They are zombies unless some magic happens in the world of nn (which is not unlikely tbh)

Whereas comma is taking the right approach of a nimble team, iterate fast and ship a working product (even if not L4-5), get cash flow, next milestone.


Cruise is part of GM, so saying they are going to make their own cars is hardly delusional.


So what? Dont the vast majority of companies that take VC funds.. burn through it before generating profit?

George is courageous, inspiring, and highly intelligent. He stands for what he believes in. He stands up for himself and his beliefs, and talks back to powerful people.

How many tries did it take to invent scalable electricity, or the light bulb?

George Hotz is frikkin awesome


But....Comma still exists, and they launched the 3rd version of their hardware.


Some of the highlights from the Twitter saga:

1. Soliciting others to do his work - and offering internships to others that can help, in a kind of internship MLM (He was an intern at the time).

2. Complaining about how Twitter doesn't run on his laptop


He has been working on tinygrad since atleast 2021. https://geohot.github.io/blog/jekyll/update/2021/06/13/a-bre...


AI was already red hot in 2021, so not sure that detracts from OP's point.

I know it seems like ages ago at the current pace, but ya gotta remember that GPT-3 was released in mid-2020.


> GPT-3 was released in mid-2020

But not to everyone.

I would argue that it was not until around December 2022 that the world at large got the opportunity to begin really using AI directly. With ChatGPT.


This is a pretty disingenuous take on a guy working with others to create solutions in spaces where there is literally no other comparable competitor.

What kind of comment is this on a site that used to be called Startup News? Even if that doesn’t resonate with you, isn’t what he’s talking about pure hacker ethos anyway?


This is the George Hotz MO: I trust him to create a lot less than I trust him to optimize, and I trust him with optimization only in a very narrow technical sense. "Optimizing" Twitter by reducing headcount to 50, regardless of the social or revenue consequences, is actually stupid. Optimizing a whole mess of software that exists between your tensor and the hardware is a decent idea.


That’s funny, I remember blackra1n and it was pretty clever.


Did comma actually tank? I was under the impression that they actually had a good product and were really honest about it too.


that is indeed still the case, the comment is just false. comma just released support for ford cars aswel. and open pilot got rated the best self driving tool above Tesla by consumer reports


comma hasn't "fizzled out", it's in active development and drives me to work.


Did he fix Twitter yet?


> I think the only way to start an AI chip company is to start with the software. The computing in ML is not general purpose computing. 95% of models in use today (including LLMs and image generation) have all their compute and memory accesses statically computable.

Agree with this insight. One thing Nvidia got right was a focus on software. They introduced CUDA [1] back in 2007 when the full set of use cases for it didn't seem very obvious. Then their GPUs had Tensor cores, and more complementary software like TensorRT to take full advantage of them post deep learning boom.

Right as Nvidia reported insane earnings beat too [2]. Would love more players in this space for sure.

[1] - https://en.wikipedia.org/wiki/CUDA [2] - https://www.cnbc.com/2023/05/24/nvidia-nvda-earnings-report-...


I know the founders of an AI chip company that taped out and got working chips on their first go. They got their chip done, it’s pretty solid. Chip has great perf and is super power efficient, a solid delivery. I knew they'd nail it and they did.

The SW story is a train wreck, though. The problem basically was that they couldn’t hire any good SW people. As I said I know the founders. They are both genuinely decent guys, they put their own money in so they have some (well, minimal) skin in the game, and they know a ton of expert-level embedded and systems coders with between 20 and 40 years of hard core experience. As far as I can tell, they weren't really able to get anyone that we know in common to join. I certainly did not, and no one I know did either. Last I heard they'd had to hire third choice guys in Europe to do the work and it wasn't going well.

There's a pretty good reason for it, and it comes down to a sociological problem. HW people don’t value SW people. It's just basically true and has been true everywhere I've looked. Maybe if you're doing a system (like a router or maybe a drone) then the HW people will begrudgingly admit that the SW is a major part of the delivery, but that isn't true for chip companies (including chips-on-reference-boards).

You can rest assured that at a chip company, all of the high comp people in the company are going to be on the ASIC team and the SW team will never be on the same tier. The argument is always the same, no matter how many times it bites the companies on the ass and sends them careening into the dumpster: “yes, but the chip without SW is the chip! we can buy SW, if we have to. SW without the chip has zero value.”

Almost every chip company ends up like that, and the kind of low level, experienced SW people that work in the space know to avoid them and work at systems companies instead.

As far as I've been able to determine, with _maybe_ the exception of Cerebras - maybe - this is the situation that has played out at all of the 201x AI chip companies. They get founded by ASIC guys, most of whom have more than a small chip on their shoulder about the relative value of ASICs-vs-SW. These guys are all ex-SGI, ex-Sun, ex-Google, ex-Nvidia, ex-Intel HW guys who saw SW people making a lot more, not just in broader industry terms over the last few years, but at hardware-focused companies. In general, ASIC guys make less than SW guys unless they are the very narrow set of top level architects. IMHO from a value creation standpoint, that is _super unfair_ and I am not here to justify it, but it is how it is. The result poisons ASIC companies. SW people who know what needs to be don't won't go to them most of the time, for good reason, and so they fail.

So I will say, given that, starting with SW first is brilliant.


I also know the founder of an AI chip company (ex-AMD), they taped out in 2022, got working chips on the first try. Miraculously, they hired a good software team, some even with previous compiler experience. In their brochure they write things like:

> In 2022, FuriosaAI remained the only startup to submit results in MLPerf Inference... This time, through purely enhancements in the compiler, our team was able to double the performance on the exact same silicon.

https://www.furiosa.ai/

Maybe it helped they are based in South Korea. Other places to work in South Korea doing system programming is not very attractive.


As a SW person who's worked at AI chip company, this is 100% true.


were the SW folks offered compensation commensurate with their low perceived status? like HW gets C/cofounder, SW gets early employee package. Because that's a different story about money and ownership. of course good people reject wage labor.


I can only speak for myself, but their offer was underwhelming and I was easily one of the first five people they called. I’ve founded and exited companies myself, so when I walked them through why the founding SW engineer offer wasn’t worth my time giving the risk profile, and laid out there size and type of SW engineers they were going to need. They understood, but they also trotted out the “but we really have to make sure we hire the right chip guys.”

Fair enough. I’m old so these conversations are never personal. I tried to help them by steering younger, less experienced but very high potential engineers toward them, but in the end they failed utterly to put together a viable SW team.

I don’t know any of their investors in terms that would let me ask, but I do know a bunch who passed, and most of them had concluded that it would end up just being more silicon on a crowded market. Being 10-20x better than nvidia isn’t the point if the market is about to be flooded with a dozen other chips against whom you are maybe 1.5-3x better. Without nailing the go to market needs, which means “make the stuff people have work, don’t make the customer learn new” etc. you have nothing. That’s all a Sw problem.

It’s actually worse, because the engineers in the space are actually pretty bad. A lot of what they have actually barely works to begin with, being a bunch of cobbled together python frameworks of dubious engineering quality and all of the hassle of the ecosystem. So the amount of mental space for “different” is almost zero even if you ignore that they’ve been burned (AMD) before, which you can’t.


Why wouldn't AMD throw a few million at this? Worst case they lose a small amount of money, but best case they finally get good software for their hardware.

The past decade or so, they haven't been able to create any good software for their hardware. They made small improvements but the competition, Nvidia, has also made improvements to their already good software.

It too the point where their software is the reason why most people/companies don't use their products. Their drivers for their customer products are just as bad.

They are very competitive in hardware, but Nvidia dominates them at software which make companies buy Nvidia. No one wants to deal with the pain of AMD software.

AMD is a better company to work with than Nvidia, but it not worth it when it comes to dealing with their software lol.


AMD can't be bothered. They've had literally years to fix the SW issue that allows Nvidia to stomp them in the space quarter after quarter with no fix in sight.


Being able to make stable software isn't just a matter of wanting it though. It's possible to be both bothered about something and still fail.


You know, I agree with you in a way. There are plenty of clueless teams that will produce broken software for years. I’ve been working since the early nineties and I’ve seen just incredible garbage and especially vendors who can’t stop themselves from just producing pathological orgs that produce bad software.

Look, the reality of software is that it’s actually dozens of markets, from random bloated web crap to building safety-focused critical systems. Each of those areas values different skills and has different knowledge.

But I can attest that AMDs problems - and for that matter, Intel’s since I am familiar with both - is that the companies see software outside of a very few niches as an afterthought and relatively a cost of doing business rather than a value.

It should in fact amaze you that Nvidia has pulled off being dominant. Jensen is pretty well known to have relatively poor understanding of software and Nvidia for a long time had a reputation exactly as I describe in one of my other comments. But Nvidia stock appreciation made it possible for them to become an attractive destination despite their core corporate mentality and accidentally created a company that ended up with a strong software culture.

AMD could solve this problem tomorrow with the right level of investment and a willingness to bulldoze from above the corporate politics that prevent them from having a markwt-leading software team.

Yes, it is hard. I have worked in companies like that before. I am not just saying “hey, go hire some rockstar coders” because that answer is ALWAYS bullshit. Software people are especially prone to thinking that’s the answer and that isn’t what I am saying.

But there are ways for companies in their situation to structure things to make it work out. The specifics are not appropriate for a public forum as it would identify a number of former employers who were successful in fixing their issues.


> Why wouldn't AMD throw a few million at this?

They tried at some point. Like another commenter pointed out elsewhere HW people just don't care about SW. They think HW is superior and SW is the joke. I don't think much of the culture has changed.

A lot of their understanding = workaround the HW, make it work, etc.


Maybe AMD is the the one who is investing $5.1M in his company.


AFAIK the "AMDs drivers are bad" meme is outdated. Sure, their AI/ML software is garbage, but the graphics drivers are fine.


I just had my AMD-based machine crash - repeatedly - every time I tried to use Google Maps inside Firefox for longer than a few minutes.

I haven't confirmed, but I strongly assume that either their graphics drivers or something Ubuntu does with Wayland are not fine.


Seems like a Wayland problem to me. Under X, AMD drivers have been nothing but rock solid for years.


To be fair, you have group the perfect combo for initiating a crash, a google product inside firefox inside linux desktop on wayland.


Of these only Linux kernel is the thing that can take down the whole machine.


On a desktop machine, there isn't a huge difference between a kernel and a Wayland crash. Either way I lost all my state and will end up rebooting.

In this specific case, I was unable to get to a console and had actual things to get done so I wasn't going to debug it.


The cutting edge for graphics is all raytracing, and Nvidia still dominates. DLSS3, pathtracing, etc, these are for 'graphics', but heavily dependent on AI post processing, so Nvidia still rules.

So in the gaming market, Nvidia still commands a huge premium. No AMD card can play Cyberpunk on overdrive 4k.


> The cutting edge for graphics is all raytracing, and Nvidia still dominates. DLSS3, pathtracing, etc, these are for 'graphics', but heavily dependent on AI post processing, so Nvidia still rules.

IMHO, unless you have a $1000+ GPU, RayTracing is still not worth the performance hit. I prefer playing at 100+ FPS with RayTracing off, than turning it on and have my frame rate cut in half.


Good idea. I don't think George Hotz has the skill set to actually deliver on a lot of this stuff (specifically I suspect trying to replace the compiler for the GPU is something that he will probably make a song and dance about with some simple prototype but then quietly scrap it because even for AI workloads its still a very very tricky problem) but he has the strength of vision to get and direct other people to do it for him.


do you think those talented workers will accept giving him ownership over that work and its value?


Traditionally the way startups entice people is by giving them equity


wow ok


I have some experience in this area, having both worked on machine learning frameworks, trained large models on datacenters, and have my own personal machine for tinkering around with.

This makes very little sense. Even if he was able to achieve his goals, consumer GPU hardware is bounded by network and memory, so it's a bad target to optimize. Fast device-to-device communication is only available on datacenter GPUs, and is essential for models training like LLaMA, Stable Diffusion, etc. Amdahl's law strikes again.


Eh... no? Stable Diffusion works fine on single device. Ditto for smaller LLaMA.


That's for inference, I'm referring to training.


Seems like a much better mission for George Hotz to go on than single handedly trying to fix Twitter.


It wasn’t even fixing Twitter. It was marginally improving search.


From the actual engineers that were posting on Twitter marginal is putting it lightly.

The quality, arrogance and ignorance of enterprise SDLC was akin to that of a 1st year grad.


It was so painful watching self-proclaimed tech geniuses displaying the same moronic attitude I had at 22


I mean imagine being full of vigor to go out and do great things in the tech world, and then get your life almost destroyed by a stupid lawsuit at a young age by people who are much "stupider".

Many of us would absolutely be just as contrarian towards general society as he is, and idolize contrarian figured like Musk.

Doesn't mean that the work he is doing is not valid. It's a shame though because his ideology is very likely to hinder the progress in his companies (like for example policy against remote work).


I think that was a great mission for the individual, George Hotz. It's beneficial to be humbled by complex systems and imposing bureaucracy -- it makes future endeavors better informed.


There isn't really any evidence he was humbled by the experience.


I thought he was doing AI self-driving. I see we are now moving the goalposts to "generalized intelligence" so I suppose there was no point for him to put up with just cars anymore.


I’m not convinced ai driving can even work without GAI. Seems like a logical next step to the gigantic yak shave that is autonomous driving.


Both Cruise and Waymo have human backup in case of unexpected edge cases.

I suspect they are going to be needed for quite some time to come.


Waymo removed it's backup drivers years ago...

https://spectrumnews1.com/ap-top-news/2020/10/08/waymo-remov...

https://www.azfamily.com/2022/08/29/waymo-launches-its-self-...

There was also a waymo press released this year that stated all rides in Arizona are now backup driver free, although I can't find it.


They removed the human from the car. There’s still humans to recover from the edge cases. I don’t know if they actually remote control the cars or if they have to send a driver though.


When I tried waymo a month or two ago a remote "driver" took over. When we called the support center about it later they told us that it's not them literally taking over the controls, but only giving some additional guidance to the system.


Reading with the term "moving goalposts" leaves a really bad impression.

It is like it is wrong to stop working on one idea and start working on the other.


George Hotz is a talented engineer but he absolutely does not have the social science background to "fix" Twitter.


Being an engineer at Twitter is much more akin to an enterprise like a bank or telco.

You need to be a team player and work across different groups e.g. Product, Testing, SRE etc in order to successfully get features into Production.

So being a talented engineer is useful but having high emotional intelligence and being able to negotiate and collaborate is far more important.


I'm surprised more people don't realize this. It happens to all big corp.


You mean talk more and code less?


You don’t need a social science background to fix Twitter.


I think abnormal psychology or perhaps criminology would be helpful in understanding the current Twitter decision making process.


Define "fix." Twitter is a tool used by extremists to reach massive audiences. Some of those extremists have socially destructive and violent objectives. Responding to harassment and hate speech involves understanding the psychology of harassers and racists so that incentives can be designed not to reward those behaviors. IMO, "fixing" Twitter isn't just about fail whales and 500 errors.


I respect Geohot's reputation and this company looks amazing. I might be in the market to work there... except "No Remote."

For such a smart guy, locking yourself out of a ton of talent by requiring software developers to be on-site in 2023 seems...out of character, to put it politely.

(Rephrased, my original post was a bit too ad hominem and accumulating downvotes rapidly. I wanted to delete this entire comment but apparently HN no longer allows comments to be deleted.)


Let's be honest here: remote and in-person is a major tradeoff with significant pros and cons on both sides. Remote work increases your talent pool by 5 orders of magnitude and removes commute overhead, but in-person work increases your communication bandwidth and team cohesion in similarly dramatic ways. Hybrid solutions put a hard cap on the upside of either path and therefore tend to give you the worst of both worlds. It makes perfect sense to be opinionated about this, especially at the startup stage. I respect the decisiveness (even though I would be very reluctant to go back to full-time commute).


Also in person is a good filter if you are getting too many applications, which I am sure this will assuming the pay is good (but those $100 bounties implies maybe not - anyone who could do those in say 15 minutes would be worth more than $400/h)


It's a filter, but what makes it good? I'd expect average applicant quality to go down.


Let's say that you are someone with a dedicated drive to contribute to AI, with the ability to grasp the complex program space, and spend many hours figuring out the solution to the problem. I.E the perfect tinygrad candidate.

There is a high chance that you are probably neurodivergent to some extent.

So instead of WFH, where you can remove distractions and work on your own time, you are now forced to abide by someone else's schedule, take time commuting, e.t.c

In office work is for people/positions that require hands on work with hardware, or you are hiring for replaceable positions where people don't have dedication to the cause and are going to do as little work as possible for the same pay. Tinygrad is neither.


How big is Tiny Corp?

I doubt they need mass volumes of employees at this stage and they maybe want to work closely with the people they choose?


Software is part of it, sure, but I doubt anyone can realistically work on this project/company without being around a bunch of specialized hardware and iterating on prototypes. Hard to contribute to any of that from home.


From Twitter

Remote work is available to everyone on GitHub. If you submit a bunch of good PRs and show me you are easy to work with, I'm down to pay per project.

Source https://twitter.com/realGeorgeHotz/status/166153013618397184...

I'm not anti remote, I'm anti full time remote. It's hard to build a culture


If culture is the thing that is holding your company to it's purpose, you aren't going to succeed.

In the same way Comma went from "Our goal is to solve AI, Comma body is the next big thing" to George peacing out because now they are just doing the busy work to make more money.


The big benefit of remote is being able to live 500 miles away and/or in a different country; and that requires being full-time remote.

If he's anti full-time remote, then his pool of candidates is still limited to those who live in San Diego or very close to it.


> For such a smart guy, locking yourself out of a ton of talent by requiring software developers to be on-site in 2023 seems...out of character, to put it politely.

I mean a lot of smart people seem to do their hacking by themselves. I'm thinking like Fabrice Bellard. This is at least a step beyond that.


Remote work requires very different management and tooling. I've seen remote companies fail the last couple of years where this was not taken into account. It's hard to run a remote company.


A lot of people have now had the direct experience that new things which are highly technical or highly collaborative are not really compatible with the remote work thing. I know that's hard for a lot of people to hear, but the world is not web apps (which do remote well) and a lot of projects benefit hugely from being able to grab the two or three people and get into a room with a whiteboard.


what about IP theft though?


what about it?


I clicked through to the previous blog post, to read more about the unit of a "person" of compute [0]. Definitely worth a read, if only for this quote:

> One Humanity is 20,000 Tampas.

I'll never think of humanity the same way!

[0]: https://geohot.github.io/blog/jekyll/update/2023/04/26/a-per...


> There’s a [Radeon RX 7900 XTX 24GB] already on the market. For $999, you get a 123 TFLOP card with 24 GB of 960 GB/s RAM. This is the best FLOPS per dollar today, and yet…nobody in ML uses it.

> I promise it’s better than the chip you taped out! It has 58B transistors on TSMC N5, and it’s like the 20th generation chip made by the company, 3rd in this series. Why are you so arrogant that you think you can make a better chip? And then, if no one uses this one, why would they use yours?

> So why does no one use it? The software is terrible!

> Forget all that software. The RDNA3 Instruction Set is well documented. The hardware is great. We are going to write our own software.

So why not just fix AMD accelerators in pytorch? Both ROCm and pytorch are open sourced. Isn't the point of the OSS community to use the community to solve problems? Shouldn't this be the killer advantage over CUDA? Making a new library doesn't democratize access to the 123 (fp16-)TFLOP accelerator. You fix pytorch and suddenly all the existing code has access to these accelerators. Millions of people now have This then puts significant pressure on Nvidia, as they can't corner the DL market. But it is a catch-22 because the DL market already is mostly Nvidia so it takes priority. Isn't this EXACTLY where OSS is supposed to help? I get Hotz wants to make money, and there's nothing wrong with that (it also complements his other company), but the arguments here seem more for fixing ROCm and specifically the pytorch implementation.

The mission is great, but AMD is in a much better position to compete with AMD. They caught up in the gamer's market (mostly) but have a long way to go for scientific work (which is what Nvidia is shifting focus to). This is realistically the only way to drive GPU prices down. Intel tried their hand (including in supercomputers) but failed too. I have to think there's a reason that's not obvious to most of us as to why this is happening.

Note 1:

I will add that supercomputers like Frontier (current #1) do use AMDs and a lot of the hope has been that this will fund the optimization from two places: 1) DOE optimizing their own code because that's the machine that they have access to and 2) AMD using the contract money to hire more devs. But this doesn't seem to be happening fast enough (I know some grad students working on ROCm).

Note 2:

There's a clear difference in how AMD and Nvidia measure TFLOPS. techpowerup shows AMD at 2-3x Nvidia, but performance is similar. Either AMD is crazy underutilized or something is wrong. Does anyone know the answer?


I know a fair amount about this problem, my last startup built a working prototype of a performance-portable deep learning framework that got good performance out of AMD cards. The compiler stack is way harder than most people appreciate because scheduling operations for GPUs is very specific to the workload, hardware, and app constraints. The two strongest companies I'm aware of that are working in this area now are Modular.AI and OctoML. On the new chip side Cerebras and Tenstorrent both look quite interesting. It's pretty hard to really beat NVIDIA for developer support though, they've invested a lot of work into the CUDA ecosystem over the years and it shows.


This. Modular and OctoML are building on top of MLIR and TVM respectively.

> It's pretty hard to really beat NVIDIA for developer support though, they've invested a lot of work into the CUDA ecosystem over the years and it shows.

Yup, strong CUDA community and dev support. That said, more ergonomic domain specific languages like Mojo might finally give CUDA some competition though - it's still a very high bar for sure.


There's also OpenAI Triton. People seem to miss that OpenAI is not using CUDA...


Yeah, also see AMD engineers working on Triton support here: https://github.com/openai/triton/issues/46


Triton outputs PTX which still requires CUDA to be installed.


Sure, but the point is that Triton is not dependent on CUDA language or frontend. Triton also outputs PTX using LLVM's NVPTX backend. Devils are in the details, but at a very high level, Triton could be ported to AMD by doing s/NVPTX/AMDGPU/. Given this, people should think again when they say NVIDIA has CUDA moat.


I thought this was a good overview of the idea Triton can circumvent the CUDA moat: https://www.semianalysis.com/p/nvidiaopenaitritonpytorch

It also looks like they added MLIR backend to Triton though I wonder if Mojo has advantages since it was designed with MLIR in mind? https://github.com/openai/triton/pull/1004


I hadn't looked at Triton before, I took a quick look at it and how it's getting used in PyTorch 2. My read is it really lowers the barrier to doing new hardware ports, I think a team of around five people within a chip vendor's team could maintain a high quality port of PyTorch for a non-NVIDIA platform. That's less than it used to be, very cool. The approach would not be to use any of the PTX stuff, but to bolt on support for say the vendor's supported flavor of Vulkan.


This seems pretty reasonable and matches my suspicions. It is not hard for me to believe that CUDA has a lot of momentum behind it, not just in users, but in optimization and development. And thanks, I'll look more at Octo. As for Modular, aren't they only CPU right now? I'm not impressed by their results, as their edge isn't strong over PyTorch, especially scaling. A big reason this is surprising to me is simply how much faster numpy functions are than torch. Like just speed test np.sqrt(np.random.random(256, 1024)) vs torch.sqrt(torch.random(256, 1024)). Hell, np.sqrt(x) is also a lot slower than math.sqrt(x). It just seems like there's a lot of availability for optimization, but I'm sure there are costs.

When we're presented with problems where the two potential answers are "it's a lot harder than it looks" and "the people working on it are idiots" I tend to lean towards the former. But hey, when it is the latter there's usually a good market opportunity. Just I've found that domain expertise is seeing the nuance that you miss when looking at 10k ft.


First you have to figure out what problem to attack. Research, training production models, and production inference all have very different needs on the software side. Then you have to work out what the decision tree is for your customers (so depends who you are in this equation) and how you can solve some important problem for them. In all of this for say training a big transformer numpy isn't going to help you much so it doesn't matter if it's faster for some small cases. If you want to support a lot of model flexibility (for research and maybe training) then you need to do some combination of hand-writing chip-specific kernels and building a compiler that can do some or most of that automatically. Behind that door is a whole world of hardware-specific scheduling models, polyhedral optimization, horizontal and vertical fusion, sparsity, etc, etc, etc. It's a big and sustained engineering effort, not within the reach of hobby developers, so you go back to the question of who is paying for all this work and why. Nvidia has clarity there and some answers that are working. Historically AMD has operated on the theory that deep learning is too early/small to matter, and for big HPC deployments they can hand-craft whatever tools they need for those specific contracts (this is why ROCm seems so broken for normal people). Google built TensorFlow, XLA, Jax, etc for their own workloads and the priorities reflect that (e.g. TPU support). For a long time the great majority of inference workloads were on Intel CPUs so their software then reflected that. Not sure what tiny corp's bet here is going to be.

The change in the landscape I see now is that the models are big enough and useful enough that the commercial appetite for inference is expanding rapidly, hardware supply will continue to be constrained, and so tools that can reduce production inference cost by a percentage are starting to become a straight forward sale (and thus justify the infrastructure investment). This is not based on any inside info but when I look at companies like Modular and Octo that's a big part of why I think they probably will have some success.


> So why not just fix AMD accelerators in pytorch? Both ROCm and pytorch are open sourced. Isn't the point of the OSS community to use the community to solve problems?

Because there's no real evidence that AMD cares about this problem, and without them caring your efforts may well be replaced by whatever AMD does next in the space. Their Brooks language[1] is abandoned, OpenCL doesn't compare well, ROCm is like the Sharepoint of GPU APIs (it ticks boxes but doesn't actually work very well).

> So why not just fix AMD accelerators in pytorch

Why not just buy NVidia? They care deeply about the space, will actually help you if you have trouble, etc etc.

Even using Google TPUs is better: Google will help you too.

While everyone using NVidia isn't great for the market as a whole as an individual company or person it makes a lot of sense.

Read "The Red Team (AMD)" section in the linked article:

> The software is called ROCm, it’s open source, and supposedly it works with PyTorch. Though I’ve tried 3 times in the last couple years to build it, and every time it didn’t build out of the box, I struggled to fix it, got it built, and it either segfaulted or returned the wrong answer. In comparison, I have probably built CUDA PyTorch 10 times and never had a single issue.

This is geohot. He knows how to build software, and how to fix problems.

Note that "Our short term goal is to get AMD on MLPerf using the tinygrad framework."

> There's a clear difference in how AMD and Nvidia measure TFLOPS. techpowerup shows AMD at 2-3x Nvidia, but performance is similar. Either AMD is crazy underutilized or something is wrong. Does anyone know the answer?

From the linked article:

> That’s the kernel space, the user space isn’t better. The compiler is so bad that clpeak only gets half the max possible FLOPS. And clpeak is a completely contrived workload attempting to maximize FLOPS, never mind how many FLOPS you get on a real program

[1] https://en.wikipedia.org/wiki/BrookGPU


> This is geohot. He knows how to build software, and how to fix problems.

This is a nonsequitor. This feels like when my uncle learns that I know how to program he asks me to build a website. These are two different things. I do ML and scientific computing, I'm not your guy. Hotz is a wiz kid but why should we expect his talents to be universal? Generalists don't exist.

And we're talking the guy who tweeted about believing that the integers and reals have the same cardinality right? Between that and his tweets on quantum we definitely have strong evidence that his jailbreaking skills don't translate to math or physics.

He's clearly good at what he does. There's no doubt about that. But why should I believe that his skills translate to other domains?

STOP MAKING GODS OUT OF MEN. Seriously, can we stop this? What does Stanning accomplish? It's creepy. It's creepy if it is BTS, Bieber, Elon, Robert Downey Jr, or Hotz.

> Read "The Red Team (AMD)" section in the linked article:

Clearly I did, I quoted from it. You quoted from the next section (So why does no one use it?).


geohot wrote tinygrad. This is not about believing his skills to translate to other domains. It is his domain.

You definitely shouldn't trust what geohot says about infinitary mathematics or (god forbids) quantum mechanics. On the other hand, you generally should trust what he says about machine learning software stack.


Tinygrad isn't a big selling point. I'd expect most people to be able to build something similar after watching Karpathy's micrograd tutorial. Tinygrad doesn't mean expertise in ML and it similarly doesn't mean expertise in accelerator programming. I wouldn't expect a front end developer to understand Template Metaprogramming and I wouldn't expect an engineer who programs acoustic simulations to be good at front end. You act like there are actually fullstack developers and not just people who do both poorly.

This project isn't even about skill in ML, which demonstrates misunderstandings. The project requires writing accelerator code. Go learn CUDA and tell me how different it is. It isn't something you're going to pick up in a weekend, or a month, and realistically not even a year. A lot of people can write kernels, not a lot of people can do it well.


> You act like there are actually fullstack developers and not just people who do both poorly.

If you haven't worked with someone who's smarter and more motivated than you are, then I can see how you'd draw that conclusion, but if you have, then you'd know that there are full stack developers out there who do both better than you. It's humbling to code in their repos. I've never worked with geohot so I don't know if he is such a person, but they're out there.


> Hotz is a wiz kid but why should we expect his talents to be universal?

No of course not. But this is literally his field of expertise, and there's plenty of reasons to think he knows what he is doing. Specifically, the combination of reverse engineering and writing ML libraries means I'd certainly expect he's had reasonable experience compiling things.


It's often less work to start from scratch than to fix an extremely complex broken stack. Of course people also say this when they just want to start from scratch.

RDNA 3 has dual-issue that basically isn't used by the compiler so half the FPUs are idle.


> So why not just fix AMD accelerators in pytorch?

It doesn't fit the business model. I mean sure, they'll sell AMD computers now like a bootleg Puget Systems. But why buy from the bootleg when I can just buy from the real thing (or AWS) and run tinygrad on there if I want?

So the play is, get people using your framework (tinygrad), then pivot to making AI chips for it:

> In the limit, it’s a chip company, but there’s a lot of intermediates along the way.

Seems far fetched but good luck to them.


Nvidia has stuff like hardware sparsity support. Modern methods (RigL) can let you train sparse for a 2X speedup.

Memory bandwidth (sparsity helps) and networking connectivity (Nvidia bought Mellanox and other networking companies) are important too. They are also using a lot of die space on raytracing stuff that they don't waste on the datacenter versions presumably.


Intel Arc A770 16gb $349 had 40 TFLOP of FP16 which is close to 7900 xtx on the FLOPS/$ scale.

Intel's software is much better (MKL, vtune, etc for GPU) and getting better.


AMD only enabled their ROCm stack on consumer cards last month. This finally corrects a huge mistake - Nvidia made cuda available on all their cards for free from the start and made it easy/cheap for people to get started. Of course once they'd started they stuck with it... I hope it's not too late to turn this around.


> The human brain has about 20 PFLOPS of compute.

Where is this number coming from? The number of spikes per second?

Edit: doing a quick search, it doesn’t seem like there’s a consensus on the order of magnitude of this. Here’s a summary of various estimates: https://aiimpacts.org/brain-performance-in-flops/


No idea. I don't think there's any scientific consensus on even an upper limit of a human brain's FLOP equivalence.


> Where is this number coming from?

Used 20 PFLOPS of compute to simulate it.


I was always surprised at how AMD hasn't already thrown a bunch of money at this problem. Maybe they have and are just incompetent in this area.

My prediction is AMD is already working on this internally, except more oriented around PyTorch not Hotz's Tinygrad, which I doubt will get much traction.


I think AMD is going down a different path, ie. ROCm then partnering with ML frameworks further up the stack for first class support.

https://pytorch.org/blog/pytorch-for-amd-rocm-platform-now-a...


He mentioned ROCm, and apparently had lack luster experience with it.

>The software is called ROCm, it’s open source, and supposedly it works with PyTorch. Though I’ve tried 3 times in the last couple years to build it, and every time it didn’t build out of the box, I struggled to fix it, got it built, and it either segfaulted or returned the wrong answer. In comparison, I have probably built CUDA PyTorch 10 times and never had a single issue.


Not surprising lol. This was also the experience I had while experimenting with MLIR approximately 3 years ago. You'd need to git checkout a very specific commit and then even change some flags in code to have a successful build. I'm sure things are better now but I haven't messed with it since then.


> I'm sure things are better now but I haven't messed with it since then.

I had the same experience ~3 months ago. Gave up and switched to Nvidia 3090s for my workloads.


It's because ROCm is not developed for RDNA (consumer) cards, but CDNA (datacenter) cards. No surprise that he's having trouble with it.


AMD is not going down the path of ROCm; perhaps they claim to do so, but as evidenced by the lack of both effort and results, they clearly are not.

The parent post is surprised that they still aren't making the appropriate investments to make it work. They kind of started to do that a few years ago, but then it fell on the wayside without reaching even table stakes, which in my opinion would require providing a ROCm distribution that works out of the box for most of their recent consumer cards (i.e. those cards which the enthusiasts/students/advocates/researchers might use while choosing which software stack to learn, and afterward base corporate compute cluster purchasing decisions on whether they support the software they wrote for e.g. CUDA+Pytorch), and they seem to be failing at that.


Now only if they would support their hardware and Windows.


AMD is limited by numerous patent and other legal issues. For this reason small company that releases everything as open source have some chances to beat AMD on their own hardware.


Obscurity is only a viable form of security until the patent holder becomes aware of you.


Patents holder might be aware of you, but they still go into legal battles for money.

AMD have a lot of money to lose.

George Hotz $5M OSS company - well, not so much.


When you click on the strip link to preorder the tinybox, it is advertised as a box running LLaMA 65B FP16 for $15000.

To be fair, the previous page has a bit more details on the hardware.

I can run LLaMA 65B GPTQ4b on my $2300 PC (built from used parts, 128GB RAM, Dual RTX 3090 @ PCIe 4.0x8 + NVLink), and according to the GPTQ paper(§) the quality of the model will not suffer much at all by the quantization.

Just saying, open source is squeezing an amazing amount of LLM goodness out of commodity hardware.

(§) https://arxiv.org/abs/2210.17323


Are you able to memory pool two 3090s for 48gb and if so what's your setup?

I looked into this previously[1] but wasn't super confident it's possible or what hardware is required (2x x8 pcie and official SLI support?). AFAICT still would look like two GPUs to the system.

[1] https://discuss.pytorch.org/t/is-there-will-have-total-48g-m...


You can memory pool with the right software however pytorch supports spreading large models over multiple GPUs OOTB. Just pass the --gpu-memory parameter with two values (one per GPU) to oobabooga's text-generation-webui for example.


What case/MB/GPUs do you use for your dual 3090 build? Liquid cooled cards?


I'm using air cooling. Gigabyte X570 Pro, RTX 3090 FE, be quiet! Pure Base 500DX mesh case with four 140mm fans currently. It's not quiet under heavy load!

The GPUs have gotten improved thermal pads. The 8 core Ryzen 3700X is a 65W model. It appears to be fast enough not to be a bottleneck for this purpose. 1200W PSU.

I may swap the fan in front of the GPUs for a high-rpm model.

Also for longer runs i throttle the GPU power draw. It doesn't cost much performance.


This is great news. I’ve oft wondered the same about AMD’s GPUs. NVIDIA’s got a clear monopoly.

He made a very good point about how this isn’t general purpose computing. The tensors and the layers are static. There’s an opportunity for a new type of optimization at the hardware level.

I don’t know much about Google’s TPUs, except that they use a fraction of the power used by a GPU.

For this experiment though, my sincere hope is that all the bugs are software only. Supporting argument - if they were hardware bugs, the buggy instructions would not have worked during gameplay.


Glad he is going ahead with this. Will make for many entertaining live streams no doubt


I don't want to cast any judgement, I just want to ask what the initial product is. The claim is they sell computers, and there's a link to the tinybox. There's a $100 preorder, for a 15k computer (I guess I'd have to pay 14.9k eventually?).

And then we get a computer that... how do I interact with it? Will it have its own OS? Some flavor of linux? Is the intent to work on it directly, or use it as an inference server, and talk over a network?


I think the tinybox is meant to be a training/inference server meant for tinygrad and filled with those AMD cards. Very likely it will run Linux.


The way they make money is for AMD to buy them out if they’re successful


Which would make total sense for AMD if they pull it off.


Why would they buy something that is open source? They could acqui-hire but Hotz doesn't strike me as a person that would stay at a big corp like AMD for a significant amount of time.


To control it. If they are successful at making AMD cards competitive in AI, and I agree that what's missing it's only the software, that would create immense value for AMD. Too much to not have control over how it evolves. If they are successful it will not just be Hotz, he will hire other devs and an entire community will form around it.

Sure they could just fork it and try to continue development themselves, but the community and momentum might very well not go with them.


It wouldn't surprise if those 5M came from AMD


If/when tinygrad is successful then AMD acquiring control of the stewardship/direction-setting of the software that drives the incremental demand for their hardware is far more valuable than Hotz's talent itself.


Love this overall. Wonderful. But I wouldn't say Cerebras failed just yet -- they're committing to OpenXLA which may provide a better dev-experience in the long run than Nvidia lock-in.


He claims there's a $999 AMD card that gives 123 TFLOPS, and his tinybox will cost $15k for 738 TFLOPS. In other words, the tinybox will have 6 of these GPUs, eg $6000 cost price. It seems a steep markup from 6k to 15k, and if the software is open-source, i'm not sure why you wouldn't build your own? Or is it worth 9k for a custom motherboard that can fit so many GPUs. Or can you buy 6-GPU motherboards off the shelf? Just curious what people think. Not being disparaging, kudos to geohot :)


As someone who has built their own deep learning rigs in the $10k BoM range, frankly it's a pain in the ass and I would gladly pay that in the future. I probably will pay lambdalabs a much larger markup.


As someone who bought a deep learning rig for about $12k from lambdalabs years ago, I can't recommend them strongly enough. The support (and not having to deal with building it out myself) was well worth the markup. They're also just really great to deal with.


Logic boards and CPUs with the necessary PCIe lanes capable of feeding the GPUs, storage fast enough to do the 30GB/s that is in the spec sheet, power supplies and a case to hold such a power hungry system, etc. is not cheap.

$15k seems like a kind of low - if you've tried to spec out a server with similar quantities of accelerators recently, you'd have trouble hitting that figure, even if using consumer grade GPUs.


You can buy 6+ gpu motherboards, but the consumer ones are solely for mining because no normal person has that many cards and no consumer CPU exposes enough lanes to properly support that many GPUs. A lot of enterprise vendors exist to sell you that kind of system in the enterprise space, but you should expect to still be paying 5 figures minimum. $15k seems like the low end to me.


The end goal is getting acquired by AMD of course.


True entrepreneur. Having a vision and ignoring naysayers. Go George!


Gotta have money to start to keep taking risks


While it's not available to many of us (and especially those outside of US) it doesn't take that much money to just tinker with whatever you want. Just have to cut time wasted on news and politics, socialising and literally anything outsidie of being hacker or entrepreneur.

Most people just would never take that risk and will stick to their well-paying job.


Was it pg who said it? Not sure but basically - middle class get about one shot. Upper class can keep shooting their whole lives. Of course lower class usually doesn’t get one.

I took my one shot! Any more is too risky for quite a while.


> The main advantage is in the tinygrad IR. It has 12 operations, all of which are ADD/MUL only. `x[3]` is supported, `x[y]` is not.

Can someone educate me why that is the case? Does `x[y]` require a Turing-complete kernel to compute?


layer of indirection introduces scatter/gather and other dynamic loads, which is tricky to optimize


I wonder what Tenstorrent is doing. Didn't realize they are Canadian.

https://tenstorrent.com/research/tenstorrent-raises-over-200...


Can anyone comment on the TinyBox they are taking preorders for?

The tinybox

738 FP16 TFLOPS

144 GB GPU RAM

5.76 TB/s RAM bandwidth

30 GB/s model load bandwidth (big llama loads in around 4 seconds)

AMD EPYC CPU

1600W (one 120V outlet)

Runs 65B FP16 LLaMA out of the box (using tinygrad, subject to software development risks)

$15,000


What is the cost of an equivalent setup using A100s?

I have no idea what I am doing but here goes!

By [1] we have 156 FP16 TFLOPS, taking their non "*" (* = with sparsity) value. So you need 5 So $40,000? pls the other stuff, and someone to make a profit putting it together say $50,000?

So this setup is 3 times cheaper for the same.

If I am allowed to use the sparsity value it is 1.5 times cheaper.

[1] https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Cent...


A100s have a much worse performance per dollar than 3090s/4090s (which are the direct competitors to the 7900x)


Fwiw, I can't think of a single popular neural net architecture that takes advantage of sparsity.


I like George's style and wish him well. But I'm not optimistic about their chances of selling $15k servers that are $10k in parts (or whatever the exact numbers are).

It's just too easy for anyone to throw together a Supermicro machine with 6x GPUs in it, which is what it sounds like they'll be doing.

My guess is they'll end up creating some premium extensions to the software and selling that to make money. Or maybe they can sell an enterprise cluster manager type thing that comes with support. He's good at software so it makes sense for him to sell software.

And maybe the box will sell well initially just as a "dev kit" type thing.


> selling $15k servers that are $10k in parts

Have you seen what a DGXA100 costs? It starts at $199k for 8 40GB A100's, which have a list price of $10k each. So the GPU costs are $80k. What do you get for the extra $120k? 1TB ram, 2 2TB NVMe OS drives, 4 4TB NVME general storage, and 8x200Gbit infiniband. I would guess no more than 20k all of the remaining hardware. So that's a ~$100k computer selling for $200k. And that's with NVDA likely making massive margins already on the A100 and the Infiniband hardware.

The reality is that companies want to buy complete solutions, not to build and manage their own hardware. A $15k a computer that's $10k in parts is not a large markup at all for something like this.


I agree the DGXA100 is a "complete solution" because it's NVIDIA selling NVIDIA customized integrated/certified/tested/supported hardware and software.

NVIDIA's advantage is that they're a proprietary company and they're the ones actually making the chips they're putting in a box.

That's very far away from a random little open source startup slapping third-party GPUs in a generic box.


To me this looks like a way to mask donations.


> And maybe the box will sell well initially just as a "dev kit" type thing.

Price: $15,000.

If they had a "lite" model that sold for $1500, and were actually shipping....


The lite model is any gaming PC with a 7900 XTX.


> It's just too easy for anyone to throw together a Supermicro machine with 6x GPUs in it, which is what it sounds like they'll be doing.

HPC compute is well advanced past just slapping GPUs into generic supermicro servers anyway. Without semi-custom hardware and equivalents to nvlink/nvswitch AMD won't ever be competitive in the HPC space.


SuperMicro has HGX servers. $300k buys you an 8xH100 chassis with crazy amounts of memory, storage, CPU and GPU compute.


We can't really comment much on it because a bunch of specs are lacking. Is it using 6x 7900 XTXs? Which Epyc CPU (Epycs vary in price from $1K to $11K)?


There's a reason no one uses ATI GPUs in datacenters. Their dev support is shit.

Don't waste your money.

Buy 6 RTX 4090's and a decent ECC-memory server, and call it a day.


Didn't read the article.


I thought you weren't allowed to use Nvidia's consumer GPUs in the datacenter?


You aren't, but who's going to stop you?


Their closed source driver?


And how exactly they are going to know you're running card in DC ?


> I think the only way to start an AI chip company is to start with the software. The computing in ML is not general purpose computing. 95% of models in use today (including LLMs and image generation) have all their compute and memory accesses statically computable.

> Unfortunately, this advantage is thrown away the minute you have something like CUDA in your stack. Once you are calling in to Turing complete kernels, you can no longer reason about their behavior. You fall back to caching, warp scheduling, and branch prediction.

> tinygrad is a simple framework with a PyTorch like frontend that will take you all the way to the hardware, without allowing terrible Turing completeness to creep in.

I like his thinking here, constraining the software to something less than Turing complete so as to minimize complexity and maximize performance. I hope this approach succeeds as he anticipates.


This 95% of models are statically computable thing really shows how much he is trivializing this problem. I’d be interested to see his SW stack compile MaskRCNN. His ISA is massively under-defined and people will not change their model code to run on this accelerator unless his performance beats cuda significantly and even then they still won’t - usability matters more than performance every time. In the end you need a compiler, and it needs to be compatible with an existing framework which is not trivial at all, since they are written in python.


I agree writing something that needs to be compatible with, say, PyTorch is a significant undertaking, but why is that necessary? I also agree some models like MaskRCNN is not static, and people will not change their model code, but I don't think it matters.

Let's say you want to run LLaMA. LLaMA is a tiny amount of code, say, 300 lines. LLaMA is static. It doesn't matter people will implement LLaMA with PyTorch and not tinygrad, geohot can port LLaMA to tinygrad himself. In fact, he already did, it's in tinygrad repository.

What I am saying is while running all models ever invented is harder than running LLaMA and Stable Diffusion (Stable Diffusion port is also in tinygrad repository), that's not necessarily trivializing the problem. It is noticing that you don't need to solve the full problem, there is enough demand for solving the trivial subset.

While developers will choose usability, users will choose cheap price. If they can run what they want on cheaper hardware, they will. I already have seen this happening: people don't buy NVIDIA to run Leela Chess Zero, they just run it on their hardware. It doesn't matter everyone working on LC0 model is using NVIDIA, that's irrelevant to users. LC0 model is fixed and tiny, people already ported the model to OpenCL, OpenCL port is performant, it runs well on AMD. The same will happen to text and image generation models.


Yeah for inference this is true, there could be a viable subset of models. You’re not going to build a viable business on inference though. It’s super cheap already and plenty of hardware can do it ootb with an existing framework as you’re saying. The $$ for selling chips is in training, and researchers trying new architectures are not going to wait for a port of their favorite model in a custom DSL or learn a new language to start prototyping now. You can port models forever, but that isn’t an ecosystem or a cuda compete. OpenCL + AMD != a from scratch company


Can anyone elaborate on how or why Turing completeness requires these sub optimal patterns?

I recall reading about avoiding Turing completeness for similar reasons to avoid the halting problem.

> Other times these programmers apply the rule of least power—they deliberately use a computer language that is not quite fully Turing-complete. Frequently, these are languages that guarantee all subroutines finish, such as Coq.

https://en.m.wikipedia.org/wiki/Halting_problem


It isn’t that there are suboptimal patterns, it’s just the more expressive your language can be at runtime, the less you can reason about statically. An example is data-dependent control flow. If you can’t reason about what branch your code is going to take statically (without your runtime data) it is harder to generate fast code for it.


Nvidia or AMD, the real winner here is truly TSMC.


> I started tinygrad in Oct 2020. It started as a toy project to teach me about neural networks

Shows you what is possible in 2.5 years. Keeps me motivated to learn.


The math isn't super difficult. Some books will try to throw a mess of differential equations at you, but some simple calculus is all you need for backpropagation.


I have been through the math thanks to the youtube videos by A. Karpathy. Deriving some of the differentials, e.g. for batchnorm seems fairly hard (hard as in slogging through something with many steps where you can't make a mistake at any step). But the principles are quite simple - I think by design. If they were hard to compute or reason about then the neural net wouldn't work very well!


Doing the compute efficiently, especially from Python, is the tricky part.


Is this the guy who couldn’t add a feature to Twitter?


I don't know whether it's a factor in the alleged software quality issues he mentions, but it's not unusual for a company that thinks of itself as a hardware company to neither understand nor respect software enough.

Even if adopting "hardware/software co-design", leadership might be hardware people, and they might not understand that there's tons more to systems software engineering than the class they had in school or the Web&app development that their 5 year-old can do. That misunderstanding can exhibit in product concepts, resource allocation, scheduling, decisions on technical arguments, etc.

(Granted, the stereotypical software techbro image in popular culture probably doesn't help the respect situation.)


geohot gave me some good advice/feedback from using my Japanese reading app, Manabi Reader: https://reader.manabi.io

I'm always grateful for his user feedback, it led to next-level improvements. Thanks geohot


If they can achieve something competitive with CUDA for $5m, why hasn't AMD done it yet?


Every single startup since maybe 1980 that is not borderline criminal has faced at least one incumbent that is better funded and has, frankly, more talent with better experience.

The reason they sometimes win is huge structural or incentive issues in the large companies. And they don't win often.

AMD has failed to fix the problem for years. Is it because the business structure doesn't incentivize it? Is it because the company's entrenched culture is opposed to compensation that might attract the right talent (often out of "fairness")? Is it because there's some internal owner for the function that keeps fucking up but for political reasons the CEO won't replace them and no one else can work on the thing?

Any of these are possible. I have seen - personally witnessed - all of these and more at large companies. We don't know the reason, but we can sort of guess as to the shape of it.


AMD/ATI has always been terrible at drivers/software for their GPUs going back decades.

It’s really as simple as that and it still hasn’t changed so nvidia is dominating them in AI as a result.


Their stuff was garbage back when there was VLB.


AMD has been tackling exactly the wrong problems. They poured their money into a porting solution for developers to take CUDA code and run it on their GPUs. I guess they didn't find it worth it to really compete. I doubt it's about being able to tackle this problem with $5m, but rather convincing the company they can win.


I still don't understand what problem they're trying to solve in EPYC in the hypervisor space with encryption.

They should've been adding tensor cores and neural acceleration to their CPUs. The need for headed graphics cards is moot and wasteful. NVIDIA solved this with the A100.

NVIDIA may spin into a mainstream enterprise CPU and systems vendor as a sales channel for converged CPU-GPU solutions beyond what they're already doing.


If you talking of AMD SEV it's actually a useful technology. Confidential virtual machines not only protects you from possible spying on AWS or Azure, but also make it possible to have some decentralized / P2P compute more feasible.

Of course nothing is perfect and you can never have 100% trust to someone else hardware, but it's defenetely step in right direction.


You can say this about every startup that has ever existed.


Vision.

That being said I am still sceptical.


I doubt they have even spent $5M on developing ROCm even though it's been a thing for nearly 10 years. AMD is just notoriously stingy about investing in things outside their core business.


Turing-completeness != un-optimizable! Literally the areas of type systems and compilers exist to serve this endeavor. It's gotta be a meme at this point every time someone brings up the halting problem or rice's theorem.


I don't think anybody claims it's unoptimizable. It's just a harder task, compared to a more constrained system.


Right, but the type system is the constraint. Nobody's gonna take the untyped lambda calculus and run it on an accelerator. Even something like turing-completeness can be a type annotation, for example, the totality effect provided by languages like Koka.


CS education considered harmful!


I’d love to see this succeed, because I own a 7900 XTX already, but there’s so much already built on top of PyTorch. Why would anyone port it all to tinygrad or whatever? Bummer.


In theory pytorch compiler can boil down to 50 or so fundamental functions and tinygrad IR to 12. So possibly you could just re-map a fairly limited set of base instructions. Devil’s in the details though..


people already ported a lot of stuff from pytorch to jax.

if you're a research scientist or grad student, to a certain extent a lot of projects are "greenfield" so it's easy to jump on a new framework if it is nice to use and offers some advantage.


what does he mean by "tape out" chips?


When transforming the logic gate design of a chip into the lithographic plates used for chip production, the plates were originally made by applying tape to create the photo masks. The name stuck and it now means to move a semiconductor project from the design phase to the manufacturing phase.


Industry term for a chip design being sent to manufacturing/being produced.



Hope George pulls this off. Just sent this to my dad, who turned down starting ATI with the Lau family on account of me being born. =)


Where does the figure that the human brain is 20 pflops come from?


Didn't realize the 7900 XTX was so good at this kind of work. Glad to have went with it over the 4080.


The whole point is that it’s not good at this work — and it’s a $Xx billion opportunity to make it work.


So much untapped potential in AMD, and funny they keep failing at the software and geohot has to save them


Why isn't AMD and Intel working on an alternative?


I find it peculiar. just recently folks were tryharding to make everything and your kitchen sink Turing complete and now it creeps in menacingly on its own


also, I like the spirit of: "I don’t want to live in a world of closed AI running in a cloud you’ve never seen". shake those shills a bit


I reached the limit of free articles. impossible to browse and read the page


George was in the list of my favourite programmers till a few months ago. Now there are Jeff Dean, John Carmack and Karpathy




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: