I especially like that he outlines an actual plan for an AI chip startup that he thinks will work, and has an update explaining why he was subsequently convinced that it wouldn't work.
I read this article vs. the actually posted one, it has a lot of good points but also get a fair amount wrong, and the correction on the power usage is just the tip of the iceberg. That's why WaferScale do what they do, the power cost of off die vs. on die is massive, and enabling everything to be on chip means you can feed the design more easily and with less power.
For example, Nvidia's compute and consumer GPU line diverged a long time ago. Modern A100s have literally only one SM capable of doing normal GPU tasks, probably to support running a display on whatever Quadro version they end up increasing. They diverged in really specific ways, for example the P100 has hardware scheduling, where as the 1080 does not (in the same way at least).
Another issue is the author spends a long time talking about how important software and ecosystem is, then completely misses that point when talking about their own CHIP - just because it is RISCV and compilers exist for that arch does not equal CUDA. Also, big re-order buffers cost area and heat that could be spent on more SMs. That's why in order to beat Nvidia you must get more specialized, they've picked their niche on the CPU-GPU-ASIC continuum, beating them at the same process node requires ditching some stuff of the stuff an Nvidia GPU. Which is why they've been specializing their arch with tensor cores.
It just also turned out those are useful for gaming with deep learning to upres the graphics, as that's easy to accelerate than driving quadraticlly higher resolutions.
Interesting! I think Cerebras is exciting too, the problem is that it's so expensive that there will never be a software ecosystem for it. The people who would develop it will never have access to one.
Yep, it's hard to see how Cerebras will ever have much of an ecosystem, since hardware makers are traditionally not very good at building and maintaining one by themselves. But they're probably aiming at very specialized customers only.
Geohot is right about the AI accelerator market's problems, and a competitive 4-digit-dollars device is a great idea even if his initial strategy was way off. Although you could say almost the same thing about the high-performance CPU and GPU duopolies (Apple's chips don't count due to their proprietary OS lock-in, although I wish Asahi Linux luck at fixing that).
[edit] And beyond that, you have TSMC dominating the next-gen fab market, too.
Adding a comment as I don't think this is a fair representation of George. He livestreamed the creation of cheapETH as a technical demonstration of web3 development[0] and continually talks about it being worthless while developing it. He's also on record as being a serial 'no-coiner'.
> tinygrad will always be below 1000 lines. If it isn't, we will revert commits until tinygrad becomes smaller.
I applaud this. Committing to keeping a project small and simple.
So many projects start small and simple, and before long they've been extended in many different directions and now have thousands of options and things to understand before you can get started.
Are there other examples of famous projects that do that, limiting themselves to an X-amount of LOC? Last one I remember was TempleOS, although Terry went a bit over his limit (100k LOC).
>Each chapter consists of a walkthrough of a program that solves a canonical problem in software engineering in at most 500 source lines of code. We hope that the material in this book will help readers understand the varied approaches that engineers take when solving problems in different domains, and will serve as a basis for projects that extend or modify the contributions here.
"In bioinformatics, BLAST (basic local alignment search tool) is an algorithm and program for comparing primary biological sequence information, such as the amino-acid sequences of proteins or the nucleotides of DNA and/or RNA sequences." - could it be used to identify and group similar structures in NNs? (https://en.wikipedia.org/wiki/BLAST_(biotechnology))
I've been thinking of some kind of visual representation of weights, graphs etc. Images and evolving images as the substrate of the neural network. Pixels on a plane, planes affecting each other, activation recorded as brightness/color. Then we can use visual algorithms like SIFT (https://en.wikipedia.org/wiki/Scale-invariant_feature_transf...) to do cool stuff and grow better and better graphic-based NNs.
Using bioinformatics tools to understand neural networks. What a fasinating idea! It probably makes sense to go for the direct application of string and pattern matching methods from stringology. BLAST is very much meant for DNA and protein biosequences.
edit: It would be cool to see the evolution of neural networks over their training, and im transfer learning. Comparative neural network genomics.
Minitorch is intended to be both an engine and high quality didactic material, from Cornell University, for the course 'Machine Learning Engineering'.
> the full student code for minitorch. It is designed as a single repo that can be completed part by part following the guide book
> Basic Neural Networks and Modules ; Autodifferentiation for Scalars ; Tensors, Views, and Strides ; Parallel Tensor Operations ; GPU / CUDA Programming in NUMBA ; Convolutions and Pooling ; Advanced NN Functions
I am reminded of Andrew Tanenbaum's Minix os: before Linux became the belle of the ball, minix implemented many unix functionalities in a less efficient but clearer way than existing unix builds (many of which were not open source) and the still nascent bsd/linux/gnuos/etc, and it meshed nicely with his textbooks on OS design. It was functional on it's own, and by Minix 3 was a full os in it's own right (Intel still uses it in various ways) but balanced pedagogy with performance.
Look man, this is just too tiny. People are upset in this thread about the excessively short line count of OP as it is. What is this, some sort of autograd for ants?
I'd like to ship a video game that does machine learning and trains on its experiences with the player. Yes, I know there are many potential problems with this.
What is the best way to ship training code in a game? Do I embed Python and PyTorch or something? Do I code my own NN training algorithm? Do I use a library such as Tinygrad?
If your game has enemies, you could give each unit its own brain. Behind the scenes you could have a "gene pool". When a unit dies it's removed from the pool. When it performs well (eg. damages player or wins the game) the unit can reproduce. Then you add the usual stuff (mutate parameters, cross-breeding fittest units).
For training you could just make two AI teams play against each other (off the cuff, 10-100k games should do the trick). Once you release the game, you could sample the best AI from each player's machine. Free distributed neuroevolution cluster ;)
I've wanted to do this for a while but I haven't made any proper games yet. Someday!
There's actually alot of research in this topic. You should look into "Genetic Programming" on scholar.google.com and you'll find some good presentations.
Genetic programming sounds like a complicated term, but it's basically an easy way to take successful characteristics and breed them into something else.
Typically, these libraries are unusable for this purpose because you don't want to ship a Python interpreter with your video game.
I usually prefer to to rewrite my training step as a pure function, so the model weights are just inputs and the gradient updates are outputs.
You need to serialize your computation graph in some way, so it can be run in C++ or some other low-level language. TensorFlow is known for doing this well since it was original design goal of the project. Some of the other frameworks that originally targeted researchers make this harder. Most mature frameworks have some way of doing this now, though, and projects like https://onnx.ai/ may solve this in general.
It gets more complicated if your model has dynamic control flow, but you get the idea.
It's likely worth looking into federated learning. You could have a core model, which then trains locally on player inputs. The federated learning research space is focused on this sort of small model update on phones, so should have plenty of people who have thought hard about the problem of keeping binary sizes small and training compute manageable.
You could also compile a neural net into a less python-tied format e.g. ONNX or torchscript. In general a siloed pytorch env would be massive, I'm assuming at least a gig or two.
I suppose you could collect training data during gameplay and process it after the fact. Then you can use the heavy python framework, but it doesn't get in the way. Have the user run the training program on the accumulated data whenever they want to increase the level of customization.
Yes, I know how to program and know a good amount about machine learning in Python. The problem is these machine learning frameworks are rather heavy to be shipping around, although I guess you could do it. I was wondering if there's any lighter weight machine learning libraries, which might not have all the bells and whistles, but would be better to ship and train models on consumers computers. Most of the time people ship pretrained models, but don't actually update that model on the end users computer.
Client side training with the usual python stack sounds like a configuration nightmare. Would it try to use the users GPU if it exists? Would it fallback to CPU? Does your user need a specific GPU card to play?
I think the real issue here is with the training speed. Presumably a GPU is required to run the game, but is it restricted to just NVIDIA GPUs/a subset of those?
I don’t know of any such libraries offhand and my guess would be that’s because the size of the library generally matters less if there’s a requirement to have a powerful GPU.
I like his honest streams. Also this geohot can code for 12 hours straight. That's just amazing. If you take in to account his speed of thought, its more like 36 hours.
For a novice in this space, can anyone provide a link to an article providing a comparison of: Pytorch, Pytorch lightning, tinygrad, micrograd.....? I would like to point my clients to a reference here. It's no longer enough to say "PyTorch" I guess.
PyTorch lightning is just a convenience wrapper for PyTorch. The rest are toy re-implementations of a very small subset of PyTorch features, likely much slower and certainly less optimized overall.
PyTorch (or Tensorflow or Keras) are the real options.
What kind of performance can you get with micrograd? It looks to be a few python files (and no external library calls?), so I imagine it's insanely slow.
At least with tensirflow/pytorch there are some calls to C in the dependencies - so the speed decreases there are from the loops and such written on top of that. What's the deal here?
I like this guy mostly for preserving a hacker-ish culture in the atmosphere and for just being a real human being in public. Too often do I feel like I'm just interacting with interfaces to corporations or people who self censor themselves so much they barely feel human. I'm not sure this library matters much to me but I maintain positive perspective on this guy for these reasons.
It's crazy that his company (comma ai) and actions stick out so much. They're in the auto-driving space and they have a real product that you can buy today and use it for a better lane-assist and hook it up to most newish cars.
It's open source so you could run it on your own hardware for free.
You have a complaint or question? Just check out their discord where you can talk to engineers working there instead of some outsourced chat bot.
They just figured out the minimum you would have to do to control a car and are making incremental improvements with the insane business strategy of charging more than it takes for them to build it.
Oh and if you're a business executive and want to partner, you can schedule a 30 minute phone call with comma's VP of Business Development for $1,000!
Having gone to college and knowing this guy personally from then, you would have a much different opinion... Maybe he has grown up, but back then... yeeesh
I mean, he's obviously not a normal guy. I don't know what you expect. From a recent profile:
> Apart from a criminal streak, Hotz shares with Raskolnikov, Dostoyevsky’s antihero, a predilection for instrumental reason and an urge to test his own mettle, to know himself by knowing his limits. As a young adult Hotz allowed himself to become addicted to prescription opiates almost as an experience in self-mastery. “I did it, I was addicted, and I quit,” he told me. “I think I had to have that experience. I don’t think I ever could have been the type who never tried it. Because in some ways I feel that if I’m not strong enough to defeat that and overcome it…” He paused for several beats before assuring me he’d never want anyone to follow his example. “In order to quit,” he continued, “it required me to rethink what I wanted out of life. After that, one of the biggest things that changed is I stopped caring about money.”
Yes, reflection is still part of the norm. Luckily... :)
Normal is what reflects the norm. That natural, observational norm (type, mode) and deontic, optimal norm ("as it should be") so typically ("normally") diverge, so the latter is found in the standard deviation, comes from the very point that was raised initially: growth («To experience, voluntarily, then grow, is the norm»), or the point where you are in it, proceeding towards the right extreme.
There is no escape from the norm (and its negative), you see: if good, then optimal norm, peripheral in the curve; if lacky, then observational norm in the centre. And between the two there is a sort of a continuity, thresholds aside...
That the "world" looks so abnormal, the bad way (hence you can call 'abnormal' the normal), is justified in such framework - especially when you look at it as a playground. And we just say, ok, if it were possible just to reduce the collateral damage...
As someone who somewhat went down that path, the answer for me was "no, I can't quit on my own and man this 'experiment' has done some serious damage to my life"
I'm doing better now! The buprenorphine injection has made my life so much better.
Of course my trauma was one of the real driving forces behind that "experiment" and thought process. Really, my "lets find out what its like" was a rationalisation it seems.
I've heard that real serious addictions are never just about the drugs, but also about underlying issues. It makes sense that someone without those issues would have an easier time quitting.
I'm in a similar situation, but not with opioids.
I don't think that this is a crazy idea at all. We're just testing our limits and seeing if we as smart as we say.
I have gone to college with similar overachievers/braggers, and now that you mention it if they were in the spotlight as much as him I'd roll my eyes pretty hard. Still though, having not gone to college with this particular one somehow I can stand it.
I cannot take this criticism seriously unless you are more specific than “yeesh.” If you aren’t willing to be specific then better to not say anything at all.
If he's being held up as some shining beacon of the "hacker-ish culture" as the grandparent poster is doing, it would be nice if he weren't also a jerk.
Everyone should kill their heroes. They almost never live up to the pedestals people place them on. They are all just people and have flaws. They don’t have to be some perfect person to do cool things we can respect.
I respect geohotz' achievements (generally). I would never hold him up as an example of hacker culture, or his persona as one to emulate. There is a difference between these things.
Linus used to be a massive jerk until he fixed his attitude. Hackers can grow.
I've got massive respect for everybody who addresses their own problems and fixes them. Way too many people only look for the problem in other people, but it's never that simple.
They are mostly the people you never hear about in mainstream media. The quiet engineers and tinkerers of the world. Guys like Fabrice Bellard, jaquesm of hn fame, Stuff Made Here (YouTube). There are a lot of prolific hackers out there that have made or are making contributions.
The poster probably meant that "«it would be nice» - yet largely irrelevant". You can abstract from those "personality traits" as long as """he is one who delivers""" - respectfully saying, as he is not obliged: he provided us with another tool, for our use, for free: he is a benefactor.
I don’t think of him as shining. I like that he’s a real person, flaws and all. Not someone to idolize. Many great hackers that I know and in general are total jerks (I suspect the trait helps wrangling a machine somehow but idk). I wouldn’t find it accurate to my experiences if we had some polished perfect optics guy as representative (not that that’s what he is)
Its funny you say this because the repo mentions Andrej Karpathy's repo. I absolutely love his work especially his ML tutorials, but there was one tesla automation day where he shuts down this guy asking a question with robotic efficiency. I wish I could find it.
Well you got your wish. Gonna be honest and say I’m not sure I’m enjoying HNs move to follow twitter and facebook to micromanage discussion to this degree. I really don’t think we were being uncivil in having a discussion about what we appreciate about this guy.
Completely disagree - limiting lines of code as an ethos makes as much sense as using lines of code written as a business metric.
What if they want to add a new feature that takes another ~1000 LoC? Can they just write it as a separate library and include it as a dependency? IMO trying to minimize LoC creates a perverse incentive to split up your package when it might not need it (in the same way maximizing LoC creates a perverse incentive to write the most verbose code possible, not necessarily the most readable/maintainable)
It trades off utility for managing scope creep. Keeping this code base tiny necessarily means that anything that uses it would need to write more code. If the point is a demonstration instead of a usable library, then that is good because it is much easier to follow without excess complexity.
Necessarily is a strong word; there are assuredly many users of this tool who find it sufficient in its scope as it is, based on the project activity. If a user needs significant feature additions they can write them as a separate tool that interacts with this one.
You say 1000 lines is an arbitrary number. Do you actually have a feature in mind that takes it above this artificial limit? Did you even read the code?
Subject: here's two extra lines of precious code (#307)
diff --git a/tinygrad/__init__.py b/tinygrad/__init__.py
index 0deab3e9..31ac75d5 100644
--- a/tinygrad/__init__.py
+++ b/tinygrad/__init__.py
@@ -1,3 +1 @@
-import tinygrad.optim
-import tinygrad.tensor
-import tinygrad.nn
+from tinygrad import optim, tensor, nn
There's a difference between "I will refuse features like X/Y/Z" and "I want the length of the code file to be N lines at most". The former tells you both explicitly and implicitly which features not to bother contributing. The latter is just nonsense.
The "actual code" is being golfed for no other reason than to keep the line count down, for the sake of a promise that never needed to be made. This commit recovered a total of seven lines wherein new code can be added, in exchange for readability. It is not just a quirk, not just some sort of interesting aside like one could argue for suckless' `dwm`: this promise actively makes the code worse over time, for no reason.
No offense taken and no offense intended at all friend.
In my opinion we are down the rabbit hole of arguing style over substance. I just think you are grasping at straws. Tools like black (if there was a just god it would just be part of the language, like gofmt) can easily settle stylistic choices.
Really not worth endlessly debating, what next, tabs vs spaces?
This commit seems fine to me as well. Consider the golfing serves the purpose of "the lols" if you must. Heck this particular commit looks slightly better with the change!
Excessive golfing to the detriment of code quality over time is a well known and wholly irrelevant point to make with something like Tinygrad.
If you are so convinced of what you are saying, fork and rewrite it as verbosely as you prefer and demonstrate some significant improvement to the project quality. Would be insightful.
I think I've proven my point well enough here by making it clear that in order for them to stay under 1kLOC they have had to make golfing-style changes that cannot be automated.
If I were hell-bent on convincing you, I could fork their repo and add literally any feature requiring more than ten LOC and win. They would have to bend over backwards to get enough lines back to be able to merge whatever it is I wrote, with exponentially worse effects per individual line contributed.
But I'm not hell-bent on convincing you. You know that I'm not actually going to write code for you. You might think it's a clever rhetorical strategy, but it's not. Because I have other things to do, like a day job and personal projects, whose preemption of my jumping through your hoop are completely disconnected from the veracity of my line of reasoning.
And it isn't a rhetorical strategy. It is socratic questioning.
Should you sit down and attempt to write something out of spite (an excellent motivator, as good as any) it would go slightly differently than how you theorize.
In order to add a feature requiring more than ten LOC you'd actually have to come
up with one in the first place :)
That would require reading Tinygrad, which shouldn't take very long for obvious reasons. Probably we've been arguing about this for longer.
At which point you'd realize there is not much to add to it and you are splitting hairs over whether it is 1200 lines or 999 lines.
In which case you could just run black on it and leave it at whatever line count it spits out. That wouldn't even require any additional coding on your part!
Then you can compare the black'd version to theirs and realize they made a very very small sacrifice as a lark. Thereby finally understanding the difference between the principal of "code golf bad" and the reality of "tinygrad l33t demo" you ol' fuddy duddy.
Socratic questioning involves questioning, not unexplained assertions. You definitely don't come off as a modern-day Socrates when you use the latter.
I'm not against something like a 64k or 4k intro competition, because the results are generally that you get to see someone take advantage of every inch of their machine, like in https://linusakesson.net/scene/a-mind-is-born/, filing down an executable byte by byte until it does what they wanted it to do. That's rad as hell.
The results here are that... someone has filed down their Python program in a way that makes it smaller. Not even in number of bytes, just in number of lines. In a really uninteresting way. It's not even like the Python examples on StackExchange's codegolf in that sense, because we could go the complete opposite route and merge most of these statements together with semicolons. It doesn't become more interesting as it gets smaller, or less interesting as it gets bigger. It's the same program regardless of how many lines it occupies, which makes it that much less interesting.
So, yes, to some extent we agree, you can just run this through automated tools and it will be as many or as few lines as you like. I see that as pointless because other restrictions would be much more stimulating, and not restricting yourself so heavily would probably lead to better (or at least more legible) results.
But sure, that'll just be this fuddy duddy's opinion.
> The results here are that... someone has filed down their Python program in a way that makes it smaller. Not even in number of bytes, just in number of lines. In a really uninteresting way.
Egads. No, it isn't the focus of the repo. You keep on missing the forest for the trees.
Tinygrad expresses its ideas succiently and clearly. It is not an exercise in code golfing. Close to the end of feature completeness they noticed it is hovering a round number and spent a minor effort pushing it down slightly for street cred.
Those changes did not impair readability or harm the project in any way. They did not go over board. You say they did, I reckon, just because you are allergic to any such changes - that is your personal subjective opinion. A waste of time you say! Bah humbug!
Well, yeah, so? Heh.
I say run it through black and make it objective, it is a great tool. It will reveal just how minor the difference is in this specific case.
I keep trying to focus on the substance. Did you actually read this project? Not through the eyes of a human linter? Was it difficult to follow? What are we arguing about if not the absolutely least interesting aspect of it really?
Dev dependencies and release dependencies are distinct. There would be no sense in complaining about hypocrisy if a zero-dependency C project uses clang-format or valgrind, and similarly for a python project using black or mypy.
I think it's a bit unnecessary. A small set of operations should be enough to make it easy to port to new accelerators, and there's lots of nice to haves in many programs that don't affect compatibility.
If you don't plan on ever doing anything but the core it does seem pretty reasonable though.
How about we show simple admiration and appreciation?
We could learn way more if we try to understand the actions rather than criticize.
He's a nerd. If we are here we are nerd too.
Keep up the nerd sht geohot!
I especially like that he outlines an actual plan for an AI chip startup that he thinks will work, and has an update explaining why he was subsequently convinced that it wouldn't work.