Hacker News new | past | comments | ask | show | jobs | submit login
Tinygrad (github.com/geohot)
405 points by tosh on April 4, 2022 | hide | past | favorite | 144 comments



I find this related page more interesting: A Breakdown of AI Chip Companies https://geohot.github.io/blog/jekyll/update/2021/06/13/a-bre...

I especially like that he outlines an actual plan for an AI chip startup that he thinks will work, and has an update explaining why he was subsequently convinced that it wouldn't work.


I read this article vs. the actually posted one, it has a lot of good points but also get a fair amount wrong, and the correction on the power usage is just the tip of the iceberg. That's why WaferScale do what they do, the power cost of off die vs. on die is massive, and enabling everything to be on chip means you can feed the design more easily and with less power.

For example, Nvidia's compute and consumer GPU line diverged a long time ago. Modern A100s have literally only one SM capable of doing normal GPU tasks, probably to support running a display on whatever Quadro version they end up increasing. They diverged in really specific ways, for example the P100 has hardware scheduling, where as the 1080 does not (in the same way at least).

Another issue is the author spends a long time talking about how important software and ecosystem is, then completely misses that point when talking about their own CHIP - just because it is RISCV and compilers exist for that arch does not equal CUDA. Also, big re-order buffers cost area and heat that could be spent on more SMs. That's why in order to beat Nvidia you must get more specialized, they've picked their niche on the CPU-GPU-ASIC continuum, beating them at the same process node requires ditching some stuff of the stuff an Nvidia GPU. Which is why they've been specializing their arch with tensor cores.

It just also turned out those are useful for gaming with deep learning to upres the graphics, as that's easy to accelerate than driving quadraticlly higher resolutions.


> for example the P100 has hardware scheduling, where as the 1080 does not (in the same way at least).

what does hardware scheduling mean in this context?


The detailed follow-up to that post is here: https://geohot.github.io/blog/jekyll/update/2021/12/12/a-cor...


Interesting! I think Cerebras is exciting too, the problem is that it's so expensive that there will never be a software ecosystem for it. The people who would develop it will never have access to one.


Yep, it's hard to see how Cerebras will ever have much of an ecosystem, since hardware makers are traditionally not very good at building and maintaining one by themselves. But they're probably aiming at very specialized customers only.

Geohot is right about the AI accelerator market's problems, and a competitive 4-digit-dollars device is a great idea even if his initial strategy was way off. Although you could say almost the same thing about the high-performance CPU and GPU duopolies (Apple's chips don't count due to their proprietary OS lock-in, although I wish Asahi Linux luck at fixing that).

[edit] And beyond that, you have TSMC dominating the next-gen fab market, too.


This is the guy that did some iOS jailbreaks, reverse engineered the PS3 and now runs a self driving car startup.


And the shady cheapETH business last year: https://oldreddit.com/r/cheapETH/comments/lkzkso/george_hotz...


Adding a comment as I don't think this is a fair representation of George. He livestreamed the creation of cheapETH as a technical demonstration of web3 development[0] and continually talks about it being worthless while developing it. He's also on record as being a serial 'no-coiner'.

[0] https://www.youtube.com/watch?v=9LaIezgiUmw


> continually talks about it being worthless

Kinda weird to try to sneak millions of a worthless coin into your wallet.


That was the whole point of the stream.


That is how you keep cheapETH cheap


Here's his response: https://cheapeth.org/whalegate.html It casts the situation in a rather different light - you should read it.



I had to replace oldreddit by reddit to make that link work...


I'm thinking he meant to link to old.reddit which forces the old (read: non-broken) reddit UI


Yeah should be old.reddit.com, not oldreddit.com.


Who is goku?


I'm not certain but I think it's this guy: https://twitter.com/tomkysar


What makes you think this is the guy? I could not find any reference to geohot in his tweets.


He was in Vegas hanging out with Tom when he was streaming cheapETH creation. You can hear another person in this stream, pretty sure that's him: https://www.youtube.com/watch?v=9LaIezgiUmw&t=1029s&ab_chann...

Also posted a picture of them together in a Ferrari around this time lol (taken down)


So he's a two-bit scammer too.


Not sure if it's a startup when it's profitable and not looking for investors.


Long enough ago that it was iPhone OS jailbreaks.


I watched a few of George's live streams, and I'm pretty impressed with his coding skills and determination to solve a problem.

In the spirit of learning, anyone else on his level do live streams or has a youtube channel?

Here's his last 7 hour stream coding Tinygrad.

https://www.youtube.com/watch?v=MeE4Y2862FY


He takes a random IQ test in the middle of it, lol.


He inspired me to look up the research on Raven Progressive Matrices: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.68...


do you have the minute of that?


4th hour, 1 minute


> tinygrad will always be below 1000 lines. If it isn't, we will revert commits until tinygrad becomes smaller.

I applaud this. Committing to keeping a project small and simple.

So many projects start small and simple, and before long they've been extended in many different directions and now have thousands of options and things to understand before you can get started.


Are there other examples of famous projects that do that, limiting themselves to an X-amount of LOC? Last one I remember was TempleOS, although Terry went a bit over his limit (100k LOC).


https://github.com/aosabook/500lines

>Each chapter consists of a walkthrough of a program that solves a canonical problem in software engineering in at most 500 source lines of code. We hope that the material in this book will help readers understand the varied approaches that engineers take when solving problems in different domains, and will serve as a basis for projects that extend or modify the contributions here.


"dwm is only a single binary, and its source code is intended to never exceed 2000 SLOC." https://dwm.suckless.org/


Algos run the world.

"In bioinformatics, BLAST (basic local alignment search tool) is an algorithm and program for comparing primary biological sequence information, such as the amino-acid sequences of proteins or the nucleotides of DNA and/or RNA sequences." - could it be used to identify and group similar structures in NNs? (https://en.wikipedia.org/wiki/BLAST_(biotechnology))

I've been thinking of some kind of visual representation of weights, graphs etc. Images and evolving images as the substrate of the neural network. Pixels on a plane, planes affecting each other, activation recorded as brightness/color. Then we can use visual algorithms like SIFT (https://en.wikipedia.org/wiki/Scale-invariant_feature_transf...) to do cool stuff and grow better and better graphic-based NNs.


Using bioinformatics tools to understand neural networks. What a fasinating idea! It probably makes sense to go for the direct application of string and pattern matching methods from stringology. BLAST is very much meant for DNA and protein biosequences.

edit: It would be cool to see the evolution of neural networks over their training, and im transfer learning. Comparative neural network genomics.


Yeah I want extreme cross-disciplinary collaboration and new ideas being tried constantly



Minitorch is intended to be both an engine and high quality didactic material, from Cornell University, for the course 'Machine Learning Engineering'.

> the full student code for minitorch. It is designed as a single repo that can be completed part by part following the guide book

> Basic Neural Networks and Modules ; Autodifferentiation for Scalars ; Tensors, Views, and Strides ; Parallel Tensor Operations ; GPU / CUDA Programming in NUMBA ; Convolutions and Pooling ; Advanced NN Functions

See: http://minitorch.github.io/


I am reminded of Andrew Tanenbaum's Minix os: before Linux became the belle of the ball, minix implemented many unix functionalities in a less efficient but clearer way than existing unix builds (many of which were not open source) and the still nascent bsd/linux/gnuos/etc, and it meshed nicely with his textbooks on OS design. It was functional on it's own, and by Minix 3 was a full os in it's own right (Intel still uses it in various ways) but balanced pedagogy with performance.

https://en.wikipedia.org/wiki/Minix gives more details.



Look man, this is just too tiny. People are upset in this thread about the excessively short line count of OP as it is. What is this, some sort of autograd for ants?

It is freaking me out, remove this link.


Lol


I'd like to ship a video game that does machine learning and trains on its experiences with the player. Yes, I know there are many potential problems with this.

What is the best way to ship training code in a game? Do I embed Python and PyTorch or something? Do I code my own NN training algorithm? Do I use a library such as Tinygrad?


If your game has enemies, you could give each unit its own brain. Behind the scenes you could have a "gene pool". When a unit dies it's removed from the pool. When it performs well (eg. damages player or wins the game) the unit can reproduce. Then you add the usual stuff (mutate parameters, cross-breeding fittest units).

For training you could just make two AI teams play against each other (off the cuff, 10-100k games should do the trick). Once you release the game, you could sample the best AI from each player's machine. Free distributed neuroevolution cluster ;)

I've wanted to do this for a while but I haven't made any proper games yet. Someday!


There's actually alot of research in this topic. You should look into "Genetic Programming" on scholar.google.com and you'll find some good presentations.

For example https://scholar.google.com/scholar?hl=en&as_sdt=0%2C21&q=gen... and that references a paper where they do this with MUGEN (Ai v Ai fighting game) http://irep.ntu.ac.uk/id/eprint/30021/1/PubSub7423_8186_Mart...

Genetic programming sounds like a complicated term, but it's basically an easy way to take successful characteristics and breed them into something else.


Typically, these libraries are unusable for this purpose because you don't want to ship a Python interpreter with your video game.

I usually prefer to to rewrite my training step as a pure function, so the model weights are just inputs and the gradient updates are outputs.

You need to serialize your computation graph in some way, so it can be run in C++ or some other low-level language. TensorFlow is known for doing this well since it was original design goal of the project. Some of the other frameworks that originally targeted researchers make this harder. Most mature frameworks have some way of doing this now, though, and projects like https://onnx.ai/ may solve this in general.

It gets more complicated if your model has dynamic control flow, but you get the idea.


It's likely worth looking into federated learning. You could have a core model, which then trains locally on player inputs. The federated learning research space is focused on this sort of small model update on phones, so should have plenty of people who have thought hard about the problem of keeping binary sizes small and training compute manageable.

https://arxiv.org/abs/1909.11875


You probably want a lightweight reinforcement learning library in C/C++ e.g. something from here https://github.com/search?l=C%2B%2B&q=reinforcement+learning...

You could also compile a neural net into a less python-tied format e.g. ONNX or torchscript. In general a siloed pytorch env would be massive, I'm assuming at least a gig or two.


I suppose you could collect training data during gameplay and process it after the fact. Then you can use the heavy python framework, but it doesn't get in the way. Have the user run the training program on the accumulated data whenever they want to increase the level of customization.


In theory, once could use Arraymancer to build the tools to do this, and not have to ship a Python interpreter. That said, I've not used it in anger.

https://github.com/mratsim/Arraymancer


I think video games already do this; video game AI existed before deep learning, it just used things like decision trees and MCMC.


Doesn't forza the racing game do this?

Edit: spelling


Do you know how to program? I have no idea what you're asking.


Yes, I know how to program and know a good amount about machine learning in Python. The problem is these machine learning frameworks are rather heavy to be shipping around, although I guess you could do it. I was wondering if there's any lighter weight machine learning libraries, which might not have all the bells and whistles, but would be better to ship and train models on consumers computers. Most of the time people ship pretrained models, but don't actually update that model on the end users computer.


Client side training with the usual python stack sounds like a configuration nightmare. Would it try to use the users GPU if it exists? Would it fallback to CPU? Does your user need a specific GPU card to play?


I think the real issue here is with the training speed. Presumably a GPU is required to run the game, but is it restricted to just NVIDIA GPUs/a subset of those?

I don’t know of any such libraries offhand and my guess would be that’s because the size of the library generally matters less if there’s a requirement to have a powerful GPU.


I like his honest streams. Also this geohot can code for 12 hours straight. That's just amazing. If you take in to account his speed of thought, its more like 36 hours.


For a novice in this space, can anyone provide a link to an article providing a comparison of: Pytorch, Pytorch lightning, tinygrad, micrograd.....? I would like to point my clients to a reference here. It's no longer enough to say "PyTorch" I guess.


PyTorch lightning is just a convenience wrapper for PyTorch. The rest are toy re-implementations of a very small subset of PyTorch features, likely much slower and certainly less optimized overall.

PyTorch (or Tensorflow or Keras) are the real options.


also Jax, hopefully



I fixed some spelling mistakes: https://github.com/geohot/tinygrad/pull/314


What kind of performance can you get with micrograd? It looks to be a few python files (and no external library calls?), so I imagine it's insanely slow. At least with tensirflow/pytorch there are some calls to C in the dependencies - so the speed decreases there are from the loops and such written on top of that. What's the deal here?

Why not just use Julia :)



Upvoted for geohot


I like this guy mostly for preserving a hacker-ish culture in the atmosphere and for just being a real human being in public. Too often do I feel like I'm just interacting with interfaces to corporations or people who self censor themselves so much they barely feel human. I'm not sure this library matters much to me but I maintain positive perspective on this guy for these reasons.


It's crazy that his company (comma ai) and actions stick out so much. They're in the auto-driving space and they have a real product that you can buy today and use it for a better lane-assist and hook it up to most newish cars.

It's open source so you could run it on your own hardware for free.

You have a complaint or question? Just check out their discord where you can talk to engineers working there instead of some outsourced chat bot.

They just figured out the minimum you would have to do to control a car and are making incremental improvements with the insane business strategy of charging more than it takes for them to build it.

Oh and if you're a business executive and want to partner, you can schedule a 30 minute phone call with comma's VP of Business Development for $1,000!


Having gone to college and knowing this guy personally from then, you would have a much different opinion... Maybe he has grown up, but back then... yeeesh


I mean, he's obviously not a normal guy. I don't know what you expect. From a recent profile:

> Apart from a criminal streak, Hotz shares with Raskolnikov, Dostoyevsky’s antihero, a predilection for instrumental reason and an urge to test his own mettle, to know himself by knowing his limits. As a young adult Hotz allowed himself to become addicted to prescription opiates almost as an experience in self-mastery. “I did it, I was addicted, and I quit,” he told me. “I think I had to have that experience. I don’t think I ever could have been the type who never tried it. Because in some ways I feel that if I’m not strong enough to defeat that and overcome it…” He paused for several beats before assuring me he’d never want anyone to follow his example. “In order to quit,” he continued, “it required me to rethink what I wanted out of life. After that, one of the biggest things that changed is I stopped caring about money.”

https://return.life/2022/03/07/george-hotz-comma-ride-or-die...


> not a normal guy

I assure you some people try themselves - and I do not see what is not "normal" about it. To experience, voluntary, then grow, is the norm.


I don't know. I ran the idea of purposely getting addicted to opioids to see if i can quit by my wife and she assured me i was crazy.


This direction goes off-topic, but: look, if you married her,

-- either you go along well, hence that she agrees with your opinion is a weak test;

-- or you married her to test yourself, which would prove my point.

TND; QED. /J


Maybe she married him to test herself?


What does TND mean?


("Tertium Non Datur" - that analyzed options are exhaustive)


Meaning this only neutrally, and completely respectfully: have you considered that you're a bit abnormal as well?


Yes, reflection is still part of the norm. Luckily... :)

Normal is what reflects the norm. That natural, observational norm (type, mode) and deontic, optimal norm ("as it should be") so typically ("normally") diverge, so the latter is found in the standard deviation, comes from the very point that was raised initially: growth («To experience, voluntarily, then grow, is the norm»), or the point where you are in it, proceeding towards the right extreme.

There is no escape from the norm (and its negative), you see: if good, then optimal norm, peripheral in the curve; if lacky, then observational norm in the centre. And between the two there is a sort of a continuity, thresholds aside...

That the "world" looks so abnormal, the bad way (hence you can call 'abnormal' the normal), is justified in such framework - especially when you look at it as a playground. And we just say, ok, if it were possible just to reduce the collateral damage...


The only normal people are the ones faking it


As someone who somewhat went down that path, the answer for me was "no, I can't quit on my own and man this 'experiment' has done some serious damage to my life"

I'm doing better now! The buprenorphine injection has made my life so much better.

Of course my trauma was one of the real driving forces behind that "experiment" and thought process. Really, my "lets find out what its like" was a rationalisation it seems.


I've heard that real serious addictions are never just about the drugs, but also about underlying issues. It makes sense that someone without those issues would have an easier time quitting.


I'm in a similar situation, but not with opioids. I don't think that this is a crazy idea at all. We're just testing our limits and seeing if we as smart as we say.


Agreed. To test my belief in probabilities, I occasionally challenge myself to a few rounds of Russian Roulette.


And what if you aren’t?


[flagged]


Well it seems to be for many, many people


The hubris of framing it that way is kind of wild.


I have gone to college with similar overachievers/braggers, and now that you mention it if they were in the spotlight as much as him I'd roll my eyes pretty hard. Still though, having not gone to college with this particular one somehow I can stand it.


I cannot take this criticism seriously unless you are more specific than “yeesh.” If you aren’t willing to be specific then better to not say anything at all.


Who cares if he’s a nice guy?


If he's being held up as some shining beacon of the "hacker-ish culture" as the grandparent poster is doing, it would be nice if he weren't also a jerk.


Everyone should kill their heroes. They almost never live up to the pedestals people place them on. They are all just people and have flaws. They don’t have to be some perfect person to do cool things we can respect.


I respect geohotz' achievements (generally). I would never hold him up as an example of hacker culture, or his persona as one to emulate. There is a difference between these things.


Is hacker culture not filled with "weird jerks"? Linus doesn't qualify? Who does?


Linus used to be a massive jerk until he fixed his attitude. Hackers can grow.

I've got massive respect for everybody who addresses their own problems and fixes them. Way too many people only look for the problem in other people, but it's never that simple.


Out of curiosity who would you show as an example of hacker culture?


They are mostly the people you never hear about in mainstream media. The quiet engineers and tinkerers of the world. Guys like Fabrice Bellard, jaquesm of hn fame, Stuff Made Here (YouTube). There are a lot of prolific hackers out there that have made or are making contributions.


Why the need to conflate personality traits with skill?

If I have a rare disease, I don't care if the doctor is nice. I want them to get the diagnosis correct.

Van Gogh painted brilliantly. And he cut off his ear.

Eminem is a great rapper. His themes can be violent.


The post I replied to was one about personality traits, not skill.


The poster probably meant that "«it would be nice» - yet largely irrelevant". You can abstract from those "personality traits" as long as """he is one who delivers""" - respectfully saying, as he is not obliged: he provided us with another tool, for our use, for free: he is a benefactor.


I don’t think of him as shining. I like that he’s a real person, flaws and all. Not someone to idolize. Many great hackers that I know and in general are total jerks (I suspect the trait helps wrangling a machine somehow but idk). I wouldn’t find it accurate to my experiences if we had some polished perfect optics guy as representative (not that that’s what he is)


> I suspect the trait helps wrangling a machine somehow

Disruptiveness, for lack of a better term. These are people who are well-acquainted with finding the boundaries of systems and barreling through them.


Talented but bad people breed a new generation of unnecessary untalented bad people.


Maybe you were the one who sucked.


I'm guessing only child syndrome + rich as hell parents.


Jealousy tends to make you react that way


Envy. Some changes in language are fine, but these are very different things and it's nice to have words for both of them.


Its funny you say this because the repo mentions Andrej Karpathy's repo. I absolutely love his work especially his ML tutorials, but there was one tesla automation day where he shuts down this guy asking a question with robotic efficiency. I wish I could find it.


This library is so cool. It shows where the value is. Just like commaai . This is how we move forward. Cut to the core and share


I wish Dang would just kill this sub-thread. What a mess.

A whole amateur dissection of this guy's personality. It's so bloody awkward, just comment on the work and move on.

But it's great to know HN readers are all such paragons of virtue.


Well you got your wish. Gonna be honest and say I’m not sure I’m enjoying HNs move to follow twitter and facebook to micromanage discussion to this degree. I really don’t think we were being uncivil in having a discussion about what we appreciate about this guy.


>tinygrad will always be below 1000 lines. If it isn't, we will revert commits until tinygrad becomes smaller.

Love this ethos so much. Wish more projects would follow suit.


Completely disagree - limiting lines of code as an ethos makes as much sense as using lines of code written as a business metric.

What if they want to add a new feature that takes another ~1000 LoC? Can they just write it as a separate library and include it as a dependency? IMO trying to minimize LoC creates a perverse incentive to split up your package when it might not need it (in the same way maximizing LoC creates a perverse incentive to write the most verbose code possible, not necessarily the most readable/maintainable)


> What if they want to add a new feature that takes another ~1000 LoC?

They don't.

The LoC limit exists as much to avoid scope creep and maintain focus as it is about anything else.


It trades off utility for managing scope creep. Keeping this code base tiny necessarily means that anything that uses it would need to write more code. If the point is a demonstration instead of a usable library, then that is good because it is much easier to follow without excess complexity.


Necessarily is a strong word; there are assuredly many users of this tool who find it sufficient in its scope as it is, based on the project activity. If a user needs significant feature additions they can write them as a separate tool that interacts with this one.


This one arguably does have a bit of scope creep it's just in a separate directory.


Maybe don’t add that feature then? Not every library is meant to solve all problems


No, but every library tries to solve at least _one_ problem, and maybe it needs to go over 1000 lines to do that.


This one doesn't, apparently.


Not all software is intended to grow indefinitely. Keeping things small and grokkable is very valuable for learning.


This is a semi-educational project, so it makes sense to keep the number of lines small.


It serves an important gatekeeping function by discouraging low-effort contributions.


simplicity is a virtue. the project goal apparently isn't to have a lot of features, it's to do the core thing in the simplest way possible.


And if someone does some code golf to stay under the 1000 line limit rather than go to 1050 lines that’s an improvement?


Then the human responsible for approving the PR will decide whether it's worth it. Personally, I love it.


no that's another bloated ml library with code compression.

i suspect a part of the motivation for this library is learning and teaching.

and maybe a little flexing on overcomplicated autograd libraries.


You say 1000 lines is an arbitrary number. Do you actually have a feature in mind that takes it above this artificial limit? Did you even read the code?

They like this round number. Fork. Why argue.


It's how you get commits like these, though https://github.com/geohot/tinygrad/commit/050636bcb1068beff6...

  Subject: here's two extra lines of precious code (#307) 
  
  diff --git a/tinygrad/__init__.py b/tinygrad/__init__.py
  index 0deab3e9..31ac75d5 100644
  --- a/tinygrad/__init__.py
  +++ b/tinygrad/__init__.py
  @@ -1,3 +1 @@
  -import tinygrad.optim
  -import tinygrad.tensor
  -import tinygrad.nn
  +from tinygrad import optim, tensor, nn
There's a difference between "I will refuse features like X/Y/Z" and "I want the length of the code file to be N lines at most". The former tells you both explicitly and implicitly which features not to bother contributing. The latter is just nonsense.


This is why the good lord gave us: https://github.com/psf/black

Import line is actually improved with this commit. LGTM, ship it.

We now return to your regularly scheduled bikeshedding.


You're saving yourself from one kind of golfing, but not all of them.

Additionally, I feel like installing an additional tool is kind of against the spirit of 1kLOC simplicity.


I'm getting distinct uses Eclipse to write vogon poetry and Jira Master vibes from you.

Any notes on the actual code in the OP beyond the import lines formatting?


I'm sorry to have offended you, but I don't feel personal insults over the README of an open source project are particularly reasonable.

This affects the actual code, too! See, for example, https://github.com/geohot/tinygrad/commit/cfb7a4c41a2b6bcc09..., which includes this gem:

  diff --git a/tinygrad/ops/ops_cpu.py b/tinygrad/ops/ops_cpu.py
  index a454f56f..0686f810 100644
  --- a/tinygrad/ops/ops_cpu.py
  +++ b/tinygrad/ops/ops_cpu.py
  @@ -2,22 +2,15 @@
   from ..tensor import Function
   
   class CPUBuffer(np.ndarray):
  -  def log(x):
  -    return np.log(x)
  -  def exp(x):
  -    return np.exp(x)
  -  def relu(x):
  -    return np.maximum(x, 0)
  -  def expand(x, shp):
  -    return np.broadcast_to(x, shp)
  +  log = lambda x: np.log(x)
  +  exp = lambda x: np.exp(x)
  +  relu = lambda x: np.maximum(x, 0)
  +  expand = lambda x,shp: np.broadcast_to(x, shp)
  +  permute = lambda x,order: x.transpose(order)
  +  type = lambda x,tt: x.astype(tt)
  +  custompad = lambda x,padding: np.pad(x, padding)
     def amax(x, *args, **kwargs):
       return np.amax(x, *args, **kwargs)
  -  def permute(x, order):
  -    return x.transpose(order)
  -  def type(x, tt):
  -    return x.astype(tt)
  -  def custompad(x, padding):
  -    return np.pad(x, padding)
     def toCPU(x):
       return x
     @staticmethod
The "actual code" is being golfed for no other reason than to keep the line count down, for the sake of a promise that never needed to be made. This commit recovered a total of seven lines wherein new code can be added, in exchange for readability. It is not just a quirk, not just some sort of interesting aside like one could argue for suckless' `dwm`: this promise actively makes the code worse over time, for no reason.


No offense taken and no offense intended at all friend.

In my opinion we are down the rabbit hole of arguing style over substance. I just think you are grasping at straws. Tools like black (if there was a just god it would just be part of the language, like gofmt) can easily settle stylistic choices.

Really not worth endlessly debating, what next, tabs vs spaces?

This commit seems fine to me as well. Consider the golfing serves the purpose of "the lols" if you must. Heck this particular commit looks slightly better with the change!

Excessive golfing to the detriment of code quality over time is a well known and wholly irrelevant point to make with something like Tinygrad.

If you are so convinced of what you are saying, fork and rewrite it as verbosely as you prefer and demonstrate some significant improvement to the project quality. Would be insightful.


I think I've proven my point well enough here by making it clear that in order for them to stay under 1kLOC they have had to make golfing-style changes that cannot be automated.

If I were hell-bent on convincing you, I could fork their repo and add literally any feature requiring more than ten LOC and win. They would have to bend over backwards to get enough lines back to be able to merge whatever it is I wrote, with exponentially worse effects per individual line contributed.

But I'm not hell-bent on convincing you. You know that I'm not actually going to write code for you. You might think it's a clever rhetorical strategy, but it's not. Because I have other things to do, like a day job and personal projects, whose preemption of my jumping through your hoop are completely disconnected from the veracity of my line of reasoning.


That's just like, your opinion, man.

And it isn't a rhetorical strategy. It is socratic questioning.

Should you sit down and attempt to write something out of spite (an excellent motivator, as good as any) it would go slightly differently than how you theorize.

In order to add a feature requiring more than ten LOC you'd actually have to come up with one in the first place :)

That would require reading Tinygrad, which shouldn't take very long for obvious reasons. Probably we've been arguing about this for longer.

At which point you'd realize there is not much to add to it and you are splitting hairs over whether it is 1200 lines or 999 lines.

In which case you could just run black on it and leave it at whatever line count it spits out. That wouldn't even require any additional coding on your part!

Then you can compare the black'd version to theirs and realize they made a very very small sacrifice as a lark. Thereby finally understanding the difference between the principal of "code golf bad" and the reality of "tinygrad l33t demo" you ol' fuddy duddy.


Socratic questioning involves questioning, not unexplained assertions. You definitely don't come off as a modern-day Socrates when you use the latter.

I'm not against something like a 64k or 4k intro competition, because the results are generally that you get to see someone take advantage of every inch of their machine, like in https://linusakesson.net/scene/a-mind-is-born/, filing down an executable byte by byte until it does what they wanted it to do. That's rad as hell.

The results here are that... someone has filed down their Python program in a way that makes it smaller. Not even in number of bytes, just in number of lines. In a really uninteresting way. It's not even like the Python examples on StackExchange's codegolf in that sense, because we could go the complete opposite route and merge most of these statements together with semicolons. It doesn't become more interesting as it gets smaller, or less interesting as it gets bigger. It's the same program regardless of how many lines it occupies, which makes it that much less interesting.

So, yes, to some extent we agree, you can just run this through automated tools and it will be as many or as few lines as you like. I see that as pointless because other restrictions would be much more stimulating, and not restricting yourself so heavily would probably lead to better (or at least more legible) results.

But sure, that'll just be this fuddy duddy's opinion.


> The results here are that... someone has filed down their Python program in a way that makes it smaller. Not even in number of bytes, just in number of lines. In a really uninteresting way.

Egads. No, it isn't the focus of the repo. You keep on missing the forest for the trees.

Tinygrad expresses its ideas succiently and clearly. It is not an exercise in code golfing. Close to the end of feature completeness they noticed it is hovering a round number and spent a minor effort pushing it down slightly for street cred.

Those changes did not impair readability or harm the project in any way. They did not go over board. You say they did, I reckon, just because you are allergic to any such changes - that is your personal subjective opinion. A waste of time you say! Bah humbug!

Well, yeah, so? Heh.

I say run it through black and make it objective, it is a great tool. It will reveal just how minor the difference is in this specific case.

I keep trying to focus on the substance. Did you actually read this project? Not through the eyes of a human linter? Was it difficult to follow? What are we arguing about if not the absolutely least interesting aspect of it really?


i think it's an improvement. there's nothing wrong with functional style.


It could be written:

    def log(x): return np.log(x)
    def exp(x): return np.exp(x)
    ...

And keep the same line count, and the functions would keep their __name__.


I would merge this PR.


Dev dependencies and release dependencies are distinct. There would be no sense in complaining about hypocrisy if a zero-dependency C project uses clang-format or valgrind, and similarly for a python project using black or mypy.


Agreed. It still leaves the loop hole of writing infinitely long lines of code (or is there a line length limit?)

I also enjoyed watching the life streaming videos https://www.youtube.com/watch?v=Xtws3-Pk69o and the videos videos documenting the neural network ANE M1 chip reverse engineering efforts for this project. See video links here: https://news.ycombinator.com/item?id=30852818


I think it's a bit unnecessary. A small set of operations should be enough to make it easy to port to new accelerators, and there's lots of nice to haves in many programs that don't affect compatibility.

If you don't plan on ever doing anything but the core it does seem pretty reasonable though.


Bonus challenge: total number of bytes monotonically decreases over time.


How about we show simple admiration and appreciation? We could learn way more if we try to understand the actions rather than criticize. He's a nerd. If we are here we are nerd too. Keep up the nerd sht geohot!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: