Hacker News new | past | comments | ask | show | jobs | submit login

I have taken several masters-level courses in Machine Learning -- and even with those credentials, I cannot recommend enough Andrej's youtube series, "Neural Networks: Zero to Hero". There, he teaches you, from scratch, how to build everything from the underlying automated gradient calculation system in pytorch, all the way up to the slower version of this model - `MinGPT`.

[1] https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThs...

(edit: self-promo: I'm currently working on a Typescript follow-through of this same series of video lectures, if you want to follow along with stronger types for explanation: https://github.com/Marviel/lab-grad)




I can’t believe I just spent 2 and a half hours glued to my phone in bed watching this, for absolutely no reason other than it was such an interesting intro (to a subject I’m already familiar with). Thanks for the recommendation, and thanks Andrej for making this!


How does it compare to fast.ai? As a engineer looking to learn, what should I start with?


Both are good for different things.

Fast.AI is great, but it takes the top down, vs the bottom up, approach. It takes you from a production-level black box that you don't understand, down to the details. The benefit there is you get good high-level intuition of how it behaves at the "let me use this technology for a job" level.

Separately, the fast.ai library is also highly recommendable -- it comes with some state-of-the-art image recognition models, and its training wrappers are really helpful particularly for image-recognition dataset training.

Karpathy's "Neural Networks: Zero to Hero" video series starts at the level of individual neurons, and works you up to the final product. For some reason both this style, and Karpathy's conciseness appeal to me slightly more. I'm also super detail-oriented, though -- and any level of "hand waving" (even if further explanation comes later) always bothers me. He's also got some pretty high-profile industry experience which carries some weight with me.

But I'll say that both are really high-quality. -- ultimately, my recommendation would be to follow whichever one speaks most to you personally after the first 1hr or so.

EDIT: Per Jeremy's response below, if you want the bottom-up approach but like the fast.ai teaching style, you should check out "part 2" of the fast.ai set of tutorials, which is exactly that.


fast.ai has both - the "part 1" section is top-down, and the "part 2" section is bottom up. You can do part 2 without having done part 1. Part 2 starts with implementing matrix multiplication from scratch, then backprop from scratch, then SGD from scratch, etc.

There will be a new version of the part 2 course out in a few weeks. It even covers stuff like random number generation from scratch, convolutions from scratch, etc. It gradually works all the way up to Stable Diffusion.

@karpathy's and the fast.ai lessons work well together. They cover similar topics from different angles.

(I'm the primary creator of the fast.ai courses.)


That's awesome! I did not know that part 2 was structured this way, and will check it out. Will be really neat to see you teach stable diffusion.

Thanks for your work on fast.ai!


Jeremy @ Fast.ai says he takes this pedagogical approach because it's "proven" to be the best way to learn. He's probably right, but I do find it confusing at times because in the beginning you're just hitting ctrl + enter on a IPYNB haha.

Maybe Karpathy's approach will speak to me more--thanks for the recommendation!


Wow, I just watched the first video and it's, hands down, the most crystal clear explanation of neural nets and backpropagation I've ever seen. Bravo.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: