Hacker News new | past | comments | ask | show | jobs | submit login
Practical Deep Learning for Coders 2022 (fast.ai)
423 points by tmabraham on July 21, 2022 | hide | past | favorite | 80 comments



Hi folks - I'm the creator/teacher of this course. I'd be happy to answer any questions that you have about the course, learning deep learning in general, or the state of deep learning in 2022.


Thank you, so much, for enabling us mortals the power of ethical, modern AI. Your work, and the work of your colleagues, has brought so much good to this world.

It wasn't until the last few years I saw people start freezing the model and just fine tuning the last layers. I've watched presenters from flamingo and imagen talk about their similar approaches. I heard it here first, at fastai.


Haha yes - it's been wonderful to see how (eventually) the deep learning world has taken to transfer learning!


What’s your take on meta learning?


I tried the 2021 course but I didn't finish. I think the biggest friction for me was using the remote machine. I wasn't able to make steady progress like I do with my offline learning projects.

How far away is the fast.ai from working on a Mac? PyTorch recently gained support (https://pytorch.org/blog/introducing-accelerated-pytorch-tra...) but that's only the start. Is this something that is being worked on?


The good news is that every lesson in this course is actually run on Kaggle Notebooks, which is a free cloud environments including GPUs. So you don't need to set up anything and it runs on any computer with a modern web browser!

Mac support for all the libs used in the course will probably continue to improve in the coming months and there should be no reason you won't be able to run the stuff for the course locally on a Mac at that time. Having said that, even the M2 trains deep learning models much slower than even the free NVIDIA GPUs provided by Kaggle. So you'd only want to use local development for the smallest and simplest models. (The course shows how to train models that are fairly cutting edge and some take a while to train even on modern GPUs, so they wouldn't be a good fit for a Mac.)


I think last time I tried this, I kinda gave up as soon as it got to the point of hand-waving hardware and telling you to run notebooks on a third party's web service. How is that democratizing AI? It's the very opposite! Not good. What is the intended audience here? Uni-level students will learn most of these within their programs if they're interested in AI. So they're not really it. Is the intended audience "coders"? If so, most of these "coders" will have to somehow get their employer on board (typically a corporate entity very much NOT interested in "coding" something in a 3rd party ecosystem) or do it themeselves. Hence, I want to take an RTX 3080+, 64+ gb of ram, big ass SSD and I want to get through the training. Not learn some basics on somebody else's platform (come on, even from the pov of OS, running both training AND notebooks on some 3rd party's private platform is so against the idea of open source...) and call it a day. What use is that? That may be enough if you want to be a cog in somebody else's machine, but not if you want to do something useful by yourself (I say "may" because big tech generally isn't interested in your "mad AI skillz" unless you also have a student loan backed piece of paper proving you successfully learned that for the last couple of years).

There will always be smart individuals and talented small teams that can successfully integrate AI into their products, but it's not thanks to the courses like this.

If you're going to aim at coders, there has to be clear path demostrated from the beginning to the end. From starting up your first notebook on your local dev machine and running training on your local training machine to setting up inference in the final app (.net app or whatever)


Thanks Jeremy, I'll give it another go.


I remember you were bullish about Swift a few years ago. What's your current view on non-python deep learning?


I'm disappointed that Google shut down the Swift for Tensorflow project, because I do think Swift is a great option for deep learning.

In some ways Jax is almost "non-python deep learning" since it's treating Python more like a DSL for the XLA backend. Normal Python code doesn't work in Jax. It's a pretty reasonable compromise since you still get all the benefits of the Python ecosystem.

Julia seems like it has the best foundations for deep learning, since everything can be written directly in Julia. But it doesn't have a great ecosystem as a general programming tool.

F# might turn out to be a good option.


Interested that as well, especially the old school lisp to this new Ai.


As someone who much prefers reading over watching videos, do you think I would miss much by just going through the book in the github repo? Or are those notebooks mainly supplemental to the videos.


I'd say it's the other way around - the videos are kinda supplemental to the book. The book has a lot more content, but doesn't have the interactive explanations in the course. Also the book is a couple of years old so is missing the more recent developments (but the principles haven't changed).


Thank you so much for this course. I plan to go through it properly.

I have a search problem of my own and I have had a hard time applying what I have learnt (including the coursera DL specialization). The chief characteristics are: (a) It is a fuzzy search of a corpus that is in a non-English language. (b) The search should be able to run on a mobile phone _offline_.

Is this possible? Can training be done elsewhere and transferred to TinyML or some such? What would be a good forum to go seeking answers?


Have you tried...

a) BM25 after some preprocessing (lemmatization etc.)

b) fastText / GloVe (possibly weighted by BM25)

The results can be surprisingly good. Often no need to bother with big language models or GPUs.


As far as I understand, BM25 is not for fuzzy searches. For a bit more context, the search terms are in the English script, but the words are basically the closest-sounding transcriptions of sounds in various Indian languages. Different people may render the same word differently in an English transcription. But there's enough crowd-sourced data to account for the ways in which words can vary.

For the same reason, GloVe is of not much use to me.


OK, I understand. That is a different problem indeed. Although I am not sure that BERT would be of much more help there either, unless you have quite a large training corpus. The simplest/cheapest approach might be some kind of transcription normalization. As you say, what is introducing ambiguity here is the act of transcribing these sounds into the English alphabet. There’s not a single source language/alphabet? Also, I am not sure if fastText/GloVe really would not pick up the semantic similarity in spite of differing transcriptions as long as you have a large enough training corpus. I’d experiment with the settings here (bag of words/skipgram, min/max lengths etc.).


If the volume of data fits on a mobile phone for it to be offline, perhaps you don't need deep learning?


Perhaps. What alternative would you suggest? The search terms are fuzzy, and there are too many variants (not exactly misspellings) for me to encode them explicitly. So I thought I'd rather learn from a crowd-sourced corpus.

In my case, I don't want the model to be general. I can afford for it to be like a database index, tailored to that data.


How many hours do you think this course would take for an experienced developer with plenty of applied maths but ~no machine learning?

How easy is it to do the course on my own hardware rather than cloud notebooks? Would that make it closer to practical deployment?


I'd suggesting budgeting about 80 hours for the course given that background. That should get you to a place where you can work on practical projects that are reasonably well within standard applications of deep learning.

Most practical deployment is done to cloud environments rather than local notebooks. The deployment exercise we do in the course is designed to show the key components you'll need for deploying simple models in practice.


Not him nor will I talk about his course, but I’ve been in the field a reasonable amount of time (both on the academia and industry side). Honestly, applied maths will get you a long way and make it easier to digest the concepts (you might just see them as repackaged problems depending on your mileage). If you have good programming skills and discipline you practically have most of what you need.

Re the course, I just skimmed it and I think you can do most things on your own hardware but if you will actually use this for something practical (not just for you or a side project), being familiar with cloud tools is a big thing especially once you scale.


out of curiosity, how much applied math should one bone up on? (Obviously the more the better, but diminishing marginal returns and all that.)


Bare minimum is basic calculus, basic linear algebra and basic statistics. By basic, I probably mean first courses for those in most undergraduate programs.

I disagree with needing none and just going along as needed. That’s how you have machine learning models that look like they work but you don’t understand why they work so there might actually be problems.


That's not quite what I said. I said to look things up if you don't understand, I made the assumption that the one asking has taken maths courses before. I interpreted bone up on as refreshing old knowledge but I could be wrong.


None, just look things up as you go along if there is something you don't understand. You're likely not going to bother understanding how the optimization functions work or how the cost functions actually work anyway. They're implementation details in most cases.


Hey Jeremy, i just want to say that I love your course and the way you teach. I refer everyone to the Fast AI in my YouTube videos on getting started with machine learning. Please keep up the great work!


Thanks for creating this fantastic content, I'm excited to give the 2022 course a look. It's an exciting time for AI. I'm curious about your thoughts on gpt3 and also the state of the art in computer vision, and object detection. All the best


Is there a new version of the book? All the links I find lead to the 2020 edition.


No, the book is continually updated for each reprint, but there isn't a separate edition.


Thank you for creating this course. I started out on Tensor Flow but seeing this material I am in two minds whether I should abandon my TF book and start this one or save it for later. Most likely I am going to dive in :-)


Both the Aurélien Géron and François Chollet TF books are absolutely terrific, and everything you learn from them will be extremely useful in becoming a deep learning practitioner, regardless of what framework you end up using. So if you've started with one of those books already, keep it up! :) The fast.ai course would actually be a pretty good addition to either book, since you'll get to see a whole different way of doing things, which might be useful to understanding what's going on.


Thank you


My suggestion would be to learn all the stuff from this course, using fast.ai library, and then gradually move towards PyTorch.

fast.ai is a fantastic educational resource and a great way to approach solving problems. But the library itself is lacking, and if you are an experienced programmer, when building real-life projects, you will be frustrated with fast.ai library.

The goal, IMO, should be learn from Jeremy Howard, s great instructor, communicator; learn his attitude, and then move to PyTorch (keeping the attitude, the knowledge, and the lessons with you.)


I am an experienced ML Engineer of 10 years and have worked at several large flagship tech companies. I do not agree that fastai is not appropriate for real-life projects. If you know the fastai library well, you know its a layered api on top of pytorch, which allows you to customize things to your needs quite easily. For example, it is fairly straightforward to get any pytorch model out of a Learner object. Furthermore, lots of care has been taken to keep the apis very consistent with pytorch as well.

It's also the only library I know of that consistently bakes in best practices like super convergence techniques or making things like test time augmentation very seamless. Many libraries lag behind fastai 1-2 years in this regards, and frankly it can be frustrating to use other frameworks sometimes.

There is a slight learning curve, for example to learn the DataBlocks API or the callback system, but once you really understand what is happening you will understand how nice the API is and how well engineered it is.

Side note: Regarding being an experienced software engineer, I highly recommend digging into how the python language was extended for this project (fastcore) and the development workflow used (nbdev), which I think could be interesting for those software engineers you mention as well as heighten your understanding of the ecosystem of tools.


First of all, a big thanks!

What is your take on the current state of autonomous driving? Do you think we can achieve "full autonomy" with the technology we have currently?

Any new advances in DL that you are excited about?


Honestly I'm not an expert on autonomous driving so I'm not sure I have great insights there. I do know quite a bit about computer vision however so feel qualified to comment on that bit -- I suspect the decision by Tesla to only use CV, and not LIDAR, may turn out to be a mistake. I don't see any reason why we couldn't achieve full autonomy with our current tech including LIDAR, although I don't know if it can be achieved at a practical latency and power budget.

The new advances in DL I'm excited about are things I show in the class: the accessibility of modern NLP thanks to the Hugging Face ecosystem; the power of ConvNeXt for even better computer vision models; the way Gradio and HF Spaces makes it trivially easy to get a working prototype application using DL online.

I'm also excited about hosted models and applications like GPT-3, DALL-E, and Codex. All the illustrations on our course website are from DALL-E, for instance!


I hope you stay as humble as you have been. But you're my personal hero. It is just incredible what you have done for the world.


Will it help me more or achieve more out of learning this course as compared to just directly using GPT-3 or Dall-E as paid user?


Hello Jeremy do you have any specific advice on tackling ASR using fast.ai?


Jeremy, hi.

I have one question, and one only. Please answer:

Second part, when?


   En


.m


    .b


I am so grateful the FastAI team exists. It wasn't until I discovered their "Machine Learning for Coders" course that I really started to grok ML. I was in grad school trying to pivot my career from finance to data science. I didn't come from a computer science / math background and things just weren't clicking for me. I remember feeling angry, embarrassed, dumb, and overall that I wasn't smart enough to learn this stuff -- I was incredibly discouraged and felt that I didn't belong there. I was lucky enough to stumble across one of the course videos on YouTube (thanks recommendation algorithm!), and the rest is history.

The amazing thing about these courses is how simple Jeremy (and team) are able to make machine learning. I didn't need to understand python dependency management in order to learn how to train an really good image classifier. Their approach helped me have lots of little wins, gave me confidence, and helped build the motivation to slog through the harder stuff when I needed to.

From the bottom of my heart, thank you @jph00. You changed my life immeasurably for the better. I learned that I AM good enough, I AM smart enough, and I CAN do hard things... I just had to find the right way to learn them. Your courses completely changed my perspective on what was possible for me and opened the door to some of my life's greatest passions.


[flagged]


Lots of people who've taken the Fast.ai course have similar things to say. It's commonly said it's the fastest way to get into DL.


Yeah just wanted to say, I am also one of the persons who has immensely benefited. So it may sound like paid, yet lot of people have immensely benefited like people from India, Nigeria, etc..

Check this article[1] to know a bit about philosophy of fast.ai and why it's so popular

[1] https://future.com/the-rise-of-domain-experts-in-deep-learni...


The first Fast.ai course back in around 2016 changed my life.

I was studying a masters in statistics and computer science that had 1 neural networks lecture and nobody knew anything about deep learning. Fast.ai and Jeremy’s teaching style helped me start playing with deep learning models really quickly and I changed my thesis topic to computer vision.

I ended up consulting on the topic and doing various startups leading to the startup I’m working on now which just finished YC (AiSupervision W22).

I doubt be here without fast.ai. I highly recommend and appreciate all the work that Jeremy and the rest of fast.ai do!


Watching from afar the great advances in machine learning over the past few years with AlphaZero, GPT3, DALLE2 I felt it important for me to start understanding what is going on under the hood. Having just completed the private pre-release of the course run through USQ, as my first foray into machine learning this was a great introduction that had me quickly produce a working image classification system. The videos are packed really with insightful rid-bits about practical approaches to iterating quickly to understand the data better to produce better results. Very much recommend the course.


This is awesome. One question I have always had - is the research on applying DL for images the most developed compared to other things?

Even DL used for audio processing (classification, separation etc) seems to convert audio to spectral graphs and apply DL to that.

Changing a problem to be expressed as image inputs will be an advantage when using DL as a solution. Would you agree ?


Working with a spectrogram is definitely similar to working with an image, and it's interesting to think why that's the case.

Take convolutional models, for example. Very effective for working with images because they're (a) parameter efficient, (b) learn local/spatial correlations in input features, and (c) exploit translational invariance. As an oversimplification, we can train models to visually identify "things" in images by their edges.

If you think about what's going on with an audio spectrogram, you can see the same concepts at work. There's local/spatial correlation - certain sounds tend to have similar power coefficients in similar frequency buckets. These are also correlated in time (because the pitch envelope of the word "yes" tends to have the same shape), and convolutional models can also exploit time-invariance (in the sense that convolutional models can learn the word "yes" from samples where the word appears with varying amounts of silence to the left and right).

That being said, the addition of the time domain makes audio quite hard to work with, and (usually) not as simple as just running a spectrogram through a vanilla image classification model. But it's definitely enlightening to think about how these models are "learning".


Thanks for that note. I have an audio classification hobby project (for now). Could you point me to things I should learn to get better at audio classification and generation?

Your comment about time domain making audio difficult - before doing some research I thought it would make it impossible. But looks like people have had some success with using spectrograms of short audio samples. What techniques should I try to learn to deal with the time component of audio?

One idea is to chop up the audio into short samples and treat the resulting images as a video. Then look at DL algorithms that deal with video. Am I on the right track?


Good question!

I think a major reason for this is because of transfer learning. For computer vision, there are many good pretrained models that were trained on huge datasets (like ImageNet) that can be fine-tuned for custom tasks. Other fields often do not have such pretrained models and huge datasets to work on, so it turns out transforming a dataset into an image dataset and fine-tuning a pretrained model works better than training from scratch.


Oops dumb question. Watched the first video and got my answer.


I haven’t seen this course content yet, but fully did the 2019 version.

Extremely grateful to have found it. Changed the course of my life.

I can vouch for it's quality.

Jeremy is an excellent instructor. So much clarity in his teaching!

I love that this is a hands-on course, and there are ZERO hand-wavings. I also really like the top-down approach of teaching. Now, whenever I try to communicate something or teach someone, I try to do it top-down. And I have Jeremy to thank for that.

Currently, I am attending his APL study group and having a blast!

Only question for @jph00 is: second part, when?


Yes, updated second part of the course when? Any chance you will shift from Swift to Julia?


Fantastic course - Fastai courses are a must for anyone looking to learn Deep Learning/ML.


Hello Jeremy. Thanks for this great content. I have gone through your entire course and learned a ton from you.

Now moving on from here, do you have any resource recommendation where I can dive deeper into machine learning and deep learning theory? And also any resources to become a much much better programmer?

I am currently working in as an assistant in a research lab. My coding skills are not that great.


Can one do these lessons in any order? For example, do CNN first then jump back to NLP. Or skip the implementation from scratch because I have done a similar one in another course.


They're designed to be done in order, but yup if you know how SGD works, for instance, you could certainly skip over that bit. The videos all have youtube timestamps, so if you drag the scrollbar you'll see what each section is about.

Or you could do those bits at 2x speed in case there's some concepts there you haven't seen before.

The NLP lesson could possibly work reasonably well standalone if you already know some DL basics, since it uses a different framework (Hugging Face) to the earlier lessons.

The CNN lesson would probably largely make sense if you already understand multi-layer perceptrons, since it mainly shows how a convolution is just a special case of sparse matrix multiplication.


what is/will be the state of deep learning in 2022 or next 3-5 years? you hear/read so many news/articles in HN about decline of DL. Is that so?


That's like saying "watch the decline of C++" while using javascript What does javascript run on?

Most ML advancements will use DL at the core in interesting ways.


I mean, just looking at OpenAI and Deepmind, they have relatively recently released break-through models for which building upon and extending can be done in relatively straightforward ways (DALLE 2, GPT-3, AlphaFold, OpenAI Codex, ...etc), so I don't think DL will "decline" any time soon ...


I re-did the course with this 2022 version. Highly recommend it :)


The course uses PyTorch, which despite being a big played still has many issues running g on any of the latest AMD cards.

Windows not supported on AMD cards Navi series cards not supported in general. Heavily biased towards CUDA, despite AMD cards drivers being open sourced far more than Nvidia cards.

Remote machines and kaggle notebooks go some way to improving these limits for the course. I'm complaining a bit more in general here, I think.


You guys really are the best. Thanks for all the hard work.


There are too many poor design decisions in the fast.ai library.

One should invest too much time just for the sake of learning the library's weird API, and then using it.

Doing something custom is too difficult, in contrast to Jax, PyTorch, and even (poor library) TensorFlow.

The coding practices are whimsical. The codebase wouldn’t pass code review in any respectable company.

Variable namings are weird and super-problematic.

I fully stick to what I said. Learn techniques, best practices, and, most importantly, Howard's attitude. Then take them with you and move onto something like PyTorch.

Howard is great with one problem: he kinda hates math. It might also seem that he ends up promoting anti-intellectualism.


You've recently posted repeated comments that cross into personal attack. We ban accounts that do that, so please don't do it again.

We detached this subthread from https://news.ycombinator.com/item?id=32189308.


> Howard is great with one problem: he kinda hates math

I'm sorry what?

I run a math study group 4x per week.

Right now the book I'm reading during my rest time is a calculus book.

I've co-authored a lengthy paper on matrix calculus foundations for deep learning.

I wrote a lot of the math materials in our numerical linear programming course.

It really seems like you have very very little understanding of me or the software library I've created, but yet are nonetheless comfortable publicly pronouncing your opinions about both.


I deleted an earlier, angrier comment of mine.

Can you explain this last sentence (which I understand to be insulting and without basis): > Howard is great with one problem: he kinda hates math. It might also seem that he ends up promoting anti-intellectualism.


He says repeatedly "You don't need math", and stuff like that.

This is not insulting. That man is my hero, and I deeply respect him.

But his 2019/20 course was riddled with such statements. He repeatedly said that one doesn't need math, and showed tools like drawing math symbols on a website to learn their names and ride on that. No further math needed.

It's like you can wing it in Deep Learning without learning Math. His behavior throughout the course reinforced this attitude. It is harmful for new learners.

But I am fortunate that I didn't learn from that, but learned from some successful alumni example that Howard gave. One woman who was also a musician ('19/'20), she made it big, but Howard mentioned that she did the Ng course, and also read the Goodfellow book.

So, I took the cue, and did DL the proper way. Anybody I know in DL made it because they know the Math.

There are some influencer types in fastai community who has 10ks of followers and shills stuff and do media stuff. Other than that 1-2 people, everyone who made it in DL, did it because they knew the math.

So, I think that people might get the wrong idea hearing from Howard that "you don't need math".

This is one fault I find. It's not like I dislike him. I like the rest of him. I love his attitude on almost all other things. I love Jeremy Howard, and he is my hero.


I think you completely misunderstand his stance.

You don't need the math in the beginning to train a model and get first results. Later, you will need the math and Jeremy clearly knows the math.

He gives a great example: In sports, you don't start with learning about physiology and train individual muscles etc. (I paraphrase), you start playing basketball or baseball or soccer, and understand the overall game. And if you like it, you can then become better and better and get deeper and deeper.

It's not helpful to start with linear algebra if - what motivated you - was the application of ML. We lose people who could have otherwise become experts later.


It is far from anti-intellectualism. It is about didactics. And Jeremy is spot on about this.


It is good enough to not need heavy math to begin.

Yeah, I know.

But you need a lot of math to do Deep Learning.

But I do not think Howard tries to communicate that.

You can't show me people who knows high school math only and gets to work in FAANG, or PhD in DL/related, or CTO of an AI start-up, or anyhow "made it" in DL.


I think co-writing and co-teaching a math course at a deep learning company he co-founded, pinning that to the github repo and moving it close to the top of the home page makes it pretty clear he does see value in math in deep learning. I mean, if you need other evidence beyond than the fact that he teaches math needed to understand and build things from scratch in the deep learning course...

In the courses he has always been clear you don't need a ton of math to begin. He's also always been clear that as you progress you will encounter math that you need to learn to continue. He's always clear that that is ok if you don't know it before you start and it's ok to learn it when you need it.


> influencer types in the fastai community who have 10ks of followers and shills

Everyone that I can see that fits that profile work at real companies doing real deep learning work, or are building infrastructure and tools that we all use. Nvidia, Huggingface, Etc. I don't see pure media stuff at all, most people are developing libraries or doing other applied work, and talk about their work publicly. Frankly, your comments come across like you are salty. Being a unpleasant person in online forums that enjoys insulting people seems correlated, which likely doesn't bode well for your professional aspirations, regardless of how much math or python you do/don't know.

> You don't need math

He's saying you don't need a PhD in math, not that you should ignore math all together. I have graduate level math and CS background and I don't thing either of those helped much, other than overcoming gatekeeping. The thing thats far more important for applied ML is to practice DL on lots of different problems to be effective. PhD level math might be useful for research, but that isn't necessary in practice for most people.


[flagged]


Personal attacks will get you banned here. Please don't post like this again.

https://news.ycombinator.com/newsguidelines.html


> Sanyam Buhtani does not do real DL work.

Oh wow, it seems like you have made a habit of judging people, even though you don't know much about them at all, as recently as 10 minutes ago: https://news.ycombinator.com/item?id=32197090

Despite the fake apologies, I suppose it is a habit you can't really shake.

By the way, how do you do real DL work if you have so much trouble communicating and interacting with people generally? Seems like that would really get in the way of doing anything of any import.

Throwing insults at people using their full names on anonymous internet forums as "being fake" is a special kind of toxic behavior. They really should not allow you to participate in these forums with this kind of behavior.

I've flagged your comment as inappropriate.


Can you please provide some resources you used to learn?


Definitely do the fast.ai course. Totally worth it.

But also use ISLR, Goodfellow, Bishop, etc.

Start with Andrew Ng's ML, then do the first part of Aurelien Geron book, then do Ng's DL specialization, then do fast.ai. Then learn PyTorch. A great book would be Sebastian Raschka's book. Also d2l.ai. A fast-paced, but really good course would be the Neuromatch DL tutorials.

Then move forward based on your interests.

Yann LeCun has THE best MOOC on DL on YouTube.

For the Math, I majored in Physics, so stuff came naturally. I suggest Imperial London's MOOC on Mathematics for ML specialization, Robert Ghrist's Calculus course, VMLS book for Linear Algebra. For stats, haven’t found a good one yet.

What you read, how much- these all depend on what you want to do. Where do you want to see yourself, and so on.

If you just want to brag about DL and put it on your resume so that you can get a job writing SQL queries and make PowerBI presentation as a "Data Scientist", then the bars are low.

If you want to do some DL, then that is another league altogether.

You need to be able to quickly read papers, understand ideas, use those for your own projects or papers.

Makes sense?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: