Hey William it's a great book. The beginning was great--it was a great intro.
I think the book is still too long. Some of the passages are huge, with a long long blocks of text. There's a lot of filler words in there like "which is a bummer", and a lot of "say..." dot dot dots. As a reader, you need to spend mental energy in trying to figure out what is the key essence you are trying to convey in each paragraph--this would be fine if the book is dense to begin with, but since you are trying to make this a concise intro, maybe it would be best to reduce this mental energy requirement to the absolute minimum
A bit if feedback is, read the paragraphs your wrote, ask yourself "what is the exact point I am trying to convey here?", and the remove words that can be removed, without taking away from the point. You want the paragraph to be as short and concise is possible, since that's what makes your book different from all the other "dense" books out there on the same topic.
If someone in the book store opens up your book and "skims", if your paragraphs are clearly small, with lots of whitespace between paragraphs, then even without reading the text content in detail, the person can immediately tell your book is special, and completely different from all the other books on NN in the bookstore, and will be more likely to buy it right away.
I think at the beginning it was great. The use of everyday examples and analogies made it very quick and simple for someone reading it to "get" what you mean. But as the chapters progressed, e.g. chapter 3, the text gets more dense and examples get fewer--in the later chapters it looks like you got more excited about the technical details, and the calculus math etc, and the later chapters no longer seem to relate to examples as much or are as concise as before. The later chapters look more and more similar to the other denser literatures in the field
In any case, overall it's a great book. It's very unique I have never seen this condensed approach before, and very special compared to all the other NN literatures out there. It's very refreshing. I'm sure it will inspire a new generation of machine scientists who will remember this for years to come!
An editor. Behind every great writer was an even better editor.
For technical documentation, it's actually better to hire a really good technical writing editor than a tech writer. Have the engineer spew out the right ideas, and then the editor puts in the magic that makes it an effective read. It's an easier process than trying to teach the subject to a tech writer.
We have an oreilly book dropping on deep learning next month at strata hadoop in new york. We've been working on it for a few years now, but yes: can confirm. Our editor has been amazing.The last 10 ft and minor tweaks have been the biggest lesson for us in writing this.
(If you click this link: warning: it's not python)
Warning: if you click the above link you will be cookied with an affiliate tracker, zippylab-20. Even if you buy other products, they will be notified of the exact items purchased and will receive a percentage of your purchases for up to the next 24 hours.
The amazon affiliate panel shows an item by item breakdown of any purchases made by cookied users, goodbye private purchases.
Oh good find! I actually jusy copy and pasted this from my search history. Thanks for the catch. I will be more careful in the future. Completely my fault there.
I sadly can't. (It's past the edit duration)@dan is welcome too though. Believe me authors don't make money from their books anyways :P (at least when going not self publishing).
When I saw the premise of the book I was initially turned off, fearing an attempt at trivialization of the ML subject matter, but after reading the intro I kind of like it. It's intuitive and would work well as an introduction for hackers.
It seems a good way in ML is to hack your way around libraries until you get the feeling, and only after that start reading up on theory or doing some ML classes. The other way around is dry.
That's one of the nicest examples of constructive criticism that I've come across in a long time. Bookmarked for future reference to help me become better at this.
Lots of great comments here already. Just some feedback in the hope that it's useful:
- Assume that your target audience is going to be very eager to learn about DL but have no clue about what exactly to learn or where to even start. That's why they are buying your book in the first place and not some other more dense text.
- Hence, telling your readers what to learn and where to find more info is just as important as the subject matter itself. This can be as easy as e.g. telling the readers about certain keywords that they can use in their Google searches.
- The very best texts that I've read on complicated subjects were always "coarse-to-fine", i.e. give the readers the big picture as early as possible, then enable them to go into details at their own pace.
- Conversely the worst text that I've read on complicated subjects were either fine-to-coarse (trying to explain individual components in detail before going to the big picture), not explaining the big picture at all or being too verbose in the beginning (slowing down the eager readers and killing their motivation). A good example of the latter is Apple's "Programming with Objective-C" [1]. Horrible text IMO.
- Following what was said above sometimes details aren't even necessary to include in your text as long as the readers are confident that they can find their way around and get details elsewhere.
- The very very best texts I've read also always had a motivational component. For someone who's just starting out the field looks vast and un-conquerable and scary. If you show them, in simple words, the boundaries of the field and which areas the experts are working on and even where current research is struggling you help give confidence and trajectory to your readers, so they can strive to become experts too.
Just FYI, there are still a few typos in the sample pdf, I think a spell checker might be useful?
"A neural network learns a function. This might seem confusing since I just told you that it is
a funtion. However, every neural network starts out predicting randomly. In other words, our
starting weight values are random... thus our function predicts randomly. It's a random function.
As you may remember from the previous chapter, a neural network learns how to take
an input dataset and convert it into an output dataset. For example, it might take an input dataset
of Farenheit temperatures and learn to convert it into an output dataset of Celsius temperatures.
It might covert a pixel values dataset"
funtion, Farenheit, covert
Edit:
"We just take each weight... compute its affect on the error... and move it
in the right direction so that the error goes down (to 0)."
There's another book from this publisher called Grokking Algorithms. That one left me very impressed. Usually, I don't care for any "simplified/easified/dumbed-down" books because they often feel like a compilation of buzzwords with all the important bits removed. I thought Grokking Algorithms was simple, yet very meaty/substantial if that makes sense.
The discussion of gradient descent was excellent. So far I'm quite impressed. As others have said the question is going to be whether you can succeed at building on this base in a way that makes the later topics accessible.
A nitpick is that you use the words "matrices" and "differentiable" at the end of chapter 2. Maybe this is okay because you are signposting that these concepts will be explained but if you are aiming for high school algebra level readers with some python experience this could intimidate people.
He did say 'high school math' and not just algebra. It would be a pretty crap high school math course if it didn't cover linear algebra and beginner-intermediate calculus.
I get that paying for the MEAP will eventually get you access to all of the chapters, but it seems a little steep at this point. I'd a lot more willing to pay, say, $10 for access to only the first three chapters, and then pay more if I get hooked. I'm guessing that isn't possible.
It also stung a bit that the link said 'Click To See the First Few Chapters' when in fact you click to see the first chapter and pay for the rest.
Having gone through the first chapter, I agree, if something like what the poster above mentioned is possible, that'd be preferable for my situation as well. Just my 2 cents.
Author here, the first 3 chapters are in pre-publication. It is my hope that people are willing to check out said chapters and help me refine it in any ways it doesn't live up to the promise. Anyone who does, feel free to reach out to me @iamtrask or via the book's Forum.
To the author : What else do I need to know besides basic python and algebra? Since I am not a python programmer, can I translate the theories into other languages easily?
I've been learning about neural networks lately and implemented mines in golang. The biggest problem I had was that python is not chosen randomly: neural networks scientists use it because of numpy.
Most importantly, numpy makes it really easy to deal with matrices (~ array of arrays). You just make operations on them as if they were classic numbers (so, you can do `a + b`, where both a and b are matrices).
While translating it into other languages not having numpy is indeed possible, expect a bit of intellectual gymnastic.
I don't know which language you targeted, but for those who wish to use golang, I made this matrix library: https://github.com/oelmekki/matrix
EDIT: oh, btw. I'm initially a ruby dev. I've learnt python just enough to be able to understand NNs code, that was easy (took me an afternoon). I won't pretend this makes me a python developer, but learning just enough to translate code in an other langage is straightforward.
I plan to use Java since that's what I am mostly familiar with. I googled a bit and I think there are couple[1] of Java libraries to handle n dimensional array. Lets hope it will work out.
It seems like nd4j has everything you need. Just remember, when seeing calls to the `np.dot()` function in python's numpy that it is the "dot product" operation on matrix, which is also known as standard mathematical matrix multiplication, which in turn nd4j is calling `mmul` ("matrix multiplication", in "Linear Algebra Operations" section).
When numpy is using the * operator between two matrices, it basically just does a cell by cell multiplication
Hmm, you certainly can, but many of the intuitions come from reading little bits of python code. It's intuitive but I'd recommend doing a python tutorial first.
Not the author, but in the blog post he says he's using [numpy](http://www.numpy.org/).
With google you might find similar libraries for the language you want to use, with a quick search I just found a quora post with a few similar libraries listed for C++.
Thank you for writing this especially on recommending memorization. I used to do this though I never thought somebody else would be doing this to grok on to something since this would be a crazy idea for others. Was really surprised about that.
I hope you will use spaced repetition in which the reader will have a base from which to move on a deeper level of understanding. Can't wait to buy this book soon.
scikit-learn doesn't have a strong neural network codebase -- for anything not NN based they've largely got you covered (along with good infrastructure tooling for pipelines, cross validation, hyper-parameter searching etc.). Contrary to the impression you may get if you only follow the current buzzwords there is a great deal of value in machine learning right now beyond NNs and deep learning. On the other hand if deep learning is what you want to do, scikit-learn is not currently the best library for that.
scikit-learn would likely fall into the category of "black box" frameworks the author mentions. If I understand correctly, this book will let the reader gain an understanding of the underlying algorithms from an intuitive standrpoint.
It is worth noting that scikit-learn does value clear understandable implementations, so you can actually pop open the source code and expect to find something other than a black box. Now, in many cases you'll have optimization work that means a slightly less obvious approach is taken, but the scikit-learn maintainers do work hard to try and ensure that, if you want to learn, you should be able to open up the code and do so.
Is there any chance you (William) could add hand-drawn illustrations and flow-charts? :) This makes a book outright welcoming IMHO. I loved that style in Grokking Algorithms (or even in Getting Started in Electronics by Forrest M. Mims III, if you want a distant example).
Thanks for your initiative, will buy the book soon as it is available.
I don't usually buy programming books. But when I do it's usually with manning. I think this will be the first time I will actually learn about deep learning. Anyone got one of those 50% off coupons?
You have left tons of negative comments throughout this post[1]. We get it, you don't like the book. Writing a book is not profitable, Trask is going to spend a lot of time just to help out other folks and share his knowledge. Be aware that he is trying to do something nice here, and try to empathize.
No, you don't "get it". And I may "love the book", if I ever read the supposed "book".
I believe this is merely a marketing test for a book which likely does not yet exist and news.ycombinator is not a forum for market testing or advertising for books, even books on NN.
Furthermore, a post in trask's defense by another author who also has a book published by Manning, is most unsavory. Publishers and authors and their agents should cease spamming news.ycombinator.
Your comment has caused me to review the Hacker News Guidelines and I find that I have violated no less than two, and now, possibly three, of the guidelines, in particular the following:
"Please don't submit comments complaining that a submission is inappropriate for the site. If you think a story is spam or off-topic, flag it by clicking on its 'flag' link. If you think a comment is egregious, click on its timestamp to go to its page, then click 'flag' at the top. (Not all users see flag links; there's a small karma threshold.)"
" If you flag something, please don't also comment that you did.
....
" Please resist commenting about being downvoted. It never does any good, and it makes boring reading."
I apologize to all for these violations.
Nonetheless I find the initial "chapter" of the aforementioned "book" to be void of significant content and await the full publication before spending any money.
I wish more than the first chapter were free so I could get a sense of the "meat" of the book vs just the intro. I'd even be willing to give up my email for it to be notified of new chapters (hint hint).
I appreciate the offer, but I'd much rather just be able to see at least Chapter 2 or something like that. If it seems like the learning style is my cup of tea, I'd definitely buy it, and even at full price. I just need to be sure the way lessons are presented align with my learning style as I'm a bit picky there. Unfortunately there isn't much taught in Chapter 1.
It's actually quite a shallow treatment. That's OK. Presumably, most folks who just want to use the things don't want to know about transience in chaotic attractors or VC-dimension and Radamacher complexity and stuff like that.
Well, giardini, I'm not a bot. My offer to send you 3 chapters still stands. Sorry you're not happy. I also have plenty of free educational materials on Deep Learning on the above linked blog which you can use to rate my writing. If there's anything else I can do for you, please let me know.
> I've concluded williamtrask is a bot - it really doesn't seem to "get" the point!
You're insulting one of your peers here. Insults, clever or otherwise do not belong on HN, especially not in threads where those peers offer their works for - limited - review.
If you're shocked by someone writing a book that has an actual audience here and even more shocked that they would have the temerity to charge for their work, if you complain they won't give you 'access' and then insult them to boot when they offer to do just that you go from 'clever' to 'asshole'.
I've flagged your comments and would really appreciate it if you found it in you to apologize to the topic starter, subthreads like these make me sad.
Python feels like a pseudocode and is a popular language among hackers and data scientists. As a result, there are more examples, tutorials to get started, that's one of the chief reasons many choose it to learn new concepts.
Gonna play Devil's Advocate here: is this the correct way to lower the barrier to entry?
This is like trying to teach monads without having taught lambda calculus, functors, and applicatives.
There is a clear order to knowledge, and people should master the books dealing with prereqs if they want to grok deep learning.
Not jump into deep learning, just because it's the hot shit.
Part of these efforts to cheaply popularize CS makes computer science not a real field. Just a fad.
Nobody will write a book called "Grokking Quantum Physics" claiming that explaining quantum like "you are a five year old" will somehow cover for the necessary mastery of classical physics.
If you read such a book and think you understand quantum physics, you are terribly misguided.
Dunning-Kruger addicts people to feeling like they have mastered a subject, without putting in the effort.
Its not about making it accessible, its about writing baseless wrong things.
The book author will do more harm then good for general audience. Its total waste of time to read this book. Our time is limited better spend it on something useful which is correct.
I think the book is still too long. Some of the passages are huge, with a long long blocks of text. There's a lot of filler words in there like "which is a bummer", and a lot of "say..." dot dot dots. As a reader, you need to spend mental energy in trying to figure out what is the key essence you are trying to convey in each paragraph--this would be fine if the book is dense to begin with, but since you are trying to make this a concise intro, maybe it would be best to reduce this mental energy requirement to the absolute minimum
A bit if feedback is, read the paragraphs your wrote, ask yourself "what is the exact point I am trying to convey here?", and the remove words that can be removed, without taking away from the point. You want the paragraph to be as short and concise is possible, since that's what makes your book different from all the other "dense" books out there on the same topic.
If someone in the book store opens up your book and "skims", if your paragraphs are clearly small, with lots of whitespace between paragraphs, then even without reading the text content in detail, the person can immediately tell your book is special, and completely different from all the other books on NN in the bookstore, and will be more likely to buy it right away.
I think at the beginning it was great. The use of everyday examples and analogies made it very quick and simple for someone reading it to "get" what you mean. But as the chapters progressed, e.g. chapter 3, the text gets more dense and examples get fewer--in the later chapters it looks like you got more excited about the technical details, and the calculus math etc, and the later chapters no longer seem to relate to examples as much or are as concise as before. The later chapters look more and more similar to the other denser literatures in the field
In any case, overall it's a great book. It's very unique I have never seen this condensed approach before, and very special compared to all the other NN literatures out there. It's very refreshing. I'm sure it will inspire a new generation of machine scientists who will remember this for years to come!
Thanks for writing this!