Hacker News new | past | comments | ask | show | jobs | submit login
Artificial Neural Networks for Beginners (mathworks.com)
255 points by rdudekul on Sept 17, 2015 | hide | past | favorite | 48 comments



If you're trying to learn about deep learning, I highly suggest using Python(Theano) or Lua(Torch). They're free and used by the experts in the field for research.

Even if you don't want to use the frameworks, you'll still have access to fast linear algebra routines.


Could someone can recommend me a book about deep learning and/or machine learning for this kind of open-source library ? I do not have any background in ML nor DL.


Although this is not for beginners of machine learning (learn that first), this is a book on deep learning that is currently in pre-publication and its being written by some big names in the field.

http://www.iro.umontreal.ca/~bengioy/dlbook/


NVidia has a free online course going on covering these libraries: https://developer.nvidia.com/deep-learning-courses


Many great material of using Torch to do machine learning and deep learning from this Oxford course: https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearni...


Then you might actually want to start in Matlab / Octave with Mchael Ng's coursera course on ML.


I think you are talking about Andrew Ng's course.

I completed it and can't recommend it more highly. It is a really excellent, dense course and Ng is a very good teacher.

https://www.coursera.org/learn/machine-learning


Geoffrey Hinton's archived course is all about neural nets, I think you can enroll in the archived version, no code, just theory.

https://www.coursera.org/course/neuralnets


I've completed the course as well - have you used any of the knowledge from it on anything in particular after you completed the course?


I'm taking the Coursera course right now. The course page at Stanford has a lot of student projects. The breadth of applications is pretty huge, definitely worth a check if you're looking for an idea.

http://cs229.stanford.edu


I am working through Ng's course currently. It is hitting the right tones against my mathsephobia...keeping me constantly in that state of semiunderstanding that is intuition, a term Ng uses often.

His choice of Octave/MatLab simplifies issues of dependencies. In particular the soft ones of documentation and community. This is something a lot of academic contexts get wrong with software: the tools are either to open ended and students wind up manipulating matrices with forloops or there's an inflexible stack of professional tools that require massive effort to learn and an orthogonal community or there is a toy IDE based on a senior thesis.

Octave more or less follows the Unix philosophy of doing one thing and thus can meet many people where they are rather than with a one true way.


I think the best introductory resources are Nielsen's book [http://neuralnetworksanddeeplearning.com/] and Hinton's online course [https://www.coursera.org/course/neuralnets]. If you need something specifically for Theano, they have their own tutorial [http://deeplearning.net/tutorial/].


Deep learning is pretty much a field still being rapidly advanced by research. A book on it would become obsolete the day it is published.


Theano is great! The learning curve can be a little difficult, but once it "clicks", it's nice to work with.


Are there any good Neural Network frameworks written in Ruby? The ones I have used (ruby-fann and AI4r) dramatically slow down when you use them on large amount of data.


Nice article. If anyone is interested in understanding the theory and also dig deeper, the machine learning course on Coursera is a great place to start as well.


Matlab? Thanks, but no thanks.


It's still widely used in academia (though I'm extremely glad IPython/Jupyter is being praised by Nature[1]), and it has a lot of "toolboxes" with a reasonable integration between them.

Once you step outside it, you need one library for the machine learning part of your project, another one for the computer vision aspects, yet another one for the reading of different data files. When you get to the 20th library you'd need, plus all the classes and wrappers you have to write on your own to achieve the same result, you start looking MATLAB under a softer light. That's even more for those who are no programming-inclined.

[1] http://www.nature.com/news/interactive-notebooks-sharing-the...


There's also Octave [0]

[0]: https://www.gnu.org/software/octave/


Matlab vs Octave is one of those places where it's still worth buying the real one. Personal use license (including machine learning toolkit) is under $200. If you're not willing to spend $200 to learn something you're probably not that interested.

Browse JSTOR at the library versus Googling any historical, scientific or research topic and you'll quickly learn that "internet" offers the shitty version of surprisingly many things. (Shh... it's a secret.)


> If you're not willing to spend $200 to learn something you're probably not that interested.

This is a bit presumptuous. I know people for which 200$ is a month's income. If you only consider their disposable income, 200$ would probably take 6 months.


Maybe that's why there's the $49 student license and the $99 one that includes ten toolboxes including machine learning.


I'd agree that Matlab is great and I honestly thought Matlab would be more expensive. I've done Andrew Ng's class, and I learned a great deal from it. Given that I haven't really figured out what I'd like to do with what I learned from the class I'd prefer to stick with the "lite" math package first till I decide to get more serious.


Actually matlab i had great experience learning NN in matlab. It is very fast with matrix multiplication and have comprehensive toolbox set. If you need prototype anything, matlab is your choice.


At times my mechanical engineering courses at university felt like MATLAB tutorials. You had to use it, no way around it. Good luck once you're out of university and want to start your own thing, you won't be able to afford it. The computer science courses in contrast preferred open source tools over proprietary ones.


Student copies start at $99 [0]. Copies for home use start at $145[1]. This is not outrageously expensive. Students can spend more on a single textbook that'll be used for only one semester. For home users, this is the price of 7-10 dinners out.

And as others have pointed out, there's also Gnu Octave, which is reasonably (or used to be) compatible, and free. And where it's not source-compatible, it still largely retains the same semantics and underlying models so knowledge of Matlab can be easily transferred.

[0] http://www.mathworks.com/pricing-licensing/index.html?intend...

[1] http://www.mathworks.com/pricing-licensing/index.html?intend...


The guy cutting my lawn spent more on his tools than MATLAB costs.


Of course it's not only about the price. Open and free tools are important for reproducibility of research, including computer science research, economic research, etc. Having open access articles/papers, we are now moving to open availability of data for reproducibility of results. Open tools are the third component, allowing a complete reproducibility of research, unencumbered by arbitrary lock-in. It is then important that we teach and share open and free tool with science learners, so that as they progress they can freely share fully reproducible research, with anyone in the world.

[edit] in summary, in the age of open access, FLOSS science tools become a must -- and a collective responsibility.


The most important thing is giving brilliant minds the best tools, period.

If those tools make it impossible to share data, or publish results reproducibly, then I'd agree that those tools suck. However Matlab reads and writes every damn format under the sun.

Don't confuse "create, invent and build" with "export and publish." They're fundamentally different tasks.


I also spent more on video games in the last year than it costs. What's your point? That it's so cheap he should just buy it, even if there are better, cheaper tools out there? That seems like a waste to me.


I was refuting this silly argument: "Good luck once you're out of university and want to start your own thing, you won't be able to afford it."

Yup, startups have costs. Go figure. And sometimes you get what you pay for.

I wish I'd learned Matlab sooner. I still love the Python ecosystem, but Matlab's replaced a LOT of dicking around in Python for me. It does completely different things, and certain things are trivial or impossible in each place. Worth learning both.


Octave is 'cheaper' but I can't call it 'better'.


Cached version since the database seems to be having issues

http://webcache.googleusercontent.com/search?q=cache:UhEgP6_...


ANNs are great for the right application. But I'm starting to fear "Deep Learning" is the new "Big Data" buzzword.

I believe ANNs are Turing Complete, meaning they should be able to compute anything (EDIT: + "that is computable by any other Turing Machine"). The questions are, can a training regimen be created to create the right ANN to solve "any" problem, and if so, is it an efficient means to solve that problem?

For example, it's fairly trivial to build an ANN to spit out the right results for a given polynomial function, i.e. "f(x, y, z) = ax + by + cz". Knowing the polynomial function ahead of time, you just generate a ton of input/output sets and feed them into the training of the ANN, and then from there on the ANN will spit them back out.

The problem with that is, you didn't learn anything new. You didn't learn how to solve a new problem. It's somewhat useful for teaching people how to program ANNs, but I personally think it's garbage for teaching how to understand ANNs.

ANNs make more sense when we already have the training data, but we don't know the underlying function that maps input to said outputs. In the trivial case of the polynomial function, if someone were to hand us the training set, we could use an ANN to figure out what the polynomial must be.

Except--for this particular example of a polynomial function--this isn't very efficient. For a polynomial of N terms, you only need N+1 sets of IO to trivially use algebra to determine the function. You can use any of the readily available linear algebra libraries to do such a thing. In fact, I wrote a project for a client that does just that: it uses a basic matrix library to crunch a set of GPS data to create a quadratic formula estimation of curves in roads, so that model can them be resampled, continuously, sans noise.

And if that function is not just a simple polynomial--if, say, it includes sines and cosines and square roots, etc.-- then the ANN is going to have to be large enough to include in it ad-hoc, arithmetic estimations of sine and cosine and square roots sufficient to give the right answers. It might even include several different estimating functions just for sine just because our mystery function requires more than one sine operation. It might even have corner cases where it gets the answer wrong, because you didn't have a sufficiently large data set for it to "figure out" things like the fact that sin(x) is approximately x for small values of x. If one knew the right formula (and yes, that's a big if), it'd be significantly more efficient to write a program that computed the values correctly.

All of this is not to poo-poo on ANNs. ANNs are great tools for when we don't know the function and when the function is sufficiently non-trivial to discover. The polynomial example is like trying to kill a fly on the wall with a swarm of nanonmachines designed to evolve and learn how to construct a flyswatter (which is part of the reason I dislike it as a learning tool). But write traditional code to do Optical Character Recognition, I dare you. ANNs are just highly specialized. Think of setting up your ANN like defining the full width and depth of the space of all possible programs that you'd like to search for the program that solves your problem. You then use feedback to "walk" across that space until you find something that looks like your desired program. We're entering an era where we have the memory and distributed processing capabilities to crank out some rather large ANNs. For some problems, we end up training a computer to write programs for us that we could have written on our own. This can impact the number of requests you can handle in a given amount of time.

Of course, that is not necessarily bad, either. "Throwing money at the problem" is not the wrong solution when you have a lot more money than time. Technology is supposed to serve us, not the other way around. Why spend a week discovering a formula to map your data when you can train an ANN in a few hours? And perhaps you don't have very high requirements for request handling. Maybe you only need to process one image a minute on your particular system. Have at it.

But you really, really need to know that is the case before you jump on the ANN bandwagon. You have to know what you want out of the ANN. If you don't have that ability to look at a set of inputs and express a desired set of outputs, then ANN isn't magic pixie dust that will solve that for you. If you have experts in your particular field telling you that your particular problem cannot be easily modeled, then ANNs might be helpful for you. If you are new to your field and you think "let's try an ANN", you're probably going to have a bad time. If you end up with an ANN that is estimating a relatively trivial program, and you're trying to provide a SaaS offering that is meant to scale to thousands or millions of concurrent users, the ANN approach could seriously harm your ability to scale.


ANNs have been proven to be universal approximators (https://en.wikipedia.org/wiki/Universal_approximation_theore...) which I think is what you meant when you said 'Turing Complete'.


There is also "Turing Computability With Neural Nets" (Seigelmann, Sontag, 1991 http://www.sciencedirect.com/science/article/pii/08939659919...)

    This paper shows the existence of a finite neural network, made up of sigmoidal 
    neurons, which simulates a universal Turing machine. It is composed of less than
    10^5 synchronously evolving processors, interconnected linearly. High-order
    connections are not required.


It looks like your argument is essentially 'don't use a neural network when you can do a glm / other regression instead' which no serious person should disagree with.


> But I'm starting to fear "Deep Learning" is the new "Big Data" buzzword.

You spent several paragraph criticizing ANNs, but regular ANNs are not deep learning at all.


I didn't criticize ANN, I criticized perception of ANNs and learning algorithms as general purpose solutions.

Also, I have no idea what you're talking about that ANNs are not involved with deep learning. Recurrent and Convolution Networks are types of ANN. https://en.wikipedia.org/wiki/Deep_learning


Regular ANNs are not deep learning.


Lol, what the hell is "regular ANNs"? Both convnets and RNNs have been in use for more than 20 years.


It takes forever to run a simple patternsearch(), fmincon() if a function gets a bit complicated.

Their mcc compiler is even more crappier, it has so many memory leaks that even valgrind gives up and gets freezed.

I do not want to run a MATALBBED-ANN over large datasets, no way.

MATLAB scwhag: "Do you speak MATLAB ?"

me: "No, I don't speak MATLAB, and I don't want to"


I love matlab but specifically with neural networks, I made bad experiences. Just generally subpar performance on convergence speed and results. It's better to use caffe, which is the best neural network kit I know. Also, large parts of caffe are being implemented for GPUs such that performance becomes even better.


Just got 'Database Error'while trying to connect to this page. Error establishing a database connection. So, the number of connections to this page is limited? or what could be the issues throwing this kind of error?


Just refresh a few times, the sites probably getting hammered with HN users


I got this error for line "targetsd = dummyvar(targets);": Undefined function 'dummyvar' for input arguments of type 'double'.


can someone give me some real world business need where I can apply RNN and this type of knowledge? Obviously not looking for a hand out but open to exploring problems in the enterprise or any other potential problems worth solving which has a market.

I find that having a goal of what I want to solve or create motivates me to learn. Whereas if I'm studying Statistics but don't have a clear goal that motivates me (calculating sports betting odds) then it's that much harder to master and appreciate it's applications.

I guess to me, knowing the application of something before I dive both feet into learning it is actually the most important truth for beginners.

As a kid, did you want to make video games and then ended up learning programming but ultimately not making video games? No 8 year old thinks I'm going to implement lxml in javascript one day they just think of something they like or curious about (ex. video games).





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: