Abstract: People learning new concepts can often generalize successfully from just a single example, yet machine learning algorithms typically require tens or hundreds of examples to perform with similar accuracy. People can also use learned concepts in richer ways than conventional algorithms—for action, imagination, and explanation. We present a computational model that captures these human learning abilities for a large class of simple visual concepts: handwritten characters from the world’s alphabets. The model represents concepts as simple programs that best explain observed examples under a Bayesian criterion. On a challenging one-shot classification task, the model achieves human-level performance while outperforming recent deep learning approaches. We also present several “visual Turing tests” probing the model’s creative generalization abilities, which in many cases are indistinguishable from human behavior.
And this algorithm has required hundreds or thousands of previous attempts at machine learning (be these and other researchers) to get to the point where it could replicate that human feature.
"The "R" initial in his name stands for "robot," a naming convention in Asimov's future society; all robot names start with the initial R to differentiate them from humans which they often resemble."
What an amazing design idea that would be for dealing with clickbait though.
A crowd sourced subtitle could be added to the link and possibly voted on for accuracy (an expanding list shows alternate submissions or allows you to submit your own)
Since headlines can no longer be trusted, we could turn to the crowd.
So a service that provides metadata for URLs. Caption, description, user ratings, review.
A browser plugin could respond to hovering over a link by querying the service and displaying the response.
Being crowd-sourced, there would have to be some form of contributor reputation tracking.
YouTube link -> Cats react to bananas [83%; funny, cute]
Clickhole link -> Infinitely recursive self-parody of clickbait articles [71%; funny]
Kotaku link -> Gamers overreact to accusations of sexism [43%; news]
NYT link -> Anticonvulsant drug found effective against Alzheimers [91%; news, advertisement]
Vimeo link -> Police arrest Black Lives Matter protesters in St. Louis [43%; news, violence, NSFW]
Right, but consider the impact if there was a crowd sourced summary on Facebook (for example)
"this jaw dropping thing that happened will make you weep for humanity" `baby drops ice cream, dog eats it`
"Carrie Fisher destroys good morning America" `she throws out a few lighthearted quips, has a one-liner about jabba`
I think it really becomes effective on video content with extreme linkbait.
Personally, (and as a card-carrying crochety old man) I don't follow those links on principle. I'm carrying on my own little "boycott of one".
By instituting platform change though, to solve for linkbait inline (not requiring a page load to comments) would change the value proposition of headlines and I think it would be more likely to reward for quality content.
A TL;DR summary is not the same as "top comment" and TC is not always a TLDR ... One line, under the headline, crowd sourced and voted. I think it could fundamentally change behavior, and if the platform doing it was sufficiently large, the composition of the internet.
This x1000. Biggest pet peeve of mine is people's unwillingness to give a valid, time saving summary. Seriously, if you just ready it and your mind has absorbed it, what the heck is so hard about giving a short recap based on that summary?
Btw, TvTropes gives the kind of summary you describe under the "laconic" button, and it's a godsend.
And so, all news sites approach the design of slashdot - from non-moderated comments to (user)moderated comments (we're still waiting (on the need) for meta-moderation here on hn), to presentation of stories? :-)
There is tldr.io, with somewhat still buggy Chrome integration, that will display you a TL;DR below Reply box on HN (and in an expandable-on-hover box next to the link on HN main page) iff someone has already written a TL;DR for the article in this service.
I wonder if this is how that same trend started on slashdot too (or meme, even: "You read TFA, are you new here?")? Then again, slashdot has always (as far as I know/remember) provided a summary in addition to the link, similar to how the page looks today.
By 'entire field' you mean handwriting recognition? They reduce the search space by limiting themselves to combinations of typical pen movements. Very smart, but I cannot see how it can be applied to machine learning in general.
>By 'entire field' you mean handwriting recognition?
Classification of high-dimensional data is a huge portion of machine learning.
>They reduce the search space by limiting themselves to combinations of typical pen movements.
"They reduce the search space by limiting themselves to gradients of nearby pixels. Very smart, but I don't see how convolutional neural networks can be applied to machine learning in general."
All of machine learning involves some prior knowledge -- the question is always how much and at what level of abstraction.
As someone who has done a little bit of recent research in AI, pretty much everything high-level (new experiments, new models) is in MATLAB or Python (the exception is C/C++/Cuda if an algorithm or system must be implemented from scratch performantly).
All the libraries are in those ecosystems. It really has nothing to do with syntax, performance, ease-of-learning, quality of IDE, concurrency support, online tutorials, quality of documentation, etc.
Computer vision in particular is dominated by MATLAB because of its superior image/signal processing libraries. This may change as deep learning takes over computer vision (deep learning is dominated by (Python or Lua) + C++/Cuda).
I find OpenCV to be a far better alternative to MATLAB for image processing and CV work. The API's are extremely similar and when things don't work the way you expect, you can modify OpenCV to fit your needs.
Plus, if you're a grad student and you write it in C++/Python with OpenCV, your algorithm might finish before it's time to graduate.
> your algorithm might finish before it's time to graduate
MATLAB is quite fast at vectorised and matrix operations. For things that are slow you can write your own C/C++ function and call it from within MATLAB.
Complete agreement - grad students have a few jobs they need to do and need to do quickly:
1) Large (within reason) easy matrix manipulations. MATLAB is fantastic at this. Matrices as first-class objects is so important, I don't know why more languages don't include them (though I am pleasantly unaware of the complications inherent to this).
2) Simple, decently pretty visualizations of data.
3) Trivial syntax. If you've programmed at all, it's easy to read MATLAB syntax.
IMO, the only threat to MATLAB on the horizon for academic work is Python + the associated SciPy tools (sckikit-learn, etc); especially as bundled in SPYDER, which I have had a blast with. But since cost is typically not an issue at a university (MATLAB is paid for by someone else), and everyone already speaks MATLAB, there's not a real impetus to change. (Julia may be int he running, too, but SPYDER seems more user friendly at this point)
Also, as someone who's been through this a dozen times before - as soon as a professor says "make me this visualization, the command in MATLAB is this" and the grad student says "SciPy (or whatever) doesn't have that", that student is converting his data to a text file, reading it into MATLAB, and making that damn plot :)
Someone down-voted me just for asking. I have no idea about any of this stuff and I was just genuinely curious. Apparently people are pretty polarized about Matlab. Thanks for explaining that to me.
It may be nitpicking, but the phrase you used ('Why do you think X?') is very common for making passive-aggressive contradictions. ("I have a smart cat." "Why do you think cats are smart?") If you've been hanging around the internet for a while you tend to start seeing an unspoken 'you idiot' on the end of it, even when it's not there. You could well have been downvoted more or less on reflex because of that.
Perl had tons of awesome libraries. Yet Python, Ruby, and Go, which had virtually none compared to Perl when they started still became very popular, and are arguably more popular than Perl now.
Incredible libraries, yes. But you may want to check out python with numpy, scipy, pandas, scikit-learn, etc. There are bundles like python (x,y) that bring them all together. Probably still want to stick with python 2.7 though (as most of scientific computing still does). This removes the tedious stumbling block that is forced unicode.
I think they downvoted you because the phrasing is ambiguous between "what evidence do you have that it's written in Matlab" vs. "why do you think the creators chose Matlab" and they interpreted it as the former.
MATLAB is a terrible programming language. MATLAB is a fantastic workbench for exploring data and implementing numerical algorithms.
There's a lot of excitement around the numerical computing tools that have sprouted up around Python, but I find the visualization capabilities to be lacking and writing numerical code in Python is really quite clunky where the same expressions in MATLAB are much more clear and concise.
In short: If you're writing lots of loops and conditionals, MATLAB is the wrong tool. If you're analyzing data, applying linear algebra or exploring using the tools of mathematics, signal processing and machine learning... it really can't be beat. Everything is just there, everything just works, everything is very well documented and there's a MATLAB implementation of everything new.
My guess is that most of the general purpose programming languages are usually poorly suited in one way or another for domain specific programming.
Matlab has a large standard library and a reasonably well integrated development environment. Its language and library ecosystem is also adapted to the domain the authors are working in. Scientific libraries in other languages come close or maybe surpass them, but it is still the de facto standard in certain parts of scientific computing.
The only reasonable alternative I can think of is IPython / Anaconda with the right set of libraries or maybe Julia, which borrows heavily from Matlab, both however don't offer all of the IDE features that a typical Matlab user maybe takes for granted.
Other languages like C++ are usually excluded by default because of their obvious very poor suitability for rapid exploratory programming.
In an ideal world maybe, but in reality if the fraction of
(hours invested in actual project)/(hours invested in fixing tooling, write high quality easy to use libraries for data visualisation, data import / export, linear algebra, optimisation, statistics etc.)
approaches zero, then a language probably won't be used.
Also in this case the syntax of Matlab and the convenience with which you can manipulate indexed expressions with it, would probably not easily trumped even by a dsl embedded in Lisp, not without significant investment in developing such a domain specific language.
Because Church (probabilistic Scheme) has been mostly discontinued, and Venture (the next big MIT probabilistic programming language) is currently somewhat hard to work with.
Looking at the latest release of Anglican, it looks awesome. I'm quite happy to see a PPL actually being maintained that still uses a functional-style `query` (allowing conditional distributions to be Yet More Distributions) instead of the imperative observe-infer programming from Venture.
In addition to the advantages of Matlab mentioned by others, a lot of uni/college freshman level (and even upper level) STEM curricula include one or more courses involving Matlab. They've gotten themselves included in a few textbooks as well.
That's awesome that Matlab is such a good expensive product. Personally I'm an open source guy all the way so I support R and Python (and Julia, I suppose) at every opportunity. Yeah, maybe there are some rough edges, but my thinking is that since they're good enough at this point, I can work around those rough edges...and I anticipate that hundreds of thousands of users will eventually drive them to being as good or better than any commercial offering. Take SAS, for instance, I don't like it--largely due to the fact that it's so expensive most companies don't upgrade until they absolutely have to. So you'll run into lots of employers running a very old version of SAS (like 9.2), without any of the extensions which help you solve problems and be more productive. So when I started my new data science-ish job at my current employer and had the option of which tools to use for machine learning and modeling I chose R and Python. But to each their own. I just dislike contributing to the market share of software I could never afford to run on my personal computer.
I'm in agreement with most of the things you said, and I wasn't really meaning to advocate for Matlab; but rather just to point out another reason why someone might choose it. I used it in college, and at one company I worked at. It was much nicer to work with than LabView. I appreciated the hand-holding I got from Matlab, but eventually became competent enough to do my job in several languages, mainly Python.
>I just dislike contributing to the market share of software I could never afford to run on my personal computer.
I agree and that's one reason I tend toward OSS in courses I teach now, unless there is no practical alternative.
Can somebody who's read the paper or code briefly comment on how the algorithm works and where the innovation is compared to the currently used Bayesian-learning algorithms?
Code: https://github.com/brendenlake/BPL
Abstract: People learning new concepts can often generalize successfully from just a single example, yet machine learning algorithms typically require tens or hundreds of examples to perform with similar accuracy. People can also use learned concepts in richer ways than conventional algorithms—for action, imagination, and explanation. We present a computational model that captures these human learning abilities for a large class of simple visual concepts: handwritten characters from the world’s alphabets. The model represents concepts as simple programs that best explain observed examples under a Bayesian criterion. On a challenging one-shot classification task, the model achieves human-level performance while outperforming recent deep learning approaches. We also present several “visual Turing tests” probing the model’s creative generalization abilities, which in many cases are indistinguishable from human behavior.