Hacker News new | past | comments | ask | show | jobs | submit login
Reptile: A Scalable Meta-Learning Algorithm (blog.openai.com)
140 points by stochastic_monk on March 11, 2018 | hide | past | favorite | 11 comments



Very impressive, but it has also some limitations, the softmax output that shows the confidence is often unreliable (tested empirically). This is an example of misclassification that has a score of 99%: https://imgur.com/a/wnr8K


I imagine that’s because it’s difficult for the model to decide which features are important. Understanding my human bias in looking for an intuitive explanation, the “circle” in this example looks most like the circle in the picture it identified as most similar. I imagine salience assignment is difficult without either more examples or injecting prior knowledge.


Yea I was thinking something similar, the alignment of the eyes is also most similar to that class as well.

The reason we find the frown to be most important is our understanding that this is a common symbolic representation of an emotion.

If we were to look at it as just an abstract shape, I think we'd come to the same conclusion as the algorithm though.


Is it really that hard to determine what changes more geometrically?

:) :( :| :O :D :/ ;)


wouldnt the solution be more accurate if you partitioned the picture down into smaller segments and then just use the algorithm on each part?


This work is surely interesting, but don't let the sophisticated formulation fool you, meta-learning is not the best performing option for few-shot classification, ProtoNets and other simple matching strategies achieve far better performance.


I looked up your claim. The ProtoNets paper by Snell et al. reports 1 shot accuracy 49.42 and 5 shot accuracy 68.20 on miniImagenet, while the new Reptile paper reports 48.21 and 66.00 respectively.

I wouldn't call Reptile sophisticated, the method actually looks really simple (perform a couple of steps of SGD per task, and use these updates as gradients in the outer loop).


Ok, I admit I hadn't looked up the actual numbers before I posted that comment, but the Snell paper wasn't the last word on that line of work, it was followed by https://arxiv.org/abs/1703.05175 (which doesn't have miniImagenet results). And there may have been further work in that direction that I'm not aware of.

You're right that Reptile is the simplest recent algorithm in the meta-learning literature, but I would argue that's basically my point, they started from somewhere pretty ambitious (lets learn a learner, or at least an SGD update rule), and ended up with learning an initialization that can be updated well with a few steps of SGD.

[EDIT]: I also prefer Matching/ProtoNets style work as being simpler to deploy, since you don't need to retrain to add new classes. Maybe one day Meta-learning will be SoTA, but there's a lot of world class researchers on it, and the approaches keep tending away from actual meta-learning IMO, so my money is on the matching approach. Though my money is on integrating with data stores in general and not needing to squish everything into weights, so I'm a bit biased here.


I think it’s more about learning a variety of tasks. And I like the emphasis on getting at higher-order derivatives with only first-order methods, which as an abstract idea has a variety of applications.


draw an upside down triangle and it thinks its the first picture. Which means, that AI is still kinda retarded.


Is the main benefit of this sampling technique that it reduces the contribution from rare though large error variance outliers?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: