Reptile: A Scalable Meta-Learning Algorithm

JacobiX · on March 11, 2018

Very impressive, but it has also some limitations, the softmax output that shows the confidence is often unreliable (tested empirically). This is an example of misclassification that has a score of 99%: https://imgur.com/a/wnr8K

stochastic_monk · on March 11, 2018

I imagine that’s because it’s difficult for the model to decide which features are important. Understanding my human bias in looking for an intuitive explanation, the “circle” in this example looks most like the circle in the picture it identified as most similar. I imagine salience assignment is difficult without either more examples or injecting prior knowledge.

JHonaker · on March 11, 2018

Yea I was thinking something similar, the alignment of the eyes is also most similar to that class as well.

The reason we find the frown to be most important is our understanding that this is a common symbolic representation of an emotion.

If we were to look at it as just an abstract shape, I think we'd come to the same conclusion as the algorithm though.

mortdeus · on March 12, 2018

Is it really that hard to determine what changes more geometrically?

:) :( :| :O :D :/ ;)

mortdeus · on March 12, 2018

wouldnt the solution be more accurate if you partitioned the picture down into smaller segments and then just use the algorithm on each part?

Eridrus · on March 11, 2018

This work is surely interesting, but don't let the sophisticated formulation fool you, meta-learning is not the best performing option for few-shot classification, ProtoNets and other simple matching strategies achieve far better performance.

twanvl · on March 12, 2018

I looked up your claim. The ProtoNets paper by Snell et al. reports 1 shot accuracy 49.42 and 5 shot accuracy 68.20 on miniImagenet, while the new Reptile paper reports 48.21 and 66.00 respectively.

I wouldn't call Reptile sophisticated, the method actually looks really simple (perform a couple of steps of SGD per task, and use these updates as gradients in the outer loop).

Eridrus · on March 12, 2018

Ok, I admit I hadn't looked up the actual numbers before I posted that comment, but the Snell paper wasn't the last word on that line of work, it was followed by https://arxiv.org/abs/1703.05175 (which doesn't have miniImagenet results). And there may have been further work in that direction that I'm not aware of.

You're right that Reptile is the simplest recent algorithm in the meta-learning literature, but I would argue that's basically my point, they started from somewhere pretty ambitious (lets learn a learner, or at least an SGD update rule), and ended up with learning an initialization that can be updated well with a few steps of SGD.

[EDIT]: I also prefer Matching/ProtoNets style work as being simpler to deploy, since you don't need to retrain to add new classes. Maybe one day Meta-learning will be SoTA, but there's a lot of world class researchers on it, and the approaches keep tending away from actual meta-learning IMO, so my money is on the matching approach. Though my money is on integrating with data stores in general and not needing to squish everything into weights, so I'm a bit biased here.

stochastic_monk · on March 13, 2018

I think it’s more about learning a variety of tasks. And I like the emphasis on getting at higher-order derivatives with only first-order methods, which as an abstract idea has a variety of applications.

mortdeus · on March 12, 2018

draw an upside down triangle and it thinks its the first picture. Which means, that AI is still kinda retarded.

notMick · on March 12, 2018

Is the main benefit of this sampling technique that it reduces the contribution from rare though large error variance outliers?