> Were you able to utilise any data about Lego parts from Lego's own catalogues ...

jncraton · on April 29, 2017

> I tried, but in the end a straight up train-correct-retrain loop took care of all the edge cases much quicker and much more reliable than any feature engineering and database correlation that I tried

I'd love to hear more about what you tried specifically. I'm considering doing this myself, and I was thinking of building a very large labeled dataset of 3d rendered images using the LDraw parts library and training on that. I could include hundreds of images per part by using different viewing angles, zoom levels, focus, etc in the rendering process. Did you try anything like that?

jacquesm · on April 29, 2017

That's fairly pointless for me. There are some intricacies in the optics that would need to be modeled as well as the specifics of the camera and the light path, and then you'd still have all the weird gravity related trickery: parts that end up on top of each other, parts that can only be found in one or two ways on the belt and so on.

After endless messing around I finally bit the bullet and trained a neural net, from 0 to 100 in a few weeks and it is rapidly getting more usable now.

The feature detection code may get a second life though: as a meta-data vector to be embedded in to the net. But only if it is really necessary.

I'm quite curious though if you can get your method to work, especially for the parts that are very rare and rare colors.

jncraton · on April 29, 2017

That's very helpful. Thanks!

I was assuming that at minimum I'd need to do a lot of filtering in order to get the camera images and renders into a state where they are similar enough to work for training.

Any chance that you'll be releasing source code for this project and/or your labeled dataset?

jacquesm · on April 29, 2017

> Any chance that you'll be releasing source code for this project and/or your labeled dataset?

Yes, but not yet. It needs to get a lot better before I'm going to stamp my name on it as a release. Right now it is rather embarrassing from a code quality point of view, it has been ripped apart and put together several times now and every time it gets a lot better but we're not there yet.

sorenjan · on April 29, 2017

Sounds like a great learning opportunity for beginners and people who haven't even started with machine learning yet, like me. I'm looking forward to it. If you're so inclined, a new blog post focusing on the machine learning process would be much appreciated, it's always interesting to read about the process of how somebody solved a real problem and implemented the solution.

Just so I understand the process correctly, did you manually sort some pieces to get a labeled training set, feed those through the machine, train the NN with that, and then manually correct the errors when sorting unknown pieces, added all those pictures to the same training set and then finally run the full training again? How many labeled images do you need to start getting acceptable performance? Are you training the NN continuously with every new image, or from scratch with an increasing data set?

Do you think a stereo camera would improve the classification in a meaningful way, or maybe a second camera from a different angle?

jacquesm · on April 29, 2017

> did you manually sort some pieces to get a labeled training set, feed those through the machine, train the NN with that, and then manually correct the errors when sorting unknown pieces, added all those pictures to the same training set and then finally run the full training again?

Yes, but that cycle repeats every day. So the training never really stops, it just runs at night and the machine runs during the day. Today it sorted close to 10K parts and those images will now be added to the training set and then I'll start the training overnight so tomorrow morning my error rate should be much better than it was today and so on.

> How many labeled images do you need to start getting acceptable performance?

Good question! Answer: I don't really know but judging by how fast the error rate is improving between 100 and 200 per 'class' so that will be 200K images or so when it is one with the 1000 most commonly found parts.

> Are you training the NN continuously with every new image, or from scratch with an increasing data set?

From scratch with every expanded set. I suspect that's the better way but I have no proof. My intution is that it is hard to make a neural net learn something entirely new that it has not seen before and every day totally new stuff gets added. So I re-train all the way from noise.

> Do you think a stereo camera would improve the classification in a meaningful way, or maybe a second camera from a different angle?

You're getting close to the secret sauce :)

sorenjan · on April 29, 2017

Thanks for the answer!

I guess my lack of knowledge in the field shines through. Continuous learning is apparently under active research at the moment, and this blog post about it [0] is less than two months old, so your intuition was right.

If I were to guess the secret sauce I'd say that a mirror might be involved. Is depth information not worth the trouble for these kinds of classification problems?

[0] https://deepmind.com/blog/enabling-continual-learning-in-neu...

jacquesm · on April 29, 2017

> If I were to guess the secret sauce I'd say that a mirror might be involved.

You might be right there :)

> Is depth information not worth the trouble for these kinds of classification problems?

Yes, it would be, but there's much more to it than that. Also keep in mind that there are parts that are almost transparent and that no matter what background color you come up with there will be a bunch of lego parts that match it.

ArnoVanLumig · on April 30, 2017

Would it help to get multiple images from the pieces on the belt, each illuminated from a different direction by some kind of strobe? Then the shadows could be incorporated in detecting some shape information. Might even help with translucent pieces.

Colored strobes may also help separating out different color pieces, although I expect that would be overkill.

dognotdog · on April 30, 2017

checkerboard background?

jacquesm · on April 30, 2017

Tried that one too, both b/w and purple/w... in the end a transparent background with a bunch of lights behind it works best.

microcolonel · on April 29, 2017

One thing that I think could work is generating imagery from the catalogue for parts you've not seen before. Initialize a random starting position, and use bullet to have them fall into a realistic position on a plane. then raytrace from the top.

This way you could also cross-reference with part seller sites to see the going rate for a part and determine whether it's worth your time to separate it manually. Have a bin for rare parts worth separating by hand.

jacquesm · on April 30, 2017

That's clever!

ma2rten · on April 29, 2017

You could also use mechanical turk to label your dataset.

jacquesm · on April 29, 2017

Can't use MT from nl...

ma2rten · on April 29, 2017

Oh, I forgot about that. Is there no European alternative?

jacquesm · on April 29, 2017

Not that I know of. But in the end I found a much better and faster solution so in a way that constraint only pushed creativity.

make3 · on April 30, 2017

did you first download and use the pretrained net of an different classification task and use that as a default ? also, did you use batch normalization? also, did you try ResNets? you probably don't care at this point, but all of this would __massively__ decrease training time

jacquesm · on April 30, 2017

> did you first download and use the pretrained net of an different classification task and use that as a default ?

Yes, but that did not give me accuracy enough, so now I train from scratch. I had hoped to save having to train the conv layers.

> also, did you use batch normalization?

Yes.

> also, did you try ResNets?

No.

> you probably don't care at this point, but all of this would __massively__ decrease training time

Oh, I care all right :) I'm re-training the net every evening (it's running right now) after adding another batch of training images.