Hacker News new | past | comments | ask | show | jobs | submit login
Background removal with deep learning (towardsdatascience.com)
126 points by dsr12 on Feb 12, 2018 | hide | past | favorite | 17 comments



> Hair, delicate clothes, tree branches and other fine objects will never be segmented perfectly, even because the ground truth segmentation does not contain these subtleties. The task of separating such delicate segmentation is called matting, and defines a different challenge.

I did an internship where one of my tasks was to automatically remove the background in X-ray images, I spent basically half a year studying image processing, segmentation, reading papers, skimming image processing books etc... and never came across the word "matting", which was exactly what I was doing.

Somebody should've told me that earlier, would've probably saved me a month.


comes from vfx/the film industry. matte paintings being backgrounds and the shot being filmed in front being 'matted'


Perhaps I'm saying the obvious here, but why not train with a bunch of images which have been recorded in front of an actual green screen? That way, you can insert any random background and generate as many new training images as you like.


The network is being trained for photographs, not CGI. I suspect the different cues will end up producing wildly different trained networks. But the green screen idea is still an interesting and worthy proposal.


Training via synthesis from at least one pre-existing generative model (image / speech / et cetera) is definitely a thing that people do


1. I don't think that they have enough images taken in front of a green screen. Just changing the background has diminishing returns because the network may start to memorize the foreground images.

2. The network may rely on differences in lighting, etc, and fail to generalize.


There are commercial services doing this by hand with cheap labor. They're called "clipping path services".[1][2][3] They don't have long to live.

[1] https://www.clippingpathindia.com/ [2] https://www.offshoreclippingpath.com/ [3] http://clippingpathking.com


Unless they decide to replace their people with deep learning. They do have training images, it’s just a matter of adding software.


> Hair, delicate clothes, tree branches and other fine objects will never be segmented perfectly, even because the ground truth segmentation does not contain these subtleties. The task of separating such delicate segmentation is called matting, and defines a different challenge.

The state of the art in natural image matting already is confronting fine details as well as image segmentation and clustering. Copying papers from 10-12 years ago would give much better results than he shows here.


To get more training data, you could have filmed using a green screen.

Insert whatever background you like (as long as the result looks natural). That way you can automatically get a good pixel map of the subject. And using a video camera at 30fps, you could get thousands of training images in just a few minutes.

Of course, you might run into an issue of overfitting (it might learn what you as an individual looks like and not generalize to other people). However as long as you green-screen a somewhat large number of people this shouldn’t be an issue.

Edit: darn. Looks like I wasn’t the only person to think of this idea!


I think this is some excellent work, and anything that can help extract items of interest from the background is going to be useful for a lot of different scenarios. That said, I can't help by imagine the mayhem of this combined with the deepfake stuff where one might put a target person's face on a porn actor, and then that scene is transported to a place where that person might actually be found by replacing the background. Scary indeed.


The ape example might be a bad example, as it shows the loss of the hairs on his head... fun stuff though.


It says later in the article that solving for those sorts of details is out of scope for the project - and is a different problem altogether (matting).


We implemented background removal in an iOS app recently. We went down a similar route, but ended up choosing a user directed grabcut (heavily modified).

It would be interesting to take the output of this and use the alpha mask as the starting point for the grabcut mask.


I integrated a GrabCut on our e-com website. We tried to build fancy background and sh*t to have no folding on the background.

The end result is that we preferred to used the blank white wall that was lying behind.


Creator of this app here.

Thank you for the comments :)

This is currently merely a side project alpha, we are trying to secure funding to take it further.

Glad people having fun though :)


Need training material? Any VFX-heavy motion picture will produce a ton of it, if you can get access.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: