Hacker News new | past | comments | ask | show | jobs | submit login
Open Sourcing a Deep Learning Solution for Detecting NSFW Images (yahooeng.tumblr.com)
453 points by pumpikano on Sept 30, 2016 | hide | past | favorite | 145 comments



So can it be reversed to become the ultimate porn-finding neural network?


When you said "reversed" I thought about it being a porn-generative neural network. Just enter your favorite keywords and an unique, tailored to your needs scene will be generated just for you!


As long as what you need is a nightmarish version of the requested scene.

What happens here with generating illegal content? If you put a public text->image gan up, and someone uses it to generate child porn, are you responsible?

How could you make sure it couldn't?


I was wondering about illegal content in a totally different sense. I know from personal experience that overfitted generative models end up just memorizing the data set you trained it on. Say I overfit a model on Shakespeare, and you ask it to generate Shakespeare-like sonnets. It will do a fantastic job because it will just spit out one of his sonnets verbatim. Claiming that my network wrote it is ludicrous. The line is fuzzy where it's not overfitting and starts to generate its own content. For what you're talking about, if your model learns entirely from other people's movies, at what point is it not just a super weird codec for those people's IP?


I would imagine if you're generating then no real people were part of its creation therefore it would be legal. If I remember correctly cartoons of children having sex is not illegal (in the United States as far as I know).

Though that then raises the question: when happens when it can be generated so realistically that it looks indistinguishable from the real thing? Would it still be treated like cartoons? How could you prove one way or the other? Lots of questions here.


In the USA:

Provisions against simulated child pornography were found to be unconstitutional in Ashcroft v. Free Speech Coalition [0] in 2002.

From wiki [1]:

> Referring to [New York v. Ferber, 1982: child pornography is not protected speech], the court stated that "the CPPA prohibits speech that records no crime and creates no victims by its production. Virtual child pornography is not 'intrinsically related' to the sexual abuse of children".

IANAL, but following that logic alone, the degree of realism doesn't seem to be relevant to the legal precedent insofar as photorealistic imagery would still "record no crime" nor "create victims by its production." As to whether it's dangerous for such material to exist because it would create plausible deniability for the production of actual photography while claiming it's simulated...I guess that would be a different matter.

[0]: https://en.wikipedia.org/wiki/Ashcroft_v._Free_Speech_Coalit...

[1]: https://en.wikipedia.org/wiki/Child_pornography_laws_in_the_...


They actually fixed this with the 2003 PROTECT act which only makes obscene simulated child pornography illegal. Because obscenity isn't protected by the first amendment this has been found to be constitutional.


It's hard to say that changed anything. Obscenity is a notoriously thorny subject in American constitutional law. The legal test for determining obscenity[0] is highly subjective and in the context of the internet very difficult to apply.

[0] https://en.wikipedia.org/wiki/Miller_test


Highlights from: https://en.wikipedia.org/wiki/Miller_test

The test:

* Whether "the average person, applying contemporary community standards", would find that the work, taken as a whole, appeals to the prurient interest,

* Whether the work depicts or describes, in a patently offensive way, sexual conduct or excretory functions specifically defined by applicable state law,

* Whether the work, taken as a whole, lacks serious literary, artistic, political, or scientific value.

Also:

Critics of obscenity law argue that defining what is obscene is paradoxical, arbitrary, and subjective.

---

I think it would be hard to not find generated child porn obscene by this test, unless you have a good lawyer, at which point there is plenty of wiggle room.


I think the technical feats involved in creating such a text-to-image program might allow a talented laywer to make an argument for scientific value.

There's also the issue that "contemporary community standards" are hard to determine, because what community you're talking about is hard to determine.


While no one was harmed in the direct creation of a generated video, can it not be argued that in order to train such a generative engine, it is highly likely that harm-inducing content was produced/consumed at some stage?

Where does the harm boundary lie? Is harm inflicted if an inanimate object is the only thing consuming the content? Is a generated video borne from harmful content an interest payment on your harm-capital?

Meta, but interesting I wonder whether this train of thought would hold in a court of law...


In the United States under the 2003 PROTECT act they are actually illegal but only if they are "obscene" as obscene speech is not protected by the first Amendment.

>Prohibits drawings, sculptures, and pictures of such drawings and sculptures depicting minors in actions or situations that meet the Miller test of being obscene, OR are engaged in sex acts that are deemed to meet the same obscene condition. The law does not explicitly state that images of fictional beings who appear to be under 18 engaged in sexual acts that are not deemed to be obscene are rendered illegal in and of their own condition (illustration of sex of fictional minors). Maximum sentence of 5 years for possession, 10 years for distribution.

Interestingly the same act does make all computer generated child pornography which is "virtually indistinguishable from that of a minor engaging in sexually explicit conduct" with no requirement that it be obscene, I don't think that has been tested in court yet.

https://en.m.wikipedia.org/wiki/PROTECT_Act_of_2003


Maybe today it's legal (or not?) but that's only because generating realistic porn is not possible yet. Does anyone really think that in today's American society so afraid of sex that this would remain legal? Consider the following situation:

Imagine we're 200 years into the future and nothing much has changed with respect to attitudes about sex, freedom of speech, technology, etc. All images available 200 years ago are still there on the internet to download, including illegal child porn. If you are downloading 200+ year old child porn, where everyone depicted is long dead, how could anyone be harmed by you viewing those images? Yet I cannot imagine anyone successfully using that defense if caught with those images.

It's not about who is being harmed by these images (if anyone), it's more about society hating pedophilia and going after it wherever possible, free speech be damned.


how could anyone be harmed by you viewing those images?

I think the argument is that pedophiles view existing media and it encourages them to do things to real people.

The same reason why excessively violent movies are not considered acceptable for children- it's not out of concern for the people in the movies, it's the effect it will have on the person viewing it.


By that logic rape porn should be illegal too. And many movies and video games of realistic violence / gore.

And that may be part of the argument, but I think the official position of the justice department is that each time a child porn image is viewed, additional harm is inflicted upon the victim in the image.


Well I think many people would want those things to be illegal too. Censorship laws are not necessarily logically consistent, nor are they black and white. Changing these laws will always be contentious.


Here in Airstrip One, cartoons of children may be illegal.

http://www.mirror.co.uk/news/uk-news/fan-japanese-anime-make...


I've spoken about this many times, and I'll say it again - it's a horrible violation of freedom of speech and expression to not be allowed to possess or create certain drawings.

There is no law that sickens me more than the one that criminalises drawings. It angers me far more than even modern copyright law.

And the worst part is that nobody seems to care about it, even those who are libertarians in the UK who I've met are overcome with repulsion at the type of speech or expression, not caring for the violation of rights.

It's a horrible law backed by no evidence of harm caused, and I think it is truly truly wrong that people ignore it. I am not exaggerating when I say that the principle is a motivating factor for me to leave the UK


I didn't see that anyone had replied to my post until now. HN could really do with an inbox feature!

I just wanted to say that I agree 100%. The UK's laws on obscenity and freedom of speech in general are a terrifying mess.

You can be arrested and convicted for as little as wearing a t-shirt with an offensive slogan. [0]

How do we define offensive? Well, nobody really knows. It basically depends on the magistrate or jury you find yourself in front of.

When I discuss cases like this with people, they often say something along the lines of "but how can you defend this person... what they said was racist/homophobic/obscene/insulting to the dead etc". How many times will I have to explain, I'm not defending the person or their opinions, I'm defending the principle of freedom of speech.

Nobody is willing to stand up for a cartoonist who draws creepy pictures, or a football fan with a terrible sense of humour. But really we should all be protesting in the streets over this stuff.

How long before your t-shirt is deemed offensive? Or something you wrote? Or something you drew?

Just like with the cartoons, there is zero evidence of harm caused. Just innocent people persecuted with no justification.

[0] http://www.bbc.co.uk/news/uk-england-hereford-worcester-3674...


I agree, and in my opinion it's a sort of climb further and further down in which people decide what is offensive and not, what is acceptable or not, and they don't require evidence for it - they just legislate the problems. And it's not rare that it's done by the people who have good intentions. They want society to run smoothly and nicely - but by trying to ensure that with law, we lose individual liberties.

I've noticed that people think of speech in the same way as "you should be allowed to say that" or "you shouldn't be allowed to say this" rather than in the way of "you should be allowed to speak" or "you shouldn't be allowed to speak". This kind of "particular" reasoning leads to examining the contents of speech rather than simply the right.

A magistrate who ruled in a case about the cartoons I mentioned mentioned that "society has no need for these materials", or words to that effect. This really proves my point about taking society's use for something over the individual. Most laws usually considered to be unjust we find are related to protecting people from themselves or inadequate consideration for personal liberty.


I don't think there are that many questions really. The answers already exist like you mention. If its not real and its generated/created content it isn't illegal. This is the same as why you have all the crazy asian animated adult content.

The only question needing answered is: What is considered real?


Consult a lawyer first; in some parts of the world constructed images of child pornography are illegal, covering both drawn child pornography and photo-manipulation where a child's head is pasted on to a young-looking but legal woman's body.


>How could you make sure it couldn't?

I just realized the answer to this is pretty obvious, you could have a network trained to classify child pornography and use it to censor the output.

Getting the training set would be an issue, you would probably have to work with law enforcement to do it.


Impossible. There is no accurate way to determine age by external appearance. For example a 15 year could look older than an 18 year old.


That doesn't stop law makers from trying. Like the mess over small-breasted porn being banned in Australia in 2010:

> While the ACB claims that there is no blanket ban on small breasts as such, women over the age of 18 with small breasts who might look young ARE banned.

http://www.inquisitr.com/59633/australian-government-censor-...


Yeah but Australia doesn't really have protected speech per se. Different legal system.


Australia doesn't technically have "freedom of speech" as America or other countries may have.


If it's being generated no characters actually have an age.

Any that are plausibly adults would presumably be legally in the clear.


Incidentally, that's almost exactly how generative adversarial networks work.

To generate realistic images, you make two neural networks: one of them (D) takes an image as input and decides whether it's real or whether it's the output of (G). The other (G) takes random noise as input and turns it into an image that will fool the (D) network.

Make them fight until they both get strong, and then use (G) as the final model.


Surely you would just filter on the input?


Maybe, it seems like there might be ways to work around an input filter though.


Would computer generated child porn be illegal? No children are harmed in its creation.


No

" (8) “child pornography” means any visual depiction, including any photograph, film, video, picture, or computer or computer-generated image or picture, whether made or produced by electronic, mechanical, or other means, of sexually explicit conduct, where— (A) the production of such visual depiction involves the use of a minor engaging in sexually explicit conduct; (B) such visual depiction is a digital image, computer image, or computer-generated image that is, or is indistinguishable from, that of a minor engaging in sexually explicit conduct; or (C) such visual depiction has been created, adapted, or modified to appear that an identifiable minor is engaging in sexually explicit conduct.

...

(11) the term “indistinguishable” used with respect to a depiction, means virtually indistinguishable, in that the depiction is such that an ordinary person viewing the depiction would conclude that the depiction is of an actual minor engaged in sexually explicit conduct. This definition does not apply to depictions that are drawings, cartoons, sculptures, or paintings depicting minors or adults. "

In theory, you could argue that the network was trained on child porn, and as such anything produced by it involved the use of a minor engaging in sexually explicit conduct; but I can't imagine a court actually buying that arguement.

https://www.law.cornell.edu/uscode/text/18/2256


Damn it this is serious stuff. Please cease in dispensing incorrect legal opinions.


Wouldn't that make it illegal under 8B?


Sufficiently realistic digital child porn would be prohibited under 8B. The original context of this conversation was neural net generated images, so I had not considered the case of photorealistic images. Those would be a violation of 8B (if they are of sufficient quality).

Having said that the segment I quoted is only the definition of child porn. The law prohibiting child porn [0, section c2] provides that:

"It shall be an affirmative defense to a charge of violating paragraph (1), (2), (3)(A), (4), or (5) of subsection (a) that— ...

the alleged child pornography was not produced using any actual minor or minors. No affirmative defense under subsection (c)(2) shall be available in any prosecution that involves child pornography as described in section 2256(8)(C)"

It is worth mentioning the segments of section A that are excluded from this defence:

Section 3B prohibts the advertisement/distribution/solicitation/etc of of material that is claimed to contain (i) "an obscene visual depiction of a minor engaging in sexually explicit conduct; or (ii) a visual depiction of an actual minor engaging in sexually explicit conduct;"

The relevant part of this is (i), where you would need to parse out the definition of "obscene" and "minor". Section 2256 defines minor as "any person under the age of eighteen years", however the courts would probably read it in this context in contrast to the phrase "actual minor". I could not fine the definition of "obscene" or "actual minor". Talk to a lawyer.

Section 6 relates to prohibits providing child porn to a minor.

Section 7 requires a depiction of an identifiable minor.

[0] https://www.law.cornell.edu/uscode/text/18/2252A


Based on what you quoted up-thread a fictional depiction of an identifiable minor would be considered child-pornography too.

Moreover in the above quote it seems it could be considered "using" to use the visual identity of a minor.

I'd imagine that meant it you used any reference image of faces for your neural net there's a chance of violating the "letter" of this provision.

This of course if not legal advice.


Yes. Whether that is a good idea should probably not be debated here, as it will lead to a 500-comment subthread.


Who doesn't love a 500-comment subthread?!


But many can be harmed because of its creation, exposure to it etc.


This has been done, sort of. NSFW: http://deepdickdreams.tumblr.com/


Someone tell H.R. Giger his job has been automated.


It would be interesting to see the "deep dream" treatment of ordinary images using this neural net.


I've been looking for balance in the universe after VidAngel requires filtering of obscenity. I want to run videos through a neural network to add obscenity and nudity.


That'll be the killer app of Augmented Reality, "see-through" glasses.

"Ok glass, undress her.".



My first thought as well. Integrating with this some web crawlers that test images as it goes so it keeps going down the path of maximum NSFW websites.

Though at the same time I fear the type of pornography it may come across.


That's what Bing is for.


And in fact, that's the only thing Bing is good for.

It is really good at it, though.


I can assure you this is not by accident.


Microsoft is intentionally doing a poor job at all the non-porn uses of Bing?


No. From what I understand, Google downplays porn searching and that's the only area left for Bing to shine.


Using it as an adversarial trainer for another net it could generate infinite amounts of porn.


The Internet seems to be accomplishing this on its own just fine already.


My first thought as well, but it turns out that finding porn on the Internet is just not that much of a problem.


"So can it be reversed to become the ultimate porn-finding neural network?"

They have that, it's called 'Google' :)


You can use the nsfw subreddits as a massive dataset for categorizing porn.


Not all of them. gonwild[1] is a just a massive polygon pun subreddit

[1] https://www.reddit.com/r/gonwild


I was a little nervous at first because it asked me to conform my age and everything but it looks like you are right. Just a bunch of math geeks, haha.


Very edgy stuff!


Now I can replace Bing!


I should update my sexy map finder: http://exclav.es/2016/05/20/sexy-maps/


Those are some hot map pictures!

I don't understand how Google's algorithm can be misled into finding sexiness in those. I imagine it has something to do with skin tones or flesh colors, but then what about the high-contrast patchwork of green and brown fields Google finds "likely to contain adult content"? That's totally puzzling.

The confusion with medical images is way more understandable. If you squint, you can almost imagine those are pics of skin cancer or lesions.

Oddly enough, I even see violence in the "violent" picture. In an abstract, Rorschach Test sort of way. Well done, Google!


> I don't understand how Google's algorithm can be misled into finding sexiness in those.

I'm reminded of a paper for which the authors generated different pictures of static that fooled neural network image classifiers into confidently identifying them as different objects: https://arxiv.org/abs/1412.1897

Wired summary: https://www.wired.com/2015/01/simple-pictures-state-art-ai-s...

> Computer vision and human vision are nothing alike. And yet, since it increasingly relies on neural networks that teach themselves to see, we’re not sure precisely how computer vision differs from our own. As Jeff Clune, one of the researchers who conducted the study, puts it, when it comes to AI, “we can get the results without knowing how we’re getting those results.”


Tangent: is your blog pure HTML/CSS? I really like the look of it and have been wanting to start something just like yours.


Not OP, but looks like it's built on Jekyll. If you're not familiar, it's a Static Site Generator. It generates static html files from your content (generally markdown files).

There are other SSGs, I use Hugo myself (http://gohugo.io, sample of my blog: http://arianv.com/). I like Hugo cause it's probably the fastest SSG (every time you create a post/change content, you're remaking your entire site from scratch. If you have lots of posts - this adds up!) but Jekyll is the most popular and has great tooling.

Hope that helps!


Awesome, I'll check both of them out. Thanks so much!


Forgive my ignorance of ML but the last bit: "you'll need your own porn to train on" confused me. Does this mean that they're just exposing the rough topology of their neutral net (eg depth) and not the actual weights between nodes? I'm curious to learn from an ML expert how much this actually offers.


It looks like they are sharing the trained network, but they aren't sharing the training data set.


The training set is almost certainly composed of copyrighted material.


Interesting thought - doesn't every single porn producer now have a valid copyright claim on the trained network? I don't see how you can argue this isn't a derivative work based on the movies they produced.


I was debating with a friend about just this: whether a text-to-speech model is a derivative work of audio recordings by a given speaker, such that they'd then have claim on ownership of it. (You could almost certainly create an [overfit] model that could re-generate the original performance of a text from said text.)

Moot if it was a work-for-hire, of course; but if I, say, created a Samuel L. Jackson speech model by training on samples from his movies, and sold it as one of those car-navigation voices, could I be sued? By Mr. Jackson? By the copyright-holders of the movies?

And if I could, what does that imply about impersonators, who do the same thing, but with their brains?


I imagine you couldn't put his name on it which would be a huge deal if you wanted to sell it.


I don't think it's a derivative work just because one of the inputs is copyrighted. I think it's more descriptive than derivative. Content producers don't generally own copyright in critics' descriptions of their movies or of plot summaries, even though their copyrighted material is a necessary input to the description's creation.


Also, good luck trying to prove that your film was used to train this network.


You could be compelled to produce your training set during the discovery phase of a lawsuit.


Only if there was any reasonable basis to believe that infringement had taken place. Most countries have some sort of pre-trial hearing before a civil suit to determine if there's merit to the accusation. You're generally not allowed to go on fishing expeditions without a reasonable basis for your claim(s).


It would not be considered derivative work, because what is produced is nothing like the original. There is nothing recognizable in the work produced for a court to rule on.

This is like saying the hash of the text of a book is derivative. If it were ruled that this is the case (that a hash is a derivative work) then suddenly every single number in existence is a derivative of every single other number (since there will always exist some function that will transform X into Y.)


It would fall under "transformative" fair use.


Except the 'derivative' has nothing in common with the original.


I don't think you could argue this is a derivative work and not be sanctioned by the courts for bringing frivolous litigation.


If you want to train yes. AFAIK, they are releasing a pre-trained model which you can just use right away. There is no sense in "topology" of this yahoo specific model as it is a caffenet.


You initialized the nets using their weights, and then provide your own data, in this case, a list of images contains a (porn|no-porn) label to 'fine-tune' the nets towards your usage case.


Direct link to Github: https://github.com/yahoo/open_nsfw


Has anyone tried taking the features that are learned at the various layers of a neural net and feeding them into something like this: https://news.ycombinator.com/item?id=12612246?

I imagine we would get some really interesting images back...


Can you run deep dream on this? That would be quite fascinating.


I think you misspelled "horrifying". I can only imagine that producing something akin to the human centipede ... "Infinite Girls, One Cup" ...


We are not releasing the training images or other details due to the nature of the data, but instead we open source the output model which can be used for classification by a developer.

I'm guessing the one who had to input the data/images had a fun time at work :p


I used to work in a company which had a division doing manual image classification next door. Not a fun time at all, the people who worked there regularly burned out on relentlessly seeing terrible things.


I've often thought that it would be even more helpful to automatically filter violent images. Particularly to spare humans from having to be the filters (and "relentlessly see terrible things").

However, I imagine that's far more difficult to accomplish. How do you detect graphic violence? Looking for blood isn't going to cut it. Also, I can't imagine how you'd separate the fictional from the real - I can watch horror movies with realistic special effects all day, but real violence/mutilation/death bothers me deeply.


likewise the folks at YouTube that have to evaluate potential CP / gore / etc. all day long.


Or they just turned safe search off and collected everything that doesn't also turn up on moderate search.


They acknowledge that NSFW (or pornographic) is hard to define, a la 'I recognize it if I see it'.

But looking at the meager 3 sample images I'm confused about the scoring already. Why is the one in the middle scoring the highest?

The question is an honest one. The two rightmost images seem to be interchangeable to me and are ~boring~: People at the beach. Is this network therefor already trained to include the biases of the creators?


All ML networks are inherently biased towards its creators. My colleague recently described this issue to me as the "Old, white, male" problem. This is why most voice recognition services drastically fail when they are shown foreign accents.


> This is why most voice recognition services drastically fail when they are shown foreign accents

As someone with a broad Norwegian accent: This has gotten massively better over the last few years.

Not that long ago, my local cinema chain started using voice recognition to discriminate between a list of city names, and it would consistently think I said "Birmingham" when I said "London" (!).

These days, both my Amazon Fire and the Youtube app will correctly recognise most things I throw at it, including e.g. names of random Youtube channels that bear no relation to real English words.

It's by no means perfect, but it's getting there. In relation to the "old, white, male" problem (well, I do somewhat fit that), presumably because these systems are now finally trained on huge and varied data sets.


I wonder if anyone at Yahoo tried using this to "deconvolute" noise into Cronenberg nightmare porn?


Sit back grab some popcorn. Lets see how long it takes people to start running data backwards to get new original porn.


This was done before and posted to HN a while ago somewhat (albeit not in the same sense like DeepDream) NSFW http://blog.clarifai.com/what-convolutional-neural-networks-...


I see you never have seen what kind of images that tends to generate.

Or you're really into R'lyehian porn.


Cthulhu has nice tentacles. So I guess it will be run of the mill hentai.


You have no idea. Try googling for "deep dream porn".


>Or you're really into R'lyehian porn.

It's called tentacle hentai...


My first thought was from years ago, when I was pitching open source forensic services to London police (did not get far, bad Salesman that I am)

Cataloging, categorising pornography seized is a nasty job and one that cops across the planet might do better with good common OSS tools.

Hopefully this will help


My first thought: would probably be very useful for sites to crack down on inappropriate content.

My second thought: I could probably use this to find porn in unexpected places via a webscraping Python program.


Good to see they've automated this (beyond the initial classification of training data). In the early days of the web, such filters were typically based on manually maintained lists of sites. I actually met someone at a party once whose full-time job was to surf for porn, to maintain the filter for a provider of IT services to schools (he worked for a company now called RM Education). He said it was his ideal job for the first few days, but soon grew tiresome (note that back in those days there wasn't really any extremely objectionable material on the web).


Anyone else see the irony in acknowledging that NFSW is subjective and contextual, but assuming that pornographic images are not?


There's no definition of "ironic" that I can think of applying to that. It's like saying that "beautiful painting" is subjective and contextual, but "still life painting" isn't. It just happens that pornography is almost[1] always considered NFSW, but again, I can't see how that is ironic.

[1] Almost, because porn is SFW when your work involves porn.


I think it's easier to agree that a picture of a naked person is NSFW than that it's pornographic.


I didn't see where they said that pornographic images are not subjective and contextual. In fact it seemed like they were using "NSFW" and "pornographic" almost interchangeably, while acknowledging that the current implementation doesn't deal with violent images etc.

For instance, if you read it like this, it still makes sense (I replaced NSFW with pornographic):

Disclaimer: The definition of pornographic is subjective and contextual. This model is a general purpose reference model, which can be used for the preliminary filtering of pornographic images. We do not provide guarantees of accuracy of output, rather we make this available for developers to explore and enhance as an open source project.


I'm not a deep learning person whatsoever, but I do have an interesting use case that I won't disclose publicly: Is there a way to build this, and output detections based on the, ugh, object it has detected?

e.g.

penis 0.94

vagina 0.01


Yes. You would have to have a large training set with these labels but it would be pretty straightforward to train. You would probably want a tagging model not a classifier because there could be multiple objects of interest in the same image. If you get me the training data I could train a model for you pretty quickly.


And it would be an... interesting job to tag the training set. Although for higher level content, I suppose lots of porn videos have very specific category tags that could be an interesting data set to play with. Uh, to analyze.


Iirc you can use ML to learn the tags as well. They tend to be in text surrounding the media


What's the current ML pet method for multi-label image classification? It seems like you could string together a bunch of individual classifiers e.g. "Scene contains dog", "Scene contains cat", but is there an efficient (and effective) way of doing it in one go? Does it significantly increase the complexity of the network? I would imagine a cat detector would be far simpler than a cat and/or dog detector.


Maybe Image Captioning: http://cs.stanford.edu/people/karpathy/densecap/ (recently discussed on HN too)


> large training set

Is the training set itself large, or are you meaning a... large training set?

(sorry I couldn't help myself)


There was a asian paper about this years ago for detecting "them" for blurring purposes, it worked quite well.


I think I'll pass on browsing the the deep dream visualizations for this.


With access to Flickr and Tumblr it must have been very easy to create a huge training set for such a task.


Aren't there more important problems to work on than worrying about someone looking at naked people? This is just what we need: more effort spent on censoring and controlling people.


Won't you say that preventing a kid from accidentally viewing porn while searching images with `Safe Search: On` an important problem?


Well, considering the other shit they look at and participate in, like video games with people killing each other and blood spattering everywhere, I'd rather they were viewing naked, non-violent people.


Reminds me of this post from hackerfactor where he describes his own porn filter based on pHash.

http://www.hackerfactor.com/blog/index.php?/archives/529-Kin...

It'd be interesting to see a direct comparison of the two. Off the cuff, I'd expect the deep neural network to be more accurate and better at generalizing, but much more expensive to train.


another work in this field: "Adult video content detection using Machine Learning Techniques" PDF: http://colorlab.no/content/download/37238/470343/file/Victor...


I'll bet this would be a good tool for sysadmins or network administrators to run against their network and see what it finds.


Awesome!

I have been using nude.js to do this ( http://s.codepen.io/icodeforlove/debug/gMrEKV ), which is hit or miss.


To be precise they are only releasing the already trained model. The associated dataset is not being made public.

Thus, it is meant to be for off the shelf use rather than being able to tinker with the network to produce nuanced results.


Or they just don't want to be distributing gigabytes of porn... most of which is probably under copyright.

Making the data set available and whether you can tinker with or retrain it are very different things.


I wonder what would happen if we stopped firing people for watching NSFW images. I mean bosses look at NSFW images all the time and it sounds like a shallow reason to fire someone.


Creates a hostile work environment, however.


Are there any other fairly basic image recognition problems that people want? I'd be happy to provide as long as a dataset is easy to collect.


Has anyone run this NN on the censored Facebook image?


Interesting that the photo of two women on the beach is given a higher NSFW rating than the photo of a man on the beach.


Could this work on mobile to detect 18+ content in images or video? Or would it be a trained library of 50mb+?


I literally just started working on this problem 2 hours ago >_<


That's it! That's what I'm looking for.


is this just the network or is it a fully trained model? The TechCrunch article suggests the former but the yahoo post the latter...


It's the trained model. You can use it out of the box or refine it to tailor to your environment.


Nice, was on phone only at the time and not able to dig deeper.

Took a couple hours to get it all up and running but indeed it works, and not half badly at that!

This is obviously a way cheaper alternative to https://sightengine.com or http://imagevision.com

Kudos to Yahoo for releasing this!


I hope this is ported to TensorFlow soon!


oh silly Americans, its just tits


I would suggest, that the link should go to Yahoo's blog post

https://yahooeng.tumblr.com/post/151148689421/open-sourcing-...

which contains some technical details. (And furthermore, I guess the HN crowd has enough Internet experience to come up with stupid jokes of their own design.)


The Yahoo blog[1] post is far more interesting than this techcrunch "article". Suggest changing URL to the Yahoo Blog please.

[1] https://yahooeng.tumblr.com/post/151148689421/open-sourcing-...



So this is what Yahoo was up to for the last 10 years, instead of building any sort of security, keeping Yahoo Messenger working properly, or anything else of value? Heckuva job, Yahoo.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: