Hacker News new | past | comments | ask | show | jobs | submit login

I think you've made up an irrelevant argument. The work has been incorporated into a commercial product, intentionally, under the control of someone else. Software isn't humans that pay taxes, appear in court, have rights, etc.



No, the work has not been. The impression that the work leaves on a neural network has been though.

AIs are not massive repositories of harvested data. The models are relatively small (<20GB).


A resized, smaller, or encoded version of an image is still subject to copyright. Calling an encoding an 'impression' is deceitful.


Not always.

https://www.pinsentmasons.com/out-law/news/google-thumbnails...

> A US court ruled this week that Google's creation and display of thumbnail images does not infringe copyright. It also said that Google was not responsible for the copyright violations of other sites which it frames and links to.


Part of this ruling is about how the images are used -- Fair use -- not just that they were stored in a particular way. If Google was using the smaller versions of the images (thumbnails) in other ways, it could have been infringing.

> The Court said that Google did claim fair use, and that whether or not use was fair depended on four factors: the purpose and character of the use, including whether such use is of a commercial nature or is for non-profit educational purposes; the nature of the copyrighted work; the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and the effect of the use upon the potential market for or value of the copyrighted work.


My take on copyrights and AI models...

Taking copyrighted material and using it to train a model is not a copyright infringement - it is sufficiently transformative and has a different use than the original images.

Note that AI models can be used for different things. A model trained to identify objects in an image has never had uproar about the output of "squirrel" showing up in the output text.

The model also, as a purely mathematical transformation on the original source material does not get a copyright. If it needs to be protected, trade secrets are the tools to use to protect it. A model is no more copyright worthy than tanking an image and applying `gray = .299 red + .587 green + .114 blue` to it.

The output of a model is ineligible for copyright protection (in the US - and most other places).

The output of a model may fall into being a derivative work of the original content used to train the model.

It is up to the human, with agency in asking the model to generate certain output to be responsible for verifying that it does not infringe upon other works if it is published.

Note that the responsibility of the human publishing the work is not anything new with an AI model. It is the same responsibility if they were to copy something from Stack Overflow or commission a random person on Fiverr... its just that those we've overlooked for a long time - but it is similarly quite possible for the material on those sources to be copyrighted by and/or licensed to some other entity and the human doing the copying into the final product is responsible for any copyright infringements.

Saying "I copied this from Stack Overflow" or "I found this on the web" as a defense is just as good as "Copilot generated this for me" or "Stable diffusion generated this when I asked for a mouse wearing red pants" and represents a similar dereliction on part of the person publishing this content.


It's none of the those things, these models train on petabytes of data. They store relationships of objects to each other, not objects themselves.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: