Hacker News new | past | comments | ask | show | jobs | submit login

All models trained on public data need to be made public. As it is their outputs are not copyrightable, it’s not a stretch to say models are public domain.



You seem to be mixing a few different things together here. There's a huge leap from something not being copyrightable to saying there is grounds for it to be made public. No copyright would greatly limit the ability of model makers to legally restrict distribution if they made it to the public, but they'd be fully within their rights to keep them as trade secrets to the best of their ability. Trade secret law and practice is its own thing separate from copyright, lots of places have private data that isn't copyrightable (pure facts) but that's not the same as it being made public. Indeed part of the historic idea of certain areas of IP like patents was to encourage more stuff to be made public vs kept secret.

>As it is their outputs are not copyrightable, it’s not a stretch to say models are public domain.

With all respect this is kind of nonsensical. "Public domain" only applies to stuff that is copyrightable, if they simply aren't then it just never enters into the picture. And it not being patentable or copyrightable doesn't mean there is any requirement to share it. If it does get out though then that's mostly their own problem is all (though depending on jurisdiction and contract whoever did the leaking might get in trouble), and anyone else is free to figure it out on their own and share that and they can't do anything.


Public domain applies to uncopyrightable works, among other things (including previously copyrighted works). In this case models are uncopyrightable, and I think FB (or any of these newfangled ai cos) would have interesting time proving otherwise, if they ever try.

https://en.m.wikipedia.org/wiki/Public_domain


I’m honestly not sure. RLHF seems particularly tricky —- if someone is shaping a model by hand, it seems reasonable to extend copyright protection to them.

For the moment, I’m just happy to disarm corporations from using DMCAs against open source projects. The long term implications will be interesting.


Aggregating and organizing public knowledge is a fundamentally valuable action which many companies make their business off of.

If I create a website for tracking real estate trends in my area — which is public information — should I not be able to sell that information?

Similarly if a consulting company analyzes public market macro trends are they not allowed to sell that information?

Just because the information which is being aggregated and organized is public does not necessarily mean that the output product should be in the public.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: