From the beginning, it's seemed completely intuitive to me that training a compu...

hujun · 2024-06-30T00:26:06.000000Z

I think the reason behind it is similar to case where you are allowed to watch movie in theater with eye, but not with camera

AuryGlenz · 2024-06-30T06:19:37.000000Z

But you can have someone that watches the movie give you a plot summary, and I assume some sort of AI device that does the same would also be alright.

still_grokking · 2024-07-01T02:07:15.000000Z

That's why for example fan-made Star Wars movies are totally OK, and Disney has no issue with them, right? In such cases not even the plot is based on anything Disney "owns".

Also, it's OK to make a movie out of a plot summary someone who read some book gave you, right?

You really think such things become legal if you have some ML algos in the loop?

https://web.archive.org/web/20220307083651/https://fairuseif...

teg4n_ · 2024-06-29T23:53:29.000000Z

The meaningful difference is that one is an autonomous person and the other is a machine owned by a company.

freetinker · 2024-06-30T21:03:18.000000Z

I think the tricky bit here is “fair use”, an inherently subjective concept that’s a function of context. What’s fair in one context (meat computer training due to limitations of meat computer) may not be fair in another context (silicon computer trained by Big Tech – unencumbered by meat computer limitations).

lores · 2024-06-30T13:07:44.000000Z

There is a big difference in most people's minds between talking to/helping individuals and machines. One establishes a human connection, the other... well, quite the opposite when on top of it they're owned by a megacorp.

I believe it was legal for them to, but a breach of an implied license.

t0bia_s · 2024-07-01T06:51:34.000000Z

Ther is no such thing like "generative art". Art cannot be generated by definition.

kmeisthax · 2024-06-30T00:12:36.000000Z

The problem is the word "training". It's too anthropomorphic. Humans absolutely do learn from their inputs, but they are also capable of recognizing those influences and adopting or rejecting them. Current generation AI technology doesn't do that - which is why Stable Diffusion loves to draw, say, incomprehensible mutations of the Getty Images watermark. Machine learning models are trained by rather simplistic fixed code - akin to the expert systems that ML replaced - and that code does not know the difference between the copyrightable and uncopyrightable features of a source image. e.g. the training set had a lot of Getty Images watermarks in it, so the model is updated to draw those watermarks.

In terms of ethics, there IS a difference: AI is cheap. It does not require food, shelter, or much of anything, really. If your fellow man learns how to draw off your work, he does not become an existential threat to your livelihood in the same way an AI model potentially would. This is not an inherent problem with machines that think[1], but the rationales for investing into AI. Generative models probably can't currently replace real human artists or programmers, but that's what your boss is hearing. The capitalist class is chock full of people who would love nothing more than to fire all workers so that they can reap 100% of the benefits of automation. Economists talk of how robots and automation didn't put humans out of a job, it just shifted the jobs around and made humans more productive, but the reality is that the people who owned the factories were hoping it would. And they're perfectly willing to take further swings at the "problem" of there being a working class.

Remember: Breaking looms was the Luddites' tactic, not their goal.

In terms of copyright, a system that is nominally intended to protect artists, but doesn't do a very good job of it: the Internet is not Public Domain[0]. If you - an AI system or a human - train on a copyrighted work and produce something substantially similar to that same copyrighted work, you've infringed.

To put this in other words: if Microsoft thinks it's perfectly OK to train on anything on the Internet, then he'll be OK with my LLaMA finetune on Windows NT source code leaks[2]. If I can get the model to output the source code to Windows that means I own it now, right?

[0] no matter what Eric Bauman thinks

[1] We must negate the machines-that-think. Humans must set their own guidelines. This is not something machines can do. Reasoning depends upon programming, not on hardware, and we are the ultimate program! Our Jihad is a "dump program." We dump the things which destroy us as humans!

[2] I would love to see someone do this, get sued, and then claim Microsoft is estopped from asserting copyright infringement because they said training on other people's work is OK. It wouldn't work and Microsoft would ruin their lives but it'd be funny.