Is it really so blurry? Social engineering is about fooling a human. If there is...

monitron · on Dec 15, 2023

The LLM is trained on human input and output and aligned to act like a human. So while there’s no individual human involved, you’re essentially trying to social engineer a composite of many humans…because if it would work on the humans it was trained on, it should work on the LLM.

zer00eyz · on Dec 15, 2023

>> to act like a human

The courts are pretty clear, without the human hand there is no copyright. This goes for LLM's and monkeys trained to paint...

large language MODEL. Not ai, not agi... it's a statistical infrence engine, that is non deterministic because it has a random number generator in front of it (temperature).

Anthropomorphizing isn't going to make it human, or agi or AI or....

monitron · on Dec 16, 2023

Okay. I think you might be yelling at the wrong guy; the conclusion you seem to have drawn is not at all the assertion I was intending to make.

To me, "acting like a human" is quite distinct from being a human or being afforded the same rights as humans. I'm not anthropomorphizing LLMs so much as I'm observing that they've been built to predict anthropic output. So, if you want to elicit specific behavior from them, one approach would be to ask yourself how you'd elicit that behavior from a human, and try that.

For the record, my current thinking is that I also don't think ML model output should be copyrightable, unless the operator holds unambiguous rights to all the data used for training. And I think it's a bummer that every second article I click on from here seems to be headed with an ML-generated image.

zer00eyz · on Dec 16, 2023

> So, if you want to elicit specific behavior from them, one approach would be to ask yourself how you'd elicit that behavior from a human, and try that.

This doesn't seem that human: https://www.theregister.com/2023/12/01/chatgpt_poetry_ai/

How far removed is that from: Did you really name your son "Robert'); DROP TABLE Students;--" ?

I think that these issues probalisticly look like "human behavior", but they are leftover software bugs that have no been resolved by the alignment process.

> unless the operator holds unambiguous rights to all the data used for training...

So on the opposite end of the spectrum is this: https://www.techdirt.com/2007/10/16/once-again-with-feeling-...

Turning a lot of works into a vector space might transform them from "copyrightable work" to "facts about the connectivity of words". Does extracting the statistical value of a copyright work transform it? Is the statistical value intrinsic to the work or to language in general (the function of LLM's implies the latter).

monitron · on Dec 16, 2023

> This doesn't seem that human: https://www.theregister.com/2023/12/01/chatgpt_poetry_ai/

Agreed; that’s why I was very careful to say “one approach.” I suspect that technique exploits a feature of the LLM’s sampler that penalizes repetition. This simple rule is effective at stopping the model from going into linguistic loops, but appears to go wrong in the edge case where the only “correct” output is a loop.

There are certainly other approaches that work on an LLM that wouldn’t work on a human. Similar to how you might be able to get an autonomous car’s vision network to detect “stop sign” by showing it a field of what looks to us like random noise. This can be exploited for productive reasons too; I’ve seen LLM prompts that look like densely packed nonsense to me but have very helpful results.

simonw · on Dec 15, 2023

What's not clear at all is what kind of "human hand" counts.

What if I prompt it dozens of times, iteratively, to refine its output?

What if I use Photoshop generative AI as part of my workflow?

What about my sketch-influenced drawing of a Pelican in a fancy hat here? https://fedi.simonwillison.net/@simon/111489351875265358

zer00eyz · on Dec 15, 2023

>> What's not clear at all is what kind of "human hand" counts.

A literal monkey, who paints, has no copyright. The use of human hand is quite literal in the courts eyes it seems. The language of the law is its own thing.

>> What if I prompt it dozens of times, iteratively, to refine its output?

The portion of the work that would be yours would be the input. The product, unless you transform it with your own hand, is not copyrightable.

>> What if I use Photoshop generative AI as part of my workflow?

You get into the fun of "transformative" ... along the same lines as "fair use".

ben_w · on Dec 15, 2023

That looks like the wrong rabbit hole for this thread?

LLMs modelling humans well enough to be fooled like humans, doesn't require them to be people in law etc.

(Also, appealing to what courts say is terrible, courts were equally clear in a similar way about Bertha Benz: she was legally her husband's property, and couldn't own any of her own).

callalex · on Dec 15, 2023

English is NOT a Domain-Specific Language.

capableweb · on Dec 15, 2023

In the context we're discussing it right now, it basically is.

cwillu · on Dec 15, 2023

A domain specific language that a few billion people happen to be familiar with, instead of the usual DSLs that nobody except the developer is familiar with. Totally the same thing.

callalex · on Dec 15, 2023

Which domain is it specific to?

saghm · on Dec 15, 2023

Communication between humans, I guess?

lucubratory · on Dec 15, 2023

Not anymore.

chefandy · on Dec 15, 2023

Not saying this necessarily applies to you, but I reckon anyone that thinks midjourney is capable of creating art by generating custom stylized imagery should take pause before saying chat bots are incapable of being social.

Zetobal · on Dec 16, 2023

so wtf is "customy stylized imagery" exactly?

chefandy · on Dec 17, 2023

wtf is any other algorithmic output? Data. It's not automatically equivalent to some human behavior because it mimics it.

robertlagrant · on Dec 15, 2023

> Just because you use a DSL (English)

English is not a DSL.