1. Humans are not neural networks. 2. Humans are not allowed to directly copy ev...

fhd2 · on Nov 10, 2022

I can't shake the feeling that a lot of the logic around ML models having more or less the same "rights" as humans comes from misleading marketing that they, in any shape or form, resemble human intelligence. AI is a buzzword applied to any kind of algorithm for an activity that people previously thought couldn't be automated.

Back when I was young, graph pathfinding algorithms where called AI. A few decades later they are a well understood commodity and I haven't seen anyone call them AI for a while. Maybe that'll happen to LLMs too, given a few years?

nomilk · on Nov 10, 2022

An argument in favour of legality of web scraping is if a human can look at websites and collect data, then why shouldn't they be allowed to do the same programatically?

This is the same but for use of open source code: if humans are allowed to use one specific (organic) neural network to read, process, and use open source code, then why shouldn't they be allowed to use some other neural network, artificial or otherwise.

geysersam · on Nov 10, 2022

This is the slippery slope argument. It's not inconsistent to allow human "webscraping" while disallowing massive machine web scraping. Most important, it's about what the owner of the website considers to be appropriate.

A neural network is closer to a database than a human brain. So this is akin to saying: I can store your personal data in my human brain (without your consent), why am I not allowed to do it in PostgreSQL?

jimktrains2 · on Nov 10, 2022

No, scraping stems from a service not placing any limits on its access cannot complain that it was accessed.

With code, that is denoted via the license, which when supplied with the code and especially as metadata before downloading (as is the case with GitHub) is the common means with which those limits are placed.

Humans and neural networks process information very differently and it's disingenuous to imply otherwise.

melagonster · on Nov 10, 2022

But the analogue on code is not machine learning, it should be automatically download code.

nomilk · on Nov 10, 2022

The specifics don't matter so much as the general idea that if a human can do it (anything), then why can't the human make a tool that can do it from them, thus saving them the work.