About 5 years ago, StackOverflow messed up and declared that they were making all content submitted by users available under CC-BY-SA 4.0 [1]. The error here is that the users-content agreement was that all users' contributions are made available under CC-BY-SA 3.0 (and not anything about later). In the middle there were also some licensing problems concerning code vs noncode that were confusing.
I remember thinking that if any of the super answerers really wanted, they could have tried to sue for illegally making their answers available under a different license. But I thought that without any damages, this probably wasn't likely to succeed.
But now I wonder whether making all content available to AI scrapers and OpenAI in particular might be enough to actually base a case. As far as I can tell, StackOverflow continued being duplicitous with what license applies to what content for half of the year 2018 and the first few months of the year 2019. Their current licensing suggests CC-BY-SA 3.0 for things before May 5 2018, and CC-BY-SA 4.0 for things after. Sometime in early 2019 (if memory serves, it was after the meta post I link to), they made users login again and accept a new license agreement for relicensing content. But those middle months are murky.
My understanding of licensing law is that something like 3.0 -> 4.0 is very unlikely to be a winnable case in the US.
Programmers think like machines. Lawyers don't. A lot of confusion comes from this. To be clear, there are places where law is machine-like, but I believe licensing is not one of them.
If two licenses are substantively equivalent, a court is likely to rule that it's a-okay. One would most likely need to show a substantive difference to have a case.
IANAL, but this is based on one conversation with a law professor specializing in this stuff, so it's also not completely uninformed. But it matches up with what you wrote. If your history is right, the 2019 changes is where there would be a case.
The joyful part here is that there are 200 countries in the world, and in many, the 3.0->4.0 would be a valid complaint. I suspect this would not fly in most common law jurisdictions (British Empire), but it would be fine in many statutory law ones (e.g. France). In the internet age, you can be sued anywhere!
> If two licenses are substantively equivalent, a court is likely to rule that it's a-okay. One would most likely need to show a substantive difference to have a case.
Which does exist and can affect the ruling. CC notably didn't grant sui generis database rights until 4.0, and I'm aware of at least one case where this could have mattered in South Korea because the plaintiff argued that these rights were never granted to and thus violated by the defendant. Ultimately it was found that the plaintiff didn't have database rights anyway, but could have been else.
A super literal reading of some bad wording in 3.0 created an effect the authors say they did not intend and fixed in 4.0. Given the authors did not intend this interpritation a judge is likly to assume people using the licence before it came to light also did not, hence switching to 4.0 is fine. Conversly now this is widiy known continuing to use 3.0 could be seen as explicitly choosing the novel interpritation (arguably this would be a substantive change).
> a judge is likly to assume people using the licence before it came to light also did not
Why would the judge have to assume anything? The person suing could simply tell the judge they did mean to use the older interpretation, and that they disagree with the "fix". They're the ones that get to decide, since they agreed to post content using that specific license, not the "fixed" one.
But the people suing aren't trying to choose how the license is interpreted, they're trying to prevent the other party from changing the text. If the change is meant to "fix" how the text should be interpreted (which is what you said), then they're the ones trying to choose the exact interpretation.
I personally write "IANAL", not to reduce my personal legal liability, but rather to give a heads up to those reading that I am not an expert, that I am likely wrong, and that you likely shouldn't listen to me.
I feel there's a common thread that maybe should be some kind of internet law that people who make a point of noting they are not experts, are more often correct than people who confidently write as though they are.
You see this particularly with crypto, where "I am not a crypto expert" is usually accompanied by a more factual statement than one from the self proclaimed expert elsewhere in the thread.
One cannot legally practice law without a license. The definition of that varies by jurisdiction. Fortunately, in my jurisdiction, "practicing law" generally implies taking money, and it's very hard to get in trouble for practicing law without a license. However, my jurisdiction is a bit of an outlier here. Yours might differ.
In general, the line is drawn at the difference between providing legal information and legal advice.
Generic legal discussions, like this one, are generally not considered practicing law. Legal information is also okay. If I say "the definition of manslaughter is ...," or "USC ___ says ___," I'm generally in the clear.
Where the line is crossed is in interpreting law for a specific context. If I say "You committed manslaughter and not murder because of ____, which implies ____," or "You'd be breaking contract ____ because clause 5 says ____, and what you're doing is ____," that's legal advice.
The reasons cited for this are multifold, but include non-obvious ones, such as that clients will generally present their case from their perspective. A non-lawyer will be unlikely to have experience with what questions to ask to get a more objective view (or even if the client is objective, what information they might need to make a determination). Even if you are an expert in the law, it's very easy to accidentally give incorrect advice, which can have severe consequences.
In practice, most of this is protectionism. Bar associations act like a guild. Lawyers are mostly incompetent crooks, and most are not very qualified to provide legal advice either, but c'est la vie. If you've worked with corporate lawyers, this statement might come off as misguided, but the vast majority of lawyers are two-bit operations handling hit-and-runs, divorces, and similar.
In either case, it's helpful to give the disclaimer so you know I'm not a lawyer, and don't rely on anything I say. It's fine for casual conversation, but if tomorrow you want to start a startup which helps people with legal problems, talk to a qualified lawyer, and don't rely on a random internet post like this one.
I always assumed it was the same type of courtesy as IMHO, and someone taking legal advice from random strangers on the internet wouldn't result in any legal liability on the side of the commenters.
Yes, people have been sued before for giving advice that was acted upon.
I remember hearing about an construction engineer who was sued for giving bad advice whilst drunk to a farmer over fixing a dam. The dam failed and the engineer was found to be liable.
I can see the reasonning behind the case, as the engineer has plausible expertise in the domain and could credibly give actionable advice.
When it comes to lawyers, there is already a legal framework where lawyers are responsible when giving legal advice, even when it's not toward their clients, the same way medical professionals have specific liabilities regarding the medical acts they can perfom.
Non lawyers giving legal advice doesn't fit that framing, except if they explicitely pose as one. I'd also exclude malicious intent, as whatever the circumstances, if it can be proven and results in actual harm there's probably no escape for the perpetrator.
That’s possible because the engineer is licensed. A random guy giving bad advice and failing to disclose he’s not an engineer would do no such thing (so long as he didn’t suggest he was an engineer).
It is worth remembering that law professors have a vested interest in making sure the system work as you described. If contract law was straightforward, they'd be out of job.
That's an admirable goal but if there are any "bugs" in the contract you probably don't want it executed mindlessly. Human language isn't code and even code isn't always perfect so I'd rather not be legally required to throw someone out a window because someone couldn't spell "defederate".
I agreed in the abstract, but not in the specific (the specific professor was one of integrity, and sufficiently famous this was not an issue).
However, it's worth noting the universe is a cesspool of corruption. If you pretend it works the way it ought to and not the way it does, you won't have a very good time or be very successful. The entire legal system is f-ed, and if you pretend it's anything else, you'll end up in prison or worse.
> if any of the super answerers really wanted, they could have tried to sue for illegally making their answers available under a different license.
they can plausibly sue people other than stackoverflow if they attempt to reuse the answers under a different license. but i think it's very difficult to find a use that 4.0 permits that 3.0 doesn't
The blog illustrates that such assumptions about what's a sufficient attribution are fraught with danger, so "the smallest professional courtesy" can expose you to a $150k risk
People put their content on the site for the public to use, and now the public is using it, it's just that "the public" includes AIs. Admittedly, a non-human public, nonetheless ...
The problem is LLMs don't provide attribution/credit which directly violates the license[0]
Otherwise search engines were already "non-human public" that scraped the site but directly linked to the answers, which was great. They didn't claim its their work like these models. The problem isn't human vs non-human. LLMs aren't magic, they don't create stuff out of thin air, what they're doing is simply content laundering.
I remember thinking that if any of the super answerers really wanted, they could have tried to sue for illegally making their answers available under a different license. But I thought that without any damages, this probably wasn't likely to succeed.
But now I wonder whether making all content available to AI scrapers and OpenAI in particular might be enough to actually base a case. As far as I can tell, StackOverflow continued being duplicitous with what license applies to what content for half of the year 2018 and the first few months of the year 2019. Their current licensing suggests CC-BY-SA 3.0 for things before May 5 2018, and CC-BY-SA 4.0 for things after. Sometime in early 2019 (if memory serves, it was after the meta post I link to), they made users login again and accept a new license agreement for relicensing content. But those middle months are murky.
I should emphasize that I know nothing.
[1]: https://meta.stackexchange.com/q/333089/205676