Hi, developper of Lasso from Entr'ouvert here (sorry for the english, I'm more used to talk about programming than french law):
It took so much time for many reasons :
* first the judge advised us to try to mitigate, mitigation failed because Orange/FranceTelecom did not want to mitigate anything (we asked in mitigation for nearly what we won in the end),
* we returned in tribunal, and the judge asked for an expertise to see if really the fact that Orange linked their program with Lasso inside an Apache module was violating GPLv2,
* we lost on first instance,
* we made appeal and won but not on the counterfeiting accusation, only on some other kind of violation in French named "parasitism", the fact is french juridication doctrine on "licences" was that is was a contract, and only a contract and the could only invoke civil contract law and not use counterfeiting laws,
* we did not agree with that, and neither did the european justice court, so we asked the french equivalent of the 2nd circuit (Cours de cassation) to overrule the court of appeal doctrine,
* we won in the cours de cassation,
* so we returned to the appeal court and we won.
And here we are.
The inefficiency come from many aspects :
* french judiciary lacks money and resources, it's not new, everything is slow, budget of ministry of justice by citizen, is half or one third of the same budget is germany, but for specialized justice like counterfeiting of software, it's a special court, which should be faster than for usual civil law,
* BUT france is not California, litigation about software licences and especially free software licence is extremely rare, and usually between commercial entities which have real contractual obligation between them, so that the court can concentrate on the commercial aspect (you owed so much, you had to deliver this and did not, etc...) here we did not have any commercial relation with Orange on this project (we had a long time ago on other project not related to the one for which they used Lasso)
* ALSO france has very special doctrine called (in french) "non cumul de la responsabilité contractuelle et délictuelle", it means that if there is some kind of contract between you and a third party, that imposed them to not violate some law which can also be an offense/crime, the invoked responsibility can only coming from the contract and not from the offense/crime law (counterfeiting is an offense in France), so we had to break this doctrine in "Cours de cassation" before being authorized to litigate on counterfeiting.
> we asked the french equivalent of the 2nd circuit
Minor note: not really the French equivalent of the 2nd circuit
US (federal) courts are like: district, appeals (e.g. 2nd circuit, 9th circuit, cafc, etc), supreme. The circuit courts are all on the same level, they don't hand cases between each other.
French court has another layer in there? And moreover, it's not geographically separated, but rather by subject matter? (US has only one such court, the cafc).
E.g. you can't remand from 2nd circuit down to an appeals court, because there is no appeals court underneath the 2nd circuit.
But you also can't say that court is like the supreme court, because there is a constitutional court above it, which is the analogy to the US supreme court
Yeah, Cassation Court is a sort of sub-Supreme Court.
It has the ability to break rulings (French casser, hence the name) if the law wasn't well done, or if the wrong law was applied to the case, or if some judicial norms (laws, basic rights, jurisprudence) were not balanced correctly in a ruling ; It does not rule on the facts of the case, only on the way it was judged.
Rulings of the Cassation court are usually applicable as case law, which is much less common in continental legal systems than it is in Common Law legal systems.
By the way, the French legal system has two main circuits ("orders") : the judiciary order (penal and civil) and the administrative law order, which handles disputes and trials between government entities (local, regional, or national), and between government entities and private entities (in most cases). The equivalent of the Cassation court in the administrative order is the Conseil d'Etat.
I'm just psyched to learn it has multiple separate subject-matter segregated appeals courts. Always looked like a saner approach than geographically separating circuit courts. Get judges with more specialized expertise in various areas.
There are benefits but also drawback to such system. The benefit is as you say that judges have more specialized expertise, which also mean that other judges don't need to dive deep into expert areas outside of their own. The drawback however tend to be a much higher risk of regulative capture, especially in areas of patent and copyright. A general judge might have a easier time creating a balance decision than a judge who been previous been working as a expert lawyer for large companies for several decades.
The above events illustrate this a bit by how they lost on first instance because the system assumed that cases are about commercial entities which has contractual obligation between them. It took the supreme-like court that is specialized on legal procedures, rather than a court specialized on contract, to see the issue in the correct context.
And "circuit" in the US is about region, not hierarchy. California has a hierarchy of federal courts and as 9th circuit, they are allowed to rule differently from the hierarchies of federal courts in Texas (in the 5th circuit) or Colorado (in the 10th). They don't have to but they can.
At some point, if the US Supreme Court steps in (which they don't have to), one reasoning gets picked (or yet another one cooked up) which then stands for all regions and the courts are then strongly dissuaded from screwing with it any further.
> That's a factor of about 2.2x, assuming 7% capital gains year-on-year.
Why would you assume 7% year on year? Did you look into Orange's stock price or quarterly/yearly reports, or did you think all companies have the same return as investing in the S&P500 would have?
7% is (roughly) what you can make by investing in ETFs tracking some "world market" index like MSCI World, FTSE All World, et cetera, averaged over long periods.
And most companies' business model isn't investing in ETFs. They usually try to do stuff that provides some value to someone, and some perform better, some perform worse than "world market" ETFs.
It doesn’t end here. I won money this way and getting them to actually pay is near impossible unless you want to burn like $50k on lawyers putting caveats on all their stuff.
Like, the law is so broken, nothing happens if they just don’t pay. Eventually your “win” can expire after 13 years or something and if you didn’t collect the money then it’s no longer collectible.
Not sure what country you are in, but in France if the other party is not going forward with it you use a "commissaire de justice", which is a regulated profession with special powers and will be able to seize money straight from the other party bank account (while taking a fee and a % for himself for his services). It works whether the other party is a legal or a moral person, they're also allowed to get inside their home or office and seize physical things, or even put a place for sale. See [1].
You need a "titre executoire", which a judgment in court is.
If on the other hand you have a simple normal everyday contract and the customer isn't paying for it, you can ask the court for a titre executoire, where the other party can argue and the judge can check whether it was a real contract / you delivered / ... and if yes will give an order to pay -- it takes between 2 weeks and 3 months in my experience, if you get the titre and the other party still doesn't pay then you can use the commissaire de justice de seize from their account.
If your contract is for a sum below 5000 you don't need to involve the court, only to have tried amical resolution before, and then the unpaid contractual debt act as a titre executoire by itself, that the commissaire can act on (though as that point the other party is allowed to involve a judge to protect themselves if needed or wanted).
They usually send a notice to banks and banks have to return what accounts they own for said person, to allow the proportional seizure.
Now this has limits of course, most importantly if the other party is insolvent, or if the commissaire cannot find their assets or they don't have any, etc ... But for a company like Orange, it's a non issue.
No you don't have to ? The judgment itself act as a "titre executoire", it's an order to pay that you can give to the commissaire de justice, not involving the courts anymore.
If I was a lawyer for this company, I'd be telling my boss we won. Decades of reselling something we didn't own for a pittance of a fine is a win in my book.
They never reselled it, I have no proof of that but I'm pretty sure of it, in fact the use of this software stopped in 2016 when the french state site mon.service-public.fr stopped (see https://fr.wikipedia.org/wiki/Mon.service-public.fr), software was used from 2009 to 2016, by a french administration named DGME, so 7 years. Orange extended its own identity provider (used in their old ISP trademark "Wanadoo") which was some kind of proprietary SSO protocol a bit like CAS, and extended it with Lasso to deliver a real SAML identity provider for this project to French state); they never have any incentive to use it for themselves and they do not resell softwares anyway.
The background of all that is only supposition from me (we participated to the same project as the contractor of another administration, la Caisse des dépôts et Consignation, but we were never inside the project of delivering Mon.Service-public.fr itself which was the project won by Orange Business Services and using Lasso), but I think it happened because an engineer was expected to deliver the software but its managers never asked him how he would do it, and at some point someone waved the problem of the license away because everything was too late anyway. It's just my own big corporation fiction, your mileage may vary.
What you say is true, and relevant to the discussion, but... it doesn't answer anything. A very long and dragged out process takes away the justness of it all, and the fact that the supreme court was involved does not change that in any way. If you're a victim, the courts are supposed to provide a remedy. Even the supreme court. That's at least part of their reason for existing. If you have to wait 14 years for a remedy, they're not doing their job right, no matter how supreme they are.
The fact that it goes the same in other countries also doesn't help. The system is broken everywhere, that's all.
Sadly, "justice" isn't the job of the supreme court, it's to interpret the constituion. If it takes 10 months or 10 years, that's not an issue in the grand scheme of things.
For the US, the right to a speedy trial is for criminal cases, not civil. And I don't think breaking software licenses is a criminal act unless a mass of damages were done. That's basically the logic that LLM's are working off of right now; get into dubious lawsuits and become top of the food chain by the time the litigation settles.
Heck, quite a few pieces of tech work that way. Lootboxes came and went in western video games before it even had time to be challenged.
Just so you know, the role of the supreme court may not be 1:1 between what you're used to in the US and the rest of the world
In France the supreme court's role isn't tied to the constitution, just the law. They're to dictate which ways the law is to be interpreted and homogenize the decisions by the different lower courts.
We have a different (actually, two...) administrative organs that ensure that the law follows the constitution
> They're to dictate which ways the law is to be interpreted and homogenize the decisions by the different lower courts.
FWIW, the U.S. Supreme Court also has this role. People focus more on their constitutional cases because those are the most impactful/famous, but a lot of their workload is just interpreting normal laws, especially ones that have diverged among the various lower courts.
>People focus more on their constitutional cases because those are the most impactful/famous, but a lot of their workload is just interpreting normal laws
They have proportionately more constitutional cases because a case hitting the SCOTUS needs to be re-appealed at least twice, and the SCOTUS needs to choose to accept the case (4 out of 9 of the judges to be precise). The SCOTUS only takes a few dozen cases a year, so whatever remains tends to be larger issues.
But yes, very few cases are going to be Brown vs. the Board of Education levels of impactful. Perusing some cases of 2023 reveal one such case "simply" being:
>A case in which the Court will decide whether the "serious drug offense" definition in the Armed Career Criminal Act incorporates the federal drug schedules that were in effect at the time of the federal firearm offense or the federal drug schedules that were in effect at the time of the prior state drug offense.
Basically a bookkeeping case due to the different states as of late re-defining "serious drugs"
There is no explicit limit on the role of the supreme court. Nor is there any actual explicit role in the constitution for it. It's just one sentence. I think they even once tried a criminal case.
It deals with lots of constitutional issues, but Congress could basically just defund it and remit constitutional complaints to other venues if they wanted to. This is done in part with FISA and other national security measures. Congress deemed there to be no venue in the courts for a complaint.
I thought "conseil constitutionnel" on the judicial branch + "conseil d'état" on the administrative branch (french law is split in two branches), but apparently the "conseil constitutionnel" is doing both, the split is "cour de cassation" (supreme court) + "conseil d'état" (administrative supreme court), with the conseil constitutionnel doing its thing upstream unrelated to any of them.
I don't think so, are you sure? The Conseil d'Etat has two distinct roles (and thus two distinct branches):
- serving as the supreme court for the administrative "order" - as a judge, the CE does not check if laws are constitutionnal
- serving as legal counsel to the government - in which case it does check if bills put forward by the governement (i.e. not laws already adopted) are constitutionnal but this is merely advisory.
I was involved in a high profile business case that took over 6 years and went up to the Supreme Court. Another case I was involved in years ago took 7 years to resolve.
I've heard stories that India was the worst with some cases passing down through generations over 50 60 70 years.
Are there any (democratic) countries where the courts run efficiently?
I just waited in jail for 10 years in the USA for trial while the lawyers just got a series of continuances and then the prosecution dropped the case. So, the USA is definitely not exempt. I've had civil litigation go on for a decade in the USA too, which I don't think is uncommon. Sometimes you'll go around to an appellate court multiple times which adds, say, two years each time.
If you look at death penalty cases in the USA it can take more than 20 years to exhaust all the appellate courts.
No, sadly, where I live you only get compensated for being in prison after conviction. If you spend 10 years in pre-trial detention and are acquitted you are not eligible for compensation.
It's slow for resolution on an individual instance, but if its establishing stronger precedent on all future situations then perhaps it's good that the details were tracked through.
One thing to note is that the library was also available for commercial licensing from the start. Makes it easy to demonstrate harm and establish damages.
Now we need two more landmark cases:
- one that awards damages for a pure GPL library that doesn't have commercial licensing available
- one that exonerates some company that is LGPL compliant but is sued by the likes of Digia in hope of squeezing some money out of them
Not a lawyer, but IIUC it's never going to be the case that a GPL violation legally compels someone to release the resulting source code. There is no rule in copyright law that says you have to do that, and the GPL is just a license agreement at the end of the day, not a law. The best you can do is say you'll collect some fine for copyright violation or breach of contract if they don't release the source code, which was the issue of substance here too.
As an analogy, if I have a plot of land with a sign that says "$1 million to enter", and someone enters without paying, the worst I can do is something like a trespassing charge, not collect $1 million because the sign said so.
> There is no rule in copyright law that says you have to do that, and the GPL is just a license agreement at the end of the day, not a law. The best you can do is say you'll collect some fine for copyright violation or breach of contract if they don't release the source code, which was the issue of substance here too.
In the US, a GPL can be both a license and a contract at the same time [1]. The distribution rights granted by a GPL to the user are conditional on the user's compliance with the GPL.
Contract law allows for punitive damages. Copyright law allows for 1. permanent injunctions against the defendant's continued use and/or distribution of the copyrighted work and 2. statutory damages xor actual damages. Every GPL requires distribution of source code under the same licensing terms to users already given the object code, so the party suing to enforce the GPL can request that the court compel distribution of the source code; the court can agree or disagree. The court can also decide to give the GPL violator the choice of 1. releasing the source code or 2. no longer distributing any copies or derivatives of the GPL'd software until the violator releases the source code or 3. whatever else. The court may impose a punitive fine in any case.
Tangential question (not a lawer, either): The damages to the copyright holder are one thing. What about the damages of the users of the software, though, due to the fact that they couldn't access the source code and modify it to their liking, which is what the GPL promises them? And what about damages incurred by competitors who, in their software, complied with the terms of the GPL and made their source code available?
I know, those are all very hard to quantify but still.
This is true until a Judge decides otherwise under penalty of permanent imprisonment.
As for the AGPL, the AGPL is regarded as the most deadly toxic irradiated poison ever concocted to big tech. You'll be in a very serious meeting with your supervisor, HR, and lawyers if you use AGPL code in big tech. But they'll more likely just outright terminate you. If you maintain a prominent AGPL project, they will not hire you and put you on a blacklist when they find out.
The virality of the AGPL is something no lawyer wants to test in court. They're scared shitless of it.
Off-topic, but does anyone know any good lawyers for enforcing the terms of open source licenses based in the EU?
Everyone we’ve used in the US doesn’t want to tackle something from Europe. They’ve recommended a firm but just figured I’d collect a range of opinions from here given the context.
The headline implies a win, but I'm not so sure... $820K USD isn't a deterrent and to the plaintiffs, it's minus 100s of hours, distraction and legal fees which will be high given the complexity.
NAL, but I think these laws typically include all reasonable expenses, but that you'd need to justify/receipt those expenses. So, travel to court each day, accommodation, anything related to running the case, sure. Stress, unlikely because you can't attribute a direct value to it. I think opportunity cost is the biggest for most people, you almost certainly wouldn't be able to claim lost revenue due to your time in court.
No, the french judiciary system is clearly advantageous for big actors, the article 700 (right to be reimbursed of expenses) is a flat fee estimated by the court event if you give every receipts, we got 60k euros, which a less than we spent, which is I think around 100k for 14 years of litigation (we never tried to make the sum as we did know it would no be useful in the end), plus others costs like expertise fees which were shared 50/50 between us and Orange/FranceTelecom.
Entr'Ouvert's website suggests they could've offered a commercial license. Possibly that would've cost Orange less than $820K/€500K. So maybe we should compare that, instead of the full replacement cost.
Entr'Ouvert's offering in 2004 was 500k€ or 0.4€ per users.
Les sociétés Orange versent au demeurant au débat un mail interne de la société Entr'Ouvert en date du 16 septembre 2004 s'inscrivant manifestement dans les négociations commerciales au sujet de la licence du logiciel LASSO faisant état de deux propositions, soit « une licence Lasso illimitée à 500.000€ ( à débattre :))) » ou «une licence Lasso par utilisateurs à 0,4€ (c'est la solution que je pousserai volontiers (') ». Elles versent également une autre proposition commerciale de la société Entr'Ouvert en date du 23 juin 2010 portant notamment sur une licence commerciale du logiciel LASSO dans le cadre de projets en cours de France Telecom pour lesquels un prix de 250.000€ est proposé.
The legal precedent: « IF you’re sued and IF you’re found guilty, then you MAY have to pay LESS than the inflation adjusted cost of paying a commercial license, 20 years in the future »
Not really. This is just the beginning. Governments are leaning into open source and will likely see protections like this, and their enforcement as helping secure the open supply chain. Which is something they’re already trying to do.
I’m not sure about the rest of you out there with companies, but for me the trend seems clear and I’m gonna be much more cautious going forward with the kind of licenses I use both as a vendor and buyer.
Landmark decisions ([0], [1]) set precedent. The conseil constitutionnel has ruled that its precedent is binding [2]. In fact the majority of current constitutionnal law in France was willed into "constitutionnal" status by a landmark decision of the conseil [3]. I mean precdent can move _laws_ to the _constitution_ (see [4]). That seems pretty binding and persuasive.
[3] https://fr.wikipedia.org/wiki/D%C3%A9cision_Libert%C3%A9_d%2...: essentially, the council ruled that the preamble of the constitution also had constitutional value (which was not the case previously and still is not for laws) - since it also mentions other texts (the preamble of the 1046 Constitution, the 1789 Bill of rights, etc. those texts also have the same legal power as the constitution.
It doesn't even matter if it will be considered for other cases - simply having an example showing that yes the GPL can be enforced will work as a deterrent before things even go to court.
True, in German it is called "Rechtssicherheit", the presence of a clear, agreed understanding of a law and its application. The idea being, some not 110% defined legal questions get litigated once, and then most follow the outcome.
> The compensatory damages were based on both lost profits of the plaintiff and disgorgement of profits of Orange. Moral damages compensate the plaintiff for harm to reputation or other non-monetary injury.
Sounds like they had their profits from the breach taken away, and they also had to pay extra in "moral" damages, and also likely would have had to pay both sides' legal fees, as well as having to sink hundreds or thousands of hours into the case.
Damages for breach of IP rights (or contract) are generally calculated on the basis of well defined principles, it's not like where a regulator is handing down a fine and has a broad level of discretion to come up with whatever number it deems fair in the circumstances.
Not to mention it took 14 years for this case to be resolved. I don't know if the defendant continued infringing over all that time, but $820k for a "license" to do so for 14 years seems like not a big deal.
Sounds like one of those fines that most companies would just shrug off as a cost of doing business. If I were the defendant's lawyer, I think I'd be pretty successful at framing this to my client as a win.
> Orange used the Lasso software in the solution, but did not pass on the rights to its modifications free of charge under GPL, or make the source code to its modifications available.
Has Orange now made the modified code available? The artice doesn't mention that.
3 millions euro for copyright, 500 000 € for moral damage, 500 000€ for the money Orange made and at least 100 000€ for the lawyer fees plus 1/3 of the remaining fees.
And 3 ADs in IT magazine(to show what Orange did wrong).
In the document :
- Condamner in solidum les sociétés Orange et Orange Business Services à payer à la société Entr'Ouvert les sommes suivantes :
3 000 000 euros (trois millions d'euros) en raison des conséquences économiques négatives de l'atteinte aux droits d'auteur de la société Entr'Ouvert, dont le manque à gagner et la perte subis,
500 000 euros (cinq cent mille euros) au titre du préjudice moral subi par la société Entr'Ouvert,
500 000 euros (cinq cent mille euros) au titre des bénéfices réalisés par les sociétés Orange et Orange Business Services.
- Autoriser la société Entr'Ouvert à procéder à la publication de tout ou partie de la décision à intervenir après signification de l'arrêt, dans trois magazines professionnels spécialisés choisis par la société Entr'Ouvert, ainsi que sur les sites internet des sociétés Entr'Ouvert et Orange, aux frais d'Orange, dans la limite de 5 000 euros HT par publication ;
- Condamner in solidum les sociétés Orange et Orange Business Services à payer à la société Entr'Ouvert la somme de 100 000 euros (cent mille euros) au titre de l'article 700 du code de procédure civile, ainsi qu'aux entiers dépens en ce compris les frais d'Huissiers dont ceux relatifs à la saisie-contrefaçon.
No, 4.1 millions was the request (“Entr'Ouvert, appelante, demande à la cour de...”; later, “La société Entr'Ouvert soutient que...”). What was awarded is in the section “Dispositif”, the rest of the document is just the recapitulation of the case, including what was requested by both parties.
Orange's had lawyers argued that the GPL was in English and that the translation offered was invalid, and that its dispositions were contrary to French law. The judges seemed particularly ticked off by this argument, noting that the lawyers did not bother to point out any examples of either point.
That’s really absurd, if two companies from non-English-speaking countries couldn’t sign contracts in English basically all of international business would collapse.
Being sarcastic, I bet the commercial license from EntrOuvert was way cheaper than 860k, this horrendous decision should mean firing of the managers involved in Orange, but being a semipublic company the incentives are wrong and these managers probably have promoted since then.
Well, turns out the CEO of Orange back then was Thierry Breton, and indeed nowadays Mr. Breton is a European Commissioner, so things turned out pretty well for him.
I wouldn’t say having a reputation as counterfeiter and ripper off of open source IP, as “working out pretty well” but YMMV, I guess it depends on your value system hahahahaha! :)
It was confusing from the beginning, and hasn't stopped being confusing.
Imagine RMS saying, "I'm glad you asked that. By 'free', we don't mean free as in beer (nor what everyone means when they search for some kind of 'free software'); we mean free as in freedom. Which I will proceed to speak about for half an hour, now that you're engaged by this small bit of wordplay that you had no reason to suspect was wordplay."
(Not that the ambiguity of "free software" was the cause of the Orange violation of Lasso. But, really, please stop calling it "free". Say "GPL", or some other license, or "libre", or anything other than "free".)
I wouldn't call it ironic, I'm pretty sure there's two separate words for free in the vast majority of languages. English is the outlier, not the other way around.
In French (which matters in this particular case since all actors are French) there is no confusion between "logiciel libre" and "logiciel gratuit" though. There isn't a word that can mean both.
I think these days in English the acronym FLOSS is the most common word for free as in freedom software.
I think it's fair to say -- sadly -- that most people don't even know of the FSF's goals, and think "free software" and "open source software" are the same thing. And I don't think we're going to fix that problem by insisting random people use the terms properly.
"Free" is a terrible word to use in English for this concept. It gives the absolute wrong impression, and no one is going to sit through RMS lecturing about "free as in speech, not as in beer". It's a pity English has such a common word with two meanings like that.
To use your phrasing, the choice of the world "free" cheapens (literally!) and confuses the goals and purpose of free software.
"Gratis" isn't new in English; unfortunately it's just fallen out of use.
The problem is that even if we were to re-adopt a word like "gratis", it would take decades, at best, before using "free" to mean the same thing would fall out of the collective consciousness. So even if people agreed to start using "gratis" to mean "without monetary cost", you'd still end up with plenty of people getting the wrong impression from a term like "free software".
The best way to fix it is to have a new word for free-as-in-speech. If we could adopt "libre", and start calling it "libre software", it would probably require less explanation. Or, rather, it might still sometimes require explanation (because some people just won't know what "libre" means at all), but it won't cause people to immediately assume the wrong thing, at least.
(I also enjoyed another poster's suggestion to call it "freedom software", and thus immediately gain the support of a certain half of the US population that tends to shut off their brains when they hear the word "freedom".)
Yes, but even reading your sentence, it's not clear whether you intend to mean that unix clone is without cost, or libre. I think most people (especially non-technical people) would assume the former.
Because "free speech" as a term of its own is already well-known and well-defined, fairly universally in US culture. "Free software" is only known in tech circles, and even then it isn't always particularly well-known or well-understood.
In a way it's an education problem, but it's also a "why should I care about this topic?" problem, for which the vast majority of people will -- quite reasonably -- answer "I shouldn't".
> In a way it's an education problem, but it's also a "why should I care about this topic?" problem, for which the vast majority of people will -- quite reasonably -- answer "I shouldn't".
This can also be applied to free speech, or just freedom in general. People are quite happy to trade convenience for liberty in society. Does that mean we should settle for "I shouldn't care" as an answer? I certainly am not going to.
Just set my code on one project from MIT to GPL, since it's a personal project without public funding. I was expecting this kind of consequences if someone breaches the license, I'm glad it's the case.
A more generous interpretation of their comment is that they are glad that some legal precedence has been set. The same case would not go through courts for another decade.
On the other hand, many companies view GPL software as some sort of kryptonite these days. So a good number of them -- at least the ones that are knowledgeable about software licensing -- won't even go near your software. Personally I'd consider this a good thing; opinions vary, of course.
Certainly there will be some companies that will just see open source code out there and use it, without considering the license at all (not even sure this is most often malice; I expect usually it's just a random developer who doesn't know any better).
But ultimately it's up to the copyright holder if they want to bring some sort of legal action or not. With MIT, you can't stop anyone from doing much of anything, for the most part. With the GPL, you at least give yourself the option, should you decide it's worth pursuing.
> So a good number of them -- at least the ones that are knowledgeable about software licensing -- won't even go near your software
Yes, very good thing for me. I'm building a Web alternative to Google Maps. It's OK if no big company reuses my code. I just want to make it free to see, and to contribute, not to use without contribution.
If I was building a library, that sure would be a different matter.
> The Court of Cassation, which is the supreme court of France, reviewed the case and issued an order on October 5, 2022 overturning the decision of the Court of Appeal. The case was then remanded to the Court of Appeal, which issued its order this week.
I know nothing of france's legal procedures, but it seems strange that the lower court, after being ordered something, could drag its feet for so long.
I'm not familiar with the US system but I don't think the court de cassation is actually comparable to the US supreme court. The court de cassation only overturns judgements on the basis of technical details of law, whether every procedure was followed as they should, etc.
It doesn't provide any feedback on the actual judgement it overturns (or not). So the court that does the actual judging is free to get to the same judgement as before if it respects the process correctly this time. But this is essentially a new judgement that starts from zero and takes as much time as the first time.
And the “is this law constitutional?” part is done by the Constitutional council, which is not quite a court. And there’s a second Supreme Court (Conseil d’État) anyway. Quite different from the US.
> this is essentially a new judgement that starts from zero and takes as much time as the first time
Did the court de cassation overturn every single detail pertaining to the original judgement? Kinda feels like they should be able to speed up the new judgement, and only change whatever detail or details the superior court said were wrong.
Well, I don't know much about this subject, but these delays make justice less just, and I don't understand how this is not obvious to everyone but the guilty.
The case is almost always sent to a different court from the one that had its judgement overturned. The new court doesn't necessarily have to redo everything from scratch, only the precise points that were overturned, but obviously if it was something early in the procedure then many things might need to be reviewed.
I believe what contributes most to the delay though is just that the French legal system (like most others) is overloaded and it just takes time until the case gets to the front of the queue.
The upper court probably didn't direct the lower court to issue a specific judgment, but clarified a legal question, and then the trial had to be repeated in the lower court, with this interpretation of the legal question in mind.
> I know nothing of france's legal procedures, but it seems strange that the lower court, after being ordered something, could drag its feet for so long.
The cour de cassassion can't rule on the case. It rules on whether the previous ruling was made properly according to legal standards. It is possible that the cc broke the previous decision for a reason unrelated to the case matter, and so the lower court can reissue a decision respecting the spirit of their first decision, but without the legal error.
Court procedures take a huge amount of time in France and nothing prevents appeals, etc. So it's pretty much usual for the case to take years in the first court then for one party to appeal, which again takes years, and finally for one party to appeal the appeal, which sends the case to Cassation, which again takes years and possibly sends the case back to appeal (which is what has happened here).
It's not very unusual for criminal cases to drag on for a decade between arrest and end of all procedures, for instance.
It sounds like the reason it took so long was that they had to decide whether this was a copyright claim or a breach of contract claim.
Could someone more versed on license law clarify this, at first glance it seems like an obvious breach of contract if they've just used the entire Lasso library and not followed the license. Was orange copying specific parts of the library and hoping nobody notices?
Copyright in Europe fundamentally does not work like Copyright in US. More precisely, the law does not govern the monetary gains possible from copying but rather the author's artistical expression in the work. Which is kind of moot regarding source code, but, it makes for an important decision: author's rights are created automatically and you cannot transfer them under any circumstances whatsoever.
Per default, the law states that the author retains all rights. They can license it, e.g. to their employer, exclusively; The employer can then sublicense that. However what licenses are possible is exhaustively defined in the law; on an abstract basis at least (e.g. using, creating derivative works, ...). It has been not exactly clear if conditions evoked on a license still have their roots in these licensable acts, or if they are based on contract law - where literally anything goes as long as it's not against the law or immoral.
Both can be enforced. However, I'd argue that it is good that it was decided that this is a copyright law matter, because this gives authors _much_ more protection than contract law, where all circumstances need to be evaluated for each single case and rulings might as well contradict each other.
> However, I'd argue that it is good that it was decided that this is a copyright law matter, because this gives authors _much_ more protection than contract law
Copyright law may give the authors more power; but unfortunately, experience teaches us that there's usually not much incentive for the authors to enforce their rights. (The case in TFA is an exception, since the author profits directly from dual-licensing.).
For this reason, the SFC has been trying to get the courts to also see it as a contract; and specifically, one in which all other possible users of the software are beneficiaries. This gives random third-parties standing to sue for damages. If it works, it means that the SFC (for example) could go around suing companies which violate the GPL without needing involvement from the original copyright holders.
I'm not entirely sure if this is even possible the way copyright in the EU works today. Be that as it may, at least from my perspective what really prohibits me from actually enforcing GPL on my software is funding. I rarely go beyond sending a stern letter, because anything more would be financial suicide in most cases; I might win but it is incredibly hard to prove damages. Punitive damages don't really exist in the EU (or, to be more precise, in the roman-german law system), at least not to the exterme seen in common law systems.
At least in my case, I'd fight tooth and nail for my software, if I don't have to bear the financial risk.
I was simplyfying. What I described is unified in the European Union; however, the outlined distinction is common across the roman-germanic law system whereas the US copyright interpretation is rooted in common law. In Europe, the only countries using common law are UK and Ireland. And while Ireland has mostly unified their copyright law with the rest of the european union, you still see some roots in common law. For example, it is possible that an irish company (a legal person) can become an author of a work, which is impossible under e.g. german law.
And yes: in a common law system, copyright governs literally "the right to copy", which is transferable. In other law systems (which is the distinction I made) the law governs the property rights of the author's expression, which is non-transferrable, you can only license the rights you have.
I’m glad there’s finally some cases setting precedent for the GPL.
I am personally not a fan of GPL family licenses BUT I am sick of fellow OSS developers who keep telling me not to worry about the minutiae of the license. (Usually some form of “Oh don’t worry because we aren’t litigious right now, as long as you stick with our incorrect interpretation of the license”)
My team is responsible for a lot of corporate contributions to open source (some GPL) and I have to pay extra attention to the license terms as a result.
There’s a lot of GPL software that operates in the grey (and many OSS devs who don’t understand their license choices), and I very much like my licenses to be as black and white as possible so I can avoid any risk.
Having legal precedence to point to will help make concerns concrete.
It's a good thing to point out that judges are notoriously indifferent about "he said that she said" type arguments. In the presence of anything that looks like a license text or contract that's usually what they take as the starting point for a decision. In so far that is enforcable of course. Most longer licenses usually have some language stating what happens when certain things aren't enforcable.
With GPL, the version matters. I'm not aware of any GPL v1 licensed stuff, it probably was a bit short lived. But there's quite a few things licensed under GPLv2 and GPLv3. v3 tried to close a few loopholes v2 had that some companies working with e.g. Linux or Java see as a feature rather than a bug.
Many corporate lawyers don't like GPL style licenses and particularly AGPL because these licenses have a lot of things to say about things like intellectual property, patents, servers, and uses that are/aren't allowed that probably isn't OK with most corporate lawyers worth their money. With GPLv2, this stuff has been through the courts a few times so the industry seems comfortable with it at this point.
Whether those lawyers are right or not is not something engineers should be overruling based on vague notions of fairness, gut feelings, loose interpretations, etc. The whole "it's fine because we're all nice people" doesn't have much value in a court room. Licenses are for when people stop being nice to each other.
I prefer the MIT license for my own projects. It's simple and clear and completely uncontroversial with corporate lawyers. It has decades of use, is well understood, and has very little ambiguity. Lots of OSS is licensed with it. It's fair for developers and users. Users being able to do what they want with the software is fully intentional on my part. That's a freedom I give them with that license and not something I actively want to restrict. Fully understood and intentional.
Yes, but it's a different license! It's not really comparable with regards to preference because the goal of MIT or BSD is different to GPL.
If $PROJECT-1 can be modified and then redistributed as closed-source by $COMPANY, then MIT is a good fit.
If $PROJECT-2 must only ever be redistributed if $COMPANY redistributes their modifications and derivatives as open-source, then GPL is a good fit.
It depends on the specific project, which is why I cannot say "I prefer MIT" or "I prefer GPL", because for some projects I don't want it repackaged and released as proprietary software, while with others I do want others to repackage and redistribute as proprietary software.
Proclaiming a preference for one type of license over another is akin to saying "I prefer bicycles", because you aren't going to prefer bicycles when needing transport to move house.
> All software and copyright licenses are ideological documents. You don't see the ideology in BSD and MIT because it is your preferred ideology.
Please read what I wrote more carefully. The phrase "in a way that the BSD and MIT licenses are not" is doing important work, namely implying that there is a way in which MIT/BSD are ideological.
Perhaps ideological came off as an aspersion? I certainly didn't mean it as one. It's just an observation, and frankly a pretty obvious one. I'm quite sympathetic to the Free Software movement and, at least in the first order, would be happy to live in a world where all software is copyleft.
BSD/MIT is essentially reverting to the natural conditions that would obtain without statutory copyright. GPL on the other hand is actively attempting to motivate sharing and respect for user freedom in a way that is far beyond the natural state of affairs. Thus it's fairly described as ideological in a way that the former licenses are not.
I think GP's reading of what you wrote is an entirely reasonable reading. Their interpretation was the same as mine before reading your follow-up. It did seem like you were saying that the GPL has ideology but BSD and MIT do not.
It would have been much clearer if you had simply said something like, "The ideology behind the GPL is very different from that of the BSD or MIT licenses".
I read what you wrote and I came to the same conclusion as GP. What's wrong with simply saying "BSD/MIT has a different ideology to the GPL"?
Is that what you mean to say? If it is, why say this (which means something else):
> BSD/MIT is essentially reverting to the natural conditions that would obtain without statutory copyright. GPL on the other hand is actively attempting to motivate sharing and respect for user freedom in a way that is far beyond the natural state of affairs. Thus it's fairly described as ideological in a way that the former licenses are not.
I think you're still trying to portray one more positively than the other; you're writing a lot of words to claim "I'm not really saying that" while actually saying that.
I agree that the GPL's ideology is different, but I don't think you need to agree with the GPL's ideology to use it, just as you don't need to agree with the BSD or MIT licenses' ideologies to use them.
You can simply make a practical choice, based on what your goals are for a specific project. If you're looking to build a community around your project where both hobbyists and corporate users/contributors end up in your commit history, you might choose (A)GPL. That's not to say there are no public corporate contributions to BSD- or MIT-licensed projects, but I expect using the (A)GPL could increase the likelihood. Well, aside from the effect of companies that simply refuse to touch GPL'd software.
You aren’t discussing the controversial part at all. You have laid out the part that everyone agrees with.
GPLv2 means you pass your modifications back - easy.
GPLv2 might mean that by calling a libraries static functions from your code, that is a “modification”, and now must include code that is not logically a modification at all. This is even more grey area if you dynamically link them - hard.
“What is a derivative” has feed millions of lawyer’s family’s, imo this type of broad interpretation does not belong in OSS. Give it to me or don’t, I promise to pass my modifications back, but don’t pretend that integrating it into my code is modification that allows you to see what I do with it.
LGPL makes more sense. But GPLv2/3 has a business model and that is why it is popular and complex.
I wasn't discussing any part at all, I was explaining why the goal of the project determines the license and not the preference of the authors.
If the goal is to allow others to close off the product, then one license is suitable. If the goal is to prevent others from turning it into a closed source product, then you choose the other license.
When the goal is "popularity amongst closed off companies", then one license makes sense. If the goal is "popularity for the community" then the other license makes sense.
There was a no value in arguing over which is easier for companies unless the goal is popularity for companies and o close off.
After all, any company that simply wants to use the software without redistribution doesn't really care.
> What is a derivative” has feed millions of lawyer’s family’s, imo this type of broad interpretation does not belong in OSS
I am sorry but your view is dying out because it’s unrealistic.
Most things in life have grey areas, most things in life have legal concerns and liability attached.
Even the simplest possible things, like If you sell bicycles and someone breaks their neck, they could sue and there will be grey areas as to the bicycles’s quality. If you sell food and someone gets food poisoning, same thing.
The fact that software developers could make millions and never worry about law, is a giant privilege and it’s disappearing because software is no longer just for kittens and porn, it is now everywhere and it’s responsible for life and limb
One problem I see with the GPL (all versions) is that they make a big deal out of 'linking' your program.
And in a world were C was the only language that mattered, that might have been fine. But more modern languages might have more diverse notions of what linking is and ain't, and there might be gray areas, too.
Dynamic linking (also what most more modern languages do instead of traditional static linking) always has been an utter mess when it comes to GPL "virality".
It's completely untested in court and a lot of legal scholars doubt the reasoning and interpretations given by the FSF (namely that dynamically linking with a GPLed work transfers the same GPL license to that work) would be allowed by current copyright law.
More or less amounts to the question of "do you need to use parts of the original work to make a derivative work".
Another, possibly far bigger problem, is that it's entirely possible to effectively neuter the GPL with a secondary contract. We know this is legally possible because of what Red Hat does (cut you off as a customer if you redistribute the source code of RHEL) as well as a far more direct legal case in Germany where a WordPress theme maker was able to prevent distribution of a theme they made by arguing that the contract made when the theme was sold overrode the terms of the GPL and prevented the buyer from sharing the theme because of agreed to terms in the contract.
Unless the GPl'd work provides a novel API that the GPL code implements; that has been tested in court. The API is protected, by copyright, and linking to that API requires the use of the API. That's why you can't just link to GPL'd code will-he–nill-he, without the possibility of serious repercussions.
“Will he nill he…” huh, I wonder if that’s the etymology of the modern expression. “Willy-nilly” just seems like a nonsense phrase. “Will he nill he” could be read as “will he or will he not,” right? It seems like only a couple short conceptual hops to go from “whether or not he decides to do something” to “look at all the random things he could do.”
I suppose I should have looked it up! For some reason I just looked up the definition of nill and tried to work out a plausible folk etymology from there.
API protected by copyright? Sounds like a judge somewhere dun goofed, that just makes no sense on the face of it. The specific IMPLEMENTATION of an API maybe, but not the API itself.
> Another, possibly far bigger problem, is that it's entirely possible to effectively neuter the GPL with a secondary contract.
I don't really think that's a problem, per se.
Regarding your second example, that's totally fair: effectively the theme maker (who is presumably also the sole copyright holder) did not actually give the theme to the other party licensed under the GPL: they licensed it differently in that sale, which is 100% their right to do.
The Red Hat example is a little more complex, I think. The GPL says you cannot add extra conditions to redistribution. If Red Hat owned the copyright for everything they distribute, that would be fine: they couldn't say RHEL was GPL-licensed, but they'd be in the clear.
I'm actually surprised they've gotten away with this: as far as "ambiguity in GPL enforcement" goes, I feel like it's not all that ambiguous that RH is committing a GPL violation by adding that condition. All it would take would be for rightsholders of a major component of RHEL (like perhaps the Linux kernel) to complain about this. (Then again, there may be no publicly-stated contract or policy here; RH may just cut off customers who share, and cite vague "we reserve the right to cut you off for any reason" policies.)
(Also I might be getting details wrong here; I'm just going by your description and haven't done my own research into these cases.)
It's pretty clear what it means, based on the goals of the license: To allow the user to easily replace/modify the (L)GPL licensed (part) of a program.
For GPL, that means you can only include GPLed code if you make the whole program available for user to modify and redistribute.
For LGPL, you can include LGPLed code if you make just the LGPLed part of the app available for easy replacement. (so normally this means some form of dynamic linking, and very commonly used in practice with eg. glibc)
So if some company statically links (L)GPLed code to their app or makes it in other ways hard to modify/replace, it's a violation.
Except the "whole program" bit is in question here. A lot of legal scholars doubt that using the GPL in the way the FSF wants it to be used when it comes to linking would actually confer the terms of that license to the final product (because the FSF wants it to be considered a derivative work of the GPL code). In the EU, it's to the point where the EU themselves have written guidance that they doubt this interpretation applies[0]. This would effectively limit the scope of the GPL to always function like the LGPL/MPL (or any other copyright license that limits the copyleft element to the parts of the program actually licensed as such, as opposed to spreading it across the whole work.)
The problem is that the GPL isn't the law; it's something (a license or a contract, the interpretation of which is a toss-up; the FSF wants it to be a license since it allows shrinkwrapping the terms, all legal enforcement has it treated as a contract) that relies on existing law to function. Which fails to work if it tries to do something that legal framework doesn't allow you to do. You can't enforce copyright over derivatives if it's not considered a derivative to begin with.
If the FSF conceded that the GPL was actually a contract and were to write more explicit terms stating that this is a requirement to using GPL'ed source code, they'd probably not run into this problem, but because they insist on forcing it through the framework of copyright the virality of the license only goes as far as the definition of derivative works go.
> as well as a far more direct legal case in Germany where a WordPress theme maker was able to prevent distribution of a theme they made by arguing that the contract made when the theme was sold overrode the terms of the GPL and prevented the buyer from sharing the theme because of agreed to terms in the contract.
> It's fair for developers and users. Users being able to do what they want with the software is fully intentional on my part. That's a freedom I give them with that license and not something I actively want to restrict. Fully understood and intentional.
Sure, but GPL is fair as well. Just differently.
I like MPLv2, because it seems like it is exactly equivalent to MIT for projects that don't modify my code, and if they do, it forces them to distribute the changes they made to my code. In practice, I strongly believe that it gives developers leverage against their management: as a developer, I can say "I must upstream my changes because it's MPLv2" (which is not exactly right but managers don't really understand it anyway). MIT doesn't give me that leverage: managers know too well that they can just pretend it's theirs (often even forgetting attribution).
The main downside with GPL licenses, from the perspective of corporations, is the requirement to share your changes if you distribute the code. GPL v3 and AGPL were both designed to close loopholes that were useful for certain corporations that didn't want to open source their own code. I believe the main ones were "Tivoization" (devices where you can't run modified code due to DRM) and cloud SaaS services where modified GPL code is run without sharing the changes because technically the cloud is not "distribution". GPL v3 was designed to stop the former and AGPL to stop the latter.
FAANG companies generally hate AGPL and ban the use of code licensed under it which is a backdoor way of admitting that their cloud platforms include modified GPL code whose changes are not publicly available.
The permissive licenses like MIT and BSD allow corporations to take your code and modify it without paying you or even sharing the changes with you. All they have to do is give you credit. That's not something any sane corporate lawyer is going to care about.
Both of these types of licenses permit commercial use without paying the people who wrote the code and the entire FOSS community rejects non-commercial licenses as "not open source" at all. Corporations only object to GPL because it prevents them from taking other people's work, modifying it, distributing it and profiting from it without sharing their changes. That can reduce corporate profits by allowing competitors to build derivative works off of the same code.
Personally, I'd prefer if there was a major non-commercial license for code because that would ensure nobody is profiteering off of my work without paying me. People who aren't in this line of work find it strange when you explain to them that somebody could write a portion of a highly successful product and never see a cent in royalties from it.
> I'd prefer if there was a major non-commercial license for code
Non-commercial licenses are considered non-free.
If you intend to be share your code with the world and don't want others to profit on your work without giving anything in return, go with something like (A)GPL. Sure, people will be able to profit on your work, but in return, you get all their bug fixes, testing, support, and improvements.
You have to realize that most contributions to GPL software is made by people who work for money, something that a non-commercial license would not let them do. So by releasing your software with such a license, you prevent the best people from eventually contributing. And why would you open source your software if not to take advantage of the contribution of others (in code or otherwise)? Another reason why you would want to publish your software under GPL (or compatible) is to make use of other GPL (or compatible) software and libraries, you can't do that if you publish under a non-commercial license. So, lots of drawbacks, few advantages.
> And why would you open source your software if not to take advantage of the contribution of others (in code or otherwise)?
Because some people do things for different reasons than you might?
I'm not the person you're replying to, and usually use GPL for things that I care about, and BSD 3-clause for things where I don't care so much. But I can see the appeal of a non-commercial license, about only allowing something for personal (or, say, non-profit) use, and refusing to allow anyone to profit monetarily off your work.
It's fine if you don't want to use something like that, but I think it's unfair to cast aspersions on the concept.
Of course the corporate lawyers and enterprise settings are going to take this position, but I find your flippant dismissal of GPLs core uses and purposes seem to belie you are just in a group that prefers the non-user friendly licensing.
That’s ok, but you can’t hand wave away that GPL, especially v3+, is about protecting and defending the user (like tron), while MIT/BSD is not, by calling it “vague notions of fairness” as if GPL hasn’t already seen wins in the courts… on this point I think some research might be due for me though.
I’m just so very tired of the “hey me and my banker/lawyer/insurance friends all hate this - that means it’s useless and dangerous for everybody”
I say the GPL and copyleft and the four freedoms for the user are the tools we use to fight these types of corporations in the first place, and I truly believe not only is GPL not dead but that it will play a vital role in our near and long term future.
Imo the question is do you want to protect passive users access to the code or developers freedom to use the code how they want it even if that does mean they take existing code, modify it slightly, put it in an embedded device, and enable temper protections on the MCU preventing anyone but them from signing new firmware images? If that embedded device is something harmless like a video player or gaming console most open source developers would probably like to use the license as a crowbar to force the device open, but what if it's a medical device, or a payment terminal the user is explicitly not trusted to modify?
If anything medical devices, especially implants, should be modifiable. Otherwise you're stuck with an unmaintainable device (possibly in your body) once the manufacturer goes out of business or the support window ends or whatever. Right to repair and all that.
To me it's not solely about modification. It's also about transparency. While I most likely would shy away from trying to modify the software of a medical device, I think it should be illegal to lock down those devices so that people can't at least see the code that's running it.
If you're going to build something that's going to directly affect my health, I would much appreciate it if it were easy for researchers to find bugs and security issues with the software that controls it.
But ultimately I don't care about the whole "legally locked down" thing, and think that concept is a net negative for society. If I can buy a medical device to use at home, then it should be on me if I decide to modify it and it injures or kills me. I expect the law hasn't really kept up with that notion; I wouldn't be surprised that in many cases the manufacturer might still be liable. But that's a legal problem (that should be fixed!), not an ideological one.
> ... or a payment terminal the user is explicitly not trusted to modify
This is the worst thing, IMO. I wholeheartedly reject the concept of software that I am given to use, but am not "trusted" to modify. Screw that. Unfortunately I do have to use software like that. But I think the entire concept is garbage.
For payment terminals the core problem is that the trust model is ass-backwards. The user is the one at risk from a faulty authn device, so they should be the side "controlling" the authn flow. Merchants would be fine just carrying dumb plastic cards with an ID number. Though these days there's little reason not to just make both sides smart… it could even run on the phones they already have!
> That’s ok, but you can’t hand wave away that GPL, especially v3+, is about protecting and defending the user (like tron), while MIT/BSD is not, by calling it “vague notions of fairness” as if GPL hasn’t already seen wins in the courts… on this point I think some research might be due for me though.
It sacrifices the user's freedoms for this. Which is exactly the concern corporate lawyers have with this. Fine if that's what you want but the net result tends to be the vast majority of big companies pretending your project does not exist. Can't look, can't touch, can't use, etc. If a blanket ban on your project across most commercial users is what you are after, AGPL does the job.
To me, frankly, that would be the goal. For some of the things I work on, I would much rather a company be allergic to it, than believe they can do whatever they want with it.
In practice, precedence has about the same legal power in both common law and civil law jurisdictions.
That is civil law courts tend to go with the precedent most of the time on the one hand. And on the other hand, a common law judge can always do a bit of nitpicking to argue why the case before her is special and thus different enough from the precedent (or she can pick from multiple different precedents, and then come up with a justification for why the one she picked matches the case before her the most).
Some, but not all, precedents are absolutely legally binding in the French system, they have the same force as the law (unless a later law contradicts them).
Yes, not sure if this is the case but higher courts in civil law systems tend to have some precedent setting powers. It's just more limited than in common law. I'm not sure if this is the case in this ruling in france, but even if it isn't
In a civil law system judges still are aware of previous decisions in similar cases and consistent application of the law is still a principle which exists. The precedent might not be binding but a judge who is in doubt will still uses existing case law and so will appeal courts. They will just always refer to the law as the source of the decision, not the previous cases
I can’t give specific examples without calling out some projects.
But usually it’s a misunderstanding of what derivative works are. E.g building an extension to a GPL app. Many don’t believe that simply importing a GPL Python module is enough to make your extension have to comply to GPL.
Or things like what constitutes distribution. E.g employees vs contractors vs third party collaborators etc…
But when I have to deal with said project owners, they bring up that the GPL is about the spirit of the license, not the exact words. They often say not to worry about litigation.
This case doesn’t specifically clarify those terms but it does highlight that GPL violation is a serious issue that can’t just be treated as a suggestion.
That any lawsuit has taken place and won means that
1. I can point to cases where the license choice is impactful when working with existing projects or when setting up new OSS projects in my industry
2. It also means that the license proponents may feel more empowered (as they should) to take action against infringement.
Basically, very few people take software licenses as seriously as they should. Cases like these help make them a more serious matter.
The more seriously people take it, the easier my job gets for multiple reasons.
And since someone might ask, the license I advocate for the most is Apache 2.0. It covers the most ground, with the most clarity and the most protections for all parties involved.
>But usually it’s a misunderstanding of what derivative works are. E.g building an extension to a GPL app. Many don’t believe that simply importing a GPL Python module is enough to make your extension have to comply to GPL.
The GPL doesn't explicitly define the concept of derivative work. The FSF's interpretation is that linking creates a derivative work, but there's no universal consensus among lawyers on this issue.
The term "derivative work" (or similar) is sometimes defined in law, but of course very abstractly (i.e. not specific to software) and differently depending on jurisdiction. (See https://en.wikipedia.org/wiki/Derivative_work) IMO it's exactly because of this ambiguity that many developers' understanding of the term is likely to be incorrect, and that most of us don't really understand what our license choice entails.
I think a better way to say this is that every developer's understanding is absolutely incorrect in some jurisdictions. Maybe they're correct in their jurisdiction, but once you start talking about this online, as with every discussion of legalities on HN, people start talking past each other forgetting that many European countries will disagree with each other, and especially forgetting that the US may have several dozen wildly different interpretations.
> Many don’t believe that simply importing a GPL Python module is enough to make your extension have to comply to GPL.
I can almost hear the chairs squeaking under the many SaaS devs reading this. IANAL, but as far I know the debate is not settled on whether this clause applies to SaaS, that is when you do not distribute the software per se, and the general consensus so far is that it does not. This is the "raison d'etre" for the AGPL license, which covers the SaaS use case.
Interesting examples. Also, a weird take for them to think that’s not triggering the virality, if not that, what then? Directly using the source code only?
> the GPL is about the spirit of the license, not the exact words. They often say not to worry about litigation.
Even weirder. Especially since then, why pick the GPL at all? The only reasons I can see is that you care about free software, and want virality for that reason, and/or you want companies to pick a commercial license.
I believe this is coming from the developers of a piece of GPL software who want to allow proprietary extensions. However, extensions to their software must import a python module that they created to register with their software or something. They don't want to invest time in separating that module and its dependencies into a different project with a different license (some variant of the LGPL), so they choose to believe that importing a python module is somehow different from linking a C library.
From the point of view of the owner of the software, I can see how you can think that your intention matters more than the word of the license. It's wrong in a legal sense, of course, but it makes some kind of sense of you say "nah, we'd never sue for that".
If the writer is the only copyright holder, if they just put the exception in writing, it's still OK. Like the exception for the Linux kernel to load proprietary binaries.
But many projects are years and years old, with many many contributors. Getting them to relicense is a burden that they don’t see the worth in if they cannot first agree on the pessimistic interpretation of the license
Sure, but if the copyright holder is a foundation, you actually need some legal representative to put that in writing, a single dev writing something on GitHub wouldn't mean much, even if they are the sole maintainer.
IANAL that seems to be a distinction without a difference.. you say yourself "the copyright holder" then make a gray case about some entity holding legal ownership that is not the copyright holder.
first to say, the interpretation will be different in different legal regimes, but.. here in the USA, I believe that the copyright holder is the title that determines who gets the say about changes to the license.
Sorry, I should have been more clear. I meant that it is fairly common for a project that has a single author and contributor to nevertheless be developed as part of an organization such as Apache. So, even if that single author is the de facto owner, they may not be the legal owner of the copyright.
Usually, the developers (and this is over multiple different GPL projects with different leads) have the following thought:
1. If you’re not touching the core code, it’s not under the GPL. I try and get them to LGPL license the extension points but they don’t see the benefit in trying to relicense it with all their contributors.
2. For your second point, the issue is that the GPL is intentionally vague as a deterrent for corporate abuse. The issue becomes that, as a corporate user: I have to take the most pessimistic view to protect myself. If I switch my hat to someone non corporate, then I take the most optimistic view. In those cases, they often say to not take the pessimistic view because that’s not the spirit as they understand it.
> GPL is intentionally vague as a deterrent for corporate abuse.
no the GPL was intentionally set the way it was.. vague is not the word for that IMHO.. thirty+ years of computer markets has brought every possible challenge plus a few more. So the original LICENSE has been deeply scrutinized and the markets have changes... the parts that needed clarification have had changes as a result.
"vague" is mildly insulting, and there is plenty of that going on as part of this discussion.
> 1. If you’re not touching the core code, it’s not under the GPL. I try and get them to LGPL license the extension points but they don’t see the benefit in trying to relicense it with all their contributors.
I sense this comes down to a really poor definition of "linking" (is that the word in the text? Can't remember and don't have it to hand).
It's not at all obvious for each software/language ecosystem which specific act constitutes "linking".
> 2. For your second point, the issue is that the GPL is intentionally vague as a deterrent for corporate abuse. The issue becomes that, as a corporate user: I have to take the most pessimistic view to protect myself. If I switch my hat to someone non corporate, then I take the most optimistic view. In those cases, they often say to not take the pessimistic view because that’s not the spirit as they understand it.
100%, and this is my gripe against things like the BSL too. It's disingenuous because the aim of the license is to create risk and uncertainty through poor definitions and vagueness.
Both things can be true. The wealth of comments in this very discussion from people who aren't sure what exactly is and isn't permitted under the GPL shows that it is vague.
I'm just not convinced that the lack of clarity which tends to steer large companies away from such software is just a bug, not a feature.
It doesn’t practically change much for most developers, but imo it underlines a bit of the challenge of misunderstandings and the complexity around the sharp corners of open source, in particular outside of the US as most licenses were designed in the US by Americans.
The conclusion of the article says "[...] so far linking (even statically) is done for interoperability, does not [...]". IANAL but it could mean that linking against a GPL library for interoperability is OK in EU, but it does not mean it's true for general case of linking with GPL libraries.
Interesting opinion but even if the courts will agree it is not relevant because most software is intended to be distributed globally (including at the very least the US market) and will have to comply with most restrictive interpretation of the license.
2. USSC claims Google's copying Java headers is fair use.
3. Many jurisdictions does not have a fair use doctrine.
So the question is, do you think Google is infringing the GPL by copying headers into Android, according to the most restrictive interpretation or the most restrictive copyright regime?
I think the unfortunate truth is that, for some reason, only US copyright law seems to matter.
One argument for you could be that if there is no problem, why aren't they using MIT instead that works the way they want it? No need to use the wrong license if they mean something else.
I also hate when projects take BSD/Apache licensed code and change it to GPL.
While it seems technically permissible, to me it also muddies the original code’s licensing as well, especially if not well known.
It is also bizarre that someone could take BSD code, make trivial modifications, and now such cannot be sent upstream or used under the original license.
I happily call out a surprisingly under-appreciated case of this…
We only have KVM virtualization on x86 today because Citrix meticulously developed an x86 instruction emulator (many thousand lines of dense code).
The Linux kernel took that BSD code and relicensed it as GPL. Sure, at some point one might argue that later modifications by the Linux developers are significant contribution that makes are unique… If I observe a good change Linux made, it is perhaps now questionable that as a 3rd party I create a PR to push the same change upstream… This is an accidentally evil side of GPL that isn’t talked about.
But at least in the early days, it also means one cannot freely use Linux’s copy but can do so from Xen —- even though it is the exact same code!!
Licensing also invites a trap where careless developers relicense GPL as BSD, and then a conscientious party unknowingly uses the BSD implementation in a closed source product.
The only saving grace is most code is horrible for reuse purposes and it is often easier to either reimplement it (from scratch) or use a well known BSD-licensed gem.
>It is also bizarre that someone could take BSD code, make trivial modifications, and now such cannot be sent upstream or used under the original license.
No it is not bizarre. That is precisely how the BSD license is intended to work. If you want to ensure that changes fall under the original license, then you want GPL.
> We only have KVM virtualization on x86 today because Citrix meticulously developed an x86 instruction emulator (many thousand lines of dense code).
> ...Linux’s copy but can do so from Xen —- even though it is the exact same code!!
Do you have any references for this? I went looking, but couldn't find anything to support your assertions. Neither code re-use/re-licensing nor origin at Citrix.
Origin at Citrix appears especially dubious, as KVM was present in linux kernel version 2.6.20, released almost a year before Citrix acquired Xensource. Also, Xen was originally developed at Cambridge University.
The only thing I thought both projects shared was their use of Fabrice Bellard's GPLv2 licensed qemu (some parts of qemu are under other licenses, but the main project is GPLv2).
I don't think you can change the license of existing code at all, unless you're the copyright holder. Best you can do is release someone's code as part of a work under another license. Anyway,
> It is also bizarre that someone could take BSD code, make trivial modifications, and now such cannot be sent upstream or used under the original license.
That's the case for anyone taking BSD sources and turning it into a closed product, isn't it? Even if you get the BSD-licensed sources, the trivial modification will not be upstreamable.
That's kind of the entire point of permissive licenses, as opposed to copyleft, isn't it?
> Many don’t believe that simply importing a GPL Python module is enough to make your extension have to comply to GPL
Wasn't this the, "if the source is in your tree everything is GPL" and otherwise "if it's downloaded module as a package it is not"? Or was that LGPL? Asking to show clarification for everyone.
edit: the question was stated as ignorant to get a few decent explanations. Thank you for the effort!
Looking closer at the GPL, it seems like most requirements only kick in once you "convey" GPL-covered code. If you make your users get the GPL component themselves from a 3rd party (e.g. PyPI or other package repository), then you might be okay. I'd be curious for input from others, but it seems like the following flow avoids GPL virality by avoiding "conveying" the GPL-covered code to the end-user:
1. You give your user a non-GPL python package with requirements.txt file (no bundled dependencies)
2. Your user pip-installs the dependencies (including some GPL-licensed ones)
3. Your user runs the application
As long as your country doesn't consider use of an API prohibited under the copyright of the implementing code, I think steps 1-3 would be fine (though not very practical for a product).
I'd be curious for others input, though, as this has bugged be for a while in the R community where several core libraries (like the Matrix package) are GPL licensed but many packages that depend on GPL packages claim to be licensed under MIT or some other license.
Yeah for interpreted languages this may be OK. For compiled languages it becomes harder to not "convey" anything derived from GPL code you linked against. The GPL even explicitly exempts system components for this reason.
Even for compiled languages, you can get around this with a properly architected plugin architecture. Your core project (non-GPL) exposes a runtime plugin interface. It has no clue what plugins will be used, but provides all the operations needed for the plugin to do what it needs. Create a plugin which links to the GPL code. GPL the plugin itself. The user can then be directed to install the GPL plugin. I can’t see how this would be a violation (by the core developer) of the GPL, as the core doesn’t have the first clue what license its plugins are using, and indeed, multiple plugins could be used that have contradictory licenses.
The end result will most probably violate the GPL. However, this is only realized by each user; you could probably argue that you never test your software and only look at API documentation.
And then it becomes a problem of proving your users violating the GPL. So you'd have to go after each one of them, which will be incredibly difficult, and proving damages would be even more difficult.
It's an asshole way of exploiting "Wo kein Kläger, da kein Richter" (where's no plaintiff, there's no judge) since actually proving that the developers violated the GPL will be difficult, unless they have a CI system that readily documents this.
Yes, all of the above is on the condition that some distribution happened (and you can prove that).
However, distribution also happens in places you might not expect. As a business, I'd stray far away from such constructs even if I only use this construct internally.
However, this is purely based on the wording of the GPL. For example, the EUPL explicitly covers the creation of derivative works - and I'd argue that the proposed circumvention would create a derivative work.
Yeah, the crux of the issue would definitely be whether use of an API is prohibited by default under copyright law for a country (i.e. does using a library make something a derivative work of the library). In the US, at least, the Google v Oracle case makes me think this is worst case fair use (for many contexts) and best case too functional to be covered by copyright in the first place.
Though I can certainly imagine that a multinational company might not be confident of the copyright status of API usage in all countries they operate in.
I'd further argue that it would be important if your program does anything useful without the GPL parts; if so, you can probably argue that it is not a derivative work. If you however only build an extension to some GPL'ed software that fundamentally needs the GPL'ed code to properly function and cannot (easily) be used by any other software, then you probably will create a derivative work.
And pipe/exec not because it's a magic boundary, but because that kind of interface is often so generic and simple it's not constituting a derivate work to use that boundary. But one could easily come up with an example where it is.
It's extremely entertaining to see people here arguing that programming and software interfaces should be copyrightable and therefore protected by the GPL.
Pretty sure the prevalent opinion was that it was going to be the end of the world if the US Supreme Court ruled that APIs were copyrightable in the Oracle v Google case.
Another process is not necessarily "another program altogether". What presents as "a program" to the user may be a collection of several processes using a variety of IPC methods to interface with each other. "Process" has a fairly concrete technical meaning but "program" is more loosely goosey. A shell script could be called "a program" even though it's executed by a shell process and invokes dozens of other processes, without which it is nothing.
Within this discussion if you take a GPL'd piece of software, a "program", and interface to it via pipe/exec you're just using an external program, you're not linking to it and are 'immune' from GPL 'contamination'. If not it would effectively not be possible, or very difficult, to run commercial license programs on Linux, for instance.
I think that's probably an interesting case, and whether it's a valid defence is going to depend on what you're doing.
If you are (for example) shipping a product which is effectively a thin wrapper around a GPL 'd component, but instead of linking to a GPL'd library you've written and dutifully GPL'd a small shim program that turns the library into an executable and reads function arguments over a pipe, then you call that from your 'non-GPL' code... I don't think you'll find the process boundary is some sort of absolute defence in law.
In fact you may be looked upon unkindly in court for having knowingly attempted a license circumvention mechanism.
The law is not necessarily sensitive to the exact inner workings of a turing machine, and the definition of a derivative work may or may not be affected by such shenanigans.
If the whole product you're selling is just a thin wrapper around GPL code, why not simply sell the GPL code or GPL the wrapper? Surely the wrapper would be trivial for a competitor to replicate so the reason you're able to make money from it must be something else like other services you provide with it, or just the value of you having chosen and distributed that particular GPL program.
It was more of a thought experiment to show that a process boundary isn’t likely to be some sort of perfect defence, than a real example of how someone might operate.
If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works.
I use the pipe/exec loophole to include a GPL program in a proprietary one. I personally decided that the proprietary code can be considered an independent and separate work because it's still does something useful without the GPL component. Also, you could swap the GPL component for a 3rd party proprietary one, which actually exists and uses the same interface for communicating via files. So to me, it's just a generic interface and I'm distributing the GPL program together as "aggregation" of separate works. For the sake of human relations, I also got approval from the GPL program's author to do this but he admitted he wasn't sure, and neither am I.
That quote is part of section 2, I believe. That section states "You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program...", so if my program uses a GPL'd program via pipe/exec I am not modifying the GPL's program at all, just using it, and so my interpretation is that section 2 (and in fact the whole GPL license), and therefore your quote, does not apply at all.
An example of what I have in mind: Let's say that you write a program and for some functionality it needs to grep files. Now, instead of programming this from scratch you just exec/pipe grep and grab its output. That's does not make your program GPL'd although grep is.
Packaging grep with your program to make distribution easier (if needed) does not, either.
I assume it never matters if you modify it or not since you're fully allowed to fork and redistribute GPL code. It doesn't even prohibit modifications that enable compatibility with proprietary software.
The ambiguity is in whether the two programs (proprietary and GPL) together count as a "work based on the program". Your grep example seems to support the idea that they don't, and it wouldn't fall under the operating system components exemption because it "accompanies the executable". But then where is the boundary? Others have said you can't make a GPL shim to convert a library to an executable then use it in this way.
Well, in the beginning of section 2, which I quoted, a "work based on the program" is defined as a modification of the program. If you use the program as is then you do not modify it and therefore your own program is not a "work based on the program".
A GPL library is a trickier case because it's all about linking to it and potentially how that is done.
My guess is that if the shim converts the library into an executable that is usable in itself generically then you're fine (e.g. you turn some 'libgrep' into grep then uses grep). But if it only implements something like RPC through pipes specifically for use with your own program then someone might decide that this is just convoluted 'linking' and falls within the licence.
I see. It almost seems to be starting to make sense. That even brings clarity to:
"But when you distribute the same sections as part of a whole which is a work based on the Program..."
No matter how you distribute them, they never formed a "work based on the Program" anyway so the GPL doesn't spread.
If all that's correct, it seems to leave a loophole that I mentioned where you can fork the original program to make it specially compatible with your proprietary program and redistribute it under the GPL. Now when you combine them you're not modifying it because you (or your friend if need be) did that before you obtained it and the modification is none of your business.
yes! And it in the general case it get really twisty and interesting. The classic example is a GPL cloning the functionality of a closed source program, down to arguments and options. Now, if you made some sort of closed source plugin for the GPL program, that's OK, if the plugin API was part of the originally cloned program.
However, if the plugin API was an innovation in the GPL program, then a proprietary plugin the can easily be infringing. (And as always with GPL, only on distribution.)
There is no difference whatsoever between the GPL and the AGPL for software you distribute.
The only difference between the AGPL and the GPL relates to software you don't distribute, but that you make available over a network, such as hosting an AGPL web server. The AGPL says that as long as public users interact with this web server, they have the right to get the code and deployment instructions and AGPL rights to be able to run the exact same service themselves, just as if they had downloaded the software.
The FSF's (very reasonable) position is that linking to a piece of software makes your software a derivative work, but that launching it as a separate process or accessing it over a network generally doesn't. However, if you exchange very complex internal data structures with this GPL software running in a separate process, whether you do it over a network or through pipes or command line arguments or whatever, then your program is still a derivative work, in their opinion.
That is, if you run gcc in a separate process but then mmap() it's memory and search for its entry points and jump to the memory locations of GCC functions in order to generate optimized code for a piece of GCC AST you provide, then your program is a derivative of GCC (according to the FSF, at least) even though GCC is running in a separate process.
A few articles I've read suggest that the AGPL terms trigger if you MODIFY AGPL covered code. So if you (1) setup an AGPL service over the network "as-is" with no changes, (2) use the AGPL-covered service from your proprietary code over the network (ie, across a process boundary), then you are in the clear, and your proprietary code can remain closed.
If you instead, modify/customize the source of the AGPL-covered service, then you are required to make available your changes to AGPL-covered product under a AGPL license.
Is this a correct understanding of the situation? If the product implements a publicly documented protocol to communicate over the network (ie, not "complex internal structures", and/or replaceable with another product), is that permitted?
I am not a lawyer, but my understanding is the same as yours - if you host a proprietary service that has some AGPL components (say, an AGPL authentication service that users interact directly with to get a token, that they then pass to the proprietary parts), then you only have to make sure that the source code for the AGPL part is made available (or not even that if you are using it completely unmodified).
> LGPL allows dynamic linkage without open sourcing your components.
Also static linkage provided you provide users with the object files so that they can re-link with a modified library.
> GPL requires any linkage to open source your parts
With the exception of system components. Maybe that is what is causing the confusion because it's not intuitive why you can ship proprietary software that uses glibc but can't ship proprietary plugins for a GPL program.
> AGPL is the nuclear don’t touch because it is vital to all possible usage types.
That is an absurd viewpoint bordering on FUD. Affero licenses are only a problem if you are trying to work around the license terms. The license does fix loopholes for networked use that allowes SAAS providers to skirt the GPL but it is NOT any more viral than the GPL. Commercial use of AGPL software is absolutely possible - see e.g. cloud providers. Most GPL software probably should be AGPL.
> The [AGPL] license does fix loopholes for networked use that [allows] SAAS providers to skirt the GPL but it is NOT any more viral than the GPL.
One person’s skirting is another person’s complying. Most of what AGPL proponents argue is skirting GPL by SaaS companies seems like straightforward compliance/conformance to me.
Linux dual licenses glibc so that it can be linked without falling under the GPL
And I’m not saying that AGPL can’t be used, in the same way I’m not saying other GPL can’t be used. You’re misunderstanding the context of the comment.
The comment was in regards to: what are you allowed to do without having to fall under the purview of the license.
In that sense, the AGPL specifically exists to close off many areas that the GPL didn’t touch.
> But usually it’s a misunderstanding of what derivative works are. E.g building an extension to a GPL app. Many don’t believe that simply importing a GPL Python module is enough to make your extension have to comply to GPL.
AIUI that's a far from settled question in copyright law (with e.g. Oracle vs Google being mostly reversed on appeal, and different courts applying different standards). For someone who's not in the business of copyright minutiae, it's not a completely unreasonable belief to have.
Not really. Including code in your code would always be a copyright violation.
The Oracle vs Google case is very different. The OP is talking about copying implementations while Oracle vs Google is about copying APIs.
The equivalent of Oracle vs Google here would be if you rewrote the exact same python module with exact same API names, but all your implementation was clean-room implementation. Here the question is, if the APIs themselves are copyrightable.
> Including code in your code would always be a copyright violation.
Well, not quite - shipping code with your code would be, because that is direct distributing. But if I ship a script to download your GPL library from PYPI and just use it as it comes, the argument on whether my application is a "derivative" of yours, is not an easy one.
It's potentially not an easy one in law, but it is one that the GPL is explicitly intended to capture. The only question is whether it has successfully done so (and that actually depends on the specifics of the usage, as I understand it).
> Including code in your code would always be a copyright violation. The Oracle vs Google case is very different. The OP is talking about copying implementations while Oracle vs Google is about copying APIs.
Importing a module doesn't generally mean you include that module in your code. It only means you made use of that module's API. So whether that API was copyrightable is very much relevant.
Extensions, importing, linking? If your (A)GPL licensed app has a plugin framework, does that 'infect' all the plugins to be similarly licensed? There's likely different situations here, or no? Like e.g. Hashicorp go-plugin where separately compiled plugins are dropped into a directory and communicated to via gRPC cross-process. Or WebAssembly Components being composed together.
Apparently this is fair use according to the US Supreme Court. Technically speaking the case that went to the USSC concerned some prior closed source version of Java, but Oracle doesn't to think there's a material difference since they're not suing Google again for that, and neither do I.
The FSF's argument that linking to GPL libraries require you to relicense the code to GPL rests on the legal argument that APIs, headers, etc. are copyrightable. USSC decided that, well, for only for Google's case, sure, fair use, whatever.
IMHO the GPL was drafted and marketed specifically to make its legal position seem uncertain, and the outcome of the Oracle v Google case doesn't make things clear. I'm not sure whether you'd actually see a court make a decision on the "GPL" given that the case would almost certainly be a matter of copyright infringement, not a matter of license interpretation, so the license text does not matter. If I were a lawyer I'd advice corporate clients to steer waaay clear of anything GPL.
Most importantly, the only reason the courts decided Google wasn't in violation was that Google's purpose, providing a compatible re-implementation of Java for Android, constituted fair use of the interface definition, so Google didn't need a license to use the OpenJDK headers. If Google had copied the OpenJDK headers in order to implement an entirely different system that used the OpenJDK code in a different way, the case would have gone in a fundamentally different direction.
Secondly, the legal reasoning for the GPL goes like this: by default, you don't have any right to re-distribute anyone's code (unless some fair use exception applies). The copyright holder can give you certain limited rights through a license - in our case, the GPL. The GPL can essentially impose any conditions, and if you think they are too onerous, then you just don't redistribute the code, period. If the GPL says "you can't re-distribute my code if you link your program with it unless you also distribute the sources of your program, including all precise compilation and installation instructions, and give your users the same rights and obligations", then there is nothing to dispute. The GPL could have also said "you can't redistribute my code unless you throw a penny in a prominent wishing well and light a candle in 3 different churches", and that would have been binding as well, if you wanted to re-distribute their code.
The subtleties come when you don't actually want to re-distribute GPL code, but would like to interface with it. Say you want to distribute a program that only works if dynamically linked with GCC. If you publish a bundle on your site with your proprietary program + gcc, you are clearly under the incidence of the GPL, since you are distributing GCC and are dynamically linking with it. However, what if you don't distribute GCC at all, and just tell your users "download my program, and then install GCC under /opt/propritary/utils/gcc"? In this case, is your program bound by the GPL or not?
This is a much more complex question, and it is here where you get in the nitty-gritty of what constitutes a derivative work in software. If your program is considered a derivative work of GCC, then you are distributing a derivative of GCC and so you need to respect the terms of the GPL. However, if your program is not a derivative of GCC, then you can entirely ignore the GPL, since you are not stepping on anyone's copyright and so don't need a license. I believe this question is entirely unsettled, and there are plausible arguments (to a layman like myself, at least) both for and against it.
I don't disagree with your take (though I don't know which part of mine you disagree with...)
The idea of derivative work is clear, the confusion is how the FSF kept pushing their interpretations of it.
Programmers are usually not lawyers, and taking legal "advice" (or rather trying to learn legal concepts) from the FSF is not exactly the wisest thing to do.
There's been a lot of confusion why linking GPL libraries would be a "derivative work", and apparently the FSF's stance is... well unless it doesn't use inline functions and macros ( https://lkml.indiana.edu/hypermail/linux/kernel/0301.1/0362.... )... but let's just tell everyone you can't use GPL libraries in proprietary code.
So we have a bunch of rumors about linking mechanisms, speculations that the GPL cares about static vs dynamic linking, kernel interface calls, classpath exceptions, etc.... which shouldn't be a part of the conversation at all, at least not without considering the broader context of how the proprietary software relates to the GPL software.
> very few people take software licenses as seriously as they should.
Isnt the fact that it has taken this long and required a straight up violation of the direct terms of the licence proof that people were probably justified in not caring until now?
FFMPEG. Look at the amount of legal horse fertilizer here: https://www.ffmpeg.org/legal.html ... and every time I've seen people asking what they can and can't do, someone from the ffmpeg devs (once in a couple of years when they actually respond to someone) just says "talk to your lawyer". I mean, flip a few switches and suddenly you're under a different licence ... and all this you have to look up yourself. Many (especially smaller) devs just don't care and distribute ffmpeg together with their (often non-oss) tools. Given the amount of these in the wild, if someone really had an axe to grind, they could sue hundreds, maybe thousands of devs of all sizes. It's a total minefield. And one of the most popular (L)GPL packages in existence.
My experience with FFMPEG is that it’s usually not directly distributed for probably those reasons. The only times I remember getting it directly with software, was for GPL software.
Eh, I'm just gonna point out that you can find ffmpeg code (proprietary decoders even) distributed with Electron. I'm not even going to think about how many proprietary apps built on top of Electron have no idea about that. Look at the answer here: https://stackoverflow.com/questions/67799370/legal-issues-in...
It is currently pretty clearly describe. Compile it in LGPL mode and you are good for propriétaire use cases.
There is nothing fundamentally dangerous with the LGPLv2.1.
It is mainly annoying in case of static linking due to the obligation of providing object files but will not contaminate your codebase like the GPL does.
What constitutes linking, dynamic linking, and static linking, if you use python and plugins? Can the rights holder of proprietary software contribute to or distribute a GPL or LGPL plugin for that software?
The important part with LGPL is not really about static vs dynamic linking.
The important part is that you should be able to modify the LGPL library that the application uses. In the case of a python module or plugins, it is possible, so that's ok for LGPL. You can even statically link LGPL libraries with your proprietary application as long as you provide a way to relink your application with a modified version of these libraries.
Fair enough, but it's still not entirely clear what constitutes linking, especially when it comes to Python, and what the situation is with plugins. The GPL FAQ Says:
> It depends on how the main program invokes its plug-ins. If the main program uses fork and exec to invoke plug-ins, and they establish intimate communication by sharing complex data structures, or shipping complex data structures back and forth, that can make them one single combined program. A main program that uses simple fork and exec to invoke plug-ins and does not establish intimate communication between them results in the plug-ins being a separate program.
> If the main program dynamically links plug-ins, and they make function calls to each other and share data structures, we believe they form a single combined program, which must be treated as an extension of both the main program and the plug-ins. If the main program dynamically links plug-ins, but the communication between them is limited to invoking the ‘main’ function of the plug-in with some options and waiting for it to return, that is a borderline case.
No company should ever do business under terms that are this unclear. It's just asking for trouble. Use another licence that is not this ambiguous.
Even saying it's OK with Python is a simplification. In almost all cases it is but what if you compile Python to binary, strip its symbols, statically link, then for good measure, upload the binary to an app store.
> If the main program dynamically links plug-ins, and they make function calls to each other and share data structures, we believe they form a single combined program, which must be treated as an extension of both the main program and the plug-ins.
You don’t have to distribute the plugin as xGPL but you do need to make it compliant.
The L/GPL would widely consider Python imports as dynamic linking.
Many companies will instead do shim/bridges that aren’t GPL themselves but something like MIT. The bridge talks to proprietary code and the GPL but doesn’t bridge the license over as a result.
This too is somewhat of a grey area though that the GPL is vague on.
> You don’t have to distribute the plugin as xGPL but you do need to make it compliant.
My reading of the GPL is that if you are the rights holder of some software, and there is a xGPL licensed plugin to said software, you can only distribute changes to the plugin if you also distribute your software as xGPL. In other words it is impractical for you to make changes to xGPL licensed plugins to your software.
> Many companies will instead do shim/bridges that aren’t GPL themselves but something like MIT. The bridge talks to proprietary code and the GPL but doesn’t bridge the license over as a result.
Unless the GPL was enshrined in some law I'm not aware of, it's likely to be a a breach of contract, not necessarily "illegal". If a court rules against you, in most cases the worst that can happen is having to pay some redress to the developer and stopping what you were doing.
> Unless the GPL was enshrined in some law I'm not aware of, it's likely to be a a breach of contract, not necessarily "illegal".
The GPL licenses give you specific permission regarding copyrighted work, provided you comply with the terms. If you do not comply with the terms, you do not get the permissions. If you then do something like distribute the work you are not in breach of contract, as you had no contract to breach. You are in breach of copyright, which is not legal as there are laws prohibiting copyright infirngement.
> If a court rules against you, in most cases the worst that can happen is having to pay some redress to the developer and stopping what you were doing.
Why do you want to leave this possibility open? Why not just avoid GPL and use a licence which is clearer, where you don't have to first go through litigation to understand the terms of the licence?
To be supremely pernickety one might argue that copyright is a tort, so it is tortuous infringement, which is unlawful (not allowed by laws) but not illegal (criminal).
But then one might also argue about how many copies of the GPL fit on the head of a pin.
However, in some jurisdictions some copyright infringing acts are deemed criminal; so not only is such argument futile but it can also be wrong according to the facts of the case.*
> Usually some form of “Oh don’t worry because we aren’t litigious right now, as long as you stick with our incorrect interpretation of the license
Where do you see that attitude in the OSS world? In my experience that's how the games industry operates wrt to mods but for open source projects you usually have a clear license IME.
I worked for a Multinational US company that was doing business with a Scandinavian company. The Scandinavian company wrote a GPL plugin to our software and wanted us to make contributions to it. We didn't think we could do that and still be within the terms of the GPL, they insisted that we could because they would be fine with it.
As the author of the GPL code they have the authority to dual licence it, so surely the fact that they state they are fine with it (especially if in writing) would be enough?
I'd worry the other way, if you're contributing to the GPL code, you'd need to relinquish your rights to any IP contributed to the third party.
Our concern was not that our contribution would be GPL licensed, our concern was that we didn't have permission under GPL to distribute the contributions even if we were fine with them being GPL licensed. For details as to why we were concerned, see https://www.gnu.org/licenses/gpl-faq.en.html#GPLPlugins - for any company that should be a clear signal to stay away.
An employee of a company saying it's fine in an email is also not issuing another licence. We did suggest they use another licence than GPL, which they eventually did after months of arguing with them that just saying "it's okay because we don't care" makes no material difference to us and that GPL does not unambiguously provide us permission to do what they want us to do.
Hah, yeah, an employee saying that without executive authority would be a big no.
But according to this [0], I read it as quite clearly not allowed to GPL a plugin for distribution within a non free program, even if the plugin was unmodified.
But that is contingent on whether it is one work on not, which in GNU's speculation depends on how it's linked, which is not clear how to understand this for Python, and the intimacy of the communication, which is not something which we can objectively measure. We believed it would likely be one work, the plugin’s author believed it would not be. All of this was just months of wasted energy, spending arguing about licence terms instead of coding.
Of course it was fine, they were the copyright holder. I assume your contribution copyrights were transferred to the other company? As they were publishing the whole plugin as GPL.
> I assume your contribution copyrights were transferred to the other company?
I'm not sure why you would assume this, but it would not have happened.
Furthermore, given this https://www.gnu.org/licenses/gpl-faq.en.html#GPLPlugins - and given there were no exceptions in the plugin, it just seemed like a complete mess to us. Of course the Scandinavian company disagreed because, as I said, they figured we should not care because they are okay with it.
Just quoting from above:
> It depends on how the main program invokes its plug-ins. If the main program uses fork and exec to invoke plug-ins, and they establish intimate communication by sharing complex data structures, or shipping complex data structures back and forth, that can make them one single combined program. A main program that uses simple fork and exec to invoke plug-ins and does not establish intimate communication between them results in the plug-ins being a separate program.
> If the main program dynamically links plug-ins, and they make function calls to each other and share data structures, we believe they form a single combined program, which must be treated as an extension of both the main program and the plug-ins. If the main program dynamically links plug-ins, but the communication between them is limited to invoking the ‘main’ function of the plug-in with some options and waiting for it to return, that is a borderline case.
I think it's strange how you can be entirely sure about something which the GNU is not sure about, but the thing is neither the GNU's speculation or yours would be relevant, it would have to go through costly and time consuming litigation to know for sure.
I think where they are coming from is that the copyright holder has the power to pick any license, multiple licenses, or make up a license. So they can give you rights that the GPL doesn't, as they are the copyright holder. IANAL so I don't know exactly how it would be made legally binding, but they do have that power.
But we are talking about the GPL. If they used a different licence (as they eventually did after months of argument) it would not be relevant, but just the terms we got under GPL were not clearly sufficient, and their first response was "it's okay because we are fine with it". If it was an "it's okay because we are fine with it" in licence form, then it would not be GPL, but a different licence altogether.
France has a civil law system [0] which doesn't place nearly as strong an emphasis on precedent as common law does, so this specific case probably won't help speed anything up.
For the first time, I can see the advantage of AI for a forum like HN. Have the AI scan the comments and generate a summary and links to show/hide, as an example:
80% of comments are pedantic.
10% of comments discuss the actual topic.
5% of comments are excited/hopeful.
30% of comments are probable astroturfing.
20% of comments are by users who are most likely millenials.
1% of comments are by former /.'ers.
etc.
Ideally this could just run locally and operate on any website as needed, like browser userscripts now.
Seems it would have simply been prudent for company to have obtained a commercial OSS exemption. And if not offered to have requested one. Shady undertones that companies seem almost to want to use OSS without ever raising that commercial issue, as if sneaking around hoping the authors will never wise up to their rights—just to save a few bucks?
Naively, if a model is smart enough to spit / whitewash a bit of code, it is also smart enough to add attribution, isn't it? Wouldn't correct attribution be just a master prompt line away?
From what I can tell this is a judgment against some sort of government services provider. So the damages will just end up being paid by the citizens via taxation.
I don't think this is correct. The case was against Orange, which is a publicly traded company. I don't see how they could "invoice" it to their customers. Also, the service was rolled out ~14 years ago, I assume it was paid by this time. I guess they might increase the maintenance fee, though, to absorb this over time.
But, most importantly, it is not relevant. If an entity is found guilty, they are supposed to cover the damages, whoever they or their customers are. Even it turns out being covered from my taxes in a legitimate way, I am glad to pay for it - this is how the justice is served, and we all get to keep the healthy judiciary and executive process.
I guess the fact that the software was also offered with a commercial license could be considered that it prejudices the legitimate interest of the author of the library. IANAL, but I think this might be a one-off (most GPL libraries do not have a commercial license alternative, so...).
I don’t know… that’s kind of a negative way to see it. but I mean those auto DMCA takedown services already kind of do something like this right? Why shouldn’t IP violation checking as a service be a real thing that open developers can access? I think your framing of this is wrong. More empowerment and legal protection and enforcement for developers that underpin the ecosystem is a good thing.
OpenAI recently fed everything on GitHub into a language model which will cheerfully reproduce GPL'd software without license or attribution.
I think that's the end of "free software" in the GPL sense. There is zero point to writing AGPL on a program if copying it into a language model and then back out into a text file without the license is a legitimate thing to do.
I can't imagine this being countered effectively by lawsuit. Microsoft has built a weapon that they can use to disassemble open source software, I think they'll take the legal side of that seriously enough that GPL will be a distant memory by the time the courts conclude in either direction.
People don't want to acknowledge it, because they deem that CoPilot and similar tools are too useful. I personally left the day CoPilot became GA. It's an open secret that xGPL licensed applications are fed into AI models, but opponents' voice is drowned under "We're doing something amazing, we don't need any permission. Plus, it's all fair use, anyway" choir.
This also important for Source Available (for your eyes only) licenses. People don't care about them either.
This whole AI thing is a boundless ethics violation in every category, yet people don't care, because "it's awesome!" for reasons I don't understand.
Addendum: The reason why OpenAI has a non-profit part is to allow them to claim that they do non-profit research, which fair-use doctrine requires to work.
As much as I agree with the conclusion (my AGPL code was stolen by OpenAI, as was a lot of other code), your logic is extreme. This is not good. A more reasoned argument is more convincing.
1. "Fair use"
It is fair use to use snippets of my code in other projects. AGPL doesn't change that. If I have a 3-line sort routine, and Copilot copies that verbatim, it's okay.
The place and reason Copilot is not okay is because I can ask it to spit out code which, in some versions, was almost verbatim code I wrote, and in others, a paraphrase. The same abstractions and high-level design I invented is copied by GPT.
It's like writing a book set in the Star Trek universe and (not as a parody), changing "Spock" to "Spork" and "Kirk" to "Dirk."
It's a problem for me because I compete with commercial systems, and chose the AGPL for very specific reasons: People have a right to know how their data is analyzed, and AGPL preserves that. Copilot allows companies to quickly replicate and whitewash my code.
If Microsoft uses a snippet in Windows -- e.g. a small use in an unrelated context, fair use clearly allows that. As much as I don't want my code stolen, degrading fair use would be the far greater evil.
2. "The reason why OpenAI has a non-profit part is to allow them to claim that they do non-profit research, which fair-use doctrine requires to work."
No, OpenAI was set up as a genuine, if ill-formed, non-profit. When it clearly became big $$$, bad people (likely breaking the law) formed a for-profit arm to personally profit from it. The non-profit arm is there for historical reasons.
Your statement is false, and there is ample historical record that what you're stating is not how it happened.
That's dangerous, since the wrong lessons would be drawn. The lesson this reiterates, which I learned many times the hard way, is that for non-profits, checks-and-balances matter. If you don't want your non-profit going south, this means everything -- licenses, corporate charter, bylaws, contracts, etc. -- need to be properly structured to act as a permanent network of checks-and-balances so the organization cannot go south even if bad people are involved, and that incentives are structured to support good behavior.
We do need a for-profit like the original OpenAI, but properly structured and resourced.
> It is fair use to use snippets of my code in other projects. AGPL doesn't change that. If I have a 3-line sort routine, and Copilot copies that verbatim, it's okay.
That depends. You may have a 30 line function which can be paper-worthy. I'm not supporting software patents, but reasonable attribution. If somebody uses my GPL licensed novel algorithm (which is a single function) to get unfair commercial advantage and doesn't adhere to the GPL license, can we say it's fair use? Example? See [0].
> It's a problem for me because I compete with commercial systems, and chose the AGPL for very specific reasons: People have a right to know how their data is analyzed, and AGPL preserves that. Copilot allows companies to quickly replicate and whitewash my code.
We're on the same page. I don't want my algorithms to be used by people verbatim w/o attribution or without proper license. Don't like the license? Reimplement it, paper is there.
> No, OpenAI was set up as a genuine, if ill-formed, non-profit. ... [snipped for brevity]
You're right. Let me rewrite my statement: "The reason why OpenAI keeps a non-profit part and has this convoluted many-company structure is to allow them to claim that they do non-profit research, which fair-use doctrine requires to work."
Hope that it's better now.
> Your statement is false, and there is ample historical record that what you're stating is not how it happened.
About OpenAI's structure, or GPL being fed to CoPilot? I didn't say anything about OpenAI's history AFAIK, but I have seen a tweet, containing a snapshot of an e-mail saying that "Copilot is indeed trained on every public repository on GitHub". The tweet is deleted later probably of threats. Just because of this tweet, I'm keeping this list [1].
Actually we're literally on the same page. Only difference is I keep my "plagiarism" threshold at "function" level, because they can contain some bigger magic than themselves, and you keep it at "design" level where the magic happens as a collection of functions + design.
This is probably due to different domains we operate in, in mine, a single function can affect accuracy and speed so much that it can be considered secret sauce. Welcome to HPC / Scientific Computing. :)
> We do need a for-profit like the original OpenAI, but properly structured and resourced.
When you do for-profit, fair use goes out of window. See [2].
> That depends. You may have a 30 line function which can be paper-worthy
We have no disagreement. That's why I gave a pretty generic 3-line function. 90% of the upside of ML tools for me, by the way, is generating very routine code, so there's not even too much friction yet. I can prototype a parser with e.g. 5 XML libraries I've never used in 15 minutes, which means I make better design choices.
> You're right. Let me rewrite my statement: "The reason why OpenAI keeps a non-profit part and has this convoluted many-company structure is to allow them to claim that they do non-profit research, which fair-use doctrine requires to work."
> Hope that it's better now.
Nope. It's not. The reason the non-profit part is around is right now, they're skirting the law. If it went away, they'd be very clearly committing fraud.
The non-profit received support from donors -- including from the government in the form of a tax waiver and the general public in the forms of non-financial support (such as positive buzz / articles / scientific work) -- based on a 501(c)3 registration and a belief it would act in the public interest.
The instant the 501(c)3 part went away, this would all be fraud. Everyone who helped make OpenAI successful, from the grad student who spoke about it glowingly at a conference to the federal government, would have been embezzled.
I feel bad about having supported them, and I bet others do too.
As a blanket statement? I don't think so. The reason we even think of anything as "dubious" now, is a moral compass, that has (at the large scale) been staggeringly readjusted for increased human well-being.
It's also why these days I do not host any of my code on Github. Only then did I realize how much of an iron grip it has on the ecosystem. People expect Github Actions, Pull requests and well now people want to use CoPilot for all their work.
Me neither. I moved to Source Hut. I'll be restarting my work on my previous research and pondering on whether I should keep its code at a Forgejo instance buried under a VPN, until it's ready to release.
If there's also MPL 1.1 licensed code in the training set then you end up in an impossible situation where you can't comply with both GPL and MPL 1.1 at the same time.
It is difficult to see an argument that the output of a language model is not derived from the language model, other than people would prefer it wasn't.
But what is derivative? When does being derivative w.r.t. the GPL mean? Can I copy a function from a GPL’ed project? What if I rename the variables? What if I re-order some of the lines in an inconsequential way? This starts looking like the software patent problem, or the ship of Theseus.
So, if I looked at a function, understood it, then reproduced the function from my grasp (not necessarily verbatim, but by understanding the logical flow), would that be subject to the GPL? What if it varied? How much does it need to vary not to be considered derivative?
If there is a threshold, then it is possible to rewrite the original code in ways that do not resemble the original code verbatim, but are effectively the same.
> So, if I looked at a function, understood it, then reproduced the function from my grasp (not necessarily verbatim, but by understanding the logical flow), would that be subject to the GPL? What if it varied? How much does it need to vary not to be considered derivative?
That's an interesting question. In general, the answer is no, but could change depending on if you had the function open at the time of writing the new function.
But a lot of us have knowledge of basic data structures, etc, from reading GPL-licensed code (or perhaps even proprietary code in textbooks). So, in general, coming up with your own understanding is fine.
The entire model weight file is a "compiled GPL binary" and its partial outputs would be too. If we would follow GPL precedents, everything that the model generates would be.
But can we get around the GPL by transforming a function, in trivial ways, to produce an equivalent function to avoid having it fall under "derived work"?
This is tricky, though, because that LLM was likely also fed code with a license incompatible with the AGPL. So in theory, that code (whether source or compiled) may not be distributed under any license at all.
Being hard to solve doesn't make a problem go away, though. This is a place where I'm somewhat curious where it ends up. Frankly I think the outcome is more interesting for artistic works (e.g. poetry or images) than practical ones like code, since current generations of AI seem so much better at the former. Still interesting either way.
The other thing is that a bunch of stuff it auto-completes is way below any reasonable threshold for a copyrightable work even it bit for bit reproduces code from a piece of AGPL software. There are also only a finite sane number of ways to implement basic data structures like single- and double-linked lists in common per language. If I lay claim to the copyright for all of them should that prohibit anyone else from implementing the data structure (without unreasonable contortions)?
Why do you assume it's a legitimate thing to do? Nothing changed from a copyright perspective. You can copy a GPL file directly, or take a photo of it, then decode the JPG, OCR the bitmap and put the result in a file, or you can train a LLM to reproduce it. It's all the same, if a judge feels you've copied a substantial part of the file then it's an infringement.
The same way an LLM would generate that code without the license, another LLM could find that code and determine that it infringes. If anything, the emergence of LLM powered agents will empower the free software movement to find infringements.
The only way free software would "end", is if judges would not be able to enforce copyright, because for example the copyright infringement is obfuscated too much. For example if LLM powered agents are going to both generate and run the code without ever saving it to disk. We're not quite there yet.
If legimate, then failure. The precondition implies the postcondition.
I don't personally think running copyright material through a magic box that you write lots of pretty marketing about removes copyright. Much like running it through grep to delete the copyright notices doesn't remove the copyright. Or deleting the licence file from a fork.
Seems totally obvious to me, but only because I don't buy that there's an artificial intelligence composing code from scratch that just happens to have the same expletives in comments that the GPL code did.
Ah, right so that's only true if you believe that all output from Github Copilot infringes on copyright. Most responses from LLM's don't include copyrighted information, or at least not substantial portions of such.
AFAIK, having humans recite copyrighted songs from memory and then performing it without permission is illegal in most jurisdictions.
However, copyright law seems to make a distinction between expression/implementation of an idea (which can be copyrighted), and the idea itself (which generally speaking, cannot be). This is why clean room implementation is a legally sanctioned protocol to copy functionality without running afoul of copyright law.
Whether the language model is copying the "expression" (or a particular implementation) of a computating function, or whether it is just somehow re-implementing it given user's instructions, is going to be an interesting debate, and I really doubt people who think it's one way or the other has actually understood the nuances.
I'm unsure what you're getting at, but when people sing copyright songs or act copyright plays that do, in fact, get a license. So yes, if you train a human on someone's copyright content I do expect you to have a license to it.
I'm sure you've heard of covers? Well every cover that is published affords a royalty to (at least) the original authors of both the lyrics and composition. The artist may get some money, depending on how the work was licensed.
Yes, someone still needed to be trained on the original copyright sheet music before they could train someone else. Ultimately the source of that knowledge was a copyright piece of material that was licensed appropriately.
Yes someone may teach another person based on their memory, but even if that person still performs that work they still are legally required to license the work.
Yes, if the only thing they were doing was training, sure. But it’s not. They’re training and then presenting the data and given the way LLMs are trained, there is no guarantee a transformation even takes place.
At the end of the day LLMs should be licensed under the current copyright system. Maybe OpenAI need to donate some money to a few politicians for that to change.
Also, those laws say that remembering a song is fine. And playing it in your living room too. But that you'll have to get the licensing right if you play that song in front of an audience, in a pub or on street. At least in most countries.
To translate that to LLMs: training an LLM on songs (or code) is fine. It's the output and what you do with that output that might be problematic.
> To translate that to LLMs: training an LLM on songs (or code) is fine. It's the output and what you do with that output that might be problematic.
Only if you assume that LLMs are treated like people. Does an LLM have a right for decent working conditions? Is it illegal to make an LLM work 24h a day?
LLMs are not humans, therefore it seems reasonable to not blindly translate everything that applies to humans.
The courts often handle new technologies by default simply by assigning responsibility to the people who operated it for the specific uses in question, such as reproduction.
What the courts do is one thing, what the law should say is another one.
One problem I see with LLMs is that it is killing the job of many artists by using the copyrighted work of those very artists. The more the artists work, e.g. by doing truly original content that an LLM cannot generate because it was not trained on that, the better the LLMs get at killing the job of the artists.
It is a very weird situation: the LLMs need the artists in order to finish killing the artists. When that is finally done, it's not clear at all that LLMs will be able to generate new truly original content or whether they will be stuck in the world that the artists created before dying.
It's very different from, say, a calculator that replaced the job that real person had. Because the calculator can do at least as well as the human was doing, and the calculator was not feeding from the output of those humans.
I think the most likely outcome is that AI will eventually produce art that most humans prefer to that created by humans, and humans may not even remember a time when art came from humans.
Are you old enough to remember when humans planned in advance which video you would watch next?
> I think the most likely outcome is that AI will eventually produce art that most humans prefer to that created by humans
Well if it prevents humans from producing art, of course this will happen. Doesn't mean it will be better art. I would argue that it will be worse by definition.
It doesn't really matter what my or anyone's view on it is, unless they're either a law maker or a judge.
People do not get sued for having abilities. They get sued for publishing copyrighted works. What tools they use to create or publish those works is not relevant.
The only interesting legal question in my opinion is wether the large language model itself can be considered a carrier of copyrighted information, like for example an encrypted DVD would be.
If you can come up with a prompt that itself does not contain substantial copyrighted information, and that prompt leads the model to come up with a substantial amount of copyrighted information, then in my opinion it's no different from an encrypted DVD, and as such would illegal to distribute without licensing. Also I believe it would be illegal to provide access to it over the web like OpenAI does.
IANAL, but I wouldn't hold stock in any company that publishes (access to) models that are easily goaded into producing substantial copyrighted information.
I think the encrypted DVD is the best analogy I've heard so far.
In fact, I once asked an actual lawyer about the information-theoretic questions raised by illegal/infringing content known to be on an encrypted storage device.
Her answer was "Yeahhhh, no, the Judge is probably not going to see it that way." :-)
If you reproduce those songs and charge a subscription fee for anyone to use them you will be going to court. If you use samples of a song (which I think is much closer to an LLM) you will pay per sample use.
Also, comparing a human to a service that runs on billions of dollars of infrastructure is silly.
> I think that's the end of "free software" in the GPL sense. There is zero point to writing AGPL on a program if copying it into a language model and then back out into a text file without the license is a legitimate thing to do.
Why would you think this is 'legitimate'? Firstly, LLMs are not legal agents, only humans are, so any human caught doing this is violating the GPL, whether or not the creation of the LLM is legal or not. Bits have color
Sorry, to be clear, I am saying, regardless of whether it is legitimate to do X, it is still illegitimate to distribute copyrighted code intentionally without following the terms of the license.
I tend to agree with the parent: if it is legitimate to take copyrighted code, pass it through an LLM, output it in a new text file, and say "it doesn't have any copyright anymore because it was generated by an LLM", then copyright is essentially dead as a concept.
If copyright is dead, then no need for licenses anymore: anyway people can remove your copyright by laundering your code through an LLM.
> : if it is legitimate to take copyrighted code, pass it through an LLM, output it in a new text file, and say "it doesn't have any copyright anymore because it was generated by an LLM", then copyright is essentially dead as a concept.
But it's not legitimate. That's what I said. If you (a human) take copyrighted code and redistribute it against the terms of the license, you are guilty of a crime in most places.
That was my point, whether the language model was built legally or not, if you have control of such a language model and use it to produce copyrighted code and then distribute that, you will be held as violating copyright.
I think there's a lot of confusion here because some on the thread seem to believe a large language model can in and of itself be guilty of a crime. It can't. The trainer may be guilty, or the user could be guilty, or both could be guilty. The model simply is. It might not be redistributable depending on how courts interpret the training procedure (remains to be seen).
I feel like you are being pedantic here. If we all agree that an LLM being trained on copyrighted material without authorization is a copyright-laundering machine, then the consequence is that copyright is dead.
I don't really care who we should theoretically punish: once it's out there, it's out there for good. Maybe it's already too late, and the LLM people just broke copyright for everybody (or worse: they broke it for honest people). So much for making the world a better place.
Microsoft has indemnified customers who use its Copilot to generate code. If you use code generated by copilot and you're found to be in copyright violation, Microsoft will pay the fines for you.
"if a third party sues a commercial customer for copyright infringement for using Microsoft’s Copilots or the output they generate, we will defend the customer and pay the amount of any adverse judgments or settlements that result from the lawsuit, as long as the customer used the guardrails and content filters we have built into our products."
As this case shows, it takes over a decade to successfully litigate a pretty obvious copyright violation. So Microsoft feels quite confident in paying for you.
I disagree. Somewhere out there is an ambitious and voracious law firm just chomping at the bit for a chance at Microsoft’s or OpenAI’s billions. They will sue and the situation will get clarification.
Definitely. It'll take a decade or two and cost millions of dollars. Doesn't actually matter which side wins, GPL will no longer be a thing either way after that long. It's a spectacular play by Microsoft.
It's not mentioned because it's neither relevant to this article or true at all? As far as courts are concerned right now, LLMs are simply tools and all existing rules and procedures around copyright still apply: a human is liable for it's use, only the work and skill of a human is copyrightable, etc.
Nor are LLMs novel in any way when it comes to copyright violation. Microsoft is already rich enough they could ask developer(s) to reproduce a GPL code base but slightly different and then make an argument of coincidental, parallel thought. Or just pinky promise "we totally didn't steal this." Heck, it'd just be easier to copy and paste without license headers and then just run some basic human-readable obfuscation.
Furthermore, AI still isn't even close to being able to pull this off for even remotely complex projects. The context sizes are too small, the models aren't sophisticated enough, etc. They save time as an assistant, but novel output is extremely finicky and prone to errors. Getting a model that can be trusted to do what you're saying would be approaching AGI, in which case the economic ramifications are far more concerning than those on OSS.
Finally, how exactly would this "disassemble" OSS? The GPL has been violated for forever, especially behind closed doors. If this would kill it, the GPL is already dead and capitalism killed it. After all, if Microsoft - a company with a revenue stream larger than countries - decides to steal from virtually any OSS projext and the developers - a gaggle of volunteers with few liquid assets - try to take them to court, they're about to engage in a battle of attrition with legal fees. The legal system notoriously favors the rich, which is a far greater threat.
> The context sizes are too small, the models aren't sophisticated enough, etc.
Yeah, I wonder what GPL-ed code is actually infringed upon. I vividly remember the time Oracle claimed Google infringed their copyrights because they copied a couple small functions, and the tech community just laughed saying the functions were too trivial. (Mind you, Java was licensed under the GPL.)
People harping that the GPL is somehow more affected by LLMs is kind of misguided. If there is copyright infringement, then almost all other licenses would be infringed too, since the vast majority of OSS licenses require keeping the copyright notice at the very least.
And the enmity towards Microsoft, whose "fault" is just having acquired shares of OpenAI, as opposed to almost every other Big Tech company who also released LLMs (and obviously injested a lot of github code), feels like the early 2000s again.
No, either this is fair use, so no license is required, or the current licences must be followed (including attribution which seems like the first thing that gets forgotten).
Copilot does small-ish snippets. It does not recreate a whole project. I am on the fence about the morality, but it won't recreate elastic search for Amazon to resell as a service.
I find It fascinating to see the same-ish group of people arguing that "Algorithms cannot be patented" or "information wants to be free" or "free software, free access" and such, now turning 180° and demanding to control where, how and when information is used.
At the very least it's somewhat ironic. At worst it's plain hypocrite.
In the interest of best possible interpretation of the group argument;
Group A : "Information wants to be free and we shouldn't patent algorithms"
Group B : "If you want to use our corporate stuff, you must pay"
Group A : "If that is the route you choose, we will still not charge for our stuff but if you want to use it, you must share your changes"
Group B : "If we use this tool that generates code, trained on the code that Group A created, it doesn't count as Group A's code and we can ignore their rules"
Group A : "It does count as our code, please abide by the rules."
More interesting to me would be; If I trained a LLM on leaked MS sources and built an OS or Office suite. Would MS be ok with that?
The rules remains the same. If you reuse a significant amount of code as is, you are infringe copyright. If you internalized some new knowledge based on some code and come up with a new implementation.. this is how we human learn. This is expected. Why would it be different for llm?
And that is what this breaks down to, is a LLM learning? or is it just regurgitating?
From my limited knowledge, it looks like it is just regurgitating. This is based on what I have heard about outputs, i.e. stays the same if rand function is fixed.
Also with code there are finite ways to create something functional as opposed to an opinion paper where there are a range of viewpoints and how to express those viewpoints.
"And that is what this breaks down to, is a LLM learning? or is it just regurgitating?"
You can make this argument for any artist. We study whole art movements which are periods of time where all art looks the same. Somehow, we want code production to be completely novel with no prior inspiration?
To your second point. I agree. There is limited expressivity in code implementation. As a human how would you cope with that? You can't recall any prior code you may have seen and you must write completely different code than anything in existence or face copyright litigation?
> If I trained a LLM on leaked MS sources and built an OS or Office suite. Would MS be ok with that?
Why not?
If your implementation is serving back literal chunks of code, I'd presume not, and they can easily use copyright laws to punish you.
If your implementation uses that as "inspiration", It's hardly different from you reading that code and learning from it. Obviously, it's almost impossible to read the code, learn from it, and be 100% sure you'll never reproduce any of their code. Hence "clean room" teams and such.
I think they would have a hard time using copyright:
>Section 102 of the U.S Copyright Act sets forth what is and what is not subject to copyright protection.[3] The first part of the statute states the eight types of works which are subject to copyright protection such as; literary works, musical works, dramatic works, pantomimes and choreographic works, pictorial, graphic, and sculptural works, audiovisual works, sound recordings; and architectural works.[4] The second part of the statute states that copyright protection does not “extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery…”[5]
https://easlerlaw.com/software-computer-code-copyrighted
Which is why we have software patents in the first place.
It feels like you completely miss the point of copyleft licenses. They definitely do impose constraints. Saying "I believe that LLMs are abusing my license" is absolutely not hypocritical.
I'm not saying that saying that critique on LLMS who are abusing copyleft licences is hypocritical.
I'm saying that the arguments that define this "abuse" often are.
What I specifically find hypocritical, we cannot suddenly put restrictions on what clients may or may not access, read, store and see our content. "Only a human may access my content" is just as bad as these numerous attempts of DRM that "hollywood" has tried to force down on the web.
More practically: I have a blog, with a CC-SA licence on my content. If an LLM starts spewing my exact paragraphs and articles without the same license (the SA-part asks this), I'd be pissed and have a right to call it "abuse" and could quite easily take the company behind it to court (If I wanted to).
But when that LLM merely hashes, vectorifies, trains on, and stores copies of my content so that their project can mimic my tone of voice, ideas and other content, so be it: I find it immoral to impose restrictions on what, who, where and how my content is read. Because I find DRM and such immoral.
What if I copy your article by hand? Or run it through an algorithm that translates it into another language. Can I sell it as my own? Or does it have to go through an LLM for people to sell your work as theirs?
Think about this: what if your job was to write articles, and suddenly nobody buys them anymore because they can read them cheaper on another website that serves them after the LLM laundering? Is it moral for that other website to make money out of your work?
> I find it immoral to impose restrictions on what, who, where and how my content is read.
Did you think that through? Do you find it immoral for a book author to sell their book, and to make it illegal for you to sell copies of it? It feels like you only consider your hobby-blog and ignore those who try to make a living by producing content (e.g. a photographer).
A judge will have to decide which it is. So far, they're leaning to "it's not a copy machine"
Edit: and yes. I find it immoral when an author has a say in how I use the book after buying it. Whether I whipe my ass with the paper, or display it in a shrine is non of his business. I bought the carrier, it is mine. And under copyright law (EU), I'm free to sell my copy to a second hand bookstore, and that bookstore is free to sell it and make a profit.
I understand the nuances of digital carries that can duplicate their contained knowledge much easier than a paper book can, but I find it immoral that the ebook I buy, cannot be read by my wife on her ereader without her buying it as well.
I see an LLM as a machine that takes an input and automatically gives an output. If that input is copyrighted, and the output is essentially a translation of that input, then it is a problem.
BTW using a camera to take a photo of your code, and then use text recognition to render it (with potential mistakes), and then use an algorithm to correct the mistakes, all that is not exactly "something that copies stuff": you will agree that there is a lot happening. Still you would say that it breaches the copyright "because it is exactly the same text". The thing is, "copyright" is not defined as "exactly the same text" (otherwise you could rename a few variables and be done with it).
> I find it immoral when an author has a say in how I use the book after buying it.
So you find it immoral for an author to sue you because you bought the book, copied it and sold cheaper copies on Amazon? Pretty sure you don't. You can't just ignore all the nuances that don't support your argument... I don't like DRMs either, but it does not mean that the concept of copyright is always wrong.
I keep reading this argument, and it's hard for me to grasp how people think it makes sense. Why do people keep comparing an LLM to a human?
If I copy-paste your code, can I say that my computer "read your code, learned from it and later wrote something inspired and based on the knowledge it internalized"? If I take a photography of a photography, can I sell it as mine, given that my camera "observed the original and learned from it"?
My opinion is that if you read my code, learn from it and later write something inspired and based on the knowledge you internalized, then that is fine. Because you are a human. But if you do anything that it functionally an obfuscated copy-paste and pretend that you actually created that, then that is not fine.
Likewise, it is hard for me to understand how people think your argument makes sense.. I work in ML, although not LLM, and I do understand how they work. I don't think you can reduce LLMs to obfuscating copy-pasters. I believe they are closer to our internalizing process and they can come up with novel solutions to never seen before problems.
> Likewise, it is hard for me to understand how people think your argument makes sense..
My argument is "LLMs are not humans, therefore we should be careful with such comparisons". Do you also think that LLMs should have rights? Should it be illegal to make an LLM work 24h a day?
> I work in ML, although not LLM, and I do understand how they work.
I never meant to underestimate ML. However it feels like you underestimate humans :-).
> I don't think you can reduce LLMs to obfuscating copy-pasters. I believe they are closer to our internalizing process
Wow. If you think that LLMs are closer to humans than to copy-pasters... you must not think much of living organisms. Now I understand why ML people think that AGI is coming soon :).
> OpenAI recently fed everything on GitHub into a language model which will cheerfully reproduce GPL'd software without license or attribution.
Currently, when does reproducing code become a violation of the GPL? If I see an interesting function in a project under the GPL, can I copy it? What if I rename the variables? Or what if I reimplement the function exactly? Approximately? Add perturbations Etc, etc. I get ship of Theseus vibes.
The most common example for this is the fast inverse square root function from Quake engines. Start the function signature and the "auto-complete" will fill in the body implementing it.
> To prove that GitHub Copilot trains on non permissive licenses, we just disable any post-generation filters and see what GPL code we can generate with minimal context
This seems like a key line.
I can get my browser to output GPL code with a carefully crafted Google search query, and there might even be unlicensed use of GPL code in my browser itself. But as I understand it, as long as I don’t copy GPL code, I should be fine.
In this case they’ve disabled the GPL filter, and gotten GPL’d outputs. Fair enough. The question though: when the filter is on, can I trust it? Not sure!
Not to mention it seems like most of the blog posts and tweets about this are also very purposefully prompting to get GPL code.
Secondly, if you take this seriously you should already being doing some kind of static analysis to detect it because what if one of your employees copy/pastes some GPL code from somewhere?
It's practically a non issue that people are blowing out of proportion.
> It's practically a non issue that people are blowing out of proportion.
I don't think it is. If it is legal to input copyrighted stuff into an LLM and have the LLM magically remove the copyright, then the very concept of copyright is dead. LLMs are then a copyright-laundering machine.
Worse: if it is considered "fair use", then I don't think that I could even sell my copyrighted material with a contract that says "you are not allowed to feed it to your copyright-laundering machine". If it is fair use, anyone can do it, right?
There's two different things people are upset about here that I see.
1. The LLM might output GPL (or other restricted) code if you turn off the filters.
This one I see as a non issue.
2. The LLM benefited from GPL code and various copyrighted content in its training.
This is more complicated and for the courts to decide. My personal opinion is that as long as you as the end user are not publishing the output copyrighted content it shouldn't matter, and if you are then the owner of that copyright should pursue justice through the existing processes.
I am genuinely not sure if you just ignored all that I said or if it is your way of saying "it is fine for me if the very concept of copyright is dead".
You've said nothing of note. I honestly don't care about your personal feelings about copyright. All I was doing was clarifying the issue I was originally addressing because you replied to me about something different.
Right, at least that's clear now. You said that "It's practically a non issue that people are blowing out of proportion", I believe that the end of copyright is not a non-issue.
But yeah, ignoring the issues others have is a good way of living in a world with no issues at all.
Ofcourse you can't trust it. It's trained on code from github repos, and, suprise, people copy paste GPL code all the time into their repo which they randomly chose a permissive license for when the repo was created.
It's theoretically helpful to at least put in a no-warranties clause. But sqlite as maybe the most popular public domain project worldwide doesn't (instead having just the declaration and a blessing). I mostly settled on the Unlicense https://unlicense.org/ over just declaring 'public domain' or 'CC0' as a simple text blob to paste in, and in the event of a significant contribution from someone else, there's a simple text blurb to ask them to say 'yes' to in order to keep the code unencumbered.
How such a simple thing could take 14 years to untangle?
How could anyone trust in courts that have such a spectacular efficiency?