What causes people to need to sprinkle license boilerplate everywhere, including in files which are otherwise completely empty (like r2/r2/config/__init__.py), and then to have to update them every year? See this commit: https://github.com/reddit/reddit/commit/90cfcaaecc56cf35e758...
It just seems to defy reason that we must make humans increment a number every year in every file in our projects, nevermind the fact that the top 20-30 lines of every file in our projects has been taken over by stuff most readers don't actually need to read (again and again).
Is this really the best we can do without somehow letting the bad guys take our home away due to some licensing gotcha? Like simply having this at the top of each file:
In essence it's because there is a very large number of lazy programmers who live by cut and paste. We're not just talking the "I've found a solution on Stack Overflow and will use that", but more "I've searched Github for keyword + language and this file does what I need".
The files are copied into their projects in its entirety, sometimes whole libraries are, and those programmers never bother to check how a project is licensed.
Once this process has been repeated a few times the code is firmly detached from the licence and any original license is ignored.
If I use the suggestion you make, then by them copying files into their project they have changed the licence of a file (it now inherits whatever their project uses).
Though I do like the idea of a stub instead of the full thing:
That would be enough to describe the licence for the file in a way that survives cut and paste, whilst also providing a URL for the full licence details.
Programmers who copy and paste can always copy the fragment they need, no need to pay attention to headers. Ultimately licensing relies on people operating in good faith. Even without explicit licensing, there is an implicit copyright on works such as source code, so taking code from random places without checking out the license is never warranted either way.
> If I use the suggestion you make, then by them copying files into their project they have changed the licence of a file (it now inherits whatever their project uses).
No, the only one who can actually change the license of a file is the rights holder, so the person who copied the code while ignoring the license misrepresents matters but does not change anything about how the work is licencsed.
> What causes people to need to sprinkle license boilerplate everywhere, including in files which are otherwise completely empty
The Apache 2 license library has language that indicates the use is to put bits of the license in every file. That's why. It's easy enough to maintain a license at the top of files with an IDE like IntelliJ
Perhaps in places like github. With central versioning systems where the server is under our control we simply run daemons that check the copyright. Each user can define how it should work for them. If copyright is not okay the user can either a) have the submit fail so he is notified that it needs fixing or b) let it be fixed automatically by the daemon.
This fixing also includes adding a copyright notice to new files that didn't have any. Nicely defined depending on the file type.
The implementation was a one time effort which now saves us from doing exactly what they are doing now. Manually going through thousands of files to fix a copyright.
I remember watching a Stephen Fry interview who mentioned that placing the Copyright symbol once on your piece of work is sufficient to claim Copyright. But is placing the symbol once on a book, the same as placing a Copyright/License block once in a project directory?
As a matter of fact, you do not need to even declare copyright anywhere in the text to claim copyright (at least in the US). Copyright exists from the moment of the work's creation. [1] And placing a copyright notice does not afford you any other benefits without registration anyway. Once you've registered with the US Copyright Office, you may place a copyright notice if you want, but your work is still protected even if you don't. [2]
I'm dating a lawyer. According to them, what you've written is true, however, speaking practically, there is a significant advantage to be gained from presenting evidence. If two parties show up to a dispute with identical source code, the one that has a copyright in it has an advantage. Sure, it's easily faked, and that could be argued, however, it would be trying to argue away evidence that exists which is much more difficult than arguing in favor of something that does exist. So if you want to lock in a victory and reduce court time, use copyright notices (and other legal notices like trespassing signs, etc.) liberally.
It just seems to defy reason that we must make humans increment a number
Ah, but why do you assume a human did that? Writing a script to update the year in all files doesn't take more than a few minutes to write. Chances are he simply ran "update_license_year" and committed.
Likewise for having a license on every single file: it may simply be a git hook that preprends it to every file with a certain extension.
> you may not use this file except in compliance with the License
This is patently absurd: the file contains no content other than the license itself, and arguably its name (which is shared by millions of other __init__.py files around the world).
Ah. Sorry, I didn't understand this point. I agree it is totally superfluous to declare a license on an empty file.
But I assume that they have this header on every file as part of their internal process. Hence, they don't make an exception for empty files. I would book it as a cost of this process.
An empty file with no surrounding context is just an empty file, but is an otherwise blank file embedded within proprietary software whose presence is required for the software to function somehow public domain? The contents of the file are trivial, but it's existence may not be.
It's a pain because they have no community support, new security bugs can go unnoticed and many other issues that can arise because of lack of maintenance.
Pylons most probably won't roadblock them but will definitely bring a lot more challenge.
I once was an intern in a company that wanted to rewrite the whole Reddit code in .net. The founder was a charming person and managed to raise a huge pile of money. "We can redo this with current technology and elegant design! We will run circles around Reddit!".
We had a great time. Free snacks, lot's of parties, luxurious office furniture, skateboarding in the hall... In the end, the company ran out of money before the product reached a useful state.
Good times. It has been some time since then and I have a "normal" job now. Last thing I heard about the founder is that he started a new vc backed company destined to run circles around something.
>FYI, that includes pg's thinking on the use of lisp for viaweb
Yeah, I for one think this was equally flawed. People have made succesful and quickly iteratable web services/apps in all kinds of languages, including Perl and PHP.
Plus, one single data source like pg had is never that accurate, plus the fact he and Martin were already Lisp guru s helped them in their use of it.
Well, they're right pretty much by definition - otherwise it wouldn't matter at all what technology you use. You could code your CRUDs in Brainfuck connected to MongoDB. In real world, you gain consumer advantage by e.g. chosing right database for the problem, or playing to language's strengths (which is what pg done at Viaweb).
That or more hardware to overcome bottlenecks caused by bad code. "It's running slow, we need more memory!". After some investigation, really... you've got 8 joins without using keys, and you're getting paid more than us how?
I must have spent my entire time between 2004-2008 working for companies that did that sort of thing (rewrite to .Net). Plenty of cash to burn with no real possibility of success and no business plan.
In two cases, the guys running it knew it was going to fail from day one and their business model was to do this in two year chunks, syphon the cash out of the VCs after talking the product up, live the high life and disappear for a bit.
I felt no shame working for them back then but I do now.
Honestly the premise, the model and database matter a whole lot more than the language. The premise is the most important as how it's presented to everyone is determined by it. The model is only to help you get your head around it; users don't (and shouldn't have to) care what you do in the backend. The database will dictate what you choose to store and how often.
Even rubbish code, in any language, can survive for a lot longer by shifting the spotlight of scrutiny to the biggest bottleneck, the database. Which will also be the deciding factor in reducing growing pains.
To be fair you can build a decent anything out of any language / framework / environment; the point is, technology doesn't matter that much if your product is good. Look at Twitter or something for example, they built a product in a language they were comfortable with, and evolved from there.
It just seems to defy reason that we must make humans increment a number every year in every file in our projects, nevermind the fact that the top 20-30 lines of every file in our projects has been taken over by stuff most readers don't actually need to read (again and again).
Is this really the best we can do without somehow letting the bad guys take our home away due to some licensing gotcha? Like simply having this at the top of each file: