Hacker News new | past | comments | ask | show | jobs | submit login
British Library has released over a million images onto Flickr Commons (britishlibrary.typepad.co.uk)
193 points by mds on Dec 13, 2013 | hide | past | favorite | 57 comments



This is fantastic, but I do have one small quibble.

    > These images were taken from the pages of 17th, 18th and 19th century books
    > digitised by Microsoft who then generously gifted the scanned images into
    > the Public Domain.
    
The images were already in the public domain. Merely scanning a public-domain image is not a sufficiently transformative process to create a new work protected by copyright.

However, Microsoft was under no obligation to share the scanned images publicly, and it was generous of them to do so. But this is different to dedicating a work protected by copyright to tbe public domain, e.g. using the CC0 dedication [0]. To gift a work to the public domain, you must first hold the copyright of that work.

The distinction may seem pedantic to some, but it seems important to me.

[0] https://creativecommons.org/publicdomain/zero/1.0/


You seem to be mistaken in assuming that US copyright law (or more precisely the decision set down in Bridgeman Art Library v. Corel Corp.) somehow applies to the world and not just the US.


Are you declaring that there are copyright doctrines more restrictive than the US? I'm very interested in hearing specifics!


> Are you declaring that there are copyright doctrines more restrictive than the US?

For copyright specifically (not IP in general, and definitely not patents), the US is very soft.

For instance many mainland euro country have a concept of Author's Rights (or moral rights), which may well be perpetual, inalienable, unwaivable and unassignable (in a company, the moral rights belong to the employee who created the work, the company has an exclusive license to the economy rights of the work). In french law, moral rights provide 4 sub-rights:

* divulgation right, the author is sole holder of the right to decide the original disclosure of the work

* paternity right, respect must be given to the parental relation between the author and the work

* work respect right, the author can forbid any and all transformative work. This alone means "fair use" is much, much broader in the US than in France

* repent right, even after divulgation the author can "uncirculate" the work (although he may have to compensate license holders)

and because these moral rights are perpetual and inalienable, you can't put something in the public domain in France, you can only provide a universal license. The work will only fall in the public domain once copyright has expired.


Lots of copyright doctrines are more restrictive than the US. Most notably, a lot of jurisdictions don't have a concept of fair use nearly as broad as the US concept.

Also, the US uses trade treaties to cajole other countries into adopting very restrictive copyright doctrines. It's more common than you'd think.


The lack of fair use is a problem in non-US jurisdictions.

However, I'm not aware of anywhere where copyright lasts longer than the US (except in special cases like Peter Pan[1], even if that is really just perpetual rights to royalties from performance[2]).

Edit: Just noticed that [2] says copyright in Mexico is actually longer than in the US, but I'm not aware of the specifics of how it works.

[1] http://www.jmbarrie.co.uk/copyright/

[2] http://www.gosh.org/gen/peterpan/copyright/faq/#Copyright


Honduras has life + 75 years. Also some countries provide circumstantial extensions e.g. France's copyright is life + 70, but if the author is "Mort pour la France" (died for France, which is usually applied to service members but may also apply to civilians) it's life + 80.


France also added a few years here and there. For example the period in WW2 when France was occupied by Nazi Germany doesn't count for copyright.


A few countries did that, but those extensions were mostly predicated on a 50 years copyright (the "morts pour la france" extension also is, it's 30 years tacked onto the original 50). When copyright got normalised to 70 by EU rules (or separately extended before EU or EU entry), these extensions became moot in EU countries. IIRC Russia is the only country where war copyright extensions still apply, as a 4 years extension was specifically added when they rewrote their copyright laws in 1993.


In this case what you're looking for is the "sweat of the brow" doctrine which holds that copyright is achievable through hard work and not just originality.

There are also entire classes of IP protection which the US doesn't recognise but are common elsewhere, for example sui generis rights.


Well, yes. Australia has no such thing as fair use, for a start.


"I'm very interested in hearing specifics!"

Sure you are. Just not interested enough to Google the specific case that was helpfully referenced in the comment you replied to.


That.. that is a US case.* I asked about places where there are stricter rules than the US, because I'm used to the US being the one pushing stronger copyright.

*And it was argued that the same result would have held in the UK.


Yes, the case is a US case. That's the point. It is a US case which has, in effect, granted people a degree of freedom not generally found elsewhere.

That is exactly what you asked for: a specific example of US law being more lenient than the law in other countries.


For any HN'ers based in London the British Library in St Pancras is a fantastic resource, and their Business and IP Centre ( http://www.bl.uk/bipc/ ) has an easy signup for a Readers Pass for entrepreneurs.


I tried getting some work done there a couple of years ago. All the desks were taken and the internet was a nightmare to get onto and highly restricted. Things may have changed since then. I tried going into the Business centre but lacked the required ID.


I think it's set up to help the people looking at the physical resources they have there (older and rare books on paper mostly), rather than someone looking to work exclusively with an internet connection. If you're not interested in the books, it is of course not a great place to work, though the cafe is pretty good for meetings.

I'm not sure it's really an appropriate place for entrepreneurs unless they are interested in accessing or using the books somehow. When all the other libraries have shut down or turned into Internet access centers, this one will still be valuable as a repository of the books we choose to keep as physical artefacts.


It's a Library rather than a Starbucks, innit?


You're saying that it's okay that the British Library doesn't have adequate infrastructure to support people trying to work and access information?

The equivalent would be if Starbucks, the coffee shop, only served inferior quality coffee at a ridiculously high price … wait a minute …


It's not really that kind of library, it's a central store for valuable (or just old) physical objects. They didn't even previously give out membership unless you had a request for a specific title that they hold which wasn't available elsewhere. Accessing all information is not their central purpose, accessing specific information on paper is.


Great, but why should the British Library, a non-profit charity and UK taxpayer assisted institution put these images on the servers of a US based for-profit company?

There are a lot of talented Brits in the Bay Area, but there are also many great UK based programmers and companies in the UK who could have developed a home-grown solution. This is embarrassing on many levels.


I'm of the mindset that developing a home-grown/in-house solution when there are time-tested & proven solutions out there is generally a mistake unless you're trying to innovate and offer it to the public. I worked at a place where the devs thought apache2, django, rails, etc. were too slow so they made their own web-server re-implementing the whole HTTP 1.1 protocol from scratch....


A UK government agency tried to release wartime aerial photos for free on the web and it was a disaster because the site was always overwhelmed with traffic. Now you can only get those photos by paying. This story could easily get picked up by a large news organisation and it needs to be hosted on something with lots of capacity.


There is rarely an existing solution for your specific needs. And sometimes, at some point it becomes more work to tweak an existing thing into what you want it to be, than just do it from scratch. Yes, you can't always know that beforehand, but that doesn't mean it's always safer to err on the side of taking something off the shelve.

That said, seeing how the images are creative commons licensed, it would still make sense to put them on flickr, too :P The more, the merrier.


oh god, that sounds... excruciating


You're speaking there as a tech-informed taxpayer who is being slightly nationalistic.

Most taxpayers, even if they agree with you, are likely to be more worried by the prospect of the British Library developing its own tools for access to its digital collections when perfectly adequately ones already exist.

By leveraging Flickr, the library frees itself of the problems of dealing with tens of millions of users attempting to access its material. Memory institutions have centuries of experience dealing with the lone, dedicated researcher. They're less used to dealing with massive number of researchers accessing collections at the same time.


By leveraging Flickr, the library frees itself of the problems of dealing with tens of millions of users attempting to access its material.

We already have an open and free technology for that: Bittorrent.


Yeah, that's totally reasonable. Just tell people to torrent a couple of terabytes, then sift through the pictures on their hard drive, using only the file system to guide them. The hardware cost is probably several hundred pounds and the download should take about two weeks if they've got a good broadband connection.

But, hey, you cut out the evil evil American company Flickr (boo), so everybody's better off.


You shouldn't assume I necessarily agree with the condemnation against using Flickr just from what I wrote; it's not an approach conducible to good debate. My opinion is that Bittorrent would be a good technology for the purpose, since it freed the organization from the distribution problem without being tied to a third-party service. But I have nothing against also using Flickr, on the contrary. As PavlovsCat wrote, the more the merrier.


Vast majority of the population do not have the knowledge to use that. Surely it's good to spread the history and these documents as far as possible. Putting requirements and barriers like "you must use bit torrent" excludes people.


Sure, but what about using both?


> Great, but why should the British Library, a non-profit charity and UK taxpayer assisted institution put these images on the servers of a US based for-profit company?

Why shouldn't they? It's to everyone's benefit for these to be as widely distributed as reasonably possible.

> There are a lot of talented Brits in the Bay Area, but there are also many great UK based programmers and companies in the UK who could have developed a home-grown solution.

These folks can now use the Flickr API to download each and every one of the photos and code to their hearts' content.


> Why shouldn't they? It's to everyone's benefit for these to be as widely distributed as reasonably possible

Because Flickr could one day be shut down, and eventually will be, however far on the future that may be. For resources like this such a disruption will be painful.

I'm not saying it is a bad idea, only that I understand the long-term hesitation in this.


Precisely.

We don't know what the future of Yahoo and Flickr will be. The terms of the deal could change drastically down the road, especially if Yahoo/Flickr get new owners.

There are also legal issues. Visitors from around the world will be accessing data from a US company as opposed to a UK organization bounded by UK and EU data protection laws (for what that's worth!).


These are images from dating up to the 19th century - there's nothing there affected by data protection laws.

As for the "terms of the deal" - that is only relevant if nobody bothers to download these images now. The images are all so old that the originals are out of copyright, and as far as I can see, the British Library have tagged them all "no known copyright restrictions", as Microsoft, who did the scanning, donated them to the public domain.

So these complaints are meaningless: Have a concern about the hosting? Mirror the images. As I'm sure various people will.

Even if Flickr were to shut down tomorrow, British Library still have them, and likely Microsoft too. And the books they are from still exists. It is the British Librarys job to ensure the preservation of the source material -, and they can provide them to other parties.


I think that yapcguy's concern around the data protection laws is not in relation to the contents of the picture collection, but rather the personal data, access logs, etc of the people using Flickr to search and view the collection.


How would that not be a concern regardless of what entity hosted it?


It is a concern either way. His point seemed to be that if it were hosted by an EU entity, then at least it would be bound by EU data privacy laws. I'm not super familiar, but I think those are more strict than US privacy laws, for whatever that's worth.

The governments will be spying either way though, so I personally don't think it's much of a difference.


If the images were hosted in the EU, then there's limitations on what the hosting company can store and process about to visitors. For the USA, there is less protection.


How would a nonprofit service be any more insulted from shutting down some day? They made the data available. You could personally choose to re-host it using an endowment to fund it for a century. They're happy seeing it immediately and reliably accessible.


I generally trust the long-term stability/accessibility of libraries and archives more than I do something like Flickr. Even if it's not shut down, Flickr could well limit API access or do any number of other things.

IMO the U.S. Library of Congress does it right. They do actually have a Flickr page, for people who prefer that interface: http://www.flickr.com/photos/library_of_congress/sets/

But there is also the canonical digital archive at: http://www.loc.gov/pictures/


> Because Flickr could one day be shut down, and eventually will be, however far on the future that may be. For resources like this such a disruption will be painful.

Which is precisely why it's good services like Flickr host stuff like this. The more services - private and public - that do, the less likely a shutdown will shut off access.


Over the years, there have been many instances of short-sighted technology selection by publically funded British institutions. However, I don't think this is such a case. That these documents are going into the public domain means Flickr can assert no rights over them. If Flickr is able to profit from hosting them then competitors will surely do so too. Meanwhile, this will cost the Library nothing to implement or operate. They can just rehost if and when Yahoo! Flickrs out. Developing against the Flickr API presents the usual risks to third-parties, but that need not be the concern of the Library.


The first few lines in the article mention that the bulk of the work was done by Microsoft in digitizing the photos.


[B]ut there are also many great UK based programmers and companies in the UK who could have developed a home-grown solution.

Would that have been more cost effective?


To make them more readily available. Their job is to preserve and provide access to them. The images are in the public domain (donated by Microsoft who did the scanning, by the way) - if any UK based programmers want to do something with them, they can.


Why not Wikimedia Commons?

Several major museums and libraries (including the US's National Archives) have donated major collections.


Wikimedia Commons could quite easily download them and store them on their servers.


Brightsolid is a Dundee-based company that has digitised quite a bit of British library material as a sort of public-private partnership. For example, the British Newspaper Archive - http://www.britishnewspaperarchive.co.uk/

I'm of two minds as to whether this has been a good thing. On one hand it's digitised content that was previously only in paper form, but on the other that content is now charged for, and likely will remain so for a long time due to contracts. Effectively it has been put back into copyright.

Flickr is a perfectly reasonable tool for the job.


This is a great initiative but as long as these images aren't reasonably tagged they'll remain undiscovered.


Yes but they have a strategy to address that:

> We plan to launch a crowdsourcing application at the beginning of next year, to help describe what the images portray. Our intention is to use this data to train automated classifiers that will run against the whole of the content. The data from this will be as openly licensed as is sensible (given the nature of crowdsourcing) and the code, as always, will be under an open licence.


Don't count out dogged historians. We spend a lot of our lives looking at material not quite relevant to our research in order to find that one gem that answers a question.

That said, I am very pleased to read the other reply that mentioned their plans. I'm not waiting, though. I'm very excited to dive in immediately.


Suggestion: 3D scan and release as open source 3d models of everything in the British Museum as well.


This is wonderful. 17th/18th/19th century images notwithstanding, some of these are quite beautiful.

It's good that they put it up on the internet, apart from just the exposure (someone sitting half a world away seeing them), but also they could be used by someone now.


How do you filter a search in flickr to only search this library?


I'm also trying to do a search in this library and had no success. Not even using Google image search. The collection is useless if one can not search for images inside. It's not pratical to browse through millions of images to find one. If there is any way to search for keywords in the collection please tell me.


How big is the entire collection?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: