> Does this suggest noncommercial library books will offer research capabilities that commercial ebooks will not?
If they are scanned noncommercial library books, I guess.
I'm not sure how this is really relevant though. Scanning for this use may be fair use (hooray! some sanity), but the publishers aren't providing an ASCII file to the libraries either, the libraries are making a copy...hence the need for a fair use decision.
Should be a format that can be grep'd, made into PostScript, DejaVu, etc. Versatile. ASCII works well for that. Open, non-proprietary format.
Not only that, working with raw text is so much faster than anything else. less(1) is the most responsive "ebook reader" I've ever used. It never chokes on huge files. less -n
Ever try reading 100 or more PDF's in one sitting?
But with 100 ASCII docs, it's possible to skim them fast using less, or mine some text with some UNIX utility.
I like what this judge said. Digitizing books not just about putting the book on a "paper-like" screen with beautiful fonts, it's about enabling new utility. Working with the text.
If you're targeting grep as a tool to operate on your data, then HTML isn't the best 'raw' format as grep isn't HTML-aware. It can't ignore markup, or translate encoded values (e.g. & => &).
Technology doesn't make this an either-or situation, and some time reading RFCs really makes the pain of ascii obvious.
Just ask for reflowable pdf. Figures, equations, cross-references (with hyperlinks), actual typography and design, full-text search, and the ability to adapt to your viewer.
If you don't like that implementation, then may I suggest looking into improvements to the algorithm, instead of brutally stripping away content at books to fit an unnecessarily crude model?
Based on reading the article (and no legal knowledge whatsoever), it would appear to depend entirely on who was doing it and why. If libraries were doing it so they could automatically subtitle the work and make it available to deaf viewers, then probably so. If you are doing it so you can hawk DVD versions on street corners for a buck apiece, likely not.
Well, that certainly. But also more. If it's legal to do this on an industrial scale with books, why not with movies or music?
I think there's some complicated reasoning in the decision, and I'm curious what it will mean in the bigger picture.
Why can't I (or a company) take any analog media and transform them to digital. I'm adding transformative functionality, after all. Digital search capabilities, for example.
Not for resale of the work, of course, but for my own vast database that I sell research access to.
And even if the work has already been translated to a digital format (DVD or CD for example), the decision seems to say there's still nothing wrong with me transferring from analog to digital. If there was, point 4 would be circular as the decision says.
So it seems like this says Google could go digitize a bunch of LPs, VHS tapes, and even celluloid film for similar projects. Google Film. Google Albums. Etc.
I don't really think this holds up, but I can't pin down why it wouldn't.
Bonus thought- what if I took software and made it run on a new platform. Is that also fair use under this decision? Wouldn't that also be transformative use?
Please resume scanning books, old magazines and newspapers, which you "paused" some time back. It was sensible to take a wait-and-see approach until this decision was reached.
Hmmm. It's not so easy to do this with ebooks viewed in a graphical ebook reader. Note he didn't say "text search". He said mining.
Does this suggest noncommercial library books will offer research capabilities that commercial ebooks will not?
I want my ebooks in ASCII format. And that certainly goes for textbooks.
less(1) is my "ebook reader".