Pdf.js: PDF Reader in JavaScript

jpallen · on Sept 14, 2012

I'm really excited for this for http://www.sharelatex.com and other similar sites that are actually generating a PDF for you. With native PDF viewers there is no way to interact with the viewer via javascript and even just having the viewer stay on the same page when your reload a document (with minor changes) is impossible. Pdf.js means that we'll be able to do this easily, as well as other cool things like letting letting the user sync between the PDF and source.

jcheng · on Sept 14, 2012

Our web-based IDE[1] does exactly this, including syncing between source and PDF, using PDF.js. Works great for the most part!

[1] http://rstudio.org

bpatrianakos · on Sept 15, 2012

Are you guys actually generating PDFs from user input? If so, I'd be really interested in finding out how you do it if you don't mind sharing. I've found a number of apps that allow you to generate PDFs server side but often the installation is near impossible (dependency issues), e rendering is not good enough, or the API makes using it with a web app clunky. I did find a solution (don't recall the name) that actually used Webkit to render web pages and then turned them into PDFs but I was never able to get it to work in an app. It only seemed to work some of the time and only through ssh. But if you can't say, then I can respect that.

jcheng · on Sept 28, 2012

(Just saw this comment, sorry for not replying earlier)

We use LaTeX to generate the PDF. You could try pandoc if you're going from HTML (or markdown) to PDF.

jcfrei · on Sept 14, 2012

nice, this is quite a sophisticated solution! have your heard of airxcell? they're doing something similar (with spreadsheets).

jcheng · on Sept 14, 2012

Hadn't heard of airxcell, looks interesting. Thanks!

winter_blue · on Sept 14, 2012

I used to use PDF.js for a while (on Linux), until I switched to KParts because it was having difficulty rendering certain kinds of PDF documents. KParts uses the same underlying engine that powers Okular (KDE's default PDF reader.) It renders everything properly and is much faster than PDF.js. It reminded me of Foxit on Windows. KParts might be only available on Linux though...

ianb · on Sept 14, 2012

I use this a lot, and it really does work. It renders everything, and renders it well. The one thing that doesn't work is maps – just too many vectors, and Javascript/Canvas/etc just can't keep up. Otherwise I'm very happy and don't feel nearly as much resentment towards PDFs as I used to.

mrb · on Sept 14, 2012

(I assume you are you mentally comparing pdf.js to Adobe Reader.)

Have we reached the point where Adobe Reader is so resource-guzzling that people prefer a JavaScript (gasp!) PDF interpreter to Reader? Wow.

fafner · on Sept 14, 2012

Yes, I really love PDF.js. I was consider switching to Chrome just because of the (proprietary) pdf viewer. But the Chrome PDF viewer even lacks the concept of pages! I had already kicked out the Adobe Reader plugin but using an external viewer every time makes PDF just so more annoying.

Mozilla is also working on something similar for Flash: https://github.com/mozilla/shumway . I'm a bit sceptical but with the success of PDF.js I can't wait to use it.

IanDrake · on Sept 14, 2012

Another Ian here, also using pdf.js. Been working on a paperless office app that allows contracts to be signed on an iPad. It works surprisingly well.

cpeterso · on Sept 14, 2012

Firefox already bundles the pdf.js reader. See https://bugzil.la/714712.

thebigshane · on Sept 14, 2012

Two questions:

1) In Firefox 15, the demo page adds two new options to my right click menu: Rotate clockwise and Rotate Counter-clockwise. Is Firefox recognizing pdf.js (since it appears that they are related) or pdf.js adding menu options? I didn't know JS could do that.

2) Isn't Javascript an embeddable language inside PDFs? I'm pretty sure I read that javascript is used, not necessarily for animations but for run-time dynamic layouts. If that's true, is pdf.js "eval"-ing that javascript?

evmar · on Sept 14, 2012

HTML5 context menu API: http://www.whatwg.org/specs/web-apps/current-work/multipage/...

thebigshane · on Sept 14, 2012

Thanks.

I've seen custom contextmenus generated on pages[0] that override the built-in context menu but for some reason I've yet to come across a page[1] that adds to the existing context menu until today.

[0] replacing example: http://developer.yahoo.com/yui/examples/menu/tablecontextmen...

[1] appending example: https://bug617528.bugzilla.mozilla.org/attachment.cgi?id=554...

notatoad · on Sept 14, 2012

>but for some reason I've yet to come across a page that adds to the existing context menu until today.

the reason is likely because browser compatibility for the context menu js api is pretty minimal [1]. firefox can use it on pages that are only going to be displayed in firefox, but for the internet at large it's pretty useless.

[1]http://caniuse.com/menu

bpatrianakos · on Sept 15, 2012

What about Google Docs? Their context menu seems to work cross browser (even on ie9!). Maybe my assumption that they're using HTML5 for this is wrong but if I'm right then what you see on Gdocs is a complete replacement of the regular menu which is pretty cool but also quite annoying for me at times. Please correct me if I'm wrong (which is very likely).

notatoad · on Sept 15, 2012

Google docs is not appending stuff to the browser's context menu. they're catching the right-click action and replacing the standard context menu with their own one drawn in HTML.

uams · on Sept 14, 2012

This is super cool.

While I can't imagine myself using it anytime soon, it's clear that web applications are improving at a far faster rate then native applications and, with t large enough, the first derivative means that web will eclipse native.

This seems like an academic exercise at the moment; it's to prove that you can replicate a native experience only.

However, it seems that this could be vastly improved by playing to the strengths of the internet. The only online apps that have beat native ones so far have been because of cloud storage and collaboration. First, use filepicker.io or something so this can open my online files. Second, bake some collaboration into it.

dkhylan · on Sept 14, 2012

GroupDocs currently provide an app for online annotation and collaboration, including accessing your files from different cloud storage providers, currently Azure & Amazon S3 are supported http://groupdocs.com/apps/annotation/try-it-now

Mizza · on Sept 14, 2012

XSS injections on these are gonna be fun..

bpatrianakos · on Sept 15, 2012

I came across this a few months ago while trying to implement a solution for turning HTML into PDFs server side. This is definitely cool and useful but it's usefulness is limited for now as native PDF readers on the desktop are preferable. Even on iOS the built in reader is nicely done. Chrome on Windows and Mac always opens PDFs in a tab and handles it well I think. That said, this can definitely be of use in Chromebook type situations. I'm sure it'll end up in Firefox OS too which I have hi expectations for. The awesome thing about Firefox OS is that it's all JavaScript and good old fashioned web technologies under the hood so this will fit right in.

So alas, I'm still searching for an easy way to convert HTML to PDF server or client side. I haven't looked at the code yet but I do wonder if one could get that functionality out of this if they wrestled with it enough. (I know there are other ways to turn HTML to PDF but a client or server side script to do so really is the best solution for my situation).

pingpong_table · on Sept 15, 2012

This is about PDF to HTML conversion.

bpatrianakos · on Sept 15, 2012

Right, I got that. But when I found it I was hoping it could work in the opposite direction.

dutchbrit · on Sept 14, 2012

As a big user of PDF.js, I have to say it's great for basic PDF documents. However, complex vectors don't render nicely with this

wheaties · on Sept 14, 2012

Now if someone would just do this for .docx, .xlsx, and such I'd be set.

dkhylan · on Sept 14, 2012

GroupDocs provides a commercial online viewer and API's for supporting those formats as well as .pdf

crisnoble · on Sept 14, 2012

That is awesome! link for the lazy: http://groupdocs.com/

retroguy · on Sept 15, 2012

http://groupdocs.com/ seems to cope with a whole bunch of file formats.

crisnoble · on Sept 14, 2012

Is there a way you could tell your browser to open those links in google docs by default?

emillon · on Sept 14, 2012

Proxying all your documents to Google is something that should be avoidable.

crisnoble · on Sept 14, 2012

True. And possibly undesirable depending on the sensitivity of the files.

wmf · on Sept 14, 2012

Maybe Web Intents?

senko · on Sept 14, 2012

I have first seen this a year ago (when it was publicly announced, IIRC). It was a cute tech demo but easily broken, and quite slow.

This ... is mind blowing.

andrewla · on Sept 14, 2012

Just as interesting, in my mind, is the inverse library -- jspdf [1] lets you create pdfs in javascript. For automatic document generation, I find I can quickly whip something up in jsbin or jsfiddle that will give me a pdf I can download and do whatever I want with.

[1] http://jspdf.com/

mwexler · on Sept 14, 2012

I presume that copy to clipboard could be added to this as well, yes? Cool project.

klr · on Sept 14, 2012

I have this error with Firefox 9.0.1:

currentPage is undefined http://mozilla.github.com/pdf.js/web/viewer.js Line 285

AndrewDucker · on Sept 14, 2012

I'd go with "That's because any version of Firefox below 15 is unsupported".

nnethercote · on Sept 14, 2012

You should update to Firefox 15. It's better than 9, and secure too.

Roritharr · on Sept 14, 2012

i've stumbled upon PDF.js a while ago because i was looking for js tool that allows me to extract data from pdfs... sadly i'm still looking for a good lib to do just that.

antman · on Sept 14, 2012

Apache Tika. Works on many filetypes and languages.

dkhylan · on Sept 14, 2012

If your looking for a web based API for manipulating PDF's, SaaSpose provide a commercially supported solution

mattdeboard · on Sept 14, 2012

iText has a lot of tools for extracting info from PDFs. We use it extensively.

robodale · on Sept 14, 2012

I have used iText(sharp) for several years. It works very well.

mattdeboard · on Sept 14, 2012

Yeah we use the C# version as well, I've done some hacking on a Clojure wrapper around the java version. it's nontrivial :P

jrl · on Sept 14, 2012

This is great, I love it. I can read PDF files without leaving the browser, in any browser. I find it slightly distracting to switch to a third-party application.

jjmanton · on Sept 14, 2012

from someone who has worked a lot with PDF, excellent work.

Aissen · on Sept 14, 2012

It's in Firefox since version Firefox 14, but disabled by default. Activable with "preview in Firefox" in options/filetypes/pdf.

gbraad · on Sept 15, 2012

Next up, a good ePub reader for use in firefox and firefoxos. Breaking free of the only two rendering engines in use...

leberwurstsaft · on Sept 14, 2012

On an iOS device with retina display it's awfully blurry, probably just not rendering to a big enough canvas.

kerrishotts · on Sept 15, 2012

Feels like they aren't taking window.devicePixelRatio into account so that they end up with a legitimately retina-sized canvas... My canvases were always blurry like this until I took dpr into account.

chj · on Sept 14, 2012

This is amazing, but sadly slow.

darkstalker · on Sept 14, 2012

On linux, anything is better than that crappy 32-bit only propietary plugin that hangs the browser for a second every time I want to open a pdf file.

tete · on Sept 14, 2012

Works nicely since it is Firefox's default viewer. No more need to install a one, yay!

antonpug · on Sept 14, 2012

Sweet. Going to keep this in mind for when my site needs a pdf viewer. Awesome tool

famoreira · on Sept 14, 2012

This is pretty cool! Anyone knows if there is support for PDF annotations?

hirenj · on Sept 15, 2012

Looks like it has basic support for annotations, but is limited in its scope. I'm hoping I can use this to read annotations from PDFs so I can hook it into a journal paper reading/organisation workflow.

davedx · on Sept 14, 2012

The demo looks really impressive, well done. Adding this to my toolbox! :)

3ds · on Sept 14, 2012

On Firefox OS this will be the default PDF viewer.

dude8 · on Sept 15, 2012

Good Job!!

jowiar · on Sept 14, 2012

1) From a technical perspective, this is damn cool - exceedingly well done. Color me very impressed.

2) I hope I never actually see anyone using this on a website, attempting to make things "easier." Between Scribd and Slideshare, and Adobe trying to force its hideous crash-prone plugins into my browser, there are already enough people making a mess out of what is one of the more well-thought-out aspects of OS X. Give me a link to a PDF, which Preview.app handles in wonderful fashion any day.

3) It would make a sweet browser plugin on browser-in-a-box platforms and other platforms that don't have a nice native implementation (which upon further reading seems to be the goal).

crisnoble · on Sept 14, 2012

Personally I wish all sites used this instead of scribd/slideshare or opening up external programs.

This will just work on different browsers, different OS etc.

Plus you can send links like this: http://mozilla.github.com/pdf.js/web/viewer.html#page=6&... To send a user to s specific part of the page. Right now usually you just send the pdf link and have to say, "check out the image on page 6".

Keeping the pdf in the browser and accessible to and from javascript opens up a world of possibilities.

geofft · on Sept 16, 2012

> This will just work on different browsers, different OS etc.

For values of "different" that mean "as long as they all have fast processors and accelerated JavaScript". I tried this in a non-Apple browser on my iPad, which means no JITted Javascript interpreter, and it basically hung. I haven't even attempted it on my smartphone, which does have a native PDF-reading application.

RandallBrown · on Sept 14, 2012

I think Firefox has been shipping with this for awhile to view PDFs. It may be an about:config value you have to turn on though.

qxcv · on Sept 14, 2012

I only noticed it in FF15, but it may have been there for longer. Change pdfjs.disabled to false in about:config (followed by a browser restart) if you want to turn it on. Here's a test file[0] should you wish to test it out.

I've been using pdf.js for a while on my Windows box (where it is enabled by default on the FF beta channel) and it has performed remarkably well. Sometimes it's a bit slower than Adobe Reader or messes up colour or formatting, but on the whole I've quite enjoyed using it.

[0]: https://svn.torproject.org/svn/projects/design-paper/tor-des...

Note that the text in [0] doesn't scale properly, and the page previews in the navigation pane just show garbled text. Usable, but only just.

SilasX · on Sept 14, 2012

Sweet! (It voids the non-existent FF warranty though...)

What's better is, it allows the (awesome) extension Pentadactyl to click the buttons in the viewer (since it's part of the browser). Have to click on (or tab to?) to the document itself though, in order to be able to use keyboard navigation on it though :-/

JasonFruit · on Sept 14, 2012

I've been using it for a few weeks now, and have noticed that it:

- renders very well

- is extremely slow on large documents with images, but pretty good with text-only ones

- offers a more integrated experience than Adobe's plugin

I think it's very promising, and I look forward to seeing it become more polished.

Argorak · on Sept 14, 2012

It is included, but off in release versions at the moment.

In about:config :

name: pdfjs.disabled status: default type: boolean value: false

pucinators · on Sept 14, 2012

I'm no usability expert, but to me it would make more sense to name that parameter "pdfjs.enabled" true/false

pilif · on Sept 14, 2012

about:config isn't really optimized for usability and there's some advantage to the double negative in that the absence of a value could be treated as a false boolean value.

So if at one point pdfjs should be enabled by default, they can just get rid of the preference in the default preferences file altogether instead of having to ship one where it says enabled: true (because absence of the value would be treated as false)

alter8 · on Sept 14, 2012

Does that mean I can just remove the Firefox extension installed from https://addons.mozilla.org/en-US/firefox/addon/pdfjs/ and then turn on this setting will enable a built-in PDF viewer?

mistercow · on Sept 14, 2012

>Give me a link to a PDF, which Preview.app handles in wonderful fashion any day.

Really? I can't stand that. My download folder ends up cluttered with (inevitably weirdly named) PDF files. Worse, I suddenly am switching between programs and tabs, rather than just tabs, and the two programs have very different search interfaces (and Preview.app's is downright clumsy). Since I rarely look at a PDF if I'm not researching something, navigation and searching are really important considerations.

I'm quite happy with the Chrome PDF viewer though.

jowiar · on Sept 14, 2012

For me it put anything in my download folder. It opens in the browser in a reasonable, non-resource-abusive fashion. I'm on Safari 6/Mountain Lion, so I'm not sure what version this happened.

mistercow · on Sept 14, 2012

OK, that's new in Mountain Lion apparently (Lion was the OS X that chased me away). I assumed it was the same as in previous versions since you mentioned Preview.app, but it appears that Preview.app isn't involved anymore when you open links in Safari; it just uses a built-in PDF viewer the same way Chrome does.

Swizec · on Sept 14, 2012

A few months after this was first shown on HN Chrome started shipping with a built-in pdf reader that looks exactly the same on both MacOS and Linux.

I always assumed it was pdf.js

fafner · on Sept 14, 2012

No, Chrome has its own binary plugin. IIRC it is closed source and not available in Chromium.

Pdf.js is developed by Mozilla and now shipped with Firefox and it works quite well so far.

esolyt · on Sept 14, 2012

That's correct. I use Chromium on Arch Linux and it doesn't come with the proprietary pdf reader by default.

mcrittenden · on Sept 14, 2012

You said "by default" which is correct but just for the sake of completeness, the built in Chrome PDF reader can easily be added to Chromium on Arch using this package from the AUR: https://aur.archlinux.org/packages.php?ID=44148

I've been using it for almost a year now without issue.

esolyt · on Sept 16, 2012

Me too! In fact, that's why I said "by default" :)

Similarly, Pepper Flash Player can also be integrated from AUR.

hey_lu · on Sept 14, 2012

There is also a Google Chrome/Chromium extension also provided by the pdf.js team. (It's not in the webstore, though.)

I've used it since about half a year and I like it a lot. (Not having to download PDFs all the time and having PDFs in tabs alongside the sites I read.)

esolyt · on Sept 14, 2012

I agree. Separate windows for PDFs is terrible.

You can also separately download the proprietary PDF reader of Chrome and use it with Chromium. If you are using Arch Linux, it is available in the repositories.

knowtheory · on Sept 14, 2012

I believe all the other replies to your comment to be incorrect.

Both Chrome and Android use the skia (http://code.google.com/p/skia/ ) library as far as i am aware (see http://code.google.com/searchframe#OAMlx_jo-ck/src/third_par... )

cygal · on Sept 14, 2012

This is not a PDF reader but a graphics library.

knowtheory · on Sept 14, 2012

My understanding is that it reads PDFs: http://code.google.com/p/skia/source/browse/trunk/include/pd...

alexlarsson · on Sept 14, 2012

That code is for drawing to a pdf.

wlesieutre · on Sept 14, 2012

I believe Chrome's pdf plugin is based on Foxit

fafner · on Sept 14, 2012

I found Preview.app always annoying. E.g., the maximize windows button not maximizing the window. Sure it is better than Adobe Reader. But there are quite a few very nice alternative PDF viewers better than it. And pdf.js is great as a browser plugin because opening PDFs won't disturb your browsing any more.

thomasfrank09 · on Sept 14, 2012

I agree about Preview - but mainly I just hate having to download PDFs to read them. Most of the time I don't need them long-term.

chrismonsanto · on Sept 15, 2012

I wouldn't say it's exceedingly well done... yet. It is so slow it is unusable on my OC'd core i5, and can't search pages you haven't looked at yet. The last one is a deal breaker for me. Really the only reason I use chrome for my day to day browser is for the PDF plugin. Once this project matures I'm out.

pwenzel · on Sept 14, 2012

How well does printing work when PDF.js handles the rendering?