Hacker News new | past | comments | ask | show | jobs | submit login

i've stumbled upon PDF.js a while ago because i was looking for js tool that allows me to extract data from pdfs... sadly i'm still looking for a good lib to do just that.



Apache Tika. Works on many filetypes and languages.


If your looking for a web based API for manipulating PDF's, SaaSpose provide a commercially supported solution


iText has a lot of tools for extracting info from PDFs. We use it extensively.


I have used iText(sharp) for several years. It works very well.


Yeah we use the C# version as well, I've done some hacking on a Clojure wrapper around the java version. it's nontrivial :P




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: