Since Adobe is pushing a more aggressive stance for monetization of Acrobat, I am trying to replace selected PDF workflows with OSS. Here are some of the tools I use.
qpdf
removing passwords, unlocking PDFs, conversion
install in WSL with apt-get install qpdf
remove password with qpdf --decrypt --password="" input.pdf output.pdf
PDF4QT - Open Source PDF Editing
Deleting, Sorting, Extracting Pages
Currently, no choco release available, must be installed manually from PDF4QT/releases
Inkscape, LibreOffice Draw
editing PDFs, adding text
Mupdf
Command line tool and Python package for parsing, filling forms, adding text
SumatraPDF
Viewing of PDFs
pdfplumber
Awesome python package to extract tables from PDFs into data pipelines. Use with Jupyter Lab
FYI, you can use firefox for viewing,signing, and adding text to PDFs.
You can also use it to remove password (just do print to PDF after unlocking it).
I got all excited - then realised "signing" just means inserting a picture. Notably absent are open source tools for digitally signing and verifying PDF's. Apparently pdftk does it in a paid version.
It's funny in a way - in this thread we have people wanting ways to modify a PDF. Yet to me, being any to prove it's not modified (eg, it's statement provably issued by some bank saying they transferred funds to my bank on behalf of person XYZ) is far more important. Instead we have companies offering paid "document signing services" which are built on sand - you can easily forge / modify any signed document they issue.
PDFTK and pdfjam are two other useful command line tools. I use PDFTK for merging PDFs, extracting/deleting/duplicating pages, and decompressing so I can extract and manipulate text/data in raw PDF commands. I use pdfjam for n-up and adjusting page size and margins.
For extracting to tables I've been using http://tabula.technology/ for a couple of years. It seems to do a pretty good job even with some fairly complex tables and I've not had any problems with it.
Actually SumatraPDF is using MuPDF now. But there is some limitation on rendering PDF and eBook files. For example, formatting PDF file or displaying Unicode characters in epub file.
I like k2pdfopt for reformatting pdfs for my e-reader.
I've also used poppler's pdfimages but I'd prefer like something less buggy for my use case; any version I've tried had problem with one pdf made by Adobe InDesign.
Also, tesseract allows creation of a pdf from the images with the embedded OCR text. It is also built in in the k2pdfopt.
Okular is my go-to document reader across operating systems. In addition to PDF, it can open EPub, DjVU, JPEG, PNG, GIF, Tiff, WebP, CBR, CBZ, DVI, XPS, ODT and other formats.
Adobe Acrobat reader installer is also almost a 1 gb download these days. One thing I do find that Acrobat does better is compression. I can usually reduce a PDF down to about 30%-40% of its original size without much loss in quality. I've tried other tools and they haven't worked nearly as well.
It's funny what you are downvoted, but FR8 was way better OCRing Office-printer-scanned documents even against the much later versions of FR, I saw the comparison on the same source documents.
It matches the pdf style/colors to your emacs theme! Sort of like a dark reader for pdfs, but it automatically adjusts to any theme based on some good but likely imperfect heuristics.
Since Adobe is pushing a more aggressive stance for monetization of Acrobat, I am trying to replace selected PDF workflows with OSS. Here are some of the tools I use.