> For fun, though, I'll dust off an old concept since you're talking printing. One might start by printing them to a virtual screen like in Nitpicker GUI with the untrusted reader. Aside from isolation, there could be a feature to convert what's on the virtual screen or page into a compressed image. A PDF with N pages becomes a zip of N images or a single image of some size. That itself could be distributed to run in the trusted, safe viewers we already should have, right?
Which is literally what Qubes "Convert to trusted PDF" does.
> My first solution would be improving reader security by starting with one with decent code (Espie suggested MuPDF), compiling it with something that makes it memory-safe, and running it in a sandbox on separation kernel (eg Genode or Muen). Then, a memory-safe conversion tool turns it into something more trustworthy.
It would of course be preferable to have a secure PDF reader to begin with, but the complexities of the PDF format doesn't isn't really conducive to that.
Oh, that's neat it's what they're doing. Far as secure PDF reader, you can definitely reduce risks it poses with mitigations which reduce headaches when they don't reduce attacks. Those I was thinking of are doing it with acceptable overheads these days. On the far end, the CPU solution already compiles legacy C to run capability-secure on FreeBSD with OS and CPU available to download and run. Just gotta buy the board which has other uses.
So, there's more possibilities to explore on top of these existing solutions.
> It would of course be preferable to have a secure PDF reader to begin with, but the complexities of the PDF format doesn't isn't really conducive to that.
I thought pdf.js was a Javascript application in a browser on a full OS with all the risks that come with that versus a memory-safe, native code in a deprivileged partition or container. Web tech isnt my strong area do I could be wrong. Do correct if it's not a browser or JS tech built in unsafe language.
And it's a little strange your reply to memory-safe code for a PDF reader is that an "unsafe one exists, just use it" when you or your colleagues are currently applying my recommendation to the browser hosting it via Rust and Quantum.
You're doing one thing that matches the language part of my recommendation while saying we should do the opposite about a type of program that's similarly high risk. Quite the contradiction.
This is really confusing for me since you keep implying JavaScript is all we need for safe, secure, efficient, and/or low-TCB apps like this one parsing and rendering PDF's. Yet, you arent rewriting Firefox parsers and renderers in Javascript: you are using a new language with the properties I just named. Properties shared with safe C/Java/Ada subsets used in embedded but with even more safety added (borrow-checker). That's probably because you didnt trust Javascript to do the job efficienty, securely, and without leaks.
Now, you do in this thread if it involves a risky format attackers love. I dont. I think complex languages running in large apps increase attack surface. So, I still recommend strong sandboxing whatever parser/renderer one uses plus developers in security-focused projects (eg Qubes) using compilers or languages offering safety if having resources to spare. Everyone contributing a little gives us more building blocks over time.
And far as your other comment, there are always new ways to turn C code safe or secure being developed. C++ might also be able to use them via a C++ to C compiler but has stuff like SaferCPlusPlus to help. For C, options to attempt include Softbound+CETS, SAFEcode, Code Pointer Integrity, and dataflow integrity. At least three are FOSS with one I havent checked yet. So, they exist. They could also be in even better shape if security tool builders put more time in them.
All Im saying on this since you seem set on Javascript for efficient, secure apps. We arent going to agree on that premise.
I'm not saying pdf.js is fast. I'm saying that it's fast enough to be a useful tool to read most PDFs securely (which in fact millions of Firefox users do!), and it has the large advantage of actually existing, unlike complex schemes involving vaporware compilers and Ada in the kernel. (If you care about fast secure PDF viewing, write a new PDF renderer in Rust or Java or Go or whatever. This doesn't have to be a complex problem.)
By the way, SaferCPlusPlus is not memory safe, and porting a PDF rendering code base to use it would be about as much work as rewriting the renderer in a safe language.
> This is really confusing for me since you keep implying JavaScript is all we need for safe, secure, efficient, and/or low-TCB apps like this one parsing and rendering PDF's. Yet, you arent rewriting Firefox parsers and renderers in Javascript: you are using a new language with the properties I just named. […] That's probably because you didnt trust Javascript to do the job efficienty, securely, and without leaks.
JavaScript is a memory-safe language thanks to a well known runtime trick called a «garbage collector» … Until Rust came, GC was the only viable way to have a memory-safe language. Unfortunately, it has important performance drawbacks which makes it unsuitable to write a browser in a GC-ed language. But for 99% of the code written everyday (including a PDF renderer), GC is a good enough solution to write memory-safe code.
Also, Rust has been designed to make parallel code safe, something a GC can't give you.
> So, I still recommend strong sandboxing whatever parser/renderer one uses plus developers in security-focused projects (eg Qubes)
Browsers are probably the most exposed piece of software nowadays, and the vendors already do a lot of work to provide secure sandboxing. When using JavaScript, you're using a memory-safe language, in a sandboxed environment, which mean you need two exploits to get out of it (a bug in the js VM and a sandboxing bug). There's no guaranty that using another sandboxing system instead would offer better security, especially because you'll just have 1 layer of security.
> And far as your other comment, there are always new ways to turn C code safe or secure being developed. C++ might also be able to use them via a C++ to C compiler but has stuff like SaferCPlusPlus to help. For C, options to attempt include Softbound+CETS, SAFEcode, Code Pointer Integrity, and dataflow integrity. At least three are FOSS with one I havent checked yet. So, they exist. They could also be in even better shape if security tool builders put more time in them.
If there's an easy way to give C or C++ code a acceptable level memory-safety, why aren't developers using it ? (Don't tell me people already do, because it would be the proof that those tools aren't able to reach the «acceptable level»). Notice that if such tool was invented tomorrow, it will also benefit browsers, and increase the security offered by JavaScript.
Which is literally what Qubes "Convert to trusted PDF" does.
> My first solution would be improving reader security by starting with one with decent code (Espie suggested MuPDF), compiling it with something that makes it memory-safe, and running it in a sandbox on separation kernel (eg Genode or Muen). Then, a memory-safe conversion tool turns it into something more trustworthy.
It would of course be preferable to have a secure PDF reader to begin with, but the complexities of the PDF format doesn't isn't really conducive to that.