Hacker News new | past | comments | ask | show | jobs | submit login
What is the PDF format good for? Nothing (2006) (tuwien.ac.at)
32 points by pointfree on July 19, 2018 | hide | past | favorite | 32 comments



PDF is good for creating documents which are decently accurate at preserving the form of the content, both in print and on paper, while being accessible on a wide range of platforms.

Sure, there are formats better for print presentation. There are also formats that are better at on-screen presentation. But PDF offers a decent tradeoff and can be opened, dare I say, on any device.

The article is being simplistic. It completely fails to address the main advantage of PDF, which is access on all platforms.


I don't think it's fair to say that preserving page structure and formatting is never a good idea. For example, at least one good use for PDFs is sheet music/tabs. Formatting is obviously important here as it tends to lose meaning without it, but equally as important is the page structure and per-page nature of reading vs scrolling. As a (hobbyist) musician, when I'm playing an instrument, I don't have any free hands to scroll with, so it's important to me that I don't have to do it very often.


I happen to work in an industry (intl shipping) where we scan billions of PDF daily, and I have no clue, how would we share docs with all the parties involved ... 'sanely' ... if it were not for PDF, TIFF isn't good either .. PDF docs are like 3kb-10kb ... other than physical scan of papers ... however 'generated PDF docs' are 5kb at best.


It's scary how differently PDFs can be output depending on the exporter. I have some Word docs for user and installation guides that can be 100% larger depending on whether the content writer saves it as PDF, exports it as PDF, or prints the document to a PDF.


Not really. Each of those PDFs has a different intent and thus a different amount of metadata embedded it in.


For the purposes of what we're doing with them, we don't care about any of that, however. The real answer would be to automate the process, so people don't have to think and have the opportunity to do it wrongly, it just happens, but it's hasn't been high enough priority to bother with getting right.


It seems that the post was written in 1997 [1], with some parts added in 2005 [2].

The author is generally a computer science professor with some idiosyncratic (at least for the current time) opinions. Though I enjoyed the classes I attended with him a few years ago and some of his points I found quite convincing.

For example, in the class on stack based languages (forth & postscript mostly) he also showed how using PS instead of PDF allows him to put results from the analysis in the source code of papers and generate the plots from them. I try to mimic this in my PDFs by adding shortened links to the source and data for each plot (example: [3]) as I've often been annoyed with figures in papers that omit crucial information on what they show.

[1]: https://web.archive.org/web/19971016211209/https://www.compl...

[2]: https://web.archive.org/web/20051231210211/https://www.compl...

[3]: https://i.imgur.com/CfuYeOO.png


It's a widely used standard: The more people who can open and use PDF files, the more people who will prefer to create and distribute PDF files. Whether the PDF format is better or worse than the alternatives doesn't factor into people's decision to use it.


21 years after, nobody cares about postscript.

This ends up being a good example of how decent tech fails if not backed up by reasonable commercial strategy. In the world we live in, it's not enough to be good - heck, you don't even need to be good at anything, just be good at marketing it.


I'm not sure. "Decent tech" failed because it wasn't made available on different architectures in a decent form. Then one architecture took off, and left Tex behind (DOS/Windows). There was another in an area where typesetting mattered (Mac OS X) and that one, too, did not have "Decent tech" support. Even though the open source movement probably put in way more total effort in screen readers and typesetters than Adobe ever did into Pdf ... there was never a coherent strategy, and everybody did what this article tries to advocate :

let's make one tex distro that's a bit better at screen reading. One that's a bit better at math typesetting. One that can also contain C programs, along with text. One that has graphs.

Let's all make them with zero integration and each of them only working on one distro, one platform (because that one's the best).

Despite what this article claims, nobody ever tried to do the minimum necessary to make the suggested solutions succeed. Instead, there was infighting and perfectionism in tiny details with zero care for the global picture.

This is the issue with academici. Call it "depth first" development vs "breadth first" development. If you want large scale uptake, commercials success and usefullness for many people, obviously the correct strategy is breath first : well-supported few-problems many-usecase easy-to-use kinda-basic software. Then, over a LOT of time, make it less basic.

The way to succeed in academici is to go depth first. Make the state of the art neural network for distinguishing left-handed pomeranians from mostly-left-but-sometimes-abidextrous pomeranians by 0.005% ? Guaranteed publication ! 20 followup papers ! Great hero ! Speaker invites across the globe ! Who's ever going to use it ? Let's nod politely and say "many other academici".


> 21 years after, nobody cares about postscript.

Except most of the print trade.


This post - and every technical post on the internet - needs a date stamp.

Please never blog or post anything without putting a date on it.


In counterpoint: please look at Last-Modified headers.


Because they are not always present, not convenient, and don't necessarily reflect the actual date when the content was written / updated. Maybe a file was copied and the modification time changed although the content on the page didn't. Maybe the author fixed a spelling mistake 10 years later but didn't update anything else.

I want a human-decided, human-readable date on the page for the same reason I want a human-readable headline. For a technical post, they're about equally important, and should not require examining protocol headers to determine.


Also research papers please.


"...Acrobat reader for DOS that I have is dog slow..."

We're done here.


Adobe reader was always dog slow across all platforms, compared to alternative PDF readers.

And his other points are still valid, even more so now that smartphones are everywhere. Platform should be able to adapt content to best utilize resources (screen size). See how e-books are not published in PDF format. E-book readers mostly support PDFs, but it's painful experience to actually read through them.


While I think the author makes some good points about the PDF format itself, this article is painfully outdated with many complaints about Acroreader and Netscape which are no longer remotely relevant.

Though the article does serve as an interesting time capsule. Kinda shows you how the more things change the more stay the same.


So one of the common uses for pdf is filling in digital forms, typically published by your government. I didnt even see this addressed in the gigantic rant by author.

Instead they complained about Actobat Reader for DOS is slow and other random anecdotial evidence, none which imo supports his claim that pdf is good for nothing.


What's the benefit of PDF forms over HTML forms? Other than being able to share the "digital" and dead-tree designs, which may be an advantage for someone, but not for the person filling it in.


The benefit is you can print the pdf form and fill in with paper (such as handing out the printed doc at the reception), or you can fill in on your computer and print the result, preserving page layout.


Also, you may need a signature on the document which you would do by hand, while filling out the other fields sith keyboard.


PDF is mostly a subset of Postscript. It improves on Postscript in terms of speed and security.

In contrast to Postscript, it is not a Turing-complete programming language. What's left is a bunch of commands placing graphics objects on pages.

That means PDF viewers cannot be DoSed in an obvious way by placing infinite loops in the document.

It also means that any page in a document can be directly rendered, without rendering the rest of the document first. That's not possible with Postscript, because Postscript documents are programs that need to executed from the start.


PDF allows Javascript, making aspects of it Turing-complete.


Math! Math typesetting is still a huge problem -- same as back in 1997 with a few improvements here and there.

I do remember the days when postscript was far more of a default, or dvi -- there were people who refused to use pdf format at that time. Almost no one has stuck with that battle.

Standardized, high-quality math typesetting in html has still just not quite happened although it's improving continuously.


Pagination is better that scrolling on mobile devices. I often prefer it on a desktop screen as well.

I read PDFs all the time on my iPad Pro. They fit the form factor perfectly.


Acrobat Reader is dog slow on...DOS? When is this published? The title should have [year].


Internet Archive earliest capture is from 16. October 1997.

http://web.archive.org/web/19971016211209/http://www.complan...


The Acrobat Reader specified in the post is 3.0 which was out around 1997/98.

This is a post has missed the whole point of PDF files.


The Last-Modified header says 2006, some of the links date to 1998


Archive.org has the first snapshot of the page in 1997


Can we get a title update to 1997? (First published year)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: