The Sunset HTTP Header Field

skrebbel · on May 16, 2019

I'm aware that the following is typical HN middlebrow matter, but I'm asking anyway. Does anyone know why RFCs are still formatted as if they were written on a typewriter in the seventies? I mean, here's a sentence quoted verbatim from this document:

    Sunset header fields will be served as soon as the sunset date is

    Wilde                           Informational                     [Page 9]
    --------------------------------------------------------------------------
    RFC 8594                        Sunset  Header                    May 2019

    less than some given period of time.

What problem does this solve? How is this more useful than it is cumbersome? also, how do people write this? Do they manually space space space space align the footer and then copy it every 30 or so lines?

rpc3 · on May 16, 2019

Don't worry, there's an RFC for that!

RFC 7990 -- RFC Format Framework: https://tools.ietf.org/html/rfc7990

There's various tools to automatically format things as necessary, just like any other kind of text wrapping.

As far as the overall "philosophy" behind keeping it this way, the honest answer is that the IETF is just a particularly unlikely group to change things without a clear need, and there are likely all sorts of tools small and large that expect RFCs to follow these conventions at this point.

LukeShu · on May 16, 2019

As an example of this (though for a non-RFC document):

Here's the "source" XML that is authored: https://openid.net/specs/openid-connect-core-1_0.xml

That can be compiled in to this HTML: https://openid.net/specs/openid-connect-core-1_0.html

Or to this RFC-like plaintext: view-source:https://openid.net/specs/openid-connect-core-1_0.txt

Most new RFCs are authored this way.

groovybits · on May 16, 2019

First line in the Abstract for that RFC:

> In order to improve the readability of RFCs while supporting their archivability, the canonical format of the RFC Series will be transitioning from plain-text ASCII to XML using the xml2rfc version 3 vocabulary;

Is it readable? Yeah

Is it archivable? Yeah, XML is (AFAIK) one of the most closely followed standards I could think of.

Waterluvian · on May 16, 2019

The legal world works the same way too. They're amazingly low tech. A lawyer in my family mentioned that it's basically a combination of:

- everyone can do it with any software (or even a typewriter)

- consistency with legacy documents. The format doesn't just change on you as you're reading through legal history

- it works fine, why change it?

I'd also add a guess:

- There's no room for implementation detail to affect formatting. Last thing you want is a whole bunch of formats that are similar but not identical, just because someone's software is a bit different.

- could you imagine trying to get everyone to change? We should be so lucky that everyone's already this consistent

codingdave · on May 16, 2019

> could you imagine trying to get everyone to change?

Yes, this is exactly what I do for a living, consolidating policy and legal documents and their related business workflows into modern applications. My requirements for how the text editors work are far more meticulous than your average app exactly for the reasons you stated. Concerns with formatting that most products would blow off as trivial are deal-breakers in this industry.

derefr · on May 16, 2019

Is there a reason to not store the document in an abstract format that is more easily handled by systems useful for legal analysts (e.g. giving you the ability to diff text), and just “renders” to the accepted format? (I’m picturing storing the docs as LaTeX, but anything like that would work. Maybe there could be a legal “theme” for a markdown processor, for example.)

Because, in such cases, it wouldn’t really matter if the editor renders the source to text incorrectly, as long as the proofer renders it correctly. Just like with WYSIWYG desktop-publishing software.

codingdave · on May 16, 2019

Storage formats aren't the issue. We diff and merge documents just fine, and do render them in different formats in some uses cases for specific audiences. Nor is it about a final rendered document. It is the details of workflows and collaborations that happen before a document is ever finalized where the the editing and reading experiences must match.

skywhopper · on May 16, 2019

Very likely: Because they are managed by software that has existed for decades when expectations and needs were very different than today, but which works well and correctly. Why invest time and effort into changing a system that is working well enough? Whether you need to view it on a screen or print it out, this format works.

Also, I guarantee there are any number of downstream consumers of RFCs which take this sort of format as a given, and which will break on even a minor change. And why break those downstream systems if you don't have to?

Basically, any changes will break something. So the benefits of the changes need to be bigger than the costs of the changes. Not to mention the cost in wasted time of all the humans bikeshedding how to change it to make it "better".

Dealing with the ongoing cost of humans having to read across artificial page breaks is a pretty minor concern compared to the costs of all that.

tinus_hn · on May 16, 2019

You can read it with any software you like, now and in 30 years when Microsoft Word is a quaint relic in a museum.

skrebbel · on May 16, 2019

How does that not hold for a plain .txt document without page-marker-ascii-art?

Sharlin · on May 16, 2019

I presume there exists a documented way to print them so that everything lines up properly.

LukeShu · on May 16, 2019

The files contain ASCII form-feeds between pages. If you send that directly to a printer, it will cause it to start a new page.

arethuza · on May 16, 2019

As I first encountered MS Word running on a Xenix system 31 years ago I suspect it will be alive and well in 30 years time!

jl6 · on May 16, 2019

To be fair, I recall having conversations in the 1990s about how MS Word was such an evil proprietary format and that one day we wouldn’t be able to read it. And here we are nearly 30 years later and Word docs (in a much evolved file format) are still here and still widely supported and easy to read using many tools, including open source ones.

Not saying that proprietary formats aren’t still a bad idea for other reasons, but predictions of unreadability don’t seem to have panned out for any common file formats.

hedora · on May 16, 2019

How do you open word documents from the 90’s?!? Do you have a Windows 95 VM or something?

Even with modern Microsoft Word, the formatting of old documents is often mangled.

To this day, up-to-date PowerPoint can’t reliably display presentations made with up-to-date PowerPoint on a different machine, let alone OS!

okket · on May 16, 2019

> How do you open word documents from the 90’s?

LibreOffice

gsich · on May 16, 2019

I open them in Word 2016.

xorcist · on May 16, 2019

> still widely supported and easy to read using many tools, including open source ones

One someone who has not tried could possibly say that.

The numerous doc file formats are a constant headache for anyone doing document processing. Not even Word itself can read its own older formats reliably. Sometimes you have better luck with LibreOffice, sometimes not.

And that's the mostly widely used document file format. Anything else from the same era is completely dead in the water. Manually viewing them can be done in emulators with a bit of work but any automatic processing is a huge undertaking.

tinus_hn · on May 16, 2019

Good luck editing these WordPerfect and CorelDraw files!

ben_w · on May 16, 2019

I put my vaccination history into a ClarisWorks document. At least, I assume I did from the file name…

Could be worse. My dad used a video tape format even more obscure than Betamax.

erik_seaberg · on May 16, 2019

Yikes. At least laserdiscs weren't homemade so I could replace what little I had.

jcwayne · on May 16, 2019

I'm not sure about Word docs, but FWIW 90s era Excel files have become progressively harder to open.

tialaramex · on May 16, 2019

Unlike Word I actually spent a few years of my life working on this.

At the surface layer this era of Excel ("BIFF" documents) isn't too bad, getting say, a table of small integers representing people's annual salaries out of an XLS file is very do-able and many programs today will get that right.

As you start to dig down it gets nastier pretty quickly. Formulae require implementations that match not just what Microsoft's published documents (I have loads of these on a shelf I rarely look at now) say, but what Excel actually did, bug for bug, back in the 1990s. Maybe the document says this implements a US Federal tax rule, but alas Excel got the year 1988 wrong, so actually it's "US Federal tax rule except in 1988".

You also run into show stoppers that prevent the oft-imagined "Just transform it to some neutral format" because Excel isn't a typed system. What is 4? Did you think it's the number 4? Because the sheet you're trying to parse assumes it's actually the fourth day of the Apple Macintosh epoch in one place, but in another place uses it to index into an array. Smile!

Finally in complicated sheets (often "business critical") there's a full-blown Turing complete programming language, complete with machine layer access to the OS. Good luck "translating" that into anything except an apologetic error message.

jcwayne · on May 16, 2019

> Good luck "translating" that into anything except an apologetic error message.

I'm going to have to steal that line. :)

dragonwriter · on May 16, 2019

> Does anyone know why RFCs are still formatted as if they were written on a typewriter in the seventies?

They are formatted in plain text with fixed page sizes because that's what they've always done, it works fine, and there's no compelling reason to change.

> also, how do people write this?

The thing about keeping the same format for a few decades rather than changing it with each shift in popular fashion is that there is plenty of supporting tooling.

https://www.rfc-editor.org/pubprocess/tools/

https://tools.ietf.org/

jjuhl · on May 16, 2019

It's super readable with any plain text editor or browser. And it's really nice to read something not filled with images, crazy fonts, colours and other junk. It's just pleasant.

Thiez · on May 16, 2019

It sucks on e-readers. It would be much nicer if there were no headers and footers, and no newlines except between paragraphs...

croh · on May 16, 2019

https://pretty-rfc.herokuapp.com

raxxorrax · on May 16, 2019

It could help applications using capability urls like in password reset links. Maybe the exact date isn't important, but the problems with capability urls is that they contain a (temporary) secret, but browsers and servers happily log all requests, even if that part of the URL is protected through TLS (edit: to outsiders).

Maybe having such a field will help treating those URL differently than "normal" ones, so that the secret is better protected.

edit: I failed to read the question correctly.

_pmf_ · on May 16, 2019

https://www.rfc-editor.org/pubprocess/tools/

sbr464 · on May 16, 2019

Here's the sunset rfc draft converted from XML to JSON.

https://www.dropbox.com/sh/duhmxzaehy0dwuc/AADyKPN5UVU1HKT9M...

Looking at the JSON, the structure is pretty basic. You could see it rendering in any format/style pretty easily.

0x445442 · on May 16, 2019

https://postimg.cc/6y327Syv

Need I say more.

alanbernstein · on May 16, 2019

Yes please? What tool is this? Are you suggesting that ASCII art is required for paging text?

moehm · on May 16, 2019

Lynx[0], a text-based browser (now primarily) used for bash scripting.

[0] https://en.wikipedia.org/wiki/Lynx_(web_browser)

barbecue_sauce · on May 16, 2019

I can use Lynx for bash scripting?

cuspycode · on May 17, 2019

It can be used in non-interactive mode from scripts, e.g.

  lynx -dump $URL > $FILE

This is a simple way to extract the content of a web page as plain text.

barbecue_sauce · on May 17, 2019

Interesting. Even as an experienced web scraper, it would have never occurred to me to use Lynx.

silversconfused · on May 16, 2019

Wow, it sure looks pretty in lynx.

recursive · on May 16, 2019

Evidently. Why does the text exist twice? Once in blue and green, and once in a smaller dark shadow behind it?

alanbernstein · on May 16, 2019

I believe it's a CLI tool to display RFCs, on a translucent terminal, with a browser window behind it. That still leaves other questions...

peterkelly · on May 16, 2019

Google could use this whenever they launch a new service!

sascha_sl · on May 16, 2019

This RFC is really indicative of what I hate about certain standards.

This is all very abstract. No user agent will use this in the same way. You'll end up writing code per application anyway. All this is is a convention that people will interpret a bunch into. This is not going to be interoperable.

The epitome of this is ActivityPub.

derefr · on May 16, 2019

What does it matter what the UA does with the information? The point is that the information is supplied; the UA can provide whatever benefit from it that the user wants from it.

For example, I was just thinking that forum software could have their backends scrape image links you attempt to embed; and, if they find that the link has a sunset policy (e.g. it was from an anonymous image upload service that expires posts after a week), then they could rehost the image instead of hotlinking to it. (Or, cache the image for now, and then rewrite the link from a hotlink to a cached link when the deadline hits.)

That’s not something you would standardize behaviour on. But, once you have servers providing the information—not to a particular client, but just spewing it out as an objective fact out into the world—clients can do all sorts of things.

sascha_sl · on May 16, 2019

Standardisation does nothing for this. The software could've always just used an X-Sunset header or just another field in their dto.

My point was that there will never be a client that will blindly read that value and understand what to do with it. Which is why the only difference to "X-Sunset" is that there's an RFC with suggestions on how it could potentially be meant there, along with a syntax restriction.

ActivityPub is the epitome of this because they supply a lot of vocabulary and syntax variation to express them (including very complex ways to normalise that syntax, JSON-LD is not fun to parse) but doesn't actually made the implementation commit to anything. It's setting up a bunch of arbitrary rules for nothing. Every single AP implementation still orients itself on compatibility with Mastodon.

jayd16 · on May 16, 2019

>Standardisation does nothing for this

Sure it does. If the information is standardized then the community as a whole can innovate around it. Browser plugins and the like can be made that make use of the information, etc.

jimktrains2 · on May 16, 2019

Or HTTP-based APIs could check for it and write out log entries or send out alerts to notify ops or the devs that they'll need to investigate further as to how this affects them.

userbinator · on May 16, 2019

It's not surprising, almost anyone can publish an RFC by doing the appropriate bureaucracy so there are plenty of RFCs which no one really cares about.

cjslep · on May 16, 2019

ActivityPub is more about living in a small part of a very big linked data graph and letting very different kinds of apps federate amongst themselves disjointly, encouraging more diverse apps that actually can interoperate with these two disjoint groups and provide niche value without the networks being a walled garden.

Very different than an HTTP header.

thejohnconway · on May 16, 2019

ActivityPub is too complicated and loosey-goosey. There are a couple of dozen partial implementations out there, but most of them don't federate properly or at all (which I thought was the entire point of the spec!).

Ideally such a standard should be really easy to implement, so we get true diversity, and democratisation, but implementing ActivityPub is a nightmare. Of course it doesn't help that the test suite has been down for months. https://test.activitypub.rocks

sascha_sl · on May 16, 2019

(You should note you're telling the author of go-fed how hard it is to implement AS)

thejohnconway · on May 16, 2019

Ha! Do you agree?

cjslep · on May 17, 2019

ActivityPub is not easy to implement due to two paradigms. One: most applications are not linked data aware, which depending on the tools could require a different way of retrofitting an existing application. Two: the tooling around the loosey goosey behaviors isn't as defined as, say, something like Swagger. Since AP is machine readable and RDF based it shouldn't be a stretch to evolve an ecosystem that helps developers.

I chose to go down a very difficult route with go-fed, something I don't think anyone else has done with JSON-LD. So my perspectives on the challenges in the area are very warped. For example, over the course of a mere 8 hours of work I've used go-fed to get my personal blog federated. But I stand very biased.

sascha_sl · on May 16, 2019

Hey CJ.

I don't think that graph will ever be readable without vendor-specific hacks applied for every piece of software participating. AS will meet the same fate as XMPP. Some very core features might work, but the diversity in user agents and servers and their quirky implementations will ultimately be to its detriment. Maybe that's leftover pessimism from the time I participated in the XMPP WG. I'd love to be wrong.

I have a lot of respect for implementing AS in Go. I tried and didn't even finish my relay implementation. It was just too counter to the Go core competency and mindset. Or I'm just not a fan of wrangling interface{}s around.

cjslep · on May 17, 2019

Hey! I totally understand the pessimism. I am untainted by XMPP for better or worse.

I started typing out a reply on my phone but I think I need to sit with a keyboard and just make a blog post on it -- I hope you don't mind a wait. I hope I'll do a fair job characterizing the problems you identity and give them my alternate outlook.

WorldMaker · on May 16, 2019

In this particular case, this RFC isn't Standards Track, it is merely Informative, presumably partly because it isn't normative (it's not about behavior but about conventions/hints).

lightgreen · on May 17, 2019

This RFC reserves an http header name. Is is not just informative.

WorldMaker · on May 17, 2019

Many capital-I Informative RFCs reserve things like HTTP Header Names. Some of the easiest examples are April Fools RFCs that reserve all kinds of things, and while some people do like to use HTTP 418 I'm a Teapot (https://http.cat/418), that's not a capital-S Standard.

The IETF has workflows for Informative things versus Standard things. It's still useful to "reserve" things in Informative RFCs in case the RFC later moves to the Standards Track. Or also as some of the April Fools RFCs seem to indicate the internet takes its jokes pretty seriously and you never know when an Informative thing maybe adopted just because it was interesting to some developer for some project.

rpc3 · on May 16, 2019

I'm kind of torn here, because I could see this being really useful for web archivists -- e.g. the people who jumped to try to preserve as much of Tumblr as possible as that seems to be "sunsetting" -- but the sites that are actually likely to configure this properly are the very ones I'd expect to have some of the best already-existing archives of. If someone cares enough to set this header, they probably also care enough to try to preserve things reasonably well, or at least have users who do.

The other use cases listed in the RFC don't seem incredibly compelling. Anyone have one that comes to mind?

deadwisdom · on May 16, 2019

The most compelling use-case for me is the web-api deprecation. It very clearly tells you when a service will no longer be available. Currently there are no standard mechanisms for this and are mostly driven by documentation. If an api used this then a consumer could look for these and raise alarm bells.

wongarsu · on May 16, 2019

There are also plenty of services that host user content, but only for a short period of time. Pastebin entries with a set expiration could for example set this header to inform clients about the expiration date. That way software can detect links to content that will expire and act accordingly.

deadwisdom · on May 17, 2019

Yes, and I think that's the smartness of this is they clearly came from a stance of Deprecation and realized they could make it more general. Good work, imo.

sbergot · on May 16, 2019

That is the first thing I thought of. We had several case of partners API being deprecated and we were not notified in a timely manner. This could be used to trigger monitoring alerts.

politelemon · on May 16, 2019

I am not sure if Sunset is the best term here. Sunsetting in such a context would indicate retirement:

https://en.wikipedia.org/wiki/Application_retirement

However the example given is indicative of a session duration, not a sunset period.

> For example, a pending shopping order represented by a resource may already list all order details, but it may only exist for a limited time unless it is confirmed and only then becomes an acknowledged shopping order.

Admittedly, naming things is hard, perhaps this could have been called Expiry: or Unavailable-After: or Valid-Until:

combatentropy · on May 17, 2019

I hate Sunset too, albeit for a different reason. It's businessese, like Legacy, Utilize, and for that matter, Retire (in this sense).

Choose the word that is old, short, and plain. Sunset is a little better than others because at least it is poetic. But for my HTTP headers I would prefer something a little more straightforward, rather than euphemistic --- like Expires, but that's already taken.

shakna · on May 16, 2019

Well, we already have Expires for marking responses as stale in cache, and it's also takes just a date string, so things could get confusing there.

politelemon · on May 16, 2019

Perhaps we could Sunset the Expires header first...

rpc3 · on May 16, 2019

My impression was the shopping example would be more aligned with something like delivery info -- it's not there just for a session, but a month after the package is marked "delivered" it's probably at least marked for deletion.

(To be fair, I'm not sure announcing that kind of mid-term expiry is a real need-to-be-filled in today's web.)

scottrblock · on May 16, 2019

I'm afraid this won't be useful in the real world. API consumers who continue to use legacy endpoints after being told not to are unlikely to update their code to listen to a new HTTP header. This strikes me as a great idea in theory, but unlikely to be useful to websites actually looking to deprecate endpoints.

derefr · on May 16, 2019

I don’t think this is intended to be about endpoints, but rather about the resources under an endpoint. E.g., “this image will be deleted from this image-hosting service in 30 days” or “this is a link to an item in your Dropbox Trash folder, and the Trash is emptied every 48 hours” or even “this is a temporary share link and will work for 24 hours.”

Given these use-cases, I could also see the use of a version of the header that doesn’t specify when the retirement of the resource will happen, but just specifies that the resource is, by design, not going to stick around forever. E.g. the URL of a “previous version” of something, where the system only keeps around N previous versions (so as soon as someone adds enough newer versions, the version you’ve linked to will disappear.)

Another fun use-case is sticking this header on everything on a given domain (by e.g. reconfiguring your web load-balancer to emit the header.) Archive.org could then use this as a signal to automatically prioritize archiving content from the domain, before any human being realizes the service is being retired and prioritizes that archiving.

DomreiRoam · on May 16, 2019

At WS-REST 2015(I think), I did talk about this with Erik Wilde (@dret), and we talked about the fact that is no good way to deprecate bookmarked url or an url stored in an uncommitted transaction. The idea was to add information so clients and user agent could keep track of resources: you can check it before their sunset date. The resource then could change the sunset date or redirect you to the new url.

ihuman · on May 16, 2019

I'm happy to see this is finally entering the RFC phase. I've already seen this used in the wild for an API (imgur [0]), and I was surprised when they linked to the proposal from 2017 [1]

[0] https://apidocs.imgur.com/?version=latest#api-deprecation

[1] https://tools.ietf.org/id/draft-wilde-sunset-header-03.html

doubletgl · on May 16, 2019

Terribly vague naming. As a non-native speaker I would've never guessed the purpose, and I've been in the English speaking tech sphere for many years. Does anybody actually say "this website will sunset in 5 years"?

news_to_me · on May 16, 2019

I've heard the term used for deprecation before, but not in a long time. I agree it's not great. At least they spelled it right, though!

whatshisface · on May 16, 2019

Tech is never really deprecated; instead, like the setting sun, it simply departs to rise again another morning (when a web developer independently re-discovers it).

save_ferris · on May 16, 2019

Some big companies do, but I never heard a startup use the term.

pawelk · on May 16, 2019

I really like this idea. I used to work in advertising/marketing and we were launching a lot of purpose built, short-lived landing pages, e.g. to collect entries in a contest, then announce the winners. These websites were live for a period specified in the contest terms, then usually archived. I could see myself routinely implementing this header in such cases.

Also, from the resource consumer point of vie this could help guide efforts like archive.org to make snapshots before it's too late.

SergeAx · on May 16, 2019

Was looking for a way to tell API client about endpoint deprecation and stumbled upon another RFC from Feb 2019: https://tools.ietf.org/html/draft-dalal-deprecation-header-0...

Actually, "E. Wilde" is on of the authors there too.

okket · on May 16, 2019

That is an early stage draft, not a RFC.

chrismorgan · on May 16, 2019

tools.ietf.org (and only that site) seems to be down for me at present, failing in the TLS handshake, PR_CONNECT_RESET_ERROR.

Here’s another URL for the content: https://www.rfc-editor.org/rfc/rfc8594.txt

Spare_account · on May 16, 2019

FWIW tools.ietf.org is working for me.

0x70dd · on May 16, 2019

The header that inspired Evan Spiegel