In many places in the code closures are used to handle requests (see the flink code there). If the fns* list is cleared (say news.arc is restarted or harvest-fnids kills them) then you'll get the message.
The use of closures in this manner means that the code needed to handle say a form submission is really compact and set up when the form itself is generated.
I would be interested to know how much state is in those closures. If it is less than 200 or so bytes, it would not be impractical to encode it (b64) in the url for the next page (rather than a reference to the state).
You don't want to execute code from URLs. (Yes, you can use cryptography to "sign" URLs you create. Don't try that at home unless you know the difference between MACs and hashes, and how to avoid timing attacks.)
It's not a druthers kind of thing. If you need to trust that it hasn't been tampered with you must sign it.
And if you don't care you might as well not add authentication because without signing it's just a fancy CRC - ie, totally replicable by an attacker. As cookies and links are sent over TCP there should be vanishingly few errors in transmission - you're far more likely to introduce false positives with buggy code, and ...
You need to sanity check your inputs anyways. Just do it. This is also how you avoid bugs normally.
You said "sign or sanity check it", as if you can do whichever you want. But in the area suggested they have vast differences and security implications.
How many corrupted web pages do you see because of CRC failure in TCP?
I wouldn't call it a closure unless it has something like a "next-instruction" field, which is probably enough to get control - and certainly enough to do nasty things (think 'debug-mode, 'restart-server, 'shutdown.)
The PLT-Scheme (aka Racket) folks do this. Its kind of awesome, and I used it for a side project. It works pretty well except some browsers have a max url length, so you can only safely shove so much data to the client side, and then your stuck with old fashion IDs again. Still lots of fun to play with :)
Racket has support for serialization of closures and continuations, which it's web server makes extensive use of. It seems like it'd be possible to expose the relevant functions to make that happen, though I have no idea how large the closures actually are once serialized.
In particular, harvest-fnids has as maximum number of allowed fnids. If there are too many, it purges any fnids that are older than their expiration time, and the oldest 10%.
Thus, the more fnids created (i.e. the more users), the sooner fnids will get harvested and you'll get the expired error.
It doesn't just happen when writing comments; it can happen almost anywhere on the site if you linger too long. It's the most glaring fault with the software behind this site and would be completely impracticable if this site weren't populated by technology-minded people who aren't bothered by error messages.
I just open up each page I'm interested in reading in its own tab, I've never really had a problem with the error message unless I take too long on a comment.
I've become accustomed to getting the error when I linger on the page for a bit, and even in that context it's pretty irritating, but tolerable. Just this morning however, I'm able to click on the logo link at top-left, immediately navigate to the bottom, select "more," and get the nasty - that's remarkably dysfunctional.
PG is using a really cool programming technique that I'm afraid is ahead of its time relative to current hardware. An upgrade to the HN server should allay the problem.
To see the potential, look at this code snippet from an academic paper on the topic. The web server presents a form asking for a number, then presents a form asking for another number, then displays their product. This technique makes event-driven web applications feel (to the programmer) like sequential imperative programs.
;; main body
‘(html (head (title ”Product”))
(body
(p ”The product is: ”
,(number→string (∗ (get-number ”first”) (get-number ”second”))))))
It's not so much that it's ahead of its time relative to hardware as it is something you do in the early versions of a program.
Using closures to store state on the server is a rapid prototyping technique, like using lists as data structures. It's elegant but inefficient. In the initial version of HN I used closures for practically all links. As traffic has increased over the years, I've gradually replaced them with hard-coded urls.
Lately traffic has grown rapidly (it usually does in the fall) and I've been working on other things (mostly banning crawlers that don't respect robots.txt), so the rate of expired links has become more conspicuous. I'll add a few more hard-coded urls and that will get it down again.
Over the last week the home page appears to be cached longer than the arc timeout, no doubt due to the spike in traffic. As I throw away cookies when closing the browser, I need to login daily. It's been impossible to login from the HN home page because of this. Refreshing the page doesn't help; I've had to click through to a story to be able to login.
The problem there is that we switched to a new deliberately slow hashing function for passwords.
Edit: I investigated further, and actually you're right, the problem was due to caching. It should be better now because we're not caching for as long. But I will work on making login links not use closures.
Gauche Scheme has a bcrypt implementation, but I don't know what the compatibility story is between mzscheme and Gauche. I think they're both R5RS compliant, so it should work.
I see that newer versions of Arc run on Racket, but I have no idea if that's what HN is using or not.
I haven't seen a scheme powered PBKDF2 implementation so I'd guess that's out.
The only other expensive KDF I can think of is scrypt, but I would be pretty surprised if that's got a scheme implementation.
Of course, I guess pg could have decided to call out to the OS to run any of those functions too.
Out of curiosity, are there any places where the hn codebase would be smaller if you used full continuations instead of just closures, allowing code akin to what I quoted from the PLT paper?
The HN server uses a table of closures to implement those links (the id code for the closure is the bit after fnid= in the url).
When the HN server starts running out of memory, it drops entries from this table. When your browser asks for an entry that is no longer in this table, you get the "Unknown or expired link" error.
This is a crazy design, but unless someone would like to patch the source code and get PG to accept it, we're stuck with it.
It's not crazy at all, it greatly simplifies development to use callbacks for actions rather than manually encoding the necessary state into the URL. Techniques such as this are what enable a single developer to be so productive by automating boring and time consuming stuff.
Except it doesn't appear to work robustly, which makes it poor design. "Automating boring and time consuming stuff," is all well and good if it actually produces a functional system, but that concern is secondary to robustness.
To you, but that's a value judgement, it obviously was the other way around for pg. Had he not taken those shortcuts, there would be no hacker news at all; be thankful he automating that boring stuff and bothered to build the site.
It'd work robustly enough if the links didn't expire, and if we believe other posts on this page, the links are expiring due to memory limits on the system. (The other possibility is a timeout, I guess, which is easily fixed.) If it's running out of memory to store the closures it would run out of memory to store the interaction state.
In other words, there's a problem here, but it's not the programming model that pg chose.
Except the links do expire, so its not robust. I expect that when I visit a web page, I can let it sit for an extended period of time before moving on to the next page and have it work. HN doesn't work.
Furthermore, the technique of holding important state authoritatively in memory like this is not a good web-development practice for various reasons. Doubly so if its state data which can be round-tripped. Links should not break when the web server or cache (I'm not sure which one it is) runs low on memory. So yes, there is a problem with the programming model that pg chose.
If he hadn't used that technique, there would be no hacker news for you to use at all. You're entirely missing the point that this is a technique to make hobby programming more fun, it's not about being robust or best practice, it's about making programming simpler so pg finds it worth his time to build this site in the first place.
blahedo's point was that the technique was not fundamentally a problem from a robustness point of view, and I disagree with that point. It is a problem, and I was pointing that out.
Your point seems to be that since Hacker News is a "hobby project," that we may forgive sacrificing a bit of robustness to make the programming exercise more pleasant. That point was not clear to me from your original posting. Rather, the point seemed to be that the technique was good because it was clever and fun, and I disagreed with that sentiment.
PG seems to be saying elsewhere that it was used as a rapid prototyping technique. That seems to be a fair justification of the technique, in my estimation.
> If it's running out of memory to store the closures it would run out of memory to store the interaction state.
Not necessarily. The way I would approach this is to keep the link cache in memory, but have the links contain the minimal necessary state to reconstruct the link from disk-based storage in the case where the cache is gone. That gives the same excellent median performance without any breakage.
> The way I would approach this is to keep the link cache in memory, but have the links contain the minimal necessary state to reconstruct the link from disk-based storage in the case where the cache is gone.
It's not a cache, and you clearly don't understand the issue. These are callbacks to closures, embedding the state in the URL is what they attempt to avoid because doing that is tedious.
Sorry, bad choice of words. I was just shooting from the hip here in response to the parent that suggests you can't both keep something in RAM for the majority of cases and still make it robust.
The fact that embedding state in the URL is tedious is neither here nor there, but in any case it's a pretty garbage excuse. You know what's more tedious than writing code to pass a few integers around in links? Thousands of people losing carefully written paragraphs of enlightened prose on a regular basis. In fact the more carefully considered, the more likely the text is to be lost. If it took 24 hours or even 12 hours for links to expire then maybe you could justify the approach, but it seems to be well under an hour on average before a given closure is purged. This site seems hardly so complex as to be gaining much from a pure continuation approach, and if you can't ease this problem in a lisp then are all of us building services for non-hackers doomed to life of bitter tedium?
You seem to not be aware of the architecture of this site, it's all run out of ram, no database; just simple lazy load on demand files on a single server running in a single process. Because of this, memory is tight, that's why those closures are purged. It simply means pg hasn't had the time to convert more of the prototyped code into production stateless code that always works. But that's his prerogative, this site is a hobby for him, he doesn't need to make any excuses about anything. If you don't like his site, go somewhere else.
There's something on disk isn't there? Or does a power outage mean poof it's all gone?
Anyway, you're totally right that it's his prerogative to build a site however he sees fit, and it's my prerogative to leave, but it's also my prerogative to complain about it and call it half-assed. I do build websites as well, so I'm not just armchair commenting.
Poof the closures are all gone; the state of the articles and comments are of course rebuilt from disk state on an as needed basis.
> I do build websites as well, so I'm not just armchair commenting.
I appreciate that, I just use a similar framework and understand why one would choose to use callbacks and not bother ever replacing them. It has to matter enough to bother and to pg, it doesn't yet.
Simplifies development? This is not a complicated piece of software and the techniques to build it are well known. You wouldn't need any more state than the id number you need for the callback anyway.
Building broken software is always much easier than building robust correct software so this is hardly a good argument.
No, I understand the issue. But you're presupposing a specific implementation here. If you were just designing this in, for example, PHP then you'd just need one piece of state: the page # (for more...) or the parent comment id (for the comment) and so on.
The real issue is that there's a whole bunch of saved state on the server for operations that could be (and should be) completely stateless.
No, the issue is time, specifically, pg's time; using callbacks takes less programmer time than manually building every URL statelessly. Yes, it could be done another way, but it wouldn't exist at all if he'd had to do that for every link because it'd have taken too much of his time.
What is it about Arc or the architecture of HN that makes callbacks so easy and stateless URLs so hard? I've written a forum in under 3 hours the "traditional" way in PHP. I find the stateless concept a lot simpler in general. None of these links should require any server state at all.
Generated html; callbacks make for rapid prototyping by grabbing necessary state directly from the environment rather than making you specify the state field by field in the URL. Your 3 hour prototype doesn't come close to the features in this forum so it's not comparable. No links ever require sever state if you take the pains to manually specify routing and parameter information for your links, but that's something callbacks eliminate the need for by trading server state for programmer time.
I think you massively over-estimate the functionality of this site. I can actually see the immense value of using closures for maintaining state in an application that actually needs state (any site containing a progression of forms, for example). HN is ridiculously simple for a site. With few requests requiring requiring much in the way of previous state.
I understand why it's designed this way -- he's got a tool meant for more complex tasks than what it's used for on this site. He used that tool because it's what he knows. But I really can't see it saving that much programmer effort in general.
I think you over-estimate what you can do in three hours and underestimate how much non UI stuff there is in trying to keep out garbage and prevent voter rings and score karma and vary behavior based on that karma and probably a dozen other little small things.
I certainly don't think 3 hours is enough for this site but then pg has put in a lot more time in than just his initial prototype as well. But the functionality of closures and server-side state aren't needed for this kind of site at all. Nothing you mentioned above seems related to that functionality either.
It's not relevant whether they're needed or not if you use closures as your default linking mechanism because it's always the easier route. They're not needed for any site ever, they're just damn convenient. You go back through the app replacing closures with more verbose and direct linking as time permits; clearly pg hasn't found the time yet.
You could route URLs for this site with a moderately sized switch statement. The code to route everything would be smaller than the code needed to just route closures before you've setup a single one.
Obviously replacing closures with directly linking is more difficult than just using direct linking in the first place.
Incorrect; routing closures is free, automated by the web framework, something no web framework does for direct linking. Look at Rails, every Rails programmer spends a large amount of time figuring out and managing routing. Look at Seaside which uses callbacks just like this site, you can build the entire app without spending a second on routing URL's because callbacks do it for free as part of the framework, completely automated.
You get crappy URLs with closures; no other web framework provides ugly URLs like that -- but if they did -- they could do it with the same amount of effort. But routing in rails is specifically designed to separate URL presentation from the underlying action.
Hell, you can get routing for free in PHP if you just name your files like: comment.php, topic.php, upvote.php, downvote.php, etc.
Users don't care what the URL looks like, Amazon does quite a bit of business with its crappy URL's and closures allow linking to actions without having to have a resource for every action. This is extraordanaryily useful when building complex applications.
Look, I use both styles daily and I'm telling you it'll be over my dead body before I allow someone to take closure style ugly links away from me. You aren't going to convince me that manually routed URL are always preferrable.
Users who bookmark and search engines care about non-random non-expiring urls. The urls on HN are wrong in every way a url can be wrong.
Closures create a resource for every closure instance which isn't very scalable or efficient -- it is in fact the problem with this very site. It might be useful for building complex applications but it's just a liability and a waste for something as simple and busy as HN.
I'm trying to convince you that manually routed URLs are always preferable but they should be for a site like this.
No, it's a design choice, he favors ease of programming more than user experience. You might not agree with that choice, but it's not a flaw, he did it on purpose and knew the consequences.
A better name might be technical debt. It is a flaw from the perspective of someone seeing the error message. But for the developer it is a way of saving development time, which can be paid off later to fix it.
You're right that pg did it this way in the beginning to save time but years have gone by and the site is now more central to his business - especially as a tech demo. This back and forth argument presupposes that there isn't a better fix than the naive 'use old-style code' solution.
Anyways, the discussion is worth having. Only by pointing out problems do you fix them.
You're justifying code that doesn't work on the grounds that it was quick to write. (Facetious comment: if the website doesn't have to work, I can write the whole thing in under a minute. Someone wrote a HN clone on these lines a few weeks back, but I don't know how long it took them.)
Given that the HN code was written by an increasingly busy man in his spare time, his use of an unreliable but quick-to-write implementation technique may be an acceptable trade-off. But it doesn't make the design any less crazy.
I find it interesting that most discussions I've seen about this exact topic are about Arc and closures... instead of about the fact that this may well be an interesting programming thing to do but it's a moronic user experience thing to do.
Your comment is in a sense its own refutation, because the ultimate test of user experience is whether users continue to use the software.
Getting user experience right depends on the users. I wouldn't use this technique in an online store. Random online shoppers would be confused by expired links, and you'd lose sales. But HN users aren't confused by them. What HN users care about is the quality of the stuff on the site.
Since I can't work full time on HN, I focus on the things that matter most. What I spend my time thinking about is e.g. detecting voting rings. Those affect what you see on the frontpage, which is what users of this site care most about.
I think you underestimate how annoying the issue is. It's one of those things you put up with because of the content, but which are annoying enough that they detract from the site experience.
So far I'd rate the user experience of the site around 3/5 and the content 5/5. You don't need to work any more on the content unless it starts dropping!
I've been here a long time and seen this expired linky thing happen roughly every other week; on a very few occasions it's been an annoyance, but mostly it makes me smile; after reading news.arc, it's a reminder of what a hack HN is.
I could care less if this issue got fixed. I have never once felt, "man, I'd definitely jump to another site if it didn't have this expired linky thing happen".
(Now, politics stories on the front page, on the other hand... I've often wished for a site with as good a crowd as HN but without the politics...)
You're right, but at times, I've found it nearly impossible to log in because I can't seem to get a new login URL. Nothing works except waiting it out. So please look into that if you can. I almost submitted a story like this because I've spent 5+ minutes trying to log in several times now.
That said: I come for the community - and the community has obviously noticed that the site occasionally throws up an annoying 'error'. The fact that you've done something cool programatically has no bearing on what I get from HN.
> because the ultimate test of user experience is whether users continue to use the software.
TIL Windows has always been an amazing user experience. Just look at their numbers.
> But HN users aren't confused by them.
"About 2,200,000 results" -- says they are.
I have a question: What advice would you give to one of your YC startups if they were having this same issue?
"Ah, your users won't be confused. Ignore all evidence that says they are."
I can tell you exactly why people continue to use Hacker News. It's because it's YOUR SITE. They put up with the broken web design where links die after a few minutes.
Your site won't get beat out by a competitor because it has you, and you fund people. So people continuing to use the site is orthogonal to whether the user experience is any good. You have lots of feedback indicating that it isn't.
Aren't you the one who says "listen to your users"?
The software stores the current state in a closure. The closure gets cached. When the cache is full, the older closures get flushed, hence the error message.
Good question, but in my opinion it's somewhat rhetorical. Given bugs like these, the ongoing optimization battle, and fairly reasonable feature requests (see the huge HN topic on that), isn't it about time Paul Graham hired someone full or part time to work all these issues out? Given how important HN is to YC, I would think it's worth it. Are any of these tasks really things pg has to do or are the best use of his time?
The first complaints are from 1575 days (over 4 years) ago, including about the more button breaking, so I am guessing pg has no interest in fixing it.
Or it's so ingrained in the architecture of the software that a fix isn't possible without completely rewriting it and changing the entire design philosophy.
Sort of yes, sort of no. It's a rapid prototyping technique. Essentially you fix it case by case, by taking individual bits of code that use this technique and replacing them with the uglier and less flexible but more efficient alternative of a hard-coded url.
Paul, I really do not mean any disrespect here because you are truly a class act and first rate player in the start up world. You are also a great hacker that loves to push the limits. You've created an amazing community here that I have been able to learn a ton from.
I have to ask, and I'll probably get down voted to hell because I'm naive or something, but what is so elegant about a coding technique that breaks under normal usage conditions? If I put out a customer facing piece of code, especially after 4 years, wouldn't it make sense to use an "uglier and less flexible but more efficient alternative" that doesn't break?
I understand your previous explanations of why this happens and of rapid prototyping etc. But at what point does the architecture actually get changed to eliminate this bug?
It doesn't make sense to call any specific amount of traffic "normal conditions."
What's good about this technique, and about rapid prototyping in general, is that you can write an initial version quickly in very little code, then gradually make it more efficient as the demands on the app increase.
The rate of expired links says more about how busy I personally have been lately than about the desirability of storing state in closures.
I'm not referring to any specific amount of traffic. I'm referring to how users expect a website to work. If the user sees a link, especially a More or Login link then the user expects it to do just what it says. When those don't work I would call that a bug. I'm in agreement that this technique can be useful for rapid prototyping, but I also think this site is probably the most active and mass used prototype I've ever seen. ;)
My goal for a web site or web app is to have 0 expired links. Sometimes stuff you link to outside your site will go dead, and it must be fixed or removed or whatnot. But for your own internal stuff... I don't know... something doesn't feel right about an architecture that allows that systematically. How much time could you save if you didn't even have to worry about fixing any expired links? Any idea on what the ROI on your time would be?
Anyway, just thinking out loud. Thanks again for the site though. I do indeed enjoy it very much regardless.
Back when it took a lunch break to cause this issue it didn't bother me at all. These days it happens so fast that it is constantly interrupting me while actively browsing through the site.
This happens to me a lot while reading HN. I hit "More" and by the time I am done reading a few comments on a handful of entries, the next "More" has expired, and so has the current.
Actually _right_now_ I cannot click "more"(on the first page) without hitting the "Unknown or expired link" page ... so I cannot go past the first page :/ -- Someone should submit a patch :)
Everyone who has literally answered the question "Why?" has completely missed the point. What would PG say about a primary site feature that is so completely broken that it drives users to complain actively, and maybe stop using the site? That it is their problem because they don't understand the technical details? Well, obviously, no one is losing any money here, so maybe that's the answer after all.
I also just tried to log in about 6-7 times in a row (clicking the "login" link on the front page, reloading the front page in between attempts), and I repeatedly received the "expired link" page.
It also reliably happens clicking the "next page" link on the bottom of the front page; by the time I'm done reading the front page the next page link usually expires.
The problem could made less painful by including a link back to http://news.ycombinator.com on the "Unknown or expired link" page. That would save me fishing around with the mouse and the back button to get a new start.
I've been wondering about this for a long time, but just as a data point if anyone cares, it has reached the point recently that HN is basically unusable for me a lot of the time, and I really am starting to give up on trying and spend more time elsewhere instead.
Perhaps one visitor is no great loss -- I'm hardly the personality around here that someone like patio11 is -- but I hope my contribution is constructive, and my comment scores have always suggested so.
However, subjectively, it seems like the quality of posting and voting has taken a sharp nosedive since the "Unknown or expired link" problems have become a several-times-per-session occurrence over the past few weeks. I can't help wondering whether long-standing regular contributors are being put off as a result. If positive contributors can't even log in to refute an objectively incorrect post with a verifiable link or downvote Redditesque diversions, a downward slide seems inevitable, and then the loss of high quality posting and voting becomes a self-sustaining decline.
Indeed, that became a habit for me a while ago after certain social discussion sites and on-line tools I use frequently went all Web 2.0 and broke the back button when a form submission failed, typically because the form fields were only added dynamically using JS so when you go back they simple aren't there any more according to your browser. Mercifully, HN has yet to introduce that particular "improvement".
That's not really the point, though, is it? The important thing is whether posters who want to offer a useful comment and/or mitigate a poor comment can do so. Once HN gets into unknown/expired mode at the moment, it seems common that even basic things like "More" links and logging in can fail as soon as you load/refresh a page, at which point the site is effectively unusable: you can't contribute even if you have something worthwhile to add saved away in your clipboard from the previous failed attempt.
Funny, I suspect it's just the opposite. Long-standing regular contributors are unlikely to be put off by the error messages, especially if they're technically knowledgeable and understand why the error is occurring.
On the other hand, new users who might not be accustomed to The Way We Do Things Around Here would be more likely to get upset at the superficial inconveniences and leave.
It's probably even the case that improving the site or adding features to it would work against its best interests by making it more accessible. HN's implementation is such that the more traffic it sees, the more frequently those errors will occur -- and as an unexpected side-effect, the popularity and instability of the site will work against each other until equilibrium is reached.
A small barrier to entry like Reddit's spartan design or MeFi's $5 fee can go a long way toward delaying the onset of the entertainment-seeking masses.
I'm a long-time user (created: 1668 days ago), and I hate this error. I understand why it's occurring, but it seems bizarre that such an obvious flaw has gone unfixed for so long. It feels amateurish. (That said, pg has bigger fish to fry, and he's probably right to ignore this. C'est dommage.)
I'll just throw in a "me too" with the other responders and say this error annoys the dickens out of me -- and I'm a long-time user and a medium-long-time initiate in the knowledge of the error's source (I tracked it down in the source in a fit of pique about 6 months ago after getting the error for the Nth time).
Besides the actual annoyance of the error, what's extra rankling is it is an example of privileging a neat trick over user-experience, which is one of my Least Favorite Things Ever that programmers tend to do.
(As an aside, I am extremely skeptical that increasing rates of this error occurring will help keep the original user community of the site -- it seems equally likely that longtime users will just get fed up and wander off.)
While there are plenty of cautionary tales of fora that failed their original purpose due popularity, growth and loss of focus, there are equally many cautionary tales of fora that failed due insular communities, group-think and stagnation.
It's a fine line to walk and it may not be wise to rely on programming bugs to point the way.
I think some pages are generated statically and expire. Why the login/logout pages "expire", I can't really guess at.
Wow, not sure why this is so deserving of downvotes. Trying to find a source, but I thought there was a previous discussion of many HN pages being statically generated and served quickly with links that expire after a certain time (or become invalid because of what may happen on the server side of things). But oh well.
One reason might be that posts/users are getting axed. Would be nice if everything except spam was unmoderated, imo. Also, the current rate limiters to keep spam out are blocking those that would otherwise be more active.
The use of closures in this manner means that the code needed to handle say a form submission is really compact and set up when the form itself is generated.