No, I think it is just a joke: if you parse "*2" as "times two" you could optimize it in code by shifting the bits left since bit shifting operations are cheaper than multiplication.
And to continue on this tangent, if your code is multiplying by the constant 2 (e.g. "x = z * 2") the compiler's probably going to optimize that into a shift anyway, so just keep the "* 2" to keep future human readers of the code happy.
I was just going to say that! Before I saw this post I was on LinkedIn looking for something, and you simply cannot miss when something like this shows in your address bar:
"Oh you want the URL of your LinkedIn Profile? Don't use the URL in the address bar, silly! Use this other URL we are providing randomly down the page."
I was referring to the URL-scheme of the LinkedIn app in general, not just the profile page - I somehow missed that this subthread was only about the profile page. Yes, I agree they dropped the ball completely on that. Especially tragic as they could fix the 80% case with a front-end one-liner (window.history.replaceState).
They are ids to lookup closures in a database. They time out to stop the database overflowing ;) It's called continuation-based web development [1], popular with Lisp and Smalltalk-based web servers (because who else has continuations?)
The fnids do not expire randomly due to restarts; they expire when there are too many or they timeout so memory doesn't fill up with these continuations. Personally, I don't like this continuation-based approach since "Unknown or expired link" is a really bad user experience.
Way back I wrote a bunch of documentation on the Arc web server if you want details: http://www.arcfn.com/doc/srv.html
Look at harvest-fnids which expires the fnids.
The fnid is an id that references the appropriate continuation function in an Arc table. The basic idea is that when you click something, such "more" or "add comment", the server ends up with the same state it had when it generated the page for you, because the state was stored in a continuation. (Note that these are not Scheme's ccc first-class continuations, but basically callbacks with closures.)
(The HN server is written in Arc, which runs on top of Racket (formerly known as PLT Scheme or mzscheme))
Edit: submitted in multiple parts to avoid expired fnids. Even so, I still hit the error during submission, which seems sort of ironic.
Racket (arc's host language) keeps continuations on the filesystem, or you can write your own "stuffer" to do what you want with them (store them in a database or whatever). But you have to keep them somewhere or else (assuming the server uses continuations) you can't keep track of the user's path through your code as they click through links and such.
Racket does have an option to serialize the continuations, gzip them, sign them with HMAC, and then send all of that to the client so the server doesn't have to keep track of anything, but HN doesn't use it.
Then where is the data that is associated with "b7VO4wED8MRumCeiX5fCnF" stored? How is that data requested? There certainly is a database, it is just most likely not a traditional database that most people think of.
Oh, wow. I had assumed that people who visited around the same time got the same next page URL, maybe as part of a caching strategy or something.
This way seems impractical, TBH. Certainly for the user - the expiration a bit of a nuisance, as I'll get it more often than not if I read a couple of stories and then click 'More'.
I never understood why the continuation couldn't instead be addressed by a URL path. It could even get constructed from URL/query data if it moves out of memory, so keeping them in memory would only be a caching mechanism.
Yes, but the way hn works, the continuations map uri's to browser sessions so they can be expired on a more granular basis than all at once or lru or what have you. I'm guessing here, though, I've not looked through the code.
[addendum] I Reread your comment and realized that wasn't what you meant at all. What would be the benefit of path based uris over query string params in hn's case? I only see how they be equal, not better.
I think THOMAS (http://thomas.loc.gov/), the search engine provided by the US Library of Congress for searching federal legislation, has the worst URLs I've seen. Here's a random one:
While it's been reskinned, THOMAS dates back to 1995 and the core reflects a much earlier era of web development. It's slowly being replaced with congress.gov which has far more palatable URLs:
It's clearly a set of arguments (separated by ':') to a function, i.e. it's a RPC invocation. The first seems to be the session of congress, 113th and 107th. The second, perhaps the nth item in that collections? The third, a temporary filename, probably where the search result is stored, so you don't have to re-run the search which you hit "back". The last bit, probably a later addition that manages breadcrumb navigation.
Sometimes, sometimes not. You can often compress a lot of
¶meter=value
into a simple
/value/
rewrite. This - by most accounts - makes it more SEO friendly. Granted, putting full sentences that match an article title is never necessary, but there are a lot of SEO tricks that can make a url not only "nicer" looking but also shorter.
Legal portals are also many times vulnerable to a form of directory traversal, where you descend the URL scheme by cropping out the last slash. ie. /documents/17683/ would become /documents/. Doing the same thing for parameters can do wonders.
So far I've found login portals to a few banks, teleoperators and to the parliament and military systems of my country. In addition, I've hit several FTP directories of organizations such as my state's public welfare system, which included software and documents.
I sometimes report these incodents as I find them, anonymously and without contact information, since nobody never responds to these reports.
It's an underscore followed by 4 bytes, possibly the integer 2650072859 or 468186269. If they're intentionally trying to obfuscate their URLs to prevent crawling, it might be further encrypted somehow.
"I can't imagine the skill required to do this without the experience to know it's a bad idea" (can't find the source for this quote, but you should get the sentiment).
The way you end up with URL's like http://www.tsa.gov/TSA-Pre✓™ is a CMS system that replaces spaces with dashes in the title to make the URL. No skill required.
It was up until last week, but http://www.tsa.gov/tsa-pre✓™ was the canonical version. Now it redirects to the version you don't need to know the ALT keyboard jockeying with.
You've got to go green by conserving bits. You can use up to a radix of 30 I believe in JS, so why use those pesky base 10 values when you can go base 30 with no additional overhead? Heck, use base 62 for the easily url passable values too. Just 4 chars to encode your 14M WP articles.
Well, I do remember years ago when I helped a colleague debug a problem with a web app. Seems that IE was the only browser we could find that crashed with %00 in the URL. I'm pretty certain there was a NUL byte exploit we could have dug into.
Let's put the tech sensation aside. I'm glad to know that the HN Folk have a good sense of humor :)
BTW: you can add multiple routes pointing to the same url, but allow only the SEO URLs to be indexed. This keeps the cryptic URLs for the entertainment of the Users/Crawlers.
How would you do this? (Leaving aside the question of 'why?')
I suppose you could try blocking crawlers from the raw URLs with an aggressive robots.txt and then put a sitemap (with friendly/SEO URLs in) somewhere for them to discover instead. Would that work?
Paranoid web spiders could flag the site as suspicious, though. Such schemes might make it seem like the website is presenting one view to the spider, and another to real visitors. Almost like it was trying to hide malware from a scanner.
You could simply add "index" to the page when accessed with /seo/url and noindex when accessed with the cryptic url. Additionally you can enforce that using .htaccess or nginx rules also. Your Framework and HTTP-Router class just has to support multiple URLs per page.
Basically your CMS or Framework must allow to have multiple routes like site.com/best/watch/casio and site.com/→@ðŋ]æ~@¢“«¢“¹²³»«@€^ linking to the same page.
I've used that in the past to switch languages dependant on url-path + browser-language. /en/my-article would show that english article to a german visitor, but everything else on the site like nav, terms etc. would be German. To access the Enlish site, the german visitor would have to click the appropriate flag. I could have easily added the feature to read that same article in German, by a click on a flag in the bread-crumb's mini drop-down. Example: blog»my-article[v] a click on [v] would open blog»mein-artikel etc.
My understanding is that you can use e.g. "rel=canonical" links to tell bots what the indexable URL of the current page is. Other tools in the box include UA sniffing and sitemap.xml.
A former coworker of mine created django-unfriendly[1], which seems like it's probably worse. On the other hand, django-unfriendly is meant to obfuscate on purpose.
Weird characters aside, having URLs of the form example.com?stuff has many advantages. For one, you don't need any weird magic to get relative URLs working properly.
What makes this bad? Or rather what makes this objectively bad? I feel like URL schemes espoused by people who judge them are basically bs. Success of the website IMO seems like the only objective measure and by that measure pretty much any URL scheme is fine given the schemes used on some of the most popular sites use schemes that people like jakub_g complain about
I can see where maybe an API URL might have objective better and worse schemes but a content URL? Show me the research results, not just fashion opinions
Actually thanks for that, I didn't realise so many people were against it but I shall refrain from using the word "epic" in a context that it should be used.
http://www.linkedin.com/profile/view?id=23081590&authTyp...