Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Two-faced URLs (brokenthings.org)
85 points by ipsin on July 6, 2011 | hide | past | favorite | 16 comments



This is a URL shortener that gives a link that sends "preview" processes one place and actual users somewhere else. It works for google plus and facebook and several other "preview" bots. Proof-of-concept because I don't like URL shorteners in general, but it links to a restricted list of sites.


Yet another reason for URL shorteners to be banished from the web (not that this will ever happen). Besides the arcane input requirements of Twitter and the semi-retarded behavior of certain email clients, I haven't found a good use case for them that couldn't have been solved better with proper link formatting or better content presentation.

When the web is just a sea of opaque pointers to pointers to pointers, with various hilarity like these URLs mixed in, we'll all wonder why the hell we stood by and let it happen.


I'm biased, since I have run more than one shortener/tracker myself. This isn't meant to be a rant, more of a defense for the uninformed.

It's a bit naive to think that most websites are referenced by a simple link to static content. The web has become so complex and urls now reflect that. Short urls are useful for many people for many different purposes.

For an example of how complex urls are, take a look at SEO'd blog post or newspaper urls. They have become a string on keyword rich terms that help to increase SEO. The content that is served is brought up dynamically and may in fact change over time. Ads, sidebars, and comments are brought in after the fact, often based on content or origin of the user or if they have a cookie. It's complicated - url redirection (which already may be happening in this flow) doesn't add much to it.

Advocating for "proper link formatting or better content presentation" reflects the fact that you've never had to dig deep into the complex SEO, user, customization, and other general business requirements that arise when running a CMS.

Short urls are valuable because they allow people an ability to simplify the increasingly long urls for use in many different mediums without the worry of breaking or losing portions of the url.

Yes, they often do allow the creator to change them, but that's no different than what any website operator can do natively on their website.

There are problematic situations that can arise, such as when a url shortener goes down. And this has happened, and it hasn't caused any catastrophic problems. It's inconvenient, but it often only affects old content on twitter or facebook. I'm not saying it's ideal, it definitely doesn't add value to the internet. But it's not that bad.

Regarding the 'arcane' input requirements for twitter or facebook. Somehow I think you missed the point of these networks. Twitters 140 characters is part of what defines them. There will always be short message mediums, they are there because people have a desire for short bursts of information. They aren't going away.

I hope that helps to explain why shorteners arose because there was a need, not just to add complexity or more evil to the internet.


> Advocating for "proper link formatting or better content presentation" reflects the fact that you've never had to dig deep into the complex SEO, user, customization, and other general business requirements that arise when running a CMS.

Erm. I don't think you understand what I meant. I meant if a URL is too long to fit within a block of text in your medium, enable hyperlinking it with short, descriptive link text the way just about any media other than twitter or plain text (or HN markdown, but I digress) allows. Have we already regressed from the concept of the well-made hyperlink?

I appreciate that designing URL structure for a site can be difficult. I don't think it's as hard as you make it out to be, because just about any free CMS package will take care of SEO'd URLs for you and set up redirects even if you change your content. Hell, you can customize all of this in Wordpress without touching code. But that's an orthogonal problem, and shortened URLs neither contribute to SEO nor constitute a viable long-term URL structure for any site.

> Short urls are valuable because they allow people an ability to simplify the increasingly long urls for use in many different mediums without the worry of breaking or losing portions of the url.

This is what I don't understand. OK, sure, URLs get long, but they could be long before we started putting keywords in them, that's not new. Besides email or twitter, what media are you possibly talking about where I'm dropping bits of a URL on the floor? And if bad email clients can't linkify long URLs properly and twitter won't allow proper link formatting, why are we just rolling over and enabling their problems?

>Twitters 140 characters is part of what defines them. There will always be short message mediums, ...

They could and should make an exception for URLs. Imagine tweeting:

    URL shortener controversy all over again!
    [Read it on HN|http://news.ycombinator.com/item?id=2734728]
and the URL portion and formatting doesn't count toward the limit, only the link text part. Visually, that's only 55 characters worth of information, and that's how the limit should be formulated. You can't tell me that http://bit.ly/blsa23848 is more informative or contributes meaningfully to Twitter's character limit by virtue of it being shorter than my formatted link with link text. In fact, it's less informative, and obscures the link's true destination without deshortening (and as the OP shows, deshortening is fallible).


It's actually several things, including user agent strings, HTTP accept/accept language.

The underlying script allows you to route between n different destinations based on header matching and netmasks. The shortened URL is just a hash of the rule set, with a sequence number appended in case of collisions.


Clever. I assume the only way for Facebook and Twitter to circumvent this would be to fake all of the HTTP headers sent by, say, a modern version of Firefox. I don't think it's cheating - even though the tools are bots, their job is to preview the content their users are about to see.

Even then, the author could probably find out which IP addresses are used by Twitter and Facebook's lookup engines and send them the innocuous version. However, I just did a test on my own server, and Facebook used two different IPs (69.171.228.246 and 69.171.224.245) for the same request. Pretty big range.


That's one thing that mildly annoys me about twitter. It clearly knows where a shortened link is heading because it reveals the full URL on the mouseover. Given that information, why does it need the shortened link any more?


This actually could be quite useful. Sometimes people need to post a nicely described link with an adequate thumbnail, yet the Facebook bot finds only some random image. With this little service one can prepare a dummy page for illustrative purposes and then let brokenthings.org handle the bot and the redirection.


I've been waiting for this since url shorteners came to be. Sadly I don't think shorteners will loose popularity anytime soon and I will continue to avoid them as I've always done.


At a guess: HEAD requests give one URL, GET requests give another?


I think it knows the user agent of specific bots and serves different pages based on that, as social networks tend to do a full GET request in order to grab a screenshot.


This is not working for me.. Whenever I change the source link, Facebook gives me no preview..


Very clever/sneaky!


To defend against this, social networking sites should simply cache the link that the preview bot sees, and send users there directly instead.


Clever use of content negotiation!


This is great! Lots of fun.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: