Tell HN: Catalog of JavaScript libraries and CSS resources hosted on public CDNs

MicahWedemeyer · on Feb 28, 2010

Correct me if I'm wrong, but the high speed of the CDN isn't the only (or even primary) benefit. In my opinion, browser caching is a much bigger plus. As soon as a user downloads one of these hosted libraries (by visiting a site that calls for it), their cache is primed for every other relying site they visit. Correct?

Assuming this is true, why would someone use the Google API Loader javascript call ("google.load") instead of linking to the file directly? Linking directly allows the browser to use its cache, while using google.load needs to make an external call to Google, bypassing a big benefit of the caching.

jerf · on March 1, 2010

"Correct me if I'm wrong, but the high speed of the CDN isn't the only (or even primary) benefit. In my opinion, browser caching is a much bigger plus. As soon as a user downloads one of these hosted libraries (by visiting a site that calls for it), their cache is primed for every other relying site they visit."

It depends on the CDN and what cache information it returns. I'm not going through all the links and verifying that they all return correctly. I do recall Yahoo's YUI documentation page promising the use of far-future Expires which means that the browser very likely never even hits the CDN, but nothing stops the CDN from using E-Tags instead, which is still pretty fast but does manifest as a hit on the CDN they could track. That said, it's entirely possible that every link there uses far-future Expires.

Personally, I'd just grab the file and use far-future Expires myself. If you've got a web application that uses code that way, odds are this is not going to be your bottleneck anyhow; far-future Expires just as thoroughly fails to hit your server as it fails to hit the CDN.

MicahWedemeyer · on March 1, 2010

Thanks for the reply. I took a look at the headers returned for https://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.mi... and here's what I see:

  Content-Type: text/javascript; charset=UTF-8
  Last-Modified: Tue, 09 Feb 2010 23:05:02 GMT
  Date: Mon, 01 Mar 2010 00:09:20 GMT
  Expires: Mon, 28 Feb 2011 21:55:22 GMT
  Cache-Control: public, max-age=31536000
  Vary: Accept-Encoding
  X-Content-Type-Options: nosniff
  Server: sffe
  Content-Encoding: gzip
  X-XSS-Protection: 0
  
  200 OK

It looks like Google is setting an expires for 1 year in the future, meaning (I believe) that the browser will use the cached copy. So, by using these libraries, you have a decent chance that a brand new visitor will arrive with a warm cache, thereby speeding up that initial page load (and first impression!) a tiny fraction.

lurkerperpetual · on March 1, 2010

according to various resources google.load() can do the work asynchronously so the whole page load will not be stalled by the <script> loading. Also you may want to load that .js only on demand not on page load. That said, IMHO most pages are fine with hardcoded <script> though.

ig1 · on Feb 28, 2010

You need to be careful about user privacy with things like this. You're leaking user information to third party services, and you need to make this clear to your users, especially if you're dealing with anything that could be considered at all sensitive.

jamesbritt · on Feb 28, 2010

I particularly annoyed by banks doing this. I assume can trust the bank itself, but why would I want to also pull down content and scripts from cdn.whoknowswhere.com or tangentialservice.com?

If I'm signed in, doing monetary transactions, I don't want to deal with any other domain but the primary.

jhancock · on Feb 28, 2010

I installed a firewall at a doctor's office last week. I set the defaults to block pretty much everything so I could whitelist sites the office staff truly needed. The following Monday, I sat around and whitelisted things as the staff stumbled on what they needed. I was amazed at how many third party sites were being hit from health insurance and government sites. This stuff is supposed to be secure, but everything from login to home page to submitting claims had requests to third party domains. I'd say less than half of these third party requests appeared to CNDs.

stilist · on Feb 28, 2010

I’d sort of assumed the traffic information was one of Google’s primary motivations for hosting all these libraries.

appathy · on Feb 28, 2010

True, although its not any different than running Analytics or Adwords.

ig1 · on Feb 28, 2010

Yep, but those are more concious business decisions. This is the kind of thing someone might decide to do on a whim without thinking of the repercussions and make suitable adjustments to privacy polices, etc.

jackowayed · on Feb 28, 2010

Care to explain yourself?

All using a CDN for JS libraries does is have the same exact code get to the user faster and not eat your bandwith. Unless the CDN threw some extra code in, which someone would eventually notice (seriously marring the company's reputation), it doesn't send any info to the company hosting the code.

But it is a good point. You should make sure you really trust the company hosting the code for you.

Edit: Oh, right. When I read the comment I was thinking more of it letting them steal all of the info on the page, or something, which would be doable if they inserted extra code.

pyre · on Feb 28, 2010

> Edit: Oh, right. When I read the comment I was thinking more of it letting them steal all of the info on the page, or something, which would be doable if they inserted extra code.

Stealing info from the page is still possible. It would be hard to crowd-source the detection of such things if they were either: 1) randomly inserted into <1% of all requests; 2) they were targeted at specific IPs/IP ranges; or 3) they were targeted at requests coming from specific sites.

harpastum · on Feb 28, 2010

I think the OP's point was that you're giving all of the traffic information to the CDN, i.e. they can monitor the people that you're sending to them.

Willie_Dynamite · on Feb 28, 2010

You've never heard of access logs?

shrike · on March 1, 2010

Is it really faster? I always assumed that the additional client side DNS lookup would eliminate the increased bandwidth advantage that the CDNs have. Anybody know if anyone has A/B tested this?

rimantas · on March 1, 2010

  I always assumed that the additional client side DNS lookup
  would eliminate the increased bandwidth advantage that the
  CDNs have

CDNs offer a bit more than bandwidth advantage. First, if you have visited a site which used one of these CDNs already it most likely means that your browser won't even send a request for the resource (if CDN sets expire time into the future). Second, even if this will be the first request for a resource it helps to work around the "connections per hostname" limitation in browsers. Let's say your browser allows 4 connections you will be able to fetch four resources in parallel. Using CDN (with a different hostname) allows you to circumvent that. Then your are limited only by total number of connections browser allows and those numbers differ by order of magnitude: 2(IE7) to 6 (IE8, lates FF and Chrome versions) connections per hostname vs. 20 (Opera) to 60 (IE, Safari, Chrome) total connections.

Check out http://www.browserscope.org/ — it may give some ideas why Chrome is so fast :)

sh1mmer · on March 1, 2010

Yahoo performance guidelines advise 3-4 domains per site.

http://yuiblog.com/blog/2007/04/11/performance-research-part...

sh1mmer · on March 1, 2010

FYI Yahoo still maintain their own CDN for YUI which has additional features such as submodule selection, so you can roll a package that fits you, and we'll host it.

Disclaimer: I work for Y!

invisible · on March 1, 2010

Google went down one day and all of these CDNs stopped working at our colo. We got over 50 calls of people complaining their sites were down. Turns out the sites were blocking on waiting for the javascript (timing out after a few minutes).

That is the only time it has happened, but this is what makes me fear being fully dependent on these CDNs. So be advised: add a fallback mechanism.

brandon272 · on Feb 28, 2010

The caching aspect is one pro to using these services, but by using these CDN's, your exposure to failure is greatly increased. Instead of factoring in the risk that your own server or network may be unavailable or slow, you now need to account for 1, 2 or 3 other networks that you are grabbing files from.

tensafefrogs · on March 1, 2010

On the other hand, I bet google's CDN has better uptime than your server does.

soult · on March 1, 2010

So? This does not help me at all. If my host is down, then my site is down. If my JavaScript CDN is down and I have no fallback, my site is down as well. You imply that "totalDownTime = CDNDownTime", while really it is "totalDownTime = WebhostDownTime + CDNDownTime". Thus using a CDN, while it might make the site faster 99,999% of the time, always increases the total downtime.

prodigal_erik · on Feb 28, 2010

Doesn't this prevent anyone from whitelisting your code in NoScript? As a user, how could I ever know that you've reviewed what the CDN is serving right now? ECMAScript sandboxes are not nearly good enough yet to run untrustworthy code.

vladocar · on March 1, 2010

I really need something like this for my CSS Frameworks. Fast DNS query resolving + gzip.

I think that Google and others(Microsoft,Yahoo) should give free access for more public libraries. So we can all benefit from that.

timmorgan · on Feb 28, 2010

Thanks! Good to know, even if I may end up copying the files to my own server anyway.

Edit: I assume the plan is to keep this list updated with the latest version of each lib. (?)

andrewdavey · on Feb 28, 2010

I'm working on a script to crawl the Google and Microsoft listing pages and keep things up to date. Eventually I may expose an RSS feed as well.

sidbatra · on Feb 28, 2010

This is really helpful for a lean startup. Helps pull some load away from servers and you get all the benefits of a CDN and caching.