This is an incredibly useful post for anyone looking to implement their own custom client-side performance metrics and error tracking. I've worked on solutions to both these problems before and needed to piece together a solution from multiple half-baked, error-prone sources. Having a reference implementation in one post is very helpful.
We've done similar performance analysis at Khan Academy by using the navigation timing API and throwing the data into graphite. The most interesting part of this post for me is seeing that the numbers (server side times, median client times) are roughly the same as ours (http://i.imgur.com/wKglMTc.png)
One thing that I see in both GitHub's data and our data is this daily-periodic increase in both server and client times. Is that just load? We had theories that the clientside spikes may be caused by users in less CDN covered areas, or an increase in mobile usage, but were never fully satisfied by those explanations.
I'm glad GitHub have started this blog, it's great when an organisation as large as theirs shares information about how they're tackling common problems.
A little meta, but it's a good decision that Github has separated their technical blog posts from the rest (previously they used to combine all posts in github.com/blog). This was much needed given the nature of their customers.
They have both RUM (requires injecting 3rd-party script, in this case, AppNeta)) and Synthetic (does not require any modification of your client-side artifacts).
Why would you consider these analytics scripts useless? The article mentioned several reasons they want to be tracking this timing information (monitoring of slow backends/CDNs and the performance of their own scripts) - having that information is vital for ensuring the usability of the site [1]. If a portion of your userbase all of a sudden starts loading the webpage much slower than usual (e.g. from issues as diverse as bad network routing or just a specific browser choking on your scripts), having the information available to you to quickly diagnose and fix the problem is a lifesaver. After all, the website is their product, so they need to ensure it's always available and usable for all their customers.
All that is nice, but simply stating the reasons does not cause me to think that something is actually beneficial in a way thats real and tangible to me. In practice, I often find that disabling analytics scripts improves the performance of many websites. My RAM and CPU usage goes down, and the site feels much more responsive. Other times, disabling JS completely causes websites to fall back to their html/non-js pages, which are significantly better to me from a performance standpoint. YMMV.
Nothing against websites who are trying to improve user experience, but if I'm donating my CPU time and RAM for someone else's business needs, I'd want a way for me to opt out of that.
To piggyback on this, does anyone have any metrics on how much time is taken up by restaurant servers asking me how my food is today? I'm paying for the food and drinks, not to be spammed for analytics by the server.
It seems like they could collect those metrics on the back-end (kitchen / food processor(s)) too, or use tip % as a proxy metric for QoS. Not sure how they'd track e.g. CTR, but straight conversions could be easy enough if you instrument the servers such that they're tracking clients per table. Of course, if you're trying to solve for optimal Drink Refresh Rate, you might need something more granular along the lines of sugar.
Sure, but those are not the same. "Did you like the food?" can be met with "Could you take it back and add some more salt" or "The steak seems a bit raw, could you fix that?"
Web analytics OTOH, are just aggregations. All the company cares about is some random metric like bounce rate or some "engagement" metric or click through rate or what have you. You can supply a horrible product and still get a temporary boost on any of those metrics. e.g. Redirect 10% of all users to full Page modal ads without a continue button and see the CTR spike. Disallow/hinder copy-paste functionality on the text and watch as people keep coming back.
But you can't bring out a turd on a plate and have the customer be excited about it.
If a site I frequented could (impractical, I know) ask me for some direct piece of feedback, and which also had a reasonable chance of being taken seriously, I'd be all for it.
An interesting thing about this is that it appears to just use window.performance.timing, which is filled in by the browser automatically. They're just grabbing data that the browser is generating anyway and forwarding it along to themselves.
On my machine, calling `JSON.stringify(window.performance.timing).length` generates 623 bytes of text. Pretty much 0 overhead here.
Thanks, GitHub team!