Uhh... this article is kind of awful. It consistently confuses hits with pageviews. There's no mention of the most popular early open source (Analog) or closed source (Webtrends) analytics tools. And how can you spend multiple paragraphs telling the story of how GA came to be the most popular analytics platform without mentioning that it's a freebie? GA is an add-on to Adwords, whereas New Relic costs hundreds per month, and Webtrends/Omniture cost thousands. And why say "New Relic is its nearest competitor..." when New Relic doesn't even compete with GA? This article is just a shambles.
And where is sessionization? Isn't going through a logfile and associating the hits with specific users what made the Webtrends founders rich back in the late 1990s? Nope, it never existed, just like web data warehouses, OLAP, etc. In this alternate reality, web analytics moved directly from "glorified hit counters" to "event tracking".
This is not a history of web analytics, it's marketing material for Amplitude, whoever they are. I'm going to go out on a limb and guess that they're an analytics vendor and that event tracking is one of their main features.
I'm using Analytics and occasionally tried other products that promised real-time data before Google caught up to them, but nothing stood out. I also trialed MouseFlow and ClickTale thinking that mouse tracking and watching individual users navigate my site was the way of the future... but the time required to setup each testing scenario, wait a few hours, and then watch a handful of users clumsily click around my site didnt' end up being useful either.
So what good event-driven tracking apps are out there right now? GA is really too big and cumbersome to setup properly, and it seems nobody else makes it super easy either.
Heap's approach to analytics is a little different. Instead of forcing you to manually instrument code, Heap automatically captures all user actions. You can then define events retroactively, without writing a bunch of brittle tracking code.
Would love to hear your feedback on this. Does Heap's approach get you closer to what you need?
This is easy enough to do with GA if you use custom events. You don't even really need to set up much on the server side; if you fire the events at GA it will understand them. I think the real problem is expectations: Google is just doing statistics on big numbers, and for statistics to tell you anything, you have to have enough observations.
The real problem is that you need a critical mass of customers -- millions of uniques per month -- before this type of workflow tells you anything meaningful. And even then, diminishing returns set in pretty quickly: 80% of your legitimate users probably fit into 3 or 4 usage profiles that will stand out, so to identify the 'long tail' traits and market to them individually requires exponentially more users.
Basically, the type of deep analytics and tracking Google and others promote are only available to customers with enough data to tease them out. This generally is enough to exclude small businesses. You can get some basic stats about your users, but probably nothing you didn't know already. It can help you highlight glaring problems in your workflow (like a broken checkout page) but it's not going to give you magic insight.
Nobody makes it super easy because it's just a hard problem :)
Scale is a function of what you track, though. Taking e-commerce as an example: most people will track conversion rate (product purchases / product page views); most also track cart add rate, checkout rate, etc. But much fewer will track product impressions - that is, the event where a search returns the product in its results, or a category displays it in its catalog, and the user "eyeballs" it.
You get, rule of thumb, perhaps 1 conversion for 10 views; you also get 1 view for 10 impressions. So, tracking impressions (and therefore click-through-rate) decuples the amount of data you're tracking, and allows you to draw conclusions much faster than by waiting for the (admittedly, stronger) signal of conversion rate to tell you what you want to know about your product.
The real problems with Google are the ridiculous fees (150k USD/year for Premium? and I don't even get to talk to an actual Google employee if I have issues, have to go through reseller?) and the fact that there is a lot of "secret sauce" that you do not see and that isn't, or is sparsely documented (compare Google's last click attribution to what you're actually seeing appear on your raw data as the last click, for a simple example). Google wants you to use their superior insights; whether or not they are superior, I prefer having access to the full picture and draw my own conclusions.
I can't speak for the parent post, but my problem with GA is that the mechanics of setting up new events and flows and tests and then mining that data is such a chore. Generating enough data for statistical significance is, of course, a problem for all solutions.
GA is not "live"; it takes about 1.5 days for the "raw" data (hit level) to be available in BigQuery, and cherry on the cake, they'll charge you extra for the privilege.
I was fond of Webtrekk, which whilst only one day late, cost 1/10th of GA and had a relational schema to their raw data.
If you're going to use your data for something other than watching users click on your site (e.g. using the live data to have the site adapt to the user's actions), you absolutely need to be controlling the DB in which the hits are being recorded.
I haven't had the chance yet but I think only Piwik (or a bespoke tracker) will let you do that. Given a choice, I'd have Piwik write to its own DB, replicate it live, and use that to feed my recommendation algorithms. Bonus: your customer's data doesn't end up on someone else's servers (although in practice, it will, because DoubleClick, Facebook, etc. will all have their tags on your page anyway).
I've been happy with Mixpanel for event-driven analytics. Straightforward API and good customer service.
And the real killer feature for me is you can get a dump of every event you've ever logged with all their properties. Which means I can do complex analysis on the raw data and it also means I can take my data with me if I ever decide to leave.
I also evaluated keen.io which was very nice, but is geared more towards developers.
Hah, we just decommissioned a server that had an on-premises urchin installation on it a couple weeks ago. http://i.imgur.com/481dUsZ.png is what the home page looked like right before it left us.
And where is sessionization? Isn't going through a logfile and associating the hits with specific users what made the Webtrends founders rich back in the late 1990s? Nope, it never existed, just like web data warehouses, OLAP, etc. In this alternate reality, web analytics moved directly from "glorified hit counters" to "event tracking".
This is not a history of web analytics, it's marketing material for Amplitude, whoever they are. I'm going to go out on a limb and guess that they're an analytics vendor and that event tracking is one of their main features.