Heap is building analytics infrastructure for web and iOS. Unlike other tools, which require you to manually instrument code, Heap captures all user actions automatically, and then lets you answer questions retroactively. Instead of writing a bunch of new tracking code every time you want to answer a question, the data is already in Heap waiting to be analyzed.
As an engineer at Heap, you will work on our in-house distributed system that ingests billions of events a week and processes queries over 100s of terabytes of data in seconds. To learn more about our distributed system, see our talks at PGConf[1] or our recent blog post on how we index our data[2].
We have a small eng team made up of 13 engineers, nine in San Francisco, and four scattered around the globe.
Our interview process consists of a one hour technical phone interview, a three hour takehome problem, and a full day onsite in which you'll build a fake-but-plausible Heap feature.
We offer all of our employees unlimited vacation with a three week minimum.
Heap is building analytics infrastructure for web and mobile. Unlike other tools, which require you to manually instrument code, Heap captures all user actions automatically, and then lets you answer questions retroactively. Instead of writing a bunch of new tracking code every time you want to answer a question, the data is already in Heap waiting to be analyzed.
As an engineer at Heap, you will work on our in-house distributed system that ingests billions of events a week and processes queries over 100s of terabytes of data in seconds. To learn more about our distributed system, see our talks at PGConf [1] or our recent blog post on how we index our data [2].
We have a small eng team made up of 13 engineers, nine in San Francisco, and four scattered around the globe.
Our interview process consists of a one hour technical phone interview, a three hour takehome problem, and a full day onsite in which you'll build a fake-but-plausible Heap feature.
Heap is building analytics infrastructure for web and mobile. Unlike other tools, which require you to manually instrument code, Heap captures all user actions automatically, and then lets you answer questions retroactively. Instead of writing a bunch of new tracking code every time you want to answer a question, the data is already in Heap waiting to be analyzed.
As an engineer at Heap, you will work on our in-house distributed system that ingests billions of events a week and processes queries over 100s of terabytes of data in seconds. To learn more about our distributed system, see our talks at PGConf [1] or our recent blog post on how we index our data [2].
We have a small eng team made up of 13 engineers, nine in San Francisco, and four scattered around the globe.
Our interview process consists of a one hour technical phone interview, a three hour takehome problem, and a full day onsite in which you'll build a fake-but-plausible Heap feature.
Analytics tracking code is a serious headache. You rarely know upfront what events you'll care about later, it's a pain to setup and maintain, and there's a principal-agent problem where the person implementing tracking (a dev) isn't the one getting value from it (a non-dev).
I think we'll start seeing teams adopt more automated approaches to analytics tracking over time.
Disclaimer: I founded Heap (https://heapanalytics.com/) with the idea that manual instrumentation should be the exception, not the norm.
I've looked at Heap before but the lack of pricing between free and “custom” has turned me off trying it.
Really hard to recommend it to anyone else either since there's no information about what it would end up costing (assuming that adding the Heap badge is out of the question).
Hey MatM. Big fan of Heap right here. In fact, just did a new install for a new project yesterday. Heap definitely makes life much much easier, but when there's a need to collect data from multiple sources, ie not just the website but lets say FB ads, Intercom etc to create a 360-degree customer view, it becomes harder to rely solely on Heap. Now, ordinarily we use Segment and pipe the data to a data warehouse but sending Heap data to a data warehouse is a very pricey option. I wish it wasnt so pricey.
+1 for manual instrumentation should be the exception, not the norm. However many time there is instinct to log everything that you don't pay enough attention to what one metric is most valuable to your business. And when you need it, its just too tough to get it + system slows down because of logging everything (if building inhouse).
Btw Autotrack is direct competitor to heap. How are you going to maintain differentiation? Or something for which heap gives 10x that autotrack doesn't?
You alluded to some of this already, but automatically capturing everything isn't just a feature you can tack on to an existing analytics system. You have to rethink everything from the ground-up in order to support it (the client-side SDKs, the infrastructure, the tools to organize the data, the analysis interface, etc).
This becomes obvious when you try both products side-by-side (and I encourage you to do so!). I haven't come across a single customer who sees this as a viable substitute for Heap.
Also, collecting everything doesn't mean you should pay attention to every possible metric all the time. It just makes it 1000x easier to hone in on the right metric at the right time.
That said, this is an awesome GA extension, and philipwalton deserves a lot of credit for building it.
At the most fundamental level: Heap's products - our SDKs, infrastructure, interface, pricing - were built from the ground up and optimized for this "capture everything" philosophy. Mixpanel tacked it onto an experience that's still built around manual instrumentation.
This becomes obvious when you actually use both products. I encourage you to do so and witness the differences yourself.
A few deficiencies of Mixpanel's approach relative to Heap:
* Performance. If you click around Mixpanel's site, you'll notice a tracking request get sent for each and every interaction. Click 10 times in a row, and Mixpanel's SDK will issue 10 separate requests in succession. Heap's SDK does the right thing: batch events. This is because Heap's SDK is optimized for automatic event tracking, while Mixpanel's was built for legacy manual tracking.
* Data trustworthiness. Form submissions and link clicks can unload the page before an analytics request gets sent, which means Mixpanel will drop a large percentage of events (~30% in our experience). This gets exacerbated on mobile devices with poor internet connections. Heap does the right thing in each of these cases (it's actually a surprisingly tricky technical problem). This sort of data gap is dangerous for any real analysis, because you're basing decisions upon figures that are fundamentally wrong (especially on mobile!). The "best-effort" approach works for manual tracking, but not for automatic tracking.
* Data completeness. Mixpanel fails to capture some key client-side events, including pushstate and hashchange events. If you have a single-page webapp, pushtate/hashchange events are critical in understanding a user's flow through your product. Heap captures these (and other interactions) seamlessly. Mixpanel doesn't.
* Scale. Owler is the only customer cited as using Mixpanel's autotrack feature in production (at least as far as I can tell from their press release). Heap's automatic tracking, on the other hand, is live and battle-tested on some of the largest websites on the internet (heapanalytics.com/customers). It'll be interesting to see how Mixpanel's approach scales to their customer base. It's clearly not fully figured out: you'll note a "X-MP-CE-Backoff" response header to each of their analytics requests, presumably to pause data collection when their backend load is too high.
* Pricing. Mixpanel still applies traditional per-event pricing here. This causes a few issues: 1) costs can balloon unpredictably as you track more events, 2) you're disincentivized from exploring new data, 3) you have to do a cost/benefit analysis for each new event you're thinking to track. This is a major deterrent to ad-hoc, retroactive, exploratory analysis, which is the primary benefit of capturing everything.
It's clear that automatic data collection is the way of the future, and there's still so many more ways to evolve it (stay tuned!). I think we'll see more and more tools adopt this approach over time. But dealing with a surplus of data requires a fundamental rethinking of analytics practices, and it's not as simple as shoehorning features into traditional experiences.
Thanks for pointing out some areas where we can improve. We will work on making autotrack great over the year specifically for those things but we haven't seen any major problems with our large customers from performance to pricing. If we do, we are committed to changing things quickly.
Hey there- I was the product manager for this product at Mixpanel. I can help explain how Mixpanel addresses those points effectively and how we differentiate ourselves.
A few things Mixpanel does really well:
* Mixpanel is the most complete reporting platform: We have been around since 2009 and have evolved to allow customers to get deep analysis of their data. Mixpanel has more than double the reports and allows you to do things like predict which users will convert, a/b test outcomes and identify issues in user flow, all of which helps our customers make data driven decision, understand your revenue, and more. Additionally with Mixpanel you can dive even deeper into your data to make smart business or product decisions.
* Mixpanel offers a fully integrated marketing suite: Using Mixpanel you can also learn about and re-engage your users with a full suite of notifications include email, push, sms, in-app, and surveys and get full reporting on the impact.
* Mixpanel is truly cross-platform: Mixpanel is the one stop shop for cross platform analytics without a developer. We are able to track data on web and mobile, no matter the device or platform.
Regarding some of your specific points:
* Performance: We have done extensive performance testing and haven’t seen or heard any issues for our customers using the product. The requests previously mentioned are asynchronous requests and have extremely little performance impacts on our customers’ product. If it turns out this is really a selling point adding client side queueing can be done in less than a week.
* Data trustworthiness: 30% data discrepancy here is extreme and our testing works well- I would recommend just trying it out yourself. Again, special handling for link tracking can easily be added if we detect issues for customers (we are constantly monitoring data accuracy)
* Scale: Mixpanel has been around since 2009 and have significant experience dealing with scale. Owler is just one example of a customer we have using Autotrack but we have a paying customer base of 4,200, with tens of thousands more customers using Mixpanel for free. Many of those customers are already using Autotrack to send many millions of events every day.
* Pricing: Mixpanel doesn’t charge for the retroactive data that we have collected only the data you use. While you do have to decide at the event level if an event is worth it- it is most closely tied to the value you receive from Mixpanel. You only pay for data that is actionable and can drive ROI. With session based pricing, even if you have users that aren’t truly using your product you still have to pay, even though they are not giving you great insights about your business.
Hey, don’t take my word for it just sign-up and see.
Because it's really hard to build, test and maintain a non-trivial application in a dynamically-typed language like JavaScript.
I think it's telling that a lot of companies with the most mature products (Facebook, Dropbox, Asana) eventually resort to optional static typing in their products, whether that means extending their runtimes (e.g. Hack) or using a transpiler like TypeScript.
The implicit conversions and "we'll just randomly give you undefined to propagate ondwards" stuff is a much bigger source of bugs in JS than dynamic typing. Picky dynamically typed languages like Erlang or Python will pretty reliably produce an error/exception at the scene of the crime. (More so than C/C++!)
(Also I don't think optional static typing has gotten off the ground in Python land, or adopted in production at Dropbox?)
What is it telling? It might just be telling that everyone is doing what you're doing here, and looking at each other to see what they should do. Another name for this is "fashion."
Sorry to hear this! I work at Heap, and Heap certainly shouldn't have any effect on your ajax calls. I also can't see how react.js would interfere with Mixpanel.
Maybe something else is happening. Could you shoot us an email (support@heapanalytics.com)? We'd love to fix this for you.
Thanks for responding. Yeah I'm sort of at a loss, but this is my first codebase with React.js and it's the only one that has had these issues. I'll shoot you a message.
Heap's approach to analytics is a little different. Instead of forcing you to manually instrument code, Heap automatically captures all user actions. You can then define events retroactively, without writing a bunch of brittle tracking code.
Would love to hear your feedback on this. Does Heap's approach get you closer to what you need?
Heap is building analytics infrastructure for web and iOS. Unlike other tools, which require you to manually instrument code, Heap captures all user actions automatically, and then lets you answer questions retroactively. Instead of writing a bunch of new tracking code every time you want to answer a question, the data is already in Heap waiting to be analyzed.
As an engineer at Heap, you will work on our in-house distributed system that ingests billions of events a week and processes queries over 100s of terabytes of data in seconds. To learn more about our distributed system, see our talks at PGConf[1] or our recent blog post on how we index our data[2].
We have a small eng team made up of 13 engineers, nine in San Francisco, and four scattered around the globe.
Our interview process consists of a one hour technical phone interview, a three hour takehome problem, and a full day onsite in which you'll build a fake-but-plausible Heap feature.
We offer all of our employees unlimited vacation with a three week minimum.
We enjoy talking to everyone who interviews, so please apply: https://heapanalytics.com/jobs.
[1] https://www.youtube.com/watch?v=iJLq3GV1Dyk
[2] https://blog.heapanalytics.com/running-10-million-postgresql...