Hacker News new | past | comments | ask | show | jobs | submit login

> The turnaround time also imposes a welcome pressure on experimental design. People are more likely to think carefully about how their controls work and how they set up their measurements when there's no promise of immediate feedback.

This seems like a cranky rationalization of the lack of a fairly ordinary system.

Sure, you shouldn't draw conclusions for potentially small effects on < 24 hours of data. But of course if you've done any real world AB testing, much less any statistics training, you should already know that.

What this means is you can't tell whether an experiment launch has gone badly wrong. Small effect size experiments are one thing, but you can surely tell if you've badly broken something in short order.

Contrary to encouraging people to be careful, it can make people risk averse for fear of breaking something. And it slows down the process of running experiments a lot. Every time you want to launch something, you probably have to launch it at a very small % of traffic, then you have to wait a full 24-36 hours to know whether you've broken anything, then increase the experiment size. Versus some semi-realtime system: launch, wait 30 minutes, did we break anything? No? OK, let's crank up the group sizes... Without semi-realtime, you have to basically add two full days times 1 + the probability of doing something wrong and requiring relaunch (compounding of course) to the development time of everything you want to try. Plus, if you have the confidence that you haven't broken anything you can much larger experiment sizes so you get significant results much faster.




> But of course if you've done any real world AB testing, much less any statistics training, you should already know that.

Are most product managers and designers running multivariate tests thus trained? My experience has said 'no'.

In fact, if there's one error I see companies making again and again, it's confusing customer signal from user noise.

I do agree with the rest of your comment however.


If the people running experiments do not know and cannot be told how to do the fundamental thing that they are trying to do, then you have bigger problems.

Which is ultimately what this post points to: the author doesn't trust his team and isn't listening to them and doesn't expect they will listen to him. Regardless of the degree to which the author is correct in his assumptions, the problem is more than just engineering.


The article mentions the difference between operational and product analytics, his complaints of real-time data are only for product analytics.

(I hope) No one in the article or this thread is suggesting your uptime or error metrics should be on a 24 hour delay.


You can totally break some product functionality somehow without necessarily triggering a software exception or server crash! You really do need to know the target events per experiment group.


> Versus some semi-realtime system: launch, wait 30 minutes, did we break anything? No?

I don't think the author is suggesting that you don't have realtime _correctness_ feedback.


The author explicitly mentions operational metrics are a different thing, and I think that would fall under that.


You can break your product without noticeably affecting the things like the http error 500 rate, cpu utilization %, etc. that you would likely see on some ops dashboard.


On our ops dashboard we see stuff like number of ID syncs, number of events processed (by type), etc. - I'd argue that is something is truly "broken" you see it.

If you're using funnel analytics to decide that the product is broken - i'd say you probably do something wrong.


> What this means is you can't tell whether an experiment launch has gone badly wrong.

Personally I prefer automated testing to tell me if a feature has gone badly wrong, not conversion numbers. Then I find out before I launch too.

Or do you mean that the UX is so badly designed the users cannot use your software anymore? In which case, maybe there are bigger problems than real time analytics


This approach is for bugs that testing can't / won't pick up, you should be doing both.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: