Hacker News new | past | comments | ask | show | jobs | submit login

First of all, kudos for quantifying your results instead of hand waving them. Yes, your results look like a ~60% improvement in conversion rate from the A to the B test, with a p value of 0.02 and a statistical power of around 80% for a two-tailed test. So that's good.

However context is important - at this level of significance you'd expect to see a similarly strong, but ultimately spurious, effect going from the A to the B test about 1 in 50 times.

Since you're not working on something safety critical, that's probably an acceptable false positive rate for you. But generally speaking, and in particular here since the absolute numbers and changes are quite small, I would be wary of trusting such a result. It seems promising but inconclusive. Maybe run a few more tests with disjoint (or nearly so) samples of visitors?

There are a few other things that could possibly confound the result - off the top of my head, your screenshots look like different pages between the A and B test. I'm not sure if that's how you ran the experiment or if you just happened to use two different page screenshots, but that would typically disqualify the result and require another test.




The way I'm seeing it is sure the error bars are huge. But it's very unlikely to be a regression. And team likes it better.


> screenshots look like different pages between the A and B test

I was also wondering about that




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: