First of all, kudos for quantifying your results instead of hand waving them. Ye... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

fractionalhare on Aug 5, 2020 | parent | context | favorite | on: Chat Bubble Blindness

First of all, kudos for quantifying your results instead of hand waving them. Yes, your results look like a ~60% improvement in conversion rate from the A to the B test, with a p value of 0.02 and a statistical power of around 80% for a two-tailed test. So that's good.

However context is important - at this level of significance you'd expect to see a similarly strong, but ultimately spurious, effect going from the A to the B test about 1 in 50 times.

Since you're not working on something safety critical, that's probably an acceptable false positive rate for you. But generally speaking, and in particular here since the absolute numbers and changes are quite small, I would be wary of trusting such a result. It seems promising but inconclusive. Maybe run a few more tests with disjoint (or nearly so) samples of visitors?

There are a few other things that could possibly confound the result - off the top of my head, your screenshots look like different pages between the A and B test. I'm not sure if that's how you ran the experiment or if you just happened to use two different page screenshots, but that would typically disqualify the result and require another test.

im3w1l on Aug 5, 2020 | [–]

The way I'm seeing it is sure the error bars are huge. But it's very unlikely to be a regression. And team likes it better.

cutemonster on Aug 5, 2020 | [–]

> screenshots look like different pages between the A and B test

I was also wondering about that

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact