Hacker News new | past | comments | ask | show | jobs | submit login
Why we’re pausing our Pay Later program (interviewing.io)
83 points by leeny on Sept 29, 2022 | hide | past | favorite | 55 comments



>We launched our Pay Later Program at the end of December 2021. As we usually do with new features, we A/B tested a few different variants, with the intent of increasing enrollment as much as possible each time.

Another example to add to the annals of A/B horror stories. Sample size too small, test length time too short, extremely relevant KPIs like chargebacks or fraud complaints not included.

A/B the background color of the home page all you want, but A/B testing legal disclaimers is going to be, uh, risky.

And this is a very tired and boring thing to say, but startups are not AAAMM. If you're on a product team at Meta you can do ten A/B tests a week because tens of millions of people are using your product. If you're an early stage startup looking for market fit, a A/B test really will authentically take weeks or months to get enough data. If you're on a short runway I could imagine it would be painful and unpleasant to just sit on your hands the whole time, but otherwise you wander down blind alleys like the above.


MASSIVE POINT

> If you're an early stage startup looking for market fit, a A/B test really will authentically take weeks or months to get enough data.

Most of the times at startups you will never have enough data to parse out exactly why something is or isn't working. It's more like an artist where you need taste to see where things are going.

Once you have that - then you can start to optimize with data.

Seen so many super-smart data guys at FAANG companies switch to startups and get stuck in the mud because they're looking at dashboards everyday and wondering why they're having a data anonomoly instead of looking at the bigger picture.

Sometimes there's not a huge reason why traffic/conversions/sales are lower. Sometimes it's just Thursday.


Would it be safe to say that A/B testing should only be used for things that can "afford to fail/be wrong"? I'm not saying don't test in other cases, but that this particular technique is not a good fit for situations where the consequences of a mistake are high. Some other kind of validation (rigorous or not) is necessary in those instances.


IMO it's only safe to A/B test things where the product vision doesn't matter one bit. Stuff like where to place the button on the screen to drive conversions on a landing page or what copy to use to target X market segment.

Edit: A better way to put that might be - only use A/B testing to make tactical decisions, but never use it to decide strategy or logistics.


What's the threshold of "affording to be wrong?"

A/B testing is a tool. It gives you data, not answers.

Would I make a color change off of 100 data points and a p value of 0.05? Well, multiply the risk of being wrong against the cost of being wrong. A 1 in 20 chance of being wrong isn't terrible when the cost of being wrong is low.

Would I make sweeping changes to my sales funnel at n=100 and p=0.05? Maybe not. n=10000 and p=0.001? Slam dunk yes. (And you should have probably halted the test earlier... imagine all the potential customers who got the "bad" leg and bounced?)

If results are currently at n=5000 and p=0.06001, and it's going to take 6 more months to hit n=10000, and you've only got 5 months of runway... Well then, give up on that test and do something else. Opportunity cost is a cost too!

At the end of the day, it's up to you to make an informed business decision using your best judgement. "Bounded rationality." All decisions are made with imperfect information and under time pressure. Like in TFA, if your chargeback rate spikes after a change, is it because the change was bad, or because you're expanding out of early adopters and into choosier (more annoying, more normal) customers? Or it's just a random Thursday and your sales volume is so low that every day is spiky and weird?

As @joelrunyon notes above, in the very earliest days, when your total sales number in the double digits, no A/B test in the world is going to be powered enough, short of "normal landing page/404 error". You just have to be good at your job, and lucky.


A/B tests are statistical tests. So you should pre-register an appropriate confidence interval and stick to that. The width of your confidence interval should be relevant to the risks you are taking.


Implicit in your question is that AB testing is used at the exclusion of other decision making tools. Definitely not the way to go.

The right starting point for thinking about AB testing is every experience is something that feels safe to put into production. That's what you're doing, but something about it being a "test" makes it not feel real to people. Maybe that threshold is low for some call to action copy, maybe it's much higher for something more critical. Use whatever tools you otherwise would to get to this point. Then you can AB test to see if it actually works.

The second part is that in theory AB testing doesn't specifically drive to local optimizations (you can test and iterate on big changes) but in practice it sure does. It also tends to encourage short term thinking, maybe comically so, because those things are faster to test. It's local optimization on the time axis. "Changing the expected ship date for custom work from 90 days to 30 days increased sales 4% with no increase in support tickets!" and they ran the test for 4 whole weeks, so how could it be wrong? Maybe someone should tell manufacturing about this test and how great it went...


>Would it be safe to say that A/B testing should only be used for things that can "afford to fail/be wrong"?

No. Definitely not. I worked at a company that lent money for car loans. We A/B tested the HELL out of our lending strategies. Testing, data collection, dashboards, monitoring was a HUGH part of our strategic advantage.


You've gotta A/B test the right thing. They A/B tested to increase sign ups but the outcome they really care about is getting paid. If they did the test right then the experiment would have failed due to the higher charge backs/non payments.


The fact that a service exists to make you better at interviews implies that interviews don't evaluate competency. If interviews evaluated competency then the way to get better at interviews would be the same as the way to get better at software development (books, projects, etc.). I feel like improving my competency is worthwhile but improving my interviewing is rent seeking.


I couldn't agree more. Most interviews assess your ability to interview well, not do the job well. Stanford even run a class on the software engineering interview! I'm currently working on a product to help companies to build realistic interviews, and I'm seeing some other startups focus on this too, so hopefully it will become a thing of the past.

The Stanford course, for the curious: https://web.stanford.edu/class/cs9/

The product we're working on: https://devscreen.io


Got a typo for you: https://devscreen.io/solutions/candidate-experience

>By choosing DevScren, you send a clear signal to candidates

Best of luck to you!


Great implementation for solving a problem that is often discussed, but never executed.!I have always thought that paired programming works better than whiteboarding.

Do you have any reference clients? Would love to interview at such companies.


Pair programming has been my most successful and least stressful way of interviewing. I can display all my skills in a short amount of time and, most important to me, in a realistic and fun way.

I don't feel so pressured when I interview through pair programming which helps a lot to showcase my normal self.

It's easy to display your technical skills (and any gaps in it), soft skills (such as communication, perseverance, thought process, etc.), problem-solving skills and so on.

I'd much rather spend a 5-hour interview just pairing with different people than going through rounds and rounds of whiteboarding. Whiteboarding makes me feel like I'm a lab subject, being prodded in different directions while the experimenters take notes on their clipboards. It's extremely artificial to evaluate any real-life scenario.


Everything needs to be sold. Being good at what you do may or may not get you some local reputation, but you won't get as far as you could without being able to make people aware of yourself and your abilities.

Is it "rent seeking" to film a TV commercial for your plumbing business or to work on your online marketing skills? It probably doesn't make you a better plumber. It probably makes you more money and may even bring in opportunities you wouldn't have had access to that could lead you to develop skills.


I'm not sure if "rent-seeking" is the right term to describe it, but an inevitable thing about your hypothetical plumber is that while he may be making more money, his new customers are getting a worse for the price (ads are expensive) plumber.

He will gain market power in his local market though, which could lead to his expansion and the closing of his least successful competition, bringing down area plumbing wages and allowing him to scoop up bad and inexperienced plumbers at a discount.

After that, we don't have to worry about his skills because he isn't working as a plumber anymore. He'll instead be paying bad plumbers bad wages, and keeping all the profits for himself. His desire will be to lower prices while expanding advertising, so the only variable to be played with is plumbers' wages vs. profit margins. He could then make high wage offers to the more skilled plumbers working for his competition in order to kill them off, and each time another independent shop dies, more of the wages will be shifted to profits.

Eventually, the market is served by the same set of plumbers as before (or likely fewer because some will leave the profession or the area), working at lower wages, but charging higher prices, whose excess goes directly to the shrewd marketer plumber. That's when our plumber enters local politics.

Maybe it is rent-seeking.


> Is it "rent seeking" to film a TV commercial for your plumbing business or to work on your online marketing skills?

Could be, depends


Wouldn't that depend on how much the plumbing business engages in actual plumbing business to drive success vs. how much it engages in marketing to drive "success"?

I've certainly seen both kinds of businesses before.


> The fact that a service exists to make you better at interviews implies that interviews don't evaluate competency.

I don't think so - it just means that your competency can be hidden, and it's possible to make it more visible to the interviewer.

For an extreme example if I go into an interview and remain mute, then I won't pass. If you teach me that I have to speak then I would do better in the interview. My technical competency hasn't changed between the two interviews.


What you said doesn't contradict what I said.


Read the definitions of recall and precision. Interviews focus on precision at the expense of recall. That doesn’t mean interviews don’t evaluate competency. It’s just that not every competent candidate can pass.


Well I meant interviews don’t perfectly evaluate competency, there's certainly a correlation. Oh, and "read" the HN Guidelines.


> The fact that a service exists to make you better at interviews implies that interviews don't evaluate competency. If interviews evaluated competency

The only interpretation of this statement I can’t think of is “interviews do not measure competency.” That’s not true. They do, but the bias for high precision instead of high recall. That is, it’s better to pass on a competent person than to hire an incompetent one.

Regarding the guidelines, I’m not sure what you mean. One of the guidelines is to assume good faith, however.


But it does kind of explain why an employer might pay them to offer candidates mock interviews -- it makes the not-mock more predictive.


I'm not sure why I'm jumping in to defend interviews, but... this merely proves that interviews are not perfect at evaluating competency. Suppose that the result of a typical interview is X% determined by competency, and (100-X)% determined by other factors. It could be possible to provide a service that helps you quickly improve the "other factors". And yet if X is large enough, interviews might still be of some value for evaluating competency.


I don't think it's that interviews don't try to evaluate competency, but that the process has become relatively homogenized and it has allowed new incentives to come into play that target interview taking development rather than skill development. It may also be that companies use most of the interview process primarily as a weeding out of candidates, and then when you get to the final X number of candidates, it becomes a toss up essentially.


To be fair, even if interviews were perfect and exactly measured software development skills, that wouldn't stop enterprising entrepreneurs (/s) from offering services to help you prep for them anyways


Same with all arbitrary low pass filters SAT, ACT, GRE. Interviewers are trying to measure competency but the only way you could possibly know for sure is have the gift of prophecy so they have to rely on proxies. But in a competitive job market where everyone is top tier the game stops being "just go in and present yourself honestly" but "study and practice giving off all the right signals because the margins are tight."


Of course they don't directly evaluate competency. Just like a math test doesn't directly evaluate the ability to do math (real math, as done by mathematicians - not schoolwork). The hope is that you can evaluate some other thing that's positively correlated with the desired property and is easier to measure.

But you can study for the interview just like you study for the test.


Unfortunately the high paying jobs are only looking for people good at interviews, everybody is doing leet code now


It’s just shitty licensure at this point. A service like this is no different than a exam prep course, in spirit. Except unlike other professions, we get to take our medical boards / bar exam every time we talk to another company.


I've never heard of this organisation, and maybe it's because I don't work in tech but I don't really understand why they need to exist. But in any case, I'm entirely unsurprised to see the customer response to these changes. It comes off as incredibly predatory - AB testing the perfect, most "frictionless" experience which is obviously going to lead people to signing up without understanding they're committing to hundreds of dollars of future expenses. Gross.


There are many dark patterns out there but what they've described and the UI screenshots don't scream all that predatory to me. It looks like a good faith effort that went awry. They're certainly handling it well afterward.

I could have made the same mistake. Their customers are highly technical engineers applying to FAANGs and other top companies, I'd personally expect a greater understanding of terms and payment details from that cohort.


>Their customers are highly technical engineers applying to FAANGs and other top companies

They don't, as far as I can tell, vet their customers. They deliberately cast as wide a net as possible, so it was entirely foreseeable they were going to hook some less sophisticated users.


Exactly - this is a HUGE common problem at startups, describing their ideal customer who doesn't really exist and completely ignoring their actual customer base and what it is made up of.

The vast majority of the people signing up are going to be wannabe engineers - what percentage are actually highly technical needs to be determined.


Right, the issue is that they're going to hook people who are aspirationally looking to work at a FAANG and will not by any stretch of the imagination pass their interviewing process. They picked Pay Later because they figured they could pay for it later if they got the job, or given the funnel hacking discussed in the post, they never considered that they would have to pay for it later even if they didn't get the job.

This is a lower scale and lower cost version of the for profit bootcamps and IT schools. Yes, if you go in with the tools you need to succeed, you will probably gain something and get a good start at a FAANG or similar. The programs cannot work miracles and have a vested interest in casting as wide a net as possible even though that's a disservice to their customers. This service figured out they need to reign in that net to confirm people can actually pay their cost.


I wouldn’t describe most of the interviewing body in software as “highly technical” that’s why there is such an issue of finding qualified talent. I’d personally bet a good deal of people would not and did not understand what they were getting themselves into.

Tech has become popular because of the salaries, the amount of people interviewing has heavily increased, but the talent levels are quite diluted. Especially because any engineer worth their salt is off the market in under a month right now, if not shorter.


It shouldn't need to exist but unfortunately interviewing is still a separate skill that engineers need to learn.

People can get so caught up in the numbers on a dashboard or AB test that they forget that there's a human on the other end that just got duped into a 4 figure expense. Gross indeed!


Optimizing conversion rate is a best practice. Dont hate the player. Hate the game.



Due diligence is a best practice for lenders.


I appreciate the candor that Aline has shown in this write up.

>Practically speaking, I’ve learned that creating some amount of friction is necessary when you’re asking people to promise to pay you ~$1000 in the future. Removing that friction can create short-term wins but may hurt you (and disappoint your users) in the long run.

But people are going to be confused from a "click this button and you owe us a grand" UI seems blindingly obvious. I suspect they got greedy and thought it wouldn't going to be as big of a deal as it turned out to be.


Feels to me like a boiling frog situation - they probably started with something reasonable and made a series of incremental changes, each of which didn't make it drastically worse, but by the end it looked nothing like how it started.


I think that's an uncharitable way of reading this. Looking at the UI the TLDR is pretty clear, plus there's the friction of putting down a credit card.


It depends on what the screen before this one looked like. I'd definitely blow pass the TCs they showed. So it would have to have been very explicit in the screen before that you were agreeing to pay something.


You would put down your credit card on a website without reading anything on the page? Why?


Because most T&Cs are boilerplate nonsense

Because I have CC disputes as a backup

Because I'm lazy


OK but this wasn't a 40 page T&C, this was a text box with three statements. What would you think you were getting when putting in your credit card?


> Then hiring basically froze when COVID-19 happened, and to survive, we started charging engineers

Err, this doesn’t line up with any of my experience, nor that of any of my peers or the multiple employers I worked for throughout the pandemic. Maybe things slowed down for a month or so, but there was so much job mobility and choice it was bonkers!


The screenshots of the deal are in gray-on-white text, which looks like the writer doesn't want the reader to read it.

Additionally, they wrap up the gray/small print with the final word, on what the reader is supposed to think, in boldface, and delimited by triple asterisks: "94% of our users find a job within 4 months of starting to practice, and of those, the majority get offers from FAANG".

There seems to be a conscious component of persuasion in that, so I think no surprise if some customers later felt they'd been misled or manipulated.


The language really does sound like "pay only if/when you get a job".


I'm really amazed they got a 90% collection rate...

If you'd have asked me to guess, I would reckon only 30% will pay.

A chunk will have used credit cards with no balance, or cards about to expire. A chunk will dispute charges. A chunk will never see your emails because they land in spam.


The sign-up form at first glance really does look like "we are giving you a $1024 free credit to get started on our platform"


I wonder how many of these users thought "It's just a web page, it's not going to send around a debt collector, I won't ever need to pay".

Those types of users tend to be put off by PDF documents you have to print, sign and return, because then they feel it's more likely that they will get a court summons for not paying...


Never mind Charles. Cash is king again.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: