Hacker News new | past | comments | ask | show | jobs | submit login
Net Promoter Score Considered Harmful (usejournal.com)
209 points by arnklint on Dec 26, 2017 | hide | past | favorite | 91 comments



>We Can’t Reduce User Experience To A Single Number

This is the crux of the issue - CEOs do need a simple metric to summarize performance in each of their business units in order to evaluate performance and prioritize company resources. Metrics are only useful for decisions if they are easy to understand, and consistently measured. Just as Earnings is used to measure profitability, and Revenue is used to measure growth, NPS is the best they've come up with so far for customer experience. Of course none of these 3 metrics is perfect, but they still need to be measured and complimented by more detailed reporting. To better understand Earnings you look at changes in the component parts, revenue and cost. To better understand NPS you look at the mix of scores (1s, 2s, 3s, etc) and follow up questions like actual referrals, which differ by business (unlike NPS which can be applied more broadly).

The author devotes most of the article pointing out that there are areas where follow up questions would give a richer understanding of customer experience, which of course all NPS proponents would agree with.

What he could do a better job of is trying to convince us why executives shouldn't even attempt to summarize their company's performance in customer experience, just like they summarize lots of other complex activities to report to shareholders. Too many companies focus only on hard metrics like revenue and profits, which is why they find NPS a helpful way to steer the company's focus back to the customer.


Well, let's let the author try to convince you that NPS is a harmful, horrible number to summarize a company's performance on.

He would tell you that NPS is only like earnings or revenues if we allowed either to have 50% or more of their data filled with arbitrary numbers, not audited data collected from state-licensed specialists who would lose their job if it was discovered the data was manufactured whole cloth.

The author would also tell you that NPS is easily gamed and there's no checking on whether that is or not. He wrote extensively in the article the various techniques that folks can game the numbers. If this is a number reported to shareholders, shareholders should insist (No, Demand!) that the numbers be corroborated by a neutral third party that will accept liability for any errors. (No surety insurer will guarantee such a liability, for the risk of error or misrepresentation is way too high.)

As you stated, most use follow-up questions to get a richer understanding of the customer. What the author would tell you is that it's clear the NPS recommendation question taints those followup questions and diminishes their validity and inherent value. If the true goal is to learn a richer understanding of customer experience, there are many better ways to achieve it.

In other words, the author believes if executives want a simple metric that is better than NPS, a random number generator is the fastest and cheapest way to achieve it. Why bother with customers at all, if all you're going to do is squander your interaction with them on such a foolish metric.

— The author.


>if executives want a simple metric that is better than NPS, a random number generator is the fastest and cheapest way to achieve it >In fact, NPS measures nothing in particular.

These types of sweeping statements aren't a helpful way for the author to advocate his broader point.

The strengths of the article lie in the more detailed points which bring to light some great examples of how NPS, and surveys in general are mis-used (and some examples that don't actually manifest themselves as real problems in daily use very often).

My thoughts on the examples chosen above:

>50% or more of their data filled with arbitrary numbers

I'm not sure what this refers to, but the way the NPS equation works, every respondent's score matters and mathematically impacts the overall NPS (each one feeds in as either a promoter, neutral, or detractor). Some of the author's own recommended questions include only 3 answer inputs also.

>Not collected from state-licensed specialists

Almost all operational data collected by companies for management reporting is not collected by state-licensed specialists, but is still useful.

>Easily gamed

All survey questions could be gamed in the ways similar to the article examples. A good executive will make sure the survey is asked in the same way of his own organization as of those of peers, and in the same manner over time. He or she won't let agents do things like cherry pick which customers to survey, otherwise the money he invests in the survey wont' actually help him run his company.

>Shareholders should insist the numbers be corroborated by a neutral third party

Correct, companies who are serious about measuring NPS accurately do often hire third parties to run surveys (see example of JD Power running NPS: http://www.jdpower.com/press-releases/bain-certified-net-pro...)

>NPS recommendation question taints followup questions

All survey questions can be tainted by preceding questions. When writing a survey it is fairly straightforward to A/B test the order to make sure this isn't a major factor.


> For some reason, NPS thinks that a 6 should be equal to a 0. Nobody else thinks this. Remember, if you worked at a company like Intuit, all that hard work to get everyone to move from a 0 to a 6 would not be rewarded. Your executive would not get their bonus. It’s as if you didn’t do anything.

This seems perfectly reasonable to me. Outcomes matter -- not effort -- and reaching 6 is not the outcome NPS wants.


Separately, the distribution will never be that narrow in practice. Once the highest rater reaches 7, NPS will start improving. The author even states herself that the input has noise, so the "everyone's a 6" argument is a straw man.


I think it depends on how the organisation is using NPS. If the company itself is cheating on NPS, then it’s cheating itself.

It’s like “Capital A” Agile. Sure, it’s easily gamed, but only only screwing over yourself when you do that.


I'm sure all the executives who are getting bonuses based on NPS improvements feel that gaming the system is cheating themselves.

How many executives get bonuses for a gameable version of Agile?


Again, comes down to the company I guess. I’ve never worked at a place where execs get bonuses based on NPS


Not convincing, considering executives often set their own goals.


> let's let the author try to convince you that NPS is a harmful, horrible number to summarize a company's performance on.

None of your arguments here are based on data. Do you have some evidence that a measured NPS score proved that the metric is bad? The WP link you posted to criticisms is all arguing relative merits. None of them are particularly strongly opposed, and none claimed that NPS doesn't work.

> The author would also tell you that NPS is easily gamed

Do you have data showing NPS scores being gamed?

Easily gamed and actually gamed are two completely different things. Having tried to measure NPS before, I found that 0 people appeared to be gaming the system, my customers told me honestly that my product was mediocre.

To suspect that the polls are being gamed, you assume there's something in it for the respondent, right? What benefit do you think there is for respondents to answer dishonestly?

> In other words, the author believes if executives want a simple metric that is better than NPS, a random number generator is the fastest and cheapest way to achieve it.

I hate to say it, but this kind of hyperbolic statement is having the opposite of the intended affect, it's reflecting on the author.


> Well, let's let the author try to convince you that NPS is a harmful, horrible number to summarize a company's performance on.

This is not how it's used in practice. Meaning: No company measures performance based solely on an NPS metric. NPS is one data point among many used to measure company performance.

There's no more sense in demonizing NPS as there is in worshipping it.


In practice, I have worked at several companies that have only used NPS as a metric.


There seems to be a silly meme, often directed at "reductionist" engineer types that taking a measurement makes you understand less.

Nope. Feynman phrased it well:

"Poets say science takes away from the beauty of the stars - mere globs of gas atoms. I too can see the stars on a desert night, and feel them. But do I see less or more? The vastness of the heavens stretches my imagination - stuck on this carousel my little eye can catch one - million - year - old light. A vast pattern - of which I am a part... What is the pattern, or the meaning, or the why? It does not do harm to the mystery to know a little about it. For far more marvelous is the truth than any artists of the past imagined it. Why do the poets of the present not speak of it? What men are poets who can speak of Jupiter if he were a man, but if he is an immense spinning sphere of methane and ammonia must be silent?"


I work as a consultant to PE companies as a source of cash income while doing other entrepreneurial stuff. Much of this work involves pre-deal commercial diligence and/or post-deal portfolio company strategy review. In the context of this type of work, which often involves formal customer / partner interviews, I think NPS is actually a very useful metric.

For sure, NPS has its limitations. Broadly speaking, I think it's a bad metric for something like asking random strangers how much they like their iPhone. But if you're trying to figure out the strength of, for example, an ERP software company's customer franchise, NPS is really useful, for two reasons:

1. When you're talking to people who know a product well, and you're talking to them in a professional context (e.g., a scheduled 15-minute interview that is part of their work day), they tend to be thoughtful about the 0-10 rating.

2. The follow-up question traditionally used in NPS surveys, "What could [company] do to make that score a 10 instead of [number]?" almost never fails to provide valuable feedback. I believe this is because of the way the question is set up: first the respondent is asked to provide a broad indicator of their customer satisfaction, and then, with that number identified, they're asked for details as to why they chose that number. I have no background in psychology but I think it's pretty intuitive that this question-flow focuses the mind and prompts people to provide insightful constructive recommendations.


To your 2nd point, there is some danger there with big ticket products like ERP software.

Asking "What can we do to make that score a 10" of existing customers is fine as long as you've got it balanced with some other work to help you gauge the thinking of the portions of the market that aren't already sending you a 6 figure annual check. But I bet many companies are lulled into a false sense of security by numerous answers to that sort of question that just request minor improvements to existing functionality. Probably frequently given by people who answered 9 to "would you recommend?" and then migrated to some hip new cloud offering a year later.


In terms of gaining/maintaining a holistic understanding of a given market and the competitive opportunities within it, I fully agree with your point that understanding the needs of one's existing customers is necessary, but not remotely sufficient.

That broader context is outside the scope of the work I usually do, though -- I'm usually focusing much more specifically on understanding the strength of existing customer franchises.

Certainly, two customers who give 9-out-of-10 scores might have very different levels of satisfaction. Sometimes it's a 9 because "I love the product and the customer service is great but on principle perfection is impossible and I don't give 10s" and sometimes it's a 9 because "the product is pretty good and does everything we need, but our main sales rep is really hard to get ahold of". This is why, in practice, I find that more insight comes out of the "why not 10" questions than the pure NPS number.


>Any normal statistician would just report on the mean of all the scores they collected from respondents

What? No!!! Any self-respecting statistician would know that the mean of a quantity where addition is not well-defined is meaningless.


Wow, yeah, agree. Its not the only point in the article and some of the other ones are fairer criticism, but this line was a bit painful.

Use the median!!!


I winced as well! But at least he grabbed an actual mathematical concept used to describe a set of data.. as opposed to whatever NPS is using.


It's not just mean and median - it's proportions. The metric is about growth, so it's specifically concerned with the top of the distribution.


> For reasons never fully explained

The reason is obvious. The entire theory is that for your business to be successful you need to have exceptionally satisfied customers who will PROMOTE your business.

All you have to do is read the name of the thing to understand exactly how and why it works, which apparently everyone besides this author can do.


Author is arguing it doesn't work. You're begging the question.


I've never heard this before, and a skim of the wiki page doesn't mention it as a prerequisite. Mind explaining? The scores are just integers, so the addition is well defined. So you're saying that the context is what's relevant?

Not that the mean is the only (or even the most useful) statistic.


Well, take decibels for example. They are a log scale physical intensity, so averaging them make no sense whatsoever (e.g. absolute silence is negative infinity dB). I would argue these scores are more like labels than actual numbers (i.e. a 3 plus a 5 doesn't really equal an 8 in any real sense). You can of course take the mean of any collection of numbers, but I've heard many a statistician lament such careless practices. The median is at least more easily interpreted for cases like this.


Ah, thanks. The decibels example makes sense (an alternative would be to take the log first, and then convert it back after averaging?), and I can see how the 0-10 system can also be viewed as categorical rather than discrete.


It's because the NPS rating numbers are ordinal, meaning that you can put the rating numbers in order, but the likelihood gap between the numbers may not be equal.

For example, 6 on the NPS scale would be less likely to recommend compared to 7 and 7 would be less likely than 8. However, the gap between 6 and 7 and the gap between 7 and 8 may not be equal. If you were to get the mean of 6, 7, and 8, you would get a value of 6, but there is no guarantee that the average of the participants' likelihood to recommend was actually equal to a 6.

Measuring the mean would work if the interval between the numbers were equal. See here for details: https://en.wikipedia.org/wiki/Level_of_measurement


> Any self-respecting statistician would know that the mean of a quantity where addition is not well-defined is meaningless.

There goes the Dow Jones average.


> the Dow Jones average

The Dow Jones was invented in the 1890s to be easy to calculate. It is widely panned and is not used by anyone in industry for anything serious.


NPS is first and foremost a marketing tool for Bain & Co to sell expensive consultants and def not "the one question to ask".

Several marketing scholars have pointed out the psychometrically undesirable properties of the metric and there is conflicting evidence on e.g. the metric's predictive validity, e.g. in relation to company success

(All in all that old HBR marketing article contains very little details to back up the claim that it is a good and valid metric.)

If you can go with behavioural outcomes for measuring success, e.g. purchases, I think that that will always be more powerful than what a user says in a survey.

And if you can then causally link (e.g. through experiment) something you can influence (e.g. some dimension of "UX) to that outcome you actually have a quite good decision tool too.

NPS does not provide that.


I regularly screw these scores up by honestly answering the question I was asked.

> On a scale of 0-10, how likely are you to recommend Netflix to a friend or colleague?

0

> You answered 0 to the last question. Why wouldn't you recommend Netflix?

It's 2018, people. Everyone I know either already uses Netflix, used to use Netflix, or doesn't own a computer. If I started going around making a recommendation like that, they would think I'm a prat.


I think the methodology assumes that only a small fraction of people are pedantic enough to do that


I gave an extreme example for comedic value.

The kind of situation I'd expect to see more often in real life is that NPS scores are inflated for fun luxury goods, because they naturally inspire more enthusiasm, and deflated for essentials, because very few people are even capable of getting excited about socks.

Mostly I was meaning to hint at, where I've seen things like NPS go off the rails is when people assume they measure what the marketing pitch says they measure. The reality is invariably more subtle.


I had similar call just after a 10 day stay for a kidney transplant they rang up to ask how likely I would be to recommend the hospital to someone!


The English Friends and Family test a large scale implementation of this kind of scoring system.

Data is available here: https://www.england.nhs.uk/fft/


Recommendations are also skewed by the needs and interests of the pool of friends a customer has. I'd recommend (and brag about) a good computer keyboard or monitor to almost anybody I know, because 90% of my acquaintances spend a lot of time in front of a computer; I can make an educated guess about the small minorities who might be interested in a certain restaurant (location-constrained) or a board game or a bootleg CD; but I don't think I could "promote" to anybody things like old roleplaying games (I'm the most eager collector I know) or a graphic tablet (the handful interested people I know already have a good one).

And what about the buying potential of one's friends? Consider plants: students could help sell their student friends something inexpensive that fits in a small vase in a small dorm room, farmers could help sell other farmers enormous amounts of product.


There are a handful of really good sock companies out there right now which I regularly recommend:

* Darn Tough

* Stance

* Bombas

NPS +10


Also the "Why did you give that rating" follow up question can help spot whether a significant number of people interpret the question differently. I've seen surveys for financial institutions where a lot of people write "I love my bank but I don't feel comfortable recommending financial products to people" which is another example of an individual's score not accurately representing their experience as a customer.


I used to do that, then I realised that a minimum wage call centre employee's salary/bonus/happiness probably depends in some small way on my answer, so I stopped being clever and just gave the answer they're clearly asking for.

(I do keep an exception for NPS surveys on automated interactions. No, Barclaycard, I will absolutely not recommend you to friends based on the text message you sent me about my increased credit limit, are you literally insane?)


It’s a truly broken system. I always answer 10 for call center employees, because often they’re not empowered to actually fix anything, and answering anything else often hurts that person instead of shedding light on the broken system they’re stuck in.

The only time I answer less than 10 for a CS person is if they’re actually rude or not trying to be helpful. I much prefer the surveys that ask multiple specific questions that split the interaction between the rep and the outcome.


You are not screwing it up as much as you think. A zero rating has the same impact as a 6 on NPS.

0-6 counts as a-1

7-8 counts as a 0

9-10 counts as a +1

NPS is an average of these numbers so if every user gave a seven or eight, the NPS would be zero (which is a mediocre but not terrible NPS score)

https://cdn.business2community.com/wp-content/uploads/2016/0...


It's actually the percentage of (+1) minus the percentage of (-1). 7-8's are literally not being counted (as you said 0's). Imagine if 990/1000 give 7 or 8. Out of the remaining 10, 7 give a 9 and 3 any number below 6. The count would be: 70%-30%=40 (which would be a good outcome). But only 1% of everyone that participated ended up determining the actual NPS Score.


I don't think that's right, and the example given in the article seems to agree with me. The 7s&8s are indeed counted, indirectly: the score is the percent of all responses (including 7s&8s) that are promoters minus the percent of detractors.

So, using your example, 990/1000=99% of people are passive, 7/1000=0.7% are promoters, and 3/1000=0.3% are detractors, so the NPS score is 0.7-0.3, or 0.4 (not 40). Passives aren't a part of the last calculation, but they affect the percent values for promoters and detractors.


That’s actually a fair answer. The quantitative aspect of NPS is to project growth. And the text is for useful verbatims.


Not sure that's a terribly uncommon scenario. Maybe a little dramatic, but not uncommon. This whole idea of recommending needs a little bit more context to be valuable.

How about "Your friend has decided to ditch cable and try using a video streaming service. How likely are you to recommend Netflix?"

It's even more critical with services that have very specific use-cases. How about a dermatologist? I don't think many people would say they recommend this type of service without some context. So, "If someone you know develops a similar condition, how likely are you to recommend DermatologyCo?"


You're presenting a bit of an extreme, pedantic example, but I think you're right in a more general sense. There are plenty of things that I wouldn't recommend to others simply because they're niche or are just in a category of things that wouldn't come up in normal conversation (perhaps they're just mundane every-day things). I might be 100% satisfied with something, but that doesn't mean I'm going to go telling everyone (or even anyone) about it.


Reminds me of the internal surveys at my company, but I don't think we're given room to elaborate.

> How likely are you to refer a friend to a position at <company>?

The answer is 0. That doesn't mean I think it's a terrible place to work, it means that I don't know anyone who is both interested and qualified.


My feeling about NPS has always been that the very wide pointing system polarizes users towards the two extremes; most people aren’t analytical and they don’t build up a mathematical way to objectively come up with a number between 0 and 10. They would instinctively vote 10 if they’re happy, 0 if they’re angry, and something in the middle if they’re not fully satisfied. The way they map partial satisfaction to a number is unscientific, and this is why NPS only rewards perfect scores (9 or 10).

Similar findings were made by Google with the YouTube rating system that was switched from 1 to 5 stars to a simple upvote/downvote after realizing that most users are polarizing towards the extremes: https://techcrunch.com/2009/09/22/youtube-comes-to-a-5-star-...


Recent experience with NPS both for internal use (are the employees happy) and with users parallels this also. Somewhat happy users appear to start with a score of 10 and discount points based on various negatives. Unhappy users start with 0 and only adjust upwards. albeit with more inertia.

To be fair, everything I've read about NPS says don't use it in isolation. The article covers this with the "why?" question. You need to follow up with more detailed questions to get any real benefit from it. But that reduces the survey to a simple tool for flushing out your unhappy users. It gets more complicated because, like unhappy families, everyone is unhappy in their own way.

Some additional things that muddy the waters:

1. Cultural fit - not giving a good score in seen as impolite. As a result you have to be pretty, damned unhappy before considering anything else. When you're users have reached that point you've probably lost them regardless of what you do.

2. Time of use - new users will be enthusiastic and give good scores because new users are enthusiastic and give good scores. By the time they have used the platform for long enough to make a valuable contribution they have generally lost the will to tell you in detail why your product is great or sucks.

2. Sample size, sample size, sample size. The math here is not precise but it's precise enough to tell you if your sample size is too small. The main effect of too small a sample size is the variation in the score is so great it is impossible to infer anything and you may as well abandon the approach lest you end up chasing shadows trying to make something better.

3. Internal use - simply don't do this - ever. There's a vast amount of politicking and employee uneasiness associated with this. Nothing is ever anonymous no matter what is said publicly.

So, in the absence of anything else NPS is probably a reasonable measure if only to generate a conversation internally and for trying to gauge whether your efforts are moving the needle and in which direction.


This is why I almost stopped reading when the article suggested a "normal statistician" would just use the mean score (it does later address this, for those who haven't read it yet).

IMO, NPS doesn't provide a good enough signal to be "the one number" you should care about, but it recognizes the actual numbers don't matter all that much. I've seen the same question used with a 5 point Likert scale which I imagine is easier to work with. You'd still have to deal with people responding like "0/10 can't think of any friend I'd recommend this to" :/


There's a lot to unpack in this article, but one thing caught my eye. In the "The Ultimate Question 2.0" book the author claims that NPS needs to be considered as a relative measure to competitors. So if we follow this advise, United needs to aim to have the highest score accross all airlines. To give an example, an NPS of -20 is decent if competitors have below -30.


I like the idealism of this, but I highly doubt reality would follow through. At least not until organisational transparency is proven to be a winner in Business degree courses


> We Can’t Reduce User Experience To A Single Number

Growth is a single number, and NPS is measuring growth, not UX.

I'm no big lover of NPS, but this analysis is awful! He claims it's bad because he doesn't understand it. That's not very scientific either.

I'm not sure NPS works well at all, but the idea behind it is obvious. It's a growth metric. The goal of NPS is to tell you how many new customers an existing customer will refer to you over the lifetime of their account.

This is just like population statistics. NPS is trying to measure your customer birth rate by asking how many customers are (or intend to be) pregnant. It's not an accident that there are only two thresholds, and it's wrong to conclude that these thresholds indicate a problem with the method.

If a customer recommends less than one new customer during the entire time they're with you, then you have a replacement rate less than 1, and you're losing customers over time. If you have a replacement rate between 1 and 2, and your customer lifetime is long (say, years) then you aren't growing fast. If your replacement rate is 2 or higher, and your customer lifetime is short (say, weeks) then you are growing more than 200% month over month virally, without the need for marketing.

What the people who designed NPS did, I am sure (meaning I'm speculating, but giving the strongest possible interpretation), is measure some responses and compare it to the number of actual referrals, then drew the lines where the referral rates cross from negative growth to neutral grow, and from neutral growth to positive growth. That's what I would do. And it seems plausible that people who give a score of 6 or less won't end up referring anyone, on average.

Sadly, the article doesn't conclude with any real alternatives for measuring growth. Since NPS is an indirect growth metric, the better answer may be to simply measure your growth directly. That means understanding engagement and activity, not just counting how many accounts exist, but other than that, counting your active customers is a single number that will reliably tell you growth, and can't be gamed by your customers -- it can only be gamed by yourself and your team.


> Growth is a single number, and NPS is measuring growth, not UX.

Connect the dots for me on how NPS measures growth. Where does it tie to growth at all?

> NPS is trying to measure your customer birth rate by asking how many customers are (or intend to be) pregnant.

Horrible analogy, but ok. I'd say, if there's any equivalent, it's asking how many people think they are likely they might get pregnant ever.

> What the people who designed NPS did, I am sure (meaning I'm speculating, but giving the strongest possible interpretation), is measure some responses and compare it to the number of actual referrals, then drew the lines where the referral rates cross from negative growth to neutral grow, and from neutral growth to positive growth.

They didn't do anything like that.

> And it seems plausible that people who give a score of 6 or less won't end up referring anyone, on average.

It does seem plausible. It isn't validated by any science, but it's certainly plausible. (Like the earth is plausibly flat.)

> Since NPS is an indirect growth metric, the better answer may be to simply measure your growth directly.

Agreed!


>> What the people who designed NPS did...

> They didn't do anything like that.

Now I'm really confused by your statements. I just read the link to the original source that you posted on hbr.org. What Reichheld described is exactly what I said above, he correlated survey responses against actual growth rates, and drew the lines between negative and positive growth rates. Not only that, he asked the question multiple different ways, and found out which question statistically landed the most accurate answers.

Why are you claiming they didn't do that? Are you saying the article is lying about the data they used to come up with NPS?


I'm not defending NPS. But your first and biggest argument in the article is unscientific and anti-statistical. You're making an emotional case that it looks weird because there are thresholds. You said "For some reason, NPS thinks that a 6 should be equal to a 0." and "Make that data set to be all nines: 9, 9, 9, 9, 9, 9, 9, 9, 9, and 9. The average is 9. And miraculously, NPS is 100!" Your reasoning here is faulty. You threw in sarcastic irrelevant comments about bonuses to make the idea of getting your NPS score wrong feel like it'll do damage.

Instead of investigating the possibly legitimate reason NPS people might be doing this, you put up a straw man argument about all respondents giving the same score. The likelihood of all respondents in a large survey all giving 8's is very, very close to zero. The likelihood of your NPS score suddenly flipping from 0 to 100 is very, very close to zero.

So you got my analogy and suggested an alternative, but you still don't see how probability of referral (or birth) is an indicator of business (or population) growth? You do seem to get it, so I don't understand what you're missing. I'm not sure how to (or if I need to) explain it better.

Polling a bunch of people how likely they are to refer a friend is like sampling the derivative of the growth function you want to estimate. If everyone responds accurately and tells the truth, and they refer people at the rate they said they would, you can use the data to predict your growth.

The fact that NPS puts the negative growth line at 60% says, to me, that they concluded that people inflate their self-reporting referral probabilities.

There is a mapping between what people report, and what they do. NPS might have the mapping wrong, but there is a mapping. I don't expect the NPS mapping to be very accurate, but if it's wrong I'd like to hear why. You haven't explained why it's wrong because you don't seem to understand why it might be right.

> Horrible analogy, but ok. I'd say, if there's any equivalent, it's asking how many people think they are likely they might get pregnant ever.

I don't understand what you're arguing (or why), you're splitting a very fine hair here, the difference between what you suggested and what I said is subtle at best. The NPS question is how probable are you to recommend this service to a friend. Someone with a low probability is likely to recommend to 0 friends. Someone with a (self reported) medium probability may be likely to refer 1 friend. Someone with a high probability may be likely to recommend 5 friends.

Asking a yes/no how likely one is to ever get pregnant would be a worse proxy for population growth than asking how many pregnancies you expect in your lifetime. The NPS question doesn't exactly ask either of those, it can be interpreted either way.

> They didn't do anything like that.

So what did they do? Your post ignores that question and argues it's purely bogus. I don't even know what they did, and I don't buy that NPS is pure fiction with nothing at all to back it. I totally would buy that the NPS scale was based on a small sample, and that it doesn't fit many companies very well.

> Like the earth is plausibly flat.

Not sure I get where the snark here is coming from. There exists an average response to this survey that is between 1 and 10 where below that number, statistically people will not refer anyone. What is that number? Why does 6 seem as plausible to you as the earth being flat?


You're giving the people behind this metric A LOT of credit if you think they picked a precise cut off based on historical evidence, rather than "hmm 2 and then another 2 sounds good".


I think it's dangerous to fail give any benefit of the doubt at all, and to assume they pulled a number completely out of their asses.

I am giving them the benefit of the doubt, I assume it was based on more than a guess.

Now, instead of assuming, I'm going to look it up.

It turns out they described what they did, and it seems that it was based on some actual data:

"So what would be a useful metric for gauging customer loyalty? To find out, I needed to do something rarely undertaken with customer surveys: Match survey responses from individual customers to their actual behavior—repeat purchases and referral patterns—over time. I sought the assistance of Satmetrix, a company that develops software to gather and analyze real-time customer feedback—and on whose board of directors I serve. Teams from Bain also helped with the project.

"We started with the roughly 20 questions on the Loyalty Acid Test, a survey that I designed four years ago with Bain colleagues, which does a pretty good job of establishing the state of relations between a company and its customers. (The complete test can be found at http://www. loyaltyrules.com/loyaltyrules/acid_test_customer.html.) We administered the test to thousands of customers recruited from public lists in six industries: financial services, cable and telephony, personal computers, e-commerce, auto insurance, and Internet service providers.

"We then obtained a purchase history for each person surveyed and asked those people to name specific instances in which they had referred someone else to the company in question. When this information wasn’t immediately available, we waited six to 12 months and gathered information on subsequent purchases and referrals from those individuals. With information from more than 4,000 customers, we were able to build 14 case studies—that is, cases in which we had sufficient sample sizes to measure the link between survey responses of individual customers of a company and those individuals’ actual referral and purchase behavior.

"The data allowed us to determine which survey questions had the strongest statistical correlation with repeat purchases or referrals. We hoped that we would find at least one question for each industry that effectively predicted such behaviors, which can drive growth. We found something more: One question was best for most industries. “How likely is it that you would recommend [company X] to a friend or colleague?” ranked first or second in 11 of the 14 cases studies. And in two of the three other cases, “would recommend” ranked so close behind the top two predictors that the surveys would be nearly as accurate by relying on results of this single question."

https://hbr.org/2003/12/the-one-number-you-need-to-grow


If you’re going to ask what someone _will_ do, it’s best to keep it to a simple binary decision and always remember that that’s only representative of a single moment in time/the current customer service cycle for that individual.

As this article identifies correctly, the more important question is the harder-to-quantify “why?”.

Armed with this, the fluctuating and heavily biased score will have more meaning and context. Using that data to improve the service/product for the next cycle may do more to realistically reflect sentiment change than any weirdly random maths.

Of course, this still ignores what someone actually does do. But it’s possible to track certain cases of referral and actually to encourage trackable referrals using the correct approach. This is a more solid measure (if possible) than any hypothetical gauge of future behaviour - again as the article points out.


Let me first say that the burden of proof here relies on Fred Reichheld (or whoever else champions NPS now) to defend NPS from the claims in the cited papers.

I totally agree that NPS feels like placing faith too much in something magical. However, I don't feel totally convinced by this article; when people calculate NPS, do they really ignore any other analysis on the raw input (i.e. all 0's vs all 6's)? Yes, there are totally exceptional datasets that make the NPS look like it hasn't move; has that been a problem for people in the wild?

Also, is picking one ecommerce customer out of the data enough to say that there's no correlation between NPS and future behavior?

I've seen NPS used as a way of keeping a pulse on a community; if it drops sharply, something is clearly wrong in a way that normal monitoring can't surface.


> I've seen NPS used as a way of keeping a pulse on a community; if it drops sharply, something is clearly wrong in a way that normal monitoring can't surface.

If this is your goal (and it's a good goal), there are way better questions than NPS to use here. I'd go with a simple "How did we do today?" question, versus the convoluted NPS mechanism.


Yeah, probably. It's the sort of thing where someone's going to want to know your NPS anyways, so if you're collecting that data you may as well break it apart a little. And by no means do I think you shouldn't be doing other sorts of user research.


NPS is great when used and managed correctly. Unfortunately most companies send out an NPS survey and ONLY look @ the score. The score is a lagging indicator of the company taking time to actually read the comments gathered from the NPS survey and act on them to improve the customer experience.

NPS is not harmful, poor leadership and failure to correctly manage a feedback program is.


Exactly. Also, many execs do wrong comparisons on the score. (Comparing across industries, for example)

The real lessons are in the verbatims.


You are exactly right. Most criticisms of NPS as a system are based on a single component (overall score) or bad execution.


I think the NPS question could benefit from some more specific answers, such as:

- Yes, I would bring up how much I like [product]

- Yes, if the subject of [product category] came up

- Yes, if my friend had a problem that [product] could solve

- No, because I don't care enough about [product category]

- No, because I don't like [product] enough

- No, because I dislike [product]


A long article with a mix of good and some poorly informed arguments, that ends up a sales pitch for a UX workshop.


The article ended and then there was an advertisement in italics, I didn't read the article as an advertisement for their book as much as just a comprehensive slamming of NPS.

Which is a terrible metric that pretty much nobody should use.


I agree with that, I think it's an interesting tactic to increase visibility of your product.

1. Create a strong worded post that attracts attention

2. Attach to this post your sales pitch.


I don't know if this tactic applies in this case. Jared Spool is a very well-known UX expert, he's also been ranting about NPS for ages. It feels more like the other way round, he wrote the post and attached a sales pitch, rather than (as you say) attaching a post to the sales pitch.


I kind of like NPS: it doesn't try to make sophisticated distinctions, and distinguishes between enthusiasm and acceptable mediocrity. It's oriented towards growth because of that emphasis on enthusiasm. The very low end of the scale doesn't really matter, as many of the people who score the product very low have already stopped using the product, so grouping 0 responses with 5 responses seems reasonable to me.

But the NPS number is more a number for executives than people working directly on the product. NPS doesn't give UX people anywhere near enough information to do their job. None of these top-line numbers have enough information for that.

I found myself thinking about this: http://www.ianbicking.org/blog/2016/04/product-journal-data-... – there is a real dysfunction that the author is seeing, and is common. Top-down process design means that executives design the work process and give it to their managers, and managers design the process and give it to their reports, and you end up with processes that aren't designed to help individuals do their own jobs, instead everyone is supporting someone else's job. Executives make broad decisions: is this product succeeding? If we continue our current approach where will that take us? Are the teams performing well? Someone who is doing UX shouldn't be worried about these questions, they should be concerned about specific details, because those specific details are what a UX designer can change.


I believe the NPS mechanism of "6 is bad" intends to capture more psychology than the statistical analysis on the article gives it credit for.

That being said, I agree that a three point scale would be better in most cases.

The focus on the number proves the saying: when a metric turns into a goal, it ceases to be useful. The goal should be "dollars earned." All the metrics should just assist in improving this metric. Else you'll get the bad statistical hacking described. Those hacks are not just for NPS; I think we've all seen them from various companies.

Of course, dollars earned also ends up being gamed (see also: Enron) but at least there are usually better safeguards against this in most organizations. And increasing dollars earned are usually a good end state.


I've seen people who barely understood the NPS formula obsess over their NPS score as well as customers who wrote quizzical responses such as "I love your service it is the best, everything is perfect" and grade an 8, overall I find its an interesting vanity metric.


There are products I love but which I never promote to my friends. I provide the company with revenues, but will never contribute to their growth. On the NPS scale, that’s a solid 8.


I worked for a large Japanese company that settled on NPS as one of their main KPIs. It was spearheaded by their American educated CEO and was implemented across all their products. The data looked good even when the company was hemorrhaging customers. Despite the flaws I can't help but wonder if questions were asked about its applicability to Japanese audiences.


NPS is a great tool if you want to understand your word of mouth growth. The issues described in the article are issues of Data Interpretation and not of the actual tool. EG "I gave a 0, because I dont know anyone to recommend the product to" is a typical word of mouth problem to Work on


In customer service, a 10 is good, a 9 is good with room for improvement, and everything below an 8 is bad. The author makes a big deal about the difference between a 0 and 6. But both mean the customer is disatisfied. There’s no real difference.


No real difference? Six might mean I'm willing to give you another chance. Zero might mean I'm so disgusted I'm going to bad-mouth your company at every opportunity, and do everything I can to see you fail.

For a big company (Forbes 500), or one with an effective monopoly (cable companies, airlines), or one with big enough backers (some SV startups) it may not matter. In those cases, the company doesn't need any single individual, and the individual doesn't have the power to really hurt the company.

For your local "mom and pop" company in a small town with only a few employees, the difference between a 6 and a 0 might be massive, perhaps to the point of staying in business or not.


I work in Support and have for almost 10 years now. In NPS theory, you wanna focus first on the 8s, as they are closest to a promoter (9) and then work your way down. Usually the 8s have easier fixes than the 0s as well.

You can choose to go down and focus on the 0s too, but they help to often spot bigger pain points that could shape the product.

Again, the thing to remember with any number is we optimize for what we measure. NPS can be helpful but you need to have the NPS feedback machine running smooth or else you are going to spend your time building/running that vs. actually doing your thing.


Agreed. NPS isn’t perfect but what is? The non linear scale encourages measuring excellence rather than incremental improvements to mediocrity (which is a common problem in the corporate world).

The simplicity of NPS also encourages its implementation rather than the most common alternative of measuring nothing.


That doesn't account for external scale, though. Let's say I go and check in to the Holiday Inn. The bed is clean, the receptionist smiles. They met my expectations, but if I call that a 9 or a 10, what do I call the experience at a proper 5 star hotel?


So why not just ask the customer to choose between "good/good but with room for improvement/bad", rather than picking a point value on an 11-point scale? Simple questions are more likely to be answered honestly than complicated ones.


NPS: like Klout for Customer Satisfaction


Only it's not similar in any way.


this field was studied and explored in the 90s by University of Wu-Tang Clan in their seminal paper C.R.E.A.M https://www.youtube.com/watch?v=PBwAxmrE194


...Article considered harmful: Constant JS thread use, avoid.


"NPS thinks that a 6 should be equal to a 0." No : It considers that at 6 you're not a promoter. But your score will be 6. And the progress from 0 to 6 will be reflected on your score.


> No : It considers that at 6 you're not a promoter. But your score will be 6. And the progress from 0 to 6 will be reflected on your score.

That's the thing though: The progress from 0 to 6 is not reflected in any score.


That is a problem in the contrived examples here, but I'm not convinced it's a problem in the real world. If I look at actual NPS scores of brands I'm familiar with, they match what I hear about them from people. E.g., Tesla 96, Apple 72, Comcast -3. (from http://indexnps.com/ )

The theory of NPS is that what matters is what people say about you. If people from 0-6 are all going to say negative things when asked, then lumping them in the same bucket is reasonable. It may not be as good as a more subtle scale, but it may be much better than thinking the numbers are linear.

It's possible that one could come up with a mapping that's even better, of course. But NPS is simple enough that even executives understand it. A marginally more-accurate number that nobody understands is probably worse, because people will trust it less. The point of the NPS score is not theoretical accuracy, it's motivating change.


The problem is that if I'm giving a company a 5 or a 6, I probably just sort of tolerate the company in lieu of reasonable competitors, ie McDonalds being the only quick food anywhere near where I work. If I'm giving a company a zero it means I hate them and have a strong desire to see them go out of business (ie Google), and will also help with any endeavor to speed that along if it's easy for me. There's a massive difference between those two.


I get that, and maybe that's now it works for everybody. There's certainly a massive difference in feeling; maybe that really does translate to a big difference in an individual's behavior. But does that translate to much of a difference in terms of word-of-mouth growth? There I'm not so sure.

Even if it did, though, it's not clear to me that there's much difference in the utility of the NPS metric. Are companies with a lot of zeroes also companies that are sincerely seeking to improve? Would a more complex scoring system motivate more change? If so, does the benefit gained outweigh the extent to which the added complexity harms NPS adoption elsewhere?

In practice, if some company had an unusually high number of zeros relative to sixes and were very serious about change and the metric didn't shift much when a bunch of people moved from zero to six, you can bet that someone would explain this in a meeting and everybody would still be excited. So although I get that this would be a problem if NPS were the only number used, I'm just not persuaded that some sort of NPS++ metric would be any better in actual use.


No, that's incorrect. If all your customers give 0s, your NPS is -100. If all your customers give 6s, your NPS is still -100.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: