I explained several times NPS is a score and it is not the best metric to measure trends. When you try to show improvements tracking two, or more, NPS scores in a timeline you face 2 particular statistic issues:
- Your NPS votes are not normally distributed
- Using top-two-box scoring technique, such NPS, losses information.
The solution? Use the NPS for your c-levels but when it is time to show a trend, please show means and standard deviation.
The Normal (or Gaussian) Distribution
What is a Normal Distribution? If we represent data on a classical histogram, we can observe a certain order: it can be spread out more on the left, more on the right or it can be all jumbled up. But there are many cases where the data tends to be around a central value with no bias left or right, and it gets close to a “Normal Distribution” (or Gaussian).
Normal Distributions share some peculiarity:
- mean = median = mode
- symmetry about the center
- 50% of values less than the mean and 50% greater than the mean
- 68% of the values within 1 standard deviation of the mean
- 95% of the values within 2 standard deviations of the mean
- 99.7% of the values within 3 standard deviations of the mean
Many things closely follow a Normal Distribution:
- heights of people
- size of things produced by machines
- errors in measurements
- blood pressure
- marks on a test
The normal distribution is important because of the Central Limit Theorem. In simple terms, if you have many independent variables that may be generated by all kinds of distributions, assuming that nothing too crazy happens, the aggregate of those variables will tend toward a normal distribution. This universality across different domains makes the normal distribution one of the centerpieces of applied mathematics and statistics.
Another corollary is that the normal distribution makes math easy – things like calculating moments, correlations between variables, and other calculations that are domain specific. For that reason, even if a distribution isn’t actually normal, it is useful to assume that it is normal to get a good, first-order understanding of a set of data.
Survey rating-scale data, such the NPS 11 scale, typically don’t follow a normal distribution.
Unfortunately, Net Promoter Data Don’t Look Normal
The popular Net Promoter Score measures customer loyalty using the following question: “How likely are you to recommend a product to a friend?” with responses on an 11-point rating scale.
If you plot a histogram of several NPS distribution the graphs hardly look like bells and certainly are not. It’s no wonder researchers have concerns using common statistical techniques like confidence intervals, t-tests or even the mean and standard deviation.
However, this is unlikely to affect the accuracy of statistical calculations because thedistribution of error in the measurement is normally distributed.
Why Normality is Important
Normality is important for two reasons:
- Statistical tests assume the error in our measurement is normally distributed.
- We can’t speak accurately about the percentage of responses above and below the mean if our data is not normal.
Net Promoter Score (NPS) is something very similar to the top-two-box scoring of a rating scale. Top-two-box scoring on a rating scale can provide an easy way to summarize or segment your data in the absence of a benchmark or comparison test. The appeal of top-box scores is that they are intuitive. It doesn’t matter if the ratings are about agreeing, purchasing or recommending. You’re basically cutting to the chase and only considering the highly opinionated folks.
There are two major disadvantages to the Net Promoter scoring method: you lose information about both precision and variability. When you go from 11 response options to 2, a response to a 1 becomes the same thing as a 5. Information is lost.
Losing precision and variability means it’s harder to track improvements, such as changes in attitudes after a new product or service was launched.
There will always be value in segmenting responses into groups for concise reporting (especially to executives). But when you want to determine whether your score has statistically improved, you’ll want to use the mean and standard deviation because they provide more precision at smaller sample sizes. Doing so means that you need to consider the distribution of your data.
The Net Promoter scoring system has the benefit of simplicity but at the cost of losing information. Even rather large changes can be masked when rating scale data with many options is reduced to two or three options. It can mean the difference between showing no improvement and a statistically significant one.
The Net Promoter Score has its place for quickly assessing results and especially for stand-alone studies when there’s no meaningful comparison or benchmark. If the results ever get compared though, you’ll want a more precise scoring system to have a good chance of detecting any differences in attitudes from design changes.