This post is a re-posting of an article first published at oreilly.com
The introduction of data science into the business world has contributed far more than recommendation algorithms; it has also taught us a lot about the efficacy with which we manage our businesses. Specifically, data science has introduced rigorous methods for measuring the outcomes of business ideas. These are the strategic ideas that we implement in order to achieve our business goals. For example, “We’ll lower prices to increase demand by 10%” and “we’ll implement a loyalty program to improve retention by 5%.” Many companies simply execute on their business ideas without measuring if they delivered the impact that was expected. But, science-based organizations are rigorously quantifying this impact and have learned some sobering lessons:
- The vast majority of business ideas fail to generate a positive impact.
- Most companies are unaware of this.
- It is unlikely that companies will increase the success rate for their business ideas.
These are lessons that could profoundly change how businesses operate. In what follows we flesh out the three assertions above with the bulk of the content explaining why it may be difficult to improve the poor success rate for business ideas. Despite the challenges, we conclude with some recommendations for better managing your business.
(1) The vast majority of business ideas fail to generate positive results
To properly measure the outcomes of business ideas, companies are embracing experimentation (aka Randomized Controlled Trials or AB Testing). The process is simple in concept. Before rolling out a business idea, you test; you try the idea out on a subset group of customers1 while another group - a control group - is not exposed to the new idea. When properly sampled, the two groups will exhibit the same attributes (demographics, geographics, etc) and behaviors (purchase rates, life-time-value, etc). Therefore, when the intervention is introduced - ie. the exposure to the new business idea - any changes in behavior can be causally attributed to the new business idea. This is the gold standard in scientific measurement used in clinical trials for medical research, biological studies, pharmaceutical trials, and now to test business ideas.
For the very first time in many business domains, experimentation reveals the causal impact of our business ideas. The results are humbling. They indicated that the vast majority of our business ideas fail to generate positive results. It’s not uncommon for 70-90% of ideas to either have no impact at all or actually move the metrics in the opposite direction of what was intended. Here are some statistic from a few notable companies that have disclosed their success rates publicly:
- Microsoft declared that roughly one-third of their ideas yield negative results, one-third yield no results, and one-third yield positive results (Kohavi and Thomke, 2017).
- Streaming service Netflix believes that 90% of its ideas are wrong (Moran, 2007).
- Google reported that as much as 96.1% of their ideas fail to generate positive results (Thomke, 2020).
- Travel site Booking.com shared that 9 out of 10 of their ideas fail to improve metrics (Thomke, 2020).
To be sure, the statistics cited above reflect a tiny subset of the ideas implemented by companies. Further, they probably reflect a particular class of ideas - those that are conducive to experimentation2 such as changes to user interfaces, new ad creatives, subtle messaging variants, and so on. Moreover, the companies represented are all relatively young and either in the tech sector or leverage technology as a medium for their business. This is far from a random sample of all companies and business ideas. So, while it’s possible that the high failure rates are specific to the types of companies and ideas that are convenient to test experimentally, it seems more plausible that the high failure rates are reflective of business ideas in general and that the disparity in perception of their success can be attributed to the method of measurement. We shouldn’t be surprised — high failure rates are common in many domains. Venture capitalists invest in many companies because most fail; similarly, most stock portfolio managers fail to outperform the S&P 500; in biology, most mutations are unsuccessful; and so on. The more surprising aspect of the low success rates for business ideas is most of us don’t seem to know about it.
(2) most companies are unaware of the low success rates for their business ideas
Those statistics should be sobering to any organization; collectively, business ideas represent the roadmap companies rely upon to hit their goals and objectives. However, the dismal failure rates appear to be known only to the few companies that regularly conduct experiments to scientifically measure the impact of their ideas. Most companies do not appear to employ such a practice and seem to have the impression that all or most of their ideas are or will be successful. Planners, strategists, and functional leaders rarely convey any doubts about their ideas. To the contrary, they set expectations on the predicted impact of their ideas and plan for them as if they are certain. They attach revenue goals and even their own bonuses to those predictions. But, how much do they really know about the outcomes of those ideas? If they don’t have an experimentation practice, they likely know very little about the impact their roadmap is actually having.
Without experimentation, companies either don’t measure the outcomes of their ideas at all or use flimsy methods to assess their impacts. In some situations ideas are acted upon so fluidly that they are not recognized as something that merits measurement. For example, in some companies an idea such as “we’ll lower prices to increase demand by 10%” might be made on a whim by a marketing exec and there will be no follow up at all to see if it had the expected impact on demand. In other situations, a post-implementation assessment of a business idea is done, but in terms of execution, not impact (“Was it implemented on time?” “Did it meet requirements?” Etc., not “What was the causal impact on business metrics?”). In other cases still, post hoc analysis is performed in an attempt to quantify the impact of the idea. But, this is often done using subjective or less-than-rigorous methods to justify the idea as a success. That is, the team responsible for doing the analysis often is motivated either implicitly or explicitly to find evidence of success. Bonuses are often tied to the outcomes of business ideas. Or, perhaps the VP whose idea it was is the one commissioning the analysis. In either case, there is a strong motivation to find success. For example, a company may seek qualitative customer feedback on the new loyalty program in order to craft a narrative for how it is received. Yet, the customers willing to give feedback are often biased towards the positive. Even if more objective feedback were to be acquired it would still not be a measure of impact; customers often behave differently from the sentiments they express. In still other cases, empirical analysis is performed on transaction data in an attempt to quantify the impact. But, without experimentation, at best, such analysis can only capture correlation - not causation. Business metrics are influenced simultaneously by many factors, including random fluctuations. Without properly controlling for these factors, it can be tempting to attribute any uptick in metrics as a result of the new business idea. The combination of malleable measurements and strong incentives to show success likely explain why so many business initiatives are perceived to be successful.
By contrast, the results of experimentation are numeric and austere. They do not care about the hard work that went into executing on a business initiative. They are unswayed by well-crafted narratives, emotional reviews by customers, or an executive’s influence. In short, they are brutally honest and often hard-to-accept3. Without experimentation, companies don’t learn the sobering truth about their high failure rate. While ignorance is bliss, it is not an effective way to run your business.
(3) It is unlikely that companies will increase the success rate for their business ideas.
At this point, you may be thinking - “we need to get better at separating the wheat from the chaff, so that we only allocate resources to the good ideas”. Sadly, without experimentation, we see little reason for optimism as there are forces that will actively work against your efforts.
One force that is actively working against us is the way we reason about our companies.
We like to reason about our businesses as if they are simple, predictable systems. We build models of their component parts and manage them as if they are levers we can pull in order to predictably manage the business to a desired state. For example, a marketer seeking to increase demand builds a model that allows her to associate each possible price with a predicted level of demand. The scope of the model is intentionally narrow so that she can isolate the impact price has on demand - other factors like consumer perception, the competitive assortment, operational capacity, the macroeconomic landscape, and so on - are out of her control and assumed to remain constant. Equipped with such an intuitive model, she can identify the price that optimizes demand. She’s in control and hitting her goal is merely a matter of execution.
However, experimentation reveals that our predictions for the impact of new business ideas can be radically off - not just a little off in terms of magnitude, but often in the completely wrong direction. We lower prices and see demand go down. We launch a new loyalty program and it hurts retention. Such unintuitive results are far more common than you might think.
The problem is that many businesses behave as complex systems which cannot be understood by studying its components in isolation. Customers, competitors, partners, market forces - each can adjust in response to the intervention in ways that are not observable from simple models of the components. Just as you can’t learn about an ant colony by studying the behaviors of an individual ant (Mauboussin, 2009), the insights derived from modeling individual components of a business in isolation often have little relevance to the way the business behaves as a whole.
It’s important to note that our use of the term complex does not just mean ‘not simple.’ Complexity is an entire area of research within Systems Theory. Complexity arises in systems with many interacting agents that react and adapt to one another and their environment. Examples of complex systems include weather systems, rain forest ecology, economies, the nervous system, cities, and yes, many businesses.
Reasoning about complex systems requires a different approach. Rather than focusing on component parts, attention needs to be directed at system-wide behaviors. These behaviors are often termed “emergent,” to indicate that they are very hard to anticipate. This frame orients us around learning - not executing. It encourages more trial and error with less attachment to the outcomes of a narrow set of ideas. As complexity researcher Scott E. Page says, “An actor in a complex system controls almost nothing but influences almost everything.” (Page, 2009)
Another force that is actively working against our efforts to discern good ideas from bad is our cognitive biases. You might be thinking: “Thank goodness my company has processes that filter away bad ideas, so that we only invest in great ideas!”. Unfortunately, probably all companies try hard to select only the best ideas and yet we assert that they are not particularly successful at separating good from bad ideas. We suggest that this is because these processes are deeply human in nature, leaving them vulnerable to cognitive biases.
Cognitive biases are systematic errors in human thinking and decision making (Tversky & Kahneman, 1974). They result from the core thinking and decision making processes that we developed over our evolutionary history. Unfortunately, evolution adapted us to an environment with many differences from the modern world. This can lead to a habit of poor decision making. To illustrate: we know that a healthy bundle of kale is better for our bodies than a big juicy burger. Yet, we have an innate preference for the burger. Many of us will decide to eat the burger tonight. And tomorrow night. And again next week. We know we shouldn’t. But yet our society continues consuming too much meat, fat, and sugar. Obesity is now a major public health problem. Why are we doing this to ourselves? Why are we imbued with such a strong urge - a literal gut instinct - to repeatedly make decisions that have negative consequences for us? It’s because meat, fat, and sugar were scarce and precious resources for most of our evolutionary history. Consuming these resources at every opportunity was an adaptive behavior, and so humans evolved a strong desire to do so. Unfortunately, we remain imbued with this desire despite the modern world’s abundance of burger joints.
Cognitive biases are predictable and pervasive. We fall prey to them despite believing that we are rational and objective thinkers. Business leaders (ourselves included) are not immune. These biases compromise our ability to filter out bad business ideas. They can also make us feel extremely confident as we make a bad business decision. See the sidebar for examples of cognitive biases manifesting in business environments and producing bad decisions.
A final force that is actively working against efforts to discern good ideas from bad is your business maturing. A thought experiment: Suppose a local high school coach told NBA superstar Stephen Curry how to adjust his jump shot. Would implementing these changes improve, or hurt, his performance? It is hard to imagine it would help. Now, suppose the coach gave this advice to a local 6th grader. It seems likely that it would help the kid’s game.
Now, imagine a consultant telling Google how to improve their search algorithm, versus advising a startup on setting up a database. It’s easier to imagine the consultant helping the startup. Why? Well, Google search is a cutting edge system that has received extensive attention from numerous world class experts - kind of like Steph Curry. It’s going to be hard to offer a new great idea. In contrast, the startup will benefit from getting pointed in a variety of good directions - kind of like a 6th grader.
To use a more analytic framework, imagine a hill which represents a company’s objective function4 like profit, revenue, or retention. The company’s goal is to climb to the peak, where its objective is maximized. However, the company can’t see very far in this landscape. It doesn’t know where the peak is. It can only assess (if it’s careful and uses experimentation…) whether it’s going up or downhill by taking small steps in different directions - perhaps by tweaking its pricing strategy and measuring if revenue goes up.
When a company (or basketball player) is young, its position on this objective function (profit, etc.) landscape is low. It can step in many directions and go uphill. Through this process, a company can grow (walk up Mount Revenue). However, as it climbs the mountain, a smaller proportion of the possible directions to step will lead uphill. At the summit a step in any direction will take you downhill.
This is admittedly a simple model of a business (and we already discussed the follies of using simple models). However, all companies will eventually face the truism that as they improve, there are fewer ways to continue to improve (the low apples have been plucked), as well as the extrinsic constraints of market saturation, commoditization, etc. that make it harder to improve your business as it matures.5
So, what to do
We’ve argued that most business ideas fail to deliver on their promised goals. We’ve also explained that there are systematic reasons that make it unlikely that companies will get better, just by trying harder. So where does this leave you? Are you destined to implement mostly bad ideas? Here are a few recommendations that might help:
- Run experiments and exercise your optionality. Recognize that your business may be a complex system, making it very difficult to predict how it will respond to your business ideas. Instead of rolling out your new business ideas to all customers, try them on a sample of customers as an experiment. This will show you the impact your idea has on the company. You can then make an informed decision about whether or not to roll out your idea. If your idea has a positive impact, great. Roll it out to all customers. But in the more likely event that your idea does not have the positive impact you were hoping for you can end the experiment and kill the idea. It may seem wasteful to use company resources to implement a business idea only to later kill it, but this is better than unknowingly providing on-going support to an idea that is doing nothing or actually hurting your metrics - which is what happens most of the time.
- Recognize your cognitive biases, collect a priori predictions, and celebrate learnings. Your company’s ability to filter out bad business ideas will be limited by your team member’s cognitive biases. You can start building a culture that appreciates this fact by sending a survey to all of a project’s stakeholders before your next big release. Ask everyone to predict how the metrics will move. Make an anonymized version of these predictions and their accuracy available for employees. We expect your team members will become less confident in their predictions over time. This process may also reveal that big wins tend to emerge from a string of experiments, rather than a single stroke of inspiration. So celebrate all of the necessary stepping stones on the way to a big win.
- Recognize that it’s going to get harder to find successful ideas, so try more things, and get more skeptical. As your company matures, it may get harder to find ways to improve it. We see three ways to address this challenge. First, try more ideas. It will be hard to increase the success rate of your ideas, so try more ideas. Consider building a leverageable and reusable experimentation platform to increase your bandwidth. Follow the lead of the venture world: fund a lot of ideas to get a few big wins6. Second, as your company matures - you might want to adjust the amount of evidence that is required before you roll out a change — a more mature company should require a higher degree of statistical certainty before inferring that a new feature has improved metrics. In experimental lingo, you might want to adjust the “p-value thresholds” that you use to assess an experiment. Or to use our metaphor, a 6th grader should probably just listen whenever a coach tells them to adjust their jump shot, but Steph Curry should require a lot of evidence before he adjusts his.
This may be a hard message to accept. It’s easier to assume that all our ideas are having the positive impact that we intended. It’s more inspiring to believe that successful ideas and companies are the result of brilliance rather than trial and error. But, consider the deference we give to mother nature. She is able to produce such exquisite creatures - the giraffe, the mighty oak tree, even us humans - each so perfectly adapted to their environment that we see them as the rightful owners of their respective niches. Yet, mother nature achieves this not through grandiose ideas, but through trial and error … with a success rate far more dismal than that of our business ideas. It’s an effective strategy if we can convince our egos to embrace it.
Arkes, H. R., & Blumer, C. (1985), The psychology of sunk costs. Organizational Behavior and Human Decision Processes, 35, 124-140.
Gneezy, U., & Rustichini, A. (2000). A Fine is a Price. The Journal of Legal Studies, 29(1), 1-17. doi:10.1086/468061
Kahneman, D., & Klein, G. (2009). Conditions for intuitive expertise: A failure to disagree. American Psychologist, 64(6), 515-526. https://doi.org/10.1037/a0016755
Kohavi, R. & Thomke, S. “The Surprising Power of Online Experiments,” Harvard Business Review 95, no. 5 (September-October 2017)
Mauboussin, M. J. (2009). Think Twice: Harnessing the Power of Counterintuition. Harvard Business Review Press.
Milgram, S. (1963). “Behavioral Study of obedience”. The Journal of Abnormal and Social Psychology. 67 (4): 371-378.
Moran, M. Do It Wrong Quickly: How the Web Changes the Old Marketing Rules . s.l. : IBM Press, 2007. 0132255960.
Nickerson, R. S. (1998), “Confirmation bias: A ubiquitous phenomenon in many guises”, Review of General Psychology, 2 (2): 175-220.
Page, S. E. (2009). Understanding Complexity - The Great Courses - Lecture Transcript and Course Guidebook (1st ed.). The Teaching Company.
Thomke, S. H. (2020). Experimentation Works: The Surprising Power of Business Experiments. Harvard Business Review Press.
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124-1131.
Whyte, W. H., (1952). “Groupthink”. Fortune, 114-117, 142, 146.
Do not confuse the term ‘test’ to mean a process by which a nascent idea is vetted to get feedback. In an experiment the test group receives a full-featured implementation of an idea. The goal of the experiment is to measure impact - not get feedback. ↩
In some cases there may be insufficient sample size, ethical concerns, lack of a suitable control group, and many other conditions that can inhibit experimentation. ↩
Even trained statisticians can fall victim to pressures to cajole the data. “P-hacking”, “significance chasing” and other terms refer to the temptation to use flawed methods in statistical analysis. ↩
We believe that these types of factors are only obvious in hindsight because the signals are often unobserved until we know to look for them (Kahneman & Klein, 2009). ↩
One reason among many why this mental picture is oversimplified is that it implicitly takes business conditions & the world at large to be static — the company “state vector” that maximizes the objective function today is the same as what maximizes the objective function tomorrow. In other words, it ignores that, in reality, the hill is changing shape under our feet as we try to climb it. Still, it’s a useful toy model. ↩
Finding a new market (jumping to a new “hill” in the “Mount Revenue” metaphor), as recommended in the next section, is one way to continue improving business metrics even as your company matures. ↩