Master the Art of Business
A world-class business education in a single volume. Learn the universal principles behind every successful business, then use these ideas to make more money, get more done, and have more fun in your life and work.
Figuring out which Changes or investments will give you the best outcomes is a major area of study in probability theory, which is best illustrated through the “multi-armed bandit” problem.
Imagine walking into a casino to play the slot machines—“one-armed bandits.” There’s a row of machines, each of which has a different probability of paying a reward when you pull the lever. Some machines pay more—some much more—than the others, but you're not sure which machine has the highest return.
If you knew the best machine in advance, you'd just pull that lever all the time, but you don't know which lever is best, and no one is going to tell you. The only way to discover the best lever is to start pulling levers at random, keep track of what works and what doesn’t, and analyze the results.
There's an important Trade-Off inherent in this approach: when you choose to pull a lever you haven’t pulled before, you get new information about that particular option, and that information is valuable in finding the best overall machine. Pulling the less tested lever, however, has an Opportunity Cost: you're not pulling the lever you currently think will give you the best return. There's a risk that the lever you pull will return less than what you would've brought in pulling the current optimal lever, and that’s a very real cost.
Information is valuable, but it comes at a price: experimentation is sometimes a form of Malinvestment. That insight is the key to solving the bandit problem.
Without going into the math, the solution to the bandit problem is easy to understand: the optimal strategy is to start with a period of Exploration, where you pull levers at random and gather information. Once you have more information about what works and what doesn’t, you shift to spending the majority of your time pulling the best lever you’ve discovered so far (Exploitation), but you keep exploring the other options in case your current best option isn’t the very best that exists.
It’s important to emphasize the last part of the optimal strategy: the Exploration phase never stops. Even if you’re certain you’ve found the best possible option, you never stop experimenting because the information you gather by experimenting is always valuable. The only way to beat the bandit is to keep trying new things.
In real-world circumstances, you have a major advantage over the situation in this Thought Experiment: other people are playing the same game, and you can watch what they do to gather information about what works and what doesn’t without having to try same approach. Learning from the actions of others—and observing their results—is an efficient way to gather useful information.
Extending this approach to day-to-day Decisions is straightforward: experiment as much as possible, with as much variation as you can, and pay close attention to the experiments other people are doing. As you find things that appear to produce the outcomes you want, spend more time and energy doing them. As your efforts produce results and your certainty in a given option increases, increase your investment in that option. The more you experiment, the more you learn, the more information and options you’ll have at your disposal and the better the chance you’ll discover the things that will produce the outcomes you desire.
You can’t make positive discoveries that make things better if you never try anything new. Start experimenting and never stop.
"We must conquer the truth by guessing, or not at all."
C. S. Peirce, nineteenth-century philosopher and mathematician
https://personalmba.com/explore-exploit/
Master the Art of Business
A world-class business education in a single volume. Learn the universal principles behind every successful business, then use these ideas to make more money, get more done, and have more fun in your life and work.