Read transcript
Welcome to You To Me Research from our algorithms for life series by Valor Angles. We're so glad you could join us. So today we are tackling a problem that I think keeps a lot of people awake at night. And I don't mean like a fun logic puzzle. No, not a crossword. Not a crossword. I mean a genuine sort of existential crisis that hits us all at some point. It's a problem of well, of settling. How do you know if you found the one? Or you know the right job? How do you know if you should keep interviewing candidates or just hire the person who's sitting right in front of you? Exactly. Should you take that job offer or hold out for something that might be better? It's really the universal anxiety of the modern world, isn't it? We're just we're drowning in options. And there's this terror that if we commit to one, we're missing out on something better just around the corner. The ultimate phomo. But if we don't commit, we end up with nothing. It's just it's paralysis by analysis on a life-altering scale. Totally. And to kick this whole thing off, I want to tell you a story about a guy who tried to solve this exact problem with pure cold logic. Oh, this should be good. His name is Michael Trick. He's a specialist in operations research at Carnegie Mellon. So this is a guy who literally optimizes systems for a living. He does things like scheduling for major league baseball. He optimizes supply chains. That is a dangerous skill set to apply to romance usually. Bringing a spreadsheet to a first date is a rarely a winning move. Oh, absolutely. But Michael decides he's going to apply the most famous result from his field. It's called optimal stopping theory to his love life. He really did it. He did. He sits down. He runs the numbers. And he calculates the exact optimal strategy. He determines that he needs to spend a specific amount of time just dating around. He calls it the exploration phase. Okay. And in this phase, he has to reject everyone no matter what. And according to his calculation, based on when he started looking and when he wanted to settle down, that phase ended precisely at age 26. Okay. So up until 26, he is essentially just collecting data. He's trying to build a baseline of what good even looks like in the dating market for him. Exactly right. He is purely exploring. But the moment he turns 26, the algorithm flips. It switches to the commitment phase. The lead phase. The lead phase. And the rule is simple. He must propose to the very next person he needs, who exceeds every single previous partner he has ever had. That's the rule. No hesitation. No second thoughts. This is the classic Look Then Leak strategy. You look for a while to set the standard. And then you leap at the very first person who beats it. So he executes it. He follows the plan. He dates until he's 26. Then shortly after the deadline, he meets a woman. And she is by all accounts. Fantastic. She surpasses all his benchmarks. She is statistically the optimal map. The algorithm says go. Algorithm says execute. So he verboses. What happened? She said no. Of course. The algorithm worked perfectly on his end. He found the optimal stopping point for him. But he forgot one tiny little variable. The algorithm has absolutely no mechanism for the other party saying no. No. It assumes the world just bends to your choice. It assumes that once you've decided to stop looking, the object of your search, in this case, a human being with her own agency, is just sitting there waiting to be chosen. And that heartbreak, that little twist of reality, is exactly where we're starting today. Right. Because this story illustrates the central tension of this whole deep dive. The math of when to stop searching is, you know, provably optimal. We have mathematical proofs going back to the 1960s. Sure. But those proofs rely on a set of assumptions that almost never ever hold up in real life. Correct. But here's the nuance we're really going to unpack. Just because the assumptions are flawed, doesn't mean the whole idea is useless. I mean, we face these decisions constantly, hiring, apartment hunting, choosing a parking spot, marriage. All the time. And the track we fall into is usually one of two extremes. We either commit way too early, which is settling, or we search for way way too long, which becomes that paralysis or phomo we talked about. So here's our roadmap for this deep dive. We are going to try and solve this tension. First, we're going to explore the famous 37% rule. You might have heard of it. We're going to look at why that specific number is almost always wrong. But why the principle behind it is pure gold. Then we're going to dig into the evidence. What happens when real humans, real organizations like Kodak and Nokia, and of course, real dating apps, collide with this Explore Exploite trade-off. And there's a spoiler there. Yes, spoiler alert. Humans are actually smarter than the models often give us credit for, but our organizations, they're often much stupider. And finally, we're going to give you specific frameworks. Actional stuff you can use, like the five questions stopping tests. So you can know when you've explored enough and when it's really time to commit. It's time to stop looking and start living. Okay, let's do it. Part one, the foundation. We need to start with the origin story here. We have to talk about the secretary problem. This is the classic setup. It's a thought experiment. Imagine you are hiring a secretary. You have a pool of candidates, let's say 100 people apply for the job. You interview them one by one in completely random order. Okay. And here's the catch. After each interview, you have to make an immediate, irrevocable decision. You either hire them on the spot and the search is over, or you reject them. And if you reject them, they're gone forever. You can't call them back a week later. Wow. That is a high-pressure interview. Take the job this second or get out. Extremely. And the only information you have at any given moment is how the person in front of you compares to the people you've already seen. You don't know if the next person is a genius or a disaster. You just know if this current person is better than the last five. So if I hire the very first person, I have absolutely no idea if they're the best because I haven't seen the other 99. Exactly. But if I wait until the very last person, I'm stuck with them, even if they're terrible, because I already rejected everyone else. You've got it. It's a perfect balance of risk. And in the early 1960s, mathematicians like Linley and Dinkin prove there is an optimal strategy to maximize your chances of picking the single best candidate from the pool. And that's where the number comes from. That's where the number comes from. It's called the 37% rule, where sometimes the 1E rule, where E is Euler's number, which is roughly 2.718. I just love that Euler's number shows up in dating advice. It feels so cosmic and weirdly right. It shows up everywhere. It's one of those fundamental constants of the universe. The strategy is that look-then-leap rule we mentioned. You take the first 37% of the pool. So if it's 100 candidates, that's the first 37 people. Right, the first 37 people and you reject them all, unconditionally. It does not matter if candidate number 2 is a Nobel laureate. You reject them. You use that time solely to gather data and establish your baseline of quality. That sounds terrifying. Rejecting ingenious just to get a baseline that feels so wrong. It feels completely wrong, but it's mathematically necessary to optimize your chances. After you pass that 37% mark, that's when you enter the leap phase. From that point on, you select the very next person who is better than everyone you saw in that first phase. The first person to beat the best of the first 37. Precisely. And the math says this works. It works surprisingly well. I mean, think about it. If you just picked a candidate at random, your chance of getting the single best one is one in 100. So 1%. Right. Terrible odds. With the 37% rule, your success rate jumps to 37%. Wow. That is a 37-fold improvement over random chance. And what's really fascinating to me is that this holds true whether the pool is 100 people or 100 million people. The math is incredibly robust. Right. But under those very specific, very rigid assumptions. Okay. Under those specific assumptions, that really is the catchphrase of this whole deep dive, isn't it? It has to be. Because life is not a math problem where you can't go back to a previous candidate, at least not always. Exactly. And this is where it all gets messy. Real life violates the assumptions of the secretary problem constantly. And the moment you start to tweak those assumptions, that 37% number starts to break down almost immediately. Let's talk about some of those variants. You mentioned recall the ability to, you know, go back. Right. When the classic problem of rejection is permanent, gone forever. But in real life, say apartment hunting or even dating, you can sometimes circle back. Maybe that apartment is still on the market a week later. Or you could text someone you went on a date with a month ago. So there's a chance of recall. And a researcher named Petrocelli approved in 1993 that if you can recall a rejected candidate with just a 50% probability, so half the time you can go back. Okay. Then the optimal exploration threshold doesn't stay at 37%. It jumps all the way to 61%. Wait, wait. So I should search more. That feels counterintuitive. Yes. Think about why. Because the risk of passing the best candidate is lower. If a safety net, you might be able to go back and get them. So you can afford to gather more information. You can be pickier for longer. You explore 61% of your options before you even start thinking about committing. Okay. That actually makes sense when you put it that way. Yeah. The cost of an error is lower. But what about the goal itself? The secretary problem assumes I need the absolute best candidate, number one out of 100. But in real life, usually, I just want someone who is really, really good. Yes. I don't need the global maximum. I need great partner or a great employee, not necessarily the single best one on planet Earth. And that is a huge variant. It's called the cardinal payoff variant. If your goal is different, if you're satisfied with someone and say the top 10%, rather than only the absolute single best person, the math changes completely. And how does it change? Beard and showed in 2006 that if you just want a good outcome, not the perfect one, you should explore much, much less. Specifically, the rule of thumb becomes the square root of N, where N is your total pool size. The square root, okay. So if I have 100 candidates, the square root of 100 is 10. So you only explore 10 people to set your baseline, not 37. That's a huge difference. It's massive. Because if you search for 37 people when you'd be happy with anyone in the top 10%, you are just wasting time. You're suffering from opportunity cost. You could have hired a great person at candidate number 11 and been done with it. This brings us to a really major point of contention. Because on the one hand, we have this elegant, simple 37% rule. But then we have these variants that swing the number from 10% all the way up to 61%. It creates a real conflict in how we're supposed to interpret this science. It's not a clear cut answer. Right, and I want to actually take a position here. I think the 37% rule is genuinely elegant and useful, even if the number itself isn't perfect. I mean, think about it. We just said it offers a 37 fold improvement over randomness. Which is true. Even Brian Christian and Tom Griffiths, who wrote algorithms to live by, they call it one of the most useful ideas in all of decision science. It gives us a concrete roadmap. Yeah, it tells you don't just drift. Have a plan, explore first, then commit. It forces you to be intentional. I see where you're coming from and I get the appeal of having a roadmap. But I have to push back on that pretty strongly. I think we need to be incredibly careful about over applying that metaphor. I'm with Robert Wiblin from 80,000 hours on this. Wiblin argues the secretary problem is, and I'm quoting him here, such a poor approximation of real life that we should not see it as useful. Wow, that seems really harsh. It's a model. I mean, all models are wrong, but some are useful, right? That's the old saying. But is it useful, though? I mean, really, look at the range we just discussed. Depending on whether you can call someone back or whether you just want a good outcome, the math tells you to stop somewhere between 10% and 61%. That is a massive gap. That is not a useful range. That is, I would argue false precision. If you tell a listener, follow the 37% role, they might literally reject the love of their life at the 30% mark because of a math problem that doesn't actually apply to their situation. That's not a road map. That's like having a GPS that's programmed to drive you into a lake. Okay, that is a fair point. If you follow the number blindly, you drive off a cliff. But isn't Wiblin and you sort of throwing the baby out with the bathwater here? Yeah. Because if I ignore the rule entirely, what am I left with? Just go with your gut. Well, my gut is terrible at probability. My gut buys lottery tickets and stays in bad relationships for way too long. The rule at least gives my gut some structure. And that's it. That's the synthesis we need to reach here. Both sides are right, but in different ways. The number 37% is fragile. It is brittle. It breaks the moment you touch any of the assumptions. But the principle, the core idea of explore deliberately then commit decisively. That is incredibly robust. That structure survives every single modification to the problem. Whether the right number is 10% or 61%, the sequence is always the same. There must be a period where you are just learning what good looks like. Yes. A dedicated exploration phase. And then you must have a period where you are ready and willing to pull the trigger. Exactly. So the takeaway isn't a magic number. It's a strategy. It's about having distinct phases of actions that are just wandering through life, hoping for the best. Don't obsess over the 37%, but obsess over the sequence. Don't commit to the first apartment you see, but also don't look forever. You got it. Okay, so we've established the math and the philosophy. I think now let's look at what the data says about real humans. Because we are not algorithms. Do we actually follow this look-then-leap pattern naturally? We do, surprisingly. But we tend to leap just a little too early. We're impatient figures. We are. In laboratory studies, like one by seal and rep report back in 1997, they basically set up the secretary problem for real people with real, albeit small, monetary stakes. And consistently, people stop looking around the 31% mark, not 37%. That's pretty close though. I'm actually impressed. It is surprisingly close. We're within about 6% of mathematical optimality, which is not bad for our messy human brains. But the deviation stopping a little bit early is very consistent across studies. So why do we do that? Are we just more risk-averse than the math says we should be? It looks like an error in the lab, but in the real world, it's actually a really rational adaptation. The math problem assumes that search is free. It assumes the interviewing candidate number 99 costs you the exact same amount of time and energy is interviewing candidate number one. But in real life, searching has costs. It takes time. It takes mental energy. It takes rent money while you're looking for that perfect apartment. Right. The cost of search. If I spend six more months looking for a slightly cheaper apartment, I've lost six months of my life living in a hotel or on a friend's couch. That has a real cost. Exactly. So humans, I think, intuitively price in that cost and stop a little earlier. We're surprisingly smart about that. But there is one area where our intuition fails us completely. And it leads to something called the Satisficing Paradox. Oh, I love this term. Satisficing. It sounds like satisfying, but it's a very specific technical term, right? Yes. It was coined by the Nobel laureate Herbert Simon. It's a blend of the word satisfy and suffice. And it's distinct from its opposite which is maximizing. Okay. So a maximizer is someone who needs to find the absolute best option no matter what. They had to check everything. Right. A Satisficer, on the other hand, sets a threshold, a good enough bar. They might say, I need a job that pays at least $50,000 and is within a 30-minute commute. And they'll take the very first option that meets that threshold. And the paradox comes from a famous study on job seekers, right? Yes. A really famous one by Iangar Wells and Schwartz from 2006. They tracked graduating seniors who were looking for their first jobs. And they first identified which students were maximizers. The ones who were refreshing job boards at 2am, applying to everything, comparing every last detail of every offer and which ones were satisfied. And who did better? I'm guessing the maximizers. Objectively. You are correct. The maximizers did better. They secured starting salaries that were on average roughly $7,500 higher than the satisfied. Wow. At the time, that was about a 20% bump in pay. 20%. That's huge. If you're 22 years old, $7,500 is a lot of money. That's life-changing. It is. But here's the paradox. Those same maximizers, the ones with more money in their pockets, were significantly less satisfied with the jobs they ended up taking. So they felt worse about the outcome? Much worse. They experienced more negative affect, more anxiety, more stress, and more regret during and after the whole process. So they got objectively better outcomes, more money, but subjectively worse outcomes. They were richer, but more miserable. Precisely the paradox. Wait, okay, I have to stop you there. Because I want to challenge this idea that their misery automatically makes their strategy wrong. I mean, $7,500 is real money. It is. That compounds over a career. If I'm advising a student or if I'm a parent, it shouldn't they tell them to maximize to get that 20% bump? We shouldn't just discount the economic reality because they feel a little anxious during the job search, the anxiety passes, the compounding interest remains. I'd argue the maximizers are winning the game of capitalism, even if they're stressing out about it. I see the logic and it's the really intuitive response right suck it up, get the money. But let me push back on that using a later paper by Cheacon Schwartz from 2016. They looked deeper into why the maximizers were so miserable. And the key insight is that the issue isn't the ambition, it's the mechanism. What do you mean by that? The misery doesn't come from the goal of having high standards. It's totally fine to want the best job or the highest salary. The part that destroys your soul that causes all the negative affect is the strategy of exhaustive comparison. Ah, so it's the process. It's the process. Maximizers are miserable because they are constantly comparing, constantly looking over their shoulder, wondering what if and checking the job boards even after they've accepted a great offer. They can't let go of the camera factuals. The ghost of the even better job, haunts them. So the advice isn't lower your standards? No, absolutely not. The resolution, the healthier approach is what we call strategic satisfying. You can and should have high standards. You can want that $7,500 bump. But the strategy is this. Once you find a job that meets your high threshold, you stop the exhaustive comparison. You don't ask, is there something even better out there? You say, this meets my standard for excellence, I'm done. So, want the best, but don't shop for the best. That's a great way to put it. The misery is in the endless shopping, not in the high standards. This feels incredibly relevant to the world of dating apps, because if there was ever a machine built for exhaustive, endless comparison, it's Tinder or hinge or bumble. Oh, absolutely. We're talking about what researchers call an infinite pool. The secretary problem assumes a finite pool. You know there are 100 candidates. On a dating app, the pool effectively never ends. There are 350 million users globally. And Tinder users swipe what something like 140 profiles a day. Around that, yeah. It's a staggering number. And does that make us better at choosing? Or does it just break our brains? The evidence suggests it fundamentally changes our brain chemistry regarding rejection. A study by Prank and Denison found that across a single session of swiping, the probability of a user accepting a match drops by 27%. We get pickier the longer we swipe. That seems backward. We get into what they call a rejection mindset. We start looking for flaws. We become numb to faces. Everyone becomes a caricature. Oh, he has a fish in his photo. Reject. Oh, she's a rung emoji. Reject. But hold on a second. Yeah. Because I've seen the big Shiba in meta analysis from 2010, they looked at this whole idea of choice overload. The theory that too many options is bad for us. And they found NO reliable universal choice overload effect. Right. That's a famous paper. Sometimes more options are good. There's even a P and A study showing the marriages that start online have slightly lower divorce rates. Mathematically, having access to 350 million people should lead to a better match than just picking someone from your small town. I'm not convinced that too many options is the real villain here. I think you are overlooking the psychological cost of the interface itself. Tell the no choice overload theory to someone who is swiping 140 times a day and feels empty inside. I'd cite studies by D'Angelo and Tuma in 2017 and Brady at all in 2022. They found that data is choosing from a pool of 24 profiles were less satisfied with their choice than those who chose from a pool of only sex. So the bigger menu made them less happy with their meal? Exactly. Because the perceived abundance of options decreases your readiness to commit to any single one, you're always thinking about the 23 you didn't choose. The Shiba in meta analysis is right that overload is moderated by context. But dating apps create the exact specific context, the time pressure, the infinite scroll, the rapid visual comparison where choice overload thrives. It turns us all into maladaptive maximizers. So it's not the number of people out there that's the problem. It's the design of the access to them. The medium is the message and the medium is saying keep swiping. Something better is just one scroll away. Fair point. Okay, we've established how individuals navigate this and often how we struggle with it, especially in the digital age. But now I want to zoom out because organizations, big companies, face this exact same trade-off. I do every single day. They have to decide. Do we exploit our current profitable products, you know, make money right now? Or do we explore new risky ideas that might become the future? This is the core concept of organizational ambidextrality. The struggle for an organization to exploit and explore at the same time. The seminal paper on this was written by James March back in 1991. And his conclusion was, frankly, pretty grim. What do you find? He found that organizations naturally, almost inevitably, drift toward exploitation. And why is that? Because the returns on exploitation are, in his words, proximate, precise, and certain. If you improve your current flagship product by 1%, you can calculate exactly how much more money you'll make this quarter. It's safe. If you explore some weird new idea, you might make a billion dollars in 10 years, or you might and probably will fail completely. So companies get addicted to the sure thing, the quarterly earnings report. Exactly. And this leads directly to the exploitation trap. They get so good at what they currently do that they become incapable of doing anything new, and then the world changes and they die. And we have two perfect, almost tragic case studies for this. Kodak and Nokia. The poster children for the exploitation trap. Let's start with Kodak. The classic story everyone tells is, they didn't see digital photography coming. Which is completely false. It's the opposite of what happened. Right. A Kodak engineer named Steve Sasson invented the first digital camera at Kodak in 1975. They literally had the patent. They owned the future in their hands. But at that same time, they also had 90% of the US film market. The profit margins on film and photo paper were astronomical. A cash cow. A massive one. So management looked at this amazing digital prototype and basically said, that's cute, but don't tell anyone about it. They actively suppressed the technology to protect their film market share. So that's a failure structure. They couldn't explore because the exploitation engine was just too powerful and too profitable. Precisely. Now compare that story to Nokia. At their peak, Nokia had something like 40% of the global mobile phone market share. They were untouchable. And yet by the end, they had 57 incompatible versions of their own operating system. It was a complete catastrophe. But the failure there wasn't just structural, it was emotional. A study by INSET found that the root cause of Nokia's failure was fear. Fear. Fear of what? Middle managers were terrified of top management. The top executives were described as temperamental and prone to shooting the messenger. So middle managers didn't want to bring them bad news. They didn't want to be the one to say, hey, that new iPhone from Apple, their operating system is actually way better than ours. So did you. They lied. The study says, and I'm quoting, top management was directly lied to about the state of their own software. Wow. So Kodak failed because they love their old product too much. And Nokia failed because everyone was too scared to tell the truth about the future. Both are exploitation traps. One was driven by greed and institutional inertia. The other was driven by fear. Is there a success story here? Who actually gets this balance right? Amazon is a fascinating counter example. Do you remember the fire phone? Vagely. I remember it was a total disaster. Right. A $170 million disaster. A complete write off. It was a massive public failure. Of exploration. But, and this is the absolute key, Jeff Bezos didn't fire the whole team. They took that team and they took the learnings from the failure, specifically the voice recognition and hardware experience they had developed. And they pivoted it into the echo into Alexa and the entire smart speaker market, which they now dominate. They took a failed exploration and turned it into a brand new massively profitable exploitation engine. That is organizational ambidextrity. You have to be willing to lose $170 million on a phone to get the smart speaker market. This reminds me of that whole mythology around Google's 20% time policy. The idea that you just give engineers 20% of their time to explore whatever they want and magic happens. We always hear the Gmail and AdSense came from that. It's a great story, a great piece of corporate branding. But it's mostly a myth. A myth. For the most part. In reality, very few engineers actually used it. Marissa Mayor, when she was a top executive there, famously said it's really 120% time. Meaning you have to do your full time 100% job. And then you can work on your passion project on the weekends and at night, it wasn't protected time. And this brings us to another one of those crucial conflicts in how we think about innovation, doesn't it? It really does. Because wait, even if it was 120% time, it did give us Gmail. It did give us AdSense. So clearly, just giving smart people permission to explore works. Even if it's messy, I argue that the lesson is, you just need to hire smart people and get out of the way. If you try to manage innovation too much, you kill it. Just let the smart people play. I really have to disagree with that. I think the permission model is romantic, but it's ultimately ineffective. If you look at the hard research by a Riley and Tushman, they're the godfathers of this field. They studied how organizations actually succeed to ambidexterity. They found that companies with separate, structured exploration units, meaning dedicated teams, protected budgets, separate PNLs, not just free time, those companies succeeded over 90% of the time. Over 90% and the permission model. Just letting people play. The companies that just had unsupported teams are these vague permission-based policies like 20% time. Their success rate in launching new ventures was 0%. Zero, not 10 or 5. Zero. The takeaway is crystal clear. You don't need permission. You need structure. Exploration is fragile. It's a seedling. If you put it in a cage match with the Geantry of Exploitation, exploitation wins every single time because it makes money today. You have to build a walled garden around exploration to protect it. That's a powerful insight. You can't just hope for innovation. You have to budget for it. You have to institutionalize it. You have to protect it. Okay, why do pivot to one more crucial area of evidence? And that's education. Because we see this explore versus exploit tension in how we raise and train our kids. Should you specialize early, get your 10,000 hours in as fast as possible? Or should you sample a bunch of different things? This is the classic Tiger Woods versus Roger Federer debate, right? Woods was golfing from the age of two. Federer played a dozen different sports until his late teens and then focused on tennis. And there's a fascinating natural experiment on this, a study by Malamood comparing the educational systems in England and Scotland. Yes, it's a perfect setup. In England, the system forces students to specialize very early around age 16. They have to pick a specific track like sciences or humanities. In Scotland, the system is much broader and allows for general study for the first two years of university before declaring a major. So you have the early specializers in England and the late specializers in Scotland. And who wins? Well, in the short term, the English students, the early specializers do get a bit of a head start. They graduate with more specific skills and their initial wages are slightly higher. Right. And I'd argue that in a hyper competitive global economy, that head start is massive. If I'm hiring a software engineer, I want the kid who has been coding since they were 16, not the one who spent two years studying philosophy and then decided to try coding. Depth produces excellence. The whole 10,000 hour rule suggests the English student should be dominating in the long run. But that's not what happens. Here's the catch. Malamood found that the English students, the early specializers, were significantly more likely to switch to entirely unrelated fields later in their career. So they quit their specialization. They quit the very field they specialized in because they were forced to commit before they actually knew what they liked or what they were good at. They had high initial skill, but they had low match quality. And the Scottish students, the Scottish students who explored for longer were less likely to make those dramatic career shifts. They found a better fit the first time. So the early specializers win the sprint out of college, but the late specializers win the career marathon. And the lesson is that match quality finding the thing you were actually suited for and interested in outweighs the benefit of a few extra years of early practice. You can always catch up on skills. You can't get back the lost years you spent in a career you fundamentally hate. That is a profound validation of the gap here. It is. Exploration often looks like wasting time in the short run, but it's actually the most efficient path in the long run. Okay, this has been fascinating. We've covered the problem, the core math and the evidence from individuals, companies, and schools. Now let's get practical. Let's move to part three. Application. How do I actually use any of this tomorrow? We need to give you some algorithms. You can actually run in your own head. Let's start with one that has a great name. The multi-armed bandit. It's a fantastic name. It comes from the old slang for a slot machine. The one armed bandit. So imagine you are in a casino. You're standing in front of a row of slot machines. You have a bucket of coins. You know that the machines have different payout rates. Some are generous, some are stingy, but you don't know which is which. So I have to pull the levers to find out which ones are good. That's the exploration part. Right. But every time you pull a lever on what turns out to be a bad machine just to check, you are losing money. That's the cost of exploration. But if you find a pretty good machine on your first try and just pull that one lever forever, that's exploitation. You'll never know if the machine right next to it pays out double. So it's the same dilemma. How do you solve it? There are a few different strategies, a few different algorithms. The simplest one to understand is called Epsilon Greedy. Epsilon Greedy. It sounds like a Wall Street villain. It does. Epsilon is just a math term for a very small value. The strategy is this. Exploit your best known option most of the time. Say 90% of the time. But 10% of the time, that's the Epsilon part. You force yourself to explore a random option. If I have a favorite restaurant that I know is great, 9 times at a 10, I should go there. But one time at a 10, I have to force myself to try the new place across the street. Even if I suspect it's going to be worse. Exactly. Because that 10% of forced exploration protects you from getting stuck in a rut or what economists call a local maximum. It ensures you never completely stop learning about the world. Your favorite restaurant might close or a new better one might open. Then there's another one, the Gittens Index. This one is a bit more complex mathematically, but the core insight is just beautiful. It's one of my favorite ideas. The Gittens Index is a way of assigning a numerical value to the unknown. And it proves mathematically that an unknown option is worth more than a known option that has a decent but not amazing payout. Why would it be worth more? Because the unknown has uncapped upside. Think about it. If you try a new restaurant and it's terrible, you lose the cost of one meal. That's a very capped downside. But if it turns out to be amazing, you gain a new favorite spot for the rest of your life. That is an enormous, uncapped upside. The math says we should be optimistic and overvalue uncertainty, at least for a while. Optimism is mathematically optimal. I love that. In the exploration phase, absolutely. Okay, so let's turn these powerful ideas into some actionable frameworks. I promise the listeners at the top of the show the five question stopping test. This is for when you're stuck in a decision loop, maybe you're hiring for a role or you're dating or buying a house and you just can't pull the trigger. This is a diagnostic tool to figure out if you should stop searching or keep looking. Question one, can you articulate specifically what great looks like? If the answer to that is no, you have to keep exploring. It's a clear signal. You haven't gathered enough data to even have a baseline yet. You are still in the first 37% of the secretary problem. Question two, are new options teaching you anything new? This is so key. If you're interviewing candidates for a job and the 20th person looks and sounds just like the 10th person, you've likely hit diminishing returns. Your exploration is no longer productive. If you aren't learning, it's time to stop exploring. Question three, does your current best option meet your predefined threshold? This goes right back to strategic-satisficing. Before you started, you set your high standards. Have you found something that actually meets them? If the answer is yes, you are in the danger zone of over shopping. You're now a maximizer and you're searching for misery. Question four, has your best guess stopped changing? This comes from the career advice people at 80,000 hours. If you've been thinking about what career to pursue for two years and your best guess for what to do has been the same for the last six months, more thinking probably won't help. You have reached the limit of simulation. You have to commit and take an action to get new data. And finally, question five, would you regret not trying one specific known thing? This is the Gitton's index question. Is there a mystery box that is haunting you? If there is one specific alternative, a specific city you've always wanted to live in, a specific company you've always wanted to work for that you haven't checked out yet, go check it. Resolve the uncertainty. Then and only then can you commit. I love that. It's like a flow chart for getting your brain unstuck. It is. It forces you to diagnose your paralysis. We also have a framework that's specifically for careers called Plan ABC. This is also from 80,000 hours and it's a brilliant way to structure your professional risk. Plan A is your best guess. It's what you're doing right now or what you plan to do next. You commit to it for a set period of time, say two to three years. You are in exploit mode on Plan A. Okay. Plan B is your nearby alternative. It's what you would pivot to if Plan A doesn't work out. But and this is the critical part. You have to set a trigger in advance. What do you buy a trigger? You decide the stopping rule before you are emotional and invested. You say if I don't get promoted by January 20, 20, 27, I will activate Plan B. Or if my startup isn't profitable in 18 months, I will start looking for a new job. And Plan Z. What's that? Plan Z is the lifeboat. This is your absolute worst case scenario plan. This is if I run out of all my money and my house burns down, I will move back in with my parents and work at the local Starbucks. Why do you need a Plan Z? That sounds so pessimistic. Because knowing you have a lifeboat allows you to take bigger, smarter risks in Plan A. If you know deep down that you won't literally starve or be homeless, you have the psychological safety to swing for the fences in your career. That makes so much sense. One last concept before we wrap all this up, we've talked a lot about exploring when you're young and exploiting when you're old. But there's a really important nuance there, isn't there? Yes. It comes from the work of Laura Carsonson, and it's called socio-emotional selectivity theory. It's a bit of a mouthful. A little bit. But the core idea is that the Explorax Point balance isn't strictly about your chronological age. It's about your perceived time horizon. Explain that difference. If you're 20 years old, you usually feel like you have a long, open-ended future, a long time horizon. You explore, but imagine you are 20 and you just found out you have to move to a new city in two weeks. Do you spend your last two weeks exploring new restaurants in your current city? No, of course not. I go to my absolute favorite pizza place every single night, because my time there is about to end. Exactly. Your time horizon for that city shortened dramatically, so you immediately switched from exploration to exploitation. Conversely, imagine you are 50, but you just started a brand new career that you plan to do for the next 20 years. You should be in explore mode for that career. You should be acting like a 20-year-old in that specific domain of your life. So it's not about I'm too old to explore. It's about asking how much time do I have left in this specific game? Correct. You need to calibrate by domain. You can be an exploiter in your 30-year marriage, hopefully, and a complete explorer in your new woodworking hobby. Okay, let's bring this all home. We have covered a lot of ground. We started with a math problem that failed a love life. We saw that we humans get tripped up by the endless, shopping part of satisfying. We saw that entire companies can die because they get addicted to the sure thing. And we saw that the solution again and again is structure. Algorithms are really just structures for our thinking. So if we boil this entire deep dive down, what are the three essential things our listeners need to walk away with? Okay, take away number one. The principle is greater than the number. Forget 37 percent. Just let that number go. The real lesson is the sequence. Explore. Deliberately set aside a conscious period of time where your only job is to learn what good looks like. Then commit decisively. Don't drift between the two modes. Be in one or the other. I like that. Take away number two. Strategic satisfying. Want the best, but do not shop for the best. The misery of the maximizer comes from the process, not from the high standards. So set your threshold. Find the first thing that meets it. And then, and this is the hardest part. Stop looking. Delete the app. Unsubscribe from the job alerts. And take away number three. Palibrate by domain. Don't just decide to settle down in all aspects of your life because you turn 30 or 40. Look at your time horizon for this specific decision. If you're new to a city, explore it like you're 20. If you're new to a career, explore it. If you are 50 years deep into a loving marriage, for God's sake, exploit that relationship. Deepen it. Enjoy it. Don't start looking for new. And I want to bring it right back to where we started. To Michael Trick. The man who proposed to the statistically optimal woman and got rejected. It's such a tragic story, but it's also a beautiful one, I think. It is. Because it reminds us that all these algorithms, all these models, they are single-player games. The secretary of problem assumes you are the only one making a choice. But life. Life is a multiplayer game. It absolutely is. The algorithm told him when to commit. It was perfect for that. But it couldn't tell him if that commitment would be reciprocated. It couldn't measure chemistry or her feelings or the timing on her end. But. And here's the real kicker to the story. He didn't give up. He didn't say, well, math is fake and love is a lie. He realized that the algorithm was a guide, not a God. He kept looking. And he eventually found someone else. He's happily married now. Because commitment is what transforms all that exploration into a life well-lived. The math gets you to the door, but you have to be the one to choose to walk through it. And someone has to choose to walk through it with you. And so, here's a final thought for you. The next time you feel stuck making a decision, ask yourself this simple question. Am I still learning anything new? Or am I just stalling? Because if the data has stopped changing, the exploration phase is over. It's time to leap. And if you leap and you miss, well, that's just more data for the next round. Exactly. You can find full research and all the sources we talked about at research.yoda.me. That's yuda.eismg. Thanks for diving deep with us. See you next time.