In 1968, the social psychologists Stanley Milgram, Leonard Bickman, and Lawrence Berkowitz decided to cause a little trouble. First, they put a single person on a street corner and had him look up at an empty sky for sixty seconds. A tiny fraction of the passing pedestrians stopped to see what the guy was looking at, hut most just walked past. Next time around, the psychologists put five skyward-looking men on the corner. This time, four times as many people stopped to gaze at the empty sky. When the psychologists put fifteen men on the corner, 45 percent of all passerbys stopped, and increasing the cohort of observers yet again made more than 80 percent of pedestrians tilt their heads and look up.
This study appears, at first glance, to be another demonstration of people’s willingness to conform. But in fact it illustrated something different, namely the idea of “social proof,” which is the tendency to assume that if lots of people are doing something or believe something, there must be a good reason why. This is different from conformity: people are not looking up at the sky because of peer pressure or a fear of being reprimanded. They’re looking up at the sky because they assume—quite reasonably—that lots of people wouldn’t be gazing upward if there weren’t something to see. That’s why the crowd becomes more influential as it becomes bigger: every additional person is proof that something important is happening. And the governing assumption seems to be that when things are uncertain, the best thing to do is just to follow along. This is actually not an unreasonable assumption. After all, if the group usually knows best (as I’ve argued it often does), then following the group is a sensible strategy. The catch is that if too many people adopt that strategy, it stops being sensible and the group stops being smart.
Consider, for instance,, the story of Mike Martz, the head coach of the St. Louis Rams. Going into Super Bowl XXXVI, the Rams were fourteen-point favorites over the New England Patriots. St. Louis had one of the most potent offenses in NFL history had led the league in eighteen different statistical categories, and had outscored their opponents 503 to 273 during the regular season. Victory looked like a lock.
Midway through the first quarter, the Rams embarked on their first big drive of the game, moving from their own twenty yard line to the Patriots’ thirty-two. On fourth down, with three yards to go for a first down, Martz faced his first big decision of the game. Instead of going for it, he sent on field-goal kicker Jeff Wilkins, who responded with a successful kick that put the Rams up 3 to 0.
Six minutes later, Martz faced a similar decision, after a Rams drive stalled at the Patriots’ thirty-four yard line. With St. Louis needing five yards for a first down, Martz again chose to send on the kicking team. This time, Wilkins’s attempt went wide left, and the Rams came away with no points.
By NFL standards, Martz’s decisions were good ones. When given the choice between a potential field goal and a potential first down, NFL coaches will almost always take the field goal. The conventional wisdom among coaches holds that you take points when you can get them. (We’ll see shortly why “conventional wisdom” is not the same as “collective wisdom.”) But though Martz’s decisions conformed to the conventional wisdom, they were wrong.
Or so, at least, the work of David Romer would suggest. Romer is an economist at Berkeley who, a couple of years ago, decided to figure out exactly what the best fourth-down strategy actually was. Homer was interested in two different variations of that problem. First, he wanted to know when it made sense to go for a first down rather than punt or kick a field goal. And second, he wanted to know when, once you were inside your opponent’s ten yard line, it made sense to go for a touchdown rather than kick a field goal. Using a mathematical technique called dynamic programming, Romer analyzed just about every game—seven hundred in all—from the 1998, 1999, and 2000 NFL seasons. When he was done, he had figured out the value of a first down at every single point on the field. A first-and-ten on a team’s own twenty yard line was worth a little bit less than half a point—in other words, if a team started from its own twenty yard line fourteen times, on average it scored just one touchdown. A first-and-ten at midfield was worth about two points. A first-and-ten on its opponent’s thirty yard line was worth three. And so on.
Then Romer figured out how often teams that went for a first down on fourth down succeeded. If you had a fourth-and-three on your opponent’s thirty-two yard line, in other words, he knew how likely it was that you’d get a first down if you went for it. And he also knew how likely it was that you’d kick a field goal successfully. From there, comparing the two plays was simple: if a first down on your opponent’s twenty-nine yard line was worth three points, and you had a 60 percent chance of getting the first down, then the expected value of going for it was 1.8 points (3 x .6). A field goal attempt from the thirty-one yard line, on the other hand, was worth barely more than a single point. So Mike Martz should have gone for the first down.
The beauty of Romer’s analysis was that it left nothing out. After all, when you try a fifty-two yard field goal, it isn’t just the potential three points you have to take into account. You also have to consider the fact that if you fail, your opponents will take over at their own thirty-five yard line. Homer could tell you how many points that would cost you. Every outcome, in other words, could be compared to every other outcome on the same scale.
Romer’s conclusions were, by NFL standards, startling. He argued that teams should pass up field goals and go for first downs far more often than they do. In fact, just about any time a team faced a fourth down needing three or fewer yards for a first, Romer recommended they go for it, and between midfield and the opponent’s thirty yard line—right where the Rams were when Martz made his decisions—Romer thought teams should be even more aggressive. Inside your opponent’s five yard line, meanwhile, you should always go for the touchdown.
Romer’s conclusions were the kind that seem surprising at first and then suddenly seem incredibly obvious. Consider a fourth down on your opponent’s two yard line. You can take a field goal, which is essentially a guaranteed three points, or go for a touchdown, which you will succeed at scoring only 43 percent of the time. Now, 43 percent of seven points is roughly three points, so the value of the two plays is identical. But that’s not all you have to think about. Even if the touchdown attempt fails, your opponent will be pinned on its two yard line. So the smart thing to do is to go for it.
Or consider a fourth-and-three at midfield. Half the time you’ll succeed, and half the time you’ll fail, so it’s a wash (since no matter what happens, either team will have the ball at the same place on the field). But the 50 percent of the time that you succeed, you’ll gain an average of six yards, leaving you better off than your opponent is when you fail. So, again, aggressiveness makes sense.
Obviously there were’things that Romer couldn’t factor in, including, most notably, the impact of momentum on a team’s play. And his numbers were averaged across the league as a whole, so individual teams would presumably need to do some adjusting to figure out their particular chancs of success on fourth down. Even so, the analysis seems undeniable: coaches are being excessively cautious. And, as for Mike Martz, his two decisions in that Super Bowl game were about as bad as decisions get, Martz refused to go for a first down on the Patriots’ thirty-two yard line when the Rams needed just three yards. Romer’s calculations suggest that Martz would have been justified in going for a first down even if the Rams had needed nine yards (since at that place on the field, the chances of missing a field goal are high, and the field-position cost is slight). And that’s with an average team. With an offense like the Rams’, the value of going for it would presumably have been much higher. While it’s impossible to say that any one (or two) decisions were responsible for the final outcome, it’s not exactly surprising that the Rams lost that Super Bowl.
Again, though, Martz was not alone. Romer looked at all the first-quarter fourthdown plays in the three seasons he studied, and found 1,100 plays where the teams would have been better off going for it. Instead, they kicked the ball 992 times.
This is perplexing. After all, football coaches are presumably trying their best to win games. They are experts. They have an incentive to introduce competitive innovations. But they’re not adopting a strategy that would help them win. It’s possible, of course, that Romer is wrong. Football is a remarkably complex, dynamic game, in which it’s hard to distinguish among skill, strategy, emotion, and luck, so there may be something important that his computer program is missing. But it’s not likely. Romer’s study suggests that the gains from being more aggressive on fourth down are so big that they can’t be explained away as a fluke or a statistical artifact. Teams that became more aggressive on fourth down would unquestionably have a competitive edge. But most NFL coaches prefer to be cautious instead. The interesting question is: Why?
The answer I think, has a lot to do with imitation and social proof and the limits of group thinking. First, and perhaps most important, playing it conservatively on fourth down is as close to a fundamental truth in professional football as you get. In the absence of hard evidence to the contrary, it’s easier for individuals to create explanations to justify the way things are than to imagine how they might be different. If no one else goes for it, then that must mean that it doesn’t make sense to go for it.
The imitative impulse is magnified by the fact that football— like most professional sports—is a remarkably clubby, insular institution. To be sure, there have been myriad genuine innovators in the game—including Martz himself—but in its approach to statistical analysis the game has been strangely hidebound. The pooi of decision makers is not, in other words, particularly diverse. That means it is unlikely to come up with radical innovations, and even more unlikely to embrace them when they’re proposed. To put it another way, the errors most football coaches make are correlated: they all point in the same direction. This is exactly the problem with most major-league baseball teams, too, as Michael Lewis documented so well in his book about the recent success of the Oakland A’s, Mone-yball. Billy Beane and Paul DePodesta, the brain trust of the A’s, have been able to build a tremendously successful team for very little money precisely because they’ve rejected the idea of social proof, abandoning the game’s conventional strategic and tactical wisdom in order to cultivate diverse approaches to player evaluation and development. (Similarly, the one current NFL coach who appears to have taken Romer’s ideas seriously— and perhaps even used them in games—is the New England Patriots’ Bill Belichick, whose penchant for rejecting the conventional wisdom has helped the Patriots win two Super Bowls in three years.)
Another factor shaping NFL coaches’ caution may be, as Romer himself suggests, an aversion to risk. Going for it on fourth- and-two makes strategic sense, but it may not make psychological sense. After all, Romer’s strategy means that teams would fail to score roughly half the time they were inside their opponent’s ten yard line. That’s a winning.strategy in the long run. But it’s still a tough ratio for a risk-averse person to accept. Similarly, even though punting on fourth down makes little sense, it at least limits disaster.
The risk-averse explanation makes additional sense if you think about the pressures that any community can bring to bear on its members. That doesn’t mean that NFL coaches are forced to be conservative. It just means that when all of one’s peers are following the exact same strategy it’s difficult to follow a different one, especially when the new strategy is more risky and failure will be public and inescapable (as it is for NFL coaches). Under those conditions, sticking with the crowd and failing small, rather than trying to innovate and run the risk of failing big, makes not just emotional but also professional sense. This is the phenomenon that’s sometimes called herding. Just as water buffalo will herd together in the face of a lion, football coaches, money managers, and corporate executives often find the safety of numbers alluring—as the old slogan “No one ever got fired for buying IBM” suggests.
The striking thing about herding is that it takes place even among people who seem to have every incentive to think independently, like professional money managers. One classic study of herding, by David S. Scharfstein and Jeremy C. Stein, looked at the tendency of mutual-fund managers to follow the same strategies and herd into the same stocks. This is thoroughly perplexing. Money managers have jobs, after all, only because they’ve convinced investors that they can outperform the market. Most of them can’t. And surely herding only makes a difficult task even harder, since it means the managers are mimicking the behavior of their competitors.
What Scharfstein and Stein recognized, though, was that mutual-fund managers actually have to do two things: they have to invest wisely, and they have to convince people that they’re investing investing wisely, too. The problem is that it’s hard for mutual- fund investors to know if their money manager is, in fact, investing their money wisely. After all, if you knew what investing wisely was, you’d do it yourself. Obviously you can look at performance, but we know that short-term performance is an imperfect indicator of skill at best. In any one quarter, a manager’s performance may be significantly better or worse depending on factors that have absolutely nothing to do with his stock-picking or asset-allocation skills. So investors need more evidence that a mutual-fund manager’s decisions are reasonable. The answer? Look at how a manager’s style compares to that of his peers. If he’s following the same strategy—investing in the same kinds of stocks, allocating money to the same kinds of assets—then at least investors know he’s not irrational. The problem, of course, is that this means that, all other things being equal, someone who bucks the crowd—by, say, following a contrarian strategy—is likely to be considered crazy.
This would not matter if investors had unlimited patience, because the difference between good and bad strategies would eventually show up in the numbers. But investors do not have unlimited patience, and even the smartest investor will fail a significant percentage of the time. It’s much safer for a manager to follow the strategy that seems rational rather than the strategy that is rational. As a result, managers anxious to protect their jobs come to mimic each other. In doing so, they destroy whatever information advantage they might have had, since the mimicking managers are not really trading on their own information but are relying on the information of others. That shrinks not only the range of possible investments but also the overall intelligence of the market, since imitating managers aren’t bringing any new information to the table.