The Wisdom of Crowds: January 2010

Ask the Audience

A good analysis of the Ask the Audience lifeline from Who Wants to be a Millionaire is available here, while on the other hand we have a slight sceptic here.

In April 1946, at a forum organized by the New York Herald- Tribune, General Wild Bill Donovan gave a speech entitled ‘Our Foreign Policy Needs a Central Intelligence Agency.” During World War II, Donovan had been the head of the Office of Strategic Services, the United States’ chief wartime intelligence organization, and once the war ended he became a loud public advocate for the creation of a more powerful peacetime version of the OSS. Before the war, the United States had divided intelligence-gathering responsibilities among the different military services. But the failure of any of those services to anticipate the attack on Pearl Harbor— despite what seemed, in retrospect, to be ample evidence that a major Japanese strike was in the works—had pointed up the system’s limitations and suggested the need for a more comprehensive approach to intelligence gathering. So, too, did the prospect of conflict with the Soviet Union, which even in 1946 loomed as a real possibility, and the advent of new technologies—Donovan cited “the rocket, the atomic bomb, bacteriological warfare—that made America’s borders seem far from impregnable. In his April speech, Donovan hit on all of these themes, arguing that what the United States needed was “a centralized, impartial, independent agency” to take charge of all of the country’s intelligence operations.

Donovan’s public speaking didn’t do much for his own career, since his sharp criticisms alienated the intelligence community and probably doomed his chances of returning to government service. Nonetheless, in 1947, Congress passed the National Security Act and created the Central Intelligence Agency. As historian Michael Warner has put it, the goal of the law was to “implement the principles of unity of command and unity of intelligence.” Fragmentation and division had left the United States vulnerable to surprise attack. Centralization and unity would keep it safe in the future.

In fact, though, the centralization of intelligence never happened. Although the CIA was initially the key player in the postwar period, as time passed the intelligence community became more fragmented than ever, divided into a kind of alphabet soup of agencies with overlapping responsibilities and missions, including not just the CIA but also the National Security Agency, the National Imagery and Mapping Agency, the National Reconnaissance Office, the Defense Intelligence Agency, and the intelligence arms of each of the three major military services. In theory, the director of the CIA was in charge of the U.S. intelligence community as a whole, but in practice he exercised very little supervision over these agencies, and most of the money for intelligence operations came from the Department of Defense. In addition, the FBI—which was responsible for domestic law enforcement—operated almost completely outside the orbit of this intelligence community, even though information about foreign terrorists operating inside the United States would obviously be of interest to the CIA. In place of the centralized repository of information and analysis that Donovan had envisioned, the U.S. intelligence community evolved into a collection of virtually autonomous, decentralized groups, all working toward the same broad goal—keeping the United States safe from attack—but in very different ways.

Until September 11, 2001, the flaws of this system were overlooked. The intelligence community had failed to anticipate the 1993 bombing of the World Trade Center and the 1998 bombings of the U.S. embassy in Kenya and the USS Cole in Yemen. But not until September 11 did the failure of U.S. intelligence gathering come to seem undeniable. The Congressional Joint Inquiry into the attacks found that the U.S. intelligence community had “failed to capitalize on both the individual and collective significance of available information that appears relevant to the events of September 1 1.” Intelligence agencies “missed opportunities to disrupt the September 11th plot,” and allowed information to pass by unnoticed that, if appreciated, would have “greatly enhanced its chances of uncovering and preventing” the attacks. It was, in other words, Pearl Harbor all over again.

The congressional inquiry was unquestionably a classic example of Monday-morning quarterbacking. Given the sheer volume of information that intelligence agencies process, it’s hardly surprising that a retrospective look at the data they had on hand at the time of the attack would uncover material that seemed relevant to what happened on September 11. That doesn’t necessarily mean the agencies could have been realistically expected to recognize the relevance of the material beforehand. In her classic account of the intelligence failures at Pearl Harbor; Warning and Decision, Roberta Wohistetter shows how many signals there were of an impending Japanese attack, hut suggests that it was still unreasonable to expect human beings to have picked the right signals out from “the buzzing and blooming confusion” that accompanied them. Strategic surprise, Wohlstetter suggests, is an intractable problem to solve. And if a massive Japanese naval attack comprising hundreds of planes and ships and thousands of men was difficult to foresee, how much harder would it have been to predict a terrorist attack involving just nineteen men?

And yet one has to wonder. Given the almost complete failure of the intelligence community to anticipate any of four major terrorist attacks from 1993 through 2001, is it not possible that organizing the intelligence community differently would have, at the very least, improved its chances of recognizing what the Joint Inquiry called “the collective significance” of the data it had on hand? Predicting the actual attacks on the World Trade Center and the Pentagon may have been impossible. But coming up with a reasonable, concrete estimate of the likelihood of such an attack nay not have been.

That, at least, was the conclusion that Congress reached: better processes would have produced a better result. In particular, they stressed the lack of “information sharing” between the various agencies. Instead of producing a coherent picture of the threats the United States faced, the various agencies produced a lot of localized snapshots. The sharpest critic of the agencies’ work, Senator Richard Shelby, argued that the FBI in particular was crippled by its “decentralized organizational structure,” which “left information-holdings fragmented into largely independent fiefdoms.” And the intelligence community as a whole was hurt by a failure to put the right information in the hands of the right people. What needed to be done, Shelby suggested, was to abolish the fiefdoms and return to the idea for which Bill Donovan had argued half a century ago. One agency, which could stand “above and independent from the disputatious bureaucracies,” needed to be put in charge of U.S. intelligence. Decentralization had led the United States astray. Centralization would put things right.

Chapter Four, Part II

In challenging the virtues of decentralization, Shelby was challenging an idea that in the past fifteen years has seized the imagination of businessmen, academics, scientists, and technologists everywhere. In business, management theories like reengineering advocated replacing supervisors and managers with self-managed teams that were responsible for solving most problems on their own, while more utopian thinkers deemed the corporation itself outmoded. In physics and biology scientists paid increasing attention to self-organizing, decentralized systems—like ant colonies or beehives—which, even without a center, proved robust and adaptable. And social scientists placed renewed emphasis on the importance of social networks, which allow people to connect and coordinate with each other without a single person being in charge. Most important, of course, was the rise of the Internet—in some respects, the most visible decentralized system in the world—and of corollary technologies like peer-to-peer file sharing (exemplified by Napster), which offered a clear demonstration of the possibilities (economic, organizational, and more) that decentralization had to offer.

The idea of the wisdom of crowds also takes decentralization as a given and a good, since it implies that if you set a crowd of self- interested, independent people to work in a decentralized way on the same problem, instead of trying to direct their efforts from the top down, their collective solution is likely to be better than any other solution you could come up with. American intelligence agents and analysts were self-interested, independent people working in a decentralized way on roughly the same problem (keeping the country safe). So what went wrong? Why did those agents not produce a better forecast? Was decentralization really the problem?

BEFORE WE ANSWER THAT question, we need to answer a simpler one first: What do we mean by “decentralization,” anyway? It’s a capacious term, and in the past few years it’s been tossed around more freely than ever. Flocks of birds, free-market economies, cities, peer-to-peer computer networks: these are all considered examples of decentralization. Yet so, too, in other contexts, are the American public-school system and the modern corporation. These systems are dramatically different from each other, but they do have this in common: in each, power does not fully reside in one central location, and many of the important decisions are made by individuals based on their own local and specific knowledge rather than by an omniscient or farseeing planner.

In terms of decision making and problem solving, there are a couple of things about decentralization that really matter. It fosters, and in turn is fed by, specialization—of labor, interest, attention, or what have you. Specialization, as we’ve known since Adam Smith, tends to make people more productive and efficient. And it increases the scope and the diversity of the opinions and information in the system (even if each individual person’s interests become more narrow).

Decentralization is also crucial to what the economist Friedrich Hayek described as tacit knowledge. Tacit knowledge is knowledge that can’t be easily summarized or conveyed to others, because it is specific to a particular place or job or experience, but it is nonetheless tremendously valuable. (In fact, figuring out how to take advantage of individuals’ tacit knowledge is a central challenge for any.group or organization.) Connected with this is the assumption that is at the heart of decentralization, namely that the closer a person is to a problem, the more likely he or she is to have a good solution to it. This practice dates hack to ancient Athens, where decisions about local festivals were left up to the demes, as opposed to the Athenian assembly, and regional magistrates handled most nonserious crimes, It can also be seen in Exodus, where Moses’ father-in-law counseled him to judge only in ‘great matter[s]” and to leave all other decisions to local rulers.

Decentralization’s great strength is that it encourages independence and specialization on the one hand while still allowing people to coordinate their activities and solve difficult problems on the other. Decentralization’s great weakness is that there’s no guarantee that valuable information which is uncovered in one part of the system will find its way through the rest of the system. Sometimes valuable information never gets disseminated, making it less useful than it otherwise would be. What you’d like is a way for individuals to specialize and to acquire local knowledge—which increases the total amount of information available in the system— while also being able to aggregate that local knowledge and private information into a collective whole, much as Google relies on the local knowledge of millions of Web-page operators to make Google searches ever-smarter and ever-quicker. To accomplish this, any “crowd”—whether it be a market, a corporation, or an intelligence agency—needs to find the right balance between the two imperatives: making individual knowledge globally and collectively useful (as we know it can be), while still allowing it to remain resolutely specific and local.

Chapter Four, Part III

In 1991, Norwegian.hacker Linus Torvalds created his own version of the Unix operating system, dubbing it Linux. He then released the source code he had written to the public, so everyone out there—well, everyone who understood computer code—could see what he had done. More important, he attached a note that read, “If your efforts are freely distributable, I’d like to hear from you, so I can add them to the system.” It was a propitious decision. As one history of Linux points out: “Of the first ten people to download Linux, five sent back bug fixes, code improvements, and new features.” Over time, this improvement process became institutionalized, as thousands of programmers, working for free, contributed thousands of minor and major fixes to the operating system, making Linux ever-more reliable and robust.

Unlike Windows, which is owned by Microsoft and worked on only by Microsoft employees, Linux is owned by no one, When a problem arises with the way Linux works, it only gets fixed if someone, on his own, offers a good solution. There are no bosses ordering people around, no organizational charts dictating people’s responsibilities. Instead, people work on what they’re interested in
and ignore the rest. This seems like—in fact, it is—a rather haphazard way to solve problems. But so far at least, it has been remarkably effective, making Linux the single most important challenger to Microsoft.

Linux is clearly a decentralized system, since it has no formal organization and its contributors come from all over the world. What decentralization offers Linux is diversity In the traditional corporate model, top management hires the best employees it can, pays them to work full-time, generally gives them some direction about what problems to work on, and hopes for the best. That is not a bad model. It has the great virtue of making it easy to mobilize people to work on a particular problem, and it also allows companies to get very good at doing the things they know how to do. But it also necessarily limits the number of possible solutions that a corporation can come up with, both because of mathematical reality (a company has only so many workers, and they have only so much time) and because of the reality of organizational and bureaucratic politics. Linux, practically speaking, doesn’t worry much about either. Surprisingly, there seems to be a huge supply of programmers willing to contribute their efforts to make the system better. That guarantees that the field of possible solutions will he immense. There’s enough variety among programmers, and there are enough programmers, that no matter what the bug is, someone is going to come up with a fix for it. And there’s enough diversity that someone will recognize bugs when they appear. In the words of open-source guru Eric Raymond, “Given enough eyeballs, all bugs are shallow.”

In the way it operates, in fact, Linux is not all that different from a market, as we saw in Chapter 2 on diversity. Like a bee colony, it sends out lots of foragers and assumes that one of them will find the best route to the flower fields. This is, without a doubt, less efficient than simply trying to define the best route to the field or even picking the smartest forager and letting him go. After all, if hundreds or thousands of programmers are spending their time trying to come up with a solution that only a few of them are going to find, that’s many hours wasted that could he spent doing something else. And yet, just as the free market’s ability to generate lots of alternatives and then winnow them down is central to its continued growth, Linux’s seeming wastefulness is a kind of strength (a kind of strength that for-profit companies cannot, fortunately or unfortunately, rely on). You can let a thousand flowers bloom and then pick the one that smells the sweetest.

Chapter Four, Part IV

So who picks the sweetest-smelling one? Ideally, the crowd would. But here’ where striking a balance between the local and the global is essential: a decentralized system can only produce genuinely intelligent results if there’s a means of aggregating the information of everyone in the system. Without such a means, there’s no reason to think that decentralization will produce a smart result. In the case of the experiment with which this book opened, that aggregating mechanism was just Frances Galton counting the votes. In the case of the free market, that aggregating mechanism is obviously price. The price of a good reflects, imperfectly but effectively, the actions of buyers and sellers everywhere, and provides the necessary incentive to push the economy where the buyers and sellers want it to go. The price of a stock reflects, imperfectly but effectively, investors’ judgment of how much a company is worth. In the case of Linux, it is the small number of oders, including Torvalds himself, who vet every potential change to the operating-system source code. There are would-be Linux programmers all over the world, but eventually all roads lead to Linus.

Now, it’s not clear that the decision about what goes into Linux’s code needs to be or should be in the hands of such a small group of people. If my argument in this book is right, a large group of programmers, even if they weren’t as skilled as Torvalds and his
lieutenants, would do an excellent job of evaluating which code was worth keeping. But set that aside. The important point here is that if the decision were not being made by someone, Linux itself would not be as successful as it is. If a group of autonomous individuals tries to solve a problem without any means of putting their judgments together, then the best solution they can hope for is the solution that the smartest person in the goiip produces, and there’s no guarantee they’ll get that. If that same group, though, has a means of aggregating all those different opinions, the group’s collective solution may well be smarter than even the smartest person’s solution. Aggregation—which could be seen as a curious form of centralization—is therefore paradoxically important to the success of decentralization. If this seems dubious, it may be because when we hear centralization we think “central planners,” as in the old Soviet Union, and imagine a small group of men—or perhaps just a single man—deciding how many shoes will be made today. But in fact there’s no reason to confuse the two. It’s possible, and desirable, to have collective decisions made by deéentralized agents.

Understanding when decentralization is a recipe for collec.tive wisdom matters because in recent years the fetish for decentralization has sometimes made it seem like the ideal solution for every problem. Obviously, given the premise of this book, I think decentralized ways of organizing human effort are, more often than not, likely to produce better results than centralized ways. But decentralization works well under some conditions and not very well under others, In the past decade, it’s been easy to believe that if a system is decentralized, then it must work well. But all you need to do is look at a traffic jam—or, for that matter, at the U.S. intelligence community—to recognize that getting rid of a central authority is not a panacea. Similarly, people have become enamored of the idea that decentralization is somehow natural or automatic, perhaps because so many of our pictures of what decentralization looks like come from biology. Ants, after all, don’t need to do anything special to form an ant colony. Forming ant colonies -is inherent in their biology. The same is not, however, true of human beings. It’s hard to make real decentralization work, and hard to keep it going, and easy for decentralization to become disorganization.

A good example of this was the performance of the Iraqi military during the U.S—Iraq war in 2003. In the early days of the war, when Iraqi fedayeen paramilitaries had surprised U.S. and British troops with the intensity of their resistance, the fedayeen were held up as an example of a successful decentralized group, which was able to flourish in the absence of any top-down control. In fact, one newspaper columnist compared the fedayeen to ants in an ant colony, finding their way to a “good” solution while communicating only with the soldiers right next to them. But after a few days, the idea that the fedayeen were mounting a meaningful, organized resistance vanished, as it becare clear that their attacks were little more than random, uncoordinated assaults that had no connection to what was happening elsewhere in the country. As one British commander remarked, it was all tactics and no strategy. To put it differently, the individual actions of the fedayeen fighters never added up to anything bigger, precisely because there was no method of aggregating their local wisdom. The fedayeen were much like nts—following local rules. But where ants who follow their local rules actually end up fostering the well-being of the colony, soldiers who followed their local rules ended up dead. (It may be, though, that once the actual war was over, and the conflict shifted to a clash between the occupying U.S. military and guerrillas using hit-and-run terrorist tactics, the absence of aggregation became less important, since the goal was not to defeat the United States in battle, but simply to inflict enough damage to make staying seem no longer worth it. In that context, tactics may have been enough.)

The irony is that the true decentralized military in the U.S.—Irac1 war was the U.S. Army. American troops have always been given significantly more initiative in the field than other armies, as the military has run itself on the “local knowledge is good” theory. But in recent years, the army has dramatically reinvented itself. Today, local commanders have considerably greater latitude to act, and sophisticated communications systems mean that collectively wise strategies can emerge from local tactics. Commanders at the top are not isolated from what’s happening in the field, and their decisions will inevitably reflect, in a deep sense, the local knowledge that field commanders are acquiring. In the case of the invasion of Baghdad for instance, the U.S. strategy adapted quickly to the reality of Iraq’s lack of strength, once local commanders reported little or no resistance. This is not to say, as some have suggested, that the military has become a true bottom- up organization. The chain of command remains essential to the way the military works, and all battlefield action takes place within a framework defined by what’s known as the Commander’s Intent, which essentially lays out a campaigil’s objectives. But increasingly, successful campaigns may depend as much on the fast aggregation of information from the field as on preexisting, top-down strategies.

Chapter Four, Part V

When it comes to the problems of the U.S. intelligence community before September 11, the problem was not decentralization, The problem was the kind of decentralization that the intelligence community was practicing. On the face of it, the division of labor between the different agencies makes a good deal of sense. Specialization allows for a more fine-grained appreciation of information and greater expertise in analysis. And everything we know about decision making suggests that the more diverse the available• perspectives on a problem, the more likely it is that the final decision will be smart. Acting Defense Intelligence Agency director Lowell Jacoby suggested precisely this in written testimony before Congress, writing, “Information considered irrelevant noise by one set of analysts may provide critical clues or reveal significant relationships when subjected to analytic scrutiny by another.”

What was missing in the intelligence community, though, was any real means of aggregating not just information but also judgments. In otherwords, there was no mechanism to tap into the collective wisdom of National Security Agency nerds, CIA spooks, and FBI agents. There was decentralization but no aggregation, and therefore no organization. Richard Shelby’s solution to the problem—creating a truly central intelligence agency—would solve the organization problem, and would make it easier for at least one agency to be in charge of all the information. But it would also forego all the benefits—diversity, local knowledge, independence—that decentralization brings. Shelby was right that information needed to be shared. But he assumed that someone—or a small group of sorneones—needed to be at the center, sifting through the information, figuring Out what was important and what was not. But everything we know about cognition suggests that a small group of people, no matter how intelligent, simply will not be smarter than the larger group. And the best tool for appreciating the collective significance of the information that the intelligence community had gathered was the collective wisdom of the intelligence community. Centralization is not the answer. But aggregation is.

There, were and are a number of paths the intelligence community could follow to aggregate information without adopting a traditional top-down organization. To begin with, simply linking the computer databases of the various agencies ould facilitate the flow of information while still allowing the agencies to retain their autonomy. Remarkably, two years after September 11, the government still did not have a single unified “watch list” that drew on data from all parts of the intelligence community. In some sense, quite simple, almost mechanical steps would have allowed the intelligence community to be significantly smarter.

Other, more far-reaching possibilities were available, too, and in fact some within the intelligence community tried to investigate them. The most important of these, arguably, was the FutureMAP program, an abortive plan to set up decision markets—much like thOse of the IEM—that would have, in theory, allowed analysts from different agencies and bureaucracies to buy and sell futures contracts based on their expectations of what might happen in the Middle East and elsewhere. FutureMAP, which got its funding from the Defense Advanced Research Projects Agency (DARPA), had two elements. The first was a set of internal markets, which would have been quite small (perhaps limited to twenty or thirty people), and open only to intelligence analysts and perhaps a small number of outside experts. These markets might actually have tried to predict the probability of specific events (like, presumably, terrorist attacks), since the traders in them would have been able to rely on, among other things, classified-information and hard intelligence data in reaching their conclusions, The hope was that an internal market would help circumvent the internal politics and bureaucratic wrangling that have indisputably had a negative effect on American intelligence gathering, in no small part by shaping the kinds of conclusions analysts feel comfortable reaching. In theory, at least, an internal market would have placed a premium not on keeping one’s boss or one’s agency happy (or on satisfying the White I-louse) but rather on offering the most accurate forecast. And since it would have been open to people from different agencies, it might have offered the kind of collective judgment that the intelligence community has found difficult to make in the past decade.

The second part of FutureMAP was the so-called Policy Analysis Market (PAM), which in the summer of 2003 became the object of a firestorm of criticism from appalled politicians. The idea behind PAM was a simple one (and similar to the idea behind the internal markets): just as the IEM does a good job of forecasting election results and other markets seem to do a good job of forecasting the future, a market centered on the Middle East might provide intelligence that otherwise would be missed.

What distinguished PAM from the internal market was that it was going to be open to the public, and that it seemed to offer the possibility of ordinary people profiting from terrible things happening. Senators Ron Wyden and Byron Dorgan, who were the leaders of the effort to kill PAM, denounced it as “harebrained,” “offensive,” and “useless.” The public, at least those who heard about PAM before it was unceremoniously killed, seemed equally appalled.

Given the thesis of this book, it will not surprise you to learn that I think PAM was potentially a very good idea. The fact that the market was going to be open to the public did not mean that its forecasts would be more inaccurate. On the contrary, we’ve seen that even when traders are not necessarily experts, their collective judgment is often remarkably good. More to the point, opening the market to the public was a way of getting people whom the American intelligence community might not normally hear from— whether because of patriotism, fear, or resentment—to offer up information they might have about conditions in the Middle East.

From the perspective of Shelby’s attack on the intelligence community, PAM, like the internal markets, would have helped break down the institutional barriers that keep information from being aggregated in a single place. Again, since traders in a market have no incentive other than making the right prediction—that is, there are no bureaucratic or political factors influencing their decisions—and since they have that incentive to be right, they are more likely to offer honest evaluations instead of tailoring their opinions to fit the political climate or satisfy institutional demands.

Senator Wyden dismissed PAM as a “fairy tale” and suggested that DARPA would be better off putting its money into “real world” intelligence. But the dichotomy was a false one. No one suggested replacing traditional intelligence gathering with a market. PAI\’I was intended to be simply another way of collecting information. And in any case, if PAM had, in fact, been a “fairy tale,” we would have known it soon enough. Killing the project ensured only that we would have no idea whether decision markets might have something to add to our current intelligence efforts.

The hostility toward PAM, in any case, had little to do with how effective it would or would not be. The real problem with it, Wyden and Dorgan made clear, was that it was “offensive” and “morally wrong” to wager on potential catastrophes. Let’s admit there’s something viscerally ghoulish about betting on an assassination attempt. But let’s also admit that U.S. government analysts ask themselves every day the exact same questions that PAM traders would have been asking: How stable is the government of Jordan? How likely is it the House of Saud will fall? Who will be the head of the Palestinian Authority in 2OO? If it isn’t immoral for the U.S. government to ask these questions, it’s hard to see how it’s immoral for people outside the U.S. government to ask them.

Nor should we have shied from the prospect of people profiting from predicting catastrophe. CIA analysts, after all, don’t volunteer their services. We pay them to predict catastrophes, as we pay informants for valuable information. Or consider our regular economy. The entire business of a life-insurance company is based on betting on when people are going to die (with a traditional life- insurance policy, the company is betting you’ll die later than you think you will, while with an annuity it’s betting you’ll die sooner). There may be something viscerally unappealing about this, hut most of us understand that it’s necessary. This is, in some sense. what markets often do: harness amorality to improve the collective good. If the price of better intelligence was simply having our sensibilities bruised, that doesn’t seem like tOo high a price to have paid. And surely letting people wager on the future was less morally problematic than many of the things our intelligence agencies have done and continue to do to get information. If PAM would actually have made America’s national security strongel-, it would have been morally wrong not to use it.

There were serious problems that the market would have had to overcome. Most notably, if the market was accurate, and the Department of Defense acted on its predictions to stop, say, a coup in Jordan, that action would make the traders’ predictions false and thereby destroy the incentives to make good predictions. A well-designed market would probably have to account for such U.S. interventions, presumably by making the wagers conditional on U.S. action (or, alternatively, traders would start to factor the possibility of U.S. action into their prices). But this would be a problem only if the market was in fact making good predictions. Had PAM ever become a fully liquid market, it would probably also have had the same problems other markets sometimes have, like bubbles and gaming. But it is not necessary to believe that markets work perfectly to believe that they work well.

More important, although most of the attention paid to PAM focused on the prospect of people betting on things like the assassination of Arafat, the vast majority of the “wagers” that PAM traders would have been making would have been on more mundane questions, such as the future economic growth of Jordan or how strong Syria’s military was. At its core, PAM was not meant to tell us what Hamas was going to do next week or to stop the next September 11. Instead, it was meant to give us a better sense of the economic health, the civil stability, and the military readiness of Middle Eastern nations, with an eye on what that might mean for U.S. interests in the region. That seems like something about which the aggregated judgment of policy analysts, would-be Midllie Eastern experts, and businessmen and academics from the : Middle East itself (the kind of people who would likely have been trading on PAM) would have had something valuable to say.

We may yet find out if they do, because in the fall of 2003, NetExchange, the company that had been responsible for setting up PAM, announced that in 2004, a new, revised Policy Analysis Market (this one without government involvement of any sort) would be opened to the public. NetExchange was careful to make clear that the goal of the market would not be to predict terrorist incidents but rather to forecast broader economic, social, and military trends in the region. So perhaps the promise of PAM will actually get tested against reality, instead of being dismissed out of hand. It also seems plausible, and even likely, that the U.S. intelligence community will eventually return to the idea of using internal prediction markets—limited to analysts and experts—as a means of aggregating dispersed pieces of information and turning them into coherent forecasts and policy recommendations. Perhaps that would mean that the CIA would be running what Senators Wyden and Dorgan scornfully called “a betting parlor.” But we know one thing about betting markets: they’re very good at predicting the future.

Chapter Three, Part V

What makes information cascades interesting is that they are a form of aggregating information, just like a voting system or a market. And the truth is that they don’t do a terrible job of aggregation. In classroom experiments, where cascades are easy to start and observe, cascading groups pick the better alternative about 30 percent of the time, which is better than any individual in the groups can do. The fundamental problem with cascades is that people’s choices are made sequentially, instead of all at once. There are good reasons for this—some people are more cautious than others, some are more willing to experiment, some have more money than others. But roughly speaking, all of the problems that cascades can cause are the result of the fact that some people make their decisions before others. if you want to improve an organization’s or an economy’s decision making, one of the best things you can do is make sure, as much as possible, that decisions are made simultaneously (or close to it) rather than one after the other. -

An interesting proof of this can be found in one of those very classroom experiments I just mentioned. This one was devised by economists Angela Hung and Charles Plott, and it involved the time-honored technique of having students draw colored marbles from urns. In this case, there were two urns. Urn A contained twice as many light marbles as dark ones. Urn B contained twice as many dark marbles as light ones. At the beginning of the experiment, the people in charge chose one of the two urns from which, in sequence, each volunteer drew a marble. The question the participants in the experiment had to answer was: Which urn was being used? A correct answer earned them a couple of dollars.

To answer that question, the participants could rely on two sources of information. First, they had the marble they had drawn from the urn, If they drew a light marble, chances were that it was from Urn A. If they drew a dark marble, chances ar that it was from Urn B. This was their “private information,” because no one was allowed to reveal what color marble they had drawn. All people revealed was their guess as .to which urn was being used. This was the second source of information, and it created a potential conflict. If three people in front of you had guessed Urn B, but you drew a light marble, would you still guess Urn A even though the group thought otherwise?

Most of the time the student in that situation guessed Urn B, which was the rational thing to do. And in 78 percent of the trials, information cascades started. This was as expected. But then Hung and Plott changed the rules. The students still drew their marbles from the urn and made their decisions in order. But this time, instead of being paid for picking the correct answer, the students got paid based on whether the group’s collective answer—as decided by majority vote—was the right one. The students’ task shifted from trying to do the best they could individually to trying to make the group as smart as it could be.

This meant one thing had to happen: each student had to pay more attention to his private information and less attention to everyone else’s. (Collective decisions are only wise, remember, when they incorporate lots of different information.) People’s private information, though, was imperfect. So by paying attention to only his own information, a student was more likely to make a wrong guess. But the group was more likely to be collectively right. Encouraging people to make incorrect guesses actually made the group as a whole smarter. And when it was the group’s collective accuracy that counted, people listened to their private information. The group’s collective judgment became, not surprisingly, significantly more accurate than the judgments of the cascading groups.

Effectively what Hung and Plott did in their experiment was remove (or at least reduce) the sequential element in the way people made decisions, by making previous choices less important to the decision makers. That’s obviously not something that an economy as a whole can do very easily—we don’t want companies to have to wait to launch products until the public at large has voted yea or nay. Organizations, on the other hand, clearly can and should have people offer their judgments simultaneously, rather than one after the other. On a deeper level, the success of the Hung and Plott experiment—which effectively forced the people in the group to make themselves independent—underscores the value and the difficulty of autonomy. One key to successful group decisions is getting people to pay much less attention to what everyone else is saying.

Chapter Three, Part IV

So should we just lock ourselves up in our rooms and stop paying attention to what others are doing? Not exactly (although it is true that we would make better collective decisions if we all stopped taking only our friends advice) Much of the time imitation works. At least in a society like America’s, where things generally work pretty well without much top-down control, taking your cues from everyone else’s behavior is an easy and useful rule of thumb. Instead of having to undertake complicated calculations before every action, we let others guide us. Take a couple of everyday examples from city life. On a cloudy day, if I’m unsure of whether or not to take an umbrella when I leave my apartment, the easiest solution— easier, even, than turning on the Weather Channel—is to pause a moment on the doorstep to see if the people on the street are carrying umbrellas. If most of them are, I do, too, and it’s the rare time when this tactic doesn’t work. Similarly, I live in Brooklyn, and I have a car, which I park on the street. Twice a-week, I have to move the car by 11AM because of street cleaning, and routinely, by 10:45 or so, every car on the street that’s being cleaned has been moved. Occasionally, though, I’ll come out of the house at 10:40 and find that all the cars are stillon the street, and I’ll know that that day street cleaning has been suspended, and I won’t move my car. Now, it’s possible that every other driver on the street has kept close track of the days on which street cleaning will be suspended. But I suspect that most drivers are like me: piggybacking, as it were, on the wisdom of others.

In a sense, imitation is a kind of rational response to our own cognitive limits. Each person can’t know everything. With imitation, people can specialize and the benefits of their investment in uncovering information can be spread widely when others mimic them. Imitation also requires little top-down direction, The relevant information percolates quickly through the system, even in the absence of any central authority. And people’s willingness to imitate is not, of course, unconditional. If I get a couple of tickets because of bad information, I’ll soon make sure I know when I have to move my car. And although I don’t think Milgram and his colleagues ever followed up with the people in their experiment who had stopped to look at the sky, one suspects that the next time they walked by a guy with his head craned upward, they didn’t stop to see what he was looking at. In the long run, imitation has- to be effective for people to keep doing it.

Mimicry is so central to the way we live that economist Her ber Simon speculated that humans were genetically predisposed to be imitation machines. And imitation seems to be a key to the transmission of valuable practices even among nonhumans. The most famous example is that of the macaque monkeys on the island of Koshima in Japan. In the early 1950s, a one-year-old female macaque named Imo somehow hit upon the idea of washing her sweet potatoes in a creek before eating them. Soon it was hard to find a Koshima macaque who wasn’t careful to wash off her sweet potato before eating it. A few years later, Imo introduced another innovation. Researchers on the island occasionally gave the mon key wheat (in addition to sweet potatoes). But the wheat was given to them on the beach, where it quickly became mixed with sand. lmo,though, realized that if you threw a handful of wheat and sand into the ocean, the sand would sink and the wheat would float. Again, within a few years most of her fellow macaques were hurling wheat and sand into the sea and reaping the benefits.

The Imo stories are interesting because they seem to be in stark contrast to the argument of this book. This was one special monkey who hit on the right answer and basically changed macaque “society.” How, then, was the crowd wise?

The wisdom was in the decision to imitate Imo. As I sug geste in the last chapter groups are better at deciding between possible solutions to a problem than they are at coming up with them. Invention may still be an individual enterprise (although, as we’ll see, invention has an inescapably collective dimension), but
selecting among inventions is a collective one. Used well, imitation is a powerful tool for spreading good ideas fast—whether they be in culture, business, sports, or the art of wheat eating. At its best, you can see it as a way of speeding up the evolutionary process— the community can become more fit without the usual need for multiple generations of genetic winnowing. Scientists Robert Boyd and Peter J. Richerson have pioneered the study of the transmission of social norms, trying to understand how groups arrive at collectively beneficial conclusions. They’ve run a series of computerized simulations looking at the behavior of agents who are trying to discover which of two different behaviors is best suited to the environment they’re living in. In the simulation, each agent can try out a behavior for himself and see what happens, but he can also observe the behavior of someone else who’s already made a decision about which behavior is best. Boyd and Richerson found that under these circumstances, everyone benefits when a sizable percentage of the population imitates. But this is only true as long as people are willing to stop imitating and learn for themselves when the benefits of doing so become high enough. In other words, if people just keep following the lead of others regardless of what happens, the well-being of the group suffers. Intelligent imitation can help the group—by making it easier for good ideas to spread quickly—but slavish imitation hurts.

Distinguishing between the two kinds of imitation is, of course, not easy, since few people will admit that they’re mindlessly conforming or herding. But it does seem clear that intelligent imitation depends on a couple of things: first, an initially wide array of options and information; and second, the willingness of at least some people to put their own judgment ahead of the group’s, even when it’s not sensible to do so.

Do such people exist? Actually they’re a lot more common than you’d expect. One reason is that people are, in general, overconfident. They overestimate Their abi1it their level of knowledge, and their decision-making prowess. And people are more overconfident when facing difficult problems than when facing easy ones. This is not good for the overconfident decision makers themselves, since it means that they’re more likely to choose badly. But it is good for society as a who1e because overconfident people are less likely to get sucked into a negative information cascade, and, in the right circumstances, are even able to break cascades. Remember that a cascade is kept going by people valuing public information more highly than their private information. Overconfident people don’t do that. They tend to ignore public information and gO on their gut. When they do so, they disrupt the signal that everyone else is getting. They make the public information seem less certain. And that encourages others to rely on themselves rather than just follow everyone else.

At the same time, even risk-averse people do not, for the most part, slavishly fall in line. For instance, in 1943 the sociologists Bryce Ryan and Neal Gross published a study of the way Iowa farmers adopted a new, more productive hybrid seed corn. In their study, which became the most influential study of innovation in history, Ryan and Gross found that most farmers didn’t investigate the corn independently as soon as they heard about it, even though there was good information available that showed it increased yields by 20 percent. They waited until other farmers had success with it and then followed their example. So that suggests that a cascade was at work. But in fact, even after witnessing the success of their neighbors, the farmers did not seed their entire fields with the hybrid corn. Instead, they set aside a small part of a field and tested the corn for themselves first. Only after they were personally satisfied with it did they start using the corn exclusively. And it took nine years from the time the first farmer planted his field with the new corn to the time half of the farmers in the region were using it, which does not suggest a rash decision-making process.

Similarly, in a fascinating study of how farmers in India decided whether or not to adopt new high-yielding-variety crop strains during the Green Revolution of the late 1960s, Kaivan Munshi shows that rice farmers and wheat farmers made their decisions about new crops in very different ways. In the wheatgrowing regions Munshi looked at, land conditions were relatively uniform, and the performance of a crop did not vary much from farm to farm. So if you were a wheat farmer and you saw that the new seeds substantially improved your neighbor’s crop, then you could be confident that it would improve your crop as well. As a result, wheat farmers paid a great deal of attention to their neighbors, and made decisions based on their performance. In rice-growing regions, on the other hand, land conditions varied considerably, and there were substantial differences in how crops did from farm to farm. So if you were a rice farmer, the fact that your neighbor was doing well (or poorly) with the new crop didn’t tell you much about what would happen on your land. As a result, rice farmers’ decisions were not that influenced by their neighbors. Instead, rice farmers experimented far more with the new crop on their own land before deciding to adopt it. V/hat’s telling, too, is that even the wheat farmers did not use the new strains of wheat until after they could see how the early adopters’ new crops did.

For farmers, choosing the right variety of corn or wheat is the most important decision they can make, so it’s perhaps not surprising that they would make those decisions on their own, rather than simply mimicking those who came before them. And that suggests that certain products or problems are more susceptible to cascades than others. For instance, fashion and style are obviously driven by cascades, which we call fads, because when it comes to fashion, what you like and what everyone else likes are clearly wrapped up with each other. I like to dress a certain way, but it’s hard to imagine that the way I like to dress is disconnected from the kind of impression I want to make, which in turn must have something to do with what other people like. The same might also be said, though less definitively, about cultural products (like TV shows) where part of why we watch the show is to talk about it with our friends, or even restaurants, since no one likes to eat in an empty restaurant. No one buys an iPod because other people have them—the way they might, in fact, go to a movie because other people are going—but many technology companies insist that information cascades (of the good kind, they would say) are crucial to their success, as early adopters spread the word of a new product’s quality to those who come after. The banal but key point I’m trying to make is that the more important the decision, the less likely a cascade is to take hold. And that’s obviously a good thing, since it means that the more important the decision, the more likely it is that the group’s collective verdict will be right.

Chapter Three, Part III

Herders may think they want to be right, and perhaps they do. But for the most part, they’re following the herd because that’s where it’s safest. They’re assuming that John Maynard Keynes was right when he wrote, in The General Theory of Employment, Interest and Money, “Worldly wisdom teaches that it is better for reputation to fail conventionally than to succeed unconventionally.” And yet there is the fact that the crowd is right much of the time, which means that paying attention to what others do should make you smarter, not dumber. Information isn’t in the hands of one person. It’s dispersed across many people. So relying on only your private information to make a decision guarantees that it will be less informed than it could be. Can you safely rely on the information of others? Does learning make for better decisions?

The answer is that it depends on how we learn. Consider the story of plank-road fever, which the economist Daniel B. Klein and the historian John Majewski uncovered a decade ago. In the first half of the nineteenth century, Americans were obsessed with what were then known as “internal improvements”—canals, railroads, and highways. The country was growing fast and commerce was booming, and Americans wanted to make sure that transportation—or rather the lack of it—didn’t get in the way. In 1825, the Erie Canal was completed, linking New York City to Lake Erie via a 363-mile-long channel that cut travel time from the East Coast to the western interior in half and cut shipping costs by 90 percent. Within a few years, the first local rail lines were being laid, even as private companies were busy building private turnpikes all over the eastern part of the country.

There was a problem, though, that all this feverish building did not solve. Although the canals and railroads would do an excellent job of connecting major towns and cities (and of turning small villages into thriving commercial hubs merely by virtue of going through them), they made it no easicr for people who lived outside of those towns—which is to say, most Americans—to get their goods to market, or for that matter to get from one small town to the next. There were local public roads, different stretches of which were maintained by individual villages (much as in a city people take care, at least in theory, of the patch of sidewalk in front of their apartment), but these roads were usually in pretty bad shape. “They had shallow foundations, if any, and were poorly drained,” write Klein and Majewski. “Their surfaces were muddy ruts in wet weather, dusty ruts in dry; travel was slow and extremely wearing on vehicles and on the animals that drew them.”

An engineer named George Geddes, though, believed he had uncovered a solution to this problem: the plank road. The plank road—which, as its name suggests, consisted of wooden planks laid over two lines of timber—had been introduced in Canada in the early 1840s, and after seeing evidence of its success there, Geddes was convinced it would work in the United States as well. There was no question that a plank road was superior to a rutted, muddy pth. What wasn’t clear was whether a plank road—which would, in most cases, be privately owned and supported by toils—would last long enough to be cost-effective, Geddes believed that a typical road would last eight years, more than long enough to provide a reasonable return on investment, and so, in 1846, he convinced some of his fellow townsmen in Sauna, New York, to charter a company to build the state’s first plank road.

The road was a roaring success, and soon plank-road fever swept through first New York, then through the mid-Atlantic states and the Midwest. Geddes became a kind of spokesman for the industry, even as other promoters played a similar role in states across the country. Within a decade, there were 352 plank-road companies in New York, and more than a thousand in the United States as a whole.

Unfortunately, the whole business was built on an illusion. Plank roads did not last the eight years Geddes had promised (let alone the twelve years that other enthusiasts had suggested). As Klein and Majewski show, the roads’ actual life span was closer to four years, which made them too expensive for companies to maintain. By the late 1850s, it was clear that the plank road was not a transportation panacea. And though a few roads—including a thirteen-mile stretch along what is now Route 27A in Jamaica, Queens—remained in operation until the l880s, by the end of the Civil War almost all of them had been abandoned.

PLANK-ROAD FEVER WAS a vivid example of a phenomenon that economists call an “information cascade.” The first Salina road was a success, as.were those which were built in the years immediately following. People who were looking around for a solution to the problem of local roads had one ready-made at hand. As more people built plank roads, their legitimacy became more entrenched, and the desire to consider other solutions shrank. It was years before the fundamental weakness of the roads—they didn’t last long enough—became obvious, and by that time plank roads were being built all over the country

Why did this happen? The economists Sushil Bikhchandani, David Hirshleifei and Ivo Welch, who offered the first real model of an information cascade, suggest that-it works like this. Assume you have a large group of people, all of whom have the choice of going to either a new Indian restaurant or a new Thai place. The Indian restaurant is better (in an objective sense) than the Thai place. And each person in the group is going to receive, at some point, a piece of information about which restaurant is better. But the information is imperfect. Sometimes it will be wrong—that is, it will say the Thai place is better when it’s not—and will guide a person in the wrong direction. So to supplement their own information, people will look at what others’are doing. (The economists assume that everyone knows that everyone else has a piece of good information, too.)

The problem starts when people’s decisions are not made all at once but rather in sequence, so that some people go to one of the two restaurants first and then everyone else follows in order. Remember, the information people have is imperfect. So if the first couple of people happen to get bad information, leading them to believe that the Thai restaurant is great, that’s where they’ll go. At that point, in the cascade model, everyone who follows assumes— even if they’re getting information telling them to go to the Indian restaurant—that there’s a good chance, simply because the Thai place is crowded, that it’s better. So everyone ends up making the wrong decision, simply because the initial diners, by chance, got the wrong information.

In this case, a cascade is not the result of mindless trendfollowing, or conformity or peer pressure. (“Everyone likes that new Britney Spears song, so I will, too!”) People fall in line because they believe they’re learning something important from the exam- pie of others. In the case of the plank roads, for instance, it wasn’t simply that George Geddes was a smooth talker, or that townspeople across the country said, “We just have to have a new plank road because the town across the river has one.” Plank-road fever spread because plank roads really seemed to be a better solution. They cut travel time between towns in half. You could ride on them in any kind of weather.-And they allowed small farmers to expand the markets for their goods far beyond what had previously been possible. These were genuine improvements, and as more and more plank roads were built, the fact that those improvements were real and long lasting seemed increasingly plausible. Each new road that was built was in a sense telling people that plank roads worked. And each new road that was built made coming up with an alternative seem increasingly improbable.

The fundamental problem with an information cascade is that after a certain point it becomes rational for people to stop paying attention to their OWfl knowledge—their private information— and to start looking instead at the actions of others and imitate them. (If everyone has the same likelihood of making the right choice, and everyone before you has made the same choice, then you should do what everyone else has done.) But once each individual stops relying on his OWfl knowledge, the cascade stops becoming informative, Everyone thinks that people are making decisions based on what they know, when in fact people are making decisions based on what the.y think the people who came before them knew. Instead of aggregating all the information individuals have, the way a market or a voting system does, the cascade becomes a sequence of uninformed choices, so that collectively the group ends up making a bad decision—spending all that money on plank roads.

That original model is far from the only theory of how cascades work, of course. In The Tipping Point, for instance, Malcolm Gladwell offered a very different account, which emphasized the importance of particular kinds of individuals—what he called mavens, connectors, and salesmen—in spreading new ideas. In Bikhchandani, Hirshleifer, and Welch’s model of cascades, everyone had as much private information as everyone else. The only thing that made the early adopters of a product more influential was the fact that they were early, and so their actions were the ones that everyone who came after them observed. In Gladwell’s world, some people are far more influential than others, and cascades (he writes of them as epidemics) move via social ties, rather than being a simple matter of anonymous strangers observing each other’s behavior. People are still looking for information, but they believe that the ones who have it are the mavens, connectors, and salesmen (each of whom has a different kind of information).

Do cascades exist? Without a doubt. They are less ubiquitous than the restaurant-going model suggests, since, as Yale economist Robert Shiller has suggested, people don’t usually make decisions in sequence. “In most cases,” Shiller writes, “many people inde -pendentl choose their. action based on their own signals, without observing the actions of others.” But there are plenty of occasions when people do closely observe the actions of others before making their own decisions. In those cases, cascades are possible, even likely. That is not always a bad thing. For instance, one of the most important and valuable innovations in American technological history was made possible by the orchestrating of a successful information cascade. The innovation was the humble screw, and in the 1860s a man named William Sellers, who was the most prominent and respected machinist of his era at a time when the machine-tool industry was the rough equivalent of the technology industry in the 1990s, embarked on a campaign to get America to adopt a standardized screw, which happened to be of his own design. When Sellers started his campaign, everyAmerican screw had to be handmade by a machinist, This obviously limited the possibilities for mass production, but it also allowed the machinists to protect their way of life. In economic terms, after all, anything tailor-made has the advantage of locking in customers. If someone bought a lathe from a machinist, that person had to come back to the machinist for screw repairs or replacements. But if screws became interchangeable, customers would need the craftsmen less and would worry about the price more.

Sellers understood the fear. But he also believed that interchangeable parts and mass production were inevitable, and the screw he designed was meant to be easier, cheaper, and faster to produce than any other. His screws fit the new economy, where a premium was placed on speed, volume, and cost. But because of what was at stake, and because the machinist community was so tight-knit, Sellers understood that connections and influence would shape people’s decisions. So over the next five years, he targeted influential users, like the Pennsylvania Railroad and the U.S. Navy, and he successfully created an air of momentum behind the screw. Each new customer made Sellers’s eventual triumph seem more likely, which in turn made his eventual triumph more likely. Within a decade the screw was on its way to becoming a national standard. Without it, assembly-line production would have been difficult at best and impossible at worst. In a sense, Sellers had helped lay the groundwork for modern mass production.

Sellers’s story is of a beneficial cascade. The screw’s design was, by all accounts, superior to its chief competitor, a British screw. And the adoption of a standard screw was a great leap forward for the U.S. economy. But there is an unnerving idea at the heart of Sellers’s story: if his srew was adopted because he used his influence and authority to start a cascade, we were just lucky that Sellers happened to design a good screw. If the machinists were ultimately following Sellers’s lead, rather than acting on their own sense of which screw was better, it was pure chance that they got the answer right.

In other words, if most decisions to adopt new technologies or social norms are driven by cascades, there is no reason to think that the decisions we make are, on average, good ones. Collective decisions are most likely to be good ones when they’re made by people with diverse opinions reaching independent conclusions, relying primarily on their private information. In cascades, none of these things are true. Effectively speaking, a few influential people—either because they happened to go first, or because they have particular skills and fill particular holes in people’s social networks—determine the course of the cascade. In a cascade, people’s decisions are not made independently, but are profoundly influenced—in some cases, even determined—by those around them.

We recently experienced perhaps the most disastrous information cascade in history which was the bubble of the late 1990s in the telecommunications business. In the early days of the Internet, traffic was growing at the rate of 1,000 percent a year. Beginning in 1996 or so, that rate slowed dramatically (as one would expect). But no one noticed. The figure “1,000 percent” had become part of the conventional wisdom, and had inspired telecom companies to start investing tens, and eventually hundreds, of billions of dollars to build the capacity that could handle all that traffic. At the time, not investing seemed tantamount to suicide. Even if you had doubts about whether the traffic would ever materialize, everyone around you was insisting that it would. It wasn’t until after the bubble burst, when most of the telecom companies were either bankrupt or on the verge of going out of business, that the conventional wisdom was seriously questioned and found wanting.

Chapter Three, Part II

In 1968, the social psychologists Stanley Milgram, Leonard Bickman, and Lawrence Berkowitz decided to cause a little trouble. First, they put a single person on a street corner and had him look up at an empty sky for sixty seconds. A tiny fraction of the passing pedestrians stopped to see what the guy was looking at, hut most just walked past. Next time around, the psychologists put five skyward-looking men on the corner. This time, four times as many people stopped to gaze at the empty sky. When the psychologists put fifteen men on the corner, 45 percent of all passerbys stopped, and increasing the cohort of observers yet again made more than 80 percent of pedestrians tilt their heads and look up.

This study appears, at first glance, to be another demonstration of people’s willingness to conform. But in fact it illustrated something different, namely the idea of “social proof,” which is the tendency to assume that if lots of people are doing something or believe something, there must be a good reason why. This is different from conformity: people are not looking up at the sky because of peer pressure or a fear of being reprimanded. They’re looking up at the sky because they assume—quite reasonably—that lots of people wouldn’t be gazing upward if there weren’t something to see. That’s why the crowd becomes more influential as it becomes bigger: every additional person is proof that something important is happening. And the governing assumption seems to be that when things are uncertain, the best thing to do is just to follow along. This is actually not an unreasonable assumption. After all, if the group usually knows best (as I’ve argued it often does), then following the group is a sensible strategy. The catch is that if too many people adopt that strategy, it stops being sensible and the group stops being smart.

Consider, for instance,, the story of Mike Martz, the head coach of the St. Louis Rams. Going into Super Bowl XXXVI, the Rams were fourteen-point favorites over the New England Patriots. St. Louis had one of the most potent offenses in NFL history had led the league in eighteen different statistical categories, and had outscored their opponents 503 to 273 during the regular season. Victory looked like a lock.

Midway through the first quarter, the Rams embarked on their first big drive of the game, moving from their own twenty yard line to the Patriots’ thirty-two. On fourth down, with three yards to go for a first down, Martz faced his first big decision of the game. Instead of going for it, he sent on field-goal kicker Jeff Wilkins, who responded with a successful kick that put the Rams up 3 to 0.

Six minutes later, Martz faced a similar decision, after a Rams drive stalled at the Patriots’ thirty-four yard line. With St. Louis needing five yards for a first down, Martz again chose to send on the kicking team. This time, Wilkins’s attempt went wide left, and the Rams came away with no points.

By NFL standards, Martz’s decisions were good ones. When given the choice between a potential field goal and a potential first down, NFL coaches will almost always take the field goal. The conventional wisdom among coaches holds that you take points when you can get them. (We’ll see shortly why “conventional wisdom” is not the same as “collective wisdom.”) But though Martz’s decisions conformed to the conventional wisdom, they were wrong.

Or so, at least, the work of David Romer would suggest. Romer is an economist at Berkeley who, a couple of years ago, decided to figure out exactly what the best fourth-down strategy actually was. Homer was interested in two different variations of that problem. First, he wanted to know when it made sense to go for a first down rather than punt or kick a field goal. And second, he wanted to know when, once you were inside your opponent’s ten yard line, it made sense to go for a touchdown rather than kick a field goal. Using a mathematical technique called dynamic programming, Romer analyzed just about every game—seven hundred in all—from the 1998, 1999, and 2000 NFL seasons. When he was done, he had figured out the value of a first down at every single point on the field. A first-and-ten on a team’s own twenty yard line was worth a little bit less than half a point—in other words, if a team started from its own twenty yard line fourteen times, on average it scored just one touchdown. A first-and-ten at midfield was worth about two points. A first-and-ten on its opponent’s thirty yard line was worth three. And so on.

Then Romer figured out how often teams that went for a first down on fourth down succeeded. If you had a fourth-and-three on your opponent’s thirty-two yard line, in other words, he knew how likely it was that you’d get a first down if you went for it. And he also knew how likely it was that you’d kick a field goal successfully. From there, comparing the two plays was simple: if a first down on your opponent’s twenty-nine yard line was worth three points, and you had a 60 percent chance of getting the first down, then the expected value of going for it was 1.8 points (3 x .6). A field goal attempt from the thirty-one yard line, on the other hand, was worth barely more than a single point. So Mike Martz should have gone for the first down.

The beauty of Romer’s analysis was that it left nothing out. After all, when you try a fifty-two yard field goal, it isn’t just the potential three points you have to take into account. You also have to consider the fact that if you fail, your opponents will take over at their own thirty-five yard line. Homer could tell you how many points that would cost you. Every outcome, in other words, could be compared to every other outcome on the same scale.

Romer’s conclusions were, by NFL standards, startling. He argued that teams should pass up field goals and go for first downs far more often than they do. In fact, just about any time a team faced a fourth down needing three or fewer yards for a first, Romer recommended they go for it, and between midfield and the opponent’s thirty yard line—right where the Rams were when Martz made his decisions—Romer thought teams should be even more aggressive. Inside your opponent’s five yard line, meanwhile, you should always go for the touchdown.

Romer’s conclusions were the kind that seem surprising at first and then suddenly seem incredibly obvious. Consider a fourth down on your opponent’s two yard line. You can take a field goal, which is essentially a guaranteed three points, or go for a touchdown, which you will succeed at scoring only 43 percent of the time. Now, 43 percent of seven points is roughly three points, so the value of the two plays is identical. But that’s not all you have to think about. Even if the touchdown attempt fails, your opponent will be pinned on its two yard line. So the smart thing to do is to go for it.

Or consider a fourth-and-three at midfield. Half the time you’ll succeed, and half the time you’ll fail, so it’s a wash (since no matter what happens, either team will have the ball at the same place on the field). But the 50 percent of the time that you succeed, you’ll gain an average of six yards, leaving you better off than your opponent is when you fail. So, again, aggressiveness makes sense.

Obviously there were’things that Romer couldn’t factor in, including, most notably, the impact of momentum on a team’s play. And his numbers were averaged across the league as a whole, so individual teams would presumably need to do some adjusting to figure out their particular chancs of success on fourth down. Even so, the analysis seems undeniable: coaches are being excessively cautious. And, as for Mike Martz, his two decisions in that Super Bowl game were about as bad as decisions get, Martz refused to go for a first down on the Patriots’ thirty-two yard line when the Rams needed just three yards. Romer’s calculations suggest that Martz would have been justified in going for a first down even if the Rams had needed nine yards (since at that place on the field, the chances of missing a field goal are high, and the field-position cost is slight). And that’s with an average team. With an offense like the Rams’, the value of going for it would presumably have been much higher. While it’s impossible to say that any one (or two) decisions were responsible for the final outcome, it’s not exactly surprising that the Rams lost that Super Bowl.

Again, though, Martz was not alone. Romer looked at all the first-quarter fourthdown plays in the three seasons he studied, and found 1,100 plays where the teams would have been better off going for it. Instead, they kicked the ball 992 times.

This is perplexing. After all, football coaches are presumably trying their best to win games. They are experts. They have an incentive to introduce competitive innovations. But they’re not adopting a strategy that would help them win. It’s possible, of course, that Romer is wrong. Football is a remarkably complex, dynamic game, in which it’s hard to distinguish among skill, strategy, emotion, and luck, so there may be something important that his computer program is missing. But it’s not likely. Romer’s study suggests that the gains from being more aggressive on fourth down are so big that they can’t be explained away as a fluke or a statistical artifact. Teams that became more aggressive on fourth down would unquestionably have a competitive edge. But most NFL coaches prefer to be cautious instead. The interesting question is: Why?

The answer I think, has a lot to do with imitation and social proof and the limits of group thinking. First, and perhaps most important, playing it conservatively on fourth down is as close to a fundamental truth in professional football as you get. In the absence of hard evidence to the contrary, it’s easier for individuals to create explanations to justify the way things are than to imagine how they might be different. If no one else goes for it, then that must mean that it doesn’t make sense to go for it.

The imitative impulse is magnified by the fact that football— like most professional sports—is a remarkably clubby, insular institution. To be sure, there have been myriad genuine innovators in the game—including Martz himself—but in its approach to statistical analysis the game has been strangely hidebound. The pooi of decision makers is not, in other words, particularly diverse. That means it is unlikely to come up with radical innovations, and even more unlikely to embrace them when they’re proposed. To put it another way, the errors most football coaches make are correlated: they all point in the same direction. This is exactly the problem with most major-league baseball teams, too, as Michael Lewis documented so well in his book about the recent success of the Oakland A’s, Mone-yball. Billy Beane and Paul DePodesta, the brain trust of the A’s, have been able to build a tremendously successful team for very little money precisely because they’ve rejected the idea of social proof, abandoning the game’s conventional strategic and tactical wisdom in order to cultivate diverse approaches to player evaluation and development. (Similarly, the one current NFL coach who appears to have taken Romer’s ideas seriously— and perhaps even used them in games—is the New England Patriots’ Bill Belichick, whose penchant for rejecting the conventional wisdom has helped the Patriots win two Super Bowls in three years.)

Another factor shaping NFL coaches’ caution may be, as Romer himself suggests, an aversion to risk. Going for it on fourth- and-two makes strategic sense, but it may not make psychological sense. After all, Romer’s strategy means that teams would fail to score roughly half the time they were inside their opponent’s ten yard line. That’s a winning.strategy in the long run. But it’s still a tough ratio for a risk-averse person to accept. Similarly, even though punting on fourth down makes little sense, it at least limits disaster.

The risk-averse explanation makes additional sense if you think about the pressures that any community can bring to bear on its members. That doesn’t mean that NFL coaches are forced to be conservative. It just means that when all of one’s peers are following the exact same strategy it’s difficult to follow a different one, especially when the new strategy is more risky and failure will be public and inescapable (as it is for NFL coaches). Under those conditions, sticking with the crowd and failing small, rather than trying to innovate and run the risk of failing big, makes not just emotional but also professional sense. This is the phenomenon that’s sometimes called herding. Just as water buffalo will herd together in the face of a lion, football coaches, money managers, and corporate executives often find the safety of numbers alluring—as the old slogan “No one ever got fired for buying IBM” suggests.

The striking thing about herding is that it takes place even among people who seem to have every incentive to think independently, like professional money managers. One classic study of herding, by David S. Scharfstein and Jeremy C. Stein, looked at the tendency of mutual-fund managers to follow the same strategies and herd into the same stocks. This is thoroughly perplexing. Money managers have jobs, after all, only because they’ve convinced investors that they can outperform the market. Most of them can’t. And surely herding only makes a difficult task even harder, since it means the managers are mimicking the behavior of their competitors.

What Scharfstein and Stein recognized, though, was that mutual-fund managers actually have to do two things: they have to invest wisely, and they have to convince people that they’re investing investing wisely, too. The problem is that it’s hard for mutual- fund investors to know if their money manager is, in fact, investing their money wisely. After all, if you knew what investing wisely was, you’d do it yourself. Obviously you can look at performance, but we know that short-term performance is an imperfect indicator of skill at best. In any one quarter, a manager’s performance may be significantly better or worse depending on factors that have absolutely nothing to do with his stock-picking or asset-allocation skills. So investors need more evidence that a mutual-fund manager’s decisions are reasonable. The answer? Look at how a manager’s style compares to that of his peers. If he’s following the same strategy—investing in the same kinds of stocks, allocating money to the same kinds of assets—then at least investors know he’s not irrational. The problem, of course, is that this means that, all other things being equal, someone who bucks the crowd—by, say, following a contrarian strategy—is likely to be considered crazy.

This would not matter if investors had unlimited patience, because the difference between good and bad strategies would eventually show up in the numbers. But investors do not have unlimited patience, and even the smartest investor will fail a significant percentage of the time. It’s much safer for a manager to follow the strategy that seems rational rather than the strategy that is rational. As a result, managers anxious to protect their jobs come to mimic each other. In doing so, they destroy whatever information advantage they might have had, since the mimicking managers are not really trading on their own information but are relying on the information of others. That shrinks not only the range of possible investments but also the overall intelligence of the market, since imitating managers aren’t bringing any new information to the table.

Chapter Three, Part I

In the early part of the twentieth century, the American naturalist William Beebe came upon a strange sight in the Guyana jungle. A group of army ants was moving in a huge circle. The circle was 1,200 feet in circumference, and it took each ant two and half hours to complete the loop. The ants went around and around the circle for two days until most of them dropped dead.

What Beebe saw was what biologists call a “circular mill.” The mill is created when army ants find themselves separated from their colony. Once they’re lost, they obey a simple rule: follow the ant in front of you. The result is the mill, which usually only breaks up when a few ants straggle off by chance and the others follow them away.

As Steven Johnson showed in his illuminating book Emergence, an ant colony normally works remarkably well. No one ant runs the colony. No one issues orders. Each individual ant knows, on its own, almost nothing. Yet the colony successfully finds food, gets all its work done, and reproduces itself. But the simple tools that make ants so successful are also responsible for the demise of the ants who get trapped in the circular mill. Every move an ant makes depends on what its fellow ants do, and an ant cannot act independently, which would help break the march to death.

So far in this book, I’ve assumed that human beings are not ants. In other words, I’ve assumed that human beings can be independent decision makers. Independence doesn’t mean isolation, but it does mean relative freedom from the influence of others. If we are independent, our opinions are, in some sense, our own. We will not march to death in a circle just because the ants in front of us are.

This is important because a group of people—unlike a colony of ants—is far more likely to come up with a good decision if the people in the group are independent of each other. Independence is always a relative term, but the story of Francis Galton and the ox illustrates the point. Each fairgoer figured out his estimate of the weight of the ox on his own (with allowances made for kibitzing), relying on what economists call ‘private information.” (Private information isn’t just concrete data. It can also include interpretation, analysis, or even intuition.) And when you put all those independent estimates together, the combined guess was, as we’ve seen, near perfect.

Independence is important to intelligent decision making for two reasons, First, it keep the mistakes that people make from becoming correlated. Errors in individual judgment won’t wreck the group’s collective judgment as long as those errors aren’t systematically pointing in the same direction. One of the quickest ways to make people’s judgments systematically biased is to make them dependent on each other for information. Second, independent individuals are more likely to have new information rather than the same old data everyone is already familiar with. The smartest groups, then, are made up of people with diverse perspectives who are able to stay independent of each other. Independence doesn’t imply rationality or impartiality, though. You can be biased and irrational, but as long you’re independent, you won’t make the group any dumber.

Now, the assumption of independence is a familiar one. It’s intuitively appealing, since it takes the autonomy of the individual for granted. It’s at the core of Western liberalism. And, in the form of what’s usually called “methodological individualism,” it under- pins most of textbook economics. Economists usually take it as a given that people are self-interested. And they assume people arrive at their idea of self-interest on their own.

For all this, though, independence is hard to come by. We are autonomous beings, hut we are also social beings. We want to learn from each other, and learning is a social process. The neighborhoods where we live, the schools we attend, and the corporations where we work shape the ay we think and feel. As Herbert J. Simon once wrote, “A man does not live for months or years in a particular position in an organization, exposed to some streams of communication, shielded from others, without the most profound effects upon what he knows, believes, attends to, hopes, wishes, emphasizes, fears, and proposes.”

Even while recognizing (how could they not?) the social nature of existence, economists tend to emphasize people’s autonomy and to downplay the influence of others on our preferences and judgments. Sociologists and social-network theorists, by contrast, describe people as embedded in particular social contexts, and see influence as inescapable. Sociologists generally don’t view this as a problem. They suggest it’s simply the way human life is organized. And it may not be a problem for everyday life. But what I want to argue here is that the more influence a group’s members exert on each other; and the more personal contact they have with each other, the less likely it is that the group’s decisions will be wise ones. The more influence we exert on each other, the more likely it is that we will believe the same things and make the same mistakes. That means it’s possible that we could become individually smarter but collectively dumber. The question we have to ask in thinking about collective wisdom, then, is: Can people make collectively intelligent decisions even when they are in constant, if erratic, interaction with each other?

Chapter Two, Part IV

In part because individual judgment is not accurate enough or consistent enough, cognitive diversity is essential to good decision making. The positive case for diversity, as we’ve seen, is that it expands a group’s set of possible solutions and allows the group to conceptualize problems in novel ways. The negative case for diversity is that diversity makes it easier for a group to make decisions based on facts, rather than on influence, authority, or group allegiance. Homogeneous groups, particularly small ones, are often victims of what the psychologist Irving Janis called “groupthink.” After a detailed study of a series of American foreign-policy fiascoes, including the Bay of Pigs invasion and the failure to anticipate Pearl Harbor, Janis argued that when decision makers are too much alike—in worldview and mind-set—they easily fall prey to groupthink. Homogeneous groups become cohesive more easily than diverse groups, and as they become more cohesive they also become more dependent on the group, more insulated from outside opinions, and therefore more convinced that the group’s judgment on important issues must be right. These kinds of groups, Janis suggested, share an illusion of invulnerability, a willingness to rationalize away possible counterarguments to the group’s position, and a conviction that dissent is not useful.

In the case of the Bay of Pigs invasion, for instance, the Kennedy administration planned and carried out its strategy without ever really talking to anyone who was skeptical of the prospects of success. The people who planned the operation were the same ones who were asked to judge whether it would be successful or not. The few people who voiced caution were quickly silenced. And, most remarkably, neither the intelligence branch of the CIA nor the Cuban desk of the State Department was consulted about the plan. The result was a bizarre neglect of some of the most elemental facts about Cuba in 1961, including the popularity of Fidel Castro, the strength of the Cuban army, and even the size of the island itself. (The invasion was predicated on the idea that 1,200 men could take over all of Cuba.) The administration even convinced itself that the world would believe the United States had nothing to do with the invasion, though American involvement was an open secret in Guatemala (where the Cuban exiles were being trained).

The important thing about groupthink is that it works not so much by censoring dissent as by making dissent seem somehow improbable. As the historian Arthur Schlesinger Jr. put it, “Our meetings took place in a curious atmosphere of assumed consen sus.” Even if at first no consensus exists—only the appearance of one—the group’s sense of cohesiveness works to turn the appearance into reality, and in doing so helps dissolve whatever doubts members of the group might have. This process obviously works all the more powerfully in situations where the group’s members already share a common mind-set. Because information that might represent a challenge to the conventional wisdom is either excluded or rationalized as obviously mistaken, people come away from discussions with their beliefs reinforced, convinced more than ever that they’re right. Deliberation in a groupthink setting has the disturbing effect not of opening people’s minds but of closing them. In that sense, Janis’s work suggests that the odds of a homogeneous group of people reaching a good decision are slim at best.

One obvious cost of homogeneity is also that it fosters the palpable pressures toward conformity that groups often bring to bear on their members. This seems similar to the problem of groupthink, but it’s actually distinct. When the pressure to conform is at work, a person changes his opinion not because he actually believes something different but because it’s easier to change his opinion than to challenge the group. The classic and still definitive illustration of the power of conformity is Solomon Asch’s experiment in which he asked groups of people to judge which of three lines was the same size as a line on a white card. Asch assembled groups of seven to nine people, one of them the subject and the rest (unbeknownst to the subject) confederates of the experimenter. He then put the subject at the end of the row of people, and asked each person to give his choice out loud. There were twelve cards in the experiment, and with the first two cards, everyone in the group identified the same lines. Beginning with the third card, though, Asch had his confederates begin to pick lines that were clearly not the same size as the line on the white card. The subject, in other words, sat there as everyone else in the room announced that the truth was something that he could plainly see was not true. Not surprisingly, this occasioned some bewilderment. The unwitting subjects changed the position of their heads to look at the lines from a different angle. They stood up to scrutinize the linos more closely. And they joked nervously about whether they were seeing things.

Most important, a significant number of the subjects simply went along with the group, saying that lines that were clearly shorter or longer than the line on the card were actually the same size. Most subjects said what they really thought most of the time, but 70 percent of the subjects changed their real opinion at least once, and a third of the subjects went along with the group at least half the time. When Asch talked to the subjects afterward, most of them stressed their desire to go along with the crowd, It wasn’t that they really believed the lines were the same size. They were only willing to say they were in order not to stand out.

Asch went on, though, to show something just as important: while people are willing to conform even against their own better judgment, it does not take much to get them to stop. In one variant on his experiment, for instance, Asch planted a confederate who, instead of going along with the group, picked the lines that matched the line on the card, effectively giving the unwitting subject an ally. And that was enough to make a huge difference. Having even one other person in the group who felt as they did made the subjects happy to announce their thoughts, and the rate of conformity plummeted.

Ultimately, diversity contributes not just by adding different perspectives to the group but also by making it easier for individuals to say what they really think. As we’ll see in the next chapter, independence of opinion is both a crucial ingredient in collectively wise decisions and one of the hardest things to keep intact. Because diversity helps preserve that independence, it’s hard to have a collectively wise group without it.

Chapter Two, Part III

The fact that cognitive diversity matters does not mean that if you assemble a group of diverse but thoroughly uninformed people, their collective wisdom will be smarter than an expert’s. But if you can assemble a diverse group of people who possess varying degrees of knowledge and insight, you’re better off entrusting it with major decisions rather than leaving them in the hands of one or two people, no matter how smart those people are. If this is difficult to believe— in the same way that March’s assertions are hard to believe—it’s because it runs counter to our basic intuitions about intelligence and business. Suggesting that the organization with the smartest people may not be the best organization is heretical, particularly in a business world caught up in a ceaseless “war for talent” and governed by the assumption that a few superstars can make the difference between an excellent and a mediocre company. Heretical or not, it’s the truth: the value of expertise is, in many contexts, overrated.

Now, experts obviously exist. The play of a great chess player is qualitatively different from the play of a merely accomplished one. The great player sees the board differently, he processes information differently, and he recognizes meaningful patterns almost instantly. As Herbert A. Simon and W. G. Chase demonstrated in the 1 970s, if you show a chess expert and an amateur a board with a chess game in progress on it, the expert will be able to re-create from memory the layout of the entire game. The amateur won’t. Yet if you show that same expert a board with chess pieces irregularly and haphazardly placed on it, he will not be able to re-create the layout. This is impressive testimony to how thoroughly chess is imprinted on the minds of successful players. But it also demonstrates how limited the scope of their expertise is. A chess expert knows about chess, and that’s it. We intuitively assume that intelligence is fungible, and that people who are excellent at one intellectual pursuit would be excellent at another. But this is not the case with experts. Instead, the fundamental truth about expertise is that it is, as Chase has said, “spectacularly narrow.”

More important, there’s no real evidence that one can become expert in something as broad as “decision making” or “policy” or “strategy.” Auto repair; piloting, skiing, perhaps even management: these are skills that yield .to application, hard work, and native talent. But forecasting an ncertain future and deciding the best course of action in the face of that future are much less likely to do so. And much of what we’ve seen so far suggests that a large group of diverse individuals will come up with better and more robust forecasts and make more intelligent decisions than even the most skilled “decision maker.”

We’re all familiar with the absurd predictions that business titans have made: Henry Warner of Warner Bros. pronouncing in 1927, “Who the hell wants to hear actors talk?,” or Thomas Watson of IBM declaring in 1943, “I think there is a world market for maybe five computers.” These can be written off as amusing anomalies, since over the course of a century, some smart people are bound to say some dumb things. What can’t be written off, though, is the dismal performance record of most experts.

Between 1984 and 1999, for instance, almost 90 percent of mutual-fund managers underperformed the Wilshire 5000 Index, a relatively low bar. The numbers for bond-fund managers are similar: in the most recent five-year period, more than 95 percent of all managed bond funds underperformed the market. After a survey of expert forecasts and analyses in a wide variety of fields, Wharton professor J. Scott Armstrong wrote, “I could find no studies that showed an important advantage for expertise.” Experts, in some cases, were a little better at forecasting than laypeople (although a number of studies have concluded that nonpsychologists, for instance, are actually better at predicting people’s behavior than psychologists are), hut above a low level, Armstrong concluded, “expertise and accuracy are unrelated.” James Shanteau is one of the country’s leading thinkers on the nature of expertise, and has spent a great deal of time coming up with a method for estimating just how expert someone is. Yet even he suggests that “experts’ decisions are seriously flawed.”

Shanteau recounts a series of studies that have found experts’ judgments to be neither consistent with the judgments of other experts in the field nor internally consistent. For instance, the between-expert agreement in a host of fields, including stock picking, livestock judging, and clinical psychology, is below 50 percent, meaning that experts are as likely to disagree as to agree. More disconcertingly, one study found that the internal consistency of medical pathologists’ judgments was just 0.5, meaning that a pathologist presented with the same evidence would, half the time, offer a different opinion. Experts are also surprisingly bad at what social scientists call “calibrating” their judgments. If your judgments are well calibrated, then you have a sense of how likely it is that your judgment is correct. But experts are much like normal people: they routinely overestimate the likelihood that they’re right.

A survey on the question of overconfidence by economist Terrance Odean found that physicians, nurses, lawyers, engineers, entrepreneurs, and investment bankers all believed that they knew more than they did. Similarly, a recent study of foreign-exchange traders found that 70 percent of the time, the traders overestimated the accuracy of their exchange-rate predictions. In other words, it wasn’t just that they were wrong; they also didn’t have any idea how wrong they were. And that seems to be the rule among experts. The only forecasters whose judgments are routinely well calibrated are expert bridge players and weathermen. It rains on 30 percent of the days when weathermen have predicted a 30 percent chance of rain.

Armstrong, who studies expertise and forecasting, summarized the case this way: ‘One would expect experts to have reliable information for predicting thange and to be able to utilize the information effectively. However, expertise beyond a minimal level is of little value in forecasting change.” Nor was there evidence that even if most experts were not very good at forecasting, a few titans were excellent. Instead, Armstrong wrote, “claims of accuracy by a single expert would seem to be of no practical value.” This was the origin of Armstrong’s “seer-sucker theory”: “No matter how much evidence exists that seers do not exist, suckers will pay for the existence of seers.”

Again, this doesn’t mean that well-informed, sophisticated analysts are of no use in making good decisions. (And it certainly doesn’t mean that you want crowds of amateurs trying to collectively perform surgery or fly planes.) It does mean that however well-informed and sophisticated an expert is, his advice and predictions should be pooled with those of others to get the most out of him. (The larger the group, the more reliable its judgment will be.) And it means that attempting to “chase the expert,” looking for the one man who will have the answers to an organization’s problem, is a waste of time. We know that the group’s decision will consistently be better than most of the people in the group, and that it will be better decision after decision, while the performance of human experts will vary dramatically depending on the problem they’re asked to solve. So it is unlikely that one person, over time, will do better than the group.

Now, it’s possible that a small number of genuine ecperts— that is, people who can consistently offer better judgments than those of a diverse, informed group—do exist. The investor Warren Buffett, who has consistently outperformed the S&P 500 Index since the 1960s, is certainly someone who comes to mind. The problem is that even if these superior beings do exist, there is no easy way to identify them. Past performance, as we are often told, is no guarantee of future results. And there are so many would-be experts out there that distinguishing between those who are lucky and those who are genuinely good is often a near-impossible task. At the very least, it’s a job that requires considerable patience: if you wanted to be sure that a successful money manager was beating the market because of his superior skill, and not because of luck or measurement error, you’d need many years, if not decades, of data. And if a group is so unintelligent that it will flounder without the right expert, it’s not clear why the group would be intelligent enough to recognize an expert when it found him.

We think that experts will, in some sense, identify themselves, announcing their presence ad demonstrating their expertise by their level of confidence. But it doesn’t work that way. Strangely, experts are no more confident in their abilities than average people are, which is to say that they are overconfident like everyone else, but no more so. Similarly, there is very little correlation between experts’ self-assessment and their performance. Knowing and knowing that you know are apparently two very different skills.

If this is the case, then why do we cling so tightly to the idea that the right expert will save us? And why do we ignore the fact that simply averaging a group’s estimates will produce a very good result? Richard Larrick and Jack B. Soil suggest that the answer is that we have bad intuitions about averaging. We assume averaging means dumbing down or compromising. When people are faced with the choice of picking one expert or picking pieces of advice from a number of experts, they try to pick the best expert rather than simply average across the group. Another reason, surely, is our assumption that true intelligence resides only in individuals, so that finding the right person—the right consultant, the right CEO—will make all the difference. In a sense, the crowd is blind to its own wisdom. Finally, we seek out experts because we get, as the writer Nassim Taleb asserts, “fooled by randomness.” If there are enough people out there making predictions, a few of them are going to compile an impressive record over time. That does not mean that the record was the product of skill, nor does it mean that the record will continue into the future. Again, trying to find smart people will not lead you astray. Trying to find the smartest person will.

The Wisdom of Crowds