CHAPTER FIVE

THE EMPIRICAL EVIDENCE

True ideas are those that we can assimilate, validate, corroborate and verify. False ideas are those that we cannot.

WILLIAM JAMES

Empirical evidence of the benefits of cognitive and identity diversity takes multiple forms. It includes correlational data, controlled experiments, and case studies. In this chapter, I summarize some of that evidence, paying particular attention to diversity bonuses. In some domains, prediction in particular, the evidence of significant diversity bonuses will be unequivocal. As would be expected from the theory, direct evidence for identity diversity bonuses will be more mixed. It will exist in some cases but not in others.

In addition to presenting direct evidence of bonuses, I also describe other evidence consistent with diversity bonuses, such as the growth of teams. For more than two decades, organizational scholars have noted the increased predominance of team-based work.1 Teams now manage a majority of mutual funds, write most software and apps, and provide input into most high-value business decisions. The shift to teams may be most pronounced in the academy. A generation ago, the model academic paper had a single author. Today, a scientific paper is three times more likely to have six or more authors than be composed by an individual.

The trend toward teams has even occurred within creative endeavors. Teams of three or more songwriters now write a majority of the Billboard 100 hits.2 Many people are aware that Paul McCartney and John Lennon, who often wrote as a duo, sit atop the list of songwriters with the most number-one Billboard hits. Few people know that in third place on that list sits a forty-five-year-old Swede, Martin Sandberg (aka Max Martin), although many know Martin’s songs. He wrote “I Want It That Way” for the Backstreet Boys, “That’s the Way It Is” for Céline Dion, “DJ Got Us Fallin’ in Love” for Usher, both “Roar” and “I Kissed a Girl” for Katy Perry, and Maroon 5’s “One More Night.”

I should say, he cowrote those songs. Martin did not compose any of the aforementioned songs by himself. He was a member of songwriting teams credited with those hits. His career reflects the broader industry trend toward collaborations.

Why teams? Simple: teams perform better. When teams compete against individuals on difficult tasks, teams generally win. That is no idle claim. Later in this chapter, I present studies of more than fifteen million research papers, a decade of mutual fund returns, and the largest prediction contest ever conducted.3 In each case, teams outperform individuals by a substantial margin.

Teams win because they can draw from larger cognitive repertoires. A team possesses more information, more ideas, more knowledge, and more ways of thinking than a single person. A team can access more perspectives and more tools. This abundance of cognitive tools allows them to produce more ideas and to find improvements in the ideas they encounter. It allows them to partition reality more finely and avoid blind spots. This abundance depends on the team consisting of individually accomplished individuals who are collectively diverse.

Consider Max Martin’s collaborators: Cuban American rapper Armando Christian Pérez (better known as Pitbull), Swedish DJ Denniz Pop, Indian American record producer Savan Harish Kotecha, Canadian record producer Henry Russell Walter (aka Cirkut), classically trained American songwriter Bonnie McKee, African American rapper Juicy J, and British songwriter Cathy Dennis—a talented lot no doubt, but what jumps out is their diversity. They specialize in different musical styles, they come from diverse backgrounds, and they belong to different identity groups. Their diversity enables them to write songs that appeal to wide audiences.

In examining the evidence, we should not expect diversity bonuses on all tasks or from all diverse groups. Cognitive diversity does not produce bonuses on all tasks. To add value, cognitive diversity must be germane. Writing computer code requires skills that are different from those required to write hit songs or identify subatomic particles. The CEO of CERN, the European Organization for Nuclear Research, would be wise to pass on an application by Max Martin. Martin’s repertoire of songwriting skills, as impressive as it may be, would not be applicable to furthering our understanding of subatomic particles. Even in cases in which diversity could produce a bonus, say, asking Max Martin to help create music for a video game, a collaboration could fail to produce bonuses if the team lacks a shared mission, fails to create an inclusive culture, or cannot communicate effectively.

In this chapter, I engage four disparate literatures. I first look at aggregate correlative data relating employee and leadership diversity to organizational success. Those data make a strong correlative case for diversity bonuses. Next, I look at a specific case, the change in Norwegian law that required increased gender diversity on boards.

I then look at academic studies of groups and teams. That research leads to more nuanced and modest conclusions. Diversity bonuses appear on some problems and not all. And too much diversity can be a problem. Last, I look at evidence showing a trend toward more team-based work. This last set of studies reveals strong evidence of diversity bonuses. It may be a best-case scenario for identifying diversity bonuses. The teams studied pursue common goals, they have good information about the repertoires of potential team members, and their performances can be measured quantitatively.

We should not view this empirical evidence as a final arbiter. We must keep in mind that the data reveal the world as it is, not as it could be. With experience, diverse teams could perform even better, a point I take up in the next chapter.

THE VIEW FROM TEN THOUSAND FEET

The empirical evidence for diversity bonuses includes compelling correlations, confusing causal studies, large N evidence from teams, and over a half century of conflicting experimental studies. The correlational evidence from ten thousand feet appears unequivocal. These studies include a 2015 McKinsey analysis of top management teams of 366 companies in the United States, the United Kingdom, Canada, and Latin America that finds a positive linear relationship between diversity and financial performance. Companies with management teams in the top quartile for gender diversity outperform those in the bottom quartile by 15 percent. Companies in the top quartile for ethnic diversity outperform those in the bottom quartile by 35 percent.4

The correlations between diverse leadership and performance are even more striking. A 2014 analysis of twenty-seven thousand senior managers at three thousand large firms revealed positive correlations between the percentage of women in leadership roles and firm performance. In a continuation of that study covering the two and a half years from January 2014 to July 2016, firms found even stronger correlations. The market values of firms at which women compose more than one-fourth of senior leadership grew at nearly 3 percent over market averages. Firms at which women filled more than half of senior leadership positions beat the market by more than 10 percent annually.5

An earlier, more intensive McKinsey analysis of 180 companies in France, Germany, the United Kingdom, and the United States in 2012 compared return on equity (ROE) and earnings before interest and taxes (EBIT) for companies in the upper and lower quartile in their executive board diversity. Here, they measure diversity by the percentage of women and foreign nationals on the executive board.6 For Germany and the United Kingdom, ROE was over 66 percent higher for firms in the top quartile than for those in the lowest quartile. In the United States, ROE was nearly 100 percent higher. For France, the difference in ROE was not significant.

The results on EBIT reveal different country-level patterns. In Germany, the most diverse firms had an EBIT 82 percent higher than the least diverse. In France and the United States, the increase was around 50 percent, and in the United Kingdom, the increase was 30 percent. A separate study of the effect of gender diversity on the firms in Standard & Poor’s top 1,500 firms from 1992 to 2016 finds that women on boards increase firm value, though only for firms that focus on innovation.7

The literature that analyzes city and regional racial and cultural diversity and economic performance also shows strong correlative evidence. Racial diversity significantly increases performance in advertising, finance, entertainment, legal services, health services, hotels, bars and restaurants, and computer manufacturing.8 A one standard deviation increase in racial diversity (relocating from South Dakota to Michigan) increases productivity by more than 25 percent in legal services, health services, and finance and results in an increase of more than 10 percent in advertising.

The industry-level analysis suggests that racial diversity improves performance when workers solve problems, think creatively, and must understand their customers. As would be expected from the diversity-bonus logic, increased racial diversity does not increase performance in industries that involve physical labor. Firms that produce aircraft parts, fabricated metals, machinery and nondurable goods, paper products, and transportation do not become more efficient by increasing the racial diversity of their employees.9

Studies also show that city-level productivity increases in the diversity of professions.10 Not surprisingly, professional diversity also scales with city size, which may contribute to why workers in larger cities are more productive.

As powerful as they may seem, all of these studies can be challenged on the grounds that they only report correlations. In the studies that show that diverse firms earn higher profits, the causal effect could well run in the opposite direction: successful firms may be able to afford diversity. If that were true, the firms would need some reason for pursuing diverse workers—perhaps to promote social justice, or in anticipation of demographic trends. More likely, it might be that firms with more inclusive cultures attract more diverse employees and earn higher returns. If so, diversity would correlate with firm performance but not cause it.

THE CAUSALITY CONUNDRUM

Our goal should be to find a causal relationship between diversity and performance. Correlative results only show how one quantity, in this case profits or market share, varies systematically with another, in this case identity or cognitive diversity. Correlation need not imply causation. A person’s blood pressure at age fifty probably positively correlates with having granite kitchen countertops. The granite itself does not cause a drop in blood pressure. A third attribute, income level, causes both. On average, higher-income people have lower blood pressure for reasons related to diet, stress, and exercise. They are also more likely to own homes with granite countertops.

Direct tests for causality entail manipulating the variable of interest. We would need to select a random subset of homes and replace granite with Formica in some and replace Formica with granite in others and then measure the effect of those changes on blood pressure.

If we desired, we could perform the Formica experiment. Unfortunately, we cannot perform an analogous experiment on identify diversity in groups. We cannot manipulate a person’s race or gender and rerun a group problem-solving exercise. We would have to swap out the whole person, and that would manipulate more than race.

For this reason, direct testing of the causal effect of race or gender will always lie out of our grasp.11 We can, however, identify the causal effects of race or gender in discrimination by manipulating identity attributes on job and loan applications. Changing an applicant’s name from Emily or Greg to Lakisha or Jamal can result in up to a 50 percent reduction in the probability of a callback for a job.12

The impossibility of manipulating primary identity attributes does not rule out other approaches to identifying causal effects, often by evaluating natural experiments in which unexpected events cause identity compositions to change for a random sample. They can then compare that sample to the population not affected by the event.

The 2003 law requiring Norwegian boards of directors to be 40 percent female by 2008 is an example of a natural experiment. By most accounts, the new law was not anticipated. And, at the time the law passed, only 9 percent of Norwegian board members were women. Thus, we have a random change in gender composition that we can test for gender effects.

Kenneth Ahern and Amy Dittmar interpret that random change as a natural experiment to estimate gender effects. When the law passed, some firms had 0 percent female board representation, others had 10 percent, and others had 20 percent. Those firms with no women on their boards had to increase their percentage of women more than the firms that already had 20 percent women board members. The complete data set consists of more than two hundred firms, each of which had to add some percentage of women to comply with the law.

In an ideal natural experiment, the women who joined the boards would be identical to the male board members on other relevant characteristics such as experience levels or educational backgrounds. Unfortunately, the women were younger and had less experience. Half as many of the new women board members had been CEOs as the men they replaced. Differences in experience contribute to board dynamics and decision making. Furthermore, some appointees were family members of the owners, raising the possibility that some firms tried to skirt the law.13 One would expect that dismantling or expanding an existing board and adding younger, less experienced people, regardless of gender, would hurt performance in the short term.14

Though not perfect, as few natural experiments are, the Norwegian example provides a solid test case. If those firms that added relatively more women performed worse, then we can infer that adding women to boards hurt performance. If those firms that added more women performed better, then we can infer that adding women to boards improved performance. The evidence from the Norwegian case reveals a negative effect of gender diversity. The boards that most increased their gender diversity performed less well after the law was implemented. The decrease in return on equity was found to be as high as 20 percent for some firms.15 That finding has been corroborated in other studies.16

These findings should not be surprising, except to those people who believe that diversity bonuses occur by magic. To see why, we need to return to the logic. For cognitive diversity to produce a bonus, it must be germane to the task. That same logic applies to identity diversity. For women, by virtue of being women, to create immediate diversity bonuses, women’s repertoires—their knowledge, information, models, heuristics, and representations—would have to produce more accurate predictions, more creative ideas, better solutions to problems, or more comprehensive evaluations of projects. A female policy maker crafting legislation for education reform or a public health official developing wellness protocols to reduce health disparities may well possess information or knowledge that stems from her identity. A chemist studying amyloids may not.

With that logic in mind, we can return to the case at hand. Norway’s main industries are petroleum, natural gas, metals, fishing, pulp paper, chemicals, machinery, timber, textiles, and mining. Two features stand out. First, these are all capital-intensive industries competing in commodity markets. Unlike fast food, retail, tech, entertainment, or media, these industries cannot or do not roll out new product lines each quarter. Furthermore, companies that manufacture chemicals, drill for oil, harvest timber, and build machinery do not sell directly to consumers, nor do they need to understand the tastes of diverse consumers. With the possible exception of textiles, none of these industries would seem situated for identity diversity bonuses in their primary markets.

Gender would and surely does matter in these industries on operational, cultural, and strategic dimensions. Over the past decade, I have spent some time at a leading Scandinavian manufacturing company. A few decades ago the great majority of their engineers and management were native men. Now, they find that a representative percentage of the top students are recent immigrants and women.

The process of on-boarding young immigrants and women into a predominantly male firm involves complex personnel issues. Building an inclusive culture that enables women and the growing immigrant population, currently 12 percent of the Norwegian population, to contribute as well is, unlike building an oil platform, a task for which we might expect board gender diversity to add substantial value.

Women on boards could influence these long-term strategic and organizational actions. Even if they had, we would not expect an immediate positive effect. The increase in women on boards must have had some effect on action. Otherwise, the empirical differences in performance would be inexplicable. It just must be that in the short run, those actions hurt performance.

Analyses show that firms that added more women directors made more acquisitions and added more employees. Given that this occurred during an economic downturn, we should not be surprised by the negative market impact.17 Those actions can be read as evidence of poor monitoring or as prescient long-term strategic thinking.

The evidence that supports the poor-monitoring argument may have little to do with gender directly. Whether because of risk aversion or adherence to existing norms of what constituted a qualified candidate, boards overselected from a small set of women. One woman served on eighteen boards. Her effectiveness, as well as that of others, must have been compromised. Second, to comply with the law, boards either had to expand in size or remove existing members. On average, nearly a third of board members changed. That percentage turnover would disrupt any deliberative body.18

In sum, if we evaluate the Norwegian policy change objectively, we should not be surprised by the findings. The law implemented a bang-bang approach to increase the number of women on boards. The women appointed had less experience and were spread thin. Their addition disrupted boards during a downturn. And, finally, the primary industries affected do not jump to front of mind as those for which gender could produce immediate bonuses. Had the large increase in board gender diversity produced anything but an immediate downturn, it would have been astounding.

The Norwegian policy should be seen for what is: the planting of a large, golden carrot for future consumption. Mid- and early-career Norwegian women now have greater opportunity. The guarantee of near-equal opportunity on boards will encourage them to build repertoires that make them valuable board members. In the long run, Norway will be able to draw from a larger talent pool and achieve diversity bonuses. Given seats of power and wealth, women may also diversify Norway’s economy into industries in which gender diversity would be expected to produce larger bonuses.

FIFTY YEARS OF EMPIRICAL RESEARCH

The next strand of research I cover consists of empirical studies of team performance that span more than fifty years. It encompasses experiments, case studies, and industry-level studies. While a few generalities emerge, this strand produces less clarity than would be expected given its scale and scope.

When evaluating any experimental study, we must take into account its replicability, which will depend on the size of the sample, the quality of the data, and the magnitude of the result. Small magnitude effect sizes found in small samples may be artifacts of time and place and not an empirical regularity. Even the most careful studies need not hold up in replication. A recent effort by 250 scholars to reexamine one hundred psychology experiments reproduced significant results in fewer than 40 percent of the replications.19

Replicability aside, there have been so many studies on diversity that we can still draw inferences. Much of that literature tests for direct effects of diversity. The literature, for the most part, distinguishes between informational, knowledge, and skill diversity and social category diversity. The first category corresponds to what I call cognitive diversity and the second to identity diversity. A typical study treats one of the two types of diversity as the independent variable and takes team performance, job turnover, or job satisfaction as the outcome variable.

The inseparability of identity and cognitive diversity implies that the same study can support both cognitive and identity-based diversity bonuses. For example, one study involving 699 participants found that a measure of collective intelligence predicts group performance in solving difficult problems better than the team members’ individual IQs, a finding that corroborates the no test exists claim for problem solving.20 In that study, team success correlates with the number of women. But that effect is largely washed out when one accounts for the ability to read emotions, a cognitive skill.

The complexities of interpersonal behavior imply that nearly anything can and does happen. One study might show that newly formed teams fail to create sufficient trust and thus perform poorly. Another study might find that people in long-standing teams think alike and lose the potential benefits of diversity. We are left with evidence showing that newly formed teams perform worse than established teams and that they also perform better.

Nevertheless, generalities do emerge from the data, and they align with the main theoretical threads developed here. First, as we would expect from the theory, identity diversity does not improve performance on routine tasks.21 That finding, though negative, aligns with what the logic implies.

Second, both cognitive and identity diversity increase perspective taking, which correlates with but does not guarantee better group performance. The experiments, on the whole, show that cognitive and identity diversity produce more, though not necessarily better, solutions and that cognitive diversity improves outcomes when making predictions and solving problems.22 Evidence from crowdsourced innovation sites shows that cognitively diverse communities often solve problems that perplex groups of experts.23

As mentioned, one study finds that gender diversity improves performance on difficult tasks and does so through improved communication and social perceptiveness.24 The best teams consist of a mix of men and women.25 Findings on the effects of racial and ethnic diversity are more mixed. Our analysis would suggest they might have a negative effect when coordination plays the dominant role and a positive effect on more creative and innovative tasks.

Industry studies find that increasing social category diversity correlates with higher job turnover. And some studies find that too much diversity of any kind can hinder performance on almost any task, at least initially.26

Another conclusion that jumps out from the thousands of studies of diverse groups is that all types of diversity have costs. Cognitive and identity diversity create challenges. A diversity of perspectives or models can produce misunderstandings. Identity diversity can undermine trust, personal validation, and commitment to a group’s goal. It can result in less communication and engagement. All of these effects make managing diverse groups a challenge.

Given how much sand diversity tosses into the gears, we might expect that diverse groups would always perform worse than homogenous groups. The fact that diverse groups often perform better should thus be seen as especially strong evidence of bonuses. One might even claim that when an experiment yields equal performance between an identity-diverse group and a homogenous group, the results do not reject diversity bonuses, because without some bonuses, the diverse group would have performed worse.

That inference becomes even stronger when one takes into account the fact that a majority of the experimental papers and many of the observational studies analyze groups meeting for the first time. Effective diverse groups need time to gel.27 Effective teams consist of people who believe that diversity will improve outcomes and therefore validate each person’s membership in the team.28 All of these conditions held for the teams in the Netflix Prize competition.

Some of the most convincing evidence comes from studies of predictive tasks. Here, I highlight two studies that reveal diversity bonuses. The first study concerns a forecasting contest run by the Intelligence Advanced Research Projects Activity from 2011 to 2014. More than twenty-five thousand forecasters, who collectively made more than a million predictions, participated. Many of the forecasts concerned international politics: Would Vladimir Putin remain in power? Would North Korea test nuclear weapons? Would Scotland leave Great Britain?29

Unlike in the Netflix Prize competition, participants did not work from a common data set. They relied on their own knowledge and on qualitative models to make probabilistic estimates. They did not, as a rule, construct empirical models fitting parameters to data. The Good Judgment Project headed by Barb Mellers and Phil Tetlock won the tournament. The most accurate individuals were 36 percent more accurate than random. With training, some of these participants could increase their accuracy to 41 percent better than random.

After the first year, the researchers identified a set of sixty superforecasters. The superforecasters were found to have high fluid intelligence. They could recognize patterns, solve logic problems, and reason from data better than most people.30

In the second year, they randomly assigned these superforecasters to five teams of size twelve. When formed into teams, these superforecasters shared more information and articles than members of other teams. These teams of superforecasters then possessed even more information and knowledge and engaged with more predictive models. These teams subsequently performed 66 percent better than random and significantly better than teams comprising the top individuals not categorized as superforecasters.31 The increase from 41 to 66 percent resulted from the team being able to tap into their diversity.

The second study consists of a meta-analysis of twenty-eight thousand predictions on six economic indicators by professional economists. The mean prediction of all forecasters was 21 percent more accurate than a randomly chosen forecaster and 10 percent more accurate than the best individual forecaster up to that time. Averaging the prediction of the six most accurate forecasters to date resulted in predictions 25 percent more accurate than an average forecaster and 15 percent more accurate than the best forecaster.32

Without knowledge of the diversity prediction theorem, these results would be counterintuitive. By adding in the predictions of the second-, third-, fourth-, fifth-, and sixth-best forecasters, who are demonstrably less accurate, we improve on the best forecaster. This can only happen if those other forecasters add diversity. They do. And they produce a substantial diversity bonus.

THE TEAM, THE TEAM, THE TEAM

The first strand of data shows correlative evidence of greater diversity at leading firms. The second strand, consisting of thousands of studies, paints a mixed, though broadly supportive, picture. In the past decade or so a third strand of literature on team performance has emerged that provides some of the strongest evidence of diversity bonuses.33

This strand leverages enormous data sets based on academic research, patents, economic forecasts, and returns on equity funds. The data sets encompass tens of millions of academic papers, millions of patents, and thousands of teams of portfolio managers. The tasks covered in these studies are challenging: advancing knowledge, coming up with innovative ideas, forecasting economic growth, and managing an equity fund.

These studies show substantial benefits to teams and significant contributions to team success attributable to diversity. The undeniable success of teams contradicts a widespread belief that scientific, technological, and artistic breakthroughs originate from the minds of singular geniuses. As John Steinbeck writes in East of Eden, “Nothing was ever created by two men. There are no good collaborations, whether in music, in art, in poetry, in mathematics, in philosophy.”34 While Steinbeck and others can point to Isaac Newton, Thomas Edison, Marie Curie, Albert Einstein, and Wolfgang Amadeus Mozart, data show these great minds to be the exceptions, not the rule, particularly in recent times.

To be fair, Steinbeck was writing in 1952, before Francis Crick and James Watson uncovered the structure of DNA, before Lennon and McCartney redefined popular music, before Steve Jobs and Steve Wozniak changed the meaning of the word apple, before Ben and Jerry mixed up Cherry Garcia, and before Sergey Brin and Larry Page launched Google. And, in further defense of Steinbeck, he did say “two men,” allowing for the collaboration between Marie and Pierre Curie and leaving open the possibility of Patti Smith and Bruce Springsteen.

Then again, maybe we should be less generous. Steinbeck surely knew of the Wright brothers. He also must have been aware that even history’s most lauded individuals had assistants. Raphael and Michelangelo worked with teams of assistants. Marcel Grossmann helped with the foundational math for Einstein’s general relativity theory.35 Steinbeck’s contemporaries F. Scott Fitzgerald, Thomas Wolfe, and Ernest Hemingway all benefited from the wisdom of the same editor, Maxwell Perkins, whose sharp pencil improved on their prose.36

This flurry of anecdotes is drawn from overwhelming aggregate evidence. Studies of patents reveal that teams, not individuals, dominate and that the notion of the “heroic lone inventor” lacks empirical support.37 The same holds for academic research. The most influential papers are written by teams.38

Furthermore, the teams that make the most significant scientific advances, construct the investment portfolios that generate the highest returns, and construct the most accurate predictive models are not arbitrary assemblages. They agree on a mission. Their members trust one another enough to challenge ideas. And, most important, they are diverse, both cognitively and in their identities.39 They consist of people who bring diverse experiences, perspectives, knowledge, and training.40

Unlike the teams thrown together in experiments, these teams work together for sustained periods of time. They therefore develop trust. And they play for real. They race for patents, they compete for academic prestige, and they try to make the most money.

Studies of the academy, research labs, and the financial world reveal strong support for diversity bonuses. In science and engineering research, 90 percent of published papers are team efforts. In social science, 60 percent of papers are coauthored.41 Similarly, more than half of all patents are now written by teams.42 Last, more than three-fourths of equity mutual funds are managed by teams.43

The data on academic papers show the dominance of teams within subfields as well (see figure 5.1). Social science can be divided into 54 subfields. In every one, coauthored papers outnumber single-authored papers. Science and engineering papers can be divided into 171 subfields. Coauthored papers outnumber single-authored papers in 170 of these subfields. In medical research, the ratio of coauthored to single-authored papers exceeds three to one. The same subfield dominance holds for patents. Stefan Wuchty, Benjamin F. Jones, and Brian Uzzi evaluate the more than two million patents issued by the United States since 1975.44 Teams predominate in all thirty-six categories of patents.

image

Figure 5.1  Percent of Categories in Which Teams Predominate (Wuchty, Jones, and Uzzi, “Increasing Dominance of Teams”)

Data from a National Academy of Sciences report on team-based science show the marked increase in coauthored papers from 1960 to 2014 (see figure 5.2).45 A similar trend can be found in investment teams. Twenty-five years ago more than two-thirds of equity funds were managed by individuals; now more than 70 percent are managed by teams.

Teams Win

The growth of teams requires an explanation given that teams cost more money and take more time to reach decisions because they require coordination. We can thus infer that teams would not be so prevalent if they did not outperform individuals. Direct comparisons show that to be the case. Teams win.

In science and engineering and the social sciences, coauthored papers earn more citations. This is also true within subfields. Coauthored papers have higher average citations in every one of the 54 social science subfields, and in 167 of the 171 scientific subfields. The same holds for patents. Team-authored patents earn more citations overall and do so in thirty-two of the thirty-six patent categories.46

For patents written between 1986 and 1995 from inventors based in the United States, a sample with more than half a million patents, citations increase with team size. Solo-authored patents earn on average nine and a half citations, patents written by teams of size three receive twelve and a half, and patents with more than six authors receive on average more than seventeen citations.47 Few people coauthor for the fun of it. They coauthor to produce better research.48

image

Figure 5.2  Trend in Coauthored Papers in Social Sciences and Science and Engineering (National Research Council, Enhancing the Effectiveness of Team Science)

These findings describe averages. Analysis of the best papers and patents shows that teams also perform best. Team-authored papers are four and a half times more likely to receive more than one hundred citations (a common benchmark for excellence) in both science and engineering and the social sciences. Team-authored papers are more than six times as likely to earn one thousand citations in science and engineering.49 Team-authored patents are 28 percent more likely than sole-authored patents to be in the top 5 percent and 9 percent less likely to receive no citations.50

The data from equity fund managers also show that teams win. Accounting for risk, the gains for funds run by three people outperform those for funds run by a single individual by about 60 basis points.51 This fact goes a long way toward explaining why teams, not individuals, now run most funds.

Diversity and Team Performance

The data leave little doubt that teams are outperforming individuals in the academy, in scientific research, and on Wall Street. The question remains as to whether we can attribute any part of that team success to cognitive diversity. The increase in team-based work and team success could be the result of teams taking on bigger projects. One person cannot build and operate a particle collider. Bigger teams can dissect more sea slug brains, run more model specifications, and conduct more experiments. Teams might be winning because of scale, not cognitive diversity.

We can start with qualitative accounts which support diversity bonuses. Steve W. J. Kozlowski and Bradford S. Bell, in a survey of studies of team-based work, note the value of a diversity of “skills, expertise, and experience.”52 The quantitative evidence is also compelling. High-impact papers include deep, diverse thinkers.53 Jones, Uzzi, and Wuchty’s analysis of collaborations among researchers using the same data set of nearly twenty million papers finds that two scientific researchers employed at different institutions have a 7 percent greater chance of writing a high-impact paper than if they work at the same institution. Two social scientists from different schools have a nearly 12 percent greater chance than two colleagues from the same school.54

Richard Freeman and Wei Huang study one and a half million scientific papers written from 1985 to 2008 in the United States. They find that the number of citations a paper receives increases with the number of authors, the number of e-mail addresses with distinct institutional domains, and the number of references contained in the paper: more authors from more schools result in more citations. Freeman and Huang also find that increased ethnic diversity of authors correlates with more citations, controlling for other factors.55

The fact that people work at different schools or come from different ethnic backgrounds does not ensure cognitive diversity, though it’s more likely, a point discussed in previous chapters. A more direct approach for identifying cognitive diversity relies on the references in patents and papers. References provide a proxy for the knowledge domains of a paper’s authors. If a paper references species-area laws from biology, we can assume that an author knows or at least speaks to the literature on that topic. If a paper references papers that use some analytic tool, say spectral analysis, we can assume an author knows that tool.

Two other studies use this approach and find evidence consistent with the claim that the best teams consist of accomplished diverse thinkers. One study examines the 5,529,055 patents filed between 1976 and 2006 in the United States.56 It develops a proximity measure for each patent based on whether it cites patents from common or uncommon combinations of categories. The most-cited patents have low proximity, that is, they reference rare combinations. The same technique applied to a sample of nearly two million papers shows that the best papers also have low proximity.

The second study applies random sampling to predict the likelihood of each pair of references appearing in the same paper.57 It is more likely that two papers on social behavior in communities of naked mole rats would be cited together than it is that either paper would be cited with a paper on fast oscillations in cortical-striatal networks in the brain. The analysis assigns a conventionality score to each pair of papers in a reference list. This captures the likelihood that the two papers would be cited together. Each paper can then be assigned a median conventionality for each pair of papers in its list of references.

A paper is classified as conventional if its median lies in the upper half of all papers. This divides the twenty million papers into approximately two equal-size groups: those with high median conventionality of pairs and those with low median conventionality of pairs. For each paper, it is possible to calculate the conventionality threshold for the bottom 10 percent of pairs of references. Papers with a lower-than-average 10 percent threshold are classified as novel, as those papers cite pairs of papers that are infrequently cited together.

These two categorizations bin the papers into four types: conventional and novel, conventional but not novel, novel but not conventional, and neither conventional nor novel.58 Conventional and novel papers combine atypical pairings with conventional pairings. An economics paper that contained forty standard economics references and twenty references from theoretical biology would fall into that group. The many pairs of references from economics and pairs from biology would be likely. The combinations of economics and biology references would be unlikely.

For each of five decades, the researchers compute the probability of a paper from each group becoming a hit paper. They define hit papers as those that produce the top 1 percent, 5 percent, or 10 percent of citations. All fifteen analyses (five decades, three thresholds) produce similar results: Papers that are either novel or conventional but not both become hits at slightly above expected rates: 1.2 percent of novel but not conventional papers in the 1950s fall in the top 1 percent, as do 1.2 percent of conventional but not novel papers published in the 1990s. Papers that are neither conventional nor novel become hits at about half their expected rates.

Papers in the novel and conventional category produce hits at twice the expected rate. In every decade, more than 2 percent of novel and conventional papers belong to the top 1 percent of papers, almost 10 percent belong to the top 5 percent, and about 18 percent belong to the top 10 percent. A similar analysis of a selection of highly cited papers in social science finds that atypical combinations increase the likelihood of a hit.59 We might replace the words conventional and novel with coherent and diverse. We can then restate their result as follows: the best papers address existing literatures and combine diverse ideas.

That same analysis examines the isolated effect of adding authors, taking into account the number and diversity of papers. It finds that adding authors decreases the likelihood of a hit paper. All else being equal, adding an author makes the paper less successful. Working in teams can be difficult. Miscommunications can arise. Coauthors can differ in writing style. Though significant, that effect is small compared to the bonus created by more and new ideas.

We might expect that more authors would imply more citations because people cite their friends’ and colleagues’ work. They may, but whatever additional citations arise through friendships are more than wiped out by the difficulty of working in a team. The sizes of the teams that write papers and patents and manage equity funds and the diversity of those teams are chosen strategically. Teams may add or drop members along the way to improve performance. So, though these studies reveal benefits from diversity, they do not imply, much less prove, that more diversity is always better. Nor did the logic presented earlier. Too much diversity can cause problems.

Data support that intuition. An analysis of the publication records of recipients of National Science Foundation grants using the number of publications as a measure finds that adding more researchers increases the number of publications.60 However, as the number of authors increases, the addition of people from other institutions and other disciplines decreases the number of publications. This interaction effect is small relative to the direct effect of larger teams. Nevertheless, the finding aligns with the intuition that too much diversity can hinder large teams.

Taken as a whole, the third strand of studies provides powerful evidence of the effectiveness of diverse teams. In evaluating any data, we should consider the possibility of selection bias. Team-based work surely exhibits selection bias. Teams choose people whom they expect to make the team better. They do not select randomly. Nor should they: randomly selected teams would not perform well. What should perform well are mindfully constructed diverse teams of talented people. And they do; they significantly outperform talented individuals who work alone.

SUMMARY

As a rule, we place greater weight on empirical evidence than on theoretical claims. And, taken in its entirety, the evidence weighs strongly in favor of diversity bonuses in many contexts. The success and growth of teams and the corresponding indications that those teams leverage diverse knowledge bases and tools is compelling evidence that cognitive diversity contributes to team success.

In some cases, diversity bonuses have to exist. On predictive tasks, the only way not to have a diversity bonus is if there exists no diversity. It is therefore the magnitude and not the existence of the bonus that catches our attention. The 5, 10, and 20 percent improvements in the accuracy of economic forecasts are meaningful, as are 60 basis point increases in returns.

Even in those cases in which we do not see bonuses (most notably the lack of immediate gender diversity bonuses in the Norwegian board case), we find support for the core logic. None of the major industries jump to mind as domains in which gender might produce immediate gains.

Finally, to tee up the final chapters, I reiterate that these data reveal where we are now. They come from segregated societies with histories of exclusion. With practice guided by theory and evidence, diverse teams should produce larger bonuses. Many of these results come from teams working without rudder or compass.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset