3

EXPERTISE

From Machine Learning to Machine Teaching

In Human + Machine, we identified novel roles and jobs that grow from human-machine partnerships in what we called the missing middle—new ways of working that are largely missing from economic research and reporting on jobs. With increasingly sophisticated AI technologies that enable human-machine collaborations, developing the missing middle has become one of the key components of innovation.

Among the six human-machine hybrid activities we identified, three entail machines augmenting humans by (1) amplifying our powers, as in providing otherwise unattainable data-driven insights; (2) interacting with us through intelligent agents; and (3) embodying us, as with robots that extend our physical capabilities. In turn, humans complement machines by (1) training them, as in labeling data for machine learning systems; (2) explaining them, to bridge the gap between technologists and business leaders; and (3) sustaining them by ensuring that AI systems are functioning properly, ethically, and in the service of humans, rather than the other way around. To those latter three, we now add a fourth: teaching machines by endowing them with the experience of experts.

In the new world of humans teaching machines, the people who ultimately help your organization wring the maximum competitive advantage from AI will not be data scientists or computer engineers or AI vendors. All of those roles will remain relevant, but the real difference-makers for your business will be the domain experts in your organization. Machine teaching can unleash the expertise of people at all levels of the organization and greatly multiply its value as you reimagine business processes around the new possibilities that top-down AI opens up.

Three Dimensions of Expertise

In 2015, the experimental sound artist Holly Herndon released Platform, the then 25-year-old’s second full-length album. It was something of a break-up record—with technology. Or at least a certain approach to technology. Known as “laptop girl” for performing on stage with her computer as her instrument, she had used software to manipulate her voice on her first album. But on Platform—the title refers to the internet—she extended those techniques to create, in part, a critique of technology’s potential to dominate humans.

She’s no Luddite. In the ensuing four years, she earned a PhD at Stanford University’s Center for Computer Research in Music and Acoustics (the home of, among other things, the Stanford Laptop Orchestra, founded in 2008). With her partner Mat Dryhurst and computer artist Jules LaPlace, Herndon built a gaming PC that houses an artificial neural network. They called it their “AI baby,” christened it “Spawn,” and used it to help make her 2019 album Proto. The title alludes to the protocols of algorithms.

But instead of simply training Spawn on a vast set of vocal samples, Herndon and her human bandmates taught it through call-and-response singing sessions. Their style harked back to folk music from around the world, including Tennessee, where Herndon sang in her church’s choir as a child. On Proto, Spawn’s sounds are married with the sounds of the human ensemble. The result is something new: “a choir of women’s voices, harmonizing and ricocheting in counterpoint that could be Balkan or extraterrestrial,” as the chief pop music critic for the New York Times put it in a rapturous review.1

“The whole album is drawing on various folk traditions, and a lot of it has to do with the individuals who were in the ensemble and the different experiences they have,” Herndon told the BBC. “I got really interested in this idea of all these different vocal traditions that happen all around the world as this almost inherent human technology inside of us.”2

That contrasts sharply with approaches to musical AI trained on samples of a musical genre or a particular artist (even dead ones) and then switched on to automatically generate “new” music in that style. Says Herndon, “I don’t want to recreate music; I want to find a new sound and a new aesthetic. The major difference is that we see Spawn as an ensemble member, rather than a composer. Even if she’s improvising, as performers do, she’s not writing the piece. I want to write the music!”3

Herndon is onto one of the most radically human turns of all: from machines “learning” by processing mountains of data to humans teaching machines based on human experience and expertise. Machine teaching represents the next logical step on the path of human-machine collaboration: humans tutoring machines rather than training them only; leveraging top-down human expertise, not just bottom-up machine empiricism; and imposing natural intelligence directly onto AI. “Machine learning is all about algorithmically finding patterns in data,” says Gurdeep Pall, Microsoft’s corporate vice president of business AI, “Machine teaching is about the transfer of knowledge from the human expert to the machine learning system.”4

Machine teaching includes three distinct areas of human expertise that AI has long struggled to incorporate: professional experience, collective social experience, and personal experience (the innate and acquired individual abilities of human beings). For Herndon, these three areas are, respectively, her professional musical experience, the social context of folk traditions and ensemble singing, and her personal ability to compose. The result: genuine, highly specific innovation that competitors can’t duplicate.

Machine teaching isn’t confined to the far corners of experimental techno-artpop. In pathbreaking approaches to AI, companies and researchers are reimagining the role of professional, social, and personal experience in AI. These pioneers are finding new ways to build human professional experience into value-creating systems unique to their businesses. They’re immersing powerful systems in heretofore elusive collective social systems like natural language, consumer perceptions of style, and intricate webs of other intelligent agents. And they are finding ways to leverage innate and acquired human abilities that give new life to the notion of “putting the human in the loop.” Along with the radically human turn in intelligence and data, this new emphasis on professional, collective, and personal expertise—the E in IDEAS—opens entirely new avenues of innovation across industries.

Professional Expertise: Making AI Innovation Business Specific

Consider Microsoft’s ambitious machine teaching efforts, begun more than a decade ago and now beginning to come to fruition. Their goal is to make it easier for workers of all kinds to use AI tools for their express purposes—to let them, in effect, “write the music” themselves. Developers or subject matter experts with little AI expertise, such as lawyers, accountants, engineers, nurses, or forklift operators, can impart important abstract concepts to an intelligent system, which then performs the machine learning mechanics in the background.

Someone who understands the task at hand decomposes the problem into smaller parts and sets up rules and criteria for how the autonomous device should operate. Then, using simulation software, the expert provides a limited number of examples—the equivalent of lesson plans—that help the machine-learning algorithms solve the problem. If the device consistently makes the same mistake, additional examples can be added to the digital curriculum. “It’s not randomly exploring, it’s exploring in a way that’s guided by the teacher,” says Mark Hammond, Microsoft general manager for business AI.5 Once the curriculum is in place, the system automates the process of teaching and learning across hundreds or thousands of simulations at the same time.

The principal program manager for the Microsoft Machine Teaching Group, Alicia Edelman Pelton, offers the simple example of a company that wants to use AI to scan through all its documents to find out how many quotes that were sent out resulted in a sale.6 First, the system has to be able to distinguish a contract from an invoice from a proposal and so on. In all likelihood, no labeled training data exists, especially if different salespeople do things differently. Under a pure machine learning regime, the company would have to outsource the job of creating training data, sending thousands of sample documents and detailed instructions to the vendor, whose army of labelers would need months to complete the task. Once the company was sure the data was free of errors, it would need a high-priced, hard-to-find machine learning expert to build a model. And if salespeople started using models the machine wasn’t trained on, its performance would deteriorate.

But with machine teaching, someone inside the company—a salesperson or other experienced employee—would identify the defining features of a quote and keywords like “payment terms.” The expert’s language would be translated into language the machine could understand, and a preselected algorithm would perform the task. Thus, using in-house experts, companies can use machine teaching to rapidly build customized solutions.

Innovating in Industrial Settings

Scores of organizations are now trying Microsoft’s machine teaching software.7 Delta Air Lines is testing whether the technology can improve baggage handling. Schneider Electric, the venerable Paris-based multinational provider of energy management and automation solutions, wants to see how it works with heating and cooling controls for buildings. Carnegie Mellon University used it to run a mine exploration robot that won a DARPA challenge.

Microsoft is not alone. Amazon and Google are also working on machine teaching techniques that would enable engineers without AI expertise to program complicated AI models. Amazon’s SageMaker Autopilot, for example, can be used by people without machine learning experience to easily produce a model. Google is teaching its AI to learn how humans talk by monitoring the way our faces move when we lip-sync a song. This is possible without any specialist motion tracking hardware—only a phone’s camera.8

Teaching a machine what a human expert would do in the face of high uncertainty and little data can beat data-hungry approaches for designing and controlling many varieties of factory equipment. Siemens is using top-down AI to control the highly complex combustion process in gas turbines, where air and gas flow into a chamber, ignite, and burn at temperatures as high as 1,600°C.9 The volume of emissions created and ultimately how long the turbine will continue to operate depends on the interplay of numerous factors, from the quality of the gas to air flow and internal and external temperature.

Using bottom-up machine learning methods, the gas turbine would have to run for a century before producing enough data to begin training. Instead, Siemens researchers Volar Sterzing and Steffen Udluft used methods that required little data in the teaching phase for the machines. The monitoring system that resulted makes fine adjustments that optimize how the turbines run in terms of emissions and wear, continuously seeking the best solution in real time, much like an expert knowledgeably twirling multiple knobs in concert.

Who Counts as an Expert?

Experts can be found at all levels in an organization, as a recent experiment our firm conducted with medical coders demonstrates.10 In healthcare, medical coders (not to be confused with programmers who write computer code) analyze individual patient charts and translate complex information about diagnoses, treatments, medications, and more into alphanumeric codes that are then submitted to billing systems and health insurers for payment and reimbursement.

The medical coders in our experiment, all of them registered nurses, already had experience with AI, as it was used to scan charts and find links between medical conditions and treatments and suggest proper codes. We wanted to see if it was possible to transform these medical coders into AI teachers, enriching the system with their knowledge and improve its performance.

The coders could review the links within the knowledge graphs where there was disagreement between human coders and the AI in determining the relationships between nodes of the graph (symptom X is associated with condition Y, for instance). Based on their expertise, the coders could directly validate, delete, or add links and provide a rationale for their decisions, which would later be visible to their coding colleagues. In addition, they were encouraged to follow their inclination to use Google (often with WebMD) to research drug-disease links, going beyond what they regarded as the existing AI’s slow look-up tool.

This overlay of human expertise has a significant multiplier effect. Instead of merely assessing single charts, the medical coders added medical knowledge that affects all future charts. Further, with the AI taking on the bulk of the routine work, the need for screening of entire medical charts is greatly reduced, freeing coders to focus on particularly problematical cases. Meanwhile, data scientists are freed from the tedious, low-value work of cleansing, normalizing, and wrangling data.

In the new system, coders were encouraged to focus less on volume of individual links and more on instructing the AI on how to handle a given drug-disease link in general, providing research when required. Links could now be considered for addition to the knowledge graph AI with a lesser burden of quantitative evidence. The AI would learn more regularly and dynamically, especially about rare, contested, or new drug-disease links.

In their new roles, the coders quickly came to see themselves not just as teachers of the AI, but as teachers of their fellow coders. Most importantly, they saw that their reputations with other members of the team would rest on their ability to provide solid rationales for their decisions. They spoke often of the importance of those rationales to the confidence of a subsequent medical coder encountering an unfamiliar link.

The medical coders also indicated that they felt more satisfied and productive when executing the new tasks, using more of their knowledge and acquiring new skills to help build their expertise. They also felt more positive about working with AI on a daily basis.

Making in-house subject matter experts and their experience the driving forces behind AI offers numerous advantages. By transforming people who are not data scientists into AI teachers, like our medical coders, companies can apply and scale the vast reserves of untapped expertise unique to their organizations at every level. Instead of having experienced people remain passive consumers of AI outputs, they become creators of AI. Instead of extracting knowledge from data alone, they put their specialized knowledge to full use. That knowledge includes not only their functional and domain expertise, but their fine-grained understanding of the business itself: how it makes money, how it competes, and where it could be improved.

Collective Expertise: Teaching AI Social Contexts

Humans operate, often effortlessly, in collective and social contexts of immense complexity. These contexts overlap and interpenetrate and are constantly evolving on short and long timescales. When we maneuver a car through an urban environment, we are negotiating a dense web of social systems: We’re processing and anticipating the movements of other vehicles and the intentions of their drivers. We’re reading the body language of pedestrians. We’re following (and maybe bending) the formal rules of the road and engaging in the informal ones embedded in our culture. For instance, in some cultures, flashing your headlights on and off means you are yielding to another vehicle; in other cultures, it means you’re coming through and the other vehicle damn well better give way.

With language, we negotiate the innumerable complexities and nuances, including formal rules, informal usage, slang, colloquialisms, jokes, tone, style, diction, and—sometimes most important of all—what goes unsaid. In matters of etiquette, we respond to well-established social cues. With art, we identify styles in works of the imagination. Any intelligible inference, prediction, action, or utterance by an individual is necessarily situated in social contexts. As Aristotle put it in Politics, society precedes the individual.

The Wisdom of Crowds + Machine

On May 4, 2019, North Korea’s wildly unpredictable leader Kim Jong Un launched his country’s first missile test in seven months. Except, in this instance, his action was correctly predicted by a group of ordinary civilians interacting with an AI system. These prescient human “forecasters” are part of a joint project between IARPA, a research arm of the US government intelligence community, and the University of Southern California’s Viterbi Information Sciences Institute (ISI).

IARPA stands for the Intelligence Advanced Research Projects Activity. Staffed by spies and PhDs, its mission is to provide decision makers with accurate predictions of geopolitical events. Since 2017, it has been working with USC’s ISI on a project called SAGE: Synergistic Anticipation of Geopolitical Events.11 The goal is to generate forecasts from the combination of humans + AI that are more accurate than the predictions of a human expert or a machine.

Forecasting geopolitical events is notoriously difficult. Experts’ predictive accuracy, tracked over time, has been shown to be comparable to a random guess.12 One way to improve forecasts is to crowdsource them, aggregating a large number of human forecasts into a single estimate of probability. This “wisdom of crowds” approach, first detailed in James Surowiecki’s 2004 book of the same name, holds that large groups of people outperform small, elite groups of experts at solving problems, making wise decisions, and predicting the future.13 In other words, collective expertise in the broadest sense can sometimes be superior to highly specific individual expertise. At the same time, advances in machine learning have led to models that produce fairly reasonable forecasts for a number of tasks. The SAGE project combines the power of crowdsourcing with advances in AI—hence the term “synergistic” in its name—to generate more accurate predictions than either method could on its own.

More than 500 of ISI’s publicly recruited forecasters have predicted more than 450 questions pertaining to geopolitics, sports, medicine, climate, and more. The forecasters select what they’d like to predict from a set of quantitative and qualitative questions. For instance, a quantitative question might read: “What will be the daily closing price of Japan’s Nikkei 225 index on [this date]?” A qualitative question might be: “Will Pakistan execute or be targeted in an acknowledged national military attack before [this date]?”

For quantitative questions, time series data can be used to give the human forecaster a sense of a base rate (how often the historical value has fallen within each answer option) and to give the machine model historical data to create a time series forecast. For qualitative questions like political disruptions, where historical data is rarely available to generate a time series forecast, the system obtains a base rate by mapping the question to a framework for similar historical events.

In addition to making predictions based on information provided by the machine learning methods, users can interact with fellow forecasters on discussion boards and comment on forecast results. In this respect, it differs from traditional crowdsourcing, which captures input from group members, who do not communicate, and then statistically analyzes the aggregate results.

In a competition held to test the accuracy of forecasting systems, SAGE was tested against two competing systems throughout 2019. All three systems were given the same set of more than 400 forecasting questions. SAGE won.

The wisdom of crowds also gives Tesla a big advantage in the race to develop self-driving technology—a half million drivers teaching its Autopilot feature to get better. Each of those cars, with more being added every day, is connected to the internet. Combined, they are driving around 15 million miles a day, or more than 5.4 billion miles a year, collecting vast amounts of camera and other sensor data, even when Autopilot is not engaged. The data is uploaded to Tesla so that its neural network can learn how humans drive and directly predict the correct steering, braking, and acceleration in virtually any situation. Meanwhile, most competitors have to accumulate real-world miles the hard way—having (and paying) human safety drivers riding along in test cars.

The Hive Mind of Humans + Machine

What knowledge, then, can be received from the collective?

Bustle Digital Group, the largest publisher for millennial women, wanted to predict Christmas sales for eight women’s sweaters from a major fashion retailer.14 It turned to a platform developed by Unanimous AI called Swarm to leverage the expertise, intuition, and experiential knowledge of a group of randomly selected millennial women who self-identified as being fashion-conscious and having no sales forecasting experience. Using their own personal computers, the participants connected remotely to the Swarm platform and were quickly taught how to use it. They were then asked to predict the retailer’s relative unit sales of eight women’s sweaters during the upcoming holiday season. The participants gave assessments first as individuals using an online survey and then by “thinking together” as an AI-optimized system using Swarm.

As the name of the platform implies, it’s leveraging the concept of “swarm intelligence,” a natural phenomenon in which groups of organisms appear to exhibit collective intelligent behavior without a central control mechanism. Birds flocking, bees swarming, fish swimming, and colonies of ants are all examples of collectives that work in unison to efficiently converge on optimal solutions to complex problems. Swarm intelligence has been known for decades, but only recently has it been joined with AI, with particular hope that in the future, robot swarms can perform dangerous or difficult tasks such as search-and-rescue, undersea mapping, cleaning up toxic spills, and more.

With the Bustle example, the swarm algorithms evaluated complex collective actions, reactions, and interactions of the fashion-conscious participants in real time. By relying on observable behaviors rather than participants’ self-reported feelings, the system produced an optimized sales ranking of the items from one to eight, as well as a scaled ranking that showed the relative spacing between each. The top three sweaters were rated as having broad appeal, and being moderately to very trendy.

Ultimately, using Swarm, Bustle was able to predict two of the three top sellers from the group of eight. Significantly, the three items ranked highest by Swarm outsold the bottom three by a factor of 150 percent—a remarkable result, given that the only differences between the items were color and graphic treatments. In addition, the Swarm forecast was significantly more predictive of the actual unit sales volume than the traditional survey, predicting 34 percent of the variance, compared to only 4 percent for the survey when it came to ratings of trendiness, breadth of appeal, and sales forecast.

In another example of the Swarm AI platform, Stanford researchers leveraged the expertise from a group of radiologists who assessed chest radiographs for pneumonia, which, because it looks like other diseases, is particularly difficult to diagnose using image alone.15 Every X-ray was examined in real time with the individual radiologists contributing their opinions. Throughout the session, each participant could manipulate an icon to express to the other participants how strongly they felt about their position at any time, while the algorithms inferred participants’ confidence based on the relative motions of their icons. In the end, Swarm was 33 percent more accurate than individual radiologists and 22 percent more accurate than a Stanford machine-learning program that had previously bested radiologists.

In other industries, swarm technologies have been used to amplify the intelligence of networked teams to produce more accurate forecasts and better decisions. For instance, groups of financial traders were asked to forecast the weekly trends of four common market indices (SPX, GLD, GDX, and Crude Oil) over a period of nineteen consecutive weeks.16 When predicting weekly trends on their own, individual forecasters averaged 56.6 percent accuracy. When predicting together as real-time swarms they increased their accuracy to 77 percent. Further, if the group had invested on the basis of these swarm-based forecasts, they would have netted a 13.3 percent return on investment (ROI) over the nineteen weeks, compared to the individual’s 0.7 percent ROI.

Teaching Natural Language Processing Systems in Real Time

Zendesk isn’t a really chill office desk or an alternative band. It’s a customer service software provider that serves 150,000 customers in 160 countries and territories. As the company was building this global footprint, its huge volume of support documents for its offerings were written in English only. So the company’s support team set a goal of having all of the articles translated into five target languages. After some initial investigation, they soon found themselves up against the dilemma faced by many enterprises (and governments) needing to translate mountains of complex documents: Using human translators yields high-quality translations but is prohibitively expensive; using machine translation is cost-effective, but yields low-quality results. Further, even highly talented human translators face the challenge of understanding industry-specific language. And for humans and machines alike, translation requires the ability to comprehend not one, but two, huge, constantly evolving linguistic social systems.

To complicate matters for Zendesk, their articles ranged from extremely popular, high-traffic content to seldom-viewed articles, all of which needed continual updating. In Lilt, a translation services provider, they found a partner whose AI-assisted translation combined the efficiency and cost effectiveness of an advanced neural language machine translation (MT) model with the expertise of human translators.

In simple terms, the end-to-end translation pipeline works like this:17 Zendesk designates which of its documents need a human-in-the-loop translator and which can go straight to the neural model for purely machine translation and then uploads them to Lilt. To facilitate the transfer of content, the system is connected directly to Zendesk’s content management system. The documents that need a human touch are routed to a translator aligned with the client’s industry. The translator uses the neural machine translator as a prompt for her translation but has the final say in the finished product—incorporating Zendesk’s specific vocabulary and adding the nuanced contextual understanding that only a human can provide. Further, the translator’s polishing of the machine translator’s prose teaches the machine how to improve the quality of the suggestions it subsequently shows to all human translators in Lilt. The human + machine translations and pure machine translations are sent back to Zendesk and stored also in Lilt’s centralized translation memory so they can be rapidly updated in the future.

The real-time teaching of the machine translator by the human translator continually increases throughput times. Further, because the machine translator engine is taught through translation feedback in real time, there’s no need to externally retrain the engine. Most importantly, machine teaching imbues the translation with the social context for which language is a principal medium. The machine translator learns the context of a word and its nuanced meaning in each individual document, industry, and dialect (such as European French versus Canadian French).

Machine teaching also reduces the time and expense of machine training. Because data trained on a specific industry or dialect by knowledgeable humans is more effective than a generalist approach, Lilt is able to train machine translation tools with up to 400 times less data than tools that leverage large, generalist datasets. Focusing on a specific industry/dialect, it doesn’t need to learn potentially fewer applicable words and phrases early on (but does gradually build a more general understanding over time). This contextual learning, rather than generalist, data-hungry correlations, allows the Lilt translation tool to provide specific, cheap, and accurate translations through understanding human teachers.

Shopping in Style on Etsy

At Etsy, the online marketplace for vintage and handmade goods, the motto is “keep commerce human.” And it was to humans they turned first when they wanted to teach their search engine how to recognize what is the crux of many purchasing decisions—aesthetic style.18 When considering an item for purchase, buyers look not only at functional aspects of an item’s specification like its category, description, price, and ratings, but also at its stylistic and aesthetic aspects. Like language and the other collective contexts we have been discussing here, styles exist in a constantly evolving social context that humans take for granted and AI struggles with.

For Etsy, capturing style is particularly challenging. Unlike mass produced goods, which can be easily classified, most of the items listed for sale on Etsy are one-of-a-kind homemade creations. Many buyers and sellers, though they know what they like, are unable to adequately articulate their notions of style. Many items may borrow from a number of styles or exhibit no strong style at all. And there are some 50 million items on offer at any given time.

In the past, style-based recommendation systems have produced unexplainable style preferences for groups of users. The AI assumed that two items must be similar in style if they are frequently purchased together by the same group of users. Another approach uses low-level attributes like color and other visual elements to group items by style. Neither method has been able to understand how style affects purchase decisions.

Who better to school AI in subjective notions of style than Etsy’s merchandizing experts?

Based on their experience, the merchandisers developed a set of forty-two styles that captured buyers’ taste across Etsy’s fifteen top-level categories from jewelry to toys to crafts.19 Some are familiar from the art world (art nouveau, art deco). Some evoke emotions (fun and humor, inspirational). Some refer to lifestyles (boho, farmhouse) and some to cultural trends (scandi, hygge). They even produced a list of 130,000 items distributed across their forty-two styles.

Etsy’s technologists then turned to buyers who tend to use search terms related to style like “art deco sideboard.” For each such query, Etsy assigned that style name to all the items the user clicked on, “favorited,” or bought during that search.

From just one month of such queries, Etsy was able to collect a labeled dataset of 3 million instances against which to test its style classes. Etsy then trained a neural network to use textual and visual cues to best distinguish between the forty-two style classes for each item.

The result was style predictions for all 50 million active items on Etsy.com. Items from a single shop as well as items purchased or favorited by a user tend to have similar style. Etsy also found that incorporating a style module into the site’s recommender system increased revenue.

In another test, the Etsy team quantified “strength of style”—how intensely an item exhibits one of the styles. For example, a piece of wall art that depicts an anchor, a sailboat, and a whale strongly represents the nautical style. Items with a strong style do better than items with a weak style or a style that is distributed over several styles at once. Says Mike Fisher, Etsy’s CTO, “We can help sellers determine whether they have a strong style or not.”20

The stylistic classes also track well in terms of seasonality. For example, sales of items deemed tropical peaked in the summer months; romantic items surged around Valentine’s Day and other holidays associated with gift-giving. Inspirational style flourished during May and June, when many students graduate. The warm and cozy farmhouse style prospered in the fall, peaking in November. Sellers might use this information to tailor their products to appropriate styles at different times of the year.

When the pandemic struck and the supply chains of mass retailers broke down, many buyers turned to Etsy—the company’s revenues doubled to $10 billion, and its market value rose to $25 billion.21 One of the hottest selling items? Masks tailored to the aesthetic sensibilities of customers. Sales of masks went from virtually nothing at the beginning of April 2020 to $740 million the rest of the year, allowing buyers to find one, said Etsy CEO Josh Silverman, “that expressed their sense of taste and style.” Buyers, he said, “discovered you can keep commerce human.”22

Personal Expertise: Inherent Human Technology

For decades, AI researchers have struggled with how to imbue machines with the basic building blocks of human intelligence. But as we said in chapter 1, the human turn in intelligence is not about recreating human consciousness. Instead, it’s about solving problems by mimicking the most powerful cognitive characteristics of humans and supplementing them with the most powerful abilities of computers.

The radically human turn in personal expertise is about directly leveraging, not mimicking, the innate and acquired intelligence of humans—to augment AI. This can be a more subtle kind of teaching, a kind where tacit skills—some that the teacher may not even know they possess—are subtly transferred to a learning system.

Putting the Whole Person in the Loop

In traditional human-in-the-loop (HITL) machine learning, people train, tune, and test algorithms. They label the data; the machine learns to make decisions or predictions based on the data; the humans tune the algorithm, score its outputs, and try to improve its performance—all in a virtuous circle. Though effective in training machine learning models, this narrow definition of HITL doesn’t begin to encompass all the rich possibilities of machine teaching.

Like many robotics companies during the pandemic, Kindred AI saw its development rapidly accelerated by the pressing need for warehouse robots. But while many such companies were stalled by the need to perfect their AI before they could widely deploy their robots, Kindred’s insertion of a “whole person” in the loop from the start enabled the company to ramp up quickly to meet the needs of one of its first significant customers, the Gap clothing retailer.

As Covid-19 spread across North America, Gap was forced to close many of its stores, including its Old Navy and Banana Republic outlets. Online orders skyrocketed, but there were fewer warehouse workers to fulfill them due to social distancing measures the company had put in place. For Kindred, that meant that what had been a pilot program at the Gap in 2019 suddenly turned into an order for 106 of the company’s eight-foot-tall robot stations.23

Powered by reinforcement learning and proprietary grasping technology, Kindred’s “SORT” machines help assemble multi-item orders. Items from a customer’s checkout cart slide down a chute into a basin where a robotic arm, equipped with suction and a physical grip, scans the bar code and places it in a nearby bin. Once all the items in the order have been placed in a bin, a worker puts it on a conveyer for packing and delivery.

What distinguishes Kindred from other warehouse robot providers is its “robot pilots.” They’re stationed in a Toronto office monitoring fleets of robots and teaching them best practices.24 A stereo camera on the robotic arm lets the pilots monitor the robots’ behavior. When the robot makes a mistake or is stumped about how to grasp an item or where to place it, the pilot steps in to guide it. The machine begins to learn the pattern, receiving a reward signal each time it succeeds, eventually achieving a level of proficiency that no longer requires human assistance.

Early on, the company modified the movements of the arm and gripper so that it could learn to reach and grasp with the same fluidity that pilots can. Easily picking things up is one of those things people can do almost unthinkingly, like ride a bike, touch type, or sing. They are part of our “inherent technology”—what the philosopher Michael Polanyi dubbed “tacit knowledge.”

Tacit knowledge, which is hard to verbalize, contrasts with explicit knowledge, which we can easily verbalize or write down. “We know more than we can tell,” as Polanyi put it, which is why explicitly transferring tacit knowledge to another person or a machine is so challenging. Kindred’s pilots need only be able to show the robots how to do something, not tell them. When Gap put in its rush order in May 2020, Kindred deployed the additional robots at four of Gap’s fulfillment centers around the country in a matter of weeks, months ahead of schedule. If online demand contracts once the brick-and-mortar stores fully reopen, Kindred’s Smart Robots as a Service (SraaS) model, where customers pay-per-pick, enable easy scaling down of capacity. In addition, the pilot plus reinforcement learning model enables the system to be quickly repurposed to operate anywhere unstructured sets of small components need to be arranged, and the company is now looking toward the automotive, electronics, and manufacturing industries.

More-Efficient Knowledge Transfer

Thanks to machines that can absorb the tacit knowledge of practitioners and experts alike, future systems will require far less data for their construction and training, enabling them to capture the specialized knowledge of experts. It’s a turn from data-hungry AI to more data-efficient AI, from systems built from the bottom-up to systems finessed from the top down.

At a competition organized by the University Hospital of Brest and the Faculty of Medicine and Telecom Bretagne in Brittany, France, competitors vied to see whose medical imaging system could most accurately recognize which tools a surgeon was using at each instant in minimally invasive cataract surgery.25

The winner was an AI machine vision system trained in six weeks on only fifty videos of cataract surgery—forty-eight operations by a renowned surgeon, one by a surgeon with one year of experience, and one by an intern. Accurate tool recognition systems enable medical personnel to rigorously analyze surgical procedures and look for ways to improve them. Such systems have potential applications in report generation, surgical training, and even real-time decision support for surgeons in the operating room of the future.

Harnessing Expertise in Your Organization

Machine teaching will take its place alongside the other six human-machine hybrid activities we identified in Human + Machine, keeping humans securely in the driver’s seat while transforming AI into an even more powerful engine of innovation. Further filling out the “missing middle,” the human turn from machine learning to machine teaching will create a variety of new, satisfying human-centered jobs. Most important for companies, machine teaching unleashes the often-untapped expertise throughout the organization, allowing a much broader swath of your people to use AI in new and sophisticated ways. Because machine teaching is customizable for your business situation, it opens the way to real innovation and advantage—you no longer are simply playing technology catch-up. In supervised learning scenarios, machine teaching is particularly useful when little or no labeled training data exists for the machine learning algorithms—as it often doesn’t because an industry or company’s needs are so specific.

To get the greatest value out of both systems and knowledge workers, organizations will need to reimagine the way specialists as well as nonspecialists interact with machines. You can begin by imbuing your domain experts with the digital fluency (detailed in chapter 6) to efficiently combine their expertise with company processes and technology. Such fluency will also equip them to develop creative ways to apply AI to the business. At the same time, companies should recognize the potential of people at all levels of the organization to be experts. Like our medical coders, AI-empowered personnel can transform tedious, low-level tasks into high-value knowledge work, increase employee engagement, and ease the burdens on your data scientists.

Meanwhile, the ease and efficiency with which AI techniques can harness the collective knowledge of human groups opens up new competitive possibilities for companies across the board—from industrial giants like Tesla to purveyors of fashion goods like Bustle. Think of it as the difference between market research, which seeks empirical data and observations about what people want, and the market directly teaching your products and services how to behave.

Finally, don’t hesitate to put humans in the loop to directly impose their innate and acquired human abilities onto AI systems. A robot pilot, for example, or a human translator doesn’t represent a failure of AI; it represents the highest and best use of human and machine, the former providing the near infinite and unsayable nuance of what we know and the latter supplying a superhuman efficiency. That is what it means for humans and machines to meet—and powerfully merge their abilities—in a middle soon to be no longer missing.

You should now have a fix on the three basic building blocks of AI-enabled business innovation in a world where innovation is being turned upside down: intelligence (I), data (D), and expertise (E). The remaining challenge is to tie them together through systems architecture (A), which produces innovation across the enterprise—the challenge to which we turn in chapter 4.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset