Chapter 20

Distancing Through Differencing: An Obstacle to Organizational Learning Following Accidents

Richard I. Cook

David D. Woods

The future seems implausible; the past seems incredible.

Woods & Cook (2002)

Introduction

A critical component of a high resilience in organizations is continuous learning from events, ‘near miss’ incidents, and accidents (Weick et al., 1999; Ringstad & Szameitat, 2000). As illustrated by the many cases referenced in this book, incidents and failures provide information about the resilience or brittleness of the system in the face of various disruptions. This chapter explores some of the barriers that can limit learning even by generally very high quality organizations.

Despite terrible consequences, accidents, as fundamentally surprising events, offer an opportunity for learning and change as there is a profound sense among many that the usual concepts and policies are insufficient to cope with what has happened (Woods et al., 1994). The immediate aftermath of a serious failure produces an atmosphere of inquiry and frees up resources normally dedicated to production, which are refocused on the accident and its consequences. The lines that divide participants, management, regulators, and victims from each other are momentarily thin. As one National Transportation Safety Board observer put it, “when the vividness of the tragedy is fresh in everyone’s mind, and the broken wreckage is still smoldering … people … have only one pressing goal, and that is to determine precisely what happened and see that it does not happen again” (NTSB, Accident Investigation Symposium, 1983, page 8). This period of cooperation and focus makes it possible to ask questions that are not usually asked, gather data not usually gathered, and probe issues not usually open to inquiry.

Not all stakeholders are in the same position relative to the surprising event. Some are closer; others more distant in knowledge, point of view, and in experiencing the consequences. Distance from the work context where an accident occurs appears to alter what is learned from incidents and accidents. Those people who are at the epicenter of high consequence accidents are usually devastated and entirely caught up in the consequences and reactions to failure (Dekker, 2003b). Conversely, people who are far distant from the epicenter are too divorced from the complex context of technical work and accept skeletal descriptions of the events that reach them (the first stories in Cook et al., 1998). As a result, those distant from the technical work area tend to fall back on oversimplified sterile responses.

But near the epicenter are people who have both a detailed understanding of the context of work yet are sufficiently distant from the consequences. Because their technical work corresponds closely to conditions of work at the epicenter, the event has direct relevance. Because they are some distance from the epicenter, their attention is not captured by the need to react to the event itself and they have an opportunity to extract deeper information (they can begin to explore the second stories of Cook et al., 1998), i.e., to learn about how safety is created.

Barriers to Learning

Learning in the aftermath of incidents and accidents is extraordinarily difficult because of the complexity of modern systems. Layers of technical complexity hide the significance of subtle human performance factors. Awareness of hazard and the consequences of overt failure lead to the deployment of (usually successful) strategies and defenses against failure. These efforts create a setting where overt failures only occur when multiple small faults combine. The combination of multiple contributors and hindsight bias makes it easy for reviewers after the fact to identify an individual, group or organization as a culprit and stop. These characteristics of complex systems tend to hide the real characteristics of systems that lead to failures.

When an organization experiences an incident, there are real, tangible and sometimes tragic consequences associated with the event which create barriers to learning:

•  The negative consequences are emotional and distressing for all concerned,

•  failure generates pressure from different stakeholders to resolve the situation,

•  a clear understandable cause and fix helps stakeholders move on from a tragedy, especially when they continue to use or participate in that system,

•  managing financial responsibility for ameliorating the consequences and losses from the failure,

•  desire for retribution from some stakeholders and processes of defense against punitive actions,

•  confronting dissonance and changing concepts and ways of acting is painful and costly in non-economic senses.

In this chapter we present a case study of learning from incidents. Analysis of the case reveals a discounting or distancing process whereby reviewers focus on differences, real and imagined, between the place, people, organization and circumstances where an incident happens and their own context. By focusing on the differences, they see no lessons for their own operation and practices or only narrow well bounded responses. We call this pattern-distancing through differencing.

Examining how this particular organization struggled to recognize and overcome distancing through differencing also provides useful insights on how to support the organizational learning process.

An Incident

A chemical fire occurred during maintenance on a piece of process machinery in the clean room of a large, high technology product manufacturing plant. The fire was detected and automatically extinguished by safety systems that shut off the flow of reactants to the machine.

The reactant involved in the fire was only one of many hazards associated with this expensive machine and the machine was only one of many arranged side by side in a long bay. Operation and maintenance of the machine also involved exposure or potential exposure to thermal, chemical, electrical, radio frequency, and mechanical hazards. Work in this environment was highly proceduralized and the site had repeatedly undergone ISO 9000 certification and review. Both the risks of accident and the high value of the machine and its operation had generated elaborate formal procedures for maintenance and required two workers (buddy system) for most procedures on the machine.

The manufacturer had an extensive safety program that required immediate and high level responses to an incident such as this, even though no personal injury occurred and damage was limited to the machine involved. High level management directed immediate investigations, including detailed debriefings of participants, reviews of corporate history for similar events, and a ‘root cause’ analysis. Company policy required completion of this activity within a few days and formal, written notification of the event and related findings to all other manufacturing plants in the company. The cost of the incident may have been more than a million dollars (and the plant’s score card suffered significantly).

Two things prompted the company to engage outside consultants for a broader review of the accident and its consequences. First, a search for prior similar events in the company files discovered a very similar accident at a manufacturing plant in another country earlier in the year. Second, one of the authors (RIC) recently had been in the plant to study the use of a different machine where operator ‘error’ seemed prevalent but only with economic consequences. He had identified a systemic trap in this other case and provided some education about how complex systems failed a few weeks earlier. During that visit, he pointed out how other systemic factors could contribute to future incidents that threatened worker safety in addition to economic losses and suggested the need for broader investigations of future events.

Following the incident the authors returned, visited the accident scene, and debriefed the participants in the event and those involved in its investigation. They studied operations involving the machine in which the fire occurred. They also examined the organizational response to this accident and to the prior fire.

Organizational Learning in this Case

The obstacles to learning from failure are nearly as complex and subtle as the circumstances that surround a failure itself. Because accidents always involve multiple contributors, the decision to focus on one or another of the set, and therefore what will be learned, is largely socially determined.

In the incident just described, the formal process of evaluating and responding to the event proceeded along a narrow path. The investigation concentrated on the machine itself, the procedures for maintenance, and the operators who performed the maintenance tasks. For example, they identified the fact the chemical reactant lines were clearly labeled outside the machine but not inside it where the maintenance took place. These local deficiencies were corrected quickly. In a sense, the accident was a ‘normal’ occurrence in the company; the event was regretted, undesirable, and costly but essentially the sort of thing for which the company’s incident procedures had been designed and response teams created. The main findings of this formal, internal investigation were limited to these rather concrete, immediate, local items.

A broader review, conducted in part by outsiders, was based on using the specific incident as a wedge to explore the nature of technical work in context and how workers coped with the significant hazards inherent in the manufacturing process. This analysis yielded a different set of findings regarding both narrow human engineering deficiencies and organizational issues. In addition to the relatively obvious human engineering deficiencies in the machine design discovered by the formal investigation, the event pointed to deeper issues that were relevant to other parts of the process and other potential events.

There were significant limitations in procedures and policies with respect to operations and maintenance of the machine. For example, although there were extensive procedural specifications contained in maintenance ‘checklists’, the workers had been called on to perform multiple procedures at the same time and had to develop their own task sequencing to manage the combination. Similarly, although the primary purpose of the buddy system was to increase safety by having one worker observe another to detect incipient failures, it was impossible to have an effective buddy system during critical parts of the procedures and parts of this maintenance activity. Some parts of the procedures were so complex that one person had to read the sequence from a computer screen while the other performed the steps. Other steps required the two individuals to stand on opposite sides of the machine to connect or remove equipment, making direct observation impossible.

Surprisingly, the formal process of investigating accidents in the company actually made deeper understanding of accidents and their sources more difficult. The requirement for immediate investigation and reporting contributed to pressure to reach closure quickly and led to a quick superficial study of the incident and its sources. The intense concern for ‘safety’ had led the company to formally lodge responsibility for safety in a specific group of employees rather than the production and maintenance workers themselves. Treating safety as an abstract goal generated the need for these people as a separate entity within the company. These ‘safety people’ had highly idealized views of the actual work environment, views uninformed by day to day contact with the realities of clean room work conditions. These views allowed them to conceptualize the accident as flowing from the workers rather than the work situation. They were captivated in their investigation by physical characteristics of the workplace, especially those characteristics that suggested immediate, concrete interventions that could be applied to ‘fix’ the problems that they thought led to the accident.

In contrast, the operators regarded the incident investigation and proposed countermeasures as derived from views that were largely divorced from the realities of the workplace. They saw the ‘safety people’ and their work as being irrelevant. They delighted in pointing out, for example, how few of them had any practical experience with working in the clean room. Privately, the workers said that production pressures were of paramount importance in the company. This view was communicated clearly to the workforce by multiple levels of management. Only after accidents, they noted, was safety regarded as a primary goal; during normal operations, safety was always a background issue, in contrast to the primary need to maintain high rates of production. The workers themselves internalized this view. There were significant incentives provided directly to workers to obtain high production and they generally sought high levels of output to earn more money.

During the incident investigation, it was discovered that a very similar incident had occurred at another manufacturing plant in another country earlier in the year – a precursor event or rehearsal from the point of view of this manufacturing facility. Within the company, every incident, including the previous overseas fire, was communicated within the company to safety people and then on to other relevant parties. However, the formal report writing and dissemination about this previous incident had been slow and incomplete, relative to when the second event occurred. Part of the recommendations following from the second incident addressed faster production and circulation of reports (in effect, increasing the pressure to reach closure when investigating incidents).

Interestingly, the relevant people at the plant knew all about the previous incident as soon as it had occurred through more informal communication channels. They had reviewed the incident, noted many features that were different from their plant (non-US location, slightly different model of the same machine, different safety systems to contain fires). The safety people consciously classified the incident as irrelevant to the local setting, and they did not initiate any broader review of hazards in the local plant. Overall they decided the incident “couldn’t happen here.”

This is an instance of a discounting or distancing process whereby reviewers focus on differences, real and imagined, between the place, people, organization and circumstances where an incident happens and their own context. By focusing on the differences, they see few or no lessons for their own operation and practices.

Notice how speeding up formal notification does nothing to enhance what is learned and does nothing to prevent or mitigate discounting the relevance of the previous incident. The formal review and reports of these incidents focused on their unique features. This made it all the easier for audiences to emphasize what was different and thereby limit the opportunity to learn before they experienced their own incident.

It is important to stress that this was a company taking safety seriously. Within the industry it had an excellent safety record and invested heavily in safety. Its management was highly motivated and its relationships with workers were good, especially because of its strong economic performance that led to high wages and good working conditions. It recognized the need to make a corporate commitment to safety and to respond quickly to safety-related events. Strong pressures to act quickly to ‘make it safe’ provided incentives to respond immediately to each individual accident. But these demands in turn directed most attention after an accident towards specific countermeasures designed to prevent recurrence of that specific accident. This, in turn, led to the view that accidents were essentially isolated, local phenomena, without wider relevance or significance.

The management of the company was confronted with the fact that the handling of the overseas accident had not been effective in preventing the local one, despite their similarities. They were confronted by the effect of social processes working to isolate accidents and making them seem irrelevant to local operations. The prior fire overseas was noticed but regarded as irrelevant until after the local fire, when it suddenly became critically important information. It was not that the overseas fire was not communicated. Indeed it was observed by management and known even to the local operators. But these local workers regarded the overseas fire not as evidence of a type of hazard that existed in the local workplace but rather as evidence that workers at the other plant were not as skilled, as motivated and as careful as they were after all, they were not Americans (the other plant was in a first world country). The consequence of this view was that no broader implications of the fire overseas were extracted locally after that event.

Interestingly (and ominously) this distancing through differencing that occurred in response to the external, overseas fire, was repeated internally after the local fire. Workers in the same plant, working in the same area in which the fire occurred but on a different shift, attributed the fire to lower skills of the workers on the other shift. (Workers and managers of other parts of the manufacturing process also saw little relevance or potential to learn from the event.) They regarded the workers to whom the accident happened as inattentive and unskilled. Not surprisingly, this meant that they saw the fire as largely irrelevant to their own work. After all, their reasoning went, the fire occurred because the workers to whom it happened were less careful than we are. Despite their beliefs, there was no evidence whatsoever that there were significant differences between workers on different shifts or in different countries (in fact, there was evidence that one of the workers involved was among the better skilled at this plant).

Contributing to this situation was, paradoxically, safety. Over a span of many years, the incidence of accidental fires with this particular chemical and in general had been reduced. But as a side effect of success, personnel’s sensitivity to the hazard the chemical presented in the workplace was reduced as well. Interviews with experienced ‘old hands’ in the industry indicated that such fires were once relatively common. New technical and procedural defenses against these events had reduced their frequency to the point that many operators had no personal experience with a fire. These ‘old hands’ were almost entirely people now in management positions, far from the clean room floor itself. Those working with the hazardous materials were so young that they had no personal knowledge of these hazards, while those who did have experience were no longer involved in the day to day operations of the clean room.

In contrast with the formal investigation, the more extensive look into the accident that the outside researchers’ visit provoked produced different findings. Discussion of the event prompted new observations from within the plant. Two examples may be given. One manager observed that the organization had extensive and refined policies for the handling of the flammable chemical delivery systems (tanks, pipes, valves) that stopped at the entrance to the machine. Different people, policies, and procedures applied to the delivery system. He made an argument for carrying these rules and policies through to the machine itself. This would have required more extensive (and expensive) preparation for maintenance on the machine than was currently the case, but would have eliminated the hazardous chemical from within the machine prior to beginning maintenance. Another engineer suggested that the absence of appropriate labeling on the machine involved with the accident should prompt a larger review of the labeling in all places where this chemical was used or transported.

These two instances are examples of using a specific accident to discover characteristics of the overall system. This kind of reasoning from the specific to the more general is a pronounced departure from the usual approach of narrowly looking for ways to prevent a very specific event in a specific place from occurring or reoccurring.

The chemical fire case reveals the pressures to discount or distance ourselves from incidents and accidents. In this organization, effective by almost all standards, managers, safety officers, and workers took a narrow view of the precursor event. By narrowing in on local, concrete, surface characteristics of the precursor event, the organization limited what could be learned.

Extending or Enhancing the Learning Opportunity

An important question for resilience management is a better understanding of how the window of opportunity for learning can be extended or enhanced following incidents. The above case illustrates one general principle that could be put into action by organizations – do not discard other events because they appear on the surface to be dissimilar. At some level of analysis, all events are unique; while at other levels of analysis, they reveal common patterns.

Promoting means for organizations to look for and consider similarities between their own operation and the organization where an incident occurred could reduce the potential for distancing through differencing. This will require shifting analysis of the case from surface characteristics to deeper patterns and more abstract dimensions (Cook et al., 1998). Each kind of contributor to an event then can guide the search for similarities.

When this process of learning moved past the obstacle of distancing through differencing in this case, the organizational response changed. The organization derived and shared with us a new lesson – safety is a value of an organization, not a commodity to be counted or a priority set among many other goals.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset