Reactive analysis is a popular—and sometimes required—process intended to help organizations delve deeper into adverse events or near misses.
The process typically follows a preset structure and usually involves mapping of events, causal factors, and root causes. The method aims to identify and address deeper organizational issues (systems issues) that contribute to events in an attempt to prevent future events. The interactive
Process for One RCA Model provides a brief outline of one approach to RCAs.
Different models offer different methodologies, tools, and steps and in some cases use different terms. For example, some models use the term "causal factors" to encompass factors that are not deep enough to be root causes, while other models use the term "proximal causes" or "contributory factors." Causal factors are sometimes further subcategorized into other concepts, such as direct causes, unsafe acts, unsafe conditions, or failures in barriers.
Unfortunately, many reactive analyses fail. Some limitations are inherent in studying an event retrospectively. First, many reactive analysis models are linear, which suggests direct causality. Difficulty in capturing the real complexity of systems and the myriad circumstances of events and conditions can bias investigators into believing that the factors identified are the only "causes" of an accident. In addition, the appearance of linearity can make it easy to fall into traps such as hindsight bias, and lead investigators to subjectively choose the events (including the initiating event) and conditions and, ultimately, the end or stopping point of the analysis. These limitations make the event look linear in retrospect; in reality, the circumstances did not appear linear to the people who were involved as the event unfolded.
In addition, investigators must be skilled and experienced in conducting investigations and performing such analyses; reactive analysis is not an undertaking that can be completed superficially, just by following a form. The investigation may also uncover lessons that span the organizations (e.g., device design issues), but such lessons often are not shared (Wu et al.).
ECRI Institute PSO reviews RCAs submitted by its members and collaborating organizations in a protected environment and offers suggestions to improve the RCA process. Based on its evaluation of hundreds of RCAs, ECRI Institute PSO has identified common areas for improvement in RCAs, including the following (ECRI Institute PSO "Safe Table"):
- More complete event investigation
- Ability to look beyond the "sharp end" of an event to identify the underlying causes for it
- A focus on systems issues rather than on individuals as causes of errors
- Corrective actions designed to address the root causes of error identified from the analysis
- Inclusion of metrics to measure the effectiveness of the action plans
Other reasons many reactive analyses fail relate to how they are done and the mindset the organization and team apply in approaching them. The interactive
Root-Cause Analysis: Match the Strategies to the Pitfalls explores such problems, as described in industry literature and as identified through ECRI Institute's own experience.
ECRI Institute has endorsed
RCA guidelines published by the Institute for Healthcare Improvement (IHI; recently merged with the National Patient Safety Foundation) to help healthcare organizations improve how they investigate adverse events and near misses. Called RCA2, the model adds to "improving root cause analyses" the words "and actions" to emphasize that if actions derived from an RCA are not implemented and measured, the entire exercise will have been a waste of time and resources. Recommendations include the following (IHI):
- Use a risk- rather than a harm-based approach to prioritize safety events, hazards, and vulnerabilities.
- Form RCA teams that include subject matter experts as well as staff who are naïve to the subject, a leader with strong knowledge of safety science and the practice of RCA, and a patient representative.
- Use interviewing techniques, flow diagramming, action hierarchy, and other tools to facilitate the investigation and develop the strongest appropriate actions.
As part of their quality assessment and performance improvement programs, the Centers for Medicare and Medicaid Services (CMS) requires that hospitals and nursing homes ensure that their performance improvement activities "track medical errors and adverse [patient or resident] events, analyze their causes, and implement preventive actions and mechanisms that include feedback and learning throughout the [facility]" (42 CFR § 482.21[c]; 42 CFR § 483.75[e]).
Many states require certain types of healthcare organizations, such as hospitals and nursing homes, to report adverse events, often through a state reporting system. States frequently require that only certain types of adverse events be reported but they may accept reports of other types of adverse events. A number of these states require organizations to submit RCAs, or similar analyses, of events. (OIG) Risk managers should investigate requirements in their states.
Joint Commission expects accredited hospitals, nursing care centers, ambulatory settings, and home care agencies to "identify and respond appropriately to all sentinel events." Joint Commission states the following:
A sentinel event is a Patient Safety Event that reaches a patient and results in any of the following:
- Permanent harm
- Severe temporary harm and intervention required to sustain life
Joint Commission also considers the following to be sentinel events (see the accreditation manual for further details regarding each):
- Suicide in an around-the-clock care setting or within 72 hours of discharge
- Unanticipated death of a full-term infant
- Infant discharge to the wrong family
- Patient abduction
- Elopement from an around-the-clock care setting that leads to death, permanent harm, or severe temporary harm
- Hemolytic transfusion reaction involving blood group incompatibility
- Rape, assault (leading to death, permanent harm, or severe temporary harm), or homicide of a patient
- Rape, assault (leading to death, permanent harm, or severe temporary harm), or homicide of a staff member, licensed independent practitioner, visitor, or vendor
- Wrong-patient or wrong-site procedure or wrong procedure
- Unintended retention of a foreign object
- Severe neonatal hyperbilirubinemia
- Prolonged fluoroscopy totaling more than 1,500 rads to a single field or radiotherapy delivered to the wrong body region or at more than 25% of the planned dose
- Fire, flame, or unanticipated smoke, heat, or flashes during patient care
- Intrapartum maternal death or severe morbidity
Each organization must define the term "patient safety event" for its own purposes, but the definition must encompass sentinel events as defined by Joint Commission. See Resource List for information on accessing the agency's sentinel event policies for each accreditation program.
Response to all sentinel events. Beginning in 2015, Joint Commission specified that "all sentinel events must be reviewed by the [organization] and are subject to review by the Joint Commission" (previously, only a subset of sentinel events were subject to review by Joint Commission). Appropriate response to a sentinel event includes completing a "comprehensive systematic analysis." Joint Commission states that RCAs are the most common type of such analysis but that other methodologies may be used.
Reporting of sentinel events. Accredited organizations are "strongly encouraged, but not required" to report sentinel events to Joint Commission. The accrediting agency may otherwise learn of such events through surveys; patients or residents, families, or staff; or the media, for example.
If Joint Commission learns of a sentinel event through means other than the accredited organization, the agency performs a preliminary assessment of the event. Joint Commission usually does not review events that happened more than a year previously but in such a case may request a written response from the accredited organization.
Submission of the analysis and action plan. If Joint Commission becomes aware, by any means, of a sentinel event, the organization is expected to do the following (Joint Commission):
- Conduct a "thorough and credible" comprehensive systematic analysis and create an action plan within 45 days of the event or of the time they became aware of the event
- Submit its comprehensive systematic analysis and action plan to Joint Commission (or otherwise arrange Joint Commission evaluation of the organization's response to the sentinel event) within 45 days
Joint Commission response to submitted analyses and action plans. Joint Commission will review the comprehensive systematic analysis and action plan to determine whether they are acceptable (the agency outlines the criteria it uses in its
sentinel event policy and procedures). If the analysis and action plan are acceptable, the agency assigns a follow-up activity, such as one or more Sentinel Event Measures of Success. A Sentinel Event Measure of Success is "a numerical or quantifiable measure, ideally with a numerator and denominator, that indicates whether a planned action was effective and sustained" (Joint Commission).
If the response is not acceptable, Joint Commission will consult with the organization and allow additional time for resubmission. If the response is still unacceptable, the organization's accreditation may be affected.
Use of sentinel event data. Deidentified data from sentinel events are included in the agency's Sentinel Event Database, which is used to develop National Patient Safety Goals and issue Sentinel Event Alerts, among other purposes (Joint Commission).
Embrace Organizational Learning
Action Recommendation: Embrace organizational learning.
Embracing organizational learning may help the organization be more effective in its efforts to unearth the deeper systems issues that contribute to events.
According to Peter Senge, a learning organization is "an organization in which people at all levels are, collectively, continually enhancing their capacity to create things they really want to create" (O'Neil). One model represents organizational learning as a three-legged stool (SOL "Core Competencies")
A learning organization appropriately emphasizes continuous reflection and progress toward the shared vision. To that end, it is willing to confront organizational, managerial, and systems issues—a requisite for effective reactive analysis. See
Resource List for resources on organizational learning.
Get in the Mindset
Action Recommendation: Embrace the "new view" of why events happen.
Many organizations and individuals continue to espouse all or parts of the "old view" of why events happen, and as a result, the reactive analyses they perform do not lead to better care. Sidney Dekker, an expert on human error and safety, has written about the old view versus the new view (Dekker).
According to the old view, a system may be or can be made essentially safe. The approach typically involves attempts to remove unsafe features from the system, or to establish barriers to prevent unsafe features from occurring, with the idea that the resulting system will then be safe. However, complex systems are not, and cannot be, essentially safe. Rather, people at all levels of the organization create safety by using tools and technologies while balancing many competing goals. Safety is what staff do every day; it is created by people and is dynamic. It is not a static property of a system.
When the old view holds sway, individual events often come as a surprise, especially because they frequently happen when staff are doing things they usually do—things that normally have no negative consequences. The new view sees events as a natural consequence of operating within complex systems. But organizations can still take steps to improve safety.
When a very bad event occurs, the old view assumes that someone did something bad or that there is a gaping failure in the system. But Newton's third law of motion—for every action, there is an equal and opposite reaction—does not hold true in event causation in complex systems. In complex systems, big, bad outcomes can happen after relatively minor "missteps." (Dekker) Organizations can embrace the new view, and other underpinnings of effective event investigation and analysis, by educating management and investigators on these concepts. In addition, organizations can use
Root-Cause Analysis: Questions for Discussion and Self-Assessment to critically appraise their current approach.
See from the Perspective of the Staff Member
Action Recommendation: In investigating an adverse event, seek to understand the perspectives of the people involved as the event unfolded.
A crucial element that is missing from many reactive analyses is an effort to view things from a perspective that will allow learning about the organization—namely, the perspectives of the people who were involved as the event unfolded.
When following the old view, investigators judge the behavior of the people involved in an adverse event based on what they know now. This is called hindsight bias. It often leads them to ask questions such as "Why didn't they do X?" and "How could they not have seen Y?" Investigations premised on this mindset typically focus on what people did do but shouldn't have, or didn't do but should have. This can happen even if the organization has committed to taking a "just culture" approach to staff involved in events.
To minimize hindsight bias, investigators must try to see things from the unfolding, uncertain perspectives of the people involved in an event. Had those involved known what was going to happen, they would not have done what they did. As Dekker has stated, "The point of understanding human error is to reconstruct why actions and assessments that are now controversial made sense to people at the time. You have to push on people's mistakes until they make sense—relentlessly." Investigating the perspectives of the people involved leads to interesting questions about the organization. (Dekker)
A primary means of seeking the perspectives of the people involved is by conducting interviews that focus on the cognitive tasks and decisions of those involved. But interviews can easily become, or at least seem, confrontational, even disciplinary. Dekker recommends one way to reduce the risk of this happening:
- Have the interviewee tell the story from his or her point of view.
- Repeat it back to make sure you understand.
- Work together to find critical points in the story.
- Probe to better see how things looked to the interviewee at the time.
Unfortunately, it is not uncommon for interviews to focus on asking interviewees why they failed to follow a particular rule, policy, or procedure. The critical decision method, instead, uses probing questions that are nonjudgmental and seek to elicit the individual's perspective; examples are as follows (Klein; Dekker):
- What were you focusing on?
- If you had to describe the situation to your colleague at this point, what would you have said?
- Did this situation fit a standard scenario?
- What did you notice that caused you to change your assessment of the situation?
- Were you trained to deal with this situation?
- What goals governed your actions at the time?
- What other actions were possible?
- What were you expecting to happen?
These are only examples; many other possibilities exist.
Focus on "Work as Done"
Action Recommendation: Focus on work as staff actually perform it, rather than evaluating staff members' actions against formal policies and procedures.
Too often, investigations focus on evaluating the behavior of the people who have been involved in an event against existing rules, such as formal policies and procedures. This approach offers little benefit to organizational learning, especially if the existing rules no longer reflect how the work is done by most (or all) staff. Instead, investigations should focus on "work as done," not "work as written."
The old view maintains that a system can be made and kept safe by ensuring performance remains within set boundaries. When a problem is discovered, a typical response is to establish more rules, policies, and procedures; implement new technology; and issue reprimands. However, it is common for work practices to change as pressures, workload, and the working environment change—often slowly and incrementally over time—even if existing rules do not. These small changes in work practices can mount and push the organization closer to the boundaries of unacceptable performance, potentially resulting in adverse events (Cook and Rasmussen).
If an investigation concludes that an existing rule has been "broken," organizations that hold to the old view often do not investigate deeper organizational reasons staff may have deviated from the rules. The individual may be disciplined, or the rule reemphasized, or the consequences for breaking the rule tightened. (Dekker) Discipline is not always managerial and not always official; it may consist of shaming, giving an employee the "cold shoulder," or reeducating only the staff members who were involved in the incident.
The new view recognizes that people obey unwritten rules at least as well as they obey written ones—often better. Some rules, such as those prescribed by regulations, are nonnegotiable. Other types of rules, such as standardization, may be helpful, but they must be grounded in reality. But some rules may be out of touch with the current way work is actually done or may be overly prescriptive. An important corollary for risk managers is the liability implications of having a written rule that staff are unable or unlikely to follow. A plaintiff's attorney may present the rule as evidence of what the organization should have done and argue that any deviation constituted negligence.
A good approach is to monitor gaps between policy and practice and seek to understand why, from an organizational-learning perspective, those gaps exist. This should happen after an event. But such evaluations should also be done on a continual basis, which may help identify safety problems before an event occurs.
Action Recommendation: Seek root causes that reflect deeper systems issues.
Many reactive analyses do not delve deeply enough into events. When this happens, the actions taken as a result do not address the systems issues at play, and the organization is unable to make care appreciably better or safer. The effort and resources spent on the reactive analysis are wasted.
A common issue is that investigators stop at what they believe is a root cause but is in reality an effect of factors deeper in the system, such as organizational, managerial, and other systems issues. A culprit that is often wrongly blamed in this manner is human error. Investigation and analysis may even erroneously seek to uncover human error, and the real work of the analysis ceases once human error is identified.
Human error may be blamed under other guises, such as loss of situational awareness, complacency, laziness, poor supervision, and bad design (Dekker). Usually when loss of situational awareness is identified as a root cause, it means simply that the individuals involved did not understand what was going on around them at the time. This is another artifact of hindsight bias: The true circumstances were unclear to the people involved at the time; they are clear only in hindsight. More insightful questions might include the following:
- What was the individual focusing on at the time the incident occurred?
- What scenario did the person believe he or she was dealing with?
- How did the person's understanding change as the event progressed?
The recommendation to conduct such an analysis does not mean that it is a mistake to train staff in skills to improve situational awareness, such as situation monitoring, in which staff members actively scan and assess the situation (AHRQ). Similarly, design issues should be considered, evaluated, and shared with others (e.g., manufacturers, care partners). The new view, however, stresses that these things are not root causes; they should not be the end of the investigation.
If human error is not the end of the investigation, what do such errors signify? Dekker explains as follows:
Human error is the inevitable byproduct of the pursuit of success in an imperfect, unstable, resource-constrained world. The occasional human contribution to failure occurs because complex systems need an overwhelming human contribution for their safety. (Dekker)
Human error is also a sign of trouble deeper within the system. It does not signify the end of investigation; it means the investigation is just beginning. The goal is to understand the behavior of the people involved and the organizational and management issues that influenced them. (Dekker)
What deeper systems issues might contribute to events? Organizations often underestimate the impact of conflicting goals on the day-to-day work of staff—and, consequently, the role of such conflicts in causing adverse events.
Leaders may care about safety, talk about safety, and expect staff to know that "safety always comes first." But staff know the many things that are important to the organization on a day-to-day basis and the relative importance of each. If leaders and managers spend most of their time emphasizing the importance of efficiency and cost control, staff recognize that these are extremely high organizational priorities. They constantly juggle goals as they provide services. In other words, the actions of frontline staff reflect goal conflicts at the point of service.
Having goals other than safety is not a bad thing. In fact, the organization would likely not continue to operate if it did not work to achieve other goals. The following are examples of goals other than safety that may be important to the organization:
- Satisfaction (of patients or residents, family members, or staff)
- Patient- or resident-centeredness
- Staff health and safety
- Public image
- Regulatory compliance
Usually, goal conflicts cannot be totally eliminated, but they should not be ignored. Below are only a few examples of steps an organization might take to address the conflicting goals staff face; actions an organization might take depend on the goals it is trying to balance, among many other factors.
- Make safety the top priority for leaders and frontline staff.
- Create safety-focused job tasks, performance goals, and metrics for people at all levels of the organization.
- Address goal conflicts in the organizational culture.
- During strategic planning and planning of changes to operations or services, plan safety elements as thoroughly as business and operational elements.
- Consider the impact of changes all the way to workflow at the front line.
- Consider performing predictive analysis (e.g., failure mode and effects analysis), with participation of frontline staff.
- Involve staff in making necessary rules.
- Continuously monitor gaps between "work as written" and "work as done."
- Continuously evaluate whether existing rules are appropriate, realistic, and effective.
- For rules that do not meet these criteria, work with frontline staff to revise them.
- For rules that do, revise other rules to enable staff to follow them.
- Promote (and reward) helping, task sharing, and cooperation, as these behaviors may help alleviate conflicts between safety and production pressures.
- Use simulation to develop critical thinking.
- For safety-critical sequences of steps that are quick and closely linked, insert a break or an intermediary step to slow down the process (e.g., independent double checks for certain high-alert medications).
- Use leading indicators to proactively monitor safety (see the discussion
Develop Leading Indicators).
Address Root Causes
Action Recommendation: Develop recommendations to address the identified root causes, and create measures to monitor the implementation, effectiveness, and sustainability of recommended actions.
Even if a reactive analysis effectively identifies deep organizational issues that contribute to events, the work and resources spent on the investigation and analysis are wasted if the actions taken as a result do not address these issues.
As shown in
Root-Cause Analysis: Common Pitfalls and Strategies to Avoid Them, ECRI Institute and other organizations have observed common problems with recommendations and action plans developed as a result of RCAs. For example, the recommendations may not be linked to the root causes that the analysis identified. Often, outside resources (e.g., literature, guidelines) are not consulted when the recommendations are developed. The recommendations may be generic, vague, or not measurable. Action plans may rely mainly or exclusively on weak strategies, such as memoranda, writing or revision of policies, or limited staff educational efforts (e.g., education regarding only the specific processes involved in the event, education of only the individuals involved). Recommendations are sometimes not implemented or not fully implemented.
To be effective, recommendations and action plans should address the root causes identified by the analysis and use strong, specific strategies. Short-, medium-, and long-term recommendations may be necessary (Vanden Heuvel et al.). Organizations should also create measures to monitor the implementation, effectiveness, and sustainability of recommended actions.
Develop Leading Indicators
Action Recommendation: Develop leading indicators to evaluate safety continuously and proactively, while continuing to track adverse events and near misses.
After recommendations are implemented, the organization should continue to track events and near misses. But those are "lagging" indicators—reactive measures related to events that have already happened. In addition, such events do not happen very often and can happen because of a confluence of a variety of factors; thus, by themselves, these indicators do not always provide useful information on whether the organization has become safer in the aftermath of an event.
In addition to continuing to track events and near misses, organizations should develop "leading" indicators. Leading indicators monitor the performance of safety processes. They are forward-looking metrics that target hazards that arise from the interaction of people with processes and the environment.
Leading indicators measure what the organization wants to occur and how it wants to perform in the future. In safety-critical industries such as healthcare, leading indicators more proactively reflect performance related to key work processes, operating discipline, and layers of protection that prevent incidents. Thus, they facilitate continuous improvement and early detection of coming problems. (Hinze et al.) In other words, tracking leading indicators gives the organization a better picture of its safety performance because it indicates what is really occurring in the organization in the day-to-day work of staff.
Some of the measures created to monitor the implementation, effectiveness, and sustainability of recommended actions identified by a reactive analysis may serve as leading indicators, assuming the measures are proactive and forward-looking. Other examples of leading indicators are as follows:
- Management of change
- Training and competency (as well as assessment of the effectiveness of these measures)
- Hazard reporting
- Observed use of safety practices
- Maintenance of equipment or facility integrity
- Follow-up on action items
These are only a few examples. The leading indicators an organization might choose to track depend on its safety challenges and goals.
The following scenario illustrates how leading indicators might be developed in response to a reactive analysis. During the day shift, a nursing home resident used her call bell because she needed to use the bathroom. When a certified nursing assistant (CNA) responded more than 30 minutes later, the resident was found on the bathroom floor. She had suffered a serious head injury.
The RCA team identified multiple root causes and contributing factors, creating an action plan with steps to address each. One root cause was that the administration, supervisors, and human resources processes focused heavily on CNAs' ability to complete all tasks for their assigned residents on each shift. As a result, CNAs largely took care of only the residents to whom they were assigned, helping coworkers with other residents only if the coworker was a friend, the circumstances were serious, or a supervisor was watching.
To address this root cause, the team revised job descriptions and performance criteria to include helping behaviors. In addition, the organization instituted a process for staff to report when a coworker had helped them; supervisors and administrators could also complete such reports if they saw staff helping each other. The organization developed three leading indicators related to this issue: the number of such reports, stratified by shift and job role; the percentage of staff recognized through such reports over the course of a month; and time to respond to call bells. The team developed leading indicators to address the other root causes and contributing factors as well.