If reliability improvements elude you, look deeper into the ‘Tree of Unreliability’

What is causing your organization to struggle to get to its desired level of asset reliability? You’ve been using root cause analysis (RCA) and the Five Whys and addressing the problems you’ve found, but disruptions or unplanned downtime persist.

Shon Isenhour, a veteran reliability engineering trainer and the founding partner at Eruditio, suggests pushing your RCA deeper. During a webinar in September 2020 called, “Exploring the Tree of Unreliability and what drives downtime,” Isenhour challenged reliability teams to take their investigations further.

“While there are infinitely many causes for downtime and lost production, there are some very common physical roots and underlying latent roots to our reliability issues,” he says. You may have dealt with the more apparent roots at the surface, but organizational issues can be harder to identify and, if left unaddressed, can haunt reliability efforts.

Figure 1. The “tree of unreliability” involves five layers of roots, according to Shon Isenhour.

Most root cause analyses diligently examine possible causes at the Physical and Human layers of RCA, says Isenhour. But pushing the line of questioning into the Systemic and Latent layers is trickier. Pursuing your questions deeper into all five layers of RCA (see the diagram above) is what Isenhour refers to as the “Tree of Unreliability.”

Physical defects refer to machine issues that you can see or touch, such as the gearbox locking up or a bearing failure. Human factors typically mean that someone did or didn’t do something. Human factors often have to do with the team’s capabilities and include issues such as not following the PM workflow.

But why are those human issues occurring? It’s tempting, says Isenhour, to point the finger at someone on the team and blame things like lack of training or poor communication. While there may be some truth to that, to truly discover the root cause, Isenhour says you must dig deeper into the Systemic and Latent levels of RCA. “Training alone,” he says, “is not your problem.”

Digging deeper can mean asking uncomfortable questions

If inadequate training is suspected, then ask: Was the organization ready for the training? Was the topic prioritized? Does the team see the value of the training? Is leadership coaching them through it, making them apply it, and setting clear expectations for usage?

Digging deeper can mean asking uncomfortable questions. “Sometimes, we have pet areas that we want to focus on,” Isenhour says, “and that leads us to focus too narrowly, too quickly.” An exemplary fault tree, he says, should have no less than 20 boxes and should reach down to the last layer.

To discover systemic issues, ask questions such as:

Has the team had proper training?
Are the correct standards in place?
Does the team have enough knowledge of condition-based maintenance?
Do they have the appropriate tools to assess asset status?

For example, if the fault tree discovers that the team is not evaluating the effectiveness of a task before adding the PM, ask whether the evaluation criteria exist (are correct standards in place?). If the team is not optimizing the PMs, ask whether the optimization process exists, or whether enough time is available.

For discovering latent issues, the questions might include:

Does the company value precision maintenance? Is it encouraged and prioritized?
Does the team have a plan, objectives, and/or guiding principles, and are they communicated?

The importance of a three-year reliability plan

Latent issues are usually cultural and center on leadership, vision, mission, and attitude. “Is it clear what you mean by reliability and how you plan to get there? Is the team encouraged to consider the consequences and ask questions? Does everyone know where they are headed and how everything will get done?”

To address latent issues, Isenhour emphasizes the importance of developing, implementing, and maintaining a three-year reliability plan. In many cases, leadership knows what to do, but they’re overwhelmed and don’t have enough of a plan. Carve out time, determine your vision, map out the stages incrementally, and use pilots to test the strategy and demonstrate quick wins. “Success breeds more success,” Isenhour says.

Be aware: The plan will change over time, he says, and communication of the changes is critical. Also, the plan cannot be a one-person effort; the whole team has to be involved and invested.

For more coaching on identifying systemic and latent causes to reliability issues and developing a stronger reliability plan, watch Isenhour’s September 2020 webinar in full at Accelix.com.

Digging deeper can mean asking uncomfortable questions

The importance of a three-year reliability plan

Moving toward smart maintenance with prescriptive analytics

As solar production rises, remote monitoring can improve efficiency

Why you should integrate your data

Fluke Accelix presents at FM Ireland 2018!

Every infrared picture tells a story

Improve Predictive Maintenance With Thermal Monitoring

Additional Information:

Contact Info:

Contact us numbers

Digging deeper can mean asking uncomfortable questions

The importance of a three-year reliability plan

Similar Posts

Additional Information:

Contact Info:

Contact us numbers