Mapping from Failure to Successful Repair in a High Volume Environment



Achievements and Results
- Potential failure paths mapped out
- New production system created to quickly identify and repair defects
The Problem
In a high-volume maintenance environment which dealt with a fleet of several thousand mobile vehicles per day, several dozen failures occurred each day. The "current situation" was that, while it was recognized that repeated failure patterns existed, no comprehensive and disciplined system had been installed to map incidents from initial symptom identification through to successful repair, and then elimination of the root cause.
The expert was tasked with developing a system to help improve reliability in this high-volume maintenance environment. The challenge involved creating a system that could quickly identify and repair defects in an environment where several dozen failures occurred each day on a fleet of several thousand mobile vehicles.
The Solution
In this situation, it is important to recognize that there are really no new failure modes or failure mechanisms. It is therefore possible to identify and map all likely combinations. From the point that initial symptoms are identified, there are a relatively limited number of paths that the entire failure/repair scenario can take. In addition, the relative likelihood of each path can be quantified if the number of failures following each path are properly identified and recorded following each incident.
The cycle of a typical failure includes the following steps :
Malfunction Reporting - Identify which critical function has been affected and how it is currently behaving. Each unit has a specific number of functions and each function has a specific number of recognizable abnormal behaviors. Interactive malfunction reporting can include questions concerning fault logs or other conditions that can help funnel in on the most likely problem.
Diagnostics - Based on reported symptoms, the repair system needs to be capable of doing three things:
a. Quick fix - Can repairs be made instantaneously? For example, simply power cycling a computer will frequently eliminate faults and restore functions.
b. Triage - If quick fix is not possible, identify the repair approach that is likely to be most effective, require the least downtime and cost the least.
c. Order of Attack for Troubleshooting - When there is more than one possible cause of the symptoms and troubleshooting is needed, which path is most likely? A procedure needs to be established to determine what the troubleshooter checks first, second, and on.
Troubleshooting - Effective troubleshooting has several important characteristics.,
a. First, it should begin with most likely paths first and proceed from most likely to least likely.
b. Second, it needs to clearly identify the tasks needed to fix the problem.
c. Third, it needs to record and properly bucket the actual failure mode.
Fault Analysis - Once the failure mode is known, the avenue that leads to permanently eliminating similar failures begins with identifying the failure mechanism. This is easier than one might initially think. Again, there are a relatively small number of failure mechanisms. For instance, for mechanical devices the failure mechanism must be corrosion, erosion, fatigue or overload. There are no other choices. If we know the failure has been caused by corrosion, for instance, we can begin looking for the cause of corrosion.
Root Cause Analysis - Total elimination of a problem requires that we identify the three levels of cause: physical cause, human cause and latent or systemic cause. By tracking the problem back to the root cause, we not only fix the current problem; we also eliminate future problems.
Benefits Realized
Keeping the above in mind, a user-friendly, readily accessible computer-based system was developed. This system helps personnel find their way through likely failure/repair paths, the system is to be able to properly "bucket" data on an instantaneous basis and update the likelihood of each path as failure patterns change with age or condition of equipment.
To see the resume of the expert associated with this case study, see the link below.
|
Resume of ZCO |
Mechanical, Locomotive Reliability, Quality Expert Consultant Resume |