Abstract
A Failure Modes and Effects Analysis (FMEA) is an equipment-oriented technique for identifying system failures and for ranking risk. The method examines the ways in which an equipment item can fail (its failure modes), and examines the effects or consequences of such failures (safety, reliability or environmental performance).
If the criticality of each failure is to be considered, then the method becomes a Failure Modes, Effects and Criticality (FMECA) Analysis.
Methodology
- Assemble a cross-functional team (how this can be done is discussed in a related knol: HAZOP Team Selection and Management. The team should represent designers, operations, maintenance and the end user.
- Define the physical scope of the FMEA. For example, in the heat exchanger analysis discussed below, determine if utility systems such as the cooling water supply are to be included.
- Define the purpose of the equipment item being examined.
- Identify the failure modes for that equipment item and determine their consequences and their likelihood.
- Determine if the analysis is to incorporate the effect of safeguards and controls.
Team Process
Like other types of hazards analysis, an FMEA should be carried out by a team. In most cases, however, only two or three team members — who are specialists in the required fields — are involved. The FMEA scribe will complete a form such as that shown in Table 2 which is written for the heat exchanger shown in Figure 1.
|
#
|
Failure Mode
|
Cause(s)
|
Indications/ ‘Announce-ment’
|
Predicted Frequency
|
Consequences
|
Risk
|
|
1
|
Tube failure
|
Corrosion from fluids (shell side).
|
Odors at the cooling tower.
Hydrocarbon detector on the tower.
|
Frequent — has happened twice in ten years.
|
Hydrocarbon is at higher pressure than the cooling water. Therefore flammable materials could enter the cooling tower and cause a major fire.
|
A
|
|
2
|
Tubesheet failure
|
See tube failure. Vibration of the tubes may cause the sheet to fail even if the tubes hold up.
|
See #1.
|
Rare
|
See #1.
|
B
|
|
3
|
Relief valve fails open
|
1. Mechanical failure
2. External impact
|
Hydrocarbons to atmosphere — fire and environmental hazard
|
Rare
|
Serious
|
C
|
|
4
|
Relief valve fails closed
|
1. Mechanical failure
2. Polymer buildup
|
None (passive failure)
|
Uncommon
|
Critical
|
B
|
|
5
|
Erosion of tubes
|
High velocity of cooling water
|
See tube failure
|
Rare
|
Critical — see tube failure
|
B
|
|
6
|
Vent valve fails open
|
Mechanical failure
|
See relief valve fails open
|
Rare
|
Serious
|
C
|
|
7
|
Vent valve fails closed
|
Mechanical failure
|
None (passive failure)
|
Rare
|
Minor — could lead to problems for turnaround maintenance
|
C
|
|
8
|
Drain valve fails open
|
Mechanical failure
|
See relief valve fails open.
|
Rare
|
Serious
|
C
|
|
9
|
Drain valve fails closed
|
See vent valve fails closed.
|
|
|
|
C
|
|
10
|
Corrosion (tube side)
|
Incorrect process composition.
|
See tube failure
|
Uncommon
|
Critical
|
B
|
- The first column is the number of the failure mode for that item of equipment.
- The second column identifies the failure mode.
- The third column lists possible causes of the failure mode. Although identification of causes is not a requirement of the FMEA process, they do need to be identified so that appropriate corrective actions can be taken.
- Column four lists the signs by which operations personnel know that the event has happened.
- The fifth column provides an estimate for the number of times the failure mode is likely to happen.
- The sixth column identifies the potential consequences of the failure mode. As already noted, the consequences will vary depending on the magnitude of the failure. The consequence that is usually of most interest is injury of personnel. However, environmental impact and economic loss can also be considered. Some practitioners have two levels of consequence: immediate and ‘end effect’. In the first row, the immediate effect of a tube failure is hydrocarbons in the cooling tower; the ultimate effect could be a catastrophic fire in the cooling tower.
- The last column provides an estimate for the level of risk associated with the failure mode. (A discussion of risk estimation is provided in two other knols in this series: Risk Analysis and Risk Matrices and ALARP (As Low as Reasonably Practical) Risk.)
Severity
- None. No effects observed.
- Minor. System operable with some loss of efficiency or quality.
- Low. System operation will cause some equipment damage but should not create a safety hazard.
- Moderate. System operation will cause equipment damage and could create a safety hazard.
- High. System operation will cause significant equipment damage and is likely to jeopardize safety.
- Very High. System operation will lead to destructive failure with a significant chance of someone being hurt and/or the creation of a major environmental problem.
In all cases, the severity of the event will depend on whether it occurs with our without warning, with the second of the two obviously being the more serious.
Process Hazards Analysis
- What-If;
- Checklist;
- What-If / Checklist;
- Hazard and Operability Study (HAZOP);
- Failure Modes and Effects Analysis (FMEA);
- Fault Tree Analysis; or
- An appropriate equivalent methodology.