L2 HazId Overview I4.1
L2 HazId Overview I4.1
Lesson 2
Overview of Hazard Identification
Techniques
“Once these (hazards) have been identified, the battle is more than half won”
FP Lees: Loss Prevention in the Process Industries
© 2024 Risktec Solutions Limited
This lesson provides an overview of a number of hazard identification techniques that are
available before exploring the principal techniques in greater detail in later lessons
Introduction
2 HazID L2 i4.1
Different industries, applications, lifecycle timings, etc. demand different solutions and many
different tools have been developed. Some have wide applicability, others are very narrow in
scope or have been developed for one-off applications, equally the level of detail and
documentation available ranges from the very extensive to the barely present.
Before choosing any technique for use, always consider (and where possible briefly test)
whether it is likely to do the job you are asking of it.
It can be a good idea to combine more than one technique. For example, using a top down
approach (e.g. fault tree analysis) and combining it with a bottom up approach (e.g. FMEA)
would maximise confidence that you are seeing the total picture.
Top Down – starting from the entire system or operation and considering how it may fail and
then seeking to identify how those failures may occur
Bottom Up – starting at the lowest levels and considering how failures may affect the overall
system
Both approaches have the potential to miss something.
Context
3 HazID L2 i4.1
Choice of approach
Quantitative
Intolerable
Semi-Qualitative
Risk Level
ALARP
Region
Broadly Qualitative
Acceptable
Complexity
Low complexity High complexity
Solution is obvious Difficult solution
Situation covered by One-off situation
codes and standards No relevant standards/guidance
Start simple. Is this adequate Y/N? After: Guidance on Risk Assessment for Offshore Industries HSE 3/2006
Before moving to next level
4 HazID L2 i4.1
The diagram was originally in the context of risk assessment, but as a basic principle also
applies to the choice of hazard identification technique to be used.
Repeating the notes from a previous slide:
The basic rule of thumb is to start simple and get more complicated until you have a level of
detail that you are satisfied with.
The choice of tool will range from ‘gut-feel’ through to detailed, complex (and expensive)
structured techniques, and there are many things that will influence the choice of technique.
Performing a line-by-line procedure review of the steps required for a manufacturing process
may well be appropriate for a complex, major hazard installation, but although possibly very
accurate, it may be inappropriate (and unnecessarily disruptive, expensive) for routine
chemicals mixing in a laboratory.
The corollary would however be, that for the latter situation, just because nothing has yet gone
wrong, are you confident that this is because of your risk management process and not just
because you’ve been lucky all these years?
Broad areas and principal methods
▪ Process ▪ Hardware
− Material Properties − Safety Audits
− Matrices − FMEA
− Pilot Plants − Fault Trees/Event Trees
− Critical Examination − Maintenance Analyses
− Codes and Standards − Sneak Analysis
− Hazard Indices ▪ Control
− Coarse Hazard Assessment − CHAZOP
− Functional Integrated Hazard Identification − Structured methods
− Checklists ▪ Human
− What if? − Task Analysis
− Action Error Analysis
− Scenario development
− HAZOP …plus many others!
This is a selection of a number of principal methods available and their applicability to various
areas. This is not an exclusive list and you are encouraged to do further reading around this
subject. Some suggested reading is shown in the final slides of this course, but please note
these are examples only.
The highlighted ones are dealt with in more detail in later lessons.
See also the handout from ISO31010.
Main CCPS Techniques
▪ Preliminary Hazard Analysis Scenario Based
▪ Safety Review
▪ What-if
▪ Relative Ranking
▪ What-if/Checklist
▪ Checklist Analysis
▪ HAZOP
Non-Scenario Based ▪ Fault Tree Analysis
6 HazID L2 i4.1
As an alternative guide for the selection of hazard identification technique, the Center for
Chemical Process Safety (CCPS) ‘Guidelines for Hazard Evaluation Procedures’ considers
the choice of hazard identification methods in terms of Scenario and non-Scenario based
(instead of hardware, process etc. as on the previous slide) methods in addition to three basic
methods.
Checklists, HAZOP and FMEA Much more on these three
techniques in later lessons
7 HazID L2 i4.1
Checklists, HAZOP and FMEA are the three most commonly used techniques for hazard
identification. More information on these is provided in later lessons, together with the
opportunity to practise the methods on simple systems.
Checklists
One of the simplest methods of hazard identification is the checklist. It is a means of passing
on lessons learned from experience, whether previous risk assessments or past failures. No
matter how expert you may be, well-designed checklists can improve outcomes, especially in
complex, technology-based sectors.
Most hazard identification techniques rely on some form of checklist. Frequently the term
HAZID is used interchangeably with a checklist approach. There are many checklists available
but you need to make sure that the one selected is appropriate for both the application and the
level of information available. At the end of the day, however, the checklist is only going to be
as good as the people using it; it does not automatically create experts. If you are generally
interested in the value of checklists, read “The Checklist Manifesto” by Atul Gawande.
HAZOP Study
HAZOP is the acronym for HAZard and OPerability study. It is a structured and systematic
examination of a product, process, procedure or system in order to identify hazards and
operability issues. The method systematically examines how each part of the “design” will
respond to deviations in key parameters by using suitable guidewords. The guidewords can be
customised or generic words can be used that encompass all types of deviation. It is generally
carried out by a multi-disciplinary team in workshops led by an independent HAZOP facilitator
and formally recorded on worksheets by a HAZOP scribe/secretary.
The HAZOP technique was initially developed in the 1960s to analyse chemical process
systems and it is very commonly used today throughout the process industries. Its success led
to the method being extended to other types of systems and complex operations, including
mechanical and electronic systems, procedures and software systems, and even to
organisational changes and to legal contract design and review. The method is well
documented, e.g. BS 61882.
FMEA
FMEA is the acronym for Failure Modes and Effects Analysis. It seeks to identify hazards by
identifying potential failure modes of the various parts of a system, the effects these failures
may have on the system, the mechanisms of failure and how to avoid the failures and/or
mitigate the effects of the failures. FMECA extends an FMEA so that each fault mode
identified is ranked according to its Criticality (importance).
Although originally intended for equipment items, it has also been adapted for procedures and
software for example. It is very good at identifying single point failure modes, but not
combinations of failure modes. Generally an FMEA is conducted by a single analyst, so it is
less labour intensive than a HAZOP, but it can alternatively be performed in a group
workshop. The method is well documented, e.g. IEC 60812.
On to the other techniques now...
Material properties
▪ Many databases widely available
▪ Physical and chemical properties, e.g.
− Flammability, explosibility, toxicity
− Decomposition
▪ Material Safety Data Sheets
− 91/155/EEC and UK CHIP regulations
− Responsibility of supplier to supply MSDS
▪ Impurities
− May have harmful effects or cause unwanted reactions
▪ Instability e.g. in manufacture, storage, thermal
− Desk screening, small scale tests
▪ Explosibility – mainly used for assessing safety during transportation
8 HazID L2 i4.1
Primarily this is aimed at chemical processes, but by analogy it can also be extended to many
other types of hazards.
If you are going to use a particular substance, e.g. in a process, to perform a task etc., then
there is a very high probability that the properties and potential hazards of the material have
already been assessed and are available to you. Frequently information can be obtained from
suppliers (e.g. MSDS) though there are many other sources of information available, e.g.
NIOSH publish an online pocket guide (http://www.cdc.gov/niosh/npg/) to common chemicals
and their properties.
Information availability varies by substance, but will frequently include toxicity, flammability,
instability etc.
Also worth considering whether there may be any impurities present and whether these are
an issue (both by themselves and also in combination with others).
Instability, particularly in manufacturing processes, may be a large problem and whilst it may
be possible to identify for a single substance, it will be a lot harder for combinations of
materials. It may be possible to derive some information from small scale tests, and this is in
part, part of the rationale behind Pilot plants (see later). Lees – Loss Prevention in the
Process Industries Chapter 33 contains more information.
When considering the hazards posed by materials, you should consider both normal and
abnormal conditions; the hazards posed by a flammable liquid atomised at high pressure may
be different to when it is cold and in a liquid pool.
Once we know the hazards inherent to the materials being worked with, then we can start
looking at how they can be managed.
Interaction Matrices
▪ Best used during early stages of process design
▪ Can consider
− Chemicals present in the process (reactants,
intermediates, products)
− Materials of construction
− Operator
− Utilities (inc. cleaning solutions, lubricants)
− Energy source
− Air, land and sea
▪ A start to more detailed analysis…but
− Interaction of only 2 components
− Matrices can get very large, very quickly
− Best done by an individual
9 HazID L2 i4.1
These can range from the very simple just considering the interaction of two substances to
three dimensional matrices, e.g. to include by-products.
The basic question is going to be “what if A comes into contact with B, with C etc.?”, which
can get very big very quickly, but may be a good starting point to identifying areas of
uncertainty. Difficult to gauge the effects of three components mixing, unless it is possible to
consider two components as a single product.
As it is likely to get very big, will need to take care to ensure usability and presentation.
Would be difficult (and time consuming) to do as a team, but equally, if it is only done by an
individual, will there be problems with knowledge limitations?
See also CCPS and HSL for further reading.
Pilot plants/scale models
“Make your mistakes on a small scale and your profits on a large scale”
L.H. Baekeland
10 HazID L2 i4.1
As the quote says, if we’re going to get it wrong, better to do it on a small scale than find out
when the plant is running that laboratory processes do not scale up well to full scale
production.
As well as safety, use of a pilot plant also allows for better examination of other aspects such
as economics, operating methods, practicalities.
The real world is unlikely to be as ‘clean’ as a laboratory, and a pilot plant may give a lot of
useful information on the presence of, and effects from, impurities in the process e.g. from the
feedstock, but also from contamination, leaks, corrosion etc. within the process.
Design and operation of pilot plants is going to carry its own set of potential risks in its
operation and may require a very careful approach.
See also Lees Chapter 8.5 and Appendix 10.
Critical Examination
▪ Largely a critical examination of the inherent safety (i.e. differs from HAZOP)
▪ Checklist approach based on a statement of the design intent to identify strengths and
weaknesses of
− the materials
− the method
− the equipment
11 HazID L2 i4.1
Part of the ‘Process Safety Review Systems’ developed by Wells et al (see Lees – 8.19) and
is a guideword approach to looking at the inherent safety of the system (rather than the
possibility that things may go wrong as in a HAZOP).
Requires consideration of what is to be achieved (design intent) and then whether this could
be improved, e.g. by eliminating chemicals, process steps, storage, by reducing the amount of
piping, frequency of operation etc; further sets of guidewords present for (for example) Avoid,
Modify, Prevent, Increase, Segregate and Improve – see next slide and Activity Book).
Plus points are that it may encourage innovation in inherent safety (particularly if applied early
on in the design stage) and can be used to start other detailed analysis. It may, however,
require experienced personnel to identify what is possible/practicable to do.
Critical Examination: keywords
12 HazID L2 i4.1
Extracts from the key words for critical examination. The full list is shown in the Activity Book
Codes and Standards
13 HazID L2 i4.1
Provides for a greater collective knowledge than may be present within a single organisation,
and in particular, standards are continually improved to reflect knowledge gained from
incidents.
They are a form of hazard identification as they frequently contain information on hazards that
must be protected against, although sometimes this may be implicit within the standards
rather than explicitly stated.
In addition to these, and looking for experience/knowledge within your own organisation, there
is also the possibility to conduct literature searches for other incidents and experiences.
Codes and Standards
Advantages
▪ Provide authoritative guidance giving minimum standards
▪ Can be used as a start to a more detailed technique
Disadvantages
▪ Careful consideration required to determine which of the many codes and standards apply (need
to avoid ‘mix and match’)
▪ Can be time consuming to understand
▪ Will change with time
▪ May not be applicable to new processes
14 HazID L2 i4.1
Codes and standards may, in some cases, be viewed as providing the minimum level to be
applied, in part because there will be an element of “one size fits all”.
One of the biggest potential problems is playing mix-and-match and cherry-picking controls
from different sets of standards which leads to problems when the whole picture of a particular
standard is not viewed.
They are not going to be the easiest (or most entertaining!) read and some standards can get
very large, requiring a lot of effort to read and to stay on top of as they are going to change
with time.
Hazard Indices
15 HazID L2 i4.1
A number of hazard indices have been developed, of which the principal ones may be
considered as the Dow, Mond and IFAL (Instantaneous Fractional Annual Loss).
They are of a similar format and originally designed for hand calculation, although some have
been automated, and later indexes (e.g. the IFAL) are computer-based. The first index (Dow)
was originally developed in 1964 in order to guide the selection of fire protection requirements.
The basis of the methods are similar in that they look for basic inventories of process
chemicals, to which a number of modifiers (e.g. for the process conditions, whether the
material may suffer an exothermic or spontaneous reaction) can be applied. Dependent on
the method chosen, then further modifiers are applied to permit estimations of frequency and
consequences of releases.
Lees Chapter 8.6 provides a good overview of each method.
These index tools provide information about both identification and assessment aspects of a
risk management approach, in that they allow understanding of which process units may
present a hazard (identification) and also what the consequences may be (assessment). The
later tools also allow for limited review of the role of risk mitigation measures as well.
Select pertinent
process unit
Calculate F1 Calculate F2
General process hazards factor Special process hazards factor
Determine replacement
value in exposure area
Determine Business
16 HazID L2 i4.1 Interruption
This is the basic flowchart for the Dow F&E Index method. The basic information required is
the flow sheet for the plant and its plot plan. If economic effects are to be calculated, the
information is also required on replacements costs (e.g. plant value per unit area) and the
monthly value of production.
For each process unit a material factor (MF) is determined (based on look-up tables) which is
largely a measure of the potential energy of the material modified for its pressure and
temperature. General (e.g. manual handling, drainage etc.) and Special (e.g. toxic material,
corrosion considerations, rotating equipment) process hazard factors are calculated and
multiplied together to determine a Process Unit Hazard Factor (PUHF) which is multiplied
with the MF to give a Fire and Explosion Index (F&EI) value, which is in its own right a guide
to the level of fire and explosion hazard.
The F&EI can then be further processed to determined an exposure radius which, given a
value for unit area costs, can be used to derive a value for the area of exposure.
The damage factor is calculated based on the PUHF and the MF and is then applied to give a
base Maximum Probable Property Damage (MPPD). Credit factors (for process control,
material isolation and fire protection) are applied to the base MPPD to give an actual MPPD.
The Maximum Probable Days Outage (MPDO) is determined based on the actual MPPD
which in turn can be used to calculate a Business Interruption (BI) value.
The majority of values used are taken from look up tables, with some based on interpreting
graphs.
Principal Indices
▪ Dow Fire and Explosion Index ▪ Instantaneous Fractional Annual Loss (IFAL)
− Widely used − Primarily for insurance assessment
− Originated in 1964, several iterations since − Scientific basis rather than historical loss
− Risk management tool as well as fire − Fires and explosions
protection/loss prevention
− Computer based only
▪ Mond Fire, Explosion and Toxicity Index
▪ Dow Chemical Exposure Index
− Extension of Dow Index, e.g:
− Toxicity index (previously incorporated in Dow
− Wider range of process and storage
F&EI)
− Inclusion of aspects of toxicity
− Include range of safety factors
17 HazID L2 i4.1
The original version of the Mond index was described in 1979 having been developed by the Mond
Division of ICI. The principal modifications to the Dow Index were:
• a wider range of processes and storage installations
• cover the processing of chemicals with explosive properties
• to account for the high heat of combustion of hydrogen and to allow distinctions to be made
between processes where a given fuel is reacted with different reactants
• inclusion of additional hazard considerations
• inclusion of aspects of toxicity
• include a range of offsetting factors e.g. good plant design, safety instrumentation
• show how the results can be applied into the design of plants
The basic method is similar to that described for the Dow Index, although it includes a hazard
review step to examine if there is potential to reduce the hazard by making design changes.
The method calculates seven indices: Overall, Fire Load, Unit Toxicity, Major Toxicity Incident,
Explosion, Aerial Explosion and Overall Risk Rating.
The IFAL is a separate index developed primarily for insurance assessment purposes, allowing for
a more scientifically based assessment than one based on historical loss records. The plant is
divided into blocks and each major equipment item reviewed for its contribution to the Index. The
main hazards considered are Pool and Vapour Fires, Unconfined Vapour Cloud Explosions,
Confined VCEs and Internal Explosions. For all bar pool fires, frequencies and holes size
distributions are used and then for each case, consequences modelled to allow estimation of
damage effects.
The IFAL is sufficiently complex that it is solely computer based (unlike the Dow and Mod
Indexes).
The Chemical Exposure Index was originally part of the Dow Index and may be used to
provide toxicity information, e.g. for emergency response planning. It requires the physical
and chemical properties of the material, a simplified PFD and plant and surrounding area
plans, together with information of toxicity limits, e.g. Emergency Planning Response
Guidelines (EPRG) values. The Dow CEI Guide then provides guidance on the selection of
release scenarios and look-up tables for physical properties and EPRG values.
Hazard Indices
Advantages
▪ Good structured approach
▪ Well documented, extensive history of application
Disadvantages
▪ Can be conservative
▪ Some can be very complex and require computer support
▪ Some indices treat toxic effects separately
18 HazID L2 i4.1
These tools represent very well established and well documented approaches, however
because they are essentially tabular/spreadsheet based approaches they will tend to err on
the conservative side. The analyst thus needs to balance time to complete a study against
accuracy required when comparing these index approaches against detailed modelling tools.
Coarse Hazard Analysis
19 HazID L2 i4.1
This type of approach may be referred to by a large variety of names. Although there are
slight differences between each technique, they are essentially a check list and/or history-
based review of plant at an early stage in its planning or development.
As at this early stage of the process there is generally going to be little information available,
the reviews tend to concentrate on the major issues rather than detailed design, by
considering the intent of the design or the major process blocks present.
For a history-based approach, the review team is seeking to look to previous accidents,
operating history to identify possible hazards associated with, for example, a process block,
and then see whether this can be designed out/reduced and also what the protection devices
are.
A checklist-based approach is essentially the same, but is more guideword-based.
Coarse Hazard Study
▪ Similar to previous; originally developed for use with HAZOPs (“pre-HAZOP” or HAZOP I)
▪ Guideword based approach
▪ Performed in early project stages, e.g. block layout of plant items to identify potential problems
for further study, e.g.
− Chemical data availability
− Hazard understanding
− Layout or siting issues
− Process design
▪ Team-based review using block layouts, site plans etc.
20 HazID L2 i4.1
There are two types of studies that may be referred to under the heading of HAZOP. A
coarse checklist type approach used at a high level in early stages of a plant design and a
guideword/deviation based approach which is generally used when detailed plant drawings
are available.
In the nuclear industries these tend to be referred to as HAZOP I and HAZOP II; in the
process industries the studies would generally be referred to as coarse (or preliminary) hazard
review and HAZOP.
Further details on HAZOP are given in lesson 5 of this module.
Coarse hazard study: example guidewords
▪ Fire ▪ Energy
▪ Explosion ▪ Interactions
▪ Toxicity − People
▪ Corrosion − Materials
▪ Smell − Equipment
▪ Effluent ▪ Environment
▪ Additional, e.g. − Climate
− Radiation − Geotechnical
− Vibration − Biological
− Asphyxia etc.
21 HazID L2 i4.1
This is an example set of guidewords used for a coarse hazard study/HAZOP 1. Alternative
approaches have also just taken the broad headings from the ISO17776 listing (see handouts
and Lesson 6) together with some location/plant specific issue prompts.
Many other checklists are available (see Lesson 6).
Concept Hazard Analysis: example keywords
▪ Flammables ▪ Thermodynamic
− Ignition − Over-, under- pressure
− Fire − Over-, under- temperature
− Emergency
22 HazID L2 i4.1
Functional Integrated Hazard Identification (FIHI)
23 HazID L2 i4.1
The TOMHID project was initiated by the Commission of the European Communities in 1991
to develop an overall methodology to provide assistance and guidance for hazard
identification purposes. The project set out to provide a comprehensive framework to
represent the plant as a socio-technical system, with the methodology including technical,
human and organisational aspects to allow identification of critical areas and the need for
further analysis using other existing methods.
FIHI requires a plant functional model to be defined in terms of:
• Intents
• Methods (how the intent is to be satisfied)
• Constraints (any restrictions on how the intent can be achieved).
Any methods or constraints containing further intents should be analysed in the same fashion
using a hierarchical approach. Concept Hazard Analysis type guide words are then applied to
each intent to allow discussion of possible causes and consequences of any deviations.
Example guide words include:
• Flammables
• Chemicals
• Reactions
• Mode of Operation
• Operator Performance
• Management System
MIMIX – Method for Investigating Management Impact to Causes and Consequences of
Specific Hazards which was also developed as part of the TOMHID project.
MIMIX consists of three main stages:
• Preparation of incident scenarios
• Worker Interviews
• Management Interviews
The worker interviews are used to identify undesired conditions in the plant which might
promote errors and violations affecting the incident. Deficiencies in the management means
and practices used for correcting these undesired conditions are then investigated in the
management interviews.
Activity 4
Activity 4
24 HazID L2 i4.1
Advantages
▪ Good basis for more detailed studies
▪ Relatively quick
▪ Early identification of major issues
Disadvantages
▪ Major hazards only - might miss more complex hazards
25 HazID L2 i4.1
FIHI provides a comprehensive approach for the entire lifecycle, but may also be expected to
be very time consuming and require experienced personnel.
‘What-if’
26 HazID L2 i4.1
Allows for a group of experienced personnel to ask questions and voice concerns about
possible hazards associated with the operation, plant etc.
Can be applied to virtually any stage of the operation life, and may form part of an
optioneering study comparing different approaches.
Generally start at the beginning (of the process, procedure, operation etc.) and proceed
logically through it, questioning throughout, testing whether all possibilities have been
considered.
There is also SWIFT (Structured What-If study) where the leader has very specific issues for
review (based on a checklist), but also flexibility to consider other issues that may come up
during the review. SWIFT may thus be viewed as a sort of pre-HAZOP study, e.g. for when
sufficient detail is not available to perform a full HAZOP.
‘What-if’ method applications
▪ Commonly used to examine proposed changes to an existing facility/unit. Can also be used
during process development or at pre-startup
▪ Can study wide range of facilities and operations
▪ Can also be used for procedural reviews
▪ Input information required includes detailed documentation of the facility/unit, the process, and
the operating procedures
▪ Requires users to be experienced
27 HazID L2 i4.1
What-if? example checklist
Storage Of Raw Materials, Products, Intermediates
Storage Tanks Design separation, Inerting, Materials of Construction
Dikes Capacity, Drainage
Emergency Valves Remote control hazardous materials
Inspections Flash arrestors, Relief Devices
Procedures contamination prevention, Analysis
Specifications Chemical, Physical, Quality, stability
Limitations Temperature, Time. Quantity
Materials Handling
Pumps Relief, Reverse rotation, Identification of materials of construction
Ducts Explosion relief, Fire Protection, Support
Activity 5
29 HazID L2 i4.1
Advantages
▪ Simple principle, easy to use, very flexible
▪ Rapid focus on major issues
▪ Can explore non-process issues
▪ Group consensus technique
Disadvantages
▪ Experienced assessors and leaders required to avoid missing hazards
▪ Unstructured
▪ Complete coverage not guaranteed
▪ Difficult to achieve quality control
30 HazID L2 i4.1
As it is a more wide ranging technique, What-if can also be used to examine non-process
related issues that a traditional HAZOP would struggle to review.
The results of the study might be less detailed than other studies, and its unstructured nature
may mean that people spend a lot of time on trivial issues.
By constraining the review using a SWIFT approach, it is possible to try and manage some of
the issues surrounding the unstructured nature of the technique, but this may also be at a loss
of creativity from the team.
The choice of this method will in part come down to balancing off the creativity that can result
from its free form nature, against the lack of structure/discipline that this nature allows.
Scenario development/QRA
▪ HAZOP tends to assume that plants are basically well designed and looks for deviations
▪ QRA tends to look for hazard releases whilst plant is in its design range, e.g. from:
− Pipework (flanges, valves, rupture etc.)
− Compressors, pumps, agitators
− Vessels and tanks
− Relief devices
− Flares
▪ Need to consider both location and type (e.g. gas, liquid) of release
▪ Role of escalation also needs to be considered
31 HazID L2 i4.1
QRA – Quantitative Risk Analysis (nuclear industry tends to use the term PSA – Probabilistic
Safety Analysis).
In addition to the comparisons made above between QRA and HAZOP, the other key area is
that a HAZOP may identify that a process failure is possibly giving a release, but will not
identify the location of such a release. By considering where releases may occur, it is possible
to start identifying hazard scenarios that require to be addressed by, e.g., the emergency
response plans, the protective devices etc.
Outside of a detailed QRA study, this could also be achieved from a review of the line
drawings for a plant, or from consideration of a prompt list (see e.g. Lees 8.17).
To gain a full consideration of the range of potential scenarios that may exist, it will be
necessary to consider the nature of the release (e.g. a spreading liquid pool will require
different response than a gas cloud) and also whether escalation or domino failures may
occur as one scenario initiates another, e.g. flame from one release playing on another
storage vessel.
Safety Audits
32 HazID L2 i4.1
Can be applied at any stage of the design. One of the first tools employed within the chemical
industries in looking for hazards, by doing a walk-down of an existing plant. With the
increasing applications of computer aided design, it is now also possible to do ‘fly-throughs’ of
3D models before they are built to identify potential hazards, e.g. equipment clashes,
ergonomic issues, etc.
Walking down an existing plant will allow consideration of how the plant is now actually
operating (which may differ from what is designed) but also will identify hazards that may have
evolved over time, e.g. creeping developments and modifications, corrosion affecting integrity,
presence of adjacent facilities, roads etc.
The general heading will include the internally led reviews by the plant personnel e.g. hazard
hunts, supervisor inspections etc. to external reviews led by third parties such as insurance
audits, periodic company audits. Internal audits will have the advantage that they are
performed by personnel knowledgeable about how the plant actually works, but may suffer
from not wishing to challenge the status quo, or accepting how things are.
Easily scalable technique, allowing reviews of a single activity (e.g. are the Permit to work
controls being applied and are they adequate), to walkdowns of a particular process unit, to
whole plant management system audits.
By involving personnel in both internal and external audits and reviews, it allows for
knowledge transfer to occur, both in expected standards but also in experience from other
locations, activities etc.
Safety Audits: example areas
− Machine guarding
33 HazID L2 i4.1
Advantages
▪ Can be easily tailored for purpose
▪ Can be quick to perform
▪ Can use inexperienced personnel to repeat parts
Disadvantages
▪ Can be time consuming
▪ Heavily dependent on experience (and independence) of personnel
34 HazID L2 i4.1
Safety audits are easy to tailor for a specific purpose, and many pre-populated checklists exist
to aid this. They can be a relatively quick and easy way of reviewing an existing plant (noting
that they can also be used on plant designs) and once completed, can be repeated by
inexperienced personnel to check for changes.
However, depending on the scope, size, etc of the audit, it can be very time consuming to gain
a complete picture of a facility, especially where it is large. Would also need to consider time
sensitive operations, e.g. operations that may only be performed occasionally, and therefore
aren’t being performed when the audit is being conducted.
Experienced personnel are generally necessary to ensure a good overview is maintained and
if they are independent then that may allow for greater freedom in identifying (and correcting)
issues that may exist, though the downside is that they may be less familiar with, e.g., the
operation being conducted.
Fault Trees and Event Trees
Fault Tree
▪ Graphical development of the contributing causes to an event – what might cause a hazard
Event Tree
▪ Development of outcome sequences from an event – what hazards may occur
35 HazID L2 i4.1
Fault trees and event trees are techniques with a long history and are widely used. Probably
their most frequent usage is in deriving frequencies of unwanted occurrences (fault trees) and
of unwanted sequences (event trees), but they also have a role in hazard identification in
allowing understanding of how hazards sequences may arise and also how they may develop.
Both techniques can be performed qualitatively, i.e. to describe the sequence of faults that
may occur, or, with appropriate numerical information, to quantify a particular outcome.
Both techniques are addressed in detail in a separate module and are only briefly described in
the following slides.
Fault Tree Analysis
Water
Supply Fails
▪ “Top-down” approach
− Qualitatively for basic causes of events 1
36 HazID L2 i4.1
A fault tree allows the identification of the contributing factors that may result in an undesired
event, but the event itself will require to be identified by other techniques.
A fault tree is built from the top (unwanted event) down, by considering what might occur (or
combine) to get to that point, giving a graphical picture such as shown. The fault tree
structure is developed using a series of logical combination events, the most common of
which are:
AND gate – the event occurs if all the inputs are true (the flat base symbol at the top of the
picture)
OR gate – the event occurs if any of the inputs are true (the curved base symbol on the
second row)
EVENT – a basic event or occurrence (represented by a circle), these may be hardware
faults, human errors or external events.
Fault trees may be solved by hand, or more generally by computer programs; and in the
solution process, common events failing multiple areas are identified. e.g. a common electrical
supply. the failure of which will disable multiple protective devices, thus making a fault tree an
ideal way of reviewing systems where there are multiple redundant layers. The solving
process will also identify combinations of basic events (called cut-sets) which, when they
occur together, will result in the unwanted (top) event occurring. Depending on how the fault
tree has been developed, this may be a qualitative description, e.g. this failure and that failure
combine to cause X, or quantitative, e.g. the frequency of X is 0.1.
In the example shown, the unwanted event (water supply fails) will occur if both pump A and
pump B fails. Pump A will fail if there is a mechanical failure, the power fails or the water
supply to the pump fails, and so on.
Event Tree Analysis
▪ “Bottom-up” approach
▪ Time sequence of how event develops
▪ Mostly used with binary branches
− Yes/No
− Success/Failure
▪ Can be multiple outcomes
▪ For each sequence event trees constructed to model event development to outcomes (e.g. jet
fire, pool fire, explosion, etc.)
Large LPG Immediate Delayed Explosion not Jet Flame
release Ignition Ignition Flash Fire Impinges on tank Outcomes
A B C D E
BLEVE
Yes (0.2)
Yes (0.1) Local Fire
No (0.8)
VCE
Yes (0.5)
Yes (0.9)
Flash Fire
No (0.5)
No (0.9)
Dispersion
No (0.1)
37 HazID L2 i4.1
An event tree may be considered as a bottom up approach in that it starts with an event and
then works through a series of stages (generally time-based) to identify possible results from
that original event.
In the example shown, following a gas release one of the first questions is ‘does it
immediately catch fire – yes/no?’. If it doesn’t immediately ignite then the possibility exists
that the gas may accumulate and find an ignition source later; which is the second question
shown – note that this 2nd question is not asked on the upper ‘Yes’ branch, because there the
gas is considered to have immediately ignited and hence the question about later ignition is
redundant.
The example shown thus contains information about numerous potential outcomes, e.g. the
upper branch is for a situation where the gas release immediately ignites (causing a jet fire)
which may or may not play on adjacent vessel; if it does, then we may have a Boiling Liquid
Expanding Vapour Cloud Explosion (BLEVE), if it doesn’t then we still have a jet fire (from the
immediate ignition) which may result in a local fire in the plant.
On the lower branch, if there is no immediate ignition and also no later (delayed) ignition, then
the gas remains unignited and disperses (the bottom most branch). If it does find a delayed
ignition source, then, depending on the degree of confinement, this may result in a flash (or
cloud) fire or an explosion.
This simple example could be further expanded to show questions about detection, fire
fighting, evacuation etc.
Each node (i.e. each question) has to either occur or not occur i.e. the probability of
occurrence must be 1. In general, event trees are constructed using Yes/No nodes
(something happens or it doesn’t) but it is possible to construct complex nodes, e.g. Where
there are multiple possible outcomes (e.g. which way does the smoke travel? North, East,
South, West each with a 25% chance of occurrence).
Activity 6
Activity 6
38 HazID L2 i4.1
Advantages
▪ Can be used to give numerical results
▪ Logical representation of causes and sequences
▪ Wide application e.g. hardware, software, human, process failures
Disadvantages
▪ Experienced assessors required
▪ Can be time consuming and expensive
▪ Easy to miss fault tree top events
39 HazID L2 i4.1
Maintenance Analyses
▪ Maintenance and Operability (MOp)
Team-based question approach
• Can the equipment be isolated and/or drained for maintenance?
• Are there plans to deal with mechanical failure?
• Are there plans for critical spares to be available?
▪ Maintenance Analysis
Question-based approach for each equipment item, e.g.
• What failures can occur?
• How would faults be identified?
• What preparations are required for repair?
• How can failed items be removed, repaired, replaced?
▪ Model reviews
• e.g. with respect to access, maintainability, exposure etc.
• Can the maintenance introduce hazards post maintenance (e.g. valve left in wrong line up, safety
system isolated)
40 HazID L2 i4.1
MOp intended as an early design stage process and is similar in approach to HAZOP in that
the plant/system/P&ID etc. is broken down into sections and questions asked about each.
Intended to identify the hazards associated with the maintenance of plant items, rather than
the intrinsic hazards of the plant items themselves.
Will however, require experienced personnel with a good understanding of possible
maintenance approaches etc. Considering how many items of equipment may require
maintenance, this may be quite a time consuming exercise.
Maintenance Analysis primarily intended to identify equipment availability, but can also cover
hazards associated with performing the maintenance. For each equipment item, a series of
questions are asked such as shown on the slide. As with MOp this gives a systematic review
of maintenance activities but will also suffer from the same drawbacks.
Reviews during the design phase can also be used to provide information about potential
hazards to the maintenance crew. These may be paper-based, scale models or, increasingly,
Computer Aided Design (CAD) giving 3D views.
Sneak Analysis
▪ A Sneak is a design condition (possibly in conjunction with a single-point failure) which gives rise
to an unintended event or which inhibits an intended event
▪ Originated in aerospace industries based on Sneak Circuit Analysis used for electronic circuits
▪ Two principal methods
− Path or Tree analysis: to identify possible sneak flows
− Check (‘clue’) lists: for other sneak paths
▪ May also be used to augment HAZOPs
41 HazID L2 i4.1
As noted in Lees, in contrast to general methods such as HAZOP and FMEA, Sneak analysis
is a niche method. It was originally developed from Sneak Circuit Analysis – a method of
identifying design errors in electronic circuits.
The method has identified six basic types of sneaks, which are shown on the next slide.
The original sneak method (referred to in Lees as ‘topological’) was based on decomposing
electrical circuits into standard sub-networks, but this proved difficult to translate to a process
setting.
Path analysis was developed by JR Taylor and is based on decomposing a P&ID into
functionally independent sections and then identifying sources and targets, source-target pairs
and paths between sources and targets. Typically this might be done by adding coloured
lines to a P&ID to trace various paths.
The other method, ‘clue’, is based on structured checklists (‘sneak clue lists’) to aid
identification by specifying possible causes to see if they can occur.
Lees (Chapter 8.14) also gives a reference for performing sneak analysis to augment a
HAZOP by using the sneak clues against the P&IDs to check for potential sneaks.
Sneak Analysis: basic error paths
▪ Sneak flow – unintended flow from a source to a target, e.g. two vessels at a different pressure
on a common drain header
▪ Sneak indication – incorrect or ambiguous indication, e.g. showing ‘signal to’ rather than ‘valve
position’ (Three Mile Island)
▪ Sneak label – incorrect or ambiguous labelling of indicators, chemicals or equipment
▪ Sneak energy – unintended presence or absence of energy, e.g. from untreated materials in a
process caused by layering or agitation failure
▪ Sneak reaction – unintended reaction e.g. from unanticipated changes in process conditions,
presence of catalysts
▪ Sneak procedure or sequence – occurrence of events in an unintended or conflicting order
42 HazID L2 i4.1
Sneak Analysis
Advantages
▪ Systematic evaluation with well defined methodology available
Disadvantages
▪ Time consuming and expensive especially for complex plant where there are many plant items
▪ Requires experienced personnel to perform the review
43 HazID L2 i4.1
Control systems - faults
44 HazID L2 i4.1
In his book Computer Control and Human Error, Trevor Kletz discusses seven basic types of
incident that may affect computer systems, but these could equally well be applied to control
systems in general, e.g. including Programmable Electronic Systems (PES), distributed
Control Systems (DCS), Programmable Logic Controllers (PLC) etc.
Equipment faults: a design review of a process plant will pick up the hardware, e.g. pump,
valve etc. faults, but will it necessarily consider the control systems? How many times have
we been told that something isn’t working because the computer isn’t working? Can we really
develop hardware without considering the interaction with the (controlling) software?
Software faults: as Kletz points out, hardware faults tend to be random, they can occur at any
point during the design lifecycle. Software faults however, are systematic, they will occur
whenever a set of conditions is satisfied; the problem is, those conditions may be very
infrequent, or unforeseen.
Black box: people will generally interpret instructions in the context of the situation in which
they are given; the old joke about “When I nod my head, you hit it”. Computers will do exactly
what we tell them to do. Has the software designer fully understood the hardware designer’s
intent, were all eventualities considered?
Operator response: in the same way that we consider human factors when designing
equipment, should we not also be considering the way in which operators interact with
computers and control systems?
Date Entry Errors: entering the wrong amount, entering in the wrong units, etc.
Change control: uncontrolled changes to control systems can be as dangerous as
uncontrolled changes to plant and equipment.
Interference: bypassing limits, systems, etc., to allow the job to be done faster, but also
maintenance/testing work accessing systems and external, malicious intents.
Control systems: methods
▪ CHAZOP
▪ Structured methods
− Structured English
− Specification language
− Structured analysis and design techniques
▪ State Transition Diagrams
▪ Petri-nets
▪ GRAFCET
45 HazID L2 i4.1
Many of the methods discussed elsewhere within this lecture could easily be applied to the
hazard identification for control systems, for example the human interaction methods, and
also What If or FMECA for looking at control system hardware failures. Gould (HSL/2005/58)
lists a number of possible techniques which we’ll look at briefly. The first, CHAZOP,
Computer or Control HAZOP we’ll cover in more detail in lesson 5.
Structured methods are generally an early stage process used to try and identify appropriate
software structures.
Structured English: turning the often, vague, design intent to definite functional statements
using a restricted set of verbs and nouns.
Specification language: similar to the above, but starts with constructing a requirements net
diagram to show flows (data, actions etc.) through the control system, which is then
developed further using a restricted set of verbs and nouns.
SADT is another graphical visualisation to describe how a control system operates. Two
basic types of diagrams are used; Activity Diagrams with nodes for activities connected by
arcs for data flow and Data Diagrams with data as nodes connected by arcs for control
activities.
State Transition Diagrams: which represent the operation of PES by control loop diagrams;
easily understandable, but can get complex quickly.
Petri-nets: another graphical methodology using bubbles (places) and arcs (transitions that
occur) to represent a computer system
GRAFCET: Graphe Fonctionnel de Commande des Étapes et Transitions, was developed in
France in the 1970’s as a graphical method of specifying control sequences, and is mostly
tailored to batch processes as it shows the sequence of operations (e.g. opening closing
valves) that is required.
Human error analysis
46 HazID L2 i4.1
Humans are widely implicated in many incidents, and it thus follows that more we can
understand and identify the possible ways in which a person may make a mistake, the better
our chances of stopping the error from occurring and a hazard being released.
There are multiple tools available to assist in identifying human errors and greater detail is
contained in the Human Factors module.
In addition to the specific methods that have been identified, other hazard identification
methods may well also bring human issues out, particularly in discussion about why
something might occur. Tripod delta is a questionnaire-based audit type tool designed to
identify weaknesses in Basic Risk Factors such as Procedures, Communication, Training etc.
The basic types of human error are listed on the next slide.
Human Error
Unintended actions: skill-based
▪ Lapses (failure to carry out an action)
− Memory failure/loss of attention
Prevented / reduced
− Change in nature/environment of task e.g. by colour coding,
− Action performed out of sequence/step in sequence is missed
Automatic
checklists, interlocks
▪ Slips (intend to carry out correct action but do it incorrectly)
− Familiar tasks carried out automatically without thinking
− Distraction, preoccupation, attention failure
Intended actions
▪ Mistakes / errors of judgement Prevented / reduced
by providing training,
− Rule based: rules exist but person applies incorrect rule (failure to recognise correct comprehensive
application or rule deficiencies) procedures &
equipment design
− Knowledge based: no rules available (organisational deficiency) so rely on
Conscious
Skill-based human errors occur when people are carrying out automatic, repetitive routines
requiring little conscious attention. The errors are unintended. They can be foreseen and
measures can be taken to prevent them or reduce their likelihood, e.g. colour coding, a
checklist, an interlock, etc.
Lapses occur when a sequence is interrupted or something unusual happens – a truck driver
has completed unloading his delivery of petrol and is about to disconnect his hose when he is
called away to answer the telephone. On returning to the truck, he drives away, forgetting to
disconnect the hose.
Slips occur when the mind wanders or the person becomes distracted during the repetitive
sequence or act, and their action is not as planned (a deviation from intent) – counting a stack
of bricks while preoccupied by an earlier row with his wife, so has to keep re-starting the
count, or an operator flicks the wrong switch on a control panel because he is thinking about
what he has to do next, or hitting “send” on a draft email you meant to “delete”.
Mistakes arise due to more complicated failures of thought processes and the person believes
that their action is right, i.e. the action is intended but the intent is incorrect. Mistakes involve
deficiencies or failures in the judgement process. The person may choose to ignore
something, believing it must be a hardware fault or is not relevant to the decision. Mistakes
can be prevented by providing a comprehensive set of procedures and ensuring equipment
design minimises the potential for error.
For rule-based mistakes, the person has a set of rules for a certain situation and applies the
wrong rule. For example, if a control room operator has procedures telling them to perform a
controlled shut down on seeing alarm A and conduct an emergency shutdown and blowdown
on alarm B, but they incorrectly initiates an emergency shutdown and blowdown in the event of
alarm A.
For knowledge-based mistakes, the person is faced with an unfamiliar situation for which they
have no rules, so they use their knowledge and work from first principles, but come to the
wrong conclusion. For example, if the cooling pump overheat light comes on and there are no
defined rules about what to do, do you turn the pump off, leave it running or turn off the entire
unit? The person may assume that the correct action is to turn off the pump when this may
cause more problems (loss of cooling).
Violations are deliberate disregards for rules and can be at individual, group or company
level. They can become normal behaviour in certain environments and we need to
understand why they occur. They are probably more important than mistakes or errors but
can be anticipated or foreseen and avoided by providing training, establishing simple practical
rules, routine monitoring and supervision. For example, cutting corners to save time, due to a
belief that rules are too restrictive and are not enforced, e.g. operating a circular saw with the
guard removed, Chernobyl.
Taking a simple example of driving your car, we have the following types of human error:
Having filled the car, you return the nozzle to the pump then your mobile rings. You answer it
(violation), then walk to the kiosk and pay before driving off, forgetting to replace the car’s
petrol filler cap (lapse).
Or, having had a row with your passenger, you are preoccupied when you arrive at the petrol
pump and fill up with diesel instead of unleaded (slip).
You believe the speed limit in this area is 40mph so you are proceeding at 38mph and are
caught by a speed camera as the limit is actually 30mph (rule-based mistake). Or, you know
that the speed limit is 30mph, but you deliberately drive at 40mph (violation).
While driving overseas you reach a crossroads with traffic lights, which are flashing amber –
you assume this means you can proceed with caution when in fact it means you should stop
(knowledge-based mistake).
Task analysis
▪ Breaking down task into component parts
− Goal: the required outcome
− Operation: the stages required to implement the goal
− Plans: methods used and surrounding conditions
▪ Typical questions
− What actions are performed?
− How do personnel respond to different cues in the environment?
− What errors might be made?
− How might errors be recovered or deviations controlled?
− How do personnel plan their actions?
▪ Writing a procedure constitutes an informal type of hazard analysis
▪ Hierarchical Task Analysis is similar, breaking down complex tasks into simpler sub-tasks,
producing a tree structure
48 HazID L2 i4.1
▪ Typically takes output from task analysis to identify hazards that may result from errors
▪ Error categories
− Unconscious slips
− Mistakes in planning or understanding
− Conscious violation of operating procedure
− Sabotage
▪ Example guidewords
TOO EARLY TOO LATE TOO MUCH
TOO LITTLE TOO LONG TOO SHORT
WRONG DIRECTION ON WRONG OBJECT WRONG ACTION
▪ Generally restricted to single initial errors and to equipment which is physically or psychologically
close to the correct object
49 HazID L2 i4.1
Action Error Analysis takes each step analysed in a task and seeks to identify errors which
may be committed and what their effects on the process might be. As may be imagined, for
single actions this may be relatively straightforward, but for multiple incorrect actions, it can be
very complex to identify what hazards may result.
Human error analysis (in general)
Advantages
▪ Allows complex tasks to be analysed in detail by splitting into simpler ones
Disadvantages
▪ Only applicable to human interaction with a process
▪ Time consuming and expensive: a large number of tasks are required for complex processes
▪ Difficult to identify effects of multiple errors
50 HazID L2 i4.1
Key learning points
51 HazID L2 i4.1
Context is everything. What are the questions you want answered, how much information is
available, what is the stage of the design, etc. and this will help to point towards methods that
may be appropriate.