SYMPOSIUM SERIES NO.
151 # 2006 IChemE
A HUMAN FACTORS APPROACH TO THE OPTIMISATION
OF STAFFING IN THE PROCESS INDUSTRY
Joanne Stokes1, Karl Rich2, and Tony Foord3
1
Senior Human Factors Consultant, Human Engineering Limited, Shore House,
68 Westbury Hill, Westbury-on-Trym, Bristol BS9 3AA
2
Associate Director, Human Engineering Limited, Shore House, 68 Westbury Hill,
Westbury-on-Trym, Bristol BS9 3AA
3
Principal Engineer, 4-sight Consulting, Southern Office, 51 Cowper Road, Harpenden,
AL5 5NJ
Staffing arrangements for process plant have been studied extensively in recent years
taking account of the many interdependent variables influencing system performance
(individual competence, equipment, teamwork and communications, procedures and
organisation). This paper describes work to determine the optimum staffing using
human engineering tools such as task analysis, function allocation, workload analysis,
and human reliability analysis.
INTRODUCTION
For several decades, staff rationalisation has been driven by a wide range of factors.
Advances in technology, organisation, education and training have enabled significant
increases in productivity. As a result of best practice, major incidents and regulatory
pressure, facility operators have also strived to reduce exposure to major hazards.
This has left the industry with highly automated plant and processes that in many
instances have been over-alarmed, and a residual workforce obliged to multi-task and
take ownership for secondary and tertiary activities. In some instances this has led to
exposure to high workload, fragmented jobs and risk that has been displaced rather than
eliminated.
Some companies are responding to this by adopting a user-centred, risk-based
approach to staffing. This requires the participative rationalisation of processes and their
associated alarms and procedures and a better allocation of function between operators
and equipment. This can lead to an improvement in, and better prioritisation of, task
allocation.
This can be achieved by developing a better understanding of the target audience
and utilising this knowledge through task analysis and allocation of function. The baseline
data can then be subjected to workload and human reliability analysis (to a level appropri-
ate for the perceived risk). This allows the definition of roles, and appropriate and coherent
job design. From this, training which is targeted, cost-effective and matches the require-
ment can be developed.
The Health and Safety Executive Contract Research Report Assessing the safety
of staffing arrangements for process operations in the chemical and allied industries
1
SYMPOSIUM SERIES NO. 151 # 2006 IChemE
[HSE CRR 348, 2001] and the Best Practice Guide [Energy Institute, 2004] provide
techniques for checking whether a particular staffing is sufficient to meet the requirements
for safety. The [HSE CRR 348, 2001] report states It is not designed to calculate the
minimum or optimum number of staff. and also states about other human factors
techniques for assessing staffing:
it is concluded that many of the techniques are research tools, requiring
specialist skills to interpret even though they may be straightforward to apply. A
method tailored to assessing staffing arrangements, and designed for general use, has
not been produced.
This paper describes an approach that still requires the use of human factors special-
ists, but is suitable for general use to determine the optimum workload and the optimum
staffing.
BACKGROUND
For several decades, improvements in technology, organisation, education and training
have enabled significant increases in productivity, often described as reductions in work-
load, staff rationalisation or reduction in staffing. For example, before 1995 a gas proces-
sing unit had 40 staff, mainly on shifts. Following a review this was reduced to 25 staff in
1996. Subsequent experience meant that the numbers were increased by one or two, but the
number of staff is still well below 30. Another unit had a large design team in a building
close to a compressor processing large volumes of natural gas at high pressure. The design
team was moved off-site and the number of people exposed to the hazards from the com-
pressor house reduced to two staff making brief inspections once a shift. Examples of the
improvements that have enabled this type of reduction are:
. Improved materials that last longer reducing maintenance
. Evolution of process technology, for example small hydro-cyclones replacing cumber-
some separators
. High speed telecoms enabling remote operation
. Increased scale of facilities producing more output per operator
. Better education and training resulting in multi-skilled operators
. More automation reducing manual tasks
Initiatives like these have been adopted by industry for a number of reasons such as
increasing safety and production, minimising negative environmental effects, and redu-
cing the number of staff exposed to major hazards.
However, the impact of these changes on the remaining workforce is not always
well defined or understood. Problems can arise during and after the changes have been
implemented for example, existing staff can become overworked or may not be competent
to undertake their new roles. This could result in staff making errors, cutting corners, or
taking unacceptable risks. Since the introduction of Control Of Major Accident Hazards
(COMAH) [HSE web-site, 2005] in 1999, the understanding of human factors within
industry has grown, but is still not yet well defined or applied.
2
SYMPOSIUM SERIES NO. 151 # 2006 IChemE
Particular concerns with reduced staffing are:
. That staff may be able to manage normal operation, but the number may not be suffi-
cient for abnormal or emergency operation; and
. Not all changes reduce workload.
For example, many processes have been upgraded by replacing the control systems with a
modern system. Unfortunately the ease with which alarms may be configured on modern
computer control systems has resulted in many operators being presented with alarms at an
unreasonably high frequency from the improved system. Other systems have been
designed using telecoms to enable remote operation, even when the facility is off-shore.
Operators and maintenance technicians then find the improved design requires daily
visits to the unmanned site involving significant travel by car or even helicopter. Not
only is the workload increased but the risks to staff are significantly increased because
of all the additional travel.
The issues with alarms are already well documented by the Engineering Equipment
and Materials Users Association [EEMUA, 1999]. The issues with assessing whether staff
may be able to manage abnormal or emergency operation are much more difficult. The
workload is not fixed and depends upon how the staff are educated, trained and organised.
Thus key questions are what is the optimum automation, workload and organisation and
what is the optimum staffing to handle it?
OUTLINE OF USER-CENTRED APPROACH
This paper describes a user-centred approach which tackles these key questions. The
approach uses established human factors methods, which when integrated into the
design of a new system, or into a change management plan, can ensure that the workload
and staffing requirements are fully understood and optimised for that system. This helps to
reduce the risk of staff-related incidents and to contribute to safer systems. Specifically,
this approach can optimise the staffing by:
. Appropriately distributing responsibilities across roles;
. Predicting and managing the workload experienced by each role; and
. Identifying and controlling or mitigating the risks associated with the change.
The output from this assessment can also be used to define:
. Job role specifications;
. The competencies, skills, and knowledge required to perform these duties;
. A suitable organisational structure providing adequate supervision and support;
. Communication and user requirements;
. Training and continued performance requirements;
. Ergonomic designs and layout for equipment; and
. A change management plan.
3
SYMPOSIUM SERIES NO. 151 # 2006 IChemE
This human factors approach can also be used to provide a key input into the safety case
for COMAH [HSE web-site, 2005].
Figure 1 presents the overall human factors approach discussed in this paper.
UNDERSTANDING THE TASK REQUIREMENTS
Understanding how the system operates is the first step in defining the optimal number
staff in any system; that is, to identify every activity that is required to operate and main-
tain that system in all conditions (for example, normal, abnormal and upset). This provides
an overall picture of the task requirements for that system. The process by which this can
be achieved is called task analysis.
TASK ANALYSIS
Task analysis takes the existing (and, where appropriate, predicted) tasks and produces a
model of the activities necessary to operate and maintain the system. This model can take a
variety of forms for example, through hierarchical, cognitive, or tabular task analyses. An
extract of a theoretical tabular task analysis for a generic process plant is presented in
Figure 2.
The task analysis forms the baseline data upon which to allocate roles and respon-
sibilities. The generation of this data should therefore be discussed and validated by key
stakeholders (including current staff) to ensure that it represents an accurate picture of the
activities required for that system.
As Figure 2 illustrates, the data collected does not have to be merely lists of tasks,
but can also answer a large range of questions, including:
. What initiated the task?
. Who is responsible for the task?
. What information is required by staff to complete the task?
. What system is used to achieve the task?
. How is it achieved?
. What is the output?
. What verbal communication is required?
This user-centred approach ensures that the task analysis represents those activities
that actually take place, rather than merely summarising operating or maintenance
procedures for how it should be done. This approach is beneficial as it incorporates on
the job or tacit knowledge into the task analysis, minimising the potential for this
knowledge to be lost over time or when staff change jobs. This approach also facilitates
identification of possible gaps or issues with current working practices. For instance, in
the following example:
Two foremen unnecessarily placed themselves at risk by entering a cloud of
propylene vapour in an attempt to isolate a leak. Fortunately the vapour did
4
SYMPOSIUM SERIES NO. 151 # 2006 IChemE
Figure 1. Human factors methodology
5
SYMPOSIUM SERIES NO. 151
6
Figure 2. Extract from a generic process plant tabular task analysis
# 2006 IChemE
SYMPOSIUM SERIES NO. 151 # 2006 IChemE
not ignite. Upon complaining that that he should not have been expected to
place himself under such risk, he recalled that remotely located emergency
equipment had been put in 8 years earlier, after a similar incident. Although
the operation of this equipment was carried out once a week by one of the
operators, neither foreman had been in contact with these valves during
this time, and had long since forgotten that they were there.
A task analysis of the plant area could have identified:
. Who was responsible for the safety of the plant area;
. Who operated/checked on the safety valve;
. What information was required by staff in the event of this type of incident occurring;
. What communication links existed between staff on site (for example, foreman and
operator);
. What plant safety procedures and/or briefings exist for that plant area;
. What safety procedures and working practices were in place.
The task analysis would have highlighted the missing communications link between the
foreman and the operator. Moreover, asking staff to verbalise and consider their roles
during the development of the tasks analysis may also have brought these issues to light.
ALLOCATION OF FUNCTION
A critical part of understanding staffing requirements is to identify which of the activities
identified in the task analysis should be assigned to which part of the system. This can be
achieved through allocation of function. Allocation of function is the identification of
activities undertaken by the staff, those that can be performed automatically by the
system, and those that require interaction between the staff and the system. During this
time, any proposed changes to the system (for example, automation, remote support, real-
location of tasks, etc), need to be considered.
Allocating tasks to either the staff or the system should be determined by the nature
of that task. For example, the characteristics of humans place certain limitations on the
types of task that they can be expected to perform safely and reliably. The following
represents Fitts list [Fitts, 1968], a definition of the types of tasks best suited to
humans and/or machines.
Humans may surpass machines in their ability to:
. Detect small amounts of visual or acoustic energy.
. Perceive patterns of light or sound.
. Improvise and use flexible procedures.
. Store very large amounts of information for long periods and to recall relevant facts at
the appropriate time.
. Reason inductively.
. Exercise judgement.
7
SYMPOSIUM SERIES NO. 151 # 2006 IChemE
Whereas machines currently surpass humans with regard to the ability to:
. Respond quickly to control signals, and to apply great force smoothly and precisely.
. Perform repetitive, routine tasks.
. Store information briefly and then to erase it completely.
. Reason deductively, including computational ability.
. Handle complex operations; that is, to do many different things at once.
However, as technology has advanced the ability of humans to exceed machines for some
activities has reduced. For example, machines are now able to employ pattern recognition
techniques that can rival and often surpass the human eye.
Therefore, it is important to also consider the context within which the activities are
taking place in order to maximise the abilities of both humans and machines in that
context; rather than relying on a set of predefined and somewhat rigid parameters.
[Sherry and Ritter, 2002] have provided the following guidance for defining appropriate
human/machine allocation:
. Minimise the impact of interruptions on the operator (to reduce the risk of mistakes
being made, especially when performing critical tasks);
. The human should be an active participant rather than a passive monitor (active
operators minimise the risks of reduced vigilance, complacency, loss of skills/
situational awareness);
. Humans have responsibility and must be given control authority (operators
provided with sufficient information/mechanisms are better disposed to safely and
effectively evaluate and control the system);
. The automation should clearly indicate its behaviour and state (clear information
ensures that the operator can maintain awareness and understanding the current system
status);
. The automation must be capable of inferring the human and environment context
and state (system awareness can facilitate communication, coordination and develop-
ment of a shared understanding).
It is also important to consider other sites and industries that use similar systems. For
example, lessons can be learned from their existing system configurations, working prac-
tices and any incident reports/databases.
ESTABLISHING BASELINE DATA
Once tasks have been systematically labelled as human tasks, the responsibilities associ-
ated with those tasks can be logically allocated into the proposed staff roles. For example,
this could be achieved by separating responsibilities according to plant processes or areas,
or levels of responsibility. These staff roles form the baseline data upon which to demon-
strate that the system can feasibly be operated safely and efficiently by the proposed staff-
ing. As illustrated in Figure 1, this is achieved by assessing the workload placed upon each
8
SYMPOSIUM SERIES NO. 151 # 2006 IChemE
proposed role, and by identification, control, and mitigation of operator-related risks
associated with this structure, through workload analysis and through targeted human
reliability analysis.
FEASIBILITY AND PERFORMANCE
WORKLOAD ANALYSIS
Workload Analysis provides assurance that staff are capable of performing all necessary
tasks associated with their job role in normal (including start-up, shut-down and mainten-
ance), abnormal, and upset conditions, without being subjected to periods of unacceptably
high or low workload. This analysis is beneficial in establishing the appropriate staffing for
the system; it also verifies the appropriate allocation of functions between systems and
operators, and can be used to assess the human-machine interface in terms of the stresses
and demands it places upon the workforce.
The type of workload assessment carried out depends on the project requirements.
For example, a qualitative assessment could be carried out relatively inexpensively, and is
particularly beneficial in the early design stages of a system, and can also be compared
with user assessments carried out when testing mock-ups of the final system. A qualitative
assessment may be carried out by a human factors specialist, supported by the stake-
holders, to assess the predicted operating conditions of a system and identify periods of
high, medium and low workload.
A quantitative analysis can, however, provide a more in-depth analysis. For
example, a quantitative assessment requires the specialist to explore operations at the
task level, identifying the type of task an operator carries out (visual, auditory, cognitive
or psychomotor). The specialist then interrogates a number of developed scenarios to
determine whether the operator would be required to perform two competing tasks
(that is, respond to two visual stimuli, or perform two cognitive tasks) simultaneously.
The success of a quantitative workload assessment depends upon the level of detail
held within the task analysis.
Identifying an acceptable workload over all working conditions indicates that the
operator should be able to perform the tasks required to operate/maintain that system.
However, any peaks or troughs of workload identified indicate that the operator may
have difficulty in completing those activities.
For example, a workload analysis was carried out in a control room on a
Floating Production, Storage and Offloading (FPSO) vessel. The operator
was observed to experience up to 15,000 alarms in a 12-hour shift period
that equal to approximately 1 alarm every 3 seconds (assuming, at best, an
equal distribution of alarms). This number of alarms is obviously too many
for one operator to contend with in such a short space of time.
In the FPSO example above it was recommended that the alarms interface
be rationalised to remove any obsolete or redundant alarms, using the alarms
9
SYMPOSIUM SERIES NO. 151 # 2006 IChemE
guidance defined in [EEMUA, 1999], and that a post-implementation
workload analysis be carried out to ensure that the adverse workload
caused by the alarms had been controlled.
Another example is a gas storage facility that conducted a large scale alarm
rationalisation study, based upon [EMMUA, 1999], removing a number of
obsolete, duplicate, and information only alarms from their Distributed
Control System (DCS). During this process it was identified that an operator
could receive a large number of alarms all relating to the same event, e.g. a
compressor trip. In cases such as this, alarms were grouped together, resulting
in an initial high priority alarm to notify the operator, suppressing the remain-
ing alarms for a defined period of time. Thus the number of alarms presented
to the operator had been reduced, with those remaining optimised.
These examples illustrate how the amount of information presented to staff can be
reduced. However, the amount of workload experienced by the operator is not simply a
function of the number of alarms presented. Although alarms rationalisation does
reduce the number of unnecessary alarms presented to the operator, it also increases the
requirement for the operator to respond to each alarm; as each remaining alarm now
requires a response.
Where adverse workloads have been identified, it is important to understand which
activities may pose a significant problem. A screening process can be applied to those
tasks, to identify which tasks have potentially serious consequences in terms of
the environment, production or safety. For these critical tasks, functions and features of
the system may require redesign in order to limit their capacity to cause human error.
Redesign may involve allocating functions to the system rather than the operator (in the
event of cognitive overload) or modifying the interface or timing of events. It may also
require that the job roles and responsibilities are re-specified; staff numbers are increased
(where workload has been identified as too high) or decreased (where staff are under-
loaded) accordingly.
Some companies have reduced the exposure to major hazards (e.g. on off-shore
installations) by introducing remotely-located experts. These experts perform a number
of different activities from full DCS monitoring to analysing trend data to providing
ad-hoc advice as required. Outsourcing some of the responsibilities associated with oper-
ating the installation obviously reduces the number of staff required off-shore. However,
problems can arise when remotely located staff have not visited the installation and do not
have a vital visual representation of the installation. Furthermore, this is compounded by
the fact that they do not have the same physical perceptions as a locally located operator
e.g. they are not able to see the plant outside or feel the impact of a storm. Different
working practices and safety assurances must be introduced, and the ability to communi-
cate with the field operators on the installation becomes of paramount importance.
Remotely-located experts to provide immediate contextual advice on abnor-
mal situations were used on a trial basis on an off-shore installation. The use
10
SYMPOSIUM SERIES NO. 151 # 2006 IChemE
of effective communication media (including real-time video) to transmit
information was identified as a critical influence on the success of the project.
Sufficient consideration has not always been given to which responsibilities and
communication mechanisms would be most appropriate and how these remote activities
might fit into the current systems of work. Reorganisation of responsibilities of staff is
a natural by-product of most system and organisational changes. These changes usually
affect the competencies required from the staff and need to be systematically analysed
in order to reduce operating costs safely.
In the 1990s a contract organisation were supporting 6 to 8 offshore platforms
using technicians for planned maintenance. They had the opportunity to take
on additional platforms through a change of ownership so that they then
supported 14 offshore platforms in total. As a result of a Criticality Analysis
the maintenance strategy was changed, and they were able to take on the extra
work with only a small increase in staff. In summary, they doubled the
number of assets they maintained with only a 10% increase in staff.
Iteration of the workload analysis can therefore determine whether the re-designs
have resulted in appropriate workloads. Where it is not possible to design out the poten-
tial for human error, critical activities can be subject to a human reliability analysis to
identify, control and/or mitigate the potential for human error to ensure that the system
as a whole is safer.
HUMAN RELIABILITY ANALYSIS (HRA)
The use of Human Factors works on the premise that staff are part of the system. As part of
the system individuals are affected by previous, current and future tasks that they have or
would have to perform. They are also be affected by other factors that are outside the
control of system (fatigue, personal factors, etc). A human reliability analysis is aimed
at understanding human performance, and creating a system that accommodates these
characteristics to achieve a safer system.
There are number of formal human reliability analysis methods however, the overall
methodology for the identification and mitigation of operator-related errors is very similar.
As outlined in Figure 1, HRA is a four stepped process, aimed at answering the following
questions:
1. What human errors are possible?
2. What is the likelihood of them occurring and what are the chances of recovery?
3. What are the potential consequences of each error occurring?
4. How can human errors be controlled or mitigated?
To determine the potential for human error during each critical activity, a realistic and
detailed scenario needs to be produced. This scenario should describe in detail the tasks
required to safely maintain plant operations during normal, abnormal and upset conditions.
11
SYMPOSIUM SERIES NO. 151 # 2006 IChemE
In this manner the potential for operator-related risks associated with performing plant
operations under all operating conditions are examined.
For example, it is more likely that an operator would make an operational
error when under time pressure, such as handling a gas flare incident, than
when the operator is monitoring plant operations in a steady state.
Once an exhaustive list of errors has been identified for each task, the likelihood of
each error occurring can be determined; the risk to the environment, production and safety
can be defined; and the potential to recover from the error ascertained. This process is
usually undertaken in collaboration with stakeholders who have experience of the
system (operators, other staff, etc) as their knowledge in the generation and assessment
of errors can be invaluable.
The following presents an example output from a HRA as it may have been used to
define the potential, likelihood and consequences of human error relating to the propylene
vapour leak detailed earlier.
. Role: Foreman
. Critical task: Isolating the propylene vapour leak.
. Direct Risk: Foreman does not communicate knowledge of the leak to the control room
operator and is not aware of the safety procedure to isolate the leak (external error).
. Description: The Foreman and the operator do not communicate sufficiently even
though the operator has an overview of the plant and has the ability to isolate the
leak remotely.
. Consequence: The Foreman tackles the leak at the source and suffers severe burns.
. Risk rating: Medium
. Potential mitigation measures:
W Mandate regular familiarisation of plant specific operating and safety procedures;
W Update safety procedures to mandate informing operator of problem (where
appropriate);
W Conduct regular safety drills;
W Provide short and long-term plant training and awareness programmes;
W Increase formal (weekly) communication between staff.
It is not always practical nor rationale to apply reduction measures to all of the errors
identified during the HRA. In some instances the cost of implementing the measure
would far out weigh the benefits. Therefore, implementation of risk reduction measures
is usually based upon a cost - benefit analysis whereby those high impact/poor likelihood
of recovery errors are subjected to appropriate control or mitigation strategies to reduce
their occurrence to As Low As Reasonable Practicable (ALARP).
Once the control or mitigation measures have been identified they should be incor-
porated into the project risk register and into the system design. Where strategies have
been identified, it is beneficial to re-assess the potential for errors to ensure that the risk
has been removed or is ALARP. This can involve reiteration of the workload analysis
12
SYMPOSIUM SERIES NO. 151 # 2006 IChemE
under the revised configuration to ensure that the workload now falls within an acceptable
range. Iteration of the human reliability analysis should also be performed as necessary.
STAFFING OPTIMISATION
This human factors approach provides the basis upon which operators can provide assur-
ance that the human element of the system has been systematically and rigorously
analysed in system design and in the change management process. It can also demonstrate
that the staffing identified is optimised to ensure safe and efficient system operations.
This approach also provides a key human factors input into the safety case for
COMAH by demonstrating that the operator has:
. Understood how humans, as well engineering fallibility, can initiate accidents;
. Identified all system critical tasks and activities;
. Systematically analysed the system to ensure that staff can safely conduct plant oper-
ations by:
W Avoiding adverse workload;
W Identifying, analysing and mitigating the potential for human errors;
W Defining appropriate staffing;
W Identifying an organisational structure with appropriate management and supervi-
sory capabilities;
. Identified initial training needs;
. Provided competency assurance;
. Encouraged employee involvement and communication;
. Sufficient information to inform:
W Ergonomically sound workplace design and layouts;
W Short-term and long-term training and performance requirements;
W On-going data collection, risk assessment and support.
REFERENCES
HSE web-site Control of Major Accident Hazards Regulations, 1999 Definition and
guidance can be found at [Link] Website last
accessed 1st October 2005.
HSE CRR348, 2001, Assessing the safety of staffing arrangements for process operations
in the chemical and allied industries.
Energy Institute, 2004, IP Safe staffing arrangements user guide for CRR348/2001
methodology and its extension to automated plant and/or equipment, ISBN 0 85293
411 4.
Fitts, P.M., 1962, Functions of man in complex systems. Aerospace Engineering. 21
January. Describes a list of human attributes and limitations and specifies those tasks
more suitable for a machine.
13
SYMPOSIUM SERIES NO. 151 # 2006 IChemE
Richard R. Sherry and Frank E. Ritter, 2002, Dynamic Task Allocation: Issues for
Implementing Adaptive Intelligent Automation. School of Information Sciences and
Technology. The Pennsylvania State University. Technical Report No. ACS 2002-2.
8 July 2002.
EEMUA (Engineering Equipment and Materials Users Association), 1999, Alarms
systems. A guide to design, management and procurement. Publication No. 191.
14