Functional Safety Standards tell you what to do, how to do it and include an approach that covers all three ways that a failure might occur. It is comprehensive, but proportionate - it works.
It Ain't What You Do, It’s The Way That You Do It
In the world of Functional Safety this isn't true. The functional safety standards demand that you consider what you do and the "way that you do it".
What You Do
The "What You Do" is given initially as an overall summary Lifecyle – a list of tasks arranged in a sensible order. Sometimes you are free to choose your own approach – for example which technique you use for hazard identification is left open, so that you can use a technique relevant to your own specific industry and/or project. Sometimes the standard is quite specific on how a to implement a particular task, where the approach can be defined with some detail, irrespective of the particular project.
The "Way That You Do It"
The "Way That You Do It" is called Functional Safety Management (FSM). When experienced engineers are asked to brainstorm the key issues of delivering "good engineering practice" for a project, they typically list the following as being the essential elements of a good project:
- Comprehensive procedures
- Planning
- Peer review
- Competency
- Documentation
Functional Safety Management (FSM)
Each of these elements makes up part of FSM. With one exception*, the elements of FSM are familiar to all of us who have experience of real-world projects. We all know these basic truths:
- poor procedures will lead to mistakes
- lack of planning causes confusion
- inadequate peer review will leave errors undetected
- lack of competency and poor documentation undermines anything you try to do.
Functional Safety Management addresses each of these elements of good engineering practice – which should be no surprise. How could we expect the approach to a “safety” project to be specified as anything other than a well-managed project?
Functional Safety Assessment (FSA)
* The "Exception" is that FSM includes something called "Functional Safety Assessment". This isn’t typically familiar to people new to functional safety – it isn’t included in most conventional projects, but it is an interesting innovation developed to provide an extra level of assurance for safety projects. It is an additional quality assurance step, done in addition to a conventional peer review, but taking a different angle of approach.
The "different approach" is to evaluate whether the project achieved its safety objectives – by confirming compliance with the target functional safety standard. Experience suggests that this additional check delivers real value in spotting mistakes and omissions that would otherwise go undetected.
To summarise, functional safety standards tell you WHAT TO DO (the lifecycle and detail on some of the tasks of that lifecycle) and HOW TO DO IT (Functional Safety Management to deliver good engineering practice), supported by Functional Safety Assessments).
Three Ways Things Go Wrong
The functional safety standards require us to consider and manage how things could go wrong because of hardware failures (stuff just fails), because of faults by humans (they are human after all) and because some external event catches us by surprise (lightning might not strike twice, but it often strikes once).
The standard terms these three failure categories random hardware, systematic and common cause failures. A range of different approaches are taken to minimise or eliminate failures from each of these categories.
1. Random hardware failures
Random hardware failures are reduced by designing and choosing reliable equipment and (sometimes) specifying additional redundancy to protect against single faults. It seems that the one thing people typically know about functional safety is that it requires some detailed reliability calculations, but there is far more to it than that.
2. Systematic failures
Systematic failures are reduced by (we've already talked about this) the various aspects of Functional Safety Management. How do we stop humans making mistakes – look again at the various elements of FSM, they constitute a comprehensive approach to minimising human errors.
3. Common cause failures
Common cause failures are reduced by first trying to identify what might trigger a common cause failure: flooding, loss of instrument air, lightning, corrosion etc and then working out how to ensure that all of protection isn’t lost if the event were to occur. The solution is often to implement diverse redundancy in one form or another - different locations of equipment, different sensing technology etc.
Product Manufacturers have some additional requirements placed on them that relate to management of the three different failure categories - which is outside the scope of this paper.
Why does Functional Safety Work?
As we’ve seen, Functional Safety Standards tell you what to do, how to do it and includes an approach that covers all three ways that a failure might occur. It is comprehensive, but proportionate - it works.