GB2258311A - Monitoring a plurality of parameters - Google Patents
Monitoring a plurality of parameters Download PDFInfo
- Publication number
- GB2258311A GB2258311A GB9215907A GB9215907A GB2258311A GB 2258311 A GB2258311 A GB 2258311A GB 9215907 A GB9215907 A GB 9215907A GB 9215907 A GB9215907 A GB 9215907A GB 2258311 A GB2258311 A GB 2258311A
- Authority
- GB
- United Kingdom
- Prior art keywords
- class
- condition
- belonging
- inputs
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 13
- 238000012549 training Methods 0.000 claims abstract description 28
- 238000013528 artificial neural network Methods 0.000 claims abstract description 27
- 238000009826 distribution Methods 0.000 claims description 27
- 238000000034 method Methods 0.000 claims description 25
- 238000012545 processing Methods 0.000 claims description 19
- 230000008569 process Effects 0.000 claims description 11
- 230000004044 response Effects 0.000 abstract description 7
- 210000002569 neuron Anatomy 0.000 description 22
- 230000006870 function Effects 0.000 description 11
- 239000013598 vector Substances 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 238000010348 incorporation Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000009827 uniform distribution Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005315 distribution function Methods 0.000 description 1
- 210000005171 mammalian brain Anatomy 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000036387 respiratory rate Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/02—Detecting, measuring or recording for evaluating the cardiovascular system, e.g. pulse, heart rate, blood pressure or blood flow
- A61B5/0205—Simultaneously evaluating both cardiovascular conditions and different types of body conditions, e.g. heart and respiratory condition
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
- A61B5/7264—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
- A61B5/7267—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H40/00—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
- G16H40/60—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices
- G16H40/63—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices for local operation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/50—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
Landscapes
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Pathology (AREA)
- Physics & Mathematics (AREA)
- Physiology (AREA)
- Molecular Biology (AREA)
- Primary Health Care (AREA)
- Cardiology (AREA)
- Veterinary Medicine (AREA)
- Animal Behavior & Ethology (AREA)
- Epidemiology (AREA)
- Surgery (AREA)
- Biophysics (AREA)
- Heart & Thoracic Surgery (AREA)
- General Business, Economics & Management (AREA)
- Databases & Information Systems (AREA)
- Pulmonology (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- Evolutionary Computation (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Psychiatry (AREA)
- Signal Processing (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
An equipment monitor 10 which is capable of learning how to respond to particular inputs (e.g. a neural network) is connected to a plurality of instruments 12 (e.g. medical instruments in an intensive care ward or instruments in an industrial plant). During an initial training session, a human supervisor monitors the instruments to ensure that no potential alarm condition is encountered whilst the monitor assimilates the gamut of signals representative of "safe" or "healthy" conditions. Thereafter the equipment is left to signal an alarm if the collection of signals it is monitoring strays out of the range encountered during the training session. A button may be provided for indicating to the monitor that responses which give rise to false alarms should be included in its "safe" responses. The monitor may be provided with some rules prior to its learning phase. <IMAGE>
Description
"Apparatus and Method for Monitorins" This invention relates to apparatus and methods for monitoring a plurality of input signals or parameters of a system and for determining the condition of the system. In particular, but not exclusively, the invention relates to such apparatus and methods for identifying novel input signals indicative of an alarm state.
The apparatus and method have very many specific applications, but one typical application is in an intensive care ward where a patient is connected to various instruments which observe various parameters of the patient, for example, his heart rate, pulse rate, respiratory rate, ABP,
CVP and so on. These instruments are conventionally monitored by a nurse who determines whether the patient's condition is stable or whether action should be taken.
Attempts have been made to relieve the nurse's workload by equipping some of these instruments with primitive monitors which sound an alarm when the reading of the instrument strays outside pre-determined limits.
However, the bounds of acceptability are to some extent arbitrary and require setting by someone with prior knowledge of the patient's condition. Additionally, the individual single instrument monitor has no information regarding the other parameters of the patient's body that are being instrumented. By treating each bodily parameter as distinct and unrelated to any other bodily parameter much useful information regarding the interdependence of these measurements is discarded. The human body is, after all, a unified whole whose parts must function in accord.
A need exists, therefore, for an apparatus and method functioning as a network monitor which learns the interrelationship between the samples, made by the instruments, of the bodily state. Any departure from normality of the entire system is detected and typically is signalled as an alarm.
Accordingly, in one aspect, this invention provides apparatus for monitoring a plurality of parameters of a system and for identifying or predicting the condition of the system, said apparatus comprising:
a plurality of inputs each for receiving data representative of a respective parameter of the system;
processing means for processing data received via said inputs to determine or predict the condition of the system and provide output data representative of said condition;
training means operable to identify to the processing means monitored input signals which belong to a first class which indicate a first predetermined system condition (e.g.
healthy), and
data generating means operable to generate or synthesise and present to said processing means input signals which belong to, or have a high probability of belonging to, a second class which indicates a second predetermined system condition (e.g. alarm condition),
the processing means being operable during a learning phase to be taught to distinguish between input signals in said first class and said second class.
The generating means preferably simulates the statistical distribution of said second class of input signals by using a pseudo-random generator. The distribution of said random process may be non-uniform, to allow for incorporation of prior knowledge of the monitored system.
In one arrangement, the output provides data representing the probability of the monitored input signals falling in said second class. The processing means preferably is configured as or operates as an artificial neural network.
The configuration or operation of the processing means is preferably structured as a network with a capability equivalent to a three or more layered network, corresponding or analogous to an input layer, at least one "hidden" or intermediate layer and an output layer.
In another aspect, this invention provides apparatus for monitoring a plurality of system parameters and thereby deducing the condition of said system, said system including processing means capable of being taught to distinguish between a set of monitored system parameters belonging to a
Class indicating one system condition and a set of monitored system parameters belonging to another Class indicating another system condition, means for supplying and identifying the Class of a plurality of sets of parameters from one of said Classes, and means for randomly or pseudo randomly generating or synthesising sets of data and for identifying these to the processor as belonging to the other Class.
In yet another aspect, this invention provides a method for monitoring a plurality of parameters of a system and for identifying or predicting the condition of the system, the method comprising the steps of:
receiving a plurality of inputs each representing a particular parameter of the system,
supplying said inputs to a processor for processing thereof to determine or predict the condition of the system,
said processor having been trained by identifying to it those input signals belonging to a first class indicating a first predetermined system condition (e.g. healthy) and by providing said processor with synthesised or otherwise generated input signals belonging to, or having a high probability of belonging to, a second class indicating a second predetermined system condition (e.g. alarm condition) and identifying said synthesised or otherwise generated input signals as belonging to said second class.
Whilst the invention has been described above, it extends to any inventive combination of the features set out above or in the following description.
The invention may be performed in various ways and an embodiment thereof will now be described, by way of example only, reference to the accompanying drawings, in which:
Figure 1 is a schematic view of an example of a neural network monitor in accordance with the invention;
Figure 2 is a diagram of a simple three-layer neural network;
Figure 3 represents the first stage of errorbackpropagation;
Figure 4 represents the second stage of errorbackpropagation;
Figure 5 is a diagram illustrating the ability of the system to generalise;
Figure 6 is a diagram illustrating the insensitivity of the system to noise;
Figure 7 shows the ouputs of a number of devices connected to an example of network monitor, together with the output of the network monitor itself; and
Figure 8 shows a typical non-alarm or "healthy" distribution and a suitable default distribution for explaining the operation of a monitor using a non-uniform default distribution for training.
The neural network monitor 10 receives outputs from a number of instruments 12 and provides an output indicating a condition requiring attention. The neural network monitor under consideration does not require the explicit statement of rules, as would a conventional expert system, but may be taught by example. The invention however also extends to network monitors which use rules known prior to learning, and prior knowledge may be incorporated in different ways;
For example prior knowledge may be incorporated by starting with weights other than small random ones, and in other ways as discussed elsewhere herein.
Referring to the present case, after a period of supervision, during which it is given experience of the normal, or healthy response to be expected from the instruments it is monitoring, the neural network monitor is left to receive input without guidance. At first the behaviour is somewhat "nervous" and the device will sound its alarm bell at any possibly "unhealthy" response from the instruments. The (human) supervisor may then indicate to the neural network monitor that the response from the instruments that caused the alarm should be included in its concept of a normal or healthy response. After repeated reassurances, the neural network monitor settles down to given infrequent false alarms.
A neural network consists of a number of simple processors, or neurons, linked together as in Figure 2. The neurons combine their inputs and subsequently produce an output which is passed to other neurons. The links between neurons contain weights which control the amplitude of the signal passing through. In addition, each neuron has an associated bias, which is effectively a connection to a neuron which is always in the on-, or 1-, state. It is the weights and biases that embody the information required to classify the input signals, just as in the mammalian brain it is the links between the neurons that determine its function. In the training stage, the weights and biases are iteratively improved by applying input and output pairs to the network.
One way of viewing the operation of the neural network is as interpolation in a space defined by its parameters.
The training patterns are the representative examples of classes. After training, previously unseen patterns are classified according to an interpolation between the training examples. The number of neurons in the network determines the complexity of the space in which interpolation is done. In this way, with enough neurons, a division of feature-space by arbitrary complex boundaries can be made, assimilating the fine distinguishing features of the input patterns. Alternatively, by providing only a few neurons, the network is forced to generalise and the outliers in its training will be effectively ignored.
The knowledge of the network is contained in the values of the weights between neurons. Initially these are set to small random values. The process by which the values of the weights are refined to represent better the mapping of the network's input to the required output is known as "errorbackpropagation".
In the error-backpropagation process, the objective is to obtain some Aw for each weight such that, when the weight vector is changed from W to W+Wt the error (i.e. the difference between the actual output y, and the desired output d) is reduced. Let us first define the error, summed over all training examples, c, as
where the j subscript denotes in turn each output unit as in
Figure 3.
The net input to a neuron in the j layer is obtained by multiplying all its separate inputs by their respective weights and adding:
where the i subscript denotes in turn each unit contributing to output unit j. Let the neuron's output be some differentiable function, A, of this net input, Yj = A(x1) (3) The threshold function is not differentiable, but it can be approximated by a function that is differentiable called the "logistic", "signoid" or hyperbolic tangent.
From equation 2 we get the derivatives @@@ dwji and
(5)
zj dyi = ii and from equation 3 we get the derivative
(6) @@@@ The process of error-backpropagation starts at the output of the network after input pattern number c has been propagated forwards from the input to the output. The actual output, in this case Yj ct is compared with the desired output.
Therefore, for just one of the c, the derivative of this error with respect to the jth neuron's output is 6E @@@@ . (7) dyj This starts the error-backpropagation process whose objective is to find the error derivative with respect to the weights: @E AE0yj azj (8) @@@ @@@ @@@ @@@ where the derivatives on the right hand side can be evaluated from equations 7, 6 and 4 above.
The next stage is to evaluate dyi to enable us to continue down the network:
where the derivatives on the right hand side can be evaluated from equations 7, 6 and 5 above.
Now we have completed the necessary calculations for the top layer. Let us re-label the layers as shown in
Figure 4 to enable us to continue the next layer down.
We have already obtained wj by equation 9 where the old i layer is now the new j layer. From this we can calculate dwji, as in equation 8. If we have not yet reached the bottom layer of the network, we calculate dyi as in equation 9.
The chaining process continues to evaluate dw followed by dy for all the layers of the network, thereby obtaining the error derivative of all the weights in the network.
Having described the process of error-backpropagation in detail we now turn to the technique of "default classification" according to which, if no examples of one class of output are available, the complementary class must be inferred by default. For example, if a particular system is being monitored for some dangerous condition, and if that dangerous condition cannot be produced at will to train the network explicitly, then the only examples available for training will be consistent with the healthy, non-dangerous state of the system.Training a network only on one class of inputs, with no counter-examples, causes the network to classify everything as the only class it has been shown.
However, by training the network on examples of the "healthy" class but also on random inputs for the "dangerous" class, any input which occurs after training which does not resemble one of the previously encountered "healthy" inputs will automatically be classified as "dangerous". The network effectively behaves as a novelty detector.
To illustrate this, a network was trained as follows.
It had five inputs and one output. The "healthy", class 1, input vectors consisted of elements a, b, c, d, e such that
b < c, d < c, a < b, e < d, ( 10) as illustrated in table 1. 50 training examples satisfying conditions 10 were generated. A network with 5 inputs, 3 hidden neurons and one output was trained to output class 1 for this data. Additionally random data with the same first-order statistics as the data for output class 1 was synthesised for which the network was trained to output class 0. When tested on a new set of 100 inputs, half of which were produced by explicitly following condition 1, and half of which were generated randomly, the performance was 93% correct. For the randomly generated data there is a finite probability of fulfilling condition 1, and it can be shown that the network is in fact performing to within 1% of the inherent upper limit of performance.
A more rigorous justification for synthesising the available data with random numbers follows from the fact that training seeks to minimise the sum squared error over the training set. Consider a binary classification network with a single input v producing an output f(v). The required outputs are 0 if the input is a member of Class A and 1 if the input is a member of class B.If the prior probability of any data being a member of class A is PAT and the prior probability of any data being a member of class B is PB; and if the probability distribution functions of the
input output a b c d e class 0.150160 0.241971 0.496722 0.338752 0.163327 1.0 0.752625 -0.258011 0.050505 -0.144331 0.085486 0.0 0.390102 0.582408 0.667979 0.252589 0.037113 1.0 -0.894841 0.933459 -0.331432 -0.835807 -0.459371 0.0 -0.406147 -0.263851 0.074262 -0.330817 -0.446870 1.0 0.024406 0.173021 0.477517 0.743378 0.155935 0.0 0.769972 0.797939 0.964704 0.768290 0.724058 1.0 0.705362 0.622344 0.909775 0.808566 -0.722170 0.0 0.574456 0.622694 0.686187 0.684142 0.684017 1.0 0.173560 -0.250082 -0.946428 -0.070469 0.570686 0.0 Table 1: Output class 1 vectors are obtained using condition 1. Output class 0 vectors are
synthesized from random numbers having the same mean as class 1.
two classes as functions of the input v are PA (V) and PB (V), then the sum squared error, E, over the whole training set is given by:
Differentiating this with respect to the function f: 8E
of =2PA(U)PAf(v) + 2PB(l))PB [ f(tl)- 1 ] and equating this to zero
which is exactly the probability of the correct classification being B given that the input was v. So by training for the minimisation of sum squared error, and using as targets 0 for class A and 1 for class B, the output from the network assumes a value equal to the probability of class B.
Substituting the f(v) corresponding to a trained network back into 11 we get
and so the minimum attainable error is when there is zero overlap between the distributions:
In this way it is possible to model the default class to produce an error less than the error to be expected from using uniformly distributed random variates.
As a generalisation, in a situation where a network is apprenticed to a human to learn to distinguish a healthy class of signal from anything else that might come along, it is important that the network should reach a state of learning where it can be left alone, as soon as possible.
The training stage should be as brief as possible, leaving only the occasional false alarm to recall the human operator to give the benefit of his judgement. Reaching a useful level of performance with only a few examples is only possible if those examples are representative of all members of the class.
To demonstrate that networks are able to generalise with relatively few examples, the training-set described above was used with different numbers of training examples. With only 5 unique examples of class 1 the performance is above 80% (see
Figure 5). As before 50 random vectors were used to synthesise class 0. In this example the network has 5 inputs, 3 hidden neurons and 1 output. There is evidence to suggest that as the hidden layer of neurons is made smaller, so the network is obliged to generalise better. This generalisation is at the expense of being able to classify correctly the outliers of the training-set.
Turning to the noise tolerance of the network, the ability of a network with few hidden neurons to generalise suggests that a compact representation of the data is being made. If the data is corrupted by zero-mean noise, the essence of the data is retained from sample to sample while the noise is changing. Within limits this has little effect on the ability of the network to learn classifications. Figure 6 shows the effect on classification performance of unseen data with an increasing amount of noise present on both the training and test data. The score is 74% even with a signal to noise ratio of 1:1.
For scalar inputs which vary slowly with time and have no syntax, an appropriately structured artificial neural network which is layered and has total connectivity between layers is just as good as any. However, certain types of input may have an underlying generator which undergoes well defined state transitions.
For these types of input the ideal network should contain the hardware (written in terms of the network formalism) able to exploit the regularity of the data and to extract parameters from it to be fed to the rest of the network.
Since the network learns from example, the allocation of signals to inputs of the network is arbitrary. This is certainly the case for a homogeneous network consisting, for instance, of totally interconnected layers, however particular applications may require a structured network predisposed to address the characteristic variation in certain types of input. In this case inputs to the network will favour certain types of signal and must be allocated accordingly.
The envisaged method of operation of the neural network equipment monitor of Figure 1 is very simple and consists of:
Connect it up: Connect the various instruments 12 whose
output requires monitoring to the neural network
equipment monitor 10. If the instruments do not
provide a line-output then it is usually a simple
matter to provide one. If, however, breaking into the
circuitry of the instrument is not allowed, then - a pick-up coil mounted on the surface of the instrument
will pick up any high-frequency signal, such as a video
signal, which can be de-modulated and sent to the
neural network equipment monitor. As discussed above,
the neural network equipment monitor will adapt to
virtually any type of signal and is tolerant of noise.
There is an increasing tendency to equip intensive care
wards with data collection centres which serve to
collect the vital function data of a ward of patients.
An instrumentation such as this would be an ideal
platform for incorporating a neural network equipment
monitor.
Teach it about "healthy" signals: Once the instruments are
turned on and registering signals characteristic of a
"healthy" state, press the OK button 14 and keep it
pressed (it can be equipped with a latch) for several
minutes while the neural network equipment monitor
learns the concept of a healthy signal, making sure,
during this time that the signals are characteristic
ally healthy.
Let it work alone: Release the OK button. If the concept
of a healthy signal has been well represented by the
training examples given so far, the false alarm rate
will be low. If, however, a signal is produced which
does not fit the network's concept of a healthy signal,
the alarm will sound requiring the human operator to
press the OK button if indeed the signal is a healthy
one. If false alarms are too frequent, a subsequent
period of training may be required. Otherwise the
neural network equipment monitor can be left to monitor
the instruments unattended.
The simulations used as examples in previous sections were implemented on a Sun 3 and took 75 seconds for 500 updates, each having calculated the error derivatives over the entire 100 training patterns. Once the network has learned, the input patterns can be processed at an approximate rate of 500 patterns per second. Even an 8-bit processor running 100 times slower than the Sun 3 would therefore be able to cope adequately with real-time input at a rate of 5 patterns per second, though the learning time of two hours might be impractical. Of course, the learning can take place on a powerful machine leaving a much slower processor in charge of the monitoring once the network has learned.
If real-time learning is required, or if a much larger network is needed, there is no technology-limited upper limit to performance is the algorithm is implemented using parallel processors. A practical design using transputers allows the use of the language Occam which addresses the parallelism in a program. Using this formalism a forwards pass through a network that has already learned the correct weight values would be
PROC forward.pass
SEQ
PAR
calculate output for each neuron in input layer
PAR
calculate output for each neuron in next layer
PAR
calculate output for each neuron in output layer
A backward pass, wherein the weight updates are calculated, would have a similar, but inverted, structure.
A complete learning cycle would consist of
PROC learn.cycle
SEQ
forward. pass
backward. pass
The computing power of a T800 Transputer is very roughly equivalent to a Sun 3, and from this the number of
Transputers required for a network of a given size to operate at a given speed can be calculated. For development, one of the many commercially available
Transputer systems hosted by a PC would be used. However, for a conveniently packaged system suitable for use in a hospital, an expandable board containing 5 Transputers with memory, power-supply, etc., would fit into a box the size of a small briefcase.
A trial with real data was conducted. A hospital intensive care ward was approached to obtain vital function data from a number of patients over a period of 24 hours.
The data was scaled to lie between -1 and +1, and an artificial neural network trained to output 1 for the healthy data and 0 for uniformly distributed uncorrelated noise. Results are shown in Figure 7. In Figure 7 the output of the artificial neural network is given at the bottom, and is mostly high, indicating a healthy response from the patient. The low dips indicate alarm conditions and with each of these can be associated a departure from normal of one or more of the vital function traces above.
The modification of the statistical distribution of the default class is possible in many ways. For example, in the example given above where a process is generating data a, b, c, d, e subscribing to the (secret) relationship a < b < c and e < d < c. This data relationship is unknown to the neural network monitor at the beginning of training. A naive random synthesis process will generate uniform distributions over the same range for a, b, c, d, e. A cursory examination of the data, however, will reveal that the distributions of a and e will be biased towards the low end and the distribution of c will be biased towards the high end. Performance will be improved if this knowledge is incorporated into the training process. Given no knowledge about the statistical distribution of the default class, a uniform uncorrelated distribution is assumed.However if we have such knowledge it may be incorporated into the statistical distribution of the default class. This may be implemented either at the outset of operation or may be continuously adapted to incoming data. This is a good example of making use of the first order statistics of the data without necessarily having access to the more subtle second order information in terms of the interrelationships between the data.
To reduce the error below that expected from a uniform distribution we can synthesise default inputs with a distribution which takes account of prior knowledge of the distribution of the non-alarm class.
If, for instance, the non-alarm inputs are expected to fall within certain bounds, then the default inputs can be synthesised to fall only outside these limits. In this way the overlap between the two distributions (the default distribution and the non-alarm distribution) is zero and therefore the expected error will be zero. In practice, estimation of the limits of the non-alarm class will not be 100% reliable and so the expected error will be greater than zero. Also, leaving a gap in the default distribution in this way, requires the neural network to generalise well and interpolate smoothly which, in turn, is only guaranteed when the architecture of the neural network is well suited to the problem.Thus in some cases it may be safer to allow the default distribution to erode slightly into the expected distribution of the non-alarm data so that there is less reliance on the good behaviour of the neural network.
Figure 8 below shows a typical non-alarm or healthy, distribution and a suitable default distribution. Of course, in most real systems these distributions will be multi-dimensional.
Another aspect of incorporation of prior knowledge is in the structuring of a network so that it is predisposed to perform well for a particular problem domain. An example: consider a device that is looking at an image of some kind.
Suppose that translating an object in the image does not affect the identity of the image in any way. It is possible to build into the network this prior knowledge of translation invariance. This is done by sharing weights (weight replication), using non-standard activation functions for the neurons and other architectural features.
The major advantages of the described embodiment and the artificial neural network for equipment monitoring include:
(1). Learn by example
(2). Generalise from a representative training-set
(3). Tolerant of noise
(4). Incorporation of prior knowledge
(5). Adaptive to any combination of inputs
(6). Extremely simple to operate - no special skill
required
(7). Small network implementable on standard PC AT +
interface
(8). No technology-limited upper bound to potential
size of network.
Claims (12)
1. Apparatus for monitoring a plurality of parameters of a system and for identifying or predicting the condition of the system, said apparatus comprising:
a plurality of inputs each for receiving data representative of a respective parameter of the system;
processing means for processing data received via said inputs to determine or predict the condition of the system and provide output data representative of said condition;
training means operable to identify to the processing means monitored inputs which belong to a first class which indicate a first predetermined system condition (e.g.
healthy), and
data generating means operable to generate or synthesise and present to said processing means input values which belong to, or have a high probability of belonging to, a second class which indicates a second predetermined system condition (e.g. alarm condition),
the processing means being operable during a learning phase to be taught to distinguish between inputs in said first class and said second class.
2. Apparatus according to Claim 1, wherein said data generating means simulates the statistical distribution of said second class of inputs by using a pseudo-random generator.
3. Apparatus according to Claim 2, wherein the distribution of said random process is non-uniform.
4. Apparatus according to any preceding claim, wherein the output provides data representing the probability of the monitored input signals falling in said second class.
5. Apparatus according to any preceding claim, wherein the processing means is configured as or operates as an artificial neural network.
6. Apparatus according to Claim 5, wherein the configuration or operation of the processing means is preferably structured as a network with a capability equivalent to a three or more layered network, corresponding or analogous to an input layer, at least one "hidden" or intermediate layer and an output layer.
7. Apparatus for monitoring a plurality of system parameters and thereby deducing the condition of said system, said system including processing means capable of being taught to distinguish between a set of monitored system parameters belonging to a Class indicating one system condition and a set of monitored system parameters belonging to another Class indicating another system condition, means for supplying and identifying the Class of a plurality of sets of parameters from one of said Classes, and means for randomly or pseudo randomly generating or synthesising sets of data and for identifying these to the processing as belonging to the other Class.
8. A method for monitoring a plurality of parameters of a system and for identifying or predicting the condition of the system, the method comprising the steps of:
receiving a plurality of inputs each represents a particular parameter of the system,
supplying said inputs to a processor for processing thereof to determine or predict the condition of the system,
said processor having been trained by identifying to it those inputs signals belonging to a first class indicating a first predetermined system condition (e.g. healthy) and by providing said processor with synthesised or otherwise generated input signals belonging to, or having a high probability of belonging to, a second class indicating a second predetermined system condition (e.g. alarm condition) and identifying said synthesised or otherwise generated input signals as belonging to said second class.
9. A method according to Claim 10, wherein said signals belonging to or having a high probability of belonging to said second class have a uniform uncorrelated distribution.
10. A method according to Claim 9, wherein said signals belonging to or having a high probability of belonging to said second class have a given statistical distribution based on information obtained either at the outset of operation or continuously adapted to incoming data.
11. Apparatus substantially as hereinbefore described with reference to and as illustrated in any of the accompanying Figures.
12. A method as substantially hereinbefore described with reference to and as illustrated in the accompanying drawings.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB919116255A GB9116255D0 (en) | 1991-07-27 | 1991-07-27 | Apparatus and method for monitoring |
Publications (3)
Publication Number | Publication Date |
---|---|
GB9215907D0 GB9215907D0 (en) | 1992-09-09 |
GB2258311A true GB2258311A (en) | 1993-02-03 |
GB2258311B GB2258311B (en) | 1995-08-30 |
Family
ID=10699097
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB919116255A Pending GB9116255D0 (en) | 1991-07-27 | 1991-07-27 | Apparatus and method for monitoring |
GB9215907A Expired - Fee Related GB2258311B (en) | 1991-07-27 | 1992-07-27 | Apparatus and method for monitoring |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB919116255A Pending GB9116255D0 (en) | 1991-07-27 | 1991-07-27 | Apparatus and method for monitoring |
Country Status (1)
Country | Link |
---|---|
GB (2) | GB9116255D0 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0856826A2 (en) * | 1997-02-04 | 1998-08-05 | Neil James Stevenson | A security system |
GB2352815A (en) * | 1999-05-01 | 2001-02-07 | Keith Henderson Cameron | Automatic health or care risk assessment |
EP1609412A1 (en) * | 2001-05-31 | 2005-12-28 | Isis Innovation Limited | Patient condition display |
WO2006056721A1 (en) * | 2004-11-26 | 2006-06-01 | France Telecom | Suppression of false alarms among alarms produced in a monitored information system |
EP2122537A2 (en) * | 2007-02-08 | 2009-11-25 | Utc Fire&Security Corporation | System and method for video-processing algorithm improvement |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0240679A1 (en) * | 1986-03-27 | 1987-10-14 | International Business Machines Corporation | Improving the training of Markov models used in a speech recognition system |
GB2231698A (en) * | 1989-05-18 | 1990-11-21 | Smiths Industries Plc | Speech recognition |
WO1991000591A1 (en) * | 1989-06-30 | 1991-01-10 | British Telecommunications Public Limited Company | Pattern recognition |
-
1991
- 1991-07-27 GB GB919116255A patent/GB9116255D0/en active Pending
-
1992
- 1992-07-27 GB GB9215907A patent/GB2258311B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0240679A1 (en) * | 1986-03-27 | 1987-10-14 | International Business Machines Corporation | Improving the training of Markov models used in a speech recognition system |
GB2231698A (en) * | 1989-05-18 | 1990-11-21 | Smiths Industries Plc | Speech recognition |
WO1991000591A1 (en) * | 1989-06-30 | 1991-01-10 | British Telecommunications Public Limited Company | Pattern recognition |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0856826A2 (en) * | 1997-02-04 | 1998-08-05 | Neil James Stevenson | A security system |
EP0856826A3 (en) * | 1997-02-04 | 1999-11-24 | Neil James Stevenson | A security system |
GB2352815A (en) * | 1999-05-01 | 2001-02-07 | Keith Henderson Cameron | Automatic health or care risk assessment |
EP1609412A1 (en) * | 2001-05-31 | 2005-12-28 | Isis Innovation Limited | Patient condition display |
US7031857B2 (en) | 2001-05-31 | 2006-04-18 | Isis Innovation Limited | Patient condition display |
WO2006056721A1 (en) * | 2004-11-26 | 2006-06-01 | France Telecom | Suppression of false alarms among alarms produced in a monitored information system |
FR2878637A1 (en) * | 2004-11-26 | 2006-06-02 | France Telecom | DELETING FALSE ALERTS AMONG ALERTS PRODUCED IN A MONITORED INFORMATION SYSTEM |
EP2122537A2 (en) * | 2007-02-08 | 2009-11-25 | Utc Fire&Security Corporation | System and method for video-processing algorithm improvement |
EP2122537A4 (en) * | 2007-02-08 | 2010-01-20 | Utc Fire & Security Corp | System and method for video-processing algorithm improvement |
Also Published As
Publication number | Publication date |
---|---|
GB2258311B (en) | 1995-08-30 |
GB9215907D0 (en) | 1992-09-09 |
GB9116255D0 (en) | 1991-09-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Omid | Design of an expert system for sorting pistachio nuts through decision tree and fuzzy logic classifier | |
Saxena et al. | Evolving an artificial neural network classifier for condition monitoring of rotating mechanical systems | |
US5092343A (en) | Waveform analysis apparatus and method using neural network techniques | |
Javadpour et al. | A fuzzy neural network approach to machine condition monitoring | |
EP4050518A1 (en) | Generation of realistic data for training of artificial neural networks | |
Schmidt | A modular neural network architecture with additional generalization abilities for high dimensional input vectors | |
Mishra et al. | A comprehensive survey of recent developments in neuronal communication and computational neuroscience | |
Chang et al. | Tests for nonlinearity in short stationary time series | |
GB2258311A (en) | Monitoring a plurality of parameters | |
Murray | Novelty detection using products of simple experts—a potential architecture for embedded systems | |
Alfaro-Cid et al. | Genetic programming and serial processing for time series classification | |
Song et al. | Understanding cognitive processes across spatial scales of the brain | |
Dodd | Artificial Neural Network for Alarm-State Monitoring | |
Abedi Khoozani et al. | Integration of allocentric and egocentric visual information in a convolutional/multilayer perceptron network model of goal-directed gaze shifts | |
Calado et al. | A hierarchical fuzzy neural network approach for multiple fault diagnosis | |
Zagal et al. | Towards self-reflecting machines: Two-minds in one robot | |
Rust et al. | Towards computational neural systems through developmental evolution | |
Wang et al. | Van der pol oscillator networks | |
Pipitsunthonsan et al. | Fast and Compact Model for Palm Bunch Classification Using DT-CWT and LSTM on Hue Histogram | |
Pokorny et al. | Associations between memory traces emerge in a generic neural circuit model through STDP | |
Feng et al. | Impact of geometrical structures on the output of neuronal models: a theoretical and numerical analysis | |
Kim | Analyzing sensor states and internal states in the Tartarus problem with tree state machines | |
Ding et al. | Assessing human situation awareness reliability considering fatigue and mood using EEG data: A Bayesian neural network-Bayesian network approach | |
Clymer et al. | A computational model of behavioral adaptation to solve the credit assignment problem | |
de Almeida et al. | Information space dynamics for neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PCNP | Patent ceased through non-payment of renewal fee |
Effective date: 20030727 |