DE19527521C1

DE19527521C1 - Neural network training method

Info

Publication number: DE19527521C1
Application number: DE19527521A
Authority: DE
Inventors: Martin Dr Ing Schlang; Einar Dr Rer Nat Broese; Otto Dr Ing Gramckow; Frank-Oliver Malisch
Original assignee: Siemens AG; Siemens Corp
Current assignee: Siemens AG; Siemens Corp
Priority date: 1995-07-27
Filing date: 1995-07-27
Publication date: 1996-12-19
Anticipated expiration: 2015-07-28

Abstract

Neural networks are trained using training data before they can obtain their generalisation function. The neural network is trained using continuously running on-line adaptation in combination with a cyclically repeated adaptation of network parameters. The on-line adaptation is limited to a selected parameter of the neural network. The on-line adaptation is provided for each new training data point, while the cyclically repeated adaptation occurs every time, depending on the amount of training data accumulated for the preceding data points.

Description

Neuronale Netze müssen, bevor sie ihre Generalisierungsfä higkeit erhalten, zunächst mit Lern- oder Trainingsdaten trainiert werden. Das Sammeln dieser Trainingsdaten ist oft langwierig und mit hohem Aufwand verbunden.Neural networks need to be ability, first with learning or training data be trained. Collecting this training data is often lengthy and associated with great effort.

Ein Beispiel hierfür ist das aus der DE-OS 44 16 364 bekannte neuronale Netz, das aus einer Vielzahl von ihm zugeführten vorausberechneten Eingangsgrößen als Netzwerkantwort einen Prozeßparameter berechnet, der zur Voreinstellung eines Sy stems zur Regelung eines technischen Prozesses dient. So wird z. B. bei einem Walzprozeß ein Vorhersagewert für die Walz kraft in Abhängigkeit von der Walzguttemperatur, der Dicken abnahme und anderen material- und anlagenspezifischen Ein gangsgrößen berechnet. Der von dem neuronalen Netz nachgebil dete Zusammenhang zwischen der Walzkraft und den Eingangsgrö ßen wird on-line nach jedem Prozeßablauf, also nach jedem Walzgutdurchlauf, an das reale Prozeßgeschehen angepaßt. Dazu werden die während des Prozeßablaufs gemessenen und anschlie ßend nachberechneten Eingangsgrößen und die Walzkraft in ei nem Datenpunkt zusammengefaßt, der dann zur Adaption von Pa rametern des neuronalen Netzes herangezogen wird. Die Adap tion erfolgt mit jedem neu ermittelten Datenpunkt, also on- line. Die Adaption muß sich durch eine besondere Stabilität auszeichnen, da sie häufig direkt und ohne Überwachung durch einen Fachmann auf der prozeßausführenden Anlage durchgeführt wird. Daher werden beim on-line Training nur unkritische Parameter des neuronalen Netzes adaptiert, wobei Adaptions algorithmen verwendet werden, die eine Stabilität des Verfah rens gewährleisten; z. B. Minimierung der quadratischen Feh lerfunktion zwischen der Netzwerkantwort und der nachberech neten Walzkraft, wobei die Fehlerfunktion nur ein globales Minimum, aber keine lokalen Minima aufweist. An example of this is that known from DE-OS 44 16 364 neural network that fed from a variety of it predicted input variables as a network response Process parameters calculated, which is used to preset a Sy serves to regulate a technical process. So will e.g. B. in a rolling process a predictive value for the rolling force depending on the rolling stock temperature, the thickness acceptance and other material- and system-specific gears are calculated. The one from the neural network correlation between the rolling force and the input size Eating is done online after each process sequence, i.e. after each Roll pass, adapted to the real process. To the measured during the process and then ß recalculated input variables and the rolling force in egg nem data point summarized, which then for the adaptation of Pa parameters of the neural network is used. The adap tion takes place with each newly determined data point, i.e. on- line. The adaptation must go through a special stability distinguish themselves because they are often direct and without monitoring carried out a specialist on the process-executing plant becomes. Therefore, online training only becomes non-critical Adapted parameters of the neural network, whereby adaptations algorithms are used that ensure the stability of the process ensure rens; e.g. B. Minimize the square mistake ler function between the network response and the post-calculation neten rolling force, the error function being only a global one Minimum but no local minima.

Aus der DE 43 38 615 A1 ist ein Verfahren bekannt, gemäß dem ein aus mehreren Teilmodellen bestehendes analytisches Modell eines Prozesses mittels eines neuronalen Netzes korrigiert wird. Aus der WO 95/14277 ist bekannt, ein neuronales Netz parallel zu einem herkömmlichen Regler einzusetzen. Aus der DE 43 01 130 A1 ist bekannt, die Parameter für ein Regelungsmodell mittels eines neuronalen Netzes zu beeinflussen. Aus der DE 40 12 278 A1 ist bekannt, ein neuronales Netzwerk zur Diagnose zu verwenden. IEEE Transactions on Systems, Man and Cybernetics, Vol. 24, No. 4, April 1994, Seiten 678-683, beschreibt das grundsätzliche Vorgehen bei der Adaption eines neuronalen Netzes. Aus der DE 44 39 986 A1 ist ein in einem neuronalen Netz verwirklichtes Prozeßoptimierungsmodell bekannt. Das Modell des Prozesses wird dadurch geändert, daß die Gewichtung des neuronalen Netzes in Übereinstimmung mit einem Lernalgorithmus des neuronalen Netzes geändert wird. IEEE Transactions on Neural Networks, Vol. 4, No. 3, May 1993, Seiten 462-469, beschreibt die Implementierung eines kohonen neuronalen Netzes auf einem Chip. IEEE Transactions on Neural Networks, Vol. 6, No. 1, Jan. 1995, Seiten 144-156, beschreibt eine auf neuronalen Netzen basierende Architektur eines Reglers, wobei zwei neuronale Netze, eines zur Regelung und eines zur Parameteridentifikation, eingesetzt werden. DE 43 38 615 A1 discloses a method according to an analytical model of a process consisting of several sub-models using a neural network is corrected. From WO 95/14277 is known a neural network parallel to a conventional controller to use. From DE 43 01 130 A1 it is known that Parameters for a control model using a neural To influence the network. From DE 40 12 278 A1 known to use a neural network for diagnosis. IEEE Transactions on Systems, Man and Cybernetics, Vol. 24, No. 4, April 1994, pages 678-683, describes the basic Procedure for the adaptation of a neural network. From the DE 44 39 986 A1 is a Process optimization model implemented in a neural network known. The model of the process is changed in that the Weighting of the neural network in accordance with a learning algorithm of the neural network is changed. IEEE Transactions on Neural Networks, Vol. 4, No. 3, May 1993, pages 462-469 describes the implementation of a coherent neural network on a chip. IEEE transactions on Neural Networks, Vol. 6, No. 1, Jan. 1995, pages 144-156, describes an architecture based on neural networks of a controller, two neural networks, one for regulation and one for parameter identification.

Damit das bekannte neuronale Netz bereits zu Beginn des on- line-Trainings zumindest annähernd sinnvolle Walzkraftwerte vorhersagt, kann es anhand eines die Walzkraft in Abhängig keit von zufällig vorgegebenen Eingangsgrößen berechnenden Walzkraftmodells vortrainiert werden. Steht ein derartiges Modell nicht zur Verfügung, so kann das für das Vortraining nötige Vorwissen durch das Sammeln von Trainingsdaten bei spielsweise auf vergleichbaren Anlagen erworben und in das neuronale Netz eingebracht werden.So that the well-known neural network already at the beginning of the line trainings at least approximately reasonable rolling force values predicts it can depend on a rolling force calculating randomly predetermined input variables Pre-trained rolling force model. Is there such a thing Model not available, so this can be for pre-training necessary prior knowledge by collecting training data acquired for example on comparable systems and in the neural network.

Der Erfindung liegt die Aufgabe zugrunde, insbesondere bei der Neuinbetriebnahme einer Anlage oder wesentlichen Verände rungen einer bestehenden Anlage, die mit neuronalen Netzen gesteuert wird, oder beim Umrüsten einer bestehenden Anlage auf neuronal gestützte Steuerung, wenn auf eine vorherige Sammlung von Trainingsdaten verzichtet werden soll, das neu ronale Netz in die Lage zu versetzen, ohne Vortraining direkt auf der Anlage bereits nach wenigen Datenpunkten ein sinnvol les Verhalten zu zeigen. Ferner sollen Langzeitdriften der Anlage erkannt und kompensiert werden.The invention has for its object, in particular the commissioning of a system or significant changes existing system using neural networks is controlled, or when converting an existing system on neuron-based control if on a previous one Collection of training data should be dispensed with, the new enable ronal network directly without pre-training a sensible on the system after just a few data points to show les behavior. Long-term drifts of the System can be recognized and compensated.

Gemäß der Erfindung wird die Aufgabe durch das in Patentan spruch 1 angegebene Lernverfahren gelöst.According to the invention the object is achieved by the in Patentan Proverb 1 specified learning method solved.

Vorteilhafte Weiterbildungen des erfindungsgemäßen Verfahrens sind in den Unteransprüchen angegeben.Advantageous further developments of the method according to the invention are specified in the subclaims.

Zur Erläuterung der Erfindung wird im folgenden auf die Figuren der Zeichnung Bezug genommen; im einzelnen zeigenTo explain the invention, reference is made to the following Figures of the drawing referenced; show in detail

Fig. 1 ein Beispiel für ein neuronales Netz mit einem ständig on-line ablaufenden und einem zyklisch wiederholten Adaptionsalgorithmus, Fig. 1 is an example of a neural network with a continuously on-line running and a cyclically repeated adaptation algorithm,

Fig. 2 ein Beispiel für die erfindungsgemäße Kombination der on-line Adaption und der zyklisch wiederholten Adaption des neuronalen Netzes, Figure 2 is an example of the inventive combination of the on-line adaptation and the cyclically repeated adaptation of the neural network.,

Fig. 3+4 Beispiele für die bei der zyklisch wiederholten Adaption verwendete Trainingsdatenmenge, Fig. 3 + 4 examples used in the cyclically repeated adaptation training data set,

Fig. 5 ein Beispiel für die Trainingsdatenmenge beim Ler nen mit exponentiellem Vergessen und Fig. 5 shows an example of the amount of training data when learning with exponential forgetting and

Fig. 6 ein Beispiel für den typischen Fehlerverlauf bei dem erfindungsgemäßen Lernverfahren im Vergleich zu einem Referenz-Lernverfahren. Fig. 6 shows an example for the typical error profile in the inventive learning method in comparison to a reference-learning method.

Fig. 1 zeigt ein neuronales Netz, dem eine Mehrzahl von in ei nem Eingangsvektor x zusammengefaßten Eingangsgrößen zuge führt wird und das in Abhängigkeit davon eine, gegebenenfalls ebenfalls mehrdimensionale, Antwort y_NN erzeugt. Die Antwort y_NN ist von einstellbaren Parametern p des neuronalen Netzes 1 abhängig. Das neuronale Netz 1 dient bei dem gezeigten Aus führungsbeispiel dazu, den Zusammenhang zwischen Einflußgrö ßen eines technischen Prozesses, die durch den Eingangsvektor x repräsentiert werden, und einem Prozeßparameter y, reprä sentiert durch die Antwort y_NN, nachzubilden. Ein Beispiel hierfür ist die Vorhersage der Walzkraft in einem Walzprozeß in Abhängigkeit von material- und anlagenspezifischen Ein flußgrößen, wie unter anderem der Walzguttemperatur, der Walzgutfestigkeit, der Walzgutdicke und der Dickenabnahme. Fig. 1 shows a neural network, where a plurality of x summarized in egg nem input vector the input variables supplied is lead and the one, optionally also multi-dimensional, depending on response y _NN generated. The answer y _NN depends on adjustable parameters p of the neural network 1 . In the exemplary embodiment shown, the neural network 1 is used to emulate the relationship between influencing variables of a technical process, which are represented by the input vector x, and a process parameter y, represented by the response y _NN . An example of this is the prediction of the rolling force in a rolling process as a function of material and plant-specific influencing variables, such as the temperature of the rolling stock, the strength of the rolling stock, the thickness of the rolling stock and the decrease in thickness.

Um den nachzubildenden Zusammenhang zu lernen, und das neuro nale Netz 1 an das tatsächliche Prozeßgeschehen anzupassen, werden die Parameter p des neuronalen Netzes 1 mit Hilfe von Adaptionsalgorithmen 2 und 3 im Sinne einer Verringerung des Fehlers zwischen der von dem neuronalen Netz 1 gelieferten Antwort y_NN und dem tatsächlichen Wert des Prozeßparameters y verändert.In order to learn the relationship to be simulated and to adapt the neural network 1 to the actual process, the parameters p of the neural network 1 are adapted with the aid of adaptation algorithms 2 and 3 in the sense of a reduction in the error between the response y supplied by the neural network 1 _NN and the actual value of the process parameter y changed.

Dabei erfolgt nach jedem n-ten (n 1) Prozeßablauf, in die sem Falle also nach jedem Walzgutdurchlauf, mittels des Adap tionsalgorithmus 2 eine on-line-Adaption, indem die während des erfolgten aktuellen Prozeßablaufs gemessenen und nachbe rechneten Einflußgrößen x_nach dem neuronalen Netz 1 aufgege ben werden und die daraus resultierende Antwort y_NN mit dem ebenfalls gemessenen oder nachberechneten Prozeßparameter y verglichen wird. Nicht plausible Werte oder Meßfehler werden durch die Nachberechnung ausgeschieden. In Abhängigkeit von dem dabei ermittelten Fehler y-y_NN werden ausgewählte Para meter p₁ des neuronalen Netzes 1 im Sinne einer Fehlerverrin gerung verändert. Dabei werden unkritische Parameter p₁ und solche Adaptionsalgorithmen gewählt, die eine Stabilität der on-line-Adaption gewährleisten und es erlauben, schnellen Prozeßzustandsänderungen zu folgen. Unkritische Parameter sind z. B. die Steigungen der Regressionsebenen; ein mögli ches Adaptionsverfahren ist der Gradientenabstieg.After every nth (n 1) process sequence, in this case after every rolling pass, an adaptation is carried out on-line by means of the adaptation algorithm 2 , in that the influencing variables x measured and recalculated during the current process sequence take place _after the neural Network 1 are given up and the resulting response y _{NN is} compared with the process parameter y, which is also measured or recalculated. Implausible values or measurement errors are eliminated by the recalculation. Depending on the error yy _{NN determined} , selected parameters p 1 of the neural network 1 are changed in the sense of an error reduction. Non-critical parameters p 1 and such adaptation algorithms are selected that ensure the stability of the on-line adaptation and allow rapid changes in the process state to be followed. Non-critical parameters are e.g. B. the slopes of the regression levels; a possible adaptation procedure is the gradient descent.

Die nach jedem n-ten Walzgutdurchlauf ermittelten Eingangs größen des Eingangsvektors x bilden zusammen mit dem gemesse nen bzw. nachberechneten Prozeßparameter y, der als Bezugs größe für den Vergleich mit der Antwort y_NN dient, einen Datenpunkt, der in einer Speichereinrichtung 4 gespeichert wird. Auf der Grundlage einer aus mehreren gespeicherten Datenpunkten gegebenenfalls durch Clustern, Selektieren oder Mitteln gebildeten Trainingsdatenmenge wird in zyklischen Abständen ein Training des neuronalen Netzes 1 vorgenommen, wobei die Parameter p₂ des neuronalen Netzes 1 mittels des Adaptionsalgorithmus 3 adaptiv verändert werden. Die Parame ter p₁ und p₂ können dieselben oder teilweise oder ganz un terschiedlich sein. Wird das zyklisch wiederholte Training als Hintergrund-Training durchgeführt, so unterliegt es kei nen on-line-Echtzeitbedingungen und kann dann damit auf der Basis beliebig großer Trainingsdatenmengen und mit zeitauf wendigen global optimierenden Lernalgorithmen arbeiten, die nicht on-line-fähig sind.The input variables of the input vector x determined after every nth rolling stock pass together with the measured or recalculated process parameter y, which serves as a reference variable for comparison with the answer y _NN , form a data point which is stored in a memory device 4 . Training of the neural network 1 is carried out at cyclical intervals on the basis of a training data set, possibly formed by clustering, selecting or averaging, the parameters p 2 of the neural network 1 being adaptively changed by means of the adaptation algorithm 3 . The parameters ter p₁ and p₂ can be the same or partially or completely different. If the cyclically repeated training is carried out as background training, it is not subject to online real-time conditions and can then work on the basis of training data of any size and with time-consuming globally optimizing learning algorithms that are not online-capable.

Nach Abschluß eines zyklisch wiederholten Trainings wird das neuronale Netz 1 zuerst einer on-line Adaption mit zumindest einem Teil der der Trainingsdatenmenge zugrunde liegenden Datenpunkte unterzogen, bevor das neuronale Netz 1 wieder für die Steuerung der Anlage aktiviert wird und mit neuen Daten punkten on-line weitertrainiert wird. So wird sichergestellt, daß sich das neuronale Netz nach dem globalen Hintergrund training sofort wieder an die aktuelle Tagesform der zu steu ernden Anlage anpaßt. After completion of a cyclically repeated training, the neural network 1 is first subjected to an on-line adaptation with at least part of the data points on which the training data set is based, before the neural network 1 is reactivated for controlling the system and scores with new data on-line is trained. This ensures that the neural network immediately adapts to the current form of the system to be controlled after the global background training.

Fig. 2 verdeutlicht das erfindungsgemäße Lernverfahren anhand zweier Blöcke 5 und 6, von denen der Block 5 die ständig ab laufende on-line Adaption und der Block 6 die zyklisch erfol gende Adaption bezeichnet. Die zyklisch erfolgende Adaption 6 besteht aus einer anfänglichen Initiallernphase 7 und einer späteren Nachlernphase 8. Fig. 2 illustrates the learning method according to the invention with the aid of two blocks 5 and 6 , of which block 5 denotes the on-line adaptation that is constantly running and block 6 denotes the adaptation which takes place cyclically. The cyclical adaptation 6 consists of an initial initial learning phase 7 and a later re-learning phase 8 .

Diese beiden unterschiedlichen Lernphasen 7 und 8 sind in Fig. 3 verdeutlicht, in der die Anzahl der zum zyklischen Training des neuronalen Netzes 1 verwendeten Datenpunkte N in Abhän gigkeit von dem Beitrag K dieser Datenpunkte N zum Training aufgetragen sind. Während der Initiallernphase 7 erfolgt das zyklisch wiederholte Training mit einer stetig wachsenden Trainingsdatenmenge 9, wobei jedesmal alle von Beginn des Lernverfahrens an gespeicherten Datenpunkte verwendet werden. Die Häufigkeit des zyklisch wiederholten Trainings ist ein für die vorgegebene Anwendung zu optimierender Parameter, wo bei z. B. nach jedem neuen Datenpunkt, nach einem vorgegebe nen prozentualen Anwachsen der Trainingsdatenmenge oder wenn der Fehler y-y_NN einen bestimmten Wert übersteigt eine erneu te Adaption erfolgt. Darüber hinaus läßt sich die Größe des neuronalen Netzes 1 in Abhängigkeit von der Größe der vorlie genden Trainingsdatenmenge verändern, wobei mit einem kleinen neuronalen Netz begonnen wird, das im Laufe der Zeit langsam vergrößert wird. Die Festlegung der Netzgröße z. B. durch Methoden der "Cross-Validation" oder andere Heuristiken wie Residualfehler im Eingangsraum erfolgt.These two different learning phases 7 and 8 are illustrated in FIG. 3, in which the number of data points N used for the cyclic training of the neural network 1 are plotted as a function of the contribution K of these data points N to the training. During the initial learning phase 7 , the cyclically repeated training takes place with a steadily increasing amount of training data 9 , all data points stored from the beginning of the learning process being used each time. The frequency of the cyclically repeated training is a parameter to be optimized for the given application. B. after every new data point, after a predetermined percentage increase in the amount of training data, or if the error yy _NN exceeds a certain value, a new adaptation takes place. In addition, the size of the neural network 1 can be changed as a function of the size of the amount of training data available, starting with a small neural network that slowly increases over time. The definition of the network size z. B. by methods of "cross-validation" or other heuristics such as residual errors in the entrance space.

Nach einer vorgegebenen Zeit oder wenn die Trainingsdaten menge N einen vorgegebenen Wert N_F erreicht, beginnt die Nachlernphase, in der die Trainingsdatenmenge 10 konstant gehalten wird. Die Größe N_F kann statisch sein oder dynamisch beispielsweise durch Techniken der Cross-Validation festge legt werden. Die Häufigkeit der zyklisch erfolgenden Adaption sowie die Größe des neuronalen Netzes 1 ist entweder konstant oder wird auf ähnliche Weise bestimmt, wie in der Initial lernphase. After a predetermined time or when the training data amount N reaches a predetermined value N _F , the learning phase begins, in which the training data amount 10 is kept constant. The size N _F can be static or dynamic, for example by means of cross-validation techniques. The frequency of the cyclical adaptation and the size of the neural network 1 is either constant or is determined in a similar manner to that in the initial learning phase.

Zu Beginn der Inbetriebnahme einer neuronal gesteuerten An lage werden in der Regel noch sehr viele Anlagenparameter verändert, um die Anlage zu optimieren. Das suboptimale Ver halten direkt nach der Inbetriebnahme sollte aber von dem neuronalen Netz 1 möglichst wieder vergessen werden. Deshalb ist entsprechend der Darstellung in Fig. 4 vorgesehen, die in der Initiallernphase 7 ständig wachsende Trainingsdatenmenge 11 nicht komplett zu verwenden, sondern die ältesten Daten punkte sukzessive zu vergessen. Das Wachstum der Trainings datenmenge 11 hat dabei natürlich schneller zu erfolgen, als das Vergessen. Die Geschwindigkeit des Vergessens kann in Form eines konstanten Bruchteils der Wachstumsrate der Trai ningsdatenmenge 11, in Abhängigkeit von dem Fehler y-y_NN oder in Abhängigkeit von dem Expertenwissen des Inbetriebnehmers festgelegt werden.When a neuron-controlled system is started up, a large number of system parameters are usually changed in order to optimize the system. The suboptimal behavior immediately after start-up should be forgotten by the neural network 1 if possible. Therefore, according to the representation in FIG. 4, it is provided that the amount of training data 11 , which is constantly growing in the initial learning phase 7, is not used completely, but rather successively forgetting the oldest data points. The growth of the amount of training data 11 must of course take place faster than forgetting. The speed of forgetting can be determined in the form of a constant fraction of the growth rate of the training data set 11 , depending on the error yy _NN or depending on the expert knowledge of the commissioning engineer.

Fig. 5 zeigt ein Beispiel für eine exponentiell abfallende "natürliche" Vergessensfunktion einer Trainingsdatenmenge 12, die dadurch zustande kommt, daß die Datenpunkte in der Trai ningsdatenmenge 12 mit zunehmenden Alter mit einem immer ge ringeren Gewichtsfaktor gewichtet werden. Fig. 5 shows an example of an exponentially decreasing "natural" forgetting function of a training data set 12 , which is caused by the fact that the data points in the training data set 12 are weighted with increasing age with an ever smaller weight factor.

In Fig. 6 ist für das erfindungsgemäße Lernverfahren der ty pische Fehlerverlauf 13 des Residualfehlers F des neuronalen Netzes 1 in Abhängigkeit von der Anzahl der zur Verfügung stehenden Datenpunkte N im Vergleich zu dem Fehlerverlauf 14 eines Referenz-Lernverfahrens, hier Vererbungsdatei im ein geschwungenen Zustand, aufzeigt.In Fig. 6 is for the inventive learning method of the ty european error profile 13 of the residual error F of the neural network 1 in accordance with the number of available data points N in comparison with the error profile 14 of a reference-learning method, here inheritance file in a curved state, shows.

Claims

1. Learning method for a neural network ( 1 ) consisting of a combination of a constantly running on-line adaptation ( 5 ) and a cyclically repeated adaptation ( 6 ) of parameters (p) of the neural network ( 1 ), the on- line adaptation (5) in each case takes place in response to a current data point from your neural network (1) currently supplied input variables (x) and to the performance thereon response (y _NN) of the neural network (1) to be compared reference variables (y) there, and the cyclically repeated adaptation ( 6 ) takes place as a function of a training data volume ( 9 to 12 ) formed from the data points (x, y) that have occurred up to that point.

2. Learning method according to claim 1, characterized in that the on-line adaptation ( 5 ) to selected parameters (p₁) of the neural network ( 1 ) is limited.

3. Learning method according to claim 1 or 2, characterized in that in an initial initial learning phase ( 7 ) the cyclically repeated adaptation ( 6 ) each time on the basis of the training data amount ( 9 , 11 ) which grows with each interim online adaptation ( 5 ). is performed and that at a later Nachlernphase (8), the training data set (10) for cyclically repeated adaptation (6) bounded by the respectively oldest in the training data set of data points contained (10) to the extent forget how new data points to the training data set ( 10 ).

4. The learning method according to claim 3, characterized in that during the initial learning phase (7), the oldest data points of the training data set (11) are not being a lesser extent, amount as the new data points to the training data (11) added.

5. Learning method according to claim 3 or 4, characterized in that the data points are forgotten by being removed from the training data set ( 11 ).

6. learning method according to claim 3 or 4, characterized, that the data points are forgotten by increasing with age with an ever lower weight factor be weighted.

7. Learning method according to one of the preceding claims, characterized in that the size of the neural network ( 1 ) is changed depending on the size of the amount of training data present ( 9 to 12 ).

8. Learning method according to one of the preceding claims, characterized in that a cyclically successful adaptation is first followed by an on-line adaptation with at least part of the data points on which the training data set ( 9 to 12 ) is based before the neural network ( 1 ) is trained with new data points.