CN112232364A

CN112232364A - Apparatus and method for training a neural network and for making it work

Info

Publication number: CN112232364A
Application number: CN202010587668.6A
Authority: CN
Inventors: K·格劳
Original assignee: Robert Bosch GmbH
Current assignee: Robert Bosch GmbH
Priority date: 2019-06-28
Filing date: 2020-06-24
Publication date: 2021-01-15
Also published as: DE102019209562B4; DE102019209562A1

Abstract

Apparatus and method for training a neural network and for making it work. An apparatus and method for training a neural network and an apparatus and method for validating a neural network are disclosed, wherein the method for training a neural network has: classifying input data by a neural network, wherein the input data is assigned is assigned one of a plurality of categories for each input data of a classification importance value; provides a target segmentation in which each of the input data is assigned a class affiliation; a first classification error is determined by comparing the classified input data with the target segmentation ; determining a second classification error by comparing the saliency map with the target segmentation; and adapting the neural network based on the first classification error and the second classification error.

Description

Apparatus and method for training neural networks and for validating the same

Technical Field

Various embodiments relate generally to an apparatus and method for training a neural network and an apparatus and method for validating a neural network.

Background

Different neural networks are used for example for classifying the data. These neural networks may be trained on basic real data. Because these underlying truth data do not cover all possible scenarios or contexts, it may result in undesirable correlations being learned, which may lead to misclassification. Thus, classification may need to be done regardless of the context of the data. Furthermore, within the scope of the validation process and/or the verification process, it may be desirable to validate or verify the context-free of the classification determined by the neural network.

One method for determining a Saliency Map (also known as a salience Map) is described in K, Simony et al Deep Inside relational Networks, visual Image Classification Models and salience Maps, ICLR works, 2014.

Disclosure of Invention

The method and the device with the features of the independent claims 1 (first example) and 11 (twenty-ninth example) enable: the neural network is trained on context-free classification.

The method and the apparatus with the features of the independent claims 14 (fortieth example) and 15 (fifty-fourth example) enable: the neural network is validated with respect to classification without context.

At least a portion of the neural network may be implemented by one or more processors. The features described in this paragraph in combination with the first example form a second example.

The classification importance value of each of the input data of the saliency map assigned to a category may account for the importance of the corresponding input data when assigning the category. In other words, at least one of the classified input data may have a category of a plurality of categories and a saliency map may be generated for the category of the plurality of categories based on the classified input data, wherein the saliency map has a classification importance value for each of the input data and wherein the classification importance value specifies an importance or a meaning/relevance of the respective one of the input data when assigning the category to the at least one input data having the category. The features described in this paragraph form a third example in combination with the first or second example.

The generation of the saliency map also has: a first classification importance value and a second classification importance value are assigned for each of a plurality of classification importance values. The features described in this paragraph form a fourth example in combination with one or more of the first to third examples.

Each classification importance value below a threshold may be assigned a first classification importance value, while each classification importance value above the threshold may be assigned a second classification importance value. This has the following advantages: each of these input data may be assigned a respective input data whether the input data is "unimportant" or "important" when assigning the category. In other words, the first or second classification importance value may state whether the corresponding input data has an impact on the assignment of the class. The features described in this paragraph in combination with the fourth example form a fifth example.

The determination of the first classification error may have: determining a first loss value based on a first loss function for the assigned one of the plurality of classes. The features described in this paragraph form a sixth example in combination with one or more of the first through fifth examples.

The first loss function may be a cross-entropy loss function. The features described in this paragraph in combination with the sixth example form a seventh example.

For each of these input data, the class affiliation of the target segmentation may have one of a plurality of classes and one segmentation. A partition may have multiple ones of these input data, which may be assigned to the same category. In other words, a plurality of input data that may have the same category may be assigned to one division. This has the following advantages: if the input data has a plurality of objects, the input data belonging to respective ones of the plurality of objects may be assigned to respective segmentations such that the respective ones of the plurality of objects are different from other ones of the plurality of objects. The features described in this paragraph form an eighth example in combination with one or more of the first through seventh examples.

Each of the input data may be assigned to exactly one of the plurality of partitions. The features described in this paragraph in combination with the eighth example form a ninth example.

Each of the plurality of partitions may be different from all other partitions in the plurality of partitions. This has the following advantages: each of these input data may be assigned to exactly one of the plurality of objects such that each of the plurality of objects is unambiguously distinguishable from the other of the plurality of objects. The features described in this paragraph in combination with the eighth example or the ninth example form a tenth example.

The comparison of these classified input data with the target segmentation may have: the classified input data is compared to a category of the plurality of categories assigned to respective ones of the input data. Features described in this paragraph form an eleventh example in combination with one or more of the eighth example through the tenth example.

The comparison of the saliency map with the target segmentation may have: comparing the classification importance value of the plurality of classification importance values assigned to the saliency map of a class assigned to the respective one of the input data with the segmentation of the plurality of segmentations assigned to the respective input data. In other words, the classification importance value of the input data may be compared to the segmentation assigned to the class. The features described in this paragraph form a twelfth example in combination with one or more of the eighth to eleventh examples.

Each of the plurality of classification importance values for which the assigned input data has the category of the plurality of categories may be set equal to a value of "0". In other words, the classification importance value of each input data assigned to the segmentation of the corresponding class may be set equal to the value "0". This has the following advantages: the segmented plurality of input data assigned to a respective category may be considered "important" for the assignment of that category. In other words, thus, the plurality of input data assigned to the segmentation of the respective category may have a large influence on the assignment of the category. In other words also, thereby, the plurality of input data assigned to the segmentation of the respective class may have no influence on the second classification error. The features described in this paragraph in combination with the twelfth example form the thirteenth example.

The determination of the second classification error may have: a second loss value is determined for the saliency map assigned to one of the classes. The features described in this paragraph form a fourteenth example in combination with one or more of the first to thirteenth examples.

The second loss value assigned to a class may have: a sum of all of the classification significance values of the plurality of classification significance values assigned to the saliency map of the class. The features described in this paragraph in combination with the fourteenth example form the fifteenth example.

The adaptation of the neural network may have: at least one total loss value is minimized based on a total loss function, wherein the total loss value may be based on a first classification error of the assigned class and a second classification error of the saliency map assigned to the class. This has the following advantages: not only an error in assigning a class, that is, a first classification error, but also the importance of a plurality of input data not assigned to the class in assigning the class, that is, a second classification error, can be reduced. Features described in this paragraph form a sixteenth example in combination with one or more of the first to fifteenth examples.

The total loss value for the assigned class may be a sum of the first classification error and the second classification error. That is, the total loss value may be a sum of the first loss value and the second loss value. The features described in this paragraph in combination with the sixteenth example form a seventeenth example.

The total loss value for the assigned category may be: the first classification error is multiplied by a weighted sum of the first weighting factor and the second classification error is multiplied by the second weighting factor. This has the following advantages: a higher correlation may be assigned to the first classification error or the second classification error. The features described in this paragraph in combination with the sixteenth example or the seventeenth example form an eighteenth example.

The method for training the neural network may be repeated until the total loss value meets a predefined target criterion. The features described in this paragraph form a nineteenth example in combination with one or more of the sixteenth to eighteenth examples.

The determination of the first classification error may have: a first classification error is determined for each of the plurality of classes. The features described in this paragraph, in combination with one or more of the first through nineteenth examples, form a twentieth example.

The generation of the saliency map may also have: a saliency map is generated for each of the plurality of classes. Features described in this paragraph form the twenty-first example in combination with one or more of the first to twentieth examples.

The determination of the second classification error may further have: determining a second classification error for each saliency map of the plurality of saliency maps. The features described in this paragraph in combination with the twenty-first example form a twenty-second example.

The adaptation of the neural network may have determining a total loss value for each of the plurality of classes and may have minimizing each of the plurality of total loss values. That is, the neural network may be trained for each of the plurality of classes. Features described in this paragraph in combination with one or more of the first through twenty-second examples form a twenty-third example.

The method for training the neural network may be repeated until each of the plurality of total loss values meets a respectively predefined target criterion. The features described in this paragraph in combination with the twenty-third example form a twenty-fourth example.

The plurality of total loss values may have a common total loss value and the method for training the neural network may be repeated until the common total loss value meets a predefined common target criterion. This has the following advantages: each of the plurality of categories may be assigned a different relevance, such as by a weighting factor. The features described in this paragraph in combination with the twenty-third example form a twenty-fifth example.

The target segmentation may be provided by at least one additional neural network. Features described in this paragraph in combination with one or more of the first through twenty-fifth examples form a twenty-ninth example.

The input data may comprise digital image data, and each of the input data may comprise or be formed by one of a plurality of image points. Each image point or image points may be assigned color values (for example three or four color values, in the case of the RGB color space used red, green and blue values) and/or luminance values or other values, depending on the color space used. The features described in this paragraph form a twenty-seventh example in combination with one or more of the first through twenty-sixth examples.

The input data may have a plurality of digital image data. Each input datum of the assigned image data of the plurality of image data can have an image point of the plurality of image points or be formed from such an image point. The method for training a neural network may be performed for each image data of the plurality of image data. In other words, the input data may have a plurality of digital images. Each of the plurality of digital images may have a plurality of image points. The method for training a neural network may be performed for each digital image of the plurality of digital images. The features described in this paragraph in combination with the twenty-seventh example form a twenty-eighth example.

At least a portion of the neural network may be implemented by one or more processors. The features described in this paragraph in combination with the twenty-ninth example form a thirty-th example.

The system may have a device according to the twenty-ninth example or the thirty-first example. The system can have a sensor, for example an imaging sensor, which is set up to provide the input data. The features described in this paragraph form the thirty-first example.

The imaging sensor may be a video sensor. The features described in this paragraph in combination with the thirty-first example form a thirty-second example.

The input data may have a plurality of digital image data. Each input datum of the assigned image data of the plurality of image data can have an image point of the plurality of image points or be formed from such an image point. The neural network may be set up for processing each of these image data. In other words, the input data may have a plurality of digital images, each of which may have a plurality of image points, and the neural network may be set up for processing each of the plurality of digital images. Features described in this paragraph in combination with the thirty-first example or the thirty-second example form a thirty-third example.

The system may also have at least one additional neural network which is set up to provide object segmentation. Features described in this paragraph in combination with one or more of the thirty-first through thirty-third examples form a thirty-fourth example.

The system may be a medical imaging system. Features described in this paragraph in combination with one or more of the thirty-first through thirty-fourth examples form a thirty-fifth example.

The vehicle may have a driving assistance system. The driving assistance system may have a system according to one or more of the thirty-first to thirty-fifth examples. The features described in this paragraph form a thirty-sixth example.

The vehicle may have at least one imaging sensor, which is set up to provide digital image data. These digital image data may have a plurality of image objects. The vehicle may also have a driving assistance system. The driving assistance system may have a neural network trained according to the twenty-seventh example or the twenty-eighth example. The neural network of the driving assistance system may be set up for classifying and segmenting the digital image data. The driving assistance system may be set up to control the vehicle on the basis of the classified or segmented digital image data. In other words, the driving assistance system can be designed to process the classified or segmented digital image data and to be able to output at least one control instruction based on the classified or segmented digital image data. This has the following advantages: each image object can be classified context-independently, that is to say independently of the other image objects of the plurality. That is, the number of misclassifications due to context may be reduced (e.g., prevented). In other words, these image objects can be correctly classified with higher accuracy. The features described in this paragraph form a thirty-seventh example.

The computer program may have program instructions which are set up to: when implemented by one or more processors, the program instructions implement the methods according to one or more of the first through twenty-eighth examples. The features described in this paragraph form the thirty-eighth example.

The computer program may be stored in a machine readable storage medium. The features described in this paragraph in combination with the thirty-eighth example form a thirty-ninth example.

At least a portion of the neural network may be implemented by one or more processors. The features described in this paragraph in combination with the fortieth example form the fortieth example.

The neural network may have been trained by the method for training according to one of the first to twenty-eighth examples. The features described in this paragraph in combination with the fortieth example or the fortieth example form a forty-second example.

Each of the plurality of classification importance values for which the assigned input data has a predefined class dependency may be set equal to a value of "0". In other words, each classification importance value of the saliency map assigned to a class, whose assigned input data has a class assigned to the saliency map, may be set equal to the value "0". The features described in this paragraph, in combination with one or more of the forty-third to forty-second examples, form a forty-third example.

The method may further have: the neural network is validated if the segmentation error assigned to a class of saliency maps is less than a predefined value. This segmentation error can be illustrated in connection with the fifty-fourth example: how much input data that is not assigned to the category affects the assignment of the category. That is, the neural network may be validated if input data that is not assigned to the class has no effect on the assignment of the class, that is to say its segmentation error is less than a predefined value. The features described in this paragraph, in combination with one or more of the forty-fourth to forty-third examples, form a forty-fourth example.

The generation of the saliency map may also have: a saliency map is generated for each of the plurality of classes. The features described in this paragraph, in combination with one or more of the forty-fourth to forty-fourth examples, form a forty-fifth example.

The determination of the segmentation error may also have: a segmentation error is determined for each saliency map of the plurality of saliency maps. The features described in this paragraph in combination with the forty-fifth example form a forty-sixth example.

The determination of whether the segmentation error is less than a predefined value may have: determining whether each of the plurality of segmentation errors assigned to the saliency map is less than a respective predefined value. The features described in this paragraph, in combination with one or more of the forty-fifth through forty-sixth examples, form a forty-seventh example.

The segmentation errors of the plurality of segmentation errors may have a common segmentation error, and it may be determined whether the common segmentation error is less than a predefined common value. The features described in this paragraph, in combination with one or more of the forty-fifth to forty-sixth examples, form a forty-eighth example.

The target segmentation may be provided by at least one additional neural network. The features described in this paragraph, in combination with one or more of the forty-eighth to forty-ninth examples, form a forty-ninth example.

The input data may comprise digital image data, and each of the input data may comprise or be formed by one of a plurality of image points. The features described in this paragraph in combination with one or more of the forty-fifth through forty-ninth examples form a fifty-fifth example.

The input data may have a plurality of digital image data. Each input datum of the assigned image data of the plurality of image data can have an image point of the plurality of image points or be formed from such an image point. The method for validating a neural network may be performed for each image data of the plurality of image data. The features described in this paragraph in combination with the fifty-th example form the fifty-th example.

The computer program may have program instructions which are set up to: when implemented by one or more processors, the program instructions implement a method according to one or more of the fortieth example through the fifty-first example. The features described in this paragraph form the fifty-second example.

The computer program may be stored in a machine readable storage medium. The features described in this paragraph in combination with the fifty-second example form a fifty-third example.

At least a portion of the neural network may be implemented by one or more processors. The features described in this paragraph in combination with the fifty-fourth example form a fifty-fifth example.

Drawings

Embodiments of the invention are illustrated in the drawings and are described in detail below. In the drawings, like numerals generally refer to like parts throughout the several views. The drawings are not necessarily to scale, emphasis instead generally being placed upon illustrating the principles of the invention.

Embodiments of the invention are illustrated in the drawings and are set forth in detail in the description that follows.

Wherein:

FIG. 1 shows an apparatus according to various embodiments;

FIG. 2 illustrates an imaging device in accordance with various embodiments;

FIG. 3 illustrates an exemplary digital image;

FIG. 4 illustrates a processing system for training a neural network, in accordance with various embodiments;

FIG. 5A illustrates an exemplary classified digital image;

FIG. 5B illustrates an exemplary object segmentation;

FIG. 5C illustrates an exemplary saliency map;

FIG. 5D illustrates an exemplary processed saliency map;

FIG. 6 illustrates a method for training a neural network, in accordance with various embodiments;

FIG. 7 shows a vehicle according to various embodiments;

FIG. 8 illustrates a processing system for validating a neural network, in accordance with various embodiments; while

Fig. 9 illustrates a method for validating a neural network, in accordance with various embodiments.

Detailed Description

In one embodiment, a "circuit" may be understood as any type of logic implementing entity, which may be hardware, software, firmware, or a combination thereof. Thus, in one embodiment, a "circuit" may be a hardwired logic circuit or a programmable logic circuit, such as a programmable processor, e.g., a microprocessor (e.g., a CISC (Complex instruction set computer) or RISC (reduced instruction set computer)). The "circuitry" may also be software implemented or embodied by a processor, e.g. any type of computer program, e.g. a computer program using virtual machine code, such as Java. According to an alternative embodiment, any other type of implementation of the respective functions, which are described in more detail below, may be understood as "circuitry".

Fig. 1 shows a system 100 according to various embodiments. The system 100 may have one or more sensors 102. The sensor 102 may be set up to provide input data 104. The sensor 102 may be an imaging sensor. Alternatively, the sensor 102 may be a LIDAR (laser radar) sensor or also a microphone. According to various embodiments, the input data 104 has digital image data (within the scope of the present description, detected LIDAR sensor signals are also understood to be image data). Each of these input data 104 may have or be formed by one of a plurality of image points. The sensors of the plurality of sensors may have the same type of sensor or different types of sensors.

The system 100 may also have a storage device 106. The storage device 106 may have a memory. The memory may be used, for example, in processing performed by the processor. The memory used in these embodiments may be: volatile memory such as DRAM (dynamic random access memory); or a non-volatile memory, such as PROM (programmable read-only memory), EPROM (erasable PROM), EEPROM (electrically erasable PROM); or a flash memory such as a floating gate memory device, a charge trap memory device, an MRAM (magnetoresistive random access memory) or a PCRAM (phase change random access memory). The storage device 106 can be set up to store the input data 104. The system 100 may also have at least one processor 108 (e.g., exactly one processor, such as two processors, such as more than two processors). As described above, the at least one processor 108 may be any type of circuitry, that is, any type of logic implementing entity. In various embodiments, the at least one processor 108 is designed to process the input data 104.

Hereinafter, the embodiments are described in terms of a digital image as input data. However, it should be noted that: other (digital) input data may also be used, such as digital audio data or digital video data.

Fig. 2 shows an imaging system 200 in which the sensor is implemented as an imaging sensor 202, according to various embodiments. The imaging sensor 202 may be a camera sensor or a video sensor. Imaging sensor 202 may be configured to provide digital image data 204. The digital image data 204 has at least one digital image, such as a plurality of digital images 210. Each digital image of the plurality of digital images 210 may be a scene having a plurality of image objects, such as roads, cars, pedestrians, cyclists, and so forth. According to various embodiments, the imaging system 200 has a plurality of imaging sensors.

Each digital image of the plurality of digital images 210 may have a plurality of image points. A digital image may have one or more image objects. An image object may be assigned a plurality of image points of the plurality of image points.

Fig. 3 shows a digital image 300 having a plurality of image points 302. The digital image 300 may have a plurality of image objects such as roads, a plurality of vehicles, a plurality of cyclists, pedestrians, houses, trees, and so forth. Illustratively, the digital image 300 may have a first image object 304 and a second image object 306.

As is shown in fig. 3, the first image object 304 may be, for example, a vehicle, and the second image object 306 may be, for example, a cyclist. The first image object 304 may have a plurality of first object image points 308 and the second image object 306 may have a plurality of second image object points 310.

Fig. 4 illustrates a processing system 400 for training a neural network, in accordance with various embodiments. The processing system 400 may have a storage device 106 for storing digital image data 204, such as a digital image 300. The processing system 400 may also have at least one processor 108. The processor 108 implements at least a portion of the neural network 402.

The neural network 402 is set up to process the digital image data 204. The neural network 402 may be set up as: each image point of the plurality of image points of one or more digital images 204 is processed, for example classified. For each image point of the plurality of image points 302, the classified digital image 404 may have an assigned category of a plurality of categories. Fig. 5A exemplarily shows a classified digital image 504. The classified digital image 504 may be generated by the neural network 402 based on the digital image 300. For each of the plurality of image points 302, the classified digital image 504 may have an assigned category X ̂, wherein each category X ̂ of one image point may be different from the categories X ̂ of the other image points of the plurality of image points 302. For example, a first object image point 308 assigned to the first image object 304 may have a common assigned first category of the plurality of categories, while a second object image point 310 assigned to the second image object 306 may have a common assigned second category of the plurality of categories. The common assigned second category may be different from the common assigned first category.

As shown in fig. 4, the storage device 106 is also set up to store the target partition 408. The target segmentation 408 may be provided at least in part (e.g., a portion of the target segmentation, such as the entire target segmentation) by an additional neural network. The object segmentation 408 may have an assigned class membership for each of a plurality of image points 302 of the digital image 300. In various embodiments, for each of the plurality of image points 302, the class membership of the object segmentation 408 has an assigned class of a plurality of classes. Furthermore, the object segmentation 408 may assign a plurality of object image points of the plurality of image points 302 to an image object. The object image points assigned to an image object can have the same assigned class. The class of object image points of the plurality of classes which is assigned to an image object may be different from all other classes of the plurality of classes. A segmentation may have object image points assigned to an image object. That is, the target segmentation 408 may have multiple segmentations, where each segmentation may be different from the other segmentations in the multiple segmentations. That is, the matrix K (see equation (1)) may have the size of the digital image 300, that is, each matrix element K_ijExactly one image point of the plurality of image points 302 may be assigned. Thus, the matrix K may have a look-up for xyThe size of (2). For each matrix element k whose assigned image point corresponds to an object image point assigned to an image object_ijIn other words, the matrix K may have all assigned classes, and all other matrix elements K_ijThere may be one or more other assigned categories. Each matrix element k_ijMay have a natural number.

（1）。

The category membership may be provided by a first additional neural network. The segmentation, that is to say the assignment of object image points to an image object, can be provided by a second additional neural network.

Fig. 5B exemplarily shows the object segmentation 508. The object segmentation 508 may have an object segmentation of the digital image 300. For each image point of the plurality of image points 302, the object segmentation 508 may have a class dependency X, i.e. an assigned class of the plurality of classes. According to the exemplary target segmentation 508, which is visually illustrated in fig. 5B, the first object image point 308 assigned to the first image object 304 may have a category X1 and the second object image point 310 assigned to the second image object 306 may have a category X2. The category X1 and the category X2 may be different from each other. Image points of the plurality of image points 302 which are not assigned either to the first image object 304 or to the second image object 306 may have a category X0 which differs from the category X1 and the category X2, respectively. The first object image point 308 assigned to the first image object 304 may be a first segmentation. The second image point 310 assigned to the second image object 306 may be a second segmentation.

As shown in fig. 4, the neural network 402 may also be set up to generate a Saliency Map (also referred to as a salience Map) 406. Saliency map 406 may be generated for one of the plurality of categories. The saliency map 406 may be based on the digital image 404 being classified. For each image point of the plurality of image points 302, the saliency map 406 may have an assigned classification importance value. The classification importance value of each of the plurality of image points 302 of saliency map 406 assigned to a category may indicate the importance of the corresponding image point in assigning the category. In other words, the classification importance value may specify the importance or meaning/relevance of the respective image point of the plurality of image points 302 when assigning the classification. That is, the matrix S (see equation (2)) may have the size of the digital image 300, that is, each matrix element S_ijExactly one image point of the plurality of image points 302 may be assigned. Thus, the matrix S may have a look-up for xyThe size of (2). For each matrix element s_ijIn other words, the matrix S may have assigned class importance values. Each matrix element k_ijAll can have a solidAnd (4) counting.

（2）。

The processor 108 is set up to: assigning a first classification importance value and a second classification importance value to each of the plurality of classification importance values. The processor 108 may be set up to assign a first classification importance value to each classification importance value below a threshold and a second classification importance value to each classification importance value above the threshold. According to a different embodiment, the first classification importance value is equal to the first binary value "0" and the second classification importance value is equal to the second binary value "1". Fig. 5C exemplarily shows the saliency map 506A. A saliency map 506A may be generated based on the classified digital image 404. A saliency map 506A may be generated for the category X1 and may have an assigned classification importance value Ŷ for each image point of the plurality of image points 402. That is, each classification importance value Ŷ of the plurality of classification importance values may have an importance of an image point of the plurality of image points 302 when assigning the category X1 to that image point. Each of the plurality of classification importance values may be different from other of the plurality of classification importance values.

As shown in fig. 4, the processor 108 may be set up to determine a first classification error 410. A first classification error 410 may be determined by comparison of the classified digital image 404 with the target segmentation 408. The comparison of the classified digital image 404 with the object segmentation 408 may have: the classified digital image 404 is compared to a target segmentation 408 for each of the plurality of image points 302. The comparison of the classified digital image 404 with the object segmentation 408 may have: the class assigned to the respective image point of the plurality of image points 302 of the classified digital image 404 is compared with the class membership assigned to the respective image point of the plurality of image points 302. According to various embodiments, the determination of the first classification error 410 has: a first loss value is determined based on a first loss function. A first loss value may be determined for the assigned category of the plurality of categories. The first loss function may be a cross-entropy loss function.

The processor 108 may also be set up to determine a second classification error 412. The second classification error 412 may be determined by a comparison of the saliency map 406 with the target segmentation 408, for example by a comparison of the saliency map 406 with the target segmentation 408 for each image point of the plurality of image points 302. In this case, the classification importance value assigned to the respective image point of the plurality of image points 302 of the saliency map 406 assigned to a class can be compared with the assigned segmentation of the image object assigned to this class. The processor 108 may be set up to: setting one or more classification importance values of the plurality of classification importance values equal to "0". In a different embodiment, each classification importance value of the plurality of classification importance values, the assigned object image point of which is assigned to an image object of the class, is set equal to "0". In other words, each classification importance value of the saliency map assigned to a class whose assigned image point has the class assigned to the saliency map is set equal to "0". This may have: each matrix element K of the matrix K given in formula (1) having a class assigned to the saliency map_ijAre all set equal to the value "0" without each matrix element k of the assigned category_ijIs set equal to the value "1" so that the matrix K' is obtained. Each matrix element K of the matrix K_ij' all can be multiplied by the associated matrix element S of the matrix S given in equation (2)_ijSo as to obtain a matrix having matrix elements s_ij'matrix S'.

In various embodiments, the classification importance value assigned to the respective image point of the plurality of image points 302 is a first classification importance value or a second classification importance value. In various embodiments, the determination of the second classification error 412 has: a second loss value is determined. A second loss value may be determined for the saliency map 406 assigned to a classAnd (4) determining. After each classification importance value of the plurality of classification importance values, whose assigned object image point is assigned to an image object of the category, has been set equal to "0", a second loss value may be determined. The second loss value assigned to a class may have: the sum of all of the classification importance values of the plurality of classification importance values of the saliency map 306 assigned to that class. That is, the second loss value (V)₂See equation (3)) may be determined based on the matrix S' and V₂May have all matrix elements s_ij' sum of.

（3）。

Fig. 5D exemplarily shows the processed saliency map 506B. Processed saliency map 506B may be generated based on saliency map 506A. Saliency map 506A may be generated for category X1. The first object image point 308 assigned to the first image object 304 may be assigned a category X1. For each first object image point 308 assigned to first image object 304, processed saliency map 506B may have a classification importance value Ŷ equal to "0". In other words, the processor 108 may be set up to set each classification importance value Ŷ, that is to say the first object image point 308 or the first segmentation with the class X1 equal to "0". The second loss value may have: the sum of all classification significance values Ŷ of the plurality of classification significance values of processed saliency map 506B assigned to category X1.

As shown in fig. 4, the processor 108 may also be set up to: the neural network 402 is adapted (in other words the neural network 402 is trained), for example by minimizing the at least one total loss value 414 to adapt the neural network 402 (in other words the neural network 402 is trained). The total loss value 414 is based on the first classification error 410 of the assigned class and the second classification error 412 of the saliency map 406 assigned to that class. The total loss value 414 for the assigned category may be: a sum (optionally a weighted sum) of the first classification error 410 (e.g., the first penalty value) and the second classification error 412 (e.g., the second penalty value).

According to various embodiments, the processing system 400 also has a sensor 102.

The processing system 400 may also have at least one additional neural network which is set up to provide at least a part of the object segmentation 408 (e.g. the entire object segmentation).

The processing system 400 may be a medical imaging system.

According to one embodiment, a computer controlled machine, such as a robot, vehicle, household appliance, power tool, production tool, intelligent personal assistant or access control system is provided. A computer controlled machine may have a processing system 400.

According to one embodiment, a vehicle is provided having a driving assistance system. The driving assistance system may have a processing system 400.

FIG. 6 illustrates a method 600 for training a neural network, in accordance with various embodiments. The method 600 may have: the input data 104 is classified by the neural network 402. The input data 104 may have digital image data 204, such as a digital image 300. The method 600 may have: the digital image 300 is classified (at 602) by the neural network 402. The classification of the digital image 300 may have: each of the plurality of image points 302 is assigned one of a plurality of categories. The method 600 may also have: a saliency map 406 is generated (at 604). Saliency map 406 may be generated for one of the plurality of categories. A saliency map 406 may be generated based on the classified digital image 404. The generation of saliency map 406 may have: each image point of the plurality of image points 302 is assigned a classification importance value. The method 600 may have: a target segmentation 408 is provided (at 606). For each of the plurality of image points 302, the target segmentation 408 may have an assigned class membership. The method 600 may also have: a first classification error 410 is determined (at 608). A first classification error 410 may be determined by comparison of the classified digital image 404 with the target segmentation 408. The method 600 may also have: a second classification error 412 is determined (at 610). The second classification error 412 may be determined by comparison of the saliency map 406 with the target segmentation 408. The method 600 may have: the neural network 402 is adapted (at 612). The neural network 402 may be adapted based on the first classification error 410 and the second classification error 412. The adaptation of the neural network 402 may have a minimization of the total loss value 414, and the total loss value 414 may be based on the first classification error 410 (e.g., a first loss value) and the second classification error 412 (e.g., a second loss value). According to various embodiments, the method 600 is repeated until the total loss value 414 meets a predefined target criterion.

According to various embodiments, the generation of saliency map 406 may have: respective saliency maps 406 are generated for the plurality of classes. According to various embodiments, the determination of the first classification error 410 has: a first classification error 410 is determined for a plurality of classes. In various embodiments, the determination of the second classification error 412 has: a second classification error is determined 412 for a plurality of saliency maps of the multitude of saliency maps. The determination of the second classification error 412 may have: a second classification error 412 is determined for each saliency map 406 of the plurality of saliency maps. According to various embodiments, the adaptation of the neural network 402 has: a total loss value 414 is determined for the plurality of classes. The adaptation of the neural network 402 may have: a total loss value 414 is determined for each of the plurality of categories. The adaptation of the neural network 402 may also have: each total penalty value 414 of the plurality of total penalty values is minimized. According to various embodiments, the method 600 is repeated until each total loss value 414 of the plurality of total loss values meets a respective predefined target criterion. The plurality of total loss values may have a common total loss value, and method 600 may be repeated until the common total loss value meets a predefined common target criterion.

Fig. 7 shows a vehicle 700 according to an embodiment. The vehicle 700 may be a vehicle having an internal combustion engine, an electric vehicle, a hybrid vehicle, or a combination thereof. The vehicle 700 may also be an automobile, a truck (LKW), a watercraft, an unmanned aerial vehicle, an aircraft, and so forth.

The vehicle 700 may have at least one sensor (e.g., imaging sensor) 702 (e.g., sensor 102). The vehicle 700 may have a driving assistance system 704. The driving assistance system 704 may have a storage device 106. The driving assistance system 704 may have a processor 108. The processor 108 may implement a neural network. The neural network of the driver assistance system 704 may be set up for classifying and segmenting these digital image data. According to various embodiments, the neural network is trained according to the method 600 for training a neural network such that the neural network can classify digital image data without context. The context-free classification prevents: undesirable correlations are learned that may lead to misclassifications. That is, the driving assistance system 704 can better assign a plurality of image objects to the respective categories, better segment the plurality of image objects, and thereby better recognize the plurality of image objects. It is therefore an aspect to provide a vehicle that enables improved identification of image objects in digital image data.

The driving assistance system 704 can be set up to control the vehicle 700 on the basis of the context-free classified or segmented digital image data. In other words, the driving assistance system 704 may be designed to process the context-free classified or segmented digital image data and to be able to output at least one control instruction to one or more actuators of the vehicle 700 on the basis of the context-free classified or segmented digital image data. That is, the driving assistance system 704 may influence the current driving behavior based on these context-free classified or segmented digital image data, e.g., may maintain or change the current driving behavior. The change to the driving behavior may be, for example, an intervention in the driving behavior for safety reasons, such as emergency braking.

Fig. 8 illustrates a processing system 800 for validating a neural network, in accordance with various embodiments. The processing system 800 may have a storage device 106 for storing input data 104. The input data 104 may be provided by the sensor 102. The processing system 800 may also have a processor 108. The processor 108 may be set up to implement at least a portion of the neural network 802.

The neural network 802 may be set up to classify digital image data 204, such as the digital image 300. For each image point of the plurality of image points 302, the classified digital image 804 may have an assigned category of a plurality of categories. The neural network 802 may also be set up to generate a saliency map 806. Saliency map 806 may be generated for one of the plurality of categories. Saliency map 806 may be based on the classified digital image 804. For each image point of the plurality of image points 302, saliency map 806 may have an assigned classification importance value. The classification importance value for each of the plurality of image points 302 of saliency map 806 assigned to a category may indicate the importance of the corresponding image point in assigning the category. The processor 108 may be set up to: assigning a first classification importance value and a second classification importance value to each of the plurality of classification importance values. The processor 108 may be set up to assign a first classification importance value (e.g. "0") to each classification importance value below a threshold and a second classification importance value (e.g. "1") to each classification importance value above the threshold.

The storage device 106 may also be set up to store the target partition 808.

The processor 108 may be set up to: the target segmentation 808 is processed, e.g., a segmentation error is determined 812. The segmentation error 812 may be determined by comparison of the saliency map 806 with the target segmentation 808, for example by means of a loss function.

The processor 108 may be configured to determine (at 814) whether the segmentation error 812 is less than a predefined value. The processor 108 may be set up to: if the segmentation error 812 is less than the predefined value ("yes" at 814), validating the neural network 802 (at 816); and the processor 108 may be set up to: if the segmentation error 812 is greater than or equal to the predefined value ("no" at 814), the neural network 802 is not validated (at 818).

Fig. 9 illustrates a method 900 for validating a neural network, in accordance with various embodiments. The method 900 may have: the input data (e.g., digital image data) 104 is classified by the neural network 802.

The method 900 may have: the digital image 300 is classified (at 902) by the neural network 802. The classification of the digital image 300 may have: each of the plurality of image points 302 is assigned one of a plurality of categories. According to various embodiments, the neural network 802 is trained or adapted according to the method 600 for training a neural network. The method 900 for validating a neural network may also have: a saliency map 806 is generated (at 904). Saliency map 806 may be generated for one of the plurality of categories. A saliency map 806 may be generated based on the classified digital image 804. The generation of saliency map 806 may have: each image point of the plurality of image points 302 is assigned a classification importance value. The method 900 may have: a target segmentation 808 is provided (at 906). For each image point of the plurality of image points 302, the target segmentation 808 may have an assigned class membership. The method 900 may have: a segmentation error 812 is determined (at 908). Segmentation error 812 may be determined by comparison of saliency map 806 with target segmentation 808. The method 900 may also have: it is determined whether the segmentation error 812 is less than a predefined value (at 910). According to various embodiments, the method 900 has: if the segmentation error 812 is less than a predefined value, the neural network 802 is validated.

Claims

1. A method for training a neural network, the method being implemented by one or more processors, the method having:

classifying input data by a neural network, wherein each of the input data is assigned one of a plurality of classes;

generating a saliency map for one of the plurality of classes based on the classified input data, wherein each of the input data is assigned a classification importance value;

providing a target segmentation in which each of the input data is assigned a class dependency;

determining a first classification error by comparison of the classified input data with the target segmentation;

determining a second classification error by comparison of the saliency map with the target segmentation;

adapting the neural network based on the first classification error and the second classification error.

2. The method of claim 1, wherein the first and second light sources are selected from the group consisting of,

wherein the classification importance value of each of the input data of the saliency map assigned to a class illustrates the importance of the corresponding input data when assigning said class.

3. The method according to any one of claims 1 or 2,

wherein the comparison of the classified input data to the target segmentation has: comparing the classified input data to a target segmentation for each of the input data.

4. The method of any one of claims 1 to 3,

wherein the adaptation of the neural network has: minimizing at least one total loss value, wherein the total loss value is based on a first classification error of the assigned class and a second classification error of the saliency map assigned to the class.

5. The method of any one of claims 1 to 4,

wherein the determination of the first classification error has: a first classification error is determined for each of the plurality of classes.

6. The method of any one of claims 1 to 5,

wherein the generation of the saliency map further has: generating a saliency map for each of the plurality of classes; and wherein the determination of the second classification error further has: determining a second classification error for each saliency map of the plurality of saliency maps.

7. The method of any one of claims 1 to 6,

wherein the object segmentation is provided by at least one additional neural network.

8. The method of claim 7, wherein the first and second light sources are selected from the group consisting of,

wherein the categories of the category affiliations are provided from a first additional neural network; and also

Wherein the segmentation of the classes is provided by a second additional neural network, the second additional neural network being different from the first additional neural network.

9. The method of any one of claims 1 to 8,

wherein the input data comprise digital image data, and wherein each of the input data comprises or is formed by an image point of a plurality of image points.

10. The method of claim 9, wherein the first and second light sources are selected from the group consisting of,

wherein the input data has a plurality of digital image data;

wherein each input datum of the assigned image data of the plurality of image data has an image point of the plurality of image points or is formed by such an image point; and also

Wherein the method for training a neural network is performed for each image data of the plurality of image data.

11. A device set up to carry out the method according to any one of claims 1 to 10.

12. A system, having:

the device according to claim 11; and

an imaging sensor, which is set up to provide the input data to the device.

13. A vehicle, having:

at least one imaging sensor, which is set up to provide digital image data; and

driving assistance system with a neural network trained according to one of claims 9 or 10, wherein the neural network is set up for classifying the digital image data, and wherein the driving assistance system is set up for controlling the vehicle on the basis of the classified digital image data.

14. A method for validating a neural network, the method being implemented by one or more processors, the method having:

determining a segmentation error by comparison of the saliency map with the target segmentation;

-determining whether the segmentation error is smaller than a predefined value.

15. A device set up to carry out the method according to claim 14.