CN115984662B - Multi-mode data pre-training and identifying method, device, equipment and medium - Google Patents
Multi-mode data pre-training and identifying method, device, equipment and medium Download PDFInfo
- Publication number
- CN115984662B CN115984662B CN202310272537.2A CN202310272537A CN115984662B CN 115984662 B CN115984662 B CN 115984662B CN 202310272537 A CN202310272537 A CN 202310272537A CN 115984662 B CN115984662 B CN 115984662B
- Authority
- CN
- China
- Prior art keywords
- defect
- scene
- data
- information
- factors
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 74
- 238000012549 training Methods 0.000 title claims abstract description 55
- 230000007547 defect Effects 0.000 claims abstract description 428
- 238000001514 detection method Methods 0.000 claims abstract description 54
- 239000013598 vector Substances 0.000 claims abstract description 29
- 230000004927 fusion Effects 0.000 claims abstract description 16
- 239000011159 matrix material Substances 0.000 claims description 52
- 230000006870 function Effects 0.000 claims description 43
- 238000001228 spectrum Methods 0.000 claims description 38
- 230000008569 process Effects 0.000 claims description 34
- 238000004519 manufacturing process Methods 0.000 claims description 25
- 238000004590 computer program Methods 0.000 claims description 24
- 238000000605 extraction Methods 0.000 claims description 19
- 238000013507 mapping Methods 0.000 claims description 15
- 238000009826 distribution Methods 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000003860 storage Methods 0.000 claims description 12
- 238000013519 translation Methods 0.000 claims description 8
- 238000013461 design Methods 0.000 claims description 7
- 210000000988 bone and bone Anatomy 0.000 claims description 6
- 239000004744 fabric Substances 0.000 claims description 6
- 230000007246 mechanism Effects 0.000 claims description 6
- 239000002184 metal Substances 0.000 claims description 6
- 239000004065 semiconductor Substances 0.000 claims description 6
- 235000012431 wafers Nutrition 0.000 claims description 6
- 239000002023 wood Substances 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 5
- 238000000227 grinding Methods 0.000 claims description 5
- 239000000203 mixture Substances 0.000 claims description 5
- 238000005096 rolling process Methods 0.000 claims description 5
- 238000010008 shearing Methods 0.000 claims description 5
- 230000001502 supplementing effect Effects 0.000 claims 1
- 239000000047 product Substances 0.000 description 13
- 230000014616 translation Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 4
- 238000007689 inspection Methods 0.000 description 4
- 239000013589 supplement Substances 0.000 description 4
- 238000007254 oxidation reaction Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000009411 base construction Methods 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003647 oxidation Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 229910000831 Steel Inorganic materials 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000008485 antagonism Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000000875 corresponding effect Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Investigating Materials By The Use Of Optical Means Adapted For Particular Applications (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a multi-mode data pre-training and identifying method, device, equipment and medium, which are used for constructing a defect scene rule database by carrying out multi-source heterogeneous data fusion on defect basic data acquired by acquisition; extracting defect type information, feature information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database; constructing a self-coding network structure model carrying defect scene information, merging the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, carrying out data and rule matching training, and generating a modal identification model; and carrying out defect recognition on the sample to be detected according to the modal recognition model. The product defect detection accuracy and the robustness of the model can be improved.
Description
Technical Field
The present invention relates to the field of image recognition, and in particular, to a method, apparatus, device, and medium for training and recognizing multimodal data.
Background
With the rapid development of the precision manufacturing industry, the loss caused by the surface defects of high-precision instruments is up to the level of one hundred billion yuan every year, and the high-precision defect detection requirements of industrial products are increasingly strong. Particularly, the industrial production environment has highly complex conditions such as noise, shielding, vibration, dim light and the like, so that the defect detection has to have the requirements of intelligence, high precision, long time and high efficiency.
Although the defect accuracy is improved to a certain extent by applying the deep learning algorithm at the present stage, the defect sample is small and unbalanced in the existing high-precision defect detection process, and meanwhile, the defect sample is easily influenced by environments such as shielding, oxidization and vibration, and the problems of low product defect detection accuracy and weak robustness of a model exist.
Disclosure of Invention
In order to solve the technical problems, the invention provides a multi-mode data pre-training and identifying method, device, equipment and medium, which improve the accuracy of product defect detection and the robustness of a model.
The embodiment of the invention provides a multi-mode data pre-training and identifying method, which comprises the following steps:
carrying out multi-source heterogeneous data fusion on the acquired defect basic data to construct a defect scene rule database;
Extracting defect type information, feature information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database;
constructing a self-coding network structure model carrying defect scene information, merging the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, carrying out data and rule matching training, and generating a modal identification model;
and carrying out defect recognition on the sample to be detected according to the modal recognition model.
Further, the multi-source heterogeneous data fusion is performed on the acquired defect basic data to construct a defect scene rule database, which specifically comprises the following steps:
performing multi-source heterogeneous data fusion on defect basic data consisting of historical experience data, common rule data and defect standard data to form a defect scene rule database of association of defect scenes, defect types, positions and scales;
the defect scene rule database comprises: a surface defect dataset, a defect rule dataset, a inspection system dataset, and a process scene dataset.
As an improvement of the above-described scheme, the surface defect data set d1= [ surface defect ID, defect geometric feature, spatial distribution data, defect statistics data, defect spectrum data ];
the defect rule data set d2= [ defect rule ID, detected object type, defect classification statistics, damage-causing mechanism data, defect cause rule, defect grade ];
the detection system data set d3= [ detection system ID, equipment type, production line design data, technical choice ];
the process scene data set d4= [ process scene data ID, detected object type, scene factor, production procedure ];
the defect geometry includes: dotted line-plane defects, boundaries, bones, shape, position, size, stretching, and translation;
the spatial distribution data includes: entropy, contrast, consistency, and correlation;
the defect statistical data comprise a gray level co-occurrence matrix, an autocorrelation coefficient, mathematical morphology, histogram statistical characteristics, fractal values and a defect spectrum subset;
the histogram statistical features include range, mean, geometric mean, harmonic mean, standard deviation, variance, and median.
The fractal values include a stretched, translated fractal dimension and a porosity;
The defect spectrum subset includes texture spectrum, stain spectrum, and saw tooth spectrum;
the defect classification statistical data specifically refers to a fault mode of automatic defect division;
the defect level includes the detection object type;
the detection object types comprise semiconductors, circuit boards, wafers, fabrics, metal surfaces and wood;
the scene factors comprise job scale and equipment type selection;
the production process comprises blank making, grinding, rolling, shearing, bundling and finished product production.
Preferably, the extracting defect type information, feature information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database specifically includes:
extracting defect type information from the surface defect dataset, extracting feature information from the surface defect dataset and the defect rule dataset, extracting scene information from the detection system dataset and the process scene dataset;
for the defect Z, a layered matrix Z multiplied by T multiplied by R is constructed according to the extracted defect type information, characteristic information and scene information;
for defect-feature association information, a first extraction factor a is adopted ij Mapping and extracting from the matrix Z multiplied by T to obtain the previous defect scene factorAccording to all the extracted forefront defect scene factors +.>Form the forepart scene factor->;
For the feature-scene association information, a second extraction factor b is adopted ij Mapping and extracting from the matrix T X R to obtain the late defect scene factorAccording to all the extracted postamble defect scene factors +.>Form postamble scene factor->;
Determining the scene factors according to the extracted front scene factors and back scene factors;
wherein,,,T,n is the number of defect categories, j is the feature vector dimension, Z i j For the values of elements in the defect matrix, T i j For the element values in the characteristic information matrix, R i j I=1, 2, … n for the element values in the scene information matrix;,,If you are->=0,If you are->;;,,If you are->=0,Time then;。
Preferably, the constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, and performing matching training of data and rules to generate a modal identification model, which specifically comprises:
applying the previous scene factors in the scene factors to the encoder of the self-coding network structure model to extract effective features;
Applying the latter scene factors in the scene factors to a decoder of the self-coding network structure model to generate rules;
inputting a feature vector W coded by sample data of various defects, introducing scene factors in the structure of a basic operation block by referring to the idea of a residual error network, so that the scene factors are hidden in a hierarchical structure in the stacking of the self-coding network structure model, and decoding and outputting to obtain scene rule output [ type, feature and scene ];
and outputting the scene rules through semi-supervised stacked self-encoders, adding a classifier in a decoding stage to realize a classification function, optimizing the self-encoding network structure model classifier through matching training of data and rules, and generating the modal identification model.
As a preferred scheme, the objective function of the self-coding network structure model is specifically:
;
the loss function of the self-coding network structure model is specifically as follows:
;
wherein V (G, D) is the whole defined objective function, N is the number of labels to which the source belongs,representing probability P of original tag in output data (x) of defect sample x after self-encoding network,/I>The probability P of the original tag in the output data z (x) after the self-coding network carrying the defect knowledge sample x is shown; d (X) is a conditional probability calculation function, G (z) is the probability of outputting information y under the condition of the category model G (z) in the applied classification category data; / >Representing whether or not there is +.>Class defects; a. b, w, h, c is the composition variable of each grid in defect detection, a, b are the points at the lower left corner of the grid, w, h are the width and height of the grid, c is the grid confidence, and->Representing the coordinate loss of the defect boundary box represented by the mean square error calculated through the position information;Representing the size loss of the defect boundary box by calculating absolute mean square error of the size information;By determining whether or not to belong toThe defect type calculates a confidence loss.
Preferably, the scene rule output further continuously generates and updates a defect scene rule by hidden layer training stacked from the encoder, and supplements the defect scene rule database.
The embodiment of the invention also provides a multi-mode data pre-training and identifying device, which comprises:
the database construction module is used for carrying out multi-source heterogeneous data fusion on the acquired defect basic data to construct a defect scene rule database;
the scene factor extraction module is used for extracting defect type information, feature information and scene information from the defect scene rule database, carrying out data association and extracting scene factors of the defect scene rule database;
The model generation module is used for constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, and carrying out data and rule matching training to generate a modal identification model;
and the defect recognition module is used for recognizing defects of the sample to be detected according to the modal recognition model.
Preferably, the database construction module is specifically configured to:
performing multi-source heterogeneous data fusion on defect basic data consisting of historical experience data, common rule data and defect standard data to form a defect scene rule database of association of defect scenes, defect types, positions and scales;
the defect scene rule database comprises: a surface defect dataset, a defect rule dataset, a inspection system dataset, and a process scene dataset.
Further, the surface defect data set d1= [ surface defect ID, defect geometrical feature, spatial distribution data, defect statistics, defect spectrum data ];
the defect rule data set d2= [ defect rule ID, detected object type, defect classification statistics, damage-causing mechanism data, defect cause rule, defect grade ];
The detection system data set d3= [ detection system ID, equipment type, production line design data, technical choice ];
the process scene data set d4= [ process scene data ID, detected object type, scene factor, production procedure ];
the defect geometry includes: dotted line-plane defects, boundaries, bones, shape, position, size, stretching, and translation;
the spatial distribution data includes: entropy, contrast, consistency, and correlation;
the defect statistical data comprise a gray level co-occurrence matrix, an autocorrelation coefficient, mathematical morphology, histogram statistical characteristics, fractal values and a defect spectrum subset;
the histogram statistical characteristics comprise a range, a mean value, a geometric mean value, a harmonic mean value, a standard deviation; median value
The fractal values include a stretched, translated fractal dimension and a porosity;
the defect spectrum subset includes texture spectrum, stain spectrum, and saw tooth spectrum;
the defect classification statistical data specifically refers to a fault mode of automatic defect division;
the defect level includes the detection object type;
the detection object types comprise semiconductors, circuit boards, wafers, fabrics, metal surfaces and wood;
The scene factors comprise job scale and equipment type selection;
the production process comprises blank making, grinding, rolling, shearing, bundling and finished product production.
Preferably, the scene factor extraction module is specifically configured to:
extracting defect type information from the surface defect dataset, extracting feature information from the surface defect dataset and the defect rule dataset, extracting scene information from the detection system dataset and the process scene dataset;
for the defect Z, a layered matrix Z multiplied by T multiplied by R is constructed according to the extracted defect type information, characteristic information and scene information;
for defect-feature association information, a first extraction factor a is adopted ij Mapping and extracting from the matrix Z multiplied by T to obtain the previous defect scene factorAccording to all the extracted forefront defect scene factors +.>Form the forepart scene factor->;
For the feature-scene association information, a second extraction factor b is adopted ij Mapping and extracting from the matrix T X R to obtain the late defect scene factorAccording to all the extracted postamble defect scene factors +.>Form postamble scene factor->;
Determining the scene factors according to the extracted front scene factors and back scene factors;
Wherein,,,T,n is the number of defect categories, j is the feature vector dimension, Z i j For the values of elements in the defect matrix, T i j For the element values in the characteristic information matrix, R i j I=1, 2, … n for the element values in the scene information matrix;,,If you are->=0,If you are->;;,,If you are->=0,Time then;。
Preferably, the model generation module is specifically configured to:
applying the previous scene factors in the scene factors to the encoder of the self-coding network structure model to extract effective features;
applying the latter scene factors in the scene factors to a decoder of the self-coding network structure model to generate rules;
inputting a feature vector W coded by sample data of various defects, introducing scene factors in the structure of a basic operation block by referring to the idea of a residual error network, so that the scene factors are hidden in a hierarchical structure in the stacking of the self-coding network structure model, and decoding and outputting to obtain scene rule output [ type, feature and scene ];
and outputting the scene rules through semi-supervised stacked self-encoders, adding a classifier in a decoding stage to realize a classification function, optimizing the self-encoding network structure model classifier through matching training of data and rules, and generating the modal identification model.
Preferably, the objective function of the self-coding network structure model is specifically:
;
the loss function of the self-coding network structure model is specifically as follows:
;
wherein V (G, D) is the whole defined objective function, N is the number of labels to which the source belongs,representing probability P of original tag in output data (x) of defect sample x after self-encoding network,/I>The probability P of the original tag in the output data z (x) after the self-coding network carrying the defect knowledge sample x is shown; d (X) is a conditional probability calculation function, G (z) is the probability of outputting information y under the condition of the category model G (z) in the applied classification category data;Representing whether or not there is +.>Class defects; a. b, w, h, c is the composition variable of each grid in defect detection, a, b are the points at the lower left corner of the grid, w, h are the width and height of the grid, c is the grid confidence, and->Representing the coordinate loss of the defect boundary box represented by the mean square error calculated through the position information;Representing the size loss of the defect boundary box by calculating absolute mean square error of the size information;By determining whether or not to belong toThe defect type calculates a confidence loss.
Further, the scene rule output also continuously generates and updates the defect scene rule through hidden layer training of the stacked self-encoder, and supplements the defect scene rule database.
The present invention also provides a computer readable storage medium, where the computer readable storage medium includes a stored computer program, where when the computer program runs, the device where the computer readable storage medium is located is controlled to execute the multi-modal data pre-training and identifying method according to any one of the foregoing embodiments.
The invention also provides a terminal device comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the multi-modal data pre-training and recognition method according to any of the above embodiments when executing the computer program.
The invention provides a multi-mode data pre-training and identifying method, device, equipment and medium, which are used for constructing a defect scene rule database by carrying out multi-source heterogeneous data fusion on acquired defect basic data; extracting defect type information, feature information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database; constructing a self-coding network structure model carrying defect scene information, merging the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, carrying out data and rule matching training, and generating a modal identification model; and carrying out defect recognition on the sample to be detected according to the modal recognition model. The product defect detection accuracy and the robustness of the model can be improved.
Drawings
FIG. 1 is a schematic flow chart of a multi-modal data pre-training and recognition method according to an embodiment of the present invention;
FIG. 2 is a flowchart of a multi-modal data pre-training and recognition method according to another embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a multi-modal data pre-training and recognition device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a terminal device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the invention provides a multi-mode data pre-training and identifying method, referring to fig. 1, which is a schematic flow chart of the multi-mode data pre-training and identifying method provided by the embodiment of the invention, wherein the steps S1-S4 of the method are as follows:
s1, carrying out multi-source heterogeneous data fusion on acquired defect basic data, and constructing a defect scene rule database;
S2, extracting defect type information, feature information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database;
s3, constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, and performing data and rule matching training to generate a modal identification model;
s4, carrying out defect recognition on the sample to be detected according to the modal recognition model.
In the implementation of the embodiment, defect basic data are collected, wherein the defect basic data are specifically historical defect data of a sample to be detected, multi-source heterogeneous data facing to defect detection are fused, and a basic defect scene rule database containing static defect characterization, dynamic defect evolution, defect classification, defect-scene rules and other information is constructed through multi-source heterogeneous data fusion;
extracting scene factors according to a defect scene rule database, constructing a three-dimensional vector matrix containing defect type information, feature information and scene information, forcing a self-encoder to consider which parts of input data need to be optimally copied and which parts need to be discarded by using the matrix constraint, so that the self-encoder can learn effective features of the data, discard irrelevant features, generate more defect scene rules, perform data association and extract the scene factors of the defect scene rule database;
Researching a scene rule knowledge base construction based on a semi-supervised self-coding network, designing a stacked self-coding network structure carrying defect scene information, introducing scene factors to enable the scene factors to be hidden in a hierarchical structure in the stacking of the self-coding network, inputting feature vectors obtained by coding sample data of various defects, carrying out data and rule matching training, and generating a modal identification model;
and identifying the defects of the sample according to the generated modal identification model.
According to the invention, under the conditions of low defect sampling rate and unbalanced samples, the material characteristics, manufacturing process data and high-resolution defect image sub-pixel characteristics are fused by combining a production process scene, a scene rule knowledge base is constructed by a sample generation method based on the material process data, a high-resolution defect image sub-pixel characteristic coding method and a deep learning classification method, and a self-coding network can well process various mapping relations in small sample defect data, perform characteristic coding and knowledge modeling, and can solve core problems of difficult defect identification and classification, weak robustness, large image capacity to be detected, low calculation efficiency, difficult defect source tracing and the like caused by the use of the deep learning method in the defect detection process under the complex background of shielding, oxidation, vibration and the like.
In yet another embodiment of the present invention, the step S1 specifically includes:
performing multi-source heterogeneous data fusion on defect basic data consisting of historical experience data, common rule data and defect standard data to form a defect scene rule database of association of defect scenes, defect types, positions and scales;
the defect scene rule database comprises: a surface defect dataset, a defect rule dataset, a inspection system dataset, and a process scene dataset.
In the implementation of this embodiment, the sources of the defect basic data include historical experience data, common rule data and defect standard data, where the historical experience data is specifically the historical data of the expert on defect judgment;
common industrial product defects are mainly: defects such as lines, scratches, greasy dirt, points, shadows, textures and saw teeth are reflected in an image when the defects are detected, the scene analysis is combined with the characteristics of business activities by combining the common defect image data representation condition, and the detected industrial products belong to links in the business, so that the scene judgment formed by the defect detection is influenced. Finally, through the association of each data set, a defect scene rule database which is associated with the defect type, the position and the scale of the defect scene is formed;
The defect scene rule database comprises a surface defect data set, a defect rule data set, a detection system data set and a process scene data set.
By classifying and correlating complex backgrounds such as shielding, oxidation, vibration and the like in the micron-sized visual image defect detection process, accurate defect identification is realized.
In yet another embodiment provided by the present invention, the surface defect dataset d1= [ surface defect ID, defect geometry, spatial distribution data, defect statistics, defect spectrum data ];
the defect rule data set d2= [ defect rule ID, detected object type, defect classification statistics, damage-causing mechanism data, defect cause rule, defect grade ];
the detection system data set d3= [ detection system ID, equipment type, production line design data, technical choice ];
the process scene data set d4= [ process scene data ID, detected object type, scene factor, production procedure ];
the defect geometry includes: dotted line-plane defects, boundaries, bones, shape, position, size, stretching, and translation;
the spatial distribution data includes: entropy, contrast, consistency, and correlation;
the defect statistical data comprise a gray level co-occurrence matrix, an autocorrelation coefficient, mathematical morphology, histogram statistical characteristics, fractal values and a defect spectrum subset;
The histogram statistical features include range, mean, geometric mean, harmonic mean, standard deviation, variance, and median
The fractal values include a stretched, translated fractal dimension and a porosity;
the defect spectrum subset includes texture spectrum, stain spectrum, and saw tooth spectrum;
the defect classification statistical data specifically refers to a fault mode of automatic defect division;
the defect level includes the detection object type;
the detection object types comprise semiconductors, circuit boards, wafers, fabrics, metal surfaces and wood;
the scene factors comprise job scale and equipment type selection;
the production process comprises blank making, grinding, rolling, shearing, bundling and finished product production.
In the implementation of the present embodiment, the surface defect dataset specifically includes defect geometric features (dotted line-plane defects, boundaries, bones, shapes, positions, sizes, stretching, translations), spatial distribution data (entropy, contrast, consistency and correlation), defect statistics (gray level co-occurrence matrix, autocorrelation coefficients, mathematical morphology, histogram statistics (range, mean, geometric mean, harmonic mean, standard deviation, variance and median), and fractal values (stretching, fractal dimension of translations and porosities)), defect spectrum data (texture spectrum, stain spectrum and saw tooth spectrum).
Wherein, entropy is used for reflecting the randomness of the image reflecting pixels, and the larger and the coarser are; contrast refers to the average difference in brightness of the defective scene image; consistency refers to the degree of consistency of the measurement angles in the batch of images; correlation refers to the degree of correlation of an acquired image with a detected scene. In general, these specific data sets are actually detection data sets of image data, and different subsets are formed by classifying from different angles, so as to facilitate image processing and identification.
The defect rule data set includes defect classification statistics (defects automatically classified into corresponding failure modes), damage causing mechanism data, defect cause rules, and defect levels (inspection object types (semiconductor, circuit board, wafer, fabric, metal surface, wood, etc.)). The detection system data set comprises equipment type, production line design data and technical model selection;
the process scene data includes the type of object detected (semiconductor, circuit board, wafer, fabric, metal surface, wood, etc.), scene factors (job scale, equipment type selection), production process (blank making, grinding, rolling, shearing, bundling, finished product, etc.).
The surface defect data set, the defect rule data set, the detection system data set and the process scene data set are respectively expressed as the following data sets in the form of data sets:
surface defect dataset d1= [ surface defect ID, defect geometry, spatial distribution data, defect statistics, defect spectrum data ];
defect geometric feature subset= [ surface defect ID, defect geometric feature ID, dotted line surface defect, boundary, bone, shape, position, size, stretching, translation ];
spatial distribution subset= [ surface defect ID, spatial distribution ID, entropy, contrast, consistency, correlation ];
defect statistical subset= [ surface defect ID, defect statistical ID, gray level co-occurrence matrix, autocorrelation coefficient, mathematical morphology, histogram statistical feature, fractal value ];
the defect statistical subset refers to a data value obtained by calculating defect data from a statistical perspective. Although the defect characteristics are not directly described, the statistical data of the characteristic distribution is mastered, so that the relationship between the defect type and the common characteristics is favorably analyzed. This is intersected in the D2 dataset, i.e. these statistics will eventually be correlated with defect rules, making it easier to form defect scene rules.
Histogram statistical feature subset= [ surface defect ID, defect statistics ID, histogram statistics ID, range, mean, geometric mean, harmonic mean, standard deviation, variance, and median ];
a subset of fractal values= [ surface defect ID, defect statistics ID, fractal value ID, fractal dimension of stretching, translation and porosity sign ];
the split value can show the stretching and deformation degree of the defect, and the whole stretching of the accessory is often caused by improper application of the process level in the manufacturing process of the product, so that the industrial gap defect and the like are caused.
Defect spectrum subset= [ surface defect ID, defect spectrum ID, texture spectrum, stain spectrum, saw tooth spectrum ];
the defect spectrum is really the spectrum characteristic of the defect image, but the spectrum characteristic formed by the texture, the stain and the saw tooth is different, and the data set is the spectrum characteristic of the defect image which is collected by the texture, the stain and the saw tooth in the image defect process.
The defect rule data set d2= [ defect rule ID, detected object type, defect classification statistics, damage-causing mechanism data, defect cause rule, defect grade ];
the device type refers to a missing test device, and the detection object type refers to a detected object, such as PCB board detection, steel detection, chip detection, mobile phone accessory detection, and the like. Different detection objects have different detection scenarios.
The detection system data set d3= [ detection system ID, equipment type, production line design data, technical choice ];
the process scene data set d4= [ process scene data ID, detected object type, scene factor, production procedure ].
In yet another embodiment of the present invention, the step S2 specifically includes:
extracting defect type information from the surface defect dataset, extracting feature information from the surface defect dataset and the defect rule dataset, extracting scene information from the detection system dataset and the process scene dataset;
for the defect Z, a layered matrix Z multiplied by T multiplied by R is constructed according to the extracted defect type information, characteristic information and scene information;
for defect-feature association information, a first extraction factor a is adopted ij Mapping and extracting from the matrix Z multiplied by T to obtain the previous defect scene factorAccording to all the extracted forefront defect scene factors +.>Form the forepart scene factor->;
For the feature-scene association information, a second extraction factor b is adopted ij Mapping and extracting from the matrix T X R to obtain the late defect scene factorAccording to all the extracted postamble defect scene factors +.>Form postamble scene factor- >;
Determining the scene factors according to the extracted front scene factors and back scene factors;
wherein,,,T,n is the number of defect categories, j is the feature vector dimension, Z i j For the values of elements in the defect matrix, T i j For the element values in the characteristic information matrix, R i j I=1, 2, … n for the element values in the scene information matrix;,,If you are->=0,If you are->;;,,If you are->=0,Time then;。
In the implementation of this embodiment, the scene factors are extracted according to the basic knowledge base, and these scene factors together constitute a three-dimensional vector matrix including types, features and scenes, and the matrix constraint is used to force the self-encoder to consider which parts of the input data need to be optimally copied and which parts need to be discarded, so the self-encoder can learn the effective features of the data and discard the irrelevant features, thereby generating more defect scene rules.
And after data cleaning, data association and conversion are carried out on the defect scene rule database, a three-dimensional vector matrix containing type information, characteristic information and scene information is finally formed.
Extracting defect type information from the surface defect dataset D1; extracting feature information from the surface defect data set D1 and the defect rule data set D2; extracting scene information from the detection system data set D3 and the process scene data set D4;
For defect Z, it can be expressed asFor characteristic information, it can be expressed as T +.>For scene information, it can be expressed as +.>Finally, a layered matrix of Z×T×R is formed.
Wherein n is the number of defect categories, j is the feature vector dimension, j is the vector dimension of the pointing quantity dimension, sample or feature; for example, for defect Z, the surface defect data set D1 and the defect rule data set D2 represent feature information, and j represents 1 to 11 assuming that the sum of the fields of the surface defect data set D1 and the defect rule data set D2 is 11;
Z i j for the values of elements in the defect matrix, T i j For the element values in the characteristic information matrix, R i j I=1, 2, … n for the element values in the scene information matrix;
for the defect-feature association information, the mapping information is extracted from Z×T, and the first extraction factor is adopted from the defect to the featureExtracting the front defect scene factor +.>;/>
Wherein,,is a stepwise expression used in the calculation process, < ->If you are->=0,Time then;
Based on the extracted forefront defect scene factorFormed antecedent scene factor->;
For the feature-scene association information, mapping information is extracted from the T X R, and a second extraction factor is adopted from the defect to the feature Extracting the postterm defect scene factor +.>;
Wherein,,is a stepwise expression used in the calculation process, < ->If you are->=0,Time then;
Based on the extracted forefront defect scene factorFormed intoFront scene factor->;
Scene factor= [ front scene factor, back scene factor ].
The previous scene factor represents: the information during the association of the defect features can be used for guiding effective feature extraction before an encoder, so that sample noise is reduced;
the postamble scene factor represents: the information when the feature is associated with the scene can be used for guiding the rule generation after the decoder and before the rule generation, and filtering invalid rules.
In yet another embodiment of the present invention, the step S3 specifically includes applying a previous scene factor of the scene factors to an encoder of the self-coding network structure model, and performing efficient feature extraction;
applying the latter scene factors in the scene factors to a decoder of the self-coding network structure model to generate rules;
inputting a feature vector W coded by sample data of various defects, introducing scene factors in the structure of a basic operation block by referring to the idea of a residual error network, so that the scene factors are hidden in a hierarchical structure in the stacking of the self-coding network structure model, and decoding and outputting to obtain scene rule output [ type, feature and scene ];
And outputting the scene rules through semi-supervised stacked self-encoders, adding a classifier in a decoding stage to realize a classification function, optimizing the self-encoding network structure model classifier through matching training of data and rules, and generating the modal identification model.
In the implementation of this embodiment, referring to fig. 2, a flowchart of a multi-mode data pre-training and identifying method according to another embodiment of the present invention is shown;
in fig. 2, a scene rule knowledge base construction based on a semi-supervised self-coding network is studied, and a stacked self-coding network structure carrying defect scene information is designed;
applying a leading-term scene factor containing defects and features in the scene factors to an encoder of the self-coding network structure model to extract effective features; applying the background scene factors containing the features and scenes in the scene factors to a decoder of the self-coding network structure model, and carrying out rule generation to enable the scene factors to be hidden in a hierarchical structure in the stacking of the self-coding network, and adding coding structures and classification feature information after the stacking of the self-coding network, so that the constructed model has the functions of modal identification and scene prejudgment;
Firstly, in a stacked self-coding network, an encoder and a decoder are in a symmetrical structure model, and a basic operation block structure of the network is designed in the coding network. Introducing scene factors in the structure of basic operation blocks by referring to the thought of a residual error network during superposition, so that the scene factors are hidden in a hierarchical structure in the stacking of the self-coding network;
inputting a characteristic vector W formed by sample data W1-Wi obtained after data preprocessing of input sample data X1-Xi into a self-coding network structure model, introducing scene factors in the structure of a basic operation block by referring to the thought of a residual network, enabling the scene factors to be hidden in a hierarchical structure in the stacking of the self-coding network structure model, and decoding and outputting to obtain scene rule output [ type, characteristic and scene ];
and outputting the scene rules through semi-supervised stacked self-encoders, adding a classifier in a decoding stage to realize a classification function, optimizing the self-encoding network structure model classifier through matching training of data and rules, and generating the modal identification model.
The mode identification and scene prejudging method based on the semi-supervised self-coding network is characterized in that a basic defect scene knowledge base containing static defect characterization, dynamic defect evolution, defect classification, defect-scene rules and other information is constructed through multi-source heterogeneous data fusion. Then, based on a self-coding network, introducing scene factors into the stacked self-coding network, coding the data samples to obtain feature vectors through learning of the data samples, learning the mapping from the image space of a certain type to the potential space, generating feature models of various types, positions and degrees, and carrying out data and rule matching training; through the construction and application of the defect scene knowledge base, the defect detection model has the scene pre-judging function, can promote the generation cause according to the defect information, and is helpful for the optimization of the production line design and the process of the industrial defect products.
In yet another embodiment of the present invention, the objective function of the self-coding network structure model is specifically:
;
the loss function of the self-coding network structure model is specifically as follows:
;
wherein V (G, D) is the whole defined objective function, N is the number of labels to which the source belongs,representing probability P of original tag in output data (x) of defect sample x after self-encoding network,/I>The probability P of the original tag in the output data z (x) after the self-coding network carrying the defect knowledge sample x is shown; d (X) is a conditional probability calculation function, G (z) is the probability of outputting information y under the condition of the category model G (z) in the applied classification category data;Representing whether or not there is +.>Class defects; a. b, w, h, c is the composition variable of each grid in defect detection, a, b are the points at the lower left corner of the grid, w, h are the width and height of the grid, c is the grid confidence, and->Representing the coordinate loss of the defect boundary box represented by the mean square error calculated through the position information;Representing the size loss of the defect boundary box by calculating absolute mean square error of the size information;By determining whether or not to belong toThe defect type calculates a confidence loss.
When the embodiment is specifically implemented, the self-coding network structure model carrying the defect scene information designed by the patent is applied to classification and identification, and the designed objective function is as follows:
;
wherein V (G, D) is the whole objective function defined, the objective function is calculated with the maximum contribution angle, the objective function is a conditional probability calculation function for generating an improvement D (X) of the antagonism network formula, the function is divided into three parts, the first part: embodying the objective function calculation of the encoding stage, pursuing that the calculation of the stage and the integral function calculation are as large as possible at the moment so as to obtain the most representative characteristic information; the second part is a decoding stage, the output calculated value of the stage is required to be as small as possible, but the whole formula calculation is as large as possible, so that the decoding difference is smaller; when the third part is object classification and identification, G (z) is the probability of outputting information y under the condition of the class model G (z) in the applied classification class data, and the probability can represent the accuracy degree of classification;the probability P that the defect sample x is the original tag in the output data (x) after the self-coding network is shown,the probability P of the original tag in the output data z (x) after the self-coding network carrying the defect knowledge sample x is shown; / >Measuring for the central point;
the loss function is:
the loss function of the self-coding network structure model is specifically as follows:
;
wherein a, b, w, h, c is the composition variable of each grid in defect detection, N is the number of labels to which the source belongs, a and b are points at the lower left corner of the grid, w and h are the width and height of the grid, c is the grid confidence,representing the coordinate loss of the defect boundary box represented by the mean square error calculated through the position information;Representing the size loss of the defect boundary box by calculating absolute mean square error of the size information;Indicating whether or not by judging whether or not it is->The defect type calculates a confidence loss.
In yet another embodiment of the present invention, the scene rule output further continuously generates and updates the defect scene rules by stacking hidden layer training from the encoder, and supplements the defect scene rules database.
When the embodiment is implemented, the post-decoder output result can not only realize the classification function by semi-supervised stacking of the self-encoders, but also continuously generate and update defect-scene rule knowledge by hidden layer training of the stacked self-encoders and supplement the defect-scene rule knowledge to the defect scene rule database. Further perfecting the knowledge base of the defect and scene mapping rules.
When the embodiment is implemented, referring to fig. 2, according to the rule generated by the output after the decoder, the scene rule knowledge base is supplemented, namely, the scene factors are extracted through the last postterm factors [ Yi-1], the scene layering matrix is updated for the self-coding network structure model, and the scene layering matrix is also supplemented into the input feature vector according to the vector matrix [ Yi ] of the extracted scene factors;
the stacks are all stacked in the form of a scene factor structure. In the stacked sub-structure, the front scene factors are merged into the first layer of training, and the rear scene factors are merged into the second layer of training; the usage is the same, one is threshold value usage, and the other is weight amplification; the threshold value use is to influence the activation function, on the basis of original full connection, through the matrix check of the front-term/back-term scene factors, the defect characteristics with too small threshold value can be directly abandoned, so that excessive characteristics/scene information can be prevented, and finally, the generation of overfitting can be prevented in application; on the other hand, the effective characteristics are further amplified, so that gradient disappearance phenomenon which is easy to generate in deep learning can be prevented, and the loss of the effective characteristics is prevented. By both aspects, rules formed by training stacked from the coding network are made more suitable for defect scenarios.
In still another embodiment of the present invention, referring to fig. 3, a schematic structural diagram of a multi-mode data pre-training and identifying device according to an embodiment of the present invention is provided, where the device includes:
the database construction module is used for carrying out multi-source heterogeneous data fusion on the acquired defect basic data to construct a defect scene rule database;
the scene factor extraction module is used for extracting defect type information, feature information and scene information from the defect scene rule database, carrying out data association and extracting scene factors of the defect scene rule database;
the model generation module is used for constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, and carrying out data and rule matching training to generate a modal identification model;
and the defect recognition module is used for recognizing defects of the sample to be detected according to the modal recognition model.
It should be noted that, the multi-mode data pre-training and identifying device provided by the embodiment of the present invention can execute the multi-mode data pre-training and identifying method described in any embodiment of the foregoing embodiments, and specific functions of the multi-mode data pre-training and identifying device are not described herein.
Referring to fig. 4, a schematic structural diagram of a terminal device according to an embodiment of the present invention is provided. The terminal device of this embodiment includes: a processor, a memory, and a computer program stored in the memory and executable on the processor, such as a multimodal data pretraining and recognition program. The steps in the embodiments of the multi-modal data pre-training and identifying method described above, such as steps S1-S5 shown in fig. 1, are implemented when the processor executes the computer program. Alternatively, the processor may implement the functions of the modules in the above-described device embodiments when executing the computer program.
The computer program may be divided into one or more modules/units, which are stored in the memory and executed by the processor to accomplish the present invention, for example. The one or more modules/units may be a series of computer program instruction segments capable of performing the specified functions, which instruction segments are used for describing the execution of the computer program in the terminal device. For example, the computer program may be divided into modules, and specific functions of each module are not described herein.
The terminal equipment can be computing equipment such as a desktop computer, a notebook computer, a palm computer, a cloud server and the like. The terminal device may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of a terminal device and does not constitute a limitation of the terminal device, and may include more or less components than illustrated, or may combine certain components, or different components, e.g., the terminal device may further include an input-output device, a network access device, a bus, etc.
The processor may be a central processing unit (Central Processing Unit, CPU), other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, which is a control center of the terminal device, and which connects various parts of the entire terminal device using various interfaces and lines.
The memory may be used to store the computer program and/or module, and the processor may implement various functions of the terminal device by running or executing the computer program and/or module stored in the memory and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
Wherein the terminal device integrated modules/units may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as stand alone products. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code, which may be in the form of code, object code, executable files, or in some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth.
It should be noted that the above-described apparatus embodiments are merely illustrative, and the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the invention, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.
Claims (9)
1. A method for multi-modal data pre-training and recognition, the method comprising:
Carrying out multi-source heterogeneous data fusion on the acquired defect basic data to construct a defect scene rule database;
extracting defect type information, feature information and scene information from the defect scene rule database, performing data association, and extracting scene factors of the defect scene rule database;
constructing a self-coding network structure model carrying defect scene information, merging the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, carrying out data and rule matching training, and generating a modal identification model;
performing defect recognition on the sample to be detected according to the modal recognition model;
the defect scene rule database comprises: a surface defect dataset, a defect rule dataset, a detection system dataset, and a process scene dataset;
the method for extracting the scene factors of the defect scene rule database specifically comprises the following steps of:
extracting defect type information from the surface defect dataset, extracting feature information from the surface defect dataset and the defect rule dataset, extracting scene information from the detection system dataset and the process scene dataset;
For a certain defect, constructing a layered matrix Z multiplied by T multiplied by R according to the extracted defect type information, the characteristic information and the scene information;
for defect-feature association information, a first extraction factor a is adopted ij Mapping and extracting from the matrix Z multiplied by T to obtain the previous defect scene factorAccording to all the extracted forefront defect scene factors +.>Form the forepart scene factor->;
For the feature-scene association information, a second extraction factor b is adopted ij Mapping and extracting from the matrix T X R to obtain the late defect scene factorAccording to all the extracted postamble defect scene factors +.>Form postamble scene factor->;
Determining the scene factors according to the extracted front scene factors and back scene factors;
wherein the defect matrixCharacteristic information matrix->Scene information matrixN is the number of defect categories, j is the feature vector dimension, Z ij For the values of elements in the defect matrix, T ij For the element values in the characteristic information matrix, R ij I=1, 2, … n for the element values in the scene information matrix;,,If you are->=0,If you are->;;,,If you are->=0,If you are->;。
2. The multi-modal data pre-training and identifying method as defined in claim 1, wherein the multi-source heterogeneous data fusion is performed on the acquired defect basic data to construct a defect scene rule database, and the method specifically comprises the following steps:
And performing multi-source heterogeneous data fusion on defect basic data consisting of historical experience data, common rule data and defect standard data to form a defect scene rule database of association of defect scenes and defect types, positions and scales.
3. The multi-modal data pre-training and recognition method of claim 2 wherein the surface defect dataset d1= [ surface defect ID, defect geometry, spatial distribution data, defect statistics, defect spectrum data ];
the defect rule data set d2= [ defect rule ID, detected object type, defect classification statistics, damage-causing mechanism data, defect cause rule, defect grade ];
the detection system data set d3= [ detection system ID, equipment type, production line design data, technical choice ];
the process scene data set d4= [ process scene data ID, detected object type, scene factor, production procedure ];
the defect geometry includes: dotted line-plane defects, boundaries, bones, shape, position, size, stretching, and translation;
the spatial distribution data includes: entropy, contrast, consistency, and correlation;
the defect statistical data comprise a gray level co-occurrence matrix, an autocorrelation coefficient, mathematical morphology, histogram statistical characteristics, fractal values and a defect spectrum subset;
The histogram statistical features include range, mean, geometric mean, harmonic mean, standard deviation, variance and median;
the fractal values include a stretched, translated fractal dimension and a porosity;
the defect spectrum subset includes texture spectrum, stain spectrum, and saw tooth spectrum;
the defect classification statistical data specifically refers to a fault mode of automatic defect division;
the defect level includes the detection object type;
the detection object types comprise semiconductors, circuit boards, wafers, fabrics, metal surfaces and wood;
the scene factors comprise job scale and equipment type selection;
the production process comprises blank making, grinding, rolling, shearing, bundling and finished product production.
4. The method for pre-training and identifying multi-modal data according to claim 1, wherein the constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, performing data and rule matching training, and generating a modal identification model, specifically comprises:
applying the previous scene factors in the scene factors to the encoder of the self-coding network structure model to extract effective features;
Applying the latter scene factors in the scene factors to a decoder of the self-coding network structure model to generate rules;
inputting a feature vector W coded by sample data of various defects, introducing a scene factor in the structure of a basic operation block during superposition, so that the scene factor is hidden in a hierarchical structure in the stacking of the self-coding network structure model, and decoding and outputting to obtain scene rule output [ type, feature and scene ];
and outputting the scene rules through semi-supervised stacked self-encoders, adding a classifier in a decoding stage to realize a classification function, optimizing the self-encoding network structure model classifier through matching training of data and rules, and generating the modal identification model.
5. The multi-modal data pre-training and recognition method of claim 1, wherein the objective function of the self-encoding network structure model is specifically:
;
the loss function of the self-coding network structure model is specifically as follows:
;
wherein V (G, D) is the whole defined objective function, N is the number of labels to which the source belongs,representing probability of original tag in output data (x) of defect sample x after self-encoding network,/I >Representing the probability of the original tag in the output data z (x) after the self-coding network carries the defect knowledge sample x; d (·) is a conditional probability calculation function, G (z) is the probability of outputting information y under the condition of the category model in the applied category data;representing whether or not there is +.>Class defects; a. b, w, h, c is the composition variable of each grid in defect detection, a, b are the points at the lower left corner of the grid, w, h are the width and height of the grid, c is the grid confidence, and->Representing the coordinate loss of the defect boundary box represented by the mean square error calculated through the position information;Representing the size loss of the defect boundary box by calculating absolute mean square error of the size information;Indicating whether or not by judging whether or not it is->The defect type calculates a confidence loss.
6. The multi-modal data pre-training and recognition method as set forth in claim 4 wherein the scene rule output is further continuously generating and updating defect scene rules by hidden layer training stacked from the encoder and supplementing into the defect scene rule database.
7. A multi-modal data pre-training and recognition apparatus, the apparatus comprising:
The database construction module is used for carrying out multi-source heterogeneous data fusion on the acquired defect basic data to construct a defect scene rule database;
the scene factor extraction module is used for extracting defect type information, feature information and scene information from the defect scene rule database, carrying out data association and extracting scene factors of the defect scene rule database;
the model generation module is used for constructing a self-coding network structure model carrying defect scene information, integrating the scene factors into the self-coding network structure model, inputting feature vectors obtained by coding sample data of various defects, and carrying out data and rule matching training to generate a modal identification model;
the defect recognition module is used for carrying out defect recognition on the sample to be detected according to the modal recognition model;
the defect scene rule database comprises: a surface defect dataset, a defect rule dataset, a detection system dataset, and a process scene dataset;
the scene factor extraction module is specifically configured to:
extracting defect type information from the surface defect dataset, extracting feature information from the surface defect dataset and the defect rule dataset, extracting scene information from the detection system dataset and the process scene dataset;
For a certain defect, constructing a layered matrix Z multiplied by T multiplied by R according to the extracted defect type information, the characteristic information and the scene information;
for defect-feature association information, a first extraction factor a is adopted ij Mapping and extracting from the matrix Z multiplied by T to obtain the previous defect scene factorAccording to all the extracted forefront defect scene factors +.>Form the forepart scene factor->;
For the feature-scene association information, a second extraction factor b is adopted ij Mapping and extracting from the matrix T X R to obtain the late defect scene factorAccording to all the extracted postamble defect scene factors +.>Form postamble scene factor->;
Determining the scene factors according to the extracted front scene factors and back scene factors;
wherein the defect matrixCharacteristic information matrix->Scene information matrixN is the number of defect categories, j is the feature vector dimension, Z ij Is a defect matrixElement value, T of ij For the element values in the characteristic information matrix, R ij I=1, 2, … n for the element values in the scene information matrix;,,If you are->=0,If you are->;;,,If you are->=0,If you are->;。
8. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored computer program, wherein the computer program, when run, controls a device in which the computer readable storage medium is located to perform the multi-modal data pre-training and recognition method according to any one of claims 1 to 6.
9. A terminal device comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the multimodal data pre-training and recognition method according to any of claims 1 to 6 when the computer program is executed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310272537.2A CN115984662B (en) | 2023-03-21 | 2023-03-21 | Multi-mode data pre-training and identifying method, device, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310272537.2A CN115984662B (en) | 2023-03-21 | 2023-03-21 | Multi-mode data pre-training and identifying method, device, equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115984662A CN115984662A (en) | 2023-04-18 |
CN115984662B true CN115984662B (en) | 2023-08-04 |
Family
ID=85958593
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310272537.2A Active CN115984662B (en) | 2023-03-21 | 2023-03-21 | Multi-mode data pre-training and identifying method, device, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115984662B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117036141B (en) * | 2023-10-08 | 2023-12-08 | 交通运输部公路科学研究所 | Data processing methods and data interaction systems for the entire life cycle of highways |
CN117376632B (en) * | 2023-12-06 | 2024-02-06 | 中国信息通信研究院 | Data recovery method and system based on intelligent depth synthesis |
CN118505704B (en) * | 2024-07-18 | 2024-10-22 | 成都数之联科技股份有限公司 | A universal model building and detection method for panel production line defect detection |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109919934A (en) * | 2019-03-11 | 2019-06-21 | 重庆邮电大学 | A liquid crystal panel defect detection method based on multi-source domain deep transfer learning |
CN112164067A (en) * | 2020-10-12 | 2021-01-01 | 西南科技大学 | A medical image segmentation method and device based on multimodal subspace clustering |
CN113066070A (en) * | 2021-03-31 | 2021-07-02 | 广东电网有限责任公司 | Multi-source data fusion interaction method in three-dimensional scene |
CN113436184A (en) * | 2021-07-15 | 2021-09-24 | 南瑞集团有限公司 | Power equipment image defect judging method and system based on improved twin network |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6869490B2 (en) * | 2018-12-28 | 2021-05-12 | オムロン株式会社 | Defect inspection equipment, defect inspection methods, and their programs |
WO2022116109A1 (en) * | 2020-12-03 | 2022-06-09 | Boe Technology Group Co., Ltd. | Computer-implemented method for defect analysis, apparatus for defect analysis, computer-program product, and intelligent defect analysis system |
CN115456929A (en) * | 2021-05-20 | 2022-12-09 | 富泰华工业(深圳)有限公司 | Defect detection method, computer device and storage medium |
-
2023
- 2023-03-21 CN CN202310272537.2A patent/CN115984662B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109919934A (en) * | 2019-03-11 | 2019-06-21 | 重庆邮电大学 | A liquid crystal panel defect detection method based on multi-source domain deep transfer learning |
CN112164067A (en) * | 2020-10-12 | 2021-01-01 | 西南科技大学 | A medical image segmentation method and device based on multimodal subspace clustering |
CN113066070A (en) * | 2021-03-31 | 2021-07-02 | 广东电网有限责任公司 | Multi-source data fusion interaction method in three-dimensional scene |
CN113436184A (en) * | 2021-07-15 | 2021-09-24 | 南瑞集团有限公司 | Power equipment image defect judging method and system based on improved twin network |
Non-Patent Citations (1)
Title |
---|
基于深度强化学习的木材缺陷图像识别及分割模型研究;张旭中 等;电子测量技术(17);第86-92页 * |
Also Published As
Publication number | Publication date |
---|---|
CN115984662A (en) | 2023-04-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115984662B (en) | Multi-mode data pre-training and identifying method, device, equipment and medium | |
CN111462120B (en) | Defect detection method, device, medium and equipment based on semantic segmentation model | |
CN111833306B (en) | Defect detection method and model training method for defect detection | |
CN114092474B (en) | Method and system for detecting processing defects of complex texture background of mobile phone shell | |
CN110148117B (en) | Power equipment defect identification method and device based on power image and storage medium | |
CN114155244B (en) | Defect detection method, device, equipment and storage medium | |
CN110136130A (en) | A kind of method and device of testing product defect | |
CN111507357B (en) | Defect detection semantic segmentation model modeling method, device, medium and equipment | |
CN112989995B (en) | Text detection method and device and electronic equipment | |
CN114565916B (en) | Target detection model training method, target detection method and electronic equipment | |
CN117523087B (en) | Three-dimensional model optimization method based on content recognition | |
CN117036243A (en) | Method, device, equipment and storage medium for detecting surface defects of shaving board | |
CN115330940A (en) | Three-dimensional reconstruction method, device, equipment and medium | |
CN119006434B (en) | Belt tear detection method and system based on dual visual state space model | |
CN115937182A (en) | A multi-view visual detection method for mechanical defects | |
CN116205918B (en) | Multi-mode fusion semiconductor detection method, device and medium based on graph convolution | |
CN114708230B (en) | Vehicle frame quality detection method, device, equipment and medium based on image analysis | |
CN116109627B (en) | Defect detection method, device and medium based on transfer learning and small sample learning | |
CN111967579B (en) | Method and device for performing convolution calculation on image using convolutional neural network | |
CN118429649B (en) | Image segmentation method and device, electronic equipment and storage medium | |
CN118887428B (en) | Image alignment method, device, storage medium and electronic equipment | |
CN113888522B (en) | Target detection method and system based on digital image and electronic equipment | |
CN114429519B (en) | A method for constructing VR content library based on visual modeling and word meaning analysis | |
CN116777848B (en) | Jade ware similarity analysis method and system | |
CN118711204B (en) | Building model construction method and system based on AI drawing recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information | ||
CB03 | Change of inventor or designer information |
Inventor after: Luo Liang Inventor after: Lin Zhu Inventor after: Li Haiwei Inventor after: Ma Zhiping Inventor after: Feng Diehua Inventor before: Luo Liang Inventor before: Lin Zhu Inventor before: Li Haiwei Inventor before: Ma Zhiping Inventor before: Feng Zhihua |
|
GR01 | Patent grant | ||
GR01 | Patent grant |