CN114266941A

CN114266941A - Method for rapidly detecting annotation result data of image sample

Info

Publication number: CN114266941A
Application number: CN202111605624.2A
Authority: CN
Inventors: 范亮; 汤坚; 王秋媚; 张磊; 郑路铭
Original assignee: Guangzhou Zhongke Zhi Tour Technology Co ltd
Current assignee: Guangzhou Zhongke Zhi Tour Technology Co ltd
Priority date: 2021-12-25
Filing date: 2021-12-25
Publication date: 2022-04-01

Abstract

The invention discloses a method for rapidly detecting image sample annotation result data, which comprises the functions of data preprocessing, model calling, algorithm testing, sample evaluation, pre-annotated image modification and the like. The method has the advantages that the data annotation result is more convenient to check and evaluate, and great progress is made in the directions of interactivity, operability, practicability and the like. The method comprises the following steps: s1: carrying out image preprocessing on the sample image; s2: testing the sample image after the pretreatment is finished to obtain a test result; s3: evaluating the test result to obtain an evaluation result; s4: comparing and analyzing the evaluation result with the label in the sample image, and calculating mAP, recall rate and accuracy; s5: and judging whether the mAP, the recall rate and the accuracy of the sample image are lower than a threshold value, if so, determining that the sample image is qualified and exporting, otherwise, determining that the sample image is unqualified, importing the sample image into a pre-marked image modification module, and exporting the modified mark information after modification.

Description

Method for rapidly detecting annotation result data of image sample

Technical Field

The invention relates to the technical field of artificial intelligence such as computer vision and deep learning, in particular to a method for rapidly detecting annotation result data of an image sample.

Background

In recent years, image target detection technology based on deep learning is becoming mature, and has been applied in the fields of smart grids, smart factories and the like, and strong practicability is shown. However, the deep learning based model training requires a large amount of labeled sample data, and thus requires a large amount of manpower for manual labeling. However, the labeling work is time-consuming and intensive, and problems such as label missing, label missing and the like often occur, so that great influence is brought to the development of the model.

In order to check hidden dangers caused by phenomena of label omission, label errors and the like, some checking methods are proposed, wherein the effect of using a target detection algorithm is particularly prominent. Specifically, the marked sample data is firstly imported into a target detection algorithm, then the preprocessed sample is predicted, and finally, the marking result and the algorithm prediction result are subjected to precision evaluation. On the basis, the target detection algorithm is utilized to analyze and check the sample data so as to improve the accuracy of the sample.

The traditional manual labeling method is often adopted for graphic labeling, which can bring waste of a large amount of human resources, time cost and the like, and label leakage and label error can occur due to unstable factors of labeling personnel. The deep learning algorithm provides extremely high requirements for the professional of users, the users are required to understand the principle and get familiar with the operation procedures, great inconvenience is brought to the detection work, the manual inspection method is time-consuming and labor-consuming, and the quality inspection effect is positively correlated with the quality of quality inspection personnel.

The quality inspection, the model training and calling and the pre-labeled picture examination are usually independent and are not systematic by using a deep learning method. And the difficulty in detecting small target objects is the problem existing in the target detection algorithm, and the model has a large amount of missed detection and false detection on some small target objects with large scale change, which is not beneficial to the detection of the marking condition.

Disclosure of Invention

The invention designs a method for rapidly detecting image sample annotation result data, which comprises the functions of data preprocessing, model calling, algorithm testing, sample evaluation, pre-annotated image modification and the like. The method has the advantages that the data annotation result is more convenient to check and evaluate, and great progress is made in the directions of interactivity, operability, practicability and the like.

The invention provides a method for rapidly detecting annotation result data of an image sample, which comprises the following steps:

carrying out image preprocessing on the sample image;

testing the sample image after the pretreatment is finished to obtain a test result;

evaluating the test result to obtain an evaluation result;

comparing and analyzing the evaluation result with the label in the sample image, and calculating mAP, recall rate and accuracy;

and judging whether the mAP, the recall rate and the accuracy of the sample image are lower than a threshold value, if so, determining that the sample image is qualified and exporting, otherwise, determining that the sample image is unqualified, importing the sample image into a pre-marked image modification module, and exporting the modified mark information after modification.

Alternatively to this, the first and second parts may,

step S1 includes denoising, contrast enhancing, and watermark removing the image sample to remove irrelevant information in the image.

Optionally, step S2 includes:

identifying a subject target;

when the main target accords with the scene logic, detecting and classifying the main target to obtain a final result;

and when the main target does not accord with the scene logic, carrying out false detection processing on the recognition result.

Alternatively to this, the first and second parts may,

after the main target is identified, the target with smaller confidence coefficient and the result of the size occupying the proportion of the whole graph and being smaller than the proportion of the normal size are excluded so as to reduce false detection.

Alternatively to this, the first and second parts may,

after the main target is identified, the main target is normalized according to a certain size, and the size consistency of the target to be identified in the inference model is ensured.

Optionally, step S2 includes:

and carrying out data sharing on the task image through the NFS.

Optionally, step S2 includes:

and selecting the trained test model from the model calling module according to the sample type.

Optionally, step S2 includes:

training and testing each test model on data sets of the same type of scene and different batches, wherein the precision index meets the requirement;

the model inference process follows the business logic of identifying targets.

Alternatively to this, the first and second parts may,

the identification result information comprises a first step identification result and a second step target identification result, and comprises a result return code, a return description, a category serial number, confidence and coordinate information of a position.

Alternatively to this, the first and second parts may,

the acceptable threshold for mAP is 50%;

the recall acceptable threshold is 60%;

the acceptable threshold for accuracy is 70%.

Compared with the prior art, the application has the following beneficial effects:

the invention provides a method for rapidly detecting successful data of image sample annotation, and the system integrates an algorithm testing module, a sample evaluation module, a model calling module and a pre-annotated image modification module. The operability, the interactivity and other aspects of the user are optimized to a great extent, and the detection technology is not limited to professionals any more. At this moment, the user only needs to import sample data, so that the detection efficiency is greatly improved, the manpower and material resources required by detection are reduced, and the cost is reduced.

The system integrates an algorithm testing module, a sample evaluation module, a model calling module and a pre-labeled image modification module, and a series of operations of detection personnel on a detection task can be completed in the system, so that the system is very simple and convenient.

The data preprocessing is carried out by utilizing an autonomously developed preprocessing method, and preprocessing operations comprise preprocessing such as denoising, contrast enhancement, watermark removal and the like on the image by utilizing technologies such as Gaussian filtering, histogram equalization, wavelet transformation and the like, so that irrelevant information in the image is eliminated.

The improved ESRGAN algorithm is used for carrying out resolution enhancement on the image, the definition of the image is further improved, and the accuracy of algorithm identification is improved.

And eliminating irrelevant information in the image, performing resolution enhancement on the data and the like, and further improving the precision of a target detection model later.

The sample detection is carried out by the detection method of firstly detecting the target position and then classifying, so that the detection precision is greatly improved, and the result precision of model identification is satisfied and is used as the basis for judging whether the quality of the labeled data meets the requirement or not.

By using the distributed computing and NFS file sharing system, the efficiency is improved, the precision of the sample is greatly improved, and the method has high discrimination on missing marks and wrong marks.

The pre-marked image modification module of the system is utilized to modify the sample images with missing marks and wrong marks, so that the loss of manpower and time caused by the repeated labor of marking personnel is avoided, and the accuracy of the sample images is greatly ensured.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a flowchart of a method for rapidly detecting annotation result data of an image sample according to a first embodiment of the present invention;

FIG. 2 is a flowchart illustrating a second embodiment of a method for rapidly detecting annotation result data of an image sample according to the present invention.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The words "a", "an" and "the" and the like as used herein are also intended to include the meanings of "a plurality" and "the" unless the context clearly dictates otherwise. Furthermore, the terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

Referring to fig. 1, a first embodiment of a method for rapidly detecting annotation result data of an image sample according to the present invention includes:

101: and carrying out image preprocessing on the sample image.

In this embodiment, the sample image is first introduced into a preprocessing module for image preprocessing. Preprocessing operations consistent with model training are adopted for preprocessing the sample images, and preprocessing operations including Gaussian filtering, histogram equalization, wavelet transformation and other technologies are adopted for denoising, contrast enhancement, watermark removal and the like of the images, so that irrelevant information in the images is eliminated. The improved ESRGAN algorithm is used for carrying out resolution enhancement on the image, the definition of the image is further improved, and the accuracy of algorithm identification is improved.

It should be noted that, according to the effectiveness requirement of the image recognition application and the complexity of the model, a plurality of image recognition algorithm model calculation servers are deployed in the cloud system, and data are transmitted back to the cloud system and are analyzed by the cloud system in a unified manner.

102: and testing the sample image after the pretreatment is finished to obtain a test result.

In this embodiment, the preprocessed sample image is imported into an algorithm testing module, and a trained testing model is selected from a model calling module according to the sample type. And splitting the task, and performing data sharing on the task image through an NFS system, wherein each test model is trained and tested on data sets of the same type of scene in different batches, and the precision index meets the requirement. The model reasoning process follows the service logic of the identified targets, the main targets are identified in the first step, whether the identified main targets accord with the scene logic or not is judged, and then the targets are detected and classified on the main targets identified in the first step to obtain the final result.

103: and evaluating the test result to obtain an evaluation result.

In this embodiment, the output result of the algorithm testing module is imported into the sample evaluation module. The identification result information includes the first step identification result and the second step target identification result, including result return code, return description, category serial number, confidence, coordinate information of position, etc.

104: comparing and analyzing the evaluation result with the label in the sample image, and calculating mAP, recall rate and accuracy;

105: judging whether the mAP, the recall rate and the accuracy of the sample image are lower than a threshold value, if so, executing 106, otherwise, executing 107;

106: considering the sample image to be qualified and exporting;

107: if the mark information is not qualified, the mark information is led into a pre-marked image modification module, and the mark information is corrected and then led out.

In this embodiment, if the mapp, the recall rate, and the accuracy of the sample image are not lower than the threshold, the sample image is determined to be qualified and derived. If any of the mAP, the recall rate and the accuracy of the sample image is lower than the threshold value, the sample image is regarded as unqualified, and the sample image is led into a pre-labeling image modification module, and the mark information is exported after being modified.

Note that the acceptable threshold for the mAP is 50%, the acceptable threshold for the recall rate is 60%, and the acceptable threshold for the accuracy is 60%.

In the embodiment, the system adopts an NFS (network File system) network File system to share images, accesses through a network in a manner similar to that of a local File system to realize common operation of multiple computing servers, and adopts a distributed computing method to distribute a large number of computing-intensive tasks to active computing servers for parallel operation so as to greatly reduce the operation time.

When detecting the image sample, firstly processing the data through a series of preprocessing work to eliminate irrelevant information of the image, then automatically determining the position and the category information of an object in the image sample through an algorithm testing module in the system, and writing an algorithm prediction result into a sample evaluation module. And performing precision evaluation according to the labeling result and the algorithm prediction result, wherein the precision evaluation comprises reference indexes such as recall rate, accuracy rate, mAP and the like.

The second embodiment of the method for rapidly detecting the annotation result data of the image sample, which is provided by the invention, is different from the first embodiment in that the method comprises the following steps: testing the sample image after the pretreatment is finished, and obtaining a test result comprises the following steps:

first step identification;

excluding the target with smaller confidence coefficient and the result that the size accounts for the proportion of the whole graph and is smaller than the proportion of the normal size;

in this embodiment, after the first step of recognition is completed, the target with a smaller confidence and the result of the proportion of the size in the full graph being smaller than the proportion of the normal size are excluded to reduce false detection.

And judging the logical relationship. And for the targets identified in the first step, judging whether the relation between the targets accords with scene logic or not and whether the subordination relation is correct or not, and if not, taking the result identified in the first step as false detection processing.

And (6) normalizing. And carrying out normalization operation on the target identified in the first step according to a certain size, and ensuring the size consistency of the target to be identified in the inference model.

And finally, identifying the target. And performing second-step final target detection and classification of corresponding classes in the main body identified in the first step, and eliminating the result with low confidence coefficient to obtain final target output.

Referring to fig. 2, the following describes a flow of the technical solution of the present embodiment by using an example of practical application, including:

201: carrying out image preprocessing on the sample image;

202: a first step of recognition, which may apply but is not limited to recognition of a subject target; and performing second-step final target detection and classification of corresponding classes in the subjects identified in the first step.

203: excluding the target with smaller confidence coefficient and the result that the size accounts for the proportion of the whole graph and is smaller than the proportion of the normal size;

204: judging whether the relation between the targets accords with scene logic or not;

if yes, go to step 205, otherwise go to step 206;

205: when the main target accords with the scene logic, detecting and classifying the main target to obtain a final result;

206: and when the main target does not accord with the scene logic, carrying out false detection processing on the recognition result.

207: identifying a final target to obtain final target output;

208: evaluating the test result to obtain an evaluation result;

209: comparing and analyzing the evaluation result with the label in the sample image, and calculating mAP, recall rate and accuracy;

210: judging whether the mAP, the recall rate and the accuracy of the sample image are lower than a threshold value, if so, executing 211, otherwise, executing 212;

211: considering the sample image to be qualified and exporting;

212: if the mark information is not qualified, the mark information is led into a pre-marked image modification module, and the mark information is corrected and then led out.

In the context of this application, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

Artificial intelligence is the subject of studying computers to simulate some human mental processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), both at the hardware level and at the software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural voice processing technology, machine learning/deep learning, a big data processing technology, a knowledge map technology and the like.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A method for rapidly detecting image sample annotation result data is characterized by comprising the following steps:

s1: carrying out image preprocessing on the sample image;

s2: testing the sample image after the pretreatment is finished to obtain a test result;

s3: evaluating the test result to obtain an evaluation result;

s4: comparing and analyzing the evaluation result with the label in the sample image, and calculating mAP, recall rate and accuracy;

s5: and judging whether the mAP, the recall rate and the accuracy of the sample image are lower than a threshold value, if so, determining that the sample image is qualified and exporting, otherwise, determining that the sample image is unqualified, importing the sample image into a pre-marked image modification module, and exporting the modified mark information after modification.

2. The method for rapidly detecting image sample annotation result data according to claim 1, wherein:

3. The method for rapidly detecting image sample annotation result data according to claim 1, wherein step S2 includes:

identifying a subject target;

4. The method for rapidly detecting image sample annotation result data according to claim 3, wherein:

5. The method for rapidly detecting image sample annotation result data according to claim 3, wherein:

6. The method for rapidly detecting image sample annotation result data according to claim 1, wherein step S2 includes:

and carrying out data sharing on the task image through the NFS.

7. The method for rapidly detecting image sample annotation result data according to claim 1, wherein step S2 includes:

8. The method for rapidly detecting image sample annotation result data according to claim 1, wherein step S2 includes:

the model inference process follows the business logic of identifying targets.

9. The method for rapidly detecting image sample annotation result data according to claim 1, wherein:

10. The method for rapidly detecting image sample annotation result data according to claim 1, wherein:

the acceptable threshold value of mAP is 50%

The recall acceptable threshold is 60%

The acceptable threshold for accuracy is 70%.