Background
Digital display type instruments are always one of the most important ways for displaying equipment parameters, such as natural gas meters, temperature and humidity meters, electric meter degrees and system monitoring equipment in other fields. However, the traditional data acquisition mode depends on manual reading, and is time-consuming and labor-consuming. In recent years, with the development of technologies such as artificial intelligence and the internet of things, a more convenient and lower-cost reading mode needs to be provided by using the existing advanced technology.
The method is widely applied at present, reading is realized by using a traditional machine learning method, and the main technology comprises data preprocessing, reading positioning and digital classification and identification, wherein the data preprocessing mainly utilizes binarization, filtering, histogram equalization and tilt correction; the reading positioning mainly uses a projection method; the digital classification recognition mainly uses technologies such as SVM, KNN and the like. These techniques are briefly described below:
1) binarization method
The binarization mainly comprises the steps of carrying out gray level processing on original data and normalizing gray values to be 0-1, and has the main defect that the picture loses detail information and rich color distribution information along with normalization.
2) Filtering
The filtering is a noise point removing (blurring) algorithm which is realized by using a Gaussian filtering algorithm and a bilateral filtering algorithm on the binarized data, and because the core of the algorithm is the weighted average sum of adjacent points of pixels, the filtering algorithm can lose the detail information of the image, and has large loss on detail characteristics.
3) Histogram equalization
Histogram equalization is performed on the basis of the filtered data, and although it may perform some data enhancement, it may destroy the distribution of the original data.
4) Tilt correction
The Hough transform is used for correcting the image, but under the condition that the complex lines and the unclear lines of the image are formed, the reference straight line in the image is difficult to monitor, so that the accuracy of inclination correction is low.
5) Projection method
The projection method is to project the preprocessed data in the X, Y direction, and the projection method selects discontinuous areas after projection as the basis of digital positioning and segmentation, so that the projection method is difficult to project multi-line repetitive and crossed data.
6) SVM, KNN technique
The digital recognition is a method for classifying and recognizing feature data obtained after projection by using traditional machine learning algorithms such as SVM, KNN and the like, and the difficulty is high by adopting the traditional machine learning method because the data generally has a complex mathematical structure and distribution and complex dimension information is difficult to estimate.
Disclosure of Invention
In view of the above problems, an object of the present invention is to provide an improved method, a system, a processing device and a storage medium for intelligently identifying the reading of a digital display instrument based on video, which greatly improves the reading identification accuracy of the digital display instrument.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention provides a method for intelligently identifying the reading of a digital display instrument, which comprises the following steps:
s1: carrying out size transformation on the original image data to obtain preprocessed image data with consistent size;
s2: inputting the preprocessed image data obtained in the step S1 into a pre-trained YOLOv3 network model to obtain a preliminary prediction result;
s3: carrying out threshold judgment on the preliminary prediction result to obtain a plurality of single character objects meeting the threshold condition;
s4: comprehensively judging the horizontal and vertical coordinates of each single character object obtained in the step S3 to obtain each combined character string and coordinate information thereof;
s5: and (4) outputting each character string and the coordinate information thereof obtained in the step (S4) to realize intelligent identification of the reading of the digital display instrument.
Further, step S1 is preceded by the step of acquiring raw image data.
Further, the original image data acquisition adopts a camera device or a mode of directly leading in an original image.
Further, in step S2, the obtained preliminary prediction result includes the category identifiers of all the characters in the preprocessed image data and the coordinate information of the four vertices corresponding to the detection box.
Further, in step S3, the step of determining the threshold value of the preliminary prediction result is: and filtering the character objects with the confidence degrees smaller than the preset threshold value by a threshold value judging mode to obtain a plurality of single character objects meeting the threshold value condition.
Further, in step S4, the method for obtaining the combined character strings and the coordinate information thereof includes the following steps:
performing difference value operation on the character coordinates of every two characters, and if the difference values of the horizontal coordinates and the vertical coordinates of the two character objects are respectively smaller than a certain preset threshold value, determining the two character objects to be data in the same character string; otherwise, the character strings are considered to belong to different character strings, and a plurality of combined character strings are finally obtained;
based on the obtained multiple combined character strings, the horizontal and vertical coordinate with the minimum characters in each combined character string is taken as the upper left corner coordinate of the character string, and the horizontal and vertical coordinate with the maximum characters in each character string is taken as the lower right corner coordinate of the character string.
In a second aspect of the present invention, there is provided a digital display instrument reading intelligent identification system, which includes:
the data acquisition unit is used for acquiring original image data;
the data preprocessing unit is used for carrying out size transformation on the original image data to obtain preprocessed image data with consistent size;
the data prediction unit is used for inputting the obtained preprocessed image data into a pre-trained Yolov3 network model to obtain a preliminary prediction result;
the threshold judgment unit is used for carrying out threshold judgment on the preliminary prediction result to obtain a plurality of single character objects meeting the threshold condition;
the character processing unit is used for comprehensively judging the horizontal and vertical coordinates of each obtained single character object to obtain a combined character string and coordinate information thereof;
and the output unit is used for outputting the obtained character strings and coordinate information to realize intelligent identification of the reading of the digital display instrument.
Further, the character processing unit includes:
the character coordinate judging unit is used for carrying out difference value operation on the character coordinates of every two characters, and if the difference values of the horizontal coordinates and the vertical coordinates of the two character objects are respectively smaller than a preset threshold value, the two characters are considered to be data in the same character string and are combined; otherwise, the character strings are considered to belong to different character strings, and a plurality of combined character strings are finally obtained;
and the character string coordinate extraction unit is used for taking the smallest horizontal and vertical coordinate of all characters in each combined character string as the upper left corner coordinate of the character string and taking the largest horizontal and vertical coordinate of all characters in each character string as the lower right corner coordinate of the character string based on the obtained combined character strings.
In a third aspect of the present invention, a processing device is provided, where the processing device at least includes a processor and a memory, where the memory stores a computer program, and the processor executes the computer program to implement the steps of the digital display meter reading intelligent identification method.
In a fourth aspect of the present invention, a computer storage medium is provided, on which computer readable instructions are stored, the computer readable instructions being executable by a processor to implement the steps of the method for intelligently identifying the reading of the digital display meter.
Due to the adoption of the technical scheme, the invention has the following advantages: 1. the method realizes the single character recognition of the digital display instrument reading by introducing the YOLOv3 target detection model, and performs character combination on the recognized result to obtain the character string, so that the accuracy rate of the digital character string reading recognition of the digital display instrument reaches more than 95%. 2. According to the method, all characters recognized by YOLOv3 are comprehensively judged through the position relation, so that a complete character string is obtained through combination, and the accuracy rate of character string recognition is guaranteed. The invention can be widely applied to the field of digital display instrument reading identification.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
It is to be understood that the terminology used herein is for the purpose of describing particular example embodiments only, and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" may be intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms "comprises," "comprising," "including," and "having" are inclusive and therefore specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof. The method steps, processes, and operations described herein are not to be construed as necessarily requiring their performance in the particular order described or illustrated, unless specifically identified as an order of performance. It should also be understood that additional or alternative steps may be used.
The method, the system and the storage medium for voiceprint recognition based on spatio-temporal information fusion provided by the embodiment of the invention have the following implementation processes: carrying out size transformation on original image data; inputting the image data with the changed size into a pre-trained YOLOv3 network model to obtain a preliminary prediction result; judging a threshold value of the preliminary prediction result; and (4) comprehensively judging the horizontal and vertical coordinates of each single character object obtained in the step (S4) to obtain the combined character string and the coordinate information thereof. The method realizes the single character recognition of the digital display instrument reading by introducing the YOLOv3 target detection model, and performs character combination on the recognized result to obtain the character string, so that the accuracy rate of the digital character string reading recognition of the digital display instrument reaches more than 95%.
In order to facilitate understanding of the contents of the embodiments of the present invention, abbreviations and key terms appearing in the embodiments of the present invention are explained below.
SVM: the support vector machine is a two-classification model, and the basic idea of the support vector machine is to maximize the interval defined on a feature space, the interval is the largest to distinguish the support vector machine from other algorithms, and the support vector machine is called in the form of an image due to the existence of the maximum interval separation point; the SVM also includes kernel techniques, which make it a substantially non-linear classifier. SVMs have been widely used in various intelligent production environments until now because they are mathematically complete and easy to implement.
KNN: the nearest neighbor node algorithm is one of the simplest methods in the traditional machine learning technology. The main idea is that each sample can be represented by its nearest K neighbors. The method is widely applied to classification tasks of simple data sets due to the simple principle and complete theory.
YOLOv 3: the deep neural network framework realized based on the convolutional neural network is widely applied to the production environment in the field of image recognition due to the small model file, the high accuracy and the high recognition speed of the deep neural network framework. Generally, the input of the YOLOv3 model is an image, and the output of the YOLOv3 model includes a category identifier and a corresponding check box, wherein the check box refers to the position of a character recognized by the YOLOv3 model in the input image after the input image passes through the YOLOv3 model, and the position is a box covered by the character and is identified by four vertex coordinates of the box; the category identification refers to the category of each character detected, i.e., what the character is.
Example 1
As shown in fig. 1, the present embodiment provides an intelligent identification method for digital display meter reading, which can realize high-speed and high-accuracy identification of meter reading, and specifically includes the following steps:
s1: an original image is acquired as input data.
Specifically, the input data may be obtained using a camera device or by directly importing the original image.
S2: and carrying out size transformation on the original image data to obtain preprocessed image data with consistent size.
Specifically, since the imported original image data may have different sizes, in order to reduce the influence on the recognition accuracy and efficiency, the original image data needs to be subjected to size conversion according to a preset size, so that the sizes of the converted image data are consistent.
S3: and (4) inputting the preprocessed image data obtained in the step (S2) into a pre-trained YOLOv3 network model to obtain a preliminary prediction result.
Specifically, the preliminary prediction result includes category identifications of all characters in the preprocessed image data and coordinate information of four vertices of the corresponding detection box.
S4: and judging the threshold value of the preliminary prediction result to obtain a plurality of single character objects meeting the threshold value condition.
Specifically, the preliminary detection result obtained in step S3 is filtered by means of threshold judgment to obtain a plurality of single character objects meeting the threshold condition, so as to reduce the probability of false detection to a certain extent, that is, reduce target jump.
S5: and (4) comprehensively judging the horizontal and vertical coordinates of each single character object obtained in the step (S4) to obtain a combined character string and coordinate information thereof.
Specifically, since the data obtained in step S4 is data of a single character object, it is necessary to perform a horizontal and vertical coordinate comprehensive judgment on each of three character objects, and the method includes the following steps:
performing difference value operation on the character coordinates of every two characters, and if the difference values of the horizontal coordinates and the vertical coordinates of the two character objects are respectively smaller than a certain preset threshold value, determining the two character objects to be data in the same character string; otherwise, it is considered as data in another string.
Based on the obtained multiple combined character strings, the horizontal and vertical coordinate with the minimum characters in each combined character string is taken as the upper left corner coordinate of each character string, and the horizontal and vertical coordinate with the maximum characters in each character string is taken as the lower right corner coordinate of each character string.
S6: and (4) outputting the character strings and the coordinate information obtained in the step (S5) to realize intelligent identification of the reading of the digital display instrument.
Example 2
The embodiment 1 provides an intelligent identification method for digital display instrument reading, and correspondingly, the embodiment provides an intelligent identification system for digital display instrument reading. The identification system provided by this embodiment can implement the digital display instrument reading-based intelligent identification method of embodiment 1, and the identification system can be implemented by software, hardware, or a combination of software and hardware. For example, the identification system may comprise integrated or separate functional modules or functional units to perform the corresponding steps in the methods of embodiment 1. Since the identification system of this embodiment is basically similar to the method embodiment, the description process of this embodiment is relatively simple, and reference may be made to the partial description of embodiment 1 for relevant points.
The digital display instrument reading intelligent recognition system that this embodiment provided includes:
the data acquisition unit is used for acquiring original image data;
the data preprocessing unit is used for carrying out size transformation on the original image data to obtain preprocessed image data with consistent size;
the data prediction unit is used for inputting the obtained preprocessed image data into a pre-trained Yolov3 network model to obtain a preliminary prediction result;
the threshold judgment unit is used for carrying out threshold judgment on the preliminary prediction result to obtain a plurality of single character objects meeting the threshold condition;
the character processing unit is used for comprehensively judging the horizontal and vertical coordinates of each obtained single character object to obtain a combined character string and coordinate information thereof;
and the output unit is used for outputting the obtained character strings and coordinate information to realize intelligent identification of the reading of the digital display instrument.
Further, the character processing unit includes: the character coordinate judging unit is used for carrying out difference value operation on the character coordinates of every two characters, and if the difference values of the horizontal coordinates and the vertical coordinates of the two character objects are respectively smaller than a preset threshold value, the two characters are considered to be data in the same character string; otherwise, the character string is considered to belong to another character string, and finally each character string after combination is obtained;
and the character string coordinate extraction unit is used for taking the minimum horizontal and vertical coordinate from all the characters of the combined character string as the upper left corner coordinate of the character string and taking the maximum horizontal and vertical coordinate from all the characters of the character string as the lower right corner coordinate of the character string.
Example 3
The present embodiment provides a processing device corresponding to the digital display instrument reading intelligent identification method provided in embodiment 1, where the processing device may be a processing device for a client, such as a mobile phone, a notebook computer, a tablet computer, a desktop computer, and the like, to execute the identification method of embodiment 1.
The processing equipment comprises a processor, a memory, a communication interface and a bus, wherein the processor, the memory and the communication interface are connected through the bus so as to complete mutual communication. The memory stores a computer program that can be run on the processor, and the processor executes the computer program to execute the method for voiceprint recognition based on spatiotemporal information fusion provided in embodiment 1.
In some implementations, the Memory may be a high-speed Random Access Memory (RAM), and may also include a non-volatile Memory, such as at least one disk Memory.
In other implementations, the processor may be various general-purpose processors such as a Central Processing Unit (CPU), a Digital Signal Processor (DSP), and the like, and is not limited herein.
Example 4
The method for intelligently identifying the reading of the digital display meter according to embodiment 1 may be embodied as a computer program product, and the computer program product may include a computer readable storage medium on which computer readable program instructions for executing the method for intelligently identifying the reading of the digital display meter according to embodiment 1 are loaded.
The computer readable storage medium may be a tangible device that retains and stores instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any combination of the foregoing.
It should be noted that the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. Each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application should be defined by the claims.