CN114973255B

CN114973255B - Single-point character recognition method and device

Info

Publication number: CN114973255B
Application number: CN202210523887.7A
Authority: CN
Inventors: 刘青松; 梁家恩
Original assignee: Unisound Intelligent Technology Co Ltd
Current assignee: Unisound Intelligent Technology Co Ltd
Priority date: 2022-05-14
Filing date: 2022-05-14
Publication date: 2024-09-10
Anticipated expiration: 2042-05-14
Also published as: CN114973255A

Abstract

A single-point character recognition method and device, the method judges the use state of the user, uses the state to judge the process to obtain the picture sequence scanned by the user and carries on the sub-image representation; calculating the similarity difference between sub-pictures according to the sub-picture characterization result, and judging that the use state of the user is a single-point mode when the similarity difference between the frames is smaller than a given similarity difference threshold value; in the single-point mode, splicing sub-pictures in the picture sequence into summarized pictures; and selecting and defining left and right boundaries of the summarized pictures after splicing the sub-pictures, and carrying out character recognition and output on the summarized pictures with the left and right boundaries after being selected and defined. When the inter-frame similarity difference is not smaller than a given similarity difference threshold, judging that the user use state is a sliding mode, splicing the picture sequences into summarized pictures, and directly carrying out character recognition and output. The invention improves the recognition accuracy and the user experience of the word segmentation pen in a single-point scene, and simultaneously can ensure the use effect in a sliding scene.

Description

Single-point character recognition method and device

Technical Field

The invention belongs to the technical field of text recognition, and particularly relates to a single-point character recognition method and device.

Background

At present, a camera, a picture splicing technology and an OCR technology are widely applied to a word segmentation pen product, the existing product is a sliding scanning scene, and the word segmentation technology based on the camera comprises a picture splicing module and an OCR recognition module under the sliding scanning scene, wherein the camera shoots a sliding area of a user to obtain a picture sequence. The picture splicing module splices the picture sequences into a single picture, and the OCR module recognizes the spliced picture to obtain a recognition result of the text of the sliding region of the user.

But in the present stage, the effects of picture stitching and OCR recognition in a single point scene have yet to be improved. The error recognition result is easy to obtain or the recognition result cannot be obtained in a single-point scene, and mainly has two reasons: compared with a sliding scene, in a single-point scene, the front and rear position characters shot by the camera are incomplete; aiming at parameter configuration of a sliding scene, the method is not suitable for a single-point scene, and the problem that words are cut off or no effective picture splicing result exists easily occurs. How to improve the accuracy of character recognition in a single-point scene on the premise of ensuring the use effect of a sliding scene is a technical problem to be solved urgently.

Disclosure of Invention

Therefore, the invention provides a single-point character recognition method and device, which solve the problems that characters are easily truncated or no effective spliced pictures exist in a single-point scene, the character recognition accuracy is low, and the use effect of a sliding scene cannot be guaranteed.

In order to achieve the above object, the present invention provides the following technical solutions: a single-point character recognition method comprises the following steps:

1) Judging the use state of a user, wherein the use state judging process acquires a picture sequence scanned by the user, and sub-picture characterization is carried out on sub-pictures in the picture sequence;

2) Calculating the similarity difference between sub-pictures according to the sub-picture characterization result, and judging that the use state of a user is a single-point mode when the similarity difference between the frames is smaller than a given similarity difference threshold value;

3) When the use state of the user is a single-point mode, splicing the sub-pictures in the picture sequence into summarized pictures;

4) And selecting and defining left and right boundaries of the summarized pictures after splicing the sub-pictures, and performing character recognition and output on the summarized pictures with the left and right boundaries defined by the selection.

In step 1), when sub-images in the image sequence are subjected to sub-image representation, the sub-images in the image sequence are compressed, and hash values of the compressed sub-images are selected for image representation.

In step 3), sub-pictures in the picture sequence are spliced into summarized pictures, and when the area where the characters in the summarized pictures are located is determined, starting position correction is carried out on the characters with the up-down structure and the characters with insufficient projection.

As a preferred embodiment of the single-point character recognition method, the step 4) includes:

41 Binarizing the spliced summarized pictures, and performing Y-axis direction projection to obtain a projection picture;

42 Scanning and judging pixel values in the projection graph to obtain character boundary information;

43 Judging the width information of the candidate characters and the relative position of the center position of the leftmost candidate character from the left edge of the summarized picture according to the character boundary information, and the relative position of the center position of the rightmost candidate character from the right edge of the summarized picture;

44 Acquiring the average width of characters except for the leftmost candidate character and the rightmost candidate character;

45 Judging whether the leftmost character is complete according to the leftmost candidate character and the average character width, and judging whether the rightmost character is complete according to the rightmost candidate character and the average character width;

46 If the leftmost character is the incomplete character, determining the left boundary of the summarized picture according to the relative position of the center position of the leftmost candidate character from the left edge of the summarized picture;

and if the rightmost character is the incomplete character, determining the right boundary of the summarized picture according to the relative position of the center position of the rightmost candidate character from the right edge of the summarized picture.

In step 2), when the inter-frame similarity difference is not smaller than a given similarity difference threshold, determining that the user use state is a sliding mode, and splicing the picture sequences into the summarized pictures and directly performing character recognition and output.

The invention also provides a single-point character recognition device, which comprises:

The use state judging module is used for judging the use state of the user; the use state judging module comprises:

the sub-graph representation sub-module is used for acquiring a picture sequence scanned by a user in the use state judging process and carrying out sub-graph representation on sub-graphs in the picture sequence;

The similarity judging sub-module is used for calculating the inter-frame similarity difference of the sub-pictures according to the sub-picture characterization result, and judging that the use state of the user is a single-point mode when the inter-frame similarity difference is smaller than a given similarity difference threshold value;

the sub-picture splicing module is used for splicing sub-pictures in the picture sequence into summarized pictures when the use state of the user is a single-point mode;

The boundary selection module is used for selecting and defining the left and right boundaries of the summarized picture after the sub-picture is spliced;

and the identification output module is used for carrying out character identification and output on the summarized pictures with the left and right boundaries defined.

As a preferred embodiment of the single-point character recognition device, the usage state judging module further includes:

and the compression processing sub-module is used for compressing the sub-pictures in the picture sequence when the sub-pictures in the picture sequence are subjected to sub-picture characterization, and selecting hash values of the compressed sub-pictures to carry out picture characterization.

As a preferred solution of the single-point character recognition device, the sub-graph stitching module includes:

And the position correction sub-module is used for splicing the sub-pictures in the picture sequence into summarized pictures, and correcting the starting positions of the upper and lower structural characters and the characters with insufficient projection when determining the area where the characters in the summarized pictures are positioned.

As a preferred embodiment of the single-point character recognition device, the boundary selection module includes:

the projection processing sub-module is used for binarizing the spliced summarized pictures and projecting in the Y-axis direction to obtain a projection picture;

The character boundary extraction sub-module is used for scanning and judging pixel values in the projection graph to obtain character boundary information;

The character boundary analysis sub-module is used for judging the width information of the candidate characters and the relative position of the center position of the leftmost candidate character from the left edge of the summarized picture according to the character boundary information and the relative position of the center position of the rightmost candidate character from the right edge of the summarized picture;

An average width obtaining sub-module for obtaining the average width of the characters except the leftmost candidate character and the rightmost candidate character;

the character integrity judging sub-module is used for judging whether the leftmost character is complete according to the leftmost candidate character and the character average width, and judging whether the rightmost character is complete according to the rightmost candidate character and the character average width;

the character boundary defining sub-module is used for determining the left boundary of the summarized picture according to the relative position of the center position of the leftmost candidate character from the left edge of the summarized picture if the leftmost character is a incomplete character;

As a preferred solution of the single-point character recognition device, in the similarity judging submodule, when the inter-frame similarity difference is not smaller than a given similarity difference threshold, the use state of the user is judged to be a sliding mode; and splicing the picture sequences into the summarized pictures, and directly carrying out character recognition and output.

The invention has the following advantages: judging the use state of a user, acquiring a picture sequence scanned by the user in the use state judging process, and carrying out sub-graph characterization on sub-pictures in the picture sequence; calculating the similarity difference between sub-pictures according to the sub-picture characterization result, and judging that the use state of the user is a single-point mode when the similarity difference between the frames is smaller than a given similarity difference threshold value; when the use state of the user is a single-point mode, splicing the sub-pictures in the picture sequence into summarized pictures; and selecting and defining left and right boundaries of the summarized pictures after splicing the sub-pictures, and carrying out character recognition and output on the summarized pictures with the left and right boundaries after being selected and defined. When the inter-frame similarity difference is not smaller than a given similarity difference threshold, judging that the user use state is a sliding mode, splicing the picture sequences into summarized pictures, and directly carrying out character recognition and output. The invention can avoid word truncation or no effective picture splicing result in the single-point scene, improves the recognition accuracy and user experience of the word segmentation pen in the single-point scene, and can ensure the use effect in the sliding scene.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It will be apparent to those of ordinary skill in the art that the drawings in the following description are exemplary only and that other implementations can be obtained from the extensions of the drawings provided without inventive effort.

The structures, proportions, sizes, etc. shown in the present specification are shown only for the purposes of illustration and description, and are not intended to limit the scope of the invention, which is defined by the claims, so that any structural modifications, changes in proportions, or adjustments of sizes, which do not affect the efficacy or the achievement of the present invention, should fall within the scope of the invention.

FIG. 1 is a schematic flow chart of a single-point character recognition method according to embodiment 1 of the present invention;

FIG. 2 is a schematic diagram of a boundary defining process of single-point character recognition according to embodiment 1 of the present invention;

Fig. 3 is a schematic diagram of a single-point character recognition device according to embodiment 2 of the present invention.

Detailed Description

Other advantages and advantages of the present invention will become apparent to those skilled in the art from the following detailed description, which, by way of illustration, is to be read in connection with certain specific embodiments, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1

Referring to fig. 1, embodiment 1 of the present invention provides a single-point character recognition method, which includes the following steps:

s1, judging the use state of a user, wherein the use state judging process acquires a picture sequence scanned by the user, and sub-graph characterization is carried out on sub-pictures in the picture sequence;

S2, calculating the similarity difference between sub-pictures according to the sub-picture characterization result, and judging that the use state of the user is a single-point mode when the similarity difference between the frames is smaller than a given similarity difference threshold value;

S3, when the use state of the user is a single-point mode, splicing the sub-pictures in the picture sequence into summarized pictures;

s4, selecting and defining left and right boundaries of the summarized pictures after splicing the sub-pictures, and performing character recognition and output on the summarized pictures with the left and right boundaries defined by the selection.

In this embodiment, step S1 compresses the sub-picture in the picture sequence when performing sub-picture characterization on the sub-picture in the picture sequence, and selects the hash value of the compressed sub-picture to perform picture characterization. In step S2, when the inter-frame similarity difference is not smaller than a given similarity difference threshold, determining that the user use state is a sliding mode, and splicing the picture sequences into the summarized pictures and directly performing character recognition and output.

Specifically, in the process of calculating the similarity of the sub-pictures in the picture sequence, the nth, the 2n, the N, k ×n Zhang Zitu tablets are selected, and k×n < r×fs is satisfied, wherein r is a parameter, and is set to 0.5, and fs is the frame rate of the camera. When calculating the similarity of sub-pictures, in order to improve the calculation speed, firstly compressing each frame of sub-picture to the size of M, and selecting the compressed sub-picture hash value as the picture representation.

Meanwhile, calculating the inter-frame similarity difference delta according to the picture characterization result, if delta is smaller than the similarity difference threshold value, judging that the user operation is in a single-point mode, and if not, judging that the user operation is in a sliding mode.

In the single-point mode, in step S3, sub-pictures in the picture sequence are spliced into a summary picture, and when the area where the characters in the summary picture are located is determined, the starting position correction is performed on the characters with the up-down structure and the characters with insufficient projection.

Specifically, compared with the jigsaw in the sliding mode, when determining the region where the characters in the spliced summarized picture are located, the influence of the font structure on the picture cutting is considered, so that the correction of the starting position of the characters is required to improve the greater redundancy. The starting position correction is performed for Chinese characters with up-down structures, such as 'Yes', and simple Chinese characters with insufficient projection, such as 'one'.

Referring to fig. 2, in the single point mode, step S4 includes:

S41, binarizing the spliced summarized pictures, and carrying out Y-axis direction projection to obtain a projection picture;

S42, scanning and judging pixel values in the projection graph to obtain character boundary information;

S43, judging the width information of the candidate characters and the relative position of the center position of the leftmost candidate character from the left edge of the summarized picture according to the character boundary information, and the relative position of the center position of the rightmost candidate character from the right edge of the summarized picture;

S44, acquiring the average width of the characters except the leftmost candidate character and the rightmost candidate character;

s45, judging whether the leftmost character is complete according to the leftmost candidate character and the average character width, and judging whether the rightmost character is complete according to the rightmost candidate character and the average character width;

s46, if the leftmost character is a incomplete character, determining the left boundary of the summarized picture according to the relative position of the center position of the leftmost candidate character from the left edge of the summarized picture;

Specifically, the purpose of step S4 is to solve the problem of recognition errors caused by incomplete words near the left and right boundary areas in the single-point mode.

Specifically, in step S41, the binarization of the summarized image is to set the gray value of the pixel point on the image to 0 or 255, so that the whole summarized image shows an obvious black-and-white effect. The projection map is obtained by performing Y-axis direction, that is, horizontal projection on the binarized aggregate picture. The binarization projection itself belongs to the prior art, and more mature codes can be realized.

Specifically, according to the character boundary information, the width information W of the candidate characters, the relative position B _left of the center position of the leftmost candidate character from the left edge of the summarized picture, and the relative position E _right of the center position of the rightmost candidate character from the right edge of the summarized picture are determined. And then removing the leftmost candidate character and the rightmost candidate character, calculating the average width W _average of the rest characters, judging the leftmost character as the incomplete character if the absolute difference between the width W _Leftmost of the leftmost candidate character and the average width W _average of the characters exceeds a given limit threshold, and judging the rightmost character as the incomplete character if the absolute difference between the width W _Rightmost of the rightmost candidate character and the average width W _average of the characters exceeds the given limit threshold.

When the leftmost character is judged to be the incomplete character, subtracting the relative position B _left of the center position of the leftmost candidate character which is 2 times away from the left edge of the summarized picture at the moment to serve as the left boundary of the summarized picture.

When the rightmost character is judged to be the incomplete character, subtracting the relative position E _right of the center position of the rightmost candidate character which is 2 times away from the right edge of the summarized picture at the moment to serve as the right boundary of the summarized picture.

Specifically, the relative position B _left of subtracting 2 times the center position of the leftmost candidate character from the left edge of the summary picture is relative to the leftmost incomplete character removed. Subtracting the relative position E _right of the center position of the rightmost candidate character which is 2 times away from the right edge of the summarized picture, and correspondingly removing the rightmost incomplete character, thereby realizing the definition of the left and right boundaries of the spliced summarized picture.

In summary, the method and the device for determining the sub-picture of the image sequence acquire the image sequence scanned by the user through determining the use state of the user, and perform sub-picture characterization on the sub-picture in the image sequence; calculating the similarity difference between sub-pictures according to the sub-picture characterization result, and judging that the use state of the user is a single-point mode when the similarity difference between the frames is smaller than a given similarity difference threshold value; when the use state of the user is a single-point mode, splicing the sub-pictures in the picture sequence into summarized pictures; and selecting and defining left and right boundaries of the summarized pictures after splicing the sub-pictures, and carrying out character recognition and output on the summarized pictures with the left and right boundaries after being selected and defined. When the inter-frame similarity difference is not smaller than a given similarity difference threshold, judging that the user use state is a sliding mode, splicing the picture sequences into summarized pictures, and directly carrying out character recognition and output. The invention can avoid word truncation or no effective picture splicing result in the single-point scene, improves the recognition accuracy and user experience of the word segmentation pen in the single-point scene, and can ensure the use effect in the sliding scene.

Example 2

Referring to fig. 3, embodiment 2 of the present invention further provides a single-point text recognition device, including:

the use state judging module 1 is used for judging the use state of a user; the use state judging module comprises:

The sub-graph representation sub-module 11 is used for acquiring a picture sequence scanned by a user in the use state judging process and carrying out sub-graph representation on sub-graphs in the picture sequence;

The similarity judging sub-module 12 is configured to calculate a similarity difference between sub-pictures according to the sub-picture characterization result, and when the similarity difference between the frames is smaller than a given similarity difference threshold value, determine that the user use state is a single-point mode;

the sub-picture splicing module 2 is used for splicing sub-pictures in the picture sequence into summarized pictures when the use state of the user is a single-point mode;

the boundary selection module 3 is used for selecting and defining the left and right boundaries of the summarized picture after the sub-picture is spliced;

And the recognition output module 4 is used for carrying out character recognition and output on the summarized pictures with the left and right boundaries defined.

In this embodiment, the usage status determining module 1 further includes:

And the compression processing sub-module 13 is used for compressing the sub-pictures in the picture sequence and selecting hash values of the compressed sub-pictures to perform picture representation when the sub-pictures in the picture sequence are subjected to sub-picture representation.

In this embodiment, the sub-graph stitching module 2 includes:

And the position correction sub-module 21 is used for splicing the sub-pictures in the picture sequence into a summarized picture, and correcting the starting positions of the upper and lower structural characters and the characters with insufficient projection when determining the area where the characters in the summarized picture are positioned.

In this embodiment, the boundary selection module 3 includes:

the projection processing sub-module 31 is configured to binarize the spliced summarized picture, and perform Y-axis direction projection to obtain a projection image;

A character boundary extraction sub-module 32, configured to scan and determine pixel values in the projection map, so as to obtain character boundary information;

The character boundary analysis sub-module 33 is configured to determine, according to the character boundary information, width information of the candidate characters, a relative position of a center position of a leftmost candidate character from a left edge of the summarized picture, and a relative position of a center position of a rightmost candidate character from a right edge of the summarized picture;

An average width acquisition sub-module 34 for acquiring an average width of characters other than the leftmost candidate character and the rightmost candidate character;

The character integrity judging sub-module 35 is configured to judge whether the leftmost character is complete according to the leftmost candidate character and the character average width, and judge whether the rightmost character is complete according to the rightmost candidate character and the character average width;

a character boundary defining sub-module 36, configured to determine a left boundary of the summarized picture according to a relative position of a center position of the leftmost candidate character from a left edge of the summarized picture if the leftmost character is a incomplete character;

In this embodiment, in the similarity determination submodule 12, when the inter-frame similarity difference is not smaller than a given similarity difference threshold, it is determined that the user use state is a sliding mode; and splicing the picture sequences into the summarized pictures, and directly carrying out character recognition and output.

It should be noted that, because the content of information interaction and execution process between the modules/sub-modules of the above-mentioned device is based on the same concept as the method embodiment in the embodiment 1 of the present application, the technical effects brought by the content are the same as the method embodiment of the present application, and the specific content can be referred to the description in the foregoing illustrated method embodiment of the present application, which is not repeated here.

Example 3

Embodiment 3 of the present invention provides a non-transitory computer-readable storage medium having stored therein program code of a single-point character recognition method, the program code including instructions for performing the single-point character recognition method of embodiment 1 or any possible implementation thereof.

Computer readable storage media can be any available media that can be accessed by a computer or data storage devices, such as servers, data centers, etc., that contain an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk (Solid STATE DISK, SSD)), etc.

Example 4

Embodiment 4 of the present invention provides an electronic device, including: a memory and a processor;

The processor and the memory complete communication with each other through a bus; the memory stores program instructions executable by the processor, which invokes the program instructions to perform the single point word recognition method of embodiment 1 or any possible implementation thereof.

Specifically, the processor may be implemented by hardware or software, and when implemented by hardware, the processor may be a logic circuit, an integrated circuit, or the like; when implemented in software, the processor may be a general-purpose processor, implemented by reading software code stored in a memory, which may be integrated in the processor, or may reside outside the processor, and which may reside separately.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.).

It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a memory device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps within them may be fabricated into a single integrated circuit module for implementation. Thus, the present invention is not limited to any specific combination of hardware and software.

While the invention has been described in detail in the foregoing general description and specific examples, it will be apparent to those skilled in the art that modifications and improvements can be made thereto. Accordingly, such modifications or improvements may be made without departing from the spirit of the invention and are intended to be within the scope of the invention as claimed.

Claims

1. The single-point character recognition method is characterized by comprising the following steps of:

2. The single-point character recognition method according to claim 1, wherein in step 1), when sub-images in the image sequence are subjected to sub-image representation, the sub-images in the image sequence are compressed, and hash values of the compressed sub-images are selected for image representation.

3. The method for recognizing single-point characters according to claim 2, wherein in step 3), sub-pictures in the picture sequence are spliced into a summary picture, and when determining an area where characters in the summary picture are located, starting position correction is performed on the characters with up-down structures and the characters with insufficient projection.

4. A single point text recognition method as defined in claim 3, wherein step 4) includes:

5. The single-point character recognition method according to claim 1, wherein in step 2), when the inter-frame similarity difference is not smaller than a given similarity difference threshold, it is determined that a user use state is a sliding mode, and the picture sequences are spliced into the summarized pictures and character recognition and output are directly performed.

6. A single point character recognition device, comprising:

7. The single point character recognition device according to claim 6, wherein the usage status judging module further comprises:

8. The single point text recognition device of claim 7, wherein the sub-graph stitching module comprises:

9. The single point text recognition device of claim 8, wherein the boundary selection module comprises:

10. The apparatus according to claim 9, wherein the similarity determination sub-module determines that the user use state is a sliding mode when the inter-frame similarity difference is not smaller than a given similarity difference threshold; and splicing the picture sequences into the summarized pictures, and directly carrying out character recognition and output.