CN118334674B

CN118334674B - Automatic identification method and system for document shooting image

Info

Publication number: CN118334674B
Application number: CN202410758895.9A
Authority: CN
Inventors: 折大伟; 章小花; 季飞行; 姬婵; 乔栋栋; 王左丽
Original assignee: Xi'an Huoda Network Technology Co ltd
Current assignee: Xi'an Huoda Network Technology Co ltd
Priority date: 2024-06-13
Filing date: 2024-06-13
Publication date: 2024-08-13
Anticipated expiration: 2044-06-13
Also published as: CN118334674A

Abstract

The invention relates to the field of image processing, in particular to a method and a system for automatically identifying a document shooting image, wherein the method comprises the following steps: dividing the delivery receipt area image into a plurality of image blocks; adjusting contrast limiting parameters of corresponding image blocks to obtain first parameters, and further obtaining improved limiting contrast self-adaptive histogram equalization; the first parameter is respectively positively and negatively correlated with an initial value of the contrast limiting parameter and an average value of the text confidence coefficient of all pixel points in each image block; performing equalization processing on each image block by adopting improved limiting contrast self-adaptive histogram equalization, and combining all the enhanced image blocks into a final enhanced image; and identifying the enhanced image to obtain accurate text information of the delivery bill. According to the invention, the degree of contrast enhancement is dynamically adjusted according to the text confidence of each pixel point, so that complete text information of the delivery bill is obtained, and the accuracy of text recognition of the system is improved.

Description

Automatic identification method and system for document shooting image

Technical Field

The present invention relates to the field of image processing. The invention relates to an automatic identification method and system for a document shooting image.

Background

In large logistics service of coal transportation, due to huge amount of transported goods, a longer transportation route span can exist in more complicated supply chain nodes, huge freight driver groups and freight car numbers. Information coordination problems and statistical settlement problems of batch transportation are generally solved by scanning delivery documents to extract information such as freight drivers, freight vehicles, delivery amounts and settlement amounts in a bulk logistics scene based on coal transportation.

The patent application document with the publication number of CN113903020A discloses a logistics document information acquisition method based on machine learning, wherein a scanning/photographing device is used for carrying out omnibearing scanning/photographing on a logistics document entering an identification area, the scanning device is used for uploading scanned logistics document picture information to an image identification device, acquiring corresponding column image information of the logistics document, controlling index parameters of the scanning device in the process of acquiring data of the logistics document, and carrying out character recognition on the scanned image by using an OCR (optical character recognition) technology, so that the data information of characters on the surface of the logistics document is effectively and accurately acquired, the definition of the logistics document picture in the process of acquiring the logistics document picture is ensured, and the acquired picture is prevented from affecting the acquisition of picture information in a fuzzy manner.

The method ensures that the acquired data information is accurate and efficient although the definition of the acquired logistics document pictures is ensured, the defects such as folds and the like can be formed on the surface of the logistics document, the data information of complete logistics document characters can not be accurately detected, errors are caused in recognition, and the progress of logistics work is influenced.

Disclosure of Invention

In order to solve the technical problem of how to accurately detect the data information of the complete logistics document text, the invention provides the following aspects.

In a first aspect, a method for automatically identifying a document-captured image includes:

acquiring a delivery receipt area image, and equally dividing the delivery receipt area image into a plurality of image blocks;

adjusting contrast limiting parameters of corresponding image blocks to obtain first parameters, and further obtaining improved limiting contrast self-adaptive histogram equalization;

Each image block is subjected to equalization processing by adopting improved limiting contrast self-adaptive histogram equalization, so that an enhanced image block is obtained, and all the enhanced image blocks are combined into a final enhanced image;

Identifying the enhanced image to obtain accurate text information of the delivery bill;

The first parameter is respectively positively and negatively correlated with an initial value of the contrast limiting parameter and an average value of the text confidence coefficient of all pixel points in each image block; the text confidence is used for representing the credibility of the corresponding pixel point as text;

the first parameter is:

; in the method, in the process of the invention, Represent the firstA first parameter of the image block is calculated,Is the firstThe initial values of the contrast limiting parameters of the individual image blocks,Is the firstAn average value of text confidence of all pixels in the image block,To be with natural constantIs an exponential function of the base.

After the image of the delivery receipt area is acquired, dividing the image into a plurality of image blocks, dynamically adjusting contrast limiting parameters according to the text confidence, and carrying out stronger contrast enhancement on the corresponding image blocks when the text confidence is high, otherwise, reducing the contrast limiting parameters. This means that different parts of the image will be adaptively contrast enhanced according to their characteristics, and excessive enhancement of the image can be avoided. The self-adaptive histogram equalization with improved limiting contrast is applied to each image block, and finally, an enhanced image is obtained, so that the text information is clearer, and the processing efficiency of the image and the accuracy of text recognition are improved while the accurate and complete data information of the text of the delivery bill is obtained.

In one embodiment, the text confidence obtaining process includes:

For each image block, acquiring information of all pixel points in the image block, and calculating a first confidence coefficient of the corresponding pixel point; the information comprises gradient amplitude and contrast of the pixel points; the first confidence coefficient is the confidence coefficient that the pixel point belongs to the text region;

Calculating differences between the pixel points and average gradient features in the corresponding surrounding neighborhood in all gradient directions to obtain a difference set, calculating variance of the difference set, and taking the product of the variance and the first confidence as text confidence of the pixel points.

By calculating the first confidence coefficient of each pixel point, whether the pixel point belongs to a text area or not can be determined in an auxiliary mode, and therefore subsequent text detection and recognition can be conducted in a targeted mode. The difference between the gradient characteristics of the pixel points and the gradient characteristics of the neighborhood around the pixel points is considered, so that the boundary of the text region is determined more accurately.

In one embodiment, when calculating the first confidence coefficient of the pixel point, calculating a second confidence coefficient of the pixel point, and correcting the first confidence coefficient based on the second confidence coefficient; the second confidence coefficient is the confidence coefficient that the pixel point belongs to the outlier noise area.

By calculating the second confidence level, the system can be helped to identify the outlier noise region and distinguish the outlier noise region from the text region, and the situation that the noise region is mistakenly identified as the text region can be reduced.

In one embodiment, performing the equalization process on each image block includes:

calculating a gray histogram of each image block;

Taking the first parameter as a multiple of the average column height of columns corresponding to all gray levels in the corresponding gray level histogram, and cutting the gray level histogram of each image block; the parts exceeding the multiple of the average column height are cut off, and the cut parts are added to all columns on average;

and carrying out equalization treatment on the cut gray level histogram.

By clipping the gray level histogram and equalizing it, excessive enhancement and image distortion that may occur during contrast enhancement can be reduced. By limiting the range of contrast enhancement, it is ensured that the image remains natural and realistic, avoiding the occurrence of excessive enhancement.

In one embodiment, the text confidence is:

; in the method, in the process of the invention, Represent the firstThe text confidence of each pixel point,To adjust the firstThe custom parameters of the text confidence value range of each pixel point,Represent the firstA first confidence level for each pixel point,Representing the variance of the set of differences,Is a normalization function.

The probability of the pixel points belonging to the text region is quantified by calculating the text confidence coefficient of the pixel points, and the custom parameters are setThe method is used for adjusting the value range of the text confidence coefficient so as to adapt to different application scenes and requirements.

In one embodiment, the first confidence level is:

The first confidence is:

; in the method, in the process of the invention, Represent the firstA first confidence level for each pixel point,Represent the firstThe contrast of the individual image blocks is determined,Representing the contrast of the global extent of the delivery document area image,Represent the firstThe maximum value of the gradient amplitude of the pixel points in the image block,Represent the firstAverage value of gradient amplitude of pixel points in each image block,To be with natural constantAs an exponential function of the base number,Represent the firstAnd the second confidence coefficient of each pixel point is used for adjusting the contrast of the corresponding pixel point.

In one embodiment, the second confidence level is:

; in the method, in the process of the invention, Represent the firstA second confidence level for each pixel point,Represent the firstEach pixel corresponds to an average value of pixel gray values in the search area,Represent the firstThe gray value in the corresponding exploration area of each pixel point is smaller thanThe variance of the gray value of the pixel of (c),As a function of the normalization,To be with natural constantAn exponential function of the base, the search area being based onAnd the circular range with the set radius and the region where the image of the delivery document region intersect are formed by taking each pixel point as the circle center.

In a second aspect, an automatic document shooting image recognition system includes: the automatic identification device comprises a processor and a memory, wherein the memory stores computer program instructions which are executed by the processor to realize the automatic identification method of the photographed image of the bill.

The beneficial effects are that: the invention firstly divides the image of the whole delivery receipt area into a plurality of small image blocks so as to more accurately process the image information of each part, then calculates the confidence coefficient of the characters in each image block to dynamically adjust the parameters for enhancing the contrast number, further applies the improved self-adaptive histogram equalization technology for limiting the contrast ratio to enhance the contrast ratio of each image block according to the characteristics of different image blocks, finally obtains the enhanced image, and improves the processing efficiency of the image and the accuracy of character recognition while acquiring the data information of the accurate and complete delivery receipt characters.

Drawings

The above, as well as additional purposes, features, and advantages of exemplary embodiments of the present invention will become readily apparent from the following detailed description when read in conjunction with the accompanying drawings. In the drawings, embodiments of the invention are illustrated by way of example and not by way of limitation, and like reference numerals refer to similar or corresponding parts and in which:

fig. 1 is a flowchart of a method for automatically recognizing a document photographed image in steps S1 to S4 according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The embodiment of the invention discloses an automatic identification method for a document shooting image, which comprises the following steps S1-S4 with reference to FIG. 1. The method comprises the following steps:

In processing a delivery document image to identify text information therein, it is often necessary to pre-process the image to increase the accuracy of the identification. In embodiments of the present invention, limited contrast adaptive histogram equalization (CLAHE) is employed for improving image contrast, highlighting text features. Limiting contrast adaptive histogram equalization (CLAHE) limits the amplification of noise and avoids excessive enhancement by clipping the histogram so that the contrast in each small region is improved while guaranteeing the uniformity of the overall image.

S1: and acquiring a delivery receipt area image, and equally dividing the delivery receipt area image into a plurality of image blocks.

And acquiring a flat delivery bill image, converting the flat delivery bill image into a gray image, and extracting a delivery bill area in the image by semantic segmentation to obtain the delivery bill area image.

S2: and adjusting the contrast limiting parameters of the corresponding image blocks to obtain a first parameter, and further obtaining improved limiting contrast self-adaptive histogram equalization.

In the prior art of limiting contrast adaptive histogram equalization (CLAHE), gray level histogram of an image is directly regulated through fixed contrast limiting parameters, then histogram equalization is carried out, and further contrast of the image is enhanced, so that character information in the image is highlighted, but writing traces (when ink is cut off from a refill) are quite close to gray level values of delivery bill paper, and character information is difficult to highlight in the image effectively.

The text area of the image is formed by writing or printing, but sometimes the condition that the ink of a pen core is broken or the printing is unclear can exist, so that only shallow marks similar to the color of the paper of the delivery bill are left in the delivery bill, and meanwhile, the delivery bill can generate certain folds under poor preservation conditions, so that the identification of text information in the delivery bill is influenced. In order to highlight the text in the image, the text region needs to be initially extracted.

Firstly, traversing all pixel points in each image block, finding out the maximum gray value and the minimum gray value in the image block, and similarly finding out the maximum gray value and the minimum gray value of the image of the delivery receipt area, thereby obtaining the corresponding local contrast and global contrast.

The Sobel operator is used to calculate the gradient of the image, and the gradient intensity of each pixel point in the image is represented by calculating the magnitude of the gradient (i.e., gradient amplitude).

Illustratively, in the firstFor example, the first image block is selectedThe individual pixels are the object of analysis.

According to the firstAnd calculating the confidence coefficient of the pixel point belonging to the text region, namely the first confidence coefficient, according to the information of the pixel point in each image block. The specific calculation formula is as follows:

The first confidence is:

In the method, in the process of the invention, Represent the firstA first confidence level for each pixel point,Represent the firstThe contrast of the individual image blocks is determined,Representing the contrast of the global extent of the delivery document area image,Represent the firstThe maximum value of the gradient amplitude of the pixel points in the image block,Represent the firstAverage value of gradient amplitude of pixel points in each image block,To be with natural constantAs an exponential function of the base number,Represent the firstAnd the second confidence coefficient of each pixel point is used for adjusting the contrast of the corresponding pixel point.

In other embodiments, the feature extraction is performed by using a convolutional neural network, and the sequence recognition is performed by combining the convolutional neural network and the connection time sequence classification, so that the text content and the confidence level thereof can be directly predicted from the image block by training a deep network.

In image processing, outliers generally refer to pixels that are significantly different from surrounding pixels, which may be due to errors in the image acquisition or transmission process. The text area refers to those areas containing important information such as characters, symbols, and the like. Therefore, the distribution condition of the gray values of the pixel points around each pixel point in the image of the delivery receipt area is analyzed to determine the confidence that each pixel point belongs to the outlier noise area, namely the second confidence.

Specifically, in the firstWith a single pixel point as the centerAnd extracting pixel point gray values of the intersecting part for the exploratory area of the corresponding pixel points of the intersecting part of the circular range of the radius and the image of the delivery bill area. In an embodiment of the present invention,=45, And the practitioner can adjust according to the resolution of the image acquisition device and the image. According to the firstDetermining the gray value of the pixel point in the exploration area of each pixel pointAnd a second confidence level of each pixel point. The specific calculation formula is as follows:

In the method, in the process of the invention, Represent the firstA second confidence level for each pixel point,Represent the firstEach pixel corresponds to an average value of pixel gray values in the search area,Represent the firstThe gray value in the corresponding exploration area of each pixel point is smaller thanThe variance of the gray value of the pixel of (c),As a function of the normalization,To be with natural constantIs an exponential function of the base.

The above determination of the firstThe second confidence level of each pixel point is mainly based on gray level information around the pixel, in other embodiments, color values or texture features of the pixel point can be used as data points, an LOF algorithm can be used to detect outlier noise points, or the histogram of the pixel point can be analyzed to determine the outlier noise points.

Represent the firstThe complexity of the edge texture within the image block, the greater the value, the more indicative of the firstThe more likely an image block belongs to a text region, and the more likely an image block belongs to a blank region; Represent the first The difference between the contrast of the individual image blocks and the contrast of the delivery document area image, i.e. the difference between the local contrast and the global contrast, the larger the value is, the more indicative of the firstThe larger the difference value between the maximum gray value/minimum gray value of the pixel points in the image blocks and the maximum gray value/minimum gray value of the whole image of the delivery bill area is, the description of the firstThe more likely an image block will belong to a blank area in the delivery document area image, and conversely the more likely it will belong to a text area in the delivery document area image.

By introduction ofThe classification probability of the pixel points is corrected by the value, so that whether the pixel points belong to an outlier noise area or a text area can be judged more accurately. When (when)The smaller the time that is taken for the device to be,For a pair ofThe greater the degree of expansion of (a)The greater the first confidence of each pixel point, the more likely it is that the pixel point belongs to the text region; when (when)The larger the bigger the firstThe more likely an individual pixel is to belong to an outlier region, which typically belongs to a blank region.

Second, at acquisition of the firstAfter the first confidence coefficient of each pixel point is calculatedDifferences between the average gradient characteristics of each pixel point in all gradient directions and the corresponding surrounding adjacent areas are obtained to obtain a difference set, variance of the difference set is calculated, and the product of the variance and the first confidence is taken as the first confidenceThe text confidence of each pixel point, thereby further highlighting the text information.

It should be noted that, according to specific application scenario and requirement, the first to be analyzed is determinedThe size of the surrounding neighborhood of the individual pixels. The neighborhood may be 33、55 Or other size window.

Specifically, count the thOriginal gradient directions of each pixel point are mapped to obtain a first gradient directionGradient direction after mapping of each pixel point. In the embodiment of the invention, the mapping mode divides the angle corresponding to the original gradient direction by 45 degrees and then rounds upwards, so that the gradient direction is mapped to eight directions from 1 to 8, specifically: (0 DEG, 22.5 DEG) mapping is 1, (22.5 DEG, 67.5 DEG) mapping is 2, (67.5 DEG, 112.5 DEG) mapping is 3, (112.5 DEG, 157.5 DEG) mapping is 4, (157.5 DEG, 202.5 DEG) mapping is 5, (202.5 DEG, 247.5 DEG) mapping is 6, (247.5 DEG, 292.5 DEG) mapping is 7, (292.5 DEG, 337.5 DEG) mapping is 8.

Wherein the mapped gradient direction can highlight the main directional characteristic in the image, which is helpful for distinguishing different image contents, such as characters, lines, etc.

Then the first step isThe specific calculation formula of the text confidence of each pixel point is as follows:

In the method, in the process of the invention, Represent the firstThe text confidence of each pixel point,To adjust the firstThe self-defining parameters of the probability value range of each pixel point in the text region,Represent the firstA first confidence level for each pixel point,The variance of the set of differences is represented,Is a normalization function.

First confidence levelAnd a set of differencesThe larger the variance of the pixel point, the text confidence of the pixel pointThe larger the indication isThe larger the gradient difference of each direction of the pixel points in each image block, the more complex the lines in the image block, the higher the credibility of the corresponding pixel points belonging to the text region, and the firstThe more likely an image block is a text region.

Wherein the difference setThe acquisition process of (a) is as follows:

calculate the first In the image blockThe sum of the gradient magnitudes of all pixels of each gradient direction；

Acquisition of the firstIn the image blockNumber of pixels in each gradient direction；

Acquisition of the firstAverage gradient amplitude of pixel points in gradient directionAverage gradient magnitude for eight gradient directionsAnd further obtaining a difference value of the two values, wherein the specific calculation formula is as follows:

Is the first In the image blockThe difference between the average gradient amplitude of the pixel points in the gradient direction and the average gradient amplitude in the eight gradient directions, namely the firstThe difference between the average gradient magnitude of the pixels in the individual gradient directions and the average gradient magnitude of the entire image block,Is thatIs used for the average value of (a),Is that、、Average value of (2).

And then obtain the firstThe difference between the average gradient amplitude of each gradient direction of the pixel points in each image block and the average gradient amplitude of the whole image block is obtained to obtain a difference set, and the variance of the difference set is calculated。

Wherein,. In the present embodiment of the present invention, in the present embodiment,In other embodiments, the adjustment may be made according to the actual situation.

Finally, setting the initial value of the contrast limiting parameter in the limiting contrast adaptive histogram equalization (CLAHE) asBy the obtained firstCalculating the corresponding text confidence of each pixel pointThe average value of the text confidence coefficient of all pixel points in each image is further adjustedContrast limiting parameters for the individual images. It should be noted that, in the embodiment of the present invention,In other embodiments, parameter adjustment may be performed according to actual situations.

The calculation formula of the first parameter obtained after the adjustment is as follows:

In the method, in the process of the invention, Represent the firstA first parameter of the image block is calculated,Is the firstAn average value of text confidence of all pixels in the image block,To be with natural constantIs an exponential function of the base.

And improved constrained contrast adaptive histogram equalization.

S3: and carrying out equalization treatment on each image block by adopting improved limited contrast self-adaptive histogram equalization to obtain enhanced image blocks, and combining all the enhanced image blocks into a final enhanced image.

Since histogram equalization has a better enhancement effect on gray values with a larger frequency, the enhancement effect on gray values with a smaller frequency is poor and even phagocytized. Therefore, in the image block of the text area, the frequency of the gray value belonging to the white background is larger than that of the gray value corresponding to the text, the contrast between a plurality of gray values corresponding to the white background is possibly improved in the histogram equalization process, the text cannot be emphasized and is unfavorable for the recognition of the text, and therefore, for the text area, a lower contrast limiting parameter is adopted, so that the frequency of the gray value corresponding to the white background is reduced, the excessive enhancement of the white background is avoided, the contrast between the text and the background is larger, and the text enhancement effect is better.

The image block corresponding to the blank area does not contain characters, only contains white background and noise points, the frequency of gray values corresponding to the white background is very high, but also contains some noise, the contrast of the whole image block is enhanced by histogram equalization, so that the noise is enhanced, in order to avoid the excessive enhancement of the noise, higher contrast limiting parameters are given to the image block with lower text confidence of the pixel points, the enhancement of the noise is limited, and the influence of non-text information in an image on text information identification is smaller.

Specifically, with the above-described image segmentation of the delivery receipt area image equally divided into 64 image blocks as a limiting contrast adaptive histogram equalization (CLAHE) and obtaining a gray level histogram for each image block, improved limiting contrast adaptive histogram equalization is applied, wherein the adjusted contrast limiting parameters are usedWill exceed the average column height h in the gray level histogramThe portion of the corresponding column of x h height is cut out and all the cut-out pixel values are equally distributed to all the columns. And merging all the image blocks after the text information is enhanced into a final enhanced image after the text information is enhanced.

S4: and identifying the enhanced image to obtain accurate text information of the delivery bill.

After the enhanced image is acquired, the worker can recognize the text information in the enhanced image through the relevant recognition equipment, so that the worker can process and manage the freight flow more efficiently, errors are reduced, and accuracy is improved.

After the image of the delivery bill area is acquired, the image is divided into a plurality of image blocks, the contrast limiting parameters are adjusted according to the text confidence, when the text confidence is high, the contrast of the corresponding image blocks can be enhanced more strongly, otherwise, the contrast limiting parameters are reduced, and the dynamic contrast adjustment of the image is realized. This means that different parts of the image will be adaptively contrast enhanced according to their characteristics, and excessive enhancement of the image can be avoided. The self-adaptive histogram equalization of the limiting contrast, which is improved in adjustment, is applied to each image block, and finally, an enhanced image is obtained, so that the text information is clearer, and the processing efficiency of the image and the accuracy of text recognition are improved while the accurate and complete data information of the text of the delivery bill is obtained.

The embodiment of the invention also discloses an automatic identification system for the bill shooting image, which comprises a processor and a memory, wherein the memory stores computer program instructions, and the automatic identification method for the bill shooting image is realized when the computer program instructions are executed by the processor.

The system further comprises other components known to those skilled in the art, such as communication buses and communication interfaces, the arrangement and function of which are known in the art and therefore will not be described in detail herein.

In the context of this patent, the foregoing memory may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, the computer-readable storage medium may be any suitable magnetic or magneto-optical storage medium, such as, for example, resistance change Memory RRAM (Resistive Random Access Memory), dynamic Random Access Memory DRAM (Dynamic Random Access Memory), static Random Access Memory SRAM (Static Random-Access Memory), enhanced dynamic Random Access Memory EDRAM (ENHANCED DYNAMIC Random Access Memory), high-Bandwidth Memory HBM (High-Bandwidth Memory), hybrid storage cube HMC (Hybrid Memory Cube), or the like, or any other medium that may be used to store the desired information and that may be accessed by an application, a module, or both. Any such computer storage media may be part of, or accessible by, or connectable to, the device.

In the description of the present specification, the meaning of "a plurality", "a number" or "a plurality" is at least two, for example, two, three or more, etc., unless explicitly defined otherwise.

While various embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Many modifications, changes, and substitutions will now occur to those skilled in the art without departing from the spirit and scope of the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention.

Claims

1. An automatic identification method for a document shooting image is characterized by comprising the following steps:

the first parameter is:

; in the method, in the process of the invention, Represent the firstA first parameter of the image block is calculated,Is the firstThe initial values of the contrast limiting parameters of the individual image blocks,Is the firstAn average value of text confidence of all pixels in the image block,To be with natural constantAn exponential function that is a base;

For each image block, acquiring information of all pixel points in the image block, and calculating a first confidence coefficient of the corresponding pixel point; the information comprises gradient amplitude and contrast of the pixel points; the first confidence coefficient is the confidence coefficient that the pixel point belongs to the text region; calculating differences between the pixel points and average gradient features in surrounding adjacent areas corresponding to the pixel points in all gradient directions to obtain a difference set, calculating variances of the difference set, and taking products of the variances and the first confidence as text confidence of the pixel points;

when calculating the first confidence coefficient of the pixel point, calculating a second confidence coefficient of the pixel point, and correcting the first confidence coefficient based on the second confidence coefficient; the second confidence coefficient is the confidence coefficient that the pixel point belongs to the outlier noise area;

The text confidence is as follows:

; in the method, in the process of the invention, Represent the firstThe text confidence of each pixel point,To adjust the firstThe custom parameters of the text confidence value range of each pixel point,Represent the firstA first confidence level for each pixel point,Representing the variance of the set of differences,Is a normalization function;

The first confidence is:

; in the method, in the process of the invention, Represent the firstA first confidence level for each pixel point,Represent the firstThe contrast of the individual image blocks is determined,Representing the contrast of the global extent of the delivery document area image,Represent the firstThe maximum value of the gradient amplitude of the pixel points in the image block,Represent the firstAverage value of gradient amplitude of pixel points in each image block,To be with natural constantAs an exponential function of the base number,Represent the firstThe second confidence coefficient of each pixel point is used for adjusting the contrast of the corresponding pixel point;

The second confidence is:

2. The automatic document shooting image recognition method according to claim 1, wherein the equalizing processing of each image block comprises:

calculating a gray histogram of each image block;

and carrying out equalization treatment on the cut gray level histogram.

3. An automatic document shooting image recognition system, comprising: a processor and a memory storing computer program instructions which, when executed by the processor, implement the document capture image automatic identification method of any one of claims 1-2.