[go: up one dir, main page]

CN118334674B - Automatic identification method and system for document shooting image - Google Patents

Automatic identification method and system for document shooting image Download PDF

Info

Publication number
CN118334674B
CN118334674B CN202410758895.9A CN202410758895A CN118334674B CN 118334674 B CN118334674 B CN 118334674B CN 202410758895 A CN202410758895 A CN 202410758895A CN 118334674 B CN118334674 B CN 118334674B
Authority
CN
China
Prior art keywords
image
confidence
text
contrast
pixel point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410758895.9A
Other languages
Chinese (zh)
Other versions
CN118334674A (en
Inventor
折大伟
章小花
季飞行
姬婵
乔栋栋
王左丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xi'an Huoda Network Technology Co ltd
Original Assignee
Xi'an Huoda Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xi'an Huoda Network Technology Co ltd filed Critical Xi'an Huoda Network Technology Co ltd
Priority to CN202410758895.9A priority Critical patent/CN118334674B/en
Publication of CN118334674A publication Critical patent/CN118334674A/en
Application granted granted Critical
Publication of CN118334674B publication Critical patent/CN118334674B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/15Cutting or merging image elements, e.g. region growing, watershed or clustering-based techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Input (AREA)

Abstract

The invention relates to the field of image processing, in particular to a method and a system for automatically identifying a document shooting image, wherein the method comprises the following steps: dividing the delivery receipt area image into a plurality of image blocks; adjusting contrast limiting parameters of corresponding image blocks to obtain first parameters, and further obtaining improved limiting contrast self-adaptive histogram equalization; the first parameter is respectively positively and negatively correlated with an initial value of the contrast limiting parameter and an average value of the text confidence coefficient of all pixel points in each image block; performing equalization processing on each image block by adopting improved limiting contrast self-adaptive histogram equalization, and combining all the enhanced image blocks into a final enhanced image; and identifying the enhanced image to obtain accurate text information of the delivery bill. According to the invention, the degree of contrast enhancement is dynamically adjusted according to the text confidence of each pixel point, so that complete text information of the delivery bill is obtained, and the accuracy of text recognition of the system is improved.

Description

Automatic identification method and system for document shooting image
Technical Field
The present invention relates to the field of image processing. The invention relates to an automatic identification method and system for a document shooting image.
Background
In large logistics service of coal transportation, due to huge amount of transported goods, a longer transportation route span can exist in more complicated supply chain nodes, huge freight driver groups and freight car numbers. Information coordination problems and statistical settlement problems of batch transportation are generally solved by scanning delivery documents to extract information such as freight drivers, freight vehicles, delivery amounts and settlement amounts in a bulk logistics scene based on coal transportation.
The patent application document with the publication number of CN113903020A discloses a logistics document information acquisition method based on machine learning, wherein a scanning/photographing device is used for carrying out omnibearing scanning/photographing on a logistics document entering an identification area, the scanning device is used for uploading scanned logistics document picture information to an image identification device, acquiring corresponding column image information of the logistics document, controlling index parameters of the scanning device in the process of acquiring data of the logistics document, and carrying out character recognition on the scanned image by using an OCR (optical character recognition) technology, so that the data information of characters on the surface of the logistics document is effectively and accurately acquired, the definition of the logistics document picture in the process of acquiring the logistics document picture is ensured, and the acquired picture is prevented from affecting the acquisition of picture information in a fuzzy manner.
The method ensures that the acquired data information is accurate and efficient although the definition of the acquired logistics document pictures is ensured, the defects such as folds and the like can be formed on the surface of the logistics document, the data information of complete logistics document characters can not be accurately detected, errors are caused in recognition, and the progress of logistics work is influenced.
Disclosure of Invention
In order to solve the technical problem of how to accurately detect the data information of the complete logistics document text, the invention provides the following aspects.
In a first aspect, a method for automatically identifying a document-captured image includes:
acquiring a delivery receipt area image, and equally dividing the delivery receipt area image into a plurality of image blocks;
adjusting contrast limiting parameters of corresponding image blocks to obtain first parameters, and further obtaining improved limiting contrast self-adaptive histogram equalization;
Each image block is subjected to equalization processing by adopting improved limiting contrast self-adaptive histogram equalization, so that an enhanced image block is obtained, and all the enhanced image blocks are combined into a final enhanced image;
Identifying the enhanced image to obtain accurate text information of the delivery bill;
The first parameter is respectively positively and negatively correlated with an initial value of the contrast limiting parameter and an average value of the text confidence coefficient of all pixel points in each image block; the text confidence is used for representing the credibility of the corresponding pixel point as text;
the first parameter is:
; in the method, in the process of the invention, Represent the firstA first parameter of the image block is calculated,Is the firstThe initial values of the contrast limiting parameters of the individual image blocks,Is the firstAn average value of text confidence of all pixels in the image block,To be with natural constantIs an exponential function of the base.
After the image of the delivery receipt area is acquired, dividing the image into a plurality of image blocks, dynamically adjusting contrast limiting parameters according to the text confidence, and carrying out stronger contrast enhancement on the corresponding image blocks when the text confidence is high, otherwise, reducing the contrast limiting parameters. This means that different parts of the image will be adaptively contrast enhanced according to their characteristics, and excessive enhancement of the image can be avoided. The self-adaptive histogram equalization with improved limiting contrast is applied to each image block, and finally, an enhanced image is obtained, so that the text information is clearer, and the processing efficiency of the image and the accuracy of text recognition are improved while the accurate and complete data information of the text of the delivery bill is obtained.
In one embodiment, the text confidence obtaining process includes:
For each image block, acquiring information of all pixel points in the image block, and calculating a first confidence coefficient of the corresponding pixel point; the information comprises gradient amplitude and contrast of the pixel points; the first confidence coefficient is the confidence coefficient that the pixel point belongs to the text region;
Calculating differences between the pixel points and average gradient features in the corresponding surrounding neighborhood in all gradient directions to obtain a difference set, calculating variance of the difference set, and taking the product of the variance and the first confidence as text confidence of the pixel points.
By calculating the first confidence coefficient of each pixel point, whether the pixel point belongs to a text area or not can be determined in an auxiliary mode, and therefore subsequent text detection and recognition can be conducted in a targeted mode. The difference between the gradient characteristics of the pixel points and the gradient characteristics of the neighborhood around the pixel points is considered, so that the boundary of the text region is determined more accurately.
In one embodiment, when calculating the first confidence coefficient of the pixel point, calculating a second confidence coefficient of the pixel point, and correcting the first confidence coefficient based on the second confidence coefficient; the second confidence coefficient is the confidence coefficient that the pixel point belongs to the outlier noise area.
By calculating the second confidence level, the system can be helped to identify the outlier noise region and distinguish the outlier noise region from the text region, and the situation that the noise region is mistakenly identified as the text region can be reduced.
In one embodiment, performing the equalization process on each image block includes:
calculating a gray histogram of each image block;
Taking the first parameter as a multiple of the average column height of columns corresponding to all gray levels in the corresponding gray level histogram, and cutting the gray level histogram of each image block; the parts exceeding the multiple of the average column height are cut off, and the cut parts are added to all columns on average;
and carrying out equalization treatment on the cut gray level histogram.
By clipping the gray level histogram and equalizing it, excessive enhancement and image distortion that may occur during contrast enhancement can be reduced. By limiting the range of contrast enhancement, it is ensured that the image remains natural and realistic, avoiding the occurrence of excessive enhancement.
In one embodiment, the text confidence is:
; in the method, in the process of the invention, Represent the firstThe text confidence of each pixel point,To adjust the firstThe custom parameters of the text confidence value range of each pixel point,Represent the firstA first confidence level for each pixel point,Representing the variance of the set of differences,Is a normalization function.
The probability of the pixel points belonging to the text region is quantified by calculating the text confidence coefficient of the pixel points, and the custom parameters are setThe method is used for adjusting the value range of the text confidence coefficient so as to adapt to different application scenes and requirements.
In one embodiment, the first confidence level is:
The first confidence is:
; in the method, in the process of the invention, Represent the firstA first confidence level for each pixel point,Represent the firstThe contrast of the individual image blocks is determined,Representing the contrast of the global extent of the delivery document area image,Represent the firstThe maximum value of the gradient amplitude of the pixel points in the image block,Represent the firstAverage value of gradient amplitude of pixel points in each image block,To be with natural constantAs an exponential function of the base number,Represent the firstAnd the second confidence coefficient of each pixel point is used for adjusting the contrast of the corresponding pixel point.
In one embodiment, the second confidence level is:
; in the method, in the process of the invention, Represent the firstA second confidence level for each pixel point,Represent the firstEach pixel corresponds to an average value of pixel gray values in the search area,Represent the firstThe gray value in the corresponding exploration area of each pixel point is smaller thanThe variance of the gray value of the pixel of (c),As a function of the normalization,To be with natural constantAn exponential function of the base, the search area being based onAnd the circular range with the set radius and the region where the image of the delivery document region intersect are formed by taking each pixel point as the circle center.
In a second aspect, an automatic document shooting image recognition system includes: the automatic identification device comprises a processor and a memory, wherein the memory stores computer program instructions which are executed by the processor to realize the automatic identification method of the photographed image of the bill.
The beneficial effects are that: the invention firstly divides the image of the whole delivery receipt area into a plurality of small image blocks so as to more accurately process the image information of each part, then calculates the confidence coefficient of the characters in each image block to dynamically adjust the parameters for enhancing the contrast number, further applies the improved self-adaptive histogram equalization technology for limiting the contrast ratio to enhance the contrast ratio of each image block according to the characteristics of different image blocks, finally obtains the enhanced image, and improves the processing efficiency of the image and the accuracy of character recognition while acquiring the data information of the accurate and complete delivery receipt characters.
Drawings
The above, as well as additional purposes, features, and advantages of exemplary embodiments of the present invention will become readily apparent from the following detailed description when read in conjunction with the accompanying drawings. In the drawings, embodiments of the invention are illustrated by way of example and not by way of limitation, and like reference numerals refer to similar or corresponding parts and in which:
fig. 1 is a flowchart of a method for automatically recognizing a document photographed image in steps S1 to S4 according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the invention discloses an automatic identification method for a document shooting image, which comprises the following steps S1-S4 with reference to FIG. 1. The method comprises the following steps:
In processing a delivery document image to identify text information therein, it is often necessary to pre-process the image to increase the accuracy of the identification. In embodiments of the present invention, limited contrast adaptive histogram equalization (CLAHE) is employed for improving image contrast, highlighting text features. Limiting contrast adaptive histogram equalization (CLAHE) limits the amplification of noise and avoids excessive enhancement by clipping the histogram so that the contrast in each small region is improved while guaranteeing the uniformity of the overall image.
S1: and acquiring a delivery receipt area image, and equally dividing the delivery receipt area image into a plurality of image blocks.
And acquiring a flat delivery bill image, converting the flat delivery bill image into a gray image, and extracting a delivery bill area in the image by semantic segmentation to obtain the delivery bill area image.
S2: and adjusting the contrast limiting parameters of the corresponding image blocks to obtain a first parameter, and further obtaining improved limiting contrast self-adaptive histogram equalization.
In the prior art of limiting contrast adaptive histogram equalization (CLAHE), gray level histogram of an image is directly regulated through fixed contrast limiting parameters, then histogram equalization is carried out, and further contrast of the image is enhanced, so that character information in the image is highlighted, but writing traces (when ink is cut off from a refill) are quite close to gray level values of delivery bill paper, and character information is difficult to highlight in the image effectively.
The text area of the image is formed by writing or printing, but sometimes the condition that the ink of a pen core is broken or the printing is unclear can exist, so that only shallow marks similar to the color of the paper of the delivery bill are left in the delivery bill, and meanwhile, the delivery bill can generate certain folds under poor preservation conditions, so that the identification of text information in the delivery bill is influenced. In order to highlight the text in the image, the text region needs to be initially extracted.
Firstly, traversing all pixel points in each image block, finding out the maximum gray value and the minimum gray value in the image block, and similarly finding out the maximum gray value and the minimum gray value of the image of the delivery receipt area, thereby obtaining the corresponding local contrast and global contrast.
The Sobel operator is used to calculate the gradient of the image, and the gradient intensity of each pixel point in the image is represented by calculating the magnitude of the gradient (i.e., gradient amplitude).
Illustratively, in the firstFor example, the first image block is selectedThe individual pixels are the object of analysis.
According to the firstAnd calculating the confidence coefficient of the pixel point belonging to the text region, namely the first confidence coefficient, according to the information of the pixel point in each image block. The specific calculation formula is as follows:
The first confidence is:
In the method, in the process of the invention, Represent the firstA first confidence level for each pixel point,Represent the firstThe contrast of the individual image blocks is determined,Representing the contrast of the global extent of the delivery document area image,Represent the firstThe maximum value of the gradient amplitude of the pixel points in the image block,Represent the firstAverage value of gradient amplitude of pixel points in each image block,To be with natural constantAs an exponential function of the base number,Represent the firstAnd the second confidence coefficient of each pixel point is used for adjusting the contrast of the corresponding pixel point.
In other embodiments, the feature extraction is performed by using a convolutional neural network, and the sequence recognition is performed by combining the convolutional neural network and the connection time sequence classification, so that the text content and the confidence level thereof can be directly predicted from the image block by training a deep network.
In image processing, outliers generally refer to pixels that are significantly different from surrounding pixels, which may be due to errors in the image acquisition or transmission process. The text area refers to those areas containing important information such as characters, symbols, and the like. Therefore, the distribution condition of the gray values of the pixel points around each pixel point in the image of the delivery receipt area is analyzed to determine the confidence that each pixel point belongs to the outlier noise area, namely the second confidence.
Specifically, in the firstWith a single pixel point as the centerAnd extracting pixel point gray values of the intersecting part for the exploratory area of the corresponding pixel points of the intersecting part of the circular range of the radius and the image of the delivery bill area. In an embodiment of the present invention,=45, And the practitioner can adjust according to the resolution of the image acquisition device and the image. According to the firstDetermining the gray value of the pixel point in the exploration area of each pixel pointAnd a second confidence level of each pixel point. The specific calculation formula is as follows:
In the method, in the process of the invention, Represent the firstA second confidence level for each pixel point,Represent the firstEach pixel corresponds to an average value of pixel gray values in the search area,Represent the firstThe gray value in the corresponding exploration area of each pixel point is smaller thanThe variance of the gray value of the pixel of (c),As a function of the normalization,To be with natural constantIs an exponential function of the base.
The above determination of the firstThe second confidence level of each pixel point is mainly based on gray level information around the pixel, in other embodiments, color values or texture features of the pixel point can be used as data points, an LOF algorithm can be used to detect outlier noise points, or the histogram of the pixel point can be analyzed to determine the outlier noise points.
Represent the firstThe complexity of the edge texture within the image block, the greater the value, the more indicative of the firstThe more likely an image block belongs to a text region, and the more likely an image block belongs to a blank region; Represent the first The difference between the contrast of the individual image blocks and the contrast of the delivery document area image, i.e. the difference between the local contrast and the global contrast, the larger the value is, the more indicative of the firstThe larger the difference value between the maximum gray value/minimum gray value of the pixel points in the image blocks and the maximum gray value/minimum gray value of the whole image of the delivery bill area is, the description of the firstThe more likely an image block will belong to a blank area in the delivery document area image, and conversely the more likely it will belong to a text area in the delivery document area image.
By introduction ofThe classification probability of the pixel points is corrected by the value, so that whether the pixel points belong to an outlier noise area or a text area can be judged more accurately. When (when)The smaller the time that is taken for the device to be,For a pair ofThe greater the degree of expansion of (a)The greater the first confidence of each pixel point, the more likely it is that the pixel point belongs to the text region; when (when)The larger the bigger the firstThe more likely an individual pixel is to belong to an outlier region, which typically belongs to a blank region.
Second, at acquisition of the firstAfter the first confidence coefficient of each pixel point is calculatedDifferences between the average gradient characteristics of each pixel point in all gradient directions and the corresponding surrounding adjacent areas are obtained to obtain a difference set, variance of the difference set is calculated, and the product of the variance and the first confidence is taken as the first confidenceThe text confidence of each pixel point, thereby further highlighting the text information.
It should be noted that, according to specific application scenario and requirement, the first to be analyzed is determinedThe size of the surrounding neighborhood of the individual pixels. The neighborhood may be 33、55 Or other size window.
Specifically, count the thOriginal gradient directions of each pixel point are mapped to obtain a first gradient directionGradient direction after mapping of each pixel point. In the embodiment of the invention, the mapping mode divides the angle corresponding to the original gradient direction by 45 degrees and then rounds upwards, so that the gradient direction is mapped to eight directions from 1 to 8, specifically: (0 DEG, 22.5 DEG) mapping is 1, (22.5 DEG, 67.5 DEG) mapping is 2, (67.5 DEG, 112.5 DEG) mapping is 3, (112.5 DEG, 157.5 DEG) mapping is 4, (157.5 DEG, 202.5 DEG) mapping is 5, (202.5 DEG, 247.5 DEG) mapping is 6, (247.5 DEG, 292.5 DEG) mapping is 7, (292.5 DEG, 337.5 DEG) mapping is 8.
Wherein the mapped gradient direction can highlight the main directional characteristic in the image, which is helpful for distinguishing different image contents, such as characters, lines, etc.
Then the first step isThe specific calculation formula of the text confidence of each pixel point is as follows:
In the method, in the process of the invention, Represent the firstThe text confidence of each pixel point,To adjust the firstThe self-defining parameters of the probability value range of each pixel point in the text region,Represent the firstA first confidence level for each pixel point,The variance of the set of differences is represented,Is a normalization function.
First confidence levelAnd a set of differencesThe larger the variance of the pixel point, the text confidence of the pixel pointThe larger the indication isThe larger the gradient difference of each direction of the pixel points in each image block, the more complex the lines in the image block, the higher the credibility of the corresponding pixel points belonging to the text region, and the firstThe more likely an image block is a text region.
Wherein the difference setThe acquisition process of (a) is as follows:
calculate the first In the image blockThe sum of the gradient magnitudes of all pixels of each gradient direction
Acquisition of the firstIn the image blockNumber of pixels in each gradient direction
Acquisition of the firstAverage gradient amplitude of pixel points in gradient directionAverage gradient magnitude for eight gradient directionsAnd further obtaining a difference value of the two values, wherein the specific calculation formula is as follows:
Is the first In the image blockThe difference between the average gradient amplitude of the pixel points in the gradient direction and the average gradient amplitude in the eight gradient directions, namely the firstThe difference between the average gradient magnitude of the pixels in the individual gradient directions and the average gradient magnitude of the entire image block,Is thatIs used for the average value of (a),Is thatAverage value of (2).
And then obtain the firstThe difference between the average gradient amplitude of each gradient direction of the pixel points in each image block and the average gradient amplitude of the whole image block is obtained to obtain a difference set, and the variance of the difference set is calculated
Wherein,. In the present embodiment of the present invention, in the present embodiment,In other embodiments, the adjustment may be made according to the actual situation.
Finally, setting the initial value of the contrast limiting parameter in the limiting contrast adaptive histogram equalization (CLAHE) asBy the obtained firstCalculating the corresponding text confidence of each pixel pointThe average value of the text confidence coefficient of all pixel points in each image is further adjustedContrast limiting parameters for the individual images. It should be noted that, in the embodiment of the present invention,In other embodiments, parameter adjustment may be performed according to actual situations.
The calculation formula of the first parameter obtained after the adjustment is as follows:
In the method, in the process of the invention, Represent the firstA first parameter of the image block is calculated,Is the firstAn average value of text confidence of all pixels in the image block,To be with natural constantIs an exponential function of the base.
And improved constrained contrast adaptive histogram equalization.
S3: and carrying out equalization treatment on each image block by adopting improved limited contrast self-adaptive histogram equalization to obtain enhanced image blocks, and combining all the enhanced image blocks into a final enhanced image.
Since histogram equalization has a better enhancement effect on gray values with a larger frequency, the enhancement effect on gray values with a smaller frequency is poor and even phagocytized. Therefore, in the image block of the text area, the frequency of the gray value belonging to the white background is larger than that of the gray value corresponding to the text, the contrast between a plurality of gray values corresponding to the white background is possibly improved in the histogram equalization process, the text cannot be emphasized and is unfavorable for the recognition of the text, and therefore, for the text area, a lower contrast limiting parameter is adopted, so that the frequency of the gray value corresponding to the white background is reduced, the excessive enhancement of the white background is avoided, the contrast between the text and the background is larger, and the text enhancement effect is better.
The image block corresponding to the blank area does not contain characters, only contains white background and noise points, the frequency of gray values corresponding to the white background is very high, but also contains some noise, the contrast of the whole image block is enhanced by histogram equalization, so that the noise is enhanced, in order to avoid the excessive enhancement of the noise, higher contrast limiting parameters are given to the image block with lower text confidence of the pixel points, the enhancement of the noise is limited, and the influence of non-text information in an image on text information identification is smaller.
Specifically, with the above-described image segmentation of the delivery receipt area image equally divided into 64 image blocks as a limiting contrast adaptive histogram equalization (CLAHE) and obtaining a gray level histogram for each image block, improved limiting contrast adaptive histogram equalization is applied, wherein the adjusted contrast limiting parameters are usedWill exceed the average column height h in the gray level histogramThe portion of the corresponding column of x h height is cut out and all the cut-out pixel values are equally distributed to all the columns. And merging all the image blocks after the text information is enhanced into a final enhanced image after the text information is enhanced.
S4: and identifying the enhanced image to obtain accurate text information of the delivery bill.
After the enhanced image is acquired, the worker can recognize the text information in the enhanced image through the relevant recognition equipment, so that the worker can process and manage the freight flow more efficiently, errors are reduced, and accuracy is improved.
After the image of the delivery bill area is acquired, the image is divided into a plurality of image blocks, the contrast limiting parameters are adjusted according to the text confidence, when the text confidence is high, the contrast of the corresponding image blocks can be enhanced more strongly, otherwise, the contrast limiting parameters are reduced, and the dynamic contrast adjustment of the image is realized. This means that different parts of the image will be adaptively contrast enhanced according to their characteristics, and excessive enhancement of the image can be avoided. The self-adaptive histogram equalization of the limiting contrast, which is improved in adjustment, is applied to each image block, and finally, an enhanced image is obtained, so that the text information is clearer, and the processing efficiency of the image and the accuracy of text recognition are improved while the accurate and complete data information of the text of the delivery bill is obtained.
The embodiment of the invention also discloses an automatic identification system for the bill shooting image, which comprises a processor and a memory, wherein the memory stores computer program instructions, and the automatic identification method for the bill shooting image is realized when the computer program instructions are executed by the processor.
The system further comprises other components known to those skilled in the art, such as communication buses and communication interfaces, the arrangement and function of which are known in the art and therefore will not be described in detail herein.
In the context of this patent, the foregoing memory may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, the computer-readable storage medium may be any suitable magnetic or magneto-optical storage medium, such as, for example, resistance change Memory RRAM (Resistive Random Access Memory), dynamic Random Access Memory DRAM (Dynamic Random Access Memory), static Random Access Memory SRAM (Static Random-Access Memory), enhanced dynamic Random Access Memory EDRAM (ENHANCED DYNAMIC Random Access Memory), high-Bandwidth Memory HBM (High-Bandwidth Memory), hybrid storage cube HMC (Hybrid Memory Cube), or the like, or any other medium that may be used to store the desired information and that may be accessed by an application, a module, or both. Any such computer storage media may be part of, or accessible by, or connectable to, the device.
In the description of the present specification, the meaning of "a plurality", "a number" or "a plurality" is at least two, for example, two, three or more, etc., unless explicitly defined otherwise.
While various embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Many modifications, changes, and substitutions will now occur to those skilled in the art without departing from the spirit and scope of the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention.

Claims (3)

1. An automatic identification method for a document shooting image is characterized by comprising the following steps:
acquiring a delivery receipt area image, and equally dividing the delivery receipt area image into a plurality of image blocks;
adjusting contrast limiting parameters of corresponding image blocks to obtain first parameters, and further obtaining improved limiting contrast self-adaptive histogram equalization;
Each image block is subjected to equalization processing by adopting improved limiting contrast self-adaptive histogram equalization, so that an enhanced image block is obtained, and all the enhanced image blocks are combined into a final enhanced image;
Identifying the enhanced image to obtain accurate text information of the delivery bill;
The first parameter is respectively positively and negatively correlated with an initial value of the contrast limiting parameter and an average value of the text confidence coefficient of all pixel points in each image block; the text confidence is used for representing the credibility of the corresponding pixel point as text;
the first parameter is:
; in the method, in the process of the invention, Represent the firstA first parameter of the image block is calculated,Is the firstThe initial values of the contrast limiting parameters of the individual image blocks,Is the firstAn average value of text confidence of all pixels in the image block,To be with natural constantAn exponential function that is a base;
For each image block, acquiring information of all pixel points in the image block, and calculating a first confidence coefficient of the corresponding pixel point; the information comprises gradient amplitude and contrast of the pixel points; the first confidence coefficient is the confidence coefficient that the pixel point belongs to the text region; calculating differences between the pixel points and average gradient features in surrounding adjacent areas corresponding to the pixel points in all gradient directions to obtain a difference set, calculating variances of the difference set, and taking products of the variances and the first confidence as text confidence of the pixel points;
when calculating the first confidence coefficient of the pixel point, calculating a second confidence coefficient of the pixel point, and correcting the first confidence coefficient based on the second confidence coefficient; the second confidence coefficient is the confidence coefficient that the pixel point belongs to the outlier noise area;
The text confidence is as follows:
; in the method, in the process of the invention, Represent the firstThe text confidence of each pixel point,To adjust the firstThe custom parameters of the text confidence value range of each pixel point,Represent the firstA first confidence level for each pixel point,Representing the variance of the set of differences,Is a normalization function;
The first confidence is:
; in the method, in the process of the invention, Represent the firstA first confidence level for each pixel point,Represent the firstThe contrast of the individual image blocks is determined,Representing the contrast of the global extent of the delivery document area image,Represent the firstThe maximum value of the gradient amplitude of the pixel points in the image block,Represent the firstAverage value of gradient amplitude of pixel points in each image block,To be with natural constantAs an exponential function of the base number,Represent the firstThe second confidence coefficient of each pixel point is used for adjusting the contrast of the corresponding pixel point;
The second confidence is:
; in the method, in the process of the invention, Represent the firstA second confidence level for each pixel point,Represent the firstEach pixel corresponds to an average value of pixel gray values in the search area,Represent the firstThe gray value in the corresponding exploration area of each pixel point is smaller thanThe variance of the gray value of the pixel of (c),As a function of the normalization,To be with natural constantAn exponential function of the base, the search area being based onAnd the circular range with the set radius and the region where the image of the delivery document region intersect are formed by taking each pixel point as the circle center.
2. The automatic document shooting image recognition method according to claim 1, wherein the equalizing processing of each image block comprises:
calculating a gray histogram of each image block;
Taking the first parameter as a multiple of the average column height of columns corresponding to all gray levels in the corresponding gray level histogram, and cutting the gray level histogram of each image block; the parts exceeding the multiple of the average column height are cut off, and the cut parts are added to all columns on average;
and carrying out equalization treatment on the cut gray level histogram.
3. An automatic document shooting image recognition system, comprising: a processor and a memory storing computer program instructions which, when executed by the processor, implement the document capture image automatic identification method of any one of claims 1-2.
CN202410758895.9A 2024-06-13 2024-06-13 Automatic identification method and system for document shooting image Active CN118334674B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410758895.9A CN118334674B (en) 2024-06-13 2024-06-13 Automatic identification method and system for document shooting image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410758895.9A CN118334674B (en) 2024-06-13 2024-06-13 Automatic identification method and system for document shooting image

Publications (2)

Publication Number Publication Date
CN118334674A CN118334674A (en) 2024-07-12
CN118334674B true CN118334674B (en) 2024-08-13

Family

ID=91779299

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410758895.9A Active CN118334674B (en) 2024-06-13 2024-06-13 Automatic identification method and system for document shooting image

Country Status (1)

Country Link
CN (1) CN118334674B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112819772A (en) * 2021-01-28 2021-05-18 南京挥戈智能科技有限公司 High-precision rapid pattern detection and identification method
CN115082672A (en) * 2022-06-06 2022-09-20 西安电子科技大学 Infrared image target detection method based on bounding box regression

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9171224B2 (en) * 2013-07-04 2015-10-27 Qualcomm Incorporated Method of improving contrast for text extraction and recognition applications
US20230394670A1 (en) * 2020-10-20 2023-12-07 The Johns Hopkins University Anatomically-informed deep learning on contrast-enhanced cardiac mri for scar segmentation and clinical feature extraction

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112819772A (en) * 2021-01-28 2021-05-18 南京挥戈智能科技有限公司 High-precision rapid pattern detection and identification method
CN115082672A (en) * 2022-06-06 2022-09-20 西安电子科技大学 Infrared image target detection method based on bounding box regression

Also Published As

Publication number Publication date
CN118334674A (en) 2024-07-12

Similar Documents

Publication Publication Date Title
CN111275129B (en) Image data augmentation policy selection method and system
TWI774659B (en) Image text recognition method and device
CN109409374B (en) A joint-based method for cutting the answer area of the same batch of test papers
JP5616308B2 (en) Document modification detection method by character comparison using character shape feature
CN110348264B (en) QR two-dimensional code image correction method and system
CN112183038A (en) Form identification and typing method, computer equipment and computer readable storage medium
US20120294528A1 (en) Method of Detecting and Correcting Digital Images of Books in the Book Spine Area
CN112926564B (en) Picture analysis method, system, computer device and computer readable storage medium
WO2019056346A1 (en) Method and device for correcting tilted text image using expansion method
CN105374015A (en) Binary method for low-quality document image based on local contract and estimation of stroke width
CN111178290A (en) Signature verification method and device
CN114926839A (en) Image identification method based on RPA and AI and electronic equipment
CN107016363A (en) Bill images managing device, bill images management system and method
CN109447080B (en) Character recognition method and device
US20200293811A1 (en) Method and device for obtaining image of form sheet
CN112419207A (en) Image correction method, device and system
CN117830315B (en) Real-time monitoring method and system for printing machine based on image processing
CN111814673A (en) Method, device and equipment for correcting text detection bounding box and storage medium
CN112183325B (en) Road vehicle detection method based on image comparison
CN112991536A (en) Automatic extraction and vectorization method for geographic surface elements of thematic map
US7961941B2 (en) Color form dropout using dynamic geometric solid thresholding
CN118334674B (en) Automatic identification method and system for document shooting image
JP7377661B2 (en) Image semantic region segmentation device, region detection sensitivity improvement method, and program
CN112183531A (en) Method, device, medium and electronic equipment for determining character positioning frame
CN114596564A (en) Recognition method and system for layered characters of optical delivery box

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant