[go: up one dir, main page]

CN113095307A - Automatic identification method for financial voucher information - Google Patents

Automatic identification method for financial voucher information Download PDF

Info

Publication number
CN113095307A
CN113095307A CN202110640739.9A CN202110640739A CN113095307A CN 113095307 A CN113095307 A CN 113095307A CN 202110640739 A CN202110640739 A CN 202110640739A CN 113095307 A CN113095307 A CN 113095307A
Authority
CN
China
Prior art keywords
information
sub
value
image
confidence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110640739.9A
Other languages
Chinese (zh)
Other versions
CN113095307B (en
Inventor
郑瑛
孙泉辉
程嵩
高翔
郭端宏
潘军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Zhejiang Electric Power Co Ltd
Jinhua Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Original Assignee
State Grid Zhejiang Electric Power Co Ltd
Jinhua Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Zhejiang Electric Power Co Ltd, Jinhua Power Supply Co of State Grid Zhejiang Electric Power Co Ltd filed Critical State Grid Zhejiang Electric Power Co Ltd
Priority to CN202110640739.9A priority Critical patent/CN113095307B/en
Publication of CN113095307A publication Critical patent/CN113095307A/en
Application granted granted Critical
Publication of CN113095307B publication Critical patent/CN113095307B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种财务凭证信息自动识别方法,图像分析模块对纸质财务凭证的图像信息进行图像特征信息提取,然后图像分析模块根据图像特征信息进行分析得到数据信息,数据信息进行电子化存储。本发明对于财务凭证存储记录不仅仅是做到了图像的存储,而且对于财务凭证的数据进行了数据化的存储,节约了人工成本,且减少了人工记录的误差,同时对于验证信息的识别进一步提高了识别的准确率。

Figure 202110640739

The invention discloses an automatic identification method for financial voucher information. An image analysis module extracts image feature information from image information of paper financial vouchers, and then the image analysis module analyzes the image feature information to obtain data information, and the data information is stored electronically. . The present invention not only stores images for the storage records of financial vouchers, but also stores the data of financial vouchers digitally, which saves labor costs, reduces the error of manual records, and further improves the identification of verification information. the recognition accuracy.

Figure 202110640739

Description

Automatic identification method for financial voucher information
Technical Field
The invention relates to the technical field of image recognition and storage, in particular to an automatic recognition method for financial voucher information.
Background
In the standard cost accounting of enterprises, financial settlement contents of the enterprises need to be involved, the financial settlement of the enterprises generally includes settlement between internal units, settlement with suppliers, settlement with banks and the like, and financial vouchers are often used in the settlement process. The existing paper financial voucher generally comprises the contents of abstract, subject, amount, certificate making date, lending direction and the like. But the paper financial voucher is comparatively difficult to preserve, and it is comparatively troublesome when checking, auditing simultaneously, need carry out the electronization with the paper financial voucher and save, the current process of carrying out the electronization with the paper financial voucher is only simple carries out the image extraction record with the paper financial voucher, can not obtain paper financial voucher kind information, amount of money information and anti-fake information automatically, and these information need artifical discernment to obtain, and efficiency is lower and make mistakes easily.
The invention patent of Chinese patent publication No. CN112598062A, published on 2021, 4 and 2, entitled "a method and device for identifying image" discloses a method and device for identifying image, the method comprises: acquiring an image to be identified; randomly cutting an image block with a preset image size from an image to be recognized; inputting the image blocks into a trained neural network classification model to obtain classification results of the image blocks; determining classification confidence according to the classification result; determining whether the current classification result is used as a final image recognition result or not according to the classification confidence; and when the current classification result cannot be used as the final image recognition result, obtaining a next image block again according to the feature map and the positioning strategy network in an iterative calculation mode, and obtaining a next classification confidence coefficient according to the next image block until the current classification result is determined to be used as the final image recognition result according to the obtained classification confidence coefficient. The invention has the disadvantages that the invention only discloses an identification method for the image, the identification method can not be applied to the image identification of the paper financial voucher, and the sum information and the like of the paper product voucher can not be well identified.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provide an automatic identification method of financial voucher information.
The purpose of the invention is realized by the following technical scheme:
an image analysis module extracts image characteristic information of image information of a paper financial voucher, then the image analysis module analyzes the image characteristic information to obtain data information, and the data information is electronically stored;
the image characteristic information comprises: the category information of the paper financial voucher, the amount information of the paper financial voucher and the verification information of the paper financial voucher,
in the identification of the category information of the paper financial voucher, the image analysis module determines the category of the paper financial voucher according to different category information corresponding to different types of paper financial vouchers;
in the identification of the amount information of the paper financial voucher, the image analysis module finishes the extraction of image characteristic information by the following steps:
as shown in fig. 1, the image analysis module obtains the category of the paper financial voucher, and determines the range and category of money information extraction according to the category of the paper financial voucher, wherein the category of money information extraction is arabic digital information;
step 2, carrying out characteristic extraction on the amount information within the amount information extraction range, and then converting the image information of the amount information into numerical value information, wherein the specific mode is as follows: determining an image area of each numerical value in the amount information, wherein the numerical value in the image area is a numerical value to be recognized, recognizing any image area through a convolutional neural network, comparing the numerical value to be recognized in the image area with numerical value information pre-trained by the convolutional neural network to determine a confidence level, determining the numerical value to be recognized as the numerical value information with the confidence level being greater than or equal to a set threshold value if the confidence level of the numerical value to be recognized and certain numerical value information is greater than or equal to the set threshold value and the confidence level of the numerical value to be recognized and other numerical value information is less than the set threshold value, and performing an error correction analysis step if the confidence level of the numerical value to be recognized and all the numerical value information is less than the set threshold value so that the numerical value to be recognized determines specific;
step 3, finishing the extraction of the money information image characteristic information if all the numerical values to be identified are identified;
in the identification of the verification information of the paper financial voucher, the image analysis module restores the verification information according to a preset mapping rule to verify the authenticity.
The error correction analysis step specifically comprises:
the image area of the numerical value to be recognized is decomposed into a plurality of sub-image areas, meanwhile, the numerical value information pre-trained by the convolutional neural network is also correspondingly decomposed into a plurality of pieces of sub-numerical value information, for any sub-image area, the convolutional neural network is used for recognizing, the sub-numerical value to be recognized in the sub-image area is compared with the sub-numerical value information pre-trained by the convolutional neural network to determine the sub-confidence degree, if the confidence degree of the sub-numerical value to be recognized and certain sub-numerical value information in the sub-image area is larger than or equal to a set sub-confidence degree threshold value, and meanwhile, the confidence degree of the sub-numerical value to be recognized and other sub-numerical value information in the sub-image area is smaller than the set sub-confidence degree threshold value, the numerical value information corresponding to the sub.
According to the scheme, the paper financial voucher is divided into three types of areas to be identified, in the image characteristic information extraction process, the type information of the paper financial voucher is extracted firstly, the type information is generally easily identified and can be a two-dimensional code or a bar code, more types of information can be combined with colors, the type information identification reading accuracy of the paper financial voucher is high, and mistakes are avoided. After the category information of the paper financial voucher is determined, according to the category of the paper financial voucher, the preset range of money amount information extraction corresponding to each category is called, so that image characteristic information extraction can be rapidly performed on the money amount information, the money amount information is identified by the convolutional neural network in the scheme, and the image information is converted into specific money amount data to be stored. More, this scheme has still designed verification information, and verification information is generally used for the anti-fake discernment to paper financial document, and verification information can be anti-fake code or anti-fake watermark, can also verify whether the discernment is correct to the discernment of the classification information of paper financial document when guaranteeing paper financial document true and false. In conclusion, the scheme not only stores images for the financial voucher storage records, but also stores the data of the financial vouchers in a datamation mode, so that labor cost is saved, errors of manual records are reduced, and the identification accuracy is further improved for identification of verification information.
The money information of the paper financial voucher can be manually filled or printed data, the manual filling can cause that the size and the shape of the numerical value are not uniform due to different writing of each person, and the printed data can cause the problem of different printing depths, so that the convolutional neural network can cause the condition that the image characteristic information of the money information is extracted wrongly. Therefore, in the scheme, an error correction analysis step is set for extracting the image characteristic information of the money information, the image area of the numerical value to be identified is decomposed into a plurality of sub-image areas, if any sub-image area is matched with the corresponding sub-numerical value information, the numerical value information corresponding to the sub-numerical value information can be determined as the numerical value information of the numerical value to be identified, and the design optimizes the capacity of extracting the image characteristic information of the money information.
Preferably, the error correction analyzing step further includes:
if the confidence degrees of the sub-numerical values to be identified and at least two sub-numerical value information in the sub-image region are greater than or equal to the set sub-confidence degree threshold value, the numerical value information corresponding to the sub-numerical value information is used as candidate numerical value information, confidence degree analysis is carried out on other sub-image regions of the numerical values to be identified, if the confidence degree of the sub-numerical values to be identified and one sub-numerical value information in any other sub-image region is greater than or equal to the set sub-confidence degree threshold value, meanwhile, the confidence degrees of the sub-numerical values to be identified and other sub-numerical value information in the sub-image region are smaller than the set sub-confidence degree threshold value, and the numerical value information corresponding to the sub-numerical value information is the numerical value information of the numerical values to be identified, wherein the confidence degrees of the sub-numerical.
If the image feature information is decomposed into a plurality of sub-image regions, for a certain sub-image region, the confidence degrees of at least two pieces of sub-value information are greater than or equal to a set sub-confidence degree threshold, for example, when the lower half of Arabic numerals 6 or 8 is used as the sub-image region, the confidence degrees of 6 or 8 used as the sub-value information are both greater than the set sub-confidence degree threshold, at this time, the confidence degrees of other sub-image regions of the value to be recognized are analyzed, and the confidence degrees of the other sub-image regions of 6 or 8 are different greatly, so that the design of the scheme further improves the capability of extracting the image feature information of the.
Preferably, the sizes of the plurality of block sub-image regions are equal, and the plurality of block sub-image regions divide the image region into horizontal equal parts or vertical equal parts; the error correction analysis substep specifically comprises:
step a, for a single sub-image region, searching a first region with the same size and the highest confidence coefficient as the single sub-image region in each numerical information, recording the first region and the confidence coefficient thereof, and determining the highest confidence coefficient from the confidence coefficients corresponding to all the numerical information, wherein the numerical information corresponding to the highest confidence coefficient is candidate numerical information;
selecting sub-image areas adjacent to a single sub-image area, searching a second area with the same size and the highest confidence coefficient as the adjacent sub-image areas in the candidate numerical information, and recording the second area;
the substep c, comparing the first region with the second region, if the distance between the first region and the second region is adjacent or the area of the overlapped region between the first region and the second region is less than a set error value or the area of the overlapped region between the first region and the second region is less than a set area value, judging that the candidate numerical value information is the final numerical value information, and determining the numerical value information of the numerical value to be identified as the identification numerical value corresponding to the final numerical value information; in other cases, a jump is made to sub-step d,
and d, deleting the numerical information corresponding to the highest confidence coefficient, and repeatedly executing the substeps a to c until the identification numerical value corresponding to the numerical value to be identified is determined.
The handwritten amount information is greatly different from people to hand, so in the process of extracting the image characteristic information of the amount information, if the image area of a value to be recognized is decomposed into a plurality of sub-image areas, meanwhile, the pre-trained numerical value information of the convolutional neural network is also correspondingly decomposed into a plurality of blocks of sub-numerical value information, and a method for determining confidence coefficient of the sub-image areas and the sub-numerical value information is possibly not used for recognizing the hand-written information, therefore, the pre-trained numerical value information of the convolutional neural network is not simply decomposed into a plurality of blocks of sub-numerical value information in the scheme, but the sub-image areas are traversed and matched in the pre-trained numerical value information of the convolutional neural network to obtain the optimal matching result, so that the best possible matching of the handwritten amount information is ensured firstly, and then the sub-image areas adjacent to the sub-image areas are matched with the pre-trained numerical value information of the convolutional neural network, the first region and the second region may not be adjacent to each other or may have an overlapping region, but as long as the distance between the first region and the second region or the first region and the second region is smaller than a set error value or the area of the overlapping region between the first region and the second region is smaller than a set area value, the matching is considered to be successful.
Preferably, the substep d is executed at most twice, and if the final numerical value information is not determined after the substep d is executed twice, it is determined that the identification of the numerical value to be identified fails.
Preferably, the amount information further includes Chinese upper case digit information, the Chinese upper case digit information is recognized through an OCR, a numerical value corresponding to the Chinese upper case digit information and a numerical value corresponding to the arabic digit information are compared after recognition is completed, if the two numerical values are equal, it is determined that the arabic digit information and the Chinese upper case digit information are successfully recognized, and if the two numerical values are not equal, a correction step is executed to ensure that the amount information is successfully recognized.
The money information on the paper financial voucher is sometimes completed by manual handwriting, and the situation that Chinese capital figures and Arabic figure information are identified to be inconsistent occurs at the moment, so that the Arabic figure information and the Chinese capital figure information contained in the financial voucher are ingeniously utilized to judge whether convolutional neural network characteristic identification is wrong or not, the Arabic figure information and the Chinese capital figure information are matched, and the inconsistent numerical values can be corrected into the consistent numerical values.
Preferably, the correcting step specifically comprises: searching possible matching results with the confidence coefficient larger than the set Chinese upper case number threshold value in the Chinese upper case number information, searching possible matching results with the confidence coefficient larger than the set Arabic number threshold value in the Arabic number information, and correcting unequal numerical values into equal numerical values if the corresponding numerical values in the possible matching results are equal.
Preferably, the financial certificate information automatic identification method further comprises a method for optimizing a convolutional neural network, each numerical value corresponds to an initial image quantity in the initial convolutional neural network, after a plurality of times of convolutional neural network training, a plurality of image quantities corresponding to each numerical value exist, all images of any numerical value are compared with the pre-initial images in similarity, images with the similarity being smaller than or equal to a set threshold value are selected and retained as candidate images, then the similarity of all candidate images is compared pairwise, if the similarity between any two candidate images is smaller than the set value, the lower similarity between the two candidate images and the initial image is retained, the higher similarity is discarded, if the similarity between any two candidate images is larger than or equal to the set value, both candidate images are retained; all the retained candidate images and the initial image are retained as images corresponding to the numerical value.
The design carries out optimization training on the convolutional neural network, images which possibly appear on the same numerical value are kept as far as possible, the images are different from each other but can be regarded as images corresponding to the numerical value, the design can improve the recognition efficiency of the convolutional neural network, and the recognition accuracy of the amount information of the paper financial voucher is further ensured.
The invention has the beneficial effects that:
1. according to the scheme, the financial certificate storage records are not only stored with images, but also stored with data, so that labor cost is saved, errors of manual records are reduced, and the identification accuracy of verification information is further improved;
2. an error correction analysis sub-step is set for the extraction of the image characteristic information of the money information, and the capability of extracting the image characteristic information of the money information is optimized;
3. and the convolutional neural network is optimally trained, so that the identification accuracy of the amount information of the paper financial voucher is further ensured.
Drawings
FIG. 1 is a flow chart of the identification of the amount information of the paper financial instrument of the present invention;
fig. 2 is a schematic view of the paper financial instrument of the present invention.
Detailed Description
The invention is further described below with reference to the figures and examples.
Example 1:
an image analysis module extracts image characteristic information of image information of a paper financial voucher, then the image analysis module analyzes the image characteristic information to obtain data information, and the data information is electronically stored;
the image characteristic information comprises: the category information of the paper financial voucher, the amount information of the paper financial voucher and the verification information of the paper financial voucher,
in the identification of the category information of the paper financial voucher, the image analysis module determines the category of the paper financial voucher according to different category information corresponding to different types of paper financial vouchers;
in the identification of the amount information of the paper financial voucher, the image analysis module completes the extraction of the image characteristic information by the following steps, as shown in fig. 1:
step 1, an image analysis module acquires the type of a paper financial voucher, and determines the range and the type of money information extraction according to the type of the paper financial voucher, wherein the type of money information extraction is Arabic digital information;
step 2, carrying out characteristic extraction on the amount information within the amount information extraction range, and then converting the image information of the amount information into numerical value information, wherein the specific mode is as follows: determining an image area of each numerical value in the amount information, wherein the numerical value in the image area is a numerical value to be recognized, recognizing any image area through a convolutional neural network, comparing the numerical value to be recognized in the image area with numerical value information pre-trained by the convolutional neural network to determine a confidence level, determining the numerical value to be recognized as the numerical value information with the confidence level being greater than or equal to a set threshold value if the confidence level of the numerical value to be recognized and certain numerical value information is greater than or equal to the set threshold value and the confidence level of the numerical value to be recognized and other numerical value information is less than the set threshold value, and performing an error correction analysis step if the confidence level of the numerical value to be recognized and all the numerical value information is less than the set threshold value so that the numerical value to be recognized determines specific;
step 3, finishing the extraction of the money information image characteristic information if all the numerical values to be identified are identified;
in the identification of the verification information of the paper financial voucher, the image analysis module restores the verification information according to a preset mapping rule to verify the authenticity.
The error correction analysis step specifically comprises:
the image area of the numerical value to be recognized is decomposed into a plurality of sub-image areas, meanwhile, the numerical value information pre-trained by the convolutional neural network is also correspondingly decomposed into a plurality of pieces of sub-numerical value information, for any sub-image area, the convolutional neural network is used for recognizing, the sub-numerical value to be recognized in the sub-image area is compared with the sub-numerical value information pre-trained by the convolutional neural network to determine the sub-confidence degree, if the confidence degree of the sub-numerical value to be recognized and certain sub-numerical value information in the sub-image area is larger than or equal to a set sub-confidence degree threshold value, and meanwhile, the confidence degree of the sub-numerical value to be recognized and other sub-numerical value information in the sub-image area is smaller than the set sub-confidence degree threshold value, the numerical value information corresponding to the sub.
According to the scheme, the paper financial voucher is divided into three types of areas to be identified, in the image characteristic information extraction process, the type information of the paper financial voucher is extracted firstly, the type information is generally easily identified and can be a two-dimensional code or a bar code, more types of information can be combined with colors, the type information identification reading accuracy of the paper financial voucher is high, and mistakes are avoided. After the category information of the paper financial voucher is determined, according to the category of the paper financial voucher, the preset range of money amount information extraction corresponding to each category is called, so that image characteristic information extraction can be rapidly performed on the money amount information, the money amount information is identified by the convolutional neural network in the scheme, and the image information is converted into specific money amount data to be stored. More, this scheme has still designed verification information, and verification information is generally used for the anti-fake discernment to paper financial document, and verification information can be anti-fake code or anti-fake watermark, can also verify whether the discernment is correct to the discernment of the classification information of paper financial document when guaranteeing paper financial document true and false. In conclusion, the scheme not only stores images for the financial voucher storage records, but also stores the data of the financial vouchers in a datamation mode, so that labor cost is saved, errors of manual records are reduced, and the identification accuracy is further improved for identification of verification information.
The money information of the paper financial voucher can be manually filled or printed data, the manual filling can cause that the size and the shape of the numerical value are not uniform due to different writing of each person, and the printed data can cause the problem of different printing depths, so that the convolutional neural network can cause the condition that the image characteristic information of the money information is extracted wrongly. Therefore, in the scheme, an error correction analysis step is set for extracting the image characteristic information of the money information, the image area of the numerical value to be identified is decomposed into a plurality of sub-image areas, if any sub-image area is matched with the corresponding sub-numerical value information, the numerical value information corresponding to the sub-numerical value information can be determined as the numerical value information of the numerical value to be identified, and the design optimizes the capacity of extracting the image characteristic information of the money information.
The error correction analyzing step further includes:
if the confidence degrees of the sub-numerical values to be identified and at least two sub-numerical value information in the sub-image region are greater than or equal to the set sub-confidence degree threshold value, the numerical value information corresponding to the sub-numerical value information is used as candidate numerical value information, confidence degree analysis is carried out on other sub-image regions of the numerical values to be identified, if the confidence degree of the sub-numerical values to be identified and one sub-numerical value information in any other sub-image region is greater than or equal to the set sub-confidence degree threshold value, meanwhile, the confidence degrees of the sub-numerical values to be identified and other sub-numerical value information in the sub-image region are smaller than the set sub-confidence degree threshold value, and the numerical value information corresponding to the sub-numerical value information is the numerical value information of the numerical values to be identified, wherein the confidence degrees of the sub-numerical.
If the image feature information is decomposed into a plurality of sub-image regions, for a certain sub-image region, the confidence degrees of at least two pieces of sub-value information are greater than or equal to a set sub-confidence degree threshold, for example, when the lower half of Arabic numerals 6 or 8 is used as the sub-image region, the confidence degrees of 6 or 8 used as the sub-value information are both greater than the set sub-confidence degree threshold, at this time, the confidence degrees of other sub-image regions of the value to be recognized are analyzed, and the confidence degrees of the other sub-image regions of 6 or 8 are different greatly, so that the design of the scheme further improves the capability of extracting the image feature information of the.
The amount information also comprises Chinese upper case digit information, the Chinese upper case digit information is identified through OCR, the numerical value corresponding to the Chinese upper case digit information is compared with the numerical value corresponding to the Arabic digit information after the identification is finished, if the numerical values of the Chinese upper case digit information and the Arabic digit information are equal, the Arabic digit information and the Chinese upper case digit information are judged to be successfully identified, and if the numerical values of the Arabic digit information and the Chinese upper case digit information are not equal, a correction step is executed to ensure that the amount information is successfully identified.
The money information on the paper financial voucher is sometimes completed by manual handwriting, and the situation that Chinese capital figures and Arabic figure information are identified to be inconsistent occurs at the moment, so that the Arabic figure information and the Chinese capital figure information contained in the financial voucher are ingeniously utilized to judge whether convolutional neural network characteristic identification is wrong or not, the Arabic figure information and the Chinese capital figure information are matched, and the inconsistent numerical values can be corrected into the consistent numerical values.
The correction steps are specifically as follows: searching possible matching results with the confidence coefficient larger than the set Chinese upper case number threshold value in the Chinese upper case number information, searching possible matching results with the confidence coefficient larger than the set Arabic number threshold value in the Arabic number information, and correcting unequal numerical values into equal numerical values if the corresponding numerical values in the possible matching results are equal.
The method for automatically identifying financial voucher information further comprises a method for optimizing a convolutional neural network, wherein in the initial convolutional neural network, each numerical value corresponds to an initial image number, after a plurality of times of convolutional neural network training, the number of images corresponding to each numerical value also exists, similarity comparison is carried out on all images of any numerical value and a pre-initial image, images with the similarity less than or equal to a set threshold value are selected and reserved as candidate images, then comparing the similarity of all candidate images pairwise, if the similarity between any two candidate images is less than a set value, keeping the two candidate images with low similarity with the initial image and abandoning the two candidate images with high similarity, and keeping the two candidate images if the similarity between any two candidate images is more than or equal to a set value; all the retained candidate images and the initial image are retained as images corresponding to the numerical value.
The design carries out optimization training on the convolutional neural network, images which possibly appear on the same numerical value are kept as far as possible, the images are different from each other but can be regarded as images corresponding to the numerical value, the design can improve the recognition efficiency of the convolutional neural network, and the recognition accuracy of the amount information of the paper financial voucher is further ensured.
The paper financial voucher of the invention is shown in figure 2, the financial voucher is an electric power company simulation electric charge invoice, in this embodiment, an invoice code is the type information of the paper financial voucher, the type of the paper financial voucher can be determined according to the invoice code, for the amount information of the paper financial voucher, as a template of the electric charge invoice is preset, the extraction range of the amount information can be obtained, the extraction range comprises the amount of Arabic numerals and the Chinese capitalized amount, the Arabic numerals comprise the account number of a user, the starting and ending electric meter reading, the actual electric quantity, the electric price, the electric charge and the like, the Chinese capitalized amount comprises the information in a table of total RMB (capitalized), the electric charge can simply obtain the total amount, and can obtain each classified electric charge and the total electric charge, if each classified amount is obtained, the classified sum can be added and the like to be calculated and the total sum can be checked and matched, so that the reliability of identification is further improved.
The analog electric bill displayed in the embodiment is data printed, which is regular but still has unclear writing, for example, some parts are light and cannot be identified in the printing process, or the writing printed after long-time storage is partially shallow. Therefore, the scheme considers the optimized identification reading of the amount information, and greatly improves the accuracy and effectiveness of the amount information reading. In addition, besides the temporary analog electric charge invoice, a handwritten electric charge invoice also exists, and because each person of handwritten handwriting is different, the traditional method for identifying the handwritten handwriting needs to train a large amount of convolutional neural networks, the efficiency is low, the method can quickly identify handwritten digital information, and the identification accuracy is high.
Example 2: the principle and implementation method of the automatic identification method of the financial voucher information are basically the same as the embodiment 1, and the difference is that: the sizes of the plurality of sub-image areas are equal, and the plurality of sub-image areas divide the image area into horizontal equal parts or vertical equal parts; the error correction analysis substep specifically comprises:
step a, for a single sub-image region, searching a first region with the same size and the highest confidence coefficient as the single sub-image region in each numerical information, recording the first region and the confidence coefficient thereof, and determining the highest confidence coefficient from the confidence coefficients corresponding to all the numerical information, wherein the numerical information corresponding to the highest confidence coefficient is candidate numerical information;
selecting sub-image areas adjacent to a single sub-image area, searching a second area with the same size and the highest confidence coefficient as the adjacent sub-image areas in the candidate numerical information, and recording the second area;
the substep c, comparing the first region with the second region, if the distance between the first region and the second region is adjacent or the area of the overlapped region between the first region and the second region is less than a set error value or the area of the overlapped region between the first region and the second region is less than a set area value, judging that the candidate numerical value information is the final numerical value information, and determining the numerical value information of the numerical value to be identified as the identification numerical value corresponding to the final numerical value information; in other cases, a jump is made to sub-step d,
and d, deleting the numerical information corresponding to the highest confidence coefficient, and repeatedly executing the substeps a to c until the identification numerical value corresponding to the numerical value to be identified is determined. And the substep d is executed at most twice, and if the final numerical value information is not determined after the substep d is executed twice, the numerical value to be identified is judged to fail to be identified.
The handwritten amount information is greatly different from people to hand, so in the process of extracting the image characteristic information of the amount information, if the image area of a value to be recognized is decomposed into a plurality of sub-image areas, meanwhile, the pre-trained numerical value information of the convolutional neural network is also correspondingly decomposed into a plurality of blocks of sub-numerical value information, and a method for determining confidence coefficient of the sub-image areas and the sub-numerical value information is possibly not used for recognizing the hand-written information, therefore, the pre-trained numerical value information of the convolutional neural network is not simply decomposed into a plurality of blocks of sub-numerical value information in the scheme, but the sub-image areas are traversed and matched in the pre-trained numerical value information of the convolutional neural network to obtain the optimal matching result, so that the best possible matching of the handwritten amount information is ensured firstly, and then the sub-image areas adjacent to the sub-image areas are matched with the pre-trained numerical value information of the convolutional neural network, the first region and the second region may not be adjacent to each other or may have an overlapping region, but as long as the distance between the first region and the second region or the first region and the second region is smaller than a set error value or the area of the overlapping region between the first region and the second region is smaller than a set area value, the matching is considered to be successful.
The above-described embodiments are only preferred embodiments of the present invention, and are not intended to limit the present invention in any way, and other variations and modifications may be made without departing from the spirit of the invention as set forth in the claims.

Claims (7)

1.一种财务凭证信息自动识别方法,其特征是,图像分析模块对纸质财务凭证的图像信息进行图像特征信息提取,然后图像分析模块根据图像特征信息进行分析得到数据信息,数据信息进行电子化存储;1. an automatic identification method for financial voucher information, characterized in that the image analysis module extracts image feature information from the image information of paper financial vouchers, and then the image analysis module analyzes and obtains data information according to the image feature information, and the data information is processed electronically. storage; 所述的图像特征信息包括:纸质财务凭证的类别信息、纸质财务凭证的金额信息和纸质财务凭证的验证信息,The image feature information includes: category information of paper financial vouchers, amount information of paper financial vouchers, and verification information of paper financial vouchers, 在纸质财务凭证的类别信息识别中,图像分析模块根据不同种类的纸质财务凭证对应的不同的类别信息确定纸质财务凭证的类别;In the identification of category information of paper financial documents, the image analysis module determines the category of paper financial documents according to different categories of information corresponding to different types of paper financial documents; 在对于纸质财务凭证的金额信息识别中,图像分析模块通过以下步骤完成图像特征信息提取:In the identification of amount information for paper financial vouchers, the image analysis module completes the extraction of image feature information through the following steps: 步骤1,图像分析模块获取纸质财务凭证的类别,根据纸质财务凭证的类别确定金额信息提取的范围及类别,金额信息提取的类别为阿拉伯数字信息;Step 1, the image analysis module obtains the category of the paper financial voucher, determines the scope and category of the amount information extraction according to the category of the paper financial voucher, and the category of the amount information extraction is Arabic numeral information; 步骤2,在金额信息提取的范围内对金额信息进行特征提取,然后将金额信息的图像信息转换成数值信息,具体方式为:确定金额信息内每一个数值的图像区域,图像区域内的数值为待识别数值,对于任意一个图像区域,通过卷积神经网络进行识别,将图像区域的待识别数值与卷积神经网络预训练的数值信息进行比对确定置信度,若待识别数值与某一个数值信息的置信度大于等于设定的阈值,且该待识别数值与其他数值信息的置信度小于设定的阈值,则该待识别数值确定为置信度大于等于设定阈值对应的数值信息,若待识别数值与所有数值信息的置信度均小于设定的阈值,则执行纠错分析步骤,使待识别数值确定具体数值信息;Step 2: Perform feature extraction on the amount information within the scope of the amount information extraction, and then convert the image information of the amount information into numerical information. The value to be identified, for any image area, identify it through a convolutional neural network, and compare the value to be identified in the image area with the numerical information pre-trained by the convolutional neural network to determine the confidence level. The confidence of the information is greater than or equal to the set threshold, and the confidence of the value to be identified and other numerical information is less than the set threshold, then the value to be identified is determined to be the numerical information whose confidence is greater than or equal to the set threshold. If the confidence of the identification value and all the numerical information is less than the set threshold, the error correction analysis step is performed, so that the to-be-identified value determines the specific numerical information; 步骤3,所有待识别数值均识别完成,则完成金额信息图像特征信息提取;Step 3: After all the values to be identified have been identified, the feature information extraction of the amount information image is completed; 在纸质财务凭证的验证信息识别中,图像分析模块根据预设的映射规则将验证信息进行还原验证真伪;In the identification of verification information of paper financial vouchers, the image analysis module restores the verification information to verify the authenticity according to the preset mapping rules; 所述的纠错分析步骤具体包括:The error correction analysis steps specifically include: 将待识别数值的图像区域分解成若干块子图像区域,同时卷积神经网络预训练的数值信息也对应分解成若干块子数值信息,对于任意一个子图像区域,通过卷积神经网络进行识别,将该子图像区域内待识别子数值与卷积神经网络预训练的子数值信息进行比对确定子置信度,若该子图像区域内待识别子数值与某一个子数值信息的置信度大于等于设定的子置信度阈值,同时该子图像区域内待识别子数值与其他的子数值信息的置信度小于设定的子置信度设定阈值,则将子数值信息对应的数值信息确定为待识别数值的数值信息。The image area of the value to be recognized is decomposed into several sub-image areas, and the numerical information pre-trained by the convolutional neural network is also decomposed into several sub-numeric information correspondingly. For any sub-image area, the convolutional neural network is used for identification. Compare the sub-value to be identified in the sub-image area with the sub-value information pre-trained by the convolutional neural network to determine the sub-confidence, if the confidence of the sub-value to be identified and a certain sub-value information in the sub-image area is greater than or equal to The set sub-confidence threshold, and at the same time the confidence of the sub-value to be identified and other sub-value information in the sub-image area is less than the set sub-confidence threshold, then the numerical information corresponding to the sub-value information is determined as to be Numerical information that identifies a numeric value. 2.根据权利要求1所述的一种财务凭证信息自动识别方法,其特征是,所述的纠错分析步骤还包括:2. a kind of financial voucher information automatic identification method according to claim 1 is characterized in that, described error correction analysis step also comprises: 若该子图像区域内待识别子数值与其中至少两个子数值信息的置信度大于等于设定的子置信度阈值,则将这些子数值信息对应的数值信息作为候选数值信息,且对待识别数值的其他子图像区域进行置信度分析,若其他任意一个子图像区域内待识别子数值与某一个子数值信息的置信度大于等于设定的子置信度阈值,同时该子图像区域内待识别子数值与其他的子数值信息的置信度小于设定的子置信度设定阈值,且该子数值信息对应的数值信息与候选数值信息中某一个数值信息相同,则该子数值信息对应的数值信息即为待识别数值的数值信息。If the confidence of the sub-value to be identified and at least two of the sub-value information in the sub-image area is greater than or equal to the set sub-confidence threshold, the numerical information corresponding to these sub-value information is used as candidate numerical information, and the Perform confidence analysis on other sub-image areas. If the confidence of the sub-value to be identified and a certain sub-value in any other sub-image area is greater than or equal to the set sub-confidence threshold, and the sub-value to be identified in the sub-image area The confidence with other sub-value information is less than the set threshold for the sub-value information, and the value information corresponding to the sub-value information is the same as a certain value information in the candidate value information, then the value information corresponding to the sub-value information is Value information for the value to be identified. 3.根据权利要求1所述的一种财务凭证信息自动识别方法,其特征是,所述的若干块子图像区域的大小均相等,若干块子图像区域将图像区域进行横向等分或纵向等分;所述的纠错分析子步骤具体为:3. a kind of financial voucher information automatic identification method according to claim 1, is characterized in that, the size of described several sub-image areas is all equal, and several sub-image areas carry out horizontal equal division or vertical etc. The described error correction analysis sub-steps are specifically: 子步骤a,对于单个子图像区域,寻找每一个数值信息中与单个子图像区域大小相等的且置信度最高的第一区域,记录该第一区域及其置信度,所有数值信息对应的置信度中,确定最高的置信度,最高的置信度对应的数值信息为候选数值信息;Sub-step a, for a single sub-image area, find the first area in each numerical information that is equal in size to the single sub-image area and has the highest confidence, record the first area and its confidence, and the confidence corresponding to all numerical information. , determine the highest confidence, and the numerical information corresponding to the highest confidence is the candidate numerical information; 子步骤b,选取单个子图像区域相邻的子图像区域,寻找候选数值信息中与相邻的子图像区域大小相等且置信度最高的第二区域,记录该第二区域;Sub-step b, select the adjacent sub-image area of a single sub-image area, find the second area that is equal in size to the adjacent sub-image area and has the highest confidence in the candidate numerical information, and record the second area; 子步骤c,比较第一区域和第二区域,若第一区域和第二区域相邻或第一区域和第二区域之间的距离小于设定的误差值或第一区域和第二区域之间重叠的区域的面积小于设定的面积值,则判断候选数值信息为最终数值信息,待识别数值的数值信息确定为最终数值信息对应的识别数值;其他情况下,跳转至子步骤d,Sub-step c, compare the first area and the second area, if the first area and the second area are adjacent or the distance between the first area and the second area is less than the set error value or the difference between the first area and the second area. If the area of the overlapping area is smaller than the set area value, the candidate numerical information is judged to be the final numerical information, and the numerical information of the numerical value to be identified is determined as the identification value corresponding to the final numerical information; in other cases, jump to sub-step d, 子步骤d,删除最高的置信度对应的数值信息,重复执行子步骤a至子步骤c,直到确定待识别数值对应的识别数值为止。In sub-step d, the numerical information corresponding to the highest confidence level is deleted, and sub-steps a to c are repeatedly executed until the identification value corresponding to the value to be identified is determined. 4.根据权利要求3所述的一种财务凭证信息自动识别方法,其特征是,所述的子步骤d最多执行两次,若在执行两次子步骤d后未确定最终数值信息,则判断待识别数值识别失败。4. a kind of financial voucher information automatic identification method according to claim 3 is characterized in that, described sub-step d is carried out at most twice, if after carrying out sub-step d twice, final numerical information is not determined, then judge Recognition of the value to be recognized failed. 5.根据权利要求1-4任意一项权利要求所述的一种财务凭证信息自动识别方法,其特征是,所述的金额信息还包括中文大写数字信息,中文大写数字信息通过OCR进行识别,识别完成后将中文大写数字信息对应的数值和阿拉伯数字信息对应的数值相比较,若两者数值相等,则判断阿拉伯数字信息和中文大写数字信息识别成功,若两者数值不相等,则执行修正步骤,确保金额信息识别成功。5. according to a kind of financial voucher information automatic identification method described in any one of claim 1-4, it is characterized in that, described amount information also comprises Chinese capital numeral information, and Chinese capital numeral information is identified by OCR, After the recognition is completed, compare the numerical value corresponding to the Chinese capital numeral information with the numerical value corresponding to the Arabic numeral information. If the two values are equal, it is judged that the Arabic numeral information and the Chinese capital numeral information have been successfully identified. If the two values are not equal, the correction is performed. Steps to ensure that the amount information is successfully identified. 6.根据权利要求5所述的一种财务凭证信息自动识别方法,其特征是,所述的修正步骤具体为:在中文大写数字信息中寻找置信度大于设定的中文大写数字阈值的可能匹配结果,在阿拉伯数字信息中寻找置信度大于设定的阿拉伯数字阈值的可能匹配结果,若在可能匹配结果中对应的数值相等,则将不相等的数值修正成相等的数值。6. a kind of financial voucher information automatic identification method according to claim 5, is characterized in that, described revision step is specifically: in Chinese capital numeral information, look for the possible match whose confidence is greater than the Chinese capital numeral threshold of setting As a result, the Arabic numeral information is searched for possible matching results whose confidence is greater than the set Arabic numeral threshold. If the corresponding values in the possible matching results are equal, the unequal values are corrected to equal values. 7.根据权利要求1所述的一种财务凭证信息自动识别方法,其特征是,还包括对卷积神经网络进行优化的方法,在初始的卷积神经网络中,每一个数值对应一个初始图像数量,在进行若干次的卷积神经网络训练后,每一个数值对应的图像数量也会存在若干个,将任意一个数值的所有图像与预初始图像进行相似度比较,选取相似度小于等于设定的阈值的图像,将这些图像作为候选图像保留,然后将所有候选图像两两比较相似度,若任意两个候选图像之间的相似度小于设定值,则将这两个候选图像之间与初始图像之间相似度较低的进行保留、相似度较高的舍弃,若任意两个候选图像之间的相似度大于等于设定值,则两个候选图像均保留;所有保留的候选图像和初始图像均保留作为该数值对应的图像。7. a kind of financial voucher information automatic identification method according to claim 1 is characterized in that, also comprises the method that convolutional neural network is optimized, in initial convolutional neural network, each numerical value corresponds to an initial image Quantity, after several times of convolutional neural network training, there will also be several images corresponding to each value. Compare all images of any value with the pre-initial image for similarity, and select the similarity less than or equal to the setting The images with the threshold value of , keep these images as candidate images, and then compare the similarity between all candidate images. If the similarity between any two candidate images is less than the set value, then the two candidate images are compared with each other. If the similarity between any two candidate images is greater than or equal to the set value, the two candidate images are retained; all retained candidate images and The initial images are kept as the images corresponding to this value.
CN202110640739.9A 2021-06-09 2021-06-09 Automatic identification method for financial voucher information Active CN113095307B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110640739.9A CN113095307B (en) 2021-06-09 2021-06-09 Automatic identification method for financial voucher information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110640739.9A CN113095307B (en) 2021-06-09 2021-06-09 Automatic identification method for financial voucher information

Publications (2)

Publication Number Publication Date
CN113095307A true CN113095307A (en) 2021-07-09
CN113095307B CN113095307B (en) 2021-08-20

Family

ID=76664509

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110640739.9A Active CN113095307B (en) 2021-06-09 2021-06-09 Automatic identification method for financial voucher information

Country Status (1)

Country Link
CN (1) CN113095307B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114239081A (en) * 2022-02-25 2022-03-25 华中科技大学同济医学院附属协和医院 A business voucher processing method, system and electronic device
CN114639173A (en) * 2022-05-18 2022-06-17 国网浙江省电力有限公司 OCR technology-based intelligent auditing method and device for checking and certifying materials
CN115204150A (en) * 2022-09-06 2022-10-18 北京国电通网络技术有限公司 Information verification method and device, electronic equipment and computer readable medium
CN116310423A (en) * 2023-04-14 2023-06-23 中国工商银行股份有限公司 Image recognition method, device, electronic device and readable storage medium
CN116701711A (en) * 2023-04-18 2023-09-05 深圳金融科技研究院(中国人民银行金融科技研究院) Credential data storage method, device, equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105678612A (en) * 2015-12-30 2016-06-15 远光软件股份有限公司 Mobile terminal original certificate electronic intelligent filling system and method
US9633268B1 (en) * 2015-12-18 2017-04-25 Beijing University Of Posts And Telecommunications Method and device for gait recognition
CN107133571A (en) * 2017-04-11 2017-09-05 上海众开信息科技有限公司 A kind of system and method that paper invoice is automatically generated to financial statement
CN108446621A (en) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 Bank slip recognition method, server and computer readable storage medium
CN111612966A (en) * 2020-05-21 2020-09-01 广东乐佳印刷有限公司 Bill certificate anti-counterfeiting detection method and device based on image recognition
CN112395996A (en) * 2020-11-19 2021-02-23 深圳供电局有限公司 Financial bill OCR recognition and image processing method, system and readable storage medium
CN112598062A (en) * 2020-12-24 2021-04-02 清华大学 Image identification method and device
CN112734623A (en) * 2021-04-01 2021-04-30 国网浙江省电力有限公司 Financial certificate anti-counterfeiting encryption and reading identification method in standard cost accounting

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9633268B1 (en) * 2015-12-18 2017-04-25 Beijing University Of Posts And Telecommunications Method and device for gait recognition
CN105678612A (en) * 2015-12-30 2016-06-15 远光软件股份有限公司 Mobile terminal original certificate electronic intelligent filling system and method
CN107133571A (en) * 2017-04-11 2017-09-05 上海众开信息科技有限公司 A kind of system and method that paper invoice is automatically generated to financial statement
CN108446621A (en) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 Bank slip recognition method, server and computer readable storage medium
CN111612966A (en) * 2020-05-21 2020-09-01 广东乐佳印刷有限公司 Bill certificate anti-counterfeiting detection method and device based on image recognition
CN112395996A (en) * 2020-11-19 2021-02-23 深圳供电局有限公司 Financial bill OCR recognition and image processing method, system and readable storage medium
CN112598062A (en) * 2020-12-24 2021-04-02 清华大学 Image identification method and device
CN112734623A (en) * 2021-04-01 2021-04-30 国网浙江省电力有限公司 Financial certificate anti-counterfeiting encryption and reading identification method in standard cost accounting

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114239081A (en) * 2022-02-25 2022-03-25 华中科技大学同济医学院附属协和医院 A business voucher processing method, system and electronic device
CN114639173A (en) * 2022-05-18 2022-06-17 国网浙江省电力有限公司 OCR technology-based intelligent auditing method and device for checking and certifying materials
CN114639173B (en) * 2022-05-18 2022-08-09 国网浙江省电力有限公司 OCR technology-based intelligent auditing method and device for checking and certifying materials
CN115204150A (en) * 2022-09-06 2022-10-18 北京国电通网络技术有限公司 Information verification method and device, electronic equipment and computer readable medium
CN116310423A (en) * 2023-04-14 2023-06-23 中国工商银行股份有限公司 Image recognition method, device, electronic device and readable storage medium
CN116701711A (en) * 2023-04-18 2023-09-05 深圳金融科技研究院(中国人民银行金融科技研究院) Credential data storage method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN113095307B (en) 2021-08-20

Similar Documents

Publication Publication Date Title
CN113095307A (en) Automatic identification method for financial voucher information
KR101446376B1 (en) Identification and verification of an unknown document according to an eigen image process
CN112651289B (en) An intelligent recognition and verification system and method for general value-added tax invoices
KR101515256B1 (en) Document verification using dynamic document identification framework
CN109658584B (en) Bill information identification method and device
CN101510252B (en) Area extraction method, character recognition method, and character recognition device
JP3842006B2 (en) Form classification device, form classification method, and computer-readable recording medium storing a program for causing a computer to execute these methods
CN114299528A (en) Information extraction and structuring method for scanned document
JP4796169B2 (en) How to determine the orientation of a digital image
CN114511866B (en) Data auditing method, device, system, processor and machine-readable storage medium
KR19980023917A (en) Pattern recognition apparatus and method
CN111861731A (en) Post-credit check system and method based on OCR
CN113255642A (en) Medical bill information integration method for injury claims
CN113657377B (en) A structured recognition method for machine-printed bill images
CN118279923B (en) Picture character recognition method, system and storage medium based on deep learning training
CN116798061A (en) A bill audit and identification method, device, terminal and storage medium
CN120340054A (en) Document recognition method, system, device and medium based on multimodal large model
CN115147852B (en) Ancient book identification method, ancient book identification device, storage medium and equipment
CN113903118A (en) Invoice checking method based on invoice acquisition system
JP2003109007A (en) Device, method and program for classifying slip form and image collating device
CN119131800A (en) A mixed text image detection and recognition method and device
JP3558493B2 (en) Paper alignment device, paper alignment method, and computer-readable recording medium recording paper alignment program
CN111626281A (en) Recognition method and system of Chinese annotation information in paper image map based on adaptive learning
CN113837119B (en) A method and device for identifying easily confused characters based on grayscale images
CN117292388A (en) Method, device, equipment and storage medium for detecting entrusted book content

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant