CN109948617A

CN109948617A - An Invoice Image Positioning Method

Info

Publication number: CN109948617A
Application number: CN201910246868.2A
Authority: CN
Inventors: 桂冠; 孟洋; 孙颖异; 李懋阳; 邵蕾; 熊健; 杨洁
Original assignee: Nanjing Post and Telecommunication University
Current assignee: Nanjing Post and Telecommunication University
Priority date: 2019-03-29
Filing date: 2019-03-29
Publication date: 2019-06-28

Abstract

The invention discloses an invoice image positioning method in the technical field of computer vision image processing. The purpose is to solve the problem that the existing technology cannot accurately locate the invoice images captured by the mobile terminal from different oblique angles to accurately extract useful information. The method includes the following steps: loading the image of the invoice to be positioned into the trained deep learning model; performing tilt correction on the image of the invoice to be positioned according to the Hough transform algorithm; positioning the image of the invoice to be positioned according to the optimal weight value generated by training .

Description

An Invoice Image Positioning Method

技术领域technical field

本发明涉及一种发票图像定位方法，属于计算机视觉图像处理技术领域。The invention relates to an invoice image positioning method, which belongs to the technical field of computer vision image processing.

背景技术Background technique

随着科技的飞速发展，大量人工智能产品逐步代替人工做一些复杂、琐碎、重复性的工作。人工整理及报销发票是一项极为复杂且繁琐的工作，报销发票过程中容易由于人为疏忽造成错误，并且纸质发票易损坏造成发票无法报销。现有技术对于移动端从不同倾斜角度拍摄的发票图像，无法进行准确定位以提取有用信息。With the rapid development of science and technology, a large number of artificial intelligence products have gradually replaced humans to do some complex, trivial and repetitive tasks. Manual sorting and reimbursement of invoices is an extremely complicated and tedious task. Errors are easily caused by human negligence in the process of invoice reimbursement, and paper invoices are easily damaged, resulting in invoices that cannot be reimbursed. In the prior art, for the invoice images captured by the mobile terminal from different oblique angles, accurate positioning cannot be performed to extract useful information.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于提供一种发票图像定位方法，以克服现有技术中存在的上述缺陷或缺陷之一。The purpose of the present invention is to provide a method for locating an image of an invoice, so as to overcome the above-mentioned defects or one of the defects existing in the prior art.

为达到上述目的，本发明提供了一种发票图像定位方法，包括如下步骤：In order to achieve the above object, the present invention provides a method for locating an invoice image, comprising the following steps:

将待定位发票图像加载至已训练好的深度学习模型中；Load the image of the invoice to be located into the trained deep learning model;

根据霍夫变换算法对待定位发票图像进行倾斜校正；Perform tilt correction on the image of the invoice to be positioned according to the Hough transform algorithm;

根据训练生成的最佳权重值对待定位发票图像进行定位。The image of the invoice to be located is located according to the best weight value generated by training.

进一步地，最佳权重值的训练生成包括如下步骤：Further, the training generation of the optimal weight value includes the following steps:

采集发票图像构建数据集；Collect invoice images to build a dataset;

对数据集的发票图像打标签，所述打标签的发票栏目包括购买方、销售方、货物详情和价税合计；Label the invoice image of the dataset, and the labelled invoice column includes buyer, seller, details of goods and total price and tax;

将数据集的发票图像和打标签生成的文件输入到特征提取网络，训练生成最佳权重值。Input the invoice images of the dataset and the files generated by labeling into the feature extraction network, and train to generate the best weight values.

进一步地，对发票图像打标签的工具为labelImg工具。Further, the tool for labeling the invoice image is the labelImg tool.

进一步地，霍夫变换算法通过python生成，包括如下步骤：Further, the Hough transform algorithm is generated by python, including the following steps:

调用OpenCV中的HoughTransform()函数，用于提取待定位发票图像；Call the HoughTransform() function in OpenCV to extract the image of the invoice to be located;

调用HoughLines()函数，用于提取待定位发票图像中的直线。Call the HoughLines() function to extract the straight lines in the image of the invoice to be located.

进一步地，霍夫变换算法的公式如下：Further, the formula of the Hough transform algorithm is as follows:

y＝mx+by=mx+b

式中，(x,y)为发票图像中直线任一点在直角坐标系中的坐标参数，m为发票图像中直线的斜率，b为发票图像中直线的截距。In the formula, (x, y) is the coordinate parameter of any point of the straight line in the invoice image in the rectangular coordinate system, m is the slope of the straight line in the invoice image, and b is the intercept of the straight line in the invoice image.

进一步地，霍夫变换算法提取直线的Hesse normal form公式如下：Further, the Hesse normal form formula for extracting straight lines by the Hough transform algorithm is as follows:

式中，(x',y')为发票图像中被提取直线l上任一点在极坐标系中的坐标参数，r_l为极坐标系原点到发票图像上被提取直线l的距离，θ_l为极坐标系X'轴与发票图像上被提取直线l的垂直线的夹角，cos(θ_l)为对θ_l取余弦值，sin(θ_l)为对θ_l取正弦值。In the formula, (x', y') is the coordinate parameter of any point on the extracted straight line l in the invoice image in the polar coordinate system, r _l is the distance from the origin of the polar coordinate system to the extracted straight line l on the invoice image, θ _l is The angle between the X' axis of the polar coordinate system and the vertical line of the extracted straight line l on the invoice image, cos(θ _l ) is the cosine value of θ _l , and sin(θ _l ) is the sine value of θ _l .

进一步地，所述深度学习模型为YOLOv3深度学习模型，所述特征提取网络为YOLOv3深度学习模型中的Darknet53特征提取网络。Further, the deep learning model is the YOLOv3 deep learning model, and the feature extraction network is the Darknet53 feature extraction network in the YOLOv3 deep learning model.

进一步地，所述倾斜校正包括如下步骤：Further, the tilt correction includes the following steps:

将霍夫变换算法嵌入到YOLOv3深度学习模型中的测试脚本文件中；Embed the Hough transform algorithm into the test script file in the YOLOv3 deep learning model;

运行测试脚本文件对待定位发票图像进行倾斜校正。Run the test script file to skew-correct the image of the invoice to be positioned.

进一步地，所述对待定位发票图像进行定位包括如下步骤：Further, the positioning of the to-be-located invoice image includes the following steps:

将最佳权重值加载到YOLOv3深度学习模型中的测试脚本文件中；Load the optimal weight values into the test script file in the YOLOv3 deep learning model;

运行测试脚本文件对待定位发票图像进行定位。Run the test script file to locate the image of the invoice to be located.

进一步地，所述发票包括中国各省增值税普通发票，所述发票图像包括基于移动端拍摄的发票图像。Further, the invoice includes ordinary VAT invoices of various provinces in China, and the invoice image includes an invoice image captured by a mobile terminal.

与现有技术相比，本发明所达到的有益效果：能够对移动端不同角度拍摄的发票图像进行倾斜校正和定位，具有很好的准确性和鲁棒性。Compared with the prior art, the present invention achieves the beneficial effects that tilt correction and positioning can be performed on the invoice images captured by the mobile terminal at different angles, with good accuracy and robustness.

附图说明Description of drawings

图1是本发明具体实施方式提供的一种发票图像定位方法的流程示意图；1 is a schematic flowchart of a method for locating an invoice image provided by a specific embodiment of the present invention;

图2是本发明具体实施方式提供的一种发票图像定位方法的Hesse normal form(Hesses法线式)参数示意图。FIG. 2 is a schematic diagram of Hesse normal form (Hesses normal form) parameters of a method for locating an invoice image provided by a specific embodiment of the present invention.

具体实施方式Detailed ways

下面结合附图对本发明作进一步描述。The present invention will be further described below in conjunction with the accompanying drawings.

本发明具体实施方式在Linux平台上进行。The specific implementation of the present invention is carried out on the Linux platform.

如图1所示，是本发明具体实施方式提供的一种发票图像定位方法的流程示意图，包括如下步骤：As shown in FIG. 1, it is a schematic flowchart of a method for locating an invoice image provided by a specific embodiment of the present invention, including the following steps:

采集手机拍摄的发票图像作为数据集A，所述发票为中国各个省的增值税普通发票。The image of the invoice taken by the mobile phone is collected as data set A, and the invoice is the ordinary VAT invoice of each province in China.

采用labelImg工具对数据集A中的发票图像打标签生成对应的xml文件B，所述打标签的发票栏目包括购买方、销售方、货物详情和价税合计四栏。The labelImg tool is used to label the invoice image in data set A to generate the corresponding xml file B. The labelled invoice column includes four columns: buyer, seller, goods details and total price and tax.

对YOLOv3深度学习模型中算法进行修改，将xml文件B通过YOLOv3深度学习模型转换为txt文件C。Modify the algorithm in the YOLOv3 deep learning model, and convert the xml file B to the txt file C through the YOLOv3 deep learning model.

将数据集A的发票图像和txt文件C输入YOLOv3深度学习模型，使用YOLOv3深度学习模型中的Darknet53特征提取网络对输入数据进行训练，生成相应权重值。Input the invoice image of dataset A and txt file C into the YOLOv3 deep learning model, and use the Darknet53 feature extraction network in the YOLOv3 deep learning model to train the input data to generate corresponding weight values.

通过python生成能够自动计算直线的倾斜角度并实现对发票图像的倾斜校正的霍夫变换算法，包括如下步骤：Generate a Hough transform algorithm that can automatically calculate the inclination angle of the straight line and realize the inclination correction of the invoice image through python, including the following steps:

调用OpenCV中HoughTransform()函数，用于提取待定位发票图像；Call the HoughTransform() function in OpenCV to extract the image of the invoice to be located;

所述霍夫变换算法的公式如下：The formula of the Hough transform algorithm is as follows:

y＝mx+by=mx+b

如图2所示，是本发明具体实施方式提供的一种发票图像定位方法的Hessenormalform(Hesses法线式)参数示意图；运用霍夫变换算法提取直线的Hessenormal form(Hesses法线式)公式如下：As shown in Figure 2, it is a schematic diagram of the Hessenormalform (Hesses normal form) parameters of a method for locating an invoice image provided by the specific embodiment of the present invention; the Hessenormal form (Hesses normal form) formula for extracting a straight line using the Hough transform algorithm is as follows:

将霍夫变换算法嵌入到YOLOv3深度模型中的测试脚本文件中。Embed the Hough transform algorithm into the test script file in the YOLOv3 deep model.

将权重值加载到YOLOv3深度学习模型的测试脚本文件中。Load the weight values into the test script file of the YOLOv3 deep learning model.

运行测试脚本文件，对待定位的基于移动端拍摄的发票图像进行定位和倾斜校正。Run the test script file to perform positioning and tilt correction of the mobile-based invoice image to be positioned.

以上所述仅是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明技术原理的前提下，还可以做出若干改进和变形，这些改进和变形也应视为本发明的保护范围。The above are only the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the technical principle of the present invention, several improvements and modifications can also be made. These improvements and modifications It should also be regarded as the protection scope of the present invention.

Claims

1. an invoice image positioning method, is characterized in that, comprises the steps:

Load the image of the invoice to be located into the trained deep learning model;

Perform tilt correction on the image of the invoice to be positioned according to the Hough transform algorithm;

The image of the invoice to be located is located according to the best weight value generated by training.

2. The invoice image positioning method according to claim 1, wherein the training generation of the optimal weight value comprises the steps:

Collect invoice images to build a dataset;

Label the invoice image of the dataset, and the labelled invoice column includes buyer, seller, details of goods and total price and tax;

Input the invoice images of the dataset and the files generated by labeling into the feature extraction network, and train to generate the best weight values.

3. invoice image positioning method according to claim 2, is characterized in that, the tool of labeling invoice image is labelImg tool.

4. invoice image positioning method according to claim 1, is characterized in that, Hough transform algorithm is generated by python, comprises the steps:

Call the HoughTransform() function in OpenCV to extract the image of the invoice to be located;

Call the HoughLines() function to extract the straight lines in the image of the invoice to be located.

5. invoice image positioning method according to claim 1, is characterized in that, the formula of Hough transform algorithm is as follows:

y=mx+b

In the formula, (x, y) is the coordinate parameter of any point of the straight line in the invoice image in the rectangular coordinate system, m is the slope of the straight line in the invoice image, and b is the intercept of the straight line in the invoice image.

6. invoice image positioning method according to claim 1, is characterized in that, the Hesse normal form formula that Hough transform algorithm extracts straight line is as follows:

In the formula, (x', y') is the coordinate parameter of any point on the extracted straight line l in the invoice image in the polar coordinate system, r _l is the distance from the origin of the polar coordinate system to the extracted straight line l on the invoice image, θ _l is The angle between the X' axis of the polar coordinate system and the vertical line of the extracted straight line l on the invoice image, cos(θ _l ) is the cosine value of θ _l , and sin(θ _l ) is the sine value of θ _l .

7. The invoice image positioning method according to claim 2, wherein the deep learning model is a YOLOv3 deep learning model, and the feature extraction network is a Darknet53 feature extraction network in the YOLOv3 deep learning model.

8. The invoice image positioning method according to claim 7, wherein the tilt correction comprises the following steps:

Embed the Hough transform algorithm into the test script file in the YOLOv3 deep learning model;

Run the test script file to skew-correct the image of the invoice to be positioned.

9. The invoice image positioning method according to claim 7, wherein the positioning of the invoice image to be positioned comprises the following steps:

Load the optimal weight values into the test script file in the YOLOv3 deep learning model;

Run the test script file to locate the image of the invoice to be located.

10. The method for locating an invoice image according to any one of claims 1 to 9, wherein the invoice includes ordinary VAT invoices from various provinces in China, and the invoice image includes an invoice image captured by a mobile terminal.