CN102592124A

CN102592124A - Geometrical correction method, device and binocular stereoscopic vision system of text image

Info

Publication number: CN102592124A
Application number: CN2011100070119A
Authority: CN
Inventors: 林锦梅
Original assignee: Hanwang Technology Co Ltd
Current assignee: Hanwang Technology Co Ltd
Priority date: 2011-01-13
Filing date: 2011-01-13
Publication date: 2012-07-18
Anticipated expiration: 2031-01-13
Also published as: CN102592124B

Abstract

The invention discloses a text image geometric correction method, device and binocular stereo vision system, belonging to the field of image processing. Including: build a binocular stereo vision system, obtain the respective internal parameters and mutual external parameters of the two lenses in the binocular stereo vision system, and obtain the respective projection matrices of the two lenses; image collected by the two lenses in the binocular stereo vision system Carry out stereo matching after feature extraction respectively, and purify the obtained matching points; calculate the three-dimensional coordinates of the corresponding points in the text image according to the obtained matching points, and judge the text trend direction in the text image; fit the text trend according to the text trend direction Curve, the text trend curve is discretely unfolded and then reconstructed to obtain a corrected text image. The invention is not affected by the specific content and typesetting of the text, and is especially suitable for geometric distortion correction of text with complex layout, and can be used for processing various types of text images, greatly improving the applicable range of geometric distortion correction.

Description

Text image geometric correction method, device and binocular stereo vision system

技术领域 technical field

本发明属于图像处理领域，涉及一种图像处理方法和装置，具体涉及一种文本图像的几何校正方法、装置和双目立体视觉系统。 The invention belongs to the field of image processing, and relates to an image processing method and device, in particular to a text image geometric correction method, device and binocular stereo vision system.

the

背景技术 Background technique

文本图像的采集，主要有两种方式：通过扫描仪对文本进行扫描；通过各种相机、摄像头进行拍摄。用扫描仪扫描时，通常要将书稿拆开或展平后再进行扫描，一般情况下不采用。若将书稿直接放在扫描仪上扫描或采用相机、摄像头直接拍摄时，由于书脊、厚度等因素，不可避免使文稿本身产生弯曲等畸变，既影响了图像美观，又影响了后续的OCR识别应用中的版面分析、行提取等处理过程，因此对拍摄得到的文本图像进行几何校正就变得至关重要。 There are two main ways to collect text images: scanning the text with a scanner; shooting with various cameras and cameras. When scanning with a scanner, it is usually necessary to disassemble or flatten the manuscript before scanning, which is generally not used. If the manuscript is directly scanned on a scanner or directly photographed by a camera or camera, due to factors such as the spine and thickness of the manuscript, it will inevitably cause distortion such as bending of the manuscript itself, which not only affects the appearance of the image, but also affects the subsequent OCR recognition application. Therefore, it is very important to perform geometric correction on the captured text image.

目前一般文本图像几何校正的方法主要分为两类：一类是通过在图像上设置一些已知的参考点，根据参考点在畸变前后的映射关系进行几何校正，该方法一般的实现过程如下，在待拍摄文本上放置一个模板，拍摄模板图像，获取模板图像上的已知参考点，并与原始模板中的参考点建立映射关系，最后根据该映射关系进行几何校正；另一类方法则直接通过分析文本图像本身的特点进行校正，但该方法无法针对一般的非特定畸变图像进行处理，根据处理得到的该类图像的某些特征进行校正。 At present, the general text image geometric correction methods are mainly divided into two categories: one is to set some known reference points on the image, and perform geometric correction according to the mapping relationship of the reference points before and after distortion. The general implementation process of this method is as follows, Place a template on the text to be photographed, take a template image, obtain known reference points on the template image, and establish a mapping relationship with the reference points in the original template, and finally perform geometric correction according to the mapping relationship; the other method directly Correction is carried out by analyzing the characteristics of the text image itself, but this method cannot deal with general non-specific distorted images, and corrects according to some characteristics of such images obtained through processing.

文本采集主要有两种用途，其一是为了资料的保存，通常用于某些珍贵的资料；其二是用于OCR识别。一般情况下，在图像数据采集过程中都会尽可能在横平竖直的角度采集书页文本图像，这样有利于后续的处理。 There are two main purposes of text collection, one is for data preservation, usually for some precious data; the other is for OCR recognition. In general, in the process of image data collection, text images of book pages are collected as far as possible from horizontal and vertical angles, which is beneficial to subsequent processing.

对于第一类方法，采用一定的模板附于文稿上进行扫描或者拍摄，无论在资料保存还是用于OCR识别，操作都不方便。谷歌公司根据已知的一个迷宫图案模板，利用一套包含两台相机和红外线的系统来解决书页文稿弯曲的问题，作法是将两台红外线相机从不同角度把红外图像拍摄下来，根据这些图像用已知的立体成像技术加以组合取得该图案的3D对应效果，该图案落在书页表面，根据图案的3D图形与书页的3D表面相互对应，从而将书页展平。该方法克服了模板在拍摄过程中放置的不便问题，但是该方法系统复杂，成本高昂。 For the first type of method, a certain template is attached to the document for scanning or shooting, which is inconvenient for both data preservation and OCR recognition. According to a known maze pattern template, Google uses a system including two cameras and infrared rays to solve the problem of book page bending. The method is to take infrared images from two infrared cameras from different angles, and use these images to Known stereoscopic imaging technologies are combined to obtain a 3D corresponding effect of the pattern. The pattern falls on the surface of the page, and the 3D figure according to the pattern corresponds to the 3D surface of the page, thereby flattening the page. This method overcomes the inconvenience of placing the template in the shooting process, but the method is complicated in system and high in cost.

对第二类方法，主要是依据文本中文本行的方向作为参考信息进行校正，这种方法较大程度上受到文本内容、排版的影响，对于排版复杂，既有文字，表格，又有花边、插图的文本图像，很难获得较好的校正效果。中国专利CN1804861A通过对图像进行游程涂黑处理，在游程图上划分区段，以段中规则部分带动不规则部分进行校正的方法，较好的解决了某一类文本图像的几何畸变问题。但是该类方法往往要先进行版面分析、二值化等预处理，这些预处理的结果直接影响畸变校正的质量，并且该类方法主要依赖文本行的信息，若文本中既有横排又有竖排的情况，则无能为力。 For the second type of method, the correction is mainly based on the direction of the text line in the text as reference information. This method is largely affected by the text content and typesetting. For complex typesetting, there are not only text, tables, but also lace, Text images for illustrations, it is difficult to obtain good correction results. Chinese patent CN1804861A solves the geometric distortion problem of a certain type of text image by blacking out the run length of the image, dividing the segment on the run length map, and correcting the irregular part with the regular part in the segment. However, this type of method often requires preprocessing such as layout analysis and binarization first. The results of these preprocessing directly affect the quality of distortion correction, and this type of method mainly relies on the information of text lines. In the vertical case, nothing can be done.

the

发明内容 Contents of the invention

本发明所要解决的技术问题是提出一种文本图像的几何校正方法、装置和双目立体视觉系统，采用双目立体视觉结合曲线拟合的方式重建文本图像，对于复杂图像有良好的校正效果，从而有效改善了图像的视觉效果，利于后续的OCR处理和识别。 The technical problem to be solved by the present invention is to propose a geometric correction method, device and binocular stereo vision system of a text image, which uses binocular stereo vision combined with curve fitting to reconstruct the text image, and has a good correction effect for complex images. Therefore, the visual effect of the image is effectively improved, which is beneficial to subsequent OCR processing and recognition.

本发明公开了一种文本图像的几何校正方法，包括如下步骤： The invention discloses a geometric correction method of a text image, comprising the following steps:

步骤一：搭建双目立体视觉系统，进行系统标定后获取双目立体视觉系统中两镜头各自的内部参数和相互的外部参数，求取两镜头各自的投影矩阵； Step 1: Build a binocular stereo vision system, obtain the respective internal parameters and mutual external parameters of the two lenses in the binocular stereo vision system after system calibration, and obtain the respective projection matrices of the two lenses;

步骤二：对经标定的双目立体视觉系统中两镜头所采集的图像分别进行特征提取，根据提取到的图像特征进行立体匹配，并对得到的匹配点进行提纯； Step 2: Perform feature extraction on the images collected by the two lenses in the calibrated binocular stereo vision system, perform stereo matching according to the extracted image features, and purify the obtained matching points;

步骤三：根据提纯得到的匹配点和标定后获得的内部参数、外部参数及投影矩阵，计算文本图像中对应点的三维坐标，从而判断文本图像中的文本走势方向； Step 3: Calculate the three-dimensional coordinates of the corresponding points in the text image according to the purified matching points and the internal parameters, external parameters and projection matrix obtained after calibration, so as to determine the direction of the text trend in the text image;

步骤四：根据获得的文本走势方向拟合文本走势曲线，将文本走势曲线离散展开后进行图像重建，从而得到校正后的文本图像。 Step 4: Fit the text trend curve according to the obtained text trend direction, unfold the text trend curve discretely and perform image reconstruction to obtain a corrected text image.

所述步骤一中双目立体视觉系统包括两台同样的第一镜头和第二镜头；第一镜头用于拍摄待校正的文本图像；第二镜头的视野包括第一镜头拍摄的文本图像。 In the first step, the binocular stereo vision system includes two identical first lenses and second lenses; the first lens is used to capture the text image to be corrected; the field of view of the second lens includes the text image captured by the first lens.

所述步骤二中特征提取时，采用尺度不变特征变换方法进行特征提取。 During the feature extraction in the step 2, a scale invariant feature transformation method is used for feature extraction.

所述步骤二中对得到的匹配点进行提纯时，对匹配点连线的角度进行统计，将180°划分为n个区间，统计所有匹配点连线角度落入各个区间的个数，取点个数最多的角度区间作为匹配点连线角度的范围进行初步提纯，然后根据初步提纯后的匹配点间的平均距离进行设定，如果 When the matching points obtained in the step 2 are purified, the angles of the matching point lines are counted, 180 ° is divided into n intervals, the number of all matching point line angles falling into each interval is counted, and the points are taken The angle interval with the largest number is used as the range of matching point connection angles for preliminary purification, and then it is set according to the average distance between the matching points after preliminary purification, if

Figure 2011100070119100002DEST_PATH_IMAGE001

其中，T为设定的阈值，

为当前匹配点的距离，

Figure 2011100070119100002DEST_PATH_IMAGE003

为匹配点间的平均距离，则当前匹配点为提纯得到的匹配点。 Among them, T is the set threshold,

is the distance of the current matching point,

is the average distance between matching points, and the current matching point is the purified matching point.

所述区间n为36，设定的阈值T为0.1。 The interval n is 36, and the set threshold T is 0.1.

所述步骤三中判断文本图像中的文本走势方向，包括如下步骤： In the step 3, judging the text trend direction in the text image includes the following steps:

1、根据设定的子带宽度在横纵两个方向分别对各匹配点进行子带划分； 1. According to the set sub-band width, each matching point is divided into sub-bands in the horizontal and vertical directions;

2、根据每个子带中匹配点对应方向的坐标计算在对应方向上该子带的离散度； 2. Calculate the dispersion of the sub-band in the corresponding direction according to the coordinates of the corresponding direction of the matching point in each sub-band;

3、在横纵两个方向上分别取对应离散度与匹配点个数乘积最大的子带，分别计算两子带中各匹配点在横纵两个方向上文本图像距镜头距离的变化程度，取变化程度较大对应的方向作为文本走势方向。 3. In the horizontal and vertical directions, take the sub-band with the largest product of the corresponding dispersion and the number of matching points, and calculate the change degree of the distance between the text image and the lens in the horizontal and vertical directions of each matching point in the two sub-bands, respectively. The direction corresponding to the greater degree of change is taken as the direction of the text trend.

所述步骤四中根据获得的文本走势方向拟合文本走势时，根据文本走势方向选择文本图像中对应点的三维坐标中的X方向或Y方向坐标点，与Z方向坐标点进行曲线拟合，获得文本走势曲线。 When fitting the text trend according to the text trend direction obtained in the step 4, select the X direction or Y direction coordinate point in the three-dimensional coordinates of the corresponding point in the text image according to the text trend direction, and perform curve fitting with the Z direction coordinate point, Get the text trend curve.

所述步骤四中将文本走势曲线离散展开时，以水平线与文本走势曲线的交点为基点，计算各离散点相对于基点的曲线长度。 When the text trend curve is discretely expanded in the step 4, the intersection point of the horizontal line and the text trend curve is used as the base point to calculate the curve length of each discrete point relative to the base point.

所述重建时，根据文本走势曲线离散展开所得的目标位置进行后向映射，计算展开平面对应的文本图像中各坐标点在原图像中的位置，并采用双线性插值的方式进行重建，得到几何校正后的文本图像。 During the reconstruction, perform backward mapping according to the target position obtained by the discrete expansion of the text trend curve, calculate the position of each coordinate point in the original image in the text image corresponding to the expansion plane, and reconstruct by using bilinear interpolation to obtain the geometric Corrected text image.

所述对文本走势曲线进行展开时，根据匹配点的三维空间坐标及匹配点对应的图像坐标，确定曲线的离散步长，以基点为起点，计算各离散点相对于基点的曲线长度。 When unfolding the text trend curve, the discrete step length of the curve is determined according to the three-dimensional space coordinates of the matching point and the corresponding image coordinates of the matching point, and the length of the curve of each discrete point relative to the base point is calculated with the base point as the starting point.

本发明还公开了一种双目立体视觉系统，包括第一镜头、第二镜头，所述第一镜头和所述第二镜头并排固定于水平支架上；所述第一镜头用于拍摄待校正的文本图像；所述第二镜头的视野包括第一镜头拍摄的文本图像。 The invention also discloses a binocular stereoscopic vision system, which includes a first lens and a second lens, and the first lens and the second lens are fixed side by side on a horizontal bracket; the first lens is used for shooting images to be corrected The text image of the second camera; the field of view of the second camera includes the text image captured by the first camera.

本发明还公开了一种文本图像的几何校正装置，包括如下模块： The invention also discloses a geometric correction device for text images, which includes the following modules:

搭建模块：搭建双目立体视觉系统，进行系统标定后获取双目立体视觉系统中两镜头各自的内部参数和相互的外部参数，求取两镜头各自的投影矩阵； Building module: build a binocular stereo vision system, obtain the internal parameters and mutual external parameters of the two lenses in the binocular stereo vision system after system calibration, and obtain the respective projection matrices of the two lenses;

提取模块：对经标定的双目立体视觉系统中两镜头所采集的图像分别进行特征提取，根据提取到的图像特征进行立体匹配，并对得到的匹配点进行提纯； Extraction module: perform feature extraction on the images collected by the two lenses in the calibrated binocular stereo vision system, perform stereo matching according to the extracted image features, and purify the obtained matching points;

走势模块：根据提纯得到的匹配点和标定后获得的内部参数、外部参数及投影矩阵，计算文本图像中对应点的三维坐标，从而判断文本图像中的文本走势方向； Trend module: Calculate the three-dimensional coordinates of the corresponding points in the text image according to the purified matching points and the internal parameters, external parameters and projection matrix obtained after calibration, so as to judge the text trend direction in the text image;

重建模块：根据获得的文本走势方向拟合文本走势曲线，将文本走势曲线离散展开后进行重建，从而得到校正后的文本图像。 Reconstruction module: fit the text trend curve according to the obtained text trend direction, and reconstruct the text trend curve after discrete expansion, so as to obtain the corrected text image.

本发明公开了一种文本图像的几何校正方法、装置和双目立体视觉系统，充分利用了文本的三维信息，对文本走向进行预测、拟合及展开，不需要模板图像，直接对采集到的文本图像进行处理，只用到了两个普通的摄像头，操作方便且成本低廉，不受文本弯曲程度及横纵弯曲方向的限制；此外本发明采用对文本图像进行特征提取，不受文本具体内容及排版影响，尤其适用于版面复杂的文本的几何畸变校正，可以用于处理各种类型的文本图像，大大提高了几何畸变校正的适用范围。 The invention discloses a text image geometric correction method, a device and a binocular stereo vision system, which make full use of the three-dimensional information of the text, predict, fit and expand the text direction, and directly perform the collected text without using a template image. The text image is processed, only two common cameras are used, the operation is convenient and the cost is low, and it is not limited by the degree of text bending and the direction of horizontal and vertical bending; Typesetting effects, especially for geometric distortion correction of text with complex layouts, can be used to process various types of text images, greatly improving the scope of application of geometric distortion correction.

the

附图说明 Description of drawings

图1为本发明文本图像的几何校正方法的流程图； Fig. 1 is the flowchart of the geometric correction method of text image of the present invention;

图2为本发明文本图像的几何校正方法中的双目立体视觉系统的示意图； Fig. 2 is the schematic diagram of the binocular stereo vision system in the geometric correction method of text image of the present invention;

图3为本发明文本图像的几何校正方法中的三维空间点云； Fig. 3 is the three-dimensional space point cloud in the geometric correction method of text image of the present invention;

图4a为本发明文本图像的几何校正方法中文字公式混排的畸变图像； Fig. 4a is the distorted image of the mixed arrangement of text formulas in the geometric correction method of the text image of the present invention;

图4b为本发明文本图像的几何校正方法中文字公式混排的畸变图像的前视图像； Fig. 4b is the front-view image of the distorted image mixed with text formulas in the geometric correction method of the text image of the present invention;

图5为本发明文本图像的几何校正方法中文本走势方向判定的流程图； Fig. 5 is the flowchart of text trend direction determination in the geometric correction method of text image of the present invention;

图6为本发明文本图像的几何校正方法中的子带示意图； Fig. 6 is a schematic diagram of the subbands in the geometric correction method of the text image of the present invention;

图7a为本发明文本图像的几何校正方法中畸变图像的前视图像的翻转视图； Fig. 7a is a flipped view of the front-view image of the distorted image in the geometric correction method of the text image of the present invention;

图7b为本发明文本图像的几何校正方法中畸变图像的前视图像的曲线拟合结果及曲线展开基点示意图； Fig. 7b is a schematic diagram of the curve fitting result of the front-view image of the distorted image and the base point of the curve expansion in the geometric correction method of the text image of the present invention;

图8为本发明文本图像的几何校正方法中曲线展开和重建的示意图； Fig. 8 is a schematic diagram of curve expansion and reconstruction in the geometric correction method of text image in the present invention;

图9为本发明文本图像的几何校正方法中文字公式混排的畸变图像结果图； Fig. 9 is a distorted image result diagram of mixed arrangement of text formulas in the geometric correction method of text images of the present invention;

图10a为本发明文本图像的几何校正方法中校正前纯文字文本图像的示意图； Fig. 10a is a schematic diagram of a plain text text image before correction in the geometric correction method of a text image in the present invention;

图10b为本发明文本图像的几何校正方法中校正后纯文字文本图像的示意图； Fig. 10b is a schematic diagram of a corrected plain text text image in the geometric correction method of a text image in the present invention;

图11a为本发明文本图像的几何校正方法中校正前横竖混排文本图像的示意图； Fig. 11a is a schematic diagram of horizontal and vertical mixed text images before correction in the geometric correction method of text images of the present invention;

图11b为本发明文本图像的几何校正方法中校正后横竖混排文本图像的示意图； Fig. 11b is a schematic diagram of a corrected horizontal and vertical mixed text image in the geometric correction method of the text image of the present invention;

图12a为本发明文本图像的几何校正方法中校正前文字图片混排文本图像的示意图； Fig. 12a is a schematic diagram of a text image mixed with text and pictures before correction in the geometric correction method of the text image of the present invention;

图12b为本发明文本图像的几何校正方法中校正后文字图片混排文本图像的示意图； Fig. 12b is a schematic diagram of text images after correction in the geometric correction method of text images in the present invention;

图13a为本发明文本图像的几何校正方法中校正前含较多公式文本图像的示意图； Fig. 13a is a schematic diagram of a text image containing many formulas before correction in the geometric correction method of a text image in the present invention;

图13b为本发明文本图像的几何校正方法中校正后含较多公式文本图像的示意图； Fig. 13b is a schematic diagram of a corrected text image containing many formulas in the geometric correction method of the text image of the present invention;

图14a为本发明文本图像的几何校正方法中校正前竖向手写文本图像的示意图； Fig. 14a is a schematic diagram of a vertical handwritten text image before correction in the geometric correction method of a text image in the present invention;

图14b为本发明文本图像的几何校正方法中校正后竖向手写文本图像的示意图。 Fig. 14b is a schematic diagram of a corrected vertical handwritten text image in the geometric correction method for a text image of the present invention.

the

具体实施方式 Detailed ways

为了更加明晰的阐述本发明的技术方案和内容，下面结合附图对本发明做进一步详尽的描述。 In order to illustrate the technical solution and content of the present invention more clearly, the present invention will be further described in detail below in conjunction with the accompanying drawings.

根据对文本畸变特征，畸变主要有两种形式：一是由镜头的拍摄角度引起的，如透视畸变；另一种则是由于文本书脊的存在、厚度的变化产生的文本自身的几何结构变化引起的畸变。由于文本书脊的存在及厚度的变化，使得文本表面结构发生了变化，这种变化体现在文本图像上最主要的是文字行等发生了弯曲变形，通常这种变化并不是随机任意的，而是具有一定的规律性。由于书页在拍摄后，形成的文本图像在弯曲时从上到下(或从左到右)具有连续性和一致性，基于这种一致性规律，本发明采用曲线拟合的方法来表征文本的走势变化，并在此基础上进行曲线展开消除文本图像的几何畸变。 According to the characteristics of text distortion, there are two main forms of distortion: one is caused by the shooting angle of the lens, such as perspective distortion; the other is caused by the geometric structure change of the text itself due to the existence of the text spine and the change of thickness distortion. Due to the existence of the spine of the text and the change of the thickness, the surface structure of the text has changed. This change is reflected in the text image. Has a certain regularity. After the book page is photographed, the text image formed has continuity and consistency from top to bottom (or from left to right) when it is bent, based on this consistency law, the present invention uses a curve fitting method to characterize the text The trend changes, and on this basis, the curve is expanded to eliminate the geometric distortion of the text image.

本发明利用计算机立体视觉技术，通过两台相机的镜头对文本进行拍摄，对两幅图像进行特征提取后进行匹配，获取立体匹配点后重建出文本图像上对应点的三维坐标，根据三维坐标进行文本弯曲走势判定及曲线拟合，将拟合曲线展开后重建文本图像，实现了图像的几何畸变校正。 The present invention utilizes computer stereo vision technology to shoot text through the lenses of two cameras, perform feature extraction on the two images and then match them, and reconstruct the three-dimensional coordinates of the corresponding points on the text image after obtaining stereo matching points, and carry out the process according to the three-dimensional coordinates Text bending trend judgment and curve fitting, the fitting curve is unfolded and the text image is reconstructed, realizing the geometric distortion correction of the image.

本发明公开的文本图像的几何校正方法，流程图如图1所示，具体处理步骤如下： The geometric correction method of the text image disclosed by the present invention, the flow chart is as shown in Figure 1, and the specific processing steps are as follows:

步骤1：搭建双目立体视觉系统，进行系统标定后获取双目立体视觉系统中两镜头各自的内部参数和相互的外部参数，求取两镜头各自的投影矩阵。 Step 1: Build a binocular stereo vision system. After system calibration, obtain the respective internal parameters and mutual external parameters of the two lenses in the binocular stereo vision system, and obtain the respective projection matrices of the two lenses.

本发明所搭建的双目立体视觉系统的一个实施例如图2所示，双目立体视觉系统包括两台同样的第一镜头和第二镜头；第一镜头用于拍摄待校正的文本图像；第二镜头的视野包括第一镜头拍摄的文本图像。第一镜头和第二镜头并排放在一个水平的支架C上，分别对所要校正的文本进行拍摄，要求待校正的文本图像尽可能都在两个镜头的拍摄视野中。 An embodiment of the binocular stereo vision system built by the present invention is shown in Figure 2, the binocular stereo vision system includes two identical first lenses and second lenses; the first lens is used to take text images to be corrected; the second lens The field of view of the second camera includes the text image captured by the first camera. The first lens and the second lens are arranged side by side on a horizontal bracket C, and respectively shoot the text to be corrected, and it is required that the text image to be corrected should be within the shooting field of view of the two lenses as much as possible.

双目立体视觉系统中两个镜头的位置固定后，对两个镜头进行标定，并获得两个镜头的内部参数及两镜头的相对位置关系，进一步获得两镜头的投影矩阵，为后续计算物点的三维坐标做准备。 After the positions of the two lenses in the binocular stereo vision system are fixed, the two lenses are calibrated, and the internal parameters of the two lenses and the relative positional relationship between the two lenses are obtained, and the projection matrix of the two lenses is further obtained for the subsequent calculation of the object point. The three-dimensional coordinates are prepared.

系统标定为建立两个镜头所拍摄的图像像素位置与场景点位置之间的关系，根据镜头模型由己知特征点的图像坐标和世界坐标求解镜头的模型参数，从而确定相机的图像坐标系与三维空间中物体的参考坐标系之间的对应关系，只有当镜头被正确标定后，才能根据图像平面中的二维坐标计算出对应物体在三维空间中的实际位置。 The system calibration is to establish the relationship between the pixel position of the image captured by the two lenses and the position of the scene point, and calculate the model parameters of the lens from the image coordinates and world coordinates of the known feature points according to the lens model, so as to determine the image coordinate system of the camera and the three-dimensional The correspondence between the reference coordinate systems of objects in space, only when the lens is correctly calibrated, can the actual position of the corresponding object in three-dimensional space be calculated according to the two-dimensional coordinates in the image plane.

本发明所用的相机镜头均满足针孔相机模型，采用张正友标定法对镜头进行标定，根据多幅平面模板图像求解相机的内外参数。本实施例中的几个常用坐标系如下： The camera lenses used in the present invention all meet the pinhole camera model, the lens is calibrated by using the Zhang Zhengyou calibration method, and the internal and external parameters of the camera are calculated according to multiple plane template images. Several commonly used coordinate systems in this embodiment are as follows:

(1) 图像坐标系：以镜头采集到的数字图像来定义，原点

Figure 2011100070119100002DEST_PATH_IMAGE005

在图像平面的左上角，每一像素的坐标分别是该像素在数组中的列数和行数，所以

是以像素为单位的图像坐标系的坐标。 (1) Image coordinate system : Defined by the digital image captured by the lens, the origin

In the upper left corner of the image plane, the coordinates of each pixel are the number of columns and rows of the pixel in the array, respectively, so

are coordinates in the image coordinate system in pixels.

(2) 成像平面坐标系

Figure 2011100070119100002DEST_PATH_IMAGE007

：以物理单位表示图像坐标，

为原点，

Figure 2011100070119100002DEST_PATH_IMAGE009

轴与

轴分别与，

轴平行，设

在

，

坐标系中的坐标为

Figure 2011100070119100002DEST_PATH_IMAGE013

，每一个像素在

轴与

轴方向上的物理尺寸为

，

Figure 2011100070119100002DEST_PATH_IMAGE015

, 则图像中任意一个象素在两个坐标系下的坐标关系可表示为下面的公式： (2) Imaging plane coordinate system

: image coordinates expressed in physical units,

as the origin,

axis with

axis respectively with,

axis parallel to

exist

,

The coordinates in the coordinate system are

, each pixel at

axis with

The physical dimension in the axial direction is

,

, then the coordinate relationship of any pixel in the image under the two coordinate systems can be expressed as the following formula:

其中，

Figure 2011100070119100002DEST_PATH_IMAGE017

分别为

轴的尺度因子。 in,

respectively

The scale factor of the axes.

(3) 相机坐标系

Figure 2011100070119100002DEST_PATH_IMAGE019

_：

点称为镜头光心，

轴和

轴与图像的轴与轴平行，

轴为与图像平面垂直的镜头光轴。光轴与图像平面的交点，由点

与

，

轴组成的直角坐标系成为相机坐标系。镜头光心到图像平面的距离

为镜头的有效焦距。空间一点

经镜头成像后其像面坐标和镜头坐标关系为： (3) Camera coordinate system

_:

The point is called the optical center of the lens,

axis and

axes and images axis with axis parallel,

axis is the optical axis of the lens perpendicular to the image plane. The intersection of the optical axis and the image plane, given by the point

and

,

The Cartesian coordinate system composed of axes becomes the camera coordinate system. The distance from the optical center of the lens to the image plane

is the effective focal length of the lens . a little space

After imaging through the lens, the relationship between the image plane coordinates and the lens coordinates is:

Figure 2011100070119100002DEST_PATH_IMAGE027

亦即成像平面坐标系与相机坐标系之间的关系，可写为： That is, the relationship between the imaging plane coordinate system and the camera coordinate system can be written as:

(4) 世界坐标系

Figure 2011100070119100002DEST_PATH_IMAGE029

：客观世界的绝对坐标。相机坐标系与相机坐标系之间的关系可以用旋转矩阵与平移向量来描述。对于空间中的某一点

在世界坐标系和相机坐标系下的齐次坐标如果分别是

Figure 2011100070119100002DEST_PATH_IMAGE033

与

，存在如下关系： (4) World coordinate system

: The absolute coordinates of the objective world. The relationship between the camera coordinate system and the camera coordinate system can use the rotation matrix with translation vector to describe. for a point in space

If the homogeneous coordinates in the world coordinate system and the camera coordinate system are

and

, there is the following relationship:

Figure 2011100070119100002DEST_PATH_IMAGE035

其中，

为

Figure 2011100070119100002DEST_PATH_IMAGE037

正交单位矩阵；

为平移向量；；

为

矩阵。 in,

for

Orthogonal identity matrix;

for translation vector; ;

for

matrix.

物体点从世界坐标系至相机坐标系的平移

和旋转变换矩阵

中的参数称为外部参数。外部参数有六个，相应于

的用欧拉角表示的侧倾角

、俯仰角

、旋转角

，以及相应于平移矢量的三个分量

。于是旋转矩阵

可以表示为： The translation of the object point from the world coordinate system to the camera coordinate system

and the rotation transformation matrix

The parameters in are called extrinsic parameters. There are six external parameters, corresponding to

The roll angle expressed in terms of Euler angles

,Pitch angle

, rotation angle

, and corresponding to the translation vector three components of

. Then the rotation matrix

It can be expressed as:

_{从世界坐标系变换到图像坐标系的关系如下：The relationship between transforming from the world coordinate system to the image coordinate system is as follows:}

张正友平面标定法避免了传统标定方法设备要求高、操作繁琐的缺点，精度较高，鲁棒性较好。在本发明实施例中，其主要步骤有： Zhang Zhengyou's planar calibration method avoids the disadvantages of high equipment requirements and cumbersome operation of traditional calibration methods, and has high precision and good robustness. In the embodiment of the present invention, its main steps are:

a：打印一张棋盘格图案，贴于平面上作为标定板，棋盘格尺寸为20*20，棋盘格边长为0.067m； a: Print a checkerboard pattern and stick it on the plane as a calibration board. The size of the checkerboard is 20*20, and the side length of the checkerboard is 0.067m;

b：移动标定板，每个镜头从不同角度拍摄15幅图像（不少于3幅）； b: Move the calibration board, each lens takes 15 images (not less than 3) from different angles;

c：检测出每幅图像中所有的角点，以棋盘格四个角点的中心点代替图像角点； c: Detect all the corner points in each image, and replace the corner points of the image with the center points of the four corner points of the checkerboard;

d：在不考虑径向畸变的情况下，利用旋转矩阵的正交性，通过求解线性方程得到镜头的五个内部参数和和外部参数； d: Without considering the radial distortion, using the orthogonality of the rotation matrix, the five internal parameters and external parameters of the lens are obtained by solving the linear equation;

e：利用反投影误差最小原则，对内外参数进行优化。 e: Using the principle of minimum back-projection error to optimize the internal and external parameters.

利用检测到的角点中心点求取内、外参数的具体实现过程为：对于2维标定板上的任意目标点

，其齐次坐标与像素坐标之间通过如下的针孔相机模型相关联： The specific implementation process of calculating internal and external parameters by using the detected corner point center point is as follows: for any target point on the 2D calibration board

, whose homogeneous coordinates are associated with pixel coordinates through the following pinhole camera model:

其中，分别表示

点在世界坐标系中的齐次坐标，为与齐次世界坐标系

有关的比例缩放因子。与

分别为从世界坐标系到相机坐标系的旋转矩阵和平移向量，即镜头的外参数。

为

镜头内参数矩阵，其具体形式如下所示： in, Respectively

The homogeneous coordinates of the point in the world coordinate system, for the homogeneous world coordinate system

The relevant scaling factor. and

are the rotation matrix and translation vector from the world coordinate system to the camera coordinate system, respectively, that is, the extrinsic parameters of the lens.

for

The lens intrinsic parameter matrix, its specific form is as follows:

其中，

，

为主点（又称作像面中心）的像素坐标，

分别为有效焦距在象素坐标系

方向上的尺度因子，

为倾斜因子，表示像素坐标系中两个主坐标轴之间的角度。在理想情况下，两轴正交，。为在镜头标定过程中求解方便，将主点位置与有效焦距分离开来，分别用独立的矩阵表示，其中

均为可逆矩阵。 in,

,

The pixel coordinates of the principal point (also known as the center of the image plane),

are the effective focal length in the pixel coordinate system

The scale factor in the direction,

is the tilt factor, representing the angle between the two main coordinate axes in the pixel coordinate system. Ideally, the two axes are orthogonal, . In order to facilitate the solution in the lens calibration process, the principal point position and the effective focal length are separated, and an independent matrix is used respectively said, among them

both reversible matrix.

在标定过程中，将世界坐标系的原点选择在标定板平面上，且

轴的方向与此平面垂直，则标定板上目标点的世界坐标简化为

，将其代入并进行整理后得到下式： During the calibration process, the origin of the world coordinate system is selected on the plane of the calibration plate, and

The direction of the axis is perpendicular to this plane, then the target point on the calibration board The world coordinates of simplifies to

, after substituting it and sorting it out, the following formula is obtained:

其中，

表示

点在标定平面上的齐次坐标，

即为2D单应矩阵，它完全由镜头的内、外参数矩阵决定： in,

express

The homogeneous coordinates of the point on the calibration plane,

It is the 2D homography matrix, which is completely determined by the internal and external parameter matrix of the lens:

其中，

分别表示旋转矩阵

的前两列向量。因此，对于标定板上的任意目标点，单应矩阵

为不变量。

使实际图像坐标

与根据上述关系计算出的图像坐标

之间残差最小，目标函数为： in,

represent the rotation matrix, respectively

The first two columns of the vector. Therefore, for any target point on the calibration board, the homography matrix

as an invariant.

make the actual image coordinates

with the image coordinates calculated from the above relationship

The residual between is the smallest, and the objective function is:

求解出2D单应矩阵

以后，可以对镜头的内外参数矩阵进行标定，通过逐步优化来建立对于镜头内、外参数矩阵的精确估计。 Solve the 2D homography matrix

In the future, the internal and external parameter matrix of the lens can be calibrated, and an accurate estimation of the internal and external parameter matrix of the lens can be established through gradual optimization.

本实施例中，通过镜头标定得到第一镜头、第二镜头的内外参数，并利用Jean-Yves Bouguet相机标定工具箱中的立体相机进行标定，导入上述两个镜头的内部参数及标定结果，即可得到双目系统中第二镜头相对第一镜头的外部参数，该参数是第二镜头相对第一镜头的旋转平移参数，R为3×3矩阵，T为3×1矩阵。 In this embodiment, the internal and external parameters of the first lens and the second lens are obtained through lens calibration, and the stereo camera in the Jean-Yves Bouguet camera calibration toolbox is used for calibration, and the internal parameters and calibration results of the above two lenses are imported, namely The external parameters of the second lens relative to the first lens in the binocular system can be obtained, the parameters are the rotation and translation parameters of the second lens relative to the first lens, R is a 3×3 matrix, and T is a 3×1 matrix.

本实施例中，经过立体视觉标定后，获得第一镜头、第二镜头的内部参数为：及

，第二镜头相对于第一镜头的外部参数矩阵为

。 In this embodiment, after stereo vision calibration, the internal parameters of the first lens and the second lens are obtained as follows: and

, the extrinsic parameter matrix of the second lens relative to the first lens is

.

由此可得A、B两个镜头各自的投影矩阵M1、M2：M1、M2皆为3×4矩阵。 From this, the respective projection matrices M1 and M2 of the two lenses A and B can be obtained: M1 and M2 are both 3×4 matrices.

、

,

该步骤获得了第一镜头的投影矩阵M1及第二镜头的投影矩阵M2，至此，第一镜头、第二镜头的位置、焦距等都已固定且不可改变。 This step obtains the projection matrix M1 of the first lens and the projection matrix M2 of the second lens. So far, the positions and focal lengths of the first lens and the second lens have been fixed and cannot be changed.

步骤2：对经标定的双目立体视觉系统中两镜头所采集的图像分别进行特征提取，根据提取到的图像特征进行立体匹配。 Step 2: Perform feature extraction on the images collected by the two lenses in the calibrated binocular stereo vision system, and perform stereo matching according to the extracted image features.

通过双目立体视觉系统采集两个不同位置的文本图像IA、IB，其中IA为待校正的畸变图像，采用尺度不变特征变换（Scale Invariant Feature Transform，以下简称SIFT）方法进行两幅图像特征的提取与初步的匹配。 Two text images IA and IB in different positions are collected through the binocular stereo vision system, where IA is the distorted image to be corrected, and the scale invariant feature transformation (Scale Invariant Feature Transform, hereinafter referred to as SIFT) method is used to carry out the two image features Extract and preliminary matches.

SIFT特征是图像的局部特征，该特征对尺度缩放、旋转、亮度变化保持不变性，对视角变化、仿射变换、噪声也保持一定程度的稳定性。 The SIFT feature is a local feature of the image. This feature remains invariant to scale scaling, rotation, and brightness changes, and also maintains a certain degree of stability against viewing angle changes, affine transformations, and noise.

SIFT方法主要分为两大部分：特征提取和特征匹配。具体实现步骤如下： The SIFT method is mainly divided into two parts: feature extraction and feature matching. The specific implementation steps are as follows:

1）高斯尺度空间及高斯差分尺度空间的建立：本实施例中两镜头图像几乎没有尺度的变化，因此只通过4个尺度的高斯函数与输入图像进行卷积建立高斯尺度空间，得到3个高斯差分核与图像卷积生成高斯差分尺度空间 (Difference of Gaussian scale-space，简称DOG空间)； 1) Establishment of Gaussian scale space and Gaussian difference scale space: In this embodiment, there is almost no scale change in the two-lens images, so only Gaussian functions of 4 scales are convolved with the input image to establish a Gaussian scale space, and 3 Gaussians are obtained Difference kernel and image convolution generate Gaussian difference scale-space (Difference of Gaussian scale-space, referred to as DOG space);

2）空间极值点检测：本实施例中生成了共计3个尺度的DOG空间，对于DOG空间中第2尺度下的某一个检测点，和它同尺度的8个相邻点及上下相邻第1尺度和第3尺度对应的9×2个点共8+18=26个点比较，如果该检测点在这26个点中是最大值或最小值，则该检测点是一个局部极值点； 2) Spatial extreme point detection: In this embodiment, a total of 3 scales of DOG space are generated. For a certain detection point at the second scale in DOG space, 8 adjacent points of the same scale as well as its upper and lower neighbors The 9×2 points corresponding to the first scale and the third scale are compared with a total of 8+18=26 points. If the detection point is the maximum or minimum value among the 26 points, the detection point is a local extremum point;

3）精确定位极值点：通过拟和三维二次函数以精确确定局部极值点的位置和尺度，同时将具有低对比度的极值点和不稳定的边缘响应点去除，本实施例中，对比度阈值设为0.06，对比度低于阈值的极值点被剔除，反之则保留； 3) Accurate positioning of extreme points: by fitting a three-dimensional quadratic function to accurately determine the position and scale of local extreme points, and at the same time remove extreme points with low contrast and unstable edge response points. In this embodiment, The contrast threshold is set to 0.06, and the extreme points whose contrast is lower than the threshold are eliminated, otherwise they are retained;

4）特征点方向参数确定：利用特征点邻域像素的梯度方向分布特性为每个特征点指定方向参数，将坐标旋转至该方向上，使算子具备旋转不变性，本实施例中，将360°划分为10个区间统计特征点方向； 4) Determine the direction parameter of the feature point: use the gradient direction distribution characteristics of the neighborhood pixels of the feature point to specify the direction parameter for each feature point, and rotate the coordinates to this direction, so that the operator has rotation invariance. In this embodiment, the 360° is divided into 10 interval statistical feature point directions;

5）特征描述子生成：首先将坐标轴旋转为特征点的方向，以确保旋转不变性，然后以特征点为中心生成特征描述子。本实施例中，以某个特征点为中心取16×16的窗口，取一个高斯加权的范围，越靠近特征点的像素梯度方向信息贡献越大，权值越大，在每4×4的小块上计算8个方向的梯度方向直方图，统计每个梯度方向的累加值，形成一个种子点，共4×4=16个种子点，每个种子点有8个方向向量信息，即生成该特征点4×4×8=128维的特征描述子； 5) Feature descriptor generation: First, the coordinate axis is rotated to the direction of the feature point to ensure rotation invariance, and then the feature descriptor is generated centering on the feature point. In this embodiment, a 16×16 window is taken with a certain feature point as the center, and a Gaussian weighted range is taken. The closer to the feature point, the greater the contribution of the gradient direction information of the pixel, and the greater the weight value. In each 4×4 Calculate the gradient direction histogram of 8 directions on the small block, count the cumulative value of each gradient direction, and form a seed point, a total of 4×4=16 seed points, each seed point has 8 direction vector information, that is, generate The characteristic descriptor of the feature point 4×4×8=128 dimensions;

6）特征匹配：对IA、IB都进行SIFT特征提取，得到两个特征点集合。对于IA中的某个特征点，找出IB中与其欧式距离最近的前两个特征点，若最近距离/次近距离的值小于某个阈值，则接受这一对匹配点，否则剔除该对匹配点，本实施例中该阈值取为0.2。 6) Feature matching: SIFT feature extraction is performed on both IA and IB to obtain two sets of feature points. For a feature point in IA, find the first two feature points in IB with the closest Euclidean distance, if the value of the closest distance/second closest distance is less than a certain threshold, then accept this pair of matching points, otherwise eliminate the pair Matching points, the threshold is taken as 0.2 in this embodiment.

两幅文本图像的初始立体匹配点已经找到，但是由于文本上的内容如文字等具有较高的相似性，因此无论采用何种特征及匹配算法，都不可避免会出现错误的匹配，需要对初始匹配点进行进一步的筛选提纯。 The initial stereo matching points of the two text images have been found. However, due to the high similarity of the content of the text, such as text, no matter what features and matching algorithms are used, wrong matching will inevitably occur. Matching points were further screened and purified.

本发明采用统计的方法对初始匹配点进行错配剔除。每一对匹配点的连线与过图像原点O的水平线成一定的角度，且每对正确匹配点的连线方向大体一致，如果是错误的匹配，匹配连线的角度将会产生较大的差异。 The present invention uses a statistical method to eliminate mismatches for initial matching points. The connection line of each pair of matching points forms a certain angle with the horizontal line passing through the origin O of the image, and the connection direction of each pair of correct matching points is roughly the same. If it is a wrong match, the angle of the matching connection line will produce a large difference.

在左右两图像间连接匹配点，连线角度范围为，则连接线的角度统计的具体方法如下：将区间

转换到

区间，将180°划分为一定数量的区间，本实施例中，取区间数为36，每5°作为一个区间，统计所有匹配点连线角度落入各个角度区间的点个数，取点个数最多的角度区间作为正确匹配连线的角度范围，根据这个范围，将明显的错配点去除。 Connect the matching points between the left and right images, and the angle range of the connection is , the specific method of angle statistics of connecting lines is as follows: the interval

convert to

Interval, 180° is divided into a certain number of intervals. In this embodiment, the number of intervals is taken as 36, and every 5° is used as an interval to count the number of points where the angles of all matching point lines fall into each angle interval. The angle interval with the largest number is used as the angle range of the correct matching line, and the obvious mismatch points are removed according to this range.

由于通过角度的剔除，仍然可能存在角度与正确角度差别不大的错配，这在后续的三维空间点定位与曲线拟合过程中会产生不良影响，因此必须保证匹配的精度足够高。点对正确匹配时，除了连线角度大体一致外，点对之间相互的距离也不应该有太大的差距，因此，本发明对角度约束初步提纯得到的匹配点，则进一步约束点对之间的距离，从而提纯匹配点。 Due to the elimination of the angle, there may still be a mismatch between the angle and the correct angle, which will have adverse effects in the subsequent 3D space point positioning and curve fitting process, so it is necessary to ensure that the matching accuracy is high enough. When the point pairs are correctly matched, except that the connection angles are roughly the same, the mutual distance between the point pairs should not have too much difference. Therefore, the present invention further constrains the matching points obtained by the preliminary purification of angle constraints. to refine the matching points.

对得到的匹配点进行提纯时，根据匹配点间的平均距离进行设定，如果 When purifying the obtained matching points, it is set according to the average distance between matching points, if

其中，T为设定的阈值，

为当前匹配点的距离，

为匹配点间的平均距离，则当前匹配点为提纯得到的匹配点。本发明的实施例中，设定的阈值T取0.1。其匹配及提纯结果中正确匹配点连线的角度基本一致，且距离也基本一致，因此采用角度加距离约束的方法可以有效去除错误的匹配，提高匹配的精度。 Among them, T is the set threshold,

is the distance of the current matching point,

is the average distance between matching points, and the current matching point is the purified matching point. In the embodiment of the present invention, the set threshold T is 0.1. In the matching and purification results, the angles of the correct matching points are basically the same, and the distances are also basically the same. Therefore, the method of angle plus distance constraints can effectively remove wrong matches and improve the matching accuracy.

步骤3：根据提纯得到的匹配点和标定后获得的内部参数、外部参数及投影矩阵，计算文本图像中对应点的三维坐标，从而判断文本图像中的文本走势方向。 Step 3: Calculate the three-dimensional coordinates of the corresponding points in the text image based on the purified matching points and the internal parameters, external parameters and projection matrix obtained after calibration, so as to determine the direction of the text trend in the text image.

镜头通过透视变换获取三维空间物体的二维图像，该图像中的点与实际物体上的点存在一定的对应关系。所谓双目立体视觉三维重建的过程：即两个镜头从不同方位对空间中一个点进行拍摄后得到的两幅图像，依据对应关系推出实际空间点的位置坐标。根据步骤1完成的镜头标定及步骤2得到的图像立体匹配，即可得到匹配点对应的三维坐标。 The lens obtains a two-dimensional image of a three-dimensional space object through perspective transformation, and there is a certain correspondence between the points in the image and the points on the actual object. The so-called three-dimensional reconstruction process of binocular stereo vision: that is, two images obtained by shooting a point in space with two lenses from different orientations, and deduce the position coordinates of the actual space point according to the corresponding relationship. According to the lens calibration completed in step 1 and the image stereo matching obtained in step 2, the three-dimensional coordinates corresponding to the matching points can be obtained.

在针孔相机模型下，假定空间任意点P在两个镜头C1和C2上的图像点分别为

，通过步骤二的立体匹配已经获得对应点

，根据步骤一已经获得的两镜头的投影矩阵M1，M2，则有： Under the pinhole camera model, it is assumed that the image points of any point P in space on the two lenses C1 and C2 are respectively

, the corresponding points have been obtained through the stereo matching in step 2

, according to the two-lens projection matrices M1 and M2 obtained in step 1, there are:

由上述公式即可得到文本图像中提取到的对应点的三维坐标，得到的离散空间三维点云，如图3所示。通过对应点的三维坐标，采用多项式最小二乘法进行曲面拟合，可以得到空间中文本的实际三维曲面模型。 The three-dimensional coordinates of the corresponding points extracted in the text image can be obtained from the above formula, and the obtained three-dimensional point cloud in discrete space is shown in Figure 3. Through the three-dimensional coordinates of the corresponding points, the polynomial least square method is used for surface fitting, and the actual three-dimensional surface model of the text in the space can be obtained.

通过曲面拟合展开或投影的方法是解决文本几何畸变的一种思路，但是观察到文本几何形变的规律性和特殊性，曲线拟合可以将复杂问题简单化，将三维空间的曲面问题转化为二维空间的曲线问题且避免曲面拟合及展开过程中不确定的形变和失真。 The method of unfolding or projecting by surface fitting is an idea to solve the geometric distortion of text, but observing the regularity and particularity of geometric deformation of text, curve fitting can simplify complex problems and transform the surface problem of three-dimensional space into Curve problems in two-dimensional space and avoid uncertain deformation and distortion in the process of surface fitting and unfolding.

通过选择特定的两个坐标轴进行曲线拟合，则可以获得文本某个角度视图的曲线走势。如图4a所示的文本畸变实施例，其前视图如图4b所示，可知文本前视图为一条曲线，通过观察文本表面离相机镜头的距离沿X方向变化，因此在该实施例中选择X、Z两个坐标进行拟合。 By selecting two specific coordinate axes for curve fitting, you can obtain the curve trend of a certain angle view of the text. For the example of text distortion shown in Figure 4a, its front view is shown in Figure 4b. It can be seen that the front view of the text is a curve, and the distance between the text surface and the camera lens changes along the X direction by observing the distance between the text surface and the camera lens, so in this embodiment, X is selected. , Z two coordinates for fitting.

由于镜头采集数据时文本打开通常有两种情形，左右打开和上下打开，因此文本的变形方向可能是横向的，也可能是纵向的。为了确定用于曲线拟合的两个坐标，首先要进行文本走势方向的判定。 Since the text is usually opened in two situations when the camera collects data, left and right and up and down, the deformation direction of the text may be horizontal or vertical. In order to determine the two coordinates used for curve fitting, it is first necessary to determine the direction of the text trend.

判断文本图像中的文本走势方向时，由于书脊和文本厚度的存在，使文本表面发生了高度的变化，采集到的图像也因此产生了弯曲变形，这种弯曲变形跟空间坐标系中文本离镜头的距离存在直接的关系，所以要拟合文本走势曲线，则只要判定在三维坐标系里，

坐标与

，

坐标的关系，若为横向走势，则

坐标与

坐标相关，若为纵向走势，则

坐标与

坐标相关。文本走势方向判定的处理流程图如图5所示，主要分为以下几步： When judging the direction of the text trend in the text image, due to the existence of the spine and the thickness of the text, the height of the text surface changes, and the collected image also produces a bending deformation. This bending deformation is the same as the text in the space coordinate system. There is a direct relationship between the distance, so to fit the text trend curve, as long as it is determined in the three-dimensional coordinate system,

coordinates with

,

The relationship between the coordinates, if it is a horizontal trend, then

coordinates with

Coordinates are related, if it is a vertical trend, then

coordinates with

Coordinates are related. The processing flow chart of text trend direction determination is shown in Figure 5, which is mainly divided into the following steps:

步骤301：根据设定的子带宽度在横纵两个方向分别对各匹配点进行子带划分；设定子带宽度，该子带宽度根据图像的大小进行具体的设定。本发明实施例中子带宽度设为200个像素，即每200个像素划分为一个子带。将图像的宽高分别按照子带宽度划分为若干个区间，一个区间即为一个子带。 Step 301 : Divide each matching point into sub-bands in horizontal and vertical directions according to the set sub-band width; set the sub-band width, which is specifically set according to the size of the image. In the embodiment of the present invention, the sub-band width is set to 200 pixels, that is, every 200 pixels are divided into a sub-band. The width and height of the image are divided into several intervals according to the width of the sub-band, and one interval is a sub-band.

步骤302：根据每个子带中匹配点对应方向的坐标计算在对应方向上该子带的离散度。用于文本走势判定的子带，其内的匹配点应尽可能均匀的分散于整个子带各个区域，才能充分反映文本的走势，下面以横向划分的子带为例对子带定位方法进行说明。匹配点在图像IA中的坐标为

，u代表像素所在的行，v代表像素所在的列。对于横向子带，u相差不大于200个像素的匹配点落入同一个子带内，在该子带内匹配点的v坐标较均匀散布于子带各个区域才能最好的反映文本的走势变化。首先统计各个子带内匹配点的个数，并计算该子带内对应的v坐标的离散度，离散度

公式如下： Step 302: Calculate the dispersion of the sub-band in the corresponding direction according to the coordinates of the corresponding direction of the matching point in each sub-band. For sub-bands used to determine the text trend, the matching points in it should be distributed as evenly as possible in each area of the entire sub-band, so as to fully reflect the trend of the text. The sub-band positioning method is described below by taking horizontally divided sub-bands as an example. . The coordinates of the matching point in image IA are

, u represents the row where the pixel is located, and v represents the column where the pixel is located. For horizontal sub-bands, the matching points whose u difference is not more than 200 pixels fall into the same sub-band, and the v-coordinates of the matching points in this sub-band are more evenly distributed in each area of the sub-band to best reflect the trend of the text. First count the number of matching points in each subband, and calculate the dispersion of the corresponding v coordinates in the subband, the dispersion

The formula is as follows:

其中，

为每个点对应的坐标，

为对应坐标的均值，n为该子带内的匹配点个数。同理，对于纵向子带，则v相差不大于200个像素的匹配点落入同一个子带，在该子带内，计算u坐标的离散度。 in,

For the coordinates corresponding to each point,

is the mean value of the corresponding coordinates, and n is the number of matching points in the subband. Similarly, for vertical sub-bands, the matching points whose v difference is not greater than 200 pixels fall into the same sub-band, and within this sub-band, the dispersion of u coordinates is calculated.

步骤303：在横纵两个方向上分别取对应离散度与匹配点个数乘积最大的子带，分别计算两子带中各匹配点在横纵两个方向上文本图像距镜头距离的变化程度，取变化程度较大对应的方向作为文本走势方向。 Step 303: Take the sub-band with the largest product of the corresponding dispersion and the number of matching points in the horizontal and vertical directions, respectively, and calculate the change degree of the distance between the text image and the camera in the horizontal and vertical directions of each matching point in the two sub-bands , take the direction corresponding to the greater degree of change as the direction of the text trend.

用于文本走势判定的不仅要求子带内数据离散度要尽可能的大，即尽可能的分散，且该子带内的点不宜过少，因此在这两者之间取一个折中，取

作为评判的准则，取所有子带中对应

最大的子带作为后续文本走势判定的子带。图4a所示实施例的横纵方向对应子带如图6所示，横向框内的子带即为定位到的横向子带，纵向框所示的子带即为定位到的纵向子带。 It is not only required for the judgment of the text trend that the data dispersion in the sub-band should be as large as possible, that is, as scattered as possible, and the points in the sub-band should not be too few, so a compromise between the two is taken.

As a criterion for judging, take the corresponding

The largest sub-band is used as the sub-band for subsequent text trend determination. The corresponding sub-bands in the horizontal and vertical directions of the embodiment shown in FIG. 4a are shown in FIG. 6 , the sub-bands in the horizontal frame are the positioned horizontal sub-bands, and the sub-bands shown in the vertical frame are the positioned vertical sub-bands.

得到横纵两个方向的子带后，分别计算

坐标沿两个方向的变化程度。用横向子带的数据点，计算三维坐标沿

坐标方向的变化程度；同样用纵向子带的数据点，计算三维坐标

沿

坐标方向的变化程度。取变化较剧烈方向的坐标与

坐标作为拟合的两个变量。 After obtaining the subbands in the horizontal and vertical directions, calculate them separately

How much the coordinates change in both directions. Use the data points of the horizontal subband to calculate the three-dimensional coordinates along

The degree of change in the coordinate direction; also use the data points of the longitudinal sub-band to calculate the three-dimensional coordinates

along

The degree of change in the coordinate direction. Take the coordinates in the direction of the more drastic change and

coordinates as the two variables of the fit.

横向子带反映的是

沿

坐标方向的变化程度，用VarX表示，纵向子带反映的是

沿

坐标方向的变化程度，用VarY表示。以定位到的子带内第一个匹配点作为参考点，其对应坐标为

。以横向子带为例，每个匹配点对应的

相对于

的变化为sumX，v相对于

的变化为sumV，该子带内匹配点个数为n，则横向的变化程度为： The lateral subbands reflect the

along

The degree of change in the coordinate direction is represented by VarX, and the vertical sub-band reflects

along

The degree of change in the coordinate direction, expressed in VarY. Taking the first matching point in the located subband as the reference point, its corresponding coordinates are

. Taking the horizontal sub-band as an example, each matching point corresponds to

compared to

The change of sumX,v relative to

The change of is sumV, and the number of matching points in this subband is n, then the degree of horizontal change is:

同理，纵向的变化程度为： Similarly, the degree of vertical change is:

若VarX>VarY，则

沿

坐标方向的变化，

、

作为曲线拟合的两个变量；否则

沿

坐标方向的变化，

、

作为曲线拟合的两个变量。 If VarX>VarY, then

along

The change in the direction of the coordinates,

,

as two variables for the curve fit; otherwise

along

The change in the direction of the coordinates,

,

as two variables for the curve fit.

如图4a所示具体实施例，

沿

方向变化较剧烈，因此

、

作为曲线拟合的两个变量，标志变量flag=1。 The specific embodiment shown in Figure 4a,

along

Direction changes are more drastic, so

,

As the two variables of curve fitting, flag variable flag=1.

步骤4：根据获得的文本走势方向得到用于曲线拟合的两个变量，拟合文本走势曲线，根据文本走势曲线离散展开对文本图像进行重建，实现文本图像的几何校正。 Step 4: Obtain two variables for curve fitting according to the obtained text trend direction, fit the text trend curve, reconstruct the text image according to the discrete expansion of the text trend curve, and realize the geometric correction of the text image.

由步骤3获得的三维空间点坐标及文本走势方向，即可进行曲线的拟合，根据文本走势方向选择文本图像中对应点的三维坐标中的X方向或Y方向坐标点，与Z方向坐标点进行曲线拟合，获得文本走势曲线。步骤3已经得到三维空间点云，若能从离散的三维空间点云中提取有规律的反映文本结构、弯曲的有效信息，则可以根据这种规律文本展平，达到几何畸变校正的目的。 The coordinates of the three-dimensional space point obtained in step 3 and the direction of the text trend can be used to fit the curve, and the coordinate point in the X direction or the Y direction in the three-dimensional coordinates of the corresponding point in the text image is selected according to the direction of the text trend, and the coordinate point in the Z direction Perform curve fitting to obtain the text trend curve. Step 3 has obtained the three-dimensional space point cloud. If the effective information reflecting the text structure and bending can be extracted from the discrete three-dimensional space point cloud, the text can be flattened according to this law to achieve the purpose of geometric distortion correction.

如图4a所示的实施例，flag=1，即文本走势为坐标沿

变化，采用曲线拟合的方法得到描述文本走势曲线的函数

。拟合方法采用多项式拟合，其公式如下所示，公式中n为拟合的阶次。 In the embodiment shown in Figure 4a, flag=1, that is, the text trend is Coordinates along

change, using the method of curve fitting to obtain the function describing the text trend curve

. The fitting method adopts polynomial fitting, and its formula is shown below, where n is the order of fitting.

其中，

为走势曲线函数表达式的系数；求得多项式系数

即可拟合出文本走势曲线，图4a所示实施例的走势曲线镜像翻转后如图7a所示。拟合后得到的函数

对应的曲线如图7b所示，从图7b与图7a的文本前视图曲线比较可以看出，图7b所示的拟合曲线反映了文本书页的走势，即反映了文本面离镜头距离

沿

的变化规律。 in,

is the coefficient of the function expression of the trend curve; obtain the polynomial coefficient

The text trend curve can be fitted, and the trend curve of the embodiment shown in FIG. 4a is mirror-inverted, as shown in FIG. 7a. The function obtained after fitting

The corresponding curve is shown in Figure 7b. From the comparison of the text front view curve in Figure 7b and Figure 7a, it can be seen that the fitting curve shown in Figure 7b reflects the trend of the text page, that is, it reflects the distance between the text surface and the camera

along

change rule.

文本图像的几何畸变校正事实上就是为了消除由于文本厚度变化、书脊存在所造成的文本离相机镜头的距离发生变化，进而进一步引起的畸变，只要消除这种距离的变化就可以达到消除畸变的目的。 The geometric distortion correction of the text image is actually to eliminate the distance between the text and the camera lens caused by the change of the thickness of the text and the presence of the spine, and further cause distortion. As long as the change of this distance is eliminated, the purpose of eliminating distortion can be achieved .

为了得到展平的文本图像，本方法将反映文本走势的曲线展开为一条直线，也就是将弯曲的文本书页展平成一个平面。曲线展开之前需要确定曲线展开的基点，以水平线与文本走势曲线的交点为基点，计算各离散点相对于基点的曲线长度。本发明采用预先指定的方式来确定曲线展开基点，指定镜头光轴与文本曲面的交点作为展开的基点，对于如图4a所示的实施例，为横向走势的文本，即沿

变化，

，则曲线展开基点为曲线上的一点，如图7b经过

的竖直线与曲线交点所示为展开基点；若为纵向走势文本，即

沿

变化

，则

处为曲线展开基点。 In order to obtain a flattened text image, this method expands the curve reflecting the trend of the text into a straight line, that is, flattens the curved text page into a plane. Before the curve is unfolded, the base point of the curve expansion needs to be determined, and the intersection point of the horizontal line and the text trend curve is used as the base point to calculate the curve length of each discrete point relative to the base point. The present invention uses a pre-designated method to determine the base point of curve expansion, and specifies the intersection point of the optical axis of the lens and the text surface as the base point of expansion. For the embodiment shown in Figure 4a, it is the text of the horizontal trend, that is along

Variety,

, then the base point of the curve expansion is a point on the curve , as shown in Figure 7b after

The intersection of the vertical line and the curve is the expansion base point; if it is a vertical trend text, that is

along

Variety

,but

is the base point of curve expansion.

曲线拟合获得了描述该走势曲线函数表达式的系数

，事实上拟合的曲线为一堆离散的点。曲线展开即求曲线上每个点相对于基点的曲线长度，求取曲线的长度采用积分的思想，曲线离散化后求出每个离散点对应的曲线长度。 Curve fitting obtains the coefficients describing the function expression of the trend curve

, in fact the fitted curve is a bunch of discrete points. Curve expansion is to calculate the length of each point on the curve relative to the base point. The idea of integral is used to calculate the length of the curve. After the curve is discretized, the length of the curve corresponding to each discrete point is calculated.

根据匹配点的三维空间坐标及匹配点对应的图像坐标，确定曲线的离散步长，以基点为起点，计算各离散点相对于基点的曲线长度。如图4a所示的实施例，得到的文本走势曲线为

，根据获取匹配点的三维坐标，可得三维坐标

的最大范围

及对应的图像坐标v的变化范围

，将空间中的

坐标量化为每个像素的变化量，步长

，为了提高曲线展开的精度，可以将步长缩小一定倍数进行曲线的离散展开。本发明实施例中，取离散步长为

，以基点为起点，计算与基点距离为

的每个离散点对应的曲线长度

。 According to the three-dimensional space coordinates of the matching points and the image coordinates corresponding to the matching points, the discrete step length of the curve is determined, and the base point is used as the starting point to calculate the length of the curve of each discrete point relative to the base point. The embodiment shown in Figure 4a, the text trend curve that obtains is

, according to the three-dimensional coordinates of the matching points, the three-dimensional coordinates can be obtained

The maximum range of

And the range of change of the corresponding image coordinate v

, the space in

The coordinates are quantized as the amount of change for each pixel, the step size

, in order to improve the accuracy of curve expansion, the step size can be reduced by a certain multiple for discrete expansion of the curve. In the embodiment of the present invention, the discrete step length is taken as

, taking the base point as the starting point, calculate the distance from the base point as

The length of the curve corresponding to each discrete point of

.

获得曲线离散结果后，根据文本走势曲线离散展开所得的目标位置进行后向映射，计算展开平面对应的文本图像中各坐标点在原图像中的位置，并采用双线性插值的方式进行重建，得到几何校正后的文本图像。后向映射是为了避免重建图像中出现空洞和重叠，双线性插值则是为了避免非整数映射引入的噪声和图像质量降低。其重建过程如图8所示，对于目标图像中的匹配点

，且已知文本曲面展开后所在的平面为

，可求得该点在展开平面上的三维坐标，根据曲线离散展开可知

，由此可得该三维点在原曲面上的坐标为

，结合第一镜头的投影矩阵M1可求得目标图像上的

点对应在原图像上的坐标为

，通过双线性插值即可重建出畸变校正图像。如图4a所示实施例的畸变校正结果如图9所示，从图中可以看出，畸变的文字行基本被拉直。 After obtaining the curve discrete result, perform backward mapping according to the target position obtained from the discrete expansion of the text trend curve, calculate the position of each coordinate point in the original image in the text image corresponding to the expansion plane, and reconstruct it by bilinear interpolation, and obtain Geometrically corrected text image. Backward mapping is used to avoid holes and overlaps in the reconstructed image, and bilinear interpolation is used to avoid noise and image quality degradation introduced by non-integer mapping. Its reconstruction process is shown in Figure 8, for the matching points in the target image

, and the plane where the text surface is unfolded is known as

, the three-dimensional coordinates of the point on the unfolded plane can be obtained , according to the discrete expansion of the curve, we can know that

, so the coordinates of the 3D point on the original surface can be obtained as

, combined with the projection matrix M1 of the first lens, the

The coordinates corresponding to the point on the original image are

, the distortion-corrected image can be reconstructed by bilinear interpolation. The distortion correction result of the embodiment shown in FIG. 4 a is shown in FIG. 9 , and it can be seen from the figure that the distorted character lines are basically straightened.

以上具体实施方式主要是针对大致横平竖直拍摄情况的图像进行表述的，并且文本三维空间信息是通过对获取彩色或灰度的文本图像进行处理获得。 The above specific implementation methods are mainly described for images taken roughly horizontally and vertically, and the three-dimensional space information of the text is obtained by processing the color or grayscale text image.

在其他实施例中，对图10a、图11a、图12a、图13a和图14a采用本发明所述的方法进行处理，得到如图10b、图11b、图12b、图13b和图14b的处理结果。 In other embodiments, Fig. 10a, Fig. 11a, Fig. 12a, Fig. 13a and Fig. 14a are processed by the method described in the present invention, and the processing results as Fig. 10b, Fig. 11b, Fig. 12b, Fig. 13b and Fig. 14b are obtained .

如图10a、图10b所示，从图中可以看出，本方法对弯曲畸变较严重的纯文本图像校正效果良好，弯曲的文字行都被拉直了。 As shown in Figure 10a and Figure 10b, it can be seen from the figures that this method has a good correction effect on the plain text image with serious bending distortion, and the curved text lines are straightened.

如图11a、图11b所示，从图中可以看出，本方法对横竖混排的复杂排版文本图像也有较好的校正效果，校正过程只与文本走势有关，而与文本排版是横向还是纵向无关。 As shown in Figure 11a and Figure 11b, it can be seen from the figures that this method also has a good correction effect on complex typesetting text images with mixed horizontal and vertical layouts. The correction process is only related to the text trend, but not to whether the text layout is horizontal or vertical. irrelevant.

如图12a、图12b所示，从图中可以看出，本方法对图文并茂、分栏等的文本图像也有理想的校正效果，只要文本内容具备有利且足够的特征，无论是文字还是图片都可以进行较好的畸变校正。 As shown in Figure 12a and Figure 12b, it can be seen from the figures that this method also has an ideal correction effect on text images with pictures and texts, column divisions, etc., as long as the text content has favorable and sufficient features, no matter it is text or pictures. Perform better distortion correction.

如图13a、图13b所示：图中文字较少，特征较少，公式较多，一般的基于文本内容的校正方法可能会失效，从图中可以看出，本方法的校正效果良好。 As shown in Figure 13a and Figure 13b: there are fewer texts, fewer features, and more formulas in the figure. The general correction method based on text content may fail. It can be seen from the figure that the correction effect of this method is good.

如图14a、图14b所示，从图中可以看出，该文本为纵向走势文本，且为手写文本，采用本方法进行校正并不受文本具体内容影响，同样达到良好的校正效果。该图像中上半部分在两镜头重叠部分之外且为纵向走势，因此重叠区域之外的文本不作展开处理。 As shown in Figure 14a and Figure 14b, it can be seen from the figures that the text is a vertical trend text and is a handwritten text, and the correction by this method is not affected by the specific content of the text, and also achieves a good correction effect. The upper part of the image is outside the overlapping part of the two shots and is in a vertical direction, so the text outside the overlapping area will not be expanded.

本发明还公开了一种双目立体视觉系统，包括水平支架、第一镜头、第二镜头，所述第一镜头和所述第二镜头并排固定于水平支架上；所述第一镜头用于拍摄待校正的文本图像；所述第二镜头的视野包括第一镜头拍摄的文本图像。 The invention also discloses a binocular stereo vision system, which includes a horizontal bracket, a first lens, and a second lens, and the first lens and the second lens are fixed side by side on the horizontal bracket; the first lens is used for Taking the text image to be corrected; the field of view of the second camera includes the text image taken by the first camera.

本发明公开了的文本图像的几何校正方法、装置和双目立体视觉系统，不需要模板图像，直接对采集到的文本图像进行处理，只用到了两个普通的摄像头，操作方便且成本低廉，本发明同样适用于采用模板附于文本图像上，通过对模板图像进行相应处理，得到走势曲线并将待校正图像按曲线展开，亦可达到畸变校正的目的；此外本发明所采用的几何校正方法不依赖文字行的走向，无论是横排还是竖排都可以进行校正，且校正过程与图像内容形式无直接关系，无论文本内容是图片、公式、表格、文字，只要保证文本图像中有相对较多的信息即可。 The text image geometric correction method, device and binocular stereo vision system disclosed in the present invention do not need a template image, and directly process the collected text image, only two common cameras are used, the operation is convenient and the cost is low, The present invention is also suitable for attaching a template to a text image, and by correspondingly processing the template image, a trend curve is obtained and the image to be corrected is unfolded according to the curve, and the purpose of distortion correction can also be achieved; in addition, the geometric correction method adopted in the present invention It does not depend on the direction of the text line, whether it is horizontal or vertical, it can be corrected, and the correction process has no direct relationship with the image content form. as much information as possible.

本发明充分利用基于特征获取的文本立体三维信息，结合曲线拟合及展开的方法，通过后向映射和双线性插值重建图像，去除图像几何畸变，获得校正的文本图像，处理过程中无需进行版面分析、二值化等预处理，也不需要模板图像获取相应的对应关系，具有不受文本内容形式及畸变程度的影响、适用范围较广的优点。 The present invention makes full use of the three-dimensional text information obtained based on features, combines the curve fitting and unfolding method, reconstructs the image through backward mapping and bilinear interpolation, removes the geometric distortion of the image, and obtains the corrected text image, without the need for processing Preprocessing such as layout analysis and binarization does not require template images to obtain the corresponding corresponding relationship, and has the advantages of being not affected by the text content form and distortion degree and having a wide range of applications.

通过本实施例可以看出，本发明的文本图像几何畸变校正技术可以处理各种内容、排版形式的文本畸变，明显提高了图像质量，且在OCR识别过程中，畸变校正使原本难于进行版面分析、行切分的文本能顺利进行相应的处理，整个文本图像被统一进行校正，避免图片、花边、公式及图表等畸变对OCR产生的不利影响。 It can be seen from this embodiment that the text image geometric distortion correction technology of the present invention can deal with text distortions in various content and typesetting forms, significantly improving image quality, and in the process of OCR recognition, distortion correction makes it difficult to perform layout analysis , The line-segmented text can be processed smoothly, and the entire text image is uniformly corrected to avoid the adverse effects of distortions such as pictures, lace, formulas, and charts on OCR.

Claims

1. A geometric correction method of a text image is characterized by comprising the following steps:

the method comprises the following steps: building a binocular stereo vision system, obtaining respective internal parameters and mutual external parameters of two lenses in the binocular stereo vision system after system calibration, and solving respective projection matrixes of the two lenses;

step two: respectively extracting the characteristics of the images collected by the two lenses in the calibrated binocular stereo vision system, carrying out stereo matching according to the extracted image characteristics, and purifying the obtained matching points;

step three: calculating the three-dimensional coordinates of corresponding points in the text image according to the matching points obtained by purification and the internal parameters, external parameters and projection matrix obtained after calibration, thereby judging the text trend direction in the text image;

step four: and fitting a text trend curve according to the obtained text trend direction, and reconstructing the text trend curve after discretely expanding the text trend curve so as to obtain a corrected text image.

2. The method of claim 1, wherein: the binocular stereoscopic vision system in the first step comprises a first lens and a second lens which are the same; the first lens is used for shooting a text image to be corrected; the field of view of the second lens includes the text image captured by the first lens.

3. The method of claim 1, wherein: and in the second step, during feature extraction, a scale-invariant feature transformation method is adopted for feature extraction.

4. The method of claim 1, wherein: when the obtained matching points are purified in the second step, dividing 180 degrees into n intervals, counting the connecting line angles of the matching points, taking the interval with the largest number of matching points to carry out primary purification on the matching points, then setting according to the average distance between the matching points after the primary purification, and if so, carrying out the primary purification on the matching points

Wherein T is a set threshold value,

Figure 2011100070119100001DEST_PATH_IMAGE002

is the distance between the current matching points,

and if the distance is the average distance between the matching points, the current matching point is the matching point obtained by purification.

5. The method of claim 4, wherein: the interval n is 36, and the set threshold T is 0.1.

6. The method of claim 2, wherein: and judging the trend direction of the text in the text image in the third step, which comprises the following steps:

respectively carrying out sub-band division on each matching point in the horizontal direction and the vertical direction according to the set sub-band width;

calculating the dispersion of the sub-band in the corresponding direction according to the coordinates of the corresponding direction of the matching point in each sub-band;

and respectively taking the sub-bands with the maximum product of the corresponding dispersion and the number of the matching points in the horizontal and vertical directions, respectively calculating the change degree of the text image in the horizontal and vertical directions from the first lens distance of each matching point in the two sub-bands, and taking the direction with the larger change degree as the text tendency direction.

7. The method of claim 6, wherein: and in the fourth step, when the text trend is fitted according to the obtained text trend direction, selecting an X-direction or Y-direction coordinate point in the three-dimensional coordinates of the corresponding point in the text image according to the text trend direction, and performing curve fitting with the Z-direction coordinate point to obtain a text trend curve.

8. The method of claim 1, wherein: and in the fourth step, when the text trend curve is discretely expanded, calculating the curve length of each discrete point relative to the base point by taking the intersection point of the horizontal line and the text trend curve as the base point.

9. The method of claim 1, wherein: when the text trend curve is discretely expanded and then reconstructed, backward mapping is carried out according to a target position obtained by the discrete expansion of the text trend curve, the position of each coordinate point in the text image corresponding to the expansion plane in the original image is calculated, and the reconstruction is carried out in a bilinear interpolation mode, so that the text image after geometric correction is obtained.

10. The method of claim 8, wherein: when the text trend curve is expanded, the discrete step length of the curve is determined according to the three-dimensional space coordinates of the matching points and the image coordinates corresponding to the matching points, and the curve length of each discrete point relative to the base point is calculated by taking the base point as a starting point.

11. The utility model provides a binocular stereoscopic vision system, includes first camera lens, second camera lens, its characterized in that: the first lens and the second lens are fixed on the horizontal bracket in parallel; the first lens is used for shooting a text image to be corrected; the field of view of the second lens includes a text image captured by the first lens.

12. A geometric correction device for text images is characterized by comprising the following modules:

building a module: building a binocular stereo vision system, obtaining respective internal parameters and mutual external parameters of two lenses in the binocular stereo vision system after system calibration, and solving respective projection matrixes of the two lenses;

an extraction module: respectively extracting the characteristics of the images collected by the two lenses in the calibrated binocular stereo vision system, carrying out stereo matching according to the extracted image characteristics, and purifying the obtained matching points;

a trend module: calculating the three-dimensional coordinates of corresponding points in the text image according to the matching points obtained by purification and the internal parameters, external parameters and projection matrix obtained after calibration, thereby judging the text trend direction in the text image;

a reconstruction module: and fitting a text trend curve according to the obtained text trend direction, and reconstructing the text trend curve after discretely expanding the text trend curve so as to obtain a corrected text image.