[go: up one dir, main page]

CN101315664A - Text Image Preprocessing Method for Text Recognition - Google Patents

Text Image Preprocessing Method for Text Recognition Download PDF

Info

Publication number
CN101315664A
CN101315664A CNA2008100584515A CN200810058451A CN101315664A CN 101315664 A CN101315664 A CN 101315664A CN A2008100584515 A CNA2008100584515 A CN A2008100584515A CN 200810058451 A CN200810058451 A CN 200810058451A CN 101315664 A CN101315664 A CN 101315664A
Authority
CN
China
Prior art keywords
text
image
area
correction
foreground
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2008100584515A
Other languages
Chinese (zh)
Inventor
邵玉斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CNA2008100584515A priority Critical patent/CN101315664A/en
Publication of CN101315664A publication Critical patent/CN101315664A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Character Input (AREA)

Abstract

本发明是用于文字识别的文本图像预处理方法。包含对文本图像的几何校正以及动态域值二值化两个步骤。所述的几何校正方法对照相机摄取的文字图片的文字区域几何失真进行区域校正,得到矩形区域的校正结果;几何校正方法包含自动区域识别和手动区域指定两种。所述的动态域值二值化方法将照相机摄取的文字图片中的文字前景和亮度不均的背景自适应地分离出来。本方法的特征是可根据所识别的文字区域进行矩形化校正,并可结合任意的图像模糊算法,以源图像作为参考图像通过对比计算,从而将亮度不均的背景中的文字前景分离出来。本方法所得出的结果图像可用于计算机文字识别、机器视觉和机器理解等领域。本发明具有对环境适应性强、算法稳定可靠的优良效果。

The invention is a text image preprocessing method for character recognition. It includes two steps of geometric correction of text image and binarization of dynamic domain value. The geometric correction method performs area correction on the geometric distortion of the text area of the text image captured by the camera to obtain a correction result of a rectangular area; the geometric correction method includes automatic area recognition and manual area designation. The dynamic threshold binarization method adaptively separates the text foreground and the background with uneven brightness in the text picture picked up by the camera. The feature of this method is that it can perform rectangular correction according to the recognized text area, and can be combined with any image blurring algorithm, and use the source image as a reference image to separate the text foreground from the background with uneven brightness. The resulting image obtained by the method can be used in fields such as computer character recognition, machine vision and machine understanding. The invention has the excellent effects of strong adaptability to the environment and stable and reliable algorithm.

Description

用于文字识别的文本图像预处理方法 Text Image Preprocessing Method for Text Recognition

技术领域 technical field

本发明涉及图像处理技术领域,具体地说是用于文字识别的文本图像预处理方法。The invention relates to the technical field of image processing, in particular to a text image preprocessing method for character recognition.

背景技术 Background technique

在文字识别、计算机视觉和机器理解等领域的应用中,对复杂环境图像中的文字内容进行分离和识别是技术难点之一。机器理解文字是基于文字识别结果的,而用于文字识别的源图像的质量越好,识别率就越高。文本图像预处理方法的目的是为文字识别算法提供一种几何失真小、前景文本分离完善的二值化图像。用于文字识别的图像预处理方法一般是对图像中的文字内容区域进行识别,然后采用图像平移、旋转和伸缩等方法对文字内容区域进行校正,然后对校正结果进行全局固定域值的二值化处理。In the application of text recognition, computer vision, and machine understanding, it is one of the technical difficulties to separate and recognize text content in complex environment images. Machine understanding of text is based on text recognition results, and the better the quality of the source image used for text recognition, the higher the recognition rate. The purpose of the text image preprocessing method is to provide a binary image with small geometric distortion and perfect separation of the foreground text for the text recognition algorithm. The image preprocessing method for text recognition is generally to identify the text content area in the image, and then use image translation, rotation and stretching methods to correct the text content area, and then perform a binary value of the global fixed domain value on the correction result. processing.

目前,针对文本内容的图像几何校正方法主要集中于对文本图像的倾斜校正上,根据文字走向和行间空白将文本校正为水平或垂直方向的。然而,这种倾斜校正方法仅对于倾斜的矩形区域文本有效。对于实际摄像得到的图像,由于透视关系和镜头非线性失真,原为矩形的文字区域将发生失真变为不规则四边形或曲四边形。目前没有较为便捷的针对文字图片失真为不规则四边形或曲四边形的几何校正方法。针对亮度背景变化的文本图像的二值化问题,采用全局固定域值方法不足之处是不能取得好的效果,而采用分块图像局部二值化域值方法会导致分块边界效应。在文字识别预处理领域,目前尚无一种实用软件能够对照相机摄影的文字图片进行有效的几何校正和文字前景的提取。At present, the image geometric correction methods for text content mainly focus on the tilt correction of text images, and correct the text to be horizontal or vertical according to the direction of the text and the space between lines. However, this skew correction method is only effective for skewed rectangular text. For the image obtained by the actual camera, due to the perspective relationship and lens nonlinear distortion, the original rectangular text area will be distorted into a trapezoid or a curved quadrilateral. At present, there is no convenient geometric correction method for distorting text and pictures into trapezoids or curved quadrilaterals. For the binarization of text images with changing brightness background, the disadvantage of using the global fixed threshold method is that it cannot achieve good results, while the local binarization threshold method for block images will lead to block boundary effects. In the field of text recognition preprocessing, there is currently no practical software that can perform effective geometric correction and text foreground extraction on text pictures taken by cameras.

发明内容 Contents of the invention

本发明的目的在于为文字识别和机器理解提供一种方便实用的用于文字识别的文本图像预处理方法文字图片预处理方法,利用此方法,可根据照相机拍摄的文字图片,进行文字区域的识别、并将识别区域还原为矩形区域,通过自适应域值的二值化方法实现文字前景与亮度变化背景之间分离,为文字识别方法提供良好的图像源。The purpose of the present invention is to provide a convenient and practical text image preprocessing method for text recognition and text image preprocessing method for text recognition and machine understanding. Using this method, the text area can be identified according to the text picture taken by the camera. , and restore the recognition area to a rectangular area, realize the separation between the foreground of the text and the background of the brightness change through the binarization method of the adaptive threshold value, and provide a good image source for the text recognition method.

本发明解决的主要技术问题是采用以下技术方案来实现的:The main technical problem that the present invention solves is to adopt the following technical solutions to realize:

对照相机摄取的文字图片的文字区域几何失真进行区域校正,得到矩形区域的校正结果;并将照相机摄取的文字图片中的文字前景和亮度不均的背景自适应地分离出来。The geometric distortion of the text area of the text image captured by the camera is corrected to obtain the correction result of the rectangular area; and the text foreground and the background of uneven brightness in the text image captured by the camera are adaptively separated.

对照相机摄取的文字图片的文字区域几何失真进行区域校正的方法是:首先对照相机摄取的具有几何失真的文本图像进行区域自动识别或指定,得出文本区域的边界位置,再利用文本图像文字排列的固有矩形区域特性,对所识别或指定的文本区域进行几何校正,还原为矩形形状的文本区域;对于文本图像中的任何一个像素,根据它与所指定的文本区域的边界位置的关系,求取其在设定矩形区域中的对应位置,从而得到该像素在校正后图像中的本源位置。校正后图像的所有像素的色彩值或亮度值均以其所对应的源文字图片中的像素本源位置作为参照,通过任意一种图像插值算法确定;具体步骤如下:The method of area correction for the geometric distortion of the text area of the text image captured by the camera is: firstly, automatically identify or specify the area of the text image with geometric distortion captured by the camera, obtain the boundary position of the text area, and then use the text image text arrangement Inherent rectangular region characteristics, perform geometric correction on the identified or specified text region, and restore it to a rectangular text region; for any pixel in the text image, according to its relationship with the boundary position of the specified text region, find Get its corresponding position in the set rectangular area, so as to obtain the original position of the pixel in the corrected image. The color value or brightness value of all pixels in the corrected image is determined by any image interpolation algorithm with reference to the original position of the pixel in the corresponding source text image; the specific steps are as follows:

1)对原本为矩形文字区域的图像进行文字区域识别,得出文本的边界;1) Carry out text area recognition to the image that is originally a rectangular text area, and obtain the boundary of the text;

2)以区域识别结果为基准,计算源图像相对于该基准下的位置关系;2) Taking the area recognition result as a benchmark, calculate the positional relationship of the source image relative to the benchmark;

3)通过任意一种图像插值算法确定校正后图像对应像素的色彩值或亮度值,从而得出几何校正的结果;3) Determine the color value or brightness value of the corresponding pixel of the corrected image through any image interpolation algorithm, so as to obtain the result of geometric correction;

4)对步骤3所得出的几何校正的结果还可重复步骤1、步骤2和步骤3,通过迭代得到更好的几何校正的结果。4) Step 1, step 2 and step 3 can also be repeated for the geometric correction result obtained in step 3, and a better geometric correction result can be obtained through iteration.

对照相机摄取的文字图片或由上述的文字区域几何失真校正方法所得出的结果图像进行文字前景的提取,将文字前景和亮度不均的背景自适应地分离出来的方法是图像动态域值二值化预处理方法:通过图像中任意像素及其邻域像素之间的运算,可使用但不限于使用任意一种图像模糊方法,求取局部背景亮度;通过利用局部背景亮度的加权值作为该区域的二值化门限,对该区域文字前景进行分离;根据分离结果,将源图像中的文字前景部分扣除,再次使用但不限于使用任意一种图像模糊方法计算局部背景亮度,然后利用其加权值作为新的动态门限对源图像进行二值化,得出更精确的分离结果;这一过程可以进行多次迭代,具体步骤如下:The text image captured by the camera or the result image obtained by the above-mentioned text area geometric distortion correction method is used to extract the text foreground, and the method for adaptively separating the text foreground and the background with uneven brightness is image dynamic domain binary value Preprocessing method: Through the operation between any pixel in the image and its neighboring pixels, any image blurring method can be used, but not limited to, to obtain the local background brightness; by using the weighted value of the local background brightness as the area The binarization threshold is used to separate the text foreground of the area; according to the separation result, the text foreground part in the source image is deducted, and the local background brightness is calculated by using but not limited to any image blurring method, and then its weighted value is used As a new dynamic threshold, the source image is binarized to obtain a more accurate separation result; this process can be iterated multiple times, and the specific steps are as follows:

1)采用固定域值方法对文本前景进行粗分离;1) Use the fixed domain value method to roughly separate the text foreground;

2)利用任意一种图像模糊方法,对粗分离出来的背景区域求取其局部背景亮度;2) Use any image blurring method to calculate the local background brightness of the roughly separated background area;

3)对局部背景亮度进行加权计算,以之作为动态域值对原图像进行文献前景分离;3) Carry out weighted calculation on the local background brightness, and use it as the dynamic domain value to separate the document foreground of the original image;

4)利用步骤3所得出的结果,重复迭代步骤2、3得出更精确的分离结果。4) Using the result obtained in step 3, iterative steps 2 and 3 are repeated to obtain more accurate separation results.

本发明的方法与现有技术比较具有的优点是:由于本发明采取了上述的技术措施,与现有技术方法相比,具有对摄影环境适应性强、算法简单、稳定可靠的优良效果,此外,还设计了迭代方法以改进结果图像的质量,应用范围更广。实践中,利用本方法的文本图像预处理效果明显优于现有的文字识别预处理方法的结果。本发明可以采用软件实现,形成为计算机文字识别软件进行图像预处理的软件部分或模块,也可以通过硬件或数字信号处理芯片实现,成为数码照相机、摄像机、机器人视觉系统等嵌入式系统的一个功能。Compared with the prior art, the method of the present invention has the advantages that: because the present invention adopts the above-mentioned technical measures, compared with the prior art method, it has the excellent effects of strong adaptability to the photographic environment, simple algorithm, stability and reliability, and in addition , an iterative method is also designed to improve the quality of the resulting image, which has a wider range of applications. In practice, the text image preprocessing effect using this method is obviously better than the result of the existing text recognition preprocessing method. The present invention can adopt software to realize, and form the software part or module that computer text recognition software carries out image preprocessing, also can realize through hardware or digital signal processing chip, become a function of embedded systems such as digital camera, video camera, robot vision system .

本发明的具体实施方法由以下实施例及其附图详细给出。The specific implementation method of the present invention is given in detail by the following examples and accompanying drawings.

附图说明 Description of drawings

图1是线性失真的被校正的文字图像的文字区域示意图(该区域对应的不失真文字区域是一个矩形)。FIG. 1 is a schematic diagram of a text area of a text image corrected for linear distortion (the corresponding undistorted text area of this area is a rectangle).

其中:P11,P1n,Pm1,Pmn分别为该不规则四边形区域的四个顶点,P1k,P21,Pin,ij为图像区域中的一些不同位置上的像素。Among them: P11, P1n, Pm1, Pmn are four vertices of the trapezoidal area respectively, P1k, P21, Pin, ij are pixels at different positions in the image area.

图2是对图1文字图像的校正结果区域示意图。FIG. 2 is a schematic diagram of the correction result area of the text image in FIG. 1 .

其中:P11’,P1n’,Pm1’,Pmn’分别为对应图1中P11,P1n,Pm1,Pmn四顶点的校正位置结果;P1k’,P21’,Pin’,Pij’对应图1中的P1k,P21,Pin,Pij点。Among them: P11', P1n', Pm1', Pmn' are the correction position results corresponding to the four vertices of P11, P1n, Pm1, Pmn in Figure 1; P1k', P21', Pin', Pij' correspond to P1k in Figure 1 , P21, Pin, Pij points.

图3是非线性失真情况下的被校正的文字图像的文字区域示意图(该区域对应的不失真文字区域是一个矩形)。FIG. 3 is a schematic diagram of a text area of a corrected text image in the case of nonlinear distortion (the corresponding undistorted text area of this area is a rectangle).

其中,Pij表示该失真区域中的一个像素点。Wherein, Pij represents a pixel in the distorted area.

图4是经过垂直方向校正之后的文字区域示意图。FIG. 4 is a schematic diagram of a text area after vertical correction.

其中像素点Pij’对应图3中的像素点Pij。The pixel point Pij' corresponds to the pixel point Pij in Fig. 3 .

图5是图4区域进一步经过水平校正之后的文字区域示意图。FIG. 5 is a schematic diagram of the text area after the area in FIG. 4 has been further corrected horizontally.

其中像素点Pij”对应图4中的像素点Pij’,也即对应图3中的像素点Pij。The pixel point Pij" corresponds to the pixel point Pij' in Fig. 4, that is, corresponds to the pixel point Pij in Fig. 3 .

图6是动态域值二值化计算方法示意图的一维表示。Fig. 6 is a one-dimensional representation of a schematic diagram of a dynamic threshold binarization calculation method.

其中,曲线f表示二维图像f(x,y);曲线g表示经过邻近像素之间运算得出的模糊图像g(x,y),曲线t表示模糊图像g(x,y)经过平移加权得出的图像t(x,y),D为平移加权值。Among them, the curve f represents the two-dimensional image f(x, y); the curve g represents the blurred image g(x, y) obtained through calculation between adjacent pixels, and the curve t represents the blurred image g(x, y) after translation weighting The obtained image t(x, y), D is a translation weighted value.

具体实施方式 Detailed ways

以下结合附图实施例,对本发明的文本图像预处理方法作进一步的详述。The text image preprocessing method of the present invention will be further described in detail below with reference to the embodiments of the accompanying drawings.

实施例Example

1.对于线性几何失真的不规则的四边形的校正1. Correction of irregular quadrilaterals for linear geometric distortion

如图1所示,被校正图像中的文字区域发生了线性失真,原本为矩形区域的文字部分失真为一个不规则四边形区域。由于文字区域部分与文字区域边沿有明显区别,可采用自动识别方法自动辨别出该文本所在区域,也可通过人工观察来手工设定该区域。As shown in Figure 1, the text area in the corrected image is linearly distorted, and the text part that was originally a rectangular area is distorted into a trapezoidal area. Since the part of the text area is obviously different from the edge of the text area, the area where the text is located can be automatically identified by an automatic recognition method, or the area can be manually set by manual observation.

为了将该失真区域校正为图2所示的无失真矩形区域,应用本发明的方法是:In order to correct this distorted area to the undistorted rectangular area shown in Figure 2, the method of applying the present invention is:

将不规则的四边形边沿按照横向和纵向划分为若干像素,例如,统计线段P11P1n所历经的像素点数目,设为N1,统计线段Pm1Pmn所历经的像素点数目,设为N2,则横向划分像素点数n为Divide the irregular quadrilateral edge into several pixels horizontally and vertically. For example, count the number of pixels traversed by the line segment P11P1n, set N1, count the number of pixels traversed by the line segment Pm1Pmn, set N2, then divide the number of pixels horizontally n is

n=(N1+N2)/2n=(N1+N2)/2

采用类似方法对线段P11Pm1和线段P1nPmn进行纵向像素点划分,设纵向划分像素点数为m。Use a similar method to divide the vertical pixel points of the line segment P11Pm1 and line segment P1nPmn, and set the number of vertically divided pixel points as m.

求出线段P11P1n上均匀划分的n个像素点(P11,P12,...,P1n)的所在坐标,并根据任何一种插值方法,求出这些像素点的取值。Calculate the coordinates of n pixel points (P11, P12, .

然后将这些点映射到图2中的线段P11’P1n’上的n个点(P11’,P12’,...,P1n’)位置上。These points are then mapped to n points (P11', P12', ..., P1n') on the line segment P11'P1n' in Figure 2.

以相同的方法对下一线段P21P2n上的n个像素点进行操作。Operate the n pixels on the next line segment P21P2n in the same way.

这样,当进行第i次线段划分时,将线段Pi1Pin上的n个像素点(Pij,j=1,...,n)值对应到图2中的线段Pi1’Pin’上的像素点(Pij’,j=1,...,n)位置上。In this way, when performing the i-th line segment division, the n pixel points (Pij, j=1, . Pij', j=1, . . . , n) position.

这样的过程一直进行到线段Pm1Pmn为止。Such a process has been carried out until the line segment Pm1Pmn.

可重复以上过程形成迭代。The above process can be repeated to form an iteration.

2.对于非线性几何失真的不规则的四边形的校正2. Correction of irregular quadrilaterals for nonlinear geometric distortion

如图3所示,被校正图像中的文字区域发生了非线性失真,原本为矩形区域的文字部分失真为一个由四段曲线围成的不规则区域。由于文字区域部分与文字区域边沿有明显区别,可采用自动识别方法自动辨别出该文本所在区域,也可通过人工观察来手工设定该区域。由于曲线的光滑性质,在区域辨别或设定时只需要确定曲线上的若干点,再通过如样条插值等插值方法计算出近似曲线。As shown in Figure 3, nonlinear distortion occurs in the text area in the corrected image, and the text part that was originally a rectangular area is distorted into an irregular area surrounded by four curves. Since the part of the text area is obviously different from the edge of the text area, the area where the text is located can be automatically identified by an automatic recognition method, or the area can be manually set by manual observation. Due to the smooth nature of the curve, it is only necessary to determine a few points on the curve during area identification or setting, and then calculate an approximate curve through interpolation methods such as spline interpolation.

为了将该曲线围成的失真区域校正为图6所示的无失真矩形区域,应用本发明的方法是:In order to correct the distortion area surrounded by this curve into the distortion-free rectangular area shown in Figure 6, the method of applying the present invention is:

将一次迭代校正过程分为两个阶段:第一阶段是进行垂直校正,其结果为将区域竖直方向的围线校正为垂直等长的两条边,而水平方向上仍然存在失真,如图4所示。第二阶段是在图4所示结果的基础上再进行水平方向的校正,得到如图5所示的矩形区域校正结果。An iterative correction process is divided into two stages: the first stage is to perform vertical correction, and the result is to correct the vertical perimeter of the area into two vertically equal sides, but there is still distortion in the horizontal direction, as shown in the figure 4. The second stage is to correct the horizontal direction on the basis of the results shown in Figure 4, and obtain the rectangular area correction results shown in Figure 5.

这两个阶段所使用的校正算法相同。以第一阶段垂直校正为例,设定垂直方向的计算分辨率m后,将竖直方向的两曲边分割为m段,并将两曲边上对应的分割点用直线连接起来,再根据设定的水平方向分辨率n,将该连线分割为n段,于是得出这些网格交点共m*n个,图3中所示的点Pij即表示这些网格交点中任意一个。The correction algorithm used in both stages is the same. Taking the first stage of vertical correction as an example, after setting the calculation resolution m in the vertical direction, divide the two curved sides in the vertical direction into m segments, and connect the corresponding dividing points on the two curved sides with a straight line, and then according to Set the resolution n in the horizontal direction, divide the line into n segments, and then obtain m*n intersection points of these grids, and the point Pij shown in Figure 3 represents any one of these grid intersection points.

根据点Pij的序号(i,j)可直接将之映射到图4所示的区域中,得到点Pij’。这样的过程一直进行到i=m,j=n为止。According to the serial number (i, j) of the point Pij, it can be directly mapped to the area shown in Fig. 4 to obtain the point Pij'. Such a process has been carried out until i=m, j=n.

用同样的方法完成第二阶段的水平方向校正。得到如图5所示的校正结果。最终将被校正图像上的点Pij映射到图5所示的点Pij”上。Use the same method to complete the second stage of horizontal correction. The correction results shown in Figure 5 are obtained. Finally, the point Pij on the corrected image is mapped to the point Pij" shown in Fig. 5 .

可重复以上过程形成迭代。The above process can be repeated to form an iteration.

3.动态域值的文字部分二值化分离方法3. Binarization separation method of text part of dynamic domain value

如图6所示,图中以一维曲线来示例二维图像的亮度变化。设曲线取值大的部分(凸起部分)为图像中的文字部分,而曲线取值较小部分(凹下部分)为图像背景部分。注意到背景部分的取值是变化的,表示图像背景随区域的不同而发生亮度变化。f(x,y)表示源图像,动态域值二值化分离方法的思想是构造一种随图像局部背景亮度变化而变化的量化域值,从而将文字凸起部分和背景部分分离。为此,本发明的方法是:As shown in FIG. 6 , the brightness change of the two-dimensional image is illustrated by a one-dimensional curve in the figure. Let the part of the curve with a large value (convex part) be the text part in the image, and the part of the curve with a small value (concave part) be the background part of the image. Note that the value of the background part changes, indicating that the brightness of the image background changes with different regions. f(x, y) represents the source image. The idea of the dynamic threshold binarization separation method is to construct a quantitative threshold value that changes with the local background brightness of the image, so as to separate the raised part of the text from the background part. For this reason, the method of the present invention is:

对源图像f(x,y)进行任意形式的模糊运算,也即图像的低通滤波,得出的模糊图像g(x,y)代表了源图像局部区域的亮度。本发明利用模糊图像g(x,y)的加权值Any form of fuzzy operation is performed on the source image f(x, y), that is, low-pass filtering of the image, and the obtained blurred image g(x, y) represents the brightness of the local area of the source image. The present invention utilizes the weighted value of the fuzzy image g(x, y)

t(x,y)=g(x,y)+Dt(x,y)=g(x,y)+D

作为动态域值,对源图像f(x,y)进行二值化分离,即As a dynamic domain value, binary separation is performed on the source image f(x, y), namely

H(x,y)=255当f(x,y)>t(x,y)H(x, y)=255 when f(x, y)>t(x, y)

H(x,y)=0  当f(x,y)<t(x,y)H(x, y)=0 When f(x, y)<t(x, y)

还可进一步采用迭代方法来改善文字部分的分离效果,如下:The iterative method can also be further used to improve the separation effect of the text part, as follows:

根据分离结果,将源图像f(x,y)中文字部分扣除(H(x,y)=255部分),即用相邻背景亮度替代相应的文字部分,得出背景部分f’(x,y),对之采用上述的图像模糊方法可得到新的模糊图像g’(x,y),加权后可作为新的动态域值对源图像实施二值化。According to the separation result, the text part in the source image f(x, y) is deducted (H(x, y)=255 parts), that is, the corresponding text part is replaced by the adjacent background brightness, and the background part f'(x, y), for which the above-mentioned image blurring method can be used to obtain a new blurred image g'(x, y), which can be used as a new dynamic domain value to perform binarization on the source image after weighting.

4.迭代停止条件4. Iteration stop condition

以上方法(几何校正方法和动态域值方法)中,当前后两次迭代产生的结果之差值的均方值小于设定门限时,或迭代次数大于指定最大迭代次数值时,停止迭代。In the above methods (geometric correction method and dynamic threshold method), when the mean square value of the difference between the results of the previous two iterations is less than the set threshold, or when the number of iterations is greater than the specified maximum number of iterations, the iteration is stopped.

以上所述,仅是本发明的较佳的实施例,不构成对本发明的任何形式上的限制,凡是依据本发明的技术实质对以上实施例所做的任何简单修改、等效变化与修饰,均仍属于本发明技术方案范围内。The above is only a preferred embodiment of the present invention, and does not constitute any formal limitation to the present invention. Any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present invention, All still belong to the scope of the technical solution of the present invention.

Claims (4)

1.一种用于文字识别的文本图像预处理方法,其特征在于:对照相机摄取的文字图片的文字区域几何失真进行区域校正,得到矩形区域的校正结果;并将照相机摄取的文字图片中的文字前景和亮度不均的背景自适应地分离出来。1. A text image preprocessing method for character recognition, characterized in that: carry out regional correction to the geometric distortion of the text area of the text picture taken by the camera, obtain the correction result of the rectangular area; The foreground of the text and the background with uneven brightness are adaptively separated. 2.根据权利要求1所述的用于文字识别的文本图像预处理方法,其特征在于:对照相机摄取的文字图片的文字区域几何失真进行区域校正的方法是:首先对照相机摄取的具有几何失真的文本图像进行区域自动识别或指定,得出文本区域的边界位置,再利用文本图像文字排列的固有矩形区域特性,对所识别或指定的文本区域进行几何校正,还原为矩形形状的文本区域;对于文本图像中的任何一个像素,根据它与所指定的文本区域的边界位置的关系,求取其在设定矩形区域中的对应位置,从而得到该像素在校正后图像中的本源位置。校正后图像的所有像素的色彩值或亮度值均以其所对应的源文字图片中的像素本源位置作为参照,通过任意一种图像插值算法确定;具体步骤如下:2. the text image preprocessing method that is used for character recognition according to claim 1 is characterized in that: the method that area correction is carried out to the text area geometric distortion of the text picture of camera pick-up is: at first to camera pick-up has geometric distortion Automatically identify or specify the area of the text image, obtain the boundary position of the text area, and then use the inherent rectangular area characteristics of the text image text arrangement to perform geometric correction on the identified or specified text area, and restore it to a rectangular text area; For any pixel in the text image, calculate its corresponding position in the set rectangular area according to the relationship between it and the boundary position of the specified text area, so as to obtain the original position of the pixel in the corrected image. The color value or brightness value of all pixels in the corrected image is determined by any image interpolation algorithm with reference to the original position of the pixel in the corresponding source text image; the specific steps are as follows: 1)对原本为矩形文字区域的图像进行文字区域识别,得出文本的边界;1) Carry out text area recognition to the image that is originally a rectangular text area, and obtain the boundary of the text; 2)以区域识别结果为基准,计算源图像相对于该基准下的位置关系;2) Taking the area recognition result as a benchmark, calculate the positional relationship of the source image relative to the benchmark; 3)通过任意一种图像插值算法确定校正后图像对应像素的色彩值或亮度值,从而得出几何校正的结果;3) Determine the color value or brightness value of the corresponding pixel of the corrected image through any image interpolation algorithm, so as to obtain the result of geometric correction; 4)对步骤3所得出的几何校正的结果还可重复步骤1、步骤2和步骤3,通过迭代得到更好的几何校正的结果。4) Step 1, step 2 and step 3 can also be repeated for the geometric correction result obtained in step 3, and a better geometric correction result can be obtained through iteration. 3.根据权利要求1所述的用于文字识别的文本图像预处理方法,其特征在于:对照相机摄取的文字图片或由权利要求1所述的文字区域几何失真校正方法所得出的结果图像进行文字前景的提取,将文字前景和亮度不均的背景自适应地分离出来的方法是图像动态域值二值化预处理方法:通过图像中任意像素及其邻域像素之间的运算,可使用但不限于使用任意一种图像模糊方法,求取局部背景亮度;通过利用局部背景亮度的加权值作为该区域的二值化门限,对该区域文字前景进行分离;根据分离结果,将源图像中的文字前景部分扣除,再次使用但不限于使用任意一种图像模糊方法计算局部背景亮度,然后利用其加权值作为新的动态门限对源图像进行二值化,得出更精确的分离结果;这一过程可以进行多次迭代,具体步骤如下:3. the text image preprocessing method that is used for text recognition according to claim 1, is characterized in that: to the text picture that camera picks up or the result image that draws by the text area geometric distortion correction method described in claim 1 The method of extracting the text foreground and adaptively separating the text foreground from the background with uneven brightness is the image dynamic domain value binarization preprocessing method: through the operation between any pixel in the image and its neighboring pixels, you can use But it is not limited to use any image blurring method to obtain the local background brightness; by using the weighted value of the local background brightness as the binarization threshold of the region, the text foreground of the region is separated; according to the separation result, the source image is divided into Subtract the foreground part of the text, use but not limited to any image blurring method to calculate the local background brightness, and then use its weighted value as a new dynamic threshold to binarize the source image to obtain a more accurate separation result; this A process can be iterated multiple times, the specific steps are as follows: 1)采用固定域值方法对文本前景进行粗分离;1) Use the fixed domain value method to roughly separate the text foreground; 2)利用任意一种图像模糊方法,对粗分离出来的背景区域求取其局部背景亮度;2) Use any image blurring method to calculate the local background brightness of the roughly separated background area; 3)对局部背景亮度进行加权计算,以之作为动态域值对原图像进行文献前景分离;3) Carry out weighted calculation on the local background brightness, and use it as the dynamic domain value to separate the document foreground of the original image; 4)利用步骤3所得出的结果,重复迭代步骤2、3得出更精确的分离结果。4) Using the result obtained in step 3, iterative steps 2 and 3 are repeated to obtain more accurate separation results. 4.根据权利要求1所述的用于文字识别的文本图像预处理方法,其特征在于几何校正方法包含自动区域识别和手动区域指定两种。4. The text image preprocessing method for character recognition according to claim 1, characterized in that the geometric correction method includes automatic area recognition and manual area designation.
CNA2008100584515A 2008-05-27 2008-05-27 Text Image Preprocessing Method for Text Recognition Pending CN101315664A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2008100584515A CN101315664A (en) 2008-05-27 2008-05-27 Text Image Preprocessing Method for Text Recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2008100584515A CN101315664A (en) 2008-05-27 2008-05-27 Text Image Preprocessing Method for Text Recognition

Publications (1)

Publication Number Publication Date
CN101315664A true CN101315664A (en) 2008-12-03

Family

ID=40106671

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2008100584515A Pending CN101315664A (en) 2008-05-27 2008-05-27 Text Image Preprocessing Method for Text Recognition

Country Status (1)

Country Link
CN (1) CN101315664A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101833543A (en) * 2010-05-07 2010-09-15 李响 Naked-eye three-dimensional display method for characters
CN101576956B (en) * 2009-05-11 2011-08-31 天津普达软件技术有限公司 On-line character detection method based on machine vision and system thereof
CN102592124A (en) * 2011-01-13 2012-07-18 汉王科技股份有限公司 Geometrical correction method, device and binocular stereoscopic vision system of text image
CN106446898A (en) * 2016-09-14 2017-02-22 宇龙计算机通信科技(深圳)有限公司 Extraction method and extraction device of character information in image
CN107657230A (en) * 2017-09-27 2018-02-02 安徽硕威智能科技有限公司 A kind of bank self-help robot character recognition device
CN110363196A (en) * 2019-06-20 2019-10-22 吴晓东 It is a kind of tilt text text precisely know method for distinguishing
CN110647795A (en) * 2019-07-30 2020-01-03 正和智能网络科技(广州)有限公司 Form recognition method
CN110942064A (en) * 2019-11-25 2020-03-31 维沃移动通信有限公司 Image processing method and device and electronic equipment
CN112001238A (en) * 2020-07-14 2020-11-27 浙江大华技术股份有限公司 Terminal block wiring state identification method, identification device and storage medium
CN113537187A (en) * 2021-01-06 2021-10-22 腾讯科技(深圳)有限公司 Text recognition method, device, electronic device and readable storage medium

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101576956B (en) * 2009-05-11 2011-08-31 天津普达软件技术有限公司 On-line character detection method based on machine vision and system thereof
CN101833543A (en) * 2010-05-07 2010-09-15 李响 Naked-eye three-dimensional display method for characters
CN101833543B (en) * 2010-05-07 2011-09-28 李响 Naked-eye three-dimensional display method for characters
CN102592124A (en) * 2011-01-13 2012-07-18 汉王科技股份有限公司 Geometrical correction method, device and binocular stereoscopic vision system of text image
CN102592124B (en) * 2011-01-13 2013-11-27 汉王科技股份有限公司 Text image geometric correction method, device and binocular stereo vision system
CN106446898A (en) * 2016-09-14 2017-02-22 宇龙计算机通信科技(深圳)有限公司 Extraction method and extraction device of character information in image
CN107657230A (en) * 2017-09-27 2018-02-02 安徽硕威智能科技有限公司 A kind of bank self-help robot character recognition device
CN110363196A (en) * 2019-06-20 2019-10-22 吴晓东 It is a kind of tilt text text precisely know method for distinguishing
CN110363196B (en) * 2019-06-20 2022-02-08 吴晓东 Method for accurately recognizing characters of inclined text
CN110647795A (en) * 2019-07-30 2020-01-03 正和智能网络科技(广州)有限公司 Form recognition method
CN110647795B (en) * 2019-07-30 2023-08-11 正和智能网络科技(广州)有限公司 Form identification method
CN110942064A (en) * 2019-11-25 2020-03-31 维沃移动通信有限公司 Image processing method and device and electronic equipment
CN110942064B (en) * 2019-11-25 2023-05-09 维沃移动通信有限公司 Image processing method and device and electronic equipment
CN112001238A (en) * 2020-07-14 2020-11-27 浙江大华技术股份有限公司 Terminal block wiring state identification method, identification device and storage medium
CN113537187A (en) * 2021-01-06 2021-10-22 腾讯科技(深圳)有限公司 Text recognition method, device, electronic device and readable storage medium

Similar Documents

Publication Publication Date Title
CN101315664A (en) Text Image Preprocessing Method for Text Recognition
CN108898610B (en) An object contour extraction method based on mask-RCNN
CN109785291B (en) Lane line self-adaptive detection method
JP5542889B2 (en) Image processing device
CN110298282B (en) Document image processing method, storage medium and computing device
CN105488758B (en) A kind of image-scaling method based on perception of content
CN110675346A (en) Image acquisition and depth map enhancement method and device suitable for Kinect
CN108596878B (en) Image sharpness evaluation method
KR100996897B1 (en) Circumferential Distortion Image Correction Method of Wide Angle Lens by Linear Fitting
CN104794688A (en) Single image defogging method and device based on depth information separation sky region
CN109035170A (en) Adaptive wide-angle image correction method and device based on single grid chart subsection compression
CN104715221B (en) A kind of coding/decoding method and system of ultralow contrast Quick Response Code
WO2025045021A1 (en) Screen defect detection method and apparatus, and device and storage medium
WO2022127491A1 (en) Image processing method and device, and storage medium and terminal
CN108171674A (en) For the vision correcting method of visual angle projector image
WO2022056876A1 (en) Method and apparatus for recognizing electric motor nameplate, and computer-readable storage medium
CN105447489B (en) A kind of character of picture OCR identifying system and background adhesion noise cancellation method
CN105787894A (en) Barrel distortion container number correction method
CN115393216A (en) Image Dehazing Method and Device Based on Polarization Characteristics and Atmospheric Transmission Model
CN106296608B (en) Mapping table-based fisheye image processing method and system
US9094617B2 (en) Methods and systems for real-time image-capture feedback
CN111079738B (en) Image processing method, system and terminal equipment
CN108961155B (en) High-fidelity fisheye lens distortion correction method
CN110610163B (en) Table extraction method and system based on ellipse fitting in natural scene
WO2022056875A1 (en) Method and apparatus for segmenting nameplate image, and computer-readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Open date: 20081203