CN104866850B - A kind of optimization method of text image binaryzation - Google Patents
A kind of optimization method of text image binaryzation Download PDFInfo
- Publication number
- CN104866850B CN104866850B CN201510257271.XA CN201510257271A CN104866850B CN 104866850 B CN104866850 B CN 104866850B CN 201510257271 A CN201510257271 A CN 201510257271A CN 104866850 B CN104866850 B CN 104866850B
- Authority
- CN
- China
- Prior art keywords
- pixels
- pixel
- image
- row
- binary image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract 9
- 238000005457 optimization Methods 0.000 title claims abstract 4
- 239000011159 matrix material Substances 0.000 claims 2
- 238000001514 detection method Methods 0.000 claims 1
- 230000004927 fusion Effects 0.000 claims 1
- 230000015556 catabolic process Effects 0.000 abstract 1
- 238000006731 degradation reaction Methods 0.000 abstract 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/28—Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
本发明公开了一种文本图像二值化的优化方法,本发明具有如下的技术效果,(1)本发明提出了一种优化二值化的方法。现有的二值化算法,对于不同退化类型的文本图像,都有着自己不同的特点和准确度,而本发明能够很好的在现有二值图的基础上进行二次优化,保留二值化算法自身的优点,进一步提升二值化方法的准确度。(2)本发明提出了一种以每个区域中、每一行或者每一列像素中某一类作为特征的像素所占的百分比作为判断其中所有像素分类的方法,不仅仅可以用到该发明中,对于很多其它需要细致分类的情况,在已经初步得到分类信息的情况下,都能够很好的借鉴此方法进行二次分类。
The invention discloses an optimization method for binarization of text images. The invention has the following technical effects: (1) The invention proposes a method for optimizing binarization. Existing binarization algorithms have different characteristics and accuracy for text images of different degradation types, but the present invention can perform secondary optimization on the basis of existing binary images, retaining binary images The advantages of the binarization algorithm itself further improve the accuracy of the binarization method. (2) The present invention proposes a method of judging the classification of all pixels in each region, each row or column of pixels by the percentage of a certain type of pixel as a feature, which can not only be used in this invention , for many other situations that require detailed classification, this method can be used for secondary classification when the classification information has been obtained initially.
Description
技术领域technical field
本发明涉及一种文本图像的二值化优化方法,属于图像处理领域。The invention relates to a binarization optimization method of a text image, which belongs to the field of image processing.
背景技术Background technique
目前纸质的文献资料随着时间的发展越来越多,需要占用越来越多的地方来存放,且使用搜索起来也不方便,因此需要将其数字化后存储以便于传播、管理与应用。文本图像的数字化,需要对字符进行分割、识别等步骤,而在进行这些操作之前,需要对文本图像进行二值化,二值化的准确度直接影响着随后的分析、识别等步骤能否顺利进行,所以,文本图像的二值化扮演着至关重要的角色。文本图像的二值化,是将一幅文本的灰度图像,转换为只有黑白两色的二值图像,即将图像分为前景和背景两部分。许多文本图像是年代久远的文献资料,难以避免会发生图像退化,发生退化的原因有很多种,比如图像的获取来源,图像的保存环境、保存时间等等,都会使图像发生严重的退化现象,使得文本图像中前景与背景高度相似,难以区分,因而如何准确的对文本图像进行二值化一直是一道难题。At present, with the development of time, there are more and more paper-based documents, which need to occupy more and more places for storage, and are inconvenient to use and search. Therefore, they need to be digitized and stored for dissemination, management and application. The digitization of text images requires steps such as segmentation and recognition of characters. Before performing these operations, it is necessary to binarize the text images. The accuracy of binarization directly affects whether the subsequent steps of analysis and recognition can be smooth. Therefore, the binarization of text images plays a crucial role. The binarization of a text image is to convert a grayscale image of a text into a binary image with only black and white, that is, the image is divided into two parts, the foreground and the background. Many text images are old documents, and it is inevitable that image degradation will occur. There are many reasons for the degradation, such as the source of image acquisition, the storage environment of the image, the storage time, etc., which will cause serious image degradation. The foreground and background in the text image are highly similar and difficult to distinguish, so how to accurately binarize the text image has always been a difficult problem.
二值化算法,通过最近十多年的发展,已经有了很大的进步。但是对于历史文本图像,通常图像退化比较严重,图像质量较差。而退化类型有各种各样,比如光照变化,污渍,折痕,背面浸透过来的字迹等等,现有的二值化算法,对各种类型的退化文本图像都有自己的不同的特点且准确度不一。因而本文希望能够提高现有二值化算法对于各种类型的退化文本的适应性,在现有二值化算法得到的二值图的基础上,进行二次优化,进一步提高二值化的准确度。The binarization algorithm has made great progress through the development of the past ten years. But for historical text images, usually the image degradation is serious and the image quality is poor. There are various types of degradation, such as illumination changes, stains, creases, handwriting soaked from the back, etc. The existing binarization algorithms have their own different characteristics for various types of degraded text images and Accuracy varies. Therefore, this paper hopes to improve the adaptability of the existing binarization algorithm to various types of degraded texts, and perform secondary optimization on the basis of the binary image obtained by the existing binarization algorithm to further improve the accuracy of binarization. Spend.
发明内容Contents of the invention
本发明的目的是提供一种对文本图像二值化的优化方法。The purpose of the present invention is to provide an optimization method for binarizing text images.
本发明的技术方案是,The technical scheme of the present invention is,
一种文本图像二值化的优化方法,包括以下步骤:An optimization method for text image binarization, comprising the following steps:
步骤1:step 1:
用二值化算法对原始文本图像进行二值化后得到的二值图,作为初始二值图;The binary image obtained after binarizing the original text image with a binarization algorithm is used as the initial binary image;
对原始文本图像运用k-means算法,以图像的所有像素点的像素值为对象进行分类,分类的数目设为k,分类以后就能得到一幅被标记为{I1,I2,I3…Ii}k类像素集合的图像,每个像素被标记为1~k中的一类,计算每个Ii集合中所有像素值的平均值Ai,记Imin即为像素值均值最小的一类像素集合;Apply the k-means algorithm to the original text image, and classify the objects with the pixel values of all pixels in the image. The number of classifications is set to k. After classification, you can get a picture marked as {I 1 , I 2 , I 3 …I i }The image of k-type pixel sets, each pixel is marked as one of 1~k, calculate the average value A i of all the pixel values in each I i set, denote I min is a set of pixels with the smallest average pixel value;
步骤2:Step 2:
采用R.M.haralick连通区域检测算法在初始二值图中标记出每一个独立封闭的连通区域,连通区域是指图像中的一个最大连通子集,在一个最大连通子集中任意两个像素点P1(x1,y1)满足:Use the RMharalick connected region detection algorithm to mark each independently closed connected region in the initial binary image. The connected region refers to a maximum connected subset in the image. In a maximum connected subset, any two pixel points P 1 (x 1 , y 1 ) satisfy:
1≤(x1-x2)2+(y1-y2)2≤2 (1)1≤(x 1 -x 2 ) 2 +(y 1 -y 2 ) 2 ≤2 (1)
得到一幅大小与初始二值图相同且被标记为1~m个连通区域的图像;Obtain an image with the same size as the initial binary image and marked as 1-m connected regions;
步骤3:Step 3:
首先,对于步骤2中被标记的1~m个连通区域,任取标记为j(1≤j≤m)的连通区域,j连通区域中像素总个数为Sj,计算其中包含的Imin类像素的总数,记为mj,j连通区域中的像素通过以下规则重新分类,Qj(x)代表j连通区域中所有的像素:First, for the 1~m connected regions marked in step 2, any connected region marked as j (1≤j≤m) is randomly selected, the total number of pixels in the j connected region is S j , and the I min contained in it is calculated The total number of class pixels, denoted as m j , the pixels in the j-connected region are reclassified by the following rules, Q j (x) represents all the pixels in the j-connected region:
将标记为1~m的每个连通区域中的像素通过上式划分后,即可去掉一部分错误划分为前景像素的区域,得到了二值图B2;After the pixels in each connected region marked 1-m are divided by the above formula, a part of the region that is wrongly divided into foreground pixels can be removed, and the binary image B2 is obtained ;
然后,对二值图B2重新检测连通区域,得到标记为1~n的连通区域图,任取标记为j(1≤j≤n)的连通区域,每幅图像都是一个2维像素矩阵,矩阵中的每个像素都有自己的行、列下标,假设j连通区域中的像素的行下标为p~q,统计j区域中第f(p≤f≤q)行的Imin类像素个数mjf和像素的总个数Sjf,对第f行像素按下式重新分类,Qjf(x)代表连通区域中f行的所有像素:Then, re-detect the connected regions on the binary image B 2 to obtain the connected region graph marked 1~n, and randomly select the connected regions marked j (1≤j≤n), and each image is a 2-dimensional pixel matrix , each pixel in the matrix has its own row and column subscripts, assuming that the row subscripts of the pixels in the j-connected area are p~q, and the I min of the fth (p≤f≤q) row in the j area is counted The number of class pixels m jf and the total number of pixels S jf , reclassify the pixels in the fth row according to the following formula, Q jf (x) represents all the pixels in the f row in the connected area:
将标记为j连通区域用上式对连通区域中的每一行像素进行判断的同时记录该行像素划分的类别并将其记录在数组ajf,如果该行分类为背景像素,则该行的分类结果记为0存入数组中,反之则分类结果记为1,用公式表示如下:Use the above formula to judge each row of pixels in the connected region and record the category of the row of pixels and record it in the array a jf . If the row is classified as a background pixel, the classification of the row The result is recorded as 0 and stored in the array, otherwise, the classification result is recorded as 1, and the formula is expressed as follows:
ajf中的每一个数对应p~q中的一行像素,由此,当一个连通区域中的像素被逐行使用上式分类完成后,能够生成一个标记数组ajf作为附加条件。只需把数组ajf所有元素中1与1中间的所有0代表的那行像素重新标记为前景像素即可,用上述方法将标记为1~n的连通区域中的行像素全部划分完。最后,重新检测连通区域,对得到的每个连通区域中每列像素用上述相同方法进行分类,得到中期二值图;Each number in a jf corresponds to a row of pixels in p~q, so when the pixels in a connected region are classified row by row using the above formula, a tag array a jf can be generated as an additional condition. It is only necessary to re-mark the row of pixels represented by all 0s between 1 and 1 in all elements of the array a jf as foreground pixels, and use the above method to divide all the row pixels in the connected region marked 1~n. Finally, re-detect the connected area, classify each column of pixels in each connected area obtained by the same method as above, and obtain a medium-term binary image;
步骤4:Step 4:
使用Su提出的一个融合规则,即对初始二值图和中期二值图,同时被分类为前景或者背景的像素认为分类正确,其它不相同的字符像素分类为待定像素;如下公式所示:Use a fusion rule proposed by Su, that is, for the initial binary image and the intermediate binary image, the pixels that are classified as foreground or background at the same time are considered to be classified correctly, and other different character pixels are classified as undetermined pixels; as shown in the following formula:
K(x)代表原始图像中的一个像素,Ni(x)代表初始二值图和中期二值图中K(x)位置像素的值。对于待定的像素,通过下式进行分类:K(x) represents a pixel in the original image, and N i (x) represents the value of the pixel at position K(x) in the initial binary image and the intermediate binary image. For pending pixels, the classification is done by the following formula:
式(6)中J(x)代表待定的像素,Con(x),I(x)代表像素J(x)的对比度值和像素值;ConF,IF代表了以J(x)为中心的局部窗口内前景像素的平均对比度值和平均灰度值,ConB和IB代表了局部窗口中的背景像素的平均对比度值和平均灰度值;In formula (6), J(x) represents the undetermined pixel, Con(x), I(x) represents the contrast value and pixel value of pixel J(x); Con F , I F represents the center of J(x) The average contrast value and the average gray value of the foreground pixels in the local window of Con B and I B represent the average contrast value and the average gray value of the background pixels in the local window;
式(7)中,(x,y)表示原始图像中像素的行列坐标,Con(x,y)表示每个像素的对比度值,灰度值I(x,y)表示原始图像中像素(x,y)处的像素值,以(x,y)为中心点取一个10×10的像素窗口,fmax(x,y)表示窗口中的最大像素值。ε为正的极小化因子,为了防止分母为零;In formula (7), (x, y) represents the row and column coordinates of pixels in the original image, Con(x, y) represents the contrast value of each pixel, and the gray value I(x, y) represents the pixel (x , y), take a 10×10 pixel window with (x, y) as the center point, and f max (x, y) represents the maximum pixel value in the window. ε is a positive minimization factor, in order to prevent the denominator from being zero;
记中期二值图为B3,初始二值图为B1,选择B1作为初始的迭代图像,将其与中期二值图B3结合运用公式(5)分类后,对于分类得到的所有待定像素,以每一个待定像素为中心取3×3的窗口,当窗口内有一个或1个以上的前景或者背景像素后,按照公式(6)中J(x)的判定条件对这个待定像素进行分类,分类完毕后换下一个待定像素继续分类,将所有待定像素分类完毕后得到的图像与第一次迭代图像B1做对比,不相同则将其作为第二次迭代的初始图像,继续与中期二值图B3相结合分类,重复之前步骤,一直迭代到基本不变,得到了最终优化后的二值图。Note that the medium-term binary image is B 3 , the initial binary image is B 1 , select B 1 as the initial iterative image, combine it with the intermediate binary image B 3 and use formula (5) to classify, for all undetermined Pixel, take each undetermined pixel as the center to take a 3×3 window, when there is one or more foreground or background pixels in the window, according to the judgment condition of J(x) in the formula (6), the undetermined pixel is processed Classification, after the classification is completed, replace the next undetermined pixel to continue the classification, compare the image obtained after all undetermined pixels are classified with the first iteration image B 1 , if they are not the same, use it as the initial image of the second iteration, and continue with The medium-term binary image B and 3 are combined and classified, and the previous steps are repeated until it is basically unchanged, and the final optimized binary image is obtained.
本发明具有如下的技术效果,(1)本发明提出了一种优化二值化的方法。现有的二值化算法,对于不同退化类型的文本图像,都有着自己不同的特点和准确度,而本发明能够很好的在现有二值图的基础上进行二次优化,保留二值化算法自身的优点,进一步提升二值化方法的准确度。(2)本发明提出了一种以每个区域中、每一行或者每一列像素中某一类作为特征的像素所占的百分比作为判断其中所有像素分类的方法,不仅仅可以用到该发明中,对于很多其它需要细致分类的情况,在已经初步得到分类信息的情况下,都能够很好的借鉴此方法进行二次分类。The present invention has the following technical effects. (1) The present invention proposes a method for optimizing binarization. Existing binarization algorithms have different characteristics and accuracy for text images of different degradation types, but the present invention can perform secondary optimization on the basis of existing binary images, retaining binary images The advantages of the binarization algorithm itself further improve the accuracy of the binarization method. (2) The present invention proposes a method of judging the classification of all pixels in each region, each row or column of pixels by the percentage of a certain type of pixel as a feature, which can not only be used in this invention , for many other situations that require detailed classification, this method can be used for secondary classification when the classification information has been obtained initially.
附图说明Description of drawings
图1是文本图像二值化二次优化方法的具体流程图。FIG. 1 is a specific flow chart of a text image binarization secondary optimization method.
图2是原始文本图像。Figure 2 is the original text image.
图3是原始文本图像otsu算法二值化图。Figure 3 is the binarization diagram of the original text image by Otsu algorithm.
图4是用本发明优化后二值图。Fig. 4 is a binary map optimized by the present invention.
图5是原始文本图像二值图真值图。Figure 5 is the truth map of the binary image of the original text image.
图6是原始文本图像。Figure 6 is the original text image.
图7是原始文本图像Lelore算法二值化图。Fig. 7 is the binarization diagram of the original text image by Lelore algorithm.
图8是原始文本图像用本发明优化后二值图。Fig. 8 is a binary image after the original text image is optimized by the present invention.
图9是原始文本图像二值图真值图。Figure 9 is the truth map of the binary image of the original text image.
具体实施方式Detailed ways
如图1所示输入现有二值化算法得到初始的二值图,对其检测连通区域。而对于原始文本图像,运用k-means算法进行分类,标记出像素值均值最小的一类像素,以此类像素为依据对得到的每一个连通区域进行总体和局部的像素值重新分类,进而达到优化二值图的目的。下面是具体步骤的执行内容:As shown in Figure 1, input the existing binarization algorithm to obtain the initial binary image, and detect connected regions on it. For the original text image, the k-means algorithm is used to classify, and a class of pixels with the smallest mean value of the pixel value is marked, and based on such pixels, the overall and local pixel values of each connected region are reclassified to achieve The purpose of optimizing the binary map. The following is the execution content of the specific steps:
步骤1:将已有二值化算法对原始文本图像进行二值化后得到的二值图,作为初始的二值图,同时,对于原始文本图像,运用k-means算法进行分类,标记出像素值均值最小的一类像素。Step 1: Use the binary image obtained by binarizing the original text image with the existing binarization algorithm as the initial binary image. At the same time, for the original text image, use the k-means algorithm to classify and mark the pixels The class of pixels with the smallest mean value.
对原始文本图像运用k-means算法,以图像的所有像素点的像素值为对象进行分类,分类的数目设为k,分类以后就能得到一幅被标记为{I1,I2,L Ik}k类像素集合的图像,每个像素被标记为1~k中的一类,计算每个Ii集合中所有像素值的平均值Ai,记Imin即为像素值均值最小的一类像素集合。Apply the k-means algorithm to the original text image, classify the objects with the pixel values of all the pixels in the image, and set the number of classifications to k. After classification, you can get a picture marked as {I 1 , I 2 , LI k } The image of k pixel sets, each pixel is marked as one of 1~k, calculate the average value A i of all the pixel values in each I i set, denote I min is a set of pixels with the smallest average pixel value.
步骤2:对初始的二值图,用连通区域检测算法进行连通区域标记,得到一幅被标记为m个不同连通区域的标记图。Step 2: For the initial binary image, use the connected region detection algorithm to mark the connected regions, and obtain a labeled map marked with m different connected regions.
连通区域检测是指在二值图中标记出每一个独立封闭的连通区域。连通区域是指图像中的一个最大连通子集,在一个连通集中任意两个像素之间存在一条完全由这个集合中的元素构成的路径,这个集合中的像素都是前景像素,而判断两个像素之间连通的连通性准则有两种。如果两个像素点P1(x1,y1),P2(x2,y2)满足:Connected region detection refers to marking each independently closed connected region in the binary image. A connected region refers to a maximum connected subset in an image. There is a path completely composed of elements in this set between any two pixels in a connected set. The pixels in this set are all foreground pixels, and judging two There are two connectivity criteria for connectivity between pixels. If two pixel points P 1 (x 1 , y 1 ), P 2 (x 2 , y 2 ) satisfy:
1≤(x1-x2)2+(y1-y2)2≤2 (1)1≤(x 1 -x 2 ) 2 +(y 1 -y 2 ) 2 ≤2 (1)
就说p1和p2是八相邻的,也就是一个像素在这种情况下有8个邻居。如果两个像素点P1(x1,y1),P2(x2,y2)满足:Let's say p 1 and p 2 are eight-adjacent, that is, a pixel has 8 neighbors in this case. If two pixel points P 1 (x 1 , y 1 ), P 2 (x 2 , y 2 ) satisfy:
(x1-x2)2+(y1-Y2)2=1 (2)(x 1 -x 2 ) 2 +(y 1 -Y 2 ) 2 =1 (2)
就说p1和p2是四相邻的,也就是一个像素在这种情况下有4个邻居,一般使用八连通性,更符合人们的视觉。连通区域标记的算法经过几十年的发展已经较为成熟,因此只需选择一种速度快且准确的连通区域检查算法即可,我们采用R.M.haralick的经典连通区域检测算法对初始二值图进行连通区域检测,得到一幅大小与初始二值图相同且被标记为m个不同连通区域的图像。It is said that p 1 and p 2 are four-adjacent, that is, one pixel has four neighbors in this case, and eight-connectivity is generally used, which is more in line with people's vision. The algorithm of connected area marking has been relatively mature after decades of development, so it is only necessary to choose a fast and accurate connected area checking algorithm. We use RMharalick's classic connected area detection algorithm to check the connected area of the initial binary image. Detection, an image with the same size as the initial binary image and marked as m different connected regions is obtained.
步骤3:首先,对步骤2中每一个独立的连通区域中的所有像素,进行背景像素和前景像素的总体重新划分;然后,重新检测连通区域,对每一个独立的连通区域中,每一行的像素进行背景像素和前景像素的划分,连通区域中所有行划分完的同时生成一个数组作为连通区域中所有的行像素分类的附加条件;最后,重新检测连通区域,对每一个独立的连通区域中,所有列进行与上述行处理相同的步骤,得到中期二值图。Step 3: First, for all pixels in each independent connected region in step 2, perform an overall re-division of background pixels and foreground pixels; then, re-detect the connected region, and for each independent connected region, each row's Pixels are divided into background pixels and foreground pixels. After all the rows in the connected region are divided, an array is generated as an additional condition for the classification of all row pixels in the connected region; finally, the connected region is re-detected, and each independent connected region , all columns undergo the same steps as the above-mentioned row processing to obtain a medium-term binary image.
首先,对于步骤2中被标记的1~m个连通区域,任取标记为j(1≤j≤m)的连通区域,j连通区域中像素总个数为Sj,计算其中包含的Imin类像素的总数,记为mj,j连通区域中的像素通过以下规则重新分类,Qj(x)代表j连通区域中所有的像素:First, for the 1~m connected regions marked in step 2, any connected region marked as j (1≤j≤m) is randomly selected, the total number of pixels in the j connected region is S j , and the I min contained in it is calculated The total number of class pixels, denoted as m j , the pixels in the j-connected region are reclassified by the following rules, Q j (x) represents all the pixels in the j-connected region:
将标记为1~m的每个连通区域中的像素通过上式划分后,即可去掉一部分错误划分为前景像素的区域,得到了二值图B2。After the pixels in each connected region marked 1~m are divided by the above formula, a part of the region that is wrongly classified as foreground pixels can be removed, and a binary image B 2 is obtained.
然后,对二值图B2重新检测连通区域,得到标记为1~n的连通区域图,任取标记为j(1≤j≤n)的连通区域,每幅图像都是一个2维像素矩阵,矩阵中的每个像素都有自己的行、列下标,假设j连通区域中的像素的行下标为p~q,统计j区域中第f(p≤f≤q)行的Imin类像素个数mjf和像素的总个数Sjf,对第f行像素按下式重新分类,Qjf(x)代表连通区域中f行的所有像素:Then, re-detect the connected regions on the binary image B 2 to obtain the connected region graph marked 1~n, and randomly select the connected regions marked j (1≤j≤n), and each image is a 2-dimensional pixel matrix , each pixel in the matrix has its own row and column subscripts, assuming that the row subscripts of the pixels in the j-connected area are p~q, and the I min of the fth (p≤f≤q) row in the j area is counted The number of class pixels m jf and the total number of pixels S jf , reclassify the pixels in the fth row according to the following formula, Q jf (x) represents all the pixels in the f row in the connected area:
将标记为j连通区域用上式对连通区域中的每一行像素进行判断的同时记录该行像素划分的类别并将其记录在数组ajf,如果该行分类为背景像素,则该行的分类结果记为0存入数组中,反之则分类结果记为1,用公式表示如下:Use the above formula to judge each row of pixels in the connected region and record the category of the row of pixels and record it in the array a jf . If the row is classified as a background pixel, the classification of the row The result is recorded as 0 and stored in the array, otherwise, the classification result is recorded as 1, and the formula is expressed as follows:
ajf中的每一个数对应p~q中的一行像素,由此,当一个连通区域中的像素被逐行使用上式分类完成后,能够生成一个标记数组ajf作为附加条件。只需把数组ajf所有元素中1与1中间的所有0代表的那行像素重新标记为前景像素即可,用上述方法将标记为1~n的连通区域中的行像素全部划分完。最后,重新检测连通区域,对得到的每个连通区域中每列像素用上述相同方法进行分类,得到中期二值图。Each number in a jf corresponds to a row of pixels in p~q, so when the pixels in a connected region are classified row by row using the above formula, a tag array a jf can be generated as an additional condition. It is only necessary to re-mark the row of pixels represented by all 0s between 1 and 1 in all elements of the array a jf as foreground pixels, and use the above method to divide all the row pixels in the connected region marked 1~n. Finally, re-detect the connected regions, classify each column of pixels in each connected region obtained using the same method as above, and obtain a medium-term binary image.
步骤4:中期二值图中,部分字符的边缘可能会不够平滑和完整,通过融合中期二值图和初始二值图,使得字符的边缘更加完整平滑,得到最终优化后的二值图。Step 4: In the medium-term binary image, the edges of some characters may not be smooth and complete. By fusing the intermediate binary image and the initial binary image, the edges of the characters are more complete and smooth, and the final optimized binary image is obtained.
为了使中期二值图中的字符边缘更加完整和平滑,使用Su提出的一个融合规则,即对于两幅二值图,同时被分类为前景或者背景的像素认为分类正确,其它不相同的字符像素分类为待定像素。如下公式所示:In order to make the character edges in the medium-term binary image more complete and smooth, a fusion rule proposed by Su is used, that is, for two binary images, pixels that are classified as foreground or background at the same time are considered to be classified correctly, and other different character pixels Classification as pending pixel. As shown in the following formula:
K(x)代表原始图像中的一个像素,Bi(x)代表初始二值图和中期二值图中K(x)位置像素的值。对于待定的像素,通过下式(7)进行分类:K(x) represents a pixel in the original image, B i (x) represents the value of the pixel at position K(x) in the initial binary image and the intermediate binary image. For the undetermined pixels, the classification is carried out by the following formula (7):
式(8)中,(x,y)表示原始图像中像素的行列坐标,Con(x,y)表示每个像素的对比度值,灰度值I(x,y)表示原始图像中像素(x,y)处的像素值,以(x,y)为中心点取一个10×10的像素窗口,fmax(x,y)表示窗口中的最大像素值。ε为正的极小化因子,为了防止分母为零。In formula (8), (x, y) represents the row and column coordinates of pixels in the original image, Con(x, y) represents the contrast value of each pixel, and the gray value I(x, y) represents the pixel (x , y), take a 10×10 pixel window with (x, y) as the center point, and f max (x, y) represents the maximum pixel value in the window. ε is a positive minimization factor, in order to prevent the denominator from being zero.
J(x)代表公式(7)中待定的像素,Con(x),I(x)代表像素J(x)的对比度值和像素值。ConF,IF代表了以J(x)为中心的局部窗口内前景像素的平均对比度值和平均灰度值,ConB和IB代表了局部窗口中的背景像素的平均对比度值和平均灰度值。记中期二值图为B3,初始二值图为B1,选择B1作为初始的迭代图像,将其与中期二值图B3结合运用公式(6)分类后,对于分类得到的所有待定像素,以每一个待定像素为中心取3×3的窗口,当窗口内有一个或1个以上的前景或者背景像素后,按照公式(7)中J(x)的判定条件对这个待定像素进行分类,分类完毕后换下一个待定像素继续分类,将所有待定像素分类完毕后得到的图像与第一次迭代图像B1做对比,不相同则将其作为第二次迭代的初始图像,继续与中期二值图B3相结合分类,重复之前步骤,一直迭代到基本不变,得到了最终优化后的二值图。J(x) represents the undetermined pixel in formula (7), Con(x), I(x) represents the contrast value and pixel value of pixel J(x). Con F , I F represent the average contrast value and average gray value of the foreground pixels in the local window centered on J(x), Con B and I B represent the average contrast value and average gray value of the background pixels in the local window degree value. Note that the medium-term binary image is B 3 , the initial binary image is B 1 , select B 1 as the initial iterative image, combine it with the intermediate-term binary image B 3 and use the formula (6) to classify, for all undetermined Pixels, take each undetermined pixel as the center to take a 3×3 window, when there is one or more foreground or background pixels in the window, according to the judgment condition of J(x) in the formula (7), the undetermined pixel is processed Classification, after the classification is completed, replace the next undetermined pixel to continue the classification, compare the image obtained after all undetermined pixels are classified with the first iteration image B 1 , if they are not the same, use it as the initial image of the second iteration, and continue with The medium-term binary image B and 3 are combined and classified, and the previous steps are repeated until it is basically unchanged, and the final optimized binary image is obtained.
本发明能够在现有二值化的基础上,提升二值化算法对于各种类型图像的适应能力,且能够适用于各种现有二值化算法,能够有效提高二值化的准确度,给二值化的指标以及视觉效果带来二次提升。The present invention can improve the adaptability of the binarization algorithm to various types of images on the basis of the existing binarization, and can be applied to various existing binarization algorithms, and can effectively improve the accuracy of the binarization, It brings a secondary improvement to the binarized indicators and visual effects.
具体实施例方式Specific embodiments
采用DIBCO比赛中提供的图片库中的文本图像如图2、图6所示,因为该图片库提供了二值图的真值图以供测量二值化算法的准确性。参与比较的算法包括Otsu方法和Lelore算法。Otsu作为经典二值化算法如图3如图6所示,不仅奠定了二值化算法发展的基础,在各种二值化文章中一般都会作为参与对比的算法。而2013年的Lelore算法,在DIBCO比赛中获得了非常好的成绩,因此选择它们来作为优化的实验对象,能够较好的说明本发明的优化的效果。The text images in the picture library provided in the DIBCO competition are shown in Figure 2 and Figure 6, because the picture library provides the true value map of the binary image for measuring the accuracy of the binarization algorithm. Algorithms involved in the comparison include Otsu's method and Lelore's algorithm. As a classic binarization algorithm, Otsu is shown in Figure 3 and Figure 6. It not only lays the foundation for the development of binarization algorithms, but also generally serves as an algorithm for comparison in various binarization articles. The Lelore algorithm in 2013 achieved very good results in the DIBCO competition, so choosing them as the optimized experimental object can better illustrate the optimization effect of the present invention.
本实施例是在基于windows7系统的64位pc机上,通过matlab2013的编程实验环境进行的。附图2中的文本图像,是一幅1078×2477像素的png图片格式的文本图像。附图6中的文本图像2,是一幅2278×870像素的png图片格式的文本图像。本实施例中的T1和T2的值设为0.2,步骤4中迭代次数上限设定为20次,k-means聚类算法中分类数目k设为3。首先,对附图2中的文本图像1使用otsu算法进行二值化,得到otsu算法二值图见附图3,然后,按照说明书中的4个步骤,使用上述参数对otsu算法二值图进行优化后得到优化后的二值图见附图4。同理,对附图6中的文本图像使用Lelore算法进行二值化,得到附图7中的Lelore算法二值图见附图8,然后,按照说明书中的4个步骤,使用上述参数对Lelore算法二值图进行优化后得到优化后的二值图见附图8。This embodiment is carried out on a 64-bit pc based on the windows7 system through the programming experiment environment of matlab2013. The text image in the accompanying drawing 2 is a text image in png image format with 1078×2477 pixels. The text image 2 in the accompanying drawing 6 is a text image in png image format with 2278×870 pixels. In this embodiment, the values of T1 and T2 are set to 0.2, the upper limit of the number of iterations in step 4 is set to 20, and the number of categories k in the k-means clustering algorithm is set to 3. First, use the otsu algorithm to binarize the text image 1 in the accompanying drawing 2, and obtain the binary image of the otsu algorithm as shown in the attached drawing 3. Then, according to the 4 steps in the specification, use the above parameters to process the binary image of the otsu algorithm See Figure 4 for the optimized binary image obtained after optimization. Similarly, use the Lelore algorithm to binarize the text image in accompanying drawing 6, and obtain the Lelore algorithm binary image in accompanying drawing 7, see accompanying drawing 8, and then, according to the 4 steps in the specification, use the above parameters to Lelore After the algorithm binary image is optimized, the optimized binary image is shown in Figure 8.
附图3为文本图像附图1的Otsu算法二值化图、附图4为优化后的二值图,附图5为DIBCO提供的二值图的真值图。从Otsu的二值图能够看到,Otsu算法作为全局阈值的二值化方法,对于光照变化非常敏感,并且对于背景稍微复杂的退化文本图像,大量的背景被当作文本信息保留下来。而运用文本图像二次优化算法进行优化后,大量的背景噪声被去除了,二值化的准确度得到很大的提升。Accompanying drawing 3 is the Otsu algorithm binarization diagram of text image accompanying drawing 1, accompanying drawing 4 is the optimized binary diagram, and accompanying drawing 5 is the true value diagram of the binary diagram provided by DIBCO. It can be seen from Otsu's binary image that the Otsu algorithm, as a global threshold binarization method, is very sensitive to illumination changes, and for degraded text images with a slightly complex background, a large amount of background is retained as text information. After using the text image secondary optimization algorithm for optimization, a large amount of background noise is removed, and the accuracy of binarization is greatly improved.
附图7中列出了文本图像6的Lelore算法二值化图、附图8为优化后的二值图,附图9为DIBCO提供的二值图的真值图。从Lelore的二值图能够看到,Lelore二值化算法对于字符的二值化已经非常准确,但是同时,它对于背景中与字符十分相似的像素,过于敏感,难以对这部分背景像素做判别,因而二值化结果中有大量非字符部分被保留下来进行了二值化。而从优化后的二值图中可以看到,这些字符之外的像素都被去除了,并且字符本身没有受到影响,因此二值化的准确度得到了一定提升,特别是视觉效果改善了很多。The Lelore algorithm binarization map of the text image 6 is listed in the accompanying drawing 7, the optimized binary image is shown in the accompanying drawing 8, and the truth value map of the binary image provided by DIBCO is shown in the accompanying drawing 9. From the binary image of Lelore, we can see that the Lelore binarization algorithm is very accurate for the binarization of characters, but at the same time, it is too sensitive to the pixels in the background that are very similar to the characters, and it is difficult to distinguish these background pixels. , so a large number of non-character parts in the binarization result are reserved for binarization. As can be seen from the optimized binary image, the pixels other than these characters have been removed, and the characters themselves have not been affected, so the accuracy of binarization has been improved to a certain extent, especially the visual effect has been improved a lot .
从上面的两个实验结果可以看到,本发明确实能有效改善二值化的处理效果,在去除错误的前景像素的同时不影响字符的本身,提高了二值化的准确度的同时改善了二值图给予人们的视觉效果。同时,它能适用于不同的算法,具有较好的算法适应性。From the above two experimental results, it can be seen that the present invention can effectively improve the processing effect of binarization, does not affect the character itself while removing wrong foreground pixels, and improves the accuracy of binarization while improving the accuracy of binarization. The binary image gives people a visual effect. At the same time, it can be applied to different algorithms and has good algorithm adaptability.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510257271.XA CN104866850B (en) | 2015-05-13 | 2015-05-13 | A kind of optimization method of text image binaryzation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510257271.XA CN104866850B (en) | 2015-05-13 | 2015-05-13 | A kind of optimization method of text image binaryzation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104866850A CN104866850A (en) | 2015-08-26 |
CN104866850B true CN104866850B (en) | 2018-11-02 |
Family
ID=53912671
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510257271.XA Active CN104866850B (en) | 2015-05-13 | 2015-05-13 | A kind of optimization method of text image binaryzation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104866850B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107437256B (en) * | 2016-05-27 | 2020-06-02 | 合肥美亚光电技术股份有限公司 | Automatic dividing method and system applied to tire X-ray image channel |
CN106097375B (en) * | 2016-06-27 | 2018-11-23 | 湖南大学 | A kind of the folding line detection method and device of scan image |
CN106886987B (en) * | 2017-03-23 | 2019-05-24 | 重庆大学 | A kind of train license plate binary image interfusion method |
CN108021648B (en) * | 2017-11-30 | 2020-09-04 | 广东小天才科技有限公司 | Method, device and intelligent terminal for searching questions |
CN114627146B (en) * | 2022-03-15 | 2024-09-24 | 平安科技(深圳)有限公司 | Image processing method, device, electronic equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5915037A (en) * | 1994-03-31 | 1999-06-22 | Licentia Patent-Verwaltungs Gmbh | Method and device for the binarization of pixel data |
CN1991865A (en) * | 2005-12-29 | 2007-07-04 | 佳能株式会社 | Device, method, program and media for extracting text from document image having complex background |
CN101149790A (en) * | 2007-11-14 | 2008-03-26 | 哈尔滨工程大学 | Chinese printed formula recognition method |
CN104573685A (en) * | 2015-01-29 | 2015-04-29 | 中南大学 | Natural scene text detecting method based on extraction of linear structures |
-
2015
- 2015-05-13 CN CN201510257271.XA patent/CN104866850B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5915037A (en) * | 1994-03-31 | 1999-06-22 | Licentia Patent-Verwaltungs Gmbh | Method and device for the binarization of pixel data |
CN1991865A (en) * | 2005-12-29 | 2007-07-04 | 佳能株式会社 | Device, method, program and media for extracting text from document image having complex background |
CN101149790A (en) * | 2007-11-14 | 2008-03-26 | 哈尔滨工程大学 | Chinese printed formula recognition method |
CN104573685A (en) * | 2015-01-29 | 2015-04-29 | 中南大学 | Natural scene text detecting method based on extraction of linear structures |
Non-Patent Citations (2)
Title |
---|
"基于图理论聚类和二值纹理分析技术的彩色文本图像二值化方法";李向丰等;《中国图像图形学报》;20040331;第9卷(第3期);第290-296页 * |
"基于直方图分析和OTSU算法的文字图像二值化";吴丹等;《计算机与现代化》;20130731(第7期);第117-224页 * |
Also Published As
Publication number | Publication date |
---|---|
CN104866850A (en) | 2015-08-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109726643B (en) | Method and device for identifying table information in image, electronic equipment and storage medium | |
CN107133622B (en) | Word segmentation method and device | |
CN104700092B (en) | A kind of small characters digit recognition method being combined based on template and characteristic matching | |
JP5492205B2 (en) | Segment print pages into articles | |
CN103366367B (en) | Based on the FCM gray-scale image segmentation method of pixel count cluster | |
CN104866850B (en) | A kind of optimization method of text image binaryzation | |
CN110503103B (en) | Character segmentation method in text line based on full convolution neural network | |
CN107016409A (en) | A kind of image classification method and system based on salient region of image | |
CN101859382A (en) | A license plate detection and recognition method based on the maximum stable extremum region | |
Rigaud et al. | Text-independent speech balloon segmentation for comics and manga | |
CN103295009B (en) | Based on the license plate character recognition method of Stroke decomposition | |
CN111695373B (en) | Zebra stripes positioning method, system, medium and equipment | |
CN110210297B (en) | Method for locating and extracting Chinese characters in customs clearance image | |
CN111241897B (en) | System and implementation method for digitizing industrial inspection sheets by inferring visual relationships | |
CN103218833B (en) | The color space the most steady extremal region detection method of Edge Enhancement type | |
CN110570442A (en) | Contour detection method under complex background, terminal device and storage medium | |
CN115273115A (en) | Document element labeling method and device, electronic equipment and storage medium | |
Ahmed et al. | Traffic sign detection and recognition model using support vector machine and histogram of oriented gradient | |
CN114359288A (en) | Medical image cerebral aneurysm detection and positioning method based on artificial intelligence | |
Sharma et al. | Piece-wise linearity based method for text frame classification in video | |
CN114581928B (en) | A table recognition method and system | |
CN112200789B (en) | Image recognition method and device, electronic equipment and storage medium | |
CN103093241B (en) | Based on the remote sensing image nonuniformity cloud layer method of discrimination of homogeneity process | |
Xue | Optical character recognition | |
Sharma et al. | Primitive feature-based optical character recognition of the Devanagari script |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
EXSB | Decision made by sipo to initiate substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |