CN107133929B - Low-quality document image binarization method based on background estimation and energy minimization - Google Patents
Low-quality document image binarization method based on background estimation and energy minimization Download PDFInfo
- Publication number
- CN107133929B CN107133929B CN201710289747.7A CN201710289747A CN107133929B CN 107133929 B CN107133929 B CN 107133929B CN 201710289747 A CN201710289747 A CN 201710289747A CN 107133929 B CN107133929 B CN 107133929B
- Authority
- CN
- China
- Prior art keywords
- image
- pixel
- background
- edge
- document image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 18
- 230000002146 bilateral effect Effects 0.000 claims abstract description 13
- 238000001914 filtration Methods 0.000 claims abstract description 9
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 238000012545 processing Methods 0.000 claims abstract description 7
- 230000003190 augmentative effect Effects 0.000 claims abstract description 5
- 230000009467 reduction Effects 0.000 claims abstract description 5
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000003384 imaging method Methods 0.000 claims description 7
- 230000000877 morphologic effect Effects 0.000 claims description 7
- 238000003708 edge detection Methods 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 6
- 230000003416 augmentation Effects 0.000 claims description 4
- 238000010586 diagram Methods 0.000 claims description 4
- 210000001525 retina Anatomy 0.000 claims description 4
- 229920006395 saturated elastomer Polymers 0.000 claims description 4
- 230000000694 effects Effects 0.000 abstract description 4
- 239000003086 colorant Substances 0.000 abstract description 2
- 230000008595 infiltration Effects 0.000 abstract description 2
- 238000001764 infiltration Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 10
- 230000011218 segmentation Effects 0.000 description 4
- 230000015556 catabolic process Effects 0.000 description 3
- 238000006731 degradation reaction Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002902 bimodal effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 238000002791 soaking Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/90—Dynamic range modification of images or parts thereof
- G06T5/94—Dynamic range modification of images or parts thereof based on local image properties, e.g. for local contrast enhancement
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
- G06T7/62—Analysis of geometric attributes of area, perimeter, diameter or volume
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20024—Filtering details
- G06T2207/20028—Bilateral filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30176—Document
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Geometry (AREA)
- Image Processing (AREA)
Abstract
本发明公开了一种基于背景估计和能量最小化的低质量文档图像二值化方法,首先对彩色文档图像进行灰度预处理、采用双边滤波对图像进行降噪处理、图像背景估计、背景减除与图像增强、构造能量函数、构造网络图、最后采用基于增广路径的图割算法实现能量函数的最小化。本发明显著提高了复杂背景下的文档图像二值化效果,能够适用于多种颜色书写、笔画渐变、墨迹浸润、页面有污渍或纹理、光照不均、对比度低等复杂背景的文档图像二值化处理。
The invention discloses a low-quality document image binarization method based on background estimation and energy minimization. First, grayscale preprocessing is performed on a color document image, bilateral filtering is used to perform noise reduction processing on the image, image background estimation, and background reduction are performed. Divide and image enhancement, construct energy function, construct network graph, and finally use the graph cut algorithm based on augmented path to minimize the energy function. The invention significantly improves the binarization effect of document images under complex backgrounds, and can be applied to document image binarization with complex backgrounds such as writing in multiple colors, stroke gradients, ink infiltration, pages with stains or textures, uneven lighting, and low contrast. processing.
Description
技术领域technical field
本发明属于数字图像处理、模式识别与机器学习技术领域,特别是涉及一种基于背景估计和能量最小化的低质量文档图像二值化方法。The invention belongs to the technical fields of digital image processing, pattern recognition and machine learning, in particular to a low-quality document image binarization method based on background estimation and energy minimization.
背景技术Background technique
文档分析与识别(DAR)技术已广泛应用于古籍数字化、版面分析与文字识别、视频字幕提取、文本信息检索等领域,主要包括图像的采集、二值化、歪斜校正、字符分割与识别等过程。图像二值化是其中一个关键预处理环节,它是将灰度图像转换成二进制图像,从而实现字符前景与文档背景的分离。二值化算法的效果直接影响整个DAR系统的性能,因此近年来很多学者对此进行了研究,并提出了很多算法;然而,受图像对比度差、墨迹浸润、页面污渍或光照不均等因素的影响,使得低质量文档图像二值化仍是一个挑战。Document Analysis and Recognition (DAR) technology has been widely used in ancient book digitization, layout analysis and text recognition, video subtitle extraction, text information retrieval and other fields, mainly including image acquisition, binarization, skew correction, character segmentation and recognition and other processes . Image binarization is one of the key preprocessing steps, which converts grayscale images into binary images, so as to separate the foreground of characters from the background of the document. The effect of the binarization algorithm directly affects the performance of the entire DAR system, so many scholars have studied it in recent years and proposed many algorithms; however, it is affected by factors such as poor image contrast, ink infiltration, page stains or uneven lighting , binarizing low-quality document images remains a challenge.
二值化算法可粗略分为全局阈值法和局部阈值法。全局阈值法采用单一的阈值将文档图像分为字符(前景)与背景两大类,如Otsu算法利用图像的灰度直方图选择一个最优阈值,使得经阈值分割后的前景与背景像素的类间方差最大。全局阈值法对于前景和背景差别较大,即直方图具有显著双峰特征的图像具有较好的分割效果,但在处理低质量文档图像时,会丢失部分甚至全部前景细节。Binarization algorithm can be roughly divided into global threshold method and local threshold method. The global threshold method uses a single threshold to divide the document image into two categories: character (foreground) and background. For example, the Otsu algorithm uses the grayscale histogram of the image to select an optimal threshold, so that the foreground and background pixels after threshold segmentation are classified into two categories. the largest variance. The global threshold method has a good segmentation effect for images with a large difference between the foreground and the background, that is, the histogram has a significant bimodal feature, but when dealing with low-quality document images, some or even all foreground details will be lost.
局部阈值法(也称为自适应阈值法)则通过滑动窗口与文档图像的卷积,从而实现在图像不同部分设定不同阈值,如Niblack、Sauvola、Wolf等算法利用像素邻域内的灰度均值和方差来构建阈值分割曲面,其算法性能有赖于滑动窗口的尺寸及字符笔画的粗细等。针对不同质量的文档图像需动态调整窗口尺寸,以获得最佳的阈值处理结果;当图像对比度较低时,会产生大量噪声点或造成误判。The local threshold method (also known as the adaptive threshold method) uses the convolution of the sliding window and the document image to set different thresholds in different parts of the image, such as Niblack, Sauvola, Wolf and other algorithms use the gray mean value in the pixel neighborhood and variance to construct the threshold segmentation surface, the performance of the algorithm depends on the size of the sliding window and the thickness of the character strokes. For document images of different quality, the window size needs to be dynamically adjusted to obtain the best threshold processing results; when the image contrast is low, a large number of noise points will be generated or misjudgment will be caused.
此外,国内外研究人员还提出了很多更为复杂的算法,如局部对比度法、背景估计与笔画边缘检测法、拉普拉斯能量法、卷积神经网络法等。然而,以上这些方法都不能很好地解决在低对比度、墨迹浸润、渐变光照、带污迹和纹理等复杂文档背景下的图像二值化。In addition, domestic and foreign researchers have also proposed many more complex algorithms, such as local contrast method, background estimation and stroke edge detection method, Laplace energy method, convolutional neural network method, etc. However, none of the above methods can well solve the image binarization in complex document backgrounds such as low contrast, inking, gradient lighting, smudges and textures.
发明内容SUMMARY OF THE INVENTION
为了解决上述技术问题,本发明提出了一种基于背景估计和能量最小化的低质量文档图像二值化方法,显著提高了复杂背景下的文档图像二值化效果,能够适用于多种颜色书写、笔画渐变、墨迹浸润、页面有污渍或纹理、光照不均、对比度低等复杂背景的文档图像二值化处理。In order to solve the above technical problems, the present invention proposes a low-quality document image binarization method based on background estimation and energy minimization, which significantly improves the document image binarization effect under complex backgrounds and is suitable for writing in multiple colors. , stroke gradient, ink soaking, pages with stains or textures, uneven lighting, low contrast and other complex background document image binarization processing.
本发明所采用的技术方案是:一种基于背景估计和能量最小化的低质量文档图像二值化方法,其特征在于,包括以下步骤:The technical solution adopted in the present invention is: a low-quality document image binarization method based on background estimation and energy minimization, which is characterized in that it includes the following steps:
步骤1:对彩色文档图像进行灰度预处理;Step 1: Perform grayscale preprocessing on the color document image;
步骤2:采用双边滤波对图像进行降噪处理;Step 2: Use bilateral filtering to denoise the image;
步骤3:图像背景估计,具体包括以下子步骤:Step 3: Image background estimation, which includes the following sub-steps:
步骤3.1:针对步骤2处理后的图像,进行笔画宽度变换;Step 3.1: for the image processed in step 2, perform stroke width transformation;
步骤3.2:计算模拟距离和成像高度;Step 3.2: Calculate the simulated distance and imaging height;
步骤3.3:针对步骤2处理后的图像,通过两次形态学闭操作削弱文档图像中的暗特征;Step 3.3: For the image processed in step 2, the dark features in the document image are weakened by two morphological closing operations;
步骤3.4:结合步骤3.2和步骤3.3的结果,进行图像降采样和升采样;Step 3.4: Combine the results of Step 3.2 and Step 3.3 to perform image downsampling and upsampling;
步骤4:背景减除与图像增强,具体包括以下子步骤:Step 4: Background subtraction and image enhancement, including the following sub-steps:
步骤4.1:背景减除;Step 4.1: Background subtraction;
计算步骤2中的双边滤波图像与步骤3中的背景估计图像间的绝对差值,差值图像中灰度为零的像素点属于高置信背景像素点,并将其灰度值设为255;Calculate the absolute difference between the bilateral filtered image in step 2 and the background estimated image in step 3, the pixels with zero grayscale in the difference image belong to high-confidence background pixels, and set their grayscale value to 255;
步骤4.2:直方图均衡;Step 4.2: Histogram equalization;
对背景减除图像中非零像素点进行取反,得到该点对应的灰度值,然后对整幅图像进行直方图均衡化,增大图像前景和背景的对比度;Invert the non-zero pixel points in the background subtraction image to obtain the corresponding gray value of the point, and then perform histogram equalization on the entire image to increase the contrast between the foreground and background of the image;
步骤5:构造能量函数;Step 5: Construct the energy function;
步骤6:构造网络图;Step 6: Construct the network diagram;
步骤7:采用基于增广路径的图割算法实现能量函数的最小化。Step 7: Use the augmented path-based graph cut algorithm to minimize the energy function.
本发明与现有算法相比,其显著优点在于:Compared with the existing algorithm, the present invention has the following significant advantages:
(1)本发明采用最小均值法对彩色文档图像进行灰度预处理,所得灰度图像具有彩色无关性,既能增大前景与背景像素间的对比度,又能减小前景像素间的灰度方差;(1) The present invention uses the minimum mean value method to perform grayscale preprocessing on the color document image, and the obtained grayscale image has color independence, which can not only increase the contrast between foreground and background pixels, but also reduce the grayscale between foreground pixels. variance;
(2)本发明采用非线性双边滤波算法实现图像降噪处理,由于同时考虑了图像的空间邻近度和灰度相似性,从而达到了保边去噪的目的;(2) The present invention adopts the nonlinear bilateral filtering algorithm to realize image noise reduction processing, because the spatial proximity and grayscale similarity of the image are considered at the same time, so as to achieve the purpose of edge preservation and denoising;
(3)本发明采用笔画宽度变换的方法来估计文档图像中的笔画宽度,其优势在于,笔画特征基本上是属于文字独有的特征(当然也不排除某些退化因素的干扰,需要后续操作加以剔除),对于不同语言的文本具有普适性;(3) The present invention adopts the method of stroke width transformation to estimate the stroke width in the document image, and its advantage is that the stroke feature is basically a unique feature of the text (of course, the interference of some degradation factors is not excluded, and subsequent operations are required. be eliminated), which is universal to texts in different languages;
(4)本发明基于视觉灵敏度测试模型,采用形态学闭操作实现图像背景估计,并对背景减除图像进行直方图均衡化,有效抑制了退化因素的影响,同时增强了图像的局部对比度;(4) Based on the visual sensitivity test model, the present invention adopts morphological closing operation to realize image background estimation, and performs histogram equalization on the background subtracted image, effectively suppressing the influence of degradation factors, and simultaneously enhancing the local contrast of the image;
(5)本发明基于最大流/最小割的组合优化算法实现文档图像二值化,该图割算法通用性强,可行性高,运行速度快(接近实时性能),并且适用于多种退化类型的低质量文档图像。(5) The present invention realizes document image binarization based on the combined optimization algorithm of maximum flow/minimum cut. The graph cut algorithm has strong versatility, high feasibility, fast running speed (close to real-time performance), and is suitable for various degradation types. low-quality document images.
附图说明Description of drawings
图1:为本发明实施例的流程图;Fig. 1: is the flow chart of the embodiment of the present invention;
图2:为本发明实施例的视力测试模型的角度分辨率示意图。FIG. 2 is a schematic diagram of the angular resolution of a vision test model according to an embodiment of the present invention.
具体实施方式Detailed ways
为了便于本领域普通技术人员理解和实施本发明,下面结合附图及实施例对本发明作进一步的详细描述,应当理解,此处所描述的实施示例仅用于说明和解释本发明,并不用于限定本发明。In order to facilitate the understanding and implementation of the present invention by those of ordinary skill in the art, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the embodiments described herein are only used to illustrate and explain the present invention, but not to limit it. this invention.
本发明主要思想是:当目标图像离观察者距离较远时,能观测到的目标图像的细节(笔画)信息越来越少,但感知到的背景灰度和深度不受距离的影响,因此可以通过模拟远距离观测图像的场景,估计出图像的大致背景,再对剔除估计背景后的图像构造能量函数,采用图割算法实现图像二值化。The main idea of the present invention is: when the target image is far away from the observer, the detail (stroke) information of the target image that can be observed becomes less and less, but the perceived background grayscale and depth are not affected by the distance, so The approximate background of the image can be estimated by simulating the scene of long-distance observation of the image, and then the energy function can be constructed for the image after excluding the estimated background, and the image binarization can be realized by using the graph cut algorithm.
请见图1,本发明提供的一种基于背景估计和能量最小化的低质量文档图像二值化方法,包括以下步骤:Referring to Fig. 1, a method for binarizing low-quality document images based on background estimation and energy minimization provided by the present invention includes the following steps:
步骤1:最小均值灰度化;Step 1: minimum mean grayscale;
本发明采用最小均值法对彩色文档图像f(x,y)进行灰度预处理,具体计算公式为:The invention adopts the minimum mean value method to perform grayscale preprocessing on the color document image f(x, y), and the specific calculation formula is:
其中,fi(x,y)分别为R、G、B彩色分量图像,fgray(x,y)为变换后的灰度图像。Among them, f i (x, y) are R, G, and B color component images, respectively, and f gray (x, y) is the transformed grayscale image.
所得灰度图像具有彩色无关性,即灰度图像中,前景与背景像素间具有较大的对比度,同时前景像素间的灰度差异性较小。The obtained grayscale image has color independence, that is, in the grayscale image, the contrast between foreground and background pixels is relatively large, and the grayscale difference between foreground pixels is small.
步骤2:双边滤波去噪;Step 2: Denoising by bilateral filtering;
本发明采用非线性双边滤波算法进行图像降噪处理,其输出像素值依赖于邻域S内像素值f(k,l)的加权组合,具体计算公式为:The present invention adopts nonlinear bilateral filtering algorithm to process image noise reduction, and its output pixel value Depending on the weighted combination of pixel values f(k,l) in the neighborhood S, the specific calculation formula is:
其中,权重系数w(i,j,k,l)取决于定义域核和值域核的乘积,即 和分别表示高斯距离方差和高斯灰度方差。Among them, the weight coefficient w(i,j,k,l) depends on the domain kernel and range kernel the product of , that is and represent the Gaussian distance variance and the Gaussian grayscale variance, respectively.
由于双边滤波器同时考虑了图像的空间邻近度和灰度相似性,可以达到保边去噪的目的。Since the bilateral filter considers the spatial proximity and grayscale similarity of the image at the same time, the purpose of edge-preserving denoising can be achieved.
步骤3:图像背景估计;Step 3: Image background estimation;
步骤3.1笔画宽度变换(SWT):采用Canny算子对双边滤波后的灰度图像进行边缘检测,并对每一个边缘像素点p按其梯度方向查找与之对应的另一个边缘像素点q,两点间的欧式距离||p-q||即为[p,q]路径上所有像素点的笔画宽度估计,除非该像素点已经被指定了一个更小的宽度值,则图像的笔画宽度SWE为所有非零像素点笔画宽度估计的数学期望,具体计算公式为:Step 3.1 Stroke Width Transformation (SWT): Use Canny operator to perform edge detection on the grayscale image after bilateral filtering, and search for another edge pixel q corresponding to each edge pixel p according to its gradient direction. The Euclidean distance between points ||p-q|| is the stroke width estimate of all pixels on the [p,q] path, unless the pixel has been assigned a smaller width value, then the stroke width SWE of the image is all Mathematical expectation of stroke width estimation for non-zero pixels, the specific calculation formula is:
其中,n为笔画宽度变换输出图像s(x,y)中非零值像素点总数。Among them, n is the total number of non-zero value pixels in the output image s(x, y) of the stroke width transformation.
步骤3.2计算模拟距离和成像高度:基于视觉灵敏度测试模型,人眼的最小分辨角(1′的角度)所能感知的即为最小图像,如图2所示。由于低质量文档图像的对比度通常都低于视力表上的二值图像,对应目标的最小视角也通常大于视力测试的最小视角,并且图像的笔画越粗,不能感知到笔画细节所需的观测距离就会越远,因此,本发明将文档图像的笔画宽度对应的分辨角假定为3′,并根据步骤3.1估计得到的笔画宽度确定模拟观测距离d0,具体计算公式为:Step 3.2 Calculate the simulated distance and imaging height: Based on the visual sensitivity test model, the smallest image that can be perceived by the human eye at the smallest resolution angle (angle of 1'), as shown in Figure 2. Since the contrast of low-quality document images is usually lower than that of the binary image on the eye chart, the minimum viewing angle of the corresponding target is usually greater than the minimum viewing angle of the vision test, and the thicker the strokes of the image, the less the observation distance required to perceive the details of the strokes. Therefore, in the present invention, the resolution angle corresponding to the stroke width of the document image is assumed to be 3′, and the simulated observation distance d 0 is determined according to the stroke width estimated in step 3.1. The specific calculation formula is:
d0=SWE×cotθ,d 0 =SWE×cotθ,
其中,θ为观测分辨角,此处为3′视角。Among them, θ is the observation resolution angle, here is the 3' viewing angle.
由于人眼的晶状体类似于凸透镜,根据透镜成像规律和焦距方程,可得到在距离目标图像为d0时视网膜上的成像高度hi,具体计算公式为:Since the lens of the human eye is similar to a convex lens, according to the lens imaging law and the focal length equation, the imaging height h i on the retina when the distance from the target image is d 0 can be obtained. The specific calculation formula is:
其中,f为人眼晶状体与视网膜间距,即透镜焦距(约17mm),h0为目标图像原始高度。Among them, f is the distance between the human eye lens and the retina, that is, the focal length of the lens (about 17mm), and h 0 is the original height of the target image.
步骤3.3形态学闭操作:通过两次形态学闭操作削弱文档图像中的暗特征(字符笔画),两次闭操作均采用圆形结构元素。本发明将第一次结构元素的直径设置为图像的笔画宽度,第二次结构元素的直径则比图像的笔画宽度大12个像素。Step 3.3 Morphological closing operation: The dark features (character strokes) in the document image are weakened by two morphological closing operations, both of which use circular structural elements. In the present invention, the diameter of the first structural element is set as the stroke width of the image, and the diameter of the second structural element is 12 pixels larger than the stroke width of the image.
步骤3.4图像降采样和升采样:距离目标图像为d0时观测到的图像高度为hi,因此,将形态学闭操作后的图像通过双线性降采样缩放到hi高度;然后采用双线性内插法将缩放后的图像恢复到原始尺寸大小,得到的图像即为估计的背景图像。在进行图像缩放时,保持图像宽高比不变。Step 3.4 Image downsampling and upsampling: the observed image height is h i when the distance from the target image is d 0 , therefore, the image after the morphological closing operation is scaled to the h i height by bilinear downsampling; Linear interpolation restores the scaled image to its original size, and the resulting image is the estimated background image. When doing image scaling, keep the image aspect ratio unchanged.
步骤4:背景减除与图像增强;Step 4: Background subtraction and image enhancement;
步骤4.1背景减除:计算双边滤波图像与背景估计图像间的绝对差值,差值图像中灰度为零的像素点属于高置信背景像素点,并将其灰度值设为255(白色)。Step 4.1 Background subtraction: Calculate the absolute difference between the bilateral filtered image and the background estimated image. The pixels with zero grayscale in the difference image belong to the high-confidence background pixels, and set their grayscale value to 255 (white) .
步骤4.2直方图均衡:对背景减除图像中非零像素点进行取反,得到该点对应的灰度值,然后对整幅图像进行直方图均衡化,增大图像前景和背景的对比度。Step 4.2 Histogram equalization: Invert the non-zero pixel points in the background subtraction image to obtain the corresponding gray value of the point, and then perform histogram equalization on the entire image to increase the contrast between the foreground and background of the image.
步骤5:构造能量函数;Step 5: Construct the energy function;
拉普拉斯能量函数的具体形式为:The specific form of the Laplace energy function is:
其中,数据项表示给像素点赋予某个标签的代价,如是指给像素pij赋予标签0(1)的代价;边界项表示相邻像素不连续的代价,即将两相邻像素赋予不同标签时的代价。Among them, the data item represents the cost of assigning a certain label to the pixel, such as It refers to the cost of assigning the label 0(1) to the pixel p ij ; the boundary term represents the cost of discontinuous adjacent pixels, that is, the cost of assigning different labels to two adjacent pixels.
图像的拉普拉斯变换可以反映图像灰度突变的地方,当图像中某像素点的拉普拉斯值符号为正时,对应的像素点一般位于灰度图的波谷处(暗);反之,当图像某像素点的拉普拉斯值符号为负时,对应的像素点就位于灰度图的波峰处(亮)。因此,本发明定义拉普拉斯能量函数的数据项具体表示为:The Laplace transform of the image can reflect the sudden change of the gray level of the image. When the sign of the Laplace value of a pixel in the image is positive, the corresponding pixel is generally located at the valley (dark) of the grayscale image; otherwise , when the sign of the Laplacian value of a pixel in the image is negative, the corresponding pixel is located at the peak (bright) of the grayscale image. Therefore, the data item defining the Laplace energy function in the present invention is specifically expressed as:
其中,表示像素pij处的拉普拉斯值;in, represents the Laplacian value at pixel p ij ;
边界项可分为水平方向的边界项和竖直方向的边界项本发明采用Canny边缘检测算子来确定边界项,位于边缘附近的像素不连续的可能性较大,可以直接将位于边缘两侧的像素间的不连续代价置为零,具体表示为:Boundary items can be divided into horizontal boundary items and vertical boundary terms The present invention uses the Canny edge detection operator to determine the boundary term, the pixels located near the edge are more likely to be discontinuous, and the discontinuity cost between the pixels located on both sides of the edge can be directly set to zero, specifically expressed as:
其中,Eij表示像素点pij处的边缘检测结果,Iij表示像素pij处的灰度值,c为任意常数(>0)。Among them, E ij represents the edge detection result at the pixel p ij , I ij represents the gray value at the pixel p ij , and c is an arbitrary constant (>0).
步骤6:构造网络图;Step 6: Construct the network diagram;
图像的每个像素点pij构成了网络图的中间节点,另外附加两个终端节点s和t。连接中间节点的边称为nlink,其权值由能量函数的边界项确定;连接中间节点与终端节点的边称为tlink,其权值由能量函数的数据项确定。边(pij,s)的权值为边(pij,t)的权值为边(pij,pi+1,j)的权值为边(pij,pi,j+1)的权值为 Each pixel p ij of the image constitutes the intermediate node of the network graph, and two additional terminal nodes s and t are attached. The edge connecting the intermediate nodes is called nlink, and its weight is determined by the boundary term of the energy function; the edge connecting the intermediate node and the terminal node is called tlink, and its weight is determined by the data item of the energy function. The weight of the edge (p ij ,s) is The weight of the edge (p ij ,t) is The weight of the edge (p ij ,p i+1,j ) is The weight of the edge (pi ij ,pi ,j+1 ) is
步骤7:采用基于增广路径的图割算法实现能量函数的最小化;Step 7: Use the augmented path-based graph cut algorithm to minimize the energy function;
基于网络图建立两颗搜索树S和T,树的根节点分别位于源点s和汇点t,将搜索树的节点分为两类:主动节点和被动节点,主动节点可以由非饱和边将自由节点扩展为主动节点,实现树的生长。Two search trees S and T are established based on the network graph. The root nodes of the trees are located at the source point s and the sink point t, respectively. The nodes of the search tree are divided into two categories: active nodes and passive nodes. Active nodes can be connected by unsaturated edges. Free nodes are expanded to active nodes to realize tree growth.
步骤7.1生长阶段:两棵树不断生长,直到两棵树的主动节点相遇便找到了一条从源点到汇点的路径;Step 7.1 Growth stage: The two trees continue to grow until the active nodes of the two trees meet and find a path from the source point to the sink point;
步骤7.2增广阶段:对步骤7.1获得的路径进行增广,增广会形成至少一条饱和边,连接该边的子节点就变成了孤立节点,树S和T则被拆分为多颗子树;Step 7.2 Augmentation phase: the path obtained in step 7.1 is augmented, the augmentation will form at least one saturated edge, the child nodes connecting the edge become isolated nodes, and the trees S and T are split into multiple children. Tree;
步骤7.3收养阶段:为每一个孤立节点寻找父节点,如果没有满足条件的父节点,将其变为自由节点,直至所有的孤立节点都被处理。Step 7.3 Adoption phase: Find the parent node for each isolated node. If there is no parent node that meets the conditions, turn it into a free node until all isolated nodes are processed.
重复执行上面三个步骤,直至两棵树不再生长,被饱和边分开,便求出了图的最小割即能量函数的最小值,从而实现了图像的最终二值化。The above three steps are repeated until the two trees no longer grow and are separated by saturated edges, and the minimum cut of the graph, that is, the minimum value of the energy function, is obtained, thereby realizing the final binarization of the image.
应当理解的是,本说明书未详细阐述的部分均属于现有技术。It should be understood that the parts not described in detail in this specification belong to the prior art.
应当理解的是,上述针对较佳实施例的描述较为详细,并不能因此而认为是对本发明专利保护范围的限制,本领域的普通技术人员在本发明的启示下,在不脱离本发明权利要求所保护的范围情况下,还可以做出替换或变形,均落入本发明的保护范围之内,本发明的请求保护范围应以所附权利要求为准。It should be understood that the above description of the preferred embodiments is relatively detailed, and therefore should not be considered as a limitation on the protection scope of the patent of the present invention. In the case of the protection scope, substitutions or deformations can also be made, which all fall within the protection scope of the present invention, and the claimed protection scope of the present invention shall be subject to the appended claims.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710289747.7A CN107133929B (en) | 2017-04-27 | 2017-04-27 | Low-quality document image binarization method based on background estimation and energy minimization |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710289747.7A CN107133929B (en) | 2017-04-27 | 2017-04-27 | Low-quality document image binarization method based on background estimation and energy minimization |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107133929A CN107133929A (en) | 2017-09-05 |
CN107133929B true CN107133929B (en) | 2019-06-11 |
Family
ID=59716294
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710289747.7A Expired - Fee Related CN107133929B (en) | 2017-04-27 | 2017-04-27 | Low-quality document image binarization method based on background estimation and energy minimization |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107133929B (en) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107705300A (en) * | 2017-09-28 | 2018-02-16 | 成都大熊智能科技有限责任公司 | A kind of method that blank page detection is realized based on morphological transformation |
CN108830186B (en) * | 2018-05-28 | 2021-12-03 | 腾讯科技(深圳)有限公司 | Text image content extraction method, device, equipment and storage medium |
CN109918482A (en) * | 2019-03-14 | 2019-06-21 | 西安航空学院 | An Evaluation and Analysis System for College Students' Innovation and Entrepreneurship Plans |
US11100611B2 (en) * | 2019-03-29 | 2021-08-24 | GE Precision Healthcare LLC | Systems and methods for background noise reduction in magnetic resonance images |
CN111292342A (en) * | 2020-02-17 | 2020-06-16 | 深圳前海微众银行股份有限公司 | Method, device and equipment for cutting text in image and readable storage medium |
CN111681175A (en) * | 2020-05-09 | 2020-09-18 | 浙江大学 | A Preprocessing Method for Scanning Grayscale Document Image |
CN111583157B (en) * | 2020-05-13 | 2023-06-02 | 杭州睿琪软件有限公司 | Image processing method, system and computer readable storage medium |
CN112488107A (en) * | 2020-12-04 | 2021-03-12 | 北京华录新媒信息技术有限公司 | Video subtitle processing method and processing device |
CN112836541B (en) * | 2021-02-03 | 2022-06-03 | 华中师范大学 | Automatic acquisition and identification method and device for 32-bit bar code of cigarette |
CN112837329B (en) * | 2021-03-01 | 2022-07-19 | 西北民族大学 | A method and system for image binarization of Tibetan ancient book documents |
CN113129246A (en) * | 2021-04-19 | 2021-07-16 | 厦门喵宝科技有限公司 | Document picture processing method and device and electronic equipment |
CN114283156B (en) * | 2021-12-02 | 2024-03-05 | 珠海移科智能科技有限公司 | Method and device for removing document image color and handwriting |
CN117436058B (en) * | 2023-10-10 | 2024-09-03 | 国网湖北省电力有限公司 | A power information security protection system |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106295648A (en) * | 2016-07-29 | 2017-01-04 | 湖北工业大学 | A kind of low quality file and picture binary coding method based on multi-optical spectrum imaging technology |
-
2017
- 2017-04-27 CN CN201710289747.7A patent/CN107133929B/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106295648A (en) * | 2016-07-29 | 2017-01-04 | 湖北工业大学 | A kind of low quality file and picture binary coding method based on multi-optical spectrum imaging technology |
Non-Patent Citations (3)
Title |
---|
A Laplacian Energy for Document Binarization;Nicholas R. Howe等;《2011 International Conference on Document Analysis and Recognition》;20111231;第6-10页 * |
Parameter tuning for document image binarization using a racing;Rafael G. Mesquita等;《Expert Systems with Applications》;20151231;第2593-2603页 * |
低质量文档图像二值化算法研究;熊炜等;《计算机应用与软件》;20160731;第33卷(第7期);第204-208页 * |
Also Published As
Publication number | Publication date |
---|---|
CN107133929A (en) | 2017-09-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107133929B (en) | Low-quality document image binarization method based on background estimation and energy minimization | |
CN108898610B (en) | An object contour extraction method based on mask-RCNN | |
CN112819772B (en) | High-precision rapid pattern detection and recognition method | |
CN108090888B (en) | Fusion detection method of infrared image and visible light image based on visual attention model | |
Pei et al. | Removing rain and snow in a single image using saturation and visibility features | |
CN101710425B (en) | Self-adaptive pre-segmentation method based on gray scale and gradient of image and gray scale statistic histogram | |
CN116823686B (en) | Night infrared and visible light image fusion method based on image enhancement | |
CN107527332A (en) | Enhancement Method is kept based on the low-light (level) image color for improving Retinex | |
CN108765325A (en) | Small unmanned aerial vehicle blurred image restoration method | |
CN101783012A (en) | Automatic image defogging method based on dark primary colour | |
Bouzos et al. | Conditional random field model for robust multi-focus image fusion | |
CN109035274A (en) | File and picture binary coding method based on background estimating Yu U-shaped convolutional neural networks | |
CN104463814B (en) | Image enhancement method based on local texture directionality | |
CN109509163B (en) | A method and system for multi-focus image fusion based on FGF | |
Shah et al. | An iterative approach for shadow removal in document images | |
CN114331886A (en) | Image deblurring method based on depth features | |
CN107798670A (en) | A kind of dark primary prior image defogging method using image wave filter | |
CN105184761A (en) | Image rain removing method based on wavelet analysis and system | |
CN115689960A (en) | A fusion method of infrared and visible light images based on adaptive illumination in nighttime scenes | |
CN110782447A (en) | Multi-moving ship target detection method in geostationary orbit satellite optical remote sensing images | |
CN111598788B (en) | Single image defogging method based on quadtree decomposition and non-local prior | |
Gasparyan et al. | Iterative Retinex-based decomposition framework for low light visibility restoration | |
CN105913391B (en) | A kind of defogging method can be changed Morphological Reconstruction based on shape | |
Anantrasirichai et al. | Mitigating the effects of atmospheric distortion using DT-CWT fusion | |
Zhang et al. | Dehazing with improved heterogeneous atmosphere light estimation and a nonlinear color attenuation prior model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190611 |