[go: up one dir, main page]

CN107133929B - Low-quality document image binarization method based on background estimation and energy minimization - Google Patents

Low-quality document image binarization method based on background estimation and energy minimization Download PDF

Info

Publication number
CN107133929B
CN107133929B CN201710289747.7A CN201710289747A CN107133929B CN 107133929 B CN107133929 B CN 107133929B CN 201710289747 A CN201710289747 A CN 201710289747A CN 107133929 B CN107133929 B CN 107133929B
Authority
CN
China
Prior art keywords
image
pixel
background
edge
document image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201710289747.7A
Other languages
Chinese (zh)
Other versions
CN107133929A (en
Inventor
熊炜
徐晶晶
李敏
熊子婕
王改华
刘敏
赵楠
王鑫睿
冯川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubei University of Technology
Original Assignee
Hubei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei University of Technology filed Critical Hubei University of Technology
Priority to CN201710289747.7A priority Critical patent/CN107133929B/en
Publication of CN107133929A publication Critical patent/CN107133929A/en
Application granted granted Critical
Publication of CN107133929B publication Critical patent/CN107133929B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • G06T5/94Dynamic range modification of images or parts thereof based on local image properties, e.g. for local contrast enhancement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • G06T2207/20028Bilateral filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30176Document

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Geometry (AREA)
  • Image Processing (AREA)

Abstract

本发明公开了一种基于背景估计和能量最小化的低质量文档图像二值化方法,首先对彩色文档图像进行灰度预处理、采用双边滤波对图像进行降噪处理、图像背景估计、背景减除与图像增强、构造能量函数、构造网络图、最后采用基于增广路径的图割算法实现能量函数的最小化。本发明显著提高了复杂背景下的文档图像二值化效果,能够适用于多种颜色书写、笔画渐变、墨迹浸润、页面有污渍或纹理、光照不均、对比度低等复杂背景的文档图像二值化处理。

The invention discloses a low-quality document image binarization method based on background estimation and energy minimization. First, grayscale preprocessing is performed on a color document image, bilateral filtering is used to perform noise reduction processing on the image, image background estimation, and background reduction are performed. Divide and image enhancement, construct energy function, construct network graph, and finally use the graph cut algorithm based on augmented path to minimize the energy function. The invention significantly improves the binarization effect of document images under complex backgrounds, and can be applied to document image binarization with complex backgrounds such as writing in multiple colors, stroke gradients, ink infiltration, pages with stains or textures, uneven lighting, and low contrast. processing.

Description

基于背景估计和能量最小化的低质量文档图像二值化方法Low-quality document image binarization method based on background estimation and energy minimization

技术领域technical field

本发明属于数字图像处理、模式识别与机器学习技术领域,特别是涉及一种基于背景估计和能量最小化的低质量文档图像二值化方法。The invention belongs to the technical fields of digital image processing, pattern recognition and machine learning, in particular to a low-quality document image binarization method based on background estimation and energy minimization.

背景技术Background technique

文档分析与识别(DAR)技术已广泛应用于古籍数字化、版面分析与文字识别、视频字幕提取、文本信息检索等领域,主要包括图像的采集、二值化、歪斜校正、字符分割与识别等过程。图像二值化是其中一个关键预处理环节,它是将灰度图像转换成二进制图像,从而实现字符前景与文档背景的分离。二值化算法的效果直接影响整个DAR系统的性能,因此近年来很多学者对此进行了研究,并提出了很多算法;然而,受图像对比度差、墨迹浸润、页面污渍或光照不均等因素的影响,使得低质量文档图像二值化仍是一个挑战。Document Analysis and Recognition (DAR) technology has been widely used in ancient book digitization, layout analysis and text recognition, video subtitle extraction, text information retrieval and other fields, mainly including image acquisition, binarization, skew correction, character segmentation and recognition and other processes . Image binarization is one of the key preprocessing steps, which converts grayscale images into binary images, so as to separate the foreground of characters from the background of the document. The effect of the binarization algorithm directly affects the performance of the entire DAR system, so many scholars have studied it in recent years and proposed many algorithms; however, it is affected by factors such as poor image contrast, ink infiltration, page stains or uneven lighting , binarizing low-quality document images remains a challenge.

二值化算法可粗略分为全局阈值法和局部阈值法。全局阈值法采用单一的阈值将文档图像分为字符(前景)与背景两大类,如Otsu算法利用图像的灰度直方图选择一个最优阈值,使得经阈值分割后的前景与背景像素的类间方差最大。全局阈值法对于前景和背景差别较大,即直方图具有显著双峰特征的图像具有较好的分割效果,但在处理低质量文档图像时,会丢失部分甚至全部前景细节。Binarization algorithm can be roughly divided into global threshold method and local threshold method. The global threshold method uses a single threshold to divide the document image into two categories: character (foreground) and background. For example, the Otsu algorithm uses the grayscale histogram of the image to select an optimal threshold, so that the foreground and background pixels after threshold segmentation are classified into two categories. the largest variance. The global threshold method has a good segmentation effect for images with a large difference between the foreground and the background, that is, the histogram has a significant bimodal feature, but when dealing with low-quality document images, some or even all foreground details will be lost.

局部阈值法(也称为自适应阈值法)则通过滑动窗口与文档图像的卷积,从而实现在图像不同部分设定不同阈值,如Niblack、Sauvola、Wolf等算法利用像素邻域内的灰度均值和方差来构建阈值分割曲面,其算法性能有赖于滑动窗口的尺寸及字符笔画的粗细等。针对不同质量的文档图像需动态调整窗口尺寸,以获得最佳的阈值处理结果;当图像对比度较低时,会产生大量噪声点或造成误判。The local threshold method (also known as the adaptive threshold method) uses the convolution of the sliding window and the document image to set different thresholds in different parts of the image, such as Niblack, Sauvola, Wolf and other algorithms use the gray mean value in the pixel neighborhood and variance to construct the threshold segmentation surface, the performance of the algorithm depends on the size of the sliding window and the thickness of the character strokes. For document images of different quality, the window size needs to be dynamically adjusted to obtain the best threshold processing results; when the image contrast is low, a large number of noise points will be generated or misjudgment will be caused.

此外,国内外研究人员还提出了很多更为复杂的算法,如局部对比度法、背景估计与笔画边缘检测法、拉普拉斯能量法、卷积神经网络法等。然而,以上这些方法都不能很好地解决在低对比度、墨迹浸润、渐变光照、带污迹和纹理等复杂文档背景下的图像二值化。In addition, domestic and foreign researchers have also proposed many more complex algorithms, such as local contrast method, background estimation and stroke edge detection method, Laplace energy method, convolutional neural network method, etc. However, none of the above methods can well solve the image binarization in complex document backgrounds such as low contrast, inking, gradient lighting, smudges and textures.

发明内容SUMMARY OF THE INVENTION

为了解决上述技术问题,本发明提出了一种基于背景估计和能量最小化的低质量文档图像二值化方法,显著提高了复杂背景下的文档图像二值化效果,能够适用于多种颜色书写、笔画渐变、墨迹浸润、页面有污渍或纹理、光照不均、对比度低等复杂背景的文档图像二值化处理。In order to solve the above technical problems, the present invention proposes a low-quality document image binarization method based on background estimation and energy minimization, which significantly improves the document image binarization effect under complex backgrounds and is suitable for writing in multiple colors. , stroke gradient, ink soaking, pages with stains or textures, uneven lighting, low contrast and other complex background document image binarization processing.

本发明所采用的技术方案是:一种基于背景估计和能量最小化的低质量文档图像二值化方法,其特征在于,包括以下步骤:The technical solution adopted in the present invention is: a low-quality document image binarization method based on background estimation and energy minimization, which is characterized in that it includes the following steps:

步骤1:对彩色文档图像进行灰度预处理;Step 1: Perform grayscale preprocessing on the color document image;

步骤2:采用双边滤波对图像进行降噪处理;Step 2: Use bilateral filtering to denoise the image;

步骤3:图像背景估计,具体包括以下子步骤:Step 3: Image background estimation, which includes the following sub-steps:

步骤3.1:针对步骤2处理后的图像,进行笔画宽度变换;Step 3.1: for the image processed in step 2, perform stroke width transformation;

步骤3.2:计算模拟距离和成像高度;Step 3.2: Calculate the simulated distance and imaging height;

步骤3.3:针对步骤2处理后的图像,通过两次形态学闭操作削弱文档图像中的暗特征;Step 3.3: For the image processed in step 2, the dark features in the document image are weakened by two morphological closing operations;

步骤3.4:结合步骤3.2和步骤3.3的结果,进行图像降采样和升采样;Step 3.4: Combine the results of Step 3.2 and Step 3.3 to perform image downsampling and upsampling;

步骤4:背景减除与图像增强,具体包括以下子步骤:Step 4: Background subtraction and image enhancement, including the following sub-steps:

步骤4.1:背景减除;Step 4.1: Background subtraction;

计算步骤2中的双边滤波图像与步骤3中的背景估计图像间的绝对差值,差值图像中灰度为零的像素点属于高置信背景像素点,并将其灰度值设为255;Calculate the absolute difference between the bilateral filtered image in step 2 and the background estimated image in step 3, the pixels with zero grayscale in the difference image belong to high-confidence background pixels, and set their grayscale value to 255;

步骤4.2:直方图均衡;Step 4.2: Histogram equalization;

对背景减除图像中非零像素点进行取反,得到该点对应的灰度值,然后对整幅图像进行直方图均衡化,增大图像前景和背景的对比度;Invert the non-zero pixel points in the background subtraction image to obtain the corresponding gray value of the point, and then perform histogram equalization on the entire image to increase the contrast between the foreground and background of the image;

步骤5:构造能量函数;Step 5: Construct the energy function;

步骤6:构造网络图;Step 6: Construct the network diagram;

步骤7:采用基于增广路径的图割算法实现能量函数的最小化。Step 7: Use the augmented path-based graph cut algorithm to minimize the energy function.

本发明与现有算法相比,其显著优点在于:Compared with the existing algorithm, the present invention has the following significant advantages:

(1)本发明采用最小均值法对彩色文档图像进行灰度预处理,所得灰度图像具有彩色无关性,既能增大前景与背景像素间的对比度,又能减小前景像素间的灰度方差;(1) The present invention uses the minimum mean value method to perform grayscale preprocessing on the color document image, and the obtained grayscale image has color independence, which can not only increase the contrast between foreground and background pixels, but also reduce the grayscale between foreground pixels. variance;

(2)本发明采用非线性双边滤波算法实现图像降噪处理,由于同时考虑了图像的空间邻近度和灰度相似性,从而达到了保边去噪的目的;(2) The present invention adopts the nonlinear bilateral filtering algorithm to realize image noise reduction processing, because the spatial proximity and grayscale similarity of the image are considered at the same time, so as to achieve the purpose of edge preservation and denoising;

(3)本发明采用笔画宽度变换的方法来估计文档图像中的笔画宽度,其优势在于,笔画特征基本上是属于文字独有的特征(当然也不排除某些退化因素的干扰,需要后续操作加以剔除),对于不同语言的文本具有普适性;(3) The present invention adopts the method of stroke width transformation to estimate the stroke width in the document image, and its advantage is that the stroke feature is basically a unique feature of the text (of course, the interference of some degradation factors is not excluded, and subsequent operations are required. be eliminated), which is universal to texts in different languages;

(4)本发明基于视觉灵敏度测试模型,采用形态学闭操作实现图像背景估计,并对背景减除图像进行直方图均衡化,有效抑制了退化因素的影响,同时增强了图像的局部对比度;(4) Based on the visual sensitivity test model, the present invention adopts morphological closing operation to realize image background estimation, and performs histogram equalization on the background subtracted image, effectively suppressing the influence of degradation factors, and simultaneously enhancing the local contrast of the image;

(5)本发明基于最大流/最小割的组合优化算法实现文档图像二值化,该图割算法通用性强,可行性高,运行速度快(接近实时性能),并且适用于多种退化类型的低质量文档图像。(5) The present invention realizes document image binarization based on the combined optimization algorithm of maximum flow/minimum cut. The graph cut algorithm has strong versatility, high feasibility, fast running speed (close to real-time performance), and is suitable for various degradation types. low-quality document images.

附图说明Description of drawings

图1:为本发明实施例的流程图;Fig. 1: is the flow chart of the embodiment of the present invention;

图2:为本发明实施例的视力测试模型的角度分辨率示意图。FIG. 2 is a schematic diagram of the angular resolution of a vision test model according to an embodiment of the present invention.

具体实施方式Detailed ways

为了便于本领域普通技术人员理解和实施本发明,下面结合附图及实施例对本发明作进一步的详细描述,应当理解,此处所描述的实施示例仅用于说明和解释本发明,并不用于限定本发明。In order to facilitate the understanding and implementation of the present invention by those of ordinary skill in the art, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the embodiments described herein are only used to illustrate and explain the present invention, but not to limit it. this invention.

本发明主要思想是:当目标图像离观察者距离较远时,能观测到的目标图像的细节(笔画)信息越来越少,但感知到的背景灰度和深度不受距离的影响,因此可以通过模拟远距离观测图像的场景,估计出图像的大致背景,再对剔除估计背景后的图像构造能量函数,采用图割算法实现图像二值化。The main idea of the present invention is: when the target image is far away from the observer, the detail (stroke) information of the target image that can be observed becomes less and less, but the perceived background grayscale and depth are not affected by the distance, so The approximate background of the image can be estimated by simulating the scene of long-distance observation of the image, and then the energy function can be constructed for the image after excluding the estimated background, and the image binarization can be realized by using the graph cut algorithm.

请见图1,本发明提供的一种基于背景估计和能量最小化的低质量文档图像二值化方法,包括以下步骤:Referring to Fig. 1, a method for binarizing low-quality document images based on background estimation and energy minimization provided by the present invention includes the following steps:

步骤1:最小均值灰度化;Step 1: minimum mean grayscale;

本发明采用最小均值法对彩色文档图像f(x,y)进行灰度预处理,具体计算公式为:The invention adopts the minimum mean value method to perform grayscale preprocessing on the color document image f(x, y), and the specific calculation formula is:

其中,fi(x,y)分别为R、G、B彩色分量图像,fgray(x,y)为变换后的灰度图像。Among them, f i (x, y) are R, G, and B color component images, respectively, and f gray (x, y) is the transformed grayscale image.

所得灰度图像具有彩色无关性,即灰度图像中,前景与背景像素间具有较大的对比度,同时前景像素间的灰度差异性较小。The obtained grayscale image has color independence, that is, in the grayscale image, the contrast between foreground and background pixels is relatively large, and the grayscale difference between foreground pixels is small.

步骤2:双边滤波去噪;Step 2: Denoising by bilateral filtering;

本发明采用非线性双边滤波算法进行图像降噪处理,其输出像素值依赖于邻域S内像素值f(k,l)的加权组合,具体计算公式为:The present invention adopts nonlinear bilateral filtering algorithm to process image noise reduction, and its output pixel value Depending on the weighted combination of pixel values f(k,l) in the neighborhood S, the specific calculation formula is:

其中,权重系数w(i,j,k,l)取决于定义域核和值域核的乘积,即 分别表示高斯距离方差和高斯灰度方差。Among them, the weight coefficient w(i,j,k,l) depends on the domain kernel and range kernel the product of , that is and represent the Gaussian distance variance and the Gaussian grayscale variance, respectively.

由于双边滤波器同时考虑了图像的空间邻近度和灰度相似性,可以达到保边去噪的目的。Since the bilateral filter considers the spatial proximity and grayscale similarity of the image at the same time, the purpose of edge-preserving denoising can be achieved.

步骤3:图像背景估计;Step 3: Image background estimation;

步骤3.1笔画宽度变换(SWT):采用Canny算子对双边滤波后的灰度图像进行边缘检测,并对每一个边缘像素点p按其梯度方向查找与之对应的另一个边缘像素点q,两点间的欧式距离||p-q||即为[p,q]路径上所有像素点的笔画宽度估计,除非该像素点已经被指定了一个更小的宽度值,则图像的笔画宽度SWE为所有非零像素点笔画宽度估计的数学期望,具体计算公式为:Step 3.1 Stroke Width Transformation (SWT): Use Canny operator to perform edge detection on the grayscale image after bilateral filtering, and search for another edge pixel q corresponding to each edge pixel p according to its gradient direction. The Euclidean distance between points ||p-q|| is the stroke width estimate of all pixels on the [p,q] path, unless the pixel has been assigned a smaller width value, then the stroke width SWE of the image is all Mathematical expectation of stroke width estimation for non-zero pixels, the specific calculation formula is:

其中,n为笔画宽度变换输出图像s(x,y)中非零值像素点总数。Among them, n is the total number of non-zero value pixels in the output image s(x, y) of the stroke width transformation.

步骤3.2计算模拟距离和成像高度:基于视觉灵敏度测试模型,人眼的最小分辨角(1′的角度)所能感知的即为最小图像,如图2所示。由于低质量文档图像的对比度通常都低于视力表上的二值图像,对应目标的最小视角也通常大于视力测试的最小视角,并且图像的笔画越粗,不能感知到笔画细节所需的观测距离就会越远,因此,本发明将文档图像的笔画宽度对应的分辨角假定为3′,并根据步骤3.1估计得到的笔画宽度确定模拟观测距离d0,具体计算公式为:Step 3.2 Calculate the simulated distance and imaging height: Based on the visual sensitivity test model, the smallest image that can be perceived by the human eye at the smallest resolution angle (angle of 1'), as shown in Figure 2. Since the contrast of low-quality document images is usually lower than that of the binary image on the eye chart, the minimum viewing angle of the corresponding target is usually greater than the minimum viewing angle of the vision test, and the thicker the strokes of the image, the less the observation distance required to perceive the details of the strokes. Therefore, in the present invention, the resolution angle corresponding to the stroke width of the document image is assumed to be 3′, and the simulated observation distance d 0 is determined according to the stroke width estimated in step 3.1. The specific calculation formula is:

d0=SWE×cotθ,d 0 =SWE×cotθ,

其中,θ为观测分辨角,此处为3′视角。Among them, θ is the observation resolution angle, here is the 3' viewing angle.

由于人眼的晶状体类似于凸透镜,根据透镜成像规律和焦距方程,可得到在距离目标图像为d0时视网膜上的成像高度hi,具体计算公式为:Since the lens of the human eye is similar to a convex lens, according to the lens imaging law and the focal length equation, the imaging height h i on the retina when the distance from the target image is d 0 can be obtained. The specific calculation formula is:

其中,f为人眼晶状体与视网膜间距,即透镜焦距(约17mm),h0为目标图像原始高度。Among them, f is the distance between the human eye lens and the retina, that is, the focal length of the lens (about 17mm), and h 0 is the original height of the target image.

步骤3.3形态学闭操作:通过两次形态学闭操作削弱文档图像中的暗特征(字符笔画),两次闭操作均采用圆形结构元素。本发明将第一次结构元素的直径设置为图像的笔画宽度,第二次结构元素的直径则比图像的笔画宽度大12个像素。Step 3.3 Morphological closing operation: The dark features (character strokes) in the document image are weakened by two morphological closing operations, both of which use circular structural elements. In the present invention, the diameter of the first structural element is set as the stroke width of the image, and the diameter of the second structural element is 12 pixels larger than the stroke width of the image.

步骤3.4图像降采样和升采样:距离目标图像为d0时观测到的图像高度为hi,因此,将形态学闭操作后的图像通过双线性降采样缩放到hi高度;然后采用双线性内插法将缩放后的图像恢复到原始尺寸大小,得到的图像即为估计的背景图像。在进行图像缩放时,保持图像宽高比不变。Step 3.4 Image downsampling and upsampling: the observed image height is h i when the distance from the target image is d 0 , therefore, the image after the morphological closing operation is scaled to the h i height by bilinear downsampling; Linear interpolation restores the scaled image to its original size, and the resulting image is the estimated background image. When doing image scaling, keep the image aspect ratio unchanged.

步骤4:背景减除与图像增强;Step 4: Background subtraction and image enhancement;

步骤4.1背景减除:计算双边滤波图像与背景估计图像间的绝对差值,差值图像中灰度为零的像素点属于高置信背景像素点,并将其灰度值设为255(白色)。Step 4.1 Background subtraction: Calculate the absolute difference between the bilateral filtered image and the background estimated image. The pixels with zero grayscale in the difference image belong to the high-confidence background pixels, and set their grayscale value to 255 (white) .

步骤4.2直方图均衡:对背景减除图像中非零像素点进行取反,得到该点对应的灰度值,然后对整幅图像进行直方图均衡化,增大图像前景和背景的对比度。Step 4.2 Histogram equalization: Invert the non-zero pixel points in the background subtraction image to obtain the corresponding gray value of the point, and then perform histogram equalization on the entire image to increase the contrast between the foreground and background of the image.

步骤5:构造能量函数;Step 5: Construct the energy function;

拉普拉斯能量函数的具体形式为:The specific form of the Laplace energy function is:

其中,数据项表示给像素点赋予某个标签的代价,如是指给像素pij赋予标签0(1)的代价;边界项表示相邻像素不连续的代价,即将两相邻像素赋予不同标签时的代价。Among them, the data item represents the cost of assigning a certain label to the pixel, such as It refers to the cost of assigning the label 0(1) to the pixel p ij ; the boundary term represents the cost of discontinuous adjacent pixels, that is, the cost of assigning different labels to two adjacent pixels.

图像的拉普拉斯变换可以反映图像灰度突变的地方,当图像中某像素点的拉普拉斯值符号为正时,对应的像素点一般位于灰度图的波谷处(暗);反之,当图像某像素点的拉普拉斯值符号为负时,对应的像素点就位于灰度图的波峰处(亮)。因此,本发明定义拉普拉斯能量函数的数据项具体表示为:The Laplace transform of the image can reflect the sudden change of the gray level of the image. When the sign of the Laplace value of a pixel in the image is positive, the corresponding pixel is generally located at the valley (dark) of the grayscale image; otherwise , when the sign of the Laplacian value of a pixel in the image is negative, the corresponding pixel is located at the peak (bright) of the grayscale image. Therefore, the data item defining the Laplace energy function in the present invention is specifically expressed as:

其中,表示像素pij处的拉普拉斯值;in, represents the Laplacian value at pixel p ij ;

边界项可分为水平方向的边界项和竖直方向的边界项本发明采用Canny边缘检测算子来确定边界项,位于边缘附近的像素不连续的可能性较大,可以直接将位于边缘两侧的像素间的不连续代价置为零,具体表示为:Boundary items can be divided into horizontal boundary items and vertical boundary terms The present invention uses the Canny edge detection operator to determine the boundary term, the pixels located near the edge are more likely to be discontinuous, and the discontinuity cost between the pixels located on both sides of the edge can be directly set to zero, specifically expressed as:

其中,Eij表示像素点pij处的边缘检测结果,Iij表示像素pij处的灰度值,c为任意常数(>0)。Among them, E ij represents the edge detection result at the pixel p ij , I ij represents the gray value at the pixel p ij , and c is an arbitrary constant (>0).

步骤6:构造网络图;Step 6: Construct the network diagram;

图像的每个像素点pij构成了网络图的中间节点,另外附加两个终端节点s和t。连接中间节点的边称为nlink,其权值由能量函数的边界项确定;连接中间节点与终端节点的边称为tlink,其权值由能量函数的数据项确定。边(pij,s)的权值为边(pij,t)的权值为边(pij,pi+1,j)的权值为边(pij,pi,j+1)的权值为 Each pixel p ij of the image constitutes the intermediate node of the network graph, and two additional terminal nodes s and t are attached. The edge connecting the intermediate nodes is called nlink, and its weight is determined by the boundary term of the energy function; the edge connecting the intermediate node and the terminal node is called tlink, and its weight is determined by the data item of the energy function. The weight of the edge (p ij ,s) is The weight of the edge (p ij ,t) is The weight of the edge (p ij ,p i+1,j ) is The weight of the edge (pi ij ,pi ,j+1 ) is

步骤7:采用基于增广路径的图割算法实现能量函数的最小化;Step 7: Use the augmented path-based graph cut algorithm to minimize the energy function;

基于网络图建立两颗搜索树S和T,树的根节点分别位于源点s和汇点t,将搜索树的节点分为两类:主动节点和被动节点,主动节点可以由非饱和边将自由节点扩展为主动节点,实现树的生长。Two search trees S and T are established based on the network graph. The root nodes of the trees are located at the source point s and the sink point t, respectively. The nodes of the search tree are divided into two categories: active nodes and passive nodes. Active nodes can be connected by unsaturated edges. Free nodes are expanded to active nodes to realize tree growth.

步骤7.1生长阶段:两棵树不断生长,直到两棵树的主动节点相遇便找到了一条从源点到汇点的路径;Step 7.1 Growth stage: The two trees continue to grow until the active nodes of the two trees meet and find a path from the source point to the sink point;

步骤7.2增广阶段:对步骤7.1获得的路径进行增广,增广会形成至少一条饱和边,连接该边的子节点就变成了孤立节点,树S和T则被拆分为多颗子树;Step 7.2 Augmentation phase: the path obtained in step 7.1 is augmented, the augmentation will form at least one saturated edge, the child nodes connecting the edge become isolated nodes, and the trees S and T are split into multiple children. Tree;

步骤7.3收养阶段:为每一个孤立节点寻找父节点,如果没有满足条件的父节点,将其变为自由节点,直至所有的孤立节点都被处理。Step 7.3 Adoption phase: Find the parent node for each isolated node. If there is no parent node that meets the conditions, turn it into a free node until all isolated nodes are processed.

重复执行上面三个步骤,直至两棵树不再生长,被饱和边分开,便求出了图的最小割即能量函数的最小值,从而实现了图像的最终二值化。The above three steps are repeated until the two trees no longer grow and are separated by saturated edges, and the minimum cut of the graph, that is, the minimum value of the energy function, is obtained, thereby realizing the final binarization of the image.

应当理解的是,本说明书未详细阐述的部分均属于现有技术。It should be understood that the parts not described in detail in this specification belong to the prior art.

应当理解的是,上述针对较佳实施例的描述较为详细,并不能因此而认为是对本发明专利保护范围的限制,本领域的普通技术人员在本发明的启示下,在不脱离本发明权利要求所保护的范围情况下,还可以做出替换或变形,均落入本发明的保护范围之内,本发明的请求保护范围应以所附权利要求为准。It should be understood that the above description of the preferred embodiments is relatively detailed, and therefore should not be considered as a limitation on the protection scope of the patent of the present invention. In the case of the protection scope, substitutions or deformations can also be made, which all fall within the protection scope of the present invention, and the claimed protection scope of the present invention shall be subject to the appended claims.

Claims (7)

1.一种基于背景估计和能量最小化的低质量文档图像二值化方法,其特征在于,包括以下步骤:1. a low-quality document image binarization method based on background estimation and energy minimization, is characterized in that, comprises the following steps: 步骤1:对彩色文档图像进行灰度预处理;Step 1: Perform grayscale preprocessing on the color document image; 步骤2:采用双边滤波对图像进行降噪处理;Step 2: Use bilateral filtering to denoise the image; 步骤3:图像背景估计,具体包括以下子步骤:Step 3: Image background estimation, which includes the following sub-steps: 步骤3.1:针对步骤2处理后的图像,进行笔画宽度变换;Step 3.1: for the image processed in step 2, perform stroke width transformation; 步骤3.2:计算模拟距离和成像高度;Step 3.2: Calculate the simulated distance and imaging height; 步骤3.3:针对步骤2处理后的图像,通过两次形态学闭操作削弱文档图像中的暗特征;Step 3.3: For the image processed in step 2, the dark features in the document image are weakened by two morphological closing operations; 步骤3.4:结合步骤3.2和步骤3.3的结果,进行图像降采样和升采样;Step 3.4: Combine the results of Step 3.2 and Step 3.3 to perform image downsampling and upsampling; 步骤4:背景减除与图像增强,具体包括以下子步骤:Step 4: Background subtraction and image enhancement, including the following sub-steps: 步骤4.1:背景减除;Step 4.1: Background subtraction; 计算步骤2中的双边滤波图像与步骤3中的背景估计图像间的绝对差值,差值图像中灰度为零的像素点属于高置信背景像素点,并将其灰度值设为255;Calculate the absolute difference between the bilateral filtered image in step 2 and the background estimated image in step 3, the pixels with zero grayscale in the difference image belong to high-confidence background pixels, and set their grayscale value to 255; 步骤4.2:直方图均衡;Step 4.2: Histogram equalization; 对背景减除图像中非零像素点进行取反,得到该点对应的灰度值,然后对整幅图像进行直方图均衡化,增大图像前景和背景的对比度;Invert the non-zero pixel points in the background subtraction image to obtain the corresponding gray value of the point, and then perform histogram equalization on the entire image to increase the contrast between the foreground and background of the image; 步骤5:构造能量函数;Step 5: Construct the energy function; 拉普拉斯能量函数的具体形式为:The specific form of the Laplace energy function is: 其中,数据项表示给像素点赋予某个标签的代价,是指给像素pij赋予标签0/1的代价;▽2Iij表示像素pij处的拉普拉斯值;边界项表示相邻像素不连续的代价,即将两相邻像素赋予不同标签时的代价;边界项分为水平方向的边界项和竖直方向的边界项Eij表示像素点pij处的边缘检测结果,Iij表示像素pij处的灰度值,c为任意常数,其c>0;Among them, the data item represents the cost of assigning a certain label to the pixel, refers to the cost of assigning the label 0/1 to the pixel p ij ; ▽ 2 I ij represents the Laplacian value at the pixel p ij ; the boundary term represents the cost of the discontinuity of adjacent pixels, that is, when two adjacent pixels are assigned different labels The cost of ; the boundary terms are divided into horizontal boundary terms and vertical boundary terms E ij represents the edge detection result at the pixel p ij , I ij represents the gray value at the pixel p ij , c is an arbitrary constant, and c>0; 步骤6:构造网络图;Step 6: Construct the network diagram; 图像的每个像素点pij构成了网络图的中间节点,另外附加两个终端节点s和t;连接中间节点的边称为nlink,其权值由能量函数的边界项确定;连接中间节点与终端节点的边称为tlink,其权值由能量函数的数据项确定;边(pij,s)的权值为边(pij,t)的权值为边(pij,pi+1,j)的权值为边(pij,pi,j+1)的权值为 Each pixel p ij of the image constitutes the intermediate node of the network graph, and two additional terminal nodes s and t are attached; the edge connecting the intermediate node is called nlink, and its weight is determined by the boundary term of the energy function; connecting the intermediate node with The edge of the terminal node is called tlink, and its weight is determined by the data item of the energy function; the weight of the edge (p ij , s) is The weight of the edge (p ij ,t) is The weight of the edge (p ij ,p i+1,j ) is The weight of the edge (pi ij ,pi ,j+1 ) is 步骤7:采用基于增广路径的图割算法实现能量函数的最小化;Step 7: Use the augmented path-based graph cut algorithm to minimize the energy function; 基于网络图建立两颗搜索树S和T,树的根节点分别位于源点s和汇点t,将搜索树的节点分为两类:主动节点和被动节点,主动节点可以由非饱和边将自由节点扩展为主动节点,实现树的生长;Two search trees S and T are established based on the network graph. The root nodes of the trees are located at the source point s and the sink point t, respectively. The nodes of the search tree are divided into two categories: active nodes and passive nodes. Active nodes can be connected by unsaturated edges. Free nodes are expanded to active nodes to realize tree growth; 步骤7.1:生长阶段;Step 7.1: Growth stage; 两棵树不断生长,直到两棵树的主动节点相遇便找到了一条从源点到汇点的路径;The two trees continue to grow until the active nodes of the two trees meet and find a path from the source to the sink; 步骤7.2、增广阶段;Step 7.2, the augmentation stage; 对步骤7.1获得的路径进行增广,增广会形成至少一条饱和边,连接该边的子节点就变成了孤立节点,树S和T则被拆分为多颗子树;Augment the path obtained in step 7.1, the augmentation will form at least one saturated edge, the child node connecting the edge becomes an isolated node, and the trees S and T are split into multiple subtrees; 步骤7.3:收养阶段;Step 7.3: Adoption Phase; 为每一个孤立节点寻找父节点,如果没有满足条件的父节点,将其变为自由节点,直至所有的孤立节点都被处理;Find a parent node for each isolated node, if there is no parent node that meets the conditions, turn it into a free node until all isolated nodes are processed; 步骤7.4:重复执行上面三个步骤,直至两棵树不再生长,被饱和边分开,便求出了图的最小割即能量函数的最小值,从而实现了图像的最终二值化。Step 7.4: Repeat the above three steps until the two trees no longer grow and are separated by saturated edges, then the minimum cut of the graph, that is, the minimum value of the energy function, is obtained, thus realizing the final binarization of the image. 2.根据权利要求1所述的基于背景估计和能量最小化的低质量文档图像二值化方法,其特征在于:步骤1中采用最小均值法对彩色文档图像f(x,y)进行灰度预处理,其中预处理公式为:2. The low-quality document image binarization method based on background estimation and energy minimization according to claim 1, characterized in that: in step 1, a minimum mean method is used to perform grayscale on the color document image f(x, y). Preprocessing, where the preprocessing formula is: 其中,fi(x,y)分别为R、G、B彩色分量图像,fgray(x,y)为变换后的灰度图像。Among them, f i (x, y) are R, G, and B color component images, respectively, and f gray (x, y) is the transformed grayscale image. 3.根据权利要求1所述的基于背景估计和能量最小化的低质量文档图像二值化方法,其特征在于:步骤2中采用非线性双边滤波算法进行图像降噪处理,其输出像素值依赖于邻域S内像素值f(k,l)的加权组合,具体计算公式为:3. the low-quality document image binarization method based on background estimation and energy minimization according to claim 1, is characterized in that: adopt nonlinear bilateral filtering algorithm to carry out image noise reduction processing in step 2, and its output pixel value Depending on the weighted combination of pixel values f(k,l) in the neighborhood S, the specific calculation formula is: 其中,权重系数w(i,j,k,l)取决于定义域核和值域核的乘积,即 分别表示高斯距离方差和高斯灰度方差。Among them, the weight coefficient w(i,j,k,l) depends on the domain kernel and range kernel the product of , that is and represent the Gaussian distance variance and the Gaussian grayscale variance, respectively. 4.根据权利要求1所述的基于背景估计和能量最小化的低质量文档图像二值化方法,其特征在于:步骤3.1中采用Canny算子对双边滤波后的灰度图像进行边缘检测,并对每一个边缘像素点p按其梯度方向查找与之对应的另一个边缘像素点q,两点间的欧式距离||p-q||即为[p,q]路径上所有像素点的笔画宽度估计,除非该像素点已经被指定了一个更小的宽度值,则图像的笔画宽度SWE为所有非零像素点笔画宽度估计的数学期望,具体计算公式为:4. the low-quality document image binarization method based on background estimation and energy minimization according to claim 1, is characterized in that: adopt Canny operator to carry out edge detection to the grayscale image after bilateral filtering in step 3.1, and For each edge pixel p, find another edge pixel q corresponding to it according to its gradient direction, and the Euclidean distance between the two points ||p-q|| is the stroke width estimate of all pixels on the path of [p,q] , unless the pixel has been assigned a smaller width value, the stroke width SWE of the image is the mathematical expectation of the estimated stroke width of all non-zero pixels. The specific calculation formula is: 其中,n为笔画宽度变换输出图像s(x,y)中非零值像素点总数。Among them, n is the total number of non-zero value pixels in the output image s(x, y) of the stroke width transformation. 5.根据权利要求1所述的基于背景估计和能量最小化的低质量文档图像二值化方法,其特征在于:步骤3.2中,并根据步骤3.1估计得到的笔画宽度SWE确定模拟观测距离d0,具体计算公式为:5. The low-quality document image binarization method based on background estimation and energy minimization according to claim 1, characterized in that: in step 3.2, and according to the stroke width SWE estimated in step 3.1, the simulated observation distance d is determined . , the specific calculation formula is: d0=SWE×cotθ,d 0 =SWE×cotθ, 其中,θ为观测分辨角;Among them, θ is the observation resolution angle; 根据透镜成像规律和焦距方程,得到在距离目标图像为d0时视网膜上的成像高度hi,具体计算公式为:According to the lens imaging law and the focal length equation, the imaging height h i on the retina when the distance from the target image is d 0 is obtained. The specific calculation formula is: 其中,f为人眼晶状体与视网膜间距,即透镜焦距,h0为目标图像原始高度。Among them, f is the distance between the human eye lens and the retina, that is, the focal length of the lens, and h 0 is the original height of the target image. 6.根据权利要求1所述的基于背景估计和能量最小化的低质量文档图像二值化方法,其特征在于:步骤3.3中,两次闭操作均采用圆形结构元素;第一次结构元素的直径设置为图像的笔画宽度,第二次结构元素的直径则比图像的笔画宽度大12个像素。6. The low-quality document image binarization method based on background estimation and energy minimization according to claim 1, characterized in that: in step 3.3, both closing operations adopt circular structural elements; The diameter is set to the stroke width of the image, and the diameter of the second structuring element is 12 pixels larger than the stroke width of the image. 7.根据权利要求1所述的基于背景估计和能量最小化的低质量文档图像二值化方法,其特征在于:步骤3.4中,距离目标图像为d0时观测到的图像高度为hi,因此,将形态学闭操作后的图像通过双线性降采样缩放到hi高度;然后采用双线性内插法将缩放后的图像恢复到原始尺寸大小,得到的图像即为估计的背景图像;在进行图像缩放时,保持图像宽高比不变。7. The low-quality document image binarization method based on background estimation and energy minimization according to claim 1, characterized in that: in step 3.4, the observed image height is h i when the distance target image is d 0 , Therefore, the image after the morphological closing operation is scaled to the height h i by bilinear downsampling; then the scaled image is restored to the original size by bilinear interpolation, and the obtained image is the estimated background image ; keep the image aspect ratio unchanged when doing image scaling.
CN201710289747.7A 2017-04-27 2017-04-27 Low-quality document image binarization method based on background estimation and energy minimization Expired - Fee Related CN107133929B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710289747.7A CN107133929B (en) 2017-04-27 2017-04-27 Low-quality document image binarization method based on background estimation and energy minimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710289747.7A CN107133929B (en) 2017-04-27 2017-04-27 Low-quality document image binarization method based on background estimation and energy minimization

Publications (2)

Publication Number Publication Date
CN107133929A CN107133929A (en) 2017-09-05
CN107133929B true CN107133929B (en) 2019-06-11

Family

ID=59716294

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710289747.7A Expired - Fee Related CN107133929B (en) 2017-04-27 2017-04-27 Low-quality document image binarization method based on background estimation and energy minimization

Country Status (1)

Country Link
CN (1) CN107133929B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107705300A (en) * 2017-09-28 2018-02-16 成都大熊智能科技有限责任公司 A kind of method that blank page detection is realized based on morphological transformation
CN108830186B (en) * 2018-05-28 2021-12-03 腾讯科技(深圳)有限公司 Text image content extraction method, device, equipment and storage medium
CN109918482A (en) * 2019-03-14 2019-06-21 西安航空学院 An Evaluation and Analysis System for College Students' Innovation and Entrepreneurship Plans
US11100611B2 (en) * 2019-03-29 2021-08-24 GE Precision Healthcare LLC Systems and methods for background noise reduction in magnetic resonance images
CN111292342A (en) * 2020-02-17 2020-06-16 深圳前海微众银行股份有限公司 Method, device and equipment for cutting text in image and readable storage medium
CN111681175A (en) * 2020-05-09 2020-09-18 浙江大学 A Preprocessing Method for Scanning Grayscale Document Image
CN111583157B (en) * 2020-05-13 2023-06-02 杭州睿琪软件有限公司 Image processing method, system and computer readable storage medium
CN112488107A (en) * 2020-12-04 2021-03-12 北京华录新媒信息技术有限公司 Video subtitle processing method and processing device
CN112836541B (en) * 2021-02-03 2022-06-03 华中师范大学 Automatic acquisition and identification method and device for 32-bit bar code of cigarette
CN112837329B (en) * 2021-03-01 2022-07-19 西北民族大学 A method and system for image binarization of Tibetan ancient book documents
CN113129246A (en) * 2021-04-19 2021-07-16 厦门喵宝科技有限公司 Document picture processing method and device and electronic equipment
CN114283156B (en) * 2021-12-02 2024-03-05 珠海移科智能科技有限公司 Method and device for removing document image color and handwriting
CN117436058B (en) * 2023-10-10 2024-09-03 国网湖北省电力有限公司 A power information security protection system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295648A (en) * 2016-07-29 2017-01-04 湖北工业大学 A kind of low quality file and picture binary coding method based on multi-optical spectrum imaging technology

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295648A (en) * 2016-07-29 2017-01-04 湖北工业大学 A kind of low quality file and picture binary coding method based on multi-optical spectrum imaging technology

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A Laplacian Energy for Document Binarization;Nicholas R. Howe等;《2011 International Conference on Document Analysis and Recognition》;20111231;第6-10页 *
Parameter tuning for document image binarization using a racing;Rafael G. Mesquita等;《Expert Systems with Applications》;20151231;第2593-2603页 *
低质量文档图像二值化算法研究;熊炜等;《计算机应用与软件》;20160731;第33卷(第7期);第204-208页 *

Also Published As

Publication number Publication date
CN107133929A (en) 2017-09-05

Similar Documents

Publication Publication Date Title
CN107133929B (en) Low-quality document image binarization method based on background estimation and energy minimization
CN108898610B (en) An object contour extraction method based on mask-RCNN
CN112819772B (en) High-precision rapid pattern detection and recognition method
CN108090888B (en) Fusion detection method of infrared image and visible light image based on visual attention model
Pei et al. Removing rain and snow in a single image using saturation and visibility features
CN101710425B (en) Self-adaptive pre-segmentation method based on gray scale and gradient of image and gray scale statistic histogram
CN116823686B (en) Night infrared and visible light image fusion method based on image enhancement
CN107527332A (en) Enhancement Method is kept based on the low-light (level) image color for improving Retinex
CN108765325A (en) Small unmanned aerial vehicle blurred image restoration method
CN101783012A (en) Automatic image defogging method based on dark primary colour
Bouzos et al. Conditional random field model for robust multi-focus image fusion
CN109035274A (en) File and picture binary coding method based on background estimating Yu U-shaped convolutional neural networks
CN104463814B (en) Image enhancement method based on local texture directionality
CN109509163B (en) A method and system for multi-focus image fusion based on FGF
Shah et al. An iterative approach for shadow removal in document images
CN114331886A (en) Image deblurring method based on depth features
CN107798670A (en) A kind of dark primary prior image defogging method using image wave filter
CN105184761A (en) Image rain removing method based on wavelet analysis and system
CN115689960A (en) A fusion method of infrared and visible light images based on adaptive illumination in nighttime scenes
CN110782447A (en) Multi-moving ship target detection method in geostationary orbit satellite optical remote sensing images
CN111598788B (en) Single image defogging method based on quadtree decomposition and non-local prior
Gasparyan et al. Iterative Retinex-based decomposition framework for low light visibility restoration
CN105913391B (en) A kind of defogging method can be changed Morphological Reconstruction based on shape
Anantrasirichai et al. Mitigating the effects of atmospheric distortion using DT-CWT fusion
Zhang et al. Dehazing with improved heterogeneous atmosphere light estimation and a nonlinear color attenuation prior model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190611