CN104182966A - Automatic splicing method of regular shredded paper - Google Patents
Automatic splicing method of regular shredded paper Download PDFInfo
- Publication number
- CN104182966A CN104182966A CN201410340616.3A CN201410340616A CN104182966A CN 104182966 A CN104182966 A CN 104182966A CN 201410340616 A CN201410340616 A CN 201410340616A CN 104182966 A CN104182966 A CN 104182966A
- Authority
- CN
- China
- Prior art keywords
- fragments
- matching
- fragment
- row
- shredded paper
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Image Analysis (AREA)
- Crushing And Pulverization Processes (AREA)
Abstract
本发明属于图像处理技术,具体涉及一种规则碎纸自动拼接的方法。本发明的技术方案通过六步来实现:(1)准备图像数据集并进行预处理;(2)对碎纸按中英文、单双面进行分类;(3)提取每幅图像的局部区域特征,如碎纸片边界像素点的位置和灰度值、上(下)边界高度;对英文碎纸片特征的提取范围进行扩大,附加特征包括:英文碎纸片的行高、英文岁纸片的水平位置、英文碎纸片的行间距;(4)依据步骤(3)提取的特征值,对碎片进行再分类;(5)对碎片进行局部匹配,行匹配和列匹配;(6)将匹配好的图像进行还原。本发明提供的方法能够更加准确地对大量碎纸进行拼接。
The invention belongs to image processing technology, and in particular relates to a method for automatic splicing of regular shredded paper. The technical scheme of the present invention is realized through six steps: (1) prepare the image data set and perform preprocessing; (2) classify the shredded paper according to Chinese and English, single and double sides; (3) extract the local area feature of each image , such as the position and gray value of the boundary pixel of the shredded paper, and the height of the upper (lower) boundary; the extraction range of the features of the English shredded paper is expanded, and the additional features include: the line height of the English shredded paper, the English old paper (4) Reclassify the fragments according to the feature values extracted in step (3); (5) Partially match the fragments, row matching and column matching; (6) Matched images are restored. The method provided by the invention can splice a large amount of shredded paper more accurately.
Description
技术领域 technical field
本发明属于图像处理技术的应用领域,具体涉及一种规则碎纸自动拼接方法。 The invention belongs to the application field of image processing technology, and in particular relates to an automatic splicing method of regular shredded paper. the
背景技术 Background technique
碎纸拼接技术是数字图像处理技术的一个重要研究分支,它是将一组相互间存在重叠部分的碎纸进行空间匹配对准,从而进行无缝拼接得到完整的、宽视角场景的图像。 Shredded paper splicing technology is an important research branch of digital image processing technology. It is to spatially match and align a group of shredded papers that overlap with each other, so as to seamlessly stitch together to obtain a complete image of a wide-view scene. the
碎纸自动拼接复原技术在司法物证复原、历史文献修复以及军事情报获取等领域都有着重要的应用。近年来,随着德国斯塔西文件恢复工程的公布,碎纸文件复原技术的研究引起了广泛的关注。 Shredded paper automatic splicing and restoration technology has important applications in the fields of judicial evidence restoration, historical document restoration, and military intelligence acquisition. In recent years, with the publication of the German Stasi Document Restoration Project, the research on shredded document recovery technology has attracted widespread attention. the
碎纸拼接必须完成的关键是碎片的匹配技术。传统破碎文件的拼接,更多的是使用碎片的边缘形状提取其轮廓曲线并利用计算机算法进行拼接。现如今随着碎纸机的广泛应用,越来越多的破碎纸片拼接问题中,碎纸的边缘形状都大致相同,边缘形状拼接不再适用。对于规则形状的碎纸,则是根据纸片边缘所包含的文字内容,通过图像配准运算确定碎纸边界的参数,对碎片进行匹配,最终实现无缝拼接。但是在实际应用当中,待拼接的纸片数量越大,具有相似文字信息的纸片边缘数量也就越大,且相似程度越高。而计算机扫描形成数字图像的分辨率具有一定的局限性,因此在拼接过程中,会出现一定量的错误拼接。理想的拼接技术所要达到的效果便是“零错误”。就现有的技术现状来看,现有的碎纸拼接方法大都针对于非规则形状,能够有效应用于大型宽幅规则纸片拼接的方法较为少见。 The key to the splicing of shredded paper is the matching technology of the fragments. The splicing of traditional broken files is more about using the edge shape of the fragments to extract their contour curves and using computer algorithms for splicing. Nowadays, with the widespread application of paper shredders, in more and more splicing problems of shredded paper, the edge shapes of shredded paper are roughly the same, and edge shape splicing is no longer applicable. For regular-shaped shredded paper, according to the text content contained in the edge of the paper, the parameters of the shredded paper boundary are determined through image registration operations, and the shreds are matched to finally achieve seamless splicing. However, in practical applications, the larger the number of paper sheets to be spliced, the larger the number of edges of paper sheets with similar text information, and the higher the similarity. However, the resolution of digital images formed by computer scanning has certain limitations, so a certain amount of wrong splicing will occur during the splicing process. The effect that the ideal splicing technology wants to achieve is "zero error". As far as the current technical situation is concerned, most of the existing splicing methods for shredded paper are aimed at irregular shapes, and methods that can be effectively applied to splicing large-scale wide and regular pieces of paper are relatively rare. the
提高碎纸自动拼接质量的技术关键在于如何高质量地获取碎纸上的文字或图像信息。一般来说,碎片上的信息量越小,拼接错误甚至是无法拼接的几率越大。因此迄今为止,在该技术领域对碎纸图像进行自动拼接过程希望能够得到最终高质量的宽幅碎纸拼接存在纸较大的技术难度。 The technical key to improving the quality of automatic splicing of shredded paper lies in how to obtain text or image information on shredded paper with high quality. Generally speaking, the smaller the amount of information on the fragments, the greater the chance of splicing errors or even failure to splice. Therefore, so far, in this technical field, the process of automatically splicing shredded paper images in the hope of obtaining the final high-quality wide-width shredded paper splicing has relatively large technical difficulties. the
发明内容 Contents of the invention
本发明的目的是提供一种规则碎纸自动拼接的方法,能够更加准确地对大量碎纸片进行拼接。 The purpose of the present invention is to provide a method for automatically splicing regular shredded paper, which can more accurately splice a large number of shredded paper. the
本发明是通过以下技术方案实现的,主要包括以下六个步骤: The present invention is achieved through the following technical solutions, mainly comprising the following six steps:
1.图像数据集的准备和预处理的具体步骤包括: 1. The specific steps of image data set preparation and preprocessing include:
1.1将碎纸片从左到右、从上到下依次编号,记为1,2,3···n;若需要区分正反面,则正面记为a1,a2,a3···an;反面记为b1,b2,b3···bn; 1.1 Number the scraps of paper from left to right and from top to bottom, and record them as 1, 2, 3...n; if you need to distinguish the front and back, record the front as a1, a2, a3...an; Recorded as b1, b2, b3...bn;
1.2将图像数字化,以像素点作为最小单位,并提取各像素点的灰度值和所在位置,建立函数矩阵; 1.2 Digitize the image, take the pixel as the smallest unit, and extract the gray value and location of each pixel, and establish a function matrix;
1.3将图像进行值化:灰度值为“0”的点为黑色点,灰度值为“255”的点为白色点,“0”与“255”之间的为灰色点; 1.3 Value the image: the point with the gray value "0" is a black point, the point with the gray value "255" is a white point, and the point between "0" and "255" is a gray point;
1.4去噪点:由于原始信息都是连续的模拟信号,数字化处理过后的图像也应该是一个具有连续趋势的间断点图像。针对同一颜色点完全包围异色点的情况,将异色点的颜色同化成周围点的颜色; 1.4 Denoising: Since the original information is a continuous analog signal, the digitally processed image should also be a discontinuous point image with a continuous trend. For the situation that the same color point completely surrounds the different color point, the color of the different color point is assimilated into the color of the surrounding points;
2.对碎纸整体进行分类,按中英文、单双面分为4种情况:中文单面、中文双面、英文单面、英文双面; 2. Classify the shredded paper as a whole, and divide it into 4 situations according to Chinese and English, single and double-sided: Chinese single-sided, Chinese double-sided, English single-sided, English double-sided;
3.分别提取出每幅图像局部区域的特征,这些特征包括:碎纸片边界像素点的位置和灰度值、上(下)边界高度;对英文碎纸片特征的提取范围进行扩大,附加特征包括:英文碎纸片的行高、英文碎纸片的水平位置、英文碎纸片的行间距; 3. Extract the features of the local area of each image respectively, these features include: the position and gray value of the pixel points on the boundary of the shredded paper, and the height of the upper (lower) boundary; expand the extraction range of the English shredded paper features, add Features include: row height of English scraps, horizontal position of English scraps, line spacing of English scraps;
特征提取的方法具体如下: The method of feature extraction is as follows:
i)碎纸片最外层的像素点的位置和灰度值: i) The position and gray value of the outermost pixel of the shredded paper:
定义碎纸片最左(右)端一列像素点为左(右)边界,最顶(底)端一行像素点为上(下)边界,提取各边界像素点的位置和灰度值; Define a column of pixels at the leftmost (right) end of the shredded paper as the left (right) boundary, and a row of pixels at the top (bottom) end as the upper (lower) boundary, and extract the position and gray value of each boundary pixel;
ii)上(下)边界高度: ii) Upper (lower) boundary height:
根据每一张碎片的上下边界是否完全白色分为白色边界高度和黑色边界高度两大类。具体分类方法如下: According to whether the upper and lower borders of each fragment are completely white, they are divided into two categories: white border height and black border height. The specific classification methods are as follows:
以碎片的最底端为x轴,以碎片左边垂直于x轴向上为y轴,x轴与y轴的相交点为原点建立坐标系,将图片上各个像素点向y轴作投影。如图1所示。一个黑色或灰色点的投影记为一次有效投影,投影次数加1,而白色点的投影无效,投影次数不改变。记录与原点之间的距离为h个像素点的投影点上的投影次数f(h)。 The bottom of the fragment is the x-axis, the left side of the fragment is perpendicular to the x-axis and the y-axis is the y-axis, and the intersection point of the x-axis and the y-axis is the origin to establish a coordinate system, and each pixel on the picture is projected to the y-axis. As shown in Figure 1. The projection of a black or gray point is recorded as a valid projection, and the number of projections is increased by 1, while the projection of a white point is invalid, and the number of projections does not change. Record the number of projections f(h) on the projection point whose distance from the origin is h pixels. the
当投影次数f(h)小于该行总像素点n的1/10时,将y轴上点h的灰度值g(h)记为“0”;当投影次数f(h)大于或等于该行总像素点n的1/10时,将点h的灰度值g(h)记为“255”。 When the number of projections f(h) is less than 1/10 of the total pixel points n of the row, the gray value g(h) of point h on the y-axis is recorded as "0"; when the number of projections f(h) is greater than or equal to When it is 1/10 of the total pixel point n in this row, the gray value g(h) of point h is recorded as "255". the
在投影轴上,从碎片的上边界依次向下进行统计,直至出现颜色不同的点。这一段高度即为上边界高度,下边界高度亦然。 On the projected axis, count down from the upper boundary of the shards until points with different colors appear. The height of this section is the height of the upper boundary, and the height of the lower boundary is also the same. the
iii)英文碎纸片的行高: iii) Row height of English shreds:
英文字母的高度以及在同一行中所占的位置高度大致相同,因此,按照步骤i)的方式进行投影,灰度值为“1”的区间即是字母有效区间,定义有效区间的高度为行高; The height of English letters and the height of the position occupied in the same line are roughly the same, therefore, according to step i) for projection, the interval with a gray value of "1" is the effective interval of letters, and the height of the effective interval is defined as row high;
iv)英文碎纸片的水平位置: iv) Horizontal position of English fragments:
经步骤i)投影后,字母的有效投影区间的上下边界,距碎纸片顶部的距离称为该行字母的所在水平位置,用以确定该行字母在碎纸片上的位置; After step i) projection, the upper and lower boundaries of the effective projection interval of the letter, the distance from the top of the shredded paper is called the horizontal position of the row of letters, which is used to determine the position of the row of letters on the shredded paper;
v)行间距: v) Line spacing:
提取两水平位置间的垂直距离作为行间距; Extract the vertical distance between two horizontal positions as the line spacing;
4.依据步骤3所提取的特征集,对碎片进行分类: 4. Classify the fragments according to the feature set extracted in step 3:
具体步骤如下: Specific steps are as follows:
i)根据纸片边缘是否有文字笔画信息,将碎纸分为三类:上下边界碎片、左右边界碎片和中间碎片; i) According to whether there is text stroke information on the edge of the paper, the shredded paper is divided into three categories: upper and lower border fragments, left and right border fragments and middle fragments;
ii)依据行间距特征,分别对上述三类碎纸片进一步分类,相同行间距分为一类; ii) According to the characteristics of line spacing, further classify the above three types of shredded paper, and the same line spacing is divided into one category;
iii)依据上(下)边界高度,对步骤i)所形成的三类碎片集进行分类,上(下)边界高度相同或相近的碎片划分为同一碎片集: iii) According to the height of the upper (lower) boundary, classify the three types of fragment sets formed in step i), and the fragments with the same or similar height of the upper (lower) boundary are divided into the same fragment set:
划分类别需要遵循一定的条件: Classification needs to follow certain conditions:
(1)每一类的碎片数量必须等于或略小于纸张的纵切次数; (1) The number of fragments of each type must be equal to or slightly less than the number of longitudinal cuts of the paper;
(2)与其他高度相间隔的类别,若数量小于各类别碎片数量的1/5,则不独立为一个类别; (2) If the number of categories separated from other heights is less than 1/5 of the number of fragments of each category, it is not an independent category;
(3)高度相互连续的几个类别归为同一类; (3) Several categories that are highly continuous with each other are classified into the same category;
(4)最终的类别总数为纸张的横切次数; (4) The final total number of categories is the number of cross-cuts of the paper;
(5)若还是无法确定类别,则再以同样的方法对底部高度进行辅助判断。 (5) If the category still cannot be determined, use the same method to make an auxiliary judgment on the height of the bottom. the
iv)利用水平位置,对步骤ii)所形成的各个碎片集进一步分类,处于同一水平位置的碎片划分为一类; iv) Utilize the horizontal position to further classify each fragment set formed in step ii), and the fragments at the same horizontal position are divided into one class;
5.对碎片进行匹配的具体步骤: 5. Specific steps for matching fragments:
5.1对碎片进行局部匹配,即是两碎片之间的匹配,下面以左右匹配为例: 5.1 Partial matching of fragments, that is, matching between two fragments, the following takes left and right matching as an example:
i.定义Xij为第i张碎片右边界上第j行像素点的灰度值,定义Yi′j为第i′张碎片左边界上第j行像素点的灰度值(i≠i′)。判定匹配与否的关键在于Xij和Yi′j之间的匹配程度,将步骤3提取的特征集,以右边界特征为基准,定义判定标准为: i. Define X ij as the gray value of the jth row of pixels on the right boundary of the i'th fragment, and define Y i'j as the gray value of the jth row of pixels on the left boundary of the i'th fragment (i≠i '). The key to judging whether it matches or not lies in the degree of matching between X ij and Y i′j . The feature set extracted in step 3 is based on the right boundary feature, and the judgment standard is defined as:
Xij为白色,Yi′j-1、Yi′j、Yi′j+1出现灰白黑三色且不全为黑为正常,可进行匹配; X ij is white, Y i′j-1 , Y i′j , Y i′j+1 have three colors of gray, white and black and not all black, which is normal and can be matched;
Xij为灰色,Yi′j-1、Yi′j、Yi′j+1出现任意色均为正常,可进行匹配; X ij is gray, any color of Y i′j-1 , Y i′j , Y i′j+1 is normal and can be matched;
Xij为黑色,Yi′j-1、Yi′j、Yi′j+1不全为白色为正常,可进行匹配; X ij is black, Y i′j-1 , Y i′j , Y i′j+1 are not all white, it is normal and can be matched;
其余情况为不正常,不可进行匹配。 The rest of the cases are abnormal and cannot be matched. the
Xij与Yi′j-1、Yi′j、Yi′j+1的关系如图2所示。 The relationship between X ij and Y i′j-1 , Y i′j , and Y i′j+1 is shown in FIG. 2 .
其中:Xij:第i张纸条的最左边一列的第j行像素点的灰度值; Among them: X ij : the gray value of the pixel point in the jth row of the leftmost column of the i-th note;
Yi′j:第i′张纸条的最右边一列的第j行像素点的灰度值; Y i′j : the gray value of the pixel point in the jth row of the rightmost column of the i′th paper strip;
边界跟踪算法具体流程如下: The specific process of the boundary tracking algorithm is as follows:
(1)选取碎片i和i′ (1) Select fragments i and i′
(2)假设碎片i和i′相互匹配; (2) Assume that fragments i and i′ match each other;
(3)读取碎片i右边界j行像素点Xij的灰度值; (3) Read the gray value of the pixel point X ij of the row j of the right boundary of the fragment i;
(4)扫描碎片i′左边界的第j-1、j、j+1行的像素点Yi′(j-1)、Yi′j、Yi′(j+1)的,判断其是否全为白色; (4) Scan the pixels Y i ′(j-1) , Y i′j , Y i′(j+1 ) of the j-1, j, j+1th row of the left boundary of the fragment i′, and judge its Is it all white;
(5)若全为白色,且超出行范围,则j=j+1后返回(3); (5) If all are white and exceed the line range, return to (3) after j=j+1;
(6)若不全为白色,则j=j+1,读取下一行,判断Xij是否为白色; (6) If it is not all white, then j=j+1, read the next line, and judge whether X ij is white;
(7)若为白色,则返回(5); (7) If it is white, return to (5);
(8)若不为白色,则判断Yi′(j-1)、Yi′j、Yi′(j+1)是否全为白色; (8) If it is not white, judge whether Y i′(j-1) , Y i′j , Y i′(j+1) are all white;
(9)若全为白色,则返回(5); (9) If all are white, then return to (5);
(10)若不为白色,则j=j+1读取下一行,判断Xij的颜色; (10) If it is not white, then j=j+1 reads the next line, and judges the color of X ij ;
(11)若为白色,则返回(5); (11) If it is white, return to (5);
(12)若为灰色,则返回(5); (12) If it is gray, return to (5);
(13)若为黑色,则判断Yi′(j-1)、Yi′j、Yi′(j+1)是否全为白色; (13) If it is black, judge whether Y i′(j-1) , Y i′j , Y i′(j+1) are all white;
(14)若不为白色,则返回(5); (14) If it is not white, return to (5);
(15)若全为白色,则碎片i和i′匹配过程结束,碎片i和i′不匹配; (15) If all are white, the matching process of fragments i and i' ends, and fragments i and i' do not match;
(16)若j+1超出行范围,则碎片i和i′匹配过程结束,碎片i和i′匹配。 (16) If j+1 exceeds the row range, the matching process of fragment i and i' ends, and fragment i and i' match. the
ii.根据步骤i的判定标准,确定图像匹配指数的数学模型,具体为: ii. According to the judgment standard of step i, determine the mathematical model of the image matching index, specifically:
其中: in:
Sii′:第i张碎片与第i′张碎片的匹配指数; S ii′ : the matching index between the i-th fragment and the i′-th fragment;
N:碎纸片竖直高度上像素点的总数; N: the total number of pixels on the vertical height of the shredded paper;
Xij:碎片i右边界j行像素点的灰度值; X ij : the gray value of the pixel points in row j on the right boundary of fragment i;
Ti′(Xij):判断第i张碎纸第j行的右边界特征与对应行的第i′张纸条的左边界特征的匹配指数; T i′ (X ij ): judging the matching index of the right boundary feature of the i-th shredded paper row j and the left boundary feature of the i′-th paper strip in the corresponding row;
该匹配指数具体表示为: The matching index is specifically expressed as:
其中,T2(Xij)=0. in, T 2 (X ij )=0.
当且仅当Sii′指数为0时,两碎片才视为可匹配;若不为0则不能进行匹配,且数值越大,匹配程度越差。 If and only when the S ii′ index is 0, the two fragments are regarded as matching; if it is not 0, the matching cannot be performed, and the larger the value, the worse the matching degree.
5.2步骤5.1已经完成了两碎片之间的局部匹配过程,将步骤5.1获得的符合匹配条件的碎纸片,形成各个小的碎片集,对碎片集进行行匹配和列匹配,i.行匹配的具体过程: 5.2 Step 5.1 has completed the local matching process between the two fragments. The shreds that meet the matching conditions obtained in step 5.1 are formed into small fragment sets, and row matching and column matching are performed on the fragment sets. i. Specific process:
i)以其中一张碎片为基准,若两碎片的局部匹配成功,则将两碎片合并为一张碎纸片,放入新的碎片集;若局部匹配未成功,则保留基准碎片,继续局部匹配。原碎片集中的碎片均无法成功局部匹配时,均放入新碎片集; i) Based on one of the fragments, if the partial matching of the two fragments is successful, the two fragments will be merged into one fragment and put into a new fragment set; if the partial matching is not successful, the reference fragment will be kept and the local match. When none of the fragments in the original fragmentation set can be partially matched successfully, they are put into the new fragmentation set;
ii)新的碎片按照上述步骤重复进行,直至所有碎纸片拼接成完整的碎片行。 ii) For new pieces, repeat the above steps until all pieces of paper are spliced into a complete line of pieces. the
ii.根据上述过程,确定图像行匹配指数的数学模型,具体为: ii. According to the above process, determine the mathematical model of the image row matching index, specifically:
目标函数: Objective function:
W=min∑Sii′ W=min∑S ii′
约束条件: Restrictions:
其中,a为Sii′的个数; Wherein, a is the number of S ii′ ;
W的最小值为0; The minimum value of W is 0;
M为碎纸片纵向切割的次数 M is the number of longitudinal cuts of shredded paper
iii将通过行匹配的碎片集形成碎片行,对碎片行矩阵进行转置,再以同样的方法进行列匹配; iii will form a fragmented row through the fragmented set matched by the row, transpose the fragmented row matrix, and then perform column matching in the same way;
6,将步骤5匹配之后的图像进行还原。 6. Restore the image after matching in step 5. the
本发明的有效利益是: Effective benefits of the present invention are:
可以一次性拼接处理数量较为庞大的碎纸片,并就匹配拼接流程提出了相应的优化解决方案,其主要体现在: It can splice and process a relatively large number of shredded paper at one time, and proposes a corresponding optimization solution for the matching splicing process, which is mainly reflected in:
(1)针对中文碎片,仅对边界做特征处理,对边界特征进行数学模型建立,因此,本发明在更新图像样本数据库时,扫描数据库的范围大大缩小,在大量待拼接碎纸的情况下,具有时间优势。 (1) For Chinese fragments, only feature processing is performed on the boundary, and a mathematical model is established on the boundary feature. Therefore, when the present invention updates the image sample database, the scope of the scanned database is greatly reduced. In the case of a large number of shredded paper to be spliced, Has a time advantage. the
(2)本发明所设计的边界跟踪算法,可以确保碎纸匹配过程中的唯一性,进一步提高了本发明的有效性和可操作性。 (2) The boundary tracking algorithm designed by the present invention can ensure the uniqueness in the shredded paper matching process, further improving the effectiveness and operability of the present invention. the
附图说明 Description of drawings
图1是本发明实施例的碎纸片文字投影图; Fig. 1 is the text projection diagram of shredded paper of the embodiment of the present invention;
图2是本发明实施例的局部匹配图; Fig. 2 is the local matching figure of the embodiment of the present invention;
图3是本发明实施例的边界跟踪算法流程图。 Fig. 3 is a flow chart of the boundary tracking algorithm of the embodiment of the present invention. the
具体实施方式 Detailed ways
下面以中文单面为例,简单地说明本发明的执行过程。本实例共选择了209张碎纸片图像,这209张碎片一张A4纸横切10刀,纵切18刀。具体执行步骤如下: Taking Chinese single-sided as an example, the execution process of the present invention will be briefly described below. In this example, a total of 209 shredded paper images are selected, and these 209 shredded pieces are cut 10 times crosswise and 18 lengthwise on an A4 paper. The specific execution steps are as follows:
(1)预处理 (1) Pretreatment
(a)进行图像数据集的准备和预处理,包括图像数字化、去噪、二值化; (a) Prepare and preprocess the image data set, including image digitization, denoising, and binarization;
(b)将碎纸片排列为11行19列的矩阵,按从左向右、从上到下的顺序依次从1到209编号; (b) Arrange the scraps of paper into a matrix of 11 rows and 19 columns, numbered from 1 to 209 in order from left to right and from top to bottom;
(2)提取碎片矩阵的特征集,这些特征包括:碎纸片最外层的像素点的位置和灰度值、上边界高度; (2) Extract the feature set of the debris matrix, these features include: the position and gray value of the pixel points of the outermost layer of the debris, and the height of the upper boundary;
(3)利用特征值对碎片进行分类: (3) Use eigenvalues to classify fragments:
①根据纸片边缘是否有文字笔画信息,将碎纸分为三类:左右边界碎片各11张、上下边界碎片各19张和中间碎片149张; ① According to whether there is text and stroke information on the edge of the paper, the shredded paper is divided into three categories: 11 left and right border fragments, 19 upper and lower border fragments, and 149 middle fragments;
②根据提取的上边界高度,对碎片进行分类: ② Classify the fragments according to the extracted upper boundary height:
一般情况下,同一行碎片的白色上边界高度或黑色上边界高度是大致相同的。计算出每一碎片的边界高度并将具有相同边界高度的碎片归为一类,并统计该类碎片数量。统计结果如表一所示。 Generally, the height of the white upper border or the height of the black upper border of the same row of fragments is roughly the same. Calculate the boundary height of each fragment and classify the fragments with the same boundary height into one category, and count the number of such fragments. The statistical results are shown in Table 1. the
表一 具有相同上边界高度的碎片数量 Table 1 Number of Fragments with the Same Upper Boundary Height
从表一中可以看出白色顶部和黑色顶部一共有43组,而图片仅被切割成为了11行,因此,需要对已划分的类别做进一步处理。 It can be seen from Table 1 that there are 43 groups of white tops and black tops, and the picture is only cut into 11 rows. Therefore, further processing of the divided categories is required. the
划分类别需要遵循一定的条件: Classification needs to follow certain conditions:
(6)每一类的碎片数量必须等于或略小于19; (6) The number of fragments of each category must be equal to or slightly less than 19;
(7)与其他高度相间隔的类别,若数量小于10,则不独立为一个类别(例如高度3); (7) The category separated from other heights, if the number is less than 10, is not an independent category (for example, height 3);
(8)高度相互连续的几个类别归为同一类; (8) Several categories that are highly continuous with each other are classified into the same category;
(9)最终的类别总数为11类; (9) The final total number of categories is 11 categories;
(10)若还是无法确定类别,则再根据底部高度进行辅助判断。 (10) If the category still cannot be determined, an auxiliary judgment will be made based on the height of the bottom. the
经过进一步处理过后的分类情况如表二所示: The classification after further processing is shown in Table 2:
表二 上边界高度的分类及对应的碎片数量 Table 2 Classification of the height of the upper boundary and the corresponding number of debris
(4)基于边界跟踪算法对碎片进行行、列匹配; (4) Row and column matching of fragments based on boundary tracking algorithm;
(5)显示拼接后的图像: (5) Display the spliced image:
表三 拼接后的碎片编号表 Table 3 Fragment number table after splicing
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410340616.3A CN104182966B (en) | 2014-07-16 | 2014-07-16 | A kind of regular shredded paper method for automatically split-jointing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410340616.3A CN104182966B (en) | 2014-07-16 | 2014-07-16 | A kind of regular shredded paper method for automatically split-jointing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104182966A true CN104182966A (en) | 2014-12-03 |
CN104182966B CN104182966B (en) | 2017-03-29 |
Family
ID=51963984
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410340616.3A Expired - Fee Related CN104182966B (en) | 2014-07-16 | 2014-07-16 | A kind of regular shredded paper method for automatically split-jointing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104182966B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105809623A (en) * | 2016-03-04 | 2016-07-27 | 重庆交通大学 | A method of splicing and restoring shredded paper |
CN107180412A (en) * | 2017-06-15 | 2017-09-19 | 北京工业大学 | Transverse and longitudinal based on floor projection and seed point constraint K mean cluster shreds scraps of paper method for reconstructing |
CN109584163A (en) * | 2018-12-17 | 2019-04-05 | 深圳市华星光电半导体显示技术有限公司 | A scrap of paper original document restored method |
CN110109751A (en) * | 2019-04-03 | 2019-08-09 | 百度在线网络技术(北京)有限公司 | Distribution method, device and the distribution that distribution cuts figure task cut drawing system |
CN110246098A (en) * | 2019-05-31 | 2019-09-17 | 暨南大学 | A kind of reconstruction of fragments method |
CN111986087A (en) * | 2020-08-27 | 2020-11-24 | 贝壳找房(北京)科技有限公司 | House vector diagram splicing method and device and computer readable storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103679634A (en) * | 2013-12-10 | 2014-03-26 | 锐达互动科技股份有限公司 | Method for splicing recovery of two-dimensional irregular fragments |
CN103700081B (en) * | 2013-12-17 | 2016-08-17 | 河海大学 | A kind of shredder crushes the restoration methods of English document |
CN103886570A (en) * | 2014-04-09 | 2014-06-25 | 济南大学 | A splicing method of double-sided document fragments cut by a shredder |
-
2014
- 2014-07-16 CN CN201410340616.3A patent/CN104182966B/en not_active Expired - Fee Related
Non-Patent Citations (3)
Title |
---|
杨梓艺: "纸片拼接技术", 《网络安全技术与应用》 * |
王勇勇等: "碎纸片的拼接复原", 《台州学院院报》 * |
罗智中: "基于文字特征的文档碎纸片半自动拼接", 《计算机工程与应用》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105809623A (en) * | 2016-03-04 | 2016-07-27 | 重庆交通大学 | A method of splicing and restoring shredded paper |
CN107180412A (en) * | 2017-06-15 | 2017-09-19 | 北京工业大学 | Transverse and longitudinal based on floor projection and seed point constraint K mean cluster shreds scraps of paper method for reconstructing |
CN107180412B (en) * | 2017-06-15 | 2020-10-16 | 北京工业大学 | Horizontal and vertical shredded paper sheet reconstruction method based on horizontal projection and K-means clustering |
CN109584163A (en) * | 2018-12-17 | 2019-04-05 | 深圳市华星光电半导体显示技术有限公司 | A scrap of paper original document restored method |
WO2020124676A1 (en) * | 2018-12-17 | 2020-06-25 | 深圳市华星光电半导体显示技术有限公司 | Method for restoring original file of shredded paper |
CN110109751A (en) * | 2019-04-03 | 2019-08-09 | 百度在线网络技术(北京)有限公司 | Distribution method, device and the distribution that distribution cuts figure task cut drawing system |
CN110246098A (en) * | 2019-05-31 | 2019-09-17 | 暨南大学 | A kind of reconstruction of fragments method |
CN110246098B (en) * | 2019-05-31 | 2021-07-27 | 暨南大学 | A Fragment Recovery Method |
CN111986087A (en) * | 2020-08-27 | 2020-11-24 | 贝壳找房(北京)科技有限公司 | House vector diagram splicing method and device and computer readable storage medium |
CN111986087B (en) * | 2020-08-27 | 2021-05-04 | 贝壳找房(北京)科技有限公司 | House vector diagram splicing method and device and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN104182966B (en) | 2017-03-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110363095B (en) | Identification method for form fonts | |
CN104182966B (en) | A kind of regular shredded paper method for automatically split-jointing | |
CN101251892B (en) | A character segmentation method and device | |
CN103258198B (en) | Character extracting method in a kind of form document image | |
CN103679678B (en) | A kind of semi-automatic splicing restored method of rectangle character features a scrap of paper | |
CN103034848B (en) | A kind of recognition methods of form types | |
CN104966051B (en) | A kind of Layout Recognition method of file and picture | |
JP5492205B2 (en) | Segment print pages into articles | |
Lin et al. | Reconstruction of shredded document based on image feature matching | |
CN103020929B (en) | The broken document recovery method of shredder based on character features | |
CN103700081B (en) | A kind of shredder crushes the restoration methods of English document | |
CN1175699A (en) | Recognition and Correction Method of Optical Scanning Form | |
CN103996180B (en) | Shredder based on English words feature crushes document restored method | |
CN102456212A (en) | Separation method and system for visible watermark in numerical image | |
CN112329641A (en) | Table identification method, device and equipment and readable storage medium | |
Das et al. | Heuristic based script identification from multilingual text documents | |
CN102682457A (en) | Rearrangement method for performing adaptive screen reading on print media image | |
CN107066997B (en) | A kind of electrical component price quoting method based on image recognition | |
CN108510442B (en) | Single-side paper scrap splicing and restoring method based on absolute value distance optimization | |
CN105701500A (en) | Single-sided English paper scrap splicing identification method | |
CN112215192B (en) | Method for quickly inputting test paper score based on machine vision technology | |
Soua et al. | Improved Hybrid Binarization based on Kmeans for Heterogeneous document processing | |
CN108596182B (en) | Manchu parts segmentation method | |
CN108564078B (en) | A method of extracting the axis of Manchu word image | |
Hanmandlu et al. | Segmentation of handwritten Hindi text: A structural approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170329 Termination date: 20170716 |