CN118657803A

CN118657803A - A method, device and system for extracting image frame lines of drilling histogram

Info

Publication number: CN118657803A
Application number: CN202411124145.2A
Authority: CN
Inventors: 邓吉秋; 郭亦伟; 郭志勇; 邱蓝
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2024-08-16
Filing date: 2024-08-16
Publication date: 2024-09-17
Anticipated expiration: 2044-08-16
Also published as: CN118657803B

Abstract

The present invention relates to the technical field of drilling column chart frame line extraction and optimization, and specifically discloses a method, device and system for extracting image frame lines of drilling column chart, including the following steps: step 100, inputting a drilling column chart in an image format; step 200, inputting image preprocessing; step 300, detecting frame line lines and merging and refining line clusters; step 400, post-processing the merged and refined result lines; step 500, outputting a vector file of horizontal and vertical frame lines in a table frame line in the image drilling column chart, and ending. Aiming at the demand for extracting frame lines of drilling column chart tables, the present invention preprocesses the input image, removes other noise elements outside the frame line, so as to enhance the detection accuracy, and sets topological constraints in combination with the organizational rules of the drilling column chart table frame line, accurately merges and refines the multiple line clusters in the detection results from the perspective of post-processing, and makes corrections, so as to improve the accuracy and completeness of the frame line detection results.

Description

A method, device and system for extracting image frame lines of drilling histogram

技术领域Technical Field

本发明涉及钻孔柱状图框线提取与优化技术领域，具体涉及一种钻孔柱状图的图像框线提取方法，还涉及一种钻孔柱状图的图像框线提取系统。The invention relates to the technical field of drilling column graph frame line extraction and optimization, in particular to a drilling column graph image frame line extraction method, and also to a drilling column graph image frame line extraction system.

背景技术Background Art

目前大多数钻孔柱状图仍以图像形式存档，包括纸质文件扫描图像与图形导出图像。由于不同来源钻孔柱状图表现形式与信息组织形式上的差异性，目前对于钻孔柱状图信息自动抽取多结合图像理解与数字图像处理技术进行，其核心关键步骤在于对表格框线准确、完整的提取。At present, most of the drilling histograms are still archived in the form of images, including scanned images of paper documents and graphic exported images. Due to the differences in the representation and information organization of drilling histograms from different sources, the automatic extraction of drilling histogram information is currently carried out by combining image understanding and digital image processing technology. The core key step is to accurately and completely extract the table frame lines.

但是在实际作业中，由于受图像印刷或扫描不清晰、直线检测算法检测不准确以及图像噪声的影响，容易出现框线检测缺失、检测不准确的现象，同时由于图像中框线像素较宽和存在线锯齿的问题，使得同一条直线在检测结果往往表现为多个直线簇，无法满足预期。因此需要对框线提取方法进行优化，提升框线检测结果的完整与准确性。However, in actual operations, due to unclear image printing or scanning, inaccurate detection of straight line detection algorithms, and image noise, it is easy to have missing or inaccurate frame line detection. At the same time, due to the wide frame line pixels and the existence of line aliasing in the image, the same straight line is often detected as multiple straight line clusters in the detection results, which cannot meet expectations. Therefore, it is necessary to optimize the frame line extraction method to improve the completeness and accuracy of the frame line detection results.

发明内容Summary of the invention

有鉴于此，本发明提供了一种钻孔柱状图的图像框线提取方法、装置及系统，以提升框线检测结果的完整与准确性。In view of this, the present invention provides a method, device and system for extracting image frame lines of a drilling histogram to improve the integrity and accuracy of frame line detection results.

为了达到上述目的，本发明的基础方案提供一种钻孔柱状图的图像框线提取方法，包括以下步骤：In order to achieve the above object, the basic scheme of the present invention provides an image frame line extraction method of a drilling histogram, comprising the following steps:

步骤100，输入图像格式钻孔柱状图；Step 100, input image format drilling histogram;

步骤200，输入图像预处理；Step 200, input image preprocessing;

步骤300，框线直线检测与直线簇合并及细化；Step 300, frame line detection and line cluster merging and refinement;

步骤400，合并与细化结果直线后处理；Step 400, merging and refining the result straight lines for post-processing;

步骤500，输出为图像钻孔柱状图中的表格框线中水平与竖直框线的矢量文件，结束。Step 500, outputting the vector file of the horizontal and vertical frame lines in the table frame lines in the image drilling histogram, and ending.

在可能的一个设计中，步骤200具体包括，In one possible design, step 200 specifically includes:

步骤201，读取钻孔柱状图图像矩阵Image，获取矩阵Image的高H和宽W；Step 201, read the drilling histogram image matrix Image, and obtain the height H and width W of the matrix Image;

步骤202，利用OpenCV对输入图像Image进行灰度化与自适应阈值二值化操作，输出二值化图像Binary_img；Step 202, using OpenCV to perform grayscale conversion and adaptive threshold binarization operations on the input image Image, and output a binary image Binary_img;

步骤203，根据高H和宽W分别设置自适应大小矩形卷积核，纵高矩形Kernel1，尺寸为高H/20，宽1；横宽矩形Kernel2，尺寸为高1，宽W/20；分别利用Kernel1和Kernel2对Binary_img利用OpenCV进行先腐蚀后膨胀，得到水平框线图像Horizontal_img和竖直框线图像Vertical_img；Step 203, according to the height H and width W, set the adaptive size rectangular convolution kernels respectively, the height rectangle Kernel1, the size is height H/20, width 1; the width rectangle Kernel2, the size is height 1, width W/20; respectively use Kernel1 and Kernel2 to corrode Binary_img first and then expand it using OpenCV to obtain the horizontal frame line image Horizontal_img and the vertical frame line image Vertical_img;

步骤204，对竖直框线图像Vertical_img进行竖直投影，得到像素线宽集合Line_width={w1,w2,w3,…,wn}，计算集合算数平均值w_mean；输入图像预处理完成，进入步骤300。Step 204 , vertically project the vertical frame line image Vertical_img to obtain a pixel line width set Line_width={w1,w2,w3,…,wn}, and calculate the set arithmetic mean w_mean; the input image preprocessing is completed, and the process proceeds to step 300 .

在可能的一个设计中，步骤300具体为，直线检测与直线簇合并与细化过程分水平框线与竖直框线两部分进行，对水平框线进行提取与处理。In a possible design, step 300 is specifically that the straight line detection and straight line cluster merging and thinning process is divided into two parts: horizontal frame lines and vertical frame lines, and the horizontal frame lines are extracted and processed.

在可能的一个设计中，步骤300包括，In one possible design, step 300 includes,

步骤301，读取水平框线图像Horizontal_img，利用Opencv的LSD直线检测算法对Horizontal_img中的框线进行检测，得到初步检测结果，水平框线簇Horline_set，簇内每一条直线以两端端点坐标的形式表达并存储，Step 301, read the horizontal frame line image Horizontal_img, use Opencv's LSD line detection algorithm to detect the frame lines in Horizontal_img, and obtain preliminary detection results, horizontal frame line cluster Horline_set, each straight line in the cluster is expressed and stored in the form of the coordinates of the two end points,

步骤302，遍历Horline_set内所有水平线，将集合内直线按照高度y升序排列；进入步骤303对Horline_set同一高度上的直线簇基于拓扑相邻或相接关系进行聚类；Step 302, traverse all horizontal lines in Horline_set, and arrange the lines in the set in ascending order according to height y; proceed to step 303 to cluster the line clusters at the same height of Horline_set based on topological adjacency or connection relationship;

步骤303，若Horline_set内仍有未分类线，则进入步骤3031；否则进入步骤304；Step 303, if there are still unclassified lines in Horline_set, go to step 3031; otherwise go to step 304;

其中，步骤3031，对排序后Horline_set进行遍历，初始化all_hor_class存储所有高度的直线簇类、单一高度直线簇类hor_class，并将hor_class加入all_hor_class，并将第一条直线line1=[(x1,y1),(x2,y1)]加入hor_class，并标记y，y=y1，后从Horline_set中删除line1；后进入步骤3032；Among them, step 3031, traverse the sorted Horline_set, initialize all_hor_class to store all heights of line clusters and single height line cluster hor_class, add hor_class to all_hor_class, add the first line line1=[(x1,y1),(x2,y1)] to hor_class, and mark y, y=y1, and then delete line1 from Horline_set; then enter step 3032;

步骤3032，继续遍历集合内直线，读取下一条直线line2=[(x3,y2),(x4,y2)]，若line2高度y2与标记y差值，|y-y2|≤1，则将line2加入hor_class，从Horline_set中删除line2，更新标记y=y2，进入步骤303，否则进入步骤3031；Step 3032, continue to traverse the lines in the set, read the next line line2 = [(x3, y2), (x4, y2)], if the difference between the height y2 of line2 and the mark y, |y-y2|≤1, then add line2 to hor_class, delete line2 from Horline_set, update the mark y = y2, and go to step 303, otherwise go to step 3031;

步骤304，遍历聚类完成所有直线簇类集合all_hor_class，若仍有未遍历完成直线簇类，则进入步骤3041，直至所有水平直线簇类处理完毕，进入步骤305；Step 304, traverse and cluster all the straight line cluster sets all_hor_class, if there are still straight line clusters that have not been traversed, go to step 3041, until all horizontal straight line clusters are processed, and then go to step 305;

步骤3041，读取直线簇类hor_class，初始化x坐标范围集合xrange_set，y坐标集合y_set，合并后直线x坐标集合xrange_merge_set；进入步骤3042；Step 3041, read the line cluster class hor_class, initialize the x coordinate range set xrange_set, the y coordinate set y_set, and the merged line x coordinate set xrange_merge_set; proceed to step 3042;

步骤3042，遍历hor_class内所有直线，记直线linei=[(xi1,yi),(xi2,yi)]，i为线在集合中的索引，i=1、2、3、…、n，将yi加入y_set，[xi1，xi2]加入xrange_set，将遍历完成的hor_class删除，进入步骤3043；Step 3042, traverse all lines in hor_class, record linei=[(xi1,yi),(xi2,yi)], i is the index of the line in the set, i=1, 2, 3, ..., n, add yi to y_set, add [xi1, xi2] to xrange_set, delete the traversed hor_class, and go to step 3043;

步骤3043，读取y_set，对y_set内所有元素求算数平均，记为y_mean，进入步骤3044；Step 3043, read y_set, calculate the arithmetic mean of all elements in y_set, record it as y_mean, and go to step 3044;

步骤3044，遍历x_range_set，初始化直线x坐标范围x_range=[x11,x12]，并加入xrange_merge_set，进入步骤3045；Step 3044, traverse x_range_set, initialize the x coordinate range of the straight line x_range = [x11, x12], and add xrange_merge_set, and enter step 3045;

步骤3045，继续遍历，得到x坐标范围x_range=[x21,x22]，若x21∈[x11,x12]，且x22>x12，则更新x12=x22，若x22<x12则不更新，删除已完成遍历的线，若仍有x左边范围未完成遍历，则进入步骤3044，否则进入步骤3046；Step 3045, continue traversing to obtain the x coordinate range x_range=[x21,x22]. If x21∈[x11,x12] and x22>x12, update x12=x22. If x22<x12, do not update. Delete the lines that have been traversed. If there is still an x left range that has not been traversed, go to step 3044. Otherwise, go to step 3046.

步骤3046，遍历融合后x坐标范围xrange_merge_set，内部元素xrange_mergei=[xi1,xi2]，输出直线[(xi1,y_mean),(xi2,y_mean)]，并加入集合mergeline_set；进入步骤304；Step 3046, traverse the merged x-coordinate range xrange_merge_set, the internal element xrange_mergei=[xi1,xi2], output the line [(xi1,y_mean),(xi2,y_mean)], and add it to the set mergeline_set; go to step 304;

步骤305，对竖直框线检测结果进行处理，读取竖直框线图像Vertical_img，利用Opencv的LSD直线检测算法对Vertical_img中的框线进行检测，得到初步检测结果，竖直框线簇Vertical_set，簇内每一条直线以两端端点坐标的形式表达并存储；后进入步骤306；Step 305, processing the vertical frame line detection result, reading the vertical frame line image Vertical_img, using Opencv's LSD line detection algorithm to detect the frame lines in Vertical_img, and obtaining a preliminary detection result, a vertical frame line cluster Vertical_set, in which each straight line in the cluster is expressed and stored in the form of the coordinates of both end points; then entering step 306;

步骤306，遍历Verline_set内所有水平线，将集合内直线按照水平位置x升序排列；进入步骤307，对Verline_set同一高度上的直线簇基于拓扑相邻或相接关系进行聚类；Step 306, traverse all horizontal lines in Verline_set, and arrange the lines in the set in ascending order according to the horizontal position x; proceed to step 307, cluster the line clusters at the same height of Verline_set based on the topological adjacent or connected relationship;

步骤307，若Verline_set内仍有未分类线，则进入步骤3071；否则进入步骤308；Step 307, if there are still unclassified lines in Verline_set, go to step 3071; otherwise go to step 308;

其中，步骤3071，对排序后Verline_set进行遍历，初始化所有直线簇类all_ver_class，直线簇类ver_class，并将ver_class加入all_ver_class，并将第一条直线line1=[(x1,y1),(x1,y2)]加入ver_class，并标记x，x=x1，后从Verline_set中删除line1；后进入步骤3032；Among them, step 3071, traverse the sorted Verline_set, initialize all line cluster classes all_ver_class, line cluster class ver_class, and add ver_class to all_ver_class, and add the first line line1=[(x1,y1),(x1,y2)] to ver_class, and mark x, x=x1, and then delete line1 from Verline_set; then enter step 3032;

步骤3072，继续遍历集合内直线，读取下一条直线line2=[(x2,y3),(x2,y4)]，若line2高度y2与标记y差值，|x-x2|≤1，则将line2加入ver_class，从Verline_set中删除line2，更新标记x=x2，进入步骤308，否则进入步骤3071；Step 3072, continue to traverse the lines in the set, read the next line line2 = [(x2, y3), (x2, y4)], if the difference between the height y2 of line2 and the mark y, |x-x2|≤1, then add line2 to ver_class, delete line2 from Verline_set, update the mark x = x2, and go to step 308, otherwise go to step 3071;

步骤308，遍历聚类完成所有直线簇类集合all_ver_class，若仍有未遍历完成直线簇类，则进入步骤3081，直至所有直线簇类处理完毕，进入步骤400；Step 308, traverse the clustering to complete all the line cluster sets all_ver_class, if there are still line clusters that have not been traversed, then go to step 3081, until all the line clusters are processed, then go to step 400;

其中，步骤3081读取直线簇类ver_class，初始化y坐标范围集合yrange_set，x坐标集合x_set，合并后直线y坐标集合yrange_merge_set；进入步骤3082；Among them, step 3081 reads the line cluster class ver_class, initializes the y coordinate range set yrange_set, the x coordinate set x_set, and the merged line y coordinate set yrange_merge_set; then proceeds to step 3082;

步骤3082，遍历ver_class内所有直线，记直线linei=[(xi,yi1),(xi,yi2)]，i为线在集合中的索引，i=1、2、3、…、n，将xi加入x_set，[yi1，yi2]加入yrange_set，将遍历完成的ver_class删除，进入步骤3043；Step 3082, traverse all lines in ver_class, record linei=[(xi,yi1),(xi,yi2)], i is the index of the line in the set, i=1, 2, 3, ..., n, add xi to x_set, [yi1, yi2] to yrange_set, delete the traversed ver_class, and go to step 3043;

步骤3083，读取x_set，对x_set内所有元素求算数平均，记为x_mean，进入步骤3084；Step 3083, read x_set, calculate the arithmetic mean of all elements in x_set, record it as x_mean, and go to step 3084;

步骤3084，遍历y_range_set，初始化直线y坐标范围y_range=[y11，y12]，并加入yrange_merge_set，进入步骤3085；Step 3084, traverse y_range_set, initialize the straight line y coordinate range y_range = [y11, y12], and add yrange_merge_set, and enter step 3085;

步骤3085，继续遍历，得到y坐标范围y_range=[y21,y22]，若y21∈[y11,y12]，且y22>y12，则更新y12=y22，若y22<y12则不更新，删除已完成遍历的线，若仍有y坐标范围未完成遍历，则进入步骤3084，否则进入步骤3086；Step 3085, continue traversing to obtain the y coordinate range y_range=[y21,y22]. If y21∈[y11,y12] and y22>y12, then update y12=y22. If y22<y12, then do not update. Delete the lines that have been traversed. If there are still y coordinate ranges that have not been traversed, proceed to step 3084. Otherwise, proceed to step 3086.

步骤3086，遍历融合后y坐标范围yrange_merge_set，内部元素yrange_mergei=[yi1,yi2]，输出直线[(x_mean,yi1),(x_mean,yi2)]，并加入集合mergeline_set；进入步骤308。Step 3086, traverse the merged y-coordinate range yrange_merge_set, the internal element yrange_mergei=[yi1,yi2], output the straight line [(x_mean,yi1),(x_mean,yi2)], and add it to the set mergeline_set; go to step 308.

在可能的一个设计中，所述步骤400包括，In one possible design, the step 400 includes,

步骤401，读取合并与细化后得到的水平与竖直框线集合Hor_set和Ver_set，进入步骤402；Step 401, read the horizontal and vertical frame line sets Hor_set and Ver_set obtained after merging and thinning, and proceed to step 402;

步骤402，初始化水平框线纵坐标集合Hory_set，竖直框线横坐标集合Verx_set，遍历Hor_set中直线的纵坐标y并加入Hory_set，遍历Ver_set中直线横坐标x并加入Verx_set，均按升序排列，得到Hory_set=｛y1，y2，…，yn｝和Verx_set=｛x1，x2，…，xn｝；进入步骤403；Step 402, initialize the horizontal frame line ordinate set Hory_set, the vertical frame line abscissa set Verx_set, traverse the ordinate y of the straight line in Hor_set and add it to Hory_set, traverse the abscissa x of the straight line in Ver_set and add it to Verx_set, and arrange them in ascending order to obtain Hory_set = {y1, y2, ..., yn} and Verx_set = {x1, x2, ..., xn}; proceed to step 403;

步骤403，遍历Hor_set中所有水平框线，读取直线hor_linei=[(xi1,yi),(xi2,yi)],其中i=1、2、3、…、n，得到直线两端端点水平坐标xi1和xi2，若xi1∈Verx_set，则不进行调整，对xi2同理，否则进入步骤404;Step 403, traverse all horizontal frame lines in Hor_set, read the straight line hor_linei = [(xi1, yi), (xi2, yi)], where i = 1, 2, 3, ..., n, and get the horizontal coordinates xi1 and xi2 of the two end points of the straight line. If xi1∈Verx_set, no adjustment is made, and the same is true for xi2, otherwise, proceed to step 404;

步骤404，分别计算xi1与Verx_set内所有x坐标差的绝对值，并得到绝对值最小值D_min及其对应x坐标xmin，判断xmin与设定阈值t=3*w_mean大小关系，若D_min≤t，则修改xi1=xmin，否则修改xi1=xmin+1；遍历并调整完Hor_set内水平框线后进入步骤405；Step 404, respectively calculate the absolute value of the difference between xi1 and all x coordinates in Verx_set, and obtain the absolute minimum value D_min and its corresponding x coordinate xmin, and determine the size relationship between xmin and the set threshold t=3*w_mean. If D_min≤t, modify xi1=xmin, otherwise modify xi1=xmin+1; after traversing and adjusting the horizontal frame line in Hor_set, enter step 405;

步骤405，遍历Ver_set中所有竖直框线，读取直线ver_linei=[(xi,yi1),(xi,yi2)]，得到直线竖直范围[yi1,yi2]，若yi1∈Hory_set，则不进行调整，对yi2同理，否则进入步骤406；Step 405, traverse all vertical frame lines in Ver_set, read the straight line ver_linei=[(xi,yi1),(xi,yi2)], and obtain the vertical range of the straight line [yi1,yi2]. If yi1∈Hory_set, no adjustment is made, and the same is true for yi2. Otherwise, go to step 406;

步骤406，分别计算yi1与Hory_set内所有x坐标差的绝对值，并得到绝对值最小值D_min及其对应y坐标ymin，判断ymin与设定阈值t=3*w_mean的大小关系，若D_min≤t，则修改yi1=ymin，否则修改yi1=ymin+1；直至遍历并调整完Ver_set内竖直框线，进入步骤407；Step 406, respectively calculate the absolute value of the difference between yi1 and all x coordinates in Hory_set, and obtain the absolute minimum value D_min and its corresponding y coordinate ymin, and determine the size relationship between ymin and the set threshold t=3*w_mean. If D_min≤t, modify yi1=ymin, otherwise modify yi1=ymin+1; until the vertical frame lines in Ver_set are traversed and adjusted, enter step 407;

步骤407，得到调整完成的水平框线Hor_set与竖直框线Ver_set，分别按照高度y坐标和水平位置x坐标升序排列，进入步骤408；Step 407, obtain the adjusted horizontal frame line Hor_set and vertical frame line Ver_set, arrange them in ascending order according to the height y coordinate and the horizontal position x coordinate, and proceed to step 408;

步骤408，对Ver_set内所有直线进行遍历，并按照竖直线x坐标进行分类，得到Ver_x_class，其中每一类内的直线x坐标均相同，进入步骤409；Step 408, traverse all the lines in Ver_set and classify them according to the x-coordinates of the vertical lines to obtain Ver_x_class, where the x-coordinates of the lines in each class are the same, and then proceed to step 409;

步骤409，遍历Ver_x_class内所有类Ver_class，若类长度等于1则不进行操做，否则进入步骤410；Step 409, traverse all classes Ver_class in Ver_x_class, if the class length is equal to 1, no operation is performed, otherwise go to step 410;

步骤410，遍历类内所有直线，记录当前类直线x坐标x0，以及所有直线上端点和下端点y坐标，得到y坐标集合y_class_set=｛y1,y2,y3,…,yn｝，进入步骤411；Step 410, traverse all the lines in the class, record the x coordinate x0 of the current class line, and the y coordinates of the upper and lower endpoints of all the lines, obtain the y coordinate set y_class_set={y1,y2,y3,…,yn}, and go to step 411;

步骤411，取y_class_set内最大值y_max和最小值y_min，得到直线line=[(x0,y_min),(x0,y_max)]，在Ver_set中删除Ver_class中所有直线，并加入line，直至所有类遍历完成。Step 411, take the maximum value y_max and the minimum value y_min in y_class_set, and obtain the straight line line=[(x0,y_min),(x0,y_max)], delete all the straight lines in Ver_class in Ver_set, and add line until all classes are traversed.

在可能的一个设计中，步骤100中的图像格式包括jpg或png或tif。In a possible design, the image format in step 100 includes jpg, png or tif.

在可能的一个设计中，步骤500中，输出为图像钻孔柱状图中的表格框线中水平与竖直框线的矢量文件，以Esri Shapefile文件坐标系下线两端节点坐标存储，存储形式如下：In a possible design, in step 500, the output is a vector file of the horizontal and vertical frame lines in the table frame line in the image drilling histogram, which is stored in the coordinates of the nodes at both ends of the line in the Esri Shapefile file coordinate system, and the storage format is as follows:

。 .

本发明还提供一种钻孔柱状图的图像框线提取装置，包括存储器、控制处理器及存储在所述存储器上并可在所述控制处理器上运行的计算机程序，所述控制处理器执行所述程序，以实现如前所述的钻孔柱状图的图像框线提取方法。The present invention also provides an image frame line extraction device for a drilling bar graph, comprising a memory, a control processor, and a computer program stored in the memory and executable on the control processor. The control processor executes the program to implement the image frame line extraction method for the drilling bar graph as described above.

本发明还提供一种控制系统，包括如前所述的钻孔柱状图的图像框线提取装置。The present invention also provides a control system, comprising the image frame line extraction device of the drilling column chart as described above.

本发明还提供一种计算机可读存储介质，所述计算机可读存储介质存储有计算机可执行指令，所述计算机可执行指令用于实现如前所述的钻孔柱状图的图像框线提取方法。The present invention also provides a computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to implement the image frame line extraction method of the drilling column chart as described above.

本发明针对钻孔柱状图表格框线提取需求，通过对输入图像进行预处理，去除框线外的其他噪声元素，以增强检测准确性，并结合钻孔柱状图表格框线组织规律设定拓扑约束，从后处理角度对检测结果中的多直线簇进行准确合并与细化，并进行修正，提升框线检测结果的准确性与完整性。Aiming at the demand for frame line extraction of drill bar chart tables, the present invention pre-processes the input image to remove other noise elements outside the frame line to enhance the detection accuracy, and sets topological constraints in combination with the frame line organization rules of the drill bar chart table. From the perspective of post-processing, the multiple straight line clusters in the detection results are accurately merged and refined, and corrected to improve the accuracy and completeness of the frame line detection results.

与现有技术相比，本发明的有益效果为：Compared with the prior art, the present invention has the following beneficial effects:

（1）本发明能够在对图像钻孔柱状图进行直线检测的过程中消除高密度文字以及噪声对LSD算法的影响。(1) The present invention can eliminate the influence of high-density text and noise on the LSD algorithm in the process of straight line detection of the image drilling histogram.

（2）本发明在对检测得到的直线簇的合并与细化过程中，无需人工设置和调整阈值，而是基于直线间的拓扑相邻或相接的关系，对同属一条线的直线簇进行合并与细化，效率高，通用性强。(2) In the process of merging and thinning the detected straight line clusters, the present invention does not need to manually set and adjust the threshold. Instead, the straight line clusters belonging to the same line are merged and thinned based on the topological adjacent or connected relationship between the straight lines. This method has high efficiency and strong versatility.

（3）本发明基于表格框线组织规律设计拓扑约束规则，对检测不完整、不准确的框线进行修正补充，提升结果的完整与准确性。(3) The present invention designs topological constraint rules based on the organizational rules of table frame lines, corrects and supplements incomplete and inaccurate frame lines, and improves the completeness and accuracy of the results.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本申请实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required for use in the description of the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present application. For those skilled in the art, other drawings can be obtained based on these drawings without paying any creative work.

图1示出了本申请实施例提出的一种钻孔柱状图的图像框线提取方法、装置及系统的逻辑框图。FIG1 shows a logic block diagram of a method, device and system for extracting image frame lines of a drilling histogram according to an embodiment of the present application.

具体实施方式DETAILED DESCRIPTION

为进一步说明各实施例，本发明提供有附图，这些附图为本发明揭露内容的一部分，其主要用以说明实施例，并可配合说明书的相关描述来解释实施例的运作原理，配合参考这些内容，本领域普通技术人员应能理解其他可能的实施方式以及本发明的优点，图中的组件并未按比例绘制，而类似的组件符号通常用来表示类似的组件。To further illustrate each embodiment, the present invention provides drawings, which are part of the disclosure of the present invention and are mainly used to illustrate the embodiments and can be used in conjunction with the relevant descriptions in the specification to explain the operating principles of the embodiments. With reference to these contents, ordinary technicians in the field should be able to understand other possible implementations and advantages of the present invention. The components in the figures are not drawn to scale, and similar component symbols are generally used to represent similar components.

现有技术主要包括以下几种方式：The existing technologies mainly include the following methods:

（1）目前常用的直线检测算法包括基于变换域的直线检测算法，如HoughTransformation、RANSAC等，和基于图像域的检测算法，如Line Segments Detection(LSD)、Canny Lines等。(1) Currently, the commonly used line detection algorithms include those based on transform domain, such as Hough Transformation, RANSAC, etc., and those based on image domain, such as Line Segments Detection (LSD), Canny Lines, etc.

（2）对于通过直线检测算法检测所得的直线簇的合并与细化，常用的方法是基于距离的方法，即对距离阈值容差内的直线进行合并与细化。(2) For the merging and thinning of line clusters detected by the line detection algorithm, the commonly used method is the distance-based method, that is, merging and thinning the lines within the distance threshold tolerance.

其中，上述现有技术的缺点如下：Among them, the disadvantages of the above-mentioned prior art are as follows:

（1）由于直线检测算法本身、以及原始图像中直线不清晰，以及图像中框线的多像素线宽和线锯齿，造成检测结果容易出现线信息缺失以及不准确的情况，同时实际图像中的单线框线在检测结果中表现为多个直线簇，均会造成框线检测的不准确或影响后续的表格结构识别。(1) Due to the line detection algorithm itself, the unclear straight lines in the original image, and the multi-pixel line width and line aliasing of the frame lines in the image, the detection results are prone to missing line information and inaccurate results. At the same time, the single-line frame lines in the actual image appear as multiple straight line clusters in the detection results, which will cause inaccurate frame line detection or affect the subsequent table structure recognition.

（2）基于图像域的直线检测算法适用于复杂图像，但容易受到图像噪声以及表格中高密度文字影响，而基于变换域的算法抗噪声能力强，但面对复杂图像提取任务时容易出现准确性、完整性降低的问题。(2) The line detection algorithm based on the image domain is suitable for complex images, but it is easily affected by image noise and high-density text in tables. The algorithm based on the transform domain has strong noise resistance, but it is prone to reduced accuracy and completeness when faced with complex image extraction tasks.

（3）基于距离进行直线簇合并，一方面需要根据不同图像多次调整距离阈值进行合并，另一方面由于线锯齿的存在，造成同一条框线两端偏差较大，可能出现合并结果与实际偏移过大造成不准确的情况。(3) To merge line clusters based on distance, on the one hand, it is necessary to adjust the distance threshold multiple times according to different images for merging. On the other hand, due to the existence of line aliasing, the deviation between the two ends of the same frame line is large, and the merging result may be too far away from the actual deviation, resulting in inaccuracy.

目前对于图像钻孔柱状图框线提取容易受到图像质量、检测算法面对复杂检测任务和噪声时的影响导致最终结果出现检测不准确和不完整的问题；对于检测所得直线簇的合并与细化，目前常用的基于距离的合并方法，依赖阈值的调整，效率低。本发明针对钻孔柱状图框线检测，在完成图像预处理（倾斜校正、灰度与二值化）的基础上，对图像进行先腐蚀后膨胀，消除图像中噪声以及高密度文字对框线检测的影响；基于检测直线簇的拓扑相邻与相接关系对属于同一框线的直线簇进行聚类，输出合并与细化结果；最后通过钻孔柱状图框线组织规律，建立拓扑约束规则，对框线进行修正，得到最终结果。At present, the extraction of frame lines of image drilling column charts is easily affected by image quality, detection algorithms facing complex detection tasks and noise, resulting in inaccurate and incomplete detection in the final result; for the merging and refinement of the detected straight line clusters, the commonly used distance-based merging method relies on the adjustment of the threshold and has low efficiency. The present invention is aimed at the detection of frame lines of drilling column charts. On the basis of completing image preprocessing (tilt correction, grayscale and binarization), the image is first corroded and then expanded to eliminate the influence of noise and high-density text in the image on frame line detection; based on the topological adjacent and connected relationship of the detected straight line clusters, the straight line clusters belonging to the same frame line are clustered, and the merging and refinement results are output; finally, through the organizational law of the frame line of the drilling column chart, the topological constraint rules are established, the frame line is corrected, and the final result is obtained.

面对图像钻孔柱状图中复杂表格框线的检测任务，本发明基于图像域LSD检测算法，通过预处理与后处理对复杂表格框线的检测进行优化。通过对输入图像进行腐蚀与膨胀，去除噪声与文字，削弱噪声影响；基于拓扑相邻与相接关系对直线簇进行聚类、合并与细化，实现直线的准确提取；最后结合钻孔柱状图表格框线组织规律，对检测结果中不准确与缺失框线进行修正。In the face of the detection task of complex table frame lines in the image drilling column chart, the present invention optimizes the detection of complex table frame lines through preprocessing and postprocessing based on the image domain LSD detection algorithm. By corroding and dilating the input image, noise and text are removed to weaken the influence of noise; based on the topological adjacent and connected relationships, the straight line clusters are clustered, merged and refined to achieve accurate extraction of straight lines; finally, combined with the organizational rules of the table frame lines in the drilling column chart, the inaccurate and missing frame lines in the detection results are corrected.

面对钻孔柱状图中复杂表格的提取，本发明基于图像域的LSD算法相较基于变换域的算法，检测结果更完整准确，但是LSD算法抗噪声能力较弱，因此本发明首先对输入图像进行预处理，通过图像尺寸设置自适应阈值进行先腐蚀后膨胀，对图像中的噪声、文字、花纹等进行去除，避免噪声影响框线的检测，然后通过LSD算法对处理后的框线图像进行直线检测，得到检测结果，后对检测结果进行合并、细化与修正，通过簇内拓扑相邻或相接关系对同属相同高度或水平位置的框线的直线簇进行聚类，后进行合并与细化，最后结合表格框线组织规律建立拓扑约束，对部分检测不全的框线进行修正与补充。Faced with the extraction of complex tables in the drilling column chart, the LSD algorithm based on the image domain of the present invention has more complete and accurate detection results than the algorithm based on the transform domain, but the LSD algorithm has weak noise resistance. Therefore, the present invention first pre-processes the input image, sets an adaptive threshold by the image size to corrode first and then expand, removes noise, text, patterns, etc. in the image, and avoids the influence of noise on the detection of frame lines. Then, the processed frame line image is detected by the LSD algorithm to obtain the detection results, and then the detection results are merged, refined and corrected. The straight line clusters of frame lines belonging to the same height or horizontal position are clustered by the topological adjacent or connected relationship within the cluster, and then merged and refined. Finally, topological constraints are established in combination with the organizational rules of the table frame lines, and some frame lines that are not fully detected are corrected and supplemented.

如图1所示，本发明实施例公开了一种钻孔柱状图的图像框线提取方法，包括以下步骤：As shown in FIG1 , an embodiment of the present invention discloses a method for extracting image frame lines of a drilling histogram, comprising the following steps:

步骤100，输入图像格式钻孔柱状图，其中图像格式包括jpg或png或tif；Step 100, input the drilling histogram in image format, wherein the image format includes jpg, png or tif;

步骤200，输入图像预处理；Step 200, input image preprocessing;

步骤500，输出为图像钻孔柱状图中的表格框线中水平与竖直框线的矢量文件，以Esri Shapefile文件坐标系下线两端节点坐标存储，存储形式如下：Step 500, output is a vector file of the horizontal and vertical frame lines in the table frame line in the image drilling histogram, which is stored in the coordinates of the nodes at both ends of the line in the Esri Shapefile file coordinate system, and the storage format is as follows:

，结束。 ,Finish.

其中，步骤200具体包括，Wherein, step 200 specifically includes:

步骤202，利用OpenCV对输入图像Image进行灰度化与自适应阈值二值化操作，输出二值化图像Binary_img；Step 202, using OpenCV to perform grayscale and adaptive threshold binarization operations on the input image Image, and output a binary image Binary_img;

步骤203，根据高H和宽W分别设置自适应大小矩形卷积核，纵高矩形Kernel1，尺寸为高H/20，宽1，其中1表示1个像素；横宽矩形Kernel2，尺寸为高1，其中1表示1个像素，宽W/20；分别利用Kernel1和Kernel2对Binary_img利用OpenCV进行先腐蚀后膨胀，得到水平框线图像Horizontal_img和竖直框线图像Vertical_img；Step 203, according to the height H and width W, respectively set the adaptive size rectangular convolution kernel, the height rectangle Kernel1, the size is height H/20, width 1, where 1 represents 1 pixel; the width rectangle Kernel2, the size is height 1, where 1 represents 1 pixel, width W/20; respectively use Kernel1 and Kernel2 to corrode Binary_img using OpenCV first and then dilate to obtain the horizontal frame line image Horizontal_img and the vertical frame line image Vertical_img;

其中，步骤300具体为，直线检测与直线簇合并与细化过程分水平框线与竖直框线两部分进行，对水平框线进行提取与处理。Specifically, step 300 includes performing the straight line detection and straight line cluster merging and thinning process in two parts: horizontal frame lines and vertical frame lines, and extracting and processing the horizontal frame lines.

其中，步骤300包括，Wherein, step 300 includes,

其中，步骤3081读取直线簇类ver_class，初始化y坐标范围集合yrange_set，x坐标集合x_set，合并后直线y坐标集合yrange_merge_set；进入步骤3082；Wherein, step 3081 reads the line cluster class ver_class, initializes the y coordinate range set yrange_set, the x coordinate set x_set, and the merged line y coordinate set yrange_merge_set; and proceeds to step 3082;

其中，所述步骤400包括，Wherein, the step 400 includes:

尽管为使解释简单化将上述方法图示并描述为一系列动作，但是应理解并领会，这些方法不受动作的次序所限，因为根据一个或多个实施例，一些动作可按不同次序发生和/或与来自本文中图示和描述或本文中未图示和描述但本领域技术人员可以理解的其他动作并发地发生。本领域技术人员将进一步领会，结合本文中所公开的实施例来描述的各种解说性逻辑板块、模块、电路、和算法步骤可实现为电子硬件、计算机软件、或这两者的组合。为清楚地解说硬件与软件的这一可互换性，各种解说性组件、框、模块、电路、和步骤在上面是以其功能性的形式作一般化描述的。此类功能性是被实现为硬件还是软件取决于具体应用和施加于整体系统的设计约束。技术人员对于每种特定应用可用不同的方式来实现所描述的功能性，但这样的实现决策不应被解读成导致脱离了本发明的范围。结合本文所公开的实施例描述的各种解说性逻辑板块、模块、和电路可用通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或其它可编程逻辑器件、分立的门或晶体管逻辑、分立的硬件组件、或其设计成执行本文所描述功能的任何组合来实现或执行。通用处理器可以是微处理器，但在替换方案中，该处理器可以是任何常规的处理器、电池仓控制板、微电池仓控制板、或状态机。处理器还可以被实现为计算设备的组合，例如DSP与微处理器的组合、多个微处理器、与DSP核心协作的一个或多个微处理器、或任何其他此类配置。结合本文中公开的实施例描述的方法或算法的步骤可直接在硬件中、在由处理器执行的软件模块中、或在这两者的组合中体现。软件模块可驻留在RAM存储器、闪存、ROM存储器、EPROM存储器、EEPROM存储器、寄存器、硬盘、可移动盘、CD-ROM、或本领域中所知的任何其他形式的存储介质中。示例性存储介质耦合到处理器以使得该处理器能从/向该存储介质读取和写入信息。在替换方案中，存储介质可以被整合到处理器。处理器和存储介质可驻留在ASIC中。ASIC可驻留在用户终端中。在替换方案中，处理器和存储介质可作为分立组件驻留在用户终端中。在一个或多个示例性实施例中，所描述的功能可在硬件、软件、固件或其任何组合中实现。如果在软件中实现为计算机程序产品，则各功能可以作为一条或更多条指令或代码存储在计算机可读介质上或藉其进行传送。计算机可读介质包括计算机存储介质和通信介质两者，其包括促成计算机程序从一地向另一地转移的任何介质。存储介质可以是能被计算机访问的任何可用介质。作为示例而非限定，这样的计算机可读介质可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储、磁盘存储或其它磁存储设备、或能被用来携带或存储指令或数据结构形式的合意程序代码且能被计算机访问的任何其它介质。任何连接也被正当地称为计算机可读介质。例如，如果软件是使用同轴电缆、光纤电缆、双绞线、数字订户线(DSL)、或诸如红外、无线电、以及微波之类的无线技术从web网站、中控计算机、或其它远程源传送而来，则该同轴电缆、光纤电缆、双绞线、DSL、或诸如红外、无线电、以及微波之类的无线技术就被包括在介质的定义之中。如本文中所使用的盘(disk)和碟(disc)包括压缩碟(CD)、激光碟、光碟、数字多用碟(DVD)、软盘和蓝光碟，其中盘(disk)往往以磁的方式再现数据，而碟(disc)用激光以光学方式再现数据。上述的组合也应被包括在计算机可读介质的范围内。Although the above methods are illustrated and described as a series of actions for simplicity of explanation, it should be understood and appreciated that these methods are not limited by the order of the actions, because according to one or more embodiments, some actions may occur in different orders and/or concurrently with other actions from the illustrations and descriptions herein or not illustrated and described herein but understandable to those skilled in the art. It will be further appreciated by those skilled in the art that the various illustrative logic blocks, modules, circuits, and algorithm steps described in conjunction with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or a combination of the two. To clearly explain this interchangeability of hardware and software, various illustrative components, frames, modules, circuits, and steps are generally described in the form of their functionality above. Whether such functionality is implemented as hardware or software depends on the specific application and the design constraints imposed on the overall system. The technical staff can implement the described functionality in different ways for each specific application, but such implementation decisions should not be interpreted as resulting in a departure from the scope of the present invention. The various illustrative logic blocks, modules, and circuits described in conjunction with the embodiments disclosed herein may be implemented or executed with a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in an alternative, the processor may be any conventional processor, a battery compartment control board, a micro-battery compartment control board, or a state machine. The processor may also be implemented as a combination of computing devices, such as a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors in cooperation with a DSP core, or any other such configuration. The steps of the methods or algorithms described in conjunction with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to a processor so that the processor can read and write information from/to the storage medium. In an alternative, the storage medium can be integrated into the processor. The processor and the storage medium can reside in an ASIC. The ASIC can reside in a user terminal. In an alternative, the processor and the storage medium can reside in a user terminal as discrete components. In one or more exemplary embodiments, the functions described can be implemented in hardware, software, firmware, or any combination thereof. If implemented as a computer program product in software, each function can be stored on or transmitted by a computer-readable medium as one or more instructions or codes. Computer-readable media include both computer storage media and communication media, including any media that facilitates the transfer of a computer program from one place to another. Storage media can be any available medium that can be accessed by a computer. As an example and not a limitation, such a computer-readable medium may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, disk storage or other magnetic storage device, or any other medium that can be used to carry or store a desired program code in the form of an instruction or data structure and can be accessed by a computer. Any connection is also properly referred to as a computer-readable medium. For example, if the software is transmitted from a website, a central computer, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc as used herein include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and Blu-ray disc, where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

尽管为使解释简单化将上述方法图示并描述为一系列动作，但是应理解并领会，这些方法不受动作的次序所限，因为根据一个或多个实施例，一些动作可按不同次序发生和/或与来自本文中图示和描述或本文中未图示和描述但本领域技术人员可以理解的其他动作并发地发生。Although the above methods are illustrated and described as a series of actions for simplicity of explanation, it should be understood and appreciated that these methods are not limited by the order of the actions, because according to one or more embodiments, some actions may occur in a different order and/or concurrently with other actions from those illustrated and described herein or not illustrated and described herein but understandable to those skilled in the art.

以上所述，仅是本发明的较佳实施例而已，并非对本发明作任何形式上的限制，虽然本发明已以较佳实施例揭示如上，然而并非用以限定本发明，任何本领域技术人员，在不脱离本发明技术方案范围内，当可利用上述揭示的技术内容做出些许更动或修饰为等同变化的等效实施例，但凡是未脱离本发明技术方案内容，依据本发明的技术实质对以上实施例所作的任何简介修改、等同变化与修饰，均仍属于本发明技术方案的范围内。The above description is only a preferred embodiment of the present invention and does not limit the present invention in any form. Although the present invention has been disclosed as a preferred embodiment as above, it is not used to limit the present invention. Any technical personnel in this field can make some changes or modify the technical contents disclosed above into equivalent embodiments without departing from the scope of the technical solution of the present invention. However, any brief modifications, equivalent changes and modifications made to the above embodiments based on the technical essence of the present invention without departing from the content of the technical solution of the present invention are still within the scope of the technical solution of the present invention.

Claims

1. A method for extracting image frame lines of a drilling histogram, characterized in that it comprises the following steps:

Step 100, input image format drilling histogram;

Step 200, input image preprocessing;

Step 300, frame line detection and line cluster merging and refinement;

Step 400, merging and thinning the result straight lines for post-processing;

Step 500, outputting the vector file of the horizontal and vertical frame lines in the table frame lines in the image drilling histogram, and ending.

2. The method for extracting image frame lines of a drilling histogram according to claim 1, characterized in that step 200 specifically comprises:

Step 201, read the drilling histogram image matrix Image, and obtain the height H and width W of the matrix Image;

Step 202, using OpenCV to perform grayscale conversion and adaptive threshold binarization operations on the input image Image, and output a binary image Binary_img;

Step 203, according to the height H and width W, set the adaptive size rectangular convolution kernels respectively, the height rectangle Kernel1, the size is height H/20, width 1; the width rectangle Kernel2, the size is height 1, width W/20; respectively use Kernel1 and Kernel2 to corrode Binary_img first and then expand it using OpenCV to obtain the horizontal frame line image Horizontal_img and the vertical frame line image Vertical_img;

Step 204 , vertically project the vertical frame line image Vertical_img to obtain a pixel line width set Line_width={w1,w2,w3,…,wn}, and calculate the set arithmetic mean w_mean; the input image preprocessing is completed, and the process proceeds to step 300 .

3. According to the method for extracting image frame lines of a drilling bar graph according to claim 1 or 2, it is characterized in that step 300 is specifically that the straight line detection and straight line cluster merging and refinement process are divided into two parts: horizontal frame lines and vertical frame lines, and the horizontal frame lines are extracted and processed.

4. The method for extracting image frame lines of a drilling histogram according to claim 3, characterized in that step 300 comprises:

Step 301, read the horizontal frame line image Horizontal_img, use Opencv's LSD line detection algorithm to detect the frame lines in Horizontal_img, and obtain preliminary detection results, horizontal frame line cluster Horline_set, each straight line in the cluster is expressed and stored in the form of end point coordinates;

Step 302, traverse all horizontal lines in Horline_set, and arrange the lines in the set in ascending order according to height y; proceed to step 303 to cluster the line clusters at the same height of Horline_set based on topological adjacency or connection relationship;

Step 303, if there are still unclassified lines in Horline_set, go to step 3031; otherwise go to step 304;

Among them, step 3031, traverse the sorted Horline_set, initialize all_hor_class to store all heights of line clusters and single height line cluster hor_class, add hor_class to all_hor_class, add the first line line1=[(x1,y1),(x2,y1)] to hor_class, and mark y, y=y1, and then delete line1 from Horline_set; then enter step 3032;

Step 3032, continue to traverse the lines in the set, read the next line line2 = [(x3, y2), (x4, y2)], if the difference between the height y2 of line2 and the mark y, |y-y2|≤1, then add line2 to hor_class, delete line2 from Horline_set, update the mark y = y2, and go to step 303, otherwise go to step 3031;

Step 304, traverse and cluster all the straight line cluster sets all_hor_class, if there are still straight line clusters that have not been traversed, go to step 3041, until all horizontal straight line clusters are processed, and then go to step 305;

Step 3041, read the line cluster class hor_class, initialize the x coordinate range set xrange_set, the y coordinate set y_set, and the merged line x coordinate set xrange_merge_set; proceed to step 3042;

Step 3042, traverse all lines in hor_class, record linei=[(xi1,yi),(xi2,yi)], i is the index of the line in the set, i=1, 2, 3, ..., n, add yi to y_set, add [xi1, xi2] to xrange_set, delete the traversed hor_class, and go to step 3043;

Step 3043, read y_set, calculate the arithmetic mean of all elements in y_set, record it as y_mean, and go to step 3044;

Step 3044, traverse x_range_set, initialize the x coordinate range of the straight line x_range = [x11, x12], and add xrange_merge_set, and enter step 3045;

Step 3045, continue traversing to obtain the x coordinate range x_range=[x21,x22]. If x21∈[x11,x12] and x22>x12, update x12=x22. If x22<x12, do not update. Delete the lines that have been traversed. If there is still an x left range that has not been traversed, go to step 3044. Otherwise, go to step 3046.

Step 3046, traverse the merged x-coordinate range xrange_merge_set, the internal element xrange_mergei=[xi1,xi2], output the line [(xi1,y_mean),(xi2,y_mean)], and add it to the set mergeline_set; go to step 304;

Step 305, processing the vertical frame line detection result, reading the vertical frame line image Vertical_img, using the LSD line detection algorithm of Opencv to detect the frame lines in Vertical_img, and obtaining a preliminary detection result, a vertical frame line cluster Vertical_set, in which each straight line in the cluster is expressed and stored in the form of the coordinates of both end points; then entering step 306;

Step 306, traverse all horizontal lines in Verline_set, and arrange the lines in the set in ascending order according to the horizontal position x; proceed to step 307, cluster the line clusters at the same height of Verline_set based on the topological adjacent or connected relationship;

Step 307, if there are still unclassified lines in Verline_set, go to step 3071; otherwise go to step 308;

Among them, step 3071, traverse the sorted Verline_set, initialize all line cluster classes all_ver_class, line cluster class ver_class, and add ver_class to all_ver_class, and add the first line line1=[(x1,y1),(x1,y2)] to ver_class, and mark x, x=x1, and then delete line1 from Verline_set; then enter step 3032;

Step 3072, continue to traverse the lines in the set, read the next line line2 = [(x2, y3), (x2, y4)], if the difference between the height y2 of line2 and the mark y, |x-x2|≤1, then add line2 to ver_class, delete line2 from Verline_set, update the mark x = x2, and go to step 308, otherwise go to step 3071;

Step 308, traverse the clustering to complete all the line cluster sets all_ver_class, if there are still line clusters that have not been traversed, then go to step 3081, until all the line clusters are processed, then go to step 400;

Among them, step 3081 reads the line cluster class ver_class, initializes the y coordinate range set yrange_set, the x coordinate set x_set, and the merged line y coordinate set yrange_merge_set; then proceeds to step 3082;

Step 3082, traverse all lines in ver_class, record linei=[(xi,yi1),(xi,yi2)], i is the index of the line in the set, i=1, 2, 3, ..., n, add xi to x_set, [yi1, yi2] to yrange_set, delete the traversed ver_class, and go to step 3043;

Step 3083, read x_set, calculate the arithmetic mean of all elements in x_set, record it as x_mean, and go to step 3084;

Step 3084, traverse y_range_set, initialize the straight line y coordinate range y_range = [y11, y12], and add yrange_merge_set, and enter step 3085;

Step 3085, continue traversing to obtain the y coordinate range y_range=[y21,y22]. If y21∈[y11,y12] and y22>y12, update y12=y22. If y22<y12, do not update. Delete the lines that have been traversed. If there are still y coordinate ranges that have not been traversed, go to step 3084. Otherwise, go to step 3086.

Step 3086, traverse the merged y-coordinate range yrange_merge_set, the internal element yrange_mergei=[yi1,yi2], output the straight line [(x_mean,yi1),(x_mean,yi2)], and add it to the set mergeline_set; go to step 308.

5. The method for extracting image frame lines of a drilling histogram according to claim 4, wherein step 400 comprises:

Step 401, read the horizontal and vertical frame line sets Hor_set and Ver_set obtained after merging and thinning, and proceed to step 402;

Step 402, initialize the horizontal frame line ordinate set Hory_set, the vertical frame line abscissa set Verx_set, traverse the ordinate y of the straight line in Hor_set and add it to Hory_set, traverse the abscissa x of the straight line in Ver_set and add it to Verx_set, and arrange them in ascending order to obtain Hory_set = {y1, y2, ..., yn} and Verx_set = {x1, x2, ..., xn}; proceed to step 403;

Step 403, traverse all horizontal frames in Hor_set, read the straight line hor_linei=[(xi1,yi),(xi2,yi)],

Where i = 1, 2, 3, ..., n, get the horizontal coordinates of the end points of the straight line xi1 and xi2, if xi1∈Verx_set, then no adjustment is made, the same for xi2, otherwise proceeds to step 404;

Step 404, respectively calculate the absolute value of the difference between xi1 and all x coordinates in Verx_set, and obtain the absolute minimum value D_min and its corresponding x coordinate xmin, and determine the size relationship between xmin and the set threshold t=3*w_mean. If D_min≤t, modify xi1=xmin, otherwise modify xi1=xmin+1; after traversing and adjusting the horizontal frame line in Hor_set, enter step 405;

Step 405, traverse all vertical frame lines in Ver_set, read the straight line ver_linei=[(xi,yi1),(xi,yi2)], and obtain the vertical range of the straight line [yi1,yi2]. If yi1∈Hory_set, no adjustment is made, and the same is true for yi2. Otherwise, go to step 406;

Step 406, respectively calculate the absolute value of the difference between yi1 and all x coordinates in Hory_set, and obtain the absolute minimum value D_min and its corresponding y coordinate ymin, and determine the size relationship between ymin and the set threshold t=3*w_mean. If D_min≤t, modify yi1=ymin, otherwise modify yi1=ymin+1; until the vertical frame lines in Ver_set are traversed and adjusted, enter step 407;

Step 407, obtain the adjusted horizontal frame line Hor_set and vertical frame line Ver_set, arrange them in ascending order according to the height y coordinate and the horizontal position x coordinate, and proceed to step 408;

Step 408, traverse all the lines in Ver_set and classify them according to the x-coordinates of the vertical lines to obtain Ver_x_class, where the x-coordinates of the lines in each class are the same, and then proceed to step 409;

Step 409, traverse all classes Ver_class in Ver_x_class, if the class length is equal to 1, no operation is performed, otherwise go to step 410;

Step 410, traverse all the lines in the class, record the x coordinate x0 of the current class line, and the y coordinates of the upper and lower endpoints of all the lines, obtain the y coordinate set y_class_set={y1,y2,y3,…,yn}, and go to step 411;

Step 411, take the maximum value y_max and the minimum value y_min in y_class_set, and obtain the straight line line=[(x0,y_min),(x0,y_max)], delete all the straight lines in Ver_class in Ver_set, and add line until all classes are traversed.

6. The method for extracting image frame lines of a drilling histogram according to any one of claims 1, 2, 4 or 5, characterized in that the image format in step 100 includes jpg, png or tif.

7. The method for extracting image frame lines of a drilling column chart according to any one of claims 1, 2, 4 or 5, characterized in that in step 500, the output is a vector file of horizontal and vertical frame lines in the table frame line of the image drilling column chart, which is stored in the coordinates of the nodes at both ends of the line in the Esri Shapefile file coordinate system, and the storage format is as follows:

_.

8. A device for extracting image frame lines from a drilling bar graph, characterized in that it comprises a memory, a control processor, and a computer program stored in the memory and executable on the control processor, wherein the control processor executes the program to implement the method for extracting image frame lines from a drilling bar graph as described in any one of claims 1 to 7.

9. A control system, characterized by comprising the image frame line extraction device of the drilling column chart according to claim 8.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to implement the image frame line extraction method of the drilling column chart according to any one of claims 1 to 7.