[go: up one dir, main page]

CN114120340A - Text image structured processing method and device - Google Patents

Text image structured processing method and device Download PDF

Info

Publication number
CN114120340A
CN114120340A CN202111230230.3A CN202111230230A CN114120340A CN 114120340 A CN114120340 A CN 114120340A CN 202111230230 A CN202111230230 A CN 202111230230A CN 114120340 A CN114120340 A CN 114120340A
Authority
CN
China
Prior art keywords
text
text box
item
box
boxes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111230230.3A
Other languages
Chinese (zh)
Other versions
CN114120340B (en
Inventor
王亚领
马文伟
付晓
刘设伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taikang Insurance Group Co Ltd
Taikang Online Property Insurance Co Ltd
Original Assignee
Taikang Insurance Group Co Ltd
Taikang Online Property Insurance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taikang Insurance Group Co Ltd, Taikang Online Property Insurance Co Ltd filed Critical Taikang Insurance Group Co Ltd
Priority to CN202111230230.3A priority Critical patent/CN114120340B/en
Publication of CN114120340A publication Critical patent/CN114120340A/en
Application granted granted Critical
Publication of CN114120340B publication Critical patent/CN114120340B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Character Input (AREA)

Abstract

本发明提供了一种文本图像的结构化处理方法、装置,包括:确定文本图像中的文本框,以及所有文本框中的项目文本框;确定所有文本框中的表头文本框,以及属性名称文本框;根据表头文本框、项目文本框和属性名称文本框之间的方位关系,从所有文本框中确定分别与项目文本框和属性名称文本框对应的属性值文本框,以及确定多行打印项目文本框;在建立文本图像的结构化关系时,将多行打印项目文本框进行合并。本发明中可以在结构化输出的同时,进一步通过上述方位关系,确定所有项目文本框中的多行打印项目文本框并进行合并,从而解决了文本图像的结构化输出中的多行打印问题,另外,整个过程可以通过机器算法自动的实现,从而降低了人力成本。

Figure 202111230230

The invention provides a structured processing method and device for text images, including: determining text boxes in a text image and item text boxes in all text boxes; determining header text boxes and attribute names in all text boxes Text box; according to the orientation relationship between the header text box, the item text box and the property name text box, determine the property value text box corresponding to the item text box and the property name text box from all the text boxes, and determine the multi-line Print item text boxes; combine multiple lines of print item text boxes when building a structured relationship between text images. In the present invention, the multi-line printing item text boxes in all the item text boxes can be determined and merged through the above-mentioned orientation relationship at the same time of the structured output, thereby solving the multi-line printing problem in the structured output of the text image, In addition, the whole process can be automatically realized by machine algorithms, thereby reducing labor costs.

Figure 202111230230

Description

Text image structured processing method and device
Technical Field
The invention belongs to the technical field of image recognition, and particularly relates to a text image structured processing method and device, computer equipment and a computer readable storage medium.
Background
In the field of medical imaging, an Optical Character Recognition (OCR) project of an invoice image is mainly used for outputting a format standard of a project name detail field of a medical invoice by using Character recognition content and Character coordinate information of the image.
In the prior art, a client uploads a plurality of medical invoices, the medical invoices have a lot of text information, and when a claim settlement operator performs a claim settlement operation, all medical item names and corresponding attribute items on a bill of expenses need to be accurately input in a full amount. .
However, in the current scheme, because the layout of the medical invoice is relatively complex, multiple lines of printing can be caused when the characters of the project name are too long, so that the standard structured output of data is difficult to achieve, and in addition, the manual participation degree of the current scheme is high, so that the labor cost is high.
Disclosure of Invention
In view of the above, the present invention provides a text image structuring method, a text image structuring device, a computer device, and a computer readable storage medium, which solve the problems that in the current scheme, when the name characters of an item are too long, multi-line printing is caused, so that it is difficult to perform standard structured output of data, and the current scheme has a high degree of human involvement, so that the human cost is high.
According to a first aspect of the present invention, there is provided a method for structured processing of text images, including:
determining text boxes in the text image and item text boxes in all the text boxes;
determining a header text box and an attribute name text box in all text boxes;
determining attribute value text boxes respectively corresponding to the item text box and the attribute name text box from all the text boxes according to the orientation relation among the header text box, the item text box and the attribute name text box, and determining multi-line printing item text boxes in all the item text boxes;
and merging the multi-line printing project text box and the text boxes of the adjacent lines when establishing the structural relationship of the text image according to the corresponding relationship of the project text box, the attribute name text box and the attribute value text box.
According to a second aspect of the present invention, there is provided a text image structuring processing device, which may include:
the recognition module is used for determining text boxes in the text image and item text boxes in all the text boxes;
the first determining module is used for determining a header text box and an attribute name text box in all the text boxes;
a second determining module, configured to determine, according to orientation relationships among the header text box, the item text box, and the attribute name text box, attribute value text boxes respectively corresponding to the item text box and the attribute name text box from all the text boxes, and determine multi-line print item text boxes from all the item text boxes;
and the merging module is used for merging the multi-line printing project text box and the text box of the adjacent line when the structural relation of the text image is established according to the corresponding relation of the project text box, the attribute name text box and the attribute value text box.
In a third aspect, an embodiment of the present invention provides a computer device, where the computer device includes:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the steps included in the text image structuring processing method according to the first aspect according to the obtained program instructions.
In a fourth aspect, the embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when executed by a processor, the computer program implements the steps of the method for processing a text image in a structured manner according to the first aspect.
Aiming at the prior art, the invention has the following advantages:
the invention provides a text image structuring processing method, which comprises the following steps: the method can determine the structural corresponding relation among the item text box, the attribute name text box and the attribute value text box and output the structural corresponding relation according to the azimuth relation among the header text box, the item text box and the attribute name text box in the preliminary OCR recognition result of the text image, and simultaneously further determine and combine multi-line printing item text boxes in all the item text boxes according to the azimuth relation, thereby solving the multi-line printing problem in the structural output of the text image and improving the quality of the structural output of the text image.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a flowchart illustrating steps of a method for processing a text image in a structured manner according to an embodiment of the present invention;
FIG. 2 is a text image provided by an embodiment of the present invention;
FIG. 3 is a flowchart illustrating steps of another method for processing a text image in a structured manner according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a local area of a text image according to an embodiment of the present invention;
fig. 5 is a block diagram of a device for processing a text image in a structured manner according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Fig. 1 is a flowchart of steps of a method for processing a text image in a structured manner, where as shown in fig. 1, the method may include:
step 101, determining text boxes in the text image and item text boxes in all the text boxes.
In the embodiment of the present invention, the text image may be an image containing text content, such as a medical invoice image uploaded by a client, a scanned certificate image, and the like.
In practical application, text contents in the text image have corresponding structured formats, specifically, the text contents in the text image exist in respective corresponding text boxes, and positions of the text boxes are limited by the structured formats, for example, an item text box for representing an item name, an attribute value text box for representing an attribute value of the item name, and the like exist in the text image, positions of different types of text boxes are different, and a certain position constraint also exists between the text boxes, for example, a line-column relationship exists, and the attribute value text box corresponding to the item text box is to be located in an area laterally adjacent to the item text box, which are all required to be considered in a structured processing process of the text image.
Further, in addition to text boxes for a specific field (such as a medical item), text boxes related to other contents also exist in a text image, for example, text boxes corresponding to some other information also exist in a medical invoice, such as text boxes of a payee, a rechecker, a payee, and the like.
And 102, determining a header text box and an attribute name text box in all the text boxes.
In the text image structuring process of the embodiment of the present invention, the more important text boxes further include a header text box and an attribute name text box, referring to fig. 2, which shows a text image provided by the embodiment of the present invention, where a plurality of text boxes 10 are identified, the header text box 11 is used to represent item information at a header position, for example, the content of the header text box 11 may be an "item name", the plurality of item text boxes 12 identified in step 101 are used to represent specific item content, and the content of the items may include: the item text box 12 may be a subordinate text box of the header text box 11, and may be arranged in a region longitudinally adjacent to the header text box 11, as a list item [ guggu ] movement (lactulose oral solution) "," nonpareil bran (wheat cellulose particles) "," mucosal anti-infection treatment 4 (active silver ion antibacterial solution (silverton) ", and the like. The attribute name text box 13 is used for representing an attribute name, and the content of the attribute name text box is as follows: "quantity/unit", "amount (element)", and the like.
Further, the header text box and the attribute name text box may be obtained by matching preset keywords, for example, the header text box may be matched by the keyword "item name", and the attribute name text box may be matched by the keywords "number" and "amount".
Step 103, according to the orientation relationship among the header text box, the item text box and the attribute name text box, determining attribute value text boxes respectively corresponding to the item text box and the attribute name text box from all the text boxes, and determining multi-line printing item text boxes from all the item text boxes.
Specifically, referring to fig. 2, due to the current OCR recognition problem, there is a problem of multi-line printing of the item text box 12, that is, a complete item text box 12 is erroneously recognized as a plurality of multi-line printed item text boxes 121, such as a complete item text box 12: "the following is the list item [ guggotan ] action (lactulose oral solution)", erroneously identified as a plurality of lines of print items text box 121, including: "the following is the list item" or "the solution". This erroneous recognition greatly affects the accuracy of the structured output of the text image.
In the embodiment of the invention, the problem of printing the item text box in multiple lines in the item text box can be further solved on the basis of realizing the structured output of the text image by specifically using the orientation relation among the header text box, the item text box and the attribute name text box. The structured output of the text image is based on determining an attribute value text box corresponding to the item text box, where the attribute value text box is used to represent a specific attribute value of the item text box for a certain attribute name, and if the item text box can be a treatment fee, the content of the attribute value text box corresponding to the treatment fee item text box can be a money amount, such as 60, for the money amount attribute name.
Referring to fig. 2, it can be seen that there is a directional characteristic between the item text box 12 and the corresponding attribute value text box 14 arranged in a horizontal line; the item text box 12 and the header text box 11 have orientation characteristics arranged in a vertical column, the header text box 11 and the attribute name text box 13 have orientation characteristics arranged in a horizontal row, and the header text box 11 is at the header position of the end, the orientation characteristic arranged in a vertical column is present between the attribute name text box 13 and the attribute value text box 14, then, based on these orientation relationships, a horizontal straight line 21 may be first set starting from the item text box 12 (the slope of the straight line 21 is calculated from the slope of the text file in the text image), so that the text box 10 overlapping the lateral straight line 21 is determined as the attribute value text box 14 corresponding to the item text box 12, and traversing all the item text boxes 12 in the above manner, and sequentially determining the attribute value text box 14 corresponding to each item text box 12, so as to obtain the preliminary structured output of the text image.
In addition, since the orientation characteristic arranged in the vertical column manner exists between the attribute name text box 13 and the attribute value text box 14, a downward vertical straight line 22 can be made for the attribute name text box 13 (the slope of the straight line 22 is calculated from the slope of the text file in the text image), so that the text box framed by the straight line 22 can be used as the attribute value text box 14 under the attribute name text box 13.
Further, referring to fig. 2, it can be seen that the horizontal straight line 23, which is set starting from the multi-line print item text box 121, does not overlap any of the attribute value text boxes 14, and therefore the multi-line print item text boxes 121 in all of the item text boxes 12 can be determined based on this orientation relationship.
And 104, merging the multi-line printing project text box and the text boxes of the adjacent lines when establishing the structural relationship of the text image according to the corresponding relationship of the project text box, the attribute name text box and the attribute value text box.
In the embodiment of the invention, after OCR recognition is carried out on the text image and the structured corresponding relation of the project text box, the attribute name text box and the attribute value text box is determined, the multi-line printing project text box and the text boxes of the upper and lower adjacent lines can be combined, thereby solving the multi-line printing problem in structured output and improving the quality of structured output.
For example, referring to fig. 2, a plurality of lines of print item text boxes 121 that are incorrectly identified may be: the following items are list items and liquid, and the text boxes adjacent to the upper part and the lower part are moved (the lactulose orally taken solution is combined to obtain correct and complete text boxes of the items: "the following is the list item [ guo yi ] advantage (lactulose oral solution)",
in summary, according to the structured processing method for a text image provided by the embodiments of the present invention, for the orientation relationship among the header text box, the item text box, and the attribute name text box in the preliminary OCR recognition result of the text image, while determining and outputting the structured correspondence among the item text box, the attribute name text box, and the attribute value text box, the orientation relationship is further provided, and multiple lines of the item text boxes are printed and merged in all the item text boxes, so that the problem of multiple lines of the structured output of the text image is solved, the quality of the structured output of the text image is improved, and in addition, the whole process can be automatically implemented through a machine algorithm, the degree of manual participation is reduced, and the labor cost is reduced.
Fig. 3 is a flowchart of steps of another method for processing a text image in a structured manner, according to an embodiment of the present invention, as shown in fig. 3, the method may include:
step 201, determining a text box in the text image and text content contained in the text box.
In the embodiment of the invention, the text box in the text image can be detected through the text box detection model, and then a set box _ set of the text boxes is output, wherein each text box in the set comprises 8 data of [ x [ ]0,y0,x1,y1,x2,y2,x3,y3]And 4 vertex coordinates of the upper left, upper right, lower left and lower right of the text box are respectively represented.
Step 202, inputting the text content of the text box into a text classification model to obtain the item text content with the type of an item name, and determining the item text box corresponding to the item text content.
In this step, after the text box is identified, an item name text classification model may be further input to the area where the text box is located, the item name text classification model may first identify text content in the text box to obtain a text content set info _ set, and then perform semantic classification on the text content to obtain a set pro _ info _ set of item text content whose classification result is an item name, and finally may determine a text box corresponding to the item text content as an item text box to obtain an item text box set pro _ box _ set.
Step 203, matching the preset keywords with the text content of the text box, and determining the header text box and the attribute name text box.
In this step, the text content of the text box may be specifically matched with the preset header keyword to determine the header text box, and the text content of the text box may be matched with the preset attribute name keyword to determine the header text box.
For example, referring to fig. 2, the header text box 11 may be matched by the keyword "item name", and the attribute name text box 13 may be matched by the keywords "amount", "amount".
Step 204, according to the orientation relation among the header text box, the item text box and the attribute name text box, determining attribute value text boxes respectively corresponding to the item text box and the attribute name text box from all the text boxes, and determining multi-line printing item text boxes in all the item text boxes.
This step may specifically refer to step 103, which is not described herein again.
Optionally, in order to determine the attribute value text box corresponding to the attribute name text box, step 204 may specifically include:
substep 2041 determines a vertical slope from the horizontal slope of a first line formed from the header text box to the attribute name text box.
Optionally, the first straight line is: and a straight line formed from the center point of the header text box to the center point of the attribute name text box.
In the embodiment of the present invention, in order to determine the attribute value text box corresponding to the attribute name text box, referring to fig. 4, which shows a schematic diagram of a local region of a text image according to an embodiment of the present invention, a first straight line 31 may be first formed from a center point of the header text box to a center point of the attribute name text box, where the first straight line 31 reflects a horizontal slope k of a text file in the entire text image in the horizontal direction, and further, a longitudinal slope k', k ═ 1/k in the longitudinal direction of the text file may be obtained by using the horizontal slope k.
Specifically, each text box in the set box _ set according to text box contains 8 vertex coordinate data [ x0,y0,x1,y1,x2,y2,x3,y3]The center point of the header text box and the center point of the attribute name text box can be obtained, and the coordinates of the center point x and the y axis are specifically calculated as follows:
xcenter=(x0+x1+x2+x3)/4
ycenter=(y0+y1+y2+y3)/4
the calculation result includes coordinates (x) of the center point of the header text boxcc,ycc) And center point coordinates (x) of the attribute name textboxpc,ypc
The first straight line 31 is calculated as follows:
(y-ypc)(xcc-xpc)-(x-xpc)(ycc-ypc)=0;
the horizontal slope k is calculated as follows:
Figure BDA0003315318660000081
the longitudinal slope k' is calculated as follows:
Figure BDA0003315318660000082
substep 2042, setting longitudinal second straight lines on both sides of the attribute name text box according to the longitudinal slope.
In this step, referring to fig. 4, since the orientation characteristic in which the attribute name text box 13 and the attribute value text box 14 are arranged in a vertical column exists between them, the boundary points on both sides of the attribute name text box 13 may be (x) respectively3,y3),(x2,y2) And two downward longitudinal second straight lines 32 are made for the attribute name text boxes 13 from the boundary point positions, respectively, and the slope of the second straight lines 32 is a longitudinal slope k'.
Substep 2043 determines the item box overlapped with the second straight line as the attribute value text box corresponding to the attribute name text box.
In this step, referring to fig. 4, a text box selected by the second straight line 32 may be regarded as the attribute value text box 14 under the attribute name text box 13, in accordance with the existence of a constraint relationship arranged in a vertical column between the attribute name text box 13 and the attribute value text box 14. The attribute value text box 14 is used to characterize the specific attribute value of the item text box 12 for a certain attribute name.
Specifically, for each text box, whether the second straight line overlaps with the text box may be determined as follows:
a. the equation for calculating the intersection of the two lines is:
Figure BDA0003315318660000091
wherein, h (x), f (x) represent equations of two straight lines, and the intersection point (x, y) can be obtained by solving a bivariate linear equation set.
b. Calculating a straight line equation of the bottom edge of the text box:
(y-y0)(x1-x0)-(x-x0)(y1-y0)=0;
c. respectively calculating the intersection point (x) of the bottom edge of the text box and the second straight line by the calculation result of a and the straight line equation of the bottom edge of the text boxtl,ytl),xtr,ytr);
d. Determining whether the text box is an attribute-value text box under an attribute-name text box according to the following determination conditions:
if x is satisfied3<xtr<x2If the text box is the attribute value text box under the attribute name text box, the text box is a text box with the attribute value;
if x is satisfied3<xlr<x2If the text box is the attribute value text box under the attribute name text box, the text box is a text box with the attribute value;
if it satisfies
Figure BDA0003315318660000092
The text box is an attribute value text box that is under the attribute name text box.
Optionally, in order to determine that the item text box is printed in multiple lines in the item text box, step 204 may further include:
substep 2044, according to the horizontal slope, constructing a horizontal third straight line with the project text box as a starting point.
In this step, referring to fig. 4, in order to determine a multi-line print item text box 121 in the item text box 12, a horizontal third straight line 33 may be constructed starting from the item text box 12 according to a horizontal slope. The starting point of the third straight line 33 may specifically be the center point of the item text box 12.
Sub-step 2045, in the case where the third straight line does not overlap with the attribute value text box, determines that the item text box to which the third straight line corresponds is the multi-line print item text box.
Referring to fig. 4, it can be seen that the third straight line 33 in the lateral direction, which is set starting from the multi-line print item text box 121, does not overlap any of the attribute value text boxes 14, and therefore the multi-line print item text box 121 in all of the item text boxes 12 can be determined from this orientation relationship.
If yes, for the correct and complete item text box: "the following is the list item [ guoko ] interest (lactulose oral solution)," by which a multi-line print item text box 121 can be identified: "the following is the list item" or "the solution".
Optionally, in order to determine the attribute value text box corresponding to the item text box, step 204 may further include:
substep 2046 determines the attribute-value text box overlapping the third straight line as the attribute-value text box corresponding to the item text box corresponding to the third straight line.
With further reference to fig. 4, since the orientation characteristic exists between the item text box 12 and the corresponding attribute value text box 14, which are arranged in a horizontal line, the third horizontal straight line 33 may be first set, with the item text box 12 as a starting point, so that the text box 10 overlapped with the third horizontal straight line 33 is determined as the attribute value text box 14 corresponding to the item text box 12, and by traversing all the item text boxes 12 in the above manner, and sequentially determining the attribute value text box 14 corresponding to each item text box 12, a preliminary structured output of the text image may be obtained. Finally, the structured output of sub-steps 2041 to 2045 is combined to obtain the complete structured output of the text image.
Step 205, merging the multi-line printed item text box and the text boxes of the adjacent lines when establishing the structural relationship of the text image according to the corresponding relationship of the item text box, the attribute name text box and the attribute value text box.
This step may specifically refer to step 104, which is not described herein again.
In summary, according to the structured processing method for a text image provided by the embodiments of the present invention, for the orientation relationship among the header text box, the item text box, and the attribute name text box in the preliminary OCR recognition result of the text image, while determining and outputting the structured correspondence among the item text box, the attribute name text box, and the attribute value text box, the orientation relationship is further provided, and multiple lines of the item text boxes are printed and merged in all the item text boxes, so that the problem of multiple lines of the structured output of the text image is solved, the quality of the structured output of the text image is improved, and in addition, the whole process can be automatically implemented through a machine algorithm, the degree of manual participation is reduced, and the labor cost is reduced.
Fig. 5 is a block diagram of an apparatus for structured processing of text images according to an embodiment of the present invention, and as shown in fig. 5, the apparatus may include:
a recognition module 301, configured to determine text boxes in the text image and item text boxes in all text boxes;
a first determining module 302, configured to determine a header text box and an attribute name text box of all text boxes;
a second determining module 303, configured to determine, according to the orientation relationship among the header text box, the item text box, and the attribute name text box, attribute value text boxes respectively corresponding to the item text box and the attribute name text box from all the text boxes, and determine multi-line print item text boxes from all the item text boxes;
a merging module 304, configured to merge the multiple lines of printed item text boxes with text boxes in adjacent lines when a structural relationship of the text image is established according to the correspondence of the item text box, the attribute name text box, and the attribute value text box.
Optionally, the identifying module 301 includes:
the first determining submodule is used for determining a text box in the text image and text content contained in the text box;
and the classification submodule is used for inputting the text content of the text box into a text classification model, obtaining the project text content with the type of the project name, and determining the project text box corresponding to the project text content.
Optionally, the first determining module 302 includes:
and the second determining submodule is used for determining the header text box and the attribute name text box by matching preset keywords with the text content of the text box.
Optionally, the second determining module 303 includes:
the third determining submodule is used for determining the longitudinal slope according to the horizontal slope of a first straight line formed by the header text box and the attribute name text box;
the fourth determining submodule is used for respectively setting longitudinal second straight lines on two sides of the attribute name text box according to the longitudinal slope;
a fifth determining sub-module, configured to determine an item box overlapped with the second straight line as an attribute value text box corresponding to the attribute name text box.
Optionally, the first determining module 302 includes:
a sixth determining submodule, configured to construct a horizontal third straight line with the project text box as a starting point according to the horizontal slope;
a seventh determining sub-module for determining an item text box corresponding to the third straight line as the multi-line print item text box, in a case where the third straight line does not overlap with the attribute value text box.
Optionally, the second determining module 303 includes:
an eighth determining sub-module, configured to determine the attribute-value text box that overlaps the third straight line as the attribute-value text box corresponding to the item text box corresponding to the third straight line.
Optionally, the first straight line is: and a straight line formed from the center point of the header text box to the center point of the attribute name text box.
In summary, the structured processing apparatus for text images according to the embodiments of the present invention may further provide the orientation relationship while determining and outputting the structured correspondence between the item text box, the attribute name text box, and the attribute value text box in the preliminary OCR recognition result of the text image, and determine and merge multiple lines of printed item text boxes in all the item text boxes, thereby solving the problem of multiple lines of printed item text boxes in the structured output of text images, and improving the quality of structured output of text images.
For the above device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for the relevant points, refer to the partial description of the method embodiment.
Preferably, an embodiment of the present invention further provides a computer device, which includes a processor, a memory, and a computer program stored in the memory and capable of running on the processor, and when being executed by the processor, the computer program implements each process of the above-mentioned method for processing a text image in a structured manner, and can achieve the same technical effect, and in order to avoid repetition, details are not described here again.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the above-mentioned text image structuring processing method embodiment, and can achieve the same technical effect, and in order to avoid repetition, the details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As is readily imaginable to the person skilled in the art: any combination of the above embodiments is possible, and thus any combination between the above embodiments is an embodiment of the present invention, but the present disclosure is not necessarily detailed herein for reasons of space.
The structured processing methods of text images provided herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with the teachings herein. The structure required to construct a system incorporating aspects of the present invention will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. It will be appreciated by those skilled in the art that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components of the structured processing method of text images according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

Claims (10)

1.一种文本图像的结构化处理方法,其特征在于,所述方法包括:1. a structured processing method of text image, is characterized in that, described method comprises: 确定文本图像中的文本框,以及所有文本框中的项目文本框;Determine the textbox in the text image, and the item textbox in all textboxes; 确定所有文本框中的表头文本框,以及属性名称文本框;Determine the header text box in all text boxes, and the attribute name text box; 根据所述表头文本框、所述项目文本框和所述属性名称文本框之间的方位关系,从所有文本框中确定分别与所述项目文本框和所述属性名称文本框对应的属性值文本框,以及确定所有项目文本框中的多行打印项目文本框;According to the orientation relationship among the header text box, the item text box and the property name text box, the property values corresponding to the item text box and the property name text box respectively are determined from all the text boxes text boxes, and the multi-line print item text box to determine all item text boxes; 在根据所述项目文本框、所述属性名称文本框、所述属性值文本框的对应关系,建立所述文本图像的结构化关系时,将所述多行打印项目文本框与相邻行的文本框进行合并。When establishing the structural relationship of the text image according to the corresponding relationship between the item text box, the attribute name text box, and the attribute value text box, the multi-line printing item text box and the adjacent line Text boxes are merged. 2.根据权利要求1所述的方法,其特征在于,所述确定文本图像中的文本框,以及所有文本框中的项目文本框,包括:2. The method according to claim 1, wherein the determining the text box in the text image and the item text box in all the text boxes comprises: 确定所述文本图像中的文本框,以及所述文本框所包含的文本内容;determining the text box in the text image and the text content contained in the text box; 将所述文本框的文本内容输入文本分类模型,得到类型为项目名称的项目文本内容,以及确定所述项目文本内容对应的项目文本框。Inputting the text content of the text box into the text classification model, obtaining the project text content whose type is the project name, and determining the project text box corresponding to the project text content. 3.根据权利要求1所述的方法,其特征在于,所述确定所有文本框中的表头文本框,以及属性名称文本框,包括:3. The method according to claim 1, wherein the determining of the header text boxes in all the text boxes, and the attribute name text boxes, comprises: 通过预设的关键字与所述文本框的文本内容进行匹配,确定所述表头文本框,以及所述属性名称文本框。The header text box and the attribute name text box are determined by matching a preset keyword with the text content of the text box. 4.根据权利要求1所述的方法,其特征在于,所述根据所述表头文本框、所述项目文本框和所述属性名称文本框之间的方位关系,从所有文本框中确定与所述属性名称文本框对应的属性值文本框,包括:4 . The method according to claim 1 , wherein, according to the orientation relationship among the header text box, the item text box and the attribute name text box, it is determined from all the text boxes that The attribute value text box corresponding to the attribute name text box includes: 根据由所述表头文本框到所述属性名称文本框所构成的第一直线的水平斜率,确定纵向斜率;determining the vertical slope according to the horizontal slope of the first straight line formed from the header text box to the attribute name text box; 根据所述纵向斜率,在所述属性名称文本框的两侧分别设置纵向的第二直线;According to the vertical slope, two vertical straight lines are respectively set on both sides of the attribute name text box; 将与所述第二直线重叠的项目框,确定为所述属性名称文本框对应的属性值文本框。The item box overlapping the second straight line is determined as the property value text box corresponding to the property name text box. 5.根据权利要求4所述的方法,其特征在于,所述根据所述表头文本框、所述项目文本框和所述属性名称文本框之间的方位关系,确定所有项目文本框中的多行打印项目文本框,包括:5. The method according to claim 4, wherein, according to the orientation relationship among the header text box, the item text box and the attribute name text box, the Multi-line print item text boxes, including: 根据所述水平斜率,以所述项目文本框为起点构建水平的第三直线;According to the horizontal slope, a third horizontal straight line is constructed with the item text box as a starting point; 在所述第三直线与所述属性值文本框不重叠的情况下,确定所述第三直线对应的项目文本框为所述多行打印项目文本框。In the case that the third straight line does not overlap with the attribute value text box, it is determined that the item text box corresponding to the third straight line is the multi-line printing item text box. 6.根据权利要求5所述的方法,其特征在于,所述根据所述表头文本框、所述项目文本框和所述属性名称文本框之间的方位关系,确定及与所述项目文本框对应的属性值文本框包括:6 . The method according to claim 5 , wherein, determining the relationship with the item text according to the orientation relationship among the header text box, the item text box and the attribute name text box. 7 . The property value text boxes corresponding to the boxes include: 将与所述第三直线重叠的属性值文本框,确定为所述第三直线对应的项目文本框所对应的属性值文本框。The property value text box overlapping the third straight line is determined as the property value text box corresponding to the item text box corresponding to the third straight line. 7.根据权利要求4所述的方法,其特征在于,所述第一直线为:由所述表头文本框的中心点到所述属性名称文本框的中心点所构成的直线。7 . The method according to claim 4 , wherein the first straight line is: a straight line formed from the center point of the header text box to the center point of the attribute name text box. 8 . 8.一种文本图像的结构化处理装置,其特征在于,所述装置包括:8. A device for structured processing of text images, characterized in that the device comprises: 识别模块,用于确定文本图像中的文本框,以及所有文本框中的项目文本框;A recognition module for determining text boxes in text images, and item text boxes in all text boxes; 第一确定模块,用于确定所有文本框中的表头文本框,以及属性名称文本框;The first determination module is used to determine the header text boxes and the attribute name text boxes in all the text boxes; 第二确定模块,用于根据所述表头文本框、所述项目文本框和所述属性名称文本框之间的方位关系,从所有文本框中确定分别与所述项目文本框和所述属性名称文本框对应的属性值文本框,以及确定所有项目文本框中的多行打印项目文本框;The second determining module is configured to determine the corresponding relationship between the item text box and the attribute from all the text boxes according to the orientation relationship among the header text box, the item text box and the attribute name text box. The attribute value text box corresponding to the name text box, and the multi-line print item text box in the determination of all project text boxes; 合并模块,用于在根据所述项目文本框、所述属性名称文本框、所述属性值文本框的对应关系,建立所述文本图像的结构化关系时,将所述多行打印项目文本框与相邻行的文本框进行合并。The merging module is configured to print the multi-line item text box when establishing the structural relationship of the text image according to the corresponding relationship between the item text box, the attribute name text box, and the attribute value text box Merge with text boxes on adjacent lines. 9.一种计算机设备,其特征在于,所述计算机设备包括:9. A computer device, characterized in that the computer device comprises: 存储器,用于存储程序指令;memory for storing program instructions; 处理器,用于调用所述存储器中存储的程序指令,按照获得的程序指令执行权利要求1至7中任一所述的文本图像的结构化处理方法包括的步骤。The processor is configured to invoke the program instructions stored in the memory, and execute the steps included in the method for structured processing of text images according to any one of claims 1 to 7 according to the obtained program instructions. 10.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1至7任一所述的文本图像的结构化处理方法。10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, a text image according to any one of claims 1 to 7 is implemented. structured approach.
CN202111230230.3A 2021-10-21 2021-10-21 A text image structured processing method and device Active CN114120340B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111230230.3A CN114120340B (en) 2021-10-21 2021-10-21 A text image structured processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111230230.3A CN114120340B (en) 2021-10-21 2021-10-21 A text image structured processing method and device

Publications (2)

Publication Number Publication Date
CN114120340A true CN114120340A (en) 2022-03-01
CN114120340B CN114120340B (en) 2024-11-12

Family

ID=80376563

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111230230.3A Active CN114120340B (en) 2021-10-21 2021-10-21 A text image structured processing method and device

Country Status (1)

Country Link
CN (1) CN114120340B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9965695B1 (en) * 2016-12-30 2018-05-08 Konica Minolta Laboratory U.S.A., Inc. Document image binarization method based on content type separation
US20190384972A1 (en) * 2018-06-18 2019-12-19 Sap Se Systems and methods for extracting data from an image
CN111626250A (en) * 2020-06-02 2020-09-04 泰康保险集团股份有限公司 Line dividing method and device for text image, computer equipment and readable storage medium
CN111652176A (en) * 2020-06-11 2020-09-11 商汤国际私人有限公司 Information extraction method, device, equipment and storage medium
CN111985306A (en) * 2020-07-06 2020-11-24 北京欧应信息技术有限公司 OCR (optical character recognition) and information extraction method applied to documents in medical field
CN113139537A (en) * 2021-05-13 2021-07-20 上海肇观电子科技有限公司 Image processing method, electronic circuit, visual impairment assisting apparatus, and medium
CN113239227A (en) * 2021-06-02 2021-08-10 泰康保险集团股份有限公司 Image data structuring method and device, electronic equipment and computer readable medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9965695B1 (en) * 2016-12-30 2018-05-08 Konica Minolta Laboratory U.S.A., Inc. Document image binarization method based on content type separation
US20190384972A1 (en) * 2018-06-18 2019-12-19 Sap Se Systems and methods for extracting data from an image
CN111626250A (en) * 2020-06-02 2020-09-04 泰康保险集团股份有限公司 Line dividing method and device for text image, computer equipment and readable storage medium
CN111652176A (en) * 2020-06-11 2020-09-11 商汤国际私人有限公司 Information extraction method, device, equipment and storage medium
CN111985306A (en) * 2020-07-06 2020-11-24 北京欧应信息技术有限公司 OCR (optical character recognition) and information extraction method applied to documents in medical field
CN113139537A (en) * 2021-05-13 2021-07-20 上海肇观电子科技有限公司 Image processing method, electronic circuit, visual impairment assisting apparatus, and medium
CN113239227A (en) * 2021-06-02 2021-08-10 泰康保险集团股份有限公司 Image data structuring method and device, electronic equipment and computer readable medium

Also Published As

Publication number Publication date
CN114120340B (en) 2024-11-12

Similar Documents

Publication Publication Date Title
EP3620981B1 (en) Object detection method, device, apparatus and computer-readable storage medium
CN108804815B (en) Method and device for assisting in identifying wall body in CAD (computer aided design) based on deep learning
CN110147774B (en) Table format picture layout analysis method and computer storage medium
CN108763813B (en) Method and device for identifying wall in copy picture based on deep learning
CN109684005B (en) Method and device for determining similarity of components in graphical interface
CN110503100B (en) Medical document identification method and device, computer device and computer-readable storage medium
JPH05500874A (en) Polygon-based method for automatic extraction of selected text in digitized documents
CN114004204B (en) Table structure reconstruction and text extraction method and system based on computer vision
DE102019208700A1 (en) Workpiece measuring device, workpiece measuring method and program
US20190333256A1 (en) Methods and Systems For Simplified Graphical Depictions of Bipartite Graphs
WO2021023111A1 (en) Methods and devices for recognizing number of receipts and regions of a plurality of receipts in image
CN113158895A (en) Bill identification method and device, electronic equipment and storage medium
CN108550166A (en) A kind of spatial target images matching process
CN114529773A (en) Form identification method, system, terminal and medium based on structural unit
CN114299500A (en) Identification method, identification device, computer equipment and storage medium
CN113052181A (en) Table reconstruction method, device and equipment based on semantic segmentation and storage medium
CN117315224A (en) Target detection method, system and medium for improving regression loss of bounding box
JPH077456B2 (en) Recognition device of figure by degree of polymerization
CN114120340A (en) Text image structured processing method and device
Li et al. Comic image understanding based on polygon detection
JP4221534B2 (en) Feature extraction method for binary image
CN116452809A (en) Line object extraction method based on semantic segmentation
CN116645502A (en) Power transmission line image detection method and device and electronic equipment
CN108564571A (en) Image-region choosing method and terminal device
CN115222621A (en) Image correction method, electronic device, storage medium, and computer program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant