[go: up one dir, main page]

CN114692707B - A method, device, computer storage medium and terminal for text angle classification - Google Patents

A method, device, computer storage medium and terminal for text angle classification Download PDF

Info

Publication number
CN114692707B
CN114692707B CN202011566516.4A CN202011566516A CN114692707B CN 114692707 B CN114692707 B CN 114692707B CN 202011566516 A CN202011566516 A CN 202011566516A CN 114692707 B CN114692707 B CN 114692707B
Authority
CN
China
Prior art keywords
anchor
sub
anchor combination
angle
angle classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011566516.4A
Other languages
Chinese (zh)
Other versions
CN114692707A (en
Inventor
黄达一
熊龙飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Office Software Inc
Zhuhai Kingsoft Office Software Co Ltd
Original Assignee
Beijing Kingsoft Office Software Inc
Zhuhai Kingsoft Office Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Office Software Inc, Zhuhai Kingsoft Office Software Co Ltd filed Critical Beijing Kingsoft Office Software Inc
Priority to CN202011566516.4A priority Critical patent/CN114692707B/en
Publication of CN114692707A publication Critical patent/CN114692707A/en
Application granted granted Critical
Publication of CN114692707B publication Critical patent/CN114692707B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • G06F18/256Fusion techniques of classification results, e.g. of results related to same input data of results relating to different input data, e.g. multimodal recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本文公开一种文本角度分类的方法、装置、计算机存储介质及终端,本发明实施例通过锚组合确定切分的子图的角度分类的概率值;基于预设的加权信息对确定的角度分类的概率值进行处理,确定文本的角度分类。角度分类基于保持原图分辨率的子图确定,提升了文本角度分类的准确度。

This article discloses a method, device, computer storage medium and terminal for text angle classification. The embodiment of the present invention determines the probability value of angle classification of segmented sub-graphs through anchor combination; processes the determined probability value of angle classification based on preset weighting information to determine the angle classification of the text. The angle classification is determined based on the sub-graph that maintains the resolution of the original image, which improves the accuracy of text angle classification.

Description

Text angle classification method, device, computer storage medium and terminal
Technical Field
The present disclosure relates to, but is not limited to, neural network technology, and in particular, to a method, apparatus, computer storage medium, and terminal for text angle classification.
Background
Currently, most classification models can only carry out reasoning classification through a compressed whole image, and the reasoning classification method has larger limitation in application scenes requiring high definition and edge characteristics.
The text angle classification algorithm is a classification model that classifies the rotation angle of text. The text angle classification algorithm can not accurately classify the rotation angle of the text image due to the reduction of definition and the edge characteristics of the text, for example, the rotation angle of the text image can not be accurately classified into four angles of 0 degree, 90 degrees, 180 degrees and 270 degrees, or the angle classification of exclusive or smaller degrees.
How to improve the accuracy of the text angle classification algorithm becomes a problem to be solved.
Disclosure of Invention
The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.
The embodiment of the invention provides a method, a device, a computer storage medium and a terminal for classifying text angles, which can improve the accuracy of classifying the text angles.
The embodiment of the invention provides a method for classifying text angles, which comprises the following steps:
Training the anchor combinations obtained by the original graph segmentation to obtain the probability value of the angle classification of each sub-graph contained by each anchor combination;
Processing the obtained probability value of the angle classification of each subgraph contained in each anchor combination through preset weighting information to obtain an output value of each anchor combination for the angle classification;
And determining the angle classification of the text by the obtained output value for angle classification.
In one illustrative example, the training of the anchor combination obtained from the artwork segmentation includes:
packaging the anchor combination according to the preset input size of the training model;
inputting the packaged anchor combination into the training model for training;
when the resolution of the original image is smaller than the input size, packing the anchor combination according to the preset input size of the training model to obtain a sub image with the size larger than one half of the original image size and smaller than the original image size in the anchor combination; and under the condition that the resolution ratio of the original image is larger than or equal to the input size, packing the anchor combination according to the preset input size of the training model, wherein the sub image with the size larger than the input size and smaller than twice the input size in the anchor combination is obtained.
In one illustrative example, the weighting information includes a weighting coefficient that is positively correlated with the amount of information of text contained by the anchor assembly.
In one illustrative example, the weighting information includes:
A first weighting factor for each sub-picture included in the anchor combination, and/or,
And taking the distance between the anchor point of the anchor combination and the center of the original image as the anchor point distance, and forming a second weighting coefficient of each anchor combination which is inversely related to the anchor point distance.
In an exemplary embodiment, the processing the obtained probability value of the angle classification of each sub-graph included in each anchor combination through preset weighting information includes:
Calculating a probability weighted sum of each sub-graph based on the obtained probability value of the angle classification and the first weighting coefficient, accumulating the calculated probability weighted sum of each sub-graph according to the angle classification to obtain the output value for the angle classification of each anchor combination, or,
Calculating each angle class for each probability value of the angle class of each sub-graph included in each anchor group, respectively, in case the weighting information includes the second weighting coefficient, an accumulated sum of probability values of current angle classes of all sub-graphs included in the anchor group, obtaining a probability accumulated sum of each anchor group from the accumulated sums of probability values of angle classes of all sub-graphs included in the anchor group, obtaining the output value for angle class of each anchor group by combining the calculated probability accumulated sum of each anchor group with the second weighting coefficient, or,
And calculating a probability weighted sum of each sub-graph according to the obtained probability value of the angle classification and the first weighting coefficient, accumulating the calculated probability weighted sum of each sub-graph according to the angle classification to obtain a probability weighted sum of each anchor combination, and obtaining the output value of each anchor combination for the angle classification by using the calculated probability weighted sum of each anchor combination and the second weighting coefficient.
In another aspect, an embodiment of the present invention further provides a computer storage medium, where a computer program is stored, where the computer program is executed by a processor to implement the method for classifying a text angle.
In yet another aspect, an embodiment of the present invention further provides a terminal, including a memory and a processor, where the memory stores a computer program,
The processor is configured to execute the computer program in the memory;
The computer program, when executed by the processor, implements a method of text angle classification as described above.
In still another aspect, the embodiment of the invention also provides a text angle classification device, which comprises a training unit, a computing unit and a determining unit, wherein,
The training unit is used for training the anchor combinations obtained by dividing the original image to obtain the probability value of the angle classification of each subgraph contained in each anchor combination;
the calculation unit is used for processing the obtained probability value of the angle classification of each sub-graph contained in each anchor combination through preset weighting information to obtain an output value of each anchor combination for the angle classification;
the determining unit is arranged to determine the angular classification of the text by the obtained output value for the angular classification.
In an exemplary embodiment, the training unit is configured to:
packaging the anchor combination according to the preset input size of the training model;
inputting the packaged anchor combination into the training model for training;
when the resolution of the original image is smaller than the input size, packing the anchor combination according to the preset input size of the training model to obtain a sub image with the size larger than one half of the original image size and smaller than the original image size in the anchor combination; and under the condition that the resolution ratio of the original image is larger than or equal to the input size, packing the anchor combination according to the preset input size of the training model, wherein the sub image with the size larger than the input size and smaller than twice the input size in the anchor combination is obtained.
In one illustrative example, the weighting information includes a weighting coefficient that is positively correlated with the amount of information of text contained by the anchor assembly.
In one illustrative example, the weighting information includes:
A first weighting factor for each sub-picture included in the anchor combination, and/or,
And taking the distance between the anchor point of the anchor combination and the center of the original image as the anchor point distance, and forming a second weighting coefficient of each anchor combination which is inversely related to the anchor point distance.
In an exemplary example, the computing unit is configured to:
Calculating a probability weighted sum of each sub-graph based on the obtained probability value of the angle classification and the first weighting coefficient, accumulating the calculated probability weighted sum of each sub-graph according to the angle classification to obtain the output value for the angle classification of each anchor combination, or,
Calculating each angle class for each probability value of the angle class of each sub-graph included in each anchor group, respectively, in case the weighting information includes the second weighting coefficient, an accumulated sum of probability values of current angle classes of all sub-graphs included in the anchor group, obtaining a probability accumulated sum of each anchor group from the accumulated sums of probability values of angle classes of all sub-graphs included in the anchor group, obtaining the output value for angle class of each anchor group by combining the calculated probability accumulated sum of each anchor group with the second weighting coefficient, or,
And calculating a probability weighted sum of each sub-graph according to the obtained probability value of the angle classification and the first weighting coefficient, accumulating the calculated probability weighted sum of each sub-graph according to the angle classification to obtain a probability weighted sum of each anchor combination, and obtaining the output value of each anchor combination for the angle classification by using the calculated probability weighted sum of each anchor combination and the second weighting coefficient.
According to the embodiment of the invention, the probability value of the angle classification of the segmented subgraph is determined through the anchor combination, and the determined probability value of the angle classification is processed based on preset weighting information to determine the angle classification of the text. The angle classification is determined based on the subgraph maintaining the original image resolution, so that the accuracy of the text angle classification is improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate and do not limit the application.
FIG. 1 is a flow chart of a method of text angle classification according to an embodiment of the invention;
FIG. 2 is a schematic diagram of an anchor assembly according to an embodiment of the present invention;
Fig. 3 is a block diagram of a text angle classification apparatus according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, embodiments of the present application will be described in detail hereinafter with reference to the accompanying drawings. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be arbitrarily combined with each other.
The steps illustrated in the flowchart of the figures may be performed in a computer system, such as a set of computer-executable instructions. Also, while a logical order is depicted in the flowchart, in some cases, the steps depicted or described may be performed in a different order than presented herein.
FIG. 1 is a flowchart of a method for text angle classification according to an embodiment of the present invention, as shown in FIG. 1, including:
Step 101, training Anchor (Anchor) combinations obtained by cutting original pictures to obtain the probability value of angle classification of each subgraph contained in each Anchor combination, wherein the original pictures comprise document images of angles to be detected;
The embodiment of the invention avoids the problem of definition in the process of classifying the text angles by segmenting the original image.
In an exemplary embodiment, training an anchor combination obtained by slicing an original graph to obtain a probability value of an angle classification of each subgraph included in each anchor combination in the embodiment of the present invention includes:
Packaging the anchor combination according to the preset input size of the training model;
Inputting the packaged anchor combination into a training model for training;
Obtaining the probability value of the angle classification of each sub-graph contained by each anchor combination;
The method comprises the steps of obtaining a sub image with the size larger than half of the original image size and smaller than the original image size in the anchor combination when the resolution of the original image is smaller than the input size, and obtaining a sub image with the size larger than the input size and smaller than twice of the input size in the anchor combination when the resolution of the original image is larger than or equal to the input size.
The embodiment of the invention packages the sub-graphs in the anchor combination by an example, namely, the package of the anchor combination is exemplified by the fact that the input size of the neural network is 448 x 448, for example, in the step 101, the anchor combination is formed by combining 7 sub-graphs, namely, the anchor combination comprises 7 sub-graphs, the 7 sub-graphs are stacked into an array of 7 x 448 x 3 (3 is the red, green and blue channel number of the sub-graphs), and in addition, the size of the sub-graphs is set according to the resolution of the original graph and the input size of the training model, so that the definition of the sub-graphs input to the training model can be ensured.
In an illustrative example, step 101 of the embodiment of the present invention further includes:
And specifically, the original image is segmented according to the scale proportion information by taking the preset Anchor point as a central point to obtain a plurality of subgraphs, and the obtained subgraphs form the Anchor combination of the preset Anchor point.
The anchor points in the embodiment of the invention refer to center points of sub-graphs contained in the anchor combinations obtained by segmentation, the original graph is segmented according to the determined anchor points and the scale proportion information of the anchor combinations, the anchor combinations can be obtained, the anchor points can be determined by a person skilled in the art through analysis according to application scenes, the anchor points of different anchor combinations are different, the scale proportion information of each anchor combination is determined by a person skilled in the art through analysis according to the application scenes, and the sub-graphs are segmented according to the anchor points and the scale proportion information, wherein the sub-graphs belong to one part of the original graph. After the minimum unit size of the segmentation is set by a person skilled in the art, the subgraph is cut out according to the anchor point and the scale proportion information. Assuming that the original image size is 22400 pixels, the left lower corner is taken as the origin of the coordinate axis, the abscissa of each pixel is determined to be 0 to 22400 pixels according to the pixel sequence from left to right, the ordinate of each pixel is determined to be 0 to 22400 pixels according to the pixel sequence from bottom to top, the anchor point of one anchor combination is set to be at 448 pixels, the scale ratio information (aspect ratio) of the anchor combination is 1:2, 1:3, 3:1 and 2:1, the minimum unit size of the segmentation is 224 pixels, the center point of all sub-images contained in the anchor combination obtained by segmenting the original image is 448 pixels, and the sizes of the sub-images obtained by segmenting are 224 pixels, 672 pixels, 224 pixels and 448 pixels. The minimum unit size can be set by those skilled in the art according to the application scenario.
In one illustrative example, the embodiment anchor of the present invention may also be obtained by:
And acquiring an anchor point area, wherein the anchor point area is an area formed by positions with high information content concentration of texts in the original image page. The information quantity of the text can be obtained through a statistical information quantity calculation mode, or can be calculated according to the number of characters in a preset text area, for example, the size of the text area for counting the number of characters is set, the number of characters contained in the area with more than one text area size of an original image page is counted, and the number of characters is used as the information quantity of the text. In the embodiment of the present invention, the point at the upper left corner of the original image page is taken as the preset starting point for illustration, and the area with the text area size taking the preset starting point as the left top point is sequentially acquired according to a predetermined sequence until the area with the text area size corresponding to the point of the original image page is completely acquired, wherein the predetermined sequence may be a sequence from left to right in a row unit, or a sequence from top to bottom in a column unit, and the preset starting point may also be a center point or other position point of the area with the text area size. In the embodiment of the present invention, the area with more than one text area size of the original image page may be obtained in other manners, which is not limited herein. The information volume density threshold may be determined by a person skilled in the art through analysis according to the number of characters contained in the original image page, or may be set based on a statistical average value of the number of characters contained in the text region size of each position, for example, the statistical average value of the number of characters in a plurality of positions of the original image page according to the preset text region size is 30, and the information volume density threshold may be set to about 30, for example, a numerical value within 27-33. The text region whose information amount exceeds the information amount density threshold value is determined as a position where the information amount density is high. In one illustrative example, an area consisting of 3/4 times the height and 3/4 times the width of the original page may be selected as the anchor area. In one illustrative example, the vertical direction is the artwork page height and the horizontal direction is the artwork page width. Specifically, an original image page center point is taken as an origin, the height and the width of a region are obtained, and a region formed by the height and the width of the region is taken as an anchor point region. The width of the region can be set to be equal to the width of the original image page, which is obtained by taking the center point of the original image page as the origin, the left and right directions of the region are respectively equal to the width of the original image page, the sum of the widths of the left and right directions is 3/4 times of the width of the original image page, the height of the region is equal to the height of the original image page, which is obtained by taking the center point of the original image page as the origin, the vertical direction is upward and downward, and the sum of the heights of the upper and lower directions is 3/4 times of the height of the original image page.
According to the number of the anchor points, M points are selected to serve as the anchor points in the anchor point area, the anchor points can be selected in a random selection mode in the anchor point area, the anchor points can be selected through received external instructions, and the anchor points can be selected through preset selection rules, wherein M is a positive integer. Under the condition that the positions of the anchors are randomly selected, the embodiment of the invention can give different weights to each anchor according to the different positions of the anchors in the page. For example, the closer the anchor points are to the center of the page, the higher the weight of the anchor points are, the more accurate the effect is, but the consumed time is correspondingly increased. In an exemplary embodiment, the scale ratio information in the embodiment of the present invention may be set according to different application scenarios classified by the text angle.
The image to be detected is a picture of which the angle is to be determined and classified.
In an exemplary embodiment, the number of sub-graphs included in each anchor combination in the embodiment of the present invention may be more than three, and the combination of more than three sub-graphs with preset weighting information may be obtained through a probability theory correlation principle.
In an exemplary embodiment, before training the anchor combination obtained by the original graph segmentation in step 101, the method according to the embodiment of the present invention further includes:
Extracting features of more than one original image to obtain features of each original image;
and mapping the obtained features of each original image to a mark space of a sample for training to obtain a training model.
In an exemplary embodiment, the feature extraction method for more than one original image according to the embodiment of the present invention includes:
And extracting the characteristics of each original image through a deep convolutional neural network. The feature extraction is the result output by the last layer of depth convolution neural network after the original image is calculated through the depth convolution network.
In one illustrative example, a sample of an embodiment of the present invention includes a training image that includes the results of an angle classification, i.e., an image for which an angle classification has been determined. The number of training images contained in the samples of the embodiment of the invention can be set according to the accuracy of angle classification, theoretically, the larger the number of samples is, the higher the accuracy is, and in addition, the smaller the degree of angle classification is, the more the size proportion of original pictures in the samples is. The labeled space of the sample represents a set of training images;
In an exemplary embodiment, the training of mapping the obtained features of each artwork to the labeling space of the sample may include inputting the extracted features into the fully connected layer of the deep convolutional neural network in the pattern of the feature map, and obtaining a matrix 1*N (N is a positive integer and represents the number of classes of the angle class, for example, the angle classes are 0 degree, 90 degree, 180 degree and 270 degrees, and N is equal to 4) through calculation of the fully connected layer, where the matrix 1*N represents the probability value of the artwork belonging to each angle class.
It should be noted that, the embodiment of the present invention may also refer to the existing model training method in the related art to train and obtain the training model.
102, Processing the obtained probability value of the angle classification of each subgraph contained in each anchor combination through preset weighting information to obtain an output value of each anchor combination for the angle classification;
in one illustrative example, the weighting information of the embodiment of the present invention includes a weighting coefficient that is positively correlated with the amount of information of text contained in the anchor combination.
In an exemplary embodiment, the weighting information in the embodiment of the present invention includes:
the first weighting factor of each sub-picture included in the anchor combination is positively correlated with the amount of information of the text included in the sub-picture, and/or,
And taking the distance between the anchor point of the anchor combination and the center of the original image as the anchor point distance, and forming a second weighting coefficient of each anchor combination which is inversely related to the anchor point distance.
In an exemplary embodiment, the probability value of the angle classification of each sub-graph included in each obtained anchor combination in the embodiment of the present invention is processed through preset weighting information, including:
in case the weighting information comprises a first weighting coefficient, calculating a probability weighted sum of each sub-graph based on the obtained probability value of the angle classification and the first weighting coefficient, accumulating the calculated probability weighted sum of each sub-graph according to the angle classification to obtain the output value for the angle classification of each anchor combination, or,
In case the weighting information comprises a second weighting coefficient, each angle class is calculated for each probability value of the angle class of each sub-graph comprised by each anchor group, respectively, the cumulative sum of the probability values of the current angle class of all sub-graphs comprised by the anchor group, the cumulative sum of the probabilities of the respective angle classes of all sub-graphs comprised by the anchor group is obtained from the cumulative sum of the probability values of the respective angle classes of all sub-graphs comprised by the anchor group, the calculated cumulative sum of the probabilities of the respective anchor groups is multiplied by the respective second weighting coefficient, respectively, the output value for the angle class of the respective anchor group is obtained, or,
In the case that the weighting information includes a first weighting coefficient and a second weighting coefficient, for each anchor group, a probability weighted sum of each sub-graph is calculated based on the obtained probability value of the angle classification and the first weighting coefficient, the calculated probability weighted sums of each sub-graph are accumulated according to the angle classification to obtain the probability weighted sums of the anchor groups, and the calculated probability weighted sums of each anchor group are multiplied by the corresponding second weighting coefficient to obtain the output value of each anchor group for the angle classification.
In an exemplary embodiment, the first weighting factor may be set by a person skilled in the art according to the amount of information (generally, the number of characters) of the text included in the sub-graph, and generally, the larger the amount of information of the text included in the sub-graph, the larger the first weighting factor. By setting the first weighting coefficient, when the text angle classification is carried out, the larger the text quantity of the subgraph is, the larger the ratio of the subgraph is in the text angle classification process is, so that the accuracy of the text angle classification can be improved.
In one illustrative example, the calculation of the probability weighted sum for each anchor combination in an embodiment of the present invention includes:
for each anchor combination, a probability weighted sum is calculated by:
Respectively calculating the product of the probability value of the angle classification of the obtained subgraph and the first weighting coefficient of the subgraph belonging to different angle classifications to obtain the probability weighting value of the angle classification of each subgraph;
and accumulating the probability weighted values of all the subgraphs in the obtained anchor combination to obtain a probability weighted sum.
Note that when the weighting information does not include the first weighting coefficient, the first weighting coefficient may be set to 1.
In an illustrative example, the second weighting factor may be set by one skilled in the art based on the anchor point of the anchor assembly, and in general, the closer the anchor point of the anchor assembly is to the center of the artwork, the greater the second weighting factor. When the anchor point is close to the center and the edge of the original image, the information quantity of the characters in the sub-image obtained by segmentation is different, and the anchor point is close to the anchor combination of the center of the original image, the characters of the sub-image often contain more information quantity, so that the weight of the anchor combination of the anchor point close to the center of the original image in the angle classification of the text is improved by setting the second weighting coefficient, and the accuracy of the angle classification of the text is improved.
And 103, determining the angle classification of the text through the obtained output value for angle classification.
After the output values for angle classification are obtained, the embodiment of the invention can select the angle classification with the largest value as the angle classification of the text by comparing the values corresponding to the angle classifications in the output values.
According to the embodiment of the invention, the probability value of the angle classification of the segmented subgraph is determined through the anchor combination, and the determined probability value of the angle classification is processed based on preset weighting information to determine the angle classification of the text. The angle classification is determined based on the subgraph maintaining the original image resolution, so that the accuracy of the text angle classification is improved.
FIG. 2 is a schematic diagram of an anchor assembly according to an embodiment of the present invention, as shown in FIG. 2, after a minimum unit size is set, an original image is segmented according to anchor points and scale proportion information, so as to obtain an anchor assembly including three sub-images; the method comprises the steps of obtaining a first weighting coefficient of an anchor combination by a training model, obtaining a classification probability value of 0 degree, 90 degrees, 180 degrees and 270 degrees of angle classification of the first sub-graph as [0.9,0,0,0], setting a classification probability value of 0 degree, 90 degrees, 180 degrees and 270 degrees of angle classification of the second sub-graph as [0.8,0,0,0], obtaining a classification probability value of 0 degree, 90 degrees, 180 degrees and 270 degrees of angle classification of the third sub-graph as [0.88,0.1,0.1,0], obtaining a probability weighted sum of the anchor combination as [0.9,0.0,0.0,0.0] [ 0.9+ [0.8,0.1,0.0,0.0] [ 1.0+ [0.88,0.1,0.1,0.0] [ 1.0] = [2.49,0.2,0.1,0.0] ] according to the first weighting coefficient, obtaining a probability weighted sum of each anchor combination by referring to the calculation when the anchor combination is included, setting a second weighting coefficient according to the distance between the anchor point and the center of the original graph, obtaining an output text for angle classification according to the probability weighting coefficient of each anchor combination, and the second weighting coefficient, setting the three anchor combination as [ 5692 ] as the first weighting coefficient, and directly calculating the probability value of the anchor combination as [ 3990.35 ] according to the first weighting coefficient, and directly calculating the probability value of the anchor combination as [ 3990.35 ] as the first weighting coefficient, and directly using the output value of the first weighting coefficient as [ 3990.35 when the two anchor combination is shown by the value of [ 3490.9+ [ 3290 degrees and the value of the first weighting coefficient.
The embodiment of the invention improves the accuracy and stability of angle classification, and can be suitable for platforms including mobile phones, tablets, notebooks, embedded systems and the like. The embodiment of the invention can also be implemented by other open-source neural networks for performing the text angle classification method. In addition, the embodiment of the invention can be applied to the fields of text angle classification, such as medical image, fine industrialization and the like.
The embodiment of the invention also provides a computer storage medium, wherein a computer program is stored in the computer storage medium, and the method for classifying the text angles is realized when the computer program is executed by a processor.
The embodiment of the invention also provides a terminal which comprises a memory and a processor, wherein the memory stores a computer program,
The processor is configured to execute the computer program in the memory;
the computer program, when executed by a processor, implements a method of text angle classification as described above.
Fig. 3 is a block diagram of a text angle classification apparatus according to an embodiment of the present invention, as shown in fig. 3, including a training unit, a calculating unit and a determining unit, wherein,
The training unit is used for training the anchor combinations obtained by dividing the original image to obtain the probability value of the angle classification of each subgraph contained in each anchor combination;
the calculation unit is used for processing the obtained probability value of the angle classification of each sub-graph contained in each anchor combination through preset weighting information to obtain an output value of each anchor combination for the angle classification;
the determining unit is arranged to determine the angular classification of the text by the obtained output value for the angular classification.
According to the embodiment of the invention, the probability value of the angle classification of the segmented subgraph is determined through the anchor combination, and the determined probability value of the angle classification is processed based on preset weighting information to determine the angle classification of the text. The angle classification is determined based on the subgraph maintaining the original image resolution, so that the accuracy of the text angle classification is improved.
In one illustrative example, the training unit is configured to:
Packaging the anchor combination according to the preset input size of the training model;
Inputting the packaged anchor combination into a training model for training;
Obtaining the probability value of the angle classification of each sub-graph contained by each anchor combination;
The method comprises the steps of obtaining a sub image with the size larger than half of the original image size and smaller than the original image size in the anchor combination when the resolution of the original image is smaller than the input size, and obtaining a sub image with the size larger than the input size and smaller than twice of the input size in the anchor combination when the resolution of the original image is larger than or equal to the input size.
In one illustrative example, the weighting information of the embodiment of the present invention includes a weighting coefficient that is positively correlated with the amount of information of text contained in the anchor combination.
According to the embodiment of the invention, the weighting coefficient is positively correlated with the information quantity of the text contained in the anchor combination, so that the accuracy of text angle classification can be further improved.
In one illustrative example, the weighting information includes:
the first weighting factor of each sub-picture included in the anchor combination is positively correlated with the amount of information of the text included in the sub-picture, and/or,
And taking the distance between the anchor point of the anchor combination and the center of the original image as the anchor point distance, and forming a second weighting coefficient of each anchor combination which is inversely related to the anchor point distance.
In one illustrative example, the computing unit is configured to:
in case the weighting information comprises a first weighting coefficient, calculating a probability weighted sum of each sub-graph based on the obtained probability value of the angle classification and the first weighting coefficient, accumulating the calculated probability weighted sum of each sub-graph according to the angle classification to obtain the output value for the angle classification of each anchor combination, or,
In case the weighting information comprises a second weighting coefficient, each angle class is calculated for each probability value of the angle class of each sub-graph comprised by each anchor group, respectively, the cumulative sum of the probability values of the current angle class of all sub-graphs comprised by the anchor group, the cumulative sum of the probabilities of the respective angle classes of all sub-graphs comprised by the anchor group is obtained from the cumulative sum of the probability values of the respective angle classes of all sub-graphs comprised by the anchor group, the calculated cumulative sum of the probabilities of the respective anchor groups is multiplied by the respective second weighting coefficient, respectively, the output value for the angle class of the respective anchor group is obtained, or,
In the case that the weighting information includes a first weighting coefficient and a second weighting coefficient, for each anchor group, a probability weighted sum of each sub-graph is calculated based on the obtained probability value of the angle classification and the first weighting coefficient, the calculated probability weighted sums of each sub-graph are accumulated according to the angle classification to obtain the probability weighted sums of the anchor groups, and the calculated probability weighted sums of each anchor group are multiplied by the corresponding second weighting coefficient to obtain the output value of each anchor group for the angle classification.
"One of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components, for example, one physical component may have a plurality of functions, or one function or step may be cooperatively performed by several physical components. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. "

Claims (6)

1. A method of text angle classification, comprising:
Training the anchor combinations obtained by the original graph segmentation to obtain the probability value of the angle classification of each sub-graph contained by each anchor combination;
Processing the obtained probability value of the angle classification of each subgraph contained in each anchor combination through preset weighting information to obtain an output value of each anchor combination for the angle classification;
determining the angle classification of the text through the obtained output value for angle classification;
the weighting information comprises a first weighting coefficient of each sub-graph contained in the anchor combination, wherein the first weighting coefficient is positively correlated with the information quantity of the text contained in the sub-graph, and/or a second weighting coefficient of each anchor combination, which is negatively correlated with the anchor point distance by taking the distance between the anchor point of the anchor combination and the center of the original graph as the anchor point distance;
the probability value of the angle classification of each sub-graph contained in each obtained anchor combination is processed through preset weighting information to obtain an output value of each anchor combination for the angle classification, and the method comprises the following steps:
Calculating a probability weighted sum of each sub-graph based on the obtained probability value of the angle classification and the first weighting coefficient, accumulating the calculated probability weighted sum of each sub-graph according to the angle classification to obtain the output value for the angle classification of each anchor combination, or,
In case the weighting information comprises the second weighting coefficients, each angle class is calculated for each probability value of the angle class of each sub-graph comprised by each anchor combination, respectively, the anchor combination comprises an accumulated sum of probability values of the current angle class of all sub-graphs, the probability accumulated sum of each anchor combination is obtained from the accumulated sum of probability values of the angle classes of all sub-graphs comprised by the anchor combination, the output value for the angle class of each anchor combination is obtained from the calculated probability accumulated sum of each anchor combination and the second weighting coefficients, or,
And when the weighting information comprises the first weighting coefficient and the second weighting coefficient, calculating a probability weighted sum of each sub-graph according to the obtained probability value of the angle classification and the first weighting coefficient, accumulating the calculated probability weighted sum of each sub-graph according to the angle classification to obtain a probability weighted sum of the anchor combination, and obtaining the output value of the anchor combination for the angle classification according to the calculated probability weighted sum of each anchor combination and the second weighting coefficient.
2. The method of claim 1, wherein training the anchor combinations obtained from artwork segmentation comprises:
packaging the anchor combination according to the preset input size of the training model;
inputting the packaged anchor combination into the training model for training;
when the resolution of the original image is smaller than the input size, packing the anchor combination according to the preset input size of the training model to obtain a sub image with the size larger than one half of the original image size and smaller than the original image size in the anchor combination; and under the condition that the resolution ratio of the original image is larger than or equal to the input size, packing the anchor combination according to the preset input size of the training model, wherein the sub image with the size larger than the input size and smaller than twice the input size in the anchor combination is obtained.
3. A computer storage medium having stored therein a computer program which, when executed by a processor, implements the method of text angle classification according to any of claims 1-2.
4. A terminal comprises a memory and a processor, wherein the memory stores a computer program,
The processor is configured to execute the computer program in the memory;
The computer program, when executed by the processor, implements a method of text angle classification as claimed in any one of claims 1-2.
5. A text angle classification device is characterized by comprising a training unit, a calculating unit and a determining unit, wherein,
The training unit is used for training the anchor combinations obtained by dividing the original image to obtain the probability value of the angle classification of each subgraph contained in each anchor combination;
the calculation unit is used for processing the obtained probability value of the angle classification of each sub-graph contained in each anchor combination through preset weighting information to obtain an output value of each anchor combination for the angle classification;
The determining unit is configured to determine an angle classification of the text by the obtained output value for the angle classification;
the weighting information comprises a first weighting coefficient of each sub-graph contained in the anchor combination, wherein the first weighting coefficient is positively correlated with the information quantity of the text contained in the sub-graph, and/or a second weighting coefficient of each anchor combination, which is negatively correlated with the anchor point distance by taking the distance between the anchor point of the anchor combination and the center of the original graph as the anchor point distance;
the probability value of the angle classification of each sub-graph contained in each obtained anchor combination is processed through preset weighting information to obtain an output value of each anchor combination for the angle classification, and the method comprises the following steps:
Calculating a probability weighted sum of each sub-graph based on the obtained probability value of the angle classification and the first weighting coefficient, accumulating the calculated probability weighted sum of each sub-graph according to the angle classification to obtain the output value for the angle classification of each anchor combination, or,
In case the weighting information comprises the second weighting coefficients, each angle class is calculated for each probability value of the angle class of each sub-graph comprised by each anchor combination, respectively, the anchor combination comprises an accumulated sum of probability values of the current angle class of all sub-graphs, the probability accumulated sum of each anchor combination is obtained from the accumulated sum of probability values of the angle classes of all sub-graphs comprised by the anchor combination, the output value for the angle class of each anchor combination is obtained from the calculated probability accumulated sum of each anchor combination and the second weighting coefficients, or,
And when the weighting information comprises the first weighting coefficient and the second weighting coefficient, calculating a probability weighted sum of each sub-graph according to the obtained probability value of the angle classification and the first weighting coefficient, accumulating the calculated probability weighted sum of each sub-graph according to the angle classification to obtain a probability weighted sum of the anchor combination, and obtaining the output value of the anchor combination for the angle classification according to the calculated probability weighted sum of each anchor combination and the second weighting coefficient.
6. The apparatus of claim 5, wherein the training unit is configured to:
packaging the anchor combination according to the preset input size of the training model;
inputting the packaged anchor combination into the training model for training;
when the resolution of the original image is smaller than the input size, packing the anchor combination according to the preset input size of the training model to obtain a sub image with the size larger than one half of the original image size and smaller than the original image size in the anchor combination; and under the condition that the resolution ratio of the original image is larger than or equal to the input size, packing the anchor combination according to the preset input size of the training model, wherein the sub image with the size larger than the input size and smaller than twice the input size in the anchor combination is obtained.
CN202011566516.4A 2020-12-25 2020-12-25 A method, device, computer storage medium and terminal for text angle classification Active CN114692707B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011566516.4A CN114692707B (en) 2020-12-25 2020-12-25 A method, device, computer storage medium and terminal for text angle classification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011566516.4A CN114692707B (en) 2020-12-25 2020-12-25 A method, device, computer storage medium and terminal for text angle classification

Publications (2)

Publication Number Publication Date
CN114692707A CN114692707A (en) 2022-07-01
CN114692707B true CN114692707B (en) 2024-12-13

Family

ID=82130433

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011566516.4A Active CN114692707B (en) 2020-12-25 2020-12-25 A method, device, computer storage medium and terminal for text angle classification

Country Status (1)

Country Link
CN (1) CN114692707B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110378249A (en) * 2019-06-27 2019-10-25 腾讯科技(深圳)有限公司 The recognition methods of text image tilt angle, device and equipment
CN111967449A (en) * 2020-10-20 2020-11-20 北京易真学思教育科技有限公司 Text detection method, electronic device and computer readable medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3787377B2 (en) * 1995-08-31 2006-06-21 キヤノン株式会社 Document orientation determination method and apparatus, and character recognition method and apparatus
US7136813B2 (en) * 2001-09-25 2006-11-14 Intel Corporation Probabalistic networks for detecting signal content
JP4532419B2 (en) * 2006-02-22 2010-08-25 富士フイルム株式会社 Feature point detection method, apparatus, and program
KR102746047B1 (en) * 2015-05-21 2024-12-26 코닌클리케 필립스 엔.브이. Method and device for determining depth map for an image
CN105354307B (en) * 2015-11-06 2021-01-15 腾讯科技(深圳)有限公司 Image content identification method and device
CN111695377B (en) * 2019-03-13 2023-09-29 杭州海康威视数字技术股份有限公司 Text detection method and device and computer equipment
CN112016341A (en) * 2019-05-28 2020-12-01 珠海金山办公软件有限公司 A kind of text picture correction method and device
CN111931877B (en) * 2020-10-12 2021-01-05 腾讯科技(深圳)有限公司 Target detection method, device, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110378249A (en) * 2019-06-27 2019-10-25 腾讯科技(深圳)有限公司 The recognition methods of text image tilt angle, device and equipment
CN111967449A (en) * 2020-10-20 2020-11-20 北京易真学思教育科技有限公司 Text detection method, electronic device and computer readable medium

Also Published As

Publication number Publication date
CN114692707A (en) 2022-07-01

Similar Documents

Publication Publication Date Title
US9235759B2 (en) Detecting text using stroke width based text detection
CN110310264A (en) A large-scale target detection method and device based on DCNN
CN111091123A (en) Text region detection method and equipment
CN110717366A (en) Text information identification method, device, equipment and storage medium
RU2697649C1 (en) Methods and systems of document segmentation
CN112513877B (en) Target object identification method, device and system
CN110399762A (en) A kind of method and device of the lane detection based on monocular image
CN111738114B (en) Vehicle target detection method based on accurate sampling of remote sensing images without anchor points
CN112800955A (en) Remote sensing image rotating target detection method and system based on weighted bidirectional feature pyramid
US20170178341A1 (en) Single Parameter Segmentation of Images
CN111368632A (en) Signature identification method and device
CN110377670B (en) Method, device, medium and equipment for determining road element information
CN114998595B (en) Weak supervision semantic segmentation method, semantic segmentation method and readable storage medium
CN110443242B (en) Reading frame detection method, target recognition model training method and related device
CN116403127A (en) A method, device, and storage medium for object detection in aerial images taken by drones
CN113591746B (en) Document table structure detection method and device
CN114494775A (en) Video segmentation method, device, equipment and storage medium
CN116152171A (en) Intelligent construction target counting method, electronic equipment and storage medium
CN116977895A (en) Stain detection method and device for universal camera lens and computer equipment
CN113610178A (en) Inland ship target detection method and device based on video monitoring image
CN114677578B (en) Method and device for determining training sample data
CN114692707B (en) A method, device, computer storage medium and terminal for text angle classification
CN113781475B (en) Method and system for detecting salient human targets in thermal infrared images
CN117556294A (en) Data classification method, device, computer equipment and storage medium
CN116188906A (en) Method, device, equipment and medium for identifying closing mark in popup window image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant