Stroke extraction method for Chinese handwritten Chinese characters
Technical Field
The invention relates to a word processing technology, in particular to a stroke extraction method of Chinese handwritten Chinese characters.
Background
The invention of the characters is an important driving force for the development of human civilization, and after entering an information age, how to process the characters is always the key point of related discipline research, such as recognition of various online/offline handwritten characters, identity verification based on writing style, automatic processing of various documents, intelligent education and the like, and particularly various character processing technologies have been widely applied along with the rapid development of the current deep learning technology. However, in the above process, due to the black box characteristic of the deep neural network, the interpretation of the finally obtained model is not strong, and it is difficult to correlate with the prior knowledge intuitively, which is not beneficial to further deepening of application.
Chinese characters are cultural magnificents of Chinese nationalities, inherit the cultural spirit of the first generation of Chinese people, and have special character shapes. According to the pen-using rule of Chinese handwriting, namely 'Yongji eight method', all Chinese characters can be split into eight basic strokes of points, horizontal, vertical, hooks, lifting, skimming, short skimming and right falling, and basic strokes of different types and numbers are orderly distributed in a character space according to a certain word-forming logic, so that the simple and attractive square Chinese characters are formed. The above-described process is clear and intuitive for human perception, and is easy to accept and understand by people.
In the skeleton-based method, the skeleton of the character is firstly required to be obtained, and is usually realized based on various refinement algorithms, then the intersection points in the skeleton are detected, the skeleton is split into a plurality of skeleton line segments based on the intersection points, finally the split skeleton line segments are combined based on a certain rule, a single stroke skeleton is restored, and the stroke is restored by the stroke skeleton. In the process, the problems to be solved mainly include the problem of processing strokes with different widths, the problem of hairline and bifurcation of a skeleton, the problem of deformation of the strokes and the like. A schematic flow of this type of method is shown in figure 1.
In the contour-based method, firstly, the contour of a character is required to be obtained, which is usually realized based on various edge detection algorithms, secondly, the direction attribute of points on a contour line is detected, corresponding stroke inflection points are found out from the detected contour line, inflection points belonging to the same stroke intersection point are combined, the accurate position of the intersection point in a stroke is determined, the stroke is split into a plurality of disjoint contour fragments by utilizing the intersection point, finally, the split contour fragments are combined based on a certain rule, a single stroke contour is obtained, and the stroke contour is filled to obtain the stroke. In the above-described process, finding and judging of inflection points and the relative positional relationship between inflection points are important points of attention. A schematic flow of the method is shown in fig. 2.
The two methods are similar in overall flow, namely, first, the high-order information (skeleton and outline) of the original character image is extracted, then, the information is used for obtaining the intersection point, splitting the character to obtain the stroke segment, and finally, the stroke segment is combined based on a certain rule to obtain the final stroke. Where how to efficiently describe high-order information and the rule of combination of stroke segments is the technical focus.
As known from the related literature, the prior art is mainly oriented to printed characters or handwriting characters with more standard writing, and has better effect on the regular characters. However, when general handwriting is processed, the problems of continuous writing, writing deformation and the like exist, so that the stroke extraction error is large, and the performance of downstream tasks is seriously affected. The method is characterized in that the method comprises the steps of obtaining outline information and skeleton information, wherein the outline information is a first-order outline, the skeleton information is a second-order outline, the skeleton information is a first-order outline, the skeleton information is a second-order outline, the first-order outline is a first-order outline, the second-order outline is a second-order outline, and the first-order outline is a third-order outline.
Disclosure of Invention
The invention mainly aims to provide a stroke extraction method for handwriting Chinese characters.
The invention adopts the technical scheme that the stroke extraction method of the Chinese handwritten Chinese character comprises the following steps:
Preprocessing a character image;
Calculating the angle characteristics of the direction points of the pixels;
calculating the direction point angle characteristics of the inflection point group;
stroke extraction based on stroke skeleton and outline.
Further, the character image preprocessing includes:
The original image is marked as I, the input original image is binarized, and the binarized character image is marked as Ibw;
extracting the edges of the characters in the binary image Ibw by using a Canny algorithm, and marking the edge image as Iedge;
The skeleton of the character in the binary image Ibw is extracted using the Rosenfeld algorithm, and the skeleton image is denoted as Isk.
Still further, the direction point angle feature calculation of the pixel includes:
Taking the current pixel as a center, taking a square area with a neighborhood radius of N as an area localRect to be calculated, and only reserving the pixels which are communicated with the current pixel in the area;
wherein the value of N is related to the size of the character image, and N is usually 3 for the character image with the size of about 100 x 100;
Deleting the point in the radius range of N-1 in localRect, wherein the rest part is the direction point of the current pixel, and if a plurality of direction points are mutually adjacent, only one point which is farthest from the current pixel is reserved;
calculating an included angle theta between a connecting line of the current pixel and a direction point and an x-axis by taking the current pixel as a coordinate origin, wherein the calculation formula of theta is shown as formula (1):
θ=atan2(x-x0,y-y0) (1)
Wherein (x 0,y0) is the coordinate of the current pixel, (x, y) is the coordinate of the current direction point, the possible value range of theta is [ -180, 180), and the value of the angle theta is the first-order direction point angle characteristic of the current pixel point;
If the number of the direction points of a certain pixel point is more than or equal to 2, the relative angle difference delta between any two direction points can be calculated, and the calculation formula of delta is shown as the formula (2):
Wherein the subscripts a and b respectively represent any two direction points of the same pixel point, θ is the angle of the direction point corresponding to the point, and the value is relative position, so that the value needs to be normalized to the angle range of [0,180 ]) Delta values that form the second order directional point angle characteristic of the current pixel.
Still further, the method for extracting strokes of Chinese handwritten Chinese characters according to claim 1, wherein,
The direction point angle characteristic calculation of the inflection point group comprises the following steps:
Inflection point and skeleton line segment extraction
Calculating first-order and second-order direction point angle characteristics of all pixel points on the skeleton diagram Isk, and deleting the pixel points with the number of the first-order direction points being 0 to obtain a new Isk;
Extracting all inflection points in Isk, judging whether one pixel point is provided with two conditions, wherein one pixel point is provided with 2 direction points and the second order direction point angle is smaller than 145, and the other pixel point is provided with more than 2 direction points;
Subtracting Isk-i from Isk to obtain skeleton line segment graph Isk-l, wherein the number of direction points of all pixels is less than 3 and the second-order direction point angle is greater than 135;
direction point angle feature calculation of inflection point group
Calculating the distance between all pixels in the inflection point group and the center pixel, and constructing a square region localRect to be calculated by taking the farthest distance as a radius N;
Deleting localRect points in the N-1 radius range, wherein the rest part is the direction point of the current inflection point group, as shown in fig. 9 (b), and if a plurality of direction points are mutually adjacent, only one point which is farthest from the center of the current inflection point group is reserved;
Calculating an included angle theta between the direction point and the center line of the current inflection point group aiming at each direction point, searching a skeleton line segment communicated with the included angle theta, taking the value of the angle theta as the first-order direction point angle characteristic of the current inflection point group if no skeleton line segment is communicated with the direction point, taking the value of the direction angle phi of the skeleton line segment as the first-order direction point angle characteristic of the current inflection point group if the skeleton line segment is communicated with the direction point, and normalizing the value of the angle phi into the range of [ -180,180) according to the value of the angle theta because the value range of the phi is [ -90,90), wherein the normalization method is shown in a formula (3):
Calculating the relative angle difference delta between any two direction points in the inflection point group, wherein the calculation formula is shown as the formula (2), if one inflection point group has m direction points, the calculation can be performed Delta values that constitute the second order directional point angle characteristic of the current inflection point cluster.
Still further, the stroke extraction based on the stroke skeleton and outline includes:
Finding out two direction points DP a and DP b with the largest second-order direction point angles for the inflection clusters with the number of all the direction points being more than three, constructing a new inflection cluster based on the original center point and DP a and DP b, and removing the two direction points from the original inflection cluster;
if the angle variance of the three second-order direction points of the inflection point group is smaller, judging that the inflection point group is Y-shaped intersection, splitting the inflection point group into three new inflection point groups, wherein only one direction point of the original inflection point group is in each new inflection point group, otherwise, considering that the inflection point group is T-shaped intersection, splitting the original inflection point group into two new inflection point groups, wherein one inflection point group comprises the direction point with the largest angle of the two second-order direction points, and the other inflection point group comprises only one direction point;
Judging the second-order direction point angle characteristic of the inflection point group, if the second-order direction point angle characteristic is more than 135, merging skeleton line segments corresponding to two direction points on the inflection point group, otherwise, keeping the relative relation of the skeleton line segments unchanged;
searching the nearest edge parallel to the stroke skeleton in the Iedge, filling a blank area between the edge and the stroke skeleton, and filling a corresponding skeleton intersection area to obtain the split stroke.
The invention has the advantages that:
The method of the invention can convert the offline handwritten Chinese characters into corresponding strokes, and then develop the downstream tasks such as the handwriting recognition of the Chinese characters, the identification of the identities of the Chinese characters and the like based on the strokes, thereby greatly improving the interpretability of the whole processing flow and being convenient for people to develop and understand the Chinese character information processing process in depth and further study.
The method can more accurately convert the offline handwritten Chinese characters into corresponding strokes, thereby improving the accuracy of downstream tasks such as handwriting recognition of the subsequent Chinese characters, identification of the letters, and the like.
In addition to the objects, features and advantages described above, the present invention has other objects, features and advantages. The present invention will be described in further detail with reference to the drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application.
FIG. 1 is a flow chart of a skeleton-based method;
FIG. 2 is a flow chart of a contour-based method;
FIG. 3 is a sample view of a portion of a text image that needs to be processed;
FIG. 4 is a flow chart of the method of the present invention;
FIG. 5 is a current pixel map of the present invention;
FIG. 6 is a diagram with a current pixel as the origin of coordinates;
FIG. 7 is a graph of first and second order directional point angle characteristics for all pixel points on a skeleton graph Isk (where (a) is the new Isk obtained, (b) is the inflection point group, and (c) is the skeleton line segment graph Isk-l);
FIG. 8 is a directional diagram of each skeleton line segment calculated;
FIG. 9 is a diagram of the area to be calculated for constructing a square with the furthest distance as radius N;
(wherein, (a) is a region to be calculated localRect to construct a square, and (b) is a direction point of the current inflection point);
FIG. 10 is a drawing of a stroke skeleton diagram iteratively performed until all inflection groups have been processed;
FIG. 11 is a drawing of a resulting split stroke.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The object processed by the invention is an independent Chinese character picture, and the work such as denoising enhancement, layout analysis, line word segmentation, normalization and the like in the earlier stage of the document image is not in the discussion range. A sample of the portion of the text and picture that needs to be processed is shown in fig. 3.
The invention relates to a computer program algorithm, which comprises the steps of image preprocessing, pixel direction point angle characteristic calculation, inflection point angle characteristic calculation and stroke extraction, wherein the step flow is shown in fig. 4.
Character image preprocessing
1.1, The original image is denoted as I, the input original image is binarized, and the binarized character image is denoted as Ibw.
1.2 The edges of the characters in the binary image Ibw are extracted using the Canny algorithm and the edge image is denoted Iedge.
1.3 The skeleton of the character in the binary image Ibw is extracted using the Rosenfeld algorithm and the skeleton image is denoted Isk.
The technical methods related to binarization, edge extraction and skeleton extraction of the single character image are basic methods in the field of digital image processing, and implementation is provided in various image processing tool libraries, so that the specific technical details of the above processes are not separately described in the present invention.
Second, the angle characteristic of the direction point of the pixel
In order to describe the character structure information more accurately, the invention provides a description feature, which is named as a direction point angle feature. For a single pixel on the skeleton image Isk, the calculation process of the direction point angle characteristic is as follows:
2.1 with the current pixel as the center, a square area with a neighborhood radius of N is the area localRect to be calculated, and only the pixels in the area that are connected with the current pixel are reserved, as shown in fig. 5 (a). Where the value of N is related to the size of the character image, N is typically 3 for a character image of about 100 x 100.
2.2 Deleting localRect the point of the N-1 radius, the rest is the direction point of the current pixel, as shown in FIG. 5 (b). If the plurality of direction points are adjacent to each other, only one point, among which the farthest from the current pixel, is reserved.
2.3, Calculating an included angle theta between a connecting line of the current pixel and the direction point and the x-axis by taking the current pixel as the origin of coordinates, wherein the calculation formula of theta is shown in formula 1:
θ=atan2 (x-x 0,y-y0) equation (1)
Wherein (x 0,y0) is the coordinate of the current pixel, (x, y) is the coordinate of the current direction point, the possible value range of theta is [ -180, 180), and the value of the angle theta is the first-order direction point angle characteristic of the current pixel point. One pixel point may correspond to a plurality of direction points, and fig. 5 illustrates five cases of the number of the direction points being 1,2,3, and 4, and corresponds to five cases of endpoint, no intersection, Y-intersection, T-intersection, and X-intersection which occur most frequently in structural information.
2.4 If the number of the direction points of a certain pixel point is greater than or equal to 2, the relative angle difference delta between any two direction points can be calculated, as shown in fig. 6, and the calculation formula of delta is shown in formula 2:
Where subscripts a and b represent any two direction points of the same pixel point, θ is the angle of the direction point corresponding to the point, and since the value is the relative position, it is necessary to normalize it to the angle range of [0, 180). If one pixel point has m direction points, the method can calculate Delta values that form the second order directional point angle characteristic of the current pixel.
Third, the direction point angle characteristic of the inflection point group
The direction point angle characteristics of the skeleton pixels show that the extending change trend of the skeleton is a local description characteristic in a certain range of neighborhood, the character skeleton can be further split based on the characteristic, and the flow is as follows:
3.1 inflection point and skeleton segment extraction
3.1.1 Calculating the first-order and second-order direction point angle characteristics of all the pixel points on the skeleton map Isk, and deleting the pixel points (isolated noise) with the number of the first-order direction points being 0, so as to obtain a new Isk, as shown in fig. 7 (a).
3.1.2 Extract all inflection points in Isk. Whether a pixel point is an inflection point is judged by two conditions, wherein one is that the number of the direction points of the pixel is 2, the second-order direction point angle is smaller than 145, and the other is that the number of the direction points of the pixel is larger than 2. The pixel points can be judged as inflection points when one of the two conditions is met, all the inflection points form a turning point diagram Isk-i of the skeleton, the connected areas in the turning point diagram are solved, and each connected area is one inflection point group, as shown in fig. 7 (b).
3.1.3 Subtracting Isk-i from Isk to obtain skeleton line segment graph Isk-l, as shown in fig. 7 (c), in which the number of direction points of all pixels is less than 3 and the second order direction point angle is greater than 135. And solving the connected areas in Isk-l, wherein each connected area is a skeleton line segment. The direction angle phi of each skeleton line segment is calculated, and the value of the direction angle phi is the included angle between the long axis and the x axis of the current skeleton line segment area, as shown in fig. 8.
3.2 Direction Point Angle characterization of corner groups
3.2.1 For each corner cluster, its geometric center is calculated, with the pixel on the corner cluster closest to the geometric center as the corner cluster center. The distances between all pixels in the inflection group and the center pixel are calculated, and a square region to be calculated localRect is constructed with the farthest distance as a radius N, as shown in fig. 9 (a).
3.2.2 Deleting localRect points of the N-1 radius, the rest is the direction point of the current inflection point group, as shown in FIG. 9 (b). If multiple direction points are adjacent to each other, only one point of the points farthest from the center of the current inflection point cluster is reserved.
3.2.3 For each direction point, calculating the included angle theta (the calculation method is the same as 2.3) between the direction point and the center line of the current inflection point group, and searching the skeleton line segment communicated with the direction point. If there is no connection between the skeleton line segment and the direction point, the value of the angle theta is used as the first-order direction point angle characteristic of the current inflection point group, and if there is connection between the skeleton line segment and the direction point, the value of the skeleton line segment direction angle phi is used as the first-order direction point angle characteristic of the current inflection point group. Since the value of phi is in the range of [ -90, 90), it is also necessary to normalize it to the range of [ -180, 180) according to the value of the angle θ, the normalization method is as shown in formula 3:
Note that, at the time of normalization, since the y-axis direction in the image coordinate system is downward and the y-axis direction in the general coordinate system, the angle θ calculated based on the pixel coordinates and the angle Φ calculated based on the shape of the connected region are opposite in the y-direction, and thus it is necessary to add a negative sign to the value of Φ.
3.2.4 Calculating the relative angle difference delta between any two direction points in the inflection point group, the calculation of delta is shown in the same formula 2. If one inflection point group has m direction points, the calculation can be performedDelta values that constitute the second order directional point angle characteristic of the current inflection point cluster.
Fourth, stroke extraction based on stroke skeleton and outline
The direction point angle characteristic of the inflection point group reflects the relative position relation of skeleton line segments on the inflection point, is a global structure description characteristic, can combine the skeleton line segments based on the characteristic, and extracts strokes based on the feature, and comprises the following specific steps:
4.1 for inflection groups with the number of all the direction points being greater than three, finding out two direction points DP a and DP b with the largest second-order direction point angles, constructing a new inflection group based on the original center point and DP a and DP b, and removing the two direction points from the original inflection group. And iteratively executing the steps until the number of the direction points of the inflection point group is less than or equal to 2. After the step is executed, all the multi-directional point inflection groups are split, and the number of second-order direction point angle characteristics of all the inflection groups is 1.
4.2, If the angle variance of the three second-order direction points of the inflection point group is smaller, judging that the inflection point group is Y-shaped intersection, splitting the inflection point group into three new inflection point groups, wherein only one direction point of the original inflection point group is in each new inflection point group, otherwise, considering that the inflection point group is T-shaped intersection, splitting the original inflection point group into two new inflection point groups, wherein one inflection point group comprises the direction point with the largest angle of the two second-order direction points, and the other inflection point group only comprises one direction point.
And 4.3, judging the second-order direction point angle characteristic of the inflection point group, if the second-order direction point angle characteristic is larger than 135, merging skeleton line segments corresponding to two direction points on the inflection point group, otherwise, keeping the relative relation of the skeleton line segments unchanged. The step is iteratively performed until all inflection groups are processed, resulting in a stroke skeleton, as shown in fig. 10.
4.4 Searching the nearest edge parallel to the stroke skeleton in the Iedge, filling a blank area between the edge and the stroke skeleton, and filling a corresponding skeleton intersection area to obtain the split stroke, as shown in fig. 11.
Aiming at the defects of the prior art, the invention mainly solves two technical problems:
1. how to describe the structure information of the character more effectively.
2. How to accurately extract the strokes of the handwritten Chinese characters by fusing different types of character structure information.
The image binarization, edge detection and skeletonizing algorithms can be replaced by other algorithms of the same kind, and the selection and collocation can be carried out according to the characteristics of the actual input image.
The method can be replaced according to specific downstream tasks, for example, if the downstream tasks only need stroke transverse, the first-order and second-order direction point angles of the inflection point groups can be synthesized for screening, and finally only the strokes meeting the transverse characteristic are reserved.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.