CN109376739B

CN109376739B - Marshalling mode determining method and device

Info

Publication number: CN109376739B
Application number: CN201811219185.XA
Authority: CN
Inventors: 罗熹之
Original assignee: Beijing QIYI Century Science and Technology Co Ltd
Current assignee: Beijing QIYI Century Science and Technology Co Ltd
Priority date: 2018-10-19
Filing date: 2018-10-19
Publication date: 2021-03-26
Anticipated expiration: 2038-10-19
Also published as: CN109376739A

Abstract

The embodiment of the invention provides a grouping mode determining method and a grouping mode determining device, which relate to the technical field of image recognition, wherein the method comprises the following steps: determining an image area containing characters in an image to be recognized as a first image area; performing morphological gradient calculation on the first image area to obtain a first gradient map; inputting each pixel row of the first gradient image into a first detection model respectively to detect the number of characters to which pixel points belong in each pixel row, and obtaining a detection result corresponding to each pixel row; obtaining the number of characters in the image to be recognized based on the obtained detection result; and determining the grouping mode of the characters in the image to be recognized based on the obtained number of the characters. When the scheme provided by the embodiment of the invention is applied to determining the grouping mode, the determining efficiency of the grouping mode can be improved, and the workload of workers in determining the grouping mode is reduced.

Description

Marshalling mode determining method and device

Technical Field

The invention relates to the technical field of image recognition, in particular to a grouping mode determining method and device.

Background

When characters such as numbers and letters contained in an image are recognized, image areas containing the characters are often determined first, then areas where the single characters are located are determined one by one from the areas, and then recognition is performed one by one. When determining the regions where the single characters are located one by one from the image region, generally, the grouping manner of the characters in the image region is determined first, and then the regions where the single characters are located are determined one by one from the image region based on the determined grouping manner. The above-described character grouping method is a method in which the number of characters arranged in a row and the characters arranged in a non-row are separated from each other among a plurality of characters.

Taking a bank card image as an example, in the prior art, when identifying a bank card number from the bank card image, an area where the bank card number is located may be determined first, then an organizing manner of the bank card number is determined, an area where each number of the card number is located is determined one by one from the area based on the organizing manner of the bank card number, and then each number is identified to obtain an identification result of the bank card number.

The inventor finds that the prior art at least has the following problems in the process of implementing the invention:

for the bank card of China Unionpay, the card number has three conditions of 16 bits, 18 bits and 19 bits, the marshalling mode of the numbers on the card surface of the bank card is different for the card numbers with different numbers, and the marshalling mode of the numbers of the card numbers is different for the bank cards with the same number. The positions of the numbers of the bank card numbers in different marshalling modes are different, so that the regions where each number in the determined card numbers is located are different for the bank card numbers in different marshalling modes. In the prior art, when determining the grouping mode of bank card numbers, the grouping mode of the card numbers is generally determined manually, so that although the grouping mode can be determined accurately, the efficiency of determining the grouping mode is low, and the workload of workers is large.

Disclosure of Invention

The embodiment of the invention aims to provide a marshalling mode determining method and a marshalling mode determining device, so that the marshalling mode determining efficiency is improved, and the workload of workers in determining the marshalling mode is reduced. The specific technical scheme is as follows:

the embodiment of the invention provides a marshalling mode determining method, which comprises the following steps:

determining an image area containing characters in an image to be recognized as a first image area;

performing morphological gradient calculation on the first image area to obtain a first gradient map;

inputting each pixel row of the first gradient map into a first detection model respectively to detect the number of characters to which pixel points belong in each pixel row, and obtaining a detection result corresponding to each pixel row, wherein the first detection model is as follows: the method comprises the following steps of training a preset neural network model by using the labeled number of characters to which pixel points belong in each pixel row and each pixel row in a first sample gradient graph in advance to obtain the neural network model for detecting the number of characters to which the pixel points belong in the pixel row, wherein the first sample gradient graph is as follows: performing morphological gradient calculation on the first sample image to obtain a gradient map;

obtaining the number of characters in the image to be recognized based on the obtained detection result;

and determining the grouping mode of the characters in the image to be recognized based on the obtained number of the characters.

In an implementation manner of the present invention, the detection result corresponding to each pixel row includes: the number of characters to which the pixel points in the pixel line belong is the probability of the number of each character;

the obtaining the number of characters of the characters in the image to be recognized based on the obtained detection result comprises:

aiming at the number of each character, calculating the probability and the value of the number of characters to which the pixel points in each pixel row of the first gradient map belong as the number of the characters;

and determining the number of the characters corresponding to the maximum sum as the number of the characters in the image to be recognized.

In an implementation manner of the present invention, the first detection model is obtained by training in the following manner:

acquiring a first sample image;

performing morphological gradient calculation on the first sample image to obtain a first sample gradient map;

obtaining the labeling quantity of the characters to which the pixel points in each pixel row in the first sample gradient image belong according to the quantity of the characters to which the pixel points in the pixel row corresponding to the first sample image belong in each pixel row in the first sample gradient image;

and training a preset neural network model by adopting each pixel row in the first sample gradient image and the corresponding labeled quantity of each pixel row to obtain the neural network model for detecting the quantity of characters to which the pixel points of each pixel row in the image belong, and taking the neural network model as the first detection model.

In an implementation manner of the present invention, the determining a grouping manner of characters in the image to be recognized based on the obtained number of characters includes:

determining a second detection model corresponding to the obtained number of characters and used for detecting the character grouping mode in the image, wherein the second detection model is as follows: the method comprises the following steps of training a preset neural network model by using a labeling grouping mode of characters to which pixel points belong in each pixel row and each pixel row in a second sample gradient graph in advance to obtain the neural network model for detecting the grouping mode of the characters of the pixel points in the pixel rows, wherein the second sample gradient graph is as follows: performing morphological gradient calculation on the second sample image to obtain a gradient map;

inputting each pixel row of the first gradient map into the second detection model respectively to detect the grouping mode of the characters to which the pixel points belong in each pixel row, and obtaining the probability that the grouping mode of the characters to which the pixel points belong in each pixel row is a preset grouping mode;

calculating the probability and the value of the character to which the grouping mode of the pixel points in each pixel row of the first gradient map belongs to a preset grouping mode aiming at each grouping mode;

and determining the grouping mode corresponding to the maximum sum value as the grouping mode of the characters in the image to be recognized.

In an implementation manner of the present invention, the second detection model is obtained by training in the following manner:

acquiring a second sample image;

performing morphological gradient calculation on the second sample image to obtain a second sample gradient map;

obtaining a labeling grouping mode of characters to which pixel points in each pixel row in the second sample gradient image belong according to a grouping mode of characters to which pixel points in the pixel row corresponding to the second sample image belong in each pixel row in the second sample gradient image;

and training a preset neural network model by adopting each pixel row in the second sample gradient image and a mark grouping mode corresponding to each pixel row to obtain the neural network model for detecting the grouping mode of the characters to which the pixel points of each pixel row in the image belong, and taking the neural network model as the second detection model.

An embodiment of the present invention further provides a grouping method determining apparatus, including:

the area determining module is used for determining an image area containing characters in the image to be recognized as a first image area;

the image obtaining module is used for carrying out morphological gradient calculation on the first image area to obtain a first gradient image;

a result obtaining module, configured to input each pixel row of the first gradient map into a first detection model respectively to detect the number of characters to which a pixel point in each pixel row belongs, and obtain a detection result corresponding to each pixel row, where the first detection model is: the method comprises the following steps of training a preset neural network model by using the labeled number of characters to which pixel points belong in each pixel row and each pixel row in a first sample gradient graph in advance to obtain the neural network model for detecting the number of characters to which the pixel points belong in the pixel row, wherein the first sample gradient graph is as follows: performing morphological gradient calculation on the first sample image to obtain a gradient map;

the quantity obtaining module is used for obtaining the quantity of the characters in the image to be recognized based on the obtained detection result;

and the mode determining module is used for determining the grouping mode of the characters in the image to be recognized based on the obtained number of the characters.

the quantity obtaining module comprises:

the first sum value operator module is used for calculating the probability sum value of the number of characters to which the pixel points belong in each pixel row of the first gradient map as the number of the characters aiming at the number of each character;

and the quantity determining submodule is used for determining the quantity of the characters corresponding to the maximum sum value as the quantity of the characters in the image to be recognized.

In an implementation manner of the present invention, the result obtaining module includes the following sub-modules, configured to train to obtain the first detection model:

the image acquisition sub-module is used for acquiring a first sample image;

the gradient map obtaining sub-module is used for carrying out morphological gradient calculation on the first sample image to obtain a first sample gradient map;

the quantity obtaining submodule is used for obtaining the labeling quantity of the characters to which the pixel points in each pixel row in the first sample gradient image belong according to the quantity of the characters to which the pixel points in the pixel row corresponding to the first sample image belong in each pixel row in the first sample gradient image;

and the model obtaining submodule is used for training a preset neural network model by adopting each pixel row in the first sample gradient map and the corresponding labeled quantity of each pixel row to obtain the neural network model for detecting the quantity of characters to which the pixel points of each pixel row in the image belong, and the neural network model is used as the first detection model.

In an implementation manner of the present invention, the manner determining module includes:

a model determining sub-module, configured to determine a second detection model corresponding to the obtained number of characters and used for detecting a character grouping manner in the image, where the second detection model is: the method comprises the following steps of training a preset neural network model by using a labeling grouping mode of characters to which pixel points belong in each pixel row and each pixel row in a second sample gradient graph in advance to obtain the neural network model for detecting the grouping mode of the characters of the pixel points in the pixel rows, wherein the second sample gradient graph is as follows: performing morphological gradient calculation on the second sample image to obtain a gradient map;

the probability obtaining submodule is used for respectively inputting each pixel row of the first gradient map into the second detection model to detect the grouping mode of the characters to which the pixel points belong in each pixel row, and obtaining the probability that the grouping mode of the characters to which the pixel points belong in each pixel row is a preset grouping mode;

the second sum value operator module is used for calculating the probability sum value of the character to which the grouping mode of the pixel points in each pixel row of the first gradient map belongs as a preset grouping mode aiming at each grouping mode;

and the mode determining submodule is used for determining the grouping mode corresponding to the maximum sum value as the grouping mode of the characters in the image to be recognized.

In an implementation manner of the present invention, the probability obtaining sub-module includes the following units, which are used to train and obtain the second detection model:

an image acquisition unit for acquiring a second sample image;

the gradient map obtaining unit is used for performing morphological gradient calculation on the second sample image to obtain a second sample gradient map;

a mode obtaining unit, configured to obtain, according to a grouping mode of characters to which pixel points belong in a pixel row corresponding to the second sample image for each pixel row in the second sample gradient image, a labeling grouping mode of characters to which pixel points belong in each pixel row in the second sample gradient image;

and the model obtaining unit is used for training a preset neural network model by adopting each pixel row in the second sample gradient image and the corresponding labeling grouping mode of each pixel row to obtain the neural network model for detecting the grouping mode of the characters to which the pixel points of each pixel row in the image belong, and the neural network model is used as the second detection model.

The embodiment of the invention also provides electronic equipment which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;

a memory for storing a computer program;

and a processor for implementing the steps of any of the above grouping method when executing the program stored in the memory.

In yet another aspect of the present invention, the present invention further provides a computer-readable storage medium, which stores instructions that, when executed on a computer, cause the computer to execute the steps of any one of the above grouping manner determining methods.

In another aspect of the present invention, the present invention also provides a computer program product containing instructions, which when run on a computer, causes the computer to execute any one of the above-mentioned grouping manner determination methods.

The grouping mode determining method and device provided by the embodiment of the invention can firstly perform morphological gradient calculation on the image area containing the characters of the image to be recognized, then input the pixel rows of the obtained image into the neural network model obtained by pre-training, obtain the character number of the characters in the image to be recognized based on the output of the neural network model, and then determine the grouping mode of the characters based on the character number. Therefore, in the scheme provided by the embodiment of the invention, the grouping mode of the characters is not determined manually, but the equipment is used for identifying, so that the determination efficiency of the grouping mode can be improved, and the workload of workers in determining the grouping mode is reduced.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.

Fig. 1 is a schematic flow chart of a grouping manner determining method according to an embodiment of the present invention;

fig. 2 is another schematic flow chart of a grouping manner determining method according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart of a first detection model training method according to an embodiment of the present invention;

FIG. 4 is a schematic flow chart of a second detection model training method according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a grouping manner determining apparatus according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.

The embodiment of the invention provides a grouping mode determining method and a grouping mode determining device, and concepts related to the embodiment of the invention are explained first.

The morphological gradient calculation refers to a process of processing an image to obtain a difference image according to the following steps:

and respectively carrying out morphological processing of expansion and corrosion on the image, and then subtracting the corroded image from the expanded image to obtain a difference image.

In morphological processing of dilation and erosion of an image, a 3 × 3 convolution kernel may be selected as the feature detector.

A neural network model: complex network systems are formed by a large number of simple processing units widely interconnected.

Grouping mode: the number of characters arranged successively and the case where characters arranged discontinuously are separated are the number of characters.

Taking the card number of the bank card as an example, assuming that the card number of the bank card includes 16 characters, the grouping method can be as follows: 4-4-4-4, every 4 characters are arranged together continuously, and the continuously arranged character strings are separated by the width of 1 character, which is characterized in that: 6200000000000000, respectively; assuming that the card number of the bank card contains 19 characters, the grouping method can be as follows: 6-13, every 4 characters are arranged together continuously, and the continuously arranged character strings are separated by the width of 1 character, which is specifically represented as: 6200000000000000000.

the grouping method provided by the embodiment of the present invention is described in detail below with reference to specific embodiments.

Referring to fig. 1, fig. 1 is a flowchart of a grouping manner determining method according to an embodiment of the present invention, including the following steps:

step S101, determining an image area containing characters in the image to be recognized as a first image area.

The image area containing the character includes a character portion and a character-free portion in the same line as the character. When the image area is determined, a binarization-based horizontal projection algorithm may be adopted, and the specific steps may include: carrying out binarization processing on the bank card image to obtain a black-white binary image; counting the distribution of pixel points with white or black color in each pixel line of the binary image; and determining the image area according to the statistical result.

If all the images to be recognized are image areas containing characters, the whole images to be recognized are the first image areas. Of course, the image area containing the character may also be a partial area in the image to be recognized, in which case, the first image area is a partial area of the image to be recognized.

For example, the image to be recognized may be a bank card image, and the image area containing the characters may be a card number area of the bank card in the image.

And S102, performing morphological gradient calculation on the first image area to obtain a first gradient map.

The image to be identified can be a gray image or a color image. If the image to be identified is a gray image, morphological gradient calculation can be directly carried out on the first image area to obtain a first gradient image; if the image to be identified is a color image, a gray scale image of the first image area can be obtained, and then morphological gradient calculation is carried out on the gray scale image to obtain a first gradient image.

The embodiment of the present invention is described only by way of example of obtaining the first gradient map, and the present invention is not limited thereto.

Step S103, inputting each pixel row of the first gradient map into the first detection model to detect the number of characters to which the pixel point in each pixel row belongs, and obtaining a detection result corresponding to each pixel row.

The first detection model is: and the neural network model is obtained by training a preset neural network model in advance by using the labeled number of characters to which the pixel points belong in each pixel line and each pixel line in the first sample gradient map and is used for detecting the number of characters to which the pixel points belong in the pixel line.

The first sample gradient plot is: and performing morphological gradient calculation on the first sample image to obtain a gradient map.

The first gradient map is obtained by performing morphological gradient calculation on the first image region, so that pixels in the first gradient map have a corresponding relation with pixels in the first image region, and because the first image region of the first gradient map contains characters formed by pixel points, characters to which the pixel points in the pixel row in the first gradient map belong are characters to which the pixel points in the first image region corresponding to the pixel points in the pixel row in the first gradient map belong.

The pixel rows input to the first detection model may be composed of a first preset number of pixel points, and the first preset number may take a value of 240 or 300. If the number of the pixel points of the pixel rows of the first gradient image is larger than the first preset number, the first gradient image can be reduced, so that the width of the first gradient image is the first preset number of the pixel points; if the number of the pixel points of the pixel row of the first gradient image is less than the third preset number, the pixel row can be completed by using the pixel points, and the pixel value of the pixel point used for completing is as follows: the represented gradient value is smaller than the pixel value of the preset threshold value. If the gradient values represented by the colors from white to black in the first gradient map are sequentially decreased from large to small, when the number of the pixel points of the pixel rows of the first gradient map is less than the first preset number, the pixel points represented by the pixel values as black can be used to complement the pixel rows into the pixel rows with the first preset number of pixel points.

The detection result may be a specific number of characters, for example: "18"; the second preset number of possible specific character numbers and the corresponding probability of each character number may also be set, for example: "17: 0.10,18:0.70,19: 0.20"; the probability that the number of characters is a preset number of one or more characters may also be considered, for example: "16:0.05,18:0.75,19:0.20".

And step S104, acquiring the number of characters in the image to be recognized based on the obtained detection result.

If the detection result is a specific number of the number of characters, the number of characters in the image to be recognized corresponding to each pixel row is directly obtained from the detection result, the number of characters in the detection result of a plurality of pixel rows can be used as the number of characters in the image to be recognized, for example, the number of characters with the largest occurrence frequency in the detection result can be used as the number of characters in the image to be recognized.

If the detection result is the second preset number of possible specific character numbers and the probability corresponding to each character number, the detection result can be counted, the sum of the probability values of the character numbers in the detection result is calculated according to each obtained character number, and the character number with the maximum probability value and the maximum probability value is used as the character number of the characters in the image to be recognized.

If the detection result is the probability of the preset number of characters, the sum of the probability values corresponding to the number of each preset character in the detection result can be calculated, and the number of the characters with the maximum probability values is used as the number of the characters in the image to be recognized.

And step S105, determining the grouping mode of the characters in the image to be recognized based on the obtained number of the characters.

In some application scenarios, the grouping manner of the characters is fixed, so that for an image of such an application scenario, after the number of characters is determined, the grouping manner of the characters in the image can be determined according to the setting of the characters contained in the image.

For example: the image to be identified is the image of the Chinese Unionpay bank card, and the image area containing the characters is the card number area of the bank card. When the number of the bank card number is determined to be 16, the grouping mode of the bank card number can be directly determined to be 4-4-4-4 according to the rule of the bank card number of China Unionpay, every 4 numbers are continuously arranged together, and the continuously arranged number strings are separated by the width of 1 number. When the number of the bank card number is 18, the grouping mode of the bank card number is 6-6-6 according to the rule of the bank card number of China Unionpay, every 6 numbers are continuously arranged together, and the continuously arranged number strings are separated by the width of 1 number.

For the case that there are multiple grouping modes with the same number of characters, after the number of characters is determined, it is further necessary to determine the grouping mode based on the number of characters, and how to determine the grouping mode based on the number of characters is described in detail in the following embodiments, which is not repeated herein.

The grouping manner determining method provided in this embodiment may perform morphological gradient calculation on an image region including characters of an image to be recognized, input pixel rows of the obtained image into a neural network model obtained through pre-training, obtain the number of characters of the characters in the image to be recognized based on the output of the neural network model, and then determine the grouping manner of the characters based on the number of the characters. Therefore, in the scheme provided by the embodiment, the grouping mode of the characters is not determined manually, but is identified by using equipment, so that the determination efficiency of the grouping mode can be improved, and the workload of workers in determining the grouping mode is reduced.

In an implementation manner of the present invention, the detection result corresponding to each pixel row may include: when the number of characters to which the pixel points in the pixel row belong is the probability of the number of characters, and the number of characters of the characters in the image to be recognized is obtained based on the obtained detection result in step S104, the probability and the value that the number of characters to which the pixel points in each pixel row of the first gradient map belong is the number of characters may be calculated for each number of characters, and then the number of characters corresponding to the maximum sum value is determined as the number of characters of the characters in the image to be recognized.

The value of the probability may be a numerical value between 0 and 1, and the number of the characters may be preset or obtained by the detection of the first detection model.

If the corresponding sum value is equal to the maximum sum value, the number of the different characters is two or more, the minimum or maximum number of the characters can be determined as the number of the characters in the image to be recognized, the probabilities of the two or more different numbers of characters corresponding to the sum value can be compared, and the number of the characters corresponding to the probability with the maximum probability value is determined as the number of the characters in the image to be recognized.

In this implementation manner, the sum of the probabilities of the number of characters corresponding to the pixel row in the first gradient image is calculated, and the number of characters corresponding to the largest sum is determined as the number of characters. By determining the number of characters in this way, the detection results of the first gradient map can be obtained by integrating the detection results of a plurality of pixel rows, and the amount of calculation required is not large.

Fig. 2 is another schematic flow chart of the grouping manner determining method according to the embodiment of the present invention, which may specifically include the following steps:

step S201, determining an image area containing characters in the image to be recognized as a first image area.

Step S202, performing morphological gradient calculation on the first image area to obtain a first gradient map.

Step S203, inputting each pixel row of the first gradient map into the first detection model to detect the number of characters to which the pixel point in each pixel row belongs, and obtaining a detection result corresponding to each pixel row.

And step S204, acquiring the number of characters in the image to be recognized based on the obtained detection result.

The steps S201 to S204 are the same as S101 to S104 in the embodiment shown in fig. 1, and are not described again here.

Step S205, determining a second detection model for detecting the character grouping manner in the image corresponding to the obtained number of characters.

The second detection model is: and the neural network model is obtained by training a preset neural network model in advance by using the labeling grouping mode of characters to which the pixel points in each pixel line and each pixel line in the second sample gradient map belong, and is used for detecting the grouping mode of the pixel point characters in the pixel lines.

The second sample gradient map is: and performing morphological gradient calculation on the second sample image to obtain a gradient map.

The different number of characters may correspond to a different second detection model, and the second detection model may be trained using a sample image in which the number of characters included is a specific preset number of characters.

Step S206, inputting each pixel row of the first gradient map into the second detection model to detect the grouping manner of the characters to which the pixel points belong in each pixel row, and obtaining the probability that the grouping manner of the characters to which the pixel points belong in each pixel row is the preset grouping manner.

The pixel rows input to the second detection model may be composed of a first preset number of pixel points, and the first preset number may take a value of 240 or 300. If the number of the pixel points of the pixel rows of the first gradient image is larger than the first preset number, the first gradient image can be reduced, so that the width of the first gradient image is the first preset number of the pixel points; if the number of the pixel points of the pixel row of the first gradient image is less than the third preset number, the pixel row can be completed by using the pixel points, and the pixel value of the pixel point used for completing is as follows: the represented gradient value is smaller than the pixel value of the preset threshold value. If the gradient values represented by the colors from white to black in the first gradient map are sequentially decreased from large to small, when the number of the pixel points of the pixel rows of the first gradient map is less than the first preset number, the pixel points represented by the pixel values as black can be used to complement the pixel rows into the pixel rows with the first preset number of pixel points.

The value of the probability may be a value between 0 and 1.

Step S207, calculating, for each grouping manner, a probability and a value that the grouping manner of the character to which the pixel point belongs in each pixel row of the first gradient map is a preset grouping manner.

And step S208, determining the grouping mode corresponding to the maximum sum as the grouping mode of the characters in the image to be recognized.

If two or more different grouping modes in which the corresponding sum values are the maximum sum values are provided, the probabilities corresponding to the two or more different grouping modes and subjected to summation calculation can be compared, and the grouping mode corresponding to the probability with the maximum probability value is determined as the grouping mode of the characters in the image to be recognized.

In the scheme provided by this embodiment, a corresponding second detection model is determined according to the number of characters, then the pixel rows of the first gradient map are input into a neural network model obtained through pre-training, the probability that the grouping mode corresponding to each pixel row is a preset grouping mode is obtained, then the probability sum value is calculated, and the preset grouping mode corresponding to the maximum sum value is determined as the grouping mode. In the scheme provided by this embodiment, the first gradient map after the number of characters is determined is detected by using a neural network model trained by a large number of samples. The preset difference characteristics of different grouping modes are used as samples to train the neural network, so that the model can effectively distinguish different grouping modes, can deal with the situation that the same number of characters has different grouping modes, and can accurately determine the grouping modes.

Fig. 3 is a schematic flow chart of a first detection model training method according to an embodiment of the present invention, where the first detection model is applied to obtain a detection result related to the number of characters in an image to be recognized, and the training method specifically includes the following steps:

step S301, a first sample image is obtained.

The first sample image may be a grayscale image or a color image. The first sample image is an image containing characters.

And step S302, performing morphological gradient calculation on the first sample image to obtain a first sample gradient map.

The first sample image may be processed to obtain the first sample gradient map in the same manner as the step S102 of processing the image to be recognized to obtain the first gradient map.

Step S303, obtaining the labeled number of the characters to which the pixel points in each pixel row in the first sample gradient image belong according to the number of the characters to which the pixel points in the pixel row corresponding to the first sample image belong in each pixel row in the first sample gradient image.

In the first sample gradient map, the number of labeled pixel lines of a gradient map is the same as the number of characters of a first sample image of the gradient map obtained by performing morphological gradient calculation. For example, if the first sample image is an image of a bank card number area, the number of the labels is 16 for the first sample image with the bank card number of 16 bits in the image; for the first sample image with 18-bit bank card number in the image, the above noted number is 18.

Step S304, training a preset neural network model by adopting each pixel row in the first sample gradient map and the labeled quantity corresponding to each pixel row to obtain the neural network model for detecting the character quantity of the pixel point of each pixel row in the image, and taking the neural network model as a first detection model.

When a preset neural network model is trained by adopting each pixel row in the first sample gradient map and the number of labels corresponding to each pixel row, each pixel row in the first sample gradient map can be input into the preset neural network model, the neural network model detects each pixel row to obtain the number of characters to which the pixel points of each pixel row belong, then the obtained number of characters to which the pixel points of each pixel row belong is compared with the number of labels of the pixel row, and model parameters of the neural network model are adjusted according to comparison results, so that the number of characters to which the pixel points of each pixel row detected by the neural network model after parameters are adjusted is close to the number of labels of each pixel row.

In an implementation manner of the present invention, the preset neural network model may be a model constructed by using a CAFFE (Convolution Architecture for Fast Feature Embedding).

In order to detect the number of characters to which pixel points belong in pixel rows of the first gradient map, the embodiment of the invention provides a training method of a neural network model, which can input the pixel rows of the first gradient map into the trained model to obtain a detection result corresponding to each pixel row. The method has the advantages that the neural network model trained by the sample image marked with the number of characters is used for detection, and the number of characters to which pixel points belong in pixel rows of the first gradient image can be effectively identified.

Fig. 4 is a schematic flow chart of a second detection model training method according to an embodiment of the present invention, and the second detection model is applied to obtain a probability that a grouping manner of characters to which pixel points belong in each pixel row of the first gradient map is a preset grouping manner, where the training method specifically includes the following steps:

and step S401, acquiring a second sample image.

The second sample image may be a grayscale image or a color image. The second sample image is an image containing characters, and the number of the characters contained in each image in the second sample image is determined. Since the second detection model corresponds to the number of characters of the characters to which the pixel points in the pixel line detected by the model belong, the number of characters included in each image in the second sample image is the number of characters corresponding to the second detection model. For example: when identifying the bank card number, in order to obtain a second detection model for detecting the 19-card number grouping mode, an image of a bank card number area is selected as a second sample image, and the number of the bank card number contained in each image in the second sample image is 19.

And S402, performing morphological gradient calculation on the second sample image to obtain a second sample gradient image.

The second sample image may be processed to obtain the second sample gradient map in the same manner as the step S102 of processing the image to be recognized to obtain the first gradient map.

Step S403, obtaining a labeling grouping manner of the character to which the pixel point in each pixel line in the second sample gradient image belongs according to the grouping manner of the character to which the pixel point in the pixel line corresponding to the second sample image belongs in each pixel line in the second sample gradient image.

In the second sample gradient map, for a pixel row of a gradient map, the labeling grouping manner is the same as the grouping manner of characters in a second sample image of the gradient map obtained by performing morphological gradient calculation. For example: the second sample image is an image of a bank card number area, and the number of bank card number numbers included in each image in the second sample image is 19. Then, for the 19 second sample images in a grouping manner with all continuous numbers, the above-mentioned grouping manner label can be marked as 0; for the second sample image in the grouping form of 6-13, the label of the grouping form can be recorded as 1; for the second sample image in the grouping manner of 4-4-4-7, the label of the grouping manner can be marked as 2; for the second sample image in the grouping manner of 4-4-4-4-3, the above-mentioned grouping manner label can be denoted as 3. In the training process, the numbers 0,1, 2 and 3 can be used as the grouping labels.

Step S404, training a preset neural network model by using each pixel row in the second sample gradient map and the labeling grouping manner corresponding to each pixel row, to obtain a neural network model for detecting the grouping manner of the character to which the pixel point of each pixel row in the image belongs, and using the neural network model as a second detection model.

In an implementation manner of the present invention, the preset neural network model may be a model constructed by using CAFFE.

In order to detect a grouping mode of characters to which pixel points belong in pixel rows of a first gradient map, an embodiment of the invention provides a training method of a neural network model, which can input the pixel rows of the first gradient map into a trained model to obtain a probability that the grouping mode of the characters to which the pixel points belong in each pixel row is a preset grouping mode. The neural network model trained by the sample image marked with the grouping mode is used for detection, so that different grouping modes of characters to which pixel points belong in pixel rows of the first gradient map can be effectively distinguished.

Based on the same inventive concept, according to the grouping manner determining method provided in the above embodiment of the present invention, correspondingly, an embodiment of the present invention further provides a grouping manner determining apparatus, a schematic structural diagram of which is shown in fig. 5, and the method specifically includes:

the region determining module 501 is configured to determine an image region including characters in the image to be recognized as a first image region;

a map obtaining module 502, configured to perform morphological gradient calculation on the first image region to obtain a first gradient map;

a result obtaining module 503, configured to input each pixel row of the first gradient map into a first detection model respectively to detect the number of characters to which a pixel point in each pixel row belongs, and obtain a detection result corresponding to each pixel row, where the first detection model is: the method comprises the following steps of training a preset neural network model by using the labeled number of characters to which pixel points belong in each pixel row and each pixel row in a first sample gradient graph in advance to obtain the neural network model for detecting the number of characters to which the pixel points belong in the pixel row, wherein the first sample gradient graph is as follows: performing morphological gradient calculation on the first sample image to obtain a gradient map;

a number obtaining module 504, configured to obtain, based on the obtained detection result, a number of characters in the image to be recognized;

a mode determining module 505, configured to determine a grouping mode of the characters in the image to be recognized based on the obtained number of the characters.

The grouping mode determining apparatus provided in this embodiment may perform morphological gradient calculation on an image region including characters of an image to be recognized, input pixel rows of the obtained image into a neural network model obtained through pre-training, obtain the number of characters of the characters in the image to be recognized based on the output of the neural network model, and then determine the grouping mode of the characters based on the number of the characters. Therefore, in the scheme provided by the embodiment, the grouping mode of the characters is not determined manually, but is identified by using equipment, so that the determination efficiency of the grouping mode can be improved, and the workload of workers in determining the grouping mode is reduced.

In an implementation manner of the present invention, the detection result corresponding to each pixel row includes:

the number of characters to which the pixel points in the pixel line belong is the probability of the number of each character;

the quantity obtaining module 504 includes:

In this implementation manner, the sum of the probabilities of the number of characters corresponding to the pixel row in the first gradient image is calculated, and the number of characters corresponding to the largest sum is determined as the number of characters. By determining the number of characters in this way, the detection results of the first gradient map can be obtained by simply and clearly synthesizing the detection results of the plurality of pixel rows, and the required calculation amount is not large.

In an implementation manner of the present invention, the result obtaining module 503 includes the following sub-modules, configured to train and obtain the first detection model:

the image acquisition sub-module is used for acquiring a first sample image;

In an implementation manner of the present invention, the manner determining module 505 includes:

In the implementation mode, the corresponding second detection model is determined according to the number of characters, then the pixel rows of the first gradient image are input into the neural network model obtained through pre-training, the probability that the grouping mode corresponding to each pixel row is the preset grouping mode is obtained, then the probability sum value is calculated, and the preset grouping mode corresponding to the maximum sum value is determined as the grouping mode. In this implementation, the first gradient map with the determined number of characters is detected using a neural network model trained with a large number of samples. The preset difference characteristics of different grouping modes are used as samples to train the neural network, so that the model can effectively distinguish different grouping modes, can deal with the situation that the same number of characters has different grouping modes, and can accurately determine the grouping modes.

an image acquisition unit for acquiring a second sample image;

Based on the same inventive concept, according to the grouping mode determining method provided by the above embodiment of the present invention, correspondingly, the embodiment of the present invention further provides an electronic device, as shown in fig. 6, which includes a processor 601, a communication interface 602, a memory 603, and a communication bus 604, wherein the processor 601, the communication interface 602, and the memory 603 complete mutual communication through the communication bus 604,

a memory 603 for storing a computer program;

the processor 601 is configured to implement the steps of any grouping method in the above embodiments when executing the program stored in the memory 603.

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.

The electronic device for determining grouping modes provided in this embodiment may perform morphological gradient calculation on an image region including characters of an image to be recognized, input pixel rows of the obtained image into a neural network model obtained through pre-training, obtain the number of characters of the characters in the image to be recognized based on the output of the neural network model, and then determine the grouping mode of the characters based on the number of the characters. Therefore, in the scheme provided by the embodiment, the grouping mode of the characters is not determined manually, but is identified by using equipment, so that the determination efficiency of the grouping mode can be improved, and the workload of workers in determining the grouping mode is reduced.

In another embodiment of the present invention, a computer-readable storage medium is further provided, which stores instructions that, when executed on a computer, cause the computer to perform the steps of any of the above grouping manner determining methods.

In a further embodiment of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform any of the above grouping determination methods.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, the electronic device, the computer-readable storage medium, and the computer program product embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and in relation to them, reference may be made to the partial description of the method embodiments.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A grouping method, the method comprising:

2. The method of claim 1, wherein the detection result corresponding to each pixel row comprises: the number of characters to which the pixel points in the pixel line belong is the probability of the number of each character;

3. The method of claim 1, wherein the first detection model is trained by:

acquiring a first sample image;

4. The method according to any one of claims 1-3, wherein the determining the grouping of the characters in the image to be recognized based on the obtained number of the characters comprises:

5. The method of claim 4, wherein the second detection model is trained by:

acquiring a second sample image;

6. A grouping method determination apparatus, comprising:

7. The apparatus of claim 6, wherein the detection result corresponding to each pixel row comprises: the number of characters to which the pixel points in the pixel line belong is the probability of the number of each character;

the quantity obtaining module comprises:

8. The apparatus of claim 6, wherein the result obtaining module comprises sub-modules for training the first detection model:

the image acquisition sub-module is used for acquiring a first sample image;

9. The apparatus according to any one of claims 6-8, wherein the manner determining module comprises:

10. The apparatus of claim 9, wherein the probability obtaining sub-module comprises the following units for training the second detection model:

an image acquisition unit for acquiring a second sample image;

11. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method of any one of claims 1 to 5 when executing a program stored in the memory.