CN109726715A

CN109726715A - A kind of character image serializing identification, structural data output method

Info

Publication number: CN109726715A
Application number: CN201811614263.6A
Authority: CN
Inventors: 雷钧; 林路; 林康; 王慜骊; 安通鉴
Original assignee: SUNYARD SYSTEM ENGINEERING Co Ltd
Current assignee: SUNYARD SYSTEM ENGINEERING Co Ltd
Priority date: 2018-12-27
Filing date: 2018-12-27
Publication date: 2019-05-07

Abstract

The present invention discloses a kind of method that character image serializes identification, structural data output, this method specifically: obtain multiple character image blocks；Text image feature extraction is carried out to each character image block using full depth convolutional neural networks, each character image block is expressed as feature vector；The feature vector is handled using deep neural network, and exports a probability distribution about character set；Using connection chronological classification layer as transcription layer, the dynamic programming algorithm that the probability distribution about character set is propagated using forward calculation and reversed gradient exports computer readable text；Error correction is carried out to computer readable text using language model, structural data is obtained and exports.Method recognition accuracy of the invention is high, and robustness is good, and discrimination is high.

Description

A kind of character image serializing identification, structural data output method

Technical field

The present invention relates to image identification technical fields in computer software more particularly to a kind of serializing of character image to know Not, structural data output method.

Background technique

Referring to through the equipment such as computer, benefit for financial field word area detection fixation and recognition technology based on OCR The effective information in paper material is automatically extracted and identified with OCR technique (optical character identification), and carries out corresponding position Reason.It is one of the key technology that the computer for realizing that bank is with no paper automatically processes.In financial industry OCR, text often with The form of sequence occurs, rather than occurs in isolation.Traditional document OCR identification technology is very weak to anti-interference ability, can not know Picture in the case of other complex background, and low efficiency when output " row text information " and " column text information ".

Summary of the invention

In view of the deficiencies of the prior art, the present invention provides a kind of character image serializing identifications, structural data output side Method, specific technical solution are as follows:

A kind of character image serializing identification, structural data output method, which is characterized in that this method includes following step It is rapid:

S1: multiple character image blocks are obtained；

S2: text image feature extraction is carried out to each character image block using full depth convolutional neural networks, each Character image block is expressed as feature vector；

S3: being handled the feature vector using deep neural network, and exports one about the general of character set Rate distribution；

S4: using connection chronological classification layer as transcription layer, by the probability distribution about character set using forward calculation and The dynamic programming algorithm that reversed gradient is propagated, exports computer readable text；

S5: error correction is carried out to computer readable text using language model, structural data is obtained and exports.

Further, the S2 specifically:

Using the picture of arbitrary size as input, response diagram of corresponding size is exported, each position is corresponding in the response diagram Original image an acceptance region and full convolutional neural networks share convolution response diagram, as feature vector.

Further, the deep neural network is the double-deck Recognition with Recurrent Neural Network.

Further, the S5 specifically:

S5.1: establishing corpus, and with training term vector and language model；

S5.2: the computer readable text that S4 is obtained is put into the language model after training, and by beam-search mode In the insertion language model, revised text is exported.

Beneficial effects of the present invention are as follows:

(1) directly sequence label can be learnt, is marked without others；

(2) original image pixels are directly based upon and extract feature, do not need to carry out binaryzation, Character segmentation, character locating etc. Image pretreatment operation；

(3) Tag Estimation is carried out using recurrent neural network, it can direct output character sequence prediction result；

(4) length of recognition result is not limited, while calculating loss using CTC, so that character is in character string In position be also not limited；

(5) recurrent neural network is used, it is empty to consume less storage using less network weight for the layer that connects more complete than tradition Between, there is preferable recognition accuracy and robustness, while full convolution is decoded using beam-search method insertion language model and is passed Return network, further increases discrimination.

Detailed description of the invention

Fig. 1 is the flow diagram of character image serializing identification of the invention, structural data output method.

Specific embodiment

Below according to attached drawing and preferred embodiment the present invention is described in detail, the objects and effects of the present invention will become brighter White, below in conjunction with drawings and examples, the present invention will be described in further detail.It should be appreciated that described herein specific Embodiment is only used to explain the present invention, is not intended to limit the present invention.

As shown in Figure 1, a kind of character image serializing identification, structural data output method, this method include following step It is rapid:

S1: multiple character image blocks are obtained；

S2: using full depth convolutional neural networks (deep neural network is the double-deck Recognition with Recurrent Neural Network) to each text figure As block progress text image feature extraction, each character image block is expressed as feature vector；Specially with the figure of arbitrary size Piece exports response diagram of corresponding size as input, and each position corresponds to an acceptance region of original image and complete in the response diagram Convolutional neural networks share convolution response diagram, as feature vector；

S4: using connection chronological classification layer (Connectionist Temporal Classifier, hereinafter referred to as CTC) As transcription layer, the dynamic programming algorithm that the probability distribution about character set is propagated using forward calculation and reversed gradient is defeated Computer readable text out；CTC is a kind of probability function for converting prediction result to label sequence, for input feature vector and The uncertain time series problem of alignment relation between output label, can automatic end-to-end ground Optimized model parameter and right simultaneously The boundary of neat cutting.

The picture of 256 size of 32x in example, maximum can cutting 256 arrange, that is, input feature vector maximum 256, and exporting The length maximum setting of label is 18, this to be optimized with CTC model.

About CTC model, it is assumed that the picture of 32x 256, numeric string label are " 123 ", and picture is pressed column cutting (CTC meeting Optimize segmentation model), every piece then branched away goes identification number again, and it is each digital or spcial character general for finding out this block Rate (unrecognized to be then labeled as spcial character "-"), has thus obtained each based on input feature vector sequence (picture) The generic probability distribution of mutually indepedent modeling unit individual (marking off the block come) (including "-" node).Based on probability point Cloth calculates the probability P (123) that sequence label is " 123 ", sets the probability of " 123 " here as the sum of all subsequences, this lining Sequence include '-' and ' 1', ' 2', ' 3' continuously repeats.

S5: carrying out error correction to computer readable text using language model, obtain structural data and export, specifically:

Establish corpus, and with training term vector and language model；

The computer readable text that S4 is obtained is put into the language model after training, and beam-search mode is embedded in institute In the language model stated, revised text is exported.

The image data that daily workout generates belongs under more satisfactory, noiseless environment, is easy accuracy rate and just reaches 100%, actual production environment picture may some line segments or discrete point noise, can be voluntarily in generating training set Increase some noises, improves test model training effect.

It will appreciated by the skilled person that being not used to limit the foregoing is merely the preferred embodiment of invention System invention, although invention is described in detail referring to previous examples, for those skilled in the art, still It can modify to the technical solution of aforementioned each case history or equivalent replacement of some of the technical features.It is all Within the spirit and principle of invention, modification, equivalent replacement for being made etc. be should be included within the protection scope of invention.

Claims

1. a kind of character image serializing identification, structural data output method, which is characterized in that this method includes following step It is rapid:

S1: multiple character image blocks are obtained；

S2: text image feature extraction is carried out to each character image block using full depth convolutional neural networks, each text Image block is expressed as feature vector；

S3: being handled the feature vector using deep neural network, and exports a probability about character set point Cloth.

S4: using connection chronological classification layer as transcription layer, by the probability distribution about character set using forward calculation and reversely The dynamic programming algorithm that gradient is propagated exports computer readable text；

2. the method according to claim 1, wherein the S2 specifically:

Using the picture of arbitrary size as input, response diagram of corresponding size is exported, each position corresponds to original in the response diagram One acceptance region of figure and full convolutional neural networks share convolution response diagram, as feature vector.

3. the method according to claim 1, wherein the deep neural network is the double-deck circulation nerve net Network.

4. the method according to claim 1, wherein the S5 specifically:

S5.1: establishing corpus, and with training term vector and language model；

S5.2: the computer readable text that S4 is obtained is put into the language model after training, and beam-search mode is embedded in In the language model, revised text is exported.