[go: up one dir, main page]

CN109726715A - A kind of character image serializing identification, structural data output method - Google Patents

A kind of character image serializing identification, structural data output method Download PDF

Info

Publication number
CN109726715A
CN109726715A CN201811614263.6A CN201811614263A CN109726715A CN 109726715 A CN109726715 A CN 109726715A CN 201811614263 A CN201811614263 A CN 201811614263A CN 109726715 A CN109726715 A CN 109726715A
Authority
CN
China
Prior art keywords
character image
text
exports
structural data
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811614263.6A
Other languages
Chinese (zh)
Inventor
雷钧
林路
林康
王慜骊
安通鉴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SUNYARD SYSTEM ENGINEERING Co Ltd
Original Assignee
SUNYARD SYSTEM ENGINEERING Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SUNYARD SYSTEM ENGINEERING Co Ltd filed Critical SUNYARD SYSTEM ENGINEERING Co Ltd
Priority to CN201811614263.6A priority Critical patent/CN109726715A/en
Publication of CN109726715A publication Critical patent/CN109726715A/en
Pending legal-status Critical Current

Links

Landscapes

  • Character Discrimination (AREA)

Abstract

The present invention discloses a kind of method that character image serializes identification, structural data output, this method specifically: obtain multiple character image blocks;Text image feature extraction is carried out to each character image block using full depth convolutional neural networks, each character image block is expressed as feature vector;The feature vector is handled using deep neural network, and exports a probability distribution about character set;Using connection chronological classification layer as transcription layer, the dynamic programming algorithm that the probability distribution about character set is propagated using forward calculation and reversed gradient exports computer readable text;Error correction is carried out to computer readable text using language model, structural data is obtained and exports.Method recognition accuracy of the invention is high, and robustness is good, and discrimination is high.

Description

A kind of character image serializing identification, structural data output method
Technical field
The present invention relates to image identification technical fields in computer software more particularly to a kind of serializing of character image to know Not, structural data output method.
Background technique
Referring to through the equipment such as computer, benefit for financial field word area detection fixation and recognition technology based on OCR The effective information in paper material is automatically extracted and identified with OCR technique (optical character identification), and carries out corresponding position Reason.It is one of the key technology that the computer for realizing that bank is with no paper automatically processes.In financial industry OCR, text often with The form of sequence occurs, rather than occurs in isolation.Traditional document OCR identification technology is very weak to anti-interference ability, can not know Picture in the case of other complex background, and low efficiency when output " row text information " and " column text information ".
Summary of the invention
In view of the deficiencies of the prior art, the present invention provides a kind of character image serializing identifications, structural data output side Method, specific technical solution are as follows:
A kind of character image serializing identification, structural data output method, which is characterized in that this method includes following step It is rapid:
S1: multiple character image blocks are obtained;
S2: text image feature extraction is carried out to each character image block using full depth convolutional neural networks, each Character image block is expressed as feature vector;
S3: being handled the feature vector using deep neural network, and exports one about the general of character set Rate distribution;
S4: using connection chronological classification layer as transcription layer, by the probability distribution about character set using forward calculation and The dynamic programming algorithm that reversed gradient is propagated, exports computer readable text;
S5: error correction is carried out to computer readable text using language model, structural data is obtained and exports.
Further, the S2 specifically:
Using the picture of arbitrary size as input, response diagram of corresponding size is exported, each position is corresponding in the response diagram Original image an acceptance region and full convolutional neural networks share convolution response diagram, as feature vector.
Further, the deep neural network is the double-deck Recognition with Recurrent Neural Network.
Further, the S5 specifically:
S5.1: establishing corpus, and with training term vector and language model;
S5.2: the computer readable text that S4 is obtained is put into the language model after training, and by beam-search mode In the insertion language model, revised text is exported.
Beneficial effects of the present invention are as follows:
(1) directly sequence label can be learnt, is marked without others;
(2) original image pixels are directly based upon and extract feature, do not need to carry out binaryzation, Character segmentation, character locating etc. Image pretreatment operation;
(3) Tag Estimation is carried out using recurrent neural network, it can direct output character sequence prediction result;
(4) length of recognition result is not limited, while calculating loss using CTC, so that character is in character string In position be also not limited;
(5) recurrent neural network is used, it is empty to consume less storage using less network weight for the layer that connects more complete than tradition Between, there is preferable recognition accuracy and robustness, while full convolution is decoded using beam-search method insertion language model and is passed Return network, further increases discrimination.
Detailed description of the invention
Fig. 1 is the flow diagram of character image serializing identification of the invention, structural data output method.
Specific embodiment
Below according to attached drawing and preferred embodiment the present invention is described in detail, the objects and effects of the present invention will become brighter White, below in conjunction with drawings and examples, the present invention will be described in further detail.It should be appreciated that described herein specific Embodiment is only used to explain the present invention, is not intended to limit the present invention.
As shown in Figure 1, a kind of character image serializing identification, structural data output method, this method include following step It is rapid:
S1: multiple character image blocks are obtained;
S2: using full depth convolutional neural networks (deep neural network is the double-deck Recognition with Recurrent Neural Network) to each text figure As block progress text image feature extraction, each character image block is expressed as feature vector;Specially with the figure of arbitrary size Piece exports response diagram of corresponding size as input, and each position corresponds to an acceptance region of original image and complete in the response diagram Convolutional neural networks share convolution response diagram, as feature vector;
S3: being handled the feature vector using deep neural network, and exports one about the general of character set Rate distribution;
S4: using connection chronological classification layer (Connectionist Temporal Classifier, hereinafter referred to as CTC) As transcription layer, the dynamic programming algorithm that the probability distribution about character set is propagated using forward calculation and reversed gradient is defeated Computer readable text out;CTC is a kind of probability function for converting prediction result to label sequence, for input feature vector and The uncertain time series problem of alignment relation between output label, can automatic end-to-end ground Optimized model parameter and right simultaneously The boundary of neat cutting.
The picture of 256 size of 32x in example, maximum can cutting 256 arrange, that is, input feature vector maximum 256, and exporting The length maximum setting of label is 18, this to be optimized with CTC model.
About CTC model, it is assumed that the picture of 32x 256, numeric string label are " 123 ", and picture is pressed column cutting (CTC meeting Optimize segmentation model), every piece then branched away goes identification number again, and it is each digital or spcial character general for finding out this block Rate (unrecognized to be then labeled as spcial character "-"), has thus obtained each based on input feature vector sequence (picture) The generic probability distribution of mutually indepedent modeling unit individual (marking off the block come) (including "-" node).Based on probability point Cloth calculates the probability P (123) that sequence label is " 123 ", sets the probability of " 123 " here as the sum of all subsequences, this lining Sequence include '-' and ' 1', ' 2', ' 3' continuously repeats.
S5: carrying out error correction to computer readable text using language model, obtain structural data and export, specifically:
Establish corpus, and with training term vector and language model;
The computer readable text that S4 is obtained is put into the language model after training, and beam-search mode is embedded in institute In the language model stated, revised text is exported.
The image data that daily workout generates belongs under more satisfactory, noiseless environment, is easy accuracy rate and just reaches 100%, actual production environment picture may some line segments or discrete point noise, can be voluntarily in generating training set Increase some noises, improves test model training effect.
It will appreciated by the skilled person that being not used to limit the foregoing is merely the preferred embodiment of invention System invention, although invention is described in detail referring to previous examples, for those skilled in the art, still It can modify to the technical solution of aforementioned each case history or equivalent replacement of some of the technical features.It is all Within the spirit and principle of invention, modification, equivalent replacement for being made etc. be should be included within the protection scope of invention.

Claims (4)

1. a kind of character image serializing identification, structural data output method, which is characterized in that this method includes following step It is rapid:
S1: multiple character image blocks are obtained;
S2: text image feature extraction is carried out to each character image block using full depth convolutional neural networks, each text Image block is expressed as feature vector;
S3: being handled the feature vector using deep neural network, and exports a probability about character set point Cloth.
S4: using connection chronological classification layer as transcription layer, by the probability distribution about character set using forward calculation and reversely The dynamic programming algorithm that gradient is propagated exports computer readable text;
S5: error correction is carried out to computer readable text using language model, structural data is obtained and exports.
2. the method according to claim 1, wherein the S2 specifically:
Using the picture of arbitrary size as input, response diagram of corresponding size is exported, each position corresponds to original in the response diagram One acceptance region of figure and full convolutional neural networks share convolution response diagram, as feature vector.
3. the method according to claim 1, wherein the deep neural network is the double-deck circulation nerve net Network.
4. the method according to claim 1, wherein the S5 specifically:
S5.1: establishing corpus, and with training term vector and language model;
S5.2: the computer readable text that S4 is obtained is put into the language model after training, and beam-search mode is embedded in In the language model, revised text is exported.
CN201811614263.6A 2018-12-27 2018-12-27 A kind of character image serializing identification, structural data output method Pending CN109726715A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811614263.6A CN109726715A (en) 2018-12-27 2018-12-27 A kind of character image serializing identification, structural data output method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811614263.6A CN109726715A (en) 2018-12-27 2018-12-27 A kind of character image serializing identification, structural data output method

Publications (1)

Publication Number Publication Date
CN109726715A true CN109726715A (en) 2019-05-07

Family

ID=66296477

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811614263.6A Pending CN109726715A (en) 2018-12-27 2018-12-27 A kind of character image serializing identification, structural data output method

Country Status (1)

Country Link
CN (1) CN109726715A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110399486A (en) * 2019-07-02 2019-11-01 精硕科技(北京)股份有限公司 A kind of classification method, device and equipment, storage medium
CN110837838A (en) * 2019-11-06 2020-02-25 创新奇智(重庆)科技有限公司 End-to-end frame number identification system and method based on deep learning
CN110942004A (en) * 2019-11-20 2020-03-31 深圳追一科技有限公司 Handwriting recognition method and device based on neural network model and electronic equipment
CN111062397A (en) * 2019-12-18 2020-04-24 厦门商集网络科技有限责任公司 Intelligent bill processing system
CN111353397A (en) * 2020-02-22 2020-06-30 郑州铁路职业技术学院 Structured sharing system of Chinese blackboard writing in online classroom based on big data and OCR
CN112508023A (en) * 2020-10-27 2021-03-16 重庆大学 Deep learning-based end-to-end identification method for code-spraying characters of parts
CN114757840A (en) * 2022-03-24 2022-07-15 北京字跳网络技术有限公司 Image processing method, apparatus, readable medium and electronic device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106570456A (en) * 2016-10-13 2017-04-19 华南理工大学 Handwritten Chinese character recognition method based on full-convolution recursive network
CN107145853A (en) * 2017-04-28 2017-09-08 深圳市唯特视科技有限公司 A kind of scene image text suggesting method based on full convolutional network
WO2018089762A1 (en) * 2016-11-11 2018-05-17 Ebay Inc. Online personal assistant with image text localization
CN108090400A (en) * 2016-11-23 2018-05-29 中移(杭州)信息技术有限公司 A kind of method and apparatus of image text identification

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106570456A (en) * 2016-10-13 2017-04-19 华南理工大学 Handwritten Chinese character recognition method based on full-convolution recursive network
WO2018089762A1 (en) * 2016-11-11 2018-05-17 Ebay Inc. Online personal assistant with image text localization
CN108090400A (en) * 2016-11-23 2018-05-29 中移(杭州)信息技术有限公司 A kind of method and apparatus of image text identification
CN107145853A (en) * 2017-04-28 2017-09-08 深圳市唯特视科技有限公司 A kind of scene image text suggesting method based on full convolutional network

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110399486A (en) * 2019-07-02 2019-11-01 精硕科技(北京)股份有限公司 A kind of classification method, device and equipment, storage medium
CN110837838A (en) * 2019-11-06 2020-02-25 创新奇智(重庆)科技有限公司 End-to-end frame number identification system and method based on deep learning
CN110942004A (en) * 2019-11-20 2020-03-31 深圳追一科技有限公司 Handwriting recognition method and device based on neural network model and electronic equipment
CN111062397A (en) * 2019-12-18 2020-04-24 厦门商集网络科技有限责任公司 Intelligent bill processing system
CN111353397A (en) * 2020-02-22 2020-06-30 郑州铁路职业技术学院 Structured sharing system of Chinese blackboard writing in online classroom based on big data and OCR
CN111353397B (en) * 2020-02-22 2021-01-01 郑州铁路职业技术学院 Structured sharing system of Chinese blackboard writing in online classroom based on big data and OCR
CN112508023A (en) * 2020-10-27 2021-03-16 重庆大学 Deep learning-based end-to-end identification method for code-spraying characters of parts
CN114757840A (en) * 2022-03-24 2022-07-15 北京字跳网络技术有限公司 Image processing method, apparatus, readable medium and electronic device

Similar Documents

Publication Publication Date Title
CN109726715A (en) A kind of character image serializing identification, structural data output method
Xiang et al. Fabric image retrieval system using hierarchical search based on deep convolutional neural network
CN112733866B (en) Network construction method for improving text description correctness of controllable image
CN110609891A (en) A Visual Dialogue Generation Method Based on Context-Aware Graph Neural Network
CN111738169A (en) A Handwritten Formula Recognition Method Based on End-to-End Network Model
CN111460824B (en) Unmarked named entity identification method based on anti-migration learning
CN104680178B (en) Image classification method based on transfer learning multi attractor cellular automaton
CN109657039A (en) A kind of track record information extraction method based on the double-deck BiLSTM-CRF
CN110415309B (en) Method to realize automatic generation of fingerprint images based on generative adversarial network
CN112084913B (en) End-to-end human body detection and attribute identification method
Bosco A genetic algorithm for image segmentation
CN113420552B (en) Biomedical multi-event extraction method based on reinforcement learning
CN111159332A (en) Text multi-intention identification method based on bert
Uehara et al. Visual question generation for class acquisition of unknown objects
CN104881639A (en) Method of detection, division, and expression recognition of human face based on layered TDP model
CN117746078B (en) Object detection method and system based on user-defined category
CN109740151A (en) Public security notes name entity recognition method based on iteration expansion convolutional neural networks
CN108920446A (en) A kind of processing method of Engineering document
CN109992770A (en) A Lao Named Entity Recognition Method Based on Combinatorial Neural Network
CN116450829A (en) Medical text classification method, device, equipment and medium
CN113919358A (en) Named entity identification method and system based on active learning
CN105678244A (en) Approximate video retrieval method based on improvement of editing distance
CN111539417B (en) Text recognition training optimization method based on deep neural network
CN117095433A (en) Sketch face recognition method and device
Surapaneni et al. Exploring themes and bias in art using machine learning image analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190507

WD01 Invention patent application deemed withdrawn after publication