CN109726715A - A kind of character image serializing identification, structural data output method - Google Patents
A kind of character image serializing identification, structural data output method Download PDFInfo
- Publication number
- CN109726715A CN109726715A CN201811614263.6A CN201811614263A CN109726715A CN 109726715 A CN109726715 A CN 109726715A CN 201811614263 A CN201811614263 A CN 201811614263A CN 109726715 A CN109726715 A CN 109726715A
- Authority
- CN
- China
- Prior art keywords
- character image
- text
- exports
- structural data
- character
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 21
- 238000013528 artificial neural network Methods 0.000 claims abstract description 11
- 238000009826 distribution Methods 0.000 claims abstract description 8
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 7
- 238000004364 calculation method Methods 0.000 claims abstract description 4
- 238000012937 correction Methods 0.000 claims abstract description 4
- 238000000605 extraction Methods 0.000 claims abstract description 4
- 230000000644 propagated effect Effects 0.000 claims abstract description 4
- 238000013518 transcription Methods 0.000 claims abstract description 4
- 230000035897 transcription Effects 0.000 claims abstract description 4
- 238000010586 diagram Methods 0.000 claims description 10
- 230000004044 response Effects 0.000 claims description 9
- 238000012549 training Methods 0.000 claims description 8
- 239000004744 fabric Substances 0.000 claims description 2
- 210000004218 nerve net Anatomy 0.000 claims 1
- 230000000306 recurrent effect Effects 0.000 description 4
- 238000005520 cutting process Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Landscapes
- Character Discrimination (AREA)
Abstract
The present invention discloses a kind of method that character image serializes identification, structural data output, this method specifically: obtain multiple character image blocks;Text image feature extraction is carried out to each character image block using full depth convolutional neural networks, each character image block is expressed as feature vector;The feature vector is handled using deep neural network, and exports a probability distribution about character set;Using connection chronological classification layer as transcription layer, the dynamic programming algorithm that the probability distribution about character set is propagated using forward calculation and reversed gradient exports computer readable text;Error correction is carried out to computer readable text using language model, structural data is obtained and exports.Method recognition accuracy of the invention is high, and robustness is good, and discrimination is high.
Description
Technical field
The present invention relates to image identification technical fields in computer software more particularly to a kind of serializing of character image to know
Not, structural data output method.
Background technique
Referring to through the equipment such as computer, benefit for financial field word area detection fixation and recognition technology based on OCR
The effective information in paper material is automatically extracted and identified with OCR technique (optical character identification), and carries out corresponding position
Reason.It is one of the key technology that the computer for realizing that bank is with no paper automatically processes.In financial industry OCR, text often with
The form of sequence occurs, rather than occurs in isolation.Traditional document OCR identification technology is very weak to anti-interference ability, can not know
Picture in the case of other complex background, and low efficiency when output " row text information " and " column text information ".
Summary of the invention
In view of the deficiencies of the prior art, the present invention provides a kind of character image serializing identifications, structural data output side
Method, specific technical solution are as follows:
A kind of character image serializing identification, structural data output method, which is characterized in that this method includes following step
It is rapid:
S1: multiple character image blocks are obtained;
S2: text image feature extraction is carried out to each character image block using full depth convolutional neural networks, each
Character image block is expressed as feature vector;
S3: being handled the feature vector using deep neural network, and exports one about the general of character set
Rate distribution;
S4: using connection chronological classification layer as transcription layer, by the probability distribution about character set using forward calculation and
The dynamic programming algorithm that reversed gradient is propagated, exports computer readable text;
S5: error correction is carried out to computer readable text using language model, structural data is obtained and exports.
Further, the S2 specifically:
Using the picture of arbitrary size as input, response diagram of corresponding size is exported, each position is corresponding in the response diagram
Original image an acceptance region and full convolutional neural networks share convolution response diagram, as feature vector.
Further, the deep neural network is the double-deck Recognition with Recurrent Neural Network.
Further, the S5 specifically:
S5.1: establishing corpus, and with training term vector and language model;
S5.2: the computer readable text that S4 is obtained is put into the language model after training, and by beam-search mode
In the insertion language model, revised text is exported.
Beneficial effects of the present invention are as follows:
(1) directly sequence label can be learnt, is marked without others;
(2) original image pixels are directly based upon and extract feature, do not need to carry out binaryzation, Character segmentation, character locating etc.
Image pretreatment operation;
(3) Tag Estimation is carried out using recurrent neural network, it can direct output character sequence prediction result;
(4) length of recognition result is not limited, while calculating loss using CTC, so that character is in character string
In position be also not limited;
(5) recurrent neural network is used, it is empty to consume less storage using less network weight for the layer that connects more complete than tradition
Between, there is preferable recognition accuracy and robustness, while full convolution is decoded using beam-search method insertion language model and is passed
Return network, further increases discrimination.
Detailed description of the invention
Fig. 1 is the flow diagram of character image serializing identification of the invention, structural data output method.
Specific embodiment
Below according to attached drawing and preferred embodiment the present invention is described in detail, the objects and effects of the present invention will become brighter
White, below in conjunction with drawings and examples, the present invention will be described in further detail.It should be appreciated that described herein specific
Embodiment is only used to explain the present invention, is not intended to limit the present invention.
As shown in Figure 1, a kind of character image serializing identification, structural data output method, this method include following step
It is rapid:
S1: multiple character image blocks are obtained;
S2: using full depth convolutional neural networks (deep neural network is the double-deck Recognition with Recurrent Neural Network) to each text figure
As block progress text image feature extraction, each character image block is expressed as feature vector;Specially with the figure of arbitrary size
Piece exports response diagram of corresponding size as input, and each position corresponds to an acceptance region of original image and complete in the response diagram
Convolutional neural networks share convolution response diagram, as feature vector;
S3: being handled the feature vector using deep neural network, and exports one about the general of character set
Rate distribution;
S4: using connection chronological classification layer (Connectionist Temporal Classifier, hereinafter referred to as CTC)
As transcription layer, the dynamic programming algorithm that the probability distribution about character set is propagated using forward calculation and reversed gradient is defeated
Computer readable text out;CTC is a kind of probability function for converting prediction result to label sequence, for input feature vector and
The uncertain time series problem of alignment relation between output label, can automatic end-to-end ground Optimized model parameter and right simultaneously
The boundary of neat cutting.
The picture of 256 size of 32x in example, maximum can cutting 256 arrange, that is, input feature vector maximum 256, and exporting
The length maximum setting of label is 18, this to be optimized with CTC model.
About CTC model, it is assumed that the picture of 32x 256, numeric string label are " 123 ", and picture is pressed column cutting (CTC meeting
Optimize segmentation model), every piece then branched away goes identification number again, and it is each digital or spcial character general for finding out this block
Rate (unrecognized to be then labeled as spcial character "-"), has thus obtained each based on input feature vector sequence (picture)
The generic probability distribution of mutually indepedent modeling unit individual (marking off the block come) (including "-" node).Based on probability point
Cloth calculates the probability P (123) that sequence label is " 123 ", sets the probability of " 123 " here as the sum of all subsequences, this lining
Sequence include '-' and ' 1', ' 2', ' 3' continuously repeats.
S5: carrying out error correction to computer readable text using language model, obtain structural data and export, specifically:
Establish corpus, and with training term vector and language model;
The computer readable text that S4 is obtained is put into the language model after training, and beam-search mode is embedded in institute
In the language model stated, revised text is exported.
The image data that daily workout generates belongs under more satisfactory, noiseless environment, is easy accuracy rate and just reaches
100%, actual production environment picture may some line segments or discrete point noise, can be voluntarily in generating training set
Increase some noises, improves test model training effect.
It will appreciated by the skilled person that being not used to limit the foregoing is merely the preferred embodiment of invention
System invention, although invention is described in detail referring to previous examples, for those skilled in the art, still
It can modify to the technical solution of aforementioned each case history or equivalent replacement of some of the technical features.It is all
Within the spirit and principle of invention, modification, equivalent replacement for being made etc. be should be included within the protection scope of invention.
Claims (4)
1. a kind of character image serializing identification, structural data output method, which is characterized in that this method includes following step
It is rapid:
S1: multiple character image blocks are obtained;
S2: text image feature extraction is carried out to each character image block using full depth convolutional neural networks, each text
Image block is expressed as feature vector;
S3: being handled the feature vector using deep neural network, and exports a probability about character set point
Cloth.
S4: using connection chronological classification layer as transcription layer, by the probability distribution about character set using forward calculation and reversely
The dynamic programming algorithm that gradient is propagated exports computer readable text;
S5: error correction is carried out to computer readable text using language model, structural data is obtained and exports.
2. the method according to claim 1, wherein the S2 specifically:
Using the picture of arbitrary size as input, response diagram of corresponding size is exported, each position corresponds to original in the response diagram
One acceptance region of figure and full convolutional neural networks share convolution response diagram, as feature vector.
3. the method according to claim 1, wherein the deep neural network is the double-deck circulation nerve net
Network.
4. the method according to claim 1, wherein the S5 specifically:
S5.1: establishing corpus, and with training term vector and language model;
S5.2: the computer readable text that S4 is obtained is put into the language model after training, and beam-search mode is embedded in
In the language model, revised text is exported.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811614263.6A CN109726715A (en) | 2018-12-27 | 2018-12-27 | A kind of character image serializing identification, structural data output method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811614263.6A CN109726715A (en) | 2018-12-27 | 2018-12-27 | A kind of character image serializing identification, structural data output method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109726715A true CN109726715A (en) | 2019-05-07 |
Family
ID=66296477
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811614263.6A Pending CN109726715A (en) | 2018-12-27 | 2018-12-27 | A kind of character image serializing identification, structural data output method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109726715A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110399486A (en) * | 2019-07-02 | 2019-11-01 | 精硕科技(北京)股份有限公司 | A kind of classification method, device and equipment, storage medium |
CN110837838A (en) * | 2019-11-06 | 2020-02-25 | 创新奇智(重庆)科技有限公司 | End-to-end frame number identification system and method based on deep learning |
CN110942004A (en) * | 2019-11-20 | 2020-03-31 | 深圳追一科技有限公司 | Handwriting recognition method and device based on neural network model and electronic equipment |
CN111062397A (en) * | 2019-12-18 | 2020-04-24 | 厦门商集网络科技有限责任公司 | Intelligent bill processing system |
CN111353397A (en) * | 2020-02-22 | 2020-06-30 | 郑州铁路职业技术学院 | Structured sharing system of Chinese blackboard writing in online classroom based on big data and OCR |
CN112508023A (en) * | 2020-10-27 | 2021-03-16 | 重庆大学 | Deep learning-based end-to-end identification method for code-spraying characters of parts |
CN114757840A (en) * | 2022-03-24 | 2022-07-15 | 北京字跳网络技术有限公司 | Image processing method, apparatus, readable medium and electronic device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106570456A (en) * | 2016-10-13 | 2017-04-19 | 华南理工大学 | Handwritten Chinese character recognition method based on full-convolution recursive network |
CN107145853A (en) * | 2017-04-28 | 2017-09-08 | 深圳市唯特视科技有限公司 | A kind of scene image text suggesting method based on full convolutional network |
WO2018089762A1 (en) * | 2016-11-11 | 2018-05-17 | Ebay Inc. | Online personal assistant with image text localization |
CN108090400A (en) * | 2016-11-23 | 2018-05-29 | 中移(杭州)信息技术有限公司 | A kind of method and apparatus of image text identification |
-
2018
- 2018-12-27 CN CN201811614263.6A patent/CN109726715A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106570456A (en) * | 2016-10-13 | 2017-04-19 | 华南理工大学 | Handwritten Chinese character recognition method based on full-convolution recursive network |
WO2018089762A1 (en) * | 2016-11-11 | 2018-05-17 | Ebay Inc. | Online personal assistant with image text localization |
CN108090400A (en) * | 2016-11-23 | 2018-05-29 | 中移(杭州)信息技术有限公司 | A kind of method and apparatus of image text identification |
CN107145853A (en) * | 2017-04-28 | 2017-09-08 | 深圳市唯特视科技有限公司 | A kind of scene image text suggesting method based on full convolutional network |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110399486A (en) * | 2019-07-02 | 2019-11-01 | 精硕科技(北京)股份有限公司 | A kind of classification method, device and equipment, storage medium |
CN110837838A (en) * | 2019-11-06 | 2020-02-25 | 创新奇智(重庆)科技有限公司 | End-to-end frame number identification system and method based on deep learning |
CN110942004A (en) * | 2019-11-20 | 2020-03-31 | 深圳追一科技有限公司 | Handwriting recognition method and device based on neural network model and electronic equipment |
CN111062397A (en) * | 2019-12-18 | 2020-04-24 | 厦门商集网络科技有限责任公司 | Intelligent bill processing system |
CN111353397A (en) * | 2020-02-22 | 2020-06-30 | 郑州铁路职业技术学院 | Structured sharing system of Chinese blackboard writing in online classroom based on big data and OCR |
CN111353397B (en) * | 2020-02-22 | 2021-01-01 | 郑州铁路职业技术学院 | Structured sharing system of Chinese blackboard writing in online classroom based on big data and OCR |
CN112508023A (en) * | 2020-10-27 | 2021-03-16 | 重庆大学 | Deep learning-based end-to-end identification method for code-spraying characters of parts |
CN114757840A (en) * | 2022-03-24 | 2022-07-15 | 北京字跳网络技术有限公司 | Image processing method, apparatus, readable medium and electronic device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109726715A (en) | A kind of character image serializing identification, structural data output method | |
Xiang et al. | Fabric image retrieval system using hierarchical search based on deep convolutional neural network | |
CN112733866B (en) | Network construction method for improving text description correctness of controllable image | |
CN110609891A (en) | A Visual Dialogue Generation Method Based on Context-Aware Graph Neural Network | |
CN111738169A (en) | A Handwritten Formula Recognition Method Based on End-to-End Network Model | |
CN111460824B (en) | Unmarked named entity identification method based on anti-migration learning | |
CN104680178B (en) | Image classification method based on transfer learning multi attractor cellular automaton | |
CN109657039A (en) | A kind of track record information extraction method based on the double-deck BiLSTM-CRF | |
CN110415309B (en) | Method to realize automatic generation of fingerprint images based on generative adversarial network | |
CN112084913B (en) | End-to-end human body detection and attribute identification method | |
Bosco | A genetic algorithm for image segmentation | |
CN113420552B (en) | Biomedical multi-event extraction method based on reinforcement learning | |
CN111159332A (en) | Text multi-intention identification method based on bert | |
Uehara et al. | Visual question generation for class acquisition of unknown objects | |
CN104881639A (en) | Method of detection, division, and expression recognition of human face based on layered TDP model | |
CN117746078B (en) | Object detection method and system based on user-defined category | |
CN109740151A (en) | Public security notes name entity recognition method based on iteration expansion convolutional neural networks | |
CN108920446A (en) | A kind of processing method of Engineering document | |
CN109992770A (en) | A Lao Named Entity Recognition Method Based on Combinatorial Neural Network | |
CN116450829A (en) | Medical text classification method, device, equipment and medium | |
CN113919358A (en) | Named entity identification method and system based on active learning | |
CN105678244A (en) | Approximate video retrieval method based on improvement of editing distance | |
CN111539417B (en) | Text recognition training optimization method based on deep neural network | |
CN117095433A (en) | Sketch face recognition method and device | |
Surapaneni et al. | Exploring themes and bias in art using machine learning image analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190507 |
|
WD01 | Invention patent application deemed withdrawn after publication |