[go: up one dir, main page]

KR950001553A - How to Separate Individual Characters in English Strings - Google Patents

How to Separate Individual Characters in English Strings Download PDF

Info

Publication number
KR950001553A
KR950001553A KR1019930010895A KR930010895A KR950001553A KR 950001553 A KR950001553 A KR 950001553A KR 1019930010895 A KR1019930010895 A KR 1019930010895A KR 930010895 A KR930010895 A KR 930010895A KR 950001553 A KR950001553 A KR 950001553A
Authority
KR
South Korea
Prior art keywords
character
individual
string
characters
individual characters
Prior art date
Application number
KR1019930010895A
Other languages
Korean (ko)
Other versions
KR100286709B1 (en
Inventor
노희호
Original Assignee
이헌조
주식회사 금성사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 이헌조, 주식회사 금성사 filed Critical 이헌조
Priority to KR1019930010895A priority Critical patent/KR100286709B1/en
Publication of KR950001553A publication Critical patent/KR950001553A/en
Application granted granted Critical
Publication of KR100286709B1 publication Critical patent/KR100286709B1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)

Abstract

본 발명은 영문자열에서의 개별문자 분리방법에 관한 것으로서, 문자들이 접촉되어 있는 경우에는 문자인식수단의 간섭없이 개별문자를 정확하게 절취하여 보다 빠르게 문서를 인식하도록 함과 아울러 문자 절출에 쓰이는 특징량으로서 간단히 추출할 수 있는 홀특징을 사용하여 간단히 접촉문자를 절취하도록 한 것이다.The present invention relates to a method for separating individual characters in an English character string, and when the characters are in contact with each other, the individual characters are precisely cut out without interference of the character recognition means to recognize the document more quickly and as a feature amount used for character extraction. Using a hole feature that can be extracted simply to cut off the contact character.

이와같은 본 발명은 입력 문서로 부터 2치 화상 데이타를 발생시키는 스캔과정과, 상기 2치 화상으로 부터 그림영역과 문자영역을 분리하고 문자영역에서 문자열을 추출해 내는 영역분할 및 추출과정과, 상기 추출된 문자열로 부터 개별문자를 절출하는 개별문자 절출과정과, 상기 절출된 개별문자를 인식하는 인식과정과, 상기 인식과정중 오인식된 부분을 수정하는 후처리과정으로 이루어짐으로써 달성된다.As described above, the present invention provides a scanning process for generating binary image data from an input document, an area division and extraction process for separating a picture area and a text area from the binary image, and extracting a character string from the text area, and the extraction process. It is achieved by the individual character extraction process of cutting out individual characters from the extracted character string, the recognition process of recognizing the separated individual characters, and the post-processing process of correcting the misrecognized part of the recognition process.

Description

영문자열에서의 개별문자 분리 방법How to Separate Individual Characters in English Strings

본 내용은 요부공개 건이므로 전문내용을 수록하지 않았음Since this is an open matter, no full text was included.

제 2 도는 본 발명 개별문자 분리 시스템 구성도, 제 3 도는 제 2 도에 따른 문서화상의 인식 흐름도, 제 4 도는 제 3 도의 개별문자 절출과정의 상세 신호흐름도.2 is a block diagram of a system for separating individual characters according to the present invention, FIG. 3 is a flowchart for recognizing a document image according to FIG. 2, and FIG. 4 is a detailed signal flow diagram of the individual character cutting process of FIG.

Claims (3)

입력 문서로 부터 2치 화상 데이타를 발생시키는 스캔과정과, 상기 2치 화상으로 부터 그림영역과 문자영역을 분리하고 문자영역에서 문자열을 추출해 내는 영역분할 및 추출과정과, 상기 추출된 문자열로 부터 개별문자를 절출하는 개별문자 절출과정과, 상기 절출된 개별문자를 인식하는 인식과정과, 상기 인식과 정중 오인식된 부분을 수정하는 후처리과정으로 이루어짐을 특징으로 한 영문자열에서의 개별문자 분리방법.A scanning process of generating binary image data from an input document, an area division and extraction process of separating a picture area and a text area from the binary image and extracting a character string from the text area, and separately from the extracted character string The method of separating individual characters in an English character string comprising the individual character slicing process of cutting out characters, a recognition process of recognizing the severed individual characters, and a post-processing process of correcting a misrecognized portion of the recognition process. . 제 1 항에 있어서, 개별문자 절출과정은 영역분활 및 문자열 추출과정에서 분리되어 문자열 입력단계를 통해 입력되는 문자열로 부터 개별문자를 절출하는 개별문자 절출단계와, 상기 절출된 개별문자에 대해 소문자를 구분하는 문자구분단계와, 상기 접촉된 문자에 대하여 윤곽선을 추적하는 윤곽선추적단계와, 상기 접촉문자에 대하여 대문자를 구별하는 문자구별단계와, 홀특징을 이용하여 상기 문자구별단계에서 구별된 개별문자를 절취하는 문자절취단계와, 상기 절취된 개별문자의 높이를 구하고 여백을 구하여 인식과정으로 입력하는 문자여백 검출단계로 이루어짐을 특징으로 한 영문자열에서의 개별문자 분리방법.The method of claim 1, wherein the individual character segmentation process is separated from the area segmentation and the string extraction process, and the individual character segmentation step of extracting the individual character from the string input through the string input step; Character distinguishing step for classifying the character, Contour tracking step for tracking the contour for the contacted character, Character discrimination step for distinguishing the uppercase letter for the contact character, Individuals distinguished in the character discrimination step by using the hall feature Character truncation step of the character string in the character string characterized in that it consists of a character truncation step of truncating the character, and the character margin detection step of obtaining the height of the individual characters to be cut and obtain a margin. 제 2 항에 있어서, 홀특징은 대상블럭의 수직방향의 흑화소의 위치와 흑화소 런의 길이를 이용하여 홀의 존재유무를 결정하는 것을 특징으로 한 영문자열에서의 개별문자 분리방법.The method of claim 2, wherein the hole feature determines the presence or absence of a hole using the position of the black pixel in the vertical direction of the target block and the length of the black pixel run. ※ 참고사항 : 최초출원 내용에 의하여 공개하는 것임.※ Note: The disclosure is based on the initial application.
KR1019930010895A 1993-06-15 1993-06-15 How to Separate Individual Characters in English Strings KR100286709B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1019930010895A KR100286709B1 (en) 1993-06-15 1993-06-15 How to Separate Individual Characters in English Strings

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1019930010895A KR100286709B1 (en) 1993-06-15 1993-06-15 How to Separate Individual Characters in English Strings

Publications (2)

Publication Number Publication Date
KR950001553A true KR950001553A (en) 1995-01-03
KR100286709B1 KR100286709B1 (en) 2001-04-16

Family

ID=37514849

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1019930010895A KR100286709B1 (en) 1993-06-15 1993-06-15 How to Separate Individual Characters in English Strings

Country Status (1)

Country Link
KR (1) KR100286709B1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110087620A (en) * 2010-01-26 2011-08-03 광주과학기술원 Layout-based print media page recognition method

Also Published As

Publication number Publication date
KR100286709B1 (en) 2001-04-16

Similar Documents

Publication Publication Date Title
KR890017630A (en) Character recognition device and method
Kavallieratou et al. An integrated system for handwritten document image processing
Lam et al. Reading newspaper text
KR950001553A (en) How to Separate Individual Characters in English Strings
Lehal et al. Text segmentation of machine-printed Gurmukhi script
KR0186172B1 (en) Character recognition apparatus
KR960011775A (en) How to separate contact character of character recognition device
JP3457094B2 (en) Character recognition device and character recognition method
KR100480024B1 (en) Collection Recognition Method Using Stroke Thickness Information
KR970002740A (en) Contact Character Separation and Feature Extraction Method of Character Recognition Device
KR960002072A (en) Contact Character Separation Method of English Recognition System
JPH0797390B2 (en) Character recognition device
KR960042444A (en) Recognition Processing Device for Automobile License Plate Using Multi-Mask Technique and Its Method
KR930014166A (en) Individual Character Cutting Method of Document Recognition Device
Bodduluri et al. A novel way of identifying telugu, tamil and english scripts by priority check using discerning features
JPS60110089A (en) Character recognizer
JPH0578068B2 (en)
Ariki et al. Extraction and Recognition of Open Captions Superimposed on TV News Articles
JPH06231306A (en) Character recognition device
Green et al. Layout analysis of book pages
JP3116452B2 (en) English character recognition device
Kore et al. TEXT AND AUDIO TRANSLATION OF TEXT FROM SIGNBOARD IMAGES-REVIEW
KR920001384A (en) How to separate individual characters
Aparna et al. Bilingual (Tamil–Roman) Text Recognition on Windows
Gootla Integration of Telugu dictionary into Tesseract OCR

Legal Events

Date Code Title Description
PA0109 Patent application

Patent event code: PA01091R01D

Comment text: Patent Application

Patent event date: 19930615

PG1501 Laying open of application
A201 Request for examination
PA0201 Request for examination

Patent event code: PA02012R01D

Patent event date: 19980317

Comment text: Request for Examination of Application

Patent event code: PA02011R01I

Patent event date: 19930615

Comment text: Patent Application

E902 Notification of reason for refusal
PE0902 Notice of grounds for rejection

Comment text: Notification of reason for refusal

Patent event date: 20000330

Patent event code: PE09021S01D

E701 Decision to grant or registration of patent right
PE0701 Decision of registration

Patent event code: PE07011S01D

Comment text: Decision to Grant Registration

Patent event date: 20001227

GRNT Written decision to grant
PR0701 Registration of establishment

Comment text: Registration of Establishment

Patent event date: 20010116

Patent event code: PR07011E01D

PR1002 Payment of registration fee

Payment date: 20010117

End annual number: 3

Start annual number: 1

PG1601 Publication of registration
PR1001 Payment of annual fee

Payment date: 20031229

Start annual number: 4

End annual number: 4

PR1001 Payment of annual fee

Payment date: 20041221

Start annual number: 5

End annual number: 5

PR1001 Payment of annual fee

Payment date: 20051201

Start annual number: 6

End annual number: 6

FPAY Annual fee payment

Payment date: 20061220

Year of fee payment: 7

PR1001 Payment of annual fee

Payment date: 20061220

Start annual number: 7

End annual number: 7

LAPS Lapse due to unpaid annual fee
PC1903 Unpaid annual fee

Termination category: Default of registration fee

Termination date: 20081210