KR101566013B1

KR101566013B1 - Method and system for improvement of the pronunciation accuracy and expressiveness during language beginners reading an e-book aloud

Info

Publication number: KR101566013B1
Application number: KR1020130070493A
Authority: KR
Inventors: 신대진
Original assignee: 주식회사 이드웨어
Priority date: 2013-06-19
Filing date: 2013-06-19
Publication date: 2015-11-05
Anticipated expiration: 2033-06-19
Also published as: KR20140147378A

Abstract

본 발명은 언어입문자의 전자책 낭독 시 발화의 정확도 및 표현력을 향상시키기 위한 것으로, 상세하게는 전자책 구현 모듈에 음성인식 모듈이 연동되도록 함에 따라 언어입문자의 문장 발화를 인식하고 인식된 문장발화의 음성신호를 분석 처리함으로써 발화문장의 발화 정확도가 어느 정도인지 반응신호로 언어입문자에게 알려주어 일정 발화정확도에 도달할 때까지 반복적이고 집중적으로 정확한 문장발화를 유도하여 언어입문자의 문장 말하기 연습과 표현력 향상을 자연스럽게 가능하도록 도와주기 위한 전자책 낭독을 통한 발화 정확도 및 표현력 향상을 제공하는 방법 및 시스템에 관한 것이다.The present invention relates to a method for improving the accuracy and expressiveness of utterances of an e-book reader of a language, and more particularly, it relates to a method and apparatus for recognizing utterance utterances of a language primer by recognizing a sentence utterance By analyzing the speech signal, it is possible to inform the beginner of the language by the response signal of how much the utterance accuracy of the utterance sentence is, to induce repetitive and intensive correct utterance utterance until reaching the utterance utterance accuracy, The present invention relates to a method and a system for providing improved utterance accuracy and expressive power through read aloud of an electronic book to help naturally enable a user to read an electronic book.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method and system for improving the accuracy of speech and expressive power by reading an e-book,

본 발명은 언어입문자의 전자책 낭독 시 발화의 정확도 및 표현력을 향상시키기 위한 것으로, 전자책 구현 모듈에 음성인식 모듈이 연동되도록 함에 따라 언어입문자의 문장 발화를 인식하고 인식된 문장발화의 음성신호를 분석 처리함으로써 발화문장의 발화 정확도가 어느 정도인지 반응신호로 언어입문자에게 알려주어 일정 발화정확도에 도달할 때까지 반복적이고 집중적으로 정확한 문장발화를 유도하여 언어입문자의 문장 말하기 연습과 표현력 향상을 자연스럽게 가능하도록 도와주기 위한 전자책 낭독을 통한 발화 정확도 및 표현력 향상을 제공하는 방법 및 시스템에 관한 것이다.
The present invention is to improve the accuracy and expressiveness of utterance when reading an e-book of a language primer. As the voice recognition module is linked to the e-book implementation module, it recognizes the utterance of the utterance of the language primer and recognizes the voice signal of the recognized utterance By analyzing the sentence, it is possible to inform the beginner of the language by the response signal of how much the utterance accuracy of the utterance sentence is, and to induce the repetitive and intensive correct utterance utterance until reaching the utterance utterance accuracy. And more particularly, to a method and system for providing improved speaking accuracy and expressive power by reading an e-book to help a user read a book.

근래 들어 전자책 전용 단말기, 스마트폰, 태블릿 컴퓨터 등의 전자책 구동이 가능한 디지털 기기가 널리 보급되고 있으며, 이러한 디지털 기기를 이용하여 전자책을 구독하는 사용자의 수가 점차 증가하고 있는 추세이다.Recently, digital devices capable of driving electronic books such as electronic book terminals, smart phones, and tablet computers have become widespread, and the number of users who subscribe to electronic books using such digital devices is gradually increasing.

그러나 디지털 기기를 이용한 전자책의 경우 언어입문자의 입자에서 전자책을 읽을 때에는 글의 의미습득을 위한 책읽기의 의미와 언어로서 표현연습의 의미를 가질 수 있으나 현재까지 실시되고 있는 전자책들은 언어입문자들의 문장발음을 측정, 평가해줄 만한 방법이 없어 반복하여 전자책을 읽어도 의미상의 문장을 습득할 수 있을 뿐 적절한 억양과 발음을 습득할 수 없다는 문제점이 있다.However, in the case of e-books using digital devices, when reading an e-book from a particle of an introductory language, the meaning of a book for acquiring the meaning of the text and the meaning of the expression exercise as a language can be used. However, There is no way to measure and evaluate sentence pronunciations, so that even if an e-book is read repeatedly, meaningful sentences can be learned and proper intonation and pronunciation can not be learned.

그리고 종래 언어입문자의 문장발음을 인식하여 단순 반응해주는 방식의 전자책이 제시된 바 있으나, 해당 전자책에 포함된 문장들 중 읽은 문장, 읽지 않은 문장, 적절하게 발음한 문장, 적절하게 발음하지 못한 문장 등 문장발화관련 사용 통계 데이터를 추출하여 언어입문자의 발화 정확도 등의 언어학습에 도움을 주는 데이터를 세밀하게 분석하여 제공하는 방법이 없어 언어입문자의 언어 의미습득과 표현력습득이라는 언어학습의 실효성 측면에서 전자책 읽기가 효과적이지 않았다는 문제점이 있다.
In addition, although an e-book has been proposed in which a simple reaction is performed by recognizing the pronunciation of a sentence in the conventional language, the read sentence, the unread sentence, the appropriately pronounced sentence, It is necessary to extract the data of usage statistical data related to the utterance utterance such as spoken language accuracy of the language beginner in order to analyze the data to help the language learning. There is a problem that reading e-books is not effective.

대한민국 공개특허 제2012-0042298호Korean Patent Publication No. 2012-0042298

이에, 본 발명은 전술한 바와 같은 종래기술의 문제점을 해결하기 위해 안출된 것으로, 낭독체 음성인식 모듈을 전자책 구현 모듈에 적용하여 언어입문자의 전자책 문장 또는 단어 발화를 측정 및 평가하며, 언어입문자의 발화정확도 등의 언어학습에 도움을 주는 통계 데이터를 분석할 수 있는 전자책 낭독을 통한 발화 정확도 및 표현력 향상을 제공하는 방법 및 시스템을 제공함에 그 목적이 있다.
Accordingly, the present invention has been made to solve the above-mentioned problems of the prior art, and it is an object of the present invention to provide an electronic book implementation module for a learner's voice recognition module to measure and evaluate electronic book sentences or word utterances of a language- And an object of the present invention is to provide a method and system for providing accurate pronunciation and expressive power by reading an electronic book that can analyze statistical data that helps language learning, such as initialization accuracy of a beginner.

본 발명의 해결하고자 하는 과제에 의한 전자책 낭독을 통한 발화 정확도 및 표현력 향상을 제공하는 시스템은 전자책 상에서 구현되는 전자책 텍스트와, 사용자가 상기 텍스트를 발화할 때 발화중임을 인지시키는 요소와, 발화인식완료를 인지시키는 요소 및 발화인식 정확도의 수준을 인지시키는 요소를 포함하는 전자책 콘텐츠와; 음성신호와 영상신호의 입출력이 가능하고 데이터의 저장과 연산처리 기능이 탑재된 디지털기기에 설치되며 상기 전자책 콘텐츠를 상기 디지털기기 상에서 구현되도록 하는 전자책 구현 모듈; 및 상기 디지털기기에 설치되며 상기 전자책 구현 모듈과 연동하여 상기 사용자가 텍스트를 발화할 때의 음성신호를 입력받고 이를 통해 사용자의 발화정확도를 분석 및 평가하는 낭독체 음성인식 모듈;을 포함하는 것이 특징이다.According to an aspect of the present invention, there is provided a system for providing improved accuracy of speech and expressive power through read aloud of an electronic book, including an electronic book text implemented on an electronic book, an element recognizing that a user is speaking when the text is uttered, An electronic book content including an element for recognizing completion of speech recognition and an element for recognizing a level of speech recognition accuracy; An electronic book implementing module installed in a digital device capable of inputting and outputting a voice signal and a video signal and having a function of storing and processing data and implementing the electronic book contents on the digital device; And a read aloud sound recognition module installed in the digital device and interworking with the electronic book implementation module to receive a voice signal when the user utters a text and analyze and evaluate a user's utterance accuracy through the voice recognition module Feature.

한편, 본 발명의 해결하고자 하는 과제에 의한 전자책 낭독을 통한 발화 정확도 및 표현력 향상을 제공하는 방법은 사용자가 발화한 전자책의 문장 또는 단어의 발화음성신호를 입력받는 입력단계; 상기 발화음성신호를 낭독체 음성인식 모듈에서 발화정확도를 분석 및 판별하는 분석 및 판별 단계; 발화정확도에 따른 반응 유무를 제어부로 전달하고, 발화정확도 데이터를 통계 데이터베이스에 저장하는 저장 단계; 상기 제어부에서 발화정확도에 대응하는 콘텐츠를 콘텐츠 데이터베이스로부터 호출 및 출력하고 발화 완료 문장 표시를 출력하도록 명령하는 명령 단계; 및 전자책 구현 모듈에서 발화문장 표시, 발화정확도 표시 및 발화관련 통계표시를 디지털기기상에서 구현되도록 표시하는 표시 단계;를 포함하는 것이 특징이다.
According to another aspect of the present invention, there is provided a method for providing improved utterance accuracy and expressive power through read aloud of an electronic book, comprising the steps of: inputting utterance speech signals of sentences or words of uttered electronic books; An analysis and discrimination step of analyzing and discriminating the speaking accuracy in the speech recognition module of the speech voice signal; A storing step of transmitting the presence or absence of a reaction according to the firing accuracy to the control unit and storing the firing accuracy data in the statistical database; An instruction step of causing the control unit to call and output a content corresponding to the speaking accuracy from a content database and output a display of a speech completion sentence; And a display step of displaying a statistical display related to a spoken sentence display, a spoken accuracy indication, and a spoken word in an electronic book implementation module so as to be realized on a digital device.

이상에서 설명한 바와 같이, 본 발명의 전자책 낭독을 통한 발화 정확도 및 표현력 향상을 제공하는 방법 및 시스템은 언어입문자의 전자책 낭독만으로 문장 또는 단어 발화를 측정 및 평가하여 발화정확도 향상에 도움을 주고, 발화정확도 향상을 위해 반복하여 발화하는 과정을 통해 언어의 의미습득과 표현력 습득이라는 언어학습의 실효성 측면에서의 언어입문자의 발화 정확도 및 표현력 향상을 도와주는 효과가 있다.As described above, the method and system for improving the speaking accuracy and the expressive power by reading the e-book of the present invention can help improve the accuracy of speech by measuring and evaluating sentence or word utterance only by read- In order to improve the accuracy of speaking, it is effective to improve the speaking accuracy and the expressive power of the beginner in terms of the effectiveness of the language learning by acquiring the meaning of the language and acquiring the expressive power through the process of repeated speaking.

그리고, 누적된 언어입문자의 발화 데이터에서 추출한 발화 통계데이터를 제공하여 언어입문자 본인의 현재 문장 발화정확도의 개선 상태를 인지하게 하여 연속적이고 집중적인 발화연습이 되도록 함으로써 언어입문자의 문장 말하기 연습과 표현력 향상을 자연스럽게 유도할 수 있는 효과가 있다.
In addition, by providing the statistical data extracted from the accumulated speech data of the first language learner, it is possible to recognize the improved state of the current sentence speech accuracy of the first language learner so that the continuous and intensive speech practice is performed, Can be induced naturally.

도 1은 본 발명에 따른 전자책 낭독을 통한 발화 정확도 및 표현력 향상을 제공하는 시스템의 구성을 나타내는 개략도.
도 2는 본 발명의 일 구성인 낭독체 음성인식 모듈의 구성을 나타내는 개략도.
도 3은 본 발명을 통해 디지털 기기에서 구현되는 전자책의 실시 예를 나타내는 도면.
도 4는 본 발명에 따른 전자책 낭독을 통한 발화 정확도 및 표현력 향상을 제공하는 방법을 설명하는 개략 흐름도.BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a schematic diagram showing the configuration of a system for providing improved speaking accuracy and expressive power through read aloud of an electronic book according to the present invention; FIG.
2 is a schematic diagram showing a configuration of a read aloud voice recognition module which is an embodiment of the present invention;
3 is a diagram showing an embodiment of an electronic book implemented in a digital device through the present invention.
FIG. 4 is a schematic flow diagram illustrating a method for providing speaking accuracy and expressive power enhancement through e-book reading according to the present invention; FIG.

이하 본 발명의 실시 예들을 첨부되는 도면을 통해 보다 상세히 설명하도록 한다.
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명에 따른 전자책 낭독을 통한 발화 정확도 및 표현력 향상을 제공하는 시스템의 구성을 나타내는 개략도이며, 도 2는 본 발명의 일 구성인 낭독체 음성인식 모듈의 구성을 나타내는 개략도이다.BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a schematic diagram showing a configuration of a system for providing improved speech accuracy and expressive power through read aloud of an electronic book according to the present invention, and FIG. 2 is a schematic diagram showing a configuration of a readable voice recognition module as an embodiment of the present invention.

도 1에 도시된 바와 같이 본 발명은 음성 입력부(110), 낭독체 음성인식 모듈(120), 제어부(130), 전자책 구현 모듈(140), 통계 데이터베이스(150), 콘텐츠 데이터베이스(160) 및 출력부(170)를 포함하여 구성된다.1, the present invention includes a voice input unit 110, a voice recognition module 120, a controller 130, an electronic book implementation module 140, a statistics database 150, a content database 160, And an output unit 170.

상기 음성 입력부(110)는 본 발명의 시스템이 구현될 수 있는 디지털 기기에 구비되며 음성 및 소리의 입력을 가능하게 하는 마이크 등의 입력모듈을 지칭하는 것으로, 전자책을 언어입문자(이하 '사용자'라 칭함)가 읽음에 따라 발생되는 발화음성신호를 수신하는 구성에 해당한다.The voice input unit 110 refers to an input module such as a microphone that is provided in a digital device in which the system of the present invention can be implemented and enables voice and sound input. Quot;) corresponds to a configuration in which a speech voice signal generated according to reading is received.

여기서, 상기 디지털 기기는 본 발명의 시스템을 통해 전자책 응용 프로그램이 구동될 수 있는 전자책 전용 단말기, 스마트폰, 태블릿 PC 중 어느 하나가 사용될 수 있으나 이에 한정되는 것은 아니다.Here, the digital device may be any one of an electronic book exclusive terminal, a smart phone, and a tablet PC through which the electronic book application program can be run through the system of the present invention, but is not limited thereto.

상기 낭독체 음성인식 모듈(120)은 상기 음성 입력부(110)로부터 수신한 발화음성신호를 전달받고, 이를 분석하여 발화정확도의 정확도를 산출하며 발화정확도의 수준을 파악한다.The read voice recognition module 120 receives the speech voice signal received from the voice input unit 110 and analyzes the voice speech signal to calculate the accuracy of the voice recognition accuracy and the level of the voice recognition accuracy.

그리고 상기 낭독체 음성인식 모듈(120)은 사용자의 전자책에 대한 문장 발화를 인식하고, 인식된 문장발화 음성신호를 처리하여 상기 인식된 발화문장이 해당 전자책 내용에 포함되어 있는지 포함되어 있다면 어느 문장의 어느 부분인지를 파악한다.The reader voice recognition module 120 recognizes a sentence utterance of the user's e-book, processes the recognized utterance utterance voice signal, and if the recognized utterance sentence is included in the contents of the corresponding e-book, Identify which part of the sentence.

이러한 낭독체 음성인식 모듈(120)의 문장 파악은 이하에서 설명하는 콘텐츠 데이터베이스(160)에 기저장된 전자책 콘텐츠와 비교 분석을 통하여 제공할 수 있게 된다.The sentence recognition of the reader / writer speech recognition module 120 can be provided through comparative analysis with e-book contents previously stored in the content database 160 described below.

구체적으로, 도 2를 참조하면 상기 낭독체 음성인식 모듈(120)은 음성입력 처리부(121)와 끝점 검출부(122)와 음소 분리부(123)와 음소별 스코어링부(124)와 신뢰 스코어 판별부(125)와 끝점 재검출부(125a)와 거절처리부(125b)와 정량적 평가부(125c) 및 발화 스코어 판별부(126)를 포함하여 구성된다.2, the read voice recognition module 120 includes a voice input processor 121, an end point detector 122, a phoneme separator 123, a phoneme scoring unit 124, An endpoint re-detection unit 125a, a rejection processing unit 125b, a quantitative evaluation unit 125c, and an utterance score determination unit 126. [

상기 음성입력 처리부(121)는 상기 음성 입력부(110)로부터 전달되는 사용자의 발화음성신호를 수신하는 구성으로 일정시간 동안 입력된 상기 발화음성신호 데이터를 음성특징벡터 형태의 데이터로 변환하여 상기 끝점 검출부(122)로 전송한다.The speech input processor 121 receives the speech voice signal of the user transmitted from the speech input unit 110 and converts the speech voice signal data inputted for a predetermined time into data of the voice feature vector type, (122).

이러한 음성입력 처리부(121)는 사용자가 발화한 음성을 입력받은 음성 신호 데이터를 낭독체 음성인식 모듈(120) 내부의 하위 모듈에서 사용하기 위한 형태인 음성특징벡터 형태의 데이터로 컨버젼하고, 이렇게 컨버젼된 음성특징벡터 데이터를 끝점 검출부(122)로 전송하는 것이다.The voice input processor 121 converts the voice signal data received from the voice input by the user into data of a voice feature vector form for use in a lower module of the voice recognition module 120, To the end point detection unit 122. The end point detection unit 122 detects the end point of the voice feature vector data.

여기서, 상기 음성특징벡터 데이터는 입력받은 음성신호 데이터를 일정 간격으로 나누어 이 간격마다의 컨버젼 데이터를 의미한다. 또한, 상기 일정 간격은 10msec, 16msec, 20msec 등의 시간 단위를 갖는 간격을 의미하며, 이러한 시간 단위 간격으로 데이터를 일부분씩 끊어서 이 간격 안에 포함되는 음성특징 벡터값들을 추출한다. 예를 들어 10msec의 간격으로 8khz의 신호를 입력받는다면 10msec 간격 안에 80개의 데이터 샘플이 존재하고, 이러한 80개의 데이터 샘플을 1개의 음성특징벡터 데이터로 추출하는 것이다.Here, the voice feature vector data refers to conversion data for each interval by dividing input voice signal data at predetermined intervals. The predetermined interval is an interval having a time unit of 10 msec, 16 msec, 20 msec, etc., and the voice feature vector values included in the interval are extracted by dividing the data by a fraction of this time unit interval. For example, if a signal of 8 kHz is input at intervals of 10 msec, there are 80 data samples in 10 msec intervals, and the 80 data samples are extracted as one voice feature vector data.

상기 끝점 검출부(122)는 상기 음성입력 처리부(121)로부터 음성특징벡터 데이터로 변환되어 전송된 발화음성신호에서 시작점과 끝점 즉, 상기 발화음성신호의 유효한 부분을 추출하고, 추출한 시작점과 끝점 사이 유효한 부분의 음성특징벡터 데이터를 상기 음소 분리부(123)로 전송한다.The end point detection unit 122 extracts a starting point and an end point, that is, a valid part of the utterance speech signal, from the speech speech signal converted into the speech feature vector data from the speech input processing unit 121, To the phoneme separator (123).

즉, 상기 끝점 검출부(122)는 음성입력 처리부(121)에서 전송되는 음성특징벡터 데이터의 시작점과 끝점을 검출하는 것으로 예를 들면 EPD(End-Point-Detection)모듈이 해당될 수 있다. 이러한 끝점 검출부(122)는 검출된 시작점과 끝점 사이의 유효한 부분의 음성특징벡터 데이터를 상기 음소 분리부(123)로 전송하게 되는 것이다.That is, the end point detector 122 detects the start and end points of the voice feature vector data transmitted from the voice input processor 121, for example, an EPD (End-Point-Detection) module. The end point detector 122 transmits the voice feature vector data of the valid part between the detected start point and end point to the phoneme separator 123.

또한, 상기 끝점 검출부(122)는 이하에서 설명하는 끝점 재검출부(125a)로부터 시작점과 끝점을 재검출하기 위해 전송되는 음성특징벡터 데이터를 상기의 음성입력 처리부(121)에서 전송되는 음성특징벡터 데이터와 연결하여 시작점과 끝점을 재검출하는 기능을 수행한다.The end point detector 122 detects voice feature vector data transmitted from the end point re-detector 125a to detect a start point and an end point from the voice feature vector data transmitted from the voice input processor 121 And detects the start point and the end point again.

한편, 상기 음소 분리부(123)는 상기 끝점 검출부(122)에서 전송된 시작점과 끝점이 검출되어 시작점과 끝점 사이의 유효한 음성특징벡터 데이터들의 값들을 기준 음소모델과 비교하여 음소로 판별되는 구간 내 음성특징벡터 데이터의 시작주소, 끝주소를 기록하고 음소로 판별되는 구간의 개수를 기록하여 결과적으로 음성특징벡터 데이터들을 음소별(phonem ; 최소의 음성학적 단위) 단위로 분리하게 되는 것이다. 이렇게 음소별 단위로 분리된 음성특징텍터 데이터는 음소별 스코어링부(124)로 전송된다.Meanwhile, the phoneme separator 123 detects the start and end points transmitted from the end point detector 122 and compares the values of the valid speech feature vector data between the start and end points with the reference phoneme model to determine a phoneme- The start address and the end address of the voice feature vector data are recorded and the number of intervals determined as phonemes is recorded, and as a result, the voice feature vector data are separated in units of phonem (minimum phonetic unit). The speech characteristic texture data separated by the phoneme-by-phoneme unit is transmitted to the phoneme-by-phoneme scoring unit 124.

상기 음소별 스코어링부(124)는 상기 음소 분리부(123)에서 음소별 단위로 분리된 음성특징벡터 데이터를 스코어링 기준 음소모델과 비교하여 스코어를 산출하게 되며, 이후 음성특징벡터 데이터 및 이와 매칭되는 음소별 스코어 데이터를 신뢰스코어 판별부(125)로 전송한다.The phoneme-by-phoneme scoring unit 124 compares phonetic feature vector data separated by phonemes in the phoneme separator 123 with a scoring-based phoneme model to calculate a score, And transmits the phoneme-by-phoneme score data to the trust score determining unit 125.

상기 신뢰스코어 판별부(125)는 상기 음소별 스코어링부(124)에서 전송된 음성특징벡터 데이터 및 이와 매칭되는 음소별 스코어데이터 중 매칭되는 음소별 스코어데이터의 평균값이 기준 음소모델의 신뢰도 범위 내에 있다면 해당 음성특징벡터 데이터를 상기 정량적 평가부(125c)로 전송한다.If the average value of the voice feature vector data transmitted from the phoneme-by-phoneme scoring unit 124 and the score data of the phoneme matched by the matched phoneme score data are within the reliability range of the reference phoneme model And transmits the voice feature vector data to the quantitative evaluation unit 125c.

그리고, 상기 신뢰스코어 판별부(125)는 매칭되는 음소별 스코어 데이터의 평균값이 기준 음소모델의 신뢰도가 결여되면 앞서 설명한 상기 끝점 검출부(122)로 다시 전송한다.If the reliability of the reference phoneme model is lacking, the reliability score determiner 125 transmits the average value of the score data to the end point detector 122 as described above.

그리고, 상기 신뢰스코어 판별부(125)는 매칭되는 음소별 스코어 데이터의 평균값이 기준 음소모델의 신뢰도가 현저히 결여되면 거절처리부(125b)로 전송한다.If the average value of the score data for each phoneme to be matched is significantly lacking in the reliability of the reference phoneme model, the trust score determining unit 125 transmits the reliability score to the rejection processing unit 125b.

즉, 상기 신뢰도의 범위 내에 해당되면 적정하게 분리된 음소로 판별하게 되는 것이며, 신뢰도의 결여에 해당되면 비적정하게 분리된 음소로 판별되는 것이며, 현저한 결여에 해당되면 음소가 아닌 것으로 판별되는 것이다.That is, if it falls within the range of the reliability, it is discriminated as a properly separated phoneme. If it is a lack of reliability, it is discriminated as a non-properly separated phoneme.

이와 같은 신뢰스코어 판별부(125)의 신뢰도 범위는 음소별 스코어의 평균값 등으로 통계치를 추출하여 사용될 수 있다.The reliability range of the trust score discrimination unit 125 can be used by extracting the statistic value by an average value of the phoneme score or the like.

상기 끝점 재검출부(125a)는 상기 신뢰스코어 판별부(125)로부터 전송된 신뢰도 결여에 해당되는(비적정하게 분리된 음소로 판별되는) 음성특징벡터 데이터의 시작점과 끝점의 재검출을 위해 음소 분리부(123), 음소별 스코어링부(124)에서 기록된 데이터를 제거하고 상기 끝점 검출부(122)로 전송한다.The endpoint re-detection unit 125a performs phoneme separation for the re-detection of the start point and the end point of the voice feature vector data (determined as a non-properly separated phoneme) corresponding to the lack of reliability transmitted from the trust score discrimination unit 125 (123) and phoneme-by-phoneme scoring unit (124) and transmits the data to the endpoint detector (122).

그리고, 거절처리부(125b)에서는 상기 신뢰스코어 판별부(125)로부터 전송된 신뢰도 현저한 결여에 해당되는(음소가 아닌 것으로 판별되는) 음성특징벡터 데이터의 처리를 종료한다.Then, the rejection processing unit 125b terminates the processing of the voice feature vector data (determined to be not a phoneme) that corresponds to a significant lack of reliability transmitted from the trust score determining unit 125. [

상기 정량적 평가부(125c)는 상기 신뢰스코어 판별부(125)에서 전송된 신뢰도 범위 내의 음성특징벡터 데이터와 전문가들에 의해 발화평가 및 산출된 통계 데이터를 기준으로 만들어진 발화평가모델과의 비교를 통해 해당 발화한 텍스트의 정량적 발화정확도 수치를 산출하게 된다.The quantitative evaluation unit 125c compares the voice feature vector data within the reliability range transmitted from the reliability score determination unit 125 with the speech evaluation model made based on the statistical data evaluated and calculated by the experts The quantitative accuracy of the utterance of the corresponding text is calculated.

마지막으로 상기 발화스코어 판별부(126)는 상기 정량적 평가부(125c)에서 전송된 음성특징벡터 데이터와 정량적 발화정확도 수치를 사용 어플리케이션의 기준에 맞게 설정한 발화스코어로 치환하게 된다.Finally, the utterance score determiner 126 replaces the voice feature vector data and the quantitative utterance accuracy value transmitted from the quantitative evaluator 125c with utterance scores set according to the criteria of the application used.

이렇게 산출되는 발화정확도 수치 및 발화스코어 등은 상기 낭독체 음성인식 모듈(120)의 처리 과정에 의해 상기 통계 데이터베이스(150)로 저장됨에 더불어 제어부(130)의 제어에 의해 상기 전자책 구현 모듈(140)과 연동되어 전자책을 통해 표출됨으로써 사용자의 발화에 대한 발화정확도를 확인할 수 있게 되는 것이다.The ignition accuracy value and the utterance score thus calculated are stored in the statistical database 150 by the process of the read aloud voice recognition module 120 and the electronic book implementation module 140 The user can confirm the utterance accuracy of the user's utterance.

한편, 상기 전자책 구현 모듈(140)은 디지털기기 상에서 전자책을 구현하도록 하는 일종의 응용 소프트웨어(App ; application software)에 해당하는 구성으로, 상기 콘텐츠 데이터베이스(160)에 저장된 전자책에 대한 텍스트와 전자책 구성을 위한 기본 이미지들을 상기 디지털기기에서 표현되도록 한다.The e-book implementation module 140 is a kind of application software for implementing an electronic book on a digital device. The e-book implementation module 140 includes text and electronic information for an electronic book stored in the content database 160 So that the basic images for constituting the book are displayed on the digital device.

그리고, 상기 전자책 구현 모듈(140)은 상기 제어부(130)에 의해 낭독체 음성인식 모듈(120)과 연동되어 사용자가 전자책을 낭독함에 따라 그에 대응하여 상기 디지털기기 상에서 다양하게 반응하도록 구현된다.In addition, the electronic book implementation module 140 is interlocked with the read-aloud voice recognition module 120 by the control unit 130 so that the user can read various kinds of electronic books and respond in various ways on the digital device correspondingly .

즉, 상기 전자책 구현 모듈(140)은 사용자가 텍스트를 발화할 때 발화중임을 인지시켜 줄 수 있는 요소들, 발화인식완료를 인지시켜 주는 요소들, 발화 정확도가 일정수준 이상 또는 미만임을 인지시켜주는 요소들로 구성된 다양한 콘텐츠 등을 저장하고 있는 콘텐츠 데이터베이스(160)로부터 음성인식에 대응하는 콘텐츠를 호출하여 발화에 대응하는 요소들이 반영되도록 전자책을 구현하여 나타내어주는 기능을 수행한다.That is, the electronic book implementation module 140 recognizes elements that can recognize that the user is speaking when the user utters the text, elements that recognize the completion of the speech recognition, and recognizes that the speaking accuracy is above or below a certain level The contents corresponding to the voice recognition are called from the contents database 160 storing various contents including the contents of the contents, and the function of displaying the electronic book is displayed so that the elements corresponding to the utterance are reflected.

이러한 전자책 구현 모듈(140)의 호출 기능은 실질적으로 각 구성 간을 연동시키는 제어부(130)로부터 수행된다.The calling function of the electronic book implementation module 140 is performed from the control unit 130 that substantially interlocks the respective components.

상기 통계 데이터베이스(150)는 상기 전자책 구현 모듈(140)로 구현되는 전자책의 텍스트 발화인식완료된 문장 및 문장수와 상기 낭독체 음성인식 모듈(120)에 의한 발화인식완료된 문장의 발화정확도, 상기 발화인식완료된 문장의 발화인식완료 날짜 및 시간 등의 통계 데이터를 저장하며, 상기 제어부(130)의 요청에 따라 그에 대응하는 데이터를 호출하여 출력한다.The statistical database 150 stores the number of sentences and sentences of the text utterance recognition of the electronic book implemented in the electronic book implementation module 140 and the utterance accuracy of the utterance recognition completion sentences by the reader speech recognition module 120, And stores the statistical data such as the completion date and time of the utterance recognition completion of the utterance recognition sentence. The controller 130 calls and outputs the data corresponding thereto at the request of the controller 130.

또한 상기 통계 데이터베이스(150)는 사용자의 발음 이력에 대한 축적 데이터도 포함될 수 있다. 여기서 상기 축적 데이터라 함은 사용자의 발화에 대한 상기 낭독체 음성인식 모듈(120)의 분석결과에 대한 누적데이터를 칭하는 것으로, 상술한 전자책 구현 모듈(140)을 통해 디지털기기로 표출될 수 있으며, 이러한 축적 데이터를 통해 사용자의 발화정확도를 포함한 학습능력의 향상도를 유추할 수 있게 되는 것이다.Also, the statistical database 150 may include accumulation data on the pronunciation history of the user. Here, the accumulation data refers to cumulative data on the analysis result of the speech recognition module 120 for the user's utterance, and can be displayed as a digital device through the electronic book implementation module 140 described above Through this accumulation data, it is possible to deduce the improvement of the learning ability including the user's speaking accuracy.

특히 상기 통계 데이터베이스(150)는 사용자뿐만 아니라 해당 전자책을 낭독하는 모든 기사용자의 통계 데이터를 저장함으로써 해당 전자책을 낭독하는 모든 사용자의 발화인식 완료 문장수에 따른 발화정확도 평균 통계는 물론 발화인식 완료 문장수에 따르는 발화정확도의 특정범위 상위그룹의 통계를 제공할 수 있게 되며, 사용자의 발화인식 완료 문장수에 따르는 발화정확도 통계를 표시할 수 있게 된다.In particular, the statistical database 150 stores statistical data of all users who read the e-book, not only the user but also the statistical data of all users who read the e-book, It becomes possible to provide statistical information on a specific range of a specific range of the utterance accuracy according to the number of completed sentences, and it becomes possible to display the utterance accuracy statistics according to the number of completed utterance recognition sentences of the user.

또한, 이러한 모든 사용자의 통계 데이터를 기초로 하여 상기 제어부(130)에서는 각 통계를 일간, 주간, 월간을 포함한 시간 주기로 산출하여 이러한 시간 주기의 통계를 상기 통계 데이터베이스(150)에 다시 저장함과 더불어 이를 전자책 상에서 사용자에게 표시하도록 한다.The controller 130 calculates the statistics of each time period including day, week, and month based on the statistical data of all users, stores the statistics of the time period in the statistical database 150 again, Let the user display it on the e-book.

이에 따라 사용자는 본인의 발음정확도에 대한 통계 데이터뿐만 아니라 타 사용자의 통계 데이터를 용이하게 확인하도록 하여 본인의 발음정확도의 수준을 비교할 수 있게 되는 것이다. Accordingly, the user can easily check the statistical data of the user's pronunciation accuracy as well as the statistical data of the user's own pronunciation, thereby comparing the level of the pronunciation accuracy of the user.

상기 콘텐츠 데이터베이스(160)는 전자책을 구성하는 텍스트와 기본 이미지들, 디지털기기 상에서 표출되는 인자들 예를 들면, 사용자가 텍스트를 발화할 때 발화중임을 인지시켜 줄 수 있는 요소들, 발화인식완료를 인지시켜 주는 요소들, 발화 정확도가 일정수준 이상 또는 미만임을 인지시켜주는 요소들을 포함한 콘텐츠들을 저장한다. 또한, 상기 콘텐츠 데이터베이스(160)는 사용자가 전자책을 낭독함에 있어 시각과 청각적인 효과를 제공하기 위한 영상 및 효과음 등을 포함한 콘텐츠들을 저장한다.The content database 160 includes text and basic images constituting the electronic book, factors displayed on the digital device, for example, elements that can recognize that the user is speaking when he utteres the text, , And elements that recognize that the utterance accuracy is above or below a certain level. In addition, the content database 160 stores contents including images, sound effects, and the like for providing visual and auditory effects when a user reads an electronic book.

이러한 콘텐츠 데이터베이스(160) 역시 상기 전자책 구현 모듈(140)과 연동되며, 상기 제어부(130)의 요청에 따라 그에 대응하는 콘텐츠들을 호출하여 출력함으로써 상기 전자책의 구현과 함께 표출될 수 있다.The content database 160 may also be displayed in conjunction with the implementation of the electronic book by interfacing with the electronic book implementation module 140 and calling and outputting corresponding contents according to the request of the controller 130. [

상기 출력부(170)는 디지털 기기의 효과음, 이미지 및 영상 등을 출력하도록 하는 구성으로 예를 들면, 디지털 기기의 스피커와 디스플레이 등의 출력모듈을 지칭한다. 이러한 출력부(170)는 상기 음성 입력부(110)의 마이크로부터 입력되는 음성신호에 대응하는 효과음, 이미지 및 영상 중 어느 하나 또는 복합적인 출력을 제공하는 것으로 이러한 출력 데이터는 상기 콘텐츠 데이터베이스(160)로부터 추출된 데이터를 출력하게 된다. The output unit 170 outputs a sound effect, an image, and an image of a digital device. The output unit 170 is, for example, an output module such as a speaker and a display of a digital device. The output unit 170 provides any one of a sound effect, an image, and a composite output corresponding to a voice signal input from the microphone of the voice input unit 110, and the output data is output from the content database 160 And outputs the extracted data.

한편, 상기 제어부(130)는 본 발명의 구현을 위한 요소인 상기 낭독체 음성인식 모듈(120)과 전자책 구현 모듈(140) 간의 연동을 제어한다. 또한, 상기 전자책 구현 모듈(140)과 통계 데이터베이스(150) 간의 연동 제어와 전자책 구현 모듈(140)과 콘텐츠 데이터베이스(160) 간의 연동을 제어하는 기능을 수행한다.Meanwhile, the controller 130 controls interlocking between the read aloud voice recognition module 120 and the electronic book implementation module 140, which are elements for implementing the present invention. The interoperability between the electronic book implementation module 140 and the statistical database 150 and the interoperability between the electronic book implementation module 140 and the content database 160 are controlled.

또한, 상기 제어부(130)는 각 구성 간을 연동하는 기능을 수행함과 더불어 발화정확도에 대응되는 콘텐츠를 콘텐츠 데이터베이스(160)에서 호출 및 출력하여 발화 완료문장 표시, 발화 정확도 표시는 물론 발화 관련 통계 표시 등 출력하도록 명령함으로써 상기 전자책 구현 모듈(140)을 통해 전자책에 기타 콘텐츠와 함께 구현되도록 수행한다.In addition, the control unit 130 performs a function of interlocking between the respective constitutions, and calls and outputs contents corresponding to the utterance accuracy in the contents database 160 to display the utterance completion sentences, display the utterance accuracy, And so on to be implemented together with other contents in the e-book through the e-book implementation module 140.

상기 제어부(130)의 연동 기능은 사용자의 조작에 따른 요청에 의한 것이거나, 기설정된 것으로 상호 구성 간 데이터 처리에 의해 발생되는 자동적인 신호에 의한 것일 수 있다.The interlocking function of the controller 130 may be a request made by a user's operation or an automatic signal generated by inter-constituting data processing.

도 3은 본 발명을 통해 디지털 기기에서 구현되는 전자책의 실시 예를 나타내는 도면이다.3 is a diagram showing an embodiment of an electronic book implemented in a digital device through the present invention.

도 3에 도시된 바와 같이 본 발명의 전자책 구현 모듈(140)을 통해 구현되는 전자책은 사용자의 전자책 문장의 발화가 시작되어 진행되면, 발화인식 전 텍스트(311)는 발화인식된 부분까지 발화인식 완료 텍스트(322)로 텍스트의 형태가 변화된다.As shown in FIG. 3, when an electronic book implemented by the electronic book implementation module 140 of the present invention starts to be uttered in the electronic book sentence of the user, the pre-utterance recognition text 311 is divided into The shape of the text is changed to the speech recognition completion text 322. [

그리고 발화인식 완료 텍스트 위치는 사용자가 확인 가능하도록 인디케이터(323)로 발화인식 완료된 부분의 끝점을 지시하고 있어 언어입문자가 이미 발화 완료한 부분을 인지시키도록 한다.In addition, the position of the ignition recognition completion text indicates the end point of the part where the speech recognition is completed by the indicator 323 so that the user can check it, so that the initial part of the language recognizes the already ignited part.

특히, 앞서 설명한 출력부(170)는 스피커 내지 이미지 등으로 출력되는 바, 예를 들면 도 3에 도시된 바와 같이 진행중인 텍스트와 관련하여 발화인식 스코어 연동 반응 이미지(324)와 발화인식 스코어 연동 반응 사운드(325) 등의 반응 출력물들을 이용하여 보다 용이하고 흥미를 갖도록 언어입문자의 발화정확도에 대한 일정스코어 이상의 형태를 인지시킬 수 있게 한다. 여기서 설명되지 않은 도 3의 도면부도 321은 발화입력 볼륨 인디케이터이다.In particular, the output unit 170 described above is output as a speaker, an image, or the like. For example, as shown in FIG. 3, a speech recognition score score correlating reaction image 324 and a speech recognition score score correlating reaction sound (325), it is possible to recognize a form of a certain score or more of the accuracy of utterance of a language primer so as to be easier and more interesting. 3, not shown here, is a firing input volume indicator.

또한 전자책 하단에는 발화인식 상태를 인지할 수 있는 히스토리바(326)를 구현하여 언어입문자의 현재 발화중인 전자책의 전체 또는 일부의 발화정확도 흐름을 인지시킬 수도 있게 된다.In addition, a history bar 326 for recognizing the speech recognition state can be implemented at the bottom of the e-book, so that it is possible to recognize the flow of the spoken accuracy of all or a part of the electronic book currently being uttered by the language beginner.

이와 같이 본 발명에 의하면 사용자가 상기 전자책 구현 모듈(140)에서 구현된 전자책 텍스트를 발화하는 과정에서 본 발명에 따른 시스템 및 방법적인 흐름에 의해 본인의 텍스트 발화 정확도를 실시간으로 인지할 수 있고, 도면에 도시된 바 없으나 사용자의 요청에 따라 상기 전자책 구현 모듈(140)에서는 상기 통계 데이터베이스(150)와 연동하여 구현되는 전자책 발화통계페이지를 통해 본인의 발화정확도의 상대적 위치(전체 기사용자들의 누적 데이터와 비교)와 절대적 위치(사용자 본인의 누적 발화인식 완료 문장수 및 발화정확도 평균)을 인지할 수 있게 되는 것이다.As described above, according to the present invention, the user can perceive the accuracy of the text utterance of the user in real time by the system and method flow according to the present invention in the process of uttering the electronic book text implemented in the electronic book implementation module 140 Although not shown in the figure, the e-book implementation module 140, in response to a user's request, sets a relative position of the user's utterance accuracy And the absolute position (the average number of cumulative speech recognition completed sentences and the average of the utterance accuracy of the user) can be recognized.

도 4는 본 발명에 따른 전자책 낭독을 통한 발화 정확도 및 표현력 향상을 제공하는 방법을 설명하는 개략 흐름도이다. 도 4를 참고하여 본 발명에 따른 전자책 낭독을 통한 발화 정확도 및 표현력 향상을 제공하는 방법을 설명한다.4 is a schematic flow chart illustrating a method for providing speech accuracy and expressive power enhancement through e-book reading according to the present invention. Referring to FIG. 4, a description will be given of a method for providing improved speaking accuracy and expressive power by reading an electronic book according to the present invention.

먼저, 사용자가 발화한 전자책의 문장 또는 단어의 발화음성신호를 입력받는다. 상기 사용자는 디지털 기기를 통해 전자책의 문장을 낭독하면 음성 입력부(110)는 사용자의 발화음성신호를 입력하게 되고, 이를 낭독체 음성인식 모듈(120)로 전송하게 된다.(S100)First, a speech voice signal of a sentence or a word of an electronic book uttered by the user is input. When the user reads the sentence of the electronic book through the digital device, the speech input unit 110 inputs the speech voice signal of the user, and transmits the speech voice signal to the speech recognition module 120. (S100)

이후 상기 발화음성신호를 전송받은 상기 낭독체 음성인식 모듈(120)에서는 상기 발화음성신호를 처리하여 발화정확도를 분석 및 판별하게 된다. 이러한 발화정확도의 분석 및 판별은 앞서 설명한 낭독체 음성인식 모듈(120)의 구성에 의해 구현되는 것으로 본 방법에는 이에 대한 구체적인 설명은 생략하기로 한다.(S200)The portable speech recognition module 120 receives the speech speech signal and processes the speech speech signal to analyze and determine the speech accuracy. The analysis and the determination of the utterance accuracy are implemented by the configuration of the voice recognition module 120 of the voice recognition system 120 described above, and a detailed description thereof will be omitted in step S200.

이렇게 사용자의 전자책 발화에 따른 발화정확도 데이터는 통계 데이터베이스(150)에 저장되어 추후 누적되어 사용자의 발화 통계데이터를 제공함으로써 문장 발화정확도의 개선 상태를 인지하도록 하는 기본 정보로 활용될 수 있다.(S300)In this way, the speech accuracy data according to the user's electronic book utterance is stored in the statistical database 150 and accumulated as a basic information for recognizing the improved state of the utterance accuracy by providing the user's utterance statistical data. S300)

또한, 상기 제어부(130)에서는 발화정확도에 대응되는 콘텐츠를 콘텐츠 데이터베이스(160)에서 호출 및 출력하여 발화 완료문장 표시, 발화 정확도 표시는 물론 발화 관련 통계 표시 등 출력하도록 명령함으로써 상기 전자책 구현 모듈(140)을 통해 전자책에 기타 콘텐츠와 함께 구현되도록 수행한다.(S400)In addition, the control unit 130 commands and outputs the content corresponding to the speaking accuracy from the content database 160 to output the speaking completion sentence, the speaking accuracy, and the statistical display related to the speech, 140 in order to implement the electronic book together with other contents (S400)

이후, 상기 제어부(130)의 명령에 의해 상기 전자책 구현 모듈(140)에서는 전자책을 구현함에 있어 발화 문장표시, 발화 정확도 표시는 물론 발화 관련 통계 표시 등을 함께 구현되도록 수행한다.(S500)The electronic book implementation module 140 implements an electronic book by implementing a command of the control unit 130 so as to realize a display of a spoken sentence, a display of a speaking accuracy, and a display of statistics related to a speech.

여기서, 상기 발화 문장표시, 발화 정확도 표시 및 발화 관련 통계 표시를 포함한 전자책 상에서의 표시 과정에서는 이미 발화인식한 문장 또는 단어를 표시하거나, 아직 발화인식 하지않은 문장 또는 단어를 표시하거나, 일정 발화정확도 이상 발화인식한 문장 또는 단어를 표시하거나, 일정 발화정확도 미만 발화인식한 문장 또는 단어를 표시할 수 있다.Here, in the display process on the electronic book including the above-mentioned spoken sentence display, spoken accuracy display, and statistical display related to the spoken word, it is possible to display a sentence or a word already recognized, display a sentence or word that has not yet been recognized as a spoken word, Abnormal Speech Recognition A sentence or word can be displayed, or a sentence or word recognized less than a certain utterance accuracy can be displayed.

이는 앞서 설명한 낭독체 음성인식 모듈(120)과 전자책 구현 모듈(140)의 연동에 의해 이루어지게 구현되는 것이며, 상기 콘텐츠 데이터베이스(160)에 기저장된 다양한 인자들을 호출 및 출력함으로써 전자책 상에서 표출되는 것이다. This is realized by interlocking the voice recognition module 120 and the electronic book implementation module 140 described above and is displayed on the electronic book by calling and outputting various factors previously stored in the contents database 160 will be.

특히, 이미 발화인식한 문장 또는 단어를 표시함에 있어서는, 상기 낭독체 음성인식 모듈(120)에서 인식되어 정량적 평가에 의해 스코어링된 음소단위의 발화문장과 콘텐츠 데이터베이스(160)의 문장을 비교하여 발화인식한 문장의 음소단위의 글자를 발화인식 전의 글자와는 다른 형태로 출력하도록 수행되고, 이러한 낭독체 음성인식 모듈(120)의 기능에 따라 사용자 본인이 발화한 문장 또는 단어의 발화위치를 인지하도록 한다.In particular, when a sentence or word already recognized as a speech is recognized, the sentence in the content database 160 is compared with the speech-based sentence recognized by the read-aloud speech recognition module 120 and scored by the quantitative evaluation, The phonetic unit of a sentence is output in a form different from the character before the speech recognition. The speech recognition unit 120 recognizes the utterance position of a sentence or a word uttered by the user in accordance with the function of the speech recognition module 120 .

또한, 본 발명의 일 실시예에 따르면, 상기 사용자가 전자책 줄거리에 포함된 문장 또는 단어를 발화할 때 사용자와 그 밖의 기사용자의 발화통계를 함께 표시해주는 단계를 더 포함할 수 있다.In addition, according to an embodiment of the present invention, when the user utters a sentence or a word included in the e-book plot, the user may further display the utterance statistics of the user and other users.

이러한 통계 표시 단계에서는 앞서 설명한 통계 데이터베이스(150)에 기저장된 사용자의 통계 데이터와 기사용자의 통계 데이터를 기초로 산출되는 것으로, 해당 전자책 모든 기사용자의 발화인식 완료 문장수에 따른 발화정확도 평균 통계는 물론 해당 전자책 모든 기사용자의 발화인식 완료 문장수에 따르는 발화정확도 특정범위 상위그룹의 통계를 표시할 수 있다.In the statistical display step, it is calculated based on the statistical data of the user previously stored in the statistical database 150 described above and the statistical data of the user of the electronic book. The accuracy of the speaking accuracy according to the number of utterance- Of the total number of utterances of the corresponding e-book, as well as the utterance accuracy according to the number of completed utterance sentences.

이 밖에도 사용자의 발화인식 완료 문장수에 따르는 발화정확도 통계를 표시하며, 특히 이러한 각종 통계자료는 사용자의 요청에 의해 일간, 주간, 월간을 포함한 시간 주기로 표시되어 다양한 통계자료를 제공하도록 한다.In addition, the statistics of the utterance accuracy according to the number of utterance completion sentences of the user are displayed. In particular, such various statistical data are displayed in a time period including day, week, and month according to the user's request.

여기서, 상기 사용자의 발화인식 완료 문장수에 따르는 발화정확도 통계를 표시함에 있어서, 상기 낭독체 음성인식 모듈(120)에서 인식되어 정량적 평가에 의해 스코어링된 발화문장의 발화정확도에 관한 데이터와 발화인식 완료 문장수에 관한 데이터에서 사용자 본인의 누적된 일정수치 이상의 발화정확도, 평균발화정확도, 누적된 발화인식 완료 문장수를 표시하게 된다.Here, in displaying the speech accuracy statistics according to the number of utterance recognition completed sentences of the user, data on the utterance accuracy of the utterance sentence recognized by the read-aloud speech recognition module 120 and scored by the quantitative evaluation, In the data on the number of sentences, the utterance accuracy, the average utterance accuracy, and the cumulative utterance completion sentence number are displayed.

이에 따라 사용자는 전자책 낭독을 통한 발화연습에 사용자 본인의 현재 발화실력을 용이하게 인지할 수 있데 되는 것이다.
Accordingly, the user can easily recognize the current speaking ability of the user in the speaking practice through the reading of the electronic book.

이상 설명한 내용을 통해 당업자라면 본 발명의 기술사상을 일탈하지 아니하는 범위에서 다양한 변경 및 수정 가능함을 알 수 있을 것이다. 따라서, 본 발명의 기술적 범위는 명세서의 상세한 설명에 기재된 내용으로 한정되는 것이 아니라 특허 청구의 범위에 의해 정하여 져야만 할 것이다.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the invention. Therefore, the technical scope of the present invention should not be limited to the contents described in the detailed description of the specification, but should be defined by the claims.

110 : 음성입력부 120 : 낭독체 음성인식 모듈
130 : 제어부 140 : 전자책 구현 모듈
150 : 통계 데이터베이스 160 : 콘텐츠 데이터베이스
170 : 출력부 121 : 음성입력 처리부
122 : 끝점 검출부 123 : 음소 분리부
124 : 음소별 스코어링부 125 : 신뢰 스코어 판별부
125a : 끝점 재검출부 125b : 거절처리부
125c : 정략적 평가부 126 : 발화 스코어 판별부110: voice input unit 120: voice recognition module
130: controller 140: electronic book implementation module
150: statistics database 160: content database
170: output unit 121: audio input processing unit
122: end point detection unit 123: phoneme separation unit
124: Phoneme scoring unit 125: Trust score determining unit
125a: End point re-detection unit 125b: Rejection processing unit
125c: Scalability evaluation unit 126: Speech score determination unit

Claims

An electronic book text including an electronic book text implemented on an electronic book, an element recognizing that the user is speaking when the text is uttered, an element recognizing completion of speech recognition and an element recognizing a level of the speech recognition accuracy, ;
An electronic book implementing module installed in a digital device capable of inputting and outputting a voice signal and a video signal and having a function of storing and processing data and implementing the electronic book contents on the digital device; And
And a reading voice sound recognition module installed in the digital device and interfacing with the electronic book implementation module to receive a voice signal when the user utters the text and to analyze and evaluate the user's speaking accuracy through the voice recognition module To improve the accuracy of speaking and expressive power by reading e-books.

The method according to claim 1,
Statistical data such as the utterance accuracy of the uttered utterance recognition sentence by the voice recognition module, the date and time of completion of utterance recognition completion of the utterance recognition sentence, and accumulation data on the pronunciation history of the user are stored, A statistical database output by the statistical database; And
A text and an image constituting the electronic book, and elements capable of recognizing that the user is speaking when the user utters the text as expressed on the digital device, elements recognizing the completion of the speech recognition, A content database storing contents including elements for recognizing that the electronic book is a character string, a character string, a character string, a character string, a character string, a character string, and a character string.

The method according to claim 1,
The read voice recognition module includes:
A speech input processor for inputting speech speech signals transmitted from the speech input unit at predetermined intervals and extracting and converting the speech speech data into data of a speech feature vector type,
An endpoint detector for detecting a start point and an end point of a speech feature vector transmitted from the speech input processor,
A phoneme separator for separating phoneme feature vector data determined as a phoneme among valid phoneme feature vector data between a start point and an end point detected at the start point and end point transmitted from the end point detector,
A phoneme-by-phoneme scoring unit for calculating a score by comparing voice feature vector data separated by phonemic units in the phoneme separator to a scoring-based phoneme model;
A reliability score discrimination unit for discriminating whether the average value of the voice feature vector data transmitted by the phoneme scoring unit and the score data of each phoneme matching among the phoneme score data matched therewith is included in the reliability range of the predetermined reference phoneme model
A quantitative evaluation unit for comparing the voice feature vector data within the reliability range transmitted from the reliability score discrimination unit with a speech evaluation model using statistical values calculated by an utterance evaluation expert to calculate a quantitative speaking accuracy value of the uttered text;
And an utterance score discriminator for replacing the voice feature vector data and the quantitative utterance accuracy value transmitted from the quantitative evaluator with an utterance score so as to comply with a criterion of a using application. Providing system.

The method of claim 3,
Wherein the voice input processing unit comprises:
Wherein the utterance voice signal data is divided by 10 msec to 20 msec, and the feature vector values included in the interval are divided by the utterance time.

The method of claim 3,
The read voice recognition module includes:
An endpoint re-detection unit which receives the voice feature vector data corresponding to the lack of reliability from the reliability score discrimination unit and removes data recorded in the phoneme separator and phoneme scoring unit from the voice feature vector data and transmits the removed voice feature vector data to the end point detector;
Further comprising a reject processing unit for receiving the voice feature vector data that is determined not to be a phoneme by the reliability score determination unit and terminating the processing of the voice feature vector data. Providing system.

An input step of inputting a speech voice signal of a sentence or word of an electronic book uttered by the user;
An analysis and discrimination step of analyzing and discriminating the speaking accuracy in the speech recognition module of the speech voice signal;
A storing step of transmitting the presence or absence of a reaction according to the firing accuracy to the control unit and storing the firing accuracy data in the statistical database;
An instruction step of causing the control unit to call and output a content corresponding to the speaking accuracy from a content database and output a display of a speech completion sentence; And
And displaying the statistical display related to the utterance sentence, the utterance accuracy, and the utterance in the electronic book implementation module so as to be realized on the digital device.

The method according to claim 6,
In the display step
Steps to display sentences or words that have already been ignited
Steps to display sentences or words that have not yet been ignited
Displaying a sentence or word recognized as a speech recognition error over a constant speaking accuracy; and
The method comprising the steps of: displaying a sentence or a word recognized as a speech less than a predetermined utterance accuracy; and providing a speech accuracy and an expressive power improvement through an e-book read aloud.

8. The method of claim 7,
The step of displaying a sentence or a word already recognized as a speech is
The phoneme-unit sentence recognized by the read-aloud speech recognition module and scored by the quantitative evaluation is compared with the sentence of the content database to output the phoneme-unit letter of the sentence, And recognizes the utterance position of a sentence or a word of the person himself / herself.

The method according to claim 6,
Further comprising a statistical display step of displaying the utterance statistics of the user and other users when the user utters a sentence or a word contained in the e-book plot,
The statistical display step
A step of displaying the average statistic of speaking accuracy according to the number of utterance recognition sentences of all users who read the e-book
A certain range of utterance accuracy according to the number of utterance recognition sentences of all users who read the e-book, a step of displaying statistics of the upper group
Displaying the speech accuracy statistics according to the number of utterance recognition completion sentences of the user, and
And displaying each statistic in a time period including a day, a week, and a month. The method as claimed in claim 1, further comprising:

10. The method of claim 9,
Wherein the step of displaying the utterance accuracy statistics according to the number of utterance recognition completion sentences of the user
In the data on the utterance accuracy of the utterance sentence recognized by the voice recognition module and scored by the quantitative evaluation, and the data on the number of uttered utterance completion sentences, the utterance accuracy, average utterance accuracy, cumulative utterance And the number of completed sentences is displayed so that the speaking ability of the user himself / herself can be recognized in the speaking practice through reading the electronic book of the user.