CN101304457A

CN101304457A - Method and apparatus for implementing automatic spoken language training based on voice telephone

Info

Publication number: CN101304457A
Application number: CNA200710097430XA
Authority: CN
Inventors: 许罗迈
Original assignee: Individual
Current assignee: Individual
Priority date: 2007-05-10
Filing date: 2007-05-10
Publication date: 2008-11-12

Abstract

The invention relates to a device for realizing automatic oral practices base on a voice telephone which comprises a computer, a telephone receiving module, a dialogue control module, a voice recognition module, a language processing module and a dialogue knowledge base. The method for realizing automatic oral practices base on the voice telephone comprises the following steps: a telephone user dials a call to activate the telephone receiving module; the dialogue control module opens a dialogue script and enters into the dialogue initial state; the present dialogue stage is determined, and the needed expectant response content of the corresponding stage is extracted; a voice recognition rule of the expectant response content of the present dialogue stage is compiled to make preparations for recognizing voices of telephone users; the voice recognition module is started, meanwhile, a machine speaking record playing module is started to play the extracted machine speaking record; the voice recognition module is activated to carry out the voice recognition and transfer the voice recognition result to the language processing module; the voice recognition result and the obtained expectant response content are compared by the language processing module.

Description

A kind of method and apparatus of realizing automatic spoken language training based on voice call

Technical field

The invention belongs to the method and apparatus of automatic spoken language training, specifically a kind of method and apparatus of realizing automatic spoken language training based on voice call.

Technical background

Speech recognition technology has many application in the voice call field, concentrate on mainly that dialogue scenarios is fixed and the simple inside line intelligent sound of the content switch of speaking on.The Phonepass system of IBM is applied to the Oral English Practice test with voice call and speech recognition technology, and test form mainly is to read aloud vocabulary and sentence, and the oral multiple-choice question of doing.Use various widely phone automated inquiry systems all to adopt phone button interactive mode, and do not adopt spoken interactive mode.

Along with the raising of the recognition accuracy of speech recognition system and recognition speed and under specific environment the requirement to hardware condition more and more lower, will embed in the consumer electronics product based on the speech recognition technology of micro chip and more and more become a reality.More and more to mix function more and more powerful along with the consumer electronics product performance becomes, and speech recognition science and technology makes the consumer can use these products more convenient, more intuitively.And, when using these products, can not be accompanied by a series of button and prompt tone, but realize the direct dialogue between consumer and product.

These speech recognition technologies owing to lack spoken interactive learning, can't satisfy the requirement of people in Oral Training in the application in voice call field.

Summary of the invention

In order to overcome deficiency of the prior art, the object of the present invention is to provide a kind of method and apparatus based on voice call realization automatic spoken language training, satisfy people and utilize voice call to carry out the demand of interactive Oral Training.

For finishing the foregoing invention purpose, the invention provides a kind of method based on voice call realization automatic spoken language training, this method may further comprise the steps:

1) telephone subscriber's phone of dragging on activates the phone receiver module;

2) dialogue control module is opened dialog script, enters the dialogue initial state;

3) determine the present located talking phase, extract corresponding required expection response content of stage;

4) voice recognition rule of the current talking phase expection of compiling response content is for identification telephone subscriber's voice are got ready;

5) start sound identification module, the record playing module in a minute that starts the machine is simultaneously play the machine recording in a minute of extracting;

6) sound identification module activates the result carry out speech recognition and to transmit speech recognition to language processing module.Language processing module is compared the result of speech recognition with the expection response content that obtains.

For finishing the foregoing invention purpose, the present invention also provides a kind of device based on voice call realization automatic spoken language training, comprises computer, phone receiver module, dialogue control module, sound identification module and dialogue knowledge base, it is characterized in that,

Described phone receiver module is used to receive telephone subscriber's phone, and connection signal is sent to dialogue control module.

Described dialogue control module, reception from the suitable dialogue knowledge of dialogue knowledge base retrieval, is called the language that the language identification module is monitored the telephone subscriber from phone receiver module signal, call language processing module the language of identification is judged, provide the feedback that this takes turns dialogue.

Described sound identification module is used to monitor telephone subscriber's language, and the result of speech recognition is sent to language processing module.

Described language processing module receives the result of speech recognition and itself and dialogue knowledge that the dialogue knowledge base retrieves is compared, and provides comparative result.

Described dialogue knowledge base, storage dialogue knowledge and the information that all need keep.

The present invention has tangible advantage and good effect.At first adopt the dialog script control technology of expert system technology and simplification to organize human-computer dialogue to estimate required language knowledge, can write based on the dialog script of literal and true man's recording easily and control its operational process.Secondly utilize the said content of dialogue people and the actual content identified of the fuzzy comparison technology comparison of the words and phrases expection of research and development voluntarily, comparison result reaches preset threshold then according to predetermined positive feedback scheme feedback, otherwise according to predetermined passive feedback scheme feedback.

Description of drawings

Fig. 1 is a system of the present invention pie graph;

Fig. 2 is a flow chart of realizing the automatic spoken language training method based on voice call according to of the present invention;

Fig. 3 is according to suitable expansion speech recognition grammar full scale process block diagram of the present invention;

Fig. 4 is according to language processing module workflow diagram of the present invention;

Fig. 5 is according to dialog script expression way of the present invention and script control method flow chart;

Fig. 6 is according to dialog script compiling form of the present invention.

Embodiment

Below in conjunction with Figure of description the specific embodiment of the present invention is described.

Fig. 1 is a system of the present invention pie graph, referring to Fig. 1, realizes that based on voice call the automatic spoken language training device comprises with lower module according to of the present invention:

Computer is used to install the various modules of apparatus of the present invention, controls the work of each module.

The phone receiver module, this module is used to receive telephone subscriber's phone, and connection signal is sent to dialogue control module.

Dialogue control module, reception from the suitable dialogue knowledge of dialogue knowledge base retrieval, is called the language that the language identification module is monitored the telephone subscriber from phone receiver module signal, call language processing module the language of identification is judged, provide the feedback that this takes turns dialogue.

Sound identification module is used to monitor telephone subscriber's language, and the result of speech recognition is sent to language processing module.

Language processing module receives the result of speech recognition and itself and dialogue knowledge that the dialogue knowledge base retrieves is compared, and provides comparative result.

Dialogue knowledge base, storage dialogue knowledge and the information that all need keep.

Fig. 2 is a flow chart of realizing the automatic spoken language training method based on voice call according to of the present invention.Hereinafter will automatic spoken language training method of the present invention be described in detail with reference to figure 2.

At first, in step 210, telephone subscriber's phone of dragging on activates the phone receiver module;

In step 220, dialogue control module is opened dialog script, enters the dialogue initial state;

In step 230, determine the present located talking phase, extract corresponding required various expection response content of stage, comprise machine recording in a minute, expection telephone subscriber response content, the voice recognition rule of expection response content, actual feedback of replying machine when correct and the machine feedback when incorrect;

In step 240, compile the voice recognition rule of current talking phase expection response content, for identification telephone subscriber's voice are got ready;

In step 250, start sound identification module, the record playing module in a minute that starts the machine is simultaneously play the machine recording in a minute that step 3 is extracted;

In step 260, sound identification module activates language processing module and the result of speech recognition is transmitted over.Language processing module is compared the expection response content that the result of speech recognition and step 230 obtain, if comparison result reaches preset threshold, record playing module in a minute starts the machine, play the correct machine feedback recording that step 230 obtains, otherwise play the wrong machine feedback recording that step 230 obtains.The script control module is given back in control, upgraded talking phase, enter new one of step 230 beginning and take turns dialogue.

Fig. 3 is according to suitable expansion speech recognition grammar full scale process block diagram of the present invention.The voice recognition rule of user answer content actual needs is estimated in 310 expressions.320 the voice recognition rule that slightly enlarge for system's compiling.The 330th, the phonetic rules of compiling is 310 and 320 sums.By increasing some redundant speech recognition discriminations that improve, the user error input can identify, thereby improves accuracy rate.

Fig. 4 is according to language processing module workflow diagram of the present invention.Below with reference to Fig. 4, language processing module workflow of the present invention is described in detail.

At first, in step 410, sound identification module obtains telephone subscriber's language;

In step 420, obtain and estimate user's answer;

In step 430, analyze whether to exist and variously be not inconsistent a little;

In step 450, be the algorithm that calculates the comparison score value, this is a kind of relative value of and wrong percentage long according to sentence.

Fig. 5 is according to dialog script expression way of the present invention and script control method flow chart.Below with reference to Fig. 5, dialog script expression way of the present invention and script control method are described in detail.

At first, in step 500, from script storehouse 510, selects when the processing that engages in the dialogue of script information that front-wheel is talked with according to the current sign X that enters;

In step 520, the flow process based on voice call realization automatic spoken language training method according to the present invention is taken turns dialogue to this and is handled;

In step 530, with the sign that changes over to that sign upgrades the next round dialogue that produces when the front-wheel dialog script, the execution in step 500 that circulates is then carried out new round dialog process.

Fig. 6 is a dialog script compiling form of the present invention, the 710th, and each takes turns the sign that enters of dialogue.The 720th, each takes turns the sign that produces of dialogue.

The above is the preferred embodiments of the present invention only, is not limited to the present invention, and for a person skilled in the art, the present invention can have various changes and variation.Within the spirit and principles in the present invention all, any modification of being done, be equal to replacement, improvement etc., all should be included within the claim scope of the present invention.

Claims

1, a kind of method based on voice call realization automatic spoken language training, this method may further comprise the steps:

2, the method that realizes automatic spoken language training based on voice call according to claim 1, wherein, the control of the dialog script in the described step 1 adopts database record to represent the content in each stage of script, every record setting enters and produces value of statistical indicant, control program only need be searched any many records of producing the value of statistical indicant coupling that enter value of statistical indicant and current record, even if found the next round script of dialogue.

3, the method that realizes automatic spoken language training based on voice call according to claim 1, wherein, the corresponding required expection response content of stage of extraction in the described step 3 comprises: machine is spoken and is recorded, expection telephone subscriber response content, the voice recognition rule of expection response content, actual feedback of replying machine when correct and the machine feedback when incorrect.

4, a kind ofly realize that based on voice call the automatic spoken language training device comprises computer, phone receiver module, dialogue control module, sound identification module and dialogue knowledge base, it is characterized in that,

Described dialogue control module, reception from the suitable dialogue knowledge of dialogue knowledge base retrieval, is called the language that sound identification module is monitored the telephone subscriber from phone receiver module signal, call language processing module the language of identification is judged, provide the feedback that this takes turns dialogue.

5, according to claim 4 based on voice call realization automatic spoken language training device, wherein, described sound identification module is in order to improve the speech recognition accuracy rate, by increasing some redundant speech recognition discriminations that improve, the user error input can identify, thereby improves accuracy rate.

6, according to claim 4ly realize the automatic spoken language training device based on voice call, wherein, described language processing module is that the result with speech recognition compares with the expection response content that obtains.