CN101304457A - Method and apparatus for implementing automatic spoken language training based on voice telephone - Google Patents
Method and apparatus for implementing automatic spoken language training based on voice telephone Download PDFInfo
- Publication number
- CN101304457A CN101304457A CNA200710097430XA CN200710097430A CN101304457A CN 101304457 A CN101304457 A CN 101304457A CN A200710097430X A CNA200710097430X A CN A200710097430XA CN 200710097430 A CN200710097430 A CN 200710097430A CN 101304457 A CN101304457 A CN 101304457A
- Authority
- CN
- China
- Prior art keywords
- dialogue
- module
- speech recognition
- response content
- language processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 238000012549 training Methods 0.000 title claims description 20
- 238000012545 processing Methods 0.000 claims abstract description 23
- 230000004044 response Effects 0.000 claims abstract description 18
- 230000000052 comparative effect Effects 0.000 claims description 3
- 230000008878 coupling Effects 0.000 claims 1
- 238000010168 coupling process Methods 0.000 claims 1
- 238000005859 coupling reaction Methods 0.000 claims 1
- 238000000605 extraction Methods 0.000 claims 1
- 238000002360 preparation method Methods 0.000 abstract 1
- 238000005516 engineering process Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000014509 gene expression Effects 0.000 description 4
- 230000002452 interceptive effect Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
Images
Landscapes
- Telephonic Communication Services (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to a device for realizing automatic oral practices base on a voice telephone which comprises a computer, a telephone receiving module, a dialogue control module, a voice recognition module, a language processing module and a dialogue knowledge base. The method for realizing automatic oral practices base on the voice telephone comprises the following steps: a telephone user dials a call to activate the telephone receiving module; the dialogue control module opens a dialogue script and enters into the dialogue initial state; the present dialogue stage is determined, and the needed expectant response content of the corresponding stage is extracted; a voice recognition rule of the expectant response content of the present dialogue stage is compiled to make preparations for recognizing voices of telephone users; the voice recognition module is started, meanwhile, a machine speaking record playing module is started to play the extracted machine speaking record; the voice recognition module is activated to carry out the voice recognition and transfer the voice recognition result to the language processing module; the voice recognition result and the obtained expectant response content are compared by the language processing module.
Description
Technical field
The invention belongs to the method and apparatus of automatic spoken language training, specifically a kind of method and apparatus of realizing automatic spoken language training based on voice call.
Technical background
Speech recognition technology has many application in the voice call field, concentrate on mainly that dialogue scenarios is fixed and the simple inside line intelligent sound of the content switch of speaking on.The Phonepass system of IBM is applied to the Oral English Practice test with voice call and speech recognition technology, and test form mainly is to read aloud vocabulary and sentence, and the oral multiple-choice question of doing.Use various widely phone automated inquiry systems all to adopt phone button interactive mode, and do not adopt spoken interactive mode.
Along with the raising of the recognition accuracy of speech recognition system and recognition speed and under specific environment the requirement to hardware condition more and more lower, will embed in the consumer electronics product based on the speech recognition technology of micro chip and more and more become a reality.More and more to mix function more and more powerful along with the consumer electronics product performance becomes, and speech recognition science and technology makes the consumer can use these products more convenient, more intuitively.And, when using these products, can not be accompanied by a series of button and prompt tone, but realize the direct dialogue between consumer and product.
These speech recognition technologies owing to lack spoken interactive learning, can't satisfy the requirement of people in Oral Training in the application in voice call field.
Summary of the invention
In order to overcome deficiency of the prior art, the object of the present invention is to provide a kind of method and apparatus based on voice call realization automatic spoken language training, satisfy people and utilize voice call to carry out the demand of interactive Oral Training.
For finishing the foregoing invention purpose, the invention provides a kind of method based on voice call realization automatic spoken language training, this method may further comprise the steps:
1) telephone subscriber's phone of dragging on activates the phone receiver module;
2) dialogue control module is opened dialog script, enters the dialogue initial state;
3) determine the present located talking phase, extract corresponding required expection response content of stage;
4) voice recognition rule of the current talking phase expection of compiling response content is for identification telephone subscriber's voice are got ready;
5) start sound identification module, the record playing module in a minute that starts the machine is simultaneously play the machine recording in a minute of extracting;
6) sound identification module activates the result carry out speech recognition and to transmit speech recognition to language processing module.Language processing module is compared the result of speech recognition with the expection response content that obtains.
For finishing the foregoing invention purpose, the present invention also provides a kind of device based on voice call realization automatic spoken language training, comprises computer, phone receiver module, dialogue control module, sound identification module and dialogue knowledge base, it is characterized in that,
Described phone receiver module is used to receive telephone subscriber's phone, and connection signal is sent to dialogue control module.
Described dialogue control module, reception from the suitable dialogue knowledge of dialogue knowledge base retrieval, is called the language that the language identification module is monitored the telephone subscriber from phone receiver module signal, call language processing module the language of identification is judged, provide the feedback that this takes turns dialogue.
Described sound identification module is used to monitor telephone subscriber's language, and the result of speech recognition is sent to language processing module.
Described language processing module receives the result of speech recognition and itself and dialogue knowledge that the dialogue knowledge base retrieves is compared, and provides comparative result.
Described dialogue knowledge base, storage dialogue knowledge and the information that all need keep.
The present invention has tangible advantage and good effect.At first adopt the dialog script control technology of expert system technology and simplification to organize human-computer dialogue to estimate required language knowledge, can write based on the dialog script of literal and true man's recording easily and control its operational process.Secondly utilize the said content of dialogue people and the actual content identified of the fuzzy comparison technology comparison of the words and phrases expection of research and development voluntarily, comparison result reaches preset threshold then according to predetermined positive feedback scheme feedback, otherwise according to predetermined passive feedback scheme feedback.
Description of drawings
Fig. 1 is a system of the present invention pie graph;
Fig. 2 is a flow chart of realizing the automatic spoken language training method based on voice call according to of the present invention;
Fig. 3 is according to suitable expansion speech recognition grammar full scale process block diagram of the present invention;
Fig. 4 is according to language processing module workflow diagram of the present invention;
Fig. 5 is according to dialog script expression way of the present invention and script control method flow chart;
Fig. 6 is according to dialog script compiling form of the present invention.
Embodiment
Below in conjunction with Figure of description the specific embodiment of the present invention is described.
Fig. 1 is a system of the present invention pie graph, referring to Fig. 1, realizes that based on voice call the automatic spoken language training device comprises with lower module according to of the present invention:
Computer is used to install the various modules of apparatus of the present invention, controls the work of each module.
The phone receiver module, this module is used to receive telephone subscriber's phone, and connection signal is sent to dialogue control module.
Dialogue control module, reception from the suitable dialogue knowledge of dialogue knowledge base retrieval, is called the language that the language identification module is monitored the telephone subscriber from phone receiver module signal, call language processing module the language of identification is judged, provide the feedback that this takes turns dialogue.
Sound identification module is used to monitor telephone subscriber's language, and the result of speech recognition is sent to language processing module.
Language processing module receives the result of speech recognition and itself and dialogue knowledge that the dialogue knowledge base retrieves is compared, and provides comparative result.
Dialogue knowledge base, storage dialogue knowledge and the information that all need keep.
Fig. 2 is a flow chart of realizing the automatic spoken language training method based on voice call according to of the present invention.Hereinafter will automatic spoken language training method of the present invention be described in detail with reference to figure 2.
At first, in step 210, telephone subscriber's phone of dragging on activates the phone receiver module;
In step 220, dialogue control module is opened dialog script, enters the dialogue initial state;
In step 230, determine the present located talking phase, extract corresponding required various expection response content of stage, comprise machine recording in a minute, expection telephone subscriber response content, the voice recognition rule of expection response content, actual feedback of replying machine when correct and the machine feedback when incorrect;
In step 240, compile the voice recognition rule of current talking phase expection response content, for identification telephone subscriber's voice are got ready;
In step 250, start sound identification module, the record playing module in a minute that starts the machine is simultaneously play the machine recording in a minute that step 3 is extracted;
In step 260, sound identification module activates language processing module and the result of speech recognition is transmitted over.Language processing module is compared the expection response content that the result of speech recognition and step 230 obtain, if comparison result reaches preset threshold, record playing module in a minute starts the machine, play the correct machine feedback recording that step 230 obtains, otherwise play the wrong machine feedback recording that step 230 obtains.The script control module is given back in control, upgraded talking phase, enter new one of step 230 beginning and take turns dialogue.
Fig. 3 is according to suitable expansion speech recognition grammar full scale process block diagram of the present invention.The voice recognition rule of user answer content actual needs is estimated in 310 expressions.320 the voice recognition rule that slightly enlarge for system's compiling.The 330th, the phonetic rules of compiling is 310 and 320 sums.By increasing some redundant speech recognition discriminations that improve, the user error input can identify, thereby improves accuracy rate.
Fig. 4 is according to language processing module workflow diagram of the present invention.Below with reference to Fig. 4, language processing module workflow of the present invention is described in detail.
At first, in step 410, sound identification module obtains telephone subscriber's language;
In step 420, obtain and estimate user's answer;
In step 430, analyze whether to exist and variously be not inconsistent a little;
In step 450, be the algorithm that calculates the comparison score value, this is a kind of relative value of and wrong percentage long according to sentence.
Fig. 5 is according to dialog script expression way of the present invention and script control method flow chart.Below with reference to Fig. 5, dialog script expression way of the present invention and script control method are described in detail.
At first, in step 500, from script storehouse 510, selects when the processing that engages in the dialogue of script information that front-wheel is talked with according to the current sign X that enters;
In step 520, the flow process based on voice call realization automatic spoken language training method according to the present invention is taken turns dialogue to this and is handled;
In step 530, with the sign that changes over to that sign upgrades the next round dialogue that produces when the front-wheel dialog script, the execution in step 500 that circulates is then carried out new round dialog process.
Fig. 6 is a dialog script compiling form of the present invention, the 710th, and each takes turns the sign that enters of dialogue.The 720th, each takes turns the sign that produces of dialogue.
The above is the preferred embodiments of the present invention only, is not limited to the present invention, and for a person skilled in the art, the present invention can have various changes and variation.Within the spirit and principles in the present invention all, any modification of being done, be equal to replacement, improvement etc., all should be included within the claim scope of the present invention.
Claims (6)
1, a kind of method based on voice call realization automatic spoken language training, this method may further comprise the steps:
1) telephone subscriber's phone of dragging on activates the phone receiver module;
2) dialogue control module is opened dialog script, enters the dialogue initial state;
3) determine the present located talking phase, extract corresponding required expection response content of stage;
4) voice recognition rule of the current talking phase expection of compiling response content is for identification telephone subscriber's voice are got ready;
5) start sound identification module, the record playing module in a minute that starts the machine is simultaneously play the machine recording in a minute of extracting;
6) sound identification module activates the result carry out speech recognition and to transmit speech recognition to language processing module.Language processing module is compared the result of speech recognition with the expection response content that obtains.
2, the method that realizes automatic spoken language training based on voice call according to claim 1, wherein, the control of the dialog script in the described step 1 adopts database record to represent the content in each stage of script, every record setting enters and produces value of statistical indicant, control program only need be searched any many records of producing the value of statistical indicant coupling that enter value of statistical indicant and current record, even if found the next round script of dialogue.
3, the method that realizes automatic spoken language training based on voice call according to claim 1, wherein, the corresponding required expection response content of stage of extraction in the described step 3 comprises: machine is spoken and is recorded, expection telephone subscriber response content, the voice recognition rule of expection response content, actual feedback of replying machine when correct and the machine feedback when incorrect.
4, a kind ofly realize that based on voice call the automatic spoken language training device comprises computer, phone receiver module, dialogue control module, sound identification module and dialogue knowledge base, it is characterized in that,
Described phone receiver module is used to receive telephone subscriber's phone, and connection signal is sent to dialogue control module.
Described dialogue control module, reception from the suitable dialogue knowledge of dialogue knowledge base retrieval, is called the language that sound identification module is monitored the telephone subscriber from phone receiver module signal, call language processing module the language of identification is judged, provide the feedback that this takes turns dialogue.
Described sound identification module is used to monitor telephone subscriber's language, and the result of speech recognition is sent to language processing module.
Described language processing module receives the result of speech recognition and itself and dialogue knowledge that the dialogue knowledge base retrieves is compared, and provides comparative result.
Described dialogue knowledge base, storage dialogue knowledge and the information that all need keep.
5, according to claim 4 based on voice call realization automatic spoken language training device, wherein, described sound identification module is in order to improve the speech recognition accuracy rate, by increasing some redundant speech recognition discriminations that improve, the user error input can identify, thereby improves accuracy rate.
6, according to claim 4ly realize the automatic spoken language training device based on voice call, wherein, described language processing module is that the result with speech recognition compares with the expection response content that obtains.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNA200710097430XA CN101304457A (en) | 2007-05-10 | 2007-05-10 | Method and apparatus for implementing automatic spoken language training based on voice telephone |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNA200710097430XA CN101304457A (en) | 2007-05-10 | 2007-05-10 | Method and apparatus for implementing automatic spoken language training based on voice telephone |
Publications (1)
Publication Number | Publication Date |
---|---|
CN101304457A true CN101304457A (en) | 2008-11-12 |
Family
ID=40114152
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNA200710097430XA Pending CN101304457A (en) | 2007-05-10 | 2007-05-10 | Method and apparatus for implementing automatic spoken language training based on voice telephone |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101304457A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101739852B (en) * | 2008-11-13 | 2011-11-09 | 许罗迈 | Speech recognition-based method and device for realizing automatic oral interpretation training |
CN105304082A (en) * | 2015-09-08 | 2016-02-03 | 北京云知声信息技术有限公司 | Voice output method and voice output device |
CN110473522A (en) * | 2019-08-23 | 2019-11-19 | 百可录(北京)科技有限公司 | A kind of method of the short sound bite of Accurate Analysis |
-
2007
- 2007-05-10 CN CNA200710097430XA patent/CN101304457A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101739852B (en) * | 2008-11-13 | 2011-11-09 | 许罗迈 | Speech recognition-based method and device for realizing automatic oral interpretation training |
CN105304082A (en) * | 2015-09-08 | 2016-02-03 | 北京云知声信息技术有限公司 | Voice output method and voice output device |
CN105304082B (en) * | 2015-09-08 | 2018-12-28 | 北京云知声信息技术有限公司 | A kind of speech output method and device |
CN110473522A (en) * | 2019-08-23 | 2019-11-19 | 百可录(北京)科技有限公司 | A kind of method of the short sound bite of Accurate Analysis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10347244B2 (en) | Dialogue system incorporating unique speech to text conversion method for meaningful dialogue response | |
WO2022057712A1 (en) | Electronic device and semantic parsing method therefor, medium, and human-machine dialog system | |
JP6394709B2 (en) | SPEAKER IDENTIFYING DEVICE AND FEATURE REGISTRATION METHOD FOR REGISTERED SPEECH | |
US6839667B2 (en) | Method of speech recognition by presenting N-best word candidates | |
US8478592B2 (en) | Enhancing media playback with speech recognition | |
CN107481720B (en) | Explicit voiceprint recognition method and device | |
CN111833853B (en) | Voice processing method and device, electronic equipment and computer readable storage medium | |
CN101211559B (en) | Method and device for splitting voice | |
CN109637537B (en) | Method for automatically acquiring annotated data to optimize user-defined awakening model | |
CN110689877A (en) | Voice end point detection method and device | |
US20050033575A1 (en) | Operating method for an automated language recognizer intended for the speaker-independent language recognition of words in different languages and automated language recognizer | |
CN111261162B (en) | Speech recognition method, speech recognition apparatus, and storage medium | |
JPS603699A (en) | Adaptive automatically dispersing voice recognition | |
CN107077843A (en) | Session control and dialog control method | |
CN113284502A (en) | Intelligent customer service voice interaction method and system | |
CN104882141A (en) | Serial port voice control projection system based on time delay neural network and hidden Markov model | |
CN114171000A (en) | An audio recognition method based on acoustic model and language model | |
CN103680505A (en) | Voice recognition method and voice recognition system | |
CN111914078B (en) | Data processing method and device | |
CN112734604A (en) | Device for providing multi-mode intelligent case report and record generation method thereof | |
CN111402893A (en) | Voice recognition model determining method, voice recognition method and device and electronic equipment | |
US20010056345A1 (en) | Method and system for speech recognition of the alphabet | |
CN109448717B (en) | A phonetic word spelling recognition method, device and storage medium | |
WO2014033855A1 (en) | Speech search device, computer-readable storage medium, and audio search method | |
CN101304457A (en) | Method and apparatus for implementing automatic spoken language training based on voice telephone |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |
Open date: 20081112 |