[go: up one dir, main page]

CN109346058B - A system for expanding speech acoustic features - Google Patents

A system for expanding speech acoustic features Download PDF

Info

Publication number
CN109346058B
CN109346058B CN201811443497.9A CN201811443497A CN109346058B CN 109346058 B CN109346058 B CN 109346058B CN 201811443497 A CN201811443497 A CN 201811443497A CN 109346058 B CN109346058 B CN 109346058B
Authority
CN
China
Prior art keywords
voice
speech
sound
video
submodule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811443497.9A
Other languages
Chinese (zh)
Other versions
CN109346058A (en
Inventor
程冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN201811443497.9A priority Critical patent/CN109346058B/en
Publication of CN109346058A publication Critical patent/CN109346058A/en
Application granted granted Critical
Publication of CN109346058B publication Critical patent/CN109346058B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/033Voice editing, e.g. manipulating the voice of the synthesiser
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/027Concept to speech synthesisers; Generation of natural phrases from machine-based concepts
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • G10L15/25Speech recognition using non-acoustical features using position of the lips, movement of the lips or face analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/57Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for processing of video signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The application belongs to the technical field of sound processing, and particularly relates to a voice acoustic feature expansion system. In the language learning process, the acoustic characteristics of the language are required to be enlarged, and then corpus suitable for brain perception is produced for the learner to stimulate the brain. The application provides a voice acoustic characteristic expanding system, which comprises a voice acquisition unit, a voice processing unit and a video editing unit, wherein the voice acquisition unit is connected with the voice processing unit; the voice acquisition unit is used for acquiring natural voice; the voice processing unit is used for expanding the frequency spectrum characteristics in the natural voice to different degrees so as to manufacture corpus; the video editing unit is used for editing the voice video and the processed voice to synthesize a video clip. The speech acoustic feature expansion system can produce corpus more suitable for brain perception, thereby helping learners form speech categories in the brain more similar to those of the native speaker.

Description

Voice acoustic feature expansion system
Technical Field
The application belongs to the technical field of sound processing, and particularly relates to a voice acoustic feature expansion system.
Background
With the rapid development of related fields such as bioengineering, computer science, data statistics processing, brain imaging technology and the like, brain science research combines the advantages of interdisciplinary subjects, and completely new exploration is carried out on the interaction process of brain development and growth and language learning environments. Studies have shown that infants gradually lose sensitivity to non-native language speech after 12 months, thereby creating a hurdle to future foreign language speech learning. One person often gets habit to learn a new language from his original speech perception, so that the foreign language speech similar to the pronunciation of the native language is received faster, and the speech not in the native language is received more difficult. However, when learning a voice similar to a native language, a learner is more likely to be affected by the native language, thereby generating an accent. For example, the united states may have a different perception than the brain of a chinese person for the same english language.
Because it is insensitive to non-native language speech, the learner cannot fully receive language information first audibly, so it is difficult to pronounce it correctly. At the same time, each time a learner learns a phoneme, it is necessary to establish a voice category of the phoneme in the brain. This speech category is not a point, but a collection. Because the language environment in which the foreign language learner is in contact with the native language learner is not comparable, the voice category established in their brains is far from.
In the language learning process, acoustic features of natural voice are expanded to produce corpus suitable for brain perception for learners, and nerve systems of the learners, which lose sensitivity to non-native language voice, are stimulated to be reopened so as to comprehensively receive voice information, thereby helping the learners to form a voice category more similar to that of the native language learners in the brain.
Disclosure of Invention
1. Technical problem to be solved
Based on the fact that acoustic features of natural voices are expanded in the language learning process, corpus suitable for brain perception is manufactured for learners, nerve systems of the learners, which lose sensitivity to non-native voices, are stimulated to be opened again, and voice information is comprehensively received, so that the learners are helped to form voice categories which are closer to the native voices in the brain.
2. Technical proposal
In order to achieve the above object, the present application provides a speech acoustic feature expansion system, which includes a speech acquisition unit, the speech acquisition unit is connected with a speech processing unit, and the speech processing unit is connected with a video editing unit;
the voice acquisition unit is used for acquiring natural voice;
the voice processing unit is used for expanding the frequency spectrum characteristics in the natural voice to different degrees and manufacturing corpus;
the video editing unit is used for editing the voice video and the processed voice to synthesize different video clips.
Optionally, the voice processing unit comprises a MATLAB-based sound processing module.
Optionally, the MATLAB-based sound processing module includes a formant frequency difference expansion sub-module, a pitch synchronization overlap sub-module, a frequency separation sub-module, a bandwidth separation sub-module, and a gap separation sub-module.
Optionally, the MATLAB-based sound processing module includes a sound analysis sub-module and a sound synthesis sub-module.
Optionally, the video editing unit includes a format processing module and a frame rate processing module.
Optionally, the speech processing unit is configured to perform 3 different degrees of expansion on the spectral features in speech, which are 300%,208%,144%, respectively, so as to make a corpus.
3. Advantageous effects
Compared with the prior art, the voice acoustic feature expansion system provided by the application has the beneficial effects that:
The voice acoustic characteristic expanding system provided by the application is characterized in that a voice acquisition unit, a voice processing unit and a video editing unit are connected; and expanding the frequency spectrum characteristics of the natural voice to manufacture video. The acoustic characteristics of the voice contacted when the infant learns the language are simulated, corpus suitable for brain perception is manufactured for a learner to stimulate the brain, so that the brain with reduced sensitivity to foreign language voice can clearly perceive the physical acoustic characteristics of the voice, thereby establishing a voice category similar to a mother language in the brain, and further improving the accuracy of pronunciation.
Drawings
FIG. 1 is a schematic diagram of a speech acoustic feature augmentation system of the present application;
In the figure: the system comprises a 1-voice acquisition unit, a 2-voice processing unit, a 3-video editing unit, a 4-MATLAB-based voice processing module, a 5-formant frequency difference expansion sub-module, a 6-pitch synchronous splicing sub-module, a 7-frequency separation sub-module, an 8-bandwidth separation sub-module, a 9-gap separation sub-module, a 10-voice analysis sub-module, an 11-voice synthesis sub-module, a 12-format processing module and a 13-frame frequency processing module.
Detailed Description
Hereinafter, specific embodiments of the present application will be described in detail with reference to the accompanying drawings, and according to these detailed descriptions, those skilled in the art can clearly understand the present application and can practice the present application. Features from various embodiments may be combined to obtain new implementations, or substituted for certain features from certain embodiments to obtain further preferred implementations, without departing from the principles of the application.
The speech unit of the "infant" is exaggeratedly represented by the vibration frequency of the vocal cords and the resonance frequencies of the oral cavity, the laryngeal cavity, and the nasal cavity, and the gap between formants peculiar to vowels is artificially increased. This exaggeration not only allows the infant to easily discern phonetic units, but also simultaneously perceives key phonetic elements in the native language that distinguish the meaning of a single word. The mother and child sounds when speaking have great flexibility and variability, and such flexibility variation helps the infant establish an effective acoustic pattern for speech classification, i.e. a native speech category for each phoneme in the brain. The brain science field discovers that the infant learned native language voice process has the following characteristics: 1) Infants have the opportunity to hear sounds of various people speaking; 2) They have the opportunity to see the pronunciation mouth shape of different people; 3) The sound of the mother speaking to the infant is exaggeratedly represented by the vibration frequency of the vocal cords and the resonance frequencies of the oral cavity, the laryngeal cavity, and the nasal cavity. These three elements are very useful in utilizing infants to facilitate the ability to distinguish phonetic differences between speech and to build a comprehensive native language speech category.
Corpus, i.e. language material. Corpus is the content of linguistic studies. Corpus is a basic unit constituting a corpus.
The baby-ward (MATHERESE, or "mother") is a language used by adults, especially by mother speaking to infants. The language content and form (words, intonation, speed, etc.) need to be adapted to the language ability and cognitive ability of children, considering the understanding and acceptance ability of babies. Studies have shown that the vergence has a physical acoustic characteristic that is expanded in terms of speech over normal language.
Referring to fig. 1, the application provides a voice acoustic feature expanding system, which comprises a voice acquisition unit 1, wherein the voice acquisition unit 1 is connected with a voice processing unit 2, and the voice processing unit 2 is connected with a video editing unit 3;
the voice acquisition unit 1 is used for acquiring natural voice;
the voice processing unit 2 is used for expanding the frequency spectrum characteristics in the natural voice to different degrees and manufacturing corpus;
the video editing unit 3 is configured to edit the voice video and the processed voice to synthesize different video clips.
Optionally, the speech processing unit 2 comprises a MATLAB-based sound processing module 4.
Optionally, the MATLAB-based sound processing module 4 includes a formant frequency difference expansion sub-module 5, a pitch synchronization overlap sub-module 6, a frequency division sub-module 7, a bandwidth division sub-module 8, and a gap division sub-module 9.
Optionally, the MATLAB-based sound processing module 4 comprises a sound analysis sub-module 10 and a sound synthesis sub-module 11. The sound analysis sub-module 10 analyzes the acquired sound, and then synthesizes a new sound by the sound synthesis sub-module 11.
Optionally, the video editing sheet 3 includes a format processing module 12 and a frame rate processing module 13.
Optionally, the speech processing unit 2 is configured to perform 3 different degrees of expansion on the spectral features in speech, which are 300%,208%,144%, respectively, so as to make a corpus.
Examples
Amplifying the target speech is important to distinguish between acoustic elements. For each group of voices to be trained, the physical parameters of a specific natural sound process need to be determined according to the distinguishing factors of the acoustic characteristics of the two voices.
The natural sound recording is obtained through the voice obtaining unit 1 and then transmitted to the voice processing unit 2, the spectral characteristics in the voice are amplified to 3 different degrees through the MATLAB voice processing module 4, the spectral characteristics are 300%,208% and 144% respectively, and then the voice is made into four-level training corpus together with the original voice. For example, english language voice/r-l/pair, 3 parameters are F3 separation frequency, F3 bandwidth and F3 transition time. During the synthesis process, the sub-module 5 amplifies the formant frequency difference of/r-l/by the formant frequency difference and reduces the F3 bandwidth. The amplification of the/r-l/time characteristic is then added by the pitch synchronous splicing submodule 6 using a time-warping technique. For example, the vowels/I-I/pairs of english are separated into frequencies and bandwidths of F1 and F2 by the frequency separation sub-module 7, the bandwidth separation sub-module 8 and the gap separation sub-module 9, and gaps between F1 and F2 are adjusted.
The sub-module "LPC ANALYSIS AND SYNTHESIS of Speech" in the MATLAB sound processing module 4 is used for the fabrication. LPC refers to Linear Prediction Coding. Including the sound analysis sub-module 10 and the sound synthesis sub-module 11, new sounds can be analyzed and synthesized. (see: DSP SYSTEM Toolbox TM functionality available at the for operation)command line.)
After the sound processing is finished, the Final Cut Pro7 is used, and the method comprises a format processing module 12 and a frame frequency processing module 13, wherein different formats and frame frequencies can be mixed and matched in a time axis, videos of the sound are processed through synchronizing slow shot videos of different versions and time stretching audio tracks, and then the processed videos and the processed sounds are put together to edit and synthesize different video clips to be used as corpus for further manufacturing training software.
The voice acoustic characteristic expanding system provided by the application is characterized in that a voice acquisition unit, a voice processing unit and a video editing unit are connected; and expanding the frequency spectrum characteristics of the voice to manufacture video. The acoustic characteristics of the voice contacted when the infant learns the language are simulated, corpus suitable for brain perception is manufactured for a learner to stimulate the brain, so that the brain with reduced sensitivity to foreign language voice can clearly perceive the physical acoustic characteristics of the voice in a hearing way, a voice category similar to a mother language is established in the brain, and the accuracy of pronunciation is further improved.
Although the application has been described with reference to specific embodiments, those skilled in the art will appreciate that many modifications are possible in the construction and detail of the application disclosed within the spirit and scope thereof. The scope of the application is to be determined by the appended claims, and it is intended that the claims cover all modifications that are within the literal meaning or range of equivalents of the technical features of the claims.

Claims (1)

1.一种语音声学特征扩大系统,其特征在于:包括语音获取单元,所述语音获取单元与语音处理单元相连接,所述语音处理单元与视频编辑单元相连接;1. A speech acoustic feature expansion system, characterized in that: it comprises a speech acquisition unit, the speech acquisition unit is connected to a speech processing unit, and the speech processing unit is connected to a video editing unit; 所述语音获取单元,用于对自然语音进行获取;The speech acquisition unit is used to acquire natural speech; 所述语音处理单元,用于对自然语音中的频谱特征进行不同程度的扩大,制作语料;The speech processing unit is used to expand the frequency spectrum features in natural speech to different degrees to produce corpus; 所述视频编辑单元,用于将语音视频与处理过的语音编辑后合成不同视频片段;The video editing unit is used to edit the voice video and the processed voice to synthesize different video clips; 通过语音获取单元获得自然录音后传送给语音处理单元,通过MATLAB声音处理模块将声音中的频谱特征进行3种不同程度的放大,分别为300%,208%,144%,然后与原始声音一起做成四个等级的培训语料;对于英语语音对,在合成过程中,通过共振峰频率差异扩大子模块放大英语语音对的共振峰频率差异并降低了F3带宽;时间特性的放大则是利用时间偏差技术,通过基音同步叠接子模块进行相加;对于英语元音对,通过频率分离子模块、带宽分离子模块和间隙分离子模块进行F1和F2的频率、带宽分离,调整F1和F2之间的间隙;The natural recordings obtained by the speech acquisition unit are transmitted to the speech processing unit. The spectral features in the sound are amplified to three different degrees, namely 300%, 208%, and 144%, through the MATLAB sound processing module, and then made into four levels of training corpus together with the original sound; for English speech pairs, during the synthesis process, the formant frequency difference amplification submodule amplifies the formant frequency difference of the English speech pairs and reduces the F3 bandwidth; the amplification of the time characteristics is achieved by using the time deviation technology and adding through the fundamental tone synchronization splicing submodule; for English vowel pairs, the frequency and bandwidth of F1 and F2 are separated through the frequency separation submodule, the bandwidth separation submodule and the gap separation submodule, and the gap between F1 and F2 is adjusted; 制作时使用MATLAB声音处理模块中的“LPC Analysis and Synthesis of Speech”这一子模块,LPC指Linear Prediction Coding,包括声音分析子模块和声音合成子模块,能够分析并合成新的声音;The "LPC Analysis and Synthesis of Speech" submodule in the MATLAB sound processing module was used during the production. LPC refers to Linear Prediction Coding, which includes the sound analysis submodule and the sound synthesis submodule, and can analyze and synthesize new sounds. 声音处理完毕后,使用Final Cut Pro7,包括格式处理模块和帧频处理模块,在时间轴中混合及搭配不同格式和帧频,将声音的视频通过同步不同版本的慢镜头视频和时间拉伸音轨,然后与处理过的声音放在一起进行编辑、合成不同视频片段,作为进一步制作培训软件的语料。After the sound processing was completed, Final Cut Pro7, including the format processing module and the frame rate processing module, was used to mix and match different formats and frame rates in the timeline. The video of the sound was synchronized with different versions of slow-motion video and time-stretched audio tracks, and then edited and synthesized together with the processed sound to serve as the corpus for further production of training software.
CN201811443497.9A 2018-11-29 2018-11-29 A system for expanding speech acoustic features Active CN109346058B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811443497.9A CN109346058B (en) 2018-11-29 2018-11-29 A system for expanding speech acoustic features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811443497.9A CN109346058B (en) 2018-11-29 2018-11-29 A system for expanding speech acoustic features

Publications (2)

Publication Number Publication Date
CN109346058A CN109346058A (en) 2019-02-15
CN109346058B true CN109346058B (en) 2024-06-28

Family

ID=65319541

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811443497.9A Active CN109346058B (en) 2018-11-29 2018-11-29 A system for expanding speech acoustic features

Country Status (1)

Country Link
CN (1) CN109346058B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1669074A (en) * 2002-10-31 2005-09-14 富士通株式会社 voice enhancement device
CN109378015A (en) * 2018-11-29 2019-02-22 西安交通大学 A kind of language learning system and method
CN209388698U (en) * 2018-11-29 2019-09-13 西安交通大学 A kind of speech acoustics feature expansion system

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0493980A (en) * 1990-08-06 1992-03-26 Takeshige Fujitani Language learning system
GB9714001D0 (en) * 1997-07-02 1997-09-10 Simoco Europ Limited Method and apparatus for speech enhancement in a speech communication system
US20020128839A1 (en) * 2001-01-12 2002-09-12 Ulf Lindgren Speech bandwidth extension
KR100427243B1 (en) * 2002-06-10 2004-04-14 휴먼씽크(주) Method and apparatus for analysing a pitch, method and system for discriminating a corporal punishment, and computer readable medium storing a program thereof
CN1564245A (en) * 2004-04-20 2005-01-12 上海上悦通讯技术有限公司 Stunt method and device for baby's crying
US7676362B2 (en) * 2004-12-31 2010-03-09 Motorola, Inc. Method and apparatus for enhancing loudness of a speech signal
US20070168187A1 (en) * 2006-01-13 2007-07-19 Samuel Fletcher Real time voice analysis and method for providing speech therapy
US9117455B2 (en) * 2011-07-29 2015-08-25 Dts Llc Adaptive voice intelligibility processor
CN105023574B (en) * 2014-04-30 2018-06-15 科大讯飞股份有限公司 A kind of method and system for realizing synthesis speech enhan-cement
CN105982641A (en) * 2015-01-30 2016-10-05 上海泰亿格康复医疗科技股份有限公司 Speech and language hypoacousie multi-parameter diagnosis and rehabilitation apparatus and cloud rehabilitation system
CN106710604A (en) * 2016-12-07 2017-05-24 天津大学 Formant enhancement apparatus and method for improving speech intelligibility

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1669074A (en) * 2002-10-31 2005-09-14 富士通株式会社 voice enhancement device
CN109378015A (en) * 2018-11-29 2019-02-22 西安交通大学 A kind of language learning system and method
CN209388698U (en) * 2018-11-29 2019-09-13 西安交通大学 A kind of speech acoustics feature expansion system

Also Published As

Publication number Publication date
CN109346058A (en) 2019-02-15

Similar Documents

Publication Publication Date Title
Nakata et al. Effect of cochlear implants on children’s perception and production of speech prosody
Bent et al. The influence of talker and foreign-accent variability on spoken word identification
CN104081453A (en) System and method for acoustic transformation
Zhang et al. Adjustment of cue weighting in speech by speakers and listeners: Evidence from amplitude and duration modifications of Mandarin Chinese tone
KR20150076126A (en) System and method on education supporting of pronunciation including dynamically changing pronunciation supporting means
Taimi et al. Children Learning a Non-native Vowel–The Effect of a Two-day Production Training.
Athari et al. Vocal imitation between mothers and infants
CN109378015B (en) A phonetic learning system and method
Lin et al. End-to-end articulatory modeling for dysarthric articulatory attribute detection
Kabakoff et al. Training a non-native vowel contrast with a distributional learning paradigm results in improved perception and production
Hongwei et al. An investigation of tone perception and production in German learners of Mandarin
CN109346058B (en) A system for expanding speech acoustic features
CN209388698U (en) A kind of speech acoustics feature expansion system
Yang et al. The development of tonal duration in Mandarin-speaking children
CN209388701U (en) A kind of language learning system
Feng et al. The ability to use contextual cues to achieve phonological constancy emerges by 14 months.
Wong Mothers do not enhance tonal contrasts in child-directed speech: Perceptual and acoustic evidence from child-directed Mandarin lexical tones
Georgopoulos An investigation of audio-visual speech recognition as applied to multimedia speech therapy applications
Hennecke Audio-visual speech recognition: preprocessing, learning and sensory integration
Escudero et al. Have four-year-olds mastered vowel reduction in English? An acoustic analysis of bilingual and monolingual child storytelling
David Ears for computers
TWI806703B (en) Auxiliary method and system for voice correction
Sairanen Deep learning text-to-speech synthesis with Flowtron and WaveGlow
JP2011232775A (en) Pronunciation learning device and pronunciation learning program
Lee et al. A short-term longitudinal study of vocal development in young children with simultaneous bilateral cochlear implants

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant