[go: up one dir, main page]

CN103928023A - Voice scoring method and system - Google Patents

Voice scoring method and system Download PDF

Info

Publication number
CN103928023A
CN103928023A CN201410178813.XA CN201410178813A CN103928023A CN 103928023 A CN103928023 A CN 103928023A CN 201410178813 A CN201410178813 A CN 201410178813A CN 103928023 A CN103928023 A CN 103928023A
Authority
CN
China
Prior art keywords
speech
voice
scoring
examination paper
marked
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410178813.XA
Other languages
Chinese (zh)
Other versions
CN103928023B (en
Inventor
李心广
李苏梅
何智明
陈泽群
李婷婷
陈广豪
马晓纯
王晓杰
陈嘉华
徐集优
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Foreign Studies
Original Assignee
Guangdong University of Foreign Studies
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Foreign Studies filed Critical Guangdong University of Foreign Studies
Priority to CN201410178813.XA priority Critical patent/CN103928023B/en
Publication of CN103928023A publication Critical patent/CN103928023A/en
Application granted granted Critical
Publication of CN103928023B publication Critical patent/CN103928023B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Electrically Operated Instructional Devices (AREA)

Abstract

本发明公开了一种语音评分方法,包括步骤:S1、录制考生的考卷语音;S2、对所述考生的考卷语音进行预处理,得到考卷语音语料;S3、提取所述考卷语音语料的特征参数;S4、采用基于HMM和ANN混合模型的语音识别方法将所述考卷语音语料的特征参数和标准语音模板进行特征匹配,识别出所述考卷语音的内容,并给予初步评分;S5、若初步评分低于阈值,则初步评分为最终评分;否则进行准确度、流利度、语速、节奏、重音和语调等分指标的评分;S6、综合各种评分计算得到考卷语音的最终评分。本发明还公开了一种语音评分系统。本发明采用基于混合模型的语音识别方法,识别更准确,还能够通过评价标准分级实现考生录制后以文件形式存放的语音试卷进行客观评分。

The invention discloses a voice scoring method, comprising the steps of: S1, recording the examinee's examination paper voice; S2, preprocessing the examinee's examination paper voice to obtain the examination paper voice corpus; S3, extracting the characteristic parameters of the examination paper voice corpus ; S4, using the speech recognition method based on HMM and ANN hybrid model to carry out feature matching between the characteristic parameters of the speech corpus of the examination paper and the standard speech template, identify the content of the speech of the examination paper, and give a preliminary score; S5, if the preliminary score If it is lower than the threshold, the preliminary score will be the final score; otherwise, score indicators such as accuracy, fluency, speech rate, rhythm, stress and intonation; S6. Comprehensively calculate various scores to obtain the final score of the test paper voice. The invention also discloses a speech scoring system. The invention adopts the speech recognition method based on the mixed model, and the recognition is more accurate, and the speech test paper stored in the form of a file after the examinee is recorded can be objectively scored through the classification of the evaluation standard.

Description

一种语音评分方法及系统A voice scoring method and system

技术领域technical field

本发明涉及语音识别和评价技术,尤其涉及一种语音评分方法及系统。The invention relates to speech recognition and evaluation technology, in particular to a speech scoring method and system.

背景技术Background technique

语音识别技术从应用角度上通常分为两类:一类是特定人语音识别,一类是非特定人语音识别。特定人语音识别技术是针对一个特定的人的识别技术,简单说就是只识别一个人的声音,不适用于更广泛的群体;而非特定人识别技术恰恰相反,可以满足不同人的语音识别要求,适合广泛人群应用。Speech recognition technology is usually divided into two categories from the perspective of application: one is specific person speech recognition, and the other is non-specific person speech recognition. Specific person speech recognition technology is a recognition technology for a specific person. Simply put, it only recognizes the voice of one person and is not applicable to a wider group; non-specific person recognition technology is just the opposite and can meet the speech recognition requirements of different people. , suitable for a wide range of people.

目前在大词汇语音识别方面处于领先地位的IBM语音研究小组。AT&T的贝尔研究所也开始了一系列有关非特定人语音识别的实验,其成果确立了如何制作用于非特定人语音识别的标准模板的方法。The IBM Speech Research Group, currently leading the way in large vocabulary speech recognition. AT&T's Bell Research Institute also started a series of experiments on neutral speech recognition, and the results established how to make standard templates for neutral speech recognition.

这一时期所取得的重大进展有:Significant advances made during this period include:

(1)隐式马尔科夫模型(Hidden Markov Models,HMM)技术的成熟和不断完善成为语音识别的主流方法;(1) The maturity and continuous improvement of Hidden Markov Models (HMM) technology has become the mainstream method of speech recognition;

(2)在进行连续语音识别的时候,除了识别声学信息外,更多地利用各种语言知识,诸如构词、句法、语义、对话背景方面等的知识来帮助进一步对语音作出识别和理解;同时在语音识别研究领域,还产生了基于统计概率的语言模型;(2) When performing continuous speech recognition, in addition to recognizing acoustic information, make more use of various language knowledge, such as word formation, syntax, semantics, dialogue background, etc., to help further recognize and understand speech; At the same time, in the field of speech recognition research, a language model based on statistical probability has also been produced;

(3)人工神经网络在语音识别中的应用研究的兴起。在这些研究中,大部分采用基于反向传播算法(BP算法)的多层感知网络;此外,还有网络结构简单、易于实现、没有反馈信号的前向网络;系统的稳定性与联想记忆功能有密切关系、神经元间有反馈的反馈网络。人工神经网络具有区分复杂的分类边界的能力,显然它十分有助于模式划分。(3) The rise of artificial neural network application research in speech recognition. In these studies, most of the multi-layer perceptual networks based on the backpropagation algorithm (BP algorithm) are used; in addition, there are forward networks with simple network structure, easy implementation, and no feedback signal; the stability of the system and the function of associative memory Feedback network with affinity and feedback between neurons. The ability of artificial neural networks to distinguish complex classification boundaries clearly contributes greatly to pattern division.

另外,面向个人用途的连续语音听写机技术也日趋完善。这方面,最具代表性的是IBM的ViaVoice和Dragon公司的Dragon Dictate系统。这些系统具有说话人自适应能力,新用户不需要对全部词汇进行训练,便可在使用中不断提高识别率。In addition, continuous voice dictation machine technology for personal use is also becoming more and more perfect. In this regard, the most representative ones are IBM's ViaVoice and Dragon's Dragon Dictate system. These systems have the ability of speaker adaptation, and new users do not need to train all vocabulary, and the recognition rate can be continuously improved during use.

中国的语音识别技术的发展:在北京有中科院声学所、自动化所、清华大学、北方交通大学等科研机构和高等院校。另外,还有哈尔滨工业大学、中国科技大学、四川大学等也纷纷行动起来。现在,国内有不少语音识别系统已研制成功。这些系统的性能各具特色:在孤立字大词汇量语音识别方面,最具代表性的是清华大学电子工程系与中国电子器件公司合作研制成功的THED-919特定人语音识别与理解实时系统;在连续语音识别方面,四川大学计算机中心在微机上实现了一个主题受限的特定人连续英语——汉语语音翻译演示系统;在非特定人语音识别方面,有清华大学计算机科学与技术系研制的声控电话查号系统并投入实际使用。The development of China's speech recognition technology: In Beijing, there are scientific research institutions and institutions of higher learning such as the Institute of Acoustics of the Chinese Academy of Sciences, the Institute of Automation, Tsinghua University, and Northern Jiaotong University. In addition, Harbin Institute of Technology, University of Science and Technology of China, and Sichuan University have also taken action. Now, many domestic speech recognition systems have been successfully developed. The performance of these systems has its own characteristics: in terms of speech recognition of isolated words and large vocabulary, the most representative one is the THED-919 real-time system for specific person speech recognition and understanding successfully developed by the Department of Electronic Engineering of Tsinghua University and China Electronic Devices Corporation; In terms of continuous speech recognition, the Computer Center of Sichuan University has implemented a subject-limited continuous English-Chinese speech translation demonstration system for a specific person on a microcomputer; in terms of non-specific person speech recognition, it has been developed by the Department of Computer Science and Technology of Tsinghua University. The voice-activated telephone directory system was put into practical use.

另外,科大讯飞作为中国最大的智能语音技术提供商,在2010年发布了全球首个移动互联网智能语音交互平台“讯飞语音云”,宣告移动互联网语音听写时代到来。In addition, HKUST Xunfei, as China's largest intelligent voice technology provider, released the world's first mobile Internet intelligent voice interaction platform "Xunfei Voice Cloud" in 2010, announcing the arrival of the era of mobile Internet voice dictation.

在智能语音技术领域科大讯飞有着长期的研究积累,并在中文语音合成、语音识别、语音评测等多项技术上拥有国际领先的成果:语音合成和语音识别技术是实现人机语音通信,建立一个有听和讲能力的语音系统所必需的两项关键技术;自动语音识别技术(Auto Speech Recognize,ASR)所要解决的问题是让计算机能够“听懂”人类的语音,将语音中包含的文字信息“提取”出来;语音评测技术是智能语音处理领域的一项研究前沿,又称计算机辅助语言学习(ComputerAssisted Language Learning)技术,是一种通过机器自动对发音进行评分、检错并给出矫正指导的技术;声纹识别技术,又称说话人识别技术(SpeakerRecognition),是一项通过语音信号提取代表说话人身份的相关特征(如反映声门开合频率的基频特征、反映口腔大小形状及声道长度的频谱特征等),进而识别出说话人身份等工作方面的技术;自然语言是几千年来人们生活、工作、学习中必不可少的元素,而计算机是20世纪最伟大的发明之一,如何利用计算机对人类掌握的自然语言进行处理、甚至理解,使计算机具备人类的听说读写能力,一直是国内外研究机构非常关注和积极开展的研究工作。In the field of intelligent voice technology, iFLYTEK has a long-term research accumulation, and has international leading achievements in many technologies such as Chinese speech synthesis, speech recognition, and speech evaluation: speech synthesis and speech recognition technology is the realization of human-machine voice communication, the establishment of Two key technologies necessary for a speech system capable of listening and speaking; the problem to be solved by the automatic speech recognition technology (Auto Speech Recognize, ASR) is to enable the computer to "understand" human speech, and convert the text contained in the speech The information is "extracted"; speech evaluation technology is a research frontier in the field of intelligent speech processing, also known as Computer Assisted Language Learning (Computer Assisted Language Learning) technology, which is a method that automatically scores pronunciation, detects errors and gives corrections through machines. Guidance technology; voiceprint recognition technology, also known as speaker recognition technology (SpeakerRecognition), is a technology that extracts relevant features representing the speaker's identity through speech signals (such as fundamental frequency characteristics that reflect the opening and closing frequency of the glottis, reflecting the size and shape of the oral cavity) Spectrum characteristics of vocal tract length, etc.), and then identify the speaker's identity and other working technologies; natural language has been an indispensable element in people's life, work, and study for thousands of years, and computers are the greatest in the 20th century. One of the inventions, how to use computers to process and even understand the natural language mastered by humans, so that computers can have human listening, speaking, reading and writing abilities, has always been a research work that domestic and foreign research institutions have paid close attention to and actively carried out.

发明内容Contents of the invention

本发明所要解决的技术问题在于,提供一种语音评分方法和系统,能够快捷准确地进行阅卷评分,以客观的评分标准给考生评分。本发明融合了现有发音质量客观评价模型的优点,获取了性能更好的语音识别模型和语音训练模型以及更为准确的语音口语评分方案;并且能够通过多重评价指标体系实现对以文件形式存放的语音试卷进行客观评分。本发明具有更加稳定、效率更高的优点,为研究成果的实用化奠定基础,有利于实现大规模英语口语测试全自动阅卷的目标。The technical problem to be solved by the present invention is to provide a voice scoring method and system, which can quickly and accurately check and score examination papers, and score examinees with objective scoring standards. The present invention combines the advantages of the existing objective evaluation model of pronunciation quality, obtains a speech recognition model and a speech training model with better performance and a more accurate speech and spoken language scoring scheme; The voice test paper is objectively scored. The invention has the advantages of more stability and higher efficiency, lays a foundation for the practical application of research results, and is beneficial to realize the goal of automatic marking of large-scale oral English tests.

为解决上述技术问题,本发明提供了一种语音评分方法,包括步骤:In order to solve the problems of the technologies described above, the invention provides a method for scoring voice, comprising steps:

S1、录制考生的考卷语音;S1, recording the examinee's examination paper voice;

S2、对所述考生的考卷语音进行预处理,得到考卷语音语料;S2. Preprocessing the speech of the examinee's examination paper to obtain the speech corpus of the examination paper;

S3、提取所述考卷语音语料的特征参数;S3, extracting the characteristic parameters of the speech corpus of the examination paper;

S4、采用基于HMM和ANN混合模型的语音识别方法将所述考卷语音语料的特征参数和标准语音模板进行特征匹配,识别出所述考卷语音的内容,并给予初步评分;S4, using the speech recognition method based on the HMM and ANN hybrid model to perform feature matching on the characteristic parameters of the speech corpus of the examination paper and the standard speech template, identify the content of the speech in the examination paper, and give a preliminary score;

S5、若初步评分低于预先设定阈值,则所述初步评分为该考卷语音的最终评分,并标记该考卷语音为问题卷;若初步评分高于预先设定阈值,则对所述考卷语音进行准确度、流利度、语速、节奏、重音和语调分指标评分;S5. If the preliminary score is lower than the preset threshold, the preliminary score is the final score of the test paper voice, and the test paper voice is marked as a question paper; if the preliminary score is higher than the preset threshold, the test paper voice is Scoring for accuracy, fluency, rate of speech, rhythm, stress and intonation;

S6、对所述分指标评分进行加权计算得到所述考卷语音的最终评分。S6. Perform weighted calculation on the scores of the sub-indices to obtain a final score of the voice of the examination paper.

进一步的,所述步骤S1之前还包括步骤S0,所述步骤S0具体包括步骤:Further, before the step S1, it also includes a step S0, and the step S0 specifically includes the steps of:

S01、录制专家的标准语音;S01, recording the standard voice of the expert;

S02、对所述标准语音进行预处理,得到标准语音语料;S02. Preprocessing the standard speech to obtain a standard speech corpus;

S03、提取所述标准语音语料的特征参数;S03. Extracting characteristic parameters of the standard speech corpus;

S04、对所述标准语音语料的特征参数进行模型训练,得到所述标准语音模板。S04. Perform model training on the characteristic parameters of the standard speech corpus to obtain the standard speech template.

进一步的,所述步骤S4中基于HMM和ANN混合模型的语音识别方法的具体步骤为:Further, the specific steps of the speech recognition method based on the HMM and ANN mixed model in the step S4 are:

S41、建立所述考卷语音语料的特征参数的HMM模型,得到HMM模型中所有状态累积概率;S41, establishing the HMM model of the characteristic parameters of the speech corpus of the examination paper, and obtaining the cumulative probability of all states in the HMM model;

S42、将所述所有状态累积概率作为ANN分类器的输入特征进行处理,从而输出识别结果;S42. Process the cumulative probabilities of all states as input features of the ANN classifier, thereby outputting a recognition result;

S43、将所述识别结果与所述标准语音模板进行特征匹配,从而识别出所述考卷语音的内容。S43. Perform feature matching on the recognition result and the standard speech template, so as to recognize the speech content of the examination paper.

进一步的,所述步骤S2中的预处理具体包括预加重、分帧、加窗、降噪、端点检测和切词,其中,所述降噪的具体步骤为采用语音的空白语音段作为噪声的基值对后续语音进行去噪处理。Further, the preprocessing in the step S2 specifically includes pre-emphasis, framing, windowing, noise reduction, endpoint detection and word segmentation, wherein the specific step of the noise reduction is to use blank speech segments of speech as noise The base value denoises the subsequent speech.

进一步的,所述切词具体包括步骤:Further, the word segmentation specifically includes steps:

S21、提取语音中每个音素的MFCC参数,并建立对应音素的HMM模型;S21, extracting the MFCC parameters of each phoneme in the speech, and establishing the HMM model of the corresponding phoneme;

S22、对语音进行粗切分,得到有效的语音段;S22. Roughly segment the speech to obtain effective speech segments;

S23、根据所述音素的HMM模型识别出所述语音段的单词,从而将语音识别为单词集合。S23. Recognize the words of the speech segment according to the HMM model of the phoneme, so as to recognize the speech as a word set.

进一步的,所述步骤S3中的提取参数特征具体为提取MFCC特征参数,具体步骤为将预处理后得到的语料进行快速傅里叶变换、三角窗滤波、求对数、离散余弦变换得到MFCC特征参数。Further, the extraction parameter feature in the step S3 is specifically to extract the MFCC feature parameter, and the specific steps are to perform fast Fourier transform, triangular window filter, logarithm, and discrete cosine transform on the corpus obtained after preprocessing to obtain the MFCC feature parameter.

进一步的,所述步骤S5中的准确度评分具体步骤为:Further, the specific steps of the accuracy scoring in the step S5 are:

采用抽插值的方法将待评分语音语句规整到与标准语音语句相近的程度;采用短时能量作为特征来提取所述待评分语音语句与标准语音语句的强度曲线;通过比较待评分语音语句与标准语音语句的强度曲线的拟合程度进行评分。The speech sentence to be scored is regularized to a degree close to the standard speech sentence by the method of extracting interpolation; short-term energy is used as a feature to extract the intensity curve of the speech sentence to be scored and the standard speech sentence; by comparing the speech sentence to be scored and the standard speech sentence The degree of fit to the intensity curve of standard speech sentences is scored.

进一步的,所述步骤S5中的流利度评分具体步骤为:Further, the specific steps of the fluency scoring in the step S5 are:

将待评分语音截成前后两部分,并对前半部份和后半部份切词从而得到有效语音段;将前后两部分的有效语音段的长度分别与总待评分语音的长度作除运算,并将得到的值与对应的阈值相比较,若都大于对应的阈值,则判定为流利;否则,判定为不流利。Cut the speech to be scored into two parts before and after, and segment the words of the first half and the second half to obtain an effective speech segment; divide the lengths of the effective speech segments of the two parts before and after the total length of the speech to be scored respectively, And compare the obtained value with the corresponding threshold, if both are greater than the corresponding threshold, it is judged as fluent; otherwise, it is judged as not fluent.

语速评分具体步骤为:计算待评分语音中发音部分占整个待评分语音时长的比例,根据所述比例进行语速评分。The specific steps of scoring speech speed are: calculating the ratio of the pronunciation part in the speech to be scored to the entire speech duration to be scored, and performing speech speed scoring according to the ratio.

节奏评分具体步骤为:采用改进的dPVI参数计算公式计算待评分语音的节奏。The specific steps of rhythm scoring are: using the improved dPVI parameter calculation formula to calculate the rhythm of the speech to be scored.

重音评分具体步骤为:在规整后的强度曲线基础上,通过设置重音阈值和非重音阈值作为特征的双门限以及重读元音时长划分重音单元,并采用DTW算法对所述待评分语音语句和标准语音语句进行模式匹配,实现重音的评。The specific steps of stress scoring are as follows: on the basis of the regularized intensity curve, the stress unit is divided by setting the stress threshold and the non-stress threshold as the characteristic double threshold and the duration of the stressed vowel, and using the DTW algorithm to classify the speech sentences to be scored and the standard Speech sentences are pattern matched to realize stress evaluation.

语调评分具体步骤为:提取待评分语音和标准语音的共振峰,并根据所述待评分语音共振峰的变化趋势与标准语音共振峰的变化趋势的拟合程度对语调进行评分。The specific steps of scoring the intonation are: extracting the formants of the speech to be scored and the standard speech, and scoring the intonation according to the degree of fitting between the change trend of the formant of the speech to be scored and the change trend of the formant of the standard speech.

本发明还提供了一种语音评分系统,包括:The present invention also provides a voice scoring system, comprising:

语音录制模块,用于录制考生的考卷语音;The voice recording module is used to record the examinee's examination paper voice;

预处理模块,用于对所述考生的考卷语音进行预处理,得到考卷语音语料;The preprocessing module is used to preprocess the speech of the examinee's examination paper to obtain the speech corpus of the examination paper;

参数特征提取模块,用于提取所述考卷语音语料的特征参数;A parameter feature extraction module, used to extract the feature parameters of the speech corpus of the examination paper;

语音识别模块,用于采用基于HMM和ANN混合模型的语音识别方法对所述考卷语音语料的特征参数和标准语音模板进行特征匹配,识别出考卷语音的内容,并给予初步评分;Speech recognition module, for adopting the speech recognition method based on HMM and ANN hybrid model to carry out feature matching to the characteristic parameter of described examination paper speech corpus and standard speech template, recognize the content of examination paper speech, and give preliminary scoring;

语音评分模块,用于对于初步评分高于设定阈值的考卷语音进行准确度评分、流利度评分、语速评分、节奏评分、重音评分和语调评分。The speech scoring module is used to perform accuracy scoring, fluency scoring, speech rate scoring, rhythm scoring, stress scoring and intonation scoring on the speech of the test paper whose preliminary score is higher than the set threshold.

综合评分模块,用于综合准确度、流利度、语速、节奏、重音和语调的评分计算得到初步评分高于设定阈值的考卷语音的最终评分。The comprehensive scoring module is used to calculate the score of comprehensive accuracy, fluency, speech rate, rhythm, stress and intonation to obtain the final score of the speech of the test paper whose preliminary score is higher than the set threshold.

实施本发明,具有如下有益效果:Implement the present invention, have following beneficial effect:

1、本发明在预处理模块中加入了实用的降噪和切词方法,得到质量更好的语音语料;1. The present invention adds practical noise reduction and word segmentation methods in the preprocessing module to obtain better quality speech corpus;

2、采用基于HMM和ANN混合模型的语音识别方法,性能更佳,识别更为准确;2. Using the speech recognition method based on the hybrid model of HMM and ANN, the performance is better and the recognition is more accurate;

3、通过对语速、节奏、重音和语调的多指标分析,比原有朗读题的评分指标更多元化,结果更具客观性;3. Through the multi-indicator analysis of speech rate, rhythm, stress and intonation, the scoring indicators are more diversified than the original reading questions, and the results are more objective;

4、通过对准确度和流利度的双重分析,在原有只能实现对朗读题评分的基础上,实现了对翻译题、问答题和复述题等非朗读题的客观评分,建立了一个合理完善的语音评分方法和系统,能快捷准确地进行阅卷评分,以客观的评分标准给考生评分;4. Through the dual analysis of accuracy and fluency, based on the original scoring of reading questions, the objective scoring of non-reading questions such as translation questions, questions and answers, and retelling questions has been established, and a reasonable and complete system has been established. The advanced voice scoring method and system can quickly and accurately mark and score the examination papers, and score candidates with objective scoring standards;

5、本发明具有更加稳定、效率更高的优点,且实用性强,应用范围广,能够运用到口语考试的评改过程,大幅度有效地缩短评改时间,提高系统处理的高效性,也提高了评改的客观性。5. The present invention has the advantages of being more stable and efficient, and has strong practicability and a wide range of applications. It can be applied to the evaluation process of oral exams, greatly and effectively shorten the evaluation time, improve the efficiency of system processing, and improve The objectivity of evaluation.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without creative work.

图1是本发明实施例提供的语音评分方法的流程示意图;Fig. 1 is a schematic flow chart of the speech scoring method provided by the embodiment of the present invention;

图2是步骤S0的具体步骤的流程示意图;Fig. 2 is a schematic flow chart of the specific steps of step S0;

图3是图1中预处理的具体步骤的流程示意图;Fig. 3 is the schematic flow chart of the specific steps of preprocessing in Fig. 1;

图4是图3中切词的具体步骤的流程示意图;Fig. 4 is the schematic flow chart of the specific steps of word cutting among Fig. 3;

图5是MFCC特征参数提取的具体步骤的流程示意图;Fig. 5 is the schematic flow chart of the specific steps of MFCC feature parameter extraction;

图6是基于HMM和ANN混合模型的语音识别方法的具体步骤的流程示意图;Fig. 6 is the schematic flow chart of the specific steps of the speech recognition method based on HMM and ANN hybrid model;

图7是本发明实施例提供的语音评分系统的结构示意图。Fig. 7 is a schematic structural diagram of a speech scoring system provided by an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

本发明实施例提供了一种语音评分方法,如图1所示,包括步骤:The embodiment of the present invention provides a kind of speech scoring method, as shown in Figure 1, comprises steps:

S1、录制考生的考卷语音;S1, recording the examinee's examination paper voice;

S2、对所述考生的考卷语音进行预处理,得到考卷语音语料;S2. Preprocessing the speech of the examinee's examination paper to obtain the speech corpus of the examination paper;

S3、提取所述考卷语音语料的特征参数;S3, extracting the characteristic parameters of the speech corpus of the examination paper;

S4、采用基于隐马尔可夫模型(Hidden Markov Models,HMM)和人工神经网络(Artificial Neural Networks,ANN)混合模型的语音识别方法将所述考卷语音语料的特征参数和标准语音模板进行特征匹配,识别出所述考卷语音的内容,并给予初步评分;S4, using a speech recognition method based on a hybrid model of Hidden Markov Models (HMM) and Artificial Neural Networks (ANN) to perform feature matching on the feature parameters of the speech corpus of the test paper and the standard speech template, Recognize the content of the voice in the examination paper and give a preliminary score;

S5、若初步评分低于预先设定阈值,则所述初步评分为该考卷语音的最终评分,并标记该考卷语音为问题卷;若初步评分高于预先设定阈值,则对所述考卷语音进行准确度、流利度、语速、节奏、重音和语调分指标评分;S5. If the preliminary score is lower than the preset threshold, the preliminary score is the final score of the test paper voice, and the test paper voice is marked as a question paper; if the preliminary score is higher than the preset threshold, the test paper voice is Scoring for accuracy, fluency, rate of speech, rhythm, stress and intonation;

S6、对所述分指标的评分进行加权计算得到所述考卷语音的最终评分。S6. Perform weighted calculation on the scores of the sub-indices to obtain the final score of the test paper voice.

进一步的,所述步骤S1之前还包括步骤S0,如图2所示,所述步骤S0具体包括步骤:Further, the step S1 also includes a step S0, as shown in FIG. 2, the step S0 specifically includes the steps:

S01、录制专家的标准语音;S01, recording the standard voice of the expert;

其中标准语音都是由多数专业人士在特定的环境下录制的,语音内容与英语口语考试内容相对应;Among them, the standard voice is recorded by most professionals in a specific environment, and the content of the voice corresponds to the content of the oral English test;

S02、对所述标准语音进行预处理,得到标准语音语料;S02. Preprocessing the standard speech to obtain a standard speech corpus;

S03、提取所述标准语音语料的特征参数;S03. Extracting characteristic parameters of the standard speech corpus;

S04、对所述标准语音语料的特征参数进行模型训练,得到所述标准语音模板。S04. Perform model training on the characteristic parameters of the standard speech corpus to obtain the standard speech template.

其中,标准语音的模型训练是指按照一定的准则,从大量已知模式中获取表征该模式本质特征的模型参数,即标准语音模板。所述模型训练的过程具体是指为了使语音识别系统达到某种最佳状态,通过对初始构造数据不断地迭代调整系统模板的参数(包括状态转移矩阵的概率以及高斯混合模型的方差,均值,权重等),使系统的性能不断向这种最佳状态逼近的过程。由于专业人士的标准语音与考生的语音具有一定程度上的差异,而且本发明的评分对象是自然人,所以本发明将会努力扩展语料库,由特定的专业人士扩展到普通人,特定环境扩展到普通环境,并且包含不同性别、年龄、口音的说话人的声音。Among them, the standard speech model training refers to obtaining model parameters representing the essential characteristics of the pattern from a large number of known patterns according to certain criteria, that is, standard speech templates. The process of the model training specifically refers to in order to make the speech recognition system reach a certain optimal state, by continuously iteratively adjusting the parameters of the system template (comprising the probability of the state transition matrix and the variance of the Gaussian mixture model, mean value, Weights, etc.), the process of making the performance of the system continuously approach this optimal state. Since the standard pronunciation of professionals differs to a certain extent from the pronunciation of examinees, and the scoring object of the present invention is a natural person, the present invention will strive to expand the corpus, from specific professionals to ordinary people, and from specific environments to ordinary people. environment, and contains the voices of speakers of different genders, ages, and accents.

接下来将对各步骤进行具体介绍。Each step will be described in detail next.

1、预处理1. Pretreatment

如图3所示,所述步骤S2中的预处理具体包括降噪、预加重、分帧、加窗、端点检测和切词,预处理的目的是消除因为人发音器官本身和由于语音信号的设备对语音信号质量产生的影响,为语音特征提取提供优质的参数,从而提高语音处理的质量。As shown in Figure 3, the preprocessing in step S2 specifically includes noise reduction, pre-emphasis, framing, windowing, endpoint detection, and word segmentation. The impact of equipment on the quality of speech signals provides high-quality parameters for speech feature extraction, thereby improving the quality of speech processing.

其中,所述降噪的具体步骤为采用语音的空白语音段作为噪声的基值对后续语音进行去噪处理,因为根据研究发现,当考生在录制录音之前,通常在开始的一小段时间里是没有发声的,而这一小段录音却并不是空白的,而是具有噪音的录音段。因此,通过提取这录音段的音频作为噪声的基值,对之后的录音就可以进行一个去处噪声的处理了,同时也排除了无声段的噪声干扰。Wherein, the specific step of the noise reduction is to use the blank speech segment of the speech as the base value of the noise to perform denoising processing on the subsequent speech, because according to the research, when the examinee is recording the recording, usually in a short period of time at the beginning There is no sound, but this short recording is not blank, but a recording segment with noise. Therefore, by extracting the audio of this recording segment as the base value of the noise, a noise removal process can be performed on the subsequent recording, and the noise interference of the silent segment is also excluded.

其中,所述切词是指把一句话切成一个个单词或者词组,以使得计算机能够通过识别一个个单词或词组而“听懂”考生的表述内容,为后阶段计算机进行相应的加分或扣分因素的分析以及最后自动评分做准备。如图4所示,所述切词具体包括步骤:Among them, the word cutting refers to cutting a sentence into individual words or phrases, so that the computer can "understand" the candidate's expression content by recognizing individual words or phrases, and perform corresponding bonus points or phrases for the computer in the later stage. Analysis of deduction factors and preparation for final automatic scoring. As shown in Figure 4, the word segmentation specifically includes steps:

S21、提取语音中每个音素的Mel频率倒谱系数(Mel Frequency CepstrumCoefficient,MFCC)参数,并建立对应音素的HMM模型;S21, extract the Mel frequency cepstrum coefficient (Mel Frequency Cepstrum Coefficient, MFCC) parameter of each phoneme in the speech, and set up the HMM model of corresponding phoneme;

S22、对语音进行粗切分,得到有效的语音段;S22. Roughly segment the speech to obtain effective speech segments;

粗切分的目的有两点:一是减少运算量,借此减少切词的时间;二是增加切词的准确度。关于粗分,利用的是双门限法,把明显空白的地方截取掉,但是使用的门限值较低,目的是为了得到有效的语音段;The purpose of rough segmentation is twofold: one is to reduce the amount of computation, thereby reducing the time for word segmentation; the other is to increase the accuracy of word segmentation. Regarding the rough classification, the double-threshold method is used to intercept the obvious blank places, but the threshold value used is lower, the purpose is to obtain effective speech segments;

S23、根据所述音素的HMM模型识别出所述语音段的单词,从而将语音识别为单词集合。S23. Recognize the words of the speech segment according to the HMM model of the phoneme, so as to recognize the speech as a word set.

该切词方法具有识别率、准确率高,误差小的优点:1)识别模板的个数是固定的,对于HMM模型来说,准确率非常高;而且不需要再去设置输出概率的阈值,这将很大程度上提高识别率。2)切分单词之后,即得到单词的读音,读音可辅助进行关键词的匹配,从而减少了匹配单词所带来的误差。This word segmentation method has the advantages of high recognition rate, high accuracy rate, and small error: 1) The number of recognition templates is fixed, and for the HMM model, the accuracy rate is very high; and there is no need to set the threshold of the output probability, This will greatly improve the recognition rate. 2) After the word is segmented, the pronunciation of the word is obtained, and the pronunciation can assist in the matching of keywords, thereby reducing the error caused by matching words.

2、提取参数特征2. Extract parameter features

所述步骤S3中的提取特征参数具体为提取MFCC特征参数,如图5所示,具体步骤为将预处理后得到的语料进行快速傅里叶变换、三角窗滤波、求对数、离散余弦变换得到MFCC特征参数。其中,采用MFCC特征参数是因为其考虑了人耳的听觉特性,将频谱转化为基于Mel频率的非线性频谱,然后转换到倒谱域上。而且没有任何前提假设,用数学的方法来模拟人耳的听觉特性,使用一串在低频区域交叉重叠排列的三角型滤波器,捕获语音的频谱信息;另外,MFCC特征参数的抗噪能力和抗频谱失真能力强,能够更好的提高系统的识别性能。The extraction feature parameter in the described step S3 is specifically to extract the MFCC feature parameter, as shown in Figure 5, the specific steps are to carry out fast Fourier transform, triangular window filter, logarithm, discrete cosine transform to the corpus obtained after preprocessing Get the MFCC characteristic parameters. Among them, the MFCC characteristic parameters are used because it takes into account the auditory characteristics of the human ear, and converts the spectrum into a nonlinear spectrum based on the Mel frequency, and then converts it into the cepstrum domain. Moreover, without any assumptions, mathematical methods are used to simulate the auditory characteristics of the human ear, and a series of triangular filters arranged in the low frequency area are used to capture the spectral information of speech; in addition, the anti-noise ability and anti-noise of MFCC characteristic parameters The spectrum distortion ability is strong, which can better improve the recognition performance of the system.

3、语音内容识别3. Speech content recognition

所述步骤S4中采用了基于HMM和ANN混合模型的语音识别方法,其中HMM方法有需要语音信号的先验统计知识、分类决策能力弱、结构复杂,需要大量的训练样本和需要进行大量计算的缺点;ANN虽然在决策能力上有着一定的优势,但是它对动态时间信号的描述能力尚不尽人意,而且基于神经网络的语音识别算法存在训练、识别时间太长的缺点。为了克服各自的缺点,本发明将具有较强时间建模能力的HMM和具有较强分类能力的ANN两种方法有机的结合起来,进一步提高了语音识别的鲁棒性和准确率。这种方法不仅克服了HMM本身难以解决的模式类别间的相互重叠问题,提高了对易混淆词的识别能力,同时也克服了ANN仅能处理固定长输入模式的局限,省去了复杂的归整运算。具体的,如图6所示,所述步骤S4中基于HMM和ANN混合模型的语音识别方法的具体步骤为:In the step S4, a speech recognition method based on a hybrid model of HMM and ANN is adopted, wherein the HMM method requires prior statistical knowledge of speech signals, weak classification and decision-making ability, complex structure, requires a large number of training samples and requires a large number of calculations. Disadvantages: Although ANN has certain advantages in decision-making ability, its ability to describe dynamic time signals is not satisfactory, and the speech recognition algorithm based on neural network has the shortcomings of too long training and recognition time. In order to overcome their respective shortcomings, the present invention organically combines the two methods of HMM with strong time modeling ability and ANN with strong classification ability, and further improves the robustness and accuracy of speech recognition. This method not only overcomes the overlapping problem between pattern categories that HMM itself is difficult to solve, improves the ability to recognize confusing words, but also overcomes the limitation that ANN can only handle fixed-length input patterns, eliminating the need for complex regression whole operation. Specifically, as shown in Figure 6, the specific steps of the speech recognition method based on the HMM and ANN hybrid model in the step S4 are:

S41、建立所述考卷语音语料的特征参数的HMM模型,得到HMM模型中所有状态累积概率;S41, establishing the HMM model of the characteristic parameters of the speech corpus of the examination paper, and obtaining the cumulative probability of all states in the HMM model;

S42、将所述所有状态累积概率作为ANN(具体为自组织神经网络)分类器的输入特征进行处理,从而输出识别结果;S42. Process the cumulative probabilities of all states as the input features of an ANN (specifically, a self-organizing neural network) classifier, thereby outputting a recognition result;

S43、将所述识别结果与所述标准语音模板进行特征匹配,从而识别出所述考卷语音的内容。S43. Perform feature matching on the recognition result and the standard speech template, so as to recognize the speech content of the test paper.

4、语音评价4. Voice evaluation

由于在日常生活中,有一些考生并不能很好地在规定的时间进行口语测验,得到的考卷语音将出现大量空白或者无法识别,我们将这些考卷录音标记为问题卷。问题卷包括空白录音以及各种无法识别的有声录音,如非英语语种的录音、噪声过大的录音等,而步骤S4的目的不只是识别出考生所读的内容,还有就是检测问题卷,并且根据实际的情况给出较低的分数,对于此类问题卷语音就没有必要对其进行准确度、流利度、语速、节奏、重音和语调进行评分。只有当初始评分高于预先设定阈值时才进行进一步的语音评价。Because in daily life, some candidates are not able to take the oral test well at the stipulated time, and the resulting test paper voices will have a lot of blanks or cannot be recognized. We mark these test paper recordings as question papers. Question papers include blank recordings and various unrecognizable audio recordings, such as recordings in non-English languages, recordings with excessive noise, etc. The purpose of step S4 is not only to identify what the examinee read, but also to detect the question papers. And according to the actual situation, a lower score is given, and it is not necessary to score the accuracy, fluency, speech rate, rhythm, stress and intonation of this kind of question paper. Further speech evaluation is performed only if the initial score is above a pre-set threshold.

(1)所述步骤S5中的准确度评分具体步骤为:采用抽插值的方法将待评分语音语句规整到与标准语音语句相近的程度;采用短时能量作为特征来提取所述待评分语音语句与标准语音语句的强度曲线;通过比较待评分语音语句与标准语音语句的强度曲线的拟合程度进行评分。(1) The specific steps of the accuracy scoring in the step S5 are: using the method of interpolation to regularize the speech sentence to be scored to a degree close to the standard speech sentence; using short-term energy as a feature to extract the speech to be scored The intensity curve of the sentence and the standard speech sentence; scoring is carried out by comparing the fitting degree of the strength curve of the speech sentence to be scored and the standard speech sentence.

语句的强度曲线图可以反映语音信号随着时间的变化。语句中重读音节响亮的特征将反映到时域上的能量强度,即重音音节表现为语音能量强度大。但由于不同人不同时间对同一句话的发音时长不相等、发音强度也不同,如果将待评分语音语句和标准语音语句的强度曲线直接进行模板匹配,结果将影响评价的客观性。因此本发明在原有技术的基础上修改出一种基于标准语音语句的强度曲线提取方法:当待评分语音语句时长比标准用语音语句短的时候,采用插值方法对其进行时长的补充;当待评分语音语句时长比标准语音语句长的时候,采用抽值方法对其进行时长的调整;最后,利用标准语音语句的强度曲线的最强点,对待评分语音语句的强度曲线进行强度规整。The intensity graph of the sentence can reflect the change of the speech signal over time. The characteristic of loud stressed syllables in the sentence will be reflected in the energy intensity in the time domain, that is, the stressed syllables have high speech energy intensity. However, since different people pronounce the same sentence at different times and at different times, and the pronunciation intensity is also different, if the intensity curves of the speech sentences to be scored and the standard speech sentences are directly used for template matching, the result will affect the objectivity of the evaluation. Therefore the present invention revises a kind of strength curve extracting method based on standard speech sentence on the basis of prior art: when the duration of the speech sentence to be scored is shorter than the standard speech sentence, an interpolation method is used to supplement the duration; When the duration of the rated speech sentence is longer than that of the standard speech sentence, the value extraction method is used to adjust its duration; finally, the intensity curve of the speech sentence to be rated is adjusted by using the strongest point of the intensity curve of the standard speech sentence.

(2)流利度评分具体步骤为:将待评分语音截成前后两部分,并对前半部份和后半部份切词从而得到有效语音段;将前后两部分的有效语音段的长度分别与总待评分语音的长度作除运算,并将得到的值与对应的阈值相比较,若都大于对应的阈值,则判定为流利;否则,判定为不流利;(2) The specific steps of fluency scoring are as follows: the speech to be scored is cut into two parts before and after, and the words are cut into the first half and the second half to obtain an effective speech segment; The length of the total speech to be scored is divided, and the obtained value is compared with the corresponding threshold. If both are greater than the corresponding threshold, it is judged as fluent; otherwise, it is judged as not fluent;

针对句子级的流利度,旨在通过计算句子表达的通顺程度,并且利用标准语音计算发音的韵律得分,两者融合得到句子的流利度诊断模型。这种句子流利度评分方法也可以应用到篇章流利度评分。该方法考虑到发音者在表述语句过程中的通顺性,比传统方法有更高的相关度。因此可以应用到语音评分系统中。For sentence-level fluency, it aims to calculate the smoothness of sentence expression and the prosody score of pronunciation by using standard speech, and the two are fused to obtain a sentence fluency diagnostic model. This method of sentence fluency scoring can also be applied to discourse fluency scoring. This method takes into account the fluency of the speaker in the process of expressing sentences, and has a higher degree of correlation than the traditional method. Therefore, it can be applied to the speech scoring system.

(3)语速评分具体步骤为:计算待评分语音中发音部分占整个待评分语音时长的比例,根据所述比例对语速进行评分。(3) The specific steps of scoring speech speed are: calculating the ratio of the pronunciation part in the speech to be scored to the entire speech duration to be scored, and scoring the speech speed according to the ratio.

(4)节奏评分具体步骤为:采用改进的差异性成对变异指数(the DistinctPairwise Variability Index,dPVI)参数计算公式计算待评分语音的节奏。dPVI根据语音单元时长差异性的特征,将标准语音语句与带评分语音语句的音节单元片段时长分别进行对比计算,并将转换出的参数用于客观评价和反馈指导依据。(4) The specific steps of scoring the rhythm are as follows: Calculate the rhythm of the speech to be scored using the improved formula for calculating the Distinct Pairwise Variability Index (dPVI) parameters. According to the characteristics of the differences in the duration of speech units, dPVI compares and calculates the duration of syllable unit segments of standard speech sentences and speech sentences with scores, and uses the converted parameters for objective evaluation and feedback guidance.

dPVIPPVI == 100100 ×× (( ΣΣ kk == 11 mm -- 11 || dd 11 kk -- dd 22 kk || ++ || dd 11 tt -- dd 22 tt || )) // LenLen StdStd

其中d为语句划分的语音单元片段时长(如:dk为第k个语音单元片段时长),m=min时(即标准语音语句单元数,待评分语音语句单元数),LenStd为标准语音语句时长。由于进行PVI运算之前已经将待评分语音语句时长规整到与标准语音语句时长相当,计算时可只用LenStd作为计算单元。Wherein d is the speech unit segment duration of sentence division (as: d k is the kth speech unit segment duration), when m=min (i.e. the standard speech sentence unit number, the speech sentence unit number to be scored), Len Std is the standard speech Statement duration. Since the length of speech sentences to be scored has been adjusted to be equivalent to the length of standard speech sentences before the PVI calculation, only Len Std can be used as the calculation unit in the calculation.

(5)重音评分具体步骤为:在规整后的强度曲线基础上,通过设置重音阈值和非重音阈值作为特征的双门限以及重读元音时长划分重音单元,并采用动态时间规整(Dynamic Time Warping,DTW)算法对所述待评分语音语句和标准语音语句进行模式匹配,实现重音的评分。(5) The specific steps of stress scoring are as follows: on the basis of the regularized intensity curve, the stress unit is divided by setting the stress threshold and the non-stress threshold as the characteristic double threshold and the duration of the stressed vowel, and using Dynamic Time Warping (Dynamic Time Warping, DTW) algorithm carries out pattern matching to described speech sentence to be scored and standard speech sentence, realizes the scoring of stress.

重音是指词、词组、句子里重读的音。DTW算法的基本原理为动态时间规整,把测试模板和参考模板之间本来不匹配的时间长度进行匹配。用传统的欧氏距离计算其相似度,设参考模板和测试模板为R和T,距离D[T,R]越小则相似度越高。传统DTW算法的缺点是在进行模板匹配时,所有帧的权重一致,必须匹配所有的模板,计算量比较大,特别是当模板数增加较快时,运算量增长特别快。所以本发明采用改进了的DTW算法进行待评分语音语句和标准语音语句的模式匹配,完善了传统DTW算法的缺点,每一帧的权重有所侧重,大大降低了计算量,使得结果更加精确。Stress refers to the stressed sound in words, phrases, and sentences. The basic principle of the DTW algorithm is dynamic time warping, which matches the unmatched time length between the test template and the reference template. Use the traditional Euclidean distance to calculate its similarity, set the reference template and test template as R and T, the smaller the distance D[T, R], the higher the similarity. The disadvantage of the traditional DTW algorithm is that when performing template matching, all frames have the same weight, and all templates must be matched, and the amount of calculation is relatively large, especially when the number of templates increases rapidly, the amount of calculation increases very quickly. Therefore, the present invention adopts the improved DTW algorithm to carry out the pattern matching of speech sentences to be rated and standard speech sentences, which improves the shortcomings of the traditional DTW algorithm. The weight of each frame is emphasized, which greatly reduces the amount of calculation and makes the results more accurate.

(6)语调评分具体步骤为:提取待评分语音和标准语音的共振峰,并根据所述待评分语音共振峰的变化趋势与标准语音共振峰的变化趋势的拟合程度对语调进行评分。(6) The specific steps of scoring the intonation are: extracting the formants of the voice to be scored and the standard voice, and scoring the intonation according to the degree of fitting between the formant of the voice to be scored and the formant of the standard voice.

语调是人们英语口语交际中表示语言表达能力的一个重要表征,是言语人话语运用状态整体语势的反映,在听感上是语音的轻重缓急和抑扬顿挫的腔调。Intonation is an important indicator of people's ability to express language in oral English communication, and it is a reflection of the overall momentum of the speaker's discourse use.

在语音数字信号处理的研究中,语音信号的共振峰是一个十分重要的性能参数。这里提及的共振峰是指在声音的频谱中能量相对集中的一些区域,共振峰不但是音质的决定因素,而且反映了声道(共振腔)的物理特征。声音在经过共振腔时,受到腔体的滤波作用,使得频域中不同频率的能量重新分配,一部分因为共振腔的共振作用得到强化,另一部分则受到衰减,得到强化的那些频率在时频分析的语图上表现为浓重的黑色条纹。由于能量分布不均匀,强的部分犹如山峰一般,故而称之为共振峰。共振峰是反映声道谐振特性的重要特征,它代表了发音信息的最直接来源,而且人在语音感知中利用了共振峰信息,所以共振峰是语音信号处理中非常重要的特征参数。共振峰是准周期脉冲激励进入声道时产生的一组共振频率。共振峰参数包括共振峰频率和频带宽度,它是区别不同韵母的重要参数。而共振峰信息包含在频率包络之中,因此共振峰参数提取的关键是估计自然语音频谱包络,一般认为频谱包络中的最大值就是共振峰。In the research of speech digital signal processing, the formant of speech signal is a very important performance parameter. The formant mentioned here refers to some areas where the energy is relatively concentrated in the frequency spectrum of the sound. The formant is not only the determinant of sound quality, but also reflects the physical characteristics of the vocal tract (resonant cavity). When the sound passes through the resonant cavity, it is filtered by the cavity, which redistributes the energy of different frequencies in the frequency domain. Part of it is strengthened by the resonance of the resonant cavity, and the other part is attenuated. Those frequencies that are strengthened are analyzed in time-frequency It appears as thick black stripes on the language map. Due to the uneven distribution of energy, the strong part is like a mountain peak, so it is called a resonance peak. Formant is an important feature that reflects the resonance characteristics of the vocal tract. It represents the most direct source of pronunciation information, and people use formant information in speech perception, so formant is a very important characteristic parameter in speech signal processing. A formant is a set of resonant frequencies produced when quasi-periodic pulse excitation enters the vocal tract. Formant parameters include formant frequency and frequency bandwidth, which are important parameters to distinguish different finals. The formant information is included in the frequency envelope, so the key to extracting formant parameters is to estimate the spectrum envelope of natural speech. It is generally believed that the maximum value in the spectrum envelope is the formant.

本发明还提供了一种语音评分系统,如图7所示,包括:The present invention also provides a voice scoring system, as shown in Figure 7, comprising:

语音录制模块101,用于录制考生的考卷语音;Voice recording module 101, for recording examinee's examination paper voice;

预处理模块102,用于对所述考生的考卷语音进行预处理,得到考卷语音语料;The preprocessing module 102 is used to preprocess the examination paper voice of the examinee to obtain the examination paper voice corpus;

参数特征提取模块103,用于提取所述考卷语音语料的特征参数;Parameter feature extraction module 103, for extracting the feature parameter of described examination paper voice corpus;

语音识别模块104,用于采用基于HMM和ANN混合模型的语音识别方法对所述考卷语音语料的特征参数和标准语音模板进行特征匹配,识别出考卷语音的内容,并给予初步评分;Speech recognition module 104, for adopting the speech recognition method based on HMM and ANN hybrid model to carry out feature matching to the characteristic parameter of described examination paper speech corpus and standard speech template, recognize the content of examination paper speech, and give preliminary scoring;

语音评分模块105,用于对于初步评分高于设定阈值的考卷语音进行准确度评分、流利度评分、语速评分、节奏评分、重音评分和语调评分。The speech scoring module 105 is used to perform accuracy scoring, fluency scoring, speech rate scoring, rhythm scoring, stress scoring and intonation scoring on the speech of the test paper whose preliminary score is higher than the set threshold.

综合评分模块106,用于综合准确度、流利度、语速、节奏、重音和语调的评分计算得到初步评分高于设定阈值的考卷语音的最终评分。The comprehensive scoring module 106 is used for comprehensive score calculation of accuracy, fluency, speech rate, rhythm, stress and intonation to obtain the final score of the test paper voice whose preliminary score is higher than the set threshold.

其中,所述的语音评分系统和语音评分方法相互对应,因此各模块的具体处理步骤可参考语音评分方法的步骤,再次不在赘述。Wherein, the speech scoring system and the speech scoring method correspond to each other, so the specific processing steps of each module can refer to the steps of the speech scoring method, and will not be described again.

实施本发明,具有如下有益效果:Implement the present invention, have following beneficial effect:

(1)本发明在预处理模块中加入了实用的降噪和切词方法,得到质量更好的语音语料;(1) the present invention has added practical denoising and word cutting method in the preprocessing module, obtains the better speech corpus of quality;

(2)采用基于HMM和ANN混合模型的语音识别方法,性能更佳,识别更为准确;(2) Using the speech recognition method based on the HMM and ANN hybrid model, the performance is better and the recognition is more accurate;

(3)通过对语速、节奏、重音和语调的多指标分析,比原有朗读题的评分指标更多元化,结果更具客观性;(3) Through the multi-indicator analysis of speech rate, rhythm, stress and intonation, the scoring indicators are more diversified than the original reading questions, and the results are more objective;

(4)通过对准确度和流利度的双重分析,在原有只能实现对朗读题评分的基础上,实现了对翻译题、问答题和复述题等非朗读题的客观评分,建立了一个合理完善的语音评分方法和系统,能快捷准确地进行阅卷评分,以客观的评分标准给考生评分;(4) Through the dual analysis of accuracy and fluency, on the basis of the original scoring of reading questions, the objective scoring of non-reading questions such as translation questions, question-and-answer questions, and retelling questions has been realized, and a reasonable score has been established. The perfect voice scoring method and system can quickly and accurately mark and score the examination papers, and score candidates with objective scoring standards;

(5)本发明具有更加稳定、效率更高的优点,且实用性强,应用范围广,能够运用到口语考试的评改过程,大幅度有效地缩短评改时间,提高系统处理的高效性,也提高了评改的客观性。(5) The present invention has the advantages of being more stable and efficient, has strong practicability, and has a wide range of applications. The objectivity of evaluation and reform has been improved.

以上所揭露的仅为本发明一种较佳实施例而已,当然不能以此来限定本发明之权利范围,因此依本发明权利要求所作的等同变化,仍属本发明所涵盖的范围。The above disclosure is only a preferred embodiment of the present invention, which certainly cannot limit the scope of rights of the present invention. Therefore, equivalent changes made according to the claims of the present invention still fall within the scope of the present invention.

Claims (10)

1. a speech assessment method, is characterized in that, comprises step:
S1, record examinee's examination paper voice;
S2, described examinee's examination paper voice are carried out to pre-service, obtain examination paper voice language material;
S3, extract the characteristic parameter of described examination paper voice language material;
The characteristic parameter of described examination paper voice language material and received pronunciation template are carried out characteristic matching by S4, the audio recognition method of employing based on HMM and ANN mixture model, identifies the content of described examination paper voice, and give raw score;
If S5 raw score is lower than presetting threshold value, described raw score is the final scoring of these examination paper voice, and these examination paper voice of mark are problem volume; If raw score is higher than presetting threshold value, described examination paper voice are carried out to point index scoring of accuracy, fluency, word speed, rhythm, stress and intonation;
S6, to described point of index, scoring is weighted the final scoring that obtains described examination paper voice.
2. speech assessment method as claimed in claim 1, is characterized in that, before described step S1, also comprises step S0, and described step S0 specifically comprises step:
S01, record expert's received pronunciation;
S02, described received pronunciation is carried out to pre-service, obtain received pronunciation language material;
S03, extract the characteristic parameter of described received pronunciation language material;
S04, the characteristic parameter of described received pronunciation language material is carried out to model training, obtain described received pronunciation template.
3. speech assessment method as claimed in claim 1, is characterized in that, the concrete steps of the audio recognition method based on HMM and ANN mixture model in described step S4 are:
S41, set up the HMM model of the characteristic parameter of described examination paper voice language material, obtain all state cumulative probabilities in HMM model;
S42, described all state cumulative probabilities are processed as the input feature vector of ANN sorter, thus output recognition result;
S43, described recognition result and described received pronunciation template are carried out to characteristic matching, thereby identify the content of described examination paper voice.
4. speech assessment method as claimed in claim 1, it is characterized in that, pre-service in described step S2 specifically comprises noise reduction, pre-emphasis, point frame, windowing, end-point detection and cuts word, wherein, the concrete steps of described noise reduction are to adopt the blank voice segments of voice, as the base value of noise, subsequent voice is carried out to denoising.
5. speech assessment method as claimed in claim 4, is characterized in that, described in cut word and specifically comprise step:
S21, extract the MFCC parameter of each phoneme in voice, and set up the HMM model of corresponding phoneme;
S22, voice are carried out to rough lumber divide, obtain effective voice segments;
S23, go out the word of described voice segments according to the HMM Model Identification of described phoneme, thereby be set of letters by speech recognition.
6. speech assessment method as claimed in claim 1, it is characterized in that, extracting parameter feature in described step S3 is specially extracts MFCC characteristic parameter, and concrete steps are that the language material obtaining after pre-service is carried out to Fast Fourier Transform (FFT), quarter window filtering, asks logarithm, discrete cosine transform to obtain MFCC characteristic parameter.
7. speech assessment method as claimed in claim 1, is characterized in that, the accuracy scoring concrete steps in described step S5 are:
The method of the employing value of pulling and pushing by regular speech sentences to be marked to the degree close with received pronunciation statement; Adopt short-time energy extract as feature described in the intensity curve of speech sentences to be marked and received pronunciation statement; The fitting degree of the intensity curve by speech sentences relatively to be marked and received pronunciation statement is marked.
8. speech assessment method as claimed in claim 1, is characterized in that, the fluency scoring concrete steps in described step S5 are:
Two parts before and after voice to be marked are cut into, thereby and word cut in first part and latter part obtain efficient voice section; The length of two-part front and back efficient voice section is made to division operation with the length of voice always to be marked respectively, and by the value obtaining and corresponding threshold, if be greater than corresponding threshold value, it is fluent to be judged to be; Otherwise it is unfluent to be judged to be.
9. speech assessment method as claimed in claim 1, is characterized in that, in described step S5
Word speed scoring concrete steps are: calculate in voice to be marked to pronounce and partly account for the ratio of voice duration whole to be marked, according to described ratio, word speed is marked;
Rhythm scoring concrete steps are: adopt improved dPVI parameter calculation formula to calculate the rhythm of voice to be marked;
Stress scoring concrete steps are: on the intensity curve basis after regular, by being set, stress threshold value and non-stress threshold value divide stress unit as double threshold and the stressed vowel duration of feature, and adopt DTW algorithm to carry out pattern match to speech sentences described to be marked and received pronunciation statement, realize the scoring of stress;
Intonation scoring concrete steps are: extract the resonance peak of voice to be marked and received pronunciation, and according to the fitting degree of the variation tendency of the variation tendency of speech resonant peak described to be marked and received pronunciation resonance peak, intonation is marked.
10. a speech assessment system, is characterized in that, comprising:
Voice recording module, for recording examinee's examination paper voice;
Pretreatment module, carries out pre-service for the examination paper voice to described examinee, obtains examination paper voice language material;
Characteristic parameter extraction module, for extracting the characteristic parameter of described examination paper voice language material;
Sound identification module, carry out characteristic matching for adopting audio recognition method based on HMM and ANN mixture model characteristic parameter and the received pronunciation template to described examination paper voice language material, identify the content of examination paper voice, and give raw score and mark whether as problem volume;
Speech assessment module, for carrying out accuracy scoring, fluency scoring, word speed scoring, rhythm scoring, stress scoring and intonation scoring for raw score higher than the non-problem examination paper voice that preset threshold value.
Comprehensive grading module, obtains the final scoring of raw score higher than the examination paper voice of setting threshold for the score calculation of overall accuracy, fluency, word speed, rhythm, stress and intonation.
CN201410178813.XA 2014-04-29 2014-04-29 A kind of speech assessment method and system Expired - Fee Related CN103928023B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410178813.XA CN103928023B (en) 2014-04-29 2014-04-29 A kind of speech assessment method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410178813.XA CN103928023B (en) 2014-04-29 2014-04-29 A kind of speech assessment method and system

Publications (2)

Publication Number Publication Date
CN103928023A true CN103928023A (en) 2014-07-16
CN103928023B CN103928023B (en) 2017-04-05

Family

ID=51146222

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410178813.XA Expired - Fee Related CN103928023B (en) 2014-04-29 2014-04-29 A kind of speech assessment method and system

Country Status (1)

Country Link
CN (1) CN103928023B (en)

Cited By (62)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361896A (en) * 2014-12-04 2015-02-18 上海流利说信息技术有限公司 Voice quality evaluation equipment, method and system
CN104361895A (en) * 2014-12-04 2015-02-18 上海流利说信息技术有限公司 Voice quality evaluation equipment, method and system
CN104464423A (en) * 2014-12-19 2015-03-25 科大讯飞股份有限公司 Calibration optimization method and system for speaking test evaluation
CN104485105A (en) * 2014-12-31 2015-04-01 中国科学院深圳先进技术研究院 Electronic medical record generating method and electronic medical record system
CN104505103A (en) * 2014-12-04 2015-04-08 上海流利说信息技术有限公司 Voice quality evaluation equipment, method and system
CN104732352A (en) * 2015-04-02 2015-06-24 张可 Method for question bank quality evaluation
CN104732977A (en) * 2015-03-09 2015-06-24 广东外语外贸大学 On-line spoken language pronunciation quality evaluation method and system
CN104810017A (en) * 2015-04-08 2015-07-29 广东外语外贸大学 Semantic analysis-based oral language evaluating method and system
CN105608960A (en) * 2016-01-27 2016-05-25 广东外语外贸大学 Spoken language formative teaching method and system based on multi-parameter analysis
CN105632488A (en) * 2016-02-23 2016-06-01 深圳市海云天教育测评有限公司 Voice evaluation method and device
CN105654785A (en) * 2016-03-18 2016-06-08 上海语知义信息技术有限公司 Personalized spoken foreign language learning system and method
CN105681920A (en) * 2015-12-30 2016-06-15 深圳市鹰硕音频科技有限公司 Network teaching method and system with voice recognition function
CN105825852A (en) * 2016-05-23 2016-08-03 渤海大学 Oral English reading test scoring method
CN105989839A (en) * 2015-06-03 2016-10-05 乐视致新电子科技(天津)有限公司 Speech recognition method and speech recognition device
CN106531182A (en) * 2016-12-16 2017-03-22 上海斐讯数据通信技术有限公司 Language learning system
CN106548673A (en) * 2016-10-25 2017-03-29 合肥东上多媒体科技有限公司 A kind of Teaching Management Method based on intelligent Matching
CN106652622A (en) * 2017-02-07 2017-05-10 广东小天才科技有限公司 Text training method and device
CN106710348A (en) * 2016-12-20 2017-05-24 江苏前景信息科技有限公司 Civil air defense interactive experience method and system
CN106971711A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 A kind of adaptive method for recognizing sound-groove and system
CN107221318A (en) * 2017-05-12 2017-09-29 广东外语外贸大学 Oral English Practice pronunciation methods of marking and system
CN107230171A (en) * 2017-05-31 2017-10-03 中南大学 A method and system for evaluating students' career orientation
CN107239897A (en) * 2017-05-31 2017-10-10 中南大学 Method and system for testing personality occupation type
CN107274738A (en) * 2017-06-23 2017-10-20 广东外语外贸大学 Chinese-English translation teaching points-scoring system based on mobile Internet
CN107293286A (en) * 2017-05-27 2017-10-24 华南理工大学 A kind of speech samples collection method that game is dubbed based on network
CN107292496A (en) * 2017-05-31 2017-10-24 中南大学 A work value recognition system and method
CN107578778A (en) * 2017-08-16 2018-01-12 南京高讯信息科技有限公司 A kind of method of spoken scoring
CN107785011A (en) * 2017-09-15 2018-03-09 北京理工大学 Word speed estimates training, word speed method of estimation, device, equipment and the medium of model
CN107818797A (en) * 2017-12-07 2018-03-20 苏州科达科技股份有限公司 Voice quality assessment method, apparatus and its system
CN108428382A (en) * 2018-02-14 2018-08-21 广东外语外贸大学 It is a kind of spoken to repeat methods of marking and system
CN108429932A (en) * 2018-04-25 2018-08-21 北京比特智学科技有限公司 Method for processing video frequency and device
CN108831503A (en) * 2018-06-07 2018-11-16 深圳习习网络科技有限公司 A kind of method and device for oral evaluation
CN108986786A (en) * 2018-07-27 2018-12-11 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) Interactive voice equipment ranking method, system, computer equipment and storage medium
CN109036429A (en) * 2018-07-25 2018-12-18 浪潮电子信息产业股份有限公司 A kind of voice match scoring querying method and system based on cloud service
CN109147823A (en) * 2018-10-31 2019-01-04 河南职业技术学院 Oral English Practice assessment method and Oral English Practice assessment device
CN109214616A (en) * 2017-06-29 2019-01-15 上海寒武纪信息科技有限公司 A kind of information processing unit, system and method
CN109493658A (en) * 2019-01-08 2019-03-19 上海健坤教育科技有限公司 Situated human-computer dialogue formula spoken language interactive learning method
WO2019075828A1 (en) * 2017-10-20 2019-04-25 深圳市鹰硕音频科技有限公司 Voice evaluation method and apparatus
CN109727608A (en) * 2017-10-25 2019-05-07 香港中文大学深圳研究院 A kind of ill voice appraisal procedure based on Chinese speech
CN109979484A (en) * 2019-04-03 2019-07-05 北京儒博科技有限公司 Pronounce error-detecting method, device, electronic equipment and storage medium
CN110135492A (en) * 2019-05-13 2019-08-16 山东大学 Equipment fault diagnosis and anomaly detection method and system based on multi-Gaussian model
CN110211607A (en) * 2019-07-04 2019-09-06 山东中医药高等专科学校 A kind of English learning system based on sensing network
CN110600052A (en) * 2019-08-19 2019-12-20 天闻数媒科技(北京)有限公司 Voice evaluation method and device
CN111294468A (en) * 2020-02-07 2020-06-16 普强时代(珠海横琴)信息技术有限公司 Tone quality detection and analysis system for customer service center calling
CN111358428A (en) * 2020-01-20 2020-07-03 书丸子(北京)科技有限公司 Observation capability test evaluation method and device
CN111583961A (en) * 2020-05-07 2020-08-25 北京一起教育信息咨询有限责任公司 Stress evaluation method and device and electronic equipment
CN111599234A (en) * 2020-05-19 2020-08-28 黑龙江工业学院 Automatic English spoken language scoring system based on voice recognition
CN111612324A (en) * 2020-05-15 2020-09-01 深圳看齐信息有限公司 Multi-dimensional assessment method based on oral English examination
CN111612352A (en) * 2020-05-22 2020-09-01 北京易华录信息技术股份有限公司 Student expression ability assessment method and device
CN111640452A (en) * 2019-03-01 2020-09-08 北京搜狗科技发展有限公司 Data processing method and device and data processing device
CN111696524A (en) * 2020-04-21 2020-09-22 厦门快商通科技股份有限公司 Character-overlapping voice recognition method and system
CN111816169A (en) * 2020-07-23 2020-10-23 苏州思必驰信息科技有限公司 Method and device for training Chinese and English hybrid speech recognition model
CN112349300A (en) * 2020-11-06 2021-02-09 北京乐学帮网络技术有限公司 Voice evaluation method and device
CN112634692A (en) * 2020-12-15 2021-04-09 成都职业技术学院 Emergency evacuation deduction training system for crew cabins
CN112750465A (en) * 2020-12-29 2021-05-04 昆山杜克大学 Cloud language ability evaluation system and wearable recording terminal
CN113035238A (en) * 2021-05-20 2021-06-25 北京世纪好未来教育科技有限公司 Audio evaluation method, device, electronic equipment and medium
WO2021196475A1 (en) * 2020-04-01 2021-10-07 深圳壹账通智能科技有限公司 Intelligent language fluency recognition method and apparatus, computer device, and storage medium
CN113571043A (en) * 2021-07-27 2021-10-29 广州欢城文化传媒有限公司 Dialect simulation force evaluation method and device, electronic equipment and storage medium
CN113807813A (en) * 2021-09-14 2021-12-17 广东德诚科教有限公司 Grading system and method based on man-machine conversation examination
CN114519358A (en) * 2022-02-17 2022-05-20 科大讯飞股份有限公司 Translation quality evaluation method and device, electronic equipment and storage medium
US11656910B2 (en) 2017-08-21 2023-05-23 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor
US11687467B2 (en) 2018-04-28 2023-06-27 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor
US11726844B2 (en) 2017-06-26 2023-08-15 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102354495A (en) * 2011-08-31 2012-02-15 中国科学院自动化研究所 Testing method and system of semi-opened spoken language examination questions
CN102800314A (en) * 2012-07-17 2012-11-28 广东外语外贸大学 English sentence recognizing and evaluating system with feedback guidance and method of system
CN103559894A (en) * 2013-11-08 2014-02-05 安徽科大讯飞信息科技股份有限公司 Method and system for evaluating spoken language
CN103617799A (en) * 2013-11-28 2014-03-05 广东外语外贸大学 Method for detecting English statement pronunciation quality suitable for mobile device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102354495A (en) * 2011-08-31 2012-02-15 中国科学院自动化研究所 Testing method and system of semi-opened spoken language examination questions
CN102800314A (en) * 2012-07-17 2012-11-28 广东外语外贸大学 English sentence recognizing and evaluating system with feedback guidance and method of system
CN103559894A (en) * 2013-11-08 2014-02-05 安徽科大讯飞信息科技股份有限公司 Method and system for evaluating spoken language
CN103617799A (en) * 2013-11-28 2014-03-05 广东外语外贸大学 Method for detecting English statement pronunciation quality suitable for mobile device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
孟平: "发音自动评估系统的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
张文忠 等: "第二语言口语流利性发展定量研究", 《现代外语(季刊)》 *
李心广 等: "考察重音与韵律的英语句子客观评价系统研究", 《计算机工程与应用》 *
李晶皎 等: "语音识别中HMM与自组织神经网络结合的混合模型", 《东北大学学报》 *

Cited By (80)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104505103B (en) * 2014-12-04 2018-07-03 上海流利说信息技术有限公司 Voice quality assessment equipment, method and system
CN104361895A (en) * 2014-12-04 2015-02-18 上海流利说信息技术有限公司 Voice quality evaluation equipment, method and system
CN104505103A (en) * 2014-12-04 2015-04-08 上海流利说信息技术有限公司 Voice quality evaluation equipment, method and system
CN104361896A (en) * 2014-12-04 2015-02-18 上海流利说信息技术有限公司 Voice quality evaluation equipment, method and system
CN104361895B (en) * 2014-12-04 2018-12-18 上海流利说信息技术有限公司 Voice quality assessment equipment, method and system
CN104361896B (en) * 2014-12-04 2018-04-13 上海流利说信息技术有限公司 Voice quality assessment equipment, method and system
CN104464423A (en) * 2014-12-19 2015-03-25 科大讯飞股份有限公司 Calibration optimization method and system for speaking test evaluation
CN104485105A (en) * 2014-12-31 2015-04-01 中国科学院深圳先进技术研究院 Electronic medical record generating method and electronic medical record system
CN104485105B (en) * 2014-12-31 2018-04-13 中国科学院深圳先进技术研究院 A kind of electronic health record generation method and electronic medical record system
CN104732977A (en) * 2015-03-09 2015-06-24 广东外语外贸大学 On-line spoken language pronunciation quality evaluation method and system
CN104732977B (en) * 2015-03-09 2018-05-11 广东外语外贸大学 A kind of online spoken language pronunciation quality evaluating method and system
CN104732352A (en) * 2015-04-02 2015-06-24 张可 Method for question bank quality evaluation
CN104810017B (en) * 2015-04-08 2018-07-17 广东外语外贸大学 Oral evaluation method and system based on semantic analysis
CN104810017A (en) * 2015-04-08 2015-07-29 广东外语外贸大学 Semantic analysis-based oral language evaluating method and system
CN105989839B (en) * 2015-06-03 2019-12-13 乐融致新电子科技(天津)有限公司 Speech recognition method and device
CN105989839A (en) * 2015-06-03 2016-10-05 乐视致新电子科技(天津)有限公司 Speech recognition method and speech recognition device
CN105681920A (en) * 2015-12-30 2016-06-15 深圳市鹰硕音频科技有限公司 Network teaching method and system with voice recognition function
CN105681920B (en) * 2015-12-30 2017-03-15 深圳市鹰硕音频科技有限公司 A kind of Network teaching method and system with speech identifying function
CN106971711A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 A kind of adaptive method for recognizing sound-groove and system
CN105608960A (en) * 2016-01-27 2016-05-25 广东外语外贸大学 Spoken language formative teaching method and system based on multi-parameter analysis
CN105632488A (en) * 2016-02-23 2016-06-01 深圳市海云天教育测评有限公司 Voice evaluation method and device
CN105654785A (en) * 2016-03-18 2016-06-08 上海语知义信息技术有限公司 Personalized spoken foreign language learning system and method
CN105825852A (en) * 2016-05-23 2016-08-03 渤海大学 Oral English reading test scoring method
CN106548673A (en) * 2016-10-25 2017-03-29 合肥东上多媒体科技有限公司 A kind of Teaching Management Method based on intelligent Matching
CN106531182A (en) * 2016-12-16 2017-03-22 上海斐讯数据通信技术有限公司 Language learning system
CN106710348A (en) * 2016-12-20 2017-05-24 江苏前景信息科技有限公司 Civil air defense interactive experience method and system
CN106652622A (en) * 2017-02-07 2017-05-10 广东小天才科技有限公司 Text training method and device
CN107221318B (en) * 2017-05-12 2020-03-31 广东外语外贸大学 English spoken language pronunciation scoring method and system
CN107221318A (en) * 2017-05-12 2017-09-29 广东外语外贸大学 Oral English Practice pronunciation methods of marking and system
CN107293286A (en) * 2017-05-27 2017-10-24 华南理工大学 A kind of speech samples collection method that game is dubbed based on network
CN107230171A (en) * 2017-05-31 2017-10-03 中南大学 A method and system for evaluating students' career orientation
CN107292496A (en) * 2017-05-31 2017-10-24 中南大学 A work value recognition system and method
CN107239897A (en) * 2017-05-31 2017-10-10 中南大学 Method and system for testing personality occupation type
CN107274738A (en) * 2017-06-23 2017-10-20 广东外语外贸大学 Chinese-English translation teaching points-scoring system based on mobile Internet
US11726844B2 (en) 2017-06-26 2023-08-15 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor
US11537843B2 (en) 2017-06-29 2022-12-27 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor
CN109214616A (en) * 2017-06-29 2019-01-15 上海寒武纪信息科技有限公司 A kind of information processing unit, system and method
CN107578778A (en) * 2017-08-16 2018-01-12 南京高讯信息科技有限公司 A kind of method of spoken scoring
US11656910B2 (en) 2017-08-21 2023-05-23 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor
CN107785011B (en) * 2017-09-15 2020-07-03 北京理工大学 Training of speech rate estimation model, speech rate estimation method, device, equipment and medium
CN107785011A (en) * 2017-09-15 2018-03-09 北京理工大学 Word speed estimates training, word speed method of estimation, device, equipment and the medium of model
WO2019075828A1 (en) * 2017-10-20 2019-04-25 深圳市鹰硕音频科技有限公司 Voice evaluation method and apparatus
CN109727608A (en) * 2017-10-25 2019-05-07 香港中文大学深圳研究院 A kind of ill voice appraisal procedure based on Chinese speech
CN109727608B (en) * 2017-10-25 2020-07-24 香港中文大学深圳研究院 Chinese speech-based ill voice evaluation system
CN107818797A (en) * 2017-12-07 2018-03-20 苏州科达科技股份有限公司 Voice quality assessment method, apparatus and its system
CN108428382A (en) * 2018-02-14 2018-08-21 广东外语外贸大学 It is a kind of spoken to repeat methods of marking and system
CN108429932A (en) * 2018-04-25 2018-08-21 北京比特智学科技有限公司 Method for processing video frequency and device
US11687467B2 (en) 2018-04-28 2023-06-27 Shanghai Cambricon Information Technology Co., Ltd Data sharing system and data sharing method therefor
CN108831503A (en) * 2018-06-07 2018-11-16 深圳习习网络科技有限公司 A kind of method and device for oral evaluation
CN109036429A (en) * 2018-07-25 2018-12-18 浪潮电子信息产业股份有限公司 A kind of voice match scoring querying method and system based on cloud service
CN108986786A (en) * 2018-07-27 2018-12-11 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) Interactive voice equipment ranking method, system, computer equipment and storage medium
CN109147823A (en) * 2018-10-31 2019-01-04 河南职业技术学院 Oral English Practice assessment method and Oral English Practice assessment device
CN109493658A (en) * 2019-01-08 2019-03-19 上海健坤教育科技有限公司 Situated human-computer dialogue formula spoken language interactive learning method
CN111640452A (en) * 2019-03-01 2020-09-08 北京搜狗科技发展有限公司 Data processing method and device and data processing device
CN111640452B (en) * 2019-03-01 2024-05-07 北京搜狗科技发展有限公司 Data processing method and device for data processing
CN109979484A (en) * 2019-04-03 2019-07-05 北京儒博科技有限公司 Pronounce error-detecting method, device, electronic equipment and storage medium
CN109979484B (en) * 2019-04-03 2021-06-08 北京儒博科技有限公司 Pronunciation error detection method and device, electronic equipment and storage medium
CN110135492A (en) * 2019-05-13 2019-08-16 山东大学 Equipment fault diagnosis and anomaly detection method and system based on multi-Gaussian model
CN110211607A (en) * 2019-07-04 2019-09-06 山东中医药高等专科学校 A kind of English learning system based on sensing network
CN110600052A (en) * 2019-08-19 2019-12-20 天闻数媒科技(北京)有限公司 Voice evaluation method and device
CN111358428A (en) * 2020-01-20 2020-07-03 书丸子(北京)科技有限公司 Observation capability test evaluation method and device
CN111294468A (en) * 2020-02-07 2020-06-16 普强时代(珠海横琴)信息技术有限公司 Tone quality detection and analysis system for customer service center calling
WO2021196475A1 (en) * 2020-04-01 2021-10-07 深圳壹账通智能科技有限公司 Intelligent language fluency recognition method and apparatus, computer device, and storage medium
CN111696524A (en) * 2020-04-21 2020-09-22 厦门快商通科技股份有限公司 Character-overlapping voice recognition method and system
CN111696524B (en) * 2020-04-21 2023-02-14 厦门快商通科技股份有限公司 Character-overlapping voice recognition method and system
CN111583961A (en) * 2020-05-07 2020-08-25 北京一起教育信息咨询有限责任公司 Stress evaluation method and device and electronic equipment
CN111612324B (en) * 2020-05-15 2021-02-19 深圳看齐信息有限公司 Multi-dimensional assessment method based on oral English examination
CN111612324A (en) * 2020-05-15 2020-09-01 深圳看齐信息有限公司 Multi-dimensional assessment method based on oral English examination
CN111599234A (en) * 2020-05-19 2020-08-28 黑龙江工业学院 Automatic English spoken language scoring system based on voice recognition
CN111612352A (en) * 2020-05-22 2020-09-01 北京易华录信息技术股份有限公司 Student expression ability assessment method and device
CN111816169A (en) * 2020-07-23 2020-10-23 苏州思必驰信息科技有限公司 Method and device for training Chinese and English hybrid speech recognition model
CN112349300A (en) * 2020-11-06 2021-02-09 北京乐学帮网络技术有限公司 Voice evaluation method and device
CN112634692A (en) * 2020-12-15 2021-04-09 成都职业技术学院 Emergency evacuation deduction training system for crew cabins
CN112750465B (en) * 2020-12-29 2024-04-30 昆山杜克大学 Cloud language ability evaluation system and wearable recording terminal
CN112750465A (en) * 2020-12-29 2021-05-04 昆山杜克大学 Cloud language ability evaluation system and wearable recording terminal
CN113035238A (en) * 2021-05-20 2021-06-25 北京世纪好未来教育科技有限公司 Audio evaluation method, device, electronic equipment and medium
CN113571043A (en) * 2021-07-27 2021-10-29 广州欢城文化传媒有限公司 Dialect simulation force evaluation method and device, electronic equipment and storage medium
CN113571043B (en) * 2021-07-27 2024-06-04 广州欢城文化传媒有限公司 Dialect simulation force evaluation method and device, electronic equipment and storage medium
CN113807813A (en) * 2021-09-14 2021-12-17 广东德诚科教有限公司 Grading system and method based on man-machine conversation examination
CN114519358A (en) * 2022-02-17 2022-05-20 科大讯飞股份有限公司 Translation quality evaluation method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN103928023B (en) 2017-04-05

Similar Documents

Publication Publication Date Title
CN103928023B (en) A kind of speech assessment method and system
CN102800314B (en) English sentence recognizing and evaluating system with feedback guidance and method
Deshwal et al. Feature extraction methods in language identification: a survey
CN101246685B (en) Pronunciation Quality Evaluation Method in Computer Aided Language Learning System
CN101751919B (en) A method for automatic detection of accent in spoken Chinese
Mouaz et al. Speech recognition of Moroccan dialect using hidden Markov models
CN104050965A (en) English phonetic pronunciation quality evaluation system with emotion recognition function and method thereof
CN102063899A (en) Method for voice conversion under unparallel text condition
CN112002348B (en) A method and system for recognizing patient's voice anger emotion
Zhang et al. Using computer speech recognition technology to evaluate spoken English.
Razak et al. Quranic verse recitation recognition module for support in j-QAF learning: A review
US20010010039A1 (en) Method and apparatus for mandarin chinese speech recognition by using initial/final phoneme similarity vector
CN109300339A (en) A kind of exercising method and system of Oral English Practice
Kanabur et al. An extensive review of feature extraction techniques, challenges and trends in automatic speech recognition
Goyal et al. A comparison of Laryngeal effect in the dialects of Punjabi language
TWI467566B (en) Polyglot speech synthesis method
Sinha et al. Empirical analysis of linguistic and paralinguistic information for automatic dialect classification
Karjigi et al. Classification of place of articulation in unvoiced stops with spectro-temporal surface modeling
CN117012230A (en) Evaluation model for singing pronunciation and character biting
Sharma et al. Soft-Computational Techniques and Spectro-Temporal Features for Telephonic Speech Recognition: an overview and review of current state of the art
CN112767961B (en) Accent correction method based on cloud computing
Yang et al. Landmark-based pronunciation error identification on Chinese learning
Li et al. English sentence pronunciation evaluation using rhythm and intonation
Wang A machine learning assessment system for spoken english based on linear predictive coding
Rao et al. Robust features for automatic text-independent speaker recognition using Gaussian mixture model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170405

Termination date: 20200429

CF01 Termination of patent right due to non-payment of annual fee