CN103810997B - A kind of method and apparatus for determining voice identification result confidence level - Google Patents
A kind of method and apparatus for determining voice identification result confidence level Download PDFInfo
- Publication number
- CN103810997B CN103810997B CN201210459131.7A CN201210459131A CN103810997B CN 103810997 B CN103810997 B CN 103810997B CN 201210459131 A CN201210459131 A CN 201210459131A CN 103810997 B CN103810997 B CN 103810997B
- Authority
- CN
- China
- Prior art keywords
- arc
- confidence
- word
- arcs
- degree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Machine Translation (AREA)
Abstract
本发明提供了一种确定语音识别结果置信度的方法和装置,其中方法包括:确定解码得到的词图中每条弧的置信度,并确定词图中的最优路径;对所述最优路径上的每条弧Ai,在词图中确定与该弧Ai存在竞争关系的弧集合T;在确定所述弧Ai表示的词语的置信度时,从与所述Ai存在竞争关系的弧集合T中确定出弧Aj,其中弧Aj与弧Ai表示相同的词语,或者弧Aj与其所连接弧组合构成与弧Ai表示相同的词语;结合弧Ai和弧Aj的置信度,或进一步结合所述弧Aj所连接弧的置信度确定弧Ai表示的词语的置信度。本发明在确定语音识别结果的置信度时,考虑了复合词的构成因素,使得置信度更加准确地反映真实状况。
The present invention provides a method and device for determining the confidence degree of speech recognition results, wherein the method includes: determining the confidence degree of each arc in the word graph obtained by decoding, and determining the optimal path in the word graph; For each arc A i on the path, determine the arc set T that has a competitive relationship with the arc A i in the word graph; when determining the confidence of the word represented by the arc A i , there is no competition The arc A j is determined from the arc set T of the relationship, where the arc A j and the arc A i represent the same word, or the combination of the arc A j and the connected arc constitutes the same word as the arc A i ; combining the arc A i and the arc The confidence degree of A j , or further combine the confidence degree of the arcs connected by the arc A j to determine the confidence degree of the words represented by the arc A i . The present invention considers the constituent factors of compound words when determining the confidence level of the speech recognition result, so that the confidence level can more accurately reflect the real situation.
Description
【技术领域】【Technical field】
本发明涉及计算机应用技术中的语音识别领域,特别涉及一种确定语音识别结果置信度的方法和装置。The invention relates to the field of speech recognition in computer application technology, in particular to a method and device for determining the confidence level of speech recognition results.
【背景技术】【Background technique】
在语音识别中置信度用来表示识别结果为正确结果的可能性,值越大表示识别结果是正确结果的可能性越高,是进行语音识别的重要依据,语音识别结果置信度的确定方法直接影响了语音识别的准确性。In speech recognition, the confidence level is used to indicate the possibility of the recognition result being correct. The larger the value, the higher the possibility of the recognition result being the correct result. It is an important basis for speech recognition. The method for determining the confidence level of the speech recognition result is straightforward. affect the accuracy of speech recognition.
语音识别结果的置信度确定主要是通过对解码生成的词图(Aattice)进行处理得到的。词图是近年来较常用的一种语音识别结果表现形式,它将解码的多个候选结果在一个有向无环图上表示,在保留多候选信息的同时节约了存储空间。在词图中弧表示词,以结点表示词的连接关系,而每个词都属于一个从开始结点到结束结点的路径。其中词图中的每条弧可由一个五元组表示{W,Aw,Lw,Sw,Ew},其中W表示弧对应的词,Aw表示产生词W的声学得分,Lw表示产生词W的语言得分,Sw表示产生词W的开始时间,Ew表示产生词W的结束时间。图1为一个词图的实例,图中<s>和</s>分别表示路径开始符和路径结束符。The determination of the confidence of the speech recognition result is mainly obtained by processing the word map (Aattice) generated by decoding. Word graph is a more commonly used expression form of speech recognition results in recent years. It represents multiple decoded candidate results on a directed acyclic graph, saving storage space while retaining multiple candidate information. In the word graph, arcs represent words, and nodes represent the connection relationship of words, and each word belongs to a path from the start node to the end node. Each arc in the word graph can be represented by a quintuple {W,A w ,L w ,S w ,E w }, where W represents the word corresponding to the arc, A w represents the acoustic score of the generated word W, and L w Represents the language score of generating word W, S w represents the start time of generating word W, and E w represents the end time of generating word W. Figure 1 is an example of a word graph, where <s> and </s> represent the path start character and path end character, respectively.
现有语音识别结果的置信度在确定时,按照最优路径确定词语的置信度,如图1中所示,对于“中国人民”而言由于最优路径是弧“中国人民”,因此该词语的置信度为弧“中国人民”的置信度。然而在汉语中,一个词语可由另外两个词语组成,即所谓的复合词,对应于这种类型的词语,正如“中国人民”由词语“中国”和“人民”构成,现有语音识别结果的置信度确定方式就忽略了复合词的构成因素,使得识别结果置信度并不能反映真实的状况,由于识别结果的置信度可能会在后续声学模型和语言模型的自适应调整过程中产生影响,因此也会对识别结果的准确性带来影响。When the confidence degree of the existing speech recognition result is determined, the confidence degree of the word is determined according to the optimal path, as shown in Figure 1, for "Chinese people" because the optimal path is the arc "Chinese people", so the word The confidence level of is the confidence level of the arc "Chinese people". However, in Chinese, a word can be composed of two other words, so-called compound words, corresponding to this type of words, just as "Chinese people" is composed of the words "China" and "people", the confidence of the existing speech recognition results The degree determination method ignores the constituent factors of compound words, so that the confidence degree of the recognition result cannot reflect the real situation. Since the confidence degree of the recognition result may have an impact on the subsequent adaptive adjustment process of the acoustic model and language model, it will also affect the accuracy of recognition results.
【发明内容】【Content of invention】
有鉴于此,本发明提供了一种确定语音识别结果置信度的方法和装置,以便于提高语音识别结果置信度的准确性。In view of this, the present invention provides a method and device for determining the confidence level of the speech recognition result, so as to improve the accuracy of the confidence level of the speech recognition result.
具体技术方案如下:The specific technical scheme is as follows:
一种确定语音识别结果置信度的方法,该方法包括:A method of determining confidence in speech recognition results, the method comprising:
S1、确定解码得到的词图中每条弧的置信度,并确定词图中的最优路径;S1. Determine the confidence of each arc in the word graph obtained by decoding, and determine the optimal path in the word graph;
S2、对所述最优路径上的每条弧Ai,在词图中确定与该弧Ai存在竞争关系的弧集合T;S2. For each arc A i on the optimal path, determine the arc set T that competes with the arc A i in the word graph;
S3、在确定所述弧Ai表示的词语的置信度时,从与所述Ai存在竞争关系的弧集合T中确定出弧Aj,其中弧Aj与弧Ai表示相同的词语,或者弧Aj与其所连接弧组合构成与弧Ai表示相同的词语;结合弧Ai和弧Aj的置信度,或进一步结合所述弧Aj所连接弧的置信度确定弧Ai表示的词语的置信度。S3. When determining the confidence level of the word represented by the arc A i , determine the arc A j from the arc set T that competes with the A i , wherein the arc A j and the arc A i represent the same word, Or the combination of arc A j and its connected arc constitutes the same word as arc A i ; combine the confidence of arc A i and arc A j , or further determine the arc A i in combination with the confidence of the arc connected by said arc A j . confidence level of the word.
根据本发明一优选实施例,在所述步骤S1中,每条弧的置信度等于经过该弧的所有路径的得分之和除以词图中所有路径的得分之和所得到的值。According to a preferred embodiment of the present invention, in the step S1, the confidence of each arc is equal to the value obtained by dividing the sum of the scores of all paths passing through the arc by the sum of the scores of all paths in the word graph.
根据本发明一优选实施例,在所述步骤S2中确定两条弧是否存在竞争关系时,采用以下方式:According to a preferred embodiment of the present invention, when determining whether there is a competitive relationship between the two arcs in the step S2, the following method is adopted:
如果两条弧在持续时间上存在交叠,则确定两条弧存在竞争关系;或者,two arcs are determined to be in contention if they overlap in duration; or,
如果两条弧在持续时间上存在交叠,且两条弧表示的词语在发音上的相似度满足预设要求,则确定两条弧存在竞争关系。If the duration of the two arcs overlaps, and the similarity in pronunciation of the words represented by the two arcs satisfies a preset requirement, it is determined that the two arcs are in a competitive relationship.
根据本发明一优选实施例,所述S3具体包括:According to a preferred embodiment of the present invention, the S3 specifically includes:
S31、初始化弧Ai表示的词语的置信度为弧Ai的置信度;S31, the confidence degree of the words represented by the initialization arc A i is the confidence degree of the arc A i ;
S32、从与所述弧Ai存在竞争关系的弧集合T中选择一条未被选择过的弧;S32. Select an unselected arc from the arc set T that competes with the arc A i ;
S33、判断选择的弧是否与弧Ai表示相同的词语,如果是,将弧Ai表示的词语的置信度更新为该词语的置信度当前值加上选择的弧的置信度,执行步骤S35;否则,执行步骤S34;S33, whether the selected arc is judged to represent the same word as the arc A i , if so, the confidence degree of the word represented by the arc A i is updated to the confidence degree current value of the word plus the confidence degree of the selected arc, and step S35 is performed ; Otherwise, execute step S34;
S34、判断选择的弧与其所连接的弧组合是否与弧Ai表示相同的词语,如果是,结合弧Ai表示的词语的置信度当前值以及所述弧组合中各弧的置信度更新弧Ai表示的词语的置信度,执行步骤S35;否则直接执行步骤S35;S34, judge whether the selected arc and its connected arc combination represent the same word as the arc A i , if so, combine the current confidence value of the word represented by the arc A i and the confidence update arc of each arc in the arc combination The degree of confidence of the word that A i represents, executes step S35; Otherwise directly executes step S35;
S35、判断所述弧集合T中是否还存在未被选择的弧,如果是,转至所述步骤S32;否则,结束弧Ai表示的词语的置信度确定流程。S35. Judging whether there are unselected arcs in the arc set T, if yes, go to the step S32; otherwise, end the process of determining the confidence of the word represented by the arc A i .
根据本发明一优选实施例,在步骤S34中所述结合弧Ai表示的词语的置信度当前值以及所述弧组合中各弧的置信度更新弧Ai表示的词语的置信度具体为:According to a preferred embodiment of the present invention, in step S34, the current value of the confidence degree of the word represented by the arc A i and the confidence degree of each arc in the arc combination are updated in step S34. The confidence degree of the word represented by the arc A i is specifically:
将弧Ai表示的词语的置信度更新为该词语的置信度当前值加上所述弧组合中各弧的置信度最小值。The confidence degree of the word represented by the arc A i is updated as the current value of the confidence degree of the word plus the minimum value of the confidence degree of each arc in the arc combination.
一种确定语音识别结果置信度的装置,该装置包括:A device for determining confidence in speech recognition results, the device comprising:
初始确定单元,用于确定解码得到的词图中每条弧的置信度,并确定词图中的最优路径;The initial determination unit is used to determine the confidence degree of each arc in the word graph obtained by decoding, and determine the optimal path in the word graph;
集合确定单元,用于对所述最优路径上的每条弧Ai,在词图中确定与该弧Ai存在竞争关系的弧集合T;A set determination unit, configured to determine an arc set T that competes with the arc A i in the word graph for each arc A i on the optimal path;
置信度确定单元,用于在确定所述弧Ai表示的词语的置信度时,从与所述Ai存在竞争关系的弧集合T中确定出弧Aj,其中弧Aj与弧Ai表示相同的词语,或者弧Aj与其所连接弧组合构成与弧Ai表示相同的词语;结合弧Ai和弧Aj的置信度,或进一步结合所述弧Aj所连接弧的置信度确定弧Ai表示的词语的置信度。Confidence degree determination unit, used to determine the arc A j from the arc set T that competes with the A i when determining the confidence degree of the word represented by the arc A i , wherein the arc A j is the same as the arc A i Represent the same words, or arc A j and its connected arcs are combined to form the same words as arc A i ; combine the confidence of arc A i and arc A j , or further combine the confidence of the arcs connected by arc A j A confidence level is determined for the term represented by the arc A i .
根据本发明一优选实施例,所述初始确定单元确定每条弧的置信度等于经过该弧的所有路径的得分之和除以词图中所有路径的得分之和所得到的值。According to a preferred embodiment of the present invention, the initial determination unit determines that the confidence of each arc is equal to the value obtained by dividing the sum of the scores of all paths passing through the arc by the sum of the scores of all paths in the word graph.
根据本发明一优选实施例,所述集合确定单元在确定两条弧是否存在竞争关系时,采用以下方式:According to a preferred embodiment of the present invention, when the set determination unit determines whether there is a competitive relationship between two arcs, it adopts the following method:
如果两条弧在持续时间上存在交叠,则确定两条弧存在竞争关系;或者,two arcs are determined to be in contention if they overlap in duration; or,
如果两条弧在持续时间上存在交叠,且两条弧表示的词语在发音上的相似度满足预设要求,则确定两条弧存在竞争关系。If the duration of the two arcs overlaps, and the similarity in pronunciation of the words represented by the two arcs satisfies a preset requirement, it is determined that the two arcs are in a competitive relationship.
根据本发明一优选实施例,所述置信度确定单元具体包括:According to a preferred embodiment of the present invention, the confidence determination unit specifically includes:
初始化子单元,用于初始化弧Ai表示的词语的置信度为弧Ai的置信度,触发弧选择子单元;The initialization subunit is used to initialize the confidence degree of the words represented by the arc A i to be the confidence degree of the arc A i , and trigger the arc selection subunit;
弧选择子单元,用于受到触发后从与所述弧Ai存在竞争关系的弧集合T中选择一条未被选择过的弧;The arc selection subunit is configured to select an unselected arc from the arc set T that competes with the arc A i after being triggered;
第一更新子单元,用于判断所述弧选择子单元选择的弧是否与弧Ai表示相同的词语,如果是,将弧Ai表示的词语的置信度更新为该词语的置信度当前值加上选择的弧的置信度,触发判断子单元;否则触发第二更新子单元;The first update subunit is used to judge whether the arc selected by the arc selection subunit expresses the same word as the arc A i , if so, update the confidence level of the word represented by the arc A i to the current value of the confidence level of the word Add the confidence of the selected arc to trigger the judgment subunit; otherwise trigger the second update subunit;
第二更新子单元,用于判断所述弧选择子单元选择的弧与其所连接的弧组合是否与弧Ai表示相同的词语,如果是,结合弧Ai表示的词语的置信度当前值以及所述弧组合中各弧的置信度更新弧Ai表示的词语的置信度,触发判断子单元;否则直接触发判断子单元;The second update subunit is used to judge whether the arc combination selected by the arc selection subunit and the arc it is connected to represent the same word as the arc A i , if so, combine the current confidence value of the word represented by the arc A i and The degree of confidence of each arc in the combination of arcs updates the degree of confidence of the word represented by the arc A i , triggering the judgment subunit; otherwise directly triggering the judgment subunit;
判断子单元,用于判断所述弧集合T中是否还存在未被选择的弧,如果是,触发所述弧选择子单元,否则结束弧Ai表示的词语的置信度确定流程。The judging subunit is used to judge whether there are unselected arcs in the arc set T, and if so, trigger the arc selection subunit, otherwise end the process of determining the confidence level of the word represented by the arc A i .
根据本发明一优选实施例,所述第二更新子单元更新弧Ai表示的词语的置信度时,具体将弧Ai表示的词语的置信度更新为该词语的置信度当前值加上所述弧组合中各弧的置信度最小值。According to a preferred embodiment of the present invention, when the second update subunit updates the confidence degree of the word represented by the arc A i , it specifically updates the confidence degree of the word represented by the arc A i as the current value of the confidence degree of the word plus the The minimum confidence value of each arc in the above arc combination.
由以上技术方案可以看出,本发明在确定语音识别结果的置信度时,考虑了复合词的构成因素,对于多个词语组合构成一个词语的情况,将这种组合情况的置信度也纳入词语的置信度确定,使得置信度更加准确地反映真实状况。As can be seen from the above technical solutions, the present invention considers the constituent factors of compound words when determining the confidence degree of the speech recognition result, and for the situation that a plurality of words are combined to form a word, the confidence degree of this combination is also included in the words. The confidence level is determined, so that the confidence level reflects the real situation more accurately.
【附图说明】【Description of drawings】
图1为词图的一个实例图;Fig. 1 is an example diagram of word graph;
图2为本发明实施例一提供的方法流程图;FIG. 2 is a flow chart of the method provided by Embodiment 1 of the present invention;
图3为本发明实施例一提供的图2中步骤204的具体实现流程图;FIG. 3 is a specific implementation flowchart of step 204 in FIG. 2 provided by Embodiment 1 of the present invention;
图4为本发明实施例二提供的确定语音识别结果置信度的装置结构图;FIG. 4 is a structural diagram of a device for determining the confidence level of a speech recognition result provided by Embodiment 2 of the present invention;
图5为本发明实施例二提供的置信度确定单元的结构图。FIG. 5 is a structural diagram of a confidence determination unit provided in Embodiment 2 of the present invention.
【具体实施方式】【Detailed ways】
为了使本发明的目的、技术方案和优点更加清楚,下面结合附图和具体实施例对本发明进行详细描述。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.
实施例一、Embodiment one,
图2为本发明实施例一提供的方法流程图,如图2所示,该方法可以具体包括以下步骤:Fig. 2 is a flow chart of the method provided by Embodiment 1 of the present invention. As shown in Fig. 2, the method may specifically include the following steps:
步骤201:确定解码得到的词图中每条弧的置信度。Step 201: Determine the confidence level of each arc in the decoded word graph.
本步骤中每条弧的置信度等于经过该弧的所有路径的得分之和除以词图中所有路径的得分之和所得到的值,其中路径得分为该路径的声学得分和语言得分的总和。The confidence of each arc in this step is equal to the value obtained by dividing the sum of the scores of all paths passing through the arc by the sum of the scores of all paths in the word graph, where the path score is the sum of the acoustic score and language score of the path .
仍以图1中所示词图为例,假设路径“人民大学”的得分为5,路径“中国”-“人民”的得分为3,路径“中国”-“人们”的得分为2,那么可以得到:Still taking the word graph shown in Figure 1 as an example, assuming that the path "Renmin University" has a score of 5, the path "China"-"People" has a score of 3, and the path "China"-"People" has a score of 2, then can get:
弧“人民大学”的置信度为 The confidence of the arc "University of the People" is
弧“中国”的置信度为 The confidence of the arc "China" is
弧“人民”的置信度为 The confidence of the arc "people" is
弧“人们”的置信度为 The confidence of the arc "people" is
步骤202:确定词图中的最优路径。Step 202: Determine the optimal path in the word graph.
所谓词图中的最优路径就是所有路径中得分最高的路径。The so-called optimal path in the word graph is the path with the highest score among all the paths.
上述步骤201和步骤202是现有技术在此不再赘述,另外,上述步骤201和步骤202也可以同时执行,也可以按照任意顺序先后执行,上述顺序仅是其中一种实施例。The above step 201 and step 202 are prior art and will not be repeated here. In addition, the above step 201 and step 202 may also be executed simultaneously, or may be executed sequentially in any order, and the above order is only one embodiment.
步骤203:对最优路径上的每条弧Ai,在词图中确定与该弧Ai存在竞争关系的弧集合T。Step 203: For each arc A i on the optimal path, determine the arc set T that competes with the arc A i in the word graph.
在判断两条弧是否存在竞争关系时,可以依据时间因素来确定,即如果两条弧在持续时间上存在交叠,则确定两条弧存在竞争关系。例如,两条弧A1和A2:A1={W1,Aw1,Lw1,Sw1,Ew1},A2={W2,Aw2,Lw2,Sw2,Ew2},如果满足Sw2≤(Sw1+Ew1)/2<Ew2,则认为弧A2与弧A1具有竞争关系。When judging whether there is a competitive relationship between two arcs, it can be determined based on the time factor, that is, if the duration of the two arcs overlaps, it is determined that there is a competitive relationship between the two arcs. For example, two arcs A 1 and A 2 : A 1 ={W 1 ,A w1 ,L w1 ,S w1 ,E w1 }, A 2 ={W 2 ,A w2 ,L w2 ,S w2 ,E w2 } , if S w2 ≤(S w1 +E w1 )/2<E w2 is satisfied, arc A 2 is considered to be in competition with arc A 1 .
为了更准确地描述竞争关系,除了依据时间因素之外,还需要两条弧表示的词语在发音上的相似度满足预设要求才确定存在竞争关系,其中发音上的相似度可以采用音节的编辑距离来体现,也可以采用声学模型或语言模型的欧式距离来体现。In order to describe the competitive relationship more accurately, in addition to the time factor, the similarity in pronunciation of the words represented by the two arcs must meet the preset requirements to determine the existence of a competitive relationship. The similarity in pronunciation can be edited by syllables It can be represented by the distance, or it can be represented by the Euclidean distance of the acoustic model or the language model.
步骤204:在确定最优路径上每条弧Ai表示的词语的置信度时,从与弧Ai存在竞争关系的弧集合T中确定出弧Aj,其中弧Aj与弧Ai表示相同词语,或者弧Aj与其所连接弧组合构成与弧Ai相同的词语,结合弧Ai和Aj的置信度,或者进一步结合上述Aj所连接弧的置信度确定弧Ai表示的词语的置信度。Step 204: When determining the confidence of the word represented by each arc A i on the optimal path, determine the arc A j from the arc set T that has a competitive relationship with the arc A i , where the arc A j and the arc A i represent The same word, or the combination of arc A j and its connected arc constitutes the same word as arc A i , combined with the confidence degree of arc A i and A j , or further combined with the confidence degree of the arc connected by A j above to determine the value represented by arc A i word confidence.
具体地,本步骤可以针对最优路径上的每条弧Ai分别具体执行如图3所示的流程从而得到每条弧表示的词语的置信度,如图3所示包括以下步骤:Specifically, this step can specifically execute the process shown in Figure 3 for each arc A i on the optimal path to obtain the confidence of the words represented by each arc, as shown in Figure 3, including the following steps:
步骤301:初始化弧Ai表示的词语的置信度为弧Ai的置信度。Step 301: Initialize the confidence of the word represented by the arc A i as the confidence of the arc A i .
步骤302:从与该弧Ai存在竞争关系的弧集合中选择一条未被选择过的弧。Step 302: Select an unselected arc from the set of arcs competing with the arc A i .
步骤303:判断选择的弧是否与弧Ai表示相同词语,如果是,执行步骤304;否则,执行步骤305。Step 303: Judging whether the selected arc and arc Ai represent the same word, if yes, go to step 304; otherwise, go to step 305.
步骤304:将弧Ai表示的词语的置信度设置为该词语的置信度当前值加上选择的弧的置信度,执行步骤307。Step 304: Set the confidence degree of the word represented by the arc A i as the current value of the confidence degree of the word plus the confidence degree of the selected arc, and execute step 307.
步骤305:判断选择的弧与其所连接的弧组合是否与弧Ai表示相同词语,如果是,执行步骤306;否则,执行步骤307。Step 305: Judging whether the combination of the selected arc and the connected arc represents the same word as arc A i , if yes, go to step 306; otherwise, go to step 307.
这里可以将选择的弧向前扩展或者向后扩展来与其所连接的弧进行组合。Here the selected arc can be extended forward or backward to combine with the arcs it connects.
步骤306:将弧Ai表示的词语的置信度更新为该词语的置信度当前值加上上述弧组合中各弧的置信度的最小值,执行步骤307。Step 306: Update the confidence of the word represented by the arc A i to the current value of the confidence of the word plus the minimum value of the confidence of each arc in the above arc combination, and execute step 307.
例如,图1所示词图中,弧“中国”是与弧“中国人民”存在竞争关系的一条弧,由于弧“中国”向后扩展与其连接的弧“人民”的弧组合也表示“中国人民”,则将弧“中国人民”的置信度加上弧“中国”和弧“人民”之间的最小值后,将得到的值作为词语“中国人民”的置信度。For example, in the word diagram shown in Figure 1, the arc "China" is an arc that competes with the arc "Chinese people". Since the arc "China" expands backwards and the arc combination of the arc "People" connected to it also represents "China People", the confidence of the arc "Chinese people" is added to the minimum value between the arc "China" and the arc "People", and the obtained value is used as the confidence of the word "Chinese people".
在此处采用词语的置信度当前值加上弧组合中各弧的置信度最小值是本实施例采用的一种优选实施方式,除了这种实施方式之外,还可以采用诸如词语的置信度当前值加上弧组合中各弧的置信度平均值等方式,只是准确度不如加上最小值的方式高。Using the current confidence value of the word plus the minimum confidence value of each arc in the arc combination is a preferred implementation mode adopted in this embodiment. In addition to this implementation mode, confidence levels such as words can also be used The current value plus the average confidence value of each arc in the arc combination, etc., but the accuracy is not as high as the method of adding the minimum value.
步骤307:判断与该弧Ai存在竞争关系的弧集合中是否还存在未被选择的弧,如果是,转至步骤302;否则结束弧Ai表示的词语的置信度确定流程。Step 307: Determine whether there is an unselected arc in the set of arcs competing with the arc A i , if yes, go to step 302; otherwise, end the process of determining the confidence of the word represented by the arc A i .
这样就可以得到最优路径上每条弧表示词语的置信度,该置信度包含了复合词中各词语分别被解码为单独词语的情况,使得置信度更加准确地反映了该词语作为最优识别结果的可能性。In this way, the confidence degree of each arc on the optimal path can be obtained, which includes the fact that each word in the compound word is decoded as a separate word, so that the confidence degree more accurately reflects the word as the optimal recognition result. possibility.
以上是对本发明所提供的方法进行的详细描述,下面结合实施例二对本发明提供的装置进行详细描述。The above is a detailed description of the method provided by the present invention, and the device provided by the present invention will be described in detail below in conjunction with Embodiment 2.
实施例二、Embodiment two,
图4为本发明实施例二提供的确定语音识别结果置信度的装置结构图,如图4所示,该装置可以包括:初始确定单元400、集合确定单元410和置信度确定单元420。FIG. 4 is a structural diagram of an apparatus for determining the confidence of speech recognition results provided by Embodiment 2 of the present invention. As shown in FIG. 4 , the apparatus may include: an initial determination unit 400 , a set determination unit 410 and a confidence determination unit 420 .
首先初始确定单元400确定解码得到的词图中每条弧的置信度,并确定词图中的最优路径。具体地,每条弧的置信度等于经过该弧的所有路径的得分之和除以词图中所有路径的得分之和所得到的值,其中路径得分为该路径的声学得分和语言得分的总和。词图中的最优路径就是所有路径中得分最高的路径。First, the initial determination unit 400 determines the confidence of each arc in the decoded word graph, and determines the optimal path in the word graph. Specifically, the confidence of each arc is equal to the value obtained by dividing the sum of the scores of all paths passing through the arc by the sum of the scores of all paths in the word graph, where the path score is the sum of the acoustic score and language score of the path . The optimal path in the word graph is the path with the highest score among all paths.
然后集合确定单元410对最优路径上的每条弧Ai,在词图中确定与该弧Ai存在竞争关系的弧集合T。Then, for each arc A i on the optimal path, the set determination unit 410 determines the arc set T that competes with the arc A i in the word graph.
其中,集合确定单元410在确定两条弧是否存在竞争关系时,可以采用以下方式:如果两条弧在持续时间上存在交叠,则确定两条弧存在竞争关系;或者,如果两条弧在持续时间上存在交叠,且两条弧表示的词语在发音上的相似度满足预设要求,则确定两条弧存在竞争关系。其中发音上的相似度可以采用音节的编辑距离来体现,也可以采用声学模型或语言模型的欧式距离来体现。Wherein, when the set determination unit 410 determines whether there is a competitive relationship between the two arcs, the following method can be adopted: if the two arcs overlap in duration, then determine that the two arcs have a competitive relationship; or, if the two arcs are in the If there is an overlap in the duration, and the similarity in pronunciation of the words represented by the two arcs meets the preset requirements, then it is determined that the two arcs have a competitive relationship. The similarity in pronunciation can be reflected by the edit distance of syllables, or the Euclidean distance of the acoustic model or language model.
最后置信度确定单元420在确定弧Ai表示的词语的置信度时,从与Ai存在竞争关系的弧集合T中确定出弧Aj,其中弧Aj与弧Ai表示相同的词语,或者弧Aj与其所连接弧组合构成与弧Ai表示相同的词语;结合弧Ai和弧Aj,或进一步结合弧Aj所连接弧的置信度确定弧Ai表示的词语的置信度。Finally, the confidence determination unit 420 determines the arc A j from the arc set T that competes with A i when determining the confidence of the word represented by the arc A i, wherein the arc A j and the arc A i represent the same word, Or the combination of arc A j and its connected arc constitutes the same word as arc A i ; combine arc A i and arc A j , or further combine the confidence of the arc connected by arc A j to determine the confidence of the word represented by arc A i .
下面对置信度确定单元420的具体结构进行详细描述,如图5所示,该置信度确定单元420可以具体包括:初始化子单元421、弧选择子单元422、第一更新子单元423、第二更新子单元424和判断子单元425。The specific structure of the confidence degree determination unit 420 is described in detail below. As shown in FIG. Two update subunit 424 and judgment subunit 425.
其中初始化子单元421,用于初始化弧Ai表示的词语的置信度为弧Ai的置信度,然后触发弧选择子单元422。Among them, the initialization subunit 421 is used to initialize the confidence degree of the word represented by the arc A i as the confidence degree of the arc A i , and then trigger the arc selection subunit 422 .
弧选择子单元422,用于受到触发后从与弧Ai存在竞争关系的弧集合T中选择一条未被选择过的弧。The arc selection subunit 422 is configured to select an unselected arc from the arc set T that competes with the arc A i after being triggered.
第一更新子单元423,用于判断弧选择子单元422选择的弧是否与弧Ai表示相同的词语,如果是,将弧Ai表示的词语的置信度更新为该词语的置信度当前值加上选择的弧的置信度,触发判断子单元425;否则触发第二更新子单元424。The first update subunit 423 is used to judge whether the arc selected by the arc selection subunit 422 represents the same word as the arc A i , if so, update the confidence level of the word represented by the arc A i to the current value of the confidence level of the word Add the confidence of the selected arc, trigger the judgment subunit 425; otherwise trigger the second update subunit 424.
第二更新子单元424,用于判断弧选择子单元422选择的弧与其所连接的弧组合是否与弧Ai表示相同的词语,如果是,结合弧Ai表示的词语的置信度当前值以及弧组合中各弧的置信度更新弧Ai表示的词语的置信度,触发判断子单元425;否则直接触发判断子单元425。The second update subunit 424 is used to judge whether the arc combination selected by the arc selection subunit 422 and the arc it is connected to represent the same word as the arc A i , if so, combine the current confidence value of the word represented by the arc A i and The confidence of each arc in the arc combination updates the confidence of the word represented by the arc A i , triggering the judging subunit 425; otherwise, directly triggering the judging subunit 425.
优选地,第二更新子单元更新弧Ai表示的词语的置信度时,可以将弧Ai表示的词语的置信度更新为该词语的置信度当前值加上弧组合中各弧的置信度最小值。Preferably, when the second update subunit updates the confidence of the word represented by the arc A i , the confidence of the word represented by the arc A i can be updated as the current value of the confidence of the word plus the confidence of each arc in the arc combination min.
判断子单元425,用于判断弧集合T中是否还存在未被选择的弧,如果是,触发弧选择子单元422,否则结束弧Ai表示的词语的置信度确定流程。The judging subunit 425 is used for judging whether there are unselected arcs in the arc set T, if so, triggering the arc selection subunit 422, otherwise ending the process of determining the confidence of the word represented by the arc A i .
在采用上述方法和装置确定出最优路径上各弧表示的词语的置信度后,可以包括但不限于以下应用:After using the above method and device to determine the confidence of the word represented by each arc on the optimal path, it may include but not limited to the following applications:
1)如果最优路径上某词语的置信度低于预设的置信度阈值,则说明以最优路径确定的识别结果中存在不准确的识别结果,为了避免识别错误给用户带来较差的用户体验,可以拒绝返回识别结果,并可以进一步提示用户再次输入语音。1) If the confidence of a word on the optimal path is lower than the preset confidence threshold, it means that there are inaccurate recognition results in the recognition results determined by the optimal path. In order to avoid recognition errors and bring poor results to users For user experience, you can refuse to return the recognition result, and you can further prompt the user to input the voice again.
2)将确定出的词语的置信度应用于语音识别无监督自适应技术中,即用于后续声学模型和语音模型的自适应调整过程中,从而使得语音识别更加准确。2) The confidence of the determined words is applied to the unsupervised adaptive technology of speech recognition, that is, it is used in the subsequent adaptive adjustment process of the acoustic model and the speech model, so that the speech recognition is more accurate.
3)可以用于对识别结果进行纠错,如果某词语的置信度低于预设的置信度阈值,则说明该词语的识别存在错误,为识别结果的纠错提供了基础。3) It can be used to correct the recognition results. If the confidence of a word is lower than the preset confidence threshold, it indicates that there is an error in the recognition of the word, which provides a basis for error correction of the recognition results.
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明保护的范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included in the present invention. within the scope of protection.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210459131.7A CN103810997B (en) | 2012-11-14 | 2012-11-14 | A kind of method and apparatus for determining voice identification result confidence level |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210459131.7A CN103810997B (en) | 2012-11-14 | 2012-11-14 | A kind of method and apparatus for determining voice identification result confidence level |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103810997A CN103810997A (en) | 2014-05-21 |
CN103810997B true CN103810997B (en) | 2018-04-03 |
Family
ID=50707676
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210459131.7A Active CN103810997B (en) | 2012-11-14 | 2012-11-14 | A kind of method and apparatus for determining voice identification result confidence level |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103810997B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106157969B (en) * | 2015-03-24 | 2020-04-03 | 阿里巴巴集团控股有限公司 | Method and device for screening voice recognition results |
CN109243460A (en) * | 2018-08-15 | 2019-01-18 | 浙江讯飞智能科技有限公司 | A method of automatically generating news or interrogation record based on the local dialect |
CN111341305B (en) * | 2020-03-05 | 2023-09-26 | 苏宁云计算有限公司 | Audio data labeling method, device and system |
CN113223500B (en) * | 2021-04-12 | 2022-02-25 | 北京百度网讯科技有限公司 | Speech recognition method, method for training speech recognition model and corresponding device |
CN114255754A (en) * | 2021-12-27 | 2022-03-29 | 贝壳找房网(北京)信息技术有限公司 | Speech recognition method, electronic device, program product, and storage medium |
CN115862600B (en) * | 2023-01-10 | 2023-09-12 | 广州小鹏汽车科技有限公司 | Voice recognition method and device and vehicle |
CN117688319B (en) * | 2023-11-10 | 2024-05-07 | 山东恒云信息科技有限公司 | Method for analyzing database structure by using AI |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7092883B1 (en) * | 2002-03-29 | 2006-08-15 | At&T | Generating confidence scores from word lattices |
CN101118745A (en) * | 2006-08-04 | 2008-02-06 | 中国科学院声学研究所 | A Fast Calculation Method of Confidence Degree in Speech Recognition System |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
ATE457510T1 (en) * | 2005-12-08 | 2010-02-15 | Nuance Comm Austria Gmbh | LANGUAGE RECOGNITION SYSTEM WITH HUGE VOCABULARY |
US7890325B2 (en) * | 2006-03-16 | 2011-02-15 | Microsoft Corporation | Subword unit posterior probability for measuring confidence |
-
2012
- 2012-11-14 CN CN201210459131.7A patent/CN103810997B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7092883B1 (en) * | 2002-03-29 | 2006-08-15 | At&T | Generating confidence scores from word lattices |
CN101118745A (en) * | 2006-08-04 | 2008-02-06 | 中国科学院声学研究所 | A Fast Calculation Method of Confidence Degree in Speech Recognition System |
Also Published As
Publication number | Publication date |
---|---|
CN103810997A (en) | 2014-05-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103810997B (en) | A kind of method and apparatus for determining voice identification result confidence level | |
CN107123417B (en) | Customized voice awakening optimization method and system based on discriminant training | |
US12183326B2 (en) | Speech recognition error correction method, related devices, and readable storage medium | |
US9858917B1 (en) | Adapting enhanced acoustic models | |
US8914288B2 (en) | System and method for advanced turn-taking for interactive spoken dialog systems | |
US9589564B2 (en) | Multiple speech locale-specific hotword classifiers for selection of a speech locale | |
JP4876134B2 (en) | Speaker authentication | |
JP6464650B2 (en) | Audio processing apparatus, audio processing method, and program | |
JP5598331B2 (en) | Language model creation device | |
CN110910885B (en) | Voice wake-up method and device based on decoding network | |
WO2017166650A1 (en) | Voice recognition method and device | |
US20230121683A1 (en) | Text output method and system, storage medium, and electronic device | |
WO2021212817A1 (en) | Method and apparatus for correcting voice dialogue | |
CN105468582B (en) | A kind of method and device for correcting of the numeric string based on man-machine interaction | |
KR20210154849A (en) | 2-pass end-to-end speech recognition | |
WO2023070803A1 (en) | Speech recognition method and apparatus, device, and storage medium | |
CN102436816A (en) | Voice data decoding method and device | |
CN103680500B (en) | A kind of method and apparatus of speech recognition | |
CN114139524B (en) | Method and device for predicting story text and electronic equipment | |
WO2014036827A1 (en) | Text correcting method and user equipment | |
CN117351963A (en) | Methods, devices, equipment and readable media for speech recognition | |
CN116153294A (en) | Speech recognition method, device, system, equipment and medium | |
US11263852B2 (en) | Method, electronic device, and computer readable storage medium for creating a vote | |
CN105632500A (en) | Voice recognition apparatus and method of controlling the same | |
CN110491394A (en) | Wake up the acquisition methods and device of corpus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
EXSB | Decision made by sipo to initiate substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |