[go: up one dir, main page]

CN119785771B - Word inserting method and device for decoding network, electronic equipment and storage medium - Google Patents

Word inserting method and device for decoding network, electronic equipment and storage medium

Info

Publication number
CN119785771B
CN119785771B CN202411940606.3A CN202411940606A CN119785771B CN 119785771 B CN119785771 B CN 119785771B CN 202411940606 A CN202411940606 A CN 202411940606A CN 119785771 B CN119785771 B CN 119785771B
Authority
CN
China
Prior art keywords
node
word
network
sub
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411940606.3A
Other languages
Chinese (zh)
Other versions
CN119785771A (en
Inventor
张伟生
陆梦寒
费大勇
熊世富
高建清
刘聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
iFlytek Co Ltd
Original Assignee
iFlytek Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iFlytek Co Ltd filed Critical iFlytek Co Ltd
Priority to CN202411940606.3A priority Critical patent/CN119785771B/en
Publication of CN119785771A publication Critical patent/CN119785771A/en
Application granted granted Critical
Publication of CN119785771B publication Critical patent/CN119785771B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Time-Division Multiplex Systems (AREA)

Abstract

The invention relates to the technical field of computers, and provides a word inserting method, a word inserting device, electronic equipment and a storage medium of a decoding network, wherein the method comprises the steps of determining a slot to be inserted and a candidate word corresponding to the slot to be inserted; and under the condition that the slot to be inserted comprises a plurality of repeated identical slots, multiplexing the candidate words corresponding to the identical slots into the same candidate word node, and connecting the candidate word node with the endpoints of the identical slots to obtain a decoding network after word insertion. According to the word inserting method, device, electronic equipment and storage medium of the decoding network, when the slots to be inserted comprise a plurality of repeated identical slots, the candidate words corresponding to the repeated identical slots are multiplexed into the same candidate word node, namely, the candidate words are only required to be built once, compared with the prior art that the candidate words are required to be inserted once for each slot, the candidate words are required to be built repeatedly for a plurality of times, so that the time cost of word inserting is reduced, and meanwhile, the occupation of newly added memory is reduced.

Description

Word inserting method and device for decoding network, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a word inserting method and apparatus for a decoding network, an electronic device, and a storage medium.
Background
With the advent of the mobile internet era, speech recognition has been receiving more and more attention from companies and manufacturers as an important interface for man-machine interaction, and especially in the field of embedded end application, speech interaction has become a necessary function.
End-to-end speech recognition directly converts an audio sequence into a text sequence, and some decoding strategies, such as greedy decoding, beam decoding, etc., are often required when composing text. Because the end-to-end voice recognition can directly convert the audio into characters, a special language model is not available, and the recognition result is uncontrollable. For personalized content, such as proper nouns of contacts, the recognition effect is poor, and a command word network is introduced in the decoding process.
The command word network generation needs to go through a compiling process, the compiling process is tedious and time-consuming, in order to reduce the time consumption of resource loading compiling, a part of general main body content is normally compiled offline, and when the user uses the command word network, only the compiled network needs to be directly loaded. The personalized content of the user needs to be realized by a word inserting function, namely, a word inserting method of a command word network needs to be provided.
Disclosure of Invention
The invention provides a word inserting method, a word inserting device, electronic equipment and a storage medium of a decoding network, which are used for solving the defects of low word inserting speed and large occupied memory of the decoding network in the prior art.
The invention provides a word inserting method of a decoding network, which comprises the following steps:
Determining a groove to be inserted and a candidate word corresponding to the groove to be inserted;
Multiplexing candidate words corresponding to a plurality of identical slots into the same candidate word node under the condition that the slot to be inserted comprises the plurality of identical slots which repeatedly appear;
and connecting the candidate word nodes with the endpoints of the plurality of identical slots to obtain a decoding network after word insertion.
According to the word inserting method of the decoding network provided by the invention, the word candidate nodes are connected with the endpoints of the same slots to obtain the decoding network after word insertion, and the word inserting method comprises the following steps:
under the condition that the number of candidate words corresponding to the slots to be inserted is a plurality of, combining the candidate words into a sub-network, and taking the sub-network as the candidate word node;
adding a public head node and a public tail node to the sub-network;
and respectively connecting the public head node and the public tail node with the endpoints of the plurality of identical slots to obtain a decoding network after word insertion.
According to the word inserting method of the decoding network provided by the invention, the method for connecting the public head node and the public tail node with the endpoints of the same slots respectively to obtain the decoding network after word insertion comprises the following steps:
and connecting the public head node and the public tail node with each candidate word in the sub-network through real arcs, and connecting the public head node and the public tail node with the endpoints of the same slots through empty arcs to obtain a decoding network after word insertion.
The word inserting method of the decoding network provided by the invention further comprises the following steps:
determining a public head node of a sub-network connected with the to-be-offloaded slot;
Traversing each node from the public head node until an empty arc exists in an arc going out of any node, and taking any node as a public tail node of the sub-network;
and deleting all arcs and nodes in the traversal process, and releasing the memory to unload the sub-network.
According to the word inserting method of the decoding network provided by the invention, the determining of the public head node of the sub-network connected with the to-be-unloaded slot comprises the following steps:
Determining the left end point of the groove to be unloaded;
and searching an empty arc in the left end point outgoing arc, and taking a first node connected with the empty arc as a public first node of the sub-network.
The word inserting method of the decoding network provided by the invention further comprises the following steps:
combining the subnetworks in case there are two different nodes in the subnetwork with the same arc out.
According to the word inserting method of the decoding network provided by the invention, under the condition that two different nodes with the same outgoing arc exist in the sub-network, the sub-network is combined, and the word inserting method comprises the following steps:
Under the condition that nodes with the same arc are present in the current termination node of the sub-network, merging the nodes with the same arc, and updating the arc-out information of the pointing node pointing to the current termination node;
and updating the current termination node of the sub-network based on the updated arc-out information, and repeatedly executing node merging and arc-out information updating until all nodes of the sub-network are traversed.
The invention also provides a word inserting device of the decoding network, which comprises:
The determining unit is used for determining the slots to be inserted and the corresponding candidate words;
a multiplexing unit, configured to multiplex candidate words corresponding to a plurality of identical slots into a same candidate word node when the slot to be inserted includes the plurality of identical slots that repeatedly occur;
and the word inserting unit is used for connecting the candidate word nodes with the endpoints of the plurality of identical slots to obtain a decoding network after word insertion.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the word insertion method of any one of the decoding networks when executing the computer program.
The invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a word insertion method of a decoding network as described in any of the above.
The invention also provides a computer program product comprising a computer program which when executed by a processor implements a word insertion method of decoding a network as described in any one of the above.
According to the word inserting method, device, electronic equipment and storage medium of the decoding network, when the slots to be inserted comprise a plurality of repeated identical slots, the candidate words corresponding to the repeated identical slots are multiplexed into the same candidate word node, namely, the candidate words are only required to be built once, compared with the prior art that the candidate words are required to be inserted once for each slot, the candidate words are required to be built repeatedly for a plurality of times, so that the time cost of word inserting is reduced, and meanwhile, the occupation of newly added memory is reduced.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a command word network in a related art location in an end-to-end recognition system.
Fig. 2 is a schematic diagram of a command word network precompilation and use process in the related art.
Fig. 3 is a schematic diagram of a related art word insertion method.
FIG. 4 is a second schematic diagram of a related art word insertion method.
Fig. 5 is a schematic flow chart of a word inserting method of a decoding network according to the present invention.
FIG. 6 is a schematic diagram of candidate word multiplexing provided by the present invention.
Fig. 7 is a schematic diagram of a subnetwork provided by the present invention.
FIG. 8 is a schematic diagram of a word-level modeling personalized candidate word provided by the present invention.
Fig. 9 is a schematic diagram of a subnetwork with head-to-tail nodes provided by the present invention.
Fig. 10 is a schematic diagram of a sub-network using an air arc connection provided by the present invention.
Fig. 11 is a schematic flow chart of a sub-network offloading method provided by the present invention.
Fig. 12 is a schematic diagram of subnetwork merging provided by the present invention.
Fig. 13 is a schematic structural diagram of a word inserting device of a decoding network according to the present invention.
Fig. 14 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
End-to-end speech recognition is a framework for distinguishing traditional speech recognition. Conventional speech recognition generally consists of two parts, an acoustic model and a language model. The acoustic model is responsible for converting an audio sequence into a phoneme sequence, such as the common chinese pinyin, english phonetic symbols, and other tuples of multiple phonemes, such as diphones, triphones, etc. The language model is responsible for converting these phoneme sequences into text sequences. The two parts do not need to be coupled and can be trained independently. The disadvantage of this traditional speech recognition is that the model training process is cumbersome, and the speech recognition effect is jointly affected by both parts, and the effect improvement of a single model does not necessarily bring about an overall effect improvement, and is therefore gradually replaced by an end-to-end model.
End-to-end speech recognition directly converts an audio sequence into a text sequence, and some decoding strategies, such as greedy decoding, beam decoding, etc., are often required when composing text. Because the end-to-end voice recognition can directly convert the audio into the text, the recognition result is uncontrollable because of no special language model, and the recognition effect is poor for personalized contents such as proper nouns of contacts and the like. At this point a network of command words is introduced during the decoding process.
Fig. 1 is a schematic diagram of a location of a command word network in an end-to-end recognition system in the related art, as shown in fig. 1, the command word network is a finite state Transducer (FINITE STATE Transducer), which is input in a decoding state and output as a recognition result. The method comprises the steps of using a Viterbi algorithm to find a decoding path with highest score in a network in the beam decoding process, and constraining the beam decoding result to be more prone to an expected recognition result.
The command word network generation needs to go through a compiling process, the compiling process is tedious and time-consuming, in order to reduce the time consumption of resource loading compiling, a part of general main body content is normally compiled offline, and when the user uses the command word network, only the compiled network needs to be directly loaded. And the personalized content of the user is related, the word insertion function is needed to be realized.
The user can upload personalized content in the network through word insertion, for example, when the user uses a telephone function, contacts in his telephone book do not exist in the original network compiled offline, the user is required to upload the contacts, and the contacts are inserted into the original network through the word insertion function. Thus, for each user, the personalized command word network can be realized by providing a universal original network and enabling personalized contents of the user to be uploaded and inserted through user definition.
Fig. 2 is a schematic diagram of a command word network precompilation and use process in the related art, as shown in fig. 2, in the precompilation stage, a written command word file is compiled into a command word network usable by a program, and is converted into a binary file for storage. The command word file is composed of written sentence patterns, slots and candidate words, the sentence patterns generally comprise a plurality of candidate slots, each slot generally comprises a plurality of candidate words, and the plurality of command words can be composed of the sentence patterns, the slots and the candidate words in an arrangement combination. In the use process of the user, the command word network is loaded from the binary file, the user can customize the candidate words of the groove, namely the personalized file of the user, and the candidate words are expanded into a new command word network in a word inserting mode.
Fig. 3 is a schematic diagram of a related art word inserting method, as shown in fig. 3, in which, according to the existing word inserting method of the command word network, the front and rear end points of the recorded slot are connected with new personalized candidate words by a new arc connection method when the command word network is precompiled, if the user has a plurality of personalized candidate words, the above-mentioned method is repeated, thereby realizing the dynamic word inserting of the command word network. The related art word inserting method has the defects of low word inserting speed and large occupied memory.
The reason for the slow word insertion speed is that the prior art scheme can sequentially insert the personalized candidate words to be inserted into the slots into which the personalized candidate words are to be inserted. If there is one and only one slot in the command word network to be inserted, then only the insertion is required as described above. However, in reality, the command word network is generally responsible, and the same slot may be repeated in the sentence pattern. Fig. 4 is a second schematic diagram of a related art word insertion method, as shown in fig. 4, in which a sentence pattern contains a plurality of slots to be inserted, that is, the sentence pattern contains 2 slots to be inserted B, so that in order to enable all the repeated places to be inserted with the personalized candidate word a, word insertion needs to be performed on each place, and the time cost of word insertion is greatly increased.
The reason for the large memory is that when there are a plurality of identical slots to be inserted, each slot needs to be inserted with one candidate word, so that the command word network will increase m×n new candidate words and at least 2×m×n arcs, where M is the number of times the slot to be inserted occurs and N is the number of candidate words that the slot needs to be inserted. Because of the large number of candidate words, the memory occupied by the command word network is greatly increased.
Aiming at the problems of low word inserting speed and large occupied memory of the existing word inserting method, the embodiment of the invention provides a word inserting method of a decoding network, in the method, under the condition that a slot to be inserted comprises a plurality of repeated identical slots, namely, the identical slot can repeatedly appear in a sentence pattern, the method multiplexes candidate words corresponding to the repeated identical slots into identical candidate word nodes, and then connects the identical candidate word nodes with endpoints of the identical slots to obtain the decoding network after word insertion.
Compared with the prior art that one candidate word insertion is needed for each slot, multiple candidate words are needed to be repeatedly constructed, and the candidate words corresponding to the repeated multiple identical slots are multiplexed into the same candidate word node by the embodiment, namely, only one candidate word is needed to be constructed, so that the time cost of word insertion is reduced, and the occupation of newly added memory is reduced.
The embodiment of the invention can be applied to the scene requiring word insertion of the decoding network. The execution main body of the method can be electronic equipment such as terminal equipment, a computer, a server cluster or specially designed decoding network word inserting equipment, and can also be a word inserting device arranged in the electronic equipment, and the word inserting device can be realized by software, hardware or a combination of the two.
In the description of the embodiments of the present invention, the meaning of "plurality" is two or more, unless explicitly defined otherwise.
Fig. 5 is a schematic flow chart of a word inserting method of a decoding network according to the present invention, as shown in fig. 5, the method includes the following steps:
Step 510, determining a slot to be inserted and a candidate word corresponding to the slot to be inserted;
step 520, in the case that the slot to be inserted includes a plurality of identical slots that repeatedly appear, multiplexing the candidate words corresponding to the identical slots into the same candidate word node;
and 530, connecting the same candidate word node with the endpoints of a plurality of identical slots to obtain a decoding network after word insertion.
In particular, the decoding network refers to a network for decoding audio features in a speech recognition process, which may include a command word network. The number of the candidate words corresponding to each slot may be one or more, and the embodiment of the present invention is not limited in particular.
Referring to fig. 4, in the case where the slot to be inserted in the sentence pattern B includes a plurality of identical slots (slot B) that repeatedly appear, when the personalized candidate word a needs to be inserted into the slot B to be inserted, in order to enable all the repeatedly appearing slots B to be inserted into the personalized candidate word a, the prior art needs to insert a word into each appearing place, and at this time, two nodes of the personalized candidate word a need to be constructed, and the personalized candidate word a is sequentially inserted into the slot B.
Fig. 6 is a schematic diagram of candidate word multiplexing provided by the present invention, and referring to fig. 6, an embodiment of the present invention multiplexes candidate words corresponding to the same slots, that is, personalized candidate word a, into the same candidate word node. When inserting, only one candidate word node is constructed in the network, and the candidate word node is connected to the front end point and the rear end point of all the repeated grooves B, so that a decoding network after word insertion is obtained.
Then for the slot B that appears M times in the original personalized network, the personalized candidate word a needs to be constructed repeatedly M times, and only needs to be constructed once, so that the newly increased memory occupation caused by the newly inserted candidate word is reduced to 1/M.
In addition, the candidate words corresponding to the same slots are multiplexed into the same candidate word node, and meanwhile, the time cost of word insertion is reduced.
According to the method provided by the embodiment of the invention, under the condition that the slot to be inserted comprises a plurality of repeated identical slots, the candidate words corresponding to the repeated identical slots are multiplexed into the same candidate word node, namely, the candidate words are only required to be built once, compared with the prior art that the candidate words are required to be inserted once for each slot, the candidate words are required to be built repeatedly for a plurality of times, so that the time cost of word insertion is reduced, and the occupation of newly added memory is reduced.
Based on any of the above embodiments, the candidate word node is connected to the endpoints of the plurality of identical slots, so as to obtain a decoding network after word insertion, that is, step 530 specifically includes:
Step 531, combining the candidate words into a sub-network and using the sub-network as candidate word nodes under the condition that the number of the candidate words corresponding to the slots to be inserted is a plurality of;
step 532, adding a public head node and a public tail node to the sub-network;
And 533, connecting the public head node and the public tail node with the endpoints of the same slots respectively to obtain a decoding network after word insertion.
Specifically, it can be found in the above embodiment that the number of arcs for connecting the slot and the personalized candidate word is not changed, and for the personalized candidate word a, it is still necessary to newly add 2*M arcs. In the case that the number of the candidate words corresponding to the slot to be inserted is plural, that is, the number of the candidate words corresponding to the slot to be inserted is plural (assuming N), 2×m×n arcs need to be newly added, which is still a small memory overhead.
In order to solve the problem, the present embodiment combines the plurality of candidate words into one sub-network, and uses the sub-network as a candidate word node. Fig. 7 is a schematic diagram of a sub-network provided in the present invention, and as shown in fig. 7, the candidate words include a personalized candidate word 1, a personalized candidate word 2, and a personalized candidate word 3.
At this time, all personalized candidate words in the sub-network have their own individual arcs in and out, which are respectively related to the first information of the candidate word, and therefore cannot be reused. Fig. 8 is a schematic diagram of a personalized candidate word for word level modeling provided by the present invention, as shown in fig. 8, taking a word level modeling unit as an example, if two personalized candidate words of "Zhang san" and "Liqu" are added, different word information is stored on these arcs, and therefore, the two personalized candidate words cannot be multiplexed.
In this embodiment, a common head node and a common tail node are added to each candidate word in the sub-network, i.e. a common head node and a common tail node are added. Fig. 9 is a schematic diagram of a sub-network with head-tail nodes, as shown in fig. 9, where a common head-node and a common tail-node are respectively connected with end points of a plurality of identical slots, so that each candidate word needs 2 arcs to be connected with the head-position node of the slot respectively, after the common head-tail node is introduced, only the head-tail node needs 2 arcs to be connected, and for M slots, 2*M new arcs are needed. All N candidate words are connected with the public head and tail end points, and 2*N new arcs are needed.
Therefore, the word insertion mode needs to increase 2×m+2×n new arcs altogether, and compared with 2×m×n new arcs required by the existing scheme, the number of new arcs can be greatly reduced, so that the memory occupied by a network is reduced.
According to the method provided by the embodiment of the invention, under the condition that the number of the candidate words corresponding to the slots to be inserted is multiple, the multiple candidate words are combined into one sub-network, and a public head node and a public tail node are added to the sub-network, so that the number of newly built arcs can be greatly reduced while multiplexing the sub-network is ensured, and the memory occupied by the network is further reduced.
Based on any of the above embodiments, the common head node and the common tail node are respectively connected with the endpoints of the plurality of identical slots, so as to obtain the decoding network after word insertion, that is, step 533 specifically includes:
and connecting the public head node and the public tail node with each candidate word in the sub-network through real arcs, and connecting the public head node and the public tail node with the endpoints of a plurality of identical slots through empty arcs to obtain a decoding network after word insertion.
In particular, the inventor has found that the above-mentioned candidate word sub-network with the common head-tail node has a problem that is not topologically equivalent to the way in which the candidate word is directly inserted, and for one candidate word, the topological distance between the candidate word and the left-right nodes of the slot is changed from 1 arc to 2 arcs, and more than one candidate word is connected to the common head-tail node. In Finite State Transducers (FST), there is an empty arc (epsilon) concept, i.e., an arc where both input and output are empty, to achieve an unconditional jump. Fig. 10 is a schematic diagram of a sub-network using null arcs, where as shown in fig. 10, a slot is connected to a common head-to-tail node using a null arc, i.e. candidate words are separated from the end points of the slot by a null arc and a real arc, which are topologically equivalent to a real arc. Therefore, the decoding network obtained by the new word insertion mode after the empty arc is introduced is kept unchanged in the topology structure. So far, network construction can be completed by constructing 2*M empty arcs and 2*N real arcs.
According to the method provided by the embodiment of the invention, the hollow arcs are introduced to keep the topological structure unchanged, and meanwhile, the number of newly-built arcs can be greatly reduced, so that the memory occupied by a network is further reduced.
Based on any of the above embodiments, considering that in the prior art, the candidate word to be inserted is connected with the front and rear end points of the slot to be inserted by using an arc, the candidate word to be inserted after connection is a part of the network, and cannot be separated from the command word network any more, and only new candidate words can be continuously added into the existing network. When a user needs to delete a personalized candidate word inserted in a certain slot, the network must be restored to the original binary file, and other candidate words not needing to be deleted are reinserted, so that the flexibility of the command word network is greatly reduced.
Aiming at the defect of high difficulty in unloading command word networks, fig. 11 is a schematic flow chart of a sub-network unloading method provided by the invention, as shown in fig. 11, the method further comprises:
Step 1110, determining a public head node of a sub-network connected with a slot to be offloaded;
step 1120, traversing each node from the public head node until an empty arc exists in the arc of any node traversed, and taking the node as a public tail node of the sub-network;
In step 1130, all arcs and nodes in the traversal process are deleted and the memory is freed to offload the subnetwork.
Specifically, according to the decoding network after word insertion constructed in the above manner, two arcs of an empty arc and a real arc exist. Therefore, the original network and the added personalized candidate word sub-network can be simply distinguished by distinguishing the empty arc from the real arc.
First, a common head node of a sub-network connected to a slot to be offloaded is determined, comprising:
The left end point of the slot to be unloaded is determined. All left and right endpoints of the slot to be unloaded can be found according to the information recorded during pre-compiling. And starting from each left end point, searching all empty arcs in the left end point arc, and taking the first node connected with the empty arcs as a public first node of the sub-network.
And then, traversing each node from the public head node until an empty arc exists in the arc traversing to a certain node, and taking the node as a public tail node of the sub-network.
And deleting all arcs and nodes in the traversal process, and releasing the memory to unload the sub-network.
According to the method provided by the embodiment of the invention, the original network and the added personalized candidate word sub-network can be simply distinguished by distinguishing the empty arc from the real arc, so that the sub-network can be flexibly unloaded, the unloading difficulty of the sub-network is greatly reduced, and the flexibility is improved.
Based on any of the above embodiments, to further reduce the memory, the method further includes:
In case there are two different nodes in the subnetwork with the same arc out, the subnetworks are merged.
In particular, in the case where two different nodes with the same arc exit exist in the subnetwork, the two nodes can be topologically regarded as the same node, so that the two nodes can be combined, thereby reducing the number of nodes in the subnetwork and further reducing the memory.
Fig. 12 is a schematic diagram of a sub-network merging provided in the present invention, and as shown in fig. 12 (a), if node 3 and node 5 have the same arc d, node 3 and node 5 may be regarded as the same node in topology. The subnetwork obtained after combining the two nodes is shown in fig. 12 (b), where node 2 and node 4 have the same arc c, then node 2 and node 4 can be considered as the same node in topology. The resulting subnetwork after combining the two nodes is shown in fig. 12 (c).
According to the method provided by the embodiment of the invention, under the condition that two different nodes with the same arc exist in the sub-network, the sub-network is combined, so that the number of the nodes in the sub-network can be further reduced, and the memory is further reduced.
Based on any of the above embodiments, in a case where there are two different nodes with the same arc out in the sub-network, merging the sub-network includes:
under the condition that nodes with the same arc are present in the current termination node of the sub-network, merging the nodes with the same arc, and updating the arc-out information of the pointing node pointing to the current termination node;
And updating the current termination node of the sub-network based on the updated arc-out information, and repeatedly executing node merging and arc-out information updating until all nodes of the sub-network are traversed.
Specifically, the merging for the sub-networks can be realized by regarding the sub-networks as a Directed Acyclic Graph (DAG), and finding out nodes with the degree of 0 in the DAG each time according to the reverse topological order of the DAG, wherein the current termination node is the node with the degree of 0 in the current sub-network. If the nodes with the same arc are present in the current termination node, the nodes with the same arc are combined.
And updating the arc-out information of all nodes with arcs pointing to the current termination node. The pointing node here is the node pointing to the current termination node. After the arc-out information is updated, the current termination node is updated, wherein the updated current termination node does not comprise the last termination node, namely the current termination node is a node with the degree of 0 after the node with the degree of 0 of the previous time is removed.
After the updated current termination node is obtained, the node merging operation is repeatedly performed until the full node is accessed.
In some embodiments, the specific algorithm design for the subnetwork merge is as follows:
1. traversing the node vector, finding out all nodes with the degree of 0, and adding all nodes into a queue.
2. And (3) queuing all the nodes, calculating a hash value to judge whether the nodes can be combined, and simultaneously, subtracting one from the output degree of the initial node of each arc when each arc points to the nodes, and recording the id of the node when the output degree is reduced to zero.
3. The nodes that can be merged at this time are merged.
4. And (3) adding the nodes recorded in the step (2) into a queue.
5. Repeating the steps 2-4 until the queue is empty.
According to the method provided by the embodiment of the invention, the sub-network is regarded as a directed acyclic graph, and sub-network merging is carried out according to the reverse topological order of the directed acyclic graph, so that the efficiency and the accuracy of sub-network merging can be improved.
Based on any one of the above embodiments, a word inserting method for decoding a network is provided, including:
1. Personalized candidate word multiplexing. Multiplexing candidate words corresponding to a plurality of identical slots into the same candidate word node under the condition that the slot to be inserted comprises the repeated plurality of identical slots; and connecting the candidate word nodes with the endpoints of a plurality of identical slots to obtain a decoding network after word insertion.
2. And under the condition that the number of candidate words corresponding to the slots to be inserted is a plurality of, constructing a personalized candidate word sub-network. Combining the candidate words into a sub-network, using the sub-network as a candidate word node, adding a public head node and a public tail node to the sub-network, and connecting the public head node and the public tail node with the end points of a plurality of identical slots respectively to obtain a decoding network after word insertion.
3. Introducing an empty arc. And connecting the public head node and the public tail node with each candidate word in the sub-network through real arcs, and connecting the public head node and the public tail node with the endpoints of a plurality of identical slots through empty arcs to obtain a decoding network after word insertion.
4. And unloading the sub-network. The method comprises the steps of determining a public head node of a sub-network connected with a slot to be unloaded, traversing each node from the public head node to the back until an empty arc exists in an arc of any node, taking any node as a public tail node of the sub-network, deleting all arcs and nodes in the traversing process, and releasing a memory to unload the sub-network.
5. And merging the sub-networks. Under the condition that nodes with the same arc are present in the current termination node of the sub-network, merging the nodes with the same arc, and updating the arc-out information of the pointing node pointing to the current termination node; and updating the current termination node of the sub-network based on the updated arc-out information, and repeatedly executing node merging and arc-out information updating until all nodes of the sub-network are traversed.
According to the method provided by the embodiment of the invention, the word insertion is changed from word-by-word entry insertion into sub-network insertion after the sub-network construction, so that the word insertion is not repeated, and the time cost of word insertion can be greatly reduced. The sub-network can be multiplexed in the repeated slots, the number of newly added nodes is reduced to 1/M required by the prior art, and the number of newly added arcs is reduced from 2 x M x N to 2 x (M+N), wherein M is the number of slots, and N is the number of entries. The sub-network and the empty arc are introduced, and the problem that the personalized command word is difficult to unload by the command word network in the prior art can be solved through a related unloading algorithm.
The word inserting device of the decoding network provided by the invention is described below, and the word inserting device of the decoding network described below and the word inserting method of the decoding network described above can be correspondingly referred to each other.
Based on any of the above embodiments, fig. 13 is a schematic structural diagram of a word inserting device of a decoding network according to the present invention, as shown in fig. 13, the device includes:
A determining unit 1310, configured to determine a slot to be inserted and a candidate word corresponding to the slot;
a multiplexing unit 1320, configured to multiplex candidate words corresponding to a plurality of identical slots into a same candidate word node when the slot to be inserted includes the plurality of identical slots that repeatedly occur;
And the word inserting unit 1330 is configured to connect the candidate word node with the endpoints of the multiple identical slots, so as to obtain a decoding network after word insertion.
According to the device provided by the embodiment of the invention, under the condition that the slot to be inserted comprises a plurality of repeated identical slots, the candidate words corresponding to the repeated identical slots are multiplexed into the same candidate word node, namely, the candidate words are only required to be built once, compared with the prior art that the candidate words are required to be inserted once for each slot, the candidate words are required to be built repeatedly for a plurality of times, so that the time cost of word insertion is reduced, and the occupation of newly added memory is reduced.
Based on any of the above embodiments, the word insertion unit is specifically configured to:
Under the condition that the number of candidate words corresponding to the slots to be inserted is a plurality of, combining the candidate words into a sub-network, and taking the sub-network as the candidate word node;
adding a public head node and a public tail node to the sub-network;
and respectively connecting the public head node and the public tail node with the endpoints of the plurality of identical slots to obtain a decoding network after word insertion.
Based on any of the above embodiments, the word insertion unit is specifically configured to:
and connecting the public head node and the public tail node with each candidate word in the sub-network through real arcs, and connecting the public head node and the public tail node with the endpoints of the same slots through empty arcs to obtain a decoding network after word insertion.
Based on any of the above embodiments, the apparatus further comprises an unloading unit, specifically configured to:
determining a public head node of a sub-network connected with the to-be-offloaded slot;
Traversing each node from the public head node until an empty arc exists in an arc going out of any node, and taking any node as a public tail node of the sub-network;
and deleting all arcs and nodes in the traversal process, and releasing the memory to unload the sub-network.
Based on any of the above embodiments, the unloading unit is further configured to:
Determining the left end point of the groove to be unloaded;
and searching an empty arc in the left end point outgoing arc, and taking a first node connected with the empty arc as a public first node of the sub-network.
Based on any of the above embodiments, the apparatus further includes a merging unit, specifically configured to:
combining the subnetworks in case there are two different nodes in the subnetwork with the same arc out.
Based on any of the above embodiments, the merging unit is further configured to:
Under the condition that nodes with the same arc are present in the current termination node of the sub-network, merging the nodes with the same arc, and updating the arc-out information of the pointing node pointing to the current termination node;
and updating the current termination node of the sub-network based on the updated arc-out information, and repeatedly executing node merging and arc-out information updating until all nodes of the sub-network are traversed.
Fig. 14 illustrates a physical schematic diagram of an electronic device, as shown in fig. 14, which may include a processor 1410, a communication interface (Communications Interface) 1420, a memory 1430, and a communication bus 1440, wherein the processor 1410, the communication interface 1420, and the memory 1430 communicate with each other via the communication bus 1440. The processor 1410 may invoke logic instructions in the memory 1430 to perform a word insertion method of the decoding network, the method including determining a slot to be inserted and a candidate word corresponding to the slot, multiplexing the candidate word corresponding to a plurality of identical slots into a same candidate word node if the slot to be inserted includes the identical slots that repeatedly appear, and connecting the candidate word node with endpoints of the identical slots to obtain the decoding network after word insertion.
In addition, the logic instructions in the memory 1430 described above may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention further provides a computer program product, where the computer program product includes a computer program, where the computer program can be stored on a non-transitory computer readable storage medium, and when the computer program is executed by a processor, the computer is capable of executing a word insertion method of a decoding network provided by the above methods, where the method includes determining a slot to be inserted and a candidate word corresponding to the slot, multiplexing the candidate word corresponding to a plurality of identical slots into a same candidate word node if the slot to be inserted includes a plurality of identical slots that repeatedly occur, and connecting the candidate word node with endpoints of the plurality of identical slots to obtain a decoding network after word insertion.
In still another aspect, the present invention further provides a non-transitory computer readable storage medium, on which a computer program is stored, where the computer program is implemented when executed by a processor to perform the word insertion method of the decoding network provided by the above methods, where the method includes determining a slot to be inserted and a candidate word corresponding to the slot, multiplexing the candidate word corresponding to a plurality of identical slots into a same candidate word node if the slot to be inserted includes the identical slots that repeatedly occur, and connecting the candidate word node with endpoints of the identical slots to obtain the decoding network after word insertion.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
It should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention, and not for limiting the same, and although the present invention has been described in detail with reference to the above-mentioned embodiments, it should be understood by those skilled in the art that the technical solution described in the above-mentioned embodiments may be modified or some technical features may be equivalently replaced, and these modifications or substitutions do not make the essence of the corresponding technical solution deviate from the spirit and scope of the technical solution of the embodiments of the present invention.

Claims (10)

1.一种解码网络的插词方法,其特征在于,包括:1. A word insertion method for a decoding network, comprising: 确定待插入槽及其对应的候选词;Determine the slot to be inserted and its corresponding candidate word; 在所述待插入槽包括重复出现的多个相同槽的情况下,将所述多个相同槽对应的候选词复用为同一个候选词节点;In the case where the slots to be inserted include multiple identical slots that appear repeatedly, reusing the candidate words corresponding to the multiple identical slots into a same candidate word node; 将所述候选词节点与所述多个相同槽的端点连接,得到插词后的解码网络。The candidate word node is connected to the endpoints of the multiple identical slots to obtain a decoding network after word insertion. 2.根据权利要求1所述的解码网络的插词方法,其特征在于,所述将所述候选词节点与所述多个相同槽的端点连接,得到插词后的解码网络,包括:2. The word insertion method of a decoding network according to claim 1, wherein the step of connecting the candidate word nodes with the endpoints of the plurality of identical slots to obtain a decoding network after word insertion comprises: 在所述待插入槽对应的候选词的数量为多个的情况下,将所述候选词组合成子网络,并将所述子网络作为所述候选词节点;In the case where there are multiple candidate words corresponding to the slot to be inserted, the candidate words are combined into a sub-network, and the sub-network is used as the candidate word node; 给所述子网络添加一个公共首节点和公共尾节点;Adding a public first node and a public tail node to the subnetwork; 将所述公共首节点和公共尾节点分别与所述多个相同槽的端点连接,得到插词后的解码网络。The common first node and the common tail node are respectively connected to the endpoints of the multiple identical slots to obtain a decoding network after word insertion. 3.根据权利要求2所述的解码网络的插词方法,其特征在于,所述将所述公共首节点和公共尾节点分别与所述多个相同槽的端点连接,得到插词后的解码网络,包括:3. The word insertion method of a decoding network according to claim 2, wherein the step of connecting the common first node and the common last node to the endpoints of the plurality of identical slots to obtain a decoding network after word insertion comprises: 将所述公共首节点和公共尾节点分别与所述子网络中的各候选词通过实弧连接,将所述公共首节点和公共尾节点分别与所述多个相同槽的端点通过空弧连接,得到插词后的解码网络。The common first node and the common tail node are respectively connected to each candidate word in the sub-network through a solid arc, and the common first node and the common tail node are respectively connected to the endpoints of the multiple identical slots through an empty arc to obtain a decoding network after word insertion. 4.根据权利要求3所述的解码网络的插词方法,其特征在于,所述方法还包括:4. The word insertion method of the decoding network according to claim 3, characterized in that the method further comprises: 确定与待卸载槽连接的子网络的公共首节点;Determine the common head node of the sub-network connected to the slot to be unloaded; 从所述公共首节点出发往后遍历各节点,直到遍历到任一节点的出弧中存在空弧,将所述任一节点作为所述子网络的公共尾节点;Starting from the common first node, traversing each node backward until a vacant arc is found in the outgoing arcs of any node, and taking the any node as the common end node of the sub-network; 删除遍历过程中的所有弧和节点,并释放内存以卸载所述子网络。Delete all arcs and nodes in the traversal process and release memory to unload the subnetwork. 5.根据权利要求4所述的解码网络的插词方法,其特征在于,所述确定与所述待卸载槽连接的子网络的公共首节点,包括:5. The word insertion method of a decoding network according to claim 4, wherein determining the common head node of the sub-network connected to the slot to be unloaded comprises: 确定所述待卸载槽的左端点;Determine the left endpoint of the unloading slot; 查找所述左端点出弧中的空弧,将所述空弧连接的首个节点作为所述子网络的公共首节点。Find the empty arc in the arc outgoing from the left endpoint, and use the first node connected by the empty arc as the common first node of the sub-network. 6.根据权利要求2至5中任一项所述的解码网络的插词方法,其特征在于,所述方法还包括:6. The word insertion method of a decoding network according to any one of claims 2 to 5, characterized in that the method further comprises: 在所述子网络中存在具有相同出弧的两个不同节点的情况下,将所述子网络进行合并。In the case that there are two different nodes with the same outgoing arc in the sub-network, the sub-networks are merged. 7.根据权利要求6所述的解码网络的插词方法,其特征在于,所述在所述子网络中存在具有相同出弧的两个不同节点的情况下,将所述子网络进行合并,包括:7. The word insertion method of a decoding network according to claim 6, wherein when two different nodes with the same outgoing arc exist in the sub-network, merging the sub-networks comprises: 在所述子网络的当前终止节点中存在具有相同出弧的节点的情况下,将所述具有相同出弧的节点进行合并,并更新指向所述当前终止节点的指向节点的出弧信息;In the case where there are nodes with the same outgoing arc in the current terminating node of the subnetwork, the nodes with the same outgoing arc are merged, and outgoing arc information of the pointing node pointing to the current terminating node is updated; 基于更新后的出弧信息更新所述子网络的当前终止节点,重复执行节点合并和出弧信息更新,直至遍历所述子网络的全部节点。The current termination node of the sub-network is updated based on the updated arc information, and node merging and arc information updating are repeated until all nodes of the sub-network are traversed. 8.一种解码网络的插词装置,其特征在于,包括:8. A word insertion device for a decoding network, comprising: 确定单元,用于确定待插入槽及其对应的候选词;A determination unit, used to determine the slot to be inserted and its corresponding candidate word; 复用单元,用于在所述待插入槽包括重复出现的多个相同槽的情况下,将所述多个相同槽对应的候选词复用为同一个候选词节点;A multiplexing unit, configured to reuse the candidate words corresponding to the multiple identical slots into a same candidate word node when the slot to be inserted includes multiple identical slots that appear repeatedly; 插词单元,用于将所述候选词节点与所述多个相同槽的端点连接,得到插词后的解码网络。The word insertion unit is used to connect the candidate word node with the endpoints of the multiple identical slots to obtain a decoding network after word insertion. 9.一种电子设备,包括存储器、处理器及存储在所述存储器上并在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至7任一项所述解码网络的插词方法。9. An electronic device comprising a memory, a processor, and a computer program stored in the memory and running on the processor, wherein when the processor executes the computer program, the word insertion method of the decoding network according to any one of claims 1 to 7 is implemented. 10.一种非暂态计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至7任一项所述解码网络的插词方法。10. A non-transitory computer-readable storage medium having a computer program stored thereon, wherein when the computer program is executed by a processor, the word insertion method of the decoding network according to any one of claims 1 to 7 is implemented.
CN202411940606.3A 2024-12-26 2024-12-26 Word inserting method and device for decoding network, electronic equipment and storage medium Active CN119785771B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411940606.3A CN119785771B (en) 2024-12-26 2024-12-26 Word inserting method and device for decoding network, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411940606.3A CN119785771B (en) 2024-12-26 2024-12-26 Word inserting method and device for decoding network, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN119785771A CN119785771A (en) 2025-04-08
CN119785771B true CN119785771B (en) 2025-10-17

Family

ID=95231699

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411940606.3A Active CN119785771B (en) 2024-12-26 2024-12-26 Word inserting method and device for decoding network, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN119785771B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110322884A (en) * 2019-07-09 2019-10-11 科大讯飞股份有限公司 A kind of slotting word method, apparatus, equipment and the storage medium of decoding network
CN111477217A (en) * 2020-04-08 2020-07-31 北京声智科技有限公司 Command word recognition method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111355781B (en) * 2020-02-18 2021-06-08 腾讯科技(深圳)有限公司 Voice information communication management method, device and storage medium
US12524923B2 (en) * 2020-10-06 2026-01-13 Beijing Xiaomi Mobile Software Co., Ltd. Method of encoding and decoding, encoder, decoder
US11893345B2 (en) * 2021-04-06 2024-02-06 Adobe, Inc. Inducing rich interaction structures between words for document-level event argument extraction
CN113920999B (en) * 2021-10-29 2025-09-05 中国科学技术大学 Speech recognition method, device, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110322884A (en) * 2019-07-09 2019-10-11 科大讯飞股份有限公司 A kind of slotting word method, apparatus, equipment and the storage medium of decoding network
CN111477217A (en) * 2020-04-08 2020-07-31 北京声智科技有限公司 Command word recognition method and device

Also Published As

Publication number Publication date
CN119785771A (en) 2025-04-08

Similar Documents

Publication Publication Date Title
US6668243B1 (en) Network and language models for use in a speech recognition system
US8589163B2 (en) Adapting language models with a bit mask for a subset of related words
JP2013065188A (en) Automaton determining method, automaton determining device and automaton determining program
Caseiro et al. A specialized on-the-fly algorithm for lexicon and language model composition
CN114155836B (en) Speech recognition method, related device and readable storage medium
WO2021243605A1 (en) Method and device for generating dna storage coding/decoding rule, and method and device for dna storage coding/decoding
CN118333172A (en) Large language model reasoning acceleration method and related device
US20090240500A1 (en) Speech recognition apparatus and method
CN118155638A (en) Speech generation and understanding system, method and electronic equipment based on large language model
CN119785771B (en) Word inserting method and device for decoding network, electronic equipment and storage medium
JP2841404B2 (en) Continuous speech recognition device
CN110322884B (en) Word insertion method, device, equipment and storage medium of decoding network
CN110534115A (en) Recognition methods, device, system and the storage medium of multi-party speech mixing voice
CN118214921A (en) Video generation method, device, electronic device and readable storage medium
CN108334491B (en) Text analysis method and device, computing equipment and storage medium
CN112527235A (en) Voice playing method, device, equipment and storage medium
CN112819513A (en) Text chain generation method, device, equipment and medium
CN114220444B (en) Voice decoding method, device, electronic equipment and storage medium
JP4405542B2 (en) Apparatus, method and program for clustering phoneme models
CN114968950B (en) Task processing methods, devices, electronic equipment, and media
Eide Automatic modeling of pronunciation variations
CN113836917B (en) Text word segmentation processing method and device, equipment and medium thereof
CN117454886A (en) Text generation method, device, electronic device and storage medium
US7617089B2 (en) Method and apparatus for compiling two-level morphology rules
CN114326576B (en) Operation control method and device, electronic device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant