CN119785771B - Word inserting method and device for decoding network, electronic equipment and storage medium - Google Patents
Word inserting method and device for decoding network, electronic equipment and storage mediumInfo
- Publication number
- CN119785771B CN119785771B CN202411940606.3A CN202411940606A CN119785771B CN 119785771 B CN119785771 B CN 119785771B CN 202411940606 A CN202411940606 A CN 202411940606A CN 119785771 B CN119785771 B CN 119785771B
- Authority
- CN
- China
- Prior art keywords
- node
- word
- network
- sub
- candidate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Time-Division Multiplex Systems (AREA)
Abstract
The invention relates to the technical field of computers, and provides a word inserting method, a word inserting device, electronic equipment and a storage medium of a decoding network, wherein the method comprises the steps of determining a slot to be inserted and a candidate word corresponding to the slot to be inserted; and under the condition that the slot to be inserted comprises a plurality of repeated identical slots, multiplexing the candidate words corresponding to the identical slots into the same candidate word node, and connecting the candidate word node with the endpoints of the identical slots to obtain a decoding network after word insertion. According to the word inserting method, device, electronic equipment and storage medium of the decoding network, when the slots to be inserted comprise a plurality of repeated identical slots, the candidate words corresponding to the repeated identical slots are multiplexed into the same candidate word node, namely, the candidate words are only required to be built once, compared with the prior art that the candidate words are required to be inserted once for each slot, the candidate words are required to be built repeatedly for a plurality of times, so that the time cost of word inserting is reduced, and meanwhile, the occupation of newly added memory is reduced.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a word inserting method and apparatus for a decoding network, an electronic device, and a storage medium.
Background
With the advent of the mobile internet era, speech recognition has been receiving more and more attention from companies and manufacturers as an important interface for man-machine interaction, and especially in the field of embedded end application, speech interaction has become a necessary function.
End-to-end speech recognition directly converts an audio sequence into a text sequence, and some decoding strategies, such as greedy decoding, beam decoding, etc., are often required when composing text. Because the end-to-end voice recognition can directly convert the audio into characters, a special language model is not available, and the recognition result is uncontrollable. For personalized content, such as proper nouns of contacts, the recognition effect is poor, and a command word network is introduced in the decoding process.
The command word network generation needs to go through a compiling process, the compiling process is tedious and time-consuming, in order to reduce the time consumption of resource loading compiling, a part of general main body content is normally compiled offline, and when the user uses the command word network, only the compiled network needs to be directly loaded. The personalized content of the user needs to be realized by a word inserting function, namely, a word inserting method of a command word network needs to be provided.
Disclosure of Invention
The invention provides a word inserting method, a word inserting device, electronic equipment and a storage medium of a decoding network, which are used for solving the defects of low word inserting speed and large occupied memory of the decoding network in the prior art.
The invention provides a word inserting method of a decoding network, which comprises the following steps:
Determining a groove to be inserted and a candidate word corresponding to the groove to be inserted;
Multiplexing candidate words corresponding to a plurality of identical slots into the same candidate word node under the condition that the slot to be inserted comprises the plurality of identical slots which repeatedly appear;
and connecting the candidate word nodes with the endpoints of the plurality of identical slots to obtain a decoding network after word insertion.
According to the word inserting method of the decoding network provided by the invention, the word candidate nodes are connected with the endpoints of the same slots to obtain the decoding network after word insertion, and the word inserting method comprises the following steps:
under the condition that the number of candidate words corresponding to the slots to be inserted is a plurality of, combining the candidate words into a sub-network, and taking the sub-network as the candidate word node;
adding a public head node and a public tail node to the sub-network;
and respectively connecting the public head node and the public tail node with the endpoints of the plurality of identical slots to obtain a decoding network after word insertion.
According to the word inserting method of the decoding network provided by the invention, the method for connecting the public head node and the public tail node with the endpoints of the same slots respectively to obtain the decoding network after word insertion comprises the following steps:
and connecting the public head node and the public tail node with each candidate word in the sub-network through real arcs, and connecting the public head node and the public tail node with the endpoints of the same slots through empty arcs to obtain a decoding network after word insertion.
The word inserting method of the decoding network provided by the invention further comprises the following steps:
determining a public head node of a sub-network connected with the to-be-offloaded slot;
Traversing each node from the public head node until an empty arc exists in an arc going out of any node, and taking any node as a public tail node of the sub-network;
and deleting all arcs and nodes in the traversal process, and releasing the memory to unload the sub-network.
According to the word inserting method of the decoding network provided by the invention, the determining of the public head node of the sub-network connected with the to-be-unloaded slot comprises the following steps:
Determining the left end point of the groove to be unloaded;
and searching an empty arc in the left end point outgoing arc, and taking a first node connected with the empty arc as a public first node of the sub-network.
The word inserting method of the decoding network provided by the invention further comprises the following steps:
combining the subnetworks in case there are two different nodes in the subnetwork with the same arc out.
According to the word inserting method of the decoding network provided by the invention, under the condition that two different nodes with the same outgoing arc exist in the sub-network, the sub-network is combined, and the word inserting method comprises the following steps:
Under the condition that nodes with the same arc are present in the current termination node of the sub-network, merging the nodes with the same arc, and updating the arc-out information of the pointing node pointing to the current termination node;
and updating the current termination node of the sub-network based on the updated arc-out information, and repeatedly executing node merging and arc-out information updating until all nodes of the sub-network are traversed.
The invention also provides a word inserting device of the decoding network, which comprises:
The determining unit is used for determining the slots to be inserted and the corresponding candidate words;
a multiplexing unit, configured to multiplex candidate words corresponding to a plurality of identical slots into a same candidate word node when the slot to be inserted includes the plurality of identical slots that repeatedly occur;
and the word inserting unit is used for connecting the candidate word nodes with the endpoints of the plurality of identical slots to obtain a decoding network after word insertion.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the word insertion method of any one of the decoding networks when executing the computer program.
The invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a word insertion method of a decoding network as described in any of the above.
The invention also provides a computer program product comprising a computer program which when executed by a processor implements a word insertion method of decoding a network as described in any one of the above.
According to the word inserting method, device, electronic equipment and storage medium of the decoding network, when the slots to be inserted comprise a plurality of repeated identical slots, the candidate words corresponding to the repeated identical slots are multiplexed into the same candidate word node, namely, the candidate words are only required to be built once, compared with the prior art that the candidate words are required to be inserted once for each slot, the candidate words are required to be built repeatedly for a plurality of times, so that the time cost of word inserting is reduced, and meanwhile, the occupation of newly added memory is reduced.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a command word network in a related art location in an end-to-end recognition system.
Fig. 2 is a schematic diagram of a command word network precompilation and use process in the related art.
Fig. 3 is a schematic diagram of a related art word insertion method.
FIG. 4 is a second schematic diagram of a related art word insertion method.
Fig. 5 is a schematic flow chart of a word inserting method of a decoding network according to the present invention.
FIG. 6 is a schematic diagram of candidate word multiplexing provided by the present invention.
Fig. 7 is a schematic diagram of a subnetwork provided by the present invention.
FIG. 8 is a schematic diagram of a word-level modeling personalized candidate word provided by the present invention.
Fig. 9 is a schematic diagram of a subnetwork with head-to-tail nodes provided by the present invention.
Fig. 10 is a schematic diagram of a sub-network using an air arc connection provided by the present invention.
Fig. 11 is a schematic flow chart of a sub-network offloading method provided by the present invention.
Fig. 12 is a schematic diagram of subnetwork merging provided by the present invention.
Fig. 13 is a schematic structural diagram of a word inserting device of a decoding network according to the present invention.
Fig. 14 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
End-to-end speech recognition is a framework for distinguishing traditional speech recognition. Conventional speech recognition generally consists of two parts, an acoustic model and a language model. The acoustic model is responsible for converting an audio sequence into a phoneme sequence, such as the common chinese pinyin, english phonetic symbols, and other tuples of multiple phonemes, such as diphones, triphones, etc. The language model is responsible for converting these phoneme sequences into text sequences. The two parts do not need to be coupled and can be trained independently. The disadvantage of this traditional speech recognition is that the model training process is cumbersome, and the speech recognition effect is jointly affected by both parts, and the effect improvement of a single model does not necessarily bring about an overall effect improvement, and is therefore gradually replaced by an end-to-end model.
End-to-end speech recognition directly converts an audio sequence into a text sequence, and some decoding strategies, such as greedy decoding, beam decoding, etc., are often required when composing text. Because the end-to-end voice recognition can directly convert the audio into the text, the recognition result is uncontrollable because of no special language model, and the recognition effect is poor for personalized contents such as proper nouns of contacts and the like. At this point a network of command words is introduced during the decoding process.
Fig. 1 is a schematic diagram of a location of a command word network in an end-to-end recognition system in the related art, as shown in fig. 1, the command word network is a finite state Transducer (FINITE STATE Transducer), which is input in a decoding state and output as a recognition result. The method comprises the steps of using a Viterbi algorithm to find a decoding path with highest score in a network in the beam decoding process, and constraining the beam decoding result to be more prone to an expected recognition result.
The command word network generation needs to go through a compiling process, the compiling process is tedious and time-consuming, in order to reduce the time consumption of resource loading compiling, a part of general main body content is normally compiled offline, and when the user uses the command word network, only the compiled network needs to be directly loaded. And the personalized content of the user is related, the word insertion function is needed to be realized.
The user can upload personalized content in the network through word insertion, for example, when the user uses a telephone function, contacts in his telephone book do not exist in the original network compiled offline, the user is required to upload the contacts, and the contacts are inserted into the original network through the word insertion function. Thus, for each user, the personalized command word network can be realized by providing a universal original network and enabling personalized contents of the user to be uploaded and inserted through user definition.
Fig. 2 is a schematic diagram of a command word network precompilation and use process in the related art, as shown in fig. 2, in the precompilation stage, a written command word file is compiled into a command word network usable by a program, and is converted into a binary file for storage. The command word file is composed of written sentence patterns, slots and candidate words, the sentence patterns generally comprise a plurality of candidate slots, each slot generally comprises a plurality of candidate words, and the plurality of command words can be composed of the sentence patterns, the slots and the candidate words in an arrangement combination. In the use process of the user, the command word network is loaded from the binary file, the user can customize the candidate words of the groove, namely the personalized file of the user, and the candidate words are expanded into a new command word network in a word inserting mode.
Fig. 3 is a schematic diagram of a related art word inserting method, as shown in fig. 3, in which, according to the existing word inserting method of the command word network, the front and rear end points of the recorded slot are connected with new personalized candidate words by a new arc connection method when the command word network is precompiled, if the user has a plurality of personalized candidate words, the above-mentioned method is repeated, thereby realizing the dynamic word inserting of the command word network. The related art word inserting method has the defects of low word inserting speed and large occupied memory.
The reason for the slow word insertion speed is that the prior art scheme can sequentially insert the personalized candidate words to be inserted into the slots into which the personalized candidate words are to be inserted. If there is one and only one slot in the command word network to be inserted, then only the insertion is required as described above. However, in reality, the command word network is generally responsible, and the same slot may be repeated in the sentence pattern. Fig. 4 is a second schematic diagram of a related art word insertion method, as shown in fig. 4, in which a sentence pattern contains a plurality of slots to be inserted, that is, the sentence pattern contains 2 slots to be inserted B, so that in order to enable all the repeated places to be inserted with the personalized candidate word a, word insertion needs to be performed on each place, and the time cost of word insertion is greatly increased.
The reason for the large memory is that when there are a plurality of identical slots to be inserted, each slot needs to be inserted with one candidate word, so that the command word network will increase m×n new candidate words and at least 2×m×n arcs, where M is the number of times the slot to be inserted occurs and N is the number of candidate words that the slot needs to be inserted. Because of the large number of candidate words, the memory occupied by the command word network is greatly increased.
Aiming at the problems of low word inserting speed and large occupied memory of the existing word inserting method, the embodiment of the invention provides a word inserting method of a decoding network, in the method, under the condition that a slot to be inserted comprises a plurality of repeated identical slots, namely, the identical slot can repeatedly appear in a sentence pattern, the method multiplexes candidate words corresponding to the repeated identical slots into identical candidate word nodes, and then connects the identical candidate word nodes with endpoints of the identical slots to obtain the decoding network after word insertion.
Compared with the prior art that one candidate word insertion is needed for each slot, multiple candidate words are needed to be repeatedly constructed, and the candidate words corresponding to the repeated multiple identical slots are multiplexed into the same candidate word node by the embodiment, namely, only one candidate word is needed to be constructed, so that the time cost of word insertion is reduced, and the occupation of newly added memory is reduced.
The embodiment of the invention can be applied to the scene requiring word insertion of the decoding network. The execution main body of the method can be electronic equipment such as terminal equipment, a computer, a server cluster or specially designed decoding network word inserting equipment, and can also be a word inserting device arranged in the electronic equipment, and the word inserting device can be realized by software, hardware or a combination of the two.
In the description of the embodiments of the present invention, the meaning of "plurality" is two or more, unless explicitly defined otherwise.
Fig. 5 is a schematic flow chart of a word inserting method of a decoding network according to the present invention, as shown in fig. 5, the method includes the following steps:
Step 510, determining a slot to be inserted and a candidate word corresponding to the slot to be inserted;
step 520, in the case that the slot to be inserted includes a plurality of identical slots that repeatedly appear, multiplexing the candidate words corresponding to the identical slots into the same candidate word node;
and 530, connecting the same candidate word node with the endpoints of a plurality of identical slots to obtain a decoding network after word insertion.
In particular, the decoding network refers to a network for decoding audio features in a speech recognition process, which may include a command word network. The number of the candidate words corresponding to each slot may be one or more, and the embodiment of the present invention is not limited in particular.
Referring to fig. 4, in the case where the slot to be inserted in the sentence pattern B includes a plurality of identical slots (slot B) that repeatedly appear, when the personalized candidate word a needs to be inserted into the slot B to be inserted, in order to enable all the repeatedly appearing slots B to be inserted into the personalized candidate word a, the prior art needs to insert a word into each appearing place, and at this time, two nodes of the personalized candidate word a need to be constructed, and the personalized candidate word a is sequentially inserted into the slot B.
Fig. 6 is a schematic diagram of candidate word multiplexing provided by the present invention, and referring to fig. 6, an embodiment of the present invention multiplexes candidate words corresponding to the same slots, that is, personalized candidate word a, into the same candidate word node. When inserting, only one candidate word node is constructed in the network, and the candidate word node is connected to the front end point and the rear end point of all the repeated grooves B, so that a decoding network after word insertion is obtained.
Then for the slot B that appears M times in the original personalized network, the personalized candidate word a needs to be constructed repeatedly M times, and only needs to be constructed once, so that the newly increased memory occupation caused by the newly inserted candidate word is reduced to 1/M.
In addition, the candidate words corresponding to the same slots are multiplexed into the same candidate word node, and meanwhile, the time cost of word insertion is reduced.
According to the method provided by the embodiment of the invention, under the condition that the slot to be inserted comprises a plurality of repeated identical slots, the candidate words corresponding to the repeated identical slots are multiplexed into the same candidate word node, namely, the candidate words are only required to be built once, compared with the prior art that the candidate words are required to be inserted once for each slot, the candidate words are required to be built repeatedly for a plurality of times, so that the time cost of word insertion is reduced, and the occupation of newly added memory is reduced.
Based on any of the above embodiments, the candidate word node is connected to the endpoints of the plurality of identical slots, so as to obtain a decoding network after word insertion, that is, step 530 specifically includes:
Step 531, combining the candidate words into a sub-network and using the sub-network as candidate word nodes under the condition that the number of the candidate words corresponding to the slots to be inserted is a plurality of;
step 532, adding a public head node and a public tail node to the sub-network;
And 533, connecting the public head node and the public tail node with the endpoints of the same slots respectively to obtain a decoding network after word insertion.
Specifically, it can be found in the above embodiment that the number of arcs for connecting the slot and the personalized candidate word is not changed, and for the personalized candidate word a, it is still necessary to newly add 2*M arcs. In the case that the number of the candidate words corresponding to the slot to be inserted is plural, that is, the number of the candidate words corresponding to the slot to be inserted is plural (assuming N), 2×m×n arcs need to be newly added, which is still a small memory overhead.
In order to solve the problem, the present embodiment combines the plurality of candidate words into one sub-network, and uses the sub-network as a candidate word node. Fig. 7 is a schematic diagram of a sub-network provided in the present invention, and as shown in fig. 7, the candidate words include a personalized candidate word 1, a personalized candidate word 2, and a personalized candidate word 3.
At this time, all personalized candidate words in the sub-network have their own individual arcs in and out, which are respectively related to the first information of the candidate word, and therefore cannot be reused. Fig. 8 is a schematic diagram of a personalized candidate word for word level modeling provided by the present invention, as shown in fig. 8, taking a word level modeling unit as an example, if two personalized candidate words of "Zhang san" and "Liqu" are added, different word information is stored on these arcs, and therefore, the two personalized candidate words cannot be multiplexed.
In this embodiment, a common head node and a common tail node are added to each candidate word in the sub-network, i.e. a common head node and a common tail node are added. Fig. 9 is a schematic diagram of a sub-network with head-tail nodes, as shown in fig. 9, where a common head-node and a common tail-node are respectively connected with end points of a plurality of identical slots, so that each candidate word needs 2 arcs to be connected with the head-position node of the slot respectively, after the common head-tail node is introduced, only the head-tail node needs 2 arcs to be connected, and for M slots, 2*M new arcs are needed. All N candidate words are connected with the public head and tail end points, and 2*N new arcs are needed.
Therefore, the word insertion mode needs to increase 2×m+2×n new arcs altogether, and compared with 2×m×n new arcs required by the existing scheme, the number of new arcs can be greatly reduced, so that the memory occupied by a network is reduced.
According to the method provided by the embodiment of the invention, under the condition that the number of the candidate words corresponding to the slots to be inserted is multiple, the multiple candidate words are combined into one sub-network, and a public head node and a public tail node are added to the sub-network, so that the number of newly built arcs can be greatly reduced while multiplexing the sub-network is ensured, and the memory occupied by the network is further reduced.
Based on any of the above embodiments, the common head node and the common tail node are respectively connected with the endpoints of the plurality of identical slots, so as to obtain the decoding network after word insertion, that is, step 533 specifically includes:
and connecting the public head node and the public tail node with each candidate word in the sub-network through real arcs, and connecting the public head node and the public tail node with the endpoints of a plurality of identical slots through empty arcs to obtain a decoding network after word insertion.
In particular, the inventor has found that the above-mentioned candidate word sub-network with the common head-tail node has a problem that is not topologically equivalent to the way in which the candidate word is directly inserted, and for one candidate word, the topological distance between the candidate word and the left-right nodes of the slot is changed from 1 arc to 2 arcs, and more than one candidate word is connected to the common head-tail node. In Finite State Transducers (FST), there is an empty arc (epsilon) concept, i.e., an arc where both input and output are empty, to achieve an unconditional jump. Fig. 10 is a schematic diagram of a sub-network using null arcs, where as shown in fig. 10, a slot is connected to a common head-to-tail node using a null arc, i.e. candidate words are separated from the end points of the slot by a null arc and a real arc, which are topologically equivalent to a real arc. Therefore, the decoding network obtained by the new word insertion mode after the empty arc is introduced is kept unchanged in the topology structure. So far, network construction can be completed by constructing 2*M empty arcs and 2*N real arcs.
According to the method provided by the embodiment of the invention, the hollow arcs are introduced to keep the topological structure unchanged, and meanwhile, the number of newly-built arcs can be greatly reduced, so that the memory occupied by a network is further reduced.
Based on any of the above embodiments, considering that in the prior art, the candidate word to be inserted is connected with the front and rear end points of the slot to be inserted by using an arc, the candidate word to be inserted after connection is a part of the network, and cannot be separated from the command word network any more, and only new candidate words can be continuously added into the existing network. When a user needs to delete a personalized candidate word inserted in a certain slot, the network must be restored to the original binary file, and other candidate words not needing to be deleted are reinserted, so that the flexibility of the command word network is greatly reduced.
Aiming at the defect of high difficulty in unloading command word networks, fig. 11 is a schematic flow chart of a sub-network unloading method provided by the invention, as shown in fig. 11, the method further comprises:
Step 1110, determining a public head node of a sub-network connected with a slot to be offloaded;
step 1120, traversing each node from the public head node until an empty arc exists in the arc of any node traversed, and taking the node as a public tail node of the sub-network;
In step 1130, all arcs and nodes in the traversal process are deleted and the memory is freed to offload the subnetwork.
Specifically, according to the decoding network after word insertion constructed in the above manner, two arcs of an empty arc and a real arc exist. Therefore, the original network and the added personalized candidate word sub-network can be simply distinguished by distinguishing the empty arc from the real arc.
First, a common head node of a sub-network connected to a slot to be offloaded is determined, comprising:
The left end point of the slot to be unloaded is determined. All left and right endpoints of the slot to be unloaded can be found according to the information recorded during pre-compiling. And starting from each left end point, searching all empty arcs in the left end point arc, and taking the first node connected with the empty arcs as a public first node of the sub-network.
And then, traversing each node from the public head node until an empty arc exists in the arc traversing to a certain node, and taking the node as a public tail node of the sub-network.
And deleting all arcs and nodes in the traversal process, and releasing the memory to unload the sub-network.
According to the method provided by the embodiment of the invention, the original network and the added personalized candidate word sub-network can be simply distinguished by distinguishing the empty arc from the real arc, so that the sub-network can be flexibly unloaded, the unloading difficulty of the sub-network is greatly reduced, and the flexibility is improved.
Based on any of the above embodiments, to further reduce the memory, the method further includes:
In case there are two different nodes in the subnetwork with the same arc out, the subnetworks are merged.
In particular, in the case where two different nodes with the same arc exit exist in the subnetwork, the two nodes can be topologically regarded as the same node, so that the two nodes can be combined, thereby reducing the number of nodes in the subnetwork and further reducing the memory.
Fig. 12 is a schematic diagram of a sub-network merging provided in the present invention, and as shown in fig. 12 (a), if node 3 and node 5 have the same arc d, node 3 and node 5 may be regarded as the same node in topology. The subnetwork obtained after combining the two nodes is shown in fig. 12 (b), where node 2 and node 4 have the same arc c, then node 2 and node 4 can be considered as the same node in topology. The resulting subnetwork after combining the two nodes is shown in fig. 12 (c).
According to the method provided by the embodiment of the invention, under the condition that two different nodes with the same arc exist in the sub-network, the sub-network is combined, so that the number of the nodes in the sub-network can be further reduced, and the memory is further reduced.
Based on any of the above embodiments, in a case where there are two different nodes with the same arc out in the sub-network, merging the sub-network includes:
under the condition that nodes with the same arc are present in the current termination node of the sub-network, merging the nodes with the same arc, and updating the arc-out information of the pointing node pointing to the current termination node;
And updating the current termination node of the sub-network based on the updated arc-out information, and repeatedly executing node merging and arc-out information updating until all nodes of the sub-network are traversed.
Specifically, the merging for the sub-networks can be realized by regarding the sub-networks as a Directed Acyclic Graph (DAG), and finding out nodes with the degree of 0 in the DAG each time according to the reverse topological order of the DAG, wherein the current termination node is the node with the degree of 0 in the current sub-network. If the nodes with the same arc are present in the current termination node, the nodes with the same arc are combined.
And updating the arc-out information of all nodes with arcs pointing to the current termination node. The pointing node here is the node pointing to the current termination node. After the arc-out information is updated, the current termination node is updated, wherein the updated current termination node does not comprise the last termination node, namely the current termination node is a node with the degree of 0 after the node with the degree of 0 of the previous time is removed.
After the updated current termination node is obtained, the node merging operation is repeatedly performed until the full node is accessed.
In some embodiments, the specific algorithm design for the subnetwork merge is as follows:
1. traversing the node vector, finding out all nodes with the degree of 0, and adding all nodes into a queue.
2. And (3) queuing all the nodes, calculating a hash value to judge whether the nodes can be combined, and simultaneously, subtracting one from the output degree of the initial node of each arc when each arc points to the nodes, and recording the id of the node when the output degree is reduced to zero.
3. The nodes that can be merged at this time are merged.
4. And (3) adding the nodes recorded in the step (2) into a queue.
5. Repeating the steps 2-4 until the queue is empty.
According to the method provided by the embodiment of the invention, the sub-network is regarded as a directed acyclic graph, and sub-network merging is carried out according to the reverse topological order of the directed acyclic graph, so that the efficiency and the accuracy of sub-network merging can be improved.
Based on any one of the above embodiments, a word inserting method for decoding a network is provided, including:
1. Personalized candidate word multiplexing. Multiplexing candidate words corresponding to a plurality of identical slots into the same candidate word node under the condition that the slot to be inserted comprises the repeated plurality of identical slots; and connecting the candidate word nodes with the endpoints of a plurality of identical slots to obtain a decoding network after word insertion.
2. And under the condition that the number of candidate words corresponding to the slots to be inserted is a plurality of, constructing a personalized candidate word sub-network. Combining the candidate words into a sub-network, using the sub-network as a candidate word node, adding a public head node and a public tail node to the sub-network, and connecting the public head node and the public tail node with the end points of a plurality of identical slots respectively to obtain a decoding network after word insertion.
3. Introducing an empty arc. And connecting the public head node and the public tail node with each candidate word in the sub-network through real arcs, and connecting the public head node and the public tail node with the endpoints of a plurality of identical slots through empty arcs to obtain a decoding network after word insertion.
4. And unloading the sub-network. The method comprises the steps of determining a public head node of a sub-network connected with a slot to be unloaded, traversing each node from the public head node to the back until an empty arc exists in an arc of any node, taking any node as a public tail node of the sub-network, deleting all arcs and nodes in the traversing process, and releasing a memory to unload the sub-network.
5. And merging the sub-networks. Under the condition that nodes with the same arc are present in the current termination node of the sub-network, merging the nodes with the same arc, and updating the arc-out information of the pointing node pointing to the current termination node; and updating the current termination node of the sub-network based on the updated arc-out information, and repeatedly executing node merging and arc-out information updating until all nodes of the sub-network are traversed.
According to the method provided by the embodiment of the invention, the word insertion is changed from word-by-word entry insertion into sub-network insertion after the sub-network construction, so that the word insertion is not repeated, and the time cost of word insertion can be greatly reduced. The sub-network can be multiplexed in the repeated slots, the number of newly added nodes is reduced to 1/M required by the prior art, and the number of newly added arcs is reduced from 2 x M x N to 2 x (M+N), wherein M is the number of slots, and N is the number of entries. The sub-network and the empty arc are introduced, and the problem that the personalized command word is difficult to unload by the command word network in the prior art can be solved through a related unloading algorithm.
The word inserting device of the decoding network provided by the invention is described below, and the word inserting device of the decoding network described below and the word inserting method of the decoding network described above can be correspondingly referred to each other.
Based on any of the above embodiments, fig. 13 is a schematic structural diagram of a word inserting device of a decoding network according to the present invention, as shown in fig. 13, the device includes:
A determining unit 1310, configured to determine a slot to be inserted and a candidate word corresponding to the slot;
a multiplexing unit 1320, configured to multiplex candidate words corresponding to a plurality of identical slots into a same candidate word node when the slot to be inserted includes the plurality of identical slots that repeatedly occur;
And the word inserting unit 1330 is configured to connect the candidate word node with the endpoints of the multiple identical slots, so as to obtain a decoding network after word insertion.
According to the device provided by the embodiment of the invention, under the condition that the slot to be inserted comprises a plurality of repeated identical slots, the candidate words corresponding to the repeated identical slots are multiplexed into the same candidate word node, namely, the candidate words are only required to be built once, compared with the prior art that the candidate words are required to be inserted once for each slot, the candidate words are required to be built repeatedly for a plurality of times, so that the time cost of word insertion is reduced, and the occupation of newly added memory is reduced.
Based on any of the above embodiments, the word insertion unit is specifically configured to:
Under the condition that the number of candidate words corresponding to the slots to be inserted is a plurality of, combining the candidate words into a sub-network, and taking the sub-network as the candidate word node;
adding a public head node and a public tail node to the sub-network;
and respectively connecting the public head node and the public tail node with the endpoints of the plurality of identical slots to obtain a decoding network after word insertion.
Based on any of the above embodiments, the word insertion unit is specifically configured to:
and connecting the public head node and the public tail node with each candidate word in the sub-network through real arcs, and connecting the public head node and the public tail node with the endpoints of the same slots through empty arcs to obtain a decoding network after word insertion.
Based on any of the above embodiments, the apparatus further comprises an unloading unit, specifically configured to:
determining a public head node of a sub-network connected with the to-be-offloaded slot;
Traversing each node from the public head node until an empty arc exists in an arc going out of any node, and taking any node as a public tail node of the sub-network;
and deleting all arcs and nodes in the traversal process, and releasing the memory to unload the sub-network.
Based on any of the above embodiments, the unloading unit is further configured to:
Determining the left end point of the groove to be unloaded;
and searching an empty arc in the left end point outgoing arc, and taking a first node connected with the empty arc as a public first node of the sub-network.
Based on any of the above embodiments, the apparatus further includes a merging unit, specifically configured to:
combining the subnetworks in case there are two different nodes in the subnetwork with the same arc out.
Based on any of the above embodiments, the merging unit is further configured to:
Under the condition that nodes with the same arc are present in the current termination node of the sub-network, merging the nodes with the same arc, and updating the arc-out information of the pointing node pointing to the current termination node;
and updating the current termination node of the sub-network based on the updated arc-out information, and repeatedly executing node merging and arc-out information updating until all nodes of the sub-network are traversed.
Fig. 14 illustrates a physical schematic diagram of an electronic device, as shown in fig. 14, which may include a processor 1410, a communication interface (Communications Interface) 1420, a memory 1430, and a communication bus 1440, wherein the processor 1410, the communication interface 1420, and the memory 1430 communicate with each other via the communication bus 1440. The processor 1410 may invoke logic instructions in the memory 1430 to perform a word insertion method of the decoding network, the method including determining a slot to be inserted and a candidate word corresponding to the slot, multiplexing the candidate word corresponding to a plurality of identical slots into a same candidate word node if the slot to be inserted includes the identical slots that repeatedly appear, and connecting the candidate word node with endpoints of the identical slots to obtain the decoding network after word insertion.
In addition, the logic instructions in the memory 1430 described above may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention further provides a computer program product, where the computer program product includes a computer program, where the computer program can be stored on a non-transitory computer readable storage medium, and when the computer program is executed by a processor, the computer is capable of executing a word insertion method of a decoding network provided by the above methods, where the method includes determining a slot to be inserted and a candidate word corresponding to the slot, multiplexing the candidate word corresponding to a plurality of identical slots into a same candidate word node if the slot to be inserted includes a plurality of identical slots that repeatedly occur, and connecting the candidate word node with endpoints of the plurality of identical slots to obtain a decoding network after word insertion.
In still another aspect, the present invention further provides a non-transitory computer readable storage medium, on which a computer program is stored, where the computer program is implemented when executed by a processor to perform the word insertion method of the decoding network provided by the above methods, where the method includes determining a slot to be inserted and a candidate word corresponding to the slot, multiplexing the candidate word corresponding to a plurality of identical slots into a same candidate word node if the slot to be inserted includes the identical slots that repeatedly occur, and connecting the candidate word node with endpoints of the identical slots to obtain the decoding network after word insertion.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
It should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention, and not for limiting the same, and although the present invention has been described in detail with reference to the above-mentioned embodiments, it should be understood by those skilled in the art that the technical solution described in the above-mentioned embodiments may be modified or some technical features may be equivalently replaced, and these modifications or substitutions do not make the essence of the corresponding technical solution deviate from the spirit and scope of the technical solution of the embodiments of the present invention.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411940606.3A CN119785771B (en) | 2024-12-26 | 2024-12-26 | Word inserting method and device for decoding network, electronic equipment and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411940606.3A CN119785771B (en) | 2024-12-26 | 2024-12-26 | Word inserting method and device for decoding network, electronic equipment and storage medium |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN119785771A CN119785771A (en) | 2025-04-08 |
| CN119785771B true CN119785771B (en) | 2025-10-17 |
Family
ID=95231699
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202411940606.3A Active CN119785771B (en) | 2024-12-26 | 2024-12-26 | Word inserting method and device for decoding network, electronic equipment and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN119785771B (en) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110322884A (en) * | 2019-07-09 | 2019-10-11 | 科大讯飞股份有限公司 | A kind of slotting word method, apparatus, equipment and the storage medium of decoding network |
| CN111477217A (en) * | 2020-04-08 | 2020-07-31 | 北京声智科技有限公司 | Command word recognition method and device |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111355781B (en) * | 2020-02-18 | 2021-06-08 | 腾讯科技(深圳)有限公司 | Voice information communication management method, device and storage medium |
| US12524923B2 (en) * | 2020-10-06 | 2026-01-13 | Beijing Xiaomi Mobile Software Co., Ltd. | Method of encoding and decoding, encoder, decoder |
| US11893345B2 (en) * | 2021-04-06 | 2024-02-06 | Adobe, Inc. | Inducing rich interaction structures between words for document-level event argument extraction |
| CN113920999B (en) * | 2021-10-29 | 2025-09-05 | 中国科学技术大学 | Speech recognition method, device, equipment and storage medium |
-
2024
- 2024-12-26 CN CN202411940606.3A patent/CN119785771B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110322884A (en) * | 2019-07-09 | 2019-10-11 | 科大讯飞股份有限公司 | A kind of slotting word method, apparatus, equipment and the storage medium of decoding network |
| CN111477217A (en) * | 2020-04-08 | 2020-07-31 | 北京声智科技有限公司 | Command word recognition method and device |
Also Published As
| Publication number | Publication date |
|---|---|
| CN119785771A (en) | 2025-04-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US6668243B1 (en) | Network and language models for use in a speech recognition system | |
| US8589163B2 (en) | Adapting language models with a bit mask for a subset of related words | |
| JP2013065188A (en) | Automaton determining method, automaton determining device and automaton determining program | |
| Caseiro et al. | A specialized on-the-fly algorithm for lexicon and language model composition | |
| CN114155836B (en) | Speech recognition method, related device and readable storage medium | |
| WO2021243605A1 (en) | Method and device for generating dna storage coding/decoding rule, and method and device for dna storage coding/decoding | |
| CN118333172A (en) | Large language model reasoning acceleration method and related device | |
| US20090240500A1 (en) | Speech recognition apparatus and method | |
| CN118155638A (en) | Speech generation and understanding system, method and electronic equipment based on large language model | |
| CN119785771B (en) | Word inserting method and device for decoding network, electronic equipment and storage medium | |
| JP2841404B2 (en) | Continuous speech recognition device | |
| CN110322884B (en) | Word insertion method, device, equipment and storage medium of decoding network | |
| CN110534115A (en) | Recognition methods, device, system and the storage medium of multi-party speech mixing voice | |
| CN118214921A (en) | Video generation method, device, electronic device and readable storage medium | |
| CN108334491B (en) | Text analysis method and device, computing equipment and storage medium | |
| CN112527235A (en) | Voice playing method, device, equipment and storage medium | |
| CN112819513A (en) | Text chain generation method, device, equipment and medium | |
| CN114220444B (en) | Voice decoding method, device, electronic equipment and storage medium | |
| JP4405542B2 (en) | Apparatus, method and program for clustering phoneme models | |
| CN114968950B (en) | Task processing methods, devices, electronic equipment, and media | |
| Eide | Automatic modeling of pronunciation variations | |
| CN113836917B (en) | Text word segmentation processing method and device, equipment and medium thereof | |
| CN117454886A (en) | Text generation method, device, electronic device and storage medium | |
| US7617089B2 (en) | Method and apparatus for compiling two-level morphology rules | |
| CN114326576B (en) | Operation control method and device, electronic device and computer readable storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |