[go: up one dir, main page]

CN114121001A - Voice control method and device and electronic equipment - Google Patents

Voice control method and device and electronic equipment Download PDF

Info

Publication number
CN114121001A
CN114121001A CN202111342166.8A CN202111342166A CN114121001A CN 114121001 A CN114121001 A CN 114121001A CN 202111342166 A CN202111342166 A CN 202111342166A CN 114121001 A CN114121001 A CN 114121001A
Authority
CN
China
Prior art keywords
content
component
instruction
predicate
instruction text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111342166.8A
Other languages
Chinese (zh)
Inventor
曾理
张晓帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Hangzhou Douku Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Douku Software Technology Co Ltd filed Critical Hangzhou Douku Software Technology Co Ltd
Priority to CN202111342166.8A priority Critical patent/CN114121001A/en
Publication of CN114121001A publication Critical patent/CN114121001A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The embodiment of the application discloses a voice control method and device and electronic equipment. The method comprises the following steps: in response to receiving a voice instruction, converting the voice instruction into a corresponding first instruction text; analyzing grammatical components of the first instruction text to obtain grammatical components of the first instruction text; replacing the content corresponding to the grammar component in the first instruction text with the corresponding standard content based on the grammar component to obtain a second instruction text; generating a control instruction based on the second instruction text; and executing the control instruction. Therefore, after the grammatical component of the instruction text is analyzed, the content in the instruction text can be replaced by the corresponding standard content based on the grammatical component of the instruction text, so that the electronic equipment can more accurately determine the control intention of the user based on the replaced standard content, and the accuracy of the control process is improved.

Description

Voice control method and device and electronic equipment
Technical Field
The present application relates to the field of computer technologies, and in particular, to a voice control method and apparatus, and an electronic device.
Background
The combination of artificial intelligence technology and virtual personal assistant (voice assistant) can make the electronic device receive the voice command issued by the user through the hearing modality and complete the corresponding interactive task. However, due to personal habits of users, when a voice command is triggered, the triggered voice command is diversified, so that the actual control intention of the user accurately determined by the electronic device is influenced, and the accuracy of the control process needs to be improved.
Disclosure of Invention
In view of the foregoing, the present application provides a voice control method, an apparatus and an electronic device to achieve improvement of the foregoing problems.
In a first aspect, the present application provides a method for voice control, the method comprising: in response to receiving a voice instruction, converting the voice instruction into a corresponding first instruction text; analyzing grammatical components of the first instruction text to obtain grammatical components of the first instruction text; replacing the content corresponding to the grammar component in the first instruction text with the corresponding standard content based on the grammar component to obtain a second instruction text; generating a control instruction based on the second instruction text; and executing the control instruction.
In a second aspect, the present application provides a voice-controlled apparatus, the apparatus comprising: the instruction conversion unit is used for responding to the received voice instruction and converting the voice instruction into a corresponding first instruction text; the syntactic component analysis unit is used for carrying out syntactic component analysis on the first instruction text to obtain syntactic components of the first instruction text; the instruction processing unit is used for replacing the content corresponding to the grammar component in the first instruction text with the corresponding standard content based on the grammar component to obtain a second instruction text; the instruction generating unit is used for generating a control instruction based on the second instruction text; and the control unit is used for executing the control instruction.
In a third aspect, the present application provides an electronic device comprising one or more processors and a memory; one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the methods described above.
In a fourth aspect, the present application provides a computer-readable storage medium having a program code stored therein, wherein the program code performs the above method when running.
According to the voice control method, the voice control device and the electronic equipment, after a voice instruction is received, the voice instruction is converted into a corresponding first instruction text, grammatical component analysis is carried out on the first instruction text to obtain grammatical components of the first instruction text, then based on the grammatical components, the content corresponding to the grammatical components in the first instruction text is replaced by corresponding standard content to obtain a second instruction text, finally, a control instruction is generated based on the second instruction text, and the control instruction is executed. Therefore, after the grammatical component of the instruction text is analyzed, the content in the instruction text can be replaced by the corresponding standard content based on the grammatical component of the instruction text, so that the electronic equipment can more accurately determine the control intention of the user based on the replaced standard content, and the accuracy of the control process is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic diagram illustrating an application scenario of a speech control method according to an embodiment of the present application;
fig. 2 is a schematic diagram illustrating an application scenario of another speech control method proposed in an embodiment of the present application;
fig. 3 is a flowchart illustrating a voice control method according to an embodiment of the present application;
FIG. 4 is a flow chart illustrating a voice control method according to another embodiment of the present application;
FIG. 5 is a flow chart illustrating a voice control method according to yet another embodiment of the present application;
FIG. 6 is a flow chart illustrating a voice control method according to another embodiment of the present application;
FIG. 7 is a flow chart illustrating a voice control method according to another embodiment of the present application;
fig. 8 is a block diagram illustrating a structure of a voice control apparatus according to an embodiment of the present application;
fig. 9 shows a block diagram of an electronic device proposed in the present application;
fig. 10 is a storage unit according to an embodiment of the present application, configured to store or carry program code for implementing a voice control method according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The popularization of intelligent terminal equipment brings various conveniences to life. The combination of artificial intelligence technology and virtual personal assistant (voice assistant) can make the electronic device receive the voice command issued by the user through the hearing modality and complete the corresponding interactive task.
However, due to personal habits of users, when a voice command is triggered, the triggered voice command is diversified, so that the actual control intention of the user accurately determined by the electronic device is influenced, and the accuracy of the control process needs to be improved. For example, in the process of playing a video, if a user wants to control the video to stop playing by voice, the user may issue a voice command "stop", or "do not play", or "stop". For another example, if the user wants to control the start of playing a video with the name "XXX" by voice, the voice command issued by the user may be "open XXX", "play XXX", "i want to see XXX", "put XXX", etc. Therefore, the content of the voice command issued by the user may be various, so that the electronic device may not be able to accurately determine the actual control intention of the user.
Therefore, the inventor provides a voice control method, a voice control device and an electronic device in the present application, after receiving a voice instruction, the voice instruction is converted into a corresponding first instruction text, a grammatical component of the first instruction text is obtained by analyzing the grammatical component of the first instruction text, a content corresponding to the grammatical component in the first instruction text is replaced with a corresponding standard content based on the grammatical component to obtain a second instruction text, and finally a control instruction is generated based on the second instruction text and the control instruction is executed. Therefore, after the grammatical component of the instruction text is analyzed, the content in the instruction text can be replaced by the corresponding standard content based on the grammatical component of the instruction text, so that the electronic equipment can more accurately determine the control intention of the user based on the replaced standard content, and the accuracy of the control process is improved.
The following first introduces an application scenario related to the embodiment of the present application.
In the embodiment of the application, the provided voice control method can be executed by the electronic equipment. In this manner performed by the electronic device, all steps in the voice control method provided by the embodiment of the present application may be performed by the electronic device. For example, as shown in fig. 1, a voice acquisition device of the electronic device 100 may acquire a voice instruction and transmit the acquired voice acquisition instruction to a processor, so that the processor may acquire a first instruction text corresponding to the voice instruction and perform syntactic component analysis on the first instruction text to obtain a syntactic component of the first instruction text; replacing the content corresponding to the grammar component in the first instruction text with the corresponding standard content based on the grammar component to obtain a second instruction text, and then generating and executing a control instruction
Moreover, the voice control method provided by the embodiment of the application can also be executed by a server. Correspondingly, in this manner executed by the server, the electronic device may collect the voice instruction and send the collected voice instruction to the server, and then the server executes the voice control method provided in the embodiment of the present application to generate the control instruction, and then the server returns the control instruction to the electronic device, and the electronic device executes the control instruction. In addition, the method can be executed by cooperation of the electronic device and the server. In this manner, the electronic device and the server cooperatively perform some steps in the voice control method provided by the embodiment of the present application, and some other steps are performed by the electronic device and the server.
For example, as shown in fig. 2, the electronic device 100 may perform a voice control method including: in response to receiving a voice instruction, converting the voice instruction into a corresponding first instruction text; and analyzing grammatical components of the first instruction text to obtain grammatical components of the first instruction text. Then, the server 200 replaces the content corresponding to the grammar component in the first instruction text with the corresponding standard content based on the grammar component to obtain a second instruction text; and generating a control instruction based on the second instruction text. Then, the control command is returned to the electronic device 100, and the electronic device 100 is triggered to execute the control command.
It should be noted that, in this manner executed by the electronic device and the server cooperatively, the steps executed by the electronic device and the server respectively are not limited to the manner described in the above example, and in practical applications, the steps executed by the electronic device and the server respectively may be dynamically adjusted according to actual situations.
Embodiments of the present application will be described with reference to the accompanying drawings.
Referring to fig. 3, a voice control method provided in the present application includes:
s110: in response to receiving a voice instruction, converting the voice instruction into a corresponding first instruction text.
In the embodiment of the application, the user can express own control intention through voice. Correspondingly, the electronic device can take the voice sent by the user as the voice instruction. The electronic device may convert the voice command into corresponding text content based on a preconfigured Automatic Speech Recognition mode (Automatic Speech Recognition) after receiving the voice command, thereby obtaining a first command text. For example, if the received voice instruction is "open album", the first instruction text obtained after converting the voice instruction includes "open album".
S120: and analyzing grammatical components of the first instruction text to obtain grammatical components of the first instruction text.
In the embodiment of the present application, a grammar component refers to words such as a subject, a predicate, an object, and a shape included in a piece of text. As one mode, the electronic device may obtain a relationship between language unit (word, named entity) components in the first instruction text by using a Dependency Parsing (DP) technique in a natural language processing technique, and obtain a component of each language unit through the relationship, thereby obtaining a grammatical component of the first instruction text. For example, the syntactic component in a sentence can be obtained by dependency parsing. The specific labeling relationships may be as exemplified in the following table:
type of relationship Label (R) Description of the invention Examples of the invention
Relationship between major and minor SBV subject-verb I send her a bunch of flowers (I ← send)
Moving guest relationship VOB Direct object, verb-object I send her bunch of flowers (send → flower)
Inter-guest relationships IOB Indirect object, indeect-object I send her bunch of flowers (send → her)
Preposition object FOB Front-object of preceding object He reads what book is (book ← read)
Concurrent language DBL double He asks me to eat (please → me)
Centering relationships ATT attribute Red apple (Red ← apple)
Middle structure ADV adverbial Very beautiful (very special ← beautiful)
Dynamic compensation structure CMP complement Done the job (do → done)
In a parallel relationship COO coordinate Mountain and sea (mountain → sea)
Intermediary relation POB preposition-object In the trade area (in → in)
Left additive relationship LAD left adjunct Mountain and sea (He ← sea)
Right additive relationship RAD right adjunct Kids (Children → people)
Independent structure IS independent structure The two separate sentences being structurally independent of each other
Core relationships HED head Refers to the core of the whole sentence
Through dependency syntax analysis, a main-predicate relationship can be obtained, and a main language and a predicate are obtained; an object or the like can be obtained by the verb relationship and predicate. The dependency parsing tool used by the electronic device in the embodiment of the present application may be various, such as Hanlp, StanfordNLP, DDParser, LTP, and the like, and may be selected according to actual requirements.
S130: and replacing the content corresponding to the grammar component in the first instruction text with the corresponding standard content based on the grammar component to obtain a second instruction text.
It should be noted that there may be multiple syntax elements in the first instruction text. And the content replacement mode can be different for different grammar components. In the embodiment of the present application, content replacement is performed on the first instruction text based on the syntax components, which may be understood as content replacement performed on the first instruction text based on a content replacement mode corresponding to each syntax component. For example, if a predicate element and an object element are included in the first command text, the content corresponding to the predicate element in the first command text is replaced based on the content replacement method corresponding to the predicate element, and the content corresponding to the object element in the first command text is replaced based on the content replacement method corresponding to the object element. For another example, if a predicate element, an object element, and a state element are included in the first command text, the content corresponding to the predicate element in the first command text is replaced based on a content replacement method corresponding to the predicate element, the content corresponding to the object element in the first command text is replaced based on a content replacement method corresponding to the object element, and the content corresponding to the state element in the first command text is replaced based on a content replacement method corresponding to the state element.
Illustratively, the following table shows content substitution for a predicate element:
Figure BDA0003352557210000061
the generalized predicate refers to the contents of a predicate element that needs to be replaced in the first instruction text. And the standard predicate corresponding to the generalized predicate identifies the standard content after replacement.
For another example, the following table shows a content substitution mode corresponding to a shape language component:
Figure BDA0003352557210000062
after content replacement is performed on the first instruction text based on the grammar component, a second instruction text can be obtained. In the embodiment of the present application, the second instruction text is an instruction text for generating a control instruction. For example, the first instruction text obtained after converting the voice instruction into text may be "help me put yulochun", and then the second instruction text obtained after performing content replacement in the manner shown in the embodiment of the present application may be "click yulochun".
S140: and generating a control instruction based on the second instruction text.
After the second instruction text is obtained, semantic recognition can be performed on the second instruction text based on a preconfigured mode, so that a triple is obtained, and then a control instruction is generated based on the triple. Optionally, the intention, the control object and the object auxiliary information in the text may be extracted based on a Natural Language Understanding (NLU) manner, and integrated into a triple with a style of { action, object, information }. Wherein, the action represents the intention, or can be understood as the control purpose, the object represents the control object, and the information represents the object auxiliary information. For example, if the second instruction text is "play the displaying instruction". The way of understanding based on natural language can be understood that the user intention is: "play". The control object is 'a display instruction', the object auxiliary information is null, and the triple is recorded as: { Play, Chen-Emotion, Φ }. For another example, if the second instruction text is "help me search for the antique bureau of China", the intention is "search", the control object is "search", the object auxiliary information is "antique bureau of China", and the triple is: { search, antique bureau-in-bureau }.
S150: and executing the control instruction.
It should be noted that the purpose of the user triggering the voice instruction is generally to operate an operation object corresponding to the current user interface. The operation object may include the current user interface displayed by the electronic device itself, and may also include a control in the displayed interface. For example, if the operation object is the displayed current user interface, the interface may be controlled to slide by a voice instruction, or the current user interface may be triggered to exit from the displayed current user interface by the voice instruction, or the current user interface may be switched to another interface by the voice instruction. If the operation object is a control in the interface, the control can be clicked through a voice instruction.
The control instruction generated based on the second instruction text can be understood as an instruction that can be recognized and executed by the electronic device, and the control instruction functions to operate an operation object that the user desires to operate. Correspondingly, when the electronic device executes the control command generated based on the second command text, the electronic device operates the operation object desired by the user.
In the voice control method provided in this embodiment, after a voice instruction is received, the voice instruction is converted into a corresponding first instruction text, syntactic component analysis is performed on the first instruction text to obtain syntactic components of the first instruction text, and then, based on the syntactic components, content corresponding to the syntactic components in the first instruction text is replaced with corresponding standard content to obtain a second instruction text, and finally, a control instruction is generated based on the second instruction text, and the control instruction is executed. Therefore, after the grammatical component of the instruction text is analyzed, the content in the instruction text can be replaced by the corresponding standard content based on the grammatical component of the instruction text, so that the electronic equipment can more accurately determine the control intention of the user based on the replaced standard content, and the accuracy of the control process is improved.
Referring to fig. 4, a voice control method provided in the present application includes:
s210: in response to receiving a voice instruction, converting the voice instruction into a corresponding first instruction text.
S220: and analyzing grammatical components of the first instruction text to obtain grammatical components of the first instruction text.
S230: and if the grammar component represents that the first instruction text comprises a predicate component, replacing the content corresponding to the predicate component with the corresponding standard predicate content.
S240: and if the grammar component represents that the first instruction text also comprises a non-predicate component, replacing the content corresponding to the non-predicate component with the corresponding standard non-predicate content.
As one mode, the replacing the content corresponding to the non-predicate element with the corresponding standard non-predicate content includes: if the non-predicate elements comprise the state-word elements, replacing the content corresponding to the state-word elements with the corresponding standard state-word content to obtain standard non-predicate content; if the non-predicate elements comprise object elements, replacing the content corresponding to the object element with corresponding annotated object content to obtain standard non-predicate content; or if the non-predicate element comprises an object element and a state element, replacing the content corresponding to the state element with the corresponding standard object content, replacing the content corresponding to the state element with the corresponding standard state content to obtain standard non-predicate content, and obtaining standard non-predicate content based on the standard object content and the standard state content.
Similarly, there are many synonymous and near-sense expressions in terms of objects with noun structures, and generalization is required to obtain standard object contents. The following table shows a content substitution method corresponding to an object component:
Figure BDA0003352557210000081
the electronic device may also determine corresponding standard object content based on the similarity for the object components. It should be noted that, in the embodiments of the present application, the corresponding standard content may be determined based on the correspondence between the generalized content and the standard content. For example, the object component is replaced with a content according to the correspondence (for example, the table) between the generalized content corresponding to the object component and the standard object content. In one case, the content corresponding to the object component in the first instruction text may not yet exist in the correspondence relationship, and thus the content cannot be replaced with the corresponding standard object content. In this case, the electronic device may obtain the description information of the control included in the current user interface, and then perform similarity detection on the description information of the control and the content corresponding to the object component, and use the description information with the highest similarity as the standard object content.
In the embodiment of the present application, there may be a plurality of ways to calculate the similarity.
As a similarity calculation method, the electronic device may obtain, based on a longest common subsequence, a first reference similarity between the description information of the control in the current user interface and the content corresponding to the object component, to obtain a first reference similarity between the description information of each control and the content corresponding to the object component; acquiring second reference similarity of the description information of the control in the current user interface and the content corresponding to the object component based on the editing distance mode to obtain second reference similarity of the description information of each control and the content corresponding to the object component; and adding the first reference similarity and the second reference similarity of each control to obtain the similarity of the description information of each control and the content corresponding to the object component.
As another similarity calculation method, the electronic device may directly obtain a text vector of description information of a control in the current user interface based on the trained neural network model, obtain a text vector of content corresponding to an object component, and determine the similarity based on a distance between the text vectors.
Furthermore, in the embodiment of the present application, there may be multiple ways of obtaining the description information of the control in the current user interface.
Optionally, the electronic device may identify the current user interface through at least one of the following identification manners to obtain a control included in the current user interface and description information corresponding to the control: identifying the current user interface based on a code analysis mode; identifying the current user interface based on a teletext recognition approach (e.g., by optical character recognition); and identifying the current user interface based on an icon classification model.
As a mode, the electronic device may identify the current user interface by combining the three modes to obtain a control in the current user interface and description information corresponding to the control. For example, the electronic device may first identify the current user interface based on a code parsing manner, and if the identification is successful, may directly obtain a control included in the current user interface and description information corresponding to the control. If the identification is not successful, the electronic equipment can identify the current user interface by adopting an image-text identification mode or an icon classification model so as to obtain a control included in the current user interface and description information corresponding to the control.
As still another alternative, the replacing the content corresponding to the non-predicate element with the corresponding standard non-predicate content includes: if the non-predicate elements do not comprise the subject element and the object element, acquiring a current task scene; and replacing the content corresponding to the non-predicate element based on the task scene to obtain standard non-predicate content corresponding to the task scene.
It should be noted that, in the process of making a voice, the user may make the made voice more random due to the pronunciation habit problem, but the voice instruction corresponding to the more random voice may not enable the electronic device to accurately determine the control intention of the user. For example, if the content corresponding to the voice command itself is "next", the meaning corresponding to the next possible meaning may be next one, and the corresponding meaning may also be downloaded one. For example, the next possible meaning in an audio playback scenario may be the next one, e.g., playing the next song. In the software downloading scenario, the next possible corresponding meaning may be downloading one. For example, an application is downloaded.
In order to more accurately determine the real intention of a user, as a mode, after a first instruction text is obtained, the first instruction text is updated according to a task scene corresponding to a current user interface to obtain standard non-predicate content corresponding to the task scene. And the current user interface is the interface displayed when the voice instruction is acquired. For example, after obtaining the first instruction text with the content of the next music, if it is determined that the task scene corresponding to the current user interface is an audio playing scene, the electronic device may update the next music, and the updated standard non-predicate content may be the next music. If the task scene corresponding to the current user interface is determined to be an application program downloading scene, the next piece of music can be updated, and the updated standard non-predicate content can be a piece of music playing program downloaded.
As an embodiment, if the syntax element indicates that the first instruction text further includes a non-predicate element, before replacing the content corresponding to the non-predicate element with the corresponding standard non-predicate content, the method further includes:
s231: and if the grammar component represents that only the predicate component is included in the first instruction text, detecting whether the content corresponding to the predicate component represents that the overall operation is performed on the current user interface.
S232: and if the representation is to perform integral operation on the current user interface, generating a control instruction for executing the integral operation.
S233: and if the representation is not the integral operation of the current user interface and the representation is the operation of the control in the current user interface, generating a control instruction corresponding to the control.
It should be noted that, if only the predicate element exists in the first instruction text, the predicate element may be the control or the operation on the current user interface. Therefore, if the user expresses the contents of the first instruction text having only the predicate element through one complete sentence, the only predicate element should actually be an object element in the complete sentence. For example, if the first instruction text is "pause", the corresponding complete sentence thereof should be "click pause button", that is, the only predicate component "pause" in the first instruction text is actually the middle object component "pause button" of the corresponding complete sentence. Therefore, since the first instruction text may not constitute a complete sentence and the sentence component analysis method may fail, the electronic device may optionally determine that the first instruction text includes only the predicate component by the part of speech.
As described above, in the case where only predicate elements are present in the instruction text, the user may have two intentions. One possibility is to refer to a control, such as "play", "pause", "share", etc.; the other is to perform the whole operation on the page, such as left-hand stroke, return, exit, etc. In this case, the first instruction text may be converted into a corresponding complete sentence in the case where the first instruction text has only a predicate element, and then a triple may be generated based on the complete sentence, and then a corresponding control instruction may be generated based on the triple. For example, if the first instruction text is "pause", the converted complete sentence is "click pause button", and the triplet may be generated based on "click pause button".
S250: and obtaining a second instruction text based on the standard predicate content and the standard non-predicate content.
In the embodiment of the application, the process of obtaining the second instruction text may be understood as replacing nonstandard content in the first instruction text with standard content to obtain the second instruction text without changing the user intention corresponding to the first instruction text. The second instruction text obtained can be understood as the text that the electronic device can directly use for generating the control instruction. That is, in the embodiment of the present application, for the non-standard content, the electronic device cannot directly generate the control command, but the electronic device may generate the corresponding control command after replacing the non-standard content (for example, the generalized content shown in the table in this document) with the corresponding standard content. Alternatively, the electronic device may associate the standard content with the executable program code of the corresponding electronic device, and therefore, the electronic device may not be able to determine what kind of program code corresponds to the non-standard content. For example, a "stand-alone" for standard content may correspond to program code related to "Click". However, even if the control is performed for a single machine, the expressed content is nonstandard content such as "open" or "view", and the like, the control cannot be directly associated with the program code, and the control command cannot be accurately generated.
After the replacement of the content corresponding to the predicate element and the content corresponding to the non-predicate element is completed, the second instruction text may be further obtained according to the standard predicate content and the standard non-predicate content. For example, if the first instruction text is "help me press the rightmost button above". The grammatical component of the first instruction text obtained after the grammatical analysis can represent that the first instruction text comprises a predicate component and a state component, wherein the content corresponding to the predicate component is ' press ', the content corresponding to the state component is ' rightmost upper side ', the content corresponding to the predicate component is ' press ', the corresponding content corresponding to the predicate component is ' single machine ', the content corresponding to the state component is ' rightmost upper side ' is replaced by the corresponding standard predicate content, ' upper corner ', and the second instruction text obtained based on the standard predicate content and the standard non-predicate content is ' upper right corner ' of the single machine '.
S260: and generating a control instruction based on the second instruction text.
S270: and executing the control instruction.
Next, a flow according to an embodiment of the present application will be described with reference to fig. 5. As shown in fig. 5, after acquiring the voice instruction, the electronic device may perform voice recognition to obtain a first instruction text. The first instruction text is then parsed. After obtaining the syntax elements, it may be further determined whether a predicate exists in the first instruction text. If a predicate is determined, the contents corresponding to the predicate elements are generalized. The electronic device can generalize the contents corresponding to the predicate elements through the dictionary, so that the contents corresponding to the predicate elements are replaced with standard predicate contents. Further, the electronic device may determine whether or not only the content corresponding to the predicate element is present, and if not, may remove the content corresponding to the predicate element in the first instruction text and then perform non-predicate generalization on the remaining content corresponding to the non-predicate element. If only the contents corresponding to the predicate elements are present, the corresponding control command can be generated directly based on the contents of the predicate elements. When it is determined that there is no predicate element in the first command text, the first command text is directly subjected to non-predicate generalization.
According to the voice control method provided by the embodiment, after the grammatical component of the instruction text is analyzed, the content in the instruction text can be replaced by the corresponding standard content based on the grammatical component of the instruction text, so that the electronic equipment can more accurately determine the control intention of the user based on the replaced standard content, and the accuracy of the control process is improved. In addition, in this embodiment, different content replacement methods may be used according to whether a predicate element is included in a syntax element or not, and whether only a predicate element is included when a predicate element is included, so that flexibility and diversity of content replacement of syntax elements are improved, and finer granularity and more accurate content replacement are facilitated. In addition, the present embodiment can combine the application type and the state information (task scenario) to jointly decide and understand the instruction, and can solve the problem that the same instruction expresses different intentions in different contexts.
Referring to fig. 6, a voice control method provided in the present application includes:
s310: in response to receiving a voice instruction, converting the voice instruction into a corresponding first instruction text.
S320: and analyzing grammatical components of the first instruction text to obtain grammatical components of the first instruction text.
S330: and if the grammar component represents that the first instruction text does not comprise a predicate component and comprises a non-predicate component, replacing the content corresponding to the non-predicate component with the corresponding standard non-predicate content.
As one mode, if the non-predicate element includes a state element, replacing the content corresponding to the state element with the corresponding standard state content to obtain standard non-predicate content; if the non-predicate elements comprise object elements, replacing the content corresponding to the object element with corresponding annotated object content to obtain standard non-predicate content; or if the non-predicate element comprises an object element and a state element, replacing the content corresponding to the state element with the corresponding standard object content, replacing the content corresponding to the state element with the corresponding standard state content to obtain standard non-predicate content, and obtaining standard non-predicate content based on the standard object content and the standard state content.
S340: and obtaining a second instruction text based on the default predicate content and the standard non-predicate content.
S350: and generating a control instruction based on the second instruction text.
S360: and executing the control instruction.
According to the voice control method provided by the embodiment, after the grammatical component of the instruction text is analyzed, the content in the instruction text can be replaced by the corresponding standard content based on the grammatical component of the instruction text, so that the electronic equipment can more accurately determine the control intention of the user based on the replaced standard content, and the accuracy of the control process is improved. In addition, in this embodiment, when it is detected that there is no predicate in the instruction text, targeted replacement can be performed on the non-predicate elements, thereby improving flexibility and diversity of syntax element content replacement.
Referring to fig. 7, a voice control method provided in the present application includes:
s410: in response to receiving a voice instruction, converting the voice instruction into a corresponding first instruction text.
S420: and analyzing grammatical components of the first instruction text to obtain grammatical components of the first instruction text.
S430: and if the grammar component comprises a tone word component, removing the content corresponding to the tone word component in the first instruction text to obtain the instruction text from which the tone word is removed.
S440: and replacing the content corresponding to the entity grammar component in the instruction text without the tone words with the corresponding standard content to obtain a second instruction text, wherein the entity grammar component is the component left after the tone word component is removed from the grammar component.
S450: and generating a control instruction based on the second instruction text.
S460: and executing the control instruction.
As one way, the replacing, based on the grammar component, the content corresponding to the grammar component in the first instruction text with the corresponding standard content to obtain a second instruction text includes: and replacing the content corresponding to the grammar component in the first instruction text with corresponding standard content based on the grammar component and a dictionary to obtain a second instruction text, wherein the dictionary records a content replacement relation. Optionally, the method further includes: acquiring a current position; and acquiring a dictionary corresponding to the current position.
It should be noted that the dictionary may store a correspondence between the generalized content and the standard content corresponding to each grammar component. Moreover, the accent or language expression habits of users in different regions are different, so that corresponding dictionaries can be established for different regions, and the dictionaries in different regions can be different, so that the method can better adapt to the users in different regions, and further improve the accuracy of obtaining the actual control intention of the users. Furthermore, the electronic device can update the dictionary, so that the latest expression habit of the user can be better adapted.
According to the voice control method provided by the embodiment, after the grammatical component of the instruction text is analyzed, the content in the instruction text can be replaced by the corresponding standard content based on the grammatical component of the instruction text, so that the electronic equipment can more accurately determine the control intention of the user based on the replaced standard content, and the accuracy of the control process is improved. In addition, in this embodiment, the tone word in the instruction text may be removed first, so as to facilitate improving the accuracy of the generated control instruction. Moreover, in this embodiment, by constructing the dictionary, the problem of diversity of chinese language expressions can be solved, which mainly includes that the voice command is not standardized, such as special sentences like default sentence structure, inversion/question, synonyms/synonyms and different titles of the same entity in different regions, and the user refers to the entity by abbreviation/alternative name. And because the dictionary is used, a large-scale pre-training model is not needed, and the method does not bring too high computational load, so that the method can be directly deployed on the end-side equipment. Meanwhile, for the real-time updated object entity (the content in the dictionary), the configuration can be carried out in a mode of extending the dictionary at irregular intervals, the training and fine adjustment of a large number of linguistic data of the model are not required to be carried out repeatedly, and the iteration period is shortened.
Referring to fig. 8, the present application provides a voice control apparatus, where the apparatus 500 includes:
an instruction converting unit 510, configured to, in response to receiving a voice instruction, convert the voice instruction into a corresponding first instruction text;
a syntactic component analyzing unit 520, configured to perform syntactic component analysis on the first instruction text to obtain a syntactic component of the first instruction text;
the instruction processing unit 530 is configured to replace, based on the syntax component, content in the first instruction text corresponding to the syntax component with corresponding standard content to obtain a second instruction text;
an instruction generating unit 540, configured to generate a control instruction based on the second instruction text;
a control unit 550, configured to execute the control instruction.
As one mode, the instruction processing unit 530 is specifically configured to, if the syntax component indicates that a predicate component is included in the first instruction text, replace a content corresponding to the predicate component with a corresponding standard predicate content; if the grammar component represents that the first instruction text also comprises a non-predicate component, replacing the content corresponding to the non-predicate component with the corresponding standard non-predicate content; and obtaining a second instruction text based on the standard predicate content and the standard non-predicate content.
Optionally, the instruction processing unit 530 is specifically configured to, if the non-predicate element includes a state element, replace a content corresponding to the state element with a corresponding standard state content to obtain a standard non-predicate content; if the non-predicate elements comprise object elements, replacing the content corresponding to the object element with corresponding annotated object content to obtain standard non-predicate content; or if the non-predicate element comprises an object element and a state element, replacing the content corresponding to the state element with the corresponding standard object content, replacing the content corresponding to the state element with the corresponding standard state content to obtain standard non-predicate content, and obtaining standard non-predicate content based on the standard object content and the standard state content.
Optionally, the instruction processing unit 530 is specifically configured to, if the non-predicate element does not include a subject element and a subject element, obtain a current task scene; and replacing the content corresponding to the non-predicate element based on the task scene to obtain standard non-predicate content corresponding to the task scene.
As an embodiment, the instruction generating unit 540 is specifically configured to, before replacing the content corresponding to the non-predicate element with the corresponding standard non-predicate content if the syntax element indicates that the first instruction text further includes a non-predicate element, detect whether the content corresponding to the predicate element is indicated as performing an overall operation on the current user interface if the syntax element indicates that only a predicate element is included in the first instruction text; if the representation is to carry out integral operation on the current user interface, generating a control instruction for executing the integral operation; and if the representation is not the integral operation of the current user interface and the representation is the operation of the control in the current user interface, generating a control instruction corresponding to the control.
As an embodiment, the instruction processing unit 530 is specifically configured to, if the syntax element indicates that no predicate element is included in the first instruction text and a non-predicate element is included in the first instruction text, replace a content corresponding to the non-predicate element with a corresponding standard non-predicate content; and obtaining a second instruction text based on the default predicate content and the standard non-predicate content.
As a mode, the instruction processing unit 530 is specifically configured to, if the grammar component includes a tone word component, remove content corresponding to the tone word component in the first instruction text, to obtain an instruction text from which the tone word is removed; and replacing the content corresponding to the entity grammar component in the instruction text without the tone words with the corresponding standard content to obtain a second instruction text, wherein the entity grammar component is the component left after the tone word component is removed from the grammar component.
As one manner, the instruction processing unit 530 is specifically configured to replace, based on the syntax component and a dictionary, a content in the first instruction text corresponding to the syntax component with a corresponding standard content to obtain a second instruction text, where a content replacement relationship is recorded in the dictionary. Optionally, the instruction processing unit 530 is further specifically configured to obtain the current location; and acquiring a dictionary corresponding to the current position.
In the voice control device provided in this embodiment, after a voice instruction is received, the voice instruction is converted into a corresponding first instruction text, syntactic component analysis is performed on the first instruction text to obtain syntactic components of the first instruction text, and then, based on the syntactic components, content corresponding to the syntactic components in the first instruction text is replaced with corresponding standard content to obtain a second instruction text, and finally, a control instruction is generated based on the second instruction text, and the control instruction is executed. Therefore, after the grammatical component of the instruction text is analyzed, the content in the instruction text can be replaced by the corresponding standard content based on the grammatical component of the instruction text, so that the electronic equipment can more accurately determine the control intention of the user based on the replaced standard content, and the accuracy of the control process is improved.
It should be noted that, as will be clear to those skilled in the art, for convenience and brevity of description, the specific working processes of the above-described apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In several embodiments provided herein, the coupling of modules to each other may be electrical. In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
An electronic device provided by the present application will be described below with reference to fig. 9.
Referring to fig. 9, based on the voice control method and apparatus, an electronic device 1000 capable of executing the voice control method is further provided in the embodiment of the present application. The electronic device 1000 includes one or more processors 102 (only one shown), a memory 104, a camera 106, and an audio capture device 108 coupled to each other. The memory 104 stores programs that can execute the content of the foregoing embodiments, and the processor 102 can execute the programs stored in the memory 104.
Processor 102 may include one or more processing cores, among other things. The processor 102 interfaces with various components throughout the electronic device 1000 using various interfaces and circuitry to perform various functions of the electronic device 1000 and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 104 and invoking data stored in the memory 104. Alternatively, the processor 102 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 102 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 102, but may be implemented by a communication chip. By one approach, the processor 102 may be a neural network chip. For example, it may be an embedded neural network chip (NPU).
The Memory 104 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 104 may be used to store instructions, programs, code sets, or instruction sets. The memory 104 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like.
Furthermore, the electronic device 1000 may further include a network module 110 and a sensor module 112 in addition to the aforementioned components.
The network module 110 is used for implementing information interaction between the electronic device 1000 and other devices, for example, transmitting a device control instruction, a manipulation request instruction, a status information acquisition instruction, and the like. When the electronic device 200 is embodied as a different device, the corresponding network module 110 may be different.
The sensor module 112 may include at least one sensor. Specifically, the sensor module 112 may include, but is not limited to: levels, light sensors, motion sensors, pressure sensors, infrared heat sensors, distance sensors, acceleration sensors, and other sensors.
Among other things, the pressure sensor may detect the pressure generated by pressing on the electronic device 1000. That is, the pressure sensor detects pressure generated by contact or pressing between the user and the electronic device, for example, contact or pressing between the user's ear and the mobile terminal. Thus, the pressure sensor may be used to determine whether contact or pressure has occurred between the user and the electronic device 1000, as well as the magnitude of the pressure.
The acceleration sensor may detect the magnitude of acceleration in each direction (generally, three axes), detect the magnitude and direction of gravity when stationary, and may be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration) for recognizing the attitude of the electronic device 1000, and related functions (such as pedometer and tapping) for vibration recognition. In addition, the electronic device 1000 may also be configured with other sensors such as a gyroscope, a barometer, a hygrometer and a thermometer, which are not described herein again.
And the audio acquisition device 110 is used for acquiring audio signals. Optionally, the audio capturing device 110 includes a plurality of audio capturing devices, and the audio capturing devices may be microphones.
As one mode, the network module of the electronic device 1000 is a radio frequency module, and the radio frequency module is configured to receive and transmit electromagnetic waves, and implement interconversion between the electromagnetic waves and electrical signals, so as to communicate with a communication network or other devices. The radio frequency module may include various existing circuit elements for performing these functions, such as an antenna, a radio frequency transceiver, a digital signal processor, an encryption/decryption chip, a Subscriber Identity Module (SIM) card, memory, and so forth. For example, the radio frequency module may interact with an external device through transmitted or received electromagnetic waves. For example, the radio frequency module may send instructions to the target device.
Referring to fig. 10, a block diagram of a computer-readable storage medium according to an embodiment of the present application is shown. The computer-readable medium 800 has stored therein a program code that can be called by a processor to execute the method described in the above-described method embodiments.
The computer-readable storage medium 800 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the computer-readable storage medium 800 includes a non-volatile computer-readable storage medium. The computer readable storage medium 800 has storage space for program code 810 to perform any of the method steps of the method described above. The program code can be read from or written to one or more computer program products. The program code 810 may be compressed, for example, in a suitable form.
In summary, according to the voice control method, the voice control device and the electronic device provided by the application, after a voice instruction is received, the voice instruction is converted into a corresponding first instruction text, grammatical component analysis is performed on the first instruction text to obtain grammatical components of the first instruction text, then based on the grammatical components, content corresponding to the grammatical components in the first instruction text is replaced by corresponding standard content to obtain a second instruction text, and finally, a control instruction is generated based on the second instruction text and executed. Therefore, after the grammatical component of the instruction text is analyzed, the content in the instruction text can be replaced by the corresponding standard content based on the grammatical component of the instruction text, so that the electronic equipment can more accurately determine the control intention of the user based on the replaced standard content, and the accuracy of the control process is improved.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (13)

1.一种语音控制方法,其特征在于,所述方法包括:1. a voice control method, is characterized in that, described method comprises: 响应于接收到语音指令,将所述语音指令转换为对应的第一指令文本;In response to receiving the voice command, converting the voice command into corresponding first command text; 对所述第一指令文本进行语法成分分析,得到所述第一指令文本的语法成分;performing grammatical component analysis on the first instruction text to obtain grammatical components of the first instruction text; 基于所述语法成分,将所述第一指令文本中对应于所述语法成分的内容替换为对应的标准内容,以得到第二指令文本;Based on the grammatical components, the content corresponding to the grammatical components in the first instruction text is replaced with corresponding standard content to obtain a second instruction text; 基于所述第二指令文本生成控制指令;generating a control instruction based on the second instruction text; 执行所述控制指令。Execute the control instruction. 2.根据权利要求1所述的方法,其特征在于,所述基于所述语法成分,将所述第一指令文本中对应于所述语法成分的内容替换为对应的标准内容,以得到第二指令文本,包括:2 . The method according to claim 1 , wherein, based on the grammatical components, the content corresponding to the grammatical components in the first instruction text is replaced with corresponding standard content, so as to obtain the second Instruction text, including: 若所述语法成分表征所述第一指令文本中包括有谓语成分,将所述谓语成分对应的内容替换为对应的标准谓语内容;If the grammatical component indicates that the first instruction text includes a predicate component, replace the content corresponding to the predicate component with the corresponding standard predicate content; 若所述语法成分表征所述第一指令文本中还包括有非谓语成分,将所述非谓语成分对应的内容替换为对应的标准非谓语内容;If the grammatical component indicates that the first instruction text also includes a non-predicate component, replace the content corresponding to the non-predicate component with the corresponding standard non-predicate content; 基于所述标准谓语内容和所述标准非谓语内容得到第二指令文本。A second instruction text is obtained based on the standard predicate content and the standard non-predicate content. 3.根据权利要求2所述的方法,其特征在于,所述将所述非谓语成分对应的内容替换为对应的标准非谓语内容,包括:3. The method according to claim 2, wherein, replacing the content corresponding to the non-predicate component with corresponding standard non-predicate content, comprising: 若所述非谓语成分包括状语成分,则将所述状语成分对应的内容替换为对应的标准状语内容,以得到标准非谓语内容;If the non-predicate component includes an adverbial component, replace the content corresponding to the adverbial component with the corresponding standard adverbial content to obtain standard non-predicate content; 若所述非谓语成分包括宾语成分,则将所述状语成分对应的内容替换为对应的标注宾语内容,以得到标准非谓语内容;或者If the non-predicate component includes an object component, replace the content corresponding to the adverbial component with the corresponding marked object content to obtain standard non-predicate content; or 若所述非谓语成分包括宾语成分以及状语成分,则将所述状语成分对应的内容替换为对应的标准宾语内容,则将所述状语成分对应的内容替换为对应的标准状语内容,以得到标准非谓语内容,基于所述标准宾语内容以及所述标准状语内容得到标准非谓语内容。If the non-predicate component includes an object component and an adverbial component, the content corresponding to the adverbial component is replaced with the corresponding standard object content, and the content corresponding to the adverbial component is replaced with the corresponding standard adverbial content to obtain the standard For non-predicate content, standard non-predicate content is obtained based on the standard object content and the standard adverbial content. 4.根据权利要求2所述的方法,其特征在于,所述将所述非谓语成分对应的内容替换为对应的标准非谓语内容,包括:4. The method according to claim 2, characterized in that, replacing the content corresponding to the non-predicate component with corresponding standard non-predicate content, comprising: 若所述非谓语成分中未包括状语成分和宾语成分,则获取当前的任务场景;If the non-predicate component does not include an adverbial component and an object component, obtain the current task scene; 基于所述任务场景对所述非谓语成分对应的内容进行替换,得到与所述任务场景对应的标准非谓语内容。The content corresponding to the non-predicate component is replaced based on the task scenario to obtain standard non-predicate content corresponding to the task scenario. 5.根据权利要求2所述的方法,其特征在于,所述若所述语法成分表征所述第一指令文本中还包括有非谓语成分,将所述非谓语成分对应的内容替换为对应的标准非谓语内容之前还包括:5 . The method according to claim 2 , wherein, if the grammatical component indicates that the first instruction text further includes a non-predicate component, the content corresponding to the non-predicate component is replaced with a corresponding one. 6 . Standard non-predicate content is also preceded by: 若所述语法成分表征所述第一指令文本中仅包括有谓语成分,检测所述谓语成分对应的内容是否表征为对当前用户界面进行整体操作;If the grammatical component represents that the first instruction text only includes a predicate component, detecting whether the content corresponding to the predicate component represents an overall operation on the current user interface; 若表征是对当前用户界面进行整体操作,生成执行所述整体操作的控制指令;If the representation is to perform an overall operation on the current user interface, generating a control instruction for performing the overall operation; 若表征不是对当前用户界面进行整体操作且表征是对当前用户界面中的控件进行操作,生成与所述控件对应的控制指令。If the representation is not an overall operation on the current user interface and the representation is an operation on a control in the current user interface, a control instruction corresponding to the control is generated. 6.根据权利要求1所述的方法,其特征在于,所述基于所述语法成分,将所述第一指令文本中对应于所述语法成分的内容替换为对应的标准内容,以得到第二指令文本,包括:6 . The method according to claim 1 , wherein, based on the grammatical component, the content corresponding to the grammatical component in the first instruction text is replaced with corresponding standard content, so as to obtain the second Instruction text, including: 若所述语法成分表征所述第一指令文本中未包括有谓语成分且包括有非谓语成分,将所述非谓语成分对应的内容替换为对应的标准非谓语内容;If the grammatical component indicates that the first instruction text does not include a predicate component and includes a non-predicate component, replace the content corresponding to the non-predicate component with the corresponding standard non-predicate content; 基于默认谓语内容以及所述标准非谓语内容得到第二指令文本。The second instruction text is obtained based on the default predicate content and the standard non-predicate content. 7.根据权利要求1所述的方法,其特征在于,所述基于所述语法成分,将所述第一指令文本中对应于所述语法成分的内容替换为对应的标准内容,以得到第二指令文本,包括:7 . The method according to claim 1 , wherein, based on the grammatical component, the content corresponding to the grammatical component in the first instruction text is replaced with corresponding standard content, so as to obtain the second Instruction text, including: 若所述语法成分包括有语气词成分,去除所述第一指令文本中的语气词成分对应的内容,得到去除所述语气词的指令文本;If the grammatical component includes a modal particle component, remove the content corresponding to the modal particle component in the first instruction text to obtain an instruction text with the modal particle removed; 将去除所述语气词的指令文本中对应于实体语法成分的内容替换为对应的标准内容,以得到第二指令文本,所述实体语法成分为所述语法成分去除所述语气词成分后剩余的成分。Replace the content corresponding to the entity grammar component in the instruction text with the modal particle removed with the corresponding standard content, so as to obtain a second instruction text, the entity grammar component is the grammatical component remaining after the modal particle component is removed. Element. 8.根据权利要求1所述的方法,其特征在于,所述基于所述语法成分,将所述第一指令文本中对应于所述语法成分的内容替换为对应的标准内容,以得到第二指令文本,包括:8 . The method according to claim 1 , wherein, based on the grammatical component, the content corresponding to the grammatical component in the first instruction text is replaced with corresponding standard content, so as to obtain the second Instruction text, including: 基于所述语法成分以及字典,将所述第一指令文本中对应于所述语法成分的内容替换为对应的标准内容,以得到第二指令文本,其中,所述字典中记录有内容替换关系。Based on the grammatical component and the dictionary, the content corresponding to the grammatical component in the first instruction text is replaced with corresponding standard content, so as to obtain a second instruction text, wherein a content substitution relationship is recorded in the dictionary. 9.根据权利要求8所述的方法,其特征在于,所述方法还包括:9. The method according to claim 8, wherein the method further comprises: 获取当前位置;get the current location; 获取与所述当前位置对应的字典。Get a dictionary corresponding to the current location. 10.一种语音控制装置,其特征在于,所述装置包括:10. A voice control device, characterized in that the device comprises: 指令转换单元,用于响应于接收到语音指令,将所述语音指令转换为对应的第一指令文本;an instruction conversion unit, configured to convert the voice instruction into a corresponding first instruction text in response to receiving a voice instruction; 语法成分分析单元,用于对所述第一指令文本进行语法成分分析,得到所述第一指令文本的语法成分;a grammatical component analysis unit, configured to perform grammatical component analysis on the first instruction text to obtain grammatical components of the first instruction text; 指令处理单元,用于基于所述语法成分,将所述第一指令文本中对应于所述语法成分的内容替换为对应的标准内容,以得到第二指令文本;an instruction processing unit, configured to replace, based on the grammatical component, the content corresponding to the grammatical component in the first instruction text with corresponding standard content to obtain a second instruction text; 指令生成单元,用于基于所述第二指令文本生成控制指令;an instruction generation unit, configured to generate a control instruction based on the second instruction text; 控制单元,用于执行所述控制指令。The control unit is used for executing the control instruction. 11.一种电子设备,其特征在于,包括一个或多个处理器以及存储器;11. An electronic device, comprising one or more processors and a memory; 一个或多个程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个程序配置用于执行权利要求1-9任一所述的方法。One or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs being configured to perform the method of any of claims 1-9. 12.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有程序代码,其中,在所述程序代码运行时执行权利要求1-9任一所述的方法。12 . A computer-readable storage medium, wherein a program code is stored in the computer-readable storage medium, wherein the method according to any one of claims 1-9 is executed when the program code is executed. 13 . 13.一种计算机程序产品,包括计算机程序/指令,其特征在于,该计算机程序/指令被处理器执行时实现权利要求1-9任一所述方法的步骤。13. A computer program product, comprising a computer program/instruction, characterized in that, when the computer program/instruction is executed by a processor, the steps of any one of the methods of claims 1-9 are implemented.
CN202111342166.8A 2021-11-12 2021-11-12 Voice control method and device and electronic equipment Pending CN114121001A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111342166.8A CN114121001A (en) 2021-11-12 2021-11-12 Voice control method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111342166.8A CN114121001A (en) 2021-11-12 2021-11-12 Voice control method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN114121001A true CN114121001A (en) 2022-03-01

Family

ID=80379011

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111342166.8A Pending CN114121001A (en) 2021-11-12 2021-11-12 Voice control method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN114121001A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09146951A (en) * 1995-11-27 1997-06-06 Ishikura Hiroshi System and method for language analysis
CN109741737A (en) * 2018-05-14 2019-05-10 北京字节跳动网络技术有限公司 A kind of method and device of voice control
CN110070861A (en) * 2018-01-22 2019-07-30 丰田自动车株式会社 Information processing unit and information processing method
CN110675870A (en) * 2019-08-30 2020-01-10 深圳绿米联创科技有限公司 Voice recognition method and device, electronic equipment and storage medium
CN110767232A (en) * 2019-09-29 2020-02-07 深圳和而泰家居在线网络科技有限公司 Speech recognition control method and device, computer equipment and computer storage medium
CN110866090A (en) * 2019-11-14 2020-03-06 百度在线网络技术(北京)有限公司 Method, apparatus, electronic device and computer storage medium for voice interaction
CN112839261A (en) * 2021-01-14 2021-05-25 海信电子科技(深圳)有限公司 Method for improving voice instruction matching degree and display equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09146951A (en) * 1995-11-27 1997-06-06 Ishikura Hiroshi System and method for language analysis
CN110070861A (en) * 2018-01-22 2019-07-30 丰田自动车株式会社 Information processing unit and information processing method
CN109741737A (en) * 2018-05-14 2019-05-10 北京字节跳动网络技术有限公司 A kind of method and device of voice control
CN110675870A (en) * 2019-08-30 2020-01-10 深圳绿米联创科技有限公司 Voice recognition method and device, electronic equipment and storage medium
CN110767232A (en) * 2019-09-29 2020-02-07 深圳和而泰家居在线网络科技有限公司 Speech recognition control method and device, computer equipment and computer storage medium
CN110866090A (en) * 2019-11-14 2020-03-06 百度在线网络技术(北京)有限公司 Method, apparatus, electronic device and computer storage medium for voice interaction
CN112839261A (en) * 2021-01-14 2021-05-25 海信电子科技(深圳)有限公司 Method for improving voice instruction matching degree and display equipment

Similar Documents

Publication Publication Date Title
KR102380494B1 (en) Image processing apparatus and method
US11024300B2 (en) Electronic device and control method therefor
CN104735468B (en) A kind of method and system that image is synthesized to new video based on semantic analysis
WO2023082703A1 (en) Voice control method and apparatus, electronic device, and readable storage medium
JP2019046468A (en) Interface smart interactive control method, apparatus, system and program
WO2017084185A1 (en) Intelligent terminal control method and system based on semantic analysis, and intelligent terminal
CN110765294B (en) Image searching method and device, terminal equipment and storage medium
CN109543021B (en) Intelligent robot-oriented story data processing method and system
CN110781329A (en) Image searching method and device, terminal equipment and storage medium
KR20190115405A (en) Search method and electronic device using the method
KR20220102522A (en) Electronic device and method for generating summary image of electronic device
WO2023103918A1 (en) Speech control method and apparatus, and electronic device and storage medium
CN109725798A (en) The switching method and relevant apparatus of Autonomous role
CN113205569B (en) Image drawing method and device, computer readable medium and electronic device
CN118916443A (en) Information retrieval method and device and electronic equipment
CN116561294A (en) Sign language video generation method and device, computer equipment and storage medium
CN114049890A (en) Voice control method and device and electronic equipment
US20210224310A1 (en) Electronic device and story generation method thereof
CN109241331B (en) Intelligent robot-oriented story data processing method
US20240320519A1 (en) Systems and methods for providing a digital human in a virtual environment
CN114121001A (en) Voice control method and device and electronic equipment
US20220270604A1 (en) Electronic device and operation method thereof
CN118644751A (en) A model training method, model application method and related device
CN116052709A (en) Sign language generation method, device, electronic device and storage medium
CN110795581B (en) Image searching method and device, terminal equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20250415

Address after: Changan town in Guangdong province Dongguan 523860 usha Beach Road No. 18

Applicant after: GUANGDONG OPPO MOBILE TELECOMMUNICATIONS Corp.,Ltd.

Country or region after: China

Address before: 311100 room 1001, building 9, Xixi bafangcheng, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province

Applicant before: Hangzhou douku Software Technology Co.,Ltd.

Country or region before: China

TA01 Transfer of patent application right