CN112700780B - Voice processing method and system based on multiple devices - Google Patents
Voice processing method and system based on multiple devices Download PDFInfo
- Publication number
- CN112700780B CN112700780B CN202011501007.3A CN202011501007A CN112700780B CN 112700780 B CN112700780 B CN 112700780B CN 202011501007 A CN202011501007 A CN 202011501007A CN 112700780 B CN112700780 B CN 112700780B
- Authority
- CN
- China
- Prior art keywords
- voice recognition
- state information
- running state
- voice
- recognition result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/34—Adaptation of a single recogniser for parallel processing, e.g. by use of multiple processors or cloud computing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Theoretical Computer Science (AREA)
- Telephonic Communication Services (AREA)
Abstract
The invention relates to the technical field of voice processing and discloses a voice processing method and system based on multiple devices, wherein the method comprises the steps of obtaining voice instruction information and extracting corresponding pulse code modulation data from the voice instruction information; the method comprises the steps of obtaining device identifiers of a plurality of intelligent devices which are associated in advance, obtaining running state information of each intelligent device according to the device identifiers, carrying out voice recognition processing on pulse modulation data through a cloud server according to the running state information to obtain a voice recognition result, and selecting a target intelligent device from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result. The voice instruction information is processed by combining the cloud server according to the running state information of the intelligent devices, then the corresponding target intelligent device is selected to respond according to the voice recognition result, the intelligent device and the cloud end cooperate to improve the response speed of online voice processing and the voice interaction experience of users.
Description
Technical Field
The present invention relates to the field of speech processing technologies, and in particular, to a speech processing method and system based on multiple devices.
Background
The intellectualization of internet of things (Internet of Things, ioT) devices requires a lot of technical support, wherein the online speech processing technology as an interface of man-machine interaction plays a significant role. In the prior art, voice instruction information is locally recognized through intelligent equipment, then a corresponding control instruction is generated based on the recognized text, and then corresponding operation is executed based on the control instruction to respond to the voice instruction information, because of the limitation of the local storage space, the locally stored voice recognition database can not recognize various voice instruction information like cloud analysis, so that the accuracy and response speed of voice interaction are in short, the voice processing speed is slowed down, and the voice interaction experience of a user is also affected. Therefore, how to improve the response speed of online voice processing so as to improve the voice interaction experience of the user becomes a problem to be solved urgently.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
The invention mainly aims to provide a voice processing method and a voice processing system based on multiple devices, which aim to solve the technical problem of how to improve the response speed of online voice processing so as to improve the voice interaction experience of users.
To achieve the above object, the present invention provides a multi-device-based voice processing method, the method comprising the steps of:
Acquiring voice instruction information and extracting corresponding pulse code modulation data from the voice instruction information;
acquiring device identifiers of a plurality of intelligent devices which are associated in advance, and acquiring running state information of each intelligent device according to the device identifiers;
Performing voice recognition processing on the pulse modulation data through a cloud server according to the running state information to obtain a voice recognition result;
And selecting a target intelligent device from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result.
Preferably, the step of performing voice recognition processing on the pulse modulation data through a cloud server according to the running state information to obtain a voice recognition result specifically includes:
judging whether the running state information accords with a working state condition or not;
and uploading the pulse modulation data to a cloud server when the running state information does not accord with the working state condition, so that the cloud server performs voice recognition processing on the pulse modulation data to obtain a voice recognition result.
Preferably, the running state information comprises a processor occupancy rate and/or a processor idle rate;
correspondingly, the step of judging whether the running state information accords with the working state condition specifically comprises the following steps:
Extracting the processor occupancy rate from the running state information;
detecting whether the occupancy rate of the processor is larger than a preset occupancy rate or not, and judging whether the running state information accords with a working state condition or not according to a detection result;
And/or;
extracting the processor idle rate from the running state information;
And detecting whether the idle rate of the processor is smaller than a preset idle rate, and judging whether the running state information accords with the working state condition according to a detection result.
Preferably, when the running state information does not meet the working state condition, the step of uploading the pulse modulation data to a cloud server, so that the cloud server performs voice recognition processing on the pulse modulation data to obtain a voice recognition result specifically includes:
And uploading the pulse modulation data to a cloud server when the running state information does not accord with the working state condition, so that the cloud server obtains a corresponding cloud space characteristic value, and performing voice recognition processing on the pulse modulation data when detecting that the cloud space characteristic value accords with a preset storage condition to obtain a voice recognition result.
Preferably, the step of selecting a target smart device from the plurality of smart devices according to the voice recognition result to respond to the voice recognition result specifically includes:
selecting corresponding target intelligent equipment from the plurality of intelligent equipment according to the voice recognition result;
and storing the voice recognition result into a memory of the target intelligent device so that the target intelligent device responds to the voice recognition result.
In addition, to achieve the above object, the present invention also proposes a multi-device-based speech processing system, which is characterized in that the system includes:
The data acquisition module is used for acquiring voice instruction information and extracting corresponding pulse code modulation data from the voice instruction information;
the state acquisition module is used for acquiring equipment identifiers of a plurality of intelligent equipment which are associated in advance and acquiring running state information of each intelligent equipment according to the equipment identifiers;
The voice recognition module is used for carrying out voice recognition processing on the pulse modulation data through the cloud server according to the running state information to obtain a voice recognition result;
And the voice response module is used for selecting a target intelligent device from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result.
Preferably, the voice recognition module is further configured to determine whether the running state information meets a working state condition;
And the voice recognition module is further used for uploading the pulse modulation data to a cloud server when the running state information does not accord with the working state condition, so that the cloud server performs voice recognition processing on the pulse modulation data to obtain a voice recognition result.
Preferably, the running state information comprises a processor occupancy rate and/or a processor idle rate;
The voice recognition module is further used for extracting the processor occupancy rate from the running state information;
the voice recognition module is also used for detecting whether the occupancy rate of the processor is larger than a preset occupancy rate and judging whether the running state information accords with a working state condition according to a detection result;
the voice recognition module is further used for extracting the idle rate of the processor from the running state information;
The voice recognition module is also used for detecting whether the idle rate of the processor is smaller than a preset idle rate and judging whether the running state information accords with the working state condition according to a detection result.
Preferably, the voice recognition module is further configured to upload the pulse modulation data to a cloud server when the running state information does not meet the working state condition, so that the cloud server obtains a corresponding cloud space feature value, and perform voice recognition processing on the pulse modulation data when detecting that the cloud space feature value meets a preset storage condition, to obtain a voice recognition result.
Preferably, the voice response module is further configured to select a corresponding target smart device from the plurality of smart devices according to the voice recognition result;
the voice response module is further configured to store the voice recognition result in a memory of the target intelligent device, so that the target intelligent device responds to the voice recognition result.
According to the method, voice command information is obtained, corresponding pulse code modulation data are extracted from the voice command information, device identifications of a plurality of intelligent devices which are associated in advance are obtained, running state information of each intelligent device is obtained according to the device identifications, voice recognition processing is conducted on the pulse modulation data through a cloud server according to the running state information, a voice recognition result is obtained, and a target intelligent device is selected from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result. Unlike the prior art, voice command information is locally recognized through intelligent equipment, then a corresponding control command is generated based on recognized text, corresponding operation is executed based on the control command to respond to the voice command information, so that accuracy and response speed of voice interaction are high, voice processing speed is slow, voice interaction experience of a user is also influenced, pulse code modulation data corresponding to the voice command information and operation state information of a plurality of intelligent equipment which are associated in advance are obtained, voice recognition processing is conducted on the pulse code modulation data through a cloud server according to the operation state information, a voice recognition result is obtained, corresponding target intelligent equipment is selected from the plurality of intelligent equipment according to the voice recognition result to respond to the voice recognition result, intelligent equipment and cloud cooperative work is achieved, response speed of online voice processing is improved, and further voice interaction experience of the user is also improved.
Drawings
FIG. 1 is a flowchart of a first embodiment of a multi-device based speech processing method of the present invention;
FIG. 2 is a schematic diagram of the architecture of a smart device of a hardware runtime environment according to an embodiment of the present invention;
FIG. 3 is a flowchart of a second embodiment of a multi-device based speech processing method according to the present invention;
Fig. 4 is a block diagram of a first embodiment of a multi-device based speech processing system of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, fig. 1 is a schematic flow chart of a first embodiment of a speech processing method based on multiple devices.
In this embodiment, the multi-device-based voice processing method includes the following steps:
S10, acquiring voice instruction information and extracting corresponding pulse code modulation data from the voice instruction information;
It is easy to understand that the execution body of the embodiment is an intelligent control end, and the intelligent control end can be understood as an interaction medium between a user and a plurality of intelligent devices, and referring to fig. 2, fig. 2 is a schematic structural diagram of the intelligent device in a hardware operation environment according to an embodiment of the present invention. As shown in fig. 2, the smart device may include a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a micro-control unit (Microcontroller Unit, MCU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a display, an input unit such as a keyboard, and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., a wireless FIdelity (WI-FI) interface). The Memory 1005 may be a high-speed random access Memory (Random Access Memory, RAM) Memory or a stable Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Those skilled in the art will appreciate that the structure shown in fig. 2 is not limiting of the smart device and may include more or fewer components than shown, or may combine certain components, or may be arranged in different components.
As shown in fig. 2, an operating system, a data storage module, a network communication module, a user interface module, and a multi-device-based voice processing program may be included in the memory 1005 as one type of storage medium.
In the intelligent device shown in fig. 2, the network interface 1004 is mainly used for data communication with a network server, the user interface 1003 is mainly used for data interaction with a user, the processor 1001 and the memory 1005 in the intelligent device can be arranged in the intelligent device or in the intelligent control terminal, when the processor 1001 and the memory 1005 are arranged in the intelligent device, the intelligent control terminal calls the multi-device-based voice processing program stored in the memory 1005 through the processor 1001 in the intelligent device and executes the multi-device-based voice processing method provided by the embodiment of the invention, and when the processor 1001 and the memory 1005 are arranged in the intelligent control terminal, the intelligent control terminal directly calls the multi-device-based voice processing program stored in the memory 1005 through the processor 1001 and executes the multi-device-based voice processing method provided by the embodiment of the invention.
It should be noted that, when the intelligent control terminal obtains the voice command information, the corresponding pulse code modulation (Pulse Code Modulation, PCM) data may be extracted from the voice command information, where the voice command information may be voice command information sent by a user or may be voice command information generated through an input unit (such as a keyboard), in a specific implementation, the voice command information may be first converted into an analog signal with continuous time and continuous value, then the analog signal is converted into a digital signal with discrete time and discrete value, and then transmitted in a channel, which may be understood as sampling the analog signal corresponding to the voice command information, then quantizing and encoding the amplitude of the sample corresponding to the analog signal to obtain pulse code modulation data, and then the pulse code modulation data is stored in the memory 00, where the memory 00 is used to store the pulse code modulation data.
Step S20, acquiring device identifiers of a plurality of intelligent devices which are associated in advance, and acquiring running state information of each intelligent device according to the device identifiers;
It is easy to understand that, when the intelligent control end obtains the pulse code modulation data corresponding to the voice instruction information, the intelligent control end can also obtain the device identifiers of a plurality of intelligent devices which are associated in advance and are running, and then control the intelligent device to start the state detection function so as to obtain the running state information of the intelligent device. The device identifier may be used to indicate the type of device, such as a television, humidifier, air conditioner, etc. The operating state information may be used to represent the state of a processor (e.g., a micro control unit) of the smart device, including but not limited to processor occupancy and processor idle rate.
In a specific implementation, when the intelligent device is controlled to start the state detection function, a corresponding interface identifier may be set to be 1, an identifier result is stored in a memory 01, then an intelligent control end accesses the memory 01 to obtain the identifier result to be 1, obtains device identifiers of a plurality of intelligent devices, respectively stores the device identifiers in a memory 11, a memory 22 and a memory 33, sets the corresponding interface identifier to be 1, then stores the identifier result in a memory 02, accesses the memory 02 to obtain the identifier result to be 1, accesses the memory stored with the device identifier, obtains the device identifier in the memory stored with the device identifier, and inquires running state information of the corresponding intelligent device according to the device identifier.
Step S30, performing voice recognition processing on the pulse modulation data through a cloud server according to the running state information to obtain a voice recognition result;
It is easy to understand that after the operation state information of the intelligent device is obtained, in order to improve the voice interaction experience of the user, when the pulse modulation data is uploaded to the cloud server, the cloud server is controlled to perform voice recognition processing on the pulse modulation data according to the operation state information, and a voice recognition result is obtained. The speech recognition result includes, but is not limited to, characteristic information related to voice such as voiceprint characteristics, speech speed, frequency, duration, emotion and the like, and text information obtained by semantic recognition in the speech recognition process, and the text information can be understood as characteristic information related to text including, but not limited to, text length, specific characters, ease of semantic parsing and the like.
In a specific implementation, in order to increase the response speed of online voice processing, only the pulse modulation data can be subjected to semantic recognition in voice recognition processing to obtain a voice recognition result only containing text information, then a target intelligent device is selected from a plurality of intelligent devices according to the text information, and the text information is stored in a memory of the target intelligent device, so that the target intelligent device responds to the voice recognition result.
And S40, selecting a target intelligent device from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result.
It should be noted that after the voice recognition result is obtained, the device identifier (such as the device name) may be extracted from the text information in the voice recognition result, then, the corresponding target smart device is selected from the plurality of smart devices according to the device identifier, and the voice recognition result is stored in the memory of the target smart device, so that the target smart device responds to the voice recognition result.
In a specific implementation, in order to improve the voice interaction experience of the user, feature information such as voiceprint features and the like can be extracted from a voice recognition result to determine an initiating object of voice instruction information, then corresponding equipment preference setting is obtained according to the identity of the initiating object, then response results are further optimized according to the equipment preference setting, the equipment preference setting is set according to the usage setting of the initiating object obtained by the identity of the initiating object in a preset instruction database for each intelligent device, if the target intelligent device is determined to be an air conditioner according to the voice recognition result, the usage setting of the initiating object for the air conditioner is set to 26 ℃, and when the voice instruction information sent by the initiating object is received as "open the air conditioner", the air conditioner is started according to the identity of the initiating object, and the starting temperature of the air conditioner is set to 26 ℃.
It should be understood that the foregoing is illustrative only and is not limiting, and that in specific applications, those skilled in the art may set the invention as desired, and the invention is not limited thereto.
In this embodiment, voice command information is obtained, corresponding pulse code modulation data is extracted from the voice command information, device identifiers of a plurality of intelligent devices which are associated in advance are obtained, operation state information of each intelligent device is obtained according to the device identifiers, voice recognition processing is performed on the pulse code modulation data through a cloud server according to the operation state information, a voice recognition result is obtained, and a target intelligent device is selected from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result. The method comprises the steps of firstly, locally recognizing voice instruction information through intelligent equipment, then generating corresponding control instructions based on recognized texts, then executing corresponding operations based on the control instructions to respond to the voice instruction information, so that accuracy and response speed of voice interaction are high, voice processing speed is slowed down, voice interaction experience of a user is also affected, pulse code modulation data corresponding to the voice instruction information and operation state information of a plurality of intelligent equipment which are associated in advance are obtained, then voice recognition processing is conducted on the pulse code modulation data through a cloud server according to the operation state information, voice recognition results are obtained, and then corresponding target intelligent equipment is selected from the plurality of intelligent equipment according to the voice recognition results to respond to the voice recognition results, so that intelligent equipment and cloud collaborative work is achieved, response speed of online voice processing is improved, and further voice interaction experience of the user is also improved.
Referring to fig. 3, fig. 3 is a flow chart of a second embodiment of a multi-device-based speech processing method according to the present invention.
Based on the first embodiment, in this embodiment, the step S30 includes:
step 301, judging whether the running state information accords with a working state condition;
It should be noted that the running state information may be used to represent a state of a processor (e.g., a micro control unit) of the smart device, including, but not limited to, a processor occupancy rate and a processor idle rate. In a specific implementation, the processor occupancy rate may be extracted from the running state information, and then whether the running state information meets the working state condition may be determined according to whether the processor occupancy rate is greater than a preset occupancy rate, where the preset occupancy rate may be set according to actual requirements, for example, 80%, which is not limited in this embodiment. And/or extracting the processor idle rate from the running state information, judging whether the running state information accords with the working state condition according to whether the processor idle rate is smaller than a preset idle rate, wherein the preset idle rate can be set according to actual requirements, such as 10%, and the embodiment is not limited to the above.
In a specific implementation, when inquiring the running state information of the intelligent device (such as a television), if the processor occupancy rate of the television is greater than 80% and/or the processor idle rate is less than 10% (i.e. the processor corresponding to the television is in a busy state), setting the corresponding interface identifier as1 and saving the identifier result to the memory 02, otherwise setting the corresponding interface identifier as 2 and saving the identifier result to the memory 03. The intelligent control terminal accesses the memory 03 to obtain the identification result of 2, then accesses the memory 00 to obtain the PCM data, controls the television to upload the PCM data to the cloud server, sets the corresponding interface identification 1, and stores the identification result in the memory 04.
In another implementation, when the running state information of the intelligent device (such as the humidifier) is queried, if the processor occupancy rate of the humidifier is greater than 80% and/or the processor idle rate is less than 10% (i.e. the processor corresponding to the humidifier is in a busy state), setting the corresponding interface identifier as 1 and saving the identifier result to the memory 05, otherwise setting the corresponding interface identifier as 2 and saving the identifier result to the memory 05. The intelligent control end accesses the memory 05 to obtain the identification result of 2, then accesses the memory 00 to obtain the PCM data, controls the humidifier to upload the PCM data to the cloud server, sets the corresponding interface identification 1, and stores the identification result in the memory 06.
In another implementation, when the running state information of the intelligent device (such as an air conditioner) is queried, if the processor occupancy rate of the air conditioner is greater than 80% and/or the processor idle rate is less than 10% (i.e. the processor corresponding to the air conditioner is in a busy state), setting the corresponding interface identifier to be 1 and saving the identifier result to the memory 07, otherwise setting the corresponding interface identifier to be 2 and saving the identifier result to the memory 07. The intelligent control end accesses the memory 07 to obtain the identification result of 2, then accesses the memory 00 to obtain the PCM data, controls the air conditioner to upload the PCM data to the cloud server, sets the corresponding interface identification 1, and stores the identification result in the memory 08.
Repeating the operation to traverse other intelligent devices to obtain the running state information of each intelligent device, and judging whether the running state information of each intelligent device accords with the working state condition or not.
Step S302, uploading the pulse modulation data to a cloud server when the running state information does not accord with the working state condition, so that the cloud server carries out voice recognition processing on the pulse modulation data to obtain a voice recognition result.
It is easy to understand that when the running state information does not meet the working state condition (for example, the processor occupancy rate is less than or equal to 80%, and the processor idle rate is greater than or equal to 10%, that is, the processor is not in a busy state), the pulse modulation data can be uploaded to the cloud server, so that the cloud server obtains the corresponding cloud space feature value, and when the cloud space feature value is detected to meet the preset storage condition, voice recognition processing is performed on the pulse modulation data, a voice recognition result is obtained, and corresponding target intelligent devices are selected from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result. The cloud space feature value can be understood as a feature value representing the use condition of the storage space corresponding to each intelligent device in the cloud server, including but not limited to the cloud space occupancy rate and the cloud space idle rate. Accordingly, the preset storage condition may be set to determine whether the cloud space occupancy rate is greater than the preset cloud occupancy rate, which may be set according to the actual requirement, such as 60%, which is not limited in this embodiment, or to determine whether the cloud space occupancy rate is less than the preset cloud occupancy rate, which may be set according to the actual requirement, such as 40%, which is not limited in this embodiment.
In a specific implementation, when processing running state information of an intelligent device (such as a television), an intelligent control end accesses a memory 04 to obtain an identification result of 1, controls a cloud server to obtain PCM data corresponding to the television, so that the cloud server obtains a cloud space feature value corresponding to the television, such as a cloud space idle rate, if the cloud idle rate corresponding to the television is greater than 60%, controls the cloud server to perform semantic identification on the PCM data corresponding to the television, stores the obtained text information in a memory 44 to enable the television to execute corresponding operation to respond to the text information, and if the cloud idle rate corresponding to the television is less than 40%, sets a corresponding interface identifier of 1 and stores an identification result in a memory 08.
In another implementation manner, when the operation state information of the intelligent device (such as a humidifier) is processed, the intelligent control end accesses the memory 06 to obtain the identification result as 1, controls the cloud server to obtain the PCM data corresponding to the humidifier, so that the cloud server obtains the cloud space feature value corresponding to the humidifier, such as the cloud space idle rate, if the cloud idle rate corresponding to the humidifier is greater than 60%, controls the cloud server to perform semantic identification on the PCM data corresponding to the humidifier, stores the obtained text information into the memory 55 to enable the humidifier to execute the corresponding operation to respond to the text information, and if the cloud idle rate corresponding to the humidifier is less than 40%, sets the corresponding interface identifier as 1, and stores the identification result into the memory 09.
In another implementation manner, when the operation state information of the intelligent device (such as an air conditioner) is processed, the intelligent control end accesses the memory 08 to obtain the identification result as 1, controls the cloud server to obtain the PCM data corresponding to the air conditioner, so that the cloud server obtains the cloud space characteristic value corresponding to the air conditioner, such as the cloud space idle rate, if the cloud idle rate corresponding to the air conditioner is greater than 60%, controls the cloud server to perform semantic identification on the PCM data corresponding to the air conditioner, stores the obtained text information in the memory 66 to enable the humidifier to execute corresponding operation to respond to the text information, and if the cloud idle rate corresponding to the humidifier is less than 40%, sets the corresponding interface identifier as 1, and stores the identification result in the memory 10.
Based on the above implementation manner, the intelligent control end accesses the memory 44, the memory 55 or the memory 66, and then can acquire the voice recognition result, and further, the above operation can also be repeated to traverse other intelligent devices, so as to acquire the voice recognition result of each intelligent device.
It should be understood that the foregoing is illustrative only and is not limiting, and that in specific applications, those skilled in the art may set the invention as desired, and the invention is not limited thereto.
In this embodiment, whether the running state information meets the working state condition is determined, and when the running state information does not meet the working state condition, the pulse modulation data is uploaded to a cloud server, so that the cloud server performs voice recognition processing on the pulse modulation data, and a voice recognition result is obtained. Unlike the prior art, which performs voice recognition processing only through a locally stored voice recognition database, the embodiment performs voice recognition processing on pulse modulation data of each intelligent device by controlling the cloud server according to running state information of each intelligent device, such as the processor occupancy rate and the processor idle rate, so as to improve voice recognition efficiency and voice recognition accuracy, and further, also improve response speed of online voice processing and voice interaction experience of users.
Referring to fig. 4, fig. 4 is a block diagram illustrating a first embodiment of a multi-device based speech processing system according to the present invention.
As shown in fig. 4, a multi-device-based speech processing system according to an embodiment of the present invention includes:
the data acquisition module 10 is used for acquiring voice instruction information and extracting corresponding pulse code modulation data from the voice instruction information;
The state acquisition module 20 is configured to acquire device identifiers of a plurality of intelligent devices associated in advance, and acquire operation state information of each intelligent device according to the device identifiers;
The voice recognition module 30 is configured to perform voice recognition processing on the pulse modulation data through a cloud server according to the running state information, so as to obtain a voice recognition result;
and the voice response module 40 is configured to select a target intelligent device from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result.
In this embodiment, voice command information is obtained, corresponding pulse code modulation data is extracted from the voice command information, device identifiers of a plurality of intelligent devices which are associated in advance are obtained, operation state information of each intelligent device is obtained according to the device identifiers, voice recognition processing is performed on the pulse code modulation data through a cloud server according to the operation state information, a voice recognition result is obtained, and a target intelligent device is selected from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result. The method comprises the steps of firstly, locally recognizing voice instruction information through intelligent equipment, then generating corresponding control instructions based on recognized texts, then executing corresponding operations based on the control instructions to respond to the voice instruction information, so that accuracy and response speed of voice interaction are high, voice processing speed is slowed down, voice interaction experience of a user is also affected, pulse code modulation data corresponding to the voice instruction information and operation state information of a plurality of intelligent equipment which are associated in advance are obtained, then voice recognition processing is conducted on the pulse code modulation data through a cloud server according to the operation state information, voice recognition results are obtained, and then corresponding target intelligent equipment is selected from the plurality of intelligent equipment according to the voice recognition results to respond to the voice recognition results, so that intelligent equipment and cloud collaborative work is achieved, response speed of online voice processing is improved, and further voice interaction experience of the user is also improved.
Based on the above-mentioned first embodiment of the multi-device based speech processing system of the present invention, a second embodiment of the multi-device based speech processing system of the present invention is presented.
In this embodiment, the voice recognition module 30 is further configured to determine whether the running state information meets a working state condition;
The voice recognition module 30 is further configured to upload the pulse modulation data to a cloud server when the running state information does not meet the working state condition, so that the cloud server performs voice recognition processing on the pulse modulation data to obtain a voice recognition result.
The running state information comprises a processor occupancy rate and/or a processor idle rate;
The voice recognition module 30 is further configured to extract the processor occupancy rate from the running state information;
the voice recognition module 30 is further configured to detect whether the processor occupancy rate is greater than a preset occupancy rate, and determine whether the running state information meets a working state condition according to a detection result;
The speech recognition module 30 is further configured to extract the processor idle rate from the running state information;
The voice recognition module 30 is further configured to detect whether the idle rate of the processor is less than a preset idle rate, and determine whether the running state information meets a working state condition according to a detection result.
The voice recognition module 30 is further configured to upload the pulse modulation data to a cloud server when the running state information does not meet the working state condition, so that the cloud server obtains a corresponding cloud space feature value, and perform voice recognition processing on the pulse modulation data when detecting that the cloud space feature value meets a preset storage condition, so as to obtain a voice recognition result.
The voice response module 40 is further configured to select a corresponding target smart device from the plurality of smart devices according to the voice recognition result;
the voice response module 40 is further configured to store the voice recognition result in the memory of the target smart device, so that the target smart device responds to the voice recognition result.
Other embodiments or specific implementations of the multi-device-based speech processing system of the present invention may refer to the above method embodiments, and are not described herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. read-only memory/random-access memory, magnetic disk, optical disk), comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.
Claims (8)
1. A multi-device based speech processing method, the method comprising the steps of:
Acquiring voice instruction information and extracting corresponding pulse modulation data from the voice instruction information;
acquiring device identifiers of a plurality of intelligent devices which are associated in advance, and acquiring running state information of each intelligent device according to the device identifiers;
Performing voice recognition processing on the pulse modulation data through a cloud server according to the running state information to obtain a voice recognition result;
Selecting a target intelligent device from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result;
the step of performing voice recognition processing on the pulse modulation data through a cloud server according to the running state information to obtain a voice recognition result specifically comprises the following steps:
judging whether each piece of running state information accords with a working state condition or not, wherein the running state information comprises a processor occupancy rate and/or a processor idle rate;
and uploading the pulse modulation data to a cloud server when the running state information does not accord with the working state conditions, so that the cloud server performs voice recognition processing on the pulse modulation data to obtain a voice recognition result.
2. The method of claim 1, wherein the step of determining whether each of the operation state information meets an operation state condition comprises:
Extracting the processor occupancy rate from each piece of running state information;
detecting whether the occupancy rate of the processor is larger than a preset occupancy rate or not, and judging whether the running state information accords with a working state condition or not according to a detection result;
And/or;
Extracting the processor idle rate from each piece of running state information;
And detecting whether the idle rate of the processor is smaller than a preset idle rate, and judging whether the running state information accords with the working state condition according to a detection result.
3. The method of claim 1, wherein when each of the running state information does not meet the working state condition, the step of uploading the pulse modulation data to a cloud server to enable the cloud server to perform voice recognition processing on the pulse modulation data to obtain a voice recognition result specifically comprises:
uploading the pulse modulation data to a cloud server when each piece of running state information does not accord with the working state conditions, so that the cloud server obtains a corresponding cloud space characteristic value, and performing voice recognition processing on the pulse modulation data when the cloud space characteristic value is detected to accord with preset storage conditions, so as to obtain a voice recognition result;
the cloud space characteristic value is a characteristic value representing the use condition of the storage space corresponding to each intelligent device in the cloud server, and includes, but is not limited to, the occupancy rate of the cloud space and the idle rate of the cloud space.
4. The method of claim 1, wherein the step of selecting a target smart device from the plurality of smart devices based on the speech recognition result to respond to the speech recognition result comprises:
selecting corresponding target intelligent equipment from the plurality of intelligent equipment according to the voice recognition result;
and storing the voice recognition result into a memory of the target intelligent device so that the target intelligent device responds to the voice recognition result.
5. A multi-device based speech processing system, the system comprising:
the data acquisition module is used for acquiring voice instruction information and extracting corresponding pulse modulation data from the voice instruction information;
the state acquisition module is used for acquiring equipment identifiers of a plurality of intelligent equipment which are associated in advance and acquiring running state information of each intelligent equipment according to the equipment identifiers;
The voice recognition module is used for carrying out voice recognition processing on the pulse modulation data through the cloud server according to the running state information to obtain a voice recognition result;
The voice response module is used for selecting a target intelligent device from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result;
the voice recognition module is further configured to:
judging whether each piece of running state information accords with a working state condition or not, wherein the running state information comprises a processor occupancy rate and/or a processor idle rate;
and uploading the pulse modulation data to a cloud server when the running state information does not accord with the working state conditions, so that the cloud server performs voice recognition processing on the pulse modulation data to obtain a voice recognition result.
6. The system of claim 5, wherein the speech recognition module is further configured to extract the processor occupancy from each of the operational state information;
the voice recognition module is also used for detecting whether the occupancy rate of the processor is larger than a preset occupancy rate and judging whether the running state information accords with a working state condition according to a detection result;
the voice recognition module is further used for extracting the idle rate of the processor from each piece of running state information;
The voice recognition module is also used for detecting whether the idle rate of the processor is smaller than a preset idle rate and judging whether the running state information accords with the working state condition according to a detection result.
7. The system of claim 5, wherein the voice recognition module is further configured to upload the pulse modulation data to a cloud server when each of the running state information does not meet the working state condition, so that the cloud server obtains a corresponding cloud space feature value, and perform voice recognition processing on the pulse modulation data when the cloud space feature value is detected to meet a preset storage condition, so as to obtain a voice recognition result;
the cloud space characteristic value is a characteristic value representing the use condition of the storage space corresponding to each intelligent device in the cloud server, and includes, but is not limited to, the occupancy rate of the cloud space and the idle rate of the cloud space.
8. The system of claim 5, wherein the voice response module is further configured to select a corresponding target smart device from the plurality of smart devices based on the voice recognition result;
the voice response module is further configured to store the voice recognition result in a memory of the target intelligent device, so that the target intelligent device responds to the voice recognition result.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202011501007.3A CN112700780B (en) | 2020-12-17 | 2020-12-17 | Voice processing method and system based on multiple devices |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202011501007.3A CN112700780B (en) | 2020-12-17 | 2020-12-17 | Voice processing method and system based on multiple devices |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN112700780A CN112700780A (en) | 2021-04-23 |
| CN112700780B true CN112700780B (en) | 2025-04-08 |
Family
ID=75508897
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202011501007.3A Active CN112700780B (en) | 2020-12-17 | 2020-12-17 | Voice processing method and system based on multiple devices |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN112700780B (en) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113759869B (en) * | 2021-08-16 | 2024-04-02 | 深圳Tcl新技术有限公司 | Intelligent household appliance testing method and device |
| CN114244879A (en) * | 2021-12-15 | 2022-03-25 | 北京声智科技有限公司 | An industrial control system, industrial control method and electronic device |
| CN118609564B (en) * | 2024-06-22 | 2025-08-12 | 箭牌家居集团股份有限公司 | Voice control method, server, intelligent home system and computer readable storage medium |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109920417A (en) * | 2019-02-18 | 2019-06-21 | 广州视源电子科技股份有限公司 | Voice processing method, device, equipment and storage medium |
| CN111404998A (en) * | 2020-02-27 | 2020-07-10 | 北京三快在线科技有限公司 | Voice interaction method, first electronic device and readable storage medium |
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103971687B (en) * | 2013-02-01 | 2016-06-29 | 腾讯科技(深圳)有限公司 | Implementation of load balancing in a kind of speech recognition system and device |
| US10997251B2 (en) * | 2018-10-15 | 2021-05-04 | Bao Tran | Smart device |
| KR102280690B1 (en) * | 2019-08-15 | 2021-07-22 | 엘지전자 주식회사 | Intelligent voice outputting method, apparatus, and intelligent computing device |
| CN111583928A (en) * | 2020-05-09 | 2020-08-25 | 宁波奥克斯电气股份有限公司 | Equipment control method and related device |
| CN111862972B (en) * | 2020-07-08 | 2023-11-14 | 北京梧桐车联科技有限责任公司 | Voice interaction service method, device, equipment and storage medium |
| CN111817936A (en) * | 2020-08-12 | 2020-10-23 | 深圳市欧瑞博科技股份有限公司 | Control method and device of intelligent household equipment, electronic equipment and storage medium |
-
2020
- 2020-12-17 CN CN202011501007.3A patent/CN112700780B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109920417A (en) * | 2019-02-18 | 2019-06-21 | 广州视源电子科技股份有限公司 | Voice processing method, device, equipment and storage medium |
| CN111404998A (en) * | 2020-02-27 | 2020-07-10 | 北京三快在线科技有限公司 | Voice interaction method, first electronic device and readable storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN112700780A (en) | 2021-04-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112700780B (en) | Voice processing method and system based on multiple devices | |
| US10971156B2 (en) | Method, interaction device, server, and system for speech recognition | |
| CN104346127B (en) | Implementation method, device and the terminal of phonetic entry | |
| CN106504743B (en) | Voice interaction output method for intelligent robot and robot | |
| CN105702253A (en) | Voice awakening method and device | |
| CN112151034B (en) | Voice control method and device of equipment, electronic equipment and storage medium | |
| CN111897601B (en) | Application startup method, device, terminal device and storage medium | |
| CN114708856A (en) | Voice processing method and related equipment thereof | |
| CN113158692A (en) | Multi-intention processing method, system, equipment and storage medium based on semantic recognition | |
| CN115840806A (en) | Method and related device for acquiring plot information based on natural language interaction | |
| CN110808031A (en) | Voice recognition method and device and computer equipment | |
| CN111627431A (en) | Voice recognition method, device, terminal and storage medium | |
| CN105529025B (en) | Voice operation input method and electronic equipment | |
| CN105162836A (en) | Method for executing speech communication, server and intelligent terminal equipment | |
| CN117334196A (en) | Control method, device, equipment and storage medium | |
| CN112836548A (en) | A document operation method, apparatus, device and storage medium | |
| CN106371905B (en) | Application program operation method and device and server | |
| EP3059731A1 (en) | Method and apparatus for automatically sending multimedia file, mobile terminal, and storage medium | |
| CN110660393B (en) | Voice interaction method, device, equipment and storage medium | |
| CN118942454A (en) | Terminal device and voice command response method | |
| CN111464644B (en) | Data transmission method and electronic equipment | |
| WO2023082891A1 (en) | Control method and apparatus for voice air conditioner, voice air conditioner, and storage medium | |
| CN108735214A (en) | The sound control method and device of equipment | |
| CN115662422A (en) | Voice interaction method and device, electronic equipment and readable storage medium | |
| CN114822529A (en) | Method and device for voice control of air conditioner, air conditioner and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |