CN115116441B

CN115116441B - Method, device and equipment for waking up voice recognition function

Info

Publication number: CN115116441B
Application number: CN202210735039.2A
Authority: CN
Inventors: 袁瑾; 肖踞雄; 朱凌; 王娜
Original assignee: Nanjing Dayu Semiconductor Co ltd
Current assignee: Nanjing Dayu Semiconductor Co ltd
Priority date: 2022-06-27
Filing date: 2022-06-27
Publication date: 2024-10-22
Anticipated expiration: 2042-06-27
Also published as: CN115116441A

Abstract

The application provides a wake-up method, device and equipment for a voice recognition function, and relates to the field of voice detection. The method comprises the steps of obtaining an activity detection result of a voice signal, wherein the activity detection result comprises the following steps: detecting a plurality of interrupt signals generated by the voice signal for a plurality of times; counting a plurality of interrupt signals; judging whether the activity detection result meets a preset effective interruption condition according to the statistical result; if the activity detection result meets the preset effective interruption condition, determining that the voice signal is in an active state, and waking up the voice recognition function to recognize the collected voice signal. Therefore, the interrupt signal is detected to determine the active state of the voice signal, and then the voice recognition function is awakened, so that part of misrecognition information is filtered, the misrecognition rate of the VAD module is reduced, and the power consumption of voice recognition is reduced.

Description

Method, device and equipment for waking up voice recognition function

Technical Field

The present invention relates to the field of voice detection, and in particular, to a method, an apparatus, and a device for waking up a voice recognition function.

Background

With the rapid development of bluetooth headsets, their excellent user experience is favored by more and more people, and as a result, more functions are integrated on the headsets, such as voice recognition.

However, since the compact body of the bluetooth headset can only accommodate a small battery, stringent requirements are placed on power consumption during development. The voice recognition function requires a large amount of operation, so that power consumption becomes a difficult problem for the voice recognition function to operate on the earphone. Aiming at the problem, a VAD (Voice Active Detection voice activity detection) module is generally added at the front end of voice recognition, so that the voice recognition module is in a standby state in a normal state, and the voice recognition module is opened to start working after the VAD module detects active voice information, thereby realizing low power consumption.

Since the VAD module is always in the operation mode, low power consumption is also necessary. The low power consumption means that it cannot increase the complex calculation amount, and in order not to reduce the overall recognition rate, the VAD module must achieve higher sensitivity, so that invalid voice can be recognized, and the problem of higher false recognition rate of the VAD module is also caused.

Disclosure of Invention

The invention aims to provide a wake-up method, device and equipment for voice recognition function, aiming at the defects in the prior art, so as to solve the problems of high false recognition rate of a VAD module and the like in the prior art.

In order to achieve the above purpose, the technical scheme adopted by the embodiment of the application is as follows:

in a first aspect, an embodiment of the present application provides a method for waking up a voice recognition function, where the method includes:

acquiring an activity detection result of a voice signal, wherein the activity detection result comprises: detecting a plurality of interrupt signals generated by the voice signal for a plurality of times;

Counting the plurality of interrupt signals;

judging whether the activity detection result meets a preset effective interruption condition according to the statistical result;

if the activity detection result meets the preset effective interruption condition, determining that the voice signal is in an active state, and waking up a voice recognition function to recognize the collected voice signal.

Optionally, the counting the plurality of interrupt signals includes:

Counting the number of interrupt signals in a preset time period;

judging whether the activity detection result meets a preset effective interruption condition according to the statistical result, including:

And judging whether the activity detection result meets the preset effective interrupt condition according to the interrupt signal quantity in the preset duration.

Optionally, the determining, according to the number of interrupt signals in the preset duration, whether the activity detection result meets the preset effective interrupt condition includes:

Judging whether the number of the interrupt signals in the preset duration reaches a preset interrupt number threshold value or not;

if the number of the interrupt signals in the preset duration reaches the preset interrupt number threshold, determining that the activity detection result meets the preset effective interrupt condition;

if the number of the interrupt signals in the preset duration does not reach the preset interrupt number threshold, determining that the activity detection result does not meet the preset effective interrupt condition.

Optionally, the determining whether the number of interrupt signals in the preset duration reaches a preset interrupt number threshold includes:

and determining the threshold value of the preset interruption quantity according to a preset application scene.

Optionally, the counting the plurality of interrupt signals includes:

determining at least two continuous interrupt signals of which the time differences between adjacent interrupt signals of the interrupt signals are within a preset duration range;

determining continuous interrupt duration according to the duration of the continuous at least two interrupt signals;

And judging whether the activity detection result meets the preset effective interrupt condition according to the continuous interrupt duration.

Optionally, the determining, according to the continuous interrupt duration, whether the activity detection result meets the preset valid interrupt condition includes:

Judging whether the continuous interruption time length reaches a preset interruption time length threshold value or not;

If the continuous interruption time length reaches the preset interruption time length threshold value, determining that the activity detection result meets the preset effective interruption condition;

And if the continuous interruption time does not reach the preset interruption time threshold, determining that the activity detection result does not meet the preset effective interruption condition.

Optionally, the determining whether the continuous interrupt duration reaches a preset interrupt duration threshold includes:

and determining the preset interrupt duration threshold according to a preset application scene.

Optionally, the method further comprises:

If the activity detection result does not meet the preset effective interrupt condition, determining that the interrupt signal in the activity detection result is a false detection interrupt signal, and clearing the interrupt signal in the activity detection result.

In a second aspect, an embodiment of the present application provides a wake-up device for a speech recognition function, the device including:

An acquisition module for acquiring the activity detection result of the voice signal, the activity detection result includes: detecting a plurality of interrupt signals generated by the voice signal for a plurality of times;

The statistics module is used for counting the plurality of interrupt signals;

the judging module is used for judging whether the activity detection result meets a preset effective interruption condition according to the statistical result;

and the determining module is used for determining that the voice signal is in an active state and waking up a voice recognition function to recognize the collected voice signal if the activity detection result meets the preset effective interrupt condition.

In a third aspect, an embodiment of the present application provides a speech processing apparatus, including: the processor is in communication connection with the storage medium through a bus, the storage medium stores program instructions executable by the processor, and the processor calls the program instructions stored in the storage medium to execute the steps of the wake-up method of the voice recognition function according to any one of the first steps.

Compared with the prior art, the application has the following beneficial effects:

the application provides a wake-up method, a device and equipment for voice recognition function, wherein the method comprises the steps of obtaining an activity detection result of a voice signal, wherein the activity detection result comprises the following steps: detecting a plurality of interrupt signals generated by the voice signal for a plurality of times; counting a plurality of interrupt signals; judging whether the activity detection result meets a preset effective interruption condition according to the statistical result; if the activity detection result meets the preset effective interruption condition, determining that the voice signal is in an active state, and waking up the voice recognition function to recognize the collected voice signal. Therefore, the interrupt signal is detected to determine the active state of the voice signal, and then the voice recognition function is awakened, so that part of misrecognition information is filtered, the misrecognition rate of the VAD module is reduced, and the power consumption of voice recognition is reduced.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flow chart of a wake-up method of a voice recognition function according to an embodiment of the present application;

fig. 2 is a flow chart of a statistical determining method for interrupt signals according to an embodiment of the present application;

FIG. 3 is a flow chart of a method for determining whether a preset interrupt condition is satisfied according to an embodiment of the present application;

FIG. 4 is a flowchart illustrating another method for determining statistics of interrupt signals according to an embodiment of the present application;

FIG. 5 is a flowchart of another method for determining whether a preset interrupt condition is satisfied according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a wake-up device with a voice recognition function according to an embodiment of the present application;

fig. 7 is a schematic diagram of a voice processing device according to an embodiment of the present application.

Icon: 601-acquisition module, 602-statistics module, 603-judgment module, 604-determination module, 701-processor, 702-storage medium.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.

Furthermore, the terms "first," "second," and the like, if any, are used merely for distinguishing between descriptions and not for indicating or implying a relative importance.

It should be noted that the features of the embodiments of the present invention may be combined with each other without conflict.

In the Bluetooth headset, a VAD module is added at the front end of the voice recognition module, so that the voice recognition module is in a standby state in a normal state, and the voice recognition module is opened to start working after the VAD module detects active voice information, thereby realizing low power consumption. In order to reduce the transmission of the detected ineffective voice to the voice recognition module by the VAD module, the application provides a wake-up method, a device and equipment of the voice recognition function so as to reduce the false recognition rate of the VAD module. The following describes a wake-up method of a voice recognition function provided by the present application through a specific embodiment.

Fig. 1 is a flow chart of a wake-up method of a voice recognition function according to an embodiment of the present application. As shown in fig. 1, the execution body of the method may be a Chip with a computing processing function, for example, a System On Chip (SOC), and the method includes:

S101, acquiring an activity detection result of a voice signal.

Wherein, the activity detection result includes: a plurality of interrupt signals generated by the voice signal are detected a plurality of times.

When the VAD module detects that an active voice signal exists, an interrupt pulse signal is generated and reported to the system chip. When the system chip receives the interrupt signal reported by the VAD module, the interrupt signal can be further processed.

S102, counting a plurality of interrupt signals.

The system chip acquires the activity detection result of the voice signal, and can count a plurality of interrupt signals in the activity detection result. And starting from the system chip receiving the first interrupt signal, counting the received interrupt signals within a preset counting time. An interrupt signal is generated due to the presence of the voice signal. Thus, a number of interrupt signals, i.e. speech signals, are counted. Further, a statistical result of the interrupt signal is obtained.

S103, judging whether the activity detection result meets a preset effective interrupt condition according to the statistical result.

The statistics of the interrupt signal may be compared with a preset effective interrupt condition to determine whether the activity detection result meets the preset effective interrupt condition. The statistics of the interrupt signal characterizes the statistics of the voice data, so that whether the detected voice is in an active state can be determined according to the comparison result of the statistics of the interrupt signal and the preset effective interrupt condition. For example, a continuously emitted sound is in an active state, whereas a sudden transient noise is not in an active state.

And S104, if the activity detection result meets the preset effective interruption condition, determining that the voice signal is in an active state, and waking up a voice recognition function to recognize the collected voice signal.

If the activity detection result meets the preset effective interruption condition, whether the detected voice is effective voice or not can be determined, and the voice signal is determined to be in a continuous activity state. The voice signal in the continuous active state is effective voice, and can be identified. Thus, by determining that the speech signal is active, the speech recognition function may be awakened to recognize the collected speech signal. The voice signal detected is prevented from being in an active state, and the voice signal can not be detected just after the voice recognition function is awakened, so that the voice recognition function is awakened frequently, and the power consumption is increased. And the active state of the voice signal is determined first, and then the voice recognition function is awakened, so that the power consumption of voice recognition is reduced.

In summary, according to the wake-up method for a voice recognition function provided by the embodiment of the present application, by acquiring an activity detection result of a voice signal, the activity detection result includes: detecting a plurality of interrupt signals generated by the voice signal for a plurality of times; counting a plurality of interrupt signals; judging whether the activity detection result meets a preset effective interruption condition according to the statistical result; if the activity detection result meets the preset effective interruption condition, determining that the voice signal is in an active state, and waking up the voice recognition function to recognize the collected voice signal. Therefore, the interrupt signal is detected to determine the active state of the voice signal, and then the voice recognition function is awakened, so that part of misrecognition information is filtered, the misrecognition rate of the VAD module is reduced, and the power consumption of voice recognition is reduced.

Fig. 2 is a flowchart of a method for determining statistics of interrupt signals according to an embodiment of the present application. As shown in fig. 2, counting the plurality of interrupt signals in S102 includes:

s201, counting the number of interrupt signals in a preset time period.

And counting the number of the interrupt signals within a counting preset time period. Since the interrupt signal is issued whenever a voice signal is detected, the number of interrupt signals within the preset time period may represent the activity amount of the voice signal within the preset time period.

It should be noted that, because the VAD module cannot perform data buffering, if the statistics time is too long, too much voice data will be lost, which is not beneficial to subsequent voice recognition processing; if the statistics time is too short, too many interrupt signals are filtered, so that the voice recognition function is frequently awakened, and power consumption is lost. Thus, it is desirable to determine the statistical duration based on the capabilities of the speech recognition algorithm employed.

In S103, according to the statistical result, determining whether the activity detection result meets a preset effective interrupt condition includes:

S202, judging whether the activity detection result meets a preset effective interrupt condition according to the number of interrupt signals in a preset duration.

The number of interrupt signals within the preset duration may be compared with a preset effective interrupt condition to determine whether the activity detection result meets the preset effective interrupt condition. The number of interrupt signals in the preset duration characterizes the activity of the voice signal in the preset duration, so that whether the detected voice is in an active state can be determined according to the comparison result of the number of interrupt signals in the preset duration and the preset effective interrupt condition.

To sum up, in this embodiment, the number of interrupt signals in the preset duration is counted; and judging whether the activity detection result meets the preset effective interrupt condition according to the number of interrupt signals in the preset duration. Therefore, by counting the number of interrupt signals in the preset duration, whether the activity detection result meets the preset effective interrupt condition can be accurately judged.

Fig. 3 is a flowchart of a method for determining whether a preset valid interrupt condition is satisfied according to an embodiment of the present application. As shown in fig. 3, in S202, determining whether the activity detection result meets the preset effective interrupt condition according to the number of interrupt signals in the preset duration includes:

S301, judging whether the number of interrupt signals in a preset duration reaches a preset interrupt number threshold.

If the number of the interrupt signals in the preset duration is greater than or equal to the preset interrupt number threshold, the number of the interrupt signals in the preset duration reaches the preset interrupt number threshold; if the number of the interrupt signals in the preset duration is smaller than the preset interrupt number threshold, the number of the interrupt signals in the preset duration does not reach the preset interrupt number threshold.

S302, if the number of interrupt signals in the preset duration reaches a preset interrupt number threshold, determining that the activity detection result meets a preset effective interrupt condition.

If the number of the interrupt signals in the preset duration reaches the preset interrupt number threshold, the number of the interrupt signals in the preset duration is enough, and the activity detection result is determined to meet the preset effective interrupt condition.

S303, if the number of interrupt signals in the preset duration does not reach the preset interrupt number threshold, determining that the activity detection result does not meet the preset effective interrupt condition.

If the number of the interrupt signals in the preset duration does not reach the preset interrupt number threshold, the number of the interrupt signals in the preset duration is smaller, and the condition that the detected voice in the preset duration is not in an active state may exist, and the preset effective interrupt condition is not met.

To sum up, in this embodiment, it is determined whether the number of interrupt signals within the preset duration reaches a preset interrupt number threshold; if the number of the interrupt signals in the preset duration reaches a preset interrupt number threshold, determining that the activity detection result meets a preset effective interrupt condition; if the number of the interrupt signals in the preset duration does not reach the preset interrupt number threshold, determining that the activity detection result does not meet the preset effective interrupt condition. Therefore, whether the preset effective interrupt condition is met or not is judged more accurately by comparing the interrupt signal quantity in the preset duration with the preset interrupt quantity threshold value.

With continued reference to fig. 3, determining in S301 whether the number of interrupt signals within the preset duration reaches the preset interrupt number threshold includes:

And determining a preset interruption quantity threshold according to a preset application scene.

The specific preset interrupt number threshold may be set according to different preset application scenarios, which is not limited herein.

Fig. 4 is a flowchart of another method for determining statistics of interrupt signals according to an embodiment of the present application. As shown in fig. 4, counting a plurality of interrupt signals in S201 includes:

s401, determining at least two continuous interrupt signals with time differences of adjacent interrupt signals in a preset duration range.

The voice in the active state is likely to be actually a series of continuous voice signals, and the corresponding interrupt signals are also continuous multiple interrupt signals. Therefore, in order to judge whether the voice signal is effective, a plurality of continuous interrupt signals are determined within a statistical preset duration range, namely, at least two continuous interrupt signals with the time difference of adjacent interrupt signals in the plurality of interrupt signals within the preset duration range are determined.

S402, determining continuous interrupt duration according to duration time of at least two continuous interrupt signals.

Each two adjacent interrupt signals of the continuous at least two interrupt signal interrupts are continuously issued, and thus, the time intervals between each two adjacent interrupt signals of the continuous at least two interrupt signal interrupts are equal. The duration of the continuous at least two interrupt signals may be determined, and thus the continuous interrupt duration may be determined.

Further, a plurality of continuous at least two interrupt signals may occur within a preset duration range, the duration of the continuous at least two interrupt signals may be determined, and the maximum duration is determined to be the continuous interrupt duration.

S403, judging whether the activity detection result meets the preset effective interrupt condition according to the continuous interrupt duration.

The continuous interrupt duration of the interrupt signal within the preset duration may be compared with the preset effective interrupt condition to determine whether the activity detection result satisfies the preset effective interrupt condition. The continuous interruption time in the preset time characterizes the activity of the voice signal in the preset time, so that whether the detected voice is in an active state can be determined according to the comparison result of the continuous interruption time in the preset time and the preset effective interruption condition.

In summary, in this embodiment, determining at least two consecutive interrupt signals, where a time difference between adjacent interrupt signals in the plurality of interrupt signals is within a preset duration range; determining continuous interrupt duration according to the duration of at least two continuous interrupt signals; and judging whether the activity detection result meets the preset effective interrupt condition according to the continuous interrupt duration. Therefore, by counting the continuous interruption time length, whether the activity detection result meets the preset effective interruption condition can be accurately judged.

Fig. 5 is a flowchart of another method for determining whether a preset valid interrupt condition is satisfied according to an embodiment of the present application. As shown in fig. 5, in S403, determining whether the activity detection result satisfies the preset valid interrupt condition according to the continuous interrupt duration includes:

s501, judging whether the continuous interruption time reaches a preset interruption time threshold value.

If the continuous interruption time length is greater than or equal to the preset interruption time length threshold value, the continuous interruption time length reaches the preset interruption time length threshold value; if the continuous interruption time is smaller than the preset interruption time threshold, the continuous interruption time does not reach the preset interruption time threshold.

S502, if the continuous interruption time length reaches a preset interruption time length threshold value, determining that the activity detection result meets a preset effective interruption condition.

If the continuous interruption time length reaches the preset interruption time length threshold value, the interruption time length in the preset time length is enough, and the activity detection result is determined to meet the preset effective interruption condition.

S503, if the continuous interruption time length reaches a preset interruption time length threshold value, determining that the activity detection result does not meet a preset effective interruption condition.

If the continuous interruption time does not reach the preset interruption time threshold, the interruption time in the preset time is smaller, and the condition that the detected voice in the preset time is not in an active state may exist and the preset effective interruption condition is not satisfied.

To sum up, in this embodiment, it is determined whether the continuous interruption time length reaches a preset interruption time length threshold value; if the continuous interruption time length reaches a preset interruption time length threshold value, determining that the activity detection result meets a preset effective interruption condition; if the continuous interruption time length reaches the preset interruption time length threshold value, determining that the activity detection result does not meet the preset effective interruption condition. Therefore, whether the preset effective interrupt condition is met or not is judged more accurately through the continuous interrupt duration and the preset interrupt duration threshold value.

With continued reference to fig. 5, the determining in S501 whether the continuous interruption time period reaches the preset interruption time period threshold includes:

And determining a preset interruption time threshold according to a preset application scene.

The specific preset interrupt number threshold may be set according to different preset application scenarios, which is not limited herein. For example, the preset statistical duration may be 50ms and the preset interrupt duration threshold may be 20ms.

On the basis of any one of the embodiments, the wake-up method for a voice recognition function provided by the present application further includes:

If the interrupt signal in the activity detection result is determined to be the false detection interrupt signal, the subsequent voice recognition is not needed, and the interrupt signal in the activity detection result is cleared to reduce the memory pressure.

The following describes a wake-up device, a storage medium, etc. for performing a voice recognition function provided by the present application, and specific implementation processes and technical effects thereof are referred to above, which are not described in detail below.

Fig. 6 is a schematic diagram of a wake-up device with a voice recognition function according to an embodiment of the present application, where the device includes:

The acquiring module 601 is configured to acquire an activity detection result of a voice signal, where the activity detection result includes: a plurality of interrupt signals generated by the voice signal are detected a plurality of times.

The statistics module 602 is configured to perform statistics on a plurality of interrupt signals.

The judging module 603 is configured to judge whether the activity detection result meets a preset valid interrupt condition according to the statistical result.

And the determining module 604 is configured to determine that the voice signal is in an active state if the activity detection result meets the preset interrupt condition, and wake up the voice recognition function to recognize the collected voice signal.

Further, the statistics module 602 is specifically configured to perform statistics on the plurality of interrupt signals, and includes: and counting the number of interrupt signals in a preset duration.

Further, the determining module 603 is specifically configured to determine, according to the statistical result, whether the activity detection result meets a preset valid interrupt condition, including: and judging whether the activity detection result meets the preset effective interrupt condition according to the interrupt signal quantity in the preset duration.

Further, the determining module 603 is specifically configured to determine, according to the number of interrupt signals in the preset duration, whether the activity detection result meets the preset valid interrupt condition, where the determining module includes: judging whether the number of the interrupt signals in the preset duration reaches a preset interrupt number threshold value or not; if the number of the interrupt signals in the preset duration reaches the preset interrupt number threshold, determining that the activity detection result meets the preset effective interrupt condition; if the number of the interrupt signals in the preset duration does not reach the preset interrupt number threshold, determining that the activity detection result does not meet the preset effective interrupt condition.

Further, the determining module 603 is specifically configured to determine whether the number of interrupt signals in the preset duration reaches a preset interrupt number threshold, and includes: and determining the threshold value of the preset interruption quantity according to a preset application scene.

Further, the statistics module 602 is specifically configured to perform statistics on the plurality of interrupt signals, and includes: determining at least two continuous interrupt signals of which the time differences between adjacent interrupt signals of the interrupt signals are within a preset duration range; determining continuous interrupt duration according to the duration of the continuous at least two interrupt signals;

Further, the determining module 603 is specifically configured to determine, according to the statistical result, whether the activity detection result meets a preset valid interrupt condition, including: and judging whether the activity detection result meets the preset effective interrupt condition according to the continuous interrupt duration.

Further, the determining module 603 is specifically configured to determine, according to the continuous interruption time, whether the activity detection result meets the preset valid interruption condition, including: judging whether the continuous interruption time length reaches a preset interruption time length threshold value or not; if the continuous interruption time length reaches the preset interruption time length threshold value, determining that the activity detection result meets the preset effective interruption condition; and if the continuous interruption time length reaches the preset interruption time length threshold value, determining that the activity detection result does not meet the preset effective interruption condition.

Further, the determining module 603 is specifically configured to determine whether the continuous interruption duration reaches a preset interruption duration threshold, and includes: and determining the preset interrupt duration threshold according to a preset application scene.

Further, the determining module 604 is further configured to determine that the interrupt signal in the activity detection result is a false detection interrupt signal if the activity detection result does not meet the preset valid interrupt condition, and clear the interrupt signal in the activity detection result.

The above modules may be one or more integrated circuits configured to implement the above methods, for example: one or more Application SPECIFIC INTEGRATED Circuits (ASIC), or one or more microprocessors (DIGITAL SINGNAL processor, DSP), or one or more field programmable gate arrays (Field Programmable GATE ARRAY, FPGA), etc. For another example, when a module above is implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a central processing unit (Central Processing Unit, CPU) or other processor that may invoke the program code. For another example, the modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).

Fig. 7 is a schematic diagram of a speech processing device according to an embodiment of the present application, where the speech processing device may be a device with a computing function.

The voice processing apparatus includes: a processor 701, and a storage medium 702. The processor 701 and the storage medium 702 are connected by a bus.

The storage medium 702 is used to store a program, and the processor 701 calls the program stored in the storage medium 702 to execute the above-described method embodiment. The specific implementation manner and the technical effect are similar, and are not repeated here.

Optionally, the present invention also provides a program product, such as a computer readable storage medium, comprising a program for performing the above-described method embodiments when being executed by a processor.

In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in hardware plus software functional units.

The integrated units implemented in the form of software functional units described above may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (english: processor) to perform some of the steps of the methods according to the embodiments of the invention. And the aforementioned storage medium includes: u disk, mobile hard disk, read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk, etc.

Claims

1. A method for waking up a speech recognition function, the method comprising:

Counting the plurality of interrupt signals;

If the activity detection result meets the preset effective interruption condition, determining that the voice signal is in an active state, and waking up a voice recognition function to recognize the collected voice signal;

The counting the plurality of interrupt signals includes:

Determining at least two continuous interrupt signals of which the time difference between adjacent interrupt signals in the interrupt signals is within a preset duration range;

2. The method of claim 1, wherein said counting said plurality of interrupt signals comprises:

Counting the number of interrupt signals in a preset time period;

3. The method according to claim 2, wherein the determining whether the activity detection result meets the preset valid interrupt condition according to the number of interrupt signals in the preset duration includes:

4. The method of claim 3, wherein said determining whether the number of interrupt signals within the predetermined time period reaches a predetermined interrupt number threshold comprises:

5. The method according to claim 1, wherein the determining whether the activity detection result meets the preset valid interrupt condition according to the continuous interrupt duration includes:

6. The method of claim 5, wherein the determining whether the continuous break duration reaches a preset break duration threshold comprises:

7. The method according to any one of claims 1-6, further comprising:

8. A wake-up device for a speech recognition function, the device comprising:

The statistics module is used for counting the plurality of interrupt signals;

the determining module is used for determining that the voice signal is in an active state and waking up a voice recognition function to recognize the collected voice signal if the activity detection result meets the preset effective interrupt condition;

the statistics module is specifically configured to determine at least two continuous interrupt signals, where a time difference between adjacent interrupt signals in the plurality of interrupt signals is within a preset duration range; determining continuous interrupt duration according to the duration of the continuous at least two interrupt signals;

the judging module is specifically configured to judge whether the activity detection result meets the preset effective interrupt condition according to the continuous interrupt duration.

9. A speech processing apparatus, comprising: the processor is in communication connection with the storage medium through a bus, the storage medium stores program instructions executable by the processor, and the processor calls the program instructions stored in the storage medium to execute the steps of the wake-up method of the voice recognition function according to any one of claims 1 to 7.