[go: up one dir, main page]

CN113316074A - Howling detection method and device and electronic equipment - Google Patents

Howling detection method and device and electronic equipment Download PDF

Info

Publication number
CN113316074A
CN113316074A CN202110511757.7A CN202110511757A CN113316074A CN 113316074 A CN113316074 A CN 113316074A CN 202110511757 A CN202110511757 A CN 202110511757A CN 113316074 A CN113316074 A CN 113316074A
Authority
CN
China
Prior art keywords
frequency point
frequency
power spectrum
spectrum peak
howling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110511757.7A
Other languages
Chinese (zh)
Other versions
CN113316074B (en
Inventor
巴莉芳
康力
叶顺舟
何陈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisoc Chongqing Technology Co Ltd
Original Assignee
Unisoc Chongqing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisoc Chongqing Technology Co Ltd filed Critical Unisoc Chongqing Technology Co Ltd
Priority to CN202110511757.7A priority Critical patent/CN113316074B/en
Publication of CN113316074A publication Critical patent/CN113316074A/en
Application granted granted Critical
Publication of CN113316074B publication Critical patent/CN113316074B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R29/00Monitoring arrangements; Testing arrangements
    • H04R29/001Monitoring arrangements; Testing arrangements for loudspeakers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Quality & Reliability (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Telephone Function (AREA)

Abstract

The application discloses a howling detection method, a device and an electronic device, wherein the method comprises the following steps: the electronic equipment acquires a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as the center, determines a first reference frequency point according to a first power ratio between every two frequency points in the first frequency point set, and determines a second reference frequency point according to a second power ratio between every two frequency points in the second frequency point set, so that the frequency point where the power spectrum peak value is located can be determined to have single tone or howling according to the number of the frequency points between the first reference frequency point and the second reference frequency point. The method can distinguish the frequency point with the tone and the frequency point with the howling, and improves the accuracy of the howling detection.

Description

Howling detection method and device and electronic equipment
Technical Field
The present application relates to the field of audio signal processing technologies, and in particular, to a howling detection method, an apparatus, and an electronic device.
Background
Currently, in an audio device, a howling may be generated in a captured audio signal due to factors such as a capturing mode of the audio device. For example, in a public address system, an audio signal received by a microphone is amplified by a power amplifier and then output by a speaker, and the output signal may be received by the microphone again through reflection and/or refraction, thereby forming a positive feedback loop. According to the nyquist stability criterion, some frequency points may generate uniform oscillation, so that the power of an audio signal of a public address system is continuously increased, the system is unstable, and an acoustic feedback howling phenomenon may be generated. Thereby causing the audio device to make unpleasant sounds, the audio signal quality is low, and the user experience is affected. Furthermore, when the feedback howling phenomenon is serious, audio equipment may be damaged, such as burning a power amplifier and/or a middle-high unit of a loudspeaker.
Therefore, the electronic device needs to perform howling detection on the acquired audio signal, so as to perform howling suppression on the audio signal based on the howling detection result, thereby improving the quality of the audio signal. However, the electronic device is prone to erroneously detect howling, which results in low accuracy of howling detection, and therefore how to improve the accuracy of howling detection is an urgent problem to be solved.
Disclosure of Invention
The embodiment of the application provides a howling detection method. The method can distinguish the frequency point with the tone from the frequency point with the howling, and can improve the accuracy of the howling detection.
In a first aspect, an embodiment of the present application provides a howling detection method, where the howling detection method includes:
acquiring a frequency point where a power spectrum peak value of a signal to be detected is located;
acquiring a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as the center, wherein the frequency corresponding to the frequency point in the first frequency point set is less than or equal to the frequency corresponding to the power spectrum peak value, and the frequency corresponding to the frequency point in the second frequency point set is greater than or equal to the frequency corresponding to the power spectrum peak value;
acquiring a first power ratio between every two adjacent frequency points in a first frequency point set, and determining a first reference frequency point corresponding to the minimum first power ratio in the first frequency point set; acquiring a second power ratio between every two adjacent frequency points in a second frequency point set, and determining a second reference frequency point corresponding to the minimum second power ratio in the second frequency point set;
and determining whether the frequency point of the power spectrum peak has howling or single tone according to the number of the frequency points between the first reference frequency point and the second reference frequency point.
Based on the method described in the first aspect, the electronic device may obtain a first reference frequency point and a second reference frequency point centered on a frequency point where a power spectrum peak value is located; and then determining whether the frequency point of the power spectrum peak has howling or single tone according to the number of the frequency points between the first reference frequency point and the second reference frequency point. The frequency points with the single tone and the frequency points with the howling can be accurately distinguished due to the fact that the number of the frequency points between the first reference frequency point and the second reference frequency point is considered during howling detection, the frequency points with the single tone cannot be mistakenly detected as the frequency points with the howling, the frequency points with the howling can be accurately detected from the frequency points where the power spectrum peak is located, and accuracy of the howling detection is improved.
With reference to the first aspect, in some possible embodiments, determining that one of howling and single tone exists in a frequency point where a power spectrum peak is located according to the number of frequency points between a first reference frequency point and a second reference frequency point includes:
and if the number of the frequency points between the first reference frequency point and the second reference frequency point is less than or equal to a preset number threshold, determining that the frequency point at which the power spectrum peak value is located has the single tone.
With reference to the first aspect, in some possible embodiments, determining that one of howling and single tone exists in a frequency point where a power spectrum peak is located according to the number of frequency points between a first reference frequency point and a second reference frequency point includes:
and if the number of the frequency points between the first reference frequency point and the second reference frequency point is greater than a preset number threshold, determining that the frequency point where the power spectrum peak value is located has howling.
With reference to the first aspect, in some possible embodiments, the signal to be detected is a speech frame signal, and the frequency corresponding to the peak of the power spectrum is greater than the frequency threshold of the speech frame signal.
With reference to the first aspect, in some possible embodiments, the signal to be detected is a non-speech frame signal, and the frequency corresponding to the peak of the power spectrum is greater than the frequency threshold of the non-speech frame signal.
With reference to the first aspect, in some possible embodiments, the method further comprises:
acquiring the energy sum of a first frequency band taking the frequency point where the peak value of the power spectrum is located as the center; the frequency difference between the frequency point in the first frequency band and the frequency point at which the peak value of the power spectrum is located is smaller than a preset frequency value;
and determining one of howling, unvoiced consonant or noise at the frequency point where the power spectrum peak is located according to the sum of the energy of the first frequency band and the sum of the energy of the whole frame of the signal to be detected.
With reference to the first aspect, in some possible embodiments, determining that one of howling, unvoiced consonant, and noise exists at a frequency point where a power spectrum peak is located according to a sum of energies of the first frequency band and a sum of energies of an entire frame of the signal to be detected includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is smaller than or equal to a preset ratio, determining that the frequency point where the power spectrum peak value is located has noise.
With reference to the first aspect, in some possible embodiments, determining that one of howling, unvoiced consonant, and noise exists at a frequency point where a power spectrum peak is located according to a sum of energies of the first frequency band and a sum of energies of an entire frame of the signal to be detected includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is greater than a preset ratio and the sum of the energies of the first frequency band is less than or equal to a first energy threshold, determining that the frequency point where the power spectrum peak value is located has the unvoiced consonant.
With reference to the first aspect, in some possible embodiments, determining that one of howling, unvoiced consonant, and noise exists at a frequency point where a power spectrum peak is located according to a sum of energies of the first frequency band and a sum of energies of an entire frame of the signal to be detected includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is greater than a preset ratio and the sum of the energies of the first frequency band is greater than a first energy threshold, determining that the frequency point where the power spectrum peak value is located has howling.
In a second aspect, an embodiment of the present application provides a howling detection apparatus, where the howling detection apparatus includes:
the acquisition unit is used for acquiring the frequency point of the power spectrum peak value of the signal to be detected;
the acquisition unit is also used for acquiring a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as the center, wherein the frequency corresponding to the frequency point in the first frequency point set is less than or equal to the frequency corresponding to the power spectrum peak value, and the frequency corresponding to the frequency point in the second frequency point set is greater than or equal to the frequency corresponding to the power spectrum peak value;
the determining unit is used for acquiring a first power ratio between every two adjacent frequency points in the first frequency point set and determining a first reference frequency point corresponding to the minimum first power ratio in the first frequency point set; acquiring a second power ratio between every two adjacent frequency points in a second frequency point set, and determining a second reference frequency point corresponding to the minimum second power ratio in the second frequency point set;
the determining unit is further configured to determine whether a howling or a single tone exists at a frequency point where the power spectrum peak is located according to the number of frequency points between the first reference frequency point and the second reference frequency point.
With reference to the second aspect, in some possible embodiments, the determining unit is configured to determine, according to the number of frequency points between the first reference frequency point and the second reference frequency point, that one of howling and single tone exists at a frequency point where a power spectrum peak is located, and includes:
and if the number of the frequency points between the first reference frequency point and the second reference frequency point is less than or equal to a preset number threshold, determining that the frequency point at which the power spectrum peak value is located has the single tone.
With reference to the second aspect, in some possible embodiments, the determining unit is configured to determine, according to the number of frequency points between the first reference frequency point and the second reference frequency point, that one of howling and single tone exists at a frequency point where a power spectrum peak is located, and includes:
and if the number of the frequency points between the first reference frequency point and the second reference frequency point is greater than a preset number threshold, determining that the frequency point where the power spectrum peak value is located has howling.
With reference to the second aspect, in some possible embodiments, the signal to be detected is a speech frame signal, and the frequency corresponding to the peak of the power spectrum is greater than the frequency threshold of the speech frame signal.
With reference to the second aspect, in some possible embodiments, the signal to be detected is a non-speech frame signal, and the frequency corresponding to the peak of the power spectrum is greater than the frequency threshold of the non-speech frame signal.
With reference to the second aspect, in some possible embodiments, the howling detection apparatus further includes:
the acquisition unit is also used for acquiring the energy sum of the first frequency band taking the frequency point where the power spectrum peak value is located as the center; the frequency difference between the frequency point in the first frequency band and the frequency point at which the peak value of the power spectrum is located is smaller than a preset frequency value;
the determining unit is further configured to determine that one of howling, unvoiced consonant and noise exists at a frequency point where the power spectrum peak is located according to the sum of the energies of the first frequency band and the sum of the energies of the whole frame of the signal to be detected.
With reference to the second aspect, in some possible embodiments, the determining unit is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of the whole frame of the signal to be detected, that one of a howling, an unvoiced consonant, or a noise exists at a frequency point where a power spectrum peak is located, and includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is smaller than or equal to a preset ratio, determining that the frequency point where the power spectrum peak value is located has noise.
With reference to the second aspect, in some possible embodiments, the determining unit is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of the whole frame of the signal to be detected, that one of a howling, an unvoiced consonant, or a noise exists at a frequency point where a power spectrum peak is located, and includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is greater than a preset ratio and the sum of the energies of the first frequency band is less than or equal to a first energy threshold, determining that the frequency point where the power spectrum peak value is located has the unvoiced consonant.
With reference to the second aspect, in some possible embodiments, the determining unit is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of the whole frame of the signal to be detected, that one of a howling, an unvoiced consonant, or a noise exists at a frequency point where a power spectrum peak is located, and includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is greater than a preset ratio and the sum of the energies of the first frequency band is greater than a first energy threshold, determining that the frequency point where the power spectrum peak value is located has howling.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a processor and a memory, where the processor is connected to the memory, where the memory is used to store a program code, and the processor is used to call the program code to execute the howling detection method of the first aspect.
In a fourth aspect, an embodiment of the present application provides a chip, where the chip is configured to obtain a frequency point where a power spectrum peak of a signal to be detected is located;
acquiring a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as the center, wherein the frequency corresponding to the frequency point in the first frequency point set is less than or equal to the frequency corresponding to the power spectrum peak value, and the frequency corresponding to the frequency point in the second frequency point set is greater than or equal to the frequency corresponding to the power spectrum peak value;
acquiring a first power ratio between every two adjacent frequency points in a first frequency point set, and determining a first reference frequency point corresponding to the minimum first power ratio in the first frequency point set; acquiring a second power ratio between every two adjacent frequency points in a second frequency point set, and determining a second reference frequency point corresponding to the minimum second power ratio in the second frequency point set;
and determining whether the frequency point of the power spectrum peak has howling or single tone according to the number of the frequency points between the first reference frequency point and the second reference frequency point.
In a fifth aspect, an embodiment of the present application provides a module device, where the module device includes a processor and a communication interface, the processor is connected to the communication interface, the communication interface is used for receiving and sending signals, and the processor is used for:
acquiring a frequency point where a power spectrum peak value of a signal to be detected is located;
acquiring a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as the center, wherein the frequency corresponding to the frequency point in the first frequency point set is less than or equal to the frequency corresponding to the power spectrum peak value, and the frequency corresponding to the frequency point in the second frequency point set is greater than or equal to the frequency corresponding to the power spectrum peak value;
acquiring a first power ratio between every two adjacent frequency points in a first frequency point set, and determining a first reference frequency point corresponding to the minimum first power ratio in the first frequency point set; acquiring a second power ratio between every two adjacent frequency points in a second frequency point set, and determining a second reference frequency point corresponding to the minimum second power ratio in the second frequency point set;
and determining whether the frequency point of the power spectrum peak has howling or single tone according to the number of the frequency points between the first reference frequency point and the second reference frequency point.
In a sixth aspect, the present application provides a computer-readable storage medium, where a computer program is stored, and the computer program is executed by a processor to implement the howling detection method of the first aspect.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 is a schematic processing flow diagram of an audio signal in an electronic device according to an embodiment of the present disclosure;
fig. 2 is a flowchart illustrating a howling detection method according to an embodiment of the present application;
fig. 3a is a schematic diagram of a discrete power spectrum of a signal to be detected according to an embodiment of the present application;
fig. 3b is a schematic diagram of a discrete power spectrum of another signal to be detected according to the embodiment of the present application;
fig. 4 is a flowchart illustrating another howling detection method provided in an embodiment of the present application;
fig. 5 is a flowchart illustrating a further howling detection method according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a howling detection apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In order to ensure the audio signal quality of the audio signal, the electronic device may detect howling in the audio signal, so as to perform howling suppression on the audio signal based on the howling detection result, thereby obtaining a purer audio signal. Referring to fig. 1, fig. 1 is a schematic diagram illustrating a processing flow of an audio signal in an electronic device. As shown in fig. 1, the electronic device includes a signal analyzing unit, a signal processing unit, and a signal synthesizing unit.
The signal analysis unit is used for windowing in a frame division mode and carrying out Fourier transform on the signals. Since the electronic device performs howling detection based on the frequency domain signal, it is necessary to obtain a frequency domain signal corresponding to the audio signal, and it is necessary to perform fourier transform on a time domain signal of the audio signal to obtain a corresponding frequency domain signal. And, the shorter the time length of the time domain signal is, the more accurate the frequency domain signal obtained by fourier transforming the time domain signal is. Therefore, in order to improve the accuracy of howling detection, before performing fourier transform on an input signal, a frame windowing may be performed on the input signal with a longer time length (the input signal is a time domain signal) to obtain a plurality of frame signals with a shorter time length (the frame signal is a time domain signal), and then, the fourier transform may be performed on each frame signal to obtain a frequency domain signal. The Fourier transform may include Fast Fourier Transform (FFT) and the like.
The signal processing unit may be configured to perform howling detection and howling suppression on the signal. The electronic device may perform howling detection on the frequency domain signal of each frame signal, and perform howling suppression on the frequency domain signal based on the howling detection result.
The signal synthesis unit is used for performing inverse Fourier transform and window synthesis on the signal. Through the operation of the foregoing embodiment, a frequency domain signal of each frame signal after signal processing can be obtained. Specifically, inverse fourier transform is performed on a frequency domain signal of each frame signal after signal processing to obtain each frame signal after signal processing, i.e., a time domain signal, and then window synthesis is performed on the frame signals after multi-frame processing to obtain an output signal, i.e., the processed audio signal.
As can be seen from the above, the electronic device may perform howling detection. However, in the howling detection process, a tone is often erroneously detected as howling, which results in low accuracy of howling detection and further results in low quality of an audio signal obtained by howling suppression based on the howling detection result. Therefore, how to improve the accuracy of howling detection is an urgent problem to be solved.
Based on this, embodiments of the present application provide a howling detection method, an apparatus, and an electronic device. The howling detection method may be applied to a signal processing unit in an electronic device as shown in fig. 1. In the method, the application discloses a howling detection method, a howling detection device and an electronic device, wherein the method comprises the following steps: the electronic equipment acquires a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as the center, determines a first reference frequency point according to a first power ratio between every two frequency points in the first frequency point set, and determines a second reference frequency point according to a second power ratio between every two frequency points in the second frequency point set, so that the frequency point existence unit where the power spectrum peak value is located or howling can be determined according to the number of the frequency points between the first reference frequency point and the second reference frequency point. The embodiment of the application can distinguish the frequency point with the tone from the frequency point with the howling, and improves the accuracy of the howling detection.
The electronic device mentioned in the present application may be any device having a function of processing an audio signal, and may include, but is not limited to, a public address system, a voice communication terminal (such as a smart speaker, a smart phone, an intercom, a vehicle-mounted terminal), a desktop computer, and the like, which is not limited in this application.
Based on the above description, the following detailed description proposes a howling detection method according to an embodiment of the present application. The howling detection method may be performed by the above-mentioned electronic device. Referring to fig. 2, a flow chart of a howling detection method is shown. As shown in fig. 2, the howling detection method may include S201 to S204:
s201: and acquiring the frequency point of the power spectrum peak value of the signal to be detected.
When howling exists in a frequency point, the power of the frequency point is high, so that howling may exist in the frequency point where the power spectrum peak value is located. Therefore, the electronic device may first acquire the frequency point where the power spectrum peak in the signal to be detected is located, so as to subsequently further determine whether there is howling at the frequency point where the power spectrum peak is located.
The signal to be detected may be the aforementioned frequency domain signal. Wherein, the peak value of the power spectrum refers to the maximum value of the power in the power spectrum corresponding to the signal to be detected. In one embodiment, the electronic device may calculate a power spectrum corresponding to the signal to be detected, and then obtain a peak value in the power spectrum and a frequency point where the peak value of the power spectrum is located by using a peak value detection method. For example, if the signal to be detected is obtained from the frame signal through FFT, the electronic device may calculate the power corresponding to the frequency of each frequency point in the frequency domain signal to obtain the power spectrum corresponding to the signal to be detected, and then use the frequency point corresponding to the maximum power value as the frequency point at which the peak of the power spectrum is located. In another embodiment, the electronic device may generate a power spectrum corresponding to the signal to be detected, and then search a frequency point where a peak of the power spectrum is located by using a peak finding (findpeaks) function carried by a matrix laboratory (Matlab). For convenience of description, the embodiments of the present application are exemplified by signals to be detected obtained through FFT.
It should be understood that the number of frequency points at which the electronic device can obtain the power spectrum peak from the signal to be detected may include one or more. The embodiment of the present application does not limit this.
S202: and acquiring a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as the center, wherein the frequency corresponding to the frequency point in the first frequency point set is less than or equal to the frequency corresponding to the power spectrum peak value, and the frequency corresponding to the frequency point in the second frequency point set is greater than or equal to the frequency corresponding to the power spectrum peak value.
Specifically, when a single tone exists in a frequency point, the power of the frequency point is also high, so that the frequency point at which the power spectrum peak value exists may also have the single tone. Therefore, in order to avoid false detection of a frequency point with a tone as a frequency point with howling, it is necessary to distinguish between a frequency point with a tone and a frequency point with howling. The electronic device may analyze a first set of frequency points and a second set of frequency points centered around a frequency point at which a power spectrum peak is located.
For better illustration of the first frequency point set and the second frequency point set centered on the frequency point where the power spectrum peak is located, the following description is made with reference to the accompanying drawings. Referring to fig. 3a, fig. 3a shows a schematic diagram of a discrete power spectrum of a signal to be detected. The signal to be detected can be obtained based on FFT, the discrete power spectrum includes 8 frequency points, and one frequency point corresponds to one frequency. Assuming that the frequencies corresponding to 8 frequency points in the discrete power spectrum are respectively: 1 kilohertz (kHz), 2kHz, 3kHz, 4kHz, 5kHz, 6kHz, 7kHz, 8kHz, and the power of the frequency point with the frequency of 1kHz is 0.1 milliwatt (mw), the power of the frequency point with the frequency of 2kHz is 0.2mw, the power of the frequency point with the frequency of 3kHz is 0.2mw, the power of the frequency point with the frequency of 4kHz is 0.2mw, the power of the frequency point with the frequency of 5kHz is 10mw, the power of the frequency point with the frequency of 6kHz is 0.2mw, the power of the frequency point with the frequency of 7kHz is 0.1mw, and the power of the frequency point with the frequency of 8kHz is 0.1 mw.
As shown in fig. 3a, the frequency point where the peak of the power spectrum is obtained by the electronic device may be a fifth frequency point, the frequency corresponding to the frequency point is 5kHz, and the power corresponding to the frequency point is 10 mw. In fig. 3a, the first set of frequency points centered around the fifth frequency point may comprise the frequency points in the area 310: a frequency point of 1kHz, a frequency point of 2kHz, a frequency point of 3kHz, a frequency point of 4kHz and a frequency point of 5 kHz. The second set of frequency points centered around the fifth frequency point may include the frequency points in the region 320: a frequency point of 5kHz, a frequency point of 6kHz, a frequency point of 7kHz and a frequency point of 8 kHz.
S203: acquiring a first power ratio between every two adjacent frequency points in a first frequency point set, and determining a first reference frequency point corresponding to the minimum first power ratio in the first frequency point set; and acquiring a second power ratio between every two adjacent frequency points in the second frequency point set, and determining a second reference frequency point corresponding to the minimum second power ratio in the second frequency point set.
Specifically, for each frequency point in the first frequency point set, the electronic device may sequentially obtain a first power ratio between every two adjacent frequency points in the first frequency point set, and determine a first reference frequency point corresponding to a minimum first power ratio in the first frequency point set. The first power ratio may refer to a ratio of power of a frequency point with a smaller frequency to power of a frequency point with a larger frequency in two adjacent frequency points, and the first reference frequency point may include any one of the two adjacent frequency points corresponding to the minimum first power ratio. Still bearing the example shown in fig. 3a, the electronic device may sequentially obtain a first power ratio between every two adjacent frequency points in the first frequency point set (including a frequency point with a frequency of 1kHz, a frequency point with a frequency of 2kHz, a frequency point with a frequency of 3kHz, a frequency point with a frequency of 4kHz, and a frequency point with a frequency of 5 kHz), that is, obtain a ratio of the power of the frequency point with a frequency of 4kHz to the power of the frequency point with a frequency of 5kHz to obtain the first power ratio as 0.02; obtaining the ratio of the power of a frequency point with the frequency of 3kHz and the power of a frequency point with the frequency of 4kHz to obtain a first power ratio of 1; obtaining the ratio of the power of a frequency point with the frequency of 2kHz and the power of a frequency point with the frequency of 3kHz to obtain a first power ratio of 1; and obtaining the ratio of the power of the frequency point with the frequency of 1kHz to the power of the frequency point with the frequency of 2kHz to obtain a first power ratio of 0.5. The electronic device may determine that the minimum first power ratio is 0.02, where the minimum first power ratio is calculated from a frequency point with a frequency of 4kHz and a frequency point with a frequency of 5kHz, and therefore the first reference frequency point corresponding to the minimum first power ratio may be a frequency point with a frequency of 4kHz or a frequency point with a frequency of 5 kHz.
Specifically, for each frequency point in the second frequency point set, the electronic device may sequentially obtain a second power ratio between every two adjacent frequency points in the second frequency point set, and determine a second reference frequency point corresponding to a minimum second power ratio in the second frequency point set. The second power ratio may refer to a ratio of power of a frequency point with a higher frequency to power of a frequency point with a lower frequency in two adjacent frequency points, and the second reference frequency point may include any one of the two adjacent frequency points corresponding to the minimum second power ratio. Still taking the example shown in fig. 3a, the electronic device may sequentially obtain a second power ratio between every two adjacent frequency points in the second frequency point set (including the frequency point with the frequency of 5kHz, the frequency point with the frequency of 6kHz, the frequency point with the frequency of 7kHz, and the frequency point with the frequency of 8 kHz), that is, obtain a ratio of the power of the frequency point with the frequency of 6kHz to the power of the frequency point with the frequency of 5kHz to obtain the second power ratio of 0.02; obtaining the ratio of the power of the frequency point with the frequency of 7kHz to the power of the frequency point with the frequency of 6kHz to obtain a second power ratio of 0.5; and obtaining the ratio of the power of the frequency point with the frequency of 8kHz and the power of the frequency point with the frequency of 7kHz to obtain a second power ratio of 1. The electronic device may determine that the minimum second power ratio is 0.02, where the minimum second power ratio is calculated from a frequency point with a frequency of 5kHz and a frequency point with a frequency of 6kHz, and therefore the second reference frequency point corresponding to the minimum second power ratio may be a frequency point with a frequency of 5kHz or a frequency point with a frequency of 6 kHz.
Therein, a single tone is also called a single frequency, i.e. a signal with only one constant frequency. In the discrete power spectrum, only the frequency points with single tone have larger power, and the power of other frequency points is approximately equal to zero. That is to say, the first reference frequency point centered on the frequency point of the power spectrum peak is closer to the frequency point of the power spectrum peak, that is, the number of frequency points between the first reference frequency point and the frequency point of the power spectrum peak is smaller, as shown in fig. 3a, the first reference frequency point is a frequency point with a frequency of 4kHz, the first reference frequency point is 1 frequency point away from the frequency point of the power spectrum peak (frequency point with a frequency of 5 kHz), and the second reference frequency point centered on the frequency point of the power spectrum peak is also closer to the frequency point of the power spectrum peak, that is, the number of frequency points between the second reference frequency point and the frequency point of the power spectrum peak is smaller, the second reference frequency point is a frequency point with a frequency of 6kHz, and the second reference frequency point is 1 frequency point away from the frequency point of the power spectrum peak (frequency point with a frequency of 5 kHz).
Wherein howling usually comprises a plurality of frequency points, so that a plurality of frequency points with larger power exist in the discrete power spectrum, as shown in fig. 3b, the first reference frequency point which takes the frequency point of the peak value of the power spectrum as the center is far away from the frequency point of the peak value of the power spectrum, i.e. the number of frequency points between the first reference frequency point and the frequency point where the peak of the power spectrum is located is large, as shown in fig. 3b, the first reference frequency point is the frequency point with the frequency of 2kHz, the first reference frequency point is 3 frequency points away from the frequency point (the frequency point with the frequency of 5 kHz) where the peak value of the power spectrum is located, and the second reference frequency point which takes the frequency point where the peak value of the power spectrum as the center is also far away from the frequency point where the peak value of the power spectrum is located, i.e. the number of frequency bins between the second reference frequency bin and the frequency bin where the peak of the power spectrum is located, is large, as in figure 3b, the first reference frequency point is a frequency point with the frequency of 8kHz, and the distance between the first reference frequency point and the frequency point (the frequency point with the frequency of 5 kHz) where the peak value of the power spectrum is located is 3 frequency points.
According to the distinguishing characteristics of the power spectrum, the electronic equipment can distinguish whether the frequency point of the power spectrum peak has single tone or the frequency point of the power spectrum peak has howling according to the first reference frequency point and the second reference frequency point.
S204: and determining whether the frequency point of the power spectrum peak has howling or single tone according to the number of the frequency points between the first reference frequency point and the second reference frequency point.
In one embodiment, as can be seen from the foregoing, the number of frequency points between the first reference frequency point and the frequency point where the power spectrum peak value exists when a single tone exists at the frequency point where the power spectrum peak value exists is different from the number of frequency points between the first reference frequency point and the frequency point where the power spectrum peak value exists when a howling exists at the frequency point where the power spectrum peak value exists, so that the electronic device can determine that one of the howling and the single tone exists at the frequency point where the power spectrum peak value exists according to the number of frequency points between the first reference frequency point and the frequency point where the power spectrum peak value exists. Specifically, if the number of frequency points between a first reference frequency point and a frequency point where a power spectrum peak value is located is less than or equal to a preset first number threshold, determining that a single tone exists at the frequency point where the power spectrum peak value is located; and if the number of the frequency points between the first reference frequency point and the frequency point where the power spectrum peak value is located is larger than a preset first number threshold, determining that the frequency point where the power spectrum peak value is located has howling.
In another embodiment, the number of frequency points between the second reference frequency point and the frequency point at which the power spectrum peak value exists when a single tone exists at the frequency point at which the power spectrum peak value exists is different from the number of frequency points between the second reference frequency point and the frequency point at which the power spectrum peak value exists when a howling exists at the frequency point at which the power spectrum peak value exists, so that the electronic device can determine that one of the howling and the single tone exists at the frequency point at which the power spectrum peak value exists according to the number of frequency points between the second reference frequency point and the frequency point at which the power spectrum peak value exists. Specifically, if the number of frequency points between the second reference frequency point and the frequency point where the power spectrum peak value is located is less than or equal to a preset second number threshold, determining that single tones exist at the frequency point where the power spectrum peak value is located; and if the number of the frequency points between the second reference frequency point and the frequency point where the power spectrum peak value is located is larger than a preset second number threshold, determining that the frequency point where the power spectrum peak value is located has howling.
In another embodiment, the number of frequency points between the first reference frequency point and the second reference frequency point when the frequency point of the power spectrum peak has the single tone is different from the number of frequency points between the first reference frequency point and the second reference frequency point when the frequency point of the power spectrum peak has the howling, so that the electronic device can determine that the frequency point of the power spectrum peak has the howling or the single tone according to the number of frequency points between the first reference frequency point and the second reference frequency point. Specifically, if the number of frequency points between the first reference frequency point and the second reference frequency point is less than or equal to a preset third number threshold, it is determined that a single tone exists at the frequency point where the power spectrum peak value is located, and the electronic device performs single tone protection, where the single tone protection may refer to not processing the signal to be detected, for example, not performing howling suppression processing on the signal to be detected. And if the number of the frequency points between the first reference frequency point and the second reference frequency point is greater than a preset third number threshold, determining that the frequency point at which the power spectrum peak value is located has howling.
Wherein, the number threshold value can be set according to the service requirement. The quantity threshold may include a first quantity threshold, a second quantity threshold, or a third quantity threshold. For example, the first number threshold when the first reference frequency point is a frequency point with a smaller frequency in two adjacent frequency points corresponding to the minimum first power ratio is different from the first number threshold when the first reference frequency point is a frequency point with a larger frequency in two adjacent frequency points corresponding to the minimum first power ratio, as can be seen in the example shown in fig. 3a, the first number threshold when the first reference frequency point is a frequency point with a frequency of 4kHz may be different from the first number threshold when the first reference frequency point is a frequency point with a frequency of 5 kHz.
In the embodiment of the application, the electronic device may obtain a first frequency point set and a second frequency point set which take a frequency point where a power spectrum peak is located as a center, then determine a first reference frequency point according to a first power ratio between every two frequency points in the first frequency point set, determine a second reference frequency point according to a second power ratio between every two frequency points in the second frequency point set, and finally determine that one of howling and single tones exists at the frequency point where the power spectrum peak is located according to the number of the frequency points between the first reference frequency point and the second reference frequency point. Because the number of the frequency points between the first reference frequency point and the second reference frequency point in the frequency points with the power spectrum peak values of the single tones is small, and the number of the frequency points between the first reference frequency point and the second reference frequency point in the frequency points with the power spectrum peak values of the howling is large, the electronic equipment can distinguish the frequency points with the single tones from the frequency points with the howling according to the number of the frequency points between the first reference frequency point and the second reference frequency point, the frequency points with the power spectrum peak values of the single tones cannot be mistakenly detected as the frequency points with the power spectrum peak values of the howling, the frequency points with the howling can be accurately detected from the frequency points with the power spectrum peak values, and the accuracy of howling detection is improved.
As can be seen from the above description of the method embodiment shown in fig. 2, in the howling detection method shown in fig. 2, a first reference frequency point is used to represent a frequency point corresponding to maximum power change in a first frequency point set, and a second reference frequency point is used to represent a frequency point corresponding to maximum power change in a second frequency point set. In another embodiment, the electronic device may further use a third reference frequency point to represent a frequency point corresponding to maximum power change in the first frequency point set, where the third reference frequency point may be a frequency point corresponding to a maximum first power difference between every two adjacent frequency points in the first frequency point set. Similarly, the electronic device may use a fourth reference frequency point to represent a frequency point corresponding to the maximum power change in the second frequency point set, and the fourth reference frequency point may be a frequency point corresponding to the maximum second power difference between every two adjacent frequency points in the second frequency point set. Based on this, the embodiment of the present application proposes another howling detection method; referring to fig. 4, a flow chart of another howling detection method is shown. As shown in fig. 4, the howling detection method may include S401 to S404:
s401: and acquiring the frequency point of the power spectrum peak value of the signal to be detected.
S402: and acquiring a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as the center, wherein the frequency corresponding to the frequency point in the first frequency point set is less than or equal to the frequency corresponding to the power spectrum peak value, and the frequency corresponding to the frequency point in the second frequency point set is greater than or equal to the frequency corresponding to the power spectrum peak value.
The specific implementation of S401-S402 can refer to the detailed description of the related embodiment in fig. 2, which is not described herein again.
S403: acquiring a first power difference between every two adjacent frequency points in a first frequency point set, and determining a third reference frequency point corresponding to the maximum first power difference in the first frequency point set; and acquiring a second power difference between every two adjacent frequency points in the second frequency point set, and determining a fourth reference frequency point corresponding to the maximum second power difference in the second frequency point set.
Similar to S203, for each frequency point in the first frequency point set, the electronic device may sequentially obtain a first power difference between every two adjacent frequency points in the first frequency point set, and determine a third reference frequency point corresponding to a maximum first power difference in the first frequency point set. The first power difference may be a difference between the power of a frequency point with a larger frequency and the power of a frequency point with a smaller frequency in two adjacent frequency points, and the third reference frequency point may include any one of the two adjacent frequency points corresponding to the largest first power difference. Still bearing the example shown in fig. 3a, the electronic device may sequentially obtain a first power difference between every two adjacent frequency points in the first frequency point set (including a frequency point with a frequency of 1kHz, a frequency point with a frequency of 2kHz, a frequency point with a frequency of 3kHz, a frequency point with a frequency of 4kHz, and a frequency point with a frequency of 5 kHz), that is, obtain a difference between the power of the frequency point with a frequency of 5kHz and the power of the frequency point with a frequency of 4kHz to obtain a first power difference of 9.8; acquiring a difference value between the power of a frequency point with the frequency of 4kHz and the power of a frequency point with the frequency of 3kHz to obtain a first power difference of 0; acquiring a difference value between the power of a frequency point with the frequency of 3kHz and the power of a frequency point with the frequency of 2kHz to obtain a first power difference of 0; and obtaining the difference value between the power of the frequency point with the frequency of 2kHz and the power of the frequency point with the frequency of 1kHz to obtain a first power difference of 0.1. The electronic device may determine that the maximum first power difference is 9.8, where the maximum first power difference is calculated from a frequency point with a frequency of 5kHz and a frequency point with a frequency of 4kHz, and therefore the third reference frequency point corresponding to the maximum first power difference may be a frequency point with a frequency of 5kHz or a frequency point with a frequency of 4 kHz.
Specifically, for each frequency point in the second frequency point set, the electronic device may sequentially obtain a second power difference between every two adjacent frequency points in the second frequency point set, and determine a fourth reference frequency point corresponding to a maximum second power difference in the second frequency point set. The second power difference may be a difference between the power of a frequency point with a smaller frequency and the power of a frequency point with a larger frequency in two adjacent frequency points, and the fourth reference frequency point may include any one of the two adjacent frequency points corresponding to the largest second power difference. Still bearing the example shown in fig. 3a, the electronic device may sequentially obtain a second power difference between every two adjacent frequency points in the second frequency point set (including a frequency point with a frequency of 5kHz, a frequency point with a frequency of 6kHz, a frequency point with a frequency of 7kHz, and a frequency point with a frequency of 8 kHz), that is, obtain a difference between the power of the frequency point with a frequency of 5kHz and the power of the frequency point with a frequency of 6kHz to obtain a second power difference of 9.8; acquiring a difference value between the power of the frequency point with the frequency of 7kHz and the power of the frequency point with the frequency of 6kHz to obtain a second power difference of 0.1; and obtaining the difference value between the power of the frequency point with the frequency of 8kHz and the power of the frequency point with the frequency of 7kHz to obtain a second power difference of 0. The electronic device may determine that the maximum second power difference is 9.8, where the maximum second power difference is calculated from a frequency point with a frequency of 5kHz and a frequency point with a frequency of 6kHz, and therefore the fourth reference frequency point corresponding to the maximum second power difference may be a frequency point with a frequency of 5kHz or a frequency point with a frequency of 6 kHz.
Therein, a single tone is also called a single frequency, i.e. a signal with only one constant frequency. In the discrete power spectrum, only the frequency points with single tone have larger power, and the power of other frequency points is approximately equal to zero. That is to say, the third reference frequency point centered on the frequency point where the power spectrum peak is located is closer to the frequency point where the power spectrum peak is located, that is, the number of frequency points between the third reference frequency point and the frequency point where the power spectrum peak is located is smaller, and the fourth reference frequency point centered on the frequency point where the power spectrum peak is located is also closer to the frequency point where the power spectrum peak is located, that is, the number of frequency points between the fourth reference frequency point and the frequency point where the power spectrum peak is located is smaller.
The howling usually includes a plurality of frequency points, so that a plurality of frequency points with higher power exist in the discrete power spectrum, a third reference frequency point with the frequency point of the power spectrum peak as a center is farther from the frequency point of the power spectrum peak, that is, the number of frequency points between the third reference frequency point and the frequency point of the power spectrum peak is larger, and a fourth reference frequency point with the frequency point of the power spectrum peak as a center is farther from the frequency point of the power spectrum peak, that is, the number of frequency points between the fourth reference frequency point and the frequency point of the power spectrum peak is larger.
According to the distinguishing characteristics of the power spectrum, the electronic equipment can distinguish whether the frequency point of the power spectrum peak has single tone or the frequency point of the power spectrum peak has howling according to the third reference frequency point and the fourth reference frequency point.
S404: and determining whether the frequency point of the power spectrum peak has howling or single tone according to the number of the frequency points between the third reference frequency point and the fourth reference frequency point.
It should be noted that the electronic device determines, according to the number of frequency points between the third reference frequency point and the fourth reference frequency point, that the frequency point at which the power spectrum peak value is located has one of howling and single tones, which is similar to the embodiment where the electronic device determines, according to the number of frequency points between the first reference frequency point and the second reference frequency point, that the frequency point at which the power spectrum peak value is located has one of howling and single tones, and the specific implementation manner of S404 may refer to the detailed description of S204 in fig. 2, which is not repeated here.
In this embodiment of the application, the electronic device may obtain a first frequency point set and a second frequency point set which take a frequency point where a power spectrum peak is located as a center, then determine a third reference frequency point according to a first power difference between every two frequency points in the first frequency point set, determine a fourth reference frequency point according to a second power difference between every two frequency points in the second frequency point set, and finally determine that one of howling and single tone exists at the frequency point where the power spectrum peak is located according to the number of frequency points between the third reference frequency point and the fourth reference frequency point. Because the number of the frequency points between the third reference frequency point and the fourth reference frequency point in the frequency points where the power spectrum peak values of the single tones exist is small, and the number of the frequency points between the third reference frequency point and the fourth reference frequency point in the frequency points where the power spectrum peak values of the howling exist is large, the electronic equipment can distinguish the frequency points where the single tones exist and the frequency points where the howling exists according to the number of the frequency points between the third reference frequency point and the fourth reference frequency point, the frequency points where the power spectrum peak values of the single tones exist cannot be mistakenly detected as the frequency points where the power spectrum peak values of the howling exist, the frequency points where the howling exists can be accurately detected from the frequency points where the power spectrum peak values exist, and the accuracy of howling detection is improved.
In an example, as can be seen from the above description of the embodiments of the method shown in fig. 2 or fig. 4, the howling detection method shown in fig. 2 or fig. 4 can distinguish between a frequency point where there is a single tone and a frequency point where there is howling, so as to improve the accuracy of howling detection. However, since howling does not easily occur in the low frequency band, in order to further improve the accuracy of howling detection, the embodiment of the present application may further include S11.
S11: and comparing the frequency of the frequency point where the power spectrum peak value is located with a preset frequency threshold value.
The electronic equipment can compare the frequency of the frequency point where the power spectrum peak value is located with a preset frequency threshold value, when the frequency of the frequency point where the power spectrum peak value is located is larger than the preset frequency threshold value, howling is likely to exist at the frequency point where the power spectrum peak value is located, and the step of acquiring the first frequency point set and the second frequency point set which take the frequency point where the power spectrum peak value is located as the center is triggered and executed. When the frequency of the frequency point where the power spectrum peak value is located is smaller than or equal to a preset frequency threshold value, howling does not exist in the frequency point where the power spectrum peak value is located.
The signal to be detected may include a speech frame signal and a non-speech frame signal, and optionally, the frequency threshold of the speech frame signal and the frequency threshold of the non-speech frame signal may be the same and are both target frequency thresholds. After the electronic equipment acquires the frequency point of the power spectrum peak value in the signal to be detected, the frequency of the frequency point of the power spectrum peak value can be directly compared with a target frequency threshold, and if the frequency point of the power spectrum peak value is smaller than or equal to the target frequency threshold, it is determined that howling does not exist in the frequency point of the power spectrum peak value. And if the frequency point of the power spectrum peak value is larger than the target frequency threshold, triggering and executing the steps of acquiring a first frequency point set and a second frequency point set which take the frequency point of the power spectrum peak value as the center.
Optionally, in order to perform howling detection more accurately, the electronic device may set different frequency thresholds for the speech frame signal and the non-speech frame signal. The electronic device needs to acquire the signal type of the signal to be detected, and then compares the frequency of the frequency point where the power spectrum peak value is located with the frequency threshold of the signal type. Specifically, when the signal to be detected is a speech frame signal, the electronic device may compare the frequency of the frequency point where the power spectrum peak is located with the frequency threshold of the speech frame signal, and if the frequency point where the power spectrum peak is located is less than or equal to the frequency threshold of the speech frame signal, it is determined that there is no howling at the frequency point where the power spectrum peak is located. And if the frequency point of the power spectrum peak value is larger than the frequency threshold of the language frame signal, triggering and executing the step of acquiring a first frequency point set and a second frequency point set which take the frequency point of the power spectrum peak value as the center. When the signal to be detected is a non-language frame signal, the electronic device may compare the frequency of the frequency point where the power spectrum peak is located with the frequency threshold of the non-language frame signal, and if the frequency point where the power spectrum peak is located is less than or equal to the frequency threshold of the non-language frame signal, it is determined that there is no howling at the frequency point where the power spectrum peak is located. And if the frequency point of the power spectrum peak value is larger than the frequency threshold of the non-language frame signal, triggering and executing the step of acquiring a first frequency point set and a second frequency point set which take the frequency point of the power spectrum peak value as the center.
The electronic equipment can determine the signal type of the signal to be detected according to the sum of the energies in the low frequency band of the signal to be detected. Since the sum of the energies of the low frequency bands is only related to the signal in the low frequency band. That is, when the signal to be detected is a speech frame signal, the sum of the energies in the low frequency band is related to the speech signal in the low frequency band; when the signal to be detected is a non-speech frame signal, the sum of the energies in the low frequency band is correlated with the noise signal in the low frequency band. Thus, it can be determined whether the signal to be detected is a speech frame signal or a non-speech frame signal based on the sum of the energies in the low frequency band. Specifically, the electronic device may compare the sum of energies in the low frequency band with a second energy threshold, determine that the signal to be detected is a speech frame signal when the sum of energies in the low frequency band is greater than the second energy threshold, and determine that the signal to be detected is a non-speech frame signal when the sum of energies in the low frequency band is less than or equal to the second energy threshold. And the frequency of each frequency point in the low frequency band is less than a preset first frequency.
When the electronic equipment acquires the frequency point where the power spectrum peak value in the signal to be detected is located, the electronic equipment firstly screens the frequency point where the power spectrum peak value is located based on the frequency threshold value to obtain the frequency point where the power spectrum peak value is located with the frequency greater than the frequency threshold value, and then acquires a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as a center. The method has the advantages that the howling is not easy to occur in the low frequency band, and when the frequency of the frequency point where the power spectrum peak value is located is smaller than or equal to the frequency threshold, the electronic equipment determines that the howling does not exist in the frequency point where the power spectrum peak value is located. When the frequency of the frequency point where the power spectrum peak value is located is greater than the frequency threshold, howling may exist only at the frequency point where the power spectrum peak value is located. The electronic equipment cannot falsely detect the frequency point of the low frequency band as the frequency point with howling, and the accuracy of howling detection can be further improved.
Because when the signal to be detected is a non-voice frame, the power of the frequency points with unvoiced consonants and the power of the frequency points with noise may be very high, in order to avoid false detection of the frequency points with unvoiced consonants as the frequency points with howling or false detection of the frequency points with noise as the frequency points with howling, the electronic device needs to distinguish the frequency points with unvoiced consonants and the frequency points with howling, and distinguish the frequency points with noise and the frequency points with howling. Based on this, please refer to fig. 5, and fig. 5 shows a flow chart of still another howling detection method. As shown in fig. 5, the howling detection method includes S501-S502:
s501: acquiring the energy sum of a first frequency band taking the frequency point where the peak value of the power spectrum is located as the center; the frequency difference between the frequency point in the first frequency band and the frequency point at which the peak value of the power spectrum is located is smaller than a preset frequency value.
The first frequency band may be a fixed frequency band centered on a frequency point where the power spectrum peak value is located, and an absolute value of a frequency corresponding to any frequency point in the fixed frequency band and a frequency corresponding to the frequency point where the power spectrum peak value is located is smaller than a preset frequency value.
Wherein the sum of the energies of the first frequency band can be calculated by the following expression:
E=∑F(f)
wherein, E is used to represent the sum of energies in the first frequency band, f is used to represent frequencies corresponding to frequency points in the first frequency band, and f (f) is used to represent power of frequencies corresponding to frequency points in the first frequency band.
S502: and determining one of howling, unvoiced consonant or noise at the frequency point where the power spectrum peak is located according to the sum of the energy of the first frequency band and the sum of the energy of the whole frame of the signal to be detected.
The unvoiced consonants are mainly concentrated in the high frequency band, so that the signal type containing the unvoiced consonants is a non-speech frame signal.
Specifically, the electronic device may first determine whether a frequency point where the power spectrum peak is located has noise according to a ratio of the sum of energies of the first frequency band to the sum of energies of the whole frame of the signal to be detected, and then determine whether a consonant or a howling exists at the frequency point where the power spectrum peak is located according to the sum of energies of the first frequency band.
The probability of the noise appearing at each frequency point is equivalent, the energy difference between the frequency points is small, and the sum of the energy of the first frequency band only contains a small part of the energy of the whole non-language frame. However, unvoiced consonants and howls do not have the above-described feature, and the sum of the energies of the first frequency band contains most of the energy of the entire non-speech frame. Therefore, whether noise exists in the frequency point where the power spectrum peak value is located can be judged according to the ratio of the sum of the energies of the first frequency band and the sum of the energies of the whole frame of the signal to be detected. Specifically, the electronic device may calculate a ratio of the sum of energies of the first frequency band to the sum of energies of the entire frame of the signal to be detected, compare the ratio with a preset ratio, and determine that noise exists at a frequency point where the power spectrum peak is located if the ratio is smaller than or equal to the preset ratio. And carrying out noise protection. The noise protection may refer to not processing the non-speech frame signal, for example, not performing howling suppression processing on the non-speech frame signal. If the ratio is larger than the preset ratio, triggering to judge whether unvoiced consonants or howls exist at the frequency point where the power spectrum peak value is located according to the energy sum of the first frequency band.
Because the power corresponding to the frequency point with the unvoiced consonant is usually smaller than the power corresponding to the frequency point with the howling, and the powers of the frequency points in the first frequency band are equivalent, when the frequency point at which the power spectrum peak is located is the unvoiced consonant, the sum of the energies of the first frequency band obtained by calculating the powers of the frequency points in the first frequency band is smaller; when the frequency point where the peak value of the power spectrum is located is howling, the sum of the energy of the first frequency band obtained by calculating the power of each frequency point in the first frequency band is larger. Therefore, the electronic device may compare the sum of the energies of the first frequency band with the first energy threshold, and if the sum of the energies of the first frequency band is less than or equal to the first energy threshold, determine that there is an unvoiced consonant at the frequency point where the power spectrum peak is located, and perform unvoiced consonant protection. Here, the unvoiced consonant protection may mean that the non-speech frame signal is not processed, for example, howling suppression processing is not performed on the non-speech frame signal. And if the sum of the energies of the first frequency bands is larger than a first energy threshold value, determining that the frequency point where the power spectrum peak value is located has howling.
It is to be understood that the howling detection method described in fig. 5 can be combined with either of the embodiments of fig. 2 or fig. 4 when the signal to be detected is a non-speech signal. For example, in combination with the howling detection method in fig. 2, the frequency points with single tones and the frequency points with howling may be distinguished according to the number of the frequency points between the first reference frequency point and the second reference frequency point, and the frequency points with unvoiced consonants, the frequency points with noise, and the frequency points with howling may be determined according to the sum of the energies of the first frequency band and the sum of the energies of the whole frame of the signal to be detected. For another example, in combination with the howling detection method in fig. 4, the frequency points with single tones and the frequency points with howling may be distinguished according to the number of the frequency points between the third reference frequency point and the fourth reference frequency point, and the frequency points with unvoiced consonants, the frequency points with noise, and the frequency points with howling may be determined according to the sum of the energies of the first frequency band and the sum of the energies of the whole frame of the signal to be detected. Etc., which are not described in detail herein.
In an example, the frequency point of the howling existing in the signal to be detected can be detected by the howling detection method shown in fig. 2, fig. 4, or fig. 5. When it is detected that at least one frequency point with howling exists in the signal to be detected, the electronic device can determine that the howling exists in the signal to be detected. At this time, the electronic device may perform howling suppression on the signal to be detected. Therefore, embodiments of the present application may further include S21 and S22:
s21: and acquiring the maximum amplitude value in the low frequency band of the signal to be detected.
The howling is not easy to occur in the low frequency band, so that the frequency point where the maximum amplitude value in the low frequency band is located has no howling. Therefore, the electronic device can use the maximum value of the amplitude in the low frequency band as a reference value in the howling suppression process.
Similar to S11, the signal to be detected may include a speech frame signal and a non-speech frame signal, and in one embodiment, the frequency threshold of the speech frame signal and the frequency threshold of the non-speech frame signal may be the same, both being the target frequency threshold. The electronic equipment can determine each frequency point in the low frequency band according to the target frequency threshold, then obtain the amplitude corresponding to the frequency of each frequency point in the low frequency band, and determine the maximum value of the amplitude.
In another embodiment, the frequency threshold of the speech frame signal and the frequency threshold of the non-speech frame signal may be different. The electronic device needs to acquire a signal type of a signal to be detected, determine each frequency point of a low frequency band according to a frequency threshold of the signal type, acquire an amplitude corresponding to the frequency of each frequency point in the low frequency band, and determine a maximum amplitude value. For example, when the signal to be detected is a speech frame signal, the electronic device may use a frequency point having a frequency less than or equal to a frequency threshold of the speech frame signal as a frequency point in a low frequency band, then obtain an amplitude corresponding to the frequency of each frequency point in the low frequency band, and determine a maximum amplitude value. For another example, when the signal to be detected is a non-language frame signal, the electronic device may use a frequency point having a frequency less than or equal to a frequency threshold of the non-language frame signal as a frequency point in a low frequency band, then obtain an amplitude corresponding to the frequency of each frequency point in the low frequency band, and determine a maximum amplitude value.
S22: and carrying out howling suppression on the signal to be detected according to the maximum amplitude value.
In an embodiment, the howling detection method shown in the foregoing fig. 2, fig. 4, or fig. 5 is used to detect a frequency point where howling exists in a signal to be detected. Therefore, the electronic device can directly suppress the existence of howling by using the maximum amplitude value, namely, the maximum amplitude value is used for replacing the amplitude corresponding to the frequency point where the power spectrum peak value of the howling exists in the signal to be detected.
In another embodiment, because howling detection may have an error, the frequency point where howling exists in the signal to be detected cannot be completely detected, and in order to make howling suppression in the signal to be detected purer, when it is determined that howling exists in the signal to be detected, each frequency point in the signal to be detected may be processed by using the maximum amplitude value. Specifically, any one frequency point in the signal to be detected can be obtained, and if the amplitude of any one frequency point is greater than the maximum amplitude value, the maximum amplitude value is used to replace the amplitude of any one frequency point. And if the amplitude of any frequency point is less than or equal to the maximum amplitude value, keeping the amplitude of any frequency point.
The electronic equipment can perform howling suppression on the frequency points with howling, and signals after the howling suppression are purer.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a howling detection apparatus according to an embodiment of the present application. The device can be an electronic device, a device in the electronic device, or a device capable of being matched with the electronic device for use. The howling detection apparatus shown in fig. 6 may include an acquisition unit 601 and a determination unit 602. Wherein:
an obtaining unit 601, configured to obtain a frequency point where a power spectrum peak of a signal to be detected is located;
the obtaining unit 601 is further configured to obtain a first frequency point set and a second frequency point set that take the frequency point where the power spectrum peak is located as a center, where frequencies corresponding to the frequency points in the first frequency point set are less than or equal to frequencies corresponding to the power spectrum peak, and frequencies corresponding to the frequency points in the second frequency point set are greater than or equal to frequencies corresponding to the power spectrum peak;
a determining unit 602, configured to obtain a first power ratio between every two adjacent frequency points in a first frequency point set, and determine a first reference frequency point corresponding to a minimum first power ratio in the first frequency point set; acquiring a second power ratio between every two adjacent frequency points in a second frequency point set, and determining a second reference frequency point corresponding to the minimum second power ratio in the second frequency point set;
the determining unit 602 is further configured to determine, according to the number of frequency points between the first reference frequency point and the second reference frequency point, that one of howling and single tone exists at the frequency point where the power spectrum peak is located.
In some possible embodiments, the determining unit 602 is configured to determine that one of howling and single tone exists in a frequency point where a power spectrum peak is located according to the number of frequency points between a first reference frequency point and a second reference frequency point, and includes:
and if the number of the frequency points between the first reference frequency point and the second reference frequency point is less than or equal to a preset number threshold, determining that the frequency point at which the power spectrum peak value is located has the single tone.
In some possible embodiments, the determining unit 602 is configured to determine that one of howling and single tone exists in a frequency point where a power spectrum peak is located according to the number of frequency points between a first reference frequency point and a second reference frequency point, and includes:
and if the number of the frequency points between the first reference frequency point and the second reference frequency point is greater than a preset number threshold, determining that the frequency point where the power spectrum peak value is located has howling.
In some possible embodiments, the signal to be detected is a speech frame signal, and the frequency corresponding to the peak of the power spectrum is greater than the frequency threshold of the speech frame signal.
In some possible embodiments, the signal to be detected is a non-speech frame signal, and the frequency corresponding to the peak of the power spectrum is greater than the frequency threshold of the non-speech frame signal.
In some possible embodiments, the howling detection apparatus further includes:
the obtaining unit 601 is further configured to obtain a sum of energies of a first frequency band centered on a frequency point where a power spectrum peak is located; the frequency difference between the frequency point in the first frequency band and the frequency point at which the peak value of the power spectrum is located is smaller than a preset frequency value;
the determining unit 602 is further configured to determine, according to the sum of the energies of the first frequency band and the sum of the energies of the whole frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at the frequency point where the power spectrum peak is located.
In some possible embodiments, the determining unit 602 is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of an entire frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at a frequency point where a power spectrum peak is located, and includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is smaller than or equal to a preset ratio, determining that the frequency point where the power spectrum peak value is located has noise.
In some possible embodiments, the determining unit 602 is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of an entire frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at a frequency point where a power spectrum peak is located, and includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is greater than a preset ratio and the sum of the energies of the first frequency band is less than or equal to a first energy threshold, determining that the frequency point where the power spectrum peak value is located has the unvoiced consonant.
In some possible embodiments, the determining unit 602 is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of an entire frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at a frequency point where a power spectrum peak is located, and includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is greater than a preset ratio and the sum of the energies of the first frequency band is greater than a first energy threshold, determining that the frequency point where the power spectrum peak value is located has howling.
The howling detection device may be, for example: a chip, or a modular device. Each unit included in each apparatus and product described in the above embodiments may be a software unit, a hardware unit, or a part of the software unit and a part of the hardware unit. For example, for each device or product applied to or integrated into a chip, each unit included in the device or product may be implemented by hardware such as a circuit, or at least a part of the units may be implemented by a software program running on a processor integrated within the chip, and the rest (if any) part of the units may be implemented by hardware such as a circuit; for each device and product applied to or integrated in the module device, each unit included in the device and product may be implemented in a hardware manner such as a circuit, and different units may be located in the same component (e.g., a chip, a circuit unit, etc.) or different components of the module device, or at least a part of the units may be implemented in a software program running on a processor integrated in the module device, and the rest (if any) part of the units may be implemented in a hardware manner such as a circuit; for each device and product applied to or integrated in an electronic device, each unit included in the device and product may be implemented by hardware such as a circuit, different units may be located in the same component (e.g., a chip, a circuit unit, etc.) or different components in the electronic device, or at least some units may be implemented by a software program running on a processor integrated in the electronic device, and the rest (if any) of the units may be implemented by hardware such as a circuit.
The relevant content of this embodiment can be referred to the relevant content of the above method embodiment. And will not be described in detail herein. The embodiments of the present application and the embodiments of the method described above are based on the same concept, and the technical effects brought by the embodiments are also the same.
Referring to fig. 7, fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. The electronic device includes: the processor 701, the memory 702, the processor 701 and the memory 702 are connected by one or more communication buses 703.
The Processor 701 may be a Central Processing Unit (CPU), and may also be other general purpose processors, Digital Signal Processors (DSP), Application Specific Integrated Circuits (ASIC), Field-Programmable Gate arrays (FPGA) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and so on. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The processor 701 is configured to support the electronic device to perform a corresponding function of the electronic device in the aforementioned howling detection method.
The memory 702 may include read-only memory and random access memory, and provides computer programs and data to the processor 701. A portion of the memory 702 may also include non-volatile random access memory. When the processor 701 calls the computer program, it is configured to:
acquiring a frequency point where a power spectrum peak value of a signal to be detected is located;
acquiring a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as the center, wherein the frequency corresponding to the frequency point in the first frequency point set is less than or equal to the frequency corresponding to the power spectrum peak value, and the frequency corresponding to the frequency point in the second frequency point set is greater than or equal to the frequency corresponding to the power spectrum peak value;
acquiring a first power ratio between every two adjacent frequency points in a first frequency point set, and determining a first reference frequency point corresponding to the minimum first power ratio in the first frequency point set; acquiring a second power ratio between every two adjacent frequency points in a second frequency point set, and determining a second reference frequency point corresponding to the minimum second power ratio in the second frequency point set;
and determining whether the frequency point of the power spectrum peak has howling or single tone according to the number of the frequency points between the first reference frequency point and the second reference frequency point.
In some possible embodiments, the processor 701 is configured to determine, according to the number of frequency points between a first reference frequency point and a second reference frequency point, that there is one of howling or single tone in a frequency point where a power spectrum peak is located, including:
and if the number of the frequency points between the first reference frequency point and the second reference frequency point is less than or equal to a preset number threshold, determining that the frequency point at which the power spectrum peak value is located has the single tone.
In some possible embodiments, the processor 701 is configured to determine, according to the number of frequency points between a first reference frequency point and a second reference frequency point, that there is one of howling or single tone in a frequency point where a power spectrum peak is located, including:
and if the number of the frequency points between the first reference frequency point and the second reference frequency point is greater than a preset number threshold, determining that the frequency point where the power spectrum peak value is located has howling.
In some possible embodiments, the signal to be detected is a speech frame signal, and the frequency corresponding to the peak of the power spectrum is greater than the frequency threshold of the speech frame signal.
In some possible embodiments, the signal to be detected is a non-speech frame signal, and the frequency corresponding to the peak of the power spectrum is greater than the frequency threshold of the non-speech frame signal.
In some possible embodiments, the processor 701 is further configured to:
acquiring the energy sum of a first frequency band taking the frequency point where the peak value of the power spectrum is located as the center; the frequency difference between the frequency point in the first frequency band and the frequency point at which the peak value of the power spectrum is located is smaller than a preset frequency value;
and determining one of howling, unvoiced consonant or noise at the frequency point where the power spectrum peak is located according to the sum of the energy of the first frequency band and the sum of the energy of the whole frame of the signal to be detected.
In some possible embodiments, the processor 701 is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of an entire frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at a frequency point where a power spectrum peak is located, including:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is smaller than or equal to a preset ratio, determining that the frequency point where the power spectrum peak value is located has noise.
In some possible embodiments, the processor 701 is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of an entire frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at a frequency point where a power spectrum peak is located, including:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is greater than a preset ratio and the sum of the energies of the first frequency band is less than or equal to a first energy threshold, determining that the frequency point where the power spectrum peak value is located has the unvoiced consonant.
In some possible embodiments, the processor 701 is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of an entire frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at a frequency point where a power spectrum peak is located, including:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is greater than a preset ratio and the sum of the energies of the first frequency band is greater than a first energy threshold, determining that the frequency point where the power spectrum peak value is located has howling.
The relevant content of this embodiment can be referred to the relevant content of the above method embodiment. And will not be described in detail herein. The embodiments of the present application and the embodiments of the method described above are based on the same concept, and the technical effects brought by the embodiments are also the same.
Embodiments of the present application provide a chip, where the chip may perform relevant steps of an electronic device in the foregoing method embodiments. The chip is used for:
acquiring a frequency point where a power spectrum peak value of a signal to be detected is located;
acquiring a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as the center, wherein the frequency corresponding to the frequency point in the first frequency point set is less than or equal to the frequency corresponding to the power spectrum peak value, and the frequency corresponding to the frequency point in the second frequency point set is greater than or equal to the frequency corresponding to the power spectrum peak value;
acquiring a first power ratio between every two adjacent frequency points in a first frequency point set, and determining a first reference frequency point corresponding to the minimum first power ratio in the first frequency point set; acquiring a second power ratio between every two adjacent frequency points in a second frequency point set, and determining a second reference frequency point corresponding to the minimum second power ratio in the second frequency point set;
and determining whether the frequency point of the power spectrum peak has howling or single tone according to the number of the frequency points between the first reference frequency point and the second reference frequency point.
In some possible embodiments, the chip is configured to determine, according to the number of frequency points between a first reference frequency point and a second reference frequency point, that one of howling and single tone exists at a frequency point where a power spectrum peak is located, and the method includes:
and if the number of the frequency points between the first reference frequency point and the second reference frequency point is less than or equal to a preset number threshold, determining that the frequency point at which the power spectrum peak value is located has the single tone.
In some possible embodiments, the chip is configured to determine, according to the number of frequency points between a first reference frequency point and a second reference frequency point, that one of howling and single tone exists at a frequency point where a power spectrum peak is located, and the method includes:
and if the number of the frequency points between the first reference frequency point and the second reference frequency point is greater than a preset number threshold, determining that the frequency point where the power spectrum peak value is located has howling.
In some possible embodiments, the signal to be detected is a speech frame signal, and the frequency corresponding to the peak of the power spectrum is greater than the frequency threshold of the speech frame signal.
In some possible embodiments, the signal to be detected is a non-speech frame signal, and the frequency corresponding to the peak of the power spectrum is greater than the frequency threshold of the non-speech frame signal.
In some possible embodiments, the chip is further configured to:
acquiring the energy sum of a first frequency band taking the frequency point where the peak value of the power spectrum is located as the center; the frequency difference between the frequency point in the first frequency band and the frequency point at which the peak value of the power spectrum is located is smaller than a preset frequency value;
and determining one of howling, unvoiced consonant or noise at the frequency point where the power spectrum peak is located according to the sum of the energy of the first frequency band and the sum of the energy of the whole frame of the signal to be detected.
In some possible embodiments, the chip is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of an entire frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at a frequency point where a power spectrum peak is located, and includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is smaller than or equal to a preset ratio, determining that the frequency point where the power spectrum peak value is located has noise.
In some possible embodiments, the chip is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of an entire frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at a frequency point where a power spectrum peak is located, and includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is greater than a preset ratio and the sum of the energies of the first frequency band is less than or equal to a first energy threshold, determining that the frequency point where the power spectrum peak value is located has the unvoiced consonant.
In some possible embodiments, the chip is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of an entire frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at a frequency point where a power spectrum peak is located, and includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is greater than a preset ratio and the sum of the energies of the first frequency band is greater than a first energy threshold, determining that the frequency point where the power spectrum peak value is located has howling.
The relevant content of this embodiment can be referred to the relevant content of the above method embodiment. And will not be described in detail herein. The embodiments of the present application and the embodiments of the method described above are based on the same concept, and the technical effects brought by the embodiments are also the same.
The embodiment of the present application further provides a module device, and the module device includes a processor and a communication interface, and the processor is connected to the communication interface, and the communication interface is used for receiving and transmitting signals, and the processor is used for:
acquiring a frequency point where a power spectrum peak value of a signal to be detected is located;
acquiring a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as the center, wherein the frequency corresponding to the frequency point in the first frequency point set is less than or equal to the frequency corresponding to the power spectrum peak value, and the frequency corresponding to the frequency point in the second frequency point set is greater than or equal to the frequency corresponding to the power spectrum peak value;
acquiring a first power ratio between every two adjacent frequency points in a first frequency point set, and determining a first reference frequency point corresponding to the minimum first power ratio in the first frequency point set; acquiring a second power ratio between every two adjacent frequency points in a second frequency point set, and determining a second reference frequency point corresponding to the minimum second power ratio in the second frequency point set;
and determining whether the frequency point of the power spectrum peak has howling or single tone according to the number of the frequency points between the first reference frequency point and the second reference frequency point.
In some possible embodiments, the processor is configured to determine, according to the number of frequency points between a first reference frequency point and a second reference frequency point, that one of howling and single-tone exists in a frequency point where a power spectrum peak is located, and the method includes:
and if the number of the frequency points between the first reference frequency point and the second reference frequency point is less than or equal to a preset number threshold, determining that the frequency point at which the power spectrum peak value is located has the single tone.
In some possible embodiments, the processor is configured to determine, according to the number of frequency points between a first reference frequency point and a second reference frequency point, that one of howling and single-tone exists in a frequency point where a power spectrum peak is located, and the method includes:
and if the number of the frequency points between the first reference frequency point and the second reference frequency point is greater than a preset number threshold, determining that the frequency point where the power spectrum peak value is located has howling.
In some possible embodiments, the signal to be detected is a speech frame signal, and the frequency corresponding to the peak of the power spectrum is greater than the frequency threshold of the speech frame signal.
In some possible embodiments, the signal to be detected is a non-speech frame signal, and the frequency corresponding to the peak of the power spectrum is greater than the frequency threshold of the non-speech frame signal.
In some possible embodiments, the processor is further configured to:
acquiring the energy sum of a first frequency band taking the frequency point where the peak value of the power spectrum is located as the center; the frequency difference between the frequency point in the first frequency band and the frequency point at which the peak value of the power spectrum is located is smaller than a preset frequency value;
and determining one of howling, unvoiced consonant or noise at the frequency point where the power spectrum peak is located according to the sum of the energy of the first frequency band and the sum of the energy of the whole frame of the signal to be detected.
In some possible embodiments, the processor is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of an entire frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at a frequency point where a power spectrum peak is located, including:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is smaller than or equal to a preset ratio, determining that the frequency point where the power spectrum peak value is located has noise.
In some possible embodiments, the processor is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of an entire frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at a frequency point where a power spectrum peak is located, including:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is greater than a preset ratio and the sum of the energies of the first frequency band is less than or equal to a first energy threshold, determining that the frequency point where the power spectrum peak value is located has the unvoiced consonant.
In some possible embodiments, the processor is configured to determine, according to the sum of energies of the first frequency band and the sum of energies of an entire frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at a frequency point where a power spectrum peak is located, including:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is greater than a preset ratio and the sum of the energies of the first frequency band is greater than a first energy threshold, determining that the frequency point where the power spectrum peak value is located has howling.
The relevant content of this embodiment can be referred to the relevant content of the above method embodiment. And will not be described in detail herein. The embodiments of the present application and the embodiments of the method described above are based on the same concept, and the technical effects brought by the embodiments are also the same.
The embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program may be used to implement the howling detection method described in the embodiment of the present application, and details are not described herein again.
The computer readable storage medium may be an internal storage unit of the electronic device of any of the foregoing embodiments, such as a hard disk or a memory of the device. The computer readable storage medium may also be an external storage device of the electronic device, such as a plug-in hard disk provided on the device, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the computer-readable storage medium may also include both an internal storage unit and an external storage device of the electronic device. The computer-readable storage medium is used for storing a computer program and other programs and data required by the electronic device. The computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.
It will be understood by those skilled in the art that all or part of the processes in the methods of the embodiments described above may be implemented by a computer program, which may be stored in a readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims (14)

1. A howling detection method, comprising:
acquiring a frequency point where a power spectrum peak value of a signal to be detected is located;
acquiring a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as a center, wherein the frequency corresponding to the frequency point in the first frequency point set is less than or equal to the frequency corresponding to the power spectrum peak value, and the frequency corresponding to the frequency point in the second frequency point set is greater than or equal to the frequency corresponding to the power spectrum peak value;
acquiring a first power ratio between every two adjacent frequency points in the first frequency point set, and determining a first reference frequency point corresponding to the minimum first power ratio in the first frequency point set; acquiring a second power ratio between every two adjacent frequency points in the second frequency point set, and determining a second reference frequency point corresponding to the minimum second power ratio in the second frequency point set;
and determining one of howling and single tone at the frequency point where the power spectrum peak value is located according to the number of the frequency points between the first reference frequency point and the second reference frequency point.
2. The method of claim 1, wherein the determining that one of howling and single-tone exists in the frequency point where the power spectrum peak is located according to the number of frequency points between the first reference frequency point and the second reference frequency point comprises:
and if the number of the frequency points between the first reference frequency point and the second reference frequency point is less than or equal to a preset number threshold, determining that the frequency point at which the power spectrum peak value is located has the single tone.
3. The method of claim 1, wherein the determining that one of howling and single-tone exists in the frequency point where the power spectrum peak is located according to the number of frequency points between the first reference frequency point and the second reference frequency point comprises:
and if the number of the frequency points between the first reference frequency point and the second reference frequency point is greater than a preset number threshold, determining that the howling exists at the frequency point where the power spectrum peak value is located.
4. The method of claim 1, wherein the signal to be detected is a speech frame signal, and the frequency corresponding to the peak of the power spectrum is greater than the frequency threshold of the speech frame signal.
5. The method of claim 1, wherein the signal to be detected is a non-speech frame signal, and the frequency corresponding to the peak of the power spectrum is greater than the frequency threshold of the non-speech frame signal.
6. The method of claim 5, wherein the method further comprises:
acquiring the energy sum of a first frequency band taking the frequency point where the peak value of the power spectrum is located as the center; the frequency difference between the frequency point in the first frequency band and the frequency point at which the peak value of the power spectrum is located is smaller than a preset frequency value;
and determining one of howling, unvoiced consonant or noise at the frequency point where the power spectrum peak is located according to the sum of the energies of the first frequency band and the sum of the energies of the whole frame of the signal to be detected.
7. The method as claimed in claim 6, wherein the determining, according to the sum of the energies of the first frequency band and the sum of the energies of the whole frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at the frequency point where the power spectrum peak is located includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is smaller than or equal to a preset ratio, determining that the noise exists at the frequency point where the power spectrum peak value is located.
8. The method as claimed in claim 6, wherein the determining, according to the sum of the energies of the first frequency band and the sum of the energies of the whole frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at the frequency point where the power spectrum peak is located includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is greater than a preset ratio, and the sum of the energies of the first frequency band is less than or equal to a first energy threshold, determining that the consonant is present at the frequency point where the power spectrum peak is located.
9. The method as claimed in claim 6, wherein the determining, according to the sum of the energies of the first frequency band and the sum of the energies of the whole frame of the signal to be detected, that one of howling, unvoiced consonant, or noise exists at the frequency point where the power spectrum peak is located includes:
and if the ratio of the sum of the energies of the first frequency band to the sum of the energies of the whole frame of the signal to be detected is greater than a preset ratio and the sum of the energies of the first frequency band is greater than a first energy threshold, determining that the howling exists at the frequency point where the power spectrum peak value is located.
10. A howling detection apparatus, characterized in that the apparatus comprises:
the acquisition unit is used for acquiring the frequency point of the power spectrum peak value of the signal to be detected;
the acquiring unit is further configured to acquire a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak is located as a center, where frequencies corresponding to the frequency points in the first frequency point set are less than or equal to frequencies corresponding to the power spectrum peak, and frequencies corresponding to the frequency points in the second frequency point set are greater than or equal to frequencies corresponding to the power spectrum peak;
a determining unit, configured to obtain a first power ratio between every two adjacent frequency points in the first frequency point set, and determine a first reference frequency point corresponding to a minimum first power ratio in the first frequency point set; acquiring a second power ratio between every two adjacent frequency points in the second frequency point set, and determining a second reference frequency point corresponding to the minimum second power ratio in the second frequency point set;
the determining unit is further configured to determine, according to the number of frequency points between the first reference frequency point and the second reference frequency point, that one of howling and single tone exists at the frequency point where the power spectrum peak is located.
11. An electronic device, comprising a processor and a memory, the processor being connected to the memory, wherein the memory is configured to store program code, and wherein the processor is configured to invoke the program code to perform the howling detection method according to any one of claims 1 to 9.
12. A chip, characterized in that,
the chip is used for acquiring a frequency point where a power spectrum peak value of a signal to be detected is located;
acquiring a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as a center, wherein the frequency corresponding to the frequency point in the first frequency point set is less than or equal to the frequency corresponding to the power spectrum peak value, and the frequency corresponding to the frequency point in the second frequency point set is greater than or equal to the frequency corresponding to the power spectrum peak value;
acquiring a first power ratio between every two adjacent frequency points in the first frequency point set, and determining a first reference frequency point corresponding to the minimum first power ratio in the first frequency point set; acquiring a second power ratio between every two adjacent frequency points in the second frequency point set, and determining a second reference frequency point corresponding to the minimum second power ratio in the second frequency point set;
and determining one of howling and single tone at the frequency point where the power spectrum peak value is located according to the number of the frequency points between the first reference frequency point and the second reference frequency point.
13. A modular device comprising a processor and a communication interface, the processor coupled to the communication interface, the communication interface configured to transceive signals, the processor configured to:
acquiring a frequency point where a power spectrum peak value of a signal to be detected is located;
acquiring a first frequency point set and a second frequency point set which take the frequency point where the power spectrum peak value is located as a center, wherein the frequency corresponding to the frequency point in the first frequency point set is less than or equal to the frequency corresponding to the power spectrum peak value, and the frequency corresponding to the frequency point in the second frequency point set is greater than or equal to the frequency corresponding to the power spectrum peak value;
acquiring a first power ratio between every two adjacent frequency points in the first frequency point set, and determining a first reference frequency point corresponding to the minimum first power ratio in the first frequency point set; acquiring a second power ratio between every two adjacent frequency points in the second frequency point set, and determining a second reference frequency point corresponding to the minimum second power ratio in the second frequency point set;
and determining one of howling and single tone at the frequency point where the power spectrum peak value is located according to the number of the frequency points between the first reference frequency point and the second reference frequency point.
14. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when being executed by a processor, implements the howling detection method according to any one of the preceding claims 1 to 9.
CN202110511757.7A 2021-05-11 2021-05-11 Howling detection method and device and electronic equipment Active CN113316074B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110511757.7A CN113316074B (en) 2021-05-11 2021-05-11 Howling detection method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110511757.7A CN113316074B (en) 2021-05-11 2021-05-11 Howling detection method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN113316074A true CN113316074A (en) 2021-08-27
CN113316074B CN113316074B (en) 2022-07-05

Family

ID=77372865

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110511757.7A Active CN113316074B (en) 2021-05-11 2021-05-11 Howling detection method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN113316074B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0599450A2 (en) * 1992-11-25 1994-06-01 Matsushita Electric Industrial Co., Ltd. Sound amplifying apparatus with automatic howl-suppressing function
EP0843502A1 (en) * 1996-11-13 1998-05-20 Yamaha Corporation Howling detection and prevention circuit and a loudspeaker system employing the same
US20040170283A1 (en) * 2002-03-12 2004-09-02 Yasuhiro Terada Howling control device and howling control method
JP2006203459A (en) * 2005-01-19 2006-08-03 Sony Corp Howling eliminating device
EP3011757A1 (en) * 2013-06-19 2016-04-27 Creative Technology Ltd. Acoustic feedback canceller
CN107708048A (en) * 2017-09-05 2018-02-16 腾讯科技(深圳)有限公司 Detection method of uttering long and high-pitched sounds and device, storage medium and electronic installation
CN109218957A (en) * 2018-10-23 2019-01-15 北京达佳互联信息技术有限公司 It utters long and high-pitched sounds detection method, device, electronic equipment and storage medium
CN110213694A (en) * 2019-04-16 2019-09-06 浙江大华技术股份有限公司 A kind of audio frequency apparatus and its processing method, the computer storage medium uttered long and high-pitched sounds
CN110536215A (en) * 2019-09-09 2019-12-03 普联技术有限公司 Method, apparatus, calculating and setting and the storage medium of Audio Signal Processing
CN111402911A (en) * 2019-12-23 2020-07-10 佛山慧明电子科技有限公司 Howling detection and suppression method
CN111800725A (en) * 2020-05-29 2020-10-20 展讯通信(上海)有限公司 Howling detection method and device, storage medium and computer equipment
CN112562717A (en) * 2020-12-01 2021-03-26 广州华多网络科技有限公司 Howling detection method, howling detection device, storage medium and computer equipment

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0599450A2 (en) * 1992-11-25 1994-06-01 Matsushita Electric Industrial Co., Ltd. Sound amplifying apparatus with automatic howl-suppressing function
EP0843502A1 (en) * 1996-11-13 1998-05-20 Yamaha Corporation Howling detection and prevention circuit and a loudspeaker system employing the same
US20040170283A1 (en) * 2002-03-12 2004-09-02 Yasuhiro Terada Howling control device and howling control method
JP2006203459A (en) * 2005-01-19 2006-08-03 Sony Corp Howling eliminating device
EP3011757A1 (en) * 2013-06-19 2016-04-27 Creative Technology Ltd. Acoustic feedback canceller
CN107708048A (en) * 2017-09-05 2018-02-16 腾讯科技(深圳)有限公司 Detection method of uttering long and high-pitched sounds and device, storage medium and electronic installation
CN109218957A (en) * 2018-10-23 2019-01-15 北京达佳互联信息技术有限公司 It utters long and high-pitched sounds detection method, device, electronic equipment and storage medium
CN110213694A (en) * 2019-04-16 2019-09-06 浙江大华技术股份有限公司 A kind of audio frequency apparatus and its processing method, the computer storage medium uttered long and high-pitched sounds
CN110536215A (en) * 2019-09-09 2019-12-03 普联技术有限公司 Method, apparatus, calculating and setting and the storage medium of Audio Signal Processing
CN111402911A (en) * 2019-12-23 2020-07-10 佛山慧明电子科技有限公司 Howling detection and suppression method
CN111800725A (en) * 2020-05-29 2020-10-20 展讯通信(上海)有限公司 Howling detection method and device, storage medium and computer equipment
CN112562717A (en) * 2020-12-01 2021-03-26 广州华多网络科技有限公司 Howling detection method, howling detection device, storage medium and computer equipment

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
FREVISION优选: ""苹果IOS啸叫频点测试软件_啸叫抑制之陷波法"", 《BLOG.CSDN.NET/WEIXIN_34307029/ARITICLE/DETAILS/111918658》 *
张涛等: "一种低虚警概率的啸叫检测方法", 《西安电子科技大学学报》 *
梁民等: "声学反馈控制技术的研究与展望", 《数字技术与应用》 *
郝国莉: "声反馈啸叫抑制算法的研究", 《中国优秀硕士论文全文数据库信息科技辑》 *
高卫东: ""基于SHARC DSP的声反馈抑制算法设计与实现"", 《中国优秀硕士论文全文数据库信息科技辑》 *

Also Published As

Publication number Publication date
CN113316074B (en) 2022-07-05

Similar Documents

Publication Publication Date Title
US9666183B2 (en) Deep neural net based filter prediction for audio event classification and extraction
US20200227071A1 (en) Analysing speech signals
US10339956B2 (en) Method and apparatus for detecting audio signal according to frequency domain energy
CN111383646B (en) Voice signal transformation method, device, equipment and storage medium
CN105118522B (en) Noise detection method and device
US9749741B1 (en) Systems and methods for reducing intermodulation distortion
CN109218957A (en) It utters long and high-pitched sounds detection method, device, electronic equipment and storage medium
US9888330B1 (en) Detecting signal processing component failure using one or more delay estimators
US20180350378A1 (en) Detecting and reducing feedback
CN108234793B (en) Communication method, communication device, electronic equipment and storage medium
US20150325252A1 (en) Method and device for eliminating noise, and mobile terminal
TW201903755A (en) Electronic device capable of adjusting output sound and method of adjusting output sound
CN113010139A (en) Screen projection method and device and electronic equipment
CN113316075B (en) Howling detection method and device and electronic equipment
US20210201928A1 (en) Integrated speech enhancement for voice trigger application
CN111477246B (en) Voice processing method and device and intelligent terminal
JP6268916B2 (en) Abnormal conversation detection apparatus, abnormal conversation detection method, and abnormal conversation detection computer program
CN104282303A (en) Method and electronic device for speech recognition using voiceprint recognition
US9351072B2 (en) Multi-band harmonic discrimination for feedback suppression
CN113316074B (en) Howling detection method and device and electronic equipment
US10964307B2 (en) Method for adjusting voice frequency and sound playing device thereof
CN115376501B (en) Voice enhancement method and device, storage medium and electronic equipment
US20150279373A1 (en) Voice response apparatus, method for voice processing, and recording medium having program stored thereon
CN107154996B (en) Incoming call interception method and device, storage medium and terminal
CN111429890B (en) Weak voice enhancement method, voice recognition method and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant