CN118250389A - Echo cancellation method, device, electronic equipment, vehicle-mounted system and storage medium - Google Patents
Echo cancellation method, device, electronic equipment, vehicle-mounted system and storage medium Download PDFInfo
- Publication number
- CN118250389A CN118250389A CN202311738192.1A CN202311738192A CN118250389A CN 118250389 A CN118250389 A CN 118250389A CN 202311738192 A CN202311738192 A CN 202311738192A CN 118250389 A CN118250389 A CN 118250389A
- Authority
- CN
- China
- Prior art keywords
- signal
- processing
- echo cancellation
- echo
- processed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M9/00—Arrangements for interconnection not involving centralised switching
- H04M9/08—Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
- H04M9/082—Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/16—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/175—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
- G10K11/178—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
- G10K11/1785—Methods, e.g. algorithms; Devices
- G10K11/17853—Methods, e.g. algorithms; Devices of the filter
- G10K11/17854—Methods, e.g. algorithms; Devices of the filter the filter being an adaptive filter
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/21—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/58—Anti-side-tone circuits
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/60—Substation equipment, e.g. for use by subscribers including speech amplifiers
- H04M1/6033—Substation equipment, e.g. for use by subscribers including speech amplifiers for providing handsfree use or a loudspeaker mode in telephone sets
- H04M1/6041—Portable telephones adapted for handsfree use
- H04M1/6075—Portable telephones adapted for handsfree use adapted for handsfree use in a vehicle
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K2210/00—Details of active noise control [ANC] covered by G10K11/178 but not provided for in any of its subgroups
- G10K2210/30—Means
- G10K2210/301—Computational
- G10K2210/3054—Stepsize variation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02082—Noise filtering the noise being echo, reverberation of the speech
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Quality & Reliability (AREA)
- Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
Abstract
The application provides an echo cancellation method, an echo cancellation device, electronic equipment, a vehicle-mounted system and a storage medium. The echo cancellation method comprises the following steps: obtaining a first stoping signal and a near-end voice signal acquired by a microphone, wherein the first stoping signal is a signal obtained by amplifying the power of a received far-end voice signal; based on the first stoping signal, performing linear echo cancellation processing and nonlinear echo cancellation processing on the near-end voice signal to obtain a first processed signal; performing echo cancellation processing on the first processed signal on the frequency domain subband based on the first stope signal to obtain a second processed signal; the second processed signal is sent to the remote end. By the method and the device, the effect of echo cancellation is improved.
Description
Technical Field
The present application relates to the field of signal processing technologies, and in particular, to an echo cancellation method, an echo cancellation device, an electronic device, a vehicle-mounted system, and a storage medium.
Background
In a voice call scenario, particularly a scenario in which a voice call is performed based on a vehicle-mounted voice system, the voice call is often affected by an echo, so that the voice call quality is not guaranteed.
Currently, in order to improve the call quality, echo cancellation may be performed on voice signals. The main ways of echo cancellation of speech signals include: the voice signal is linearly processed by a filter. Since the linearly processed voice may also have residual echo, the linearly processed voice may be non-linearly processed again.
However, this approach cannot cancel all the echo, and the far end can still receive the residual echo which is not completely cancelled, thereby reducing the call quality of the voice call.
Disclosure of Invention
An object of the present application is to provide an echo cancellation method, which has the advantages that after a near-end speech signal is subjected to a linear echo cancellation process and a nonlinear echo cancellation process, the speech signal is subjected to an echo cancellation process based on a far-end speech signal on a frequency domain subband, and the echo cancellation process is irrelevant to the strength of the echo remained in the speech signal after the linear echo cancellation process and the nonlinear echo cancellation process, so that the processed speech signal is cleaner, the residual echo received by the far-end is reduced, thereby improving the echo cancellation effect and further improving the call quality of a speech call. In addition, the echo cancellation processing is performed on the frequency domain sub-band, so that the calculation amount is remarkably reduced, and the purpose of light weight calculation is achieved.
Another object of the present application is to provide an echo cancellation method, which is advantageous in that by performing echo estimation on a third processed signal, an amplitude envelope signal of a first echo signal is obtained, without estimating the first echo signal, and the calculation amount is further reduced. And according to the amplitude envelope signal of the first echo signal, echo suppression processing is carried out on the third processing signal, residual echoes can still be thoroughly cleared, and a clean voice signal is obtained, so that the effect of echo cancellation is improved, and the conversation quality of voice conversation is improved.
The technical scheme of the application is realized as follows:
in a first aspect, the present application provides an echo cancellation method, which may include: obtaining a first stoping signal and a near-end voice signal acquired by a microphone, wherein the first stoping signal is a signal obtained by amplifying the power of a received far-end voice signal; based on the first stoping signal, performing linear echo cancellation processing and nonlinear echo cancellation processing on the near-end voice signal to obtain a first processed signal; performing echo cancellation processing on the first processed signal on the frequency domain subband based on the first stope signal to obtain a second processed signal; the second processed signal is sent to the remote end.
In some possible embodiments, performing echo cancellation processing on the first processed signal on the frequency domain subband based on the first stope signal to obtain a second processed signal, including: carrying out frequency domain power spectrum division on the first stope signal and the first processing signal to obtain a plurality of frequency domain sub-bands; aiming at each frequency domain sub-band, performing time delay alignment on the first stope signal and the first processing signal to obtain a second stope signal and a third processing signal; for each frequency domain sub-band, performing echo estimation on the third processing signal through a first filtering algorithm based on the second stoping signal to obtain an amplitude envelope signal of the first echo signal; and performing echo suppression processing on the third processing signal according to the amplitude envelope signal of the first echo signal to obtain a second processing signal.
In some possible embodiments, performing echo suppression processing on the third processing signal according to the amplitude envelope signal of the first echo signal to obtain a second processing signal, including: according to the amplitude envelope signal of the first echo signal, obtaining the energy spectrum of the first echo signal; calculating a wiener filter coefficient according to the energy spectrum of the first echo signal and the energy spectrum of the third processing signal; and carrying out wiener filtering on the third processing signal based on the wiener filtering coefficient to obtain a second processing signal.
In some possible implementations, the first filtering algorithm is a normalized least mean square adaptive filtering (NLMS) algorithm.
In some possible embodiments, based on the first stope signal, performing linear echo cancellation processing and nonlinear echo cancellation processing on the near-end speech signal to obtain a first processed signal, including: according to the first stoping signal, linear echo cancellation processing is carried out on the near-end voice signal through a second filtering algorithm so as to obtain a second echo signal; calculating a difference signal obtained by subtracting the second echo signal from the near-end voice signal; and carrying out nonlinear echo cancellation processing on the difference signal to obtain a first processing signal.
In some possible implementations, the second filtering algorithm is a variable step NLMS algorithm.
In a second aspect, the present application provides an echo cancellation device, which may comprise: the acquisition module is used for acquiring a first stoping signal and a near-end voice signal acquired by a microphone, wherein the first stoping signal is a signal obtained by amplifying the power of a received far-end voice signal; the first processing module is used for carrying out linear echo cancellation processing and nonlinear echo cancellation processing on the near-end voice signal based on the first stoping signal so as to obtain a first processing signal; the second processing module is used for carrying out echo cancellation processing on the first processing signal on the frequency domain subband based on the first stoping signal so as to obtain a second processing signal; and the transmitting module is used for transmitting the second processing signal to the far end.
In some possible embodiments, the second processing module may include: the time delay alignment module is used for carrying out frequency domain power spectrum division on the first stoping signal and the first processing signal so as to obtain a plurality of frequency domain sub-bands; aiming at each frequency domain sub-band, performing time delay alignment on the first stope signal and the first processing signal to obtain a second stope signal and a third processing signal; the envelope estimation module is used for carrying out echo estimation on the third processing signal by a first filtering algorithm based on the second stoping signal aiming at each frequency domain sub-band so as to obtain an amplitude envelope signal of the first echo signal; and the echo suppression module is used for performing echo suppression processing on the third processing signal according to the amplitude envelope signal of the first echo signal so as to obtain a second processing signal.
In some possible implementations, the echo suppression module is configured to: according to the amplitude envelope signal of the first echo signal, obtaining the energy spectrum of the first echo signal; calculating a wiener filter coefficient according to the energy spectrum of the first echo signal and the energy spectrum of the third processing signal; and carrying out wiener filtering on the third processing signal based on the wiener filtering coefficient to obtain a second processing signal.
In some possible implementations, the first filtering algorithm is an NLMS algorithm.
In some possible embodiments, the first processing module includes: the linear echo cancellation module is used for performing linear echo cancellation processing on the near-end voice signal through a second filtering algorithm according to the first stoping signal so as to obtain a second echo signal; calculating a difference signal obtained by subtracting the second echo signal from the near-end voice signal; and the nonlinear echo cancellation module is used for carrying out nonlinear echo cancellation processing on the difference signal to obtain a first processed signal.
In some possible implementations, the second filtering algorithm is a variable step NLMS algorithm.
In a third aspect, the present application provides an electronic device comprising: at least one processor; and a memory coupled to the at least one processor, the memory containing instructions stored therein, which when executed by the at least one processor, cause the at least one processor to perform the echo cancellation method according to any one of the first aspect and possible implementations thereof.
In a fourth aspect, the present application provides an in-vehicle system comprising: a speaker, a microphone, and an electronic device; the loudspeaker and the microphone are connected with the electronic equipment; wherein, the loudspeaker is configured to play the far-end voice signal; a microphone configured to collect a near-end speech signal; an electronic device configured to perform the echo cancellation method according to any one of the first aspect and its possible implementation forms.
In a fifth aspect, the present application provides a storage medium having stored thereon a computer program which, when executed by a processor, is capable of implementing the echo cancellation method according to any one of the first aspect and its possible implementation manners.
In a sixth aspect, the present application provides a computer program comprising computer readable program instructions for performing, in a case where the computer readable program instructions are run in a computer device, a processor in the computer device for carrying out some or all of the steps of the above method.
In a seventh aspect, the present application provides a computer program product comprising a computer program or instructions which, when executed by a processor, is capable of implementing the echo cancellation method as in any one of the first aspect and its possible implementation manners.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application, as claimed.
Drawings
Fig. 1 is a schematic diagram of an application scenario of an echo cancellation method in an embodiment of the present application;
fig. 2 is a schematic flow chart of an implementation of an echo cancellation method according to an embodiment of the present application;
fig. 3 is a flowchart of another implementation of the echo cancellation method according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an echo cancellation device according to an embodiment of the present application;
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solution of the present application will be further elaborated with reference to the accompanying drawings and examples, which should not be construed as limiting the application, but all other embodiments which can be obtained by one skilled in the art without making inventive efforts are within the scope of protection of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict. The term "first/second/third" is merely to distinguish similar objects and does not represent a particular ordering of objects, it being understood that the "first/second/third" may be interchanged with a particular order or precedence, as allowed, to enable embodiments of the application described herein to be implemented in other than those illustrated or described herein.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing the application only and is not intended to be limiting of the application.
In a voice call scenario, particularly a scenario in which a voice call is performed based on a vehicle-mounted voice system, the voice call is often affected by an echo, so that the voice call quality is not guaranteed.
In a voice communication scene, after the voice of the far-end speaker is transmitted to the near-end, the voice of the far-end speaker is played by a loudspeaker of the near-end device, and is picked up by a near-end microphone after being acoustically reflected by a reverberation field, or is coupled through an electronic circuit 2/4 wire, the voice of the far-end speaker is transmitted back to the far-end speaker, so that the far-end speaker can hear the voice just before, and the phenomenon is called an echo phenomenon, and the voice heard by the far-end speaker is called echo.
In general, from the reason of generating an echo, the echo is classified into an acoustic echo and a line echo. Wherein acoustic echo is caused by the sound of the speaker being picked up by the near-end microphone after hands-free or power amplification. Line echoes are caused by the 2/4 line coupling of the electronic circuit.
In one aspect, the logic of the echo cancellation algorithm may be to linearly process the speech signal through a filter. Because residual echo may exist in the linearly processed voice, the linearly processed voice can be subjected to nonlinear processing again, so that the purpose of echo cancellation is achieved. But this approach does not eliminate all echoes. Under the condition that the near-end loudspeaker is in a hands-free or externally-placed state, although the near-end loudspeaker has a good echo cancellation effect under the condition of small volume/medium volume, in a car or other reverberation spaces, the echo signal is stronger under the condition of large volume or even larger volume, so that the echo cancellation is not thorough, and the conversation quality of voice conversation is reduced.
In addition, because the automobile and the surrounding environment are complex and have large disturbance, if various prompt tones in the automobile exist in reversing/steering operation and various noises in the automobile and the like, the echo cancellation algorithm can not be converged, so that echo leakage is caused, the far end can still hear own sound, the conversation quality of voice conversation is reduced, and the requirement of vehicle-mounted echo cancellation can not be completely met.
On the other hand, the echo cancellation algorithm can also detect whether the far end and/or the near end contain voice signals through correlation and calculate the respective correlation so as to cancel echo step by step and segment by using a filter, or can change the updating rate of the filter so as to achieve the purpose of optimizing and improving echo cancellation.
However, the difference in filter processing weight may cause poor subjective hearing such as smoothness and noise of the voice call, and thus the voice quality may be degraded. Further, because the environment where the automobile voice call is located is complex and the disturbance is large, the echo cancellation algorithm is not converged, so that echo leakage is caused, the far end can still hear own voice, the call quality of the voice call is reduced, and the vehicle-mounted echo cancellation requirement cannot be completely met.
In order to solve the above problems, embodiments of the present application provide an echo cancellation method, an echo cancellation device, an electronic device, a vehicle-mounted system, and a storage medium, so as to improve the effect of echo cancellation and further improve the voice call quality.
As shown in fig. 1, fig. 1 is a schematic diagram of an application scenario of an echo cancellation method according to an embodiment of the present application. The application scenario may include a communication system 101 and a communication system 102, wherein the communication system 101 includes an electronic device 101-1, a speaker 101-2, and a microphone 101-3, and the communication system 102 includes an electronic device 102-1, a speaker 102-2, and a microphone 102-3.
In some embodiments, communication system 101 is a near-end communication system and communication system 102 is a far-end communication system. Then the electronic device 101-1 obtains the first stope signal and the near-end speech signal. The first extraction signal is a far-end voice signal after power amplification, and the far-end voice signal is a voice signal that the electronic device 102-1 processes the sound collected by the microphone 102-3 and sends the sound to the electronic device 101-1; the near-end voice signal is obtained by processing the sound collected by the microphone 101-3 by the electronic equipment 101-1; the electronic device 102-1 performs linear echo cancellation processing and nonlinear echo cancellation processing on the near-end speech signal based on the first stope signal to obtain a first processed signal; the electronic device 102-1 performs echo cancellation processing on the first processed signal on the frequency domain subband based on the first stope signal to obtain a second processed signal; the electronic device 102-1 sends the second processing signal to the electronic device 101-1.
In some embodiments, the electronic device 101-1 and the electronic device 102-1 may be, but are not limited to, a smart phone, a tablet, a notebook, a desktop computer, a smart speaker, a smart watch, a vehicle terminal, a smart television. The speaker 101-1 and the speaker 102-1 may be hardware devices or software modules, which can convert voice signals into analog signals and play them. Illustratively, speakers 101-1 and 102-1 include, but are not limited to, audio playback devices such as speakers, sound boxes, and the like. The microphone 101-2 and the microphone 102-2 may be specifically hardware devices or software modules, which can collect sound and convert the collected sound into a corresponding voice signal. Illustratively, microphone 101-2 and microphone 102-2 may also be audio acquisition devices such as wireless microphones.
In some embodiments, speaker 101-2 and microphone 101-3 are connected to electronic device 101-1. By way of example, a speaker 101-2 and a microphone 101-3 may be provided on the electronic device 101-1 and communicate with the electronic device 101-1 via a system bus, communication interface, or the like. Or the speaker 101-2 and microphone 101-3 may be independent of the electronic device 101-1 and communicate with the electronic device 101-1 via a wired or wireless network interface, such as bluetooth, wireless compatibility authentication (Wi-Fi), universal serial bus (universal serial bus, USB), etc.
In some embodiments, the communication system 101 may be an in-vehicle system and the electronic device 101-1 may be an in-vehicle terminal located on a vehicle. In this scenario, the speaker 101-2 and microphone 101-3 may be directly connected to an in-vehicle terminal, such as an in-vehicle speaker, an in-vehicle speaker box, or the like. In another embodiment, the speaker 101-2 and the microphone 101-3 may also be connected through another terminal connected to the in-vehicle terminal. By way of example, the other terminal may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, a vehicle-mounted terminal, a smart television, or the like. The speaker 101-2 may be a speaker on another terminal or may be a speaker connected to another terminal. The microphone 101-3 may be a microphone on another terminal or may be a microphone connected to another terminal.
In some embodiments, for the scenario where the communication system 101 is an in-vehicle system, an off-vehicle sound generating system may be disposed on a vehicle where the communication system 101 is located, so as to support implementing voice communication for an off-vehicle user.
In some embodiments, the off-board sound production system described above may include a panel sound production module. The panel sounding module can be a piezoelectric sounding module, an electromagnetic induction sounding module or a traditional loudspeaker. In one embodiment, the piezoelectric sounding module may include a piezoelectric ceramic speaker. The piezoceramic speaker may include an electrode pad configured to receive an excitation voltage from the drive circuit. The electrode sheet may be a pair of positive and negative electrode sheets. The piezoceramic speaker may further comprise a piezoceramic configured to elongate or contract laterally or longitudinally under the influence of an excitation voltage received through the electrode pads. The excitation voltage may be a high frequency square wave of alternating polarity, so that the piezoelectric ceramic will mechanically deform, i.e. elongate or contract, under the influence of the alternating polarity square wave. Furthermore, the piezoelectric ceramic may be a transversely or longitudinally polarized piezoelectric ceramic, so that a transverse or longitudinal mechanical deformation is generated under the influence of an excitation voltage. The piezoelectric ceramic speaker may further include a vibration plate which is attached to the piezoelectric ceramic and generates vibration as the piezoelectric ceramic expands or contracts. In this way, the piezoelectric ceramic speaker converts an input excitation voltage into vibration, thereby emitting voice. The vibration plate can be attached to devices such as a door panel and the like to drive the devices such as the door panel and the like to vibrate. The larger the area of the device such as a door panel driven to vibrate by the piezoelectric ceramic loudspeaker is, the stronger the low-frequency response of the vibration is, and the weaker the high-frequency response is. The piezoelectric ceramic speaker may further include a vibration pad between the vibration plate and the door panel, etc., with which the frequency response of the vibration generated by the vibration plate can be adjusted. For example, in the case where low frequency response of the enhanced vibration is required, the area of the vibration pad may be increased; in the case where a low frequency response of the vibration needs to be attenuated, the area of the vibration pad can be reduced. The panel sounding module is used for sounding outwards, so that the sounding field of the vehicle is wider.
In some embodiments, the panel sound emitting module may be provided to at least one of a door panel, a hood, a trunk lid, a roof cover, a floor pan, a rear mirror, a front bumper, and a rear bumper of the vehicle.
In some embodiments, the speaker 101-2 may be a panel sound module.
In some embodiments, the communication system 102 may be an in-vehicle system and the electronic device 102-1 may be an in-vehicle terminal located on a vehicle. In this scenario, the speaker 102-2 and microphone 102-3 may be directly connected to an in-vehicle terminal, such as an in-vehicle speaker, an in-vehicle speaker box, and the like. In another embodiment, the speaker 102-2 and the microphone 102-3 may also be connected through another terminal connected to the in-vehicle terminal. By way of example, the other terminal may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, a vehicle-mounted terminal, a smart television, or the like. The speaker 102-2 may be a speaker on another terminal or may be a speaker connected to another terminal. Microphone 102-3 may be a microphone on another terminal or may be a microphone connected to another terminal.
In some embodiments, for the scenario where the communication system 102 is an in-vehicle system, the vehicle on which the communication system 102 is located may be provided with the off-vehicle sound generating system, so as to support the implementation of voice communication for the off-vehicle user.
In some embodiments, the speaker 102-2 may be a panel sound module.
The echo cancellation method provided by the embodiment of the application is described below with reference to the above exemplary application and implementation of the application scenario.
In the embodiment of the present application, the communication system 101 is taken as a near-end communication system, and the communication system 102 is taken as a far-end communication system as an example.
Fig. 2 is a schematic flow chart of an implementation of the echo cancellation method according to the embodiment of the application. The echo cancellation method provided by the embodiment of the application can be applied to the electronic device 101-1 in the communication system 101. The echo cancellation method may include steps S201 to S205.
In step S201, a first stope signal is obtained.
The first stoping signal is a signal obtained by amplifying a power of a far-end voice signal received by the electronic device 101-1.
It will be appreciated that during a voice call between a far-end user and a near-end user, the electronic device 102-1 captures the far-end user's voice signal via the microphone 102-3 and processes it into a far-end voice signal. The electronic device 102-1 then transmits the far-end voice signal to the electronic device 101-1. At this time, the electronic device 101-1 receives the far-end voice signal. The electronic device 101-1 performs power amplification on the far-end voice signal through a Power Amplifier (PA), and outputs the amplified far-end voice signal to the speaker 101-2 and plays the amplified far-end voice signal by the speaker 101-2, at this time, the near-end user can hear the sound of the far-end user. The electronic device 101-1 may sample the amplified far-end speech signal while outputting the amplified far-end speech signal to the speaker 101-2 to obtain a first stope signal. That is, the electronic device 101-1 may sample the far-end speech signal after PA amplification and before playback by the speaker 101-2, thus obtaining the first stopsignal.
In some embodiments, the first stope signal may be understood as a reference signal used when performing echo cancellation.
In step S202, a near-end speech signal is acquired by a microphone.
It will be appreciated that after the speaker 101-2 plays the far-end speech signal, the near-end user responds to the far-end user, and at this time, the microphone 101-3 may collect the sound of the near-end user and the sound (i.e. echo) of the far-end speech played by the speaker 101-2 after being acoustically reflected by the reverberant field. The microphone 101-3 processes the collected sound to obtain a near-end speech signal.
In some embodiments, for the scenario where the communication system 101 is an in-vehicle system, the near-end user may be a user who takes a vehicle in the in-vehicle system, or may be an off-vehicle user who is located outside the vehicle in the in-vehicle system.
In some embodiments, since the frequency domain multiplication is equal to the time domain convolution, the time domain convolution operation is large, and therefore, after the first stopsignal and the near-end speech signal are obtained, the two signals may be transformed to the frequency domain for processing. Thus, the echo cancellation method in the embodiment of the present application is performed in the frequency domain.
In step S203, linear echo cancellation processing and nonlinear echo cancellation processing are performed on the near-end speech signal based on the first stoped signal to obtain a first processed signal.
It can be understood that, after the electronic device 101-1 obtains the first stope signal and the near-end speech signal through step S201 and step S202, the electronic device 101-1 sequentially performs the linear echo cancellation process and the nonlinear echo cancellation process on the near-end speech signal according to the first stope signal to obtain the first processed signal.
In some embodiments, the electronic device 101-1 performs a linear echo cancellation process on the near-end speech signal using a second filtering algorithm based on the first stopsignal to obtain a second echo signal. Then, the electronic device 101-1 calculates a difference signal obtained by subtracting the second echo signal from the near-end speech signal, and performs nonlinear echo cancellation processing on the difference signal to obtain a first processed signal.
In some embodiments, when the electronic device 101-1 performs linear echo cancellation processing on the near-end speech signal, the electronic device 101-1 may perform echo estimation using a second filtering algorithm to obtain an estimated second echo signal. The electronic device 101-1 then subtracts the second echo signal from the near-end speech signal to obtain a difference signal.
By way of example, the difference signal may satisfy the following expression (1):
Wherein e (n) is the difference signal, d (n) is the first stope signal, For the estimated second echo signal, n represents n time, n is a positive number.
In some embodiments, the second filtering algorithm may be an adaptive filtering algorithm, such as a Least Mean Square (LMS) algorithm, an NLMS algorithm, a variable step NLMS algorithm, a recursive least squares (recursive least squares, RLS) algorithm, and the like.
The second filtering algorithm may be a variable step NLMS algorithm, for example. The second filtering algorithm may satisfy the following expressions (2) to (4):
Where X (N-i) is the near-end speech signal at time N-i and X n is the set of near-end speech signals at time N and N times before time N. W (i) is a filter weight at time N-i, W is a set of filter weights at time N and N times before time N, W T is a transpose of W. The N moments may represent the order of the filter, N is a positive integer, i represents the ith moment between 0 and N-1.
Further, in expression (2), W may be updated according to the following expression (3):
Wherein, W n+1 is the filter weight at iteration n+1, W n is the filter weight at iteration n, μ is the control factor, p (n) is the step factor at iteration n, and c is an infinitesimal constant.
Further, in expression (3), p (n) may be updated according to the following expression (4):
Where p (n+1) is the step size factor at iteration n+1, p (n) is the step size factor at iteration n, β is the smoothing factor of step size, and X n 2 is the frequency domain energy of X n.
In some embodiments, when the electronic device 101-1 performs the nonlinear echo cancellation processing on the near-end speech signal, the electronic device 101-1 may perform echo suppression on the difference signal using a third filtering algorithm to obtain the first processed signal.
In some embodiments, the third filtering algorithm may be a nonlinear filtering algorithm, such as wiener filtering.
In some embodiments, echo may still remain in the first processed signal due to complex and rough conditions in and around the vehicle, performance limitations of the filter, and the like.
In step S204, echo cancellation processing is performed on the first processed signal on the frequency domain subband based on the first stopsignal to obtain a second processed signal.
It is understood that the electronic device 101-1 may divide the first extraction signal and the first processing signal into a plurality of frequency domain subbands and perform echo cancellation processing on each frequency domain subband after obtaining the first processing signal through step S203. Then, the electronic device 101-1 integrates the signals processed on the frequency-domain subbands to obtain a second processed signal. At this time, the second processed signal is a cleaner voice signal.
In some embodiments, compared to the echo cancellation processing using the full frequency band, the calculation amount can be greatly reduced by performing the echo cancellation processing on the frequency domain subband in step S204, and the processed effect is not much different from the full frequency band processing effect.
In some embodiments, the electronic device 101-1 may convert the second processed signal from a frequency domain signal to a time domain signal after obtaining the second processed signal.
In step S205, a second processing signal is transmitted to the far end.
It will be appreciated that the electronic device 101-1, after deriving the second processed signal, may transmit the second processed signal to the electronic device 102-1 in the communication system 102. Then, the electronic device 102-1 may output the second processed signal to the speaker 102-2 for playback.
In some embodiments, the speaker 102-2 also needs to convert the second processed signal from a digital signal to an analog signal before playing the second processed signal.
In some embodiments, during the process of playing the second processing signal by the speaker 102-2, the microphone 102-3 may also collect the sound of the remote user and the sound of the second processing signal played by the speaker 102-2, where the communication system 102 may be a near-end communication system, and the electronic device 102-1 may perform steps S201 to S204.
In some embodiments, as shown in fig. 3, fig. 3 is a schematic flow chart of another implementation of the echo cancellation method in the embodiment of the present application. The method may be performed by the electronic device 101-1 described above. Based on fig. 1, step S204 in fig. 1 may be replaced with steps S301 to S304, and will be described in connection with the steps shown in fig. 3.
In step S301, frequency domain power spectrum division is performed on the first stopsignal and the first processed signal to obtain a plurality of frequency domain subbands.
It can be understood that, after the first processing signal is obtained through the steps S201 to S203, the electronic device 101-1 calculates the frequency domain power spectrums of the first stopsignal and the first processing signal, and divides the frequency power spectrums of the first stopsignal and the first processing signal to obtain a plurality of frequency domain subbands. Illustratively, the frequency domain power spectrum of the first stopsignal and the first processed signal may be divided into 32 frequency domain subbands. Of course, the frequency domain power spectrum may be divided into other numbers of frequency domain subbands, such as 16 frequency domain subbands, 64 frequency domain subbands, 256 frequency domain subbands, and the like, which is not particularly limited in the embodiment of the present application.
In step S302, for each frequency domain subband, the first stope signal is time-delay aligned with the first processed signal to obtain a second stope signal and a third processed signal.
It will be appreciated that the electronic device 101-1 time-aligns the first stope signal and the first processed signal over each frequency-domain subband and integrates the second stope signal and the third processed signal. Here, the second recovery signal is used to represent the first recovery signal, and the third processing signal is used to represent the first processing signal.
In some embodiments, the electronic device 101-1 may perform a delay estimation based on the cross-correlation of the first stope signal and the first processed signal over each frequency-domain subband. For example, the electronic device 101-1 may use Bastiaan algorithm to calculate the frequency domain power spectrum of the first stopsignal and the binary spectrum of the frequency domain power spectrum of the first processing signal on each frequency domain subband, and match and align according to the similarity degree (such as number of 1 after bitwise exclusive or) of the two binary spectrums. By using Bastiaan algorithm to estimate the time delay, the amount of computation can be reduced.
In some embodiments, since the frequency domain power spectrum of the first stopsignal and the first processed signal is divided into 32 frequency domain subbands, then 1 bit may be used to represent the power value on each frequency domain subband, so that the whole frequency domain power spectrum may be represented by 32 bits of data, which further reduces the computation load. It will be appreciated that the more frequency-domain sub-bands the frequency-domain power spectrum is divided, the more the second stopsignal approaches the first stopsignal and the more the third processed signal approaches the first processed signal.
In step S303, echo estimation is performed on the third processed signal by the first filtering algorithm based on the second stopsignal for each frequency domain subband, so as to obtain an amplitude envelope signal of the first echo signal.
It will be appreciated that, after obtaining the frequency domain power spectrum of the first stope signal and the binary spectrum of the frequency domain power spectrum of the first processed signal, that is, the second stope signal and the third processed signal, the electronic device 101-1 may further perform echo estimation on the third processed signal on each frequency domain subband based on the second stope signal, so as to obtain an amplitude envelope signal of the first echo signal on each frequency domain subband. The electronic device 101-1 then integrates these amplitude envelope signals, so that an amplitude envelope signal of the first echo signal can be obtained.
In some embodiments, the electronic device 101-1 may perform echo estimation on the third processed signal using the first filtering algorithm based on the second stopsignal to obtain an estimated amplitude envelope signal of the first echo signal.
In some embodiments, the first filtering algorithm may be an adaptive filtering algorithm, such as a Least Mean Square (LMS) algorithm, an NLMS algorithm, a variable step NLMS algorithm, a recursive least squares (recursive least squares, RLS) algorithm, and the like.
In some embodiments, the first filtering algorithm may be the same as or different from the second filtering algorithm. The first filtering algorithm may be an NLMS algorithm, for example. At this time, the first filtering algorithm satisfies the above expression (2), but the input signal of the first filtering algorithm is different from the input signal of the second filtering algorithm.
In some embodiments, in the first filtering algorithm, X (N-i) in expression (2) is the third processed signal at time N-i, and X n is the set of the third processed signals at time N and N times before time N.
In some embodiments, in the first filtering algorithm, W in expression (2) may be updated according to expression (6).
Wherein p is a step factor, e (n) satisfies the above expression (1), and d (n) in the expression (1) is a second stope signal,For the estimated first echo signal, n represents n time, n is a positive number.
In step S304, echo suppression processing is performed on the third processing signal according to the amplitude envelope signal of the first echo signal, so as to obtain a second processing signal.
It will be appreciated that the electronic device 101-1 may perform the echo suppression processing on the third processed signal again after obtaining the amplitude envelope signal of the first echo signal, so as to obtain the second processed signal. At this time, the second processed signal is considered as a clean speech signal in which there is no residual echo.
In some embodiments, in step S304, the electronic device 101-1 may obtain an energy spectrum of the first echo signal from an amplitude envelope signal of the first echo signal. Then, the electronic device 101-1 calculates a wiener filter coefficient according to the energy spectrum of the first echo signal and the energy spectrum of the third processing signal, and performs wiener filtering on the third processing signal based on the wiener filter coefficient to obtain the second processing signal. Here, wiener filter coefficients can also be understood as wiener filters.
Illustratively, the wiener filter coefficients may be calculated according to the following expression (7):
Wherein ω k is the digital angular velocity at k frequency points, H (ω k) is the wiener filter coefficient, P xx(ωk) is the energy spectrum of the second processed signal, and P nn(ωk) is the energy spectrum of the first echo signal.
In some embodiments, P xx(ωk) is the difference between the energy spectrum of the third processed signal and the energy spectrum of the first echo signal.
To this end, the electronic device 101-1 completes the echo cancellation process for the near-end speech signal.
In the embodiment of the application, after the near-end voice signal is subjected to the linear echo cancellation processing and the nonlinear echo cancellation processing, the voice signal is subjected to the echo cancellation processing on the frequency domain sub-band based on the far-end voice signal, and the echo strength remained in the voice signal after the linear echo cancellation processing and the nonlinear echo cancellation processing is not relevant, so that the voice signal obtained by processing is cleaner, the residual echo received by the far-end is reduced, the echo cancellation effect is improved, and the call quality of voice call is further improved.
Based on the same inventive concept, the embodiment of the present application also provides an echo cancellation device, which may be disposed in the electronic device 101-1. The modules included in the echo cancellation device, and the modules included in the modules, may be implemented by a processor in the electronic device 101-1; of course, the method can also be realized by a specific logic circuit; in an implementation, the processor may be a central processing unit (central processing unit, CPU), a microprocessor (microprocessor unit, MPU), a digital signal processor (DIGITAL SIGNAL processor, DSP), or a field programmable gate array (field programmable GATE ARRAY, FPGA), or the like.
In some embodiments, as shown in solid lines in fig. 4, fig. 4 is a schematic structural diagram of an echo cancellation device according to an embodiment of the present application. The echo cancellation device 400 may include: the obtaining module 401 is configured to obtain a first stope signal and a near-end speech signal collected by a microphone, where the first stope signal is a signal obtained by amplifying a power of a received far-end speech signal; a first processing module 402, configured to perform linear echo cancellation processing and nonlinear echo cancellation processing on the near-end speech signal based on the first stope signal, so as to obtain a first processed signal; a second processing module 403, configured to perform echo cancellation processing on the first processed signal on the frequency domain subband based on the first stope signal, so as to obtain a second processed signal; and the transmitting module is used for transmitting the second processing signal to the far end.
In some embodiments, as shown by the dashed line in fig. 4, the second processing module 403 may include: the time delay alignment module 4031 is configured to divide the frequency domain power spectrum of the first stope signal and the first processed signal to obtain a plurality of frequency domain subbands; aiming at each frequency domain sub-band, performing time delay alignment on the first stope signal and the first processing signal to obtain a second stope signal and a third processing signal; an envelope estimation module 4032, configured to perform echo estimation on the third processed signal by using a first filtering algorithm based on the second stoping signal for each frequency domain subband, so as to obtain an amplitude envelope signal of the first echo signal; the echo suppression module 4033 is configured to perform echo suppression processing on the third processing signal according to the amplitude envelope signal of the first echo signal, so as to obtain a second processing signal.
In some possible implementations, as shown by the dashed line in fig. 4, the echo suppression module 4033 is configured to: according to the amplitude envelope signal of the first echo signal, obtaining the energy spectrum of the first echo signal; calculating a wiener filter coefficient according to the energy spectrum of the first echo signal and the energy spectrum of the third processing signal; and carrying out wiener filtering on the third processing signal based on the wiener filtering coefficient to obtain a second processing signal.
In some possible implementations, the first filtering algorithm is an NLMS algorithm.
In some possible implementations, as shown in dashed lines in fig. 4, the first processing module 402 includes: the linear echo cancellation module 4021 is configured to perform linear echo cancellation processing on the near-end speech signal according to the first stopsignal through a second filtering algorithm, so as to obtain a second echo signal; calculating a difference signal obtained by subtracting the second echo signal from the near-end voice signal; the nonlinear echo cancellation module 4022 is configured to perform nonlinear echo cancellation processing on the difference signal to obtain a first processed signal.
In some possible implementations, the second filtering algorithm is a variable step NLMS algorithm.
It should be noted that the above description of the embodiment of the echo cancellation device 400 is similar to the description of the method embodiment described above, and has similar advantageous effects as the method embodiment. In some embodiments, the functions or modules included in the apparatus provided by the embodiments of the present application may be used to perform the methods described in the foregoing method embodiments, and for technical details not disclosed in the embodiments of the echo cancellation device 400 in the present application, please refer to the description of the method embodiments of the present application for understanding.
It should be noted that, in the embodiment of the present application, if the echo cancellation method is implemented in the form of a software functional module, and sold or used as a separate product, the echo cancellation method may also be stored in a computer readable storage medium. Based on such understanding, the technical solution of the embodiments of the present application may be essentially or some of contributing to the related art may be embodied in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read Only Memory (ROM), a magnetic disk, or an optical disk, etc., which can store program codes. Thus, embodiments of the application are not limited to any specific hardware, software, or firmware, or any combination of hardware, software, and firmware.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the application, as shown in fig. 5. The electronic device 500 may be the electronic device 101-1 described in the above embodiments. The electronic device 500 may include a processor 501 that may perform various suitable steps and processes in accordance with computer program instructions stored in a Read Only Memory (ROM) 502 or loaded from a memory 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the electronic device 500 may also be stored. The processor 501, ROM 502, and RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
A number of components in electronic device 500 are connected to I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, etc.; an output unit 507 such as various types of displays, speakers, and the like; a memory 508, such as a magnetic disk, optical disk, etc.; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the electronic device 500 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The processor 501 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 501 may perform the various methods and processes described above, such as performing the echo cancellation methods described above. For example, in some embodiments, the echo cancellation methods described above may be implemented as a computer software program stored on a machine-readable medium, such as memory 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 500 via the ROM 502 and/or the communication unit 509. When a computer program is loaded into RAM 503 and executed by processor 501, one or more of the steps of the echo cancellation method described above may be performed. Alternatively, in other embodiments, the processor 501 may be configured to perform one or more of the steps of the echo cancellation method described above in any other suitable manner (e.g., by means of firmware).
It is further noted that the present application may include methods, apparatus, systems, and/or computer program products. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for performing various aspects of the present application.
The embodiment of the present application provides a storage medium having stored thereon a computer program which, when executed by a processor, is capable of implementing some or all of the steps of the above method. The computer readable storage medium may be transitory or non-transitory.
Embodiments of the present application provide a computer program comprising computer readable program instructions for executing a processor in a computer device for carrying out some or all of the steps of the above method, when said computer readable program instructions are run in the computer device.
Embodiments of the present application provide a computer program product comprising a computer program or instructions which, when executed by a processor, is capable of carrying out some or all of the steps of the above method. The computer program product may be realized in particular by means of hardware, software or a combination thereof. In some embodiments, the computer program product is embodied as a computer storage medium, and in other embodiments, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.
The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.
The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device, or by a processing device, or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.
Computer program instructions for carrying out operations of the present application may be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as SMALLTALK, C ++ or the like and conventional procedural programming languages, such as the C-language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present application are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information for computer readable program instructions, which can execute the computer readable program instructions.
Various aspects of the present application are described herein with reference to flowchart illustrations and/or step diagrams of methods, apparatus (systems) and computer program products according to exemplary embodiments of the application. It will be understood that each step of the flowchart and/or step diagrams, and combinations of steps in the flowchart and/or step diagrams, can be implemented by computer readable program instructions.
These computer readable program instructions may be provided to a processor in a voice interaction device, a processing unit of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, when executed by the processing unit of the computer or other programmable data processing apparatus, create means for implementing the functions/steps specified in the flowchart and/or step diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or step diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/steps specified in the flowchart and/or step diagram block or blocks.
The flowcharts and step diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of devices, methods and computer program products according to various embodiments of the present application. In this regard, each step in the flowchart or step diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the steps may occur out of the order noted in the figures. For example, two consecutive steps may in fact be performed substantially in parallel, they may sometimes also be performed in the opposite order, depending on the function involved. It will also be noted that each step of the step diagrams and/or flowchart illustration, and combinations of steps in the step diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The above description is only illustrative of the exemplary embodiments of the application and of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the application is not limited to the specific combination of the above technical features, but also encompasses other technical solutions which may be formed by any combination of the above technical features or their equivalents without departing from the technical concept. Such as the above-mentioned features and the technical features disclosed in the present application (but not limited to) having similar functions are replaced with each other.
Claims (10)
1. An echo cancellation method, the method comprising:
obtaining a first stoping signal and a near-end voice signal acquired by a microphone, wherein the first stoping signal is a signal obtained by amplifying a received far-end voice signal by power;
Based on the first stoping signal, performing linear echo cancellation processing and nonlinear echo cancellation processing on the near-end voice signal to obtain a first processed signal;
Performing echo cancellation processing on the first processed signal on a frequency domain subband based on the first stope signal to obtain a second processed signal;
The second processed signal is sent to the far end.
2. The method of claim 1, wherein the echo cancellation processing of the first processed signal on the frequency domain subband based on the first stope signal to obtain a second processed signal comprises:
Carrying out frequency domain power spectrum division on the first stope signal and the first processing signal to obtain a plurality of frequency domain sub-bands;
For each frequency domain sub-band, performing time delay alignment on the first stope signal and the first processing signal to obtain a second stope signal and a third processing signal;
for each frequency domain sub-band, performing echo estimation on the third processing signal through a first filtering algorithm based on the second stoping signal to obtain an amplitude envelope signal of the first echo signal;
and performing echo suppression processing on the third processing signal according to the amplitude envelope signal of the first echo signal to obtain the second processing signal.
3. The method according to claim 2, wherein said performing echo suppression processing on said third processed signal based on an amplitude envelope signal of said first echo signal to obtain said second processed signal comprises:
According to the amplitude envelope signal of the first echo signal, obtaining the energy spectrum of the first echo signal;
Calculating a wiener filter coefficient according to the energy spectrum of the first echo signal and the energy spectrum of the third processing signal;
And carrying out wiener filtering on the third processing signal based on the wiener filtering coefficient to obtain the second processing signal.
4. The method of claim 2, wherein the first filtering algorithm is a normalized least mean square adaptive filtering NLMS algorithm.
5. The method of claim 1, wherein said performing linear echo cancellation processing and nonlinear echo cancellation processing on said near-end speech signal based on said first stope signal to obtain a first processed signal comprises:
According to the first stoping signal, linear echo cancellation processing is carried out on the near-end voice signal through a second filtering algorithm so as to obtain a second echo signal;
Calculating a difference signal obtained by subtracting the second echo signal from the near-end voice signal;
And carrying out nonlinear echo cancellation processing on the difference signal to obtain the first processing signal.
6. The method of claim 5, wherein the second filtering algorithm is a variable step NLMS algorithm.
7. An echo cancellation device, the device comprising:
The acquisition module is used for acquiring a first stoping signal and a near-end voice signal acquired by a microphone, wherein the first stoping signal is a signal obtained by amplifying the power of a received far-end voice signal;
The first processing module is used for carrying out linear echo cancellation processing and nonlinear echo cancellation processing on the near-end voice signal based on the first stoping signal so as to obtain a first processing signal;
the second processing module is used for carrying out echo cancellation processing on the first processing signal on a frequency domain subband based on the first stoping signal so as to obtain a second processing signal;
And the sending module is used for sending the second processing signal to a far end.
8. An electronic device, comprising:
at least one processor; and
A memory coupled to the at least one processor, the memory containing instructions stored therein, which when executed by the at least one processor, cause the at least one processor to perform the echo cancellation method of any one of claims 1 to 6.
9. A vehicle-mounted system, comprising: a speaker, a microphone, and an electronic device; the loudspeaker and the microphone are connected with the electronic equipment; wherein,
The loudspeaker is configured to play a far-end voice signal;
the microphone is configured to collect near-end voice signals;
the electronic device configured to perform the echo cancellation method of any one of claims 1 to 6.
10. A storage medium having stored thereon a computer program which, when executed by a processor, is capable of implementing the echo cancellation method according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311738192.1A CN118250389A (en) | 2023-12-15 | 2023-12-15 | Echo cancellation method, device, electronic equipment, vehicle-mounted system and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311738192.1A CN118250389A (en) | 2023-12-15 | 2023-12-15 | Echo cancellation method, device, electronic equipment, vehicle-mounted system and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118250389A true CN118250389A (en) | 2024-06-25 |
Family
ID=91554333
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311738192.1A Pending CN118250389A (en) | 2023-12-15 | 2023-12-15 | Echo cancellation method, device, electronic equipment, vehicle-mounted system and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118250389A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118762721A (en) * | 2024-07-17 | 2024-10-11 | 北京拓灵新声科技有限公司 | Voice recognition equipment |
-
2023
- 2023-12-15 CN CN202311738192.1A patent/CN118250389A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118762721A (en) * | 2024-07-17 | 2024-10-11 | 北京拓灵新声科技有限公司 | Voice recognition equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11297178B2 (en) | Method, apparatus, and computer-readable media utilizing residual echo estimate information to derive secondary echo reduction parameters | |
EP3295681B1 (en) | Acoustic echo cancelling system and method | |
EP2987316B1 (en) | Echo cancellation | |
JP6150988B2 (en) | Audio device including means for denoising audio signals by fractional delay filtering, especially for "hands free" telephone systems | |
JP5284475B2 (en) | Method for determining updated filter coefficients of an adaptive filter adapted by an LMS algorithm with pre-whitening | |
JP4283212B2 (en) | Noise removal apparatus, noise removal program, and noise removal method | |
US8812309B2 (en) | Methods and apparatus for suppressing ambient noise using multiple audio signals | |
TWI682672B (en) | Echo cancellation system and method with reduced residual echo | |
CN108376548B (en) | Echo cancellation method and system based on microphone array | |
CN111768796B (en) | Acoustic echo cancellation and dereverberation method and device | |
CN107017004A (en) | Noise suppressing method, audio processing chip, processing module and bluetooth equipment | |
JP2018528717A (en) | Adaptive block matrix with pre-whitening for adaptive beamforming | |
JP3507020B2 (en) | Echo suppression method, echo suppression device, and echo suppression program storage medium | |
CN112863532A (en) | Echo suppressing device, echo suppressing method, and storage medium | |
JP4544993B2 (en) | Echo processing apparatus for single-channel or multi-channel communication system | |
WO2019239977A1 (en) | Echo suppression device, echo suppression method, and echo suppression program | |
CN118250389A (en) | Echo cancellation method, device, electronic equipment, vehicle-mounted system and storage medium | |
CN109215672B (en) | Method, device and equipment for processing sound information | |
US8804981B2 (en) | Processing audio signals | |
CN112997249B (en) | Voice processing method, device, storage medium and electronic equipment | |
JP3381112B2 (en) | Echo canceler | |
JP2003309493A (en) | Method, device and program for reducing echo | |
JP3403549B2 (en) | Echo canceller | |
CN114333872A (en) | Autoregressive-based residual echo suppression | |
WO2022195955A1 (en) | Echo suppressing device, echo suppressing method, and echo suppressing program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Country or region after: China Address after: Room 3701, No. 866 East Changzhi Road, Hongkou District, Shanghai, 200080 Applicant after: Botai vehicle networking technology (Shanghai) Co.,Ltd. Address before: 201821 room 208, building 4, No. 1411, Yecheng Road, Jiading Industrial Zone, Jiading District, Shanghai Applicant before: Botai vehicle networking technology (Shanghai) Co.,Ltd. Country or region before: China |
|
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |