CN113270082A

CN113270082A - Vehicle-mounted KTV control method and device and vehicle-mounted intelligent networking terminal

Info

Publication number: CN113270082A
Application number: CN202010095631.1A
Authority: CN
Inventors: 梁华文; 詹灯辉; 郑淳允; 马煜森; 韩斌; 罗学权
Original assignee: Guangzhou Automobile Group Co Ltd
Current assignee: Guangzhou Automobile Group Co Ltd
Priority date: 2020-02-14
Filing date: 2020-02-14
Publication date: 2021-08-17

Abstract

The invention discloses a vehicle-mounted KTV control method and device, and a vehicle-mounted intelligent network-connected terminal. The method includes: performing human voice suppression processing on an initial audio signal to obtain a first audio signal; wherein, the human voice suppression includes human voice cancellation and vocal weakening, the initial audio signal is a multimedia audio signal; pick up the in-vehicle ambient audio signal and perform echo cancellation and vocal localization processing to obtain at least one initial vocal signal; wherein, the ambient audio signal includes a first audio signal, a human voice Signal and noise signal; perform sound modulation on the first audio signal and the initial human voice signal to obtain the target audio signal. The present invention utilizes the existing software and hardware resources of the vehicle-mounted intelligent network connection terminal, such as a microphone, a multimedia player, a DSP audio processor and a speaker, etc., without using an additional microphone system or a mobile terminal. The function of karaoke can also realize the function of multiple people in the car singing karaoke at the same time, which greatly improves the car experience.

Description

Vehicle-mounted KTV control method and device and vehicle-mounted intelligent networking terminal

Technical Field

The invention relates to the technical field of vehicle-mounted entertainment, in particular to a vehicle-mounted KTV control method and device and a vehicle-mounted intelligent networking terminal.

Background

At present, a vehicle-mounted KTV system usually depends on the support of third-party software and hardware, for example, a voice is picked up through an additional singing K microphone system, an accompaniment signal is output through a mobile terminal serving as accompaniment equipment, and therefore K songs on a vehicle are realized through a method of adding hardware equipment. However, if the karaoke microphone system or the mobile terminal is absent in the car, the karaoke cannot be realized in the car due to the limitation of the additional hardware equipment.

Disclosure of Invention

The technical problem to be solved by the invention is to provide a vehicle-mounted KTV control method and device and a vehicle-mounted intelligent networking terminal, which can solve the problem that the karaoke in a vehicle cannot be realized due to the loss of additional hardware equipment.

The embodiment of the invention provides a vehicle-mounted KTV control method, which comprises the following steps:

carrying out voice suppression processing on the initial audio signal to obtain a first audio signal; wherein the voice suppression comprises voice elimination and voice weakening, and the initial audio signal is a multimedia audio signal;

picking up an environment audio signal in the car and carrying out echo cancellation and voice positioning processing to obtain at least one initial voice signal; wherein the ambient audio signal comprises a first audio signal, a human voice signal, and a noise signal;

carrying out sound modulation on the first audio signal and the initial human voice signal to obtain a target audio signal;

and outputting the target audio signal.

An embodiment of the present invention further provides a vehicle-mounted KTV control device, including:

the first audio signal acquisition module is used for carrying out voice suppression processing on the initial audio signal to obtain a first audio signal; wherein the voice suppression comprises voice elimination and voice weakening, and the initial audio signal is a multimedia audio signal;

the system comprises an initial human voice signal acquisition module, a human voice signal acquisition module and a human voice positioning module, wherein the initial human voice signal acquisition module is used for picking up an environmental audio signal in a vehicle and carrying out echo cancellation and human voice positioning processing to obtain at least one initial human voice signal; wherein the ambient audio signal comprises a first audio signal, a human voice signal, and a noise signal;

the target audio signal acquisition module is used for carrying out sound modulation on the first audio signal and the initial human voice signal to obtain a target audio signal;

and the target audio signal output module is used for outputting the target audio signal.

The embodiment of the present invention further provides a vehicle-mounted intelligent networking terminal, including:

one or more processors;

a memory coupled to the processor for storing one or more programs;

when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the in-vehicle KTV control method as described in the above embodiments.

This embodiment is through predetermined KTV control strategy, at first carry out the voice to initial audio signal and restrain the processing, obtain the first audio signal who is used for accompanying, when the user need carry out the KTV operation in the car, through picking up car internal environment audio signal and carry out echo cancellation and voice positioning and handle, obtain at least one initial voice signal, then carry out the sound modulation to first audio signal and initial voice signal, obtain the target audio signal of audio mixing, output target audio signal at last. Therefore, the invention utilizes the existing software and hardware resources of the vehicle-mounted intelligent network terminal, such as a microphone, a multimedia player, a DSP audio processor, a loudspeaker and the like, and does not need to utilize an additional singing K microphone system or a mobile terminal, thereby not only realizing the function of high-quality K songs in the vehicle, but also realizing the function of simultaneously K songs by a plurality of people in the vehicle, and greatly improving the vehicle using experience.

Drawings

In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flowchart of a vehicle-mounted KTV control method according to an embodiment of the present invention;

fig. 2 is another schematic flow chart of a vehicle-mounted KTV control method according to an embodiment of the present invention;

fig. 3 is a schematic flowchart of a vehicle-mounted KTV control method according to another embodiment of the present invention;

fig. 4 is a schematic flowchart of a vehicle-mounted KTV control method according to another embodiment of the present invention;

fig. 5 is a schematic flowchart of a vehicle-mounted KTV control method according to another embodiment of the present invention;

fig. 6 is a schematic flowchart of a vehicle-mounted KTV control method according to another embodiment of the present invention;

fig. 7 is a schematic structural diagram of a vehicle-mounted KTV control device according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a vehicle-mounted KTV control device according to another embodiment of the present invention;

fig. 9 is a schematic structural diagram of a vehicle-mounted KTV control device according to another embodiment of the present invention;

fig. 10 is a schematic structural diagram of a vehicle-mounted KTV control device according to another embodiment of the present invention;

fig. 11 is a block diagram of a structure of a vehicle-mounted intelligent networking terminal according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be understood that the step numbers used herein are for convenience of description only and are not intended as limitations on the order in which the steps are performed.

It is to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

The terms "comprises" and "comprising" indicate the presence of the described features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

The term "and/or" refers to and includes any and all possible combinations of one or more of the associated listed items.

Referring to fig. 1, an embodiment of the present invention provides a vehicle-mounted KTV control method, including:

and S10, carrying out voice suppression processing on the initial audio signal to obtain a first audio signal. The voice suppression comprises voice elimination and voice weakening, and the initial audio signal is a vehicle-mounted multimedia audio signal.

Referring to fig. 2, the initial audio signal V10 is an unprocessed multimedia audio signal, in which the human voice ratio is greater than the accompaniment ratio. The first audio signal V11 is a multimedia sound signal of the original audio signal V10 subjected to a vocal suppression process, wherein the accompaniment ratio is greater than the vocal ratio, and is suitable for being used as the audio signal of the K song.

The voice suppression process includes a voice elimination process and a voice weakening process. The voice eliminating process is to reduce the voice proportion of the initial audio signal to zero, and the voice weakening process is to reduce the voice proportion of the initial audio signal to be lower than the accompaniment proportion, so that the voice of the user is harmonious with the accompaniment in the K song.

In one embodiment, the initial audio signal V10 may be an audio file stored on the vehicle-mounted smart internet terminal or an audio file obtained in real time through wireless communication or the like. The first audio signal V11 can be stored temporarily or for a long time at the vehicle-mounted intelligent internet terminal, so as to provide accompaniment for the user to K songs.

In one embodiment, the vehicle-mounted intelligent networking terminal comprises a multimedia system, such as a multimedia player and a display, wherein the multimedia player is used for playing audio files, and the display is used for displaying multimedia files.

And S20, picking up the audio signals of the environment in the vehicle and carrying out echo cancellation and voice positioning processing to obtain at least one initial voice signal. Wherein the ambient audio signal comprises a first audio signal, a human voice signal and a noise signal.

The current K singing system needs a mobile terminal and an additional microphone system for supporting, a receiver of the mobile terminal is used for collecting voice, then the voice is sent to the microphone system through Bluetooth or NFC, then the voice and music of the mobile terminal are processed by the microphone system to generate a third audio signal, and the third audio signal is sent to a vehicle machine in an FM mode to realize the function of K singing. However, when FM channels conflict, the karaoke system cannot realize 2-or multi-person chorus. And this embodiment need not to adopt the FM mode to send to the car machine, does not have the condition of FM channel conflict, has realized the function of many people K singing simultaneously.

Specifically, when picking up sounds, in addition to the user's voice, the accompaniment and noise played in the car are also picked up. Therefore, by performing echo cancellation on the in-vehicle environment audio signal, noise and echo can be cancelled, and then by the human localization processing, for example, distinguishing whether the sound is emitted by the main driver, the sub-driver or the rear row, the sound source can be determined. Thus, a relatively pure initial human voice signal is obtained.

In a certain embodiment, the vehicle-mounted intelligent internet terminal further comprises a microphone, and when the vehicle enters a karaoke mode, the microphone starts to pick up an in-vehicle environment audio signal. Since the microphones in the car usually comprise multiple branches, the multiple microphones are operated simultaneously. Taking 2 to 4 branches as an example, one microphone is used as one input.

When multiple persons simultaneously sing K exist in the vehicle, the initial personal sound signals of all the members in the vehicle can be obtained by processing the environment audio signals in the vehicle. Referring to fig. 2, assuming that 4 members in the vehicle are in karaoke at the same time, step S20 may output 4 initial vocal signals, i.e., initial vocal signals V20, V21, V22, and V23, respectively.

Referring to fig. 3, in one embodiment, the step S20 includes the following sub-steps:

and S21, picking up the environment audio signal in the vehicle.

And S22, performing echo cancellation processing on the environment audio signal according to a preset echo cancellation algorithm. Among them, echo cancellation algorithms include an echo suppression (acoustic echo suppression) algorithm and an acoustic echo cancellation (acoustic echo cancellation) algorithm.

The echo suppression algorithm was an earlier echo control algorithm. Echo suppression is a non-linear echo cancellation that compares the level of sound intended to be played by a loudspeaker with the level of sound currently picked up by a microphone by a simple comparator, and if the former is above a certain threshold, then it is allowed to pass to the loudspeaker, and the microphone is turned off to prevent it from picking up the sound played by the loudspeaker and causing a far-end echo. If the sound level picked up by the microphone is above a certain threshold, the loudspeaker is disabled for echo cancellation purposes.

Acoustic echo cancellation Algorithms (AEC) are based on the correlation of the loudspeaker signal with the multipath echoes it generates, creating a speech model of the far-end signal (s (n)), using it to estimate the echo (e' (n)), and continuously modifying the coefficients of the filter so that the estimate more closely approximates the true echo (e (n)). Then, the echo estimation value is subtracted from the input signal of the microphone, so as to achieve the purpose of eliminating the echo.

And S23, carrying out voice positioning processing on the environmental audio signal after the echo cancellation according to a preset sound source positioning algorithm to obtain at least one initial voice signal. The sound source positioning algorithm includes a Time-Delay Estimation (TDE) algorithm and a generalized cross-correlation (GCC) Time-Delay Estimation algorithm.

Sound source localization is the spatial arrangement of microphones in a geometric array to receive sound field information of the noise of an object, and the azimuth information of the object is determined by detecting or calculating the time delay of the signal measured by each microphone. In this example, the positions of the multiple microphones may be determined.

The time delay estimation algorithm utilizes time delay estimation caused by different signal propagation distances among sensors or sensor arrays to finish combined direction finding and distance measuring of a target; if the time delay value existing between the arrival of the sound wave at each array element can be accurately estimated, and the parameter estimation quantity of the target position can be calculated according to the geometric relationship of the microphone array arrangement.

The GCC method is a conventional TDE method. Since there is a certain correlation between signals from the same sound source, the TDE value can be estimated by calculating the correlation function between signals received by different microphones. However, in practical environments, due to the influence of noise and reverberation, the maximum peak of the correlation function is weakened and sometimes multiple peaks occur, which make the actual peak detection difficult. The GCC method weights the signal in the power spectrum to emphasize the relevant signal portion and suppress the noise-disturbed portion, so as to make the peak value of the correlation function at the time delay more prominent.

And S30, carrying out sound modulation on the first audio signal and the initial human voice signal to obtain a target audio signal.

Mixing (Audio Mixing) is a step in music production, where sound from multiple sources is integrated into a Stereo (Stereo) or Mono (Mono) soundtrack. These original sound signals may come from different musical instruments, voices or orchestras, respectively, which are recorded in live performance (live) or recording rooms. In the process of sound mixing, the frequency, dynamics, tone quality, positioning, reverberation and sound field of each individual original signal are independently adjusted to optimize each sound track, and then the sound tracks are superposed on a final finished product. In this embodiment, the first audio signal and the initial human voice signal are subjected to sound modulation, so as to obtain a mixed target audio signal.

The Mixing device can be a synthesizer (Sound Module), an audio Processor (Signal Processor) and a Mixing Console (Mixing Console), or Mixing software, so as to complete complex Mixing operation. In this example, by processing with a preset mixing algorithm, when the first audio signal and the initial human voice signal are input to the mixing device or mixing software, mixing processing is automatically performed, and a mixed target audio signal is output. The mixing step includes, but is not limited to, rail aligning, modifying, noise reduction, excitation, tooth sound removal, equalization, and compression.

In one embodiment, the vehicle-mounted intelligent internet terminal further comprises a DSP audio processor, and the DSP audio processor is configured to perform sound modulation on the first audio signal and the initial human voice signal to obtain the target audio signal.

Referring to fig. 4, in one embodiment, the step S30 includes the following sub-steps:

and S31, performing gain adjustment processing on the first audio signal and the initial human voice signal to obtain a first audio signal after gain and the initial human voice signal after gain.

In order to amplify the first audio signal and the initial human voice signal so that the output audio signal meets the requirement of karaoke, the first audio signal and the initial human voice signal need to be subjected to gain adjustment processing.

With reference to fig. 2, taking 4 initial voice signals as an example, the initial voice signals V20, V21, V22, V23 and the audio file V11 are subjected to gain adjustment, so as to sequentially generate the initial voice signals V200, V210, V220, V230 after gain and the first audio signal V110 after gain.

And S32, performing sound mixing processing on the first audio signal after the gain and the initial human voice signal after the gain to obtain a target audio signal.

And S40, outputting the target audio signal.

The current K singing system needs a mobile terminal and an additional microphone system for supporting, a receiver of the mobile terminal is used for collecting voice, then the voice is sent to the microphone system through Bluetooth or NFC, the voice and music of the mobile terminal are processed by the microphone system to generate a third audio signal, and the third audio signal is sent to a vehicle machine in an FM mode to realize the function of K singing. However, the initial voice is transmitted in the FM mode after being transmitted by bluetooth and the like, the distortion is serious, and the initial voice signal of the embodiment is not subjected to redundant modulation, so that the high-quality karaoke effect is ensured.

In a certain embodiment, the vehicle-mounted intelligent internet terminal further comprises a power amplifier and a loudspeaker, wherein the power amplifier is used for amplifying the output target audio signal, and the loudspeaker is used for converting the amplified target audio signal into an acoustic signal, so that a user can hear the audio-mixed target audio signal, and the vehicle-mounted K song function is realized.

Referring to fig. 5, in an embodiment, step S30 further includes:

and S33, performing sound quality optimization processing on the target audio signal.

The tone quality optimization processing includes modulating the volume, sound effect, tone color and the like of the input audio signal. The tone quality optimization processing is carried out on the target audio signal, and the target audio signal can be modified, so that the output target audio signal is more audible, and good karaoke experience is provided.

In a certain embodiment, on-vehicle intelligent networking terminal still includes DSP audio adjuster, and DSP audio adjuster is used for carrying out tone quality optimization to target audio signal to make the target audio signal of output better listen, thereby provide good K and sing and experience.

Please continue to combine fig. 2, taking the 4 people in the car singing at the same time as singing K as an example, the initial human voice signals V200, V210, V220, V230 after the gain and the first audio signal V110 after the gain are input to the DSP audio adjuster, the DSP audio adjuster modulates the audio, the timbre, etc. of the initial human voice signals V200, V210, V220, V230 and the first audio signal V110 after the gain to generate the mixed audio signal V300, and the mixed audio signal V300 is amplified by the power amplifier to generate the target audio signal V400, so as to directly push the speaker to sound.

Referring to fig. 6, in one embodiment, before the step S20 of picking up the audio signal of the environment in the vehicle, the method further includes:

and S60, judging whether the Karaoke mode is triggered or not.

If yes, triggering the function of picking up the environment audio signal in the vehicle according to the Karaoke mode.

The karaoke mode comprises parameters such as a human voice suppression ratio, gain adjustment, sound effect modulation and the like. The user can trigger the karaoke mode through the intelligent internet terminal display according to the user requirement, and the default karaoke mode of the system can be changed on the intelligent internet terminal display. When the vehicle-mounted intelligent internet terminal detects that the Karaoke mode is triggered, the function of S20 is triggered according to the current Karaoke mode. And when the vehicle-mounted intelligent networking terminal detects that the Karaoke mode is not triggered, the function of S20 is not triggered. So, the user can select the function of singing according to the demand in the car, also can set up corresponding K and sing the mode according to the K custom of singing of oneself, and the K sings in the car and experiences well.

In one embodiment, a KTV control application program is also preset in the vehicle-mounted intelligent internet terminal and serves as a desktop application for providing interaction for entering and exiting a Karaoke mode, setting a human voice suppression ratio, adjusting gain and setting audio effect of audio for a user.

Compared with the prior art that the lyrics of the K song can only be displayed on the mobile terminal and cannot be synchronized to the screen of the vehicle-mounted multimedia host because the current K song system needs the support of the mobile terminal, the KTV control application program in the embodiment synchronizes the multimedia player and the display, so that the synchronization of the vehicle-mounted lyrics is realized, and the vehicle-mounted K song experience is further improved.

Referring to fig. 7, an embodiment of the present invention provides a vehicle-mounted KTV control apparatus, which includes a first audio signal obtaining module 20, an initial human voice signal obtaining module 21, a target audio signal obtaining module 22, and a target audio signal output module 23.

The first audio signal obtaining module 20 is configured to perform voice suppression processing on the initial audio signal to obtain a first audio signal. The voice suppression comprises voice elimination and voice weakening, and the initial audio signal is a multimedia audio signal.

The initial human voice signal acquisition module 21 is configured to pick up an environmental audio signal in the vehicle and perform echo cancellation and human voice positioning processing to obtain at least one initial human voice signal. Wherein the ambient audio signal comprises a first audio signal, a human voice signal and a noise signal.

Referring to fig. 8, in one embodiment, the initial human voice signal obtaining module 21 includes an environmental audio signal pickup module 211, an echo cancellation processing module 212, and a human voice positioning processing module 213.

The environment audio signal pickup module 211 is used for picking up an environment audio signal in the vehicle.

The echo cancellation processing module 212 is configured to perform echo cancellation processing on the environmental audio signal according to a preset echo cancellation algorithm.

Among them, echo cancellation algorithms include an echo suppression (acoustic echo suppression) algorithm and an acoustic echo cancellation (acoustic echo cancellation) algorithm.

The voice positioning processing module 213 is configured to perform voice positioning processing on the environmental audio signal after the echo cancellation according to a preset sound source positioning algorithm to obtain at least one initial voice signal.

The sound source positioning algorithm includes a Time-Delay Estimation (TDE) algorithm and a generalized cross-correlation (GCC) Time-Delay Estimation algorithm.

The target audio signal obtaining module 22 is configured to perform sound modulation on the first audio signal and the initial human voice signal to obtain a target audio signal.

Referring to fig. 9, in one embodiment, the target audio signal obtaining module 22 includes a gain adjustment processing module 221 and a mixing processing module 222.

The gain adjustment processing module 221 is configured to perform gain adjustment processing on the first audio signal and the initial human voice signal to obtain a first audio signal after gain and a initial human voice signal after gain.

The audio mixing processing module 222 is configured to perform audio mixing processing on the first audio signal after the gain and the initial human voice signal after the gain to obtain a target audio signal.

The target audio signal output module 23 is configured to output a target audio signal.

Referring to fig. 10, in one embodiment, the vehicle-mounted KTV control apparatus further includes a determining module 24, configured to determine whether the karaoke mode is triggered;

The karaoke mode comprises parameters such as a human voice suppression ratio, gain adjustment, sound effect modulation and the like. The user can trigger the karaoke mode through the intelligent internet terminal display according to the user requirement, and the default karaoke mode of the system can be changed on the intelligent internet terminal display. When the vehicle-mounted intelligent internet terminal detects that the Karaoke mode is triggered, the function of the step S20 is triggered according to the current Karaoke mode. And when the vehicle-mounted intelligent internet terminal detects that the Karaoke mode is not triggered, the function of the step S20 is not triggered. So, the user can select the function of singing according to the demand in the car, also can set up corresponding K and sing the mode according to the K custom of singing of oneself, and the K sings in the car and experiences well.

Referring to fig. 11, an embodiment of the present invention provides a vehicle-mounted intelligent internet terminal. As shown in fig. 11, the vehicle-mounted intelligent networking terminal may include: one or more processors, and a memory. A memory is coupled to the processor for storing one or more programs. When the one or more programs are executed by the one or more processors, the one or more processors are enabled to implement the vehicle-mounted KTV control method according to any one of the embodiments, and achieve the technical effects consistent with the method.

The processor is used for controlling the overall operation of the vehicle-mounted intelligent network terminal so as to complete all or part of the steps of the vehicle-mounted KTV control method. The memory is used for storing various types of data to support the operation of the vehicle-mounted intelligent networking terminal, and the data can comprise instructions of any application program or method used for operating the vehicle-mounted intelligent networking terminal and application program related data. The Memory may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk.

Preferably, the in-vehicle intelligent networking terminal may further include one or more of a multimedia component, an input/output (I/O) interface, and a communication component.

The multimedia components may include a screen and an audio component. Wherein the screen may be, for example, a touch screen and the audio component is used for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may further be stored in a memory or transmitted through a communication component. The audio assembly also includes at least one speaker for outputting audio signals. The I/O interface provides an interface between the processor and other interface modules, such as a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication assembly is used for carrying out wired or wireless communication between the vehicle-mounted intelligent networking terminal and other equipment. Wireless Communication, such as Wi-Fi, bluetooth, Near Field Communication (NFC for short), 2G, 3G, or 4G, or a combination of one or more of them, so the corresponding Communication component may include a Wi-Fi module, a bluetooth module, and an NFC module.

In an exemplary embodiment, the on-board smart internet terminal may be implemented by one or more Application Specific 1 integrated circuits (AS 1C), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a controller, a microcontroller, a microprocessor or other electronic components, and is configured to perform the on-board KTV control method and achieve the technical effects consistent with the above method.

In another exemplary embodiment, there is also provided a computer readable storage medium comprising program instructions which, when executed by a processor, implement the steps of the in-vehicle KTV control method described above. For example, the computer readable storage medium may be the memory including the program instructions, and the program instructions may be executed by a processor of the vehicle-mounted intelligent networking terminal to implement the vehicle-mounted KTV control method, and achieve the technical effects consistent with the method.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims

1. a vehicle-mounted KTV control method, is characterized in that, comprises:

Perform vocal suppression processing on the initial audio signal to obtain a first audio signal; wherein, the vocal suppression includes vocal elimination and vocal weakening, and the initial audio signal is a multimedia audio signal;

Picking up the in-vehicle ambient audio signal and performing echo cancellation and human voice localization processing to obtain at least one initial human voice signal; wherein the ambient audio signal includes a first audio signal, a human voice signal and a noise signal;

performing sound modulation on the first audio signal and the initial vocal signal to obtain a target audio signal;

The target audio signal is output.

2. vehicle-mounted KTV control method according to claim 1, is characterized in that, described picking up in-vehicle environment audio signal and carry out echo cancellation and human voice localization processing, obtain at least one initial human voice signal, comprising:

Pick up the ambient audio signal in the car;

Perform echo cancellation processing on the ambient audio signal according to a preset echo cancellation algorithm, where the echo cancellation algorithm includes an echo cancellation algorithm and an acoustic echo cancellation algorithm;

Perform human voice localization processing on the echo-cancelled environmental audio signal according to a preset sound source localization algorithm to obtain at least one initial human voice signal, the sound source localization algorithm includes a sound source localization algorithm and a time delay estimation algorithm and generalized cross-correlation delay estimation algorithm.

3. vehicle-mounted KTV control method according to claim 1, is characterized in that, described first audio signal and described initial vocal signal are carried out sound modulation, obtain target audio signal, comprising:

Performing a gain adjustment process on the first audio signal and the initial vocal signal to obtain the first audio signal after the gain and the initial vocal signal after the gain;

Mixing processing is performed on the first audio signal after gain and the initial human voice signal after gain to obtain the target audio signal.

4. vehicle-mounted KTV control method according to claim 1, is characterized in that, before picking up in-vehicle environment audio signal, also comprises:

Determine whether the karaoke mode is triggered;

If so, the function of picking up the in-vehicle ambient audio signal is triggered according to the karaoke mode.

5. a vehicle-mounted KTV control device, is characterized in that, comprises:

A first audio signal acquisition module, configured to perform vocal suppression processing on the initial audio signal to obtain a first audio signal; wherein the vocal suppression includes vocal elimination and vocal weakening, and the initial audio signal is a multimedia audio signal ;

The initial human voice signal acquisition module is used to pick up the in-vehicle environmental audio signal and perform echo cancellation and human voice localization processing to obtain at least one initial human voice signal; wherein, the environmental audio signal includes a first audio signal, a human voice signal and noise signal;

a target audio signal acquisition module, configured to perform sound modulation on the first audio signal and the initial vocal signal to obtain a target audio signal;

The target audio signal output module is used for outputting the target audio signal.

6. The vehicle-mounted KTV control device according to claim 5, wherein the initial vocal signal acquisition module comprises:

The ambient audio signal pickup module is used to pick up the ambient audio signal in the car;

an echo cancellation processing module, configured to perform echo cancellation processing on the ambient audio signal according to a preset echo cancellation algorithm, where the echo cancellation algorithm includes an echo cancellation algorithm and an acoustic echo cancellation algorithm;

The human voice localization processing module is configured to perform human voice localization processing on the echo-cancelled environmental audio signal according to a preset sound source localization algorithm, and obtain at least one of the initial human voice signals, and the sound source localization algorithm includes a Source localization algorithms include delay estimation algorithms and generalized cross-correlation delay estimation algorithms.

7. The vehicle-mounted KTV control device according to claim 5, wherein the target audio signal acquisition module comprises:

a gain adjustment processing module, configured to perform gain adjustment processing on the first audio signal and the initial vocal signal to obtain the first audio signal after gain and the initial vocal signal after gain;

The sound mixing processing module is configured to perform sound mixing processing on the first audio signal after the gain and the initial human voice signal after the gain to obtain the target audio signal.

8. vehicle-mounted KTV control device according to claim 5, is characterized in that, also comprises:

The judgment module is used to judge whether the karaoke mode is triggered;

9. A vehicle-mounted intelligent network connection terminal, characterized in that, comprising:

one or more processors;

a memory, coupled to the processor, for storing one or more programs;

When the one or more programs are executed by the one or more processors, the one or more processors implement the vehicle KTV control method according to any one of claims 1 to 4.