CN118262738A - Sound effect space adaptation method and system - Google Patents
Sound effect space adaptation method and system Download PDFInfo
- Publication number
- CN118262738A CN118262738A CN202410339606.1A CN202410339606A CN118262738A CN 118262738 A CN118262738 A CN 118262738A CN 202410339606 A CN202410339606 A CN 202410339606A CN 118262738 A CN118262738 A CN 118262738A
- Authority
- CN
- China
- Prior art keywords
- audio
- multimedia terminal
- damage
- propagation
- played
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000000694 effects Effects 0.000 title claims abstract description 39
- 230000006978 adaptation Effects 0.000 title claims abstract description 29
- 238000007781 pre-processing Methods 0.000 claims abstract description 21
- 230000008901 benefit Effects 0.000 claims description 80
- 230000006735 deficit Effects 0.000 claims description 23
- 238000001914 filtration Methods 0.000 claims description 12
- 230000005540 biological transmission Effects 0.000 claims description 5
- 230000001902 propagating effect Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 2
- 230000006870 function Effects 0.000 description 12
- 230000005236 sound signal Effects 0.000 description 12
- 230000006872 improvement Effects 0.000 description 8
- 230000004044 response Effects 0.000 description 5
- 238000010521 absorption reaction Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 239000000428 dust Substances 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 238000010183 spectrum analysis Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 208000033999 Device damage Diseases 0.000 description 1
- 101000802640 Homo sapiens Lactosylceramide 4-alpha-galactosyltransferase Proteins 0.000 description 1
- 102100035838 Lactosylceramide 4-alpha-galactosyltransferase Human genes 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000003631 expected effect Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0316—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0212—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/022—Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/26—Pre-filtering or post-filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/01—Correction of time axis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/45—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Stereophonic System (AREA)
Abstract
The invention discloses a sound effect space adaptation method and system, and belongs to the technical field of sound effect adjustment. The method comprises the following steps: recording the original audio played by the multimedia terminal from the listening position to obtain recorded audio; then, calculating propagation damage from the placing position of the multimedia terminal to the listening position by using the original audio and the recorded audio; and finally, preprocessing the audio played by the multimedia terminal based on the propagation damage. According to the invention, the counter-propagation gain is obtained through the propagation gain, and is loaded onto the audio in advance before playing the audio of the multimedia terminal, so that the propagation gain and the counter-propagation gain are mutually offset when the audio finally arrives at the listening position through the influence of the propagation gain during playing, and a user obtains the preset listening effect from factory.
Description
Technical Field
The invention belongs to the technical field of sound effect adjustment, and particularly relates to a sound effect space adaptation method and system.
Background
When the product leaves the factory, the corresponding sound effect mode can be debugged for the product by the multimedia terminal, so that the acoustic parameters of the product taken by each user are the same.
However, the use scenes of each user are different, and the actual hearing effect of the sound effect is often inconsistent with the expected effect of factory debugging. The sound played by the multimedia terminal is transmitted to the ears of the listener through various reflections, so that the final hearing of the listener in different environments is different. Factors influencing the hearing effect include room size, room shape, the emission of objects in the room, the sound absorption characteristics of the wall surface, the position of a person relative to the multimedia terminal, dust adhesion on the surface of a loudspeaker of the multimedia terminal, and the like, and in order to enable users in different environments to obtain the best hearing experience, a method for eliminating the influence of the environment on the sound effect is needed.
Disclosure of Invention
In view of this, the present invention provides a method and a system for spatial adaptation of sound effects, which can eliminate the influence of the environment on the sound effects.
In order to solve the above technical problems, the technical solution of the present invention is to adopt an audio space adaptation method, which is applied to audio adjustment of a multimedia terminal, and includes:
recording an original Audio frequency audio_S played by the multimedia terminal from a listening position in a use scene of the multimedia terminal to obtain a recorded Audio frequency audio_T;
calculating propagation damage from the multimedia terminal placement position to the listening position by using the original audio_S and the recorded Audio audio_T;
And preprocessing the audio played by the multimedia terminal based on the propagation damage.
As an improvement, the method for recording the original Audio audio_s played by the multimedia terminal from the listening position to obtain the recorded Audio audio_t includes:
recording audio through the intelligent terminal and uploading; or alternatively
And recording audio through a detachable accessory with a recording function of the multimedia terminal and uploading the audio.
As a further improvement, in the case of recording audio through the intelligent terminal, acquiring equipment damage of the intelligent terminal includes:
Recording the original Audio audio_s1 played by the multimedia terminal by the intelligent terminal close to the multimedia terminal to obtain recorded Audio audio_t1 so as to eliminate propagation damage;
acquiring an intermediate benefit S1 from the original Audio audio_S1 to the recorded Audio audio_T1;
And removing the equipment benefit of the multimedia terminal from the intermediate benefit S1 so as to obtain the equipment benefit of the intelligent terminal.
As another further improvement, after obtaining the recorded Audio audio_t, filtering the recorded Audio audio_t; the filter window at the time of the filtering process includes a threshold number of audio periods.
As an improvement, further comprising: the recorded Audio audio_t is aligned with the original Audio audio_s such that the recorded Audio audio_t has the same start point and end point as the original Audio audio_s.
As an improvement, the method for calculating propagation loss from the multimedia terminal placement position to the listening position by using the original audio_s and the recorded Audio audio_t comprises the following steps:
Respectively converting the original Audio audio_S and the recorded Audio audio_T into frequency domain representations;
Acquiring the propagation damage according to the frequency domain representation of the original Audio audio_S, the frequency domain representation of the recorded Audio audio_T and the fixed damage; the fixed benefit includes a device benefit of the multimedia terminal and a device benefit of the recording device.
As an improvement, the method for preprocessing the audio played by the multimedia terminal based on propagation damage comprises the following steps:
Acquiring counter-propagating damage by using the propagating damage;
Converting the counter-propagating impairment benefits to a time domain representation;
And convolving the counter-propagating impairment time domain representation with the audio time domain representation to be played to obtain the preprocessed audio.
As an improvement, the method for preprocessing the audio played by the multimedia terminal based on propagation damage comprises the following steps:
Acquiring counter-propagating damage by using the propagating damage;
Converting audio to be played from a time domain representation to a frequency domain representation;
The audio frequency domain representation to be played is multiplied with the counter-propagating damage frequency domain representation and then converted into a time domain representation to obtain preprocessed audio.
As an improvement, in the case of having more than two multimedia terminals, respectively acquiring propagation damage from each multimedia terminal to a listening position;
and preprocessing the audio played by each multimedia terminal based on the propagation damage.
The invention also provides an audio space adaptation system, which is characterized by comprising:
The Audio acquisition module is used for recording the original Audio audio_S played by the multimedia terminal from the listening position to obtain recorded Audio audio_T;
The transmission damage acquisition module calculates transmission damage from the placement position of the multimedia terminal to the listening position by using the original audio_S and the recorded audio_T;
And the audio preprocessing module is used for preprocessing the audio played by the multimedia terminal based on the propagation damage.
The invention has the advantages that:
In the invention, in the use scene of the multimedia terminal, recording the original audio played by the multimedia terminal from a listening position to obtain recorded audio; then, calculating propagation damage from the placing position of the multimedia terminal to the listening position by using the original audio and the recorded audio; and finally, preprocessing the audio played by the multimedia terminal based on the propagation damage.
The propagation damage is mainly caused by the size of a room, the shape of the room, the emission of articles in the room, the sound absorption property of a wall surface, the position of a person relative to a projection/television, the dust adhesion on the surface of the television/projection loudspeaker and the like, and the conditions are different in different use scenes, so that the propagation damage in each use scene is also different. In the invention, users in different use scenes can obtain the hearing effect of factory debugging by eliminating propagation damage. The principle of eliminating the influence of the propagation gain in the invention is that the counter-propagation gain is obtained through the propagation gain, and the counter-propagation gain is loaded on the audio in advance before playing a certain audio of the multimedia terminal, so that the propagation gain and the counter-propagation gain are mutually offset when the audio finally arrives at a listening position through the influence of the propagation gain during playing, and a user obtains a preset listening effect after leaving a factory.
Drawings
Fig. 1 is a flowchart of an audio space adaptation method according to an embodiment of the present invention.
Fig. 2 is a flowchart of an audio space adaptation method according to another embodiment of the present invention.
Fig. 3 is a schematic diagram of audio recording according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of audio propagation according to an embodiment of the present invention.
Fig. 5 is a schematic structural diagram of an embodiment of the present invention.
Detailed Description
In order to make the technical scheme of the present invention better understood by those skilled in the art, the present invention will be further described in detail with reference to the following specific embodiments.
Example 1
In order to eliminate the influence of the environment on the playing sound effect of the multimedia terminal, as shown in fig. 1, an embodiment of the present invention provides a sound effect space adaptation method, which is applied to the sound effect adjustment of the multimedia terminal, and specifically includes the steps of:
S11, recording the original Audio frequency audio_S played by the multimedia terminal from a listening position in a use scene of the multimedia terminal to obtain a recorded Audio frequency audio_T;
s12, calculating propagation damage from the placing position of the multimedia terminal to the listening position by using the original audio_S and the recorded audio_T;
s13, preprocessing the audio played by the multimedia terminal based on the propagation damage.
In this embodiment, the audio to be played is preprocessed based on the propagation loss, so that the influence of the propagation loss on the audio quality can be eliminated when the audio is played.
Example 2
As shown in fig. 2, the embodiment of the present invention further provides an audio space adaptation method, which includes the following specific steps:
s21, recording the original Audio audio_S played by the multimedia terminal from a listening position in the use scene of the multimedia terminal to obtain recorded Audio audio_T.
After the multimedia terminal determines the placement position and the user determines the listening position, the multimedia terminal can start the sound effect space adaptation method provided by the invention when initializing so as to eliminate the environmental influence from the placement position of the multimedia terminal to the listening position of the user. Of course, the method can also be initiated by the user according to the requirement, for example, when the placement position of the multimedia terminal is changed or/and the listening position is changed, the user can select the method for performing the spatial adaptation of the sound effect from the function menu of the multimedia terminal.
The original audio played by the multimedia terminal covers the common frequency bands, for example, 20 HZ-20 KHZ as much as possible, so that the recorded audio of each frequency band can be obtained during recording, and the propagation damage and the benefit of each frequency band can be compensated in the subsequent steps.
In this embodiment, as shown in fig. 3, the audio is recorded and uploaded through the detachable accessory of the multimedia terminal having a recording function. The detachable accessory with the recording function refers to an accessory such as a remote controller, and the existing remote controller generally has a voice function and communicates with a host of the multimedia terminal in a Bluetooth mode. After the recorded audio_t is obtained, the recorded audio_t can be directly uploaded to the multimedia terminal to perform subsequent steps. Of course, the possibility of processing by uploading to the cloud server over a network is not precluded.
In addition, in the implementation, the audio can be recorded and uploaded through the intelligent terminal. The intelligent terminal refers to intelligent equipment such as a mobile phone and a tablet with recording and transmission functions. After the recorded Audio audio_T is obtained through recording, the recorded Audio is uploaded to a cloud server through an APP, and then the multimedia terminal is issued by the cloud server. Of course, direct uploading to the multimedia terminal by WIFI, bluetooth connection, etc. is not excluded.
Of course, other methods for obtaining the recorded Audio audio_t are not excluded in the embodiments of the present invention, as long as the recorded Audio audio_t can be obtained and the subsequent steps can be executed.
S22, filtering processing is carried out on the recorded Audio audio_T.
After the recorded audio_t is obtained, noise needs to be reduced, so as to reduce the influence of the noise signal on subsequent calculation.
For example, the purpose of noise reduction can be achieved by mean filtering, median filtering and self-adaptive filtering. In the implementation, mean filtering is selected to reduce noise, specifically using a formula
y[n] = (1/N) * Σ(x[n-i]),
Mean filtering noise reduction is performed, where N is the filter window size and x [ N-i ] is the ith sample of the input signal.
In order to affect the original signal as little as possible, the size of the filter window N in the present invention varies with the frequency variation of the audio signal, the higher the frequency the smaller the window, whereas the lower the frequency the larger the window. That is, it is preferable that each window can accommodate a threshold number of audio periods, e.g., one window accommodates 1/8~1/4 audio periods, so that this arrangement does not result in loss of detail due to too many periods in one window, nor is it inefficient due to too dense windows.
S23 aligns the recorded Audio audio_t with the original Audio audio_s such that the recorded Audio audio_t has the same start point and end point as the original Audio audio_s.
In order to obtain the difference between the recorded Audio audio_t and the original Audio audio_s, it is expected that the recorded Audio audio_t and the original Audio audio_s need to have the same start point and end point. However, when the actual recording is performed, there is a possibility that the original audio_s will be left white when playing, or that the recording start time is later than the playing time of the original Audio audio_s, which may cause the starting point and the ending point of the original Audio audio_s and the recorded Audio audio_t in the time domain to be different, so that alignment is required.
In this embodiment, the alignment is performed by using the formula
And performing alignment of the recorded Audio audio_T and the original Audio audio_S, wherein E represents mathematical expectation, T is time, R and L are discrete signals of the original Audio audio_S and the recorded Audio audio_T respectively, and C (m) represents correlation coefficients of R and L when the offset value is m. The formula shows that the correlation degree is obtained under each delay value m, the larger the correlation coefficient is, the more similar the two signals are, the subscript i of the maximum value is taken in C (m), L takes the point i as the starting point, and the length of the R signal is used for cutting the signals to obtain L (i).
After the operation, the recorded Audio audio_T and the original Audio audio_S have the same starting point and ending point in the time domain, so that the subsequent steps can be conveniently implemented.
S24, calculating propagation damage from the multimedia terminal placement position to the listening position by using the original audio_S and the recorded Audio audio_T.
As shown in fig. 4, the break from the original Audio audio_s to the recorded Audio audio_t actually includes a fixed break and a propagation break.
The fixed benefit refers to the benefit of the device, and in this implementation, the fixed benefit specifically includes the benefit of the device of the multimedia terminal and the benefit of the device of the recording device (the intelligent terminal or the detachable accessory with the recording function). More specifically, the loss and gain of the multimedia terminal comes from the tuning and gain of the equalizer and the loss and gain of the multimedia terminal loudspeaker response to each audio frequency point.
The propagation damage is mainly caused by the size of a room, the shape of the room, the emission of articles in the room, the sound absorption property of a wall surface, the position of a person relative to a projection/television, the dust adhesion on the surface of the television/projection loudspeaker and the like, and the conditions are different in different use scenes, so that the propagation damage in each use scene is also different. In the invention, users in different use scenes can obtain the hearing effect of factory debugging by eliminating propagation damage.
To eliminate the propagation loss, the propagation loss of the usage scenario needs to be acquired first, and the specific steps of acquiring the propagation loss in the implementation include:
s241 converts the original Audio audio_s and the recorded Audio audio_t into frequency domain representations, respectively.
The original Audio audio_s and the recorded Audio audio_t are both initially time domain representations.
The time domain refers to a case where a signal changes with time, that is, a waveform display is presented on a time axis, and timing information included in the signal can be displayed. It will be appreciated that the audio signal for playback is a time domain representation, although the audio signal obtained by recording is likewise a time domain representation.
And the frequency domain is a coordinate system used in describing the characteristics of the signal in terms of frequency. It focuses on the energy distribution of the signal at different frequency points. Signals in the frequency domain are typically represented in the form of a spectrogram, including information such as frequency content and power spectral density of the signal.
In this embodiment, the difference between the original Audio audio_s and the recorded Audio audio_t may be converted into a frequency domain representation before the difference is obtained.
In this embodiment, the representation of the audio from the time domain to the frequency domain is achieved by fast fourier transform, specifically using the formula:
;
the original Audio audio_S and the recorded Audio audio_T are converted from a time domain representation to a frequency domain representation, wherein X (k) is a spectrum representation in spectrum analysis, the intensity and distribution condition of an Audio signal on the frequency domain are represented, and k represents discrete frequency points. x (n) is a time domain representation of the audio signal, representing the variation of the audio signal over time, n representing discrete points in time; e is a natural logarithmic base, N is a signal sampling cut, pi is a circumference rate, and j is an imaginary unit. Both the original Audio audio_s and the recorded Audio audio_t may be converted using the above formulas.
The frequency domain representation P (k) of the original Audio audio_s and the frequency domain representation C (k) of the recorded Audio audio_t can be obtained by the above fourier fast transform formula. The frequency domain representation is recorded as the loudness of the sound signal at each frequency point in dB.
S242, acquiring the propagation damage according to the frequency domain representation of the original Audio audio_S, the frequency domain representation of the recorded Audio audio_T and the fixed damage; the fixed benefit includes a device benefit of the multimedia terminal and a device benefit of the recording device.
As described above, the break from the original Audio audio_s to the recorded Audio audio_t actually includes a fixed break and a propagation break. That is, the frequency domain representation C (k) =the frequency domain representation P (k) +the fixed benefit S (k) +the propagation benefit W (k), and the fixed benefit S (k) = (the tuning benefit T (k) +the horn response of the equalizer is the device benefit M (k) of the recording device for each Audio frequency point, wherein the tuning benefit T (k) +the horn response of the equalizer is the device benefit N (k) of the multimedia terminal for each Audio frequency point:
C(k)= P(k)+T(k)+N(k)+W(k)+M(k);
Can be deformed to spread and benefit
W(k)=C(k)-P(k)-T(k)-N(k)-M(k);
The propagation damage from the multimedia terminal placement position to the listening position can be obtained through the formula.
S25, preprocessing the audio played by the multimedia terminal based on the propagation damage.
The principle for eliminating the influence of the propagation gain in this embodiment is to obtain the counter-propagation gain through the propagation gain, and load the counter-propagation gain onto a certain audio of the multimedia terminal in advance before playing the audio, so that the propagation gain and the counter-propagation gain cancel each other when the audio finally reaches the listening position through the influence of the propagation gain during playing, and the user obtains the listening effect preset in factory.
In an ideal state, the patterns of the propagation gain and the counter-propagation gain are symmetrical along the X axis, and the wave peaks and wave troughs of the two patterns are zeroed after the two patterns are overlapped, so that the elimination of the propagation gain is realized.
After the propagation damage from the placing position of the multimedia terminal to the listening position is obtained, the embodiment of the invention provides two methods for preprocessing the audio played by the multimedia terminal so as to eliminate the influence of the propagation damage on the listening effect.
One of them includes:
S2511 acquires counter-propagating damage using propagation damage.
The propagation impairment benefits obtained in step S24 are negated, i.e. Y (k) = -W (k) = P (k) +t (k) +n (k) +m (k) -C (k), where Y (k) is the counter propagation gain, W (k) is the propagation gain, P (k) is the frequency domain representation of the original Audio audio_s, T (k) is the tuning impairment benefits of the equalizer, N (k) is the impairment benefits of the loudspeaker response to each Audio frequency point, M (k) is the equipment impairment benefits of the recording equipment, and C (k) records the frequency domain representation of the Audio audio_t.
S2512 converts the counter-propagating impairment benefits into a time-domain representation.
In this embodiment, the inverse propagation impairment benefit Y (k) is converted from the frequency domain representation to the time domain representation by inverse fast fourier transform, specifically using the formula:
;
The counter-propagation impairment is converted from a frequency domain representation to a time domain representation, wherein X (k) is the spectral representation in the spectral analysis, representing the intensity and distribution of the sound signal in the frequency domain, and k represents the discrete frequency points. x (n) is a time domain representation of the sound signal, representing the variation of the sound over time, n representing discrete points in time; e is a natural logarithmic base, N is a signal sampling cut, pi is a circumference rate, and j is an imaginary unit.
Through the above operations, the frequency domain representation Y (k) of the counter propagation impairment can be converted into the time domain representation Y (k).
S2513 convolves the counter-propagating impairment correlations time domain representation with the audio time domain representation to be played to obtain preprocessed audio.
Specifically, the formula is utilized
Convolving the time domain representation of the counter-propagating damage with the audio to be played; wherein y (n) is the discrete convolved audio output signal, x (i) is the ith discrete signal of the counter-propagating impairment time domain representation, h (n-i) is the nth-i discrete signal of the audio time domain representation to be played, and n is the total number of discrete signals.
In this step, convolution operation is performed on the counter-propagating damage time domain representation and the audio time domain representation to be played, so that the counter-propagating damage time domain representation and the audio time domain representation are overlapped to form preprocessed audio. When the preprocessed audio is played by the multimedia terminal and reaches the listening position, the propagation gain and the counter propagation gain in the preprocessed audio are mutually offset through superposition of propagation damage, so that a user can obtain the listening effect set by the factory at the listening position.
The method for acquiring the preprocessed audio frequency has fewer steps, but the convolution operation needs larger calculation force, so the method is suitable for the situation of smaller periodic data volume.
The two components comprise:
s2521 obtains counter-propagating damage using propagation damage.
Similarly, the propagation impairment benefits obtained in step S24 are negated, so that the counter propagation impairment benefits can be obtained, i.e., Y (k) = -W (k) = -P (k) +t (k) +n (k) +m (k) -C (k), where Y (k) is the counter propagation gain, W (k) is the propagation gain, P (k) is the frequency domain representation of the original Audio audio_s, T (k) is the tuning impairment benefits of the equalizer, N (k) is the impairment benefits of the loudspeaker response to each Audio frequency point, M (k) is the equipment impairment benefits of the recording equipment, and C (k) records the frequency domain representation of the Audio audio_t.
S2522 converts the audio to be played from a time domain representation to a frequency domain representation.
In this embodiment, the representation of the audio from the time domain to the frequency domain is implemented by fast fourier transform, specifically using the formula:
;
The audio to be played is converted from a time domain representation to a frequency domain representation, wherein X (k) is a spectrum representation in spectrum analysis, represents the intensity and distribution of the audio signal in the frequency domain, and k represents discrete frequency points. x (n) is a time domain representation of the audio signal, representing the variation of the audio signal over time, n representing discrete points in time; e is a natural logarithmic base, N is a signal sampling cut, pi is a circumference rate, and j is an imaginary unit.
Through the above operations, the time domain representation Z (k) of the audio to be played can be converted into the frequency domain representation Z (k).
S2523 multiplies the audio frequency domain representation to be played with the counter-propagating impairment frequency domain representation and converts it into a time domain representation to obtain the preprocessed audio.
In this step, the audio frequency domain representation to be played is multiplied with the counter-propagating impairment frequency domain representation, i.e. Z (k) x Y (k), to obtain a preprocessed audio frequency domain representation F (k), which is then transformed into a time domain representation by inverse fast fourier transform for playing by the multimedia terminal.
Similarly, the audio with the counter-propagation gain reaches the listening position after being played by the multimedia terminal, and the propagation gain and the counter-propagation gain in the audio are mutually offset through superposition of propagation damage, so that a user can obtain the listening effect set by the factory at the listening position.
The above method for acquiring the preprocessed audio frequency has more steps, but the frequency domain multiplication operation needs less calculation force, so that the method is more preferable under the condition of larger period data volume.
Example 3
In the step of acquiring environmental impact by recording audio, it is first necessary to determine the damage of the device itself, including the multimedia terminal itself, the detachable accessory with recording function and the device damage of the intelligent terminal. The multimedia terminal and accessories can be calibrated when leaving the factory, and the intelligent terminal generally belongs to users and are different in type and model, and corresponding equipment damage and benefit are different. Therefore, the user is required to calibrate the equipment damage of the intelligent terminal by himself, and the specific method comprises the following steps:
S31, recording the original Audio audio_S1 played by the multimedia terminal by the intelligent terminal close to the multimedia terminal to obtain recorded Audio audio_T1 so as to eliminate propagation damage.
As described in embodiment 2, there may be fixed benefits and propagation benefits from playing audio from the multimedia terminal to being recorded by the intelligent terminal, wherein the fixed benefits include the equipment benefits of the multimedia terminal and the equipment benefits of the intelligent terminal. In the fixed damage, the equipment damage of the multimedia terminal is known, so that the equipment damage of the intelligent terminal can be obtained as long as another unknown propagation damage is eliminated.
In this implementation, the intelligent terminal is close to the multimedia terminal to eliminate propagation damage when recording audio, so long as the intelligent terminal and the multimedia terminal are close enough and the middle is not shielded, the propagation damage of the audio between the intelligent terminal and the multimedia terminal can be considered to be close to 0, and the fineness of the intelligent terminal is enough for the invention.
S32, obtaining an intermediate benefit S1 from the original Audio audio_S1 to the recorded Audio audio_T1;
As in embodiment 2, after the original Audio audio_s1 and the recorded Audio audio_t1 are obtained, noise reduction may be performed by means of mean filtering or the like, and the original Audio audio_s1 and the recorded Audio audio_t1 may be aligned so that they have the same start point and end point.
And then converting the original Audio audio_s1 and the recorded Audio audio_t1 from the time domain representation into the frequency domain representation by using fast fourier transform.
Finally, the intermediate defect S1, i.e. the frequency domain representation of intermediate defect s1=audio_t1-the frequency domain representation of audio_s1, is obtained by recording the difference between the frequency domain representation of audio_t1 and the frequency domain representation of the original Audio audio_s1.
The specific method of the mean filtering, alignment and fast fourier transform described above is referred to embodiment 2, and will not be described in detail in this embodiment.
And S33, removing the equipment benefit of the multimedia terminal from the intermediate benefit S1 so as to obtain the equipment benefit of the intelligent terminal.
After the intermediate benefit S1 is obtained, the device benefit of the intelligent terminal can be obtained as long as the device benefit of the multimedia terminal is removed. At this time, the intermediate benefit S1 and the device benefit of the multimedia terminal are known, that is, the device benefit of the multimedia terminal is obtained by the formula of the device benefit of the intelligent terminal=the intermediate benefit S1.
Example 4
In the sound effect space adaptation method provided in embodiments 1 and 2, the propagation loss and benefit of a single intelligent terminal are mainly eliminated, and in the sound effect space adaptation method of the present embodiment, the propagation loss and benefit of a plurality of multimedia terminals are mainly eliminated.
In the case of more than two multimedia terminals, it is necessary to acquire propagation loss from each multimedia terminal to the listening position separately. The specific acquisition method can be referred to embodiment 2, and will not be described in detail in this embodiment.
After the propagation damage from each multimedia terminal to the listening position is acquired, the audio played by each multimedia terminal is preprocessed based on the propagation damage.
For example, suppose that there are 4 pieces of networked sound that are co-sounding together, and that the 4 pieces of sound are placed in four directions in the room. When the sound effect space adaptation is performed, the propagation damage and benefit from each sound to the listening position is obtained through the method provided by the embodiment 2, and then the audio to be played by each sound is preprocessed through the respective propagation damage and benefit according to the method of the embodiment 2, so that the propagation damage and benefit of 4 sounds are eliminated.
Example 5
As shown in fig. 5, the present invention further provides an audio space adaptation system, which specifically includes:
And the Audio acquisition module is used for recording the original Audio audio_S played by the multimedia terminal from the listening position to obtain recorded Audio audio_T.
The function of the module is to obtain the record of the original Audio audio_S played by the multimedia terminal from the listening position of the user, namely, record Audio audio_T.
In this embodiment, two acquisition means are provided, one is to record and upload audio through a detachable accessory with a recording function, such as a remote controller, of the multimedia terminal. Secondly, audio is recorded and uploaded through an intelligent terminal such as a mobile phone, a tablet personal computer and the like.
The specific method for obtaining the recorded audio_t is referred to step S21 in embodiment 2, and will not be described herein.
And the propagation damage acquisition module calculates propagation damage from the placement position of the multimedia terminal to the listening position by using the original Audio audio_S and the recorded Audio audio_T.
The module has the function of acquiring the propagation benefit from the placing position of the multimedia terminal to the listening position, thereby being convenient for the audio to be played in the follow-up step through the propagation benefit to be preprocessed, and further eliminating the propagation benefit.
In this implementation, the specific method for acquiring the propagation loss includes:
First, the original Audio audio_s and the recorded Audio audio_t are respectively converted into frequency domain representations.
Then acquiring the propagation damage according to the frequency domain representation of the original Audio audio_S, the frequency domain representation of the recorded Audio audio_T and the fixed damage; the fixed benefit includes a device benefit of the multimedia terminal and a device benefit of the recording device.
The specific method for acquiring the propagation loss is referred to step S24 in embodiment 2, and will not be described herein.
And the audio preprocessing module is used for preprocessing the audio played by the multimedia terminal based on the propagation damage.
The module has the function of eliminating the influence of propagation damage and benefit on the listening effect when the audio is played by preprocessing the audio played by the multimedia terminal.
In this embodiment, two pretreatment means are provided:
firstly, acquiring counter-propagation damage and benefit by using propagation damage and benefit; converting the counter-propagating impairment benefits to a time domain representation; and convolving the counter-propagating impairment time domain representation with the audio time domain representation to be played to obtain the preprocessed audio.
The second step comprises the steps of utilizing propagation damage and benefit to obtain counter propagation damage and benefit; converting audio to be played from a time domain representation to a frequency domain representation; the audio frequency domain representation to be played is multiplied with the counter-propagating damage frequency domain representation and then converted into a time domain representation to obtain preprocessed audio.
The specific method for preprocessing the audio is referred to step S25 in embodiment 2, and will not be described herein.
The audio to be played is preprocessed based on the propagation loss and loss through the three modules, so that the influence of the propagation loss and loss on the quality of the audio can be eliminated when the audio is played.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that the above-mentioned preferred embodiment should not be construed as limiting the invention, and the scope of the invention should be defined by the appended claims. It will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the spirit and scope of the invention, and such modifications and adaptations are intended to be comprehended within the scope of the invention.
Claims (10)
1. The sound effect space adaptation method is applied to sound effect adjustment of a multimedia terminal and is characterized by comprising the following steps of:
recording an original Audio frequency audio_S played by the multimedia terminal from a listening position in a use scene of the multimedia terminal to obtain a recorded Audio frequency audio_T;
calculating propagation damage from the multimedia terminal placement position to the listening position by using the original audio_S and the recorded Audio audio_T;
And preprocessing the audio played by the multimedia terminal based on the propagation damage.
2. The method for spatial adaptation of sound effects according to claim 1, wherein the method for recording the original Audio audio_s played by the multimedia terminal from the listening position to obtain the recorded Audio audio_t comprises:
recording audio through the intelligent terminal and uploading; or alternatively
And recording audio through a detachable accessory with a recording function of the multimedia terminal and uploading the audio.
3. A method of spatial adaptation of sound effects according to claim 2, characterized in that:
under the condition that the audio is recorded through the intelligent terminal, acquiring equipment damage of the intelligent terminal comprises the following steps:
Recording the original Audio audio_s1 played by the multimedia terminal by the intelligent terminal close to the multimedia terminal to obtain recorded Audio audio_t1 so as to eliminate propagation damage;
acquiring an intermediate benefit S1 from the original Audio audio_S1 to the recorded Audio audio_T1;
And removing the equipment benefit of the multimedia terminal from the intermediate benefit S1 so as to obtain the equipment benefit of the intelligent terminal.
4. A method of spatial adaptation of sound effects according to claim 1, characterized in that: after obtaining the recorded Audio audio_T, filtering the recorded Audio audio_T; the filter window at the time of the filtering process includes a threshold number of audio periods.
5. The method of sound space adaptation according to claim 1, further comprising: the recorded Audio audio_t is aligned with the original Audio audio_s such that the recorded Audio audio_t has the same start point and end point as the original Audio audio_s.
6. The method for spatial adaptation of sound effects according to claim 1, wherein the method for calculating propagation loss from a multimedia terminal placement location to a listening location using an original audio_s and a recorded Audio audio_t comprises:
Respectively converting the original Audio audio_S and the recorded Audio audio_T into frequency domain representations;
Acquiring the propagation damage according to the frequency domain representation of the original Audio audio_S, the frequency domain representation of the recorded Audio audio_T and the fixed damage; the fixed benefit includes a device benefit of the multimedia terminal and a device benefit of the recording device.
7. The method for spatial adaptation of sound effects according to claim 1, wherein the method for preprocessing audio played by a multimedia terminal based on propagation loss comprises:
Acquiring counter-propagating damage by using the propagating damage;
Converting the counter-propagating impairment benefits to a time domain representation;
And convolving the counter-propagating impairment time domain representation with the audio time domain representation to be played to obtain the preprocessed audio.
8. The method for spatial adaptation of sound effects according to claim 1, wherein the method for preprocessing audio played by a multimedia terminal based on propagation loss comprises:
Acquiring counter-propagating damage by using the propagating damage;
Converting audio to be played from a time domain representation to a frequency domain representation;
The audio frequency domain representation to be played is multiplied with the counter-propagating damage frequency domain representation and then converted into a time domain representation to obtain preprocessed audio.
9. A method of spatial adaptation of sound effects according to claim 1, characterized in that:
Under the condition of more than two multimedia terminals, respectively acquiring propagation damage from each multimedia terminal to a listening position;
and preprocessing the audio played by each multimedia terminal based on the propagation damage.
10. An audio space adaptation system, comprising:
The Audio acquisition module is used for recording the original Audio audio_S played by the multimedia terminal from the listening position to obtain recorded Audio audio_T;
The transmission damage acquisition module calculates transmission damage from the placement position of the multimedia terminal to the listening position by using the original audio_S and the recorded audio_T;
And the audio preprocessing module is used for preprocessing the audio played by the multimedia terminal based on the propagation damage.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410339606.1A CN118262738A (en) | 2024-03-22 | 2024-03-22 | Sound effect space adaptation method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410339606.1A CN118262738A (en) | 2024-03-22 | 2024-03-22 | Sound effect space adaptation method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118262738A true CN118262738A (en) | 2024-06-28 |
Family
ID=91610476
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410339606.1A Pending CN118262738A (en) | 2024-03-22 | 2024-03-22 | Sound effect space adaptation method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118262738A (en) |
-
2024
- 2024-03-22 CN CN202410339606.1A patent/CN118262738A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Müller et al. | Transfer-function measurement with sweeps | |
JP6572894B2 (en) | Information processing apparatus, information processing method, and program | |
EP3383064B1 (en) | Echo cancellation method and system | |
US20070121955A1 (en) | Room acoustics correction device | |
US8355510B2 (en) | Reduced latency low frequency equalization system | |
CN106612482B (en) | Method for adjusting audio parameters and mobile terminal | |
CN103137136B (en) | Sound processing device | |
US20070237335A1 (en) | Hormonic inversion of room impulse response signals | |
CN112954563B (en) | Signal processing method, electronic device, apparatus, and storage medium | |
CN111796790A (en) | Sound effect adjusting method and device, readable storage medium and terminal equipment | |
JP6452653B2 (en) | A system for modeling the characteristics of musical instruments | |
Denk et al. | Removing reflections in semianechoic impulse responses by frequency-dependent truncation | |
WO2022256577A1 (en) | A method of speech enhancement and a mobile computing device implementing the method | |
CN102246230B (en) | Systems and methods for improving the intelligibility of speech in a noisy environment | |
CN114584908A (en) | Acoustic testing method, device and equipment for hearing aid | |
WO2024093536A1 (en) | Audio signal processing method and apparatus, audio playback device, and storage medium | |
LV13342B (en) | Method and device for correction of acoustic parameters of electro-acoustic transducers | |
US8254590B2 (en) | System and method for intelligibility enhancement of audio information | |
US9848274B2 (en) | Sound spatialization with room effect | |
CN110140294B (en) | Method and apparatus for equalizing an audio signal | |
CN118262738A (en) | Sound effect space adaptation method and system | |
CN113921007B (en) | Method for improving far-field voice interaction performance and far-field voice interaction system | |
CN116405822A (en) | Bass enhancement system and method applied to open Bluetooth headset | |
JP2002064617A (en) | Echo suppression method / echo suppression device | |
CN115767359A (en) | Noise reduction method and device, test method and device, electronic device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |