CN106057220B

CN106057220B - High-frequency extension method of audio signal and audio player

Info

Publication number: CN106057220B
Application number: CN201610340304.1A
Authority: CN
Inventors: 刘智成; 胡以璇
Original assignee: TCL Corp
Current assignee: TCL Corp
Priority date: 2016-05-19
Filing date: 2016-05-19
Publication date: 2020-01-03
Anticipated expiration: 2036-05-19
Also published as: CN106057220A

Abstract

The invention discloses a high-frequency expansion method and an audio player of an audio signal, which are used to realize the high-frequency expansion of an audio signal missing a high-frequency part and meet the user's requirement for high-quality audio. An embodiment of the present invention provides a high-frequency extension method for an audio signal, including: up-sampling an original first audio signal to obtain a second audio signal; acquiring a low-frequency spectrum of the second audio signal, and according to the The low-frequency spectrum estimates the high-frequency spectrum of the second audio signal to obtain a high-frequency spectrum envelope; copy the low-frequency spectrum to the high-frequency spectrum of the second audio signal according to the high-frequency spectrum envelope. frequency band to obtain a third audio signal; and performing energy adjustment on the third audio signal to obtain a fourth audio signal.

Description

A high frequency expansion method of audio signal and audio player

技术领域technical field

本发明涉及音频处理技术领域，尤其涉及一种音频信号的高频扩展方法和音频播放器。The invention relates to the technical field of audio processing, and in particular, to a high-frequency expansion method of an audio signal and an audio player.

背景技术Background technique

人耳能听到的声音的频率范围大约为2-20千赫兹(Kilo Hertz，KHz)，普通人对音频信号的高频部分不太敏感，人耳虽然听不到频率超过20KHz的声音，但有些人能感觉到。在存储和传输音频文件时，为了节约空间和提高传输效率，往往要去掉人耳不太敏感的高频成分，这虽然损失了一定的音质，但是可以大大提高音频文件的压缩率，这种权衡迎合了市场的需求，例如大部分风靡一时的MP3音频文件就将高于16KHz的高频部分去掉，音质虽然下降了一些，但并不影响普通人欣赏音频文件，而且音频文件占用较小的存储空间，MP3也趁着互联网的东风流行起来，大大满足了普通人对音乐欣赏的需求。The frequency range of the sound that the human ear can hear is about 2-20 kHz (Kilo Hertz, KHz). Ordinary people are not very sensitive to the high-frequency part of the audio signal. Although the human ear cannot hear the sound with a frequency exceeding 20KHz, Some people can feel it. When storing and transmitting audio files, in order to save space and improve transmission efficiency, it is often necessary to remove high-frequency components that are not very sensitive to the human ear. Although this loses a certain sound quality, it can greatly improve the compression rate of audio files. This trade-off To meet the needs of the market, for example, most of the popular MP3 audio files will remove the high frequency part higher than 16KHz. Although the sound quality has dropped a little, it does not affect the appreciation of audio files by ordinary people, and the audio files take up less storage. Space, MP3 has also become popular with the east wind of the Internet, which greatly meets the needs of ordinary people for music appreciation.

随着人们消费水平的提高，人们对音频文件的品质要求也日益提高，人们越来越喜欢高分辨率、频域成分完整的、高保真的高品质音乐。但是这种高品质的音频文件往往收费昂贵，因此对大部分免费的但缺失了高频成分的音频文件进行合理扩展以提高音频的播放效果就成了普通用户的极大需求，而现有技术中的音频播放器只能播放缺失高频成分的音频文件，而无法对这些音频文件进行高频扩展。With the improvement of people's consumption level, people's requirements for the quality of audio files are also increasing, and people are more and more fond of high-resolution, complete frequency domain components, and high-fidelity high-quality music. However, such high-quality audio files are often expensive, so it has become a great demand for ordinary users to reasonably expand most of the free audio files that lack high-frequency components to improve the audio playback effect. The audio player in can only play audio files that lack high-frequency components, and cannot perform high-frequency expansion on these audio files.

发明内容SUMMARY OF THE INVENTION

本发明实施例提供了一种音频信号的高频扩展方法和音频播放器，用于实现对缺失高频部分的音频信号进行高频扩展，满足用户对高品质音频的要求。Embodiments of the present invention provide a high-frequency expansion method and an audio player for an audio signal, which are used to implement high-frequency expansion of an audio signal missing a high-frequency part to meet user requirements for high-quality audio.

为解决上述技术问题，本发明实施例提供以下技术方案：In order to solve the above-mentioned technical problems, the embodiments of the present invention provide the following technical solutions:

第一方面，本发明实施例提供一种音频信号的高频扩展方法，包括：In a first aspect, an embodiment of the present invention provides a high-frequency expansion method for an audio signal, including:

对原始的第一音频信号进行上采样，得到第二音频信号；Upsampling the original first audio signal to obtain a second audio signal;

获取所述第二音频信号的低频段频谱，并根据所述低频段频谱对所述第二音频信号进行高频段频谱的估计，得到高频段频谱包络线；acquiring the low-frequency spectrum of the second audio signal, and estimating the high-frequency spectrum of the second audio signal according to the low-frequency spectrum, to obtain a high-frequency spectrum envelope;

按照所述高频段频谱包络线将所述低频段频谱拷贝到所述第二音频信号的高频段，得到第三音频信号；Copy the low-frequency spectrum to the high-frequency band of the second audio signal according to the high-frequency spectrum envelope to obtain a third audio signal;

对所述第三音频信号进行能量调整，得到第四音频信号。Energy adjustment is performed on the third audio signal to obtain a fourth audio signal.

第二方面，本发明实施例还提供一种音频播放器，包括：In a second aspect, an embodiment of the present invention further provides an audio player, including:

上采样模块，用于对原始的第一音频信号进行上采样，得到第二音频信号；an upsampling module for upsampling the original first audio signal to obtain a second audio signal;

高频估计模块，用于获取所述第二音频信号的低频段频谱，并根据所述低频段频谱对所述第二音频信号进行高频段频谱的估计，得到高频段频谱包络线；a high-frequency estimation module, configured to obtain a low-frequency spectrum of the second audio signal, and perform high-frequency spectrum estimation on the second audio signal according to the low-frequency spectrum to obtain a high-frequency spectrum envelope;

频谱拷贝模块，用于按照所述高频段频谱包络线将所述低频段频谱拷贝到所述第二音频信号的高频段，得到第三音频信号；A spectrum copying module, configured to copy the low-frequency spectrum to the high-frequency band of the second audio signal according to the high-frequency spectrum envelope to obtain a third audio signal;

能量调整模块，用于对所述第三音频信号进行能量调整，得到第四音频信号。An energy adjustment module, configured to perform energy adjustment on the third audio signal to obtain a fourth audio signal.

从以上技术方案可以看出，本发明实施例具有以下优点：As can be seen from the above technical solutions, the embodiments of the present invention have the following advantages:

在本发明实施例中，首先对原始的第一音频信号进行上采样，得到第二音频信号，然后获取第二音频信号的低频段频谱，并根据低频段频谱对第二音频信号进行高频段频谱的估计，得到高频段频谱包络线，接下来按照高频段频谱包络线将低频段频谱拷贝到第二音频信号的高频段，得到第三音频信号，最后对第三音频信号进行能量调整，得到第四音频信号。本发明实施例中第一音频信号作为原始信号经过上采样之后，估计出了第二音频信号的高频段频谱包络线，第二音频信号在高频段拷贝进行低频段频谱后再进行能量调整，可以得到第四音频信号，第四音频信号中携带有高频段的频谱信息，当音频播放器播放该第四音频信号时可以提高音频信号的播放效果。In the embodiment of the present invention, first up-sampling the original first audio signal to obtain the second audio signal, then acquiring the low-frequency spectrum of the second audio signal, and performing high-frequency spectrum analysis on the second audio signal according to the low-frequency spectrum Then, copy the low-frequency spectrum to the high-frequency band of the second audio signal according to the high-frequency spectrum envelope to obtain the third audio signal, and finally adjust the energy of the third audio signal, A fourth audio signal is obtained. In the embodiment of the present invention, after the first audio signal is up-sampled as the original signal, the high-frequency spectrum envelope of the second audio signal is estimated, and the second audio signal is copied in the high-frequency band and subjected to the low-frequency spectrum before performing energy adjustment, A fourth audio signal can be obtained. The fourth audio signal carries spectrum information of a high frequency band. When the audio player plays the fourth audio signal, the playback effect of the audio signal can be improved.

附图说明Description of drawings

为了更清楚地说明本发明实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域的技术人员来讲，还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions in the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For those skilled in the art, other drawings can also be obtained from these drawings.

图1为本发明实施例提供的一种音频信号的高频扩展方法的流程方框示意图；1 is a schematic block diagram of a flowchart of a method for high-frequency expansion of an audio signal according to an embodiment of the present invention;

图2为本发明实施例提供的另一种音频信号的高频扩展方法的流程方框示意图；2 is a schematic block diagram of a flowchart of another high-frequency expansion method of an audio signal provided by an embodiment of the present invention;

图3为本发明实施例中对第一音频信号进行上采样的应用场景示意图；3 is a schematic diagram of an application scenario of up-sampling a first audio signal in an embodiment of the present invention;

图4为本发明实施例提供的另一种音频信号的高频扩展方法的流程方框示意图；4 is a schematic block diagram of a flowchart of another high-frequency expansion method of an audio signal provided by an embodiment of the present invention;

图5为本发明实施例提供的第二音频信号的频谱包络线示意图；5 is a schematic diagram of a spectral envelope of a second audio signal according to an embodiment of the present invention;

图6为本发明实施例提供的一种频谱拟合过程的示意图；6 is a schematic diagram of a spectrum fitting process according to an embodiment of the present invention;

图7-a为本发明实施例提供的一种音频播放器的组成结构示意图；FIG. 7-a is a schematic diagram of the composition and structure of an audio player provided by an embodiment of the present invention;

图7-b为本发明实施例提供的一种上采样模块的组成结构示意图；FIG. 7-b is a schematic diagram of the composition and structure of an upsampling module according to an embodiment of the present invention;

图7-c为本发明实施例提供的另一种音频播放器的组成结构示意图；FIG. 7-c is a schematic diagram of the composition and structure of another audio player provided by an embodiment of the present invention;

图7-d为本发明实施例提供的一种高频估计模块的组成结构示意图；7-d is a schematic diagram of the composition and structure of a high-frequency estimation module provided by an embodiment of the present invention;

图7-e为本发明实施例提供的另一种音频播放器的组成结构示意图。FIG. 7-e is a schematic structural diagram of another audio player provided by an embodiment of the present invention.

具体实施方式Detailed ways

为使得本发明的发明目的、特征、优点能够更加的明显和易懂，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，下面所描述的实施例仅仅是本发明一部分实施例，而非全部实施例。基于本发明中的实施例，本领域的技术人员所获得的所有其他实施例，都属于本发明保护的范围。In order to make the purpose, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the following The described embodiments are only some, but not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art fall within the protection scope of the present invention.

本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象，而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换，这仅仅是描述本发明的实施例中对相同属性的对象在描述时所采用的区分方式。此外，术语“包括”和“具有”以及他们的任何变形，意图在于覆盖不排他的包含，以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元，而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。The terms "first", "second" and the like in the description and claims of the present invention and the above-mentioned drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence. It should be understood that the terms used in this way can be interchanged under appropriate circumstances, and this is only a way of distinguishing objects with the same attributes when describing the embodiments of the present invention. Furthermore, the terms "comprising" and "having" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, product or device comprising a series of elements is not necessarily limited to those elements, but may include no explicit or other units inherent to these processes, methods, products, or devices.

以下分别进行详细说明。Each of them will be described in detail below.

本发明音频信号的高频扩展方法的一个实施例，可应用于音频播放器对缺失高频部分的音频信号的扩展中，请参阅图1所示，本发明一个实施例提供的音频信号的高频扩展方法，可以包括如下步骤：An embodiment of the high-frequency expansion method of an audio signal of the present invention can be applied to the expansion of an audio signal with a missing high-frequency part by an audio player. Please refer to FIG. 1 . The frequency expansion method may include the following steps:

101、对原始的第一音频信号进行上采样，得到第二音频信号。101. Perform up-sampling on the original first audio signal to obtain a second audio signal.

在本发明实施例中，第一音频信号是输入到音频播放器的原始音频文件，该第一音频信号可以是缺失高频部分的音频信号，若音频播放器直接播放该第一音频信号，由于频域的不完整，播放效果差，因此可以根据本发明实施例提供的音频信号的高频扩展方法进行高频部分的扩展。本发明实施例中，第一音频信号为时域信号，可以对第一音频信号进行上采样，得到第二音频信号，通过第一音频信号的上采样，可以在第一音频信号上内插一些符号，从而可以得到第二音频信号，该第二音频信号相对于原始的第一音频信号，在时域上得到了扩展。In this embodiment of the present invention, the first audio signal is an original audio file input to the audio player, and the first audio signal may be an audio signal lacking high-frequency parts. If the audio player directly plays the first audio signal, because The frequency domain is incomplete and the playback effect is poor. Therefore, the high-frequency part can be expanded according to the high-frequency expansion method of the audio signal provided by the embodiment of the present invention. In the embodiment of the present invention, the first audio signal is a time domain signal, and the first audio signal may be up-sampled to obtain the second audio signal. symbol, so that a second audio signal can be obtained, and the second audio signal is expanded in the time domain with respect to the original first audio signal.

102、获取第二音频信号的低频段频谱，并根据低频段频谱对第二音频信号进行高频段频谱的估计，得到高频段频谱包络线。102. Acquire a low-frequency spectrum of the second audio signal, and perform high-frequency spectrum estimation on the second audio signal according to the low-frequency spectrum to obtain a high-frequency spectrum envelope.

在本发明实施例中，将第一音频信号转换为第二音频信号之后，先从该第二音频信号中提取出低频段频谱，其中，低频段频谱是指在第二音频信号中已经存在的频谱分量，该低频段频谱携带有频谱能量信息，通过低频段频谱可以播放出在第二音频信号中携带的内容信息。通常情况下，第二音频信号的低频段频谱的振幅绝对值的自然对数值从低频率到高频率呈现出的变化规律是线性递减，因此根据该低频段频谱可以对第二音频信号进行高频段频谱的估计，从而可以生成第二音频信号的高频段频谱包络线，该高频段频谱包络线是第二音频信号中缺失的高频部分的振幅绝对值的自然对数值随着时间变化的曲线，通过估计出第二音频信号的高频段频谱包络线可以确定在第二音频信号中缺失的高频信号的大致区域。In the embodiment of the present invention, after the first audio signal is converted into the second audio signal, a low-frequency spectrum is first extracted from the second audio signal, where the low-frequency spectrum refers to the existing frequency spectrum in the second audio signal. Spectrum component, the low-frequency spectrum carries spectrum energy information, and content information carried in the second audio signal can be played through the low-frequency spectrum. Usually, the natural logarithm value of the absolute value of the amplitude of the low-frequency spectrum of the second audio signal exhibits a linear decrease from low frequency to high frequency. Therefore, according to the low-frequency spectrum, the second audio signal can be subjected to high-frequency analysis. Estimation of the frequency spectrum, so that the high frequency band spectral envelope of the second audio signal can be generated, and the high frequency frequency spectral envelope is the natural logarithm value of the absolute value of the amplitude of the missing high frequency part in the second audio signal. By estimating the high frequency spectral envelope of the second audio signal, the approximate area of the missing high frequency signal in the second audio signal can be determined.

103、按照高频段频谱包络线将低频段频谱拷贝到第二音频信号的高频段，得到第三音频信号。103. Copy the low-frequency spectrum to the high-frequency frequency of the second audio signal according to the high-frequency spectrum envelope to obtain a third audio signal.

在本发明实施例中，从第二音频信号中生成高频段频谱包络线之后，可以按照该高频段频谱包络线将低频段频谱拷贝到第二音频信号的高频段中，此时高频段中填充有频谱信息的第二音频信号可以称为第三音频信号。因此，本发明实施例中在第二音频信号的高频段中填充低频段频谱，可以使得第二音频信号中缺失的高频部分得到有效扩展，从而使得第三音频信号中既包含低频部分，也包括高频部分，这样的第三音频信号在播放时可以呈现出高品质的音质效果。In this embodiment of the present invention, after the high-frequency spectrum envelope is generated from the second audio signal, the low-frequency spectrum can be copied to the high-frequency band of the second audio signal according to the high-frequency spectrum envelope. The second audio signal filled with spectral information may be referred to as a third audio signal. Therefore, in the embodiment of the present invention, filling the low-frequency spectrum in the high-frequency band of the second audio signal can effectively expand the missing high-frequency part in the second audio signal, so that the third audio signal contains both the low-frequency part and the low-frequency part. Including the high frequency part, such a third audio signal can present a high-quality sound quality effect during playback.

104、对第三音频信号进行能量调整，得到第四音频信号。104. Perform energy adjustment on the third audio signal to obtain a fourth audio signal.

在本发明实施例中，获取到第三音频信号之后，还可以对第三音频信号进行能量调整，因为通过前述步骤中高频段的频谱填充，在第三音频信号的高频段填充的是低频段频谱，而已知的是，低频段频谱往往有较大的能量，若在高频段中继续使用大能量的频谱，在第三音频信号播放时会造成刺耳声，用户感觉不适应，因此需要降低第三音频信号中的高频部分的能量，以适应用户的普遍需求。其中能量调整的具体方式可以有多种，例如，直接按照预置的比例关系降低第三音频信号中高频段的能量，或者设置能量调整系数，按照该能量调整系数来调整第三音频信号，从而可以生成第四音频信号。第四音频信号和第三音频信号中都携带有高频段的频谱信息，当音频播放器播放该第四音频信号时可以提高音频信号的播放效果，使用户可以欣赏到高品质的音频文件。本发明实施例中通过音频播放器侧的编码技术可以实现缺失高频部分的音频信号的高频扩展，使用户不需要支付昂贵的费用而只需要使用本发明实施例提供的音频播放器就可以享受到高品质的音频文件，使得本发明实施例提供的音频播放器具有较高的竞争力。In this embodiment of the present invention, after the third audio signal is acquired, energy adjustment can also be performed on the third audio signal, because through the spectrum filling of the high frequency band in the preceding steps, the high frequency band of the third audio signal is filled with a low frequency band spectrum, and it is known that the low-frequency spectrum often has greater energy. If the high-energy spectrum continues to be used in the high-frequency band, it will cause harsh sound when the third audio signal is played, and the user feels uncomfortable. Therefore, it is necessary to reduce the first frequency. The energy of the high frequency part of the three audio signals to meet the general needs of users. There are various specific ways of energy adjustment. For example, directly reducing the energy of the middle and high frequency bands of the third audio signal according to a preset proportional relationship, or setting an energy adjustment coefficient, and adjusting the third audio signal according to the energy adjustment coefficient, thereby A fourth audio signal may be generated. Both the fourth audio signal and the third audio signal carry spectrum information of a high frequency band. When the audio player plays the fourth audio signal, the playback effect of the audio signal can be improved, so that the user can enjoy high-quality audio files. In the embodiment of the present invention, the high-frequency extension of the audio signal lacking the high-frequency part can be realized by the coding technology on the audio player side, so that the user does not need to pay expensive fees and only needs to use the audio player provided by the embodiment of the present invention. Enjoying high-quality audio files makes the audio player provided by the embodiment of the present invention highly competitive.

通过前述实施例对本发明的详细说明可知，首先对原始的第一音频信号进行上采样，得到第二音频信号，然后获取第二音频信号的低频段频谱，并根据低频段频谱对第二音频信号进行高频段频谱的估计，得到高频段频谱包络线，接下来按照高频段频谱包络线将低频段频谱拷贝到第二音频信号的高频段，得到第三音频信号，最后对第三音频信号进行能量调整，得到第四音频信号。本发明实施例中第一音频信号作为原始信号经过上采样之后，估计出了第二音频信号的高频段频谱包络线，第二音频信号在高频段拷贝进行低频段频谱后再进行能量调整，可以得到第四音频信号，第四音频信号中携带有高频段的频谱信息，当音频播放器播放该第四音频信号时可以提高音频信号的播放效果。It can be seen from the detailed description of the present invention by the foregoing embodiments that firstly, the original first audio signal is up-sampled to obtain the second audio signal, and then the low-frequency spectrum of the second audio signal is obtained, and the second audio signal is analyzed according to the low-frequency spectrum. Estimate the high-frequency spectrum to obtain the high-frequency spectrum envelope, and then copy the low-frequency spectrum to the high-frequency band of the second audio signal according to the high-frequency spectrum envelope to obtain the third audio signal, and finally compare the third audio signal. Perform energy adjustment to obtain a fourth audio signal. In the embodiment of the present invention, after the first audio signal is up-sampled as the original signal, the high-frequency spectrum envelope of the second audio signal is estimated, and the second audio signal is copied in the high-frequency band and subjected to the low-frequency spectrum before performing energy adjustment, A fourth audio signal can be obtained. The fourth audio signal carries spectrum information of a high frequency band. When the audio player plays the fourth audio signal, the playback effect of the audio signal can be improved.

前述实施例介绍了本发明提供的一种音频信号的高频扩展方法，接下来以另一个实施例介绍本发明提供的音频信号的高频扩展方法，接下来请参阅如图2所示，本发明一个实施例提供的音频信号的高频扩展方法，可以包括如下步骤：The foregoing embodiment introduces a high-frequency expansion method of an audio signal provided by the present invention. Next, another embodiment is used to introduce the high-frequency expansion method of an audio signal provided by the present invention. The high-frequency expansion method of an audio signal provided by an embodiment of the invention may include the following steps:

201、对第一音频信号进行分帧，得到多个帧的第一音频信号，多个帧的第一音频信号中相邻两帧的第一音频信号之间有重叠。201 . Divide the first audio signal into frames to obtain first audio signals of multiple frames, where the first audio signals of two adjacent frames of the first audio signals of the multiple frames overlap.

在本发明的一些实施例中，原始的第一音频信号是时域信号，首先对该第一音频信号进行分帧，例如可以将第一音频信号每N个比特为一帧，从而可以得到多个帧的第一音频信号，多个帧的第一音频信号中相邻两帧的第一音频信号之间有重叠，假设相邻两帧的第一音频信号之间的重叠区域的长度为2L。In some embodiments of the present invention, the original first audio signal is a time-domain signal, and the first audio signal is first divided into frames. For example, every N bits of the first audio signal can be divided into a frame, so that multiple frames can be obtained. The first audio signals of 1 frame, the first audio signals of multiple frames overlap between the first audio signals of two adjacent frames, and it is assumed that the length of the overlapping area between the first audio signals of two adjacent frames is 2L .

202、对每一帧的第一音频信号分别进行离散余弦变换(Discrete CosineTransform，DCT)处理，得到DCT处理后的第一音频信号。202. Perform discrete cosine transform (Discrete Cosine Transform, DCT) processing on the first audio signal of each frame, to obtain a first audio signal after DCT processing.

在本发明的一些实施例中，对于每一帧的第一音频信号可以依次转换到DCT域。例如，对每帧进行DCT变换时，可以根据matlab进行DCT处理。例如，可以采用如下公式进行DCT的计算：In some embodiments of the present invention, the first audio signal for each frame may be sequentially converted to the DCT domain. For example, when DCT transform is performed on each frame, DCT processing can be performed according to matlab. For example, the following formula can be used to calculate the DCT:

其中：ω(k)表示变换因子，X(k)表示第k个DCT处理后的第一音频信号，x(n)表示DCT处理之前的第一音频信号，N表示每一帧的第一音频信号的长度。Where: ω(k) represents the transformation factor, X(k) represents the first audio signal after the kth DCT processing, x(n) represents the first audio signal before the DCT processing, and N represents the first audio signal of each frame length of the signal.

在本发明的一些实施例中，变换因子ω(k)可以通过如下公式计算：In some embodiments of the present invention, the transformation factor ω(k) can be calculated by the following formula:

其中，对于第1个DCT处理后的第一音频信号，变换因子ω(k)的取值为

在k为其它取值的情况下，变换因子ω(k)的取值为 Among them, for the first audio signal processed by the first DCT, the value of the transformation factor ω(k) is

In the case where k is other values, the value of the transformation factor ω(k) is

不限定的是，上述变换因子还可以有其它取值实现方式，只要能够用于计算出X(k)即可。It is not limited that the above transformation factor may also have other value implementation manners, as long as it can be used to calculate X(k).

203、根据采样率转换比在DCT处理后的每帧第一音频信号的尾端添加0，得到补0后的第一音频信号，采样率转换比为目标采样率和原采样率的比值。203. Add 0 to the end of the first audio signal of each frame after DCT processing according to the sampling rate conversion ratio to obtain the first audio signal after 0-filling, and the sampling rate conversion ratio is the ratio of the target sampling rate to the original sampling rate.

在本发明的一些实施例中，得到每一帧的DCT处理后的第一音频信号之后，可以根据预先配置的采样率转换比计算需要在每帧第一音频信号的尾端添加0的个数，从而得到补0后的第一音频信号，具体的，该采样率转换比可以是目标采样率和原采样率的比值，例如采样率转换比可以表示为^I/_D，I表示目标采样率，D表示原采样率。举例说明如下，将采样率为48KHz的音频文件上采样至192KHz，则原采样率为48KHz，目标采样率为192KHz。In some embodiments of the present invention, after the DCT-processed first audio signal of each frame is obtained, the number of 0s that need to be added to the end of the first audio signal of each frame can be calculated according to a preconfigured sampling rate conversion ratio , so as to obtain the first audio signal after 0-filling. Specifically, the sampling rate conversion ratio can be the ratio of the target sampling rate and the original sampling rate. For example, the sampling rate conversion ratio can be expressed as ^I / _D , where I represents the target sampling rate, D represents the original sampling rate. For example, if an audio file with a sampling rate of 48KHz is upsampled to 192KHz, the original sampling rate is 48KHz, and the target sampling rate is 192KHz.

在本发明的一些实施例中，DCT处理后的每帧第一音频信号的尾端添加0可以实现在DCT域对第一音频信号的扩展，N个样本点进行DCT变换后得到N条谱线，假设采样率转换比为^I/_D，那么在X(k)后添加N₁-N个0得到上采样的频谱Y(k)：In some embodiments of the present invention, adding 0 to the end of the first audio signal of each frame after DCT processing can realize the expansion of the first audio signal in the DCT domain, and N spectral lines are obtained after DCT transformation of N sample points. , assuming that the sample rate conversion ratio is ^I / _D , then add N ₁ -N zeros after X(k) to get the upsampled spectrum Y(k):

其中N₁/N＝I/D，例如将采样率为44.1KHz的音频上采样为192KHz音频，那么N₁/N＝192/44.1＝640/147≈4.35374。Wherein N ₁ /N=I/D, for example, the audio with a sampling rate of 44.1KHz is upsampled to 192KHz audio, then N ₁ /N=192/44.1=640/147≈4.35374.

204、对补0后的第一音频信号进行逆离散余弦转换(Inverse Discrete CosineTransformation，IDCT)处理，得到IDCT处理后的第一音频信号。204 . Perform inverse discrete cosine transformation (Inverse Discrete Cosine Transformation, IDCT) processing on the first audio signal after 0-filling, to obtain a first audio signal after IDCT processing.

在本发明的一些实施例中，在第一音频信号中补0之后，由于步骤202中对第一音频信号进行了DCT处理，再补0之后还需要对第一音频信号进行IDCT处理，从而可以得到IDCT处理后的第一音频信号。In some embodiments of the present invention, after 0 is added to the first audio signal, since DCT processing is performed on the first audio signal in step 202, IDCT processing needs to be performed on the first audio signal after 0 is added, so that the first audio signal can be processed by IDCT. The IDCT-processed first audio signal is obtained.

举例说明如下，补0后的第一音频信号为Y(k)，可以根据matlab进行IDCT处理，对Y(k)用IDCT函数进行IDCT变换，并乘以缩放系数

可以得到上采样的时域信号y(n)：An example is as follows, the first audio signal after 0-filling is Y(k), which can be processed by IDCT according to matlab, and Y(k) can be IDCT transformed with the IDCT function, and multiplied by the scaling factor

The upsampled time domain signal y(n) can be obtained:

205、将所有帧的IDCT处理后的第一音频信号按照首尾拼接的方式连接起来，得到第二音频信号。205. Connect the IDCT-processed first audio signals of all frames in a head-to-tail splicing manner to obtain a second audio signal.

在本发明的一些实施例中，通过前述方式计算出每一帧的IDCT处理后的第一音频信号之后，可以得到所有帧的IDCT处理后的第一音频信号，然后可以对相邻两个帧的第一音频信号进行首尾拼接，所有帧的IDCT处理后的第一音频信号连接起来之后，可以得到第二音频信号。In some embodiments of the present invention, after the IDCT-processed first audio signal of each frame is calculated in the foregoing manner, the IDCT-processed first audio signals of all frames can be obtained, and then two adjacent frames can be analyzed for The first audio signal is spliced end-to-end, and after the IDCT-processed first audio signals of all frames are connected, the second audio signal can be obtained.

需要说明的是，在本发明前述步骤中，对原始的第一音频信号进行上采样得到第二音频信号的方式进行了详细说明，不限定的是，本发明实施例还可以采用其它的上采样方式来生成第二音频信号。例如，在进行高频扩展时首先要进行音频上采样，例如把采样率为48KHz的音频上采样至192KHz的音频，可以使用有限长单位冲激响应滤波器(FiniteImpulse Response，FIR)方式在时域中实现上采样，但是这种方式相对于前述的本发明实施例，会比较消耗硬件资源，而前述举例说明中本发明则采用在DCT频域里加0的方式进行上采样，节约了硬件资源。另外在对长序列音频信号进行频域上采样时采用分帧重叠技术，接下来本发明举例说明如何选择合适的帧长和重叠区域长度使得序列总长误差降为0。It should be noted that, in the foregoing steps of the present invention, the manner of upsampling the original first audio signal to obtain the second audio signal is described in detail. It is not limited that other upsampling may also be adopted in this embodiment of the present invention. way to generate the second audio signal. For example, when performing high-frequency expansion, audio upsampling should be performed first, for example, to upsample audio with a sampling rate of 48KHz to audio of 192KHz, you can use a finite unit impulse response filter (Finite Impulse Response, FIR) method in the time domain. However, compared with the foregoing embodiments of the present invention, this method consumes hardware resources. In the foregoing example, the present invention adopts the method of adding 0 to the DCT frequency domain for upsampling, which saves hardware resources. In addition, the frame overlapping technology is used when up-sampling the long sequence audio signal in the frequency domain. Next, the present invention illustrates how to select an appropriate frame length and overlapping area length so that the total sequence length error is reduced to 0.

在本发明的一些实施例中，步骤204对补0后的第一音频信号进行逆离散余弦转换IDCT处理，得到IDCT处理后的第一音频信号之后，本发明实施例提供的音频信号的高频扩展方法，还可以包括如下步骤：In some embodiments of the present invention, step 204 performs inverse discrete cosine transform (IDCT) processing on the first audio signal after 0-filling, and after obtaining the IDCT-processed first audio signal, the high frequency of the audio signal provided by the embodiment of the present invention The extension method can also include the following steps:

A1、将IDCT处理后的第一音频信号的首端和尾端剪切掉，得到剪切掉首尾端的第一音频信号。A1. Cut off the head end and the tail end of the IDCT-processed first audio signal to obtain the first audio signal with the head and tail ends cut off.

其中，当IDCT处理后的第一音频信号的序列长度很长时，为了避免重叠区域造成的上采样后时域信号的压缩或者拉伸，还可以对IDCT处理后的第一音频信号进行首端和尾端的剪切处理，从而可以得到剪切掉首尾端的第一音频信号，将IDCT处理后的第一音频信号的首端和尾端剪切掉，可以避免音频信号的压缩或者拉伸。Wherein, when the sequence length of the IDCT-processed first audio signal is very long, in order to avoid the compression or stretching of the up-sampled time-domain signal caused by the overlapping area, the IDCT-processed first audio signal may also be subjected to head-end processing. The first audio signal with the head and tail ends cut off can be obtained, and the head and tail ends of the IDCT processed first audio signal are cut off to avoid compression or stretching of the audio signal.

进一步的，在本发明的一些实施例中，步骤A1中，IDCT处理后的第一音频信号中剪切掉的首端和尾端的长度均为L₁，其中L₁的取值通过如下公式计算得到：Further, in some embodiments of the present invention, in step A1, the lengths of the head end and the tail end cut out in the first audio signal after IDCT processing are both L ₁ , and the value of L ₁ is calculated by the following formula get:

其中，L₁的取值为正整数，多个帧的第一音频信号中相邻两帧的第一音频信号之间的重叠区域的长度为2L，I表示目标采样率，D表示原采样率。请参阅图3所示，为本发明实施例中对第一音频信号进行上采样的应用场景示意图。举例说明如下，第一音频信号作为原音频序列，在帧P、帧P+1、帧P+2中，每个帧中都包括N个点，每个帧的首端和尾端与相邻帧的帧间重叠区域为2L，经过上采样之后的第一音频信号中包括N₁个点，在每个帧中首端和尾端的舍弃区域长度为L₁，将每帧y(n)首尾两端的L₁个样本去掉，该L₁＝I/D×L，然后各帧首尾连接起来形成最后上采样的长序列信号。Wherein, the value of L ₁ is a positive integer, the length of the overlapping region between the first audio signals of two adjacent frames in the first audio signals of multiple frames is 2L, I represents the target sampling rate, and D represents the original sampling rate . Please refer to FIG. 3 , which is a schematic diagram of an application scenario of up-sampling the first audio signal in an embodiment of the present invention. For example, the first audio signal is used as the original audio sequence. In frame P, frame P+1, and frame P+2, each frame includes N points, and the head and tail of each frame are adjacent to each other. The overlapping area between frames is 2L, the first audio signal after up-sampling includes N ₁ points, the length of the discarded area at the head and tail ends in each frame is L ₁ , and the head and tail of each frame y(n) The L ₁ samples at both ends are removed, where L ₁ =I/D×L, and then each frame is connected end to end to form the final up-sampled long sequence signal.

需要说明的是，在选择舍弃长度L₁和重叠区域2L时不允许出现小数，即需要使I×L是D的整数倍，例如在将采样率为44.1KHz的音频上采样为192KHz的音频时，因为上采样比例为I/D＝192/44.1＝640/147，所以为确保N₁和L₁为整数，N和L必须为147的倍数，因此可以取帧长N＝2352，L＝147作为一组合适值，如果出现小数，则上采样后的时域信号会有轻微压缩或拉伸现象，序列越长越严重。It should be noted that when choosing to discard the length L ₁ and the overlapping area 2L, no decimals are allowed, even if I×L needs to be an integer multiple of D, for example, when the audio with the sampling rate of 44.1KHz is upsampled to the audio of 192KHz , because the upsampling ratio is I/D=192/44.1=640/147, so to ensure that N ₁ and L ₁ are integers, N and L must be multiples of 147, so you can take the frame length N=2352, L=147 As a set of suitable values, if there is a decimal, the up-sampled time-domain signal will be slightly compressed or stretched, and the longer the sequence, the more serious it is.

在本发明实施例执行步骤A1的实现场景下，步骤205将所有帧的IDCT处理后的第一音频信号按照首尾拼接的方式连接起来，得到第二音频信号，具体为：In the implementation scenario of performing step A1 in this embodiment of the present invention, step 205 connects the first audio signals processed by IDCT of all frames in a head-to-tail splicing manner to obtain a second audio signal, specifically:

B1、将所有帧的剪切掉首尾端的第一音频信号按照首尾拼接的方式连接起来，得到第二音频信号。B1. Connect the first audio signals from which the first and last ends of all frames are cut off in a head-to-tail splicing manner to obtain a second audio signal.

也就是说，若对第一音频信号的首尾端进行了剪切，那么在所有帧的第一音频信号进行首尾拼接时，需要对第一音频信号新生成的首尾端进行拼接，相邻帧的第一音频信号拼接完成后可以生成第二音频信号。That is to say, if the head and tail ends of the first audio signal are cut, then when the first audio signals of all frames are spliced together, the newly generated head and tail ends of the first audio signal need to be spliced. After the first audio signal is spliced, the second audio signal can be generated.

206、获取第二音频信号的低频段频谱，并根据低频段频谱对第二音频信号进行高频段频谱的估计，得到高频段频谱包络线。206. Obtain a low-frequency spectrum of the second audio signal, and perform high-frequency spectrum estimation on the second audio signal according to the low-frequency spectrum, to obtain a high-frequency spectrum envelope.

207、按照高频段频谱包络线将低频段频谱拷贝到第二音频信号的高频段，得到第三音频信号。207. Copy the low-frequency spectrum to the high-frequency band of the second audio signal according to the high-frequency spectrum envelope to obtain a third audio signal.

208、对第三音频信号进行能量调整，得到第四音频信号。208. Perform energy adjustment on the third audio signal to obtain a fourth audio signal.

在本发明的一些实施例中，步骤205执行之后，可以执行步骤206至步骤208，步骤206至步骤208的实现方式与前述实施例中步骤102至步骤104的实现方式相类似，此处不再赘述。In some embodiments of the present invention, after step 205 is performed, steps 206 to 208 may be performed. The implementation of steps 206 to 208 is similar to the implementation of steps 102 to 104 in the foregoing embodiments, and is not repeated here. Repeat.

通过前述实施例对本发明的详细说明可知，本发明实施例中第一音频信号作为原始信号进行DCT处理，然后再添加0，可以实现对第一音频信号的扩展，最后再进行IDCT处理后对第一音频信号进行首尾拼接，得到第二音频信号，整个上采样过程占用的硬件资源较少，并且得到的第二音频信号的序列总长误差可以消减为0。估计出了第二音频信号的高频段频谱包络线，第二音频信号在高频段拷贝进行低频段频谱后再进行能量调整，可以得到第四音频信号，第四音频信号中携带有高频段的频谱信息，当音频播放器播放该第四音频信号时可以提高音频信号的播放效果。From the detailed description of the present invention in the foregoing embodiments, it can be seen that in the embodiment of the present invention, the first audio signal is used as the original signal to perform DCT processing, and then 0 is added to realize the expansion of the first audio signal. An audio signal is spliced end-to-end to obtain a second audio signal. The entire upsampling process occupies less hardware resources, and the total sequence length error of the obtained second audio signal can be reduced to 0. The high-frequency spectrum envelope of the second audio signal is estimated, and the second audio signal is copied in the high-frequency band to perform the energy adjustment of the low-frequency spectrum, and a fourth audio signal can be obtained. The fourth audio signal carries the high-frequency spectrum. Spectrum information, when the audio player plays the fourth audio signal, the playback effect of the audio signal can be improved.

前述实施例介绍了本发明提供的一种音频信号的高频扩展方法，接下来以另一个实施例介绍本发明提供的音频信号的高频扩展方法，接下来请参阅如图4所示，本发明一个实施例提供的音频信号的高频扩展方法，可以包括如下步骤：The foregoing embodiment introduces a high-frequency expansion method of an audio signal provided by the present invention. Next, another embodiment is used to introduce the high-frequency expansion method of an audio signal provided by the present invention. The high-frequency expansion method of an audio signal provided by an embodiment of the invention may include the following steps:

401、对原始的第一音频信号进行上采样，得到第二音频信号。401. Perform up-sampling on the original first audio signal to obtain a second audio signal.

在本发明的一些实施例中，步骤401的实现方式与前述实施例中步骤101的实现方式相类似，此处不再赘述。In some embodiments of the present invention, the implementation of step 401 is similar to the implementation of step 101 in the foregoing embodiments, and details are not described herein again.

402、对第二音频信号进行分帧，得到多个帧的第二音频信号。402. Divide the second audio signal into frames to obtain second audio signals of multiple frames.

在本发明的一些实施例中，得到第二音频信号之后，对该第二音频信号进行分帧，例如可以将第二音频信号分为多个帧，对于每个帧的第二音频信号分别执行后续步骤403至步骤405中描述的高频段频谱包络线的生成方式。In some embodiments of the present invention, after the second audio signal is obtained, the second audio signal is divided into frames, for example, the second audio signal may be divided into multiple frames, and the second audio signal of each frame is separately executed. The following steps 403 to 405 describe the generation method of the high frequency band spectral envelope.

403、对每一帧的第二音频信号进行修正离散余弦变换(Modified DiscreteCosine Transform，MDCT)处理，得到MDCT处理后的第二音频信号。403. Perform modified discrete cosine transform (Modified Discrete Cosine Transform, MDCT) processing on the second audio signal of each frame to obtain a second audio signal after MDCT processing.

在本发明的一些实施例中，每一个帧的第二音频信号转换到MDCT频域，从而得到MDCT处理后的第二音频信号，通过MDCT的变换方式，可以实现从时域到频域的转换，MDCT处理后的第二音频信号是包括多个频谱信息的频域信号。In some embodiments of the present invention, the second audio signal of each frame is converted to the MDCT frequency domain, so as to obtain the second audio signal processed by MDCT, and the conversion from the time domain to the frequency domain can be realized through the transformation method of MDCT. , the second audio signal after MDCT processing is a frequency domain signal including a plurality of spectral information.

404、采用随机采样检验(RANdom Sample Consensus，RANSAC)算法对MDCT处理后的第二音频信号的低频段频谱的振幅绝对值的自然对数值进行直线拟合，得到低频段频谱包络线，低频段频谱包括：处于MDCT处理后的第二音频信号的有效截止频率之前的频谱段。404. Use a random sampling test (RANdom Sample Consensus, RANSAC) algorithm to perform straight line fitting on the natural logarithm of the absolute value of the amplitude of the low-frequency spectrum of the second audio signal processed by the MDCT, to obtain a low-frequency spectrum envelope, and a low-frequency spectrum envelope. The spectrum includes: a spectrum segment before the effective cutoff frequency of the MDCT-processed second audio signal.

在本发明的一些实施例中，得到MDCT处理后的第二音频信号之后，可以采用RANSAC算法对MDCT处理后的第二音频信号的低频段频谱的振幅绝对值的自然对数值进行直线拟合，得到低频段频谱包络线。其中，低频段频谱可以通过计算第二音频信号的有效截止频率的方式来得到，处于MDCT处理后的第二音频信号的有效截止频率之前的频谱段就可以构成低频段频谱。In some embodiments of the present invention, after obtaining the second audio signal processed by MDCT, the RANSAC algorithm may be used to perform linear fitting on the natural logarithm value of the absolute value of the amplitude of the low frequency spectrum of the second audio signal processed by MDCT, Get the low-band spectral envelope. The low-frequency spectrum can be obtained by calculating the effective cutoff frequency of the second audio signal, and the spectrum segment before the effective cutoff frequency of the MDCT-processed second audio signal can constitute the low-frequency spectrum.

405、按照低频段频谱包络线对应的直线方程估计MDCT处理后的第二音频信号的高频段频谱，得到高频段频谱包络线。405. Estimate the high-frequency spectrum of the second audio signal processed by the MDCT according to the straight line equation corresponding to the low-frequency spectrum envelope, to obtain the high-frequency spectrum envelope.

在发明的一些实施例中，计算出低频段频谱包络线对应的直线方程后，根据该直线方程可以估计出MDCT处理后的第二音频信号的高频段频谱，从而生成第二音频信号中的高频段频谱包络线，该高频段频谱包络线中可以包括第二音频信号中需要扩展的高频部分分量。In some embodiments of the invention, after calculating the straight line equation corresponding to the spectrum envelope of the low frequency band, the high frequency band spectrum of the second audio signal after MDCT processing can be estimated according to the straight line equation, so as to generate the line equation in the second audio signal. A high-frequency spectrum envelope, the high-frequency spectrum envelope may include a high-frequency part component that needs to be expanded in the second audio signal.

举例说明如下，在音频编码压缩的方案可以基于MDCT完成，如图5所示，图5为本发明实施例提供的第二音频信号的频谱包络线示意图，在MDCT频域，频谱的ln|X[n]|包络线大致是线性递减的，根据这一特征，第二音频信号中缺失的高频部分也遵循这一规律，具体可以先将上采样之后得到的第二音频信号(为时域信号)分帧，每帧转换到MDCT频域得到X[n]，在X[n]中，可以用较小的正值10^-9代替0，以便后续过程能够计算出对应的ln|X[n]|，再求得有效截止频率，然后选取截止频率前的一些谱线的ln|X[n]|值进行直线拟合作为低频段频谱包络线。在图5中，竖直的虚线表示扩展后的高频段频谱在多个频点的频谱线，每条频谱线的对应值是该频谱线的频率点对应的振幅绝对值的自然对数值，斜虚线表示直线拟合后得到的频谱包络线。在拟合直线时，先将ln|X[n]|曲线的有效截止频率前的谱线分成几段，每段求最大值，然后用这些最大值对应的点进行直线拟合，设拟合出来的直线方程为y＝a*x+b，其中x为谱线序号，y为谱线绝对值的自然对数值，a为斜率，b为截距，大部分情况下，拟合出来的直线斜率为逆，即a<0。在用每段的最大值点进行直线拟合时，可以采用RANSAC算法，从而有效应付离群点的影响。请参阅图6所示，为本发明实施例提供的一种频谱拟合过程的示意图，图6中，标记为“*”号的点表示在频谱分组中的最大值点，标记为圆圈的点为分组的边界点，点画线表示用RANSAC方法拟合得到的直线。An example is as follows. The audio coding and compression scheme can be completed based on MDCT. As shown in FIG. 5, FIG. 5 is a schematic diagram of the spectral envelope of the second audio signal provided by the embodiment of the present invention. In the MDCT frequency domain, the ln| The X[n]| envelope is roughly linearly decreasing. According to this feature, the missing high-frequency part in the second audio signal also follows this rule. Specifically, the second audio signal obtained after up-sampling (for Time domain signal) is divided into frames, and each frame is converted to the MDCT frequency domain to obtain X[n]. In X[n], a smaller positive value 10 ^-9 can be used to replace 0, so that the subsequent process can calculate the corresponding ln| X[n]|, and then obtain the effective cutoff frequency, and then select the ln|X[n]| values of some spectral lines before the cutoff frequency to perform straight line fitting as the low frequency spectral envelope. In Figure 5, the vertical dotted line represents the spectrum lines of the expanded high-frequency spectrum at multiple frequency points, and the corresponding value of each spectrum line is the natural logarithm of the absolute value of the amplitude corresponding to the frequency point of the spectrum line. The dashed line represents the spectral envelope obtained after straight line fitting. When fitting a straight line, first divide the spectral line before the effective cut-off frequency of the ln|X[n]| curve into several segments, find the maximum value of each segment, and then use the points corresponding to these maximum values to fit the line. The resulting straight line equation is y=a*x+b, where x is the spectral line number, y is the natural logarithm of the absolute value of the spectral line, a is the slope, and b is the intercept. In most cases, the fitted straight line The slope is inverse, i.e. a<0. When using the maximum point of each segment for straight line fitting, the RANSAC algorithm can be used to effectively cope with the influence of outliers. Please refer to FIG. 6, which is a schematic diagram of a spectrum fitting process provided by an embodiment of the present invention. In FIG. 6, the point marked with a "*" sign represents the maximum point in the spectrum grouping, and the point marked with a circle It is the boundary point of the grouping, and the dotted line represents the straight line fitted by the RANSAC method.

在本发明的一些实施例中，步骤404采用RANSAC算法对MDCT处理后的第二音频信号的低频段频谱进行直线拟合，得到低频段频谱包络线之后，本发明实施例提供的音频信号的高频扩展方法，还可以包括如下步骤：In some embodiments of the present invention, step 404 uses the RANSAC algorithm to perform straight line fitting on the low-frequency spectrum of the second audio signal processed by MDCT, and after obtaining the low-frequency spectrum envelope, the audio signal provided by the embodiment of the present invention has a The high-frequency expansion method may further include the following steps:

C1、根据预置的频谱坐标系中的最小值点、有效截止频率在频谱坐标系中的有效值点确定直线方程的校正参数；C1. Determine the correction parameter of the straight line equation according to the preset minimum value point in the spectral coordinate system and the effective value point of the effective cut-off frequency in the spectral coordinate system;

C2、根据校正参数对低频段频谱包络线对应的直线方程进行参数调整。C2. Adjust the parameters of the straight line equation corresponding to the spectrum envelope of the low frequency band according to the correction parameters.

在本发明的上述实施例中，当低频段频谱包络线对应的直线方程中斜率为正值时，还可以对该直线方程的参数进行校正，先通过步骤C1的方式计算出校正参数，该校正参数由频谱坐标系中的最小值点、有效截止频率在频谱坐标系中的有效值点这两个点来确定，通过最小点和有效值点可以计算出一个斜率值作为校正参数。举例说明如下，若拟合出来的直线方程为y＝a*x+b，大部分情况下，拟合出来的直线斜率为负，即a<0，但有的帧中，会存在a>0的情况，此时就需要对a、b进行修正，修正的方式为：在频谱坐标系最右端设立一个最小值点(kNum,ln(2^(-MBits)))，求取该点与有效截止频率点(k_c,ln|X[kc]|)确定的校正参数，用该校正参数代替原参数a，其中kNum为谱线的总数量，MBits为采样位数，例如一般为16或24。In the above-mentioned embodiment of the present invention, when the slope of the straight line equation corresponding to the spectral envelope of the low frequency band is a positive value, the parameters of the straight line equation can also be corrected. The correction parameter is determined by the minimum point in the spectrum coordinate system and the effective value point of the effective cutoff frequency in the spectrum coordinate system. A slope value can be calculated as the correction parameter through the minimum point and the effective value point. An example is as follows. If the fitted straight line equation is y=a*x+b, in most cases, the fitted straight line has a negative slope, that is, a<0, but in some frames, there will be a>0 In the case of , a and b need to be corrected at this time. The method of correction is to set up a minimum value point (kNum,ln(2^(-MBits))) at the far right end of the spectrum coordinate system, and find the difference between this point and the effective value. The correction parameter determined by the cut-off frequency point (k _c ,ln|X[kc]|), which is used to replace the original parameter a, where kNum is the total number of spectral lines, and MBits is the number of sampling bits, for example, generally 16 or 24 .

进一步的，在本发明的一些实施例中，处于MDCT处理后的第二音频信号的有效截止频率可以通过如下方式获取：Further, in some embodiments of the present invention, the effective cutoff frequency of the second audio signal after MDCT processing can be obtained in the following manner:

D1、读取预先缓冲的T个帧的第二音频信号，并获取T个帧中每一个帧的第二音频信号的对应于多个频率点的频谱线，T的取值为自然数；D1, read the second audio signal of the pre-buffered T frames, and obtain the spectral lines corresponding to multiple frequency points of the second audio signal of each frame in the T frames, and the value of T is a natural number;

D2、对于T个帧中的每一个帧都按照如下对第一帧的处理方式确定出每一个帧的有效截止频率：从第一帧的最后一条频谱线往前开始搜索，找到第一频谱线对应的频率作为第一帧的有效截止频率，第一频谱线为第一帧的第二音频信号的对应于多个频率点的频谱线中第一条其振幅绝对值的自然对数值大于一预置门限的频谱线，第一帧为T个帧中的任意一个帧；D2. For each of the T frames, determine the effective cut-off frequency of each frame according to the following processing method for the first frame: start the search from the last spectral line of the first frame forward, and find the first spectral line The corresponding frequency is used as the effective cutoff frequency of the first frame, and the first spectral line is the natural logarithm value of the absolute amplitude value of the first one of the spectral lines corresponding to multiple frequency points of the second audio signal of the first frame, which is greater than a predetermined value. Thresholded spectrum line, the first frame is any one of the T frames;

D3、获取到T个帧中每一个帧的有效截止频率之后，确定T个帧的有效截止频率中的最大值作为全局截止频率；D3. After obtaining the effective cut-off frequency of each of the T frames, determine the maximum value of the effective cut-off frequencies of the T frames as the global cut-off frequency;

D4、从全局截止频率开始往前搜索MDCT处理后的第二音频信号的频谱线，找到第一条其振幅绝对值的自然对数值大于另一预置门限的频谱线对应的频率点作为MDCT处理后的第二音频信号的有效截止频率。D4. Search the spectrum line of the second audio signal processed by MDCT forward from the global cutoff frequency, and find the frequency point corresponding to the first spectrum line whose amplitude absolute value of the natural log value is greater than another preset threshold as the MDCT process Effective cutoff frequency of the second audio signal after.

其中，预先缓存T个帧的第二音频信号，例如T为25，则可以先读取到缓冲的25个帧的第二音频信号，通过MDCT处理的方式，得到25个帧中每一个帧的第二音频信号的对应于多个频率点频谱线。如图5所示，对于一个帧的第二音频信号，每一条竖线(包括虚线和实线)都是一条频谱线，对于25个帧中的每一个帧都可以确定出该帧的有效截止频率，在步骤D2中，以第一帧的有效截止频率的计算为例，如图5所示，从第一帧的最后一条频谱线往前开始搜索，找到第一频谱线对应的频率作为第一帧的有效截止频率，第一频谱线为第一帧的第二音频信号的对应于多个频率点的频谱线中第一条其振幅绝对值的自然对数值大于预置门限的频谱线，也就是说对于第一帧的所有频谱线，从后往前找，找到振幅绝对值的自然对数值大于一预置门限的频谱线作为第一频谱线，该第一频谱线的频率点就是第一帧的有效截止频率，然后从25个帧的有效截止频率中选择出最大值作为全局截止频率。其中，上述预置门限可以通过如下方式计算：ln(2^(-MBits))+q，其中，q为可调整的参量，MBits为采样位数，另外该预置门限还可以结合具体的应用场景来确定具体取值，此处不再赘述。在步骤D3中，举例说明如下，计算每个帧的MDCT处理后的第二音频信号的有效截止频率时，统一从所有帧中的第一个帧开始求每帧的有效截止频率，求取的方法是，从全局截止频率处往前开始搜索，找到第一条其振幅绝对值的自然对数值大于另一预置门限的频谱线的频率点作为该帧的有效截止频率k_c，其中，上述另一预置门限可以通过如下方式计算：ln(2^(-MBits))+p其中，p为可调整的参量。这么处理的目的是避免每一个帧都从最后一条谱线往前搜索，以节省搜索时间、提高速度。Among them, the second audio signal of T frames is pre-buffered, for example, T is 25, then the buffered second audio signal of 25 frames can be read first, and the second audio signal of each of the 25 frames can be obtained by MDCT processing. Spectral lines of the second audio signal corresponding to a plurality of frequency points. As shown in FIG. 5 , for the second audio signal of one frame, each vertical line (including the dotted line and the solid line) is a spectral line, and for each of the 25 frames, the effective cutoff of the frame can be determined Frequency, in step D2, take the calculation of the effective cutoff frequency of the first frame as an example, as shown in Figure 5, start the search from the last spectral line of the first frame forward, and find the frequency corresponding to the first spectral line as the first frequency. The effective cutoff frequency of one frame, the first spectral line is the first spectral line whose natural logarithm value of the absolute value of the amplitude is greater than the preset threshold among the spectral lines of the second audio signal of the first frame corresponding to multiple frequency points, That is to say, for all the spectral lines of the first frame, search from the back to the front, and find the spectral line whose natural logarithm value of the absolute value of the amplitude is greater than a preset threshold as the first spectral line, and the frequency point of the first spectral line is the first spectral line. The effective cutoff frequency of one frame, and then select the maximum value from the effective cutoff frequency of 25 frames as the global cutoff frequency. The above preset threshold can be calculated in the following way: ln(2^(-MBits))+q, where q is an adjustable parameter, and MBits is the number of sampling bits. In addition, the preset threshold can also be combined with specific applications The specific value is determined according to the scene, and will not be repeated here. In step D3, an example is given as follows. When calculating the effective cut-off frequency of the second audio signal after MDCT processing of each frame, the effective cut-off frequency of each frame is uniformly calculated from the first frame of all frames, and the calculated The method is to start searching forward from the global cutoff frequency, and find the frequency point of the first spectral line whose amplitude absolute value of the natural logarithm value is greater than another preset threshold as the effective cutoff frequency k _c of the frame, wherein the above Another preset threshold can be calculated as follows: ln(2^(-MBits))+p, where p is an adjustable parameter. The purpose of this processing is to avoid searching forward from the last spectral line in each frame, so as to save search time and improve speed.

需要说明的是，在本发明前述步骤中，对获取第二音频信号的低频段频谱，并根据低频段频谱对第二音频信号进行高频段频谱的估计，得到高频段频谱包络线的方式进行了详细说明，不限定的是，本发明实施例还可以采用其它的方式来生成第二音频信号的高频段频谱包络线。举例说明，对原音频信号进行上采样之后得到音频序列x[n]，再转换到MDCT频域得到X[n]，然后求得ln|X[n]|，将有效截止频率k_c之前的频谱包络线进行分组，接着采用最小二乘法进行拟合直线。但是，在本发明的前述实施例中采用RANSAC算法对每组的最大值点进行直线拟合求得直线参数，然后依据此参数对缺失的高频部分进行扩展。由于频域波形变化的多样性，最小二乘法拟合直线时容易受离群点影响，从而导致拟合的直线出现较大偏差，使得高频拓展后的音质不佳，而采用RANSAC算法则可以有效地应对离群点的干扰，提高了音质效果。It should be noted that, in the foregoing steps of the present invention, the acquisition of the low-frequency spectrum of the second audio signal is performed, and the high-frequency spectrum of the second audio signal is estimated according to the low-frequency spectrum to obtain the high-frequency spectrum envelope. For detailed description, it is not limited that the embodiment of the present invention may also adopt other manners to generate the high-frequency frequency spectrum envelope of the second audio signal. For example, after up-sampling the original audio signal, the audio sequence x[n] is obtained, and then converted to the MDCT frequency domain to obtain X[n], and then _ln |X[n]| is obtained. The spectral envelopes are grouped, and a straight line is then fitted using the least squares method. However, in the foregoing embodiment of the present invention, the RANSAC algorithm is used to perform straight line fitting on the maximum point of each group to obtain a straight line parameter, and then the missing high-frequency part is expanded according to this parameter. Due to the diversity of waveform changes in the frequency domain, the least squares method is easily affected by outliers when fitting a straight line, which leads to a large deviation in the fitted straight line, resulting in poor sound quality after high-frequency expansion. Effectively deal with the interference of outliers and improve the sound quality.

406、按照高频段频谱包络线将低频段频谱拷贝到第二音频信号的高频段，得到第三音频信号。406. Copy the low-frequency spectrum to the high-frequency band of the second audio signal according to the high-frequency spectrum envelope to obtain a third audio signal.

407、对第三音频信号进行能量调整，得到第四音频信号。407. Perform energy adjustment on the third audio signal to obtain a fourth audio signal.

在本发明的一些实施例中，步骤405执行之后，可以执行步骤406至步骤407，步骤406至步骤407的实现方式与前述实施例中步骤103至步骤104的实现方式相类似，此处不再赘述。In some embodiments of the present invention, after step 405 is performed, steps 406 to 407 may be performed. The implementation of steps 406 to 407 is similar to the implementation of steps 103 to 104 in the foregoing embodiments, and is not repeated here. Repeat.

在本发明的一些实施例中，步骤406按照高频段频谱包络线将低频段频谱拷贝到第二音频信号的高频段，得到第三音频信号，具体可以包括如下步骤：In some embodiments of the present invention, step 406 copies the low-frequency spectrum to the high-frequency band of the second audio signal according to the high-frequency spectrum envelope to obtain the third audio signal, which may specifically include the following steps:

E1、根据第二音频信号的有效截止频率将低频段频谱分为多个谱线段；E1. Divide the low frequency spectrum into a plurality of spectral line segments according to the effective cutoff frequency of the second audio signal;

E2、将多个谱线段依次拷贝到第二音频信号的高频段，得到第三音频信号。E2. Copy the plurality of spectral line segments to the high frequency band of the second audio signal in sequence to obtain a third audio signal.

其中，在进行高频拓展时需要拷贝低频段频谱，本发明实施例中采用循环镜像方式进行频段拷贝，使得拷贝频段的交界处相对平滑，有助于音质的提升。举例说明，拷贝低频段频谱时采用镜像拷贝，拷贝镜像对称谱线依次为k_c+n*U(U为被拷贝的谱线条数，n表示对称轴的标号，n＝0/1/2/3…)，例如第k_c+1、kc+2、…、k_c+U条谱线依次拷贝自k_c-1、k_c-2、k_c-U，继而第k_c+U+1、k_c+U+2、…、k_c+2U条谱线拷贝自k_c+(U-1)、k_c+(U-2)、…、k_c+1、k_c，依次类推。镜像拷贝的目的是让拷贝段的两边交界处的谱线尽量不发生跳跃，以提高音质。不限定的是，在本发明的其它实施例中，还可以先从低频段选取一段频谱，然后采用平移方式多次拷贝到高频区域直到填满，但是如果被拷贝频段两端的谱线差值较大，那么在拷贝后的频段交界处会产生跳跃现象，这会导致音质下降，而步骤E1至步骤E2提供的应用场景中，采用循环镜像方式进行频段拷贝，使得拷贝频段的交界处相对平滑，有助于音质的提升。Wherein, the low-frequency spectrum needs to be copied when performing high-frequency expansion. In the embodiment of the present invention, the cyclic mirroring method is adopted to copy the frequency band, so that the junction of the copied frequency band is relatively smooth, which is conducive to the improvement of sound quality. For example, mirror copy is used when copying the low frequency spectrum, and the mirror symmetrical spectral lines are copied as k _c +n*U (U is the number of spectral lines to be copied, n represents the label of the symmetry axis, n=0/1/2 /3...), for example the k _c +1, kc+2,..., k _c +U spectral lines are copied from k _c -1, k _c -2, k _c -U in turn, and then the k _c +U+ 1. k _c +U+2,…,k _c +2U spectral lines are copied from k _c +(U-1), k _c +(U-2),…,k _c +1,k _c , and so on . The purpose of the mirror copy is to make the spectral lines at the junction of the two sides of the copy segment not jump as much as possible, so as to improve the sound quality. It is not limited that, in other embodiments of the present invention, a segment of frequency spectrum may be selected from the low frequency band first, and then copied to the high frequency area multiple times by means of translation until it is filled, but if the difference between the spectral lines at both ends of the copied frequency band is If it is larger, then a jump phenomenon will occur at the junction of the copied frequency bands, which will lead to a decrease in sound quality. In the application scenarios provided by steps E1 to E2, the cyclic mirroring method is used to copy the frequency bands, so that the junction of the copied frequency bands is relatively smooth. , which helps to improve the sound quality.

在本发明的一些实施例中，步骤407对第三音频信号进行能量调整，得到第四音频信号，具体可以包括如下步骤：In some embodiments of the present invention, step 407 performs energy adjustment on the third audio signal to obtain a fourth audio signal, which may specifically include the following steps:

F1、根据第三音频信号的有效截止频率和信号终止频率将第三音频信号分为S个谱线段，其中，每一个谱线段包括w条谱线，S和w为自然数；F1. Divide the third audio signal into S spectral line segments according to the effective cutoff frequency and the signal termination frequency of the third audio signal, wherein each spectral line segment includes w spectral lines, and S and w are natural numbers;

F2、S个谱线段中的每一条谱线通过如下方式进行能量调整：The energy of each spectral line in the F2 and S spectral line segments is adjusted as follows:

X′[n]＝X[n]×α_i，X'[n]=X[n]×α _i ,

n＝k_c+i×w～k_c+(i+1)×w-1，i＝0～S-1，n=k _c +i×w～k _c +(i+1)×w-1, i=0～S-1,

其中，X′[n]表示第四音频信号，X[n]表示第三音频信号，α_i表示能量调整系数，E_i表示每段谱线调整前的能量，P_i表示在第二音频信号的高频段中填充的伪能量，k_c表示第三音频信号的有效截止频率，a和b表示第三音频信号的直线方程参数。Among them, X'[n] represents the fourth audio signal, X[n] represents the third audio signal, α _i represents the energy adjustment coefficient, E _i represents the energy of each spectral line before adjustment, and P _i represents the second audio signal The pseudo energy filled in the high frequency band of , k _c represents the effective cutoff frequency of the third audio signal, and a and b represent the straight line equation parameters of the third audio signal.

具体的，在通过前述方式拷贝完谱线之后，可以将有效截止频率k_c到信号终止频率end分为S个谱线段，每一个谱线段包括w条谱线，每段分别进行能量调整，需要先计算每段谱线的调整前能量E_i、伪能量P_i，然后计算每段的能量调整系数α_i，最后每段谱线乘以相应的调整系数即可得到调整后谱线值。需要说明的是，上述实现方式只是能量调整的一种具体可行的实现场景，不限定的是，本发明实施例中能量调整系数还可以采用其它的方式，例如对上述能量调整系数进行进一步的修正，设置修正因子。另外，本发明实施例中第四音频信号X′[n]的实现方式也可以不局限于上述举例方式，还可以通过对第三音频信号按照固定能量调整值进行修改，具体结合应用场景进行详细设置，此处不做限定。Specifically, after the spectral lines are copied in the aforementioned manner, the effective cutoff frequency k _c to the signal termination frequency end can be divided into S spectral line segments, each spectral line segment includes w spectral lines, and energy adjustment is performed for each segment respectively , it is necessary to first calculate the pre-adjustment energy E _i and pseudo energy P _i of each spectral line, then calculate the energy adjustment coefficient α _i of each section, and finally multiply each spectral line by the corresponding adjustment coefficient to obtain the adjusted spectral line value . It should be noted that the above implementation manner is only a specific and feasible implementation scenario of energy adjustment. It is not limited that the energy adjustment coefficient in the embodiment of the present invention may also adopt other manners, such as further modifying the above energy adjustment coefficient. , set the correction factor. In addition, the implementation manner of the fourth audio signal X'[n] in the embodiment of the present invention may not be limited to the above-mentioned exemplary manner, and the third audio signal may also be modified according to a fixed energy adjustment value. The settings are not limited here.

需要说明的是，在本发明的一些实施例中，步骤403中对每一帧的第二音频信号进行MDCT处理，得到MDCT处理后的第二音频信号，通过步骤407进行能量调整后得到的第四音频信号还可以是频域信号，该第四音频信号还可以进一步的进行逆修正离散余弦变换(Inverse Modified Discrete Cosine Transform，IMDCT)处理，并对各个帧的IMDCT处理后的第四音频信号进行合帧处理，从而得到完整的时域的第四音频信号，该第四音频信号可以向用户输出进行音频文件播放，使用户体验到高品质的音频效果。It should be noted that, in some embodiments of the present invention, MDCT processing is performed on the second audio signal of each frame in step 403 to obtain the second audio signal after MDCT processing, and the first audio signal obtained after performing energy adjustment in step 407 is obtained. The fourth audio signal may also be a frequency domain signal, and the fourth audio signal may be further processed by Inverse Modified Discrete Cosine Transform (IMDCT), and the fourth audio signal after IMDCT processing of each frame is processed. Frame-combining processing is performed to obtain a complete fourth audio signal in the time domain, and the fourth audio signal can be output to the user for audio file playback, so that the user can experience high-quality audio effects.

通过前述实施例对本发明的详细说明可知，本发明实施例中估计第二音频信号的高频段频谱包络线时，可以采用RANSAC算法，可以有效地应对离群点的干扰，提高了音质效果。第二音频信号在高频段拷贝进行低频段频谱后再进行能量调整，可以得到第四音频信号，第四音频信号中携带有高频段的频谱信息，当音频播放器播放该第四音频信号时可以提高音频信号的播放效果。It can be seen from the detailed description of the present invention in the foregoing embodiments that the RANSAC algorithm can be used when estimating the high frequency spectral envelope of the second audio signal in the embodiment of the present invention, which can effectively deal with the interference of outliers and improve the sound quality. The second audio signal is copied in the high frequency band, and then the energy adjustment is performed on the low frequency spectrum, and a fourth audio signal can be obtained. The fourth audio signal carries the spectrum information of the high frequency band. When the audio player plays the fourth audio signal, it can be Improve the playback of audio signals.

需要说明的是，对于前述的各方法实施例，为了简单描述，故将其都表述为一系列的动作组合，但是本领域技术人员应该知悉，本发明并不受所描述的动作顺序的限制，因为依据本发明，某些步骤可以采用其他顺序或者同时进行。其次，本领域技术人员也应该知悉，说明书中所描述的实施例均属于优选实施例，所涉及的动作和模块并不一定是本发明所必须的。It should be noted that, for the sake of simple description, the foregoing method embodiments are all expressed as a series of action combinations, but those skilled in the art should know that the present invention is not limited by the described action sequence. As in accordance with the present invention, certain steps may be performed in other orders or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present invention.

为便于更好的实施本发明实施例的上述方案，下面还提供用于实施上述方案的相关装置。In order to better implement the above solutions of the embodiments of the present invention, related devices for implementing the above solutions are also provided below.

请参阅图7-a所示，本发明实施例提供的一种音频播放器700，可以包括：上采样模块701、高频估计模块702、频谱拷贝模块703、能量调整模块704，其中，Referring to FIG. 7-a, an audio player 700 provided by an embodiment of the present invention may include: an up-sampling module 701, a high-frequency estimation module 702, a spectrum copy module 703, and an energy adjustment module 704, wherein,

上采样模块701，用于对原始的第一音频信号进行上采样，得到第二音频信号；an upsampling module 701 for upsampling the original first audio signal to obtain a second audio signal;

高频估计模块702，用于获取所述第二音频信号的低频段频谱，并根据所述低频段频谱对所述第二音频信号进行高频段频谱的估计，得到高频段频谱包络线；A high-frequency estimation module 702, configured to acquire a low-frequency spectrum of the second audio signal, and perform high-frequency spectrum estimation on the second audio signal according to the low-frequency spectrum to obtain a high-frequency spectrum envelope;

频谱拷贝模块703，用于按照所述高频段频谱包络线将所述低频段频谱拷贝到所述第二音频信号的高频段，得到第三音频信号；A spectrum copy module 703, configured to copy the low-frequency spectrum to the high-frequency band of the second audio signal according to the high-frequency spectrum envelope to obtain a third audio signal;

能量调整模块704，用于对所述第三音频信号进行能量调整，得到第四音频信号。The energy adjustment module 704 is configured to perform energy adjustment on the third audio signal to obtain a fourth audio signal.

在本发明的一些实施例中，如图7-b所示，上采样模块701包括：In some embodiments of the present invention, as shown in FIG. 7-b, the upsampling module 701 includes:

第一分帧模块7011，用于对所述第一音频信号进行分帧，得到多个帧的第一音频信号，所述多个帧的第一音频信号中相邻两帧的第一音频信号之间有重叠；The first framing module 7011 is used for framing the first audio signal to obtain the first audio signal of multiple frames, the first audio signal of two adjacent frames of the first audio signal of the multiple frames there is overlap between;

DCT处理模块7012，用于对每一帧的第一音频信号分别进行离散余弦变换DCT处理，得到DCT处理后的第一音频信号；The DCT processing module 7012 is used to perform discrete cosine transform DCT processing on the first audio signal of each frame, to obtain the first audio signal after the DCT processing;

扩展模块7013，用于根据采样率转换比在所述DCT处理后的每帧第一音频信号的尾端添加0，得到补0后的第一音频信号，所述采样率转换比为目标采样率和原采样率的比值；The expansion module 7013 is used to add 0 to the tail end of the first audio signal of each frame after the DCT process according to the sampling rate conversion ratio to obtain the first audio signal after the 0 is filled, and the sampling rate conversion ratio is the target sampling rate and the ratio of the original sampling rate;

ICDT处理模块7014，用于对所述补0后的第一音频信号进行逆离散余弦转换IDCT处理，得到IDCT处理后的第一音频信号；The ICDT processing module 7014 is configured to perform inverse discrete cosine transform IDCT processing on the first audio signal after 0-filling, to obtain the first audio signal after IDCT processing;

拼接模块7015，用于将所有帧的所述IDCT处理后的第一音频信号按照首尾拼接的方式连接起来，得到所述第二音频信号。The splicing module 7015 is configured to connect the IDCT-processed first audio signals of all frames in a head-to-tail splicing manner to obtain the second audio signal.

在本发明的一些实施例中，如图7-c所示，音频播放器700还可以包括如下模块：In some embodiments of the present invention, as shown in FIG. 7-c, the audio player 700 may further include the following modules:

剪切模块705，用于IDCT处理模块对所述补0后的第一音频信号进行逆离散余弦转换IDCT处理，得到IDCT处理后的第一音频信号之后，将所述IDCT处理后的第一音频信号的首端和尾端剪切掉，得到剪切掉首尾端的第一音频信号；The cutting module 705 is used for the IDCT processing module to perform inverse discrete cosine transform (IDCT) processing on the first audio signal after 0-filling, and after obtaining the IDCT-processed first audio signal, the IDCT-processed first audio signal is processed. The head and tail of the signal are cut off to obtain the first audio signal with the head and tail cut off;

在这种情况下，拼接模块7015具体用于将所有帧的所述剪切掉首尾端的第一音频信号按照首尾拼接的方式连接起来，得到所述第二音频信号。In this case, the splicing module 7015 is specifically configured to connect the first audio signals with the first and last ends of all frames cut off in a head-to-tail splicing manner to obtain the second audio signal.

在本发明的一些实施例中，所述IDCT处理后的第一音频信号中剪切掉的首端和尾端的长度均为L₁，所述L₁的取值通过如下公式计算得到：In some embodiments of the present invention, the lengths of the head end and the tail end cut out in the IDCT-processed first audio signal are both L ₁ , and the value of L ₁ is calculated by the following formula:

其中，所述L₁的取值为正整数，所述多个帧的第一音频信号中相邻两帧的第一音频信号之间的重叠区域的长度为2L，所述I表示目标采样率，所述D表示原采样率。Wherein, the value of L ₁ is a positive integer, the length of the overlapping region between the first audio signals of two adjacent frames in the first audio signals of the multiple frames is 2L, and the I represents the target sampling rate , the D represents the original sampling rate.

在本发明的一些实施例中，如图7-d所示，高频估计模块702包括：In some embodiments of the present invention, as shown in FIG. 7-d, the high frequency estimation module 702 includes:

第二分帧模块7021，用于对所述第二音频信号进行分帧，得到多个帧的第二音频信号；The second framing module 7021 is used for framing the second audio signal to obtain the second audio signal of multiple frames;

MDCT处理模块7022，用于对每一帧的第二音频信号进行修正离散余弦变换MDCT处理，得到MDCT处理后的第二音频信号；The MDCT processing module 7022 is used to perform modified discrete cosine transform MDCT processing on the second audio signal of each frame to obtain the second audio signal after MDCT processing;

直线拟合模块7023，用于采用随机采样检验算法对所述MDCT处理后的第二音频信号的低频段频谱的振幅绝对值的自然对数值进行直线拟合，得到低频段频谱包络线，所述低频段频谱包括：处于所述MDCT处理后的第二音频信号的有效截止频率之前的频谱段；The straight-line fitting module 7023 is used to perform straight-line fitting on the natural logarithm value of the absolute value of the amplitude of the low-frequency spectrum of the second audio signal processed by the MDCT using a random sampling inspection algorithm, so as to obtain the low-frequency spectrum envelope. The low-frequency spectrum includes: a spectrum segment before the effective cutoff frequency of the second audio signal processed by the MDCT;

包络线生成模块7024，用于按照所述低频段频谱包络线对应的直线方程估计所述MDCT处理后的第二音频信号的高频段频谱，得到所述高频段频谱包络线。The envelope generating module 7024 is configured to estimate the high-frequency spectrum of the second audio signal processed by the MDCT according to the straight line equation corresponding to the low-frequency spectrum envelope, and obtain the high-frequency spectrum envelope.

在本发明的一些实施例中，所述MDCT处理后的第二音频信号的有效截止频率通过如下方式获取：In some embodiments of the present invention, the effective cutoff frequency of the MDCT-processed second audio signal is obtained in the following manner:

读取预先缓冲的T个帧的第二音频信号，并获取所述T个帧中每一个帧的第二音频信号的对应于多个频率点的频谱线，所述T的取值为自然数；reading the pre-buffered second audio signals of the T frames, and acquiring spectral lines corresponding to multiple frequency points of the second audio signal of each of the T frames, where the value of T is a natural number;

对于所述T个帧中的每一个帧都按照如下对第一帧的处理方式确定出每一个帧的有效截止频率：从所述第一帧的最后一条频谱线往前开始搜索，找到第一频谱线对应的频率作为所述第一帧的有效截止频率，所述第一频谱线为所述第一帧的第二音频信号的对应于多个频率点的频谱线中第一条其振幅绝对值的自然对数值大于一预置门限的频谱线，所述第一帧为所述T个帧中的任意一个帧；For each of the T frames, the effective cut-off frequency of each frame is determined according to the following processing method for the first frame: start searching from the last spectral line of the first frame forward, and find the first The frequency corresponding to the spectral line is used as the effective cutoff frequency of the first frame, and the first spectral line is the absolute amplitude of the first one of the spectral lines corresponding to multiple frequency points of the second audio signal of the first frame The natural logarithm value of the value is greater than a spectral line of a preset threshold, and the first frame is any one of the T frames;

获取到所述T个帧中每一个帧的有效截止频率之后，确定所述T个帧的T个有效截止频率中的最大值作为全局截止频率；After obtaining the effective cut-off frequency of each of the T frames, determine the maximum value among the T effective cut-off frequencies of the T frames as the global cut-off frequency;

从所述全局截止频率开始往前搜索所述MDCT处理后的第二音频信号的频谱线，找到第一条其振幅绝对值的自然对数值大于另一预置门限的频谱线对应的频率点作为所述MDCT处理后的第二音频信号的有效截止频率。Starting from the global cutoff frequency, the frequency spectrum line of the second audio signal processed by the MDCT is searched forward, and the frequency point corresponding to the first spectrum line whose natural logarithm value of the absolute value of the amplitude is greater than another preset threshold is found as the frequency point. The effective cutoff frequency of the MDCT-processed second audio signal.

在本发明的一些实施例中，如图7-e所示，相对于如图7-a所示，音频播放器700还可以包括如下模块：In some embodiments of the present invention, as shown in FIG. 7-e, compared to that shown in FIG. 7-a, the audio player 700 may further include the following modules:

校正模块706，用于直线拟合模块采用随机采样检验算法对所述MDCT处理后的第二音频信号的低频段频谱的振幅绝对值的自然对数值进行直线拟合，得到低频段频谱包络线之后，根据预置的频谱坐标系中的最小值点、所述有效截止频率在所述频谱坐标系中的有效值点确定直线方程的校正参数；根据所述校正参数对所述低频段频谱包络线对应的直线方程进行参数调整。The correction module 706 is used for the straight-line fitting module to perform straight-line fitting on the natural logarithm value of the absolute value of the amplitude of the low-frequency spectrum of the second audio signal processed by the MDCT using a random sampling check algorithm to obtain a low-frequency spectrum envelope After that, the correction parameter of the straight line equation is determined according to the minimum value point in the preset spectral coordinate system and the effective value point of the effective cut-off frequency in the spectral coordinate system; Adjust the parameters of the straight line equation corresponding to the network line.

在本发明的一些实施例中，频谱拷贝模块703，具体用于根据所述第二音频信号的有效截止频率将所述低频段频谱分为多个谱线段；将所述多个谱线段依次拷贝到所述第二音频信号的高频段，得到所述第三音频信号。In some embodiments of the present invention, the spectrum copy module 703 is specifically configured to divide the low frequency spectrum into multiple spectrum line segments according to the effective cutoff frequency of the second audio signal; The third audio signal is obtained by sequentially copying to the high frequency band of the second audio signal.

在本发明的一些实施例中，能量调整模块704，具体用于根据所述第三音频信号的有效截止频率和信号终止频率将所述第三音频信号分为S个谱线段，其中，每一个谱线段包括w条谱线，所述S和w为自然数；所述S个谱线段中的每一条谱线通过如下方式进行能量调整：In some embodiments of the present invention, the energy adjustment module 704 is specifically configured to divide the third audio signal into S spectral line segments according to the effective cutoff frequency and the signal termination frequency of the third audio signal, wherein each One spectral line segment includes w spectral lines, and the S and w are natural numbers; each spectral line in the S spectral line segments is energy adjusted in the following manner:

X′[n]＝X[n]×α_i，X'[n]=X[n]×α _i ,

其中，所述X′[n]表示所述第四音频信号，所述X[n]表示所述第三音频信号，所述α_i表示能量调整系数，所述E_i表示每段谱线调整前的能量，所述P_i表示在所述第二音频信号的高频段中填充的伪能量，所述k_c表示所述第三音频信号的有效截止频率，所述a和所述b表示所述第三音频信号的直线方程参数。Wherein, the X'[n] represents the fourth audio signal, the X[n] represents the third audio signal, the α _i represents the energy adjustment coefficient, and the E _i represents the adjustment of each spectral line The P _i represents the pseudo energy filled in the high frequency band of the second audio signal, the k _c represents the effective cut-off frequency of the third audio signal, the a and the b represent the The linear equation parameters of the third audio signal are described.

需要说明的是，上述装置各模块/单元之间的信息交互、执行过程等内容，由于与本发明方法实施例基于同一构思，其带来的技术效果与本发明方法实施例相同，具体内容可参见本发明前述所示的方法实施例中的叙述，此处不再赘述。It should be noted that the information exchange, execution process and other contents between the modules/units of the above device are based on the same concept as the method embodiments of the present invention, and the technical effects brought by them are the same as those of the method embodiments of the present invention, and the specific content can be Refer to the descriptions in the foregoing method embodiments of the present invention, which will not be repeated here.

另外需说明的是，以上所描述的装置实施例仅仅是示意性的，其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外，本发明提供的装置实施例附图中，模块之间的连接关系表示它们之间具有通信连接，具体可以实现为一条或多条通信总线或信号线。本领域普通技术人员在不付出创造性劳动的情况下，即可以理解并实施。In addition, it should be noted that the device embodiments described above are only schematic, wherein the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be A physical unit, which can be located in one place or distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment. In addition, in the drawings of the apparatus embodiments provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, which may be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art can understand and implement it without creative effort.

通过以上的实施方式的描述，所属领域的技术人员可以清楚地了解到本发明可借助软件加必需的通用硬件的方式来实现，当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下，凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现，而且，用来实现同一功能的具体硬件结构也可以是多种多样的，例如模拟电路、数字电路或专用电路等。但是，对本发明而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在可读取的存储介质中，如计算机的软盘、U盘、移动硬盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本发明各个实施例所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the present invention can be implemented by means of software plus necessary general-purpose hardware. Special components, etc. to achieve. Under normal circumstances, all functions completed by a computer program can be easily implemented by corresponding hardware, and the specific hardware structures used to implement the same function can also be various, such as analog circuits, digital circuits or special circuit, etc. However, in many cases a software program implementation is the preferred embodiment for the present invention. Based on such understanding, the technical solutions of the present invention can be embodied in the form of software products in essence or the parts that make contributions to the prior art. The computer software products are stored in a readable storage medium, such as a floppy disk of a computer. , U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or CD, etc., including several instructions to make a computer device (which can be A personal computer, a server, or a network device, etc.) executes the methods described in the various embodiments of the present invention.

综上所述，以上实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照上述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对上述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。To sum up, the above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that it can still be used for The technical solutions described in the above embodiments are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. a high frequency expansion method of audio signal, is characterized in that, comprises:

Upsampling the original first audio signal to obtain a second audio signal;

acquiring the low-frequency spectrum of the second audio signal, and estimating the high-frequency spectrum of the second audio signal according to the low-frequency spectrum, to obtain a high-frequency spectrum envelope;

Copy the low-frequency spectrum to the high-frequency band of the second audio signal according to the high-frequency spectrum envelope to obtain a third audio signal;

Adjusting the energy of the third audio signal to reduce the energy of the high frequency part of the third audio signal to obtain a fourth audio signal;

The acquiring the low-frequency spectrum of the second audio signal, and estimating the high-frequency spectrum of the second audio signal according to the low-frequency spectrum, to obtain a high-frequency spectrum envelope, including:

Framing the second audio signal to obtain multiple frames of the second audio signal;

The second audio signal of each frame is subjected to modified discrete cosine transform MDCT processing to obtain the second audio signal processed by the MDCT;

A random sampling check algorithm is used to perform straight line fitting on the natural logarithm value of the absolute value of the amplitude of the low-frequency spectrum of the second audio signal processed by the MDCT, so as to obtain a low-frequency spectrum envelope, where the low-frequency spectrum includes: the frequency spectrum segment before the effective cut-off frequency of the second audio signal after MDCT processing;

The high-frequency spectrum of the second audio signal processed by the MDCT is estimated according to the straight line equation corresponding to the low-frequency spectrum envelope, so as to obtain the high-frequency spectrum envelope.

2. The method according to claim 1, wherein the upsampling of the original first audio signal to obtain the second audio signal comprises:

Framing the first audio signal to obtain a first audio signal of multiple frames, where there is overlap between the first audio signals of two adjacent frames in the first audio signals of the multiple frames;

Discrete cosine transform DCT processing is performed on the first audio signal of each frame respectively to obtain the first audio signal after the DCT processing;

Add 0 to the end of each frame of the first audio signal after the DCT processing according to the sampling rate conversion ratio to obtain the first audio signal after 0-filling, and the sampling rate conversion ratio is the ratio of the target sampling rate and the original sampling rate ;

Inverse discrete cosine transform (IDCT) processing is performed on the first audio signal after the complement of 0 to obtain the first audio signal after IDCT processing;

The IDCT-processed first audio signals of all frames are connected in a head-to-tail splicing manner to obtain the second audio signal.

3. The method according to claim 2, wherein the described first audio signal after complementing 0 is subjected to inverse discrete cosine transform (IDCT) processing, after obtaining the first audio signal after IDCT processing, the method Also includes:

The head end and the tail end of the first audio signal processed by the IDCT are cut off to obtain the first audio signal with the head and tail ends cut off;

The first audio signals processed by the IDCT of all frames are connected in a head-to-tail splicing manner to obtain the second audio signals, specifically:

The second audio signal is obtained by connecting the first audio signals with the first and last ends of all frames cut off in a head-to-tail splicing manner.

4. The method according to claim 3, wherein the lengths of the head end and the tail end cut off in the first audio signal after the IDCT process are both L ₁ , and the value of the L ₁ is as follows The formula calculates:

Wherein, the value of L ₁ is a positive integer, the length of the overlapping region between the first audio signals of two adjacent frames in the first audio signals of the multiple frames is 2L, and the I represents the target sampling rate , the D represents the original sampling rate.

5. The method according to claim 1, wherein the effective cutoff frequency of the second audio signal processed by the MDCT is obtained by the following manner:

reading the pre-buffered second audio signals of the T frames, and acquiring spectral lines corresponding to multiple frequency points of the second audio signal of each of the T frames, where the value of T is a natural number;

For each of the T frames, the effective cut-off frequency of each frame is determined according to the following processing method for the first frame: start searching from the last spectral line of the first frame forward, and find the first The frequency corresponding to the spectral line is used as the effective cutoff frequency of the first frame, and the first spectral line is the absolute amplitude of the first one of the spectral lines corresponding to multiple frequency points of the second audio signal of the first frame The natural logarithm value of the value is greater than a spectral line of a preset threshold, and the first frame is any one of the T frames;

After obtaining the effective cut-off frequency of each of the T frames, determine the maximum value among the T effective cut-off frequencies of the T frames as the global cut-off frequency;

Starting from the global cutoff frequency, the frequency spectrum line of the second audio signal processed by the MDCT is searched forward, and the frequency point corresponding to the first spectrum line whose natural logarithm value of the absolute value of the amplitude is greater than another preset threshold is found as the frequency point. The effective cutoff frequency of the MDCT-processed second audio signal.

6. The method according to claim 1, wherein the random sampling test RANSAC algorithm is used to perform straight line fitting on the low-frequency spectrum of the second audio signal processed by the MDCT to obtain a low-frequency spectrum envelope Afterwards, the method further includes:

Determine the correction parameter of the straight line equation according to the minimum value point in the preset spectral coordinate system and the effective value point of the effective cut-off frequency in the spectral coordinate system;

Parameter adjustment is performed on the linear equation corresponding to the spectrum envelope of the low frequency band according to the correction parameter.

7. The method according to any one of claims 1 to 6, wherein the copying the low frequency spectrum to the high frequency band of the second audio signal according to the high frequency frequency spectrum envelope, Get the third audio signal, including:

dividing the low frequency spectrum into a plurality of spectral line segments according to the effective cutoff frequency of the second audio signal;

The plurality of spectral line segments are sequentially copied to the high frequency band of the second audio signal to obtain the third audio signal.

8. The method according to any one of claims 1 to 6, wherein the performing energy adjustment on the third audio signal to obtain a fourth audio signal, comprising:

The third audio signal is divided into S spectral line segments according to the effective cutoff frequency and the signal termination frequency of the third audio signal, wherein each spectral line segment includes w spectral lines, and the S and w are natural numbers ;

The energy of each spectral line in the S spectral line segments is adjusted in the following manner:

X'[n]=X[n]×α _i ,

n=k _c +i×w～k _c +(i+1)×w-1, i=0～S-1,

Wherein, the X'[n] represents the fourth audio signal, the X[n] represents the third audio signal, the α _i represents the energy adjustment coefficient, and the E _i represents the adjustment of each spectral line The P _i represents the pseudo energy filled in the high frequency band of the second audio signal, the k _c represents the effective cut-off frequency of the third audio signal, the a and the b represent the The linear equation parameters of the third audio signal are described.

9. An audio player, characterized in that, comprising:

an upsampling module for upsampling the original first audio signal to obtain a second audio signal;

A high-frequency estimation module, configured to acquire the low-frequency spectrum of the second audio signal, and perform high-frequency spectrum estimation on the second audio signal according to the low-frequency spectrum to obtain a high-frequency spectrum envelope; spectrum copy a module, configured to copy the low-frequency spectrum to the high-frequency band of the second audio signal according to the high-frequency spectrum envelope to obtain a third audio signal;

an energy adjustment module, configured to perform energy adjustment on the third audio signal, reduce the energy of the high frequency part in the third audio signal, and obtain a fourth audio signal;

The high frequency estimation module includes:

The second framing module is used for framing the second audio signal to obtain the second audio signal of multiple frames;

The MDCT processing module is used to perform modified discrete cosine transform MDCT processing on the second audio signal of each frame to obtain the second audio signal after the MDCT processing;

A straight-line fitting module, configured to perform straight-line fitting on the natural logarithm value of the absolute value of the amplitude of the low-frequency spectrum of the second audio signal processed by the MDCT using a random sampling check algorithm to obtain a low-frequency spectrum envelope, the The low-frequency spectrum includes: a spectrum segment before the effective cut-off frequency of the MDCT-processed second audio signal;

The envelope generating module is configured to estimate the high-frequency spectrum of the second audio signal processed by the MDCT according to the straight line equation corresponding to the low-frequency spectrum envelope, and obtain the high-frequency spectrum envelope.

10. The audio player according to claim 9, wherein the upsampling module comprises:

The first framing module is used for framing the first audio signal to obtain the first audio signal of multiple frames, and the first audio signal of the two adjacent frames of the first audio signal of the multiple frames is between the first audio signals. overlap;

The DCT processing module is used to perform discrete cosine transform DCT processing on the first audio signal of each frame, to obtain the first audio signal after the DCT processing;

The expansion module is used to add 0 to the tail end of the first audio signal of each frame after the DCT process according to the sampling rate conversion ratio to obtain the first audio signal after the 0 is filled, and the sampling rate conversion ratio is the target sampling rate and The ratio of the original sampling rate;

The ICDT processing module is used to perform inverse discrete cosine transform (IDCT) processing on the first audio signal after 0-filling, to obtain the first audio signal after IDCT processing;

A splicing module, configured to connect the IDCT processed first audio signals of all frames in a head-to-tail splicing manner to obtain the second audio signal.

11. The audio player of claim 10, wherein the audio player further comprises:

The cutting module is used for the IDCT processing module to perform inverse discrete cosine transform IDCT processing on the first audio signal after 0-filling, and after obtaining the IDCT-processed first audio signal, the IDCT-processed first audio signal The head and tail ends are cut off, and the first audio signal with the head and tail ends is cut off;

The splicing module is specifically configured to connect the first audio signals with the first and last ends of all frames cut out in the manner of end-to-end splicing to obtain the second audio signal.

12. The audio player of claim 9, wherein the audio player further comprises:

The correction module is used for the straight-line fitting module to perform straight-line fitting on the natural logarithm value of the absolute value of the amplitude of the low-frequency spectrum of the second audio signal processed by the MDCT using a random sampling check algorithm, and after obtaining the low-frequency spectrum envelope , determine the correction parameter of the straight line equation according to the minimum value point in the preset spectral coordinate system and the effective value point of the effective cut-off frequency in the spectral coordinate system; Adjust the parameters of the line equation corresponding to the line.