[go: up one dir, main page]

CN117730367A - Grouping methods, encoders, decoders, and storage media - Google Patents

Grouping methods, encoders, decoders, and storage media Download PDF

Info

Publication number
CN117730367A
CN117730367A CN202380012056.6A CN202380012056A CN117730367A CN 117730367 A CN117730367 A CN 117730367A CN 202380012056 A CN202380012056 A CN 202380012056A CN 117730367 A CN117730367 A CN 117730367A
Authority
CN
China
Prior art keywords
channel
channels
grouping
channel group
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202380012056.6A
Other languages
Chinese (zh)
Inventor
王宾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Publication of CN117730367A publication Critical patent/CN117730367A/en
Pending legal-status Critical Current

Links

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The present disclosure relates to a grouping method, an apparatus, and a storage medium, a grouping method, the method being performed by an encoder, comprising: and grouping the plurality of channels to obtain at least one channel group, wherein N first channel groups exist in the at least one channel group, the first channel groups comprise three channels, M second channel groups exist in the at least one channel group, the second channel groups comprise two channels, N is 1, and M is a non-negative integer. In the above embodiment, the problem that a single channel cannot be grouped when the channels are grouped is solved, and a method for grouping channels including three channels is provided, so that each channel can be correspondingly grouped when the channels are grouped, the accuracy of grouping the channels is ensured, the redundancy among the channels is reduced, the accuracy of encoding the channels is ensured, and the accuracy of audio encoding is further ensured.

Description

分组方法、编码器、解码器以及存储介质Grouping methods, encoders, decoders, and storage media

技术领域Technical field

本公开涉及多媒体技术领域,尤其涉及分组方法、编码器、解码器以及存储介质。The present disclosure relates to the field of multimedia technology, and in particular, to a packetization method, an encoder, a decoder, and a storage medium.

背景技术Background technique

随着多媒体技术的快速发展,音频可以应用在各个领域中,并且,为了提高音频的空间感和方位感,可以对音频进行三维编码和三维解码,实现用户听到的音频与实际环境中听到的音频无差别。With the rapid development of multimedia technology, audio can be used in various fields. In order to improve the sense of space and direction of audio, audio can be three-dimensionally encoded and three-dimensionally decoded, so that the audio heard by the user is consistent with what is heard in the actual environment. There is no difference in the audio.

发明内容Contents of the invention

本公开解决了对声道进行分组时存在单一声道无法分组的问题,提供了一种通过划分包括三个声道的声道分组的方式,保证对声道分组时每个声道都可以具有对应的分组,保证对声道分组的准确性,进而保证声道之间的冗余减少,保证对声道进行编码的准确性,进而保证音频编码的准确性。The present disclosure solves the problem that a single channel cannot be grouped when grouping channels, and provides a method of grouping channels including three channels to ensure that when grouping channels, each channel can have The corresponding grouping ensures the accuracy of grouping the channels, thereby ensuring the reduction of redundancy between channels, ensuring the accuracy of encoding the channels, and thereby ensuring the accuracy of audio encoding.

本公开实施例提出了分组方法、装置以及存储介质。The embodiments of the present disclosure provide a grouping method, device and storage medium.

根据本公开实施例的第一方面,提出了一种分组方法,所述方法由编码器执行,所述方法包括:According to a first aspect of an embodiment of the present disclosure, a grouping method is proposed, the method is performed by an encoder, the method includes:

对多个声道进行分组,得到至少一个声道分组,其中,所述至少一个声道分组中存在N个第一声道分组,所述第一声道分组包括三个声道,所述至少一个声道分组中存在M个第二声道分组,所述第二声道分组包括两个声道,其中,N为1,M为非负整数。Multiple channels are grouped to obtain at least one channel group, wherein there are N first channel groups in the at least one channel group, the first channel group includes three channels, and the at least There are M second channel groups in one channel group, and the second channel group includes two channels, where N is 1 and M is a non-negative integer.

根据本公开实施例的第二方面,提出了一种分组方法,所述方法由解码器执行,所述方法包括:According to a second aspect of the embodiment of the present disclosure, a grouping method is proposed, the method is performed by a decoder, the method includes:

对第一信息进行解码,得到至少一个声道分组的声道信息,所述至少一个声道分组中存在N个第一声道分组,所述第一声道分组包括三个声道,所述至少一个声道分组中存在M个第二声道分组,所述第二声道分组包括两个声道,其中,N为1,M为非负整数。The first information is decoded to obtain channel information of at least one channel group, there are N first channel groups in the at least one channel group, the first channel group includes three channels, and the There are M second channel groups in at least one channel group, and the second channel group includes two channels, where N is 1 and M is a non-negative integer.

根据本公开实施例的第三方面,提出了一种分组方法,所述方法包括:According to a third aspect of the embodiment of the present disclosure, a grouping method is proposed, which method includes:

编码器对多个声道进行分组,得到至少一个声道分组,其中,所述至少一个声道分组中存在N个第一声道分组,所述第一声道分组包括三个声道,所述至少一个声道分组中存在M个第二声道分组,所述第二声道分组包括两个声道,其中,N为1,M为非负整数;The encoder groups multiple channels to obtain at least one channel group, wherein there are N first channel groups in the at least one channel group, and the first channel group includes three channels, so There are M second channel groups in the at least one channel group, and the second channel group includes two channels, where N is 1 and M is a non-negative integer;

解码器对第一信息进行解码,得到至少一个声道分组的声道信息。The decoder decodes the first information to obtain channel information of at least one channel group.

根据本公开实施例的第四方面,提出了一种编解码装置,包括:According to a fourth aspect of the embodiments of the present disclosure, a coding and decoding device is proposed, including:

处理模块,用于对多个声道进行分组,得到至少一个声道分组,其中,所述至少一个声道分组中存在N个第一声道分组,所述第一声道分组包括三个声道,所述至少一个声道分组中存在M个第二声道分组,所述第二声道分组包括两个声道,其中,N为1,M为非负整数。A processing module configured to group multiple sound channels to obtain at least one sound channel grouping, wherein there are N first sound channel groupings in the at least one sound channel grouping, and the first sound channel grouping includes three sound channel groups. channel, there are M second channel groups in the at least one channel group, and the second channel group includes two channels, where N is 1 and M is a non-negative integer.

根据本公开实施例的第五方面,提出了一种编解码装置,包括:According to the fifth aspect of the embodiment of the present disclosure, a coding and decoding device is proposed, including:

处理模块,用于对第一信息进行解码,得到至少一个声道分组的声道信息,所述至少一个声道分组中存在N个第一声道分组,所述第一声道分组包括三个声道,所述至少一个声道分组中存在M个第二声道分组,所述第二声道分组包括两个声道,其中,N为1,M为非负整数。A processing module configured to decode the first information to obtain channel information of at least one channel group, where there are N first channel groups in the at least one channel group, and the first channel group includes three channel, there are M second channel groups in the at least one channel group, and the second channel group includes two channels, where N is 1 and M is a non-negative integer.

根据本公开实施例的第六方面,提出了一种编解码装置,包括:According to the sixth aspect of the embodiment of the present disclosure, a coding and decoding device is proposed, including:

一个或多个处理器;one or more processors;

其中,所述编解码装置用于执行第一方面中任一所述的方法。Wherein, the encoding and decoding device is used to perform any of the methods described in the first aspect.

根据本公开实施例的第七方面,提出了一种编解码装置,包括:According to the seventh aspect of the embodiment of the present disclosure, a coding and decoding device is proposed, including:

一个或多个处理器;one or more processors;

其中,所述编解码装置用于执行第二方面中任一所述的方法。Wherein, the encoding and decoding device is used to perform any of the methods described in the second aspect.

根据本公开实施例的第八方面,提出了一种编解码系统,包括:According to an eighth aspect of the embodiments of the present disclosure, a coding and decoding system is proposed, including:

编码器和解码器,其中,所述编码器被配置为实现第一方面所述的分组方法,所述解码器被配置为实现第二方面所述的分组方法。An encoder and a decoder, wherein the encoder is configured to implement the grouping method described in the first aspect, and the decoder is configured to implement the grouping method described in the second aspect.

根据本公开实施例的第九方面,提出了一种存储介质,所述存储介质存储有指令,当所述指令在通信设备上运行时,使得所述通信设备执行如第一方面或第二方面中任一项所述的方法。According to a ninth aspect of the embodiment of the present disclosure, a storage medium is proposed, the storage medium stores instructions, and when the instructions are run on a communication device, the communication device causes the communication device to perform the first aspect or the second aspect. any one of the methods.

附图说明Description of the drawings

此处所说明的附图用来提供对本公开实施例的进一步理解,构成本公开的一部分,本公开实施例的示意性实施例及其说明用于解释本公开实施例,并不构成对本公开实施例的不当限定。在附图中:The drawings described here are used to provide a further understanding of the embodiments of the present disclosure and constitute a part of the present disclosure. The schematic embodiments of the embodiments of the present disclosure and their descriptions are used to explain the embodiments of the present disclosure and do not constitute an explanation of the embodiments of the present disclosure. undue limitation. In the attached picture:

图1是根据本公开实施例示出的编解码系统的架构示意图;Figure 1 is an architectural schematic diagram of a coding and decoding system according to an embodiment of the present disclosure;

图2A是根据本公开实施例示出的分组方法的交互示意图;Figure 2A is an interactive schematic diagram of a grouping method according to an embodiment of the present disclosure;

图2B是根据本公开实施例示出的分组方法的交互示意图;Figure 2B is an interactive schematic diagram of a grouping method according to an embodiment of the present disclosure;

图2C是根据本公开实施例示出的分组方法的交互示意图;Figure 2C is an interactive schematic diagram of a grouping method according to an embodiment of the present disclosure;

图2D是根据本公开实施例示出的分组方法的交互示意图;Figure 2D is an interactive schematic diagram of a grouping method according to an embodiment of the present disclosure;

图2E是根据本公开实施例示出的分组方法的交互示意图;Figure 2E is an interactive schematic diagram of a grouping method according to an embodiment of the present disclosure;

图3A是根据本公开实施例示出的分组方法的流程示意图;Figure 3A is a schematic flowchart of a grouping method according to an embodiment of the present disclosure;

图3B是根据本公开实施例示出的分组方法的流程示意图;Figure 3B is a schematic flowchart of a grouping method according to an embodiment of the present disclosure;

图4A是根据本公开实施例示出的分组方法的流程示意图;Figure 4A is a schematic flowchart of a grouping method according to an embodiment of the present disclosure;

图4B是根据本公开实施例示出的分组方法的流程示意图;Figure 4B is a schematic flowchart of a grouping method according to an embodiment of the present disclosure;

图5是根据本公开实施例示出的分组方法的流程示意图;Figure 5 is a schematic flowchart of a grouping method according to an embodiment of the present disclosure;

图6是根据本公开实施例示出的分组方法的流程示意图;Figure 6 is a schematic flowchart of a grouping method according to an embodiment of the present disclosure;

图7A是本公开实施例提出的编解码装置的结构示意图;Figure 7A is a schematic structural diagram of a coding and decoding device proposed by an embodiment of the present disclosure;

图7B是本公开实施例提出的编解码装置的结构示意图;Figure 7B is a schematic structural diagram of a coding and decoding device proposed by an embodiment of the present disclosure;

图8A是本公开实施例提出的通信设备的结构示意图;Figure 8A is a schematic structural diagram of a communication device proposed by an embodiment of the present disclosure;

图8B是本公开实施例提出的芯片的结构示意图。FIG. 8B is a schematic structural diagram of a chip proposed by an embodiment of the present disclosure.

具体实施方式Detailed ways

本公开提供了一种分组方法、装置以及存储介质。The present disclosure provides a grouping method, device and storage medium.

根据本公开实施例的第一方面,提出了一种分组方法,所述方法由编码器执行,所述方法包括:According to a first aspect of an embodiment of the present disclosure, a grouping method is proposed, the method is performed by an encoder, the method includes:

对多个声道进行分组,得到至少一个声道分组,其中,所述至少一个声道分组中存在N个第一声道分组,所述第一声道分组包括三个声道,所述至少一个声道分组中存在M个第二声道分组,所述第二声道分组包括两个声道,其中,N为1,M为非负整数。Multiple channels are grouped to obtain at least one channel group, wherein there are N first channel groups in the at least one channel group, the first channel group includes three channels, and the at least There are M second channel groups in one channel group, and the second channel group includes two channels, where N is 1 and M is a non-negative integer.

上述实施例中,解决了对声道进行分组时存在单一声道无法分组的问题,提供了一种通过划分包括三个声道的声道分组的方式,保证对声道分组时每个声道都可以具有对应的分组,保证对声道分组的准确性,进而保证声道之间的冗余减少,保证对声道进行编码的准确性,进而保证音频编码的准确性。In the above embodiment, the problem that a single channel cannot be grouped when channel grouping is solved, and a method of dividing channel groups including three channels is provided to ensure that each channel is grouped into channels. All can have corresponding groupings to ensure the accuracy of channel grouping, thereby ensuring the reduction of redundancy between channels, ensuring the accuracy of encoding channels, and thus ensuring the accuracy of audio encoding.

结合第一方面的一些实施例,在一些实施例中,Combined with some embodiments of the first aspect, in some embodiments,

所述对多个声道进行分组,得到至少一个声道分组,包括:The grouping of multiple sound channels to obtain at least one sound channel grouping includes:

获取所述多个声道中任两个声道之间的相似度;Obtain the similarity between any two of the plurality of channels;

基于所述任两个声道之间的相似度对所述多个声道进行分组,得到所述至少一个声道分组。The plurality of sound channels are grouped based on the similarity between any two sound channels to obtain the at least one sound channel grouping.

在上述实施例中,通过每两个声道之间的相似度对声道分组,保证每个声道分组中包括的声道的相似度符合要求,保证对声道分组的准确性,进而保证声道之间的冗余减少,保证对声道进行编码的准确性,进而保证音频编码的准确性。In the above embodiment, the channels are grouped based on the similarity between each two channels, ensuring that the similarity of the channels included in each channel group meets the requirements, ensuring the accuracy of channel grouping, and further ensuring The redundancy between channels is reduced, ensuring the accuracy of encoding the channels, and thus the accuracy of audio encoding.

结合第一方面的一些实施例,所述基于所述任两个声道之间的相似度对所述多个声道进行分组,得到所述至少一个声道分组,包括:Combined with some embodiments of the first aspect, the grouping of the plurality of sound channels based on the similarity between any two sound channels to obtain the at least one sound channel grouping includes:

对于所述多个声道中的任两个声道,将相似度最大的两个声道确定为一个候选声道分组;For any two channels among the plurality of channels, determine the two channels with the greatest similarity as a candidate channel group;

排除所述相似度最大的两个声道以外的其他声道中,将相似度最大的两个声道确定为一个候选声道分组,直至剩余一个独立声道或不剩余声道;Exclude channels other than the two channels with the greatest similarity, and determine the two channels with the greatest similarity as a candidate channel group until one independent channel remains or no channels remain;

基于得到的候选声道分组和/或所述独立声道,确定所述至少一个声道分组。The at least one channel grouping is determined based on the resulting candidate channel groupings and/or the independent channels.

在上述实施例中,按照相似度由大到小的顺序将声道划分为同一个声道分组,保证每个声道分组中包括的声道的相似度均为相关性大的声道,保证对声道分组的准确性,进而保证声道之间的冗余减少,保证对声道进行编码的准确性,进而保证音频编码的准确性。In the above embodiment, the channels are divided into the same channel group in order of similarity from large to small, ensuring that the similarity of the channels included in each channel group is a channel with high correlation, ensuring that The accuracy of grouping channels, thereby ensuring the reduction of redundancy between channels, ensuring the accuracy of encoding channels, thereby ensuring the accuracy of audio encoding.

结合第一方面的一些实施例,所述基于得到的候选声道分组和/或所述独立声道,确定所述至少一个声道分组,包括:With reference to some embodiments of the first aspect, determining the at least one channel grouping based on the obtained candidate channel groupings and/or the independent channels includes:

获取所述独立声道与每个候选声道分组中每个声道的相似度;Obtain the similarity between the independent channel and each channel in each candidate channel group;

在所述独立声道与第一候选声道分组中每个声道的相似度均大于相似度阈值时,将所述独立声道与所述第一候选声道分组中每个声道确定为一个所述第一声道分组;When the similarity between the independent channel and each channel in the first candidate channel group is greater than the similarity threshold, determine the independent channel and each channel in the first candidate channel group as one said first channel grouping;

将剩余的每个候选声道分组分别确定为一个所述第二声道分组。Each remaining candidate channel group is determined as one of the second channel groups.

在上述实施例中,将剩余的单个声道也划分至一个声道分组中,保证不存在单独的声道没有划分至声道分组的情况,进而保证声道之间的冗余减少,保证对声道进行编码的准确性,进而保证音频编码的准确性。In the above embodiment, the remaining single channels are also divided into a channel group to ensure that there is no situation where a single channel is not divided into a channel group, thereby ensuring that the redundancy between channels is reduced and ensuring that the The accuracy of audio channel encoding, thereby ensuring the accuracy of audio encoding.

结合第一方面的一些实施例,在一些实施例中,所述对多个声道进行分组,得到至少一个声道分组,包括:With reference to some embodiments of the first aspect, in some embodiments, grouping multiple channels to obtain at least one channel grouping includes:

对所述多个声道进行分组,直接得到所述N个第一声道分组和所述M个第二声道分组。The plurality of sound channels are grouped to directly obtain the N first sound channel groups and the M second sound channel groups.

结合第一方面的一些实施例,在一些实施例中,所述对所述多个声道进行分组,直接得到所述N个第一声道分组和所述M个第二声道分组,包括:Combined with some embodiments of the first aspect, in some embodiments, the grouping of the plurality of channels directly obtains the N first channel groups and the M second channel groups, including :

通过对所述多个声道进行全局搜索,根据任两个声道之间的相似度对所述多个声道进行分组,以得到N个第一声道分组以及所述M个第二声道分组。By performing a global search on the plurality of sound channels, the plurality of sound channels are grouped according to the similarity between any two sound channels to obtain N first sound channel groups and the M second sound channels. Road grouping.

结合第一方面的一些实施例,在一些实施例中,所述对多个声道进行分组,得到至少一个声道分组,包括:With reference to some embodiments of the first aspect, in some embodiments, grouping multiple channels to obtain at least one channel grouping includes:

对所述多个声道进行分组,得到包括两个声道的多个候选声道分组和至少一个独立声道;Group the plurality of channels to obtain a plurality of candidate channel groups including two channels and at least one independent channel;

将所述至少一个独立声道中的一个独立声道划分至所述多个候选声道分组中的一个候选声道分组,得到所述N个第一声道分组和所述M个第二声道分组。Divide one of the at least one independent channel into one of the plurality of candidate channel groups to obtain the N first channel groups and the M second channel groups. Road grouping.

结合第一方面的一些实施例,在一些实施例中,所述方法还包括:Combined with some embodiments of the first aspect, in some embodiments, the method further includes:

对每个所述声道分组中的音频信号进行下混,得到下混后的音频信号;Downmix the audio signal in each of the channel groups to obtain a downmixed audio signal;

对所述下混后的音频信号进行编码,得到音频流。The downmixed audio signal is encoded to obtain an audio stream.

在上述实施例中,对声道分组的音频信号进行下混编码,得到音频流,保证编码后的音频流可以正常传输,并且也保证了音频流的准确性。In the above embodiment, the audio signal of the channel grouping is downmixed and encoded to obtain an audio stream, which ensures that the encoded audio stream can be transmitted normally and the accuracy of the audio stream is also ensured.

结合第一方面的一些实施例,在一些实施例中,所述方法还包括:Combined with some embodiments of the first aspect, in some embodiments, the method further includes:

发送第一信息,所述第一信息用于指示所述多个声道分组的声道信息。First information is sent, where the first information is used to indicate channel information of the plurality of channel groups.

在上述实施例中,通过第一信息来告知每个声道分组的声道信息,保证传输声道分组的可靠性。In the above embodiment, the channel information of each channel group is informed through the first information to ensure the reliability of transmitting the channel group.

结合第一方面的一些实施例,在一些实施例中,所述声道信息包括以下至少之一:With reference to some embodiments of the first aspect, in some embodiments, the vocal channel information includes at least one of the following:

包括两个声道的声道分组的个数;The number of channel groups including two channels;

包括两个声道的声道分组中包括的声道的声道标识;Channel identifiers of channels included in a channel grouping that includes two channels;

包括两个声道的声道分组的能量参数;Energy parameters for channel groupings including two channels;

包括三个声道的声道分组的个数;The number of channel groups including three channels;

包括三个声道的声道分组中包括的声道的声道标识;Channel identifiers of channels included in a channel grouping that includes three channels;

包括三个声道的声道分组的能量参数;Energy parameters for channel groupings including three channels;

其中,所述能量参数用于对声道分组中声道的能量调整。Wherein, the energy parameter is used to adjust the energy of the channels in the channel grouping.

在上述实施例中,声道信息中包括两个声道或三个声道的声道分组的信息,保证声道信息的全面性。In the above embodiment, the channel information includes channel grouping information of two channels or three channels, ensuring the comprehensiveness of the channel information.

第二方面,本公开实施例提供了一种分组方法,所述方法由解码器执行,所述方法包括:In a second aspect, embodiments of the present disclosure provide a grouping method, which is performed by a decoder. The method includes:

对第一信息进行解码,得到至少一个声道分组的声道信息,所述至少一个声道分组中存在N个第一声道分组,所述第一声道分组包括三个声道,所述至少一个声道分组中存在M个第二声道分组,所述第二声道分组包括两个声道,其中,N为1,M为非负整数。The first information is decoded to obtain channel information of at least one channel group, there are N first channel groups in the at least one channel group, the first channel group includes three channels, and the There are M second channel groups in at least one channel group, and the second channel group includes two channels, where N is 1 and M is a non-negative integer.

结合第二方面的一些实施例,在一些实施例中,所述声道信息包括以下至少之一:Combined with some embodiments of the second aspect, in some embodiments, the vocal channel information includes at least one of the following:

包括两个声道的声道分组的个数;The number of channel groups including two channels;

包括两个声道的声道分组中包括的声道的声道标识;Channel identifiers of channels included in a channel grouping that includes two channels;

包括两个声道的声道分组的能量参数;Energy parameters for channel groupings including two channels;

包括三个声道的声道分组的个数;The number of channel groups including three channels;

包括三个声道的声道分组中包括的声道的声道标识;Channel identifiers of channels included in a channel grouping that includes three channels;

包括三个声道的声道分组的能量参数;Energy parameters for channel groupings including three channels;

其中,所述能量参数用于对声道分组中声道的能量调整。Wherein, the energy parameter is used to adjust the energy of the channels in the channel grouping.

结合第二方面的一些实施例,在一些实施例中,所述方法还包括:Combined with some embodiments of the second aspect, in some embodiments, the method further includes:

在确定所述声道分组中不包括三个声道的声道分组时,对音频流进行两声道上混。When it is determined that the channel grouping does not include a channel grouping of three channels, a two-channel upmix is performed on the audio stream.

结合第二方面的一些实施例,在一些实施例中,所述方法还包括:Combined with some embodiments of the second aspect, in some embodiments, the method further includes:

在确定所述声道分组中包括三个声道的声道分组时,对音频流进行三声道上混。When it is determined that the channel grouping includes three channels, three-channel upmixing is performed on the audio stream.

第三方面,本公开实施例提供了一种分组方法,所述方法包括:In a third aspect, embodiments of the present disclosure provide a grouping method, which method includes:

编码器对多个声道进行分组,得到至少一个声道分组,其中,所述至少一个声道分组中存在N个第一声道分组,所述第一声道分组包括三个声道,所述至少一个声道分组中存在M个第二声道分组,所述第二声道分组包括两个声道,其中,N为1,M为非负整数;The encoder groups multiple channels to obtain at least one channel group, wherein there are N first channel groups in the at least one channel group, and the first channel group includes three channels, so There are M second channel groups in the at least one channel group, and the second channel group includes two channels, where N is 1 and M is a non-negative integer;

解码器对第一信息进行解码,得到所述至少一个声道分组的声道信息,所述多个声道分组中至少存在一个声道分组中包括三个声道。The decoder decodes the first information to obtain the channel information of the at least one channel group. At least one channel group among the plurality of channel groups includes three channels.

第四方面,本公开实施例提供了一种编解码装置,上述编解码装置包括收发模块、处理模块中的至少一者;其中,上述编码器用于执行第一方面和第三方面的可选实现方式。In a fourth aspect, embodiments of the present disclosure provide a codec device. The codec device includes at least one of a transceiver module and a processing module; wherein the encoder is used to perform optional implementations of the first aspect and the third aspect. Way.

第五方面,本公开实施例提供了一种编解码装置,上述编解码装置包括收发模块、处理模块中的至少一者;其中,上述接入网设备用于执行第二方面和第三方面的可选实现方式。In a fifth aspect, embodiments of the present disclosure provide a codec device. The codec device includes at least one of a transceiver module and a processing module; wherein the access network device is configured to perform the steps of the second and third aspects. Optional implementation.

第六方面,本公开实施例提供了一种编解码装置,包括:In a sixth aspect, an embodiment of the present disclosure provides a coding and decoding device, including:

一个或多个处理器;one or more processors;

其中,所述编解码装置用于执行第一方面和第三方面中任一项所述的方法。Wherein, the encoding and decoding device is used to perform the method described in any one of the first aspect and the third aspect.

第七方面,本公开实施例提供了一种编解码装置,包括:In a seventh aspect, an embodiment of the present disclosure provides a coding and decoding device, including:

一个或多个处理器;one or more processors;

其中,所述编解码装置用于执行第二方面和第三方面中任一项所述的方法。Wherein, the encoding and decoding device is used to perform the method described in any one of the second aspect and the third aspect.

第八方面,本公开实施例提供了一种存储介质,所述存储介质存储有第一信息,当所述第一信息在通信设备上运行时,使得所述通信设备执行如第一方面、第二方面和第三方面中任一项所述的方法。In an eighth aspect, an embodiment of the present disclosure provides a storage medium that stores first information. When the first information is run on a communication device, the communication device performs the steps of the first aspect and the third aspect. The method described in any one of the second aspect and the third aspect.

第九方面,本公开实施例提出了程序产品,上述程序产品被通信设备执行时,使得上述通信设备执行如第一方面、第二方面和第三方面中任一所述的方法。In a ninth aspect, embodiments of the present disclosure provide a program product. When the program product is executed by a communication device, the communication device causes the communication device to execute the method described in any one of the first aspect, the second aspect, and the third aspect.

第十方面,本公开实施例提出了计算机程序,当其在通信设备上运行时,使得通信设备执行如第一方面、第二方面和第三方面中任一所述的方法。In a tenth aspect, embodiments of the present disclosure provide a computer program that, when run on a communication device, causes the communication device to perform the method described in any one of the first, second and third aspects.

第十一方面,本公开实施例提供了一种芯片或芯片系统。该芯片或芯片系统包括处理电路,被配置为执行第一方面、第二方面和第三方面中任一所述的方法。In an eleventh aspect, embodiments of the present disclosure provide a chip or chip system. The chip or chip system includes a processing circuit configured to perform the method described in any one of the first, second and third aspects.

可以理解地,上述编码器、存储介质、程序产品、计算机程序、芯片或芯片系统均用于执行本公开实施例所提出的方法。因此,其所能达到的有益效果可以参考对应方法中的有益效果,此处不再赘述。It can be understood that the above-mentioned encoders, storage media, program products, computer programs, chips or chip systems are all used to execute the methods proposed in the embodiments of the present disclosure. Therefore, the beneficial effects it can achieve can be referred to the beneficial effects in the corresponding methods, and will not be described again here.

本公开实施例提出了分组方法、装置以及存储介质。在一些实施例中,分组方法与信息分组方法、分组方法等术语可以相互替换,编解码装置与信息处理装置、指示装置等术语可以相互替换,信息处理系统、编解码系统等术语可以相互替换。The embodiments of the present disclosure provide a grouping method, device and storage medium. In some embodiments, terms such as grouping method and information grouping method, grouping method may be interchanged with each other, terms such as codec device and information processing device, indicating device may be interchanged with each other, and terms such as information processing system and codec system may be interchanged with each other.

本公开实施例并非穷举,仅为部分实施例的示意,不作为对本公开保护范围的具体限制。在不矛盾的情况下,某一实施例中的每个步骤均可以作为独立实施例来实施,且各步骤之间可以任意组合,例如,在某一实施例中去除部分步骤后的方案也可以作为独立实施例来实施,且在某一实施例中各步骤的顺序可以任意交换,另外,某一实施例中的可选实现方式可以任意组合;此外,各实施例之间可以任意组合,例如,不同实施例的部分或全部步骤可以任意组合,某一实施例可以与其他实施例的可选实现方式任意组合。The embodiments of this disclosure are not exhaustive, but are only illustrative of some embodiments, and are not intended to specifically limit the scope of protection of this disclosure. If there is no contradiction, each step in a certain embodiment can be implemented as an independent embodiment, and the steps can be arbitrarily combined. For example, a solution in which some steps are removed from a certain embodiment can also be used. It is implemented as an independent embodiment, and the order of each step in a certain embodiment can be arbitrarily exchanged. In addition, the optional implementation methods in a certain embodiment can be arbitrarily combined; in addition, various embodiments can be arbitrarily combined, for example , some or all steps of different embodiments can be arbitrarily combined, and a certain embodiment can be arbitrarily combined with optional implementations of other embodiments.

在各本公开实施例中,如果没有特殊说明以及逻辑冲突,各实施例之间的术语和/或描述具有一致性,且可以互相引用,不同实施例中的技术特征根据其内在的逻辑关系可以组合形成新的实施例。In each embodiment of the present disclosure, if there is no special explanation or logical conflict, the terms and/or descriptions between the embodiments are consistent and can be referenced to each other. The technical features in different embodiments can be used according to their inherent logical relationships. Combined to form new embodiments.

本公开实施例中所使用的术语只是为了描述特定实施例的目的,而并非作为对本公开的限制。The terms used in the embodiments of the present disclosure are for the purpose of describing specific embodiments only and are not intended to limit the disclosure.

在本公开实施例中,除非另有说明,以单数形式表示的元素,如“一个”、“一种”、“该”、“上述”、“所述”、“前述”、“这一”等,可以表示“一个且只有一个”,也可以表示“一个或多个”、“至少一个”等。例如,在翻译中使用如英语中的“a”、“an”、“the”等冠词(article)的情况下,冠词之后的名词可以理解为单数表达形式,也可以理解为复数表达形式。In the embodiments of the present disclosure, unless otherwise specified, elements expressed in the singular form, such as "a", "an", "the", "above", "said", "aforesaid", "this" etc., can mean "one and only one", "one or more", "at least one", etc. For example, when articles (articles) such as "a", "an", and "the" in English are used in translation, the noun after the article can be understood as a singular expression or a plural expression. .

在本公开实施例中,“多个”是指两个或两个以上。In the embodiment of the present disclosure, "plurality" refers to two or more.

在一些实施例中,“至少一者(至少一项、至少一个)(at least one of)”、“一个或多个(one or more)”、“多个(a plurality of)”、“多个(multiple)等术语可以相互替换。In some embodiments, "at least one of", "one or more", "a plurality of", "a plurality of" Terms such as multiple can be used interchangeably.

在一些实施例中,“A、B中的至少一者”、“A和/或B”、“在一情况下A,在另一情况下B”、“响应于一情况A,响应于另一情况B”等记载方式,根据情况可以包括以下技术方案:在一些实施例中A(与B无关地执行A);在一些实施例中B(与A无关地执行B);在一些实施例中从A和B中选择执行(A和B被选择性执行);在一些实施例中A和B(A和B都被执行)。当有A、B、C等更多分支时也类似上述。In some embodiments, "at least one of A, B", "A and/or B", "A in one situation, B in another situation", "in response to one situation A, in response to another situation A" "A situation B" and other recording methods may include the following technical solutions depending on the situation: in some embodiments A (execute A independently of B); in some embodiments B (execute B independently of A); in some embodiments Select execution from A and B (A and B are selectively executed); in some embodiments, A and B (both A and B are executed). It is similar to the above when there are more branches such as A, B, C, etc.

在一些实施例中,“A或B”等记载方式,根据情况可以包括以下技术方案:在一些实施例中A(与B无关地执行A);在一些实施例中B(与A无关地执行B);在一些实施例中从A和B中选择执行(A和B被选择性执行)。当有A、B、C等更多分支时也类似上述。In some embodiments, descriptions such as "A or B" may include the following technical solutions depending on the situation: in some embodiments A (execute A independently of B); in some embodiments B (execute independently of A) B); in some embodiments select execution from A and B (A and B are selectively executed). It is similar to the above when there are more branches such as A, B, C, etc.

本公开实施例中的“第一”、“第二”等前缀词,仅仅为了区分不同的描述对象,不对描述对象的位置、顺序、优先级、数量或内容等构成限制,对描述对象的陈述参见权利要求或实施例中上下文的描述,不应因为使用前缀词而构成多余的限制。例如,描述对象为“字段”,则“第一字段”和“第二字段”中“字段”之前的序数词并不限制“字段”之间的位置或顺序,“第一”和“第二”并不限制其修饰的“字段”是否在同一个消息中,也不限制“第一字段”和“第二字段”的先后顺序。再如,描述对象为“等级”,则“第一等级”和“第二等级”中“等级”之前的序数词并不限制“等级”之间的优先级。再如,描述对象的数量并不受序数词的限制,可以是一个或者多个,以“第一装置”为例,其中“装置”的数量可以是一个或者多个。此外,不同前缀词修饰的对象可以相同或不同,例如,描述对象为“装置”,则“第一装置”和“第二装置”可以是相同的装置或者不同的装置,其类型可以相同或不同;再如,描述对象为“信息”,则“第一信息”和“第二信息”可以是相同的信息或者不同的信息,其内容可以相同或不同。The prefixes such as "first" and "second" in the embodiments of this disclosure are only used to distinguish different description objects and do not limit the position, order, priority, quantity or content of the description objects. See the description of the context in the claims or embodiments, and the use of prefixes should not constitute unnecessary limitations. For example, if the description object is "field", then the ordinal word before "field" in "first field" and "second field" does not limit the position or order between "fields". "First" and "Second field" " does not limit whether the "fields" it modifies are in the same message, nor does it limit the order of the "first field" and "second field". For another example, if the description object is "level", then the ordinal words before "level" in "first level" and "second level" do not limit the priority between "levels". For another example, the number of description objects is not limited by ordinal words, and can be one or more. Taking "first device" as an example, the number of "devices" can be one or more. In addition, the objects modified by different prefixes may be the same or different. For example, if the description object is "device", then the "first device" and the "second device" may be the same device or different devices, and their types may be the same or different. ; For another example, if the description object is "information", then the "first information" and the "second information" can be the same information or different information, and their contents can be the same or different.

在一些实施例中,“包括A”、“包含A”、“用于指示A”、“携带A”,可以解释为直接携带A,也可以解释为间接指示A。In some embodiments, "includes A", "includes A", "is used to indicate A", and "carries A" can be interpreted as directly carrying A, or as indirectly indicating A.

在一些实施例中,“时频(time/frequency)”、“时频域”等术语是指时域和/或频域。In some embodiments, terms such as "time/frequency" and "time-frequency domain" refer to the time domain and/or frequency domain.

在一些实施例中,“响应于……”、“响应于确定……”、“在……的情况下”、“在……时”、“当……时”、“若……”、“如果……”等术语可以相互替换。In some embodiments, "in response to...", "in response to determining...", "in the case of...", "when...", "when...", "if...", Terms such as "if..." are interchangeable.

在一些实施例中,“大于”、“大于或等于”、“不小于”、“多于”、“多于或等于”、“不少于”、“高于”、“高于或等于”、“不低于”、“以上”等术语可以相互替换,“小于”、“小于或等于”、“不大于”、“少于”、“少于或等于”、“不多于”、“低于”、“低于或等于”、“不高于”、“以下”等术语可以相互替换。In some embodiments, "greater than", "greater than or equal to", "not less than", "more than", "more than or equal to", "not less than", "higher than", "higher than or equal to" , "not less than", "above" and other terms can be interchanged, "less than", "less than or equal to", "not greater than", "less than", "less than or equal to", "not more than", " Terms such as "less than", "less than or equal to", "no higher than" and "below" may be used interchangeably.

在一些实施例中,装置和设备可以解释为实体的、也可以解释为虚拟的,其名称不限定于实施例中所记载的名称,在一些情况下也可以被理解为“设备(equipment)”、“设备(device)”、“电路”、“网元”、“节点”、“功能”、“单元”、“部件(section)”、“系统”、“网络”、“芯片”、“芯片系统”、“实体”、“主体”等。In some embodiments, devices and equipment can be interpreted as physical or virtual, and their names are not limited to the names recorded in the embodiments. In some cases, they can also be understood as "equipment". , "device", "circuit", "network element", "node", "function", "unit", "section", "system", "network", "chip", "chip System", "entity", "subject", etc.

在一些实施例中,“网络”可以解释为网络中包含的装置,例如,接入网设备、核心网设备等。In some embodiments, "network" may be interpreted as devices included in the network, such as access network equipment, core network equipment, etc.

在一些实施例中,“接入网设备(access network device,AN device)”也可以被称为“无线接入网设备(radio access network device,RAN device)”、“基站(basestation,BS)”、“无线基站(radio base station)”、“固定台(fixed station)”,在一些实施例中也可以被理解为“节点(node)”、“接入点(access point)”、“发送点(transmissionpoint,TP)”、“接收点(reception point,RP)”、“发送和/或接收点(transmission/reception point,TRP)”、“面板(panel)”、“天线面板(antenna panel)”、“天线阵列(antenna array)”、“小区(cell)”、“宏小区(macro cell)”、“小型小区(small cell)”、“毫微微小区(femto cell)”、“微微小区(pico cell)”、“扇区(sector)”、“小区组(cellgroup)”、“服务小区”、“载波(carrier)”、“分量载波(component carrier)”、“带宽部分(bandwidth part,BWP)”等。In some embodiments, "access network device (AN device)" may also be called "radio access network device (RAN device)" or "base station (BS)" , "radio base station", "fixed station", in some embodiments can also be understood as "node", "access point", "transmitting point" (transmissionpoint, TP)", "reception point (RP)", "transmission and/or reception point (transmission/reception point, TRP)", "panel", "antenna panel" , "antenna array", "cell", "macro cell", "small cell", "femto cell", "pico cell" cell", "sector", "cellgroup", "serving cell", "carrier", "component carrier", "bandwidth part (BWP)" "wait.

在一些实施例中,“编码器(terminal)”或“编码器设备(terminal device)”可以被称为“用户设备(user equipment,编码器)”、“用户编码器(user terminal)”、“移动台(mobile station,MS)”、“移动编码器(mobile terminal,MT)”、订户站(subscriberstation)、移动单元(mobile unit)、订户单元(subscriber unit)、无线单元(wirelessunit)、远程单元(remote unit)、移动设备(mobiledevice)、无线设备(wireless device)、无线通信设备(wireless communication device)、远程设备(remote device)、移动订户站(mobile subscriber station)、接入编码器(access terminal)、移动编码器(mobileterminal)、无线编码器(wireless terminal)、远程编码器(remote terminal)、手持设备(handset)、用户代理(user agent)、移动客户端(mobile client)、客户端(client)等。In some embodiments, an "encoder (terminal)" or "encoder device (terminal device)" may be referred to as "user equipment (encoder)", "user terminal", " Mobile station (MS)", "mobile terminal (MT)", subscriber station, mobile unit, subscriber unit, wireless unit, remote unit (remote unit), mobile device (mobile device), wireless device (wireless device), wireless communication device (wireless communication device), remote device (remote device), mobile subscriber station (mobile subscriber station), access terminal ), mobile terminal, wireless terminal, remote terminal, handset, user agent, mobile client, client )wait.

在一些实施例中,获取数据、信息等可以遵照所在地国家的法律法规。In some embodiments, obtaining data, information, etc. may comply with the laws and regulations of the country where the location is located.

在一些实施例中,可以在得到用户同意后获取数据、信息等。In some embodiments, data, information, etc. may be obtained with user consent.

此外,本公开实施例的表格中的每一元素、每一行、或每一列均可以作为独立实施例来实施,任意元素、任意行、任意列的组合也可以作为独立实施例来实施。In addition, each element, each row, or each column in the table in the embodiment of the present disclosure can be implemented as an independent embodiment, and the combination of any element, any row, and any column can also be implemented as an independent embodiment.

图1是根据本公开实施例示出的编解码系统的架构示意图,如图1所示,本公开实施例提供的方法可应用于编解码系统100,该编解码系统可以包括编码器101、解码器102。需要说明的是,该编解码系统100还可以包括其他设备,本公开对该编解码系统100包括的设备不做限定。Figure 1 is a schematic architectural diagram of a codec system according to an embodiment of the present disclosure. As shown in Figure 1, the method provided by the embodiment of the present disclosure can be applied to a codec system 100. The codec system can include an encoder 101, a decoder 102. It should be noted that the coding and decoding system 100 may also include other devices, and this disclosure does not limit the devices included in the coding and decoding system 100 .

在一些实施例中,编码器101和解码器102均设置于终端中。在一些实施例中,该终端可以为各种事设备。例如包括手机(mobile phone)、可穿戴设备、物联网设备、具备通信功能的汽车、智能汽车、平板电脑(Pad)、带无线收发功能的电脑、虚拟现实(virtualreality,VR)编码器设备、增强现实(augmented reality,AR)编码器设备、工业控制(industrial control)中的无线编码器设备、无人驾驶(self-driving)中的无线编码器设备、远程手术(remote medical surgery)中的无线编码器设备、智能电网(smart grid)中的无线编码器设备、运输安全(transportation safety)中的无线编码器设备、智慧城市(smart city)中的无线编码器设备、智慧家庭(smart home)中的无线编码器设备中的至少一者,但不限于此。In some embodiments, both the encoder 101 and the decoder 102 are provided in the terminal. In some embodiments, the terminal may be a variety of devices. For example, they include mobile phones, wearable devices, Internet of Things devices, cars with communication functions, smart cars, tablets (Pads), computers with wireless transceiver functions, virtual reality (VR) encoder equipment, enhanced Augmented reality (AR) encoder equipment, wireless encoder equipment in industrial control (industrial control), wireless encoder equipment in self-driving (self-driving), wireless coding in remote medical surgery encoder equipment, wireless encoder equipment in smart grid, wireless encoder equipment in transportation safety, wireless encoder equipment in smart city, and smart home At least one of, but not limited to, wireless encoder devices.

可以理解的是,本公开实施例描述的编解码系统是为了更加清楚的说明本公开实施例的技术方案,并不构成对于本公开实施例提出的技术方案的限定,本领域普通技术人员可知,随着系统架构的演变和新业务场景的出现,本公开实施例提出的技术方案对于类似的技术问题同样适用。It can be understood that the encoding and decoding system described in the embodiments of the present disclosure is for the purpose of more clearly illustrating the technical solutions of the embodiments of the present disclosure, and does not constitute a limitation on the technical solutions proposed by the embodiments of the present disclosure. Those of ordinary skill in the art will know that With the evolution of system architecture and the emergence of new business scenarios, the technical solutions proposed in the embodiments of the present disclosure are also applicable to similar technical problems.

下述本公开实施例可以应用于图1所示的编解码系统100、或部分主体,但不限于此。图1所示的各主体是例示,编解码系统可以包括图1中的全部或部分主体,也可以包括图1以外的其他主体,各主体数量和形态为任意,各主体可以是实体的也可以是虚拟的,各主体之间的连接关系是例示,各主体之间可以不连接也可以连接,其连接可以是任意方式,可以是直接连接也可以是间接连接,可以是有线连接也可以是无线连接。The following embodiments of the present disclosure can be applied to the encoding and decoding system 100 shown in Figure 1, or part of the main body, but are not limited thereto. The subjects shown in Figure 1 are examples. The coding and decoding system may include all or part of the subjects in Figure 1, or may include other subjects outside of Figure 1. The number and form of each subject is arbitrary, and each subject may be physical or physical. It is virtual. The connection relationship between the subjects is just an example. The subjects may not be connected or connected. The connection may be in any way. It may be a direct connection or an indirect connection. It may be a wired connection or a wireless connection. connect.

本公开各实施例可以应用于长期演进(Long Term Evolution,LTE)、LTE-Advanced(LTE-A)、LTE-Beyond(LTE-B)、SUPER 3G、IMT-Advanced、第四代移动编解码系统(4th generation mobile communication system,4G)、)、第五代移动编解码系统(5thgeneration mobile communication system,5G)、5G新空口(new radio,NR)、未来无线接入(Future Radio Access,FRA)、新无线接入技术(New-Radio Access Technology,RAT)、新无线(New Radio,NR)、新无线接入(New radio access,NX)、未来一代无线接入(Futuregeneration radio access,FX)、Global System for Mobile communications(GSM(注册商标))、CDMA2000、超移动宽带(Ultra Mobile Broadband,UMB)、IEEE 802.11(Wi-Fi(注册商标))、IEEE 802.16(WiMAX(注册商标))、IEEE 802.20、超宽带(Ultra-WideBand,UWB)、蓝牙(Bl编码器tooth(注册商标))、陆上公用移动通信网(Public Land Mobile Network,PLMN)网络、设备到设备(Device-to-Device,D2D)系统、机器到机器(Machine to Machine,M2M)系统、物联网(Internet of Things,IoT)系统、车联网(Vehicle-to-Everything,V2X)、利用其他分组方法的系统、基于它们而扩展的下一代系统等。此外,也可以将多个系统组合(例如,LTE或者LTE-A与5G的组合等)应用。Each embodiment of the present disclosure can be applied to Long Term Evolution (LTE), LTE-Advanced (LTE-A), LTE-Beyond (LTE-B), SUPER 3G, IMT-Advanced, and fourth-generation mobile codec systems (4th generation mobile communication system, 4G),), fifth generation mobile communication system (5G), 5G new radio (new radio, NR), future radio access (Future Radio Access, FRA), New-Radio Access Technology (RAT), New Radio (NR), New radio access (NX), Futuregeneration radio access (FX), Global System for Mobile communications (GSM (registered trademark)), CDMA2000, Ultra Mobile Broadband (UMB), IEEE 802.11 (Wi-Fi (registered trademark)), IEEE 802.16 (WiMAX (registered trademark)), IEEE 802.20, Ultra-WideBand (UWB), Bluetooth (Bl encoder tooth (registered trademark)), Public Land Mobile Network (PLMN) network, Device-to-Device (D2D) Systems, Machine to Machine (M2M) systems, Internet of Things (IoT) systems, Vehicle-to-Everything (V2X), systems utilizing other grouping methods, and extensions based on them Generation system etc. In addition, a combination of multiple systems (for example, a combination of LTE or LTE-A and 5G, etc.) can also be applied.

在一些实施例中,本公开用于三维声音频编码。其中,该三维声音频编码是沉浸式音频技术的关键技术之一。三维声相对传统声音增加了空间感和方位感,使听众能再现在现实世界中所听到的声音,从而满足人们对声音高度还原,高度沉浸的体验需求,同时可具备个性化选择和交互体验。In some embodiments, the present disclosure is used for three-dimensional acoustic audio coding. Among them, the three-dimensional audio encoding is one of the key technologies of immersive audio technology. Compared with traditional sound, three-dimensional sound increases the sense of space and direction, allowing listeners to reproduce the sounds heard in the real world, thereby meeting people's needs for a highly restored and highly immersive experience of sound, and at the same time, it can have personalized selection and interactive experience .

在一些实施例中,为实现重现声音的空间感和方位感,技术上可以依托于基于声道的方式、基于对象的方式、基于声场的方式以及以上三种形式的组合,其中:基于声道的音频是一组相互关联的声道,常见的有5.1声道、7.1声道、5.1.4声道、7.1.4声道等,每一种格式对应一种扬声器布局,在对应的扬声器布局下可以获得最佳的回放效果。In some embodiments, in order to reproduce the sense of space and direction of sound, technology can rely on a channel-based approach, an object-based approach, a sound field-based approach, and a combination of the above three forms, where: sound-based Channel audio is a set of interrelated channels. Common ones include 5.1 channel, 7.1 channel, 5.1.4 channel, 7.1.4 channel, etc. Each format corresponds to a speaker layout. In the corresponding speaker The best playback effect can be obtained under the layout.

在一些实施例中,基于对象的音频是一系列单声道音频元素和对应元数据的集合。元数据表示对象的位置,强度,大小等信息。在回放时根据元数据信息,将对象映射到一个或多个扬声器或者双耳化渲染到耳机播放,以达到想要的空间音频效果。In some embodiments, object-based audio is a collection of monophonic audio elements and corresponding metadata. Metadata represents information such as the object's location, intensity, size, etc. During playback, according to the metadata information, the object is mapped to one or more speakers or binaurally rendered to headphones for playback to achieve the desired spatial audio effect.

在一些实施例中,基于声场的音频是一种定义在球体表面上的3D声场建模格式。其原理是声音作为压力波进行传递,对于给定时间的声音场景,每个点都需要借助数个压力函数得以体现。倘若获知该空间中每个点的压力值,便可对空间中的声音进行重构。空间中每个点的压力和其邻近的点存在一定的关系,为了使基于场景的音频制作方式的优势得以充分发挥,需要对系数进行准确获取,提高声场空间系数的编码质量。采集到的声场信号称为高阶立体声(Higher Order Ambisonics,HOA)。HOA系统性能随着HOA阶数增加而增加,但HOA信号数量也随之增加。In some embodiments, sound field-based audio is a 3D sound field modeling format defined on the surface of a sphere. The principle is that sound is transmitted as a pressure wave, and for a sound scene at a given time, each point needs to be represented by several pressure functions. If the pressure value of each point in the space is known, the sound in the space can be reconstructed. There is a certain relationship between the pressure of each point in space and its adjacent points. In order to give full play to the advantages of the scene-based audio production method, it is necessary to accurately obtain the coefficients and improve the encoding quality of the sound field spatial coefficients. The collected sound field signal is called Higher Order Ambisonics (HOA). HOA system performance increases as the HOA order increases, but the number of HOA signals also increases.

图2是根据本公开实施例示出的分组方法的交互示意图。如图2所示,本公开实施例涉及分组方法,上述方法包括:Figure 2 is an interactive schematic diagram of a grouping method according to an embodiment of the present disclosure. As shown in Figure 2, the embodiment of the present disclosure relates to a grouping method. The above method includes:

步骤S2101,编码器对多个声道进行分组,得到至少一个声道分组。Step S2101: The encoder groups multiple channels to obtain at least one channel group.

在一些实施例中,声道指声音在录制或播放时在不同空间位置采集或回放的相互独立的音频信号。声道是指音频的元素,不同类型的音频具有的声道数不同。例如,HOA3信号有16个声道,比如MC22.2信号有24个声道。In some embodiments, vocal channels refer to mutually independent audio signals that are collected or played back at different spatial locations when sound is recorded or played. Channels refer to elements of audio, and different types of audio have different numbers of channels. For example, the HOA3 signal has 16 channels, and the MC22.2 signal has 24 channels.

在一些实施例中,该音频信号包括音频数据和边信息。In some embodiments, the audio signal includes audio data and side information.

在一些实施例中,声道分组是指包括至少两个声道的分组。编码器会以声道分组为单位对声道进行编码,得到音频流。In some embodiments, a channel grouping refers to a grouping that includes at least two channels. The encoder will encode the channels in units of channel groups to obtain the audio stream.

在一些实施例中,该声道分组的名称不做限定。其例如是声道组、声道组别等。In some embodiments, the name of the channel grouping is not limited. Examples thereof include channel groups, channel groups, and the like.

在一些实施例中,至少一个声道分组中存在N个第一声道分组,第一声道分组包括三个声道,M为非负整数In some embodiments, there are N first channel groups in at least one channel group, the first channel group includes three channels, and M is a non-negative integer.

在一些实施例中,至少一个声道分组中存在M个第二声道分组,第二声道分组包括两个声道,其中,N为1。In some embodiments, there are M second channel groups in at least one channel group, and the second channel group includes two channels, where N is 1.

在一些实施例中,该N也可以为0,也就是说,在对多个声道进行分组时,未得到包括三个声道的声道分组,也就是0个第一声道分组。In some embodiments, N may also be 0, that is, when multiple channels are grouped, no channel grouping including three channels is obtained, that is, 0 first channel groupings are obtained.

可选地,对于M的取值,M还应该不大于声道数量的一半。可选地,对声道进行分组的声道数量为P,则M的取值为M大于或等于0,小于或等于P/2,其中,P为正整数。Optionally, for the value of M, M should not be greater than half the number of channels. Optionally, the number of channels to group the channels is P, then the value of M is greater than or equal to 0 and less than or equal to P/2, where P is a positive integer.

在一些实施例中,声道在进行分组之前,需要先对声道包括的信号进行预处理,在预处理完成后再对声道进行分组。In some embodiments, before the channels are grouped, the signals included in the channels need to be pre-processed, and the channels are grouped after the pre-processing is completed.

可选地,对声道包括的信号进行预处理的方式包括以下至少之一:暂态检测、窗型判断、时频变换、频域噪声整形、时域噪声整形、频带扩展编码。Optionally, the method of preprocessing the signal included in the vocal channel includes at least one of the following: transient detection, window type judgment, time-frequency transformation, frequency domain noise shaping, time domain noise shaping, and frequency band extension coding.

在一些实施例中,对多个声道进行分组,得到至少一个声道分组,包括:获取多个声道中任两个声道之间的相似度,基于任两个声道之间的相似度对多个声道进行分组,得到至少一个声道分组。In some embodiments, grouping multiple channels to obtain at least one channel grouping includes: obtaining a similarity between any two channels among the multiple channels, based on the similarity between any two channels. Group multiple channels to obtain at least one channel group.

在本公开实施例中,编码器可以获取每两个声道之间的相似度,进而根据每两个声道之间的相似度的大小关系对多个声道进行分组,得到至少一个声道分组。In this embodiment of the present disclosure, the encoder can obtain the similarity between each two channels, and then group the multiple channels according to the similarity between each two channels to obtain at least one channel. Group.

在一些实施例中,基于任两个声道之间的相似度对多个声道进行分组,得到至少一个声道分组,包括:对于多个声道中的任两个声道,将相似度最大的两个声道确定为一个候选声道分组,排除相似度最大的两个声道以外的其他声道中,将相似度最大的两个声道确定为一个候选声道分组,直至剩余一个独立声道或不剩余声道,基于得到的候选声道分组和/或独立声道,确定至少一个声道分组。In some embodiments, grouping the plurality of channels based on the similarity between any two channels to obtain at least one channel grouping includes: for any two channels among the plurality of channels, the similarity is The two largest channels are determined as a candidate channel group, and other channels except the two channels with the greatest similarity are excluded, and the two channels with the greatest similarity are determined as a candidate channel group until one is left. Independent channels or no remaining channels, at least one channel grouping is determined based on the obtained candidate channel groupings and/or independent channels.

例如,多个声道分别为声道1、声道2、声道3、声道4和声道5,则分别获取每两个声道之间的相似度,例如声道1和声道2之间的相似度最高,则将声道1和声道2作为一个声道分组,然后再排除声道1和声道2,从声道3、声道4和声道5中相似度最大的作为一个声道分组,此时剩余一个声道5,该声道5即为独立声道。For example, if the multiple channels are channel 1, channel 2, channel 3, channel 4 and channel 5, then the similarity between each two channels is obtained respectively, such as channel 1 and channel 2. If the similarity between them is the highest, then group channel 1 and channel 2 as one channel, and then exclude channel 1 and channel 2, and select the channel with the greatest similarity among channel 3, channel 4, and channel 5. As a channel group, there is one channel 5 left at this time, and this channel 5 is an independent channel.

在一些实施例中,基于得到的候选声道分组和/或独立声道,确定至少一个声道分组,包括:获取独立声道与每个候选声道分组中每个声道的相似度,在独立声道与第一候选声道分组中每个声道的相似度均大于相似度阈值时,将独立声道与第一候选声道分组中每个声道确定为一个第一声道分组,将剩余的每个候选声道分组分别确定为一个第二声道分组。In some embodiments, determining at least one channel group based on the obtained candidate channel groupings and/or independent channels includes: obtaining a similarity between the independent channel and each channel in each candidate channel group, in When the similarity between the independent channel and each channel in the first candidate channel group is greater than the similarity threshold, each channel in the independent channel and the first candidate channel group is determined to be a first channel group, Each remaining candidate channel group is determined as a second channel group.

在本公开实施例中,在独立声道与第一候选声道分组中每个声道的相似度均大于相似度阈值时,说明该独立声道与第一候选声道分组中的两个声道均相似,则此时可以将该独立声道划分至第一候选声道分组中,此时第一候选声道分组中包括三个声道,也就是说为第一声道分组,剩余的声道分组中仍包括两个声道,也就是说为第二声道分组。In the embodiment of the present disclosure, when the similarity between the independent channel and each channel in the first candidate channel group is greater than the similarity threshold, it means that the independent channel and the two channels in the first candidate channel group If the channels are all similar, then the independent channel can be divided into the first candidate channel group. At this time, the first candidate channel group includes three channels, that is to say, it is the first channel group, and the remaining The channel grouping still includes two channels, which is the second channel grouping.

在一些实施例中,在基于任两个声道之间的相似度对多个声道进行分组时,确定奇数个声道满足分组条件,进而分组得到至少一个声道分组的情况下,获取独立声道与每个候选声道分组中每个声道的相似度,在独立声道与第一候选声道分组中每个声道的相似度均大于相似度阈值时,将独立声道与第一候选声道分组中每个声道确定为一个第一声道分组,将剩余的每个候选声道分组分别确定为一个第二声道分组。In some embodiments, when multiple channels are grouped based on the similarity between any two channels, and an odd number of channels are determined to meet the grouping conditions, and then grouped to obtain at least one channel group, independent The similarity between the channel and each channel in each candidate channel group. When the similarity between the independent channel and each channel in the first candidate channel group is greater than the similarity threshold, the independent channel will be compared with the third candidate channel group. Each channel in a candidate channel group is determined as a first channel group, and each remaining candidate channel group is determined as a second channel group.

需要说明的是,本公开实施例是以可以得到独立声道为例进行说明。而在另一实施例中,对于多个声道中的任两个声道,将相似度最大的两个声道确定为一个候选声道分组,排除相似度最大的两个声道以外的其他声道中,将相似度最大的两个声道确定为一个候选声道分组,直至不剩余声道,基于得到的候选声道分组,确定至少一个声道分组。It should be noted that the embodiment of the present disclosure takes as an example that independent sound channels can be obtained. In another embodiment, for any two channels among the plurality of channels, the two channels with the greatest similarity are determined as a candidate channel group, and other channels other than the two channels with the greatest similarity are excluded. Among the channels, the two channels with the greatest similarity are determined as a candidate channel group until no channels remain. Based on the obtained candidate channel groups, at least one channel group is determined.

需要说明的是,对于多个声道中的任两个声道,若任两个声道之间的相似度均不大于相似度阈值,则不对声道进行分组。It should be noted that, for any two channels among the plurality of channels, if the similarity between any two channels is not greater than the similarity threshold, the channels will not be grouped.

需要说明的是,本公开实施例中可能存在独立声道与多个候选声道分组中的两个声道的相似度均大于相似度阈值,在此情况下,需要额外判断将独立声道划分至哪个候选声道分组中。It should be noted that in the embodiment of the present disclosure, there may be an independent channel whose similarity to two channels in multiple candidate channel groups is greater than the similarity threshold. In this case, additional judgment is required to divide the independent channels into into which candidate channel group.

在一些实施例中,获取独立声道与每个候选声道分组中的两个声道的相似度的总和,从每个候选声道分组对应的总和中选择最大的总和,将该独立声道划分至该最大的总和对应的声道分组中。In some embodiments, the sum of the similarities between the independent channels and the two channels in each candidate channel group is obtained, the largest sum is selected from the sum corresponding to each candidate channel group, and the independent channel is Divide it into the channel group corresponding to the largest sum.

例如,独立声道为声道5,第一个候选声道分组中包括声道1和声道2,第二个候选声道分组包括声道3和声道4,若声道5与声道1和声道2的相似度的总和为1.8,声道5与声道3和声道4的相似度的总和为1.9,则将声道5划分至第二个候选声道分组中。For example, if the independent channel is channel 5, the first candidate channel group includes channel 1 and channel 2, and the second candidate channel group includes channel 3 and channel 4. If channel 5 and channel The sum of the similarities between channel 1 and channel 2 is 1.8, and the sum of the similarities between channel 5 and channel 3 and channel 4 is 1.9. Then channel 5 is divided into the second candidate channel group.

在一些实施例中,确定独立声道与每个候选声道分组中的两个声道的相似度中,最大的相似度对应的候选声道分组,将该独立声道划分至该最大的相似度对应的声道分组中。In some embodiments, the candidate channel group corresponding to the largest similarity among the similarities between the independent channel and the two channels in each candidate channel group is determined, and the independent channel is divided into the candidate channel group with the largest similarity. in the corresponding channel grouping.

例如,独立声道为声道5,第一个候选声道分组中包括声道1和声道2,第二个候选声道分组包括声道3和声道4,若声道5与声道1的相似度为0.7,与声道2的相似度的为0.9,声道5与声道3的相似度为0.8,与声道4的相似度为0.6,则将声道5划分至第一个候选声道分组中。For example, if the independent channel is channel 5, the first candidate channel group includes channel 1 and channel 2, and the second candidate channel group includes channel 3 and channel 4. If channel 5 and channel The similarity of channel 1 is 0.7, the similarity with channel 2 is 0.9, the similarity between channel 5 and channel 3 is 0.8, and the similarity with channel 4 is 0.6, then channel 5 is divided into the first place. candidate channel groupings.

需要说明的是,上述实施例中若存在独立声道与至少两个候选声道分组中的声道的总和或最大值相同,则从这些候选声道分组中随机选择一个,将独立声道划分至选择的声道分组中。It should be noted that in the above embodiment, if there is an independent channel that is the same as the sum or maximum value of the channels in at least two candidate channel groups, then one of the candidate channel groups is randomly selected to divide the independent channels into to the selected channel group.

需要说明的是,本公开实施例中分组时包括两种分组方案。下面,对每种分组方案进行说明。It should be noted that the grouping in the embodiment of the present disclosure includes two grouping schemes. Below, each grouping scheme is explained.

在一些实施例中,对多个声道进行分组,直接得到N个第一声道分组和M个第二声道分组。可选地,本公开实施例中是指对多个声道分组时,直接就输出分组得到的N个第一声道分组和M个第二声道分组,期间不会输出其他分组结果。In some embodiments, multiple channels are grouped to directly obtain N first channel groups and M second channel groups. Optionally, in the embodiment of the present disclosure, when multiple channels are grouped, the N first channel groupings and M second channel groupings obtained by grouping are directly output, and other grouping results will not be output during this period.

可选地,对多个声道进行分组,直接得到N个第一声道分组和M个第二声道分组,包括:通过对多个声道进行全局搜索,根据任两个声道之间的相似度对多个声道进行分组,以得到N个第一声道分组以及M个第二声道分组。Optionally, group multiple channels to directly obtain N first channel groups and M second channel groups, including: performing a global search on multiple channels, based on the relationship between any two channels Multiple channels are grouped by the similarity to obtain N first channel groups and M second channel groups.

在一种可能实现方式中,获取多个声道中每三个声道的任两个声道之间的相似度,若每三个声道的任两个声道之间的相似度均大于相似度阈值,则将该三个声道划分为一个候选声道分组,若最终得到一个候选声道分组,则将该候选声道分组确定为第一声道分组。若最终得到多个候选声道分组,则再根据每个候选声道分组的相似度关系,从多个候选声道分组中确定第一声道分组。In one possible implementation, the similarity between any two channels of every three channels in the multiple channels is obtained, if the similarity between any two channels of every three channels is greater than If the similarity threshold is determined, the three channels are divided into a candidate channel group. If a candidate channel group is finally obtained, the candidate channel group is determined as the first channel group. If multiple candidate channel groups are finally obtained, the first channel group is determined from the multiple candidate channel groups based on the similarity relationship of each candidate channel group.

可选地,若最终得到多个候选声道分组,则再根据每个候选声道分组的相似度关系,从多个候选声道分组中确定第一声道分组,包括:获取每个候选声道分组中的每两个声道之间的相似度的总和,从每个候选声道分组对应的总和中选择最大的总和,将该最大的总和对应的声道分组确定为第一声道分组。Optionally, if multiple candidate channel groups are finally obtained, then based on the similarity relationship of each candidate channel group, determine the first channel group from the multiple candidate channel groups, including: obtaining each candidate channel group The sum of the similarities between every two channels in the channel group, select the largest sum from the sum corresponding to each candidate channel group, and determine the channel group corresponding to the largest sum as the first channel group .

例如,第一个候选声道分组中包括声道1、声道2和声道3,第二个候选声道分组包括声道4、声道5和声道6,若声道1、声道2和声道3中每两个声道之间的相似度的总和为2.7,声道4、声道5和声道6的相似度的总和为2.6,则确定第一个候选声道分组为第一声道分组。For example, the first candidate channel group includes channel 1, channel 2, and channel 3, and the second candidate channel group includes channel 4, channel 5, and channel 6. If channel 1, channel The sum of the similarities between each two channels in channel 2 and channel 3 is 2.7, and the sum of the similarities between channels 4, channel 5 and channel 6 is 2.6, then the first candidate channel grouping is determined as First channel grouping.

可选地,若最终得到多个候选声道分组,则再根据每个候选声道分组的相似度关系,从多个候选声道分组中确定第一声道分组,包括:确定独立声道与每个候选声道分组中的每两个声道的相似度中,最大的相似度对应的候选声道分组,将该最大的相似度对应的候选声道分组确定为第一声道分组。Optionally, if multiple candidate channel groups are finally obtained, the first channel group is determined from the multiple candidate channel groups based on the similarity relationship of each candidate channel group, including: determining the independent channel and Among the similarities of every two channels in each candidate channel group, the candidate channel group corresponding to the largest similarity is determined as the first channel group.

在一些实施例中,对多个声道进行分组的过程与上述实施例中基于相似度分组的过程类似,在此不再赘述。In some embodiments, the process of grouping multiple channels is similar to the process of grouping based on similarity in the above embodiments, and will not be described again here.

在一些实施例中,对多个声道进行分组,得到包括两个声道的多个候选声道分组和至少一个独立声道,将至少一个独立声道中的一个独立声道划分至多个候选声道分组中的一个候选声道分组,得到N个第一声道分组和M个第二声道分组。In some embodiments, multiple channels are grouped to obtain multiple candidate channel groups including two channels and at least one independent channel, and one independent channel in the at least one independent channel is divided into multiple candidates. A candidate channel grouping in the channel grouping results in N first channel groupings and M second channel groupings.

需要说明的是,本公开实施例是以对多个声道进行分组为例进行说明。而在另一实施例中,在对多个声道进行分组后,还会生成每个声道分组的声道信息。It should be noted that the embodiment of the present disclosure takes grouping of multiple channels as an example for explanation. In another embodiment, after multiple channels are grouped, channel information for each channel group is also generated.

在一些实施例中,声道信息包括以下至少之一:In some embodiments, the vocal tract information includes at least one of the following:

(1)包括两个声道的声道分组的个数;(1) The number of channel groups including two channels;

在一些实施例中,包括两个声道的声道分组的个数也可以称为第二声道分组数量,或者,第二声道分组数,或者,声道组对数量,或者,分组数量等,本公开实施例不做限定。In some embodiments, the number of channel groups including two channels may also be referred to as the number of second channel groups, or the number of second channel groups, or the number of channel group pairs, or the number of groups. etc., the embodiments of this disclosure are not limiting.

在一些实施例中,包括两个声道的声道分组的个数用于表示当前帧的第二声道分组对数量。In some embodiments, the number of channel groupings including two channels is used to represent the number of second channel grouping pairs of the current frame.

(2)包括两个声道的声道分组中包括的声道的声道标识;(2) Channel identifiers of channels included in a channel grouping that includes two channels;

在一些实施例中,包括两个声道的声道分组中包括的声道的声道标识用于表示声道对的索引,可解析得到当前声道对中的两个声道的索引值。In some embodiments, the channel identifiers of the channels included in the channel group including two channels are used to represent the index of the channel pair, and the index values of the two channels in the current channel pair can be parsed to obtain.

(3)包括两个声道的声道分组的能量参数;(3) Energy parameters of channel groupings including two channels;

(4)包括三个声道的声道分组的个数;(4) The number of channel groups including three channels;

在一些实施例中,包括三个声道的声道分组的个数也可以称为第一声道分组数量,或者,第一声道分组数,或者,声道组对数量,或者,分组数量等,本公开实施例不做限定。In some embodiments, the number of channel groups including three channels may also be referred to as the number of first channel groups, or the number of first channel groups, or the number of channel group pairs, or the number of groups. etc., the embodiments of this disclosure are not limiting.

在一些实施例中,包括三个声道的声道分组的个数用于表示当前帧的第一声道分组对数量。In some embodiments, the number of channel groupings including three channels is used to represent the number of first channel grouping pairs of the current frame.

(5)包括三个声道的声道分组中包括的声道的声道标识;(5) Channel identifiers of the channels included in the channel grouping including three channels;

在一些实施例中,包括三个声道的声道分组中包括的声道的声道标识用于表示声道对的索引,可解析得到当前声道对中的三个声道的索引值。In some embodiments, the channel identifiers of the channels included in the channel grouping including three channels are used to represent the index of the channel pair, and the index values of the three channels in the current channel pair can be parsed to obtain.

(6)包括三个声道的声道分组的能量参数;(6) Energy parameters of channel groupings including three channels;

其中,能量参数用于对声道分组中声道的能量调整。Among them, the energy parameter is used to adjust the energy of the channels in the channel grouping.

在一些实施例中,能量参数用于当前声道对中第一个声道和第二个声道的声道间幅度差ILD参数量化索引,用于声道间能量/幅度调整。In some embodiments, the energy parameter is used as an inter-channel amplitude difference ILD parameter quantization index for the first and second channels of the current channel pair for inter-channel energy/amplitude adjustment.

在一些实施例中,对多个声道进行分组,直接得到N个第一声道分组和M个第二声道分组,并且生成第一声道分组和第二声道分组的声道信息。In some embodiments, multiple channels are grouped to directly obtain N first channel groupings and M second channel groupings, and channel information of the first channel grouping and the second channel grouping is generated.

在一些实施例中,对多个声道进行分组,得到包括两个声道的多个候选声道分组和至少一个独立声道,生成多个候选声道分组和至少一个独立声道的声道信息,将至少一个独立声道中的一个独立声道划分至多个候选声道分组中的一个候选声道分组,得到N个第一声道分组和M个第二声道分组,并对多个候选声道分组和至少一个独立声道的声道信息进行改写,得到第一声道分组和第二声道分组的声道信息。In some embodiments, multiple channels are grouped to obtain multiple candidate channel groupings including two channels and at least one independent channel, and a plurality of candidate channel groupings and at least one independent channel are generated. Information, divide one independent channel among at least one independent channel into one candidate channel group among multiple candidate channel groups, obtain N first channel groups and M second channel groups, and classify multiple The candidate channel grouping and the channel information of at least one independent channel are rewritten to obtain the channel information of the first channel grouping and the second channel grouping.

可选地,对多个声道进行分组,得到包括两个声道的M+1个候选声道分组和至少一个独立声道,生成多个候选声道分组和至少一个独立声道的声道信息,将至少一个独立声道中的一个独立声道划分至多个候选声道分组中的一个候选声道分组,得到N个第一声道分组和M个第二声道分组。Optionally, group multiple channels to obtain M+1 candidate channel groups including two channels and at least one independent channel, and generate multiple candidate channel groups and at least one independent channel. Information, divide one independent channel among at least one independent channel into one candidate channel group among multiple candidate channel groups, and obtain N first channel groups and M second channel groups.

需要说明的是,对多个声道进行分组得到第一声道分组的方法可以称为是3声道和差编码。对多个声道进行分组得到第二声道分组的方法可以称为是2声道和差编码。It should be noted that the method of grouping multiple channels to obtain the first channel grouping can be called 3-channel sum-difference coding. The method of grouping multiple channels to obtain a second channel group can be called 2-channel sum-difference coding.

需要说明的是,本公开实施例中所执行的方式由判决模块执行,或者由其他模块执行,本公开实施例不作限定。It should be noted that the execution method in the embodiment of the present disclosure is executed by the decision module or by other modules, which is not limited by the embodiment of the present disclosure.

步骤S2102,编码器对每个声道分组中的音频信号进行下混,得到下混后的音频信号。Step S2102: The encoder downmixes the audio signal in each channel group to obtain a downmixed audio signal.

在一些实施例中,下混是指采用正交归一化矩阵对分组后的声道进行混合,以得到每个声道混合的声道。In some embodiments, downmixing refers to using an orthogonal normalization matrix to mix the grouped channels to obtain a mixed channel for each channel.

可选地,该正交归一化矩阵预先设置,本公开实施例对该正交归一化矩阵不作限定。Optionally, the orthogonal normalization matrix is set in advance, and the embodiment of the present disclosure does not limit the orthogonal normalization matrix.

例如,对于包括两个声道的声道分组,采用的正交归一化矩阵为其中,第一行是和向量,第二行是差向量。For example, for a channel grouping that includes two channels, the orthogonal normalization matrix used is Among them, the first row is the sum vector, and the second row is the difference vector.

对于包括三个声道的声道分组,采用的正交归一化矩阵为其中,第一行是和向量,第二行和第三行是差向量。For channel groupings including three channels, the orthogonal normalization matrix used is Among them, the first row is the sum vector, and the second and third rows are the difference vector.

在一些实施例中M3ch msI3ch ms=O3ch ms,M2ch msI2ch ms=O2ch ms。其中,I3ch ms是1*3的列向量,另外,该列向量是指包括三个声道的声道分组中的音频数据。I2ch ms是1*2的列向量,另外,该列向量是指包括两个声道的声道分组中的音频数据。In some embodiments M 3ch ms I 3ch ms =O 3ch ms , M 2ch ms I 2ch ms =O 2ch ms . Among them, I 3ch ms is a 1*3 column vector, and this column vector refers to audio data in a channel group including three channels. I 2ch ms is a 1*2 column vector, and this column vector refers to audio data in a channel group including two channels.

需要说明的是,本公开实施例所执行的步骤由下混模块执行,或者采用其他方式执行,本公开实施例不作限定。It should be noted that the steps performed in the embodiment of the present disclosure are performed by the downmix module or in other ways, which are not limited by the embodiment of the present disclosure.

步骤S2103,编码器对下混后的音频信号进行编码,得到音频流。Step S2103: The encoder encodes the downmixed audio signal to obtain an audio stream.

在一些实施例中,编码包括比特分配量化熵编码和码流复用。In some embodiments, encoding includes bit allocation quantization entropy encoding and code stream multiplexing.

下面,通过举例说明上述步骤执行的方案。Below, the implementation of the above steps is illustrated with an example.

例如,参见图2B,执行判决模块(步骤S2101),判断采用哪种和差编码方法或者两者的组合。其中,声道包括L声道、R声道、C声道、LS声道和RS声道,其中,任两个声道之间的相似度如表1所示,For example, referring to FIG. 2B, a decision module (step S2101) is executed to determine which sum-difference encoding method or a combination of the two is used. Among them, the channels include L channel, R channel, C channel, LS channel and RS channel. Among them, the similarity between any two channels is shown in Table 1.

表1Table 1

-- LL RR CC LSLS RSRS LL -- 0.850.85 0.740.74 0.430.43 0.360.36 RR -- -- 0.780.78 0.370.37 0.320.32 CC -- -- -- 0.390.39 0.310.31 LSLS -- -- -- -- 0.660.66 RSRS -- -- -- -- --

下混阈值为0.5。判决模块通过某种算法获得使用的和差编码方法。The downmix threshold is 0.5. The decision module obtains the used sum-and-difference coding method through a certain algorithm.

其中,3CH M/S是指三个声道划分为第一声道分组的模块,2CH M/S是指两个声道划分为第二声道分组的模块。Among them, 3CH M/S refers to a module with three channels divided into a first channel group, and 2CH M/S refers to a module with two channels divided into a second channel group.

1、采用最大相似度迭代筛选出第一声道对L声道和R声道(最大相似度cox_L_R=0.85),第二声道对LS声道和RS声道(剩余未下混声道中的最大相似度cox_LS_RS=0.66)。1. Use the maximum similarity iteration to filter out the first channel pair L channel and R channel (maximum similarity cox_L_R = 0.85), and the second channel pair LS channel and RS channel (the largest of the remaining undownmixed channels). Similarity cox_LS_RS=0.66).

2、计算未下混所有声道(C声道)和第一声道对、第二声道对的两个声道的相似度。2. Calculate the similarity between all channels that are not downmixed (C channel) and the two channels of the first channel pair and the second channel pair.

C声道和L声道的相似度cox_C_L=0.74大于下混阈值,C声道和R声道的相似度cox_C_R=0.78大于下混阈值。The similarity between the C channel and the L channel, cox_C_L=0.74, is greater than the downmixing threshold, and the similarity between the C channel and the R channel, cox_C_R=0.78, is greater than the downmixing threshold.

3、输出3CH M/S和2CH M/S的判决结果。本实施例中,3CH M/S判决结果输出L声道、R声道、C声道。2CH M/S判决结果输出LS声道、RS声道。3. Output the judgment results of 3CH M/S and 2CH M/S. In this embodiment, the 3CH M/S decision result outputs L channel, R channel, and C channel. 2CH M/S decision result outputs LS channel and RS channel.

后续,执行下混模块。Subsequently, execute the downmix module.

M2ch ms I2ch ms=O2ch ms M 2ch ms I 2ch ms =O 2ch ms

M3ch ms I3ch ms=O3ch ms M 3ch ms I 3ch ms =O 3ch ms

其中,两声道下混矩阵M2ch ms为2x2的正交归一化矩阵,其中,第一行是和向量,第二行是差向量。三声道下混矩阵M3ch ms为3x3的正交归一化矩阵,其中,第一行是和向量,第二行和第三行是差向量。Among them, the two-channel downmix matrix M 2ch ms is a 2x2 orthogonal normalized matrix, in which the first row is the sum vector and the second row is the difference vector. The three-channel downmix matrix M 3ch ms is a 3x3 orthogonal normalized matrix, in which the first row is the sum vector, and the second and third rows are the difference vectors.

I2ch ms是1x2的列向量,I3ch ms是1x3的列向量。向量包含的数据是经过了预处理的音频数据,单位为采样点或者频点。I 2ch ms is a 1x2 column vector, and I 3ch ms is a 1x3 column vector. The data contained in the vector is preprocessed audio data, and the unit is a sampling point or frequency point.

生成每个声道分组的声道信息。Generate channel information for each channel grouping.

又例如,参见图2C,执行两声道组对判决,得到两声道组对边信息,包括组对个数和声道对索引。For another example, see Figure 2C, a two-channel group pairing decision is executed, and two-channel group pairing edge information is obtained, including the number of group pairs and the channel pair index.

其中,声道包括L声道、R声道、C声道、LS声道和RS声道,其中,任两个声道之间的相似度如表1所示,下混阈值为0.5。Among them, the channels include L channel, R channel, C channel, LS channel and RS channel. The similarity between any two channels is shown in Table 1, and the downmixing threshold is 0.5.

两声道组对判决采用最大相似度迭代筛选出第一声道对L声道和R声道(最大相似度cox_L_R=0.85),第二声道对LS声道和RS声道(剩余未下混声道中的最大相似度cox_LS_RS=0.66)。The two-channel group pair judgment uses the maximum similarity iteration to filter out the first channel pair L channel and R channel (maximum similarity cox_L_R = 0.85), and the second channel pair LS channel and RS channel (the remaining ones are not listed below). The maximum similarity in the mixed channel is cox_LS_RS=0.66).

最后得到的第二声道分组个数为2,包括第一声道分组L声道和R声道、第二声道分组LS声道和RS声道。The number of second channel groups finally obtained is 2, including the first channel grouping L channel and R channel, and the second channel grouping LS channel and RS channel.

其次,执行三声道组对判决,如果产生三声道组对,会改判上述的声道分组结果。Secondly, the three-channel pairing decision is performed. If a three-channel pairing occurs, the above-mentioned channel grouping result will be changed.

首先,计算未下混所有声道(C声道)和第一声道对、第二声道对的两个声道的相似度。First, calculate the similarity between all channels that are not downmixed (C channel) and the two channels of the first channel pair and the second channel pair.

C声道和L声道的相似度cox_C_L=0.74大于下混阈值,C声道和R声道的相似度cox_C_R=0.78大于下混阈值。The similarity between the C channel and the L channel, cox_C_L=0.74, is greater than the downmixing threshold, and the similarity between the C channel and the R channel, cox_C_R=0.78, is greater than the downmixing threshold.

输出3CH M/S和2CH M/S的判决结果。本实施例中,3CH M/S判决结果输出L声道、R声道、C声道。2CH M/S判决结果改判后输出1个声道分组,即输出LS声道、RS声道。Output the judgment results of 3CH M/S and 2CH M/S. In this embodiment, the 3CH M/S decision result outputs L channel, R channel, and C channel. After the 2CH M/S judgment result is changed, 1 channel grouping is output, that is, LS channel and RS channel are output.

生成两声道和三声道的声道信息。Generates bi- and tri-channel channel information.

执行两声道和三声道下混模块。Implement two-channel and three-channel downmix modules.

M2ch ms I2ch ms=O2ch ms M 2ch ms I 2ch ms =O 2ch ms

M3ch ms I3ch ms=O3ch ms M 3ch ms I 3ch ms =O 3ch ms

其中,两声道下混矩阵M2ch ms为2x2的正交归一化矩阵,其中,第一行是和向量,第二行是差向量。三声道下混矩阵M3ch ms为3x3的正交归一化矩阵,其中,第一行是和向量,第二行和第三行是差向量。Among them, the two-channel downmix matrix M 2ch ms is a 2x2 orthogonal normalized matrix, in which the first row is the sum vector and the second row is the difference vector. The three-channel downmix matrix M 3ch ms is a 3x3 orthogonal normalized matrix, in which the first row is the sum vector, and the second and third rows are the difference vectors.

对上述声道信息进行改写,得到新的声道信息。Rewrite the above vocal channel information to obtain new vocal channel information.

在一些实施例中,编码器通过执行上述步骤得到音频流。参见图2D,在获取到声道信号后,对该声道信号进行预处理,再对预处理后的声道信号进行声道间组对下混,然后进行比特分配量化熵编码,最后进行码流复用,得到音频流。In some embodiments, the encoder obtains the audio stream by performing the above steps. Referring to Figure 2D, after the channel signal is obtained, the channel signal is preprocessed, and then the preprocessed channel signal is subjected to inter-channel pair downmixing, and then bit allocation quantization entropy coding is performed, and finally coding is performed. Stream multiplexing to obtain an audio stream.

其中,预处理包括暂态检测窗型判断、时频变换、频域噪声整形、时域噪声整形、频带扩展编码等过程。声道间组对下混包括将L声道、R声道、C声道进行三声道组对下混,得到M1声道、S11声道、S12声道;以及将LS声道、RS声道进行两声道组对下混,得到M2声道、S2声道,另外LFE声道不做处理。Among them, preprocessing includes transient detection window type judgment, time-frequency transformation, frequency domain noise shaping, time domain noise shaping, frequency band extension coding and other processes. Inter-channel pairing downmixing includes three-channel pairing downmixing of the L channel, R channel, and C channel to obtain the M1 channel, S11 channel, and S12 channel; and the LS channel and RS channel. The two channels are combined and downmixed to obtain the M2 channel and the S2 channel, and the LFE channel is not processed.

步骤S2104,编码器发送第一信息和音频流。Step S2104: The encoder sends the first information and audio stream.

在一些实施例中,解码器接收第一信息和音频流。In some embodiments, the decoder receives the first information and audio stream.

在一些实施例中,编码器可以分别发送第一信息和音频流。例如,编码器先发送第一信息,再发送音频流。或者,编码器先发送音频流,再发送第一信息。在一些实施例中,编码器可以同时发送第一信息和音频流。In some embodiments, the encoder may send the first information and the audio stream separately. For example, the encoder sends the first information first and then the audio stream. Alternatively, the encoder sends the audio stream first and then the first information. In some embodiments, the encoder may send the first information and the audio stream simultaneously.

在一些实施例中,声道信息包括以下至少之一:In some embodiments, the vocal tract information includes at least one of the following:

包括两个声道的声道分组的个数;The number of channel groups including two channels;

包括两个声道的声道分组中包括的声道的声道标识;Channel identifiers of channels included in a channel grouping that includes two channels;

包括两个声道的声道分组的能量参数;Energy parameters for channel groupings including two channels;

包括三个声道的声道分组的个数;The number of channel groups including three channels;

包括三个声道的声道分组中包括的声道的声道标识;Channel identifiers of channels included in a channel grouping that includes three channels;

包括三个声道的声道分组的能量参数;Energy parameters for channel groupings including three channels;

其中,能量参数用于对声道分组中声道的能量调整。Among them, the energy parameter is used to adjust the energy of the channels in the channel grouping.

步骤S2105,解码器对第一信息进行解码,得到至少一个声道分组的声道信息。Step S2105: The decoder decodes the first information to obtain channel information of at least one channel group.

步骤S2106,解码器在确定声道分组中不包括三个声道的声道分组时,对音频流进行两声道上混,在确定声道分组包括三个声道的声道分组时,对音频流进行三声道上混。Step S2106: When the decoder determines that the channel grouping does not include a channel grouping of three channels, the decoder performs two-channel upmixing on the audio stream. When it determines that the channel grouping includes a channel grouping of three channels, the decoder performs a two-channel upmix on the audio stream. The audio stream is three-channel upmixed.

在一些实施例中,解码器判断编码器是否已经执行两声道下混和三声道下混。如果编码器执行了两声道下混,解码器执行两声道上混。如果编码器执行了三声道下混,解码器执行三声道上混。In some embodiments, the decoder determines whether the encoder has performed two-channel downmixing and three-channel downmixing. If the encoder performs a two-channel downmix, the decoder performs a two-channel upmix. If the encoder performs a three-channel downmix, the decoder performs a three-channel upmix.

可选地,音频后处理包括但不限于通用的解码流程,比如时频反变换、时域噪声整形反变换、频域噪声整形反变换、频带扩展反变换等模块,也包括针对某类信号特征进行的解码处理,比如多声道解码处理、HOA声道解码处理、对象元数据解码处理等。Optionally, audio post-processing includes but is not limited to general decoding processes, such as time-frequency inverse transform, time-domain noise shaping inverse transform, frequency domain noise shaping inverse transform, frequency band extension inverse transform and other modules, as well as modules targeting certain types of signal characteristics. Decoding processing performed, such as multi-channel decoding processing, HOA channel decoding processing, object metadata decoding processing, etc.

在一些实施例中,解码器通过执行上述步骤解码得到各声道信号。参见图2E,在获取到音频流后,对该音频流进行码流解复用,再进行比特分配反量化熵编码、声道间上混、后处理,得到解码后的声道信号。In some embodiments, the decoder decodes and obtains each channel signal by performing the above steps. Referring to Figure 2E, after the audio stream is obtained, the audio stream is demultiplexed, and then bit allocation inverse quantization entropy coding, inter-channel upmixing, and post-processing are performed to obtain a decoded channel signal.

其中,后处理包括频带扩展解码、逆时域噪声整形、逆频域噪声整形、逆时频变换等过程。声道间组对上混包括将M1声道、S11声道、S12声进行三声道组对上混L声道、R声道、C声道,得到道;以及将M2声道、S2声道进行两声道组对上混LS声道、RS声道,得到,另外LFE声道不做处理。Among them, post-processing includes frequency band extension decoding, inverse time domain noise shaping, inverse frequency domain noise shaping, inverse time-frequency transformation and other processes. The inter-channel pairing and upmixing includes three-channel pairing of M1 channel, S11 channel, and S12 sound, and upmixing of L channel, R channel, and C channel to obtain channels; and the M2 channel, S2 channel The LS channel and the RS channel are upmixed by two-channel grouping, and the LFE channel is not processed.

在一些实施例中,信息等的名称不限定于实施例中所记载的名称,“信息(information)”、“消息(message)”、“信号(signal)”、“信令(signaling)”、“报告(report)”、“配置(configuration)”、“指示(indication)”、“指令(instruction)”、“命令(command)”、“信道”、“参数(parameter)”、“域”、“字段”、“符号(symbol)”、“码元(symbol)”、“码本(codebook)”、“码字(codeword)”、“码点(codepoint)”、“比特(bit)”、“数据(data)”、“程序(program)”、“码片(chip)”等术语可以相互替换。In some embodiments, the name of the information is not limited to the names recorded in the embodiments, such as "information", "message", "signal", "signaling", "report", "configuration", "indication", "instruction", "command", "channel", "parameter", "domain", "Field", "symbol", "symbol", "codebook", "codeword", "codepoint", "bit", Terms such as "data", "program" and "chip" are interchangeable.

在一些实施例中,“上行”、“上行链路”、“物理上行链路”等术语可以相互替换,“下行”、“下行链路”、“物理下行链路”等术语可以相互替换,“侧行(side)”、“侧行链路(sidelink)”、“侧行通信”、“侧行链路通信”、“直连”、“直连链路”、“直连通信”、“直连链路通信”等术语可以相互替换。In some embodiments, terms such as "uplink", "uplink" and "physical uplink" may be replaced by each other, and terms such as "downlink", "downlink" and "physical downlink" may be replaced by each other. "side", "sidelink", "side communication", "side link communication", "direct connection", "direct link", "direct communication", The terms "direct link communications" are used interchangeably.

在一些实施例中,“获取”、“获得”、“得到”、“接收”、“传输”、“双向传输”、“发送和/或接收”可以相互替换,其可以解释为从其他主体接收,从协议中获取,从高层获取,自身处理得到、自主实现等多种含义。In some embodiments, "obtain", "obtain", "get", "receive", "transmit", "bi-directional transmission", "send and/or receive" may be interchanged, which may be interpreted as receiving from other entities , obtained from the protocol, obtained from the high-level, processed by oneself, implemented independently, etc.

在一些实施例中,“发送”、“发射”、“上报”、“下发”、“传输”、“双向传输”、“发送和/或接收”等术语可以相互替换。In some embodiments, terms such as "send", "transmit", "report", "deliver", "transmit", "bidirectional transmission", "send and/or receive" may be replaced with each other.

在一些实施例中,“时刻”、“时间点”、“时间”、“时间位置”等术语可以相互替换,“时长”、“时段”、“时间窗口”、“窗口”、“时间”等术语可以相互替换。In some embodiments, terms such as "moment", "time point", "time", "time position", etc. may be replaced with each other, and "duration", "period", "time window", "window", "time", etc. The terms are interchangeable.

在一些实施例中,“特定(certain)”、“预定(preseted)”、“预设”、“设定”、“指示(indicated)”、“某一”、“任意”、“第一”等术语可以相互替换,“特定A”、“预定A”、“预设A”、“设定A”、“指示A”、“某一A”、“任意A”、“第一A”可以解释为在协议等中预先规定的A,也可以解释为通过设定、配置、或指示等得到的A,也可以解释为特定A、某一A、任意A、或第一A等,但不限于此。In some embodiments, "certain", "preseted", "preset", "setting", "indicated", "certain", "any", "first" Terms such as "specific A", "predetermined A", "preset A", "set A", "instruction A", "a certain A", "any A" and "first A" can be used interchangeably. It can be interpreted as A that is predetermined in an agreement, etc., or it can be interpreted as A that is obtained through setting, configuration, or instruction, etc. It can also be interpreted as a specific A, a certain A, any A, or the first A, etc., but not Limited to this.

本公开实施例所涉及的分组方法可以包括步骤S2101~步骤S2106中的至少一者。例如,步骤S2101可以作为独立实施例来实施,步骤S2102可以作为独立实施例来实施,步骤S2103可以作为独立实施例来实施,步骤S2104可以作为独立实施例来实施,步骤S2105可以作为独立实施例来实施,步骤S2106可以作为独立实施例来实施,步骤S2101和步骤S2102可以作为独立实施例来实施,步骤S2103、步骤S2104可以作为独立实施例来实施,步骤S2105、步骤S2106可以作为独立实施例来实施,步骤S2101、步骤S2102、步骤S2103、步骤S2104可以作为独立实施例来实施,步骤S2101、步骤S2102、步骤S2105、步骤S2106可以作为独立实施例来实施,步骤S2103、步骤S2104、步骤S2105、步骤S2106可以作为独立实施例来实施,但不限于此。The grouping method involved in the embodiment of the present disclosure may include at least one of steps S2101 to S2106. For example, step S2101 can be implemented as an independent embodiment, step S2102 can be implemented as an independent embodiment, step S2103 can be implemented as an independent embodiment, step S2104 can be implemented as an independent embodiment, and step S2105 can be implemented as an independent embodiment. Implementation, step S2106 can be implemented as an independent embodiment, step S2101 and step S2102 can be implemented as an independent embodiment, step S2103 and step S2104 can be implemented as an independent embodiment, and step S2105 and step S2106 can be implemented as an independent embodiment. , Step S2101, Step S2102, Step S2103, and Step S2104 can be implemented as independent embodiments. Step S2101, Step S2102, Step S2105, and Step S2106 can be implemented as independent embodiments. Step S2103, Step S2104, Step S2105, and Step S2106 It can be implemented as an independent embodiment, but is not limited thereto.

在一些实施例中,步骤S2101是可选的,在不同实施例中可以对这些步骤中的一个或多个步骤进行省略或替代。In some embodiments, step S2101 is optional, and one or more of these steps may be omitted or replaced in different embodiments.

在一些实施例中,步骤S2102是可选的,在不同实施例中可以对这些步骤中的一个或多个步骤进行省略或替代。In some embodiments, step S2102 is optional, and one or more of these steps may be omitted or replaced in different embodiments.

在一些实施例中,步骤S2103是可选的,在不同实施例中可以对这些步骤中的一个或多个步骤进行省略或替代。In some embodiments, step S2103 is optional, and one or more of these steps may be omitted or replaced in different embodiments.

在一些实施例中,步骤S2104是可选的,在不同实施例中可以对这些步骤中的一个或多个步骤进行省略或替代。In some embodiments, step S2104 is optional, and one or more of these steps may be omitted or replaced in different embodiments.

在一些实施例中,步骤S2105是可选的,在不同实施例中可以对这些步骤中的一个或多个步骤进行省略或替代。In some embodiments, step S2105 is optional, and one or more of these steps may be omitted or replaced in different embodiments.

在一些实施例中,步骤S2106是可选的,在不同实施例中可以对这些步骤中的一个或多个步骤进行省略或替代。In some embodiments, step S2106 is optional, and one or more of these steps may be omitted or replaced in different embodiments.

在一些实施例中,步骤S2101、步骤S2102是可选的,在不同实施例中可以对这些步骤中的一个或多个步骤进行省略或替代。In some embodiments, steps S2101 and S2102 are optional, and one or more of these steps may be omitted or replaced in different embodiments.

在一些实施例中,步骤S2103、步骤S2104是可选的,在不同实施例中可以对这些步骤中的一个或多个步骤进行省略或替代。In some embodiments, steps S2103 and S2104 are optional, and one or more of these steps may be omitted or replaced in different embodiments.

在一些实施例中,步骤S215、步骤S2106是可选的,在不同实施例中可以对这些步骤中的一个或多个步骤进行省略或替代。In some embodiments, steps S215 and S2106 are optional, and one or more of these steps may be omitted or replaced in different embodiments.

在一些实施例中,可参见图2所对应的说明书之前或之后记载的其他可选实现方式。In some embodiments, please refer to other optional implementations described before or after the description corresponding to Figure 2 .

图3A是根据本公开实施例示出的分组方法的流程示意图,应用于编码器。如图3A所示,本公开实施例涉及分组方法,上述方法包括:FIG. 3A is a schematic flowchart of a grouping method according to an embodiment of the present disclosure, applied to an encoder. As shown in Figure 3A, the embodiment of the present disclosure relates to a grouping method. The above method includes:

步骤S3101,编码器对多个声道进行分组,得到至少一个声道分组。Step S3101: The encoder groups multiple channels to obtain at least one channel group.

步骤S3101的可选实现方式可以参见图2的步骤S2101的可选实现方式、及图2所涉及的实施例中其他关联部分,此处不再赘述。For the optional implementation of step S3101, please refer to the optional implementation of step S2101 in Figure 2 and other related parts in the embodiment involved in Figure 2, which will not be described again here.

步骤S3102,编码器对每个声道分组中的音频信号进行下混,得到下混后的音频信号。Step S3102: The encoder downmixes the audio signal in each channel group to obtain a downmixed audio signal.

步骤S3102的可选实现方式可以参见图2的步骤S2102的可选实现方式、及图2所涉及的实施例中其他关联部分,此处不再赘述。For the optional implementation of step S3102, please refer to the optional implementation of step S2102 in Figure 2 and other related parts in the embodiment involved in Figure 2, which will not be described again here.

步骤S3103,编码器对下混后的音频信号进行编码,得到音频流。Step S3103: The encoder encodes the downmixed audio signal to obtain an audio stream.

步骤S3103的可选实现方式可以参见图2的步骤S2103的可选实现方式、及图2所涉及的实施例中其他关联部分,此处不再赘述。For the optional implementation of step S3103, please refer to the optional implementation of step S2103 in Figure 2 and other related parts in the embodiment involved in Figure 2, which will not be described again here.

步骤S3104,编码器发送第一信息和音频流。Step S3104: The encoder sends the first information and audio stream.

步骤S3103的可选实现方式可以参见图2的步骤S2104的可选实现方式、及图2所涉及的实施例中其他关联部分,此处不再赘述。For the optional implementation of step S3103, please refer to the optional implementation of step S2104 in Figure 2 and other related parts in the embodiment involved in Figure 2, which will not be described again here.

本公开实施例所涉及的分组方法可以包括步骤S3101~步骤S3104中的至少一者。例如,步骤S3101可以作为独立实施例来实施,步骤S3102可以作为独立实施例来实施,步骤S3103可以作为独立实施例来实施,步骤S3104可以作为独立实施例来实施,或者也可以至少两个步骤结合,但不限于此。The grouping method involved in the embodiment of the present disclosure may include at least one of steps S3101 to S3104. For example, step S3101 can be implemented as an independent embodiment, step S3102 can be implemented as an independent embodiment, step S3103 can be implemented as an independent embodiment, step S3104 can be implemented as an independent embodiment, or at least two steps can be combined. , but not limited to this.

在一些实施例中,步骤S3101是可选的,步骤S3102是可选的,步骤S3103是可选的,步骤S3103是可选的,在不同实施例中可以对这些步骤中的一个或多个步骤进行省略或替代。但不限于此。In some embodiments, step S3101 is optional, step S3102 is optional, step S3103 is optional, and step S3103 is optional. One or more of these steps may be performed in different embodiments. omit or substitute. But not limited to this.

图3B是根据本公开实施例示出的分组方法的流程示意图,应用于编码器。如图3B所示,本公开实施例涉及分组方法,上述方法包括:FIG. 3B is a schematic flowchart of a grouping method according to an embodiment of the present disclosure, applied to an encoder. As shown in Figure 3B, the embodiment of the present disclosure relates to a grouping method. The above method includes:

步骤S3201,编码器对多个声道进行分组,得到至少一个声道分组。Step S3201: The encoder groups multiple channels to obtain at least one channel group.

步骤S3201的可选实现方式可以参见图2的步骤S2101、图3A的步骤S3101及图2、图3A所涉及的实施例中其他关联部分,此处不再赘述。For optional implementation methods of step S3201, please refer to step S2101 in Figure 2, step S3101 in Figure 3A, and other related parts in the embodiments involved in Figure 2 and Figure 3A, which will not be described again here.

图4A是根据本公开实施例示出的分组方法的流程示意图,应用于解码器,如图4A所示,本公开实施例涉及分组方法,上述方法包括:Figure 4A is a schematic flowchart of a grouping method according to an embodiment of the present disclosure, applied to a decoder. As shown in Figure 4A, an embodiment of the present disclosure relates to a grouping method, and the above method includes:

步骤S4101,解码器对第一信息进行解码,得到至少一个声道分组的声道信息。Step S4101: The decoder decodes the first information to obtain channel information of at least one channel group.

步骤S4101的可选实现方式可以参见图2的步骤S2105及图2所涉及的实施例中其他关联部分,此处不再赘述。For optional implementation methods of step S4101, please refer to step S2105 in Figure 2 and other related parts in the embodiment involved in Figure 2, which will not be described again here.

步骤S4102,解码器在确定声道分组中不包括三个声道的声道分组时,对音频流进行两声道上混,在确定声道分组包括三个声道的声道分组时,对音频流进行三声道上混。Step S4102: When the decoder determines that the channel grouping does not include a channel grouping of three channels, the decoder performs two-channel upmixing on the audio stream. When it determines that the channel grouping includes a channel grouping of three channels, the decoder performs a two-channel upmixing on the audio stream. The audio stream is three-channel upmixed.

步骤S4102的可选实现方式可以参见图2的步骤S2106及图2所涉及的实施例中其他关联部分,此处不再赘述。For optional implementation methods of step S4102, please refer to step S2106 in Figure 2 and other related parts in the embodiment involved in Figure 2, which will not be described again here.

本公开实施例所涉及的分组方法可以包括步骤S4101~步骤S4102中的至少一者。例如,步骤S4101可以作为独立实施例来实施,步骤S4102可以作为独立实施例来实施,或者也可以至少两个步骤结合,但不限于此。The grouping method involved in the embodiment of the present disclosure may include at least one of steps S4101 to S4102. For example, step S4101 can be implemented as an independent embodiment, step S4102 can be implemented as an independent embodiment, or at least two steps can be combined, but are not limited thereto.

在一些实施例中,步骤S4101是可选的,步骤S4102是可选的,在不同实施例中可以对这些步骤中的一个或多个步骤进行省略或替代。但不限于此。In some embodiments, step S4101 is optional and step S4102 is optional, and one or more of these steps may be omitted or replaced in different embodiments. But not limited to this.

图4B是根据本公开实施例示出的分组方法的流程示意图,应用于解码器,如图4B所示,本公开实施例涉及分组方法,上述方法包括:Figure 4B is a schematic flowchart of a grouping method according to an embodiment of the present disclosure, applied to a decoder. As shown in Figure 4B, the embodiment of the present disclosure relates to a grouping method, and the above method includes:

步骤S4201,解码器对第一信息进行解码,得到至少一个声道分组的声道信息。Step S4201: The decoder decodes the first information to obtain channel information of at least one channel group.

步骤S4201的可选实现方式可以参见图2的步骤S2105及图2所涉及的实施例中其他关联部分,此处不再赘述。For optional implementation methods of step S4201, please refer to step S2105 in Figure 2 and other related parts in the embodiment involved in Figure 2, which will not be described again here.

在一些实施例中,所述至少一个声道分组中存在N个第一声道分组,所述第一声道分组包括三个声道,所述至少一个声道分组中存在M个第二声道分组,所述第二声道分组包括两个声道,其中,N为1,M为非负整数。In some embodiments, there are N first channel groupings in the at least one channel grouping, the first channel grouping includes three channels, and M second channel groups are present in the at least one channel grouping. Channel grouping, the second channel grouping includes two channels, where N is 1 and M is a non-negative integer.

在一些实施例中,所述声道信息包括以下至少之一:In some embodiments, the vocal channel information includes at least one of the following:

包括两个声道的声道分组的个数;The number of channel groups including two channels;

包括两个声道的声道分组中包括的声道的声道标识;Channel identifiers of channels included in a channel grouping that includes two channels;

包括两个声道的声道分组的能量参数;Energy parameters for channel groupings including two channels;

包括三个声道的声道分组的个数;The number of channel groups including three channels;

包括三个声道的声道分组中包括的声道的声道标识;Channel identifiers of channels included in a channel grouping that includes three channels;

包括三个声道的声道分组的能量参数;Energy parameters for channel groupings including three channels;

其中,所述能量参数用于对声道分组中声道的能量调整。Wherein, the energy parameter is used to adjust the energy of the channels in the channel grouping.

在一些实施例中,所述方法还包括:In some embodiments, the method further includes:

在确定所述声道分组中不包括三个声道的声道分组时,对音频流进行两声道上混。When it is determined that the channel grouping does not include a channel grouping of three channels, a two-channel upmix is performed on the audio stream.

在一些实施例中,所述方法还包括:In some embodiments, the method further includes:

在确定所述声道分组中包括三个声道的声道分组时,对音频流进行三声道上混。When it is determined that the channel grouping includes three channels, three-channel upmixing is performed on the audio stream.

图5是根据本公开实施例示出的分组方法的流程示意图,如图5所示,本公开实施例涉及分组方法,上述方法包括:Figure 5 is a schematic flowchart of a grouping method according to an embodiment of the present disclosure. As shown in Figure 5, the embodiment of the present disclosure relates to a grouping method. The above method includes:

步骤S5101:编码器对多个声道进行分组,得到至少一个声道分组。Step S5101: The encoder groups multiple channels to obtain at least one channel group.

在一些实施例中,所述至少一个声道分组中存在N个第一声道分组,所述第一声道分组包括三个声道,所述至少一个声道分组中存在M个第二声道分组,所述第二声道分组包括两个声道,其中,N为1,M为非负整数。In some embodiments, there are N first channel groupings in the at least one channel grouping, the first channel grouping includes three channels, and M second channel groups are present in the at least one channel grouping. Channel grouping, the second channel grouping includes two channels, where N is 1 and M is a non-negative integer.

步骤S5102:解码器对第一信息进行解码,得到至少一个声道分组的声道信息。Step S5102: The decoder decodes the first information to obtain the channel information of at least one channel group.

步骤S5101的可选实现方式可以参见图2的步骤S2101、图3A中的步骤S3101及图2、图4A所涉及的实施例中其他关联部分,此处不再赘述。For optional implementation methods of step S5101, please refer to step S2101 in Fig. 2, step S3101 in Fig. 3A, and other related parts in the embodiments involved in Fig. 2 and Fig. 4A, which will not be described again here.

步骤S5102的可选实现方式可以参见图2的步骤S2105、图4A的步骤S4101及图2、图3A所涉及的实施例中其他关联部分,此处不再赘述。For optional implementation methods of step S5102, please refer to step S2105 in Figure 2, step S4101 in Figure 4A, and other related parts in the embodiments involved in Figures 2 and 3A, which will not be described again here.

在一些实施例中,上述方法可以包括上述编解码系统侧、编码器侧、解码器侧等的实施例的方法,此处不再赘述。In some embodiments, the above method may include the method of the above embodiments on the encoding and decoding system side, encoder side, decoder side, etc., which will not be described again here.

图6是根据本公开实施例示出的分组方法的流程示意图,如图6所示,本公开实施例涉及分组方法,上述方法包括:Figure 6 is a schematic flowchart of a grouping method according to an embodiment of the present disclosure. As shown in Figure 6, the embodiment of the present disclosure relates to a grouping method. The above method includes:

步骤S6101,编码端采用两声道和差编码和三声道和差编码组合的方法对多个声道进行编码。Step S6101: The encoding end uses a combination of two-channel sum-difference coding and three-channel sum-difference coding to code multiple channels.

编码端完成音频预处理进入声道间组对下混模块。音频预处理包括但不限于通用的编码流程,比如暂态分析、时频变换、时域噪声整形、频域噪声整形、频带扩展等模块,也包括针对某类信号特征进行的处理,比如多声道编码处理、HOA声道编码处理、对象元数据编码处理等。The encoding end completes the audio preprocessing and enters the inter-channel pair downmix module. Audio preprocessing includes but is not limited to general coding processes, such as transient analysis, time-frequency transformation, time domain noise shaping, frequency domain noise shaping, frequency band expansion and other modules. It also includes processing for certain types of signal characteristics, such as multi-voice Channel encoding processing, HOA channel encoding processing, object metadata encoding processing, etc.

声道间组对下混模块引入三声道和差编码的方法,并和两声道和差编码框架组合使用。该模块包括判决模块和下混模块。The inter-channel pair downmix module introduces the three-channel sum-difference coding method and is used in combination with the two-channel sum-difference coding framework. This module includes a decision module and a downmix module.

判决模块用来判断采用哪种和差编码方法或者组合。判决标准是声道间的相关性,并与组对阈值进行比较。判决结果是L、R、C三个声道进行3声道和差编码,LS和LS进行2声道和差编码,LFE声道不做处理。The decision module is used to determine which sum and difference coding method or combination to use. The decision criterion is inter-channel correlation and is compared to group pair thresholds. The judgment result is that the L, R, and C channels are subjected to 3-channel sum and difference coding, LS and LS are subjected to 2-channel sum and difference coding, and the LFE channel is not processed.

下混模块对L、R、C三个声道进行3声道和差编码,LS和LS进行2声道和差编码。The downmix module performs 3-channel sum and difference coding on the L, R, and C channels, and LS and LS perform 2-channel sum and difference coding.

经过声道间组对下混模块后,三声道和差编码下混后的声道(M1声道、S11声道,S12声道),两声道和差编码下混后的声道(M2声道、S2声道)和未下混后的声道(LFE声道)都经过比特分配量化熵编码模块,经过码流复用形成编码比特流E。。After passing through the inter-channel pairing downmixing module, the three-channel sum-difference-coded downmixed channels (M1 channel, S11 channel, S12 channel), the two-channel sum-difference-coded downmixed channel ( M2 channel, S2 channel) and the undownmixed channel (LFE channel) all pass through the bit allocation quantization entropy coding module, and are multiplexed to form the encoded bit stream E. .

在本公开实施例中,部分或全部步骤、其可选实现方式可以与其他实施例中的部分或全部步骤任意组合,也可以与其他实施例的可选实现方式任意组合。In the embodiments of the present disclosure, some or all of the steps and their optional implementations can be arbitrarily combined with some or all of the steps in other embodiments, and can also be arbitrarily combined with the optional implementations of other embodiments.

本公开实施例还提出用于实现以上任一方法的装置,例如,提出一装置,上述装置包括用以实现以上任一方法中编码器所执行的各步骤的单元或模块。再如,还提出另一装置,包括用以实现以上任一方法中解码器所执行的各步骤的单元或模块。Embodiments of the present disclosure also provide a device for implementing any of the above methods. For example, a device is provided. The device includes units or modules for implementing each step performed by the encoder in any of the above methods. As another example, another device is also proposed, including units or modules for implementing each step performed by the decoder in any of the above methods.

应理解以上装置中各单元或模块的划分仅是一种逻辑功能的划分,在实际实现时可以全部或部分集成到一个物理实体上,也可以物理上分开。此外,装置中的单元或模块可以以处理器调用软件的形式实现:例如装置包括处理器,处理器与存储器连接,存储器中存储有指令,处理器调用存储器中存储的指令,以实现以上任一方法或实现上述装置各单元或模块的功能,其中处理器例如为通用处理器,例如中央处理单元(Central ProcessingUnit,CPU)或微处理器,存储器为装置内的存储器或装置外的存储器。或者,装置中的单元或模块可以以硬件电路的形式实现,可以通过对硬件电路的设计实现部分或全部单元或模块的功能,上述硬件电路可以理解为一个或多个处理器;例如,在一种实现中,上述硬件电路为专用集成电路(application-specific integrated circuit,ASIC),通过对电路内元件逻辑关系的设计,实现以上部分或全部单元或模块的功能;再如,在另一种实现中,上述硬件电路为可以通过可编程逻辑器件(programmable logic device,PLD)实现,以现场可编程门阵列(Field Programmable Gate Array,FPGA)为例,其可以包括大量逻辑门电路,通过配置文件来配置逻辑门电路之间的连接关系,从而实现以上部分或全部单元或模块的功能。以上装置的所有单元或模块可以全部通过处理器调用软件的形式实现,或全部通过硬件电路的形式实现,或部分通过处理器调用软件的形式实现,剩余部分通过硬件电路的形式实现。It should be understood that the division of each unit or module in the above device is only a division of logical functions. In actual implementation, it can be fully or partially integrated into a physical entity, or it can also be physically separated. In addition, the units or modules in the device can be implemented in the form of the processor calling software: for example, the device includes a processor, the processor is connected to a memory, instructions are stored in the memory, and the processor calls the instructions stored in the memory to implement any of the above. Method or implement the functions of each unit or module of the above device, where the processor is, for example, a general-purpose processor, such as a central processing unit (Central Processing Unit, CPU) or a microprocessor, and the memory is a memory within the device or a memory outside the device. Alternatively, the units or modules in the device can be implemented in the form of hardware circuits, and some or all of the functions of the units or modules can be implemented through the design of the hardware circuits. The above-mentioned hardware circuits can be understood as one or more processors; for example, in a In one implementation, the above hardware circuit is an application-specific integrated circuit (ASIC), and through the design of the logical relationship of the components in the circuit, the functions of some or all of the above units or modules are realized; for another example, in another implementation Among them, the above hardware circuit can be implemented by a programmable logic device (PLD). Taking Field Programmable Gate Array (FPGA) as an example, it can include a large number of logic gate circuits and can be configured through configuration files. Configure the connection relationship between logic gate circuits to realize the functions of some or all of the above units or modules. All units or modules of the above device may be fully implemented by the processor calling software, or may be fully implemented by hardware circuits, or part of the units or modules may be implemented by the processor calling software, and the remaining part may be implemented by hardware circuits.

在本公开实施例中,处理器是具有信号处理能力的电路,在一种实现中,处理器可以是具有指令读取与运行能力的电路,例如中央处理单元(Central Processing Unit,CPU)、微处理器、图形处理器(graphics processing unit,GPU)(可以理解为微处理器)、或数字信号处理器(digital signal processor,DSP)等;在另一种实现中,处理器可以通过硬件电路的逻辑关系实现一定功能,上述硬件电路的逻辑关系是固定的或可以重构的,例如处理器为专用集成电路(application-specific integrated circuit,ASIC)或可编程逻辑器件(programmable logic device,PLD)实现的硬件电路,例如FPGA。在可重构的硬件电路中,处理器加载配置文档,实现硬件电路配置的过程,可以理解为处理器加载指令,以实现以上部分或全部单元或模块的功能的过程。此外,还可以是针对人工智能设计的硬件电路,其可以理解为ASIC,例如神经网络处理单元(Neural Network Processing Unit,NPU)、张量处理单元(Tensor Processing Unit,TPU)、深度学习处理单元(Deep learningProcessing Unit,DPU)等。In the embodiment of the present disclosure, the processor is a circuit with signal processing capabilities. In one implementation, the processor may be a circuit with instruction reading and execution capabilities, such as a central processing unit (Central Processing Unit, CPU), microprocessor, etc. Processor, graphics processing unit (GPU) (can be understood as a microprocessor), or digital signal processor (DSP), etc.; in another implementation, the processor can be configured through a hardware circuit Logical relationships realize certain functions. The logical relationships of the above-mentioned hardware circuits are fixed or can be reconstructed. For example, the processor is implemented by an application-specific integrated circuit (ASIC) or a programmable logic device (PLD). Hardware circuits, such as FPGA. In a reconfigurable hardware circuit, the process of the processor loading the configuration file and realizing the hardware circuit configuration can be understood as the process of the processor loading instructions to realize the functions of some or all of the above units or modules. In addition, it can also be a hardware circuit designed for artificial intelligence, which can be understood as an ASIC, such as a neural network processing unit (Neural Network Processing Unit, NPU), tensor processing unit (Tensor Processing Unit, TPU), deep learning processing unit ( Deep learning Processing Unit (DPU), etc.

图7A是本公开实施例提出的编解码装置的结构示意图。如图7A所示,编解码装置7100可以包括:收发模块7101、处理模块7102等中的至少一者。在一些实施例中,处理模块7102用于对多个声道进行分组,得到至少一个声道分组,其中,所述至少一个声道分组中存在N个第一声道分组,所述第一声道分组包括三个声道,所述至少一个声道分组中存在M个第二声道分组,所述第二声道分组包括两个声道,其中,N为1,M为非负整数。可选地,上述收发模块7101用于执行以上任一方法中编码器执行的发送和/或接收等通信步骤中的至少一者(例如步骤S2101但不限于此),此处不再赘述。可选地,上述处理模块用于执行以上任一方法中编码器执行的其他步骤中的至少一者,此处不再赘述。FIG. 7A is a schematic structural diagram of a coding and decoding device proposed by an embodiment of the present disclosure. As shown in Figure 7A, the encoding and decoding device 7100 may include: at least one of a transceiver module 7101, a processing module 7102, and the like. In some embodiments, the processing module 7102 is used to group multiple channels to obtain at least one channel group, wherein there are N first channel groups in the at least one channel group, and the first channel group is The channel group includes three channels, and there are M second channel groups in the at least one channel group. The second channel group includes two channels, where N is 1 and M is a non-negative integer. Optionally, the above-mentioned transceiving module 7101 is used to perform at least one of the communication steps such as sending and/or receiving performed by the encoder in any of the above methods (such as step S2101 but not limited to this), which will not be described again here. Optionally, the above processing module is used to perform at least one of the other steps performed by the encoder in any of the above methods, which will not be described again here.

可选地,处理模块7102用于执行以上任一方法中编码器执行的处理等通信步骤中的至少一者,此处不再赘述。Optionally, the processing module 7102 is configured to perform at least one of the communication steps such as processing performed by the encoder in any of the above methods, which will not be described again here.

图7B是本公开实施例提出的编解码装置的结构示意图。如图7B所示,编解码装置7200可以包括:收发模块7201、处理模块7202等中的至少一者。在一些实施例中,处理模块7202用于对第一信息进行解码,得到至少一个声道分组的声道信息,所述至少一个声道分组中存在N个第一声道分组,所述第一声道分组包括三个声道,所述至少一个声道分组中存在M个第二声道分组,所述第二声道分组包括两个声道,其中,N为1,M为非负整数。可选地,上述收发模块用于执行以上任一方法中解码器执行的发送和/或接收等通信步骤(例如步骤S2102但不限于此)中的至少一者,此处不再赘述。FIG. 7B is a schematic structural diagram of a coding and decoding device proposed by an embodiment of the present disclosure. As shown in Figure 7B, the encoding and decoding device 7200 may include: at least one of a transceiver module 7201, a processing module 7202, and the like. In some embodiments, the processing module 7202 is used to decode the first information to obtain channel information of at least one channel group, where there are N first channel groups in the at least one channel group, and the first channel group is The channel group includes three channels, and there are M second channel groups in the at least one channel group. The second channel group includes two channels, where N is 1 and M is a non-negative integer. . Optionally, the above-mentioned transceiver module is used to perform at least one of the communication steps such as sending and/or receiving performed by the decoder in any of the above methods (such as step S2102 but not limited to this), which will not be described again here.

可选地,处理模块7202用于执行以上任一方法中解码器执行的处理等通信步骤中的至少一者,此处不再赘述。Optionally, the processing module 7202 is configured to perform at least one of the communication steps such as processing performed by the decoder in any of the above methods, which will not be described again here.

在一些实施例中,收发模块可以包括发送模块和/或接收模块,发送模块和接收模块可以是分离的,也可以集成在一起。可选地,收发模块可以与收发器相互替换。In some embodiments, the transceiver module may include a sending module and/or a receiving module, and the sending module and the receiving module may be separated or integrated together. Optionally, the transceiver module can be interchangeable with the transceiver.

在一些实施例中,处理模块可以是一个模块,也可以包括多个子模块。可选地,上述多个子模块分别执行处理模块所需执行的全部或部分步骤。可选地,处理模块可以与处理器相互替换。In some embodiments, the processing module may be one module or may include multiple sub-modules. Optionally, the above multiple sub-modules respectively perform all or part of the steps required by the processing module. Optionally, the processing module may be interchangeable with the processor.

图8A是本公开实施例提出的通信设备8100的结构示意图。通信设备8100可以是解码器(例如接入网设备、核心网设备等),也可以是编码器(例如用户设备等),也可以是支持解码器实现以上任一方法的芯片、芯片系统、或处理器等,还可以是支持编码器实现以上任一方法的芯片、芯片系统、或处理器等。通信设备8100可用于实现上述方法实施例中描述的方法,具体可以参见上述方法实施例中的说明。FIG. 8A is a schematic structural diagram of a communication device 8100 proposed by an embodiment of the present disclosure. The communication device 8100 may be a decoder (such as access network equipment, core network equipment, etc.), an encoder (such as user equipment, etc.), or a chip, chip system, or chip system that supports the decoder to implement any of the above methods. The processor, etc., may also be a chip, chip system, or processor that supports the encoder to implement any of the above methods. The communication device 8100 may be used to implement the method described in the above method embodiment. For details, please refer to the description in the above method embodiment.

如图8A所示,通信设备8100包括一个或多个处理器8101。处理器8101可以是通用处理器或者专用处理器等,例如可以是基带处理器或中央处理器。基带处理器可以用于对通信协议以及通信数据进行处理,中央处理器可以用于对编解码装置(如,基站、基带芯片,编码器设备、编码器设备芯片,DU或CU等)进行控制,执行程序,处理程序的数据。通信设备8100用于执行以上任一方法。As shown in Figure 8A, communications device 8100 includes one or more processors 8101. The processor 8101 may be a general-purpose processor or a special-purpose processor, for example, it may be a baseband processor or a central processing unit. The baseband processor can be used to process communication protocols and communication data, and the central processor can be used to control encoding and decoding devices (such as base stations, baseband chips, encoder equipment, encoder equipment chips, DU or CU, etc.), Execute the program and process the program's data. The communication device 8100 is used to perform any of the above methods.

在一些实施例中,通信设备8100还包括用于存储指令的一个或多个存储器8102。可选地,全部或部分存储器8102也可以处于通信设备8100之外。In some embodiments, communications device 8100 also includes one or more memories 8102 for storing instructions. Optionally, all or part of memory 8102 may also be external to communication device 8100.

在一些实施例中,通信设备8100还包括一个或多个收发器8103。在通信设备8100包括一个或多个收发器8103时,收发器8103执行上述方法中的发送和/或接收等通信步骤(例如步骤S2101、步骤S2102、步骤S2103、步骤S2104,但不限于此)中的至少一者。In some embodiments, communications device 8100 also includes one or more transceivers 8103. When the communication device 8100 includes one or more transceivers 8103, the transceiver 8103 performs communication steps such as sending and/or receiving in the above method (such as step S2101, step S2102, step S2103, step S2104, but is not limited to this). At least one of.

在一些实施例中,收发器可以包括接收器和/或发送器,接收器和发送器可以是分离的,也可以集成在一起。可选地,收发器、收发单元、收发机、收发电路等术语可以相互替换,发送器、发送单元、发送机、发送电路等术语可以相互替换,接收器、接收单元、接收机、接收电路等术语可以相互替换。In some embodiments, the transceiver may include a receiver and/or a transmitter, and the receiver and transmitter may be separate or integrated together. Optionally, terms such as transceiver, transceiver unit, transceiver, and transceiver circuit may be interchanged with each other, terms such as transmitter, transmitting unit, transmitter, transmitting circuit may be interchanged with each other, and terms such as receiver, receiving unit, receiver, receiving circuit, etc. The terms are interchangeable.

在一些实施例中,通信设备8100可以包括一个或多个接口电路8104。可选地,接口电路8104与存储器8102连接,接口电路8104可用于从存储器8102或其他装置接收信号,可用于向存储器8102或其他装置发送信号。例如,接口电路8104可读取存储器8102中存储的指令,并将该指令发送给处理器8101。In some embodiments, communication device 8100 may include one or more interface circuits 8104. Optionally, the interface circuit 8104 is connected to the memory 8102. The interface circuit 8104 can be used to receive signals from the memory 8102 or other devices, and can be used to send signals to the memory 8102 or other devices. For example, interface circuit 8104 may read instructions stored in memory 8102 and send the instructions to processor 8101.

以上实施例描述中的通信设备8100可以是解码器或者编码器,但本公开中描述的通信设备8100的范围并不限于此,通信设备8100的结构可以不受图8A的限制。通信设备可以是独立的设备或者可以是较大设备的一部分。例如所述通信设备可以是:1)独立的集成电路IC,或芯片,或,芯片系统或子系统;(2)具有一个或多个IC的集合,可选地,上述IC集合也可以包括用于存储数据,程序的存储部件;(3)ASIC,例如调制解调器(Modem);(4)可嵌入在其他设备内的模块;(5)接收机、编码器设备、智能编码器设备、蜂窝电话、无线设备、手持机、移动单元、车载设备、解码器、云设备、人工智能设备等等;(6)其他等等。The communication device 8100 in the above embodiment description may be a decoder or an encoder, but the scope of the communication device 8100 described in the present disclosure is not limited thereto, and the structure of the communication device 8100 may not be limited by FIG. 8A. The communication device may be a stand-alone device or may be part of a larger device. For example, the communication device may be: 1) an independent integrated circuit IC, a chip, or a chip system or a subsystem; (2) a collection of one or more ICs. Optionally, the above-mentioned IC collection may also include Storage components for storing data and programs; (3) ASIC, such as modem; (4) Modules that can be embedded in other devices; (5) Receivers, encoder devices, smart encoder devices, cellular phones, Wireless devices, handheld devices, mobile units, vehicle-mounted equipment, decoders, cloud equipment, artificial intelligence equipment, etc.; (6) Others, etc.

图8B是本公开实施例提出的芯片8200的结构示意图。对于通信设备8100可以是芯片或芯片系统的情况,可以参见图8B所示的芯片8200的结构示意图,但不限于此。FIG. 8B is a schematic structural diagram of the chip 8200 proposed by the embodiment of the present disclosure. For the case where the communication device 8100 may be a chip or a chip system, reference may be made to the schematic structural diagram of the chip 8200 shown in FIG. 8B , but is not limited thereto.

芯片8200包括一个或多个处理器8201,芯片8200用于执行以上任一方法。The chip 8200 includes one or more processors 8201, and the chip 8200 is used to perform any of the above methods.

在一些实施例中,芯片8200还包括一个或多个接口电路8202。可选地,接口电路8202与存储器8203连接,接口电路8202可以用于从存储器8203或其他装置接收信号,接口电路8202可用于向存储器8203或其他装置发送信号。例如,接口电路8202可读取存储器8203中存储的指令,并将该指令发送给处理器8201。In some embodiments, chip 8200 also includes one or more interface circuits 8202. Optionally, the interface circuit 8202 is connected to the memory 8203. The interface circuit 8202 can be used to receive signals from the memory 8203 or other devices, and the interface circuit 8202 can be used to send signals to the memory 8203 or other devices. For example, the interface circuit 8202 can read instructions stored in the memory 8203 and send the instructions to the processor 8201.

在一些实施例中,接口电路8202执行上述方法中的发送和/或接收等通信步骤中的至少一者,处理器8201执行其他步骤中的至少一者。In some embodiments, the interface circuit 8202 performs at least one of the communication steps such as sending and/or receiving in the above method, and the processor 8201 performs at least one of the other steps.

在一些实施例中,接口电路、接口、收发管脚、收发器等术语可以相互替换。In some embodiments, terms such as interface circuit, interface, transceiver pin, and transceiver may be used interchangeably.

在一些实施例中,芯片8200还包括用于存储指令的一个或多个存储器8203。可选地,全部或部分存储器8203可以处于芯片8200之外。In some embodiments, chip 8200 also includes one or more memories 8203 for storing instructions. Alternatively, all or part of memory 8203 may be external to chip 8200.

本公开还提出存储介质,上述存储介质上存储有指令,当上述指令在通信设备8100上运行时,使得通信设备8100执行以上任一方法。可选地,上述存储介质是电子存储介质。可选地,上述存储介质是计算机可读存储介质,但不限于此,其也可以是其他装置可读的存储介质。可选地,上述存储介质可以是非暂时性(non-transitory)存储介质,但不限于此,其也可以是暂时性存储介质。The present disclosure also proposes a storage medium, and instructions are stored on the storage medium. When the instructions are run on the communication device 8100, the communication device 8100 is caused to perform any of the above methods. Optionally, the above storage medium is an electronic storage medium. Optionally, the above-mentioned storage medium is a computer-readable storage medium, but is not limited thereto. It may also be a storage medium readable by other devices. Optionally, the above storage medium may be a non-transitory storage medium, but is not limited thereto, it may also be a transitory storage medium.

本公开还提出程序产品,上述程序产品被通信设备8100执行时,使得通信设备8100执行以上任一方法。可选地,上述程序产品是计算机程序产品。The present disclosure also proposes a program product. When the program product is executed by the communication device 8100, it causes the communication device 8100 to execute any of the above methods. Optionally, the above program product is a computer program product.

本公开还提出计算机程序,当其在计算机上运行时,使得计算机执行以上任一方法。The present disclosure also proposes a computer program that, when run on a computer, causes the computer to perform any of the above methods.

Claims (21)

1. A method of grouping, the method performed by an encoder, the method comprising:
and grouping the plurality of channels to obtain at least one channel group, wherein N first channel groups exist in the at least one channel group, the first channel groups comprise three channels, M second channel groups exist in the at least one channel group, the second channel groups comprise two channels, N is 1, and M is a non-negative integer.
2. The method of claim 1, wherein grouping the plurality of channels to obtain at least one channel group comprises:
acquiring the similarity between any two channels of the plurality of channels;
and grouping the plurality of channels based on the similarity between any two channels to obtain the at least one channel group.
3. The method of claim 2, wherein grouping the plurality of channels based on the similarity between the any two channels results in the at least one channel group, comprising:
for any two channels of the plurality of channels, determining the two channels with the greatest similarity as a candidate channel group;
excluding the two channels with the maximum similarity from other channels, and determining the two channels with the maximum similarity as a candidate channel group until an independent channel or no channel remains;
the at least one channel group is determined based on the resulting candidate channel group and/or the independent channels.
4. A method according to claim 3, wherein said determining said at least one channel group based on the resulting candidate channel group and/or said independent channel comprises:
obtaining the similarity between the independent channel and each channel in each candidate channel group;
determining each channel in the independent channel and first candidate channel group as one first channel group when the similarity of each channel in the independent channel and first candidate channel group is larger than a similarity threshold;
Each remaining candidate channel group is determined as one of the second channel groups, respectively.
5. The method of claim 1, wherein grouping the plurality of channels to obtain at least one channel group comprises:
and grouping the plurality of channels to directly obtain the N first channel groups and the M second channel groups.
6. The method of claim 5, wherein the grouping the plurality of channels directly results in the N first channel groups and the M second channel groups, comprising:
and carrying out global search on the plurality of channels, and grouping the plurality of channels according to the similarity between any two channels to obtain N first channel groups and M second channel groups.
7. The method of claim 1, wherein grouping the plurality of channels to obtain at least one channel group comprises:
grouping the plurality of channels to obtain a plurality of candidate channel groups comprising two channels and at least one independent channel;
dividing one independent channel in the at least one independent channel into one candidate channel group in the plurality of candidate channel groups to obtain the N first channel groups and the M second channel groups.
8. The method according to any one of claims 1 to 7, further comprising:
down-mixing the audio signals in each channel group to obtain the down-mixed audio signals;
and encoding the audio signal after the down mixing to obtain an audio stream.
9. The method according to any one of claims 1 to 8, further comprising:
and transmitting first information, wherein the first information is used for indicating channel information of the plurality of channel groups.
10. The method of claim 9, wherein the channel information includes at least one of:
the number of channel groups comprising two channels;
channel identifications of channels included in a channel group including two channels;
energy parameters of a channel group comprising two channels;
the number of channel groups including three channels;
channel identifications of channels included in a channel group including three channels;
energy parameters of a channel group comprising three channels;
wherein the energy parameter is used for adjusting the energy of the channels in the channel group.
11. A method of grouping, the method performed by a decoder, the method comprising:
Decoding the first information to obtain channel information of at least one channel group, wherein N first channel groups exist in the at least one channel group, the first channel group comprises three channels, M second channel groups exist in the at least one channel group, the second channel group comprises two channels, N is 1, and M is a non-negative integer.
12. The method of claim 11, wherein the channel information comprises at least one of:
the number of channel groups comprising two channels;
channel identifications of channels included in a channel group including two channels;
energy parameters of a channel group comprising two channels;
the number of channel groups including three channels;
channel identifications of channels included in a channel group including three channels;
energy parameters of a channel group comprising three channels;
wherein the energy parameter is used for adjusting the energy of the channels in the channel group.
13. The method according to claim 11 or 12, characterized in that the method further comprises:
upon determining that the channel group does not include three channels of the channel group, two-channel upmixing is performed on the audio stream.
14. The method according to claim 11 or 12, characterized in that the method further comprises:
upon determining that the channel group includes a channel group of three channels, the audio stream is upmixed in three channels.
15. A method of grouping, the method comprising:
the method comprises the steps that an encoder groups a plurality of channels to obtain at least one channel group, wherein N first channel groups exist in the at least one channel group, the first channel groups comprise three channels, M second channel groups exist in the at least one channel group, the second channel groups comprise two channels, N is 1, and M is a non-negative integer;
the decoder decodes the first information to obtain channel information of the at least one channel group, wherein at least one channel group of the plurality of channel groups comprises three channels.
16. A codec apparatus, the codec apparatus comprising:
the processing module is used for grouping the plurality of channels to obtain at least one channel group, wherein N first channel groups exist in the at least one channel group, the first channel groups comprise three channels, M second channel groups exist in the at least one channel group, the second channel groups comprise two channels, N is 1, and M is a non-negative integer.
17. A codec apparatus, the codec apparatus comprising:
the processing module is used for decoding the first information to obtain channel information of at least one channel group, N first channel groups exist in the at least one channel group, the first channel group comprises three channels, M second channel groups exist in the at least one channel group, the second channel group comprises two channels, N is 1, and M is a non-negative integer.
18. A codec apparatus, the codec apparatus comprising:
one or more processors;
wherein the processor is configured to perform the grouping method of any one of claims 1 to 10.
19. A codec apparatus, the codec apparatus comprising:
one or more processors;
wherein the processor is configured to perform the grouping method of any one of claims 11 to 14.
20. A codec system comprising an encoder configured to implement the grouping method of any one of claims 1 to 10 and a decoder configured to implement the grouping method of any one of claims 11 to 14.
21. A storage medium storing instructions that, when executed on a communication device, cause the communication device to perform the grouping method of any one of claims 1 to 10 or perform the grouping method of any one of claims 11 to 14.
CN202380012056.6A 2023-10-31 2023-10-31 Grouping methods, encoders, decoders, and storage media Pending CN117730367A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2023128800 2023-10-31

Publications (1)

Publication Number Publication Date
CN117730367A true CN117730367A (en) 2024-03-19

Family

ID=90203912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202380012056.6A Pending CN117730367A (en) 2023-10-31 2023-10-31 Grouping methods, encoders, decoders, and storage media

Country Status (1)

Country Link
CN (1) CN117730367A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708868A (en) * 2006-01-20 2012-10-03 微软公司 Complex-transform channel coding with extended-band frequency coding
CN103400582A (en) * 2013-08-13 2013-11-20 武汉大学 Encoding and decoding method and system for multi-channel three-dimensional voice frequency
CN104240712A (en) * 2014-09-30 2014-12-24 武汉大学深圳研究院 Three-dimensional audio multichannel grouping and clustering coding method and three-dimensional audio multichannel grouping and clustering coding system
US20150149187A1 (en) * 2012-08-03 2015-05-28 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Decoder and method for multi-instance spatial-audio-object-coding employing a parametric concept for multichannel downmix/upmix cases
CN105828271A (en) * 2015-01-09 2016-08-03 南京青衿信息科技有限公司 Method for converting two--channel signal into three-channel signal
CN106710600A (en) * 2016-12-16 2017-05-24 广州广晟数码技术有限公司 Multi-track audio signal decorrelation coding method and device
US20210134304A1 (en) * 2012-09-12 2021-05-06 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for providing enhanced guided downmix capabilities for 3d audio
US20230106764A1 (en) * 2020-03-09 2023-04-06 Nippon Telegraph And Telephone Corporation Sound signal downmixing method, sound signal coding method, sound signal downmixing apparatus, sound signal coding apparatus, program and recording medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708868A (en) * 2006-01-20 2012-10-03 微软公司 Complex-transform channel coding with extended-band frequency coding
US20150149187A1 (en) * 2012-08-03 2015-05-28 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Decoder and method for multi-instance spatial-audio-object-coding employing a parametric concept for multichannel downmix/upmix cases
US20210134304A1 (en) * 2012-09-12 2021-05-06 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for providing enhanced guided downmix capabilities for 3d audio
CN103400582A (en) * 2013-08-13 2013-11-20 武汉大学 Encoding and decoding method and system for multi-channel three-dimensional voice frequency
CN104240712A (en) * 2014-09-30 2014-12-24 武汉大学深圳研究院 Three-dimensional audio multichannel grouping and clustering coding method and three-dimensional audio multichannel grouping and clustering coding system
CN105828271A (en) * 2015-01-09 2016-08-03 南京青衿信息科技有限公司 Method for converting two--channel signal into three-channel signal
CN106710600A (en) * 2016-12-16 2017-05-24 广州广晟数码技术有限公司 Multi-track audio signal decorrelation coding method and device
US20230106764A1 (en) * 2020-03-09 2023-04-06 Nippon Telegraph And Telephone Corporation Sound signal downmixing method, sound signal coding method, sound signal downmixing apparatus, sound signal coding apparatus, program and recording medium
US20230319498A1 (en) * 2020-03-09 2023-10-05 Nippon Telegraph And Telephone Corporation Sound signal downmixing method, sound signal coding method, sound signal downmixing apparatus, sound signal coding apparatus, program and recording medium

Similar Documents

Publication Publication Date Title
EP4246510A1 (en) Audio encoding and decoding method and apparatus
US12027174B2 (en) Apparatus, methods, and computer programs for encoding spatial metadata
EP4246509A1 (en) Audio encoding/decoding method and device
WO2021208792A1 (en) Audio signal encoding method, decoding method, encoding device, and decoding device
TWI853232B (en) Audio encoding/decoding method and apparatus
JP2023523081A (en) Bit allocation method and apparatus for audio signal
RU2769789C2 (en) Method and device for encoding an inter-channel phase difference parameter
KR20230027295A (en) Audio encoding method and coding device
WO2021213128A1 (en) Audio signal encoding method and apparatus
CN117730367A (en) Grouping methods, encoders, decoders, and storage media
US20230154473A1 (en) Audio coding method and related apparatus, and computer-readable storage medium
US11159885B2 (en) Optimized audio forwarding
CN117730532A (en) Encoding and decoding method, terminal, network equipment and storage medium
EP4539045A1 (en) Audio encoding and decoding method and apparatus, storage medium, and computer program product
CN117769740A (en) Audio signal encoding and decoding method and device, communication system, communication equipment and storage medium
WO2025015478A1 (en) Signal processing method and apparatus thereof
JP2025510730A (en) Parametric Spatial Audio Encoding
EP4497134A1 (en) Parametric spatial audio encoding
CN118202406A (en) Encoding and decoding method, device and storage medium
CN118160034A (en) Encoding and decoding method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination