CN118116397A

CN118116397A - Audio metadata encoding and decoding method, transmission method, encoder terminal and system

Info

Publication number: CN118116397A
Application number: CN202410199133.XA
Authority: CN
Inventors: 关朝洋; 庞超; 焦健波; 周帆; 鹿楠楠; 张南鹏; 曹徐洋; 薛彦欢; 孙璐璐; 万玉鹏; 李勇军; 冀虹
Original assignee: China Media Group
Current assignee: China Media Group
Priority date: 2024-02-22
Filing date: 2024-02-22
Publication date: 2024-05-31

Abstract

The application provides an audio metadata encoding and decoding method, a transmission method, an encoder terminal and a system, wherein metadata information after audio parameter modification by a user is updated to the tail part of an original encoding code stream, so that the new encoding code stream can simultaneously retain the original metadata information and the modified metadata information, and when the audio is rendered, if the modified metadata information cannot be realized due to external factors, the audio rendering can be performed according to the original metadata information, thereby ensuring the flexibility and the reliability of the audio rendering.

Description

Audio metadata encoding and decoding method, transmission method, encoder terminal and system

技术领域Technical Field

本申请涉及音频处理领域，尤其涉及一种音频元数据编解码方法、传输方法、编码器终端及系统。The present application relates to the field of audio processing, and in particular to an audio metadata encoding and decoding method, a transmission method, an encoder terminal and a system.

背景技术Background technique

透传技术适用于多种通信协议和网络设备之间，它可以提高数据传输的效率和稳定性，减少数据传输过程中的丢包和延迟等问题。此外，透传技术还能使数据传输过程对上层应用程序透明，不需要进行特殊配置，从而提高了系统的灵活性和便捷性。基于上述透传技术的优势，在音频元数据传输中也得以广泛应用。Transparent transmission technology is applicable to a variety of communication protocols and network devices. It can improve the efficiency and stability of data transmission and reduce problems such as packet loss and delay during data transmission. In addition, transparent transmission technology can make the data transmission process transparent to upper-layer applications and does not require special configuration, thereby improving the flexibility and convenience of the system. Based on the advantages of the above transparent transmission technology, it is also widely used in audio metadata data transmission.

当前支持透传场景下交互元数据修改的技术主要有Dolby MS12和fraunhoferMPEG-H两种方式，但是都存在有一定的弊端，主要体现如下：Currently, the technologies that support interactive metadata modification in transparent transmission scenarios mainly include Dolby MS12 and Fraunhofer MPEG-H. However, both have certain disadvantages, which are mainly reflected as follows:

Dolby MS12方式：将ES流解码为元数据和PCM后，根据用户配置再将PCM重编码为ES流透传输出，此方式的缺点为方案复杂，要有额外PCM至ES之间的编码处理；Dolby MS12 method: After decoding the ES stream into metadata and PCM, the PCM is re-encoded into the ES stream for transparent transmission according to user configuration. The disadvantage of this method is that the solution is complex and additional encoding processing from PCM to ES is required;

Fraunhofer MPEG-H方式：单独解析ES流中的元数据段，根据用户配置修改元数据后，将元数据重新编码并更新替换原始元数据段。缺点为原始元数据丢失，只能按照修改后元数据进行解码&渲染，一旦修改后的元数据无法满足当前场景时，也无法作出相应弥补。Fraunhofer MPEG-H method: parse the metadata segments in the ES stream separately, modify the metadata according to the user configuration, re-encode the metadata and update and replace the original metadata segments. The disadvantage is that the original metadata is lost, and decoding and rendering can only be performed according to the modified metadata. Once the modified metadata cannot meet the current scene, no corresponding compensation can be made.

发明内容Summary of the invention

为了解决上述技术缺陷之一，本申请实施例中提供了一种音频元数据编解码方法、传输方法、编码器终端及系统。In order to solve one of the above-mentioned technical defects, an audio metadata encoding and decoding method, a transmission method, an encoder terminal and a system are provided in an embodiment of the present application.

本申请实施例第一方面提供了一种音频元数据编解码方法，所述方法包括：A first aspect of an embodiment of the present application provides an audio metadata encoding and decoding method, the method comprising:

获取第一编码码流，所述第一编码码流为原始编码码流；Obtaining a first encoded code stream, where the first encoded code stream is an original encoded code stream;

解码所述第一编码码流，获得第一元数据信息，所述第一元数据信息中包含有可供用户修改的音频参数；Decoding the first encoded bitstream to obtain first metadata information, wherein the first metadata information includes audio parameters modifiable by a user;

获取第二元数据信息，所述第二元数据信息为用户对所述第一元数据信息中音频参数进行修改后，修改的音频参数集合生成的元数据信息；Acquire second metadata information, where the second metadata information is metadata information generated by a modified audio parameter set after a user modifies the audio parameters in the first metadata information;

对所述第二元数据信息进行编码，并将所述编码后的第二元数据信息更新至所述第一编码码流的尾部，形成第二编码码流。The second metadata information is encoded, and the encoded second metadata information is updated to the end of the first encoding code stream to form a second encoding code stream.

本申请实施例第二方面提供了一种音频传输方法，所述方法包括：A second aspect of an embodiment of the present application provides an audio transmission method, the method comprising:

编解码终端获取第一编码码流，并对所述第一编码码流进行解码获得第一元数据信息，所述第一编码码流为原始编码码流，所述第一元数据信息中包含有可供用户修改的音频参数；The codec terminal obtains a first coded code stream, and decodes the first coded code stream to obtain first metadata information, wherein the first coded code stream is an original coded code stream, and the first metadata information includes audio parameters that can be modified by a user;

编解码终端将所述第一元数据信息发送至音频互动终端，所述音频互动终端接收用户对所述第一元数据信息中音频参数进行修改后，修改的音频参数集合生成的第二元数据信息；The codec terminal sends the first metadata information to the audio interactive terminal, and the audio interactive terminal receives second metadata information generated by the modified audio parameter set after the user modifies the audio parameters in the first metadata information;

音频互动终端将所述第二元数据信息发送至编解码终端，所述编解码终端对所述第二元数据信息进行编码，并将所述编码后的第二元数据信息更新至所述第一编码码流的尾部，形成第二编码码流；The audio interactive terminal sends the second metadata information to the codec terminal, and the codec terminal encodes the second metadata information and updates the encoded second metadata information to the end of the first encoding code stream to form a second encoding code stream;

编解码终端将所述第二编码码流透传至功放，所述功放对所述第二编码码流进行解码，获得第一元数据信息和第二元数据信息，并根据所述第一元数据信息和第二元数据信息通过与所述功放连接的音箱输出声音。The codec terminal transparently transmits the second encoded code stream to the power amplifier, the power amplifier decodes the second encoded code stream, obtains the first metadata information and the second metadata information, and outputs sound through a speaker connected to the power amplifier according to the first metadata information and the second metadata information.

本申请实施例第三方面提供了一种编解码终端，所述编解码终端包括处理器，所述处理器内置有处理器可执行的操作指令，以执行如本申请实施例第一方面所述的音频元数据编解码方法。A third aspect of an embodiment of the present application provides a codec terminal, the codec terminal comprising a processor, the processor having built-in operating instructions executable by the processor to execute the audio metadata codec method as described in the first aspect of the embodiment of the present application.

本申请实施例第四方面提供了一种音频传输系统，所述系统包括音箱、音频互动终端、功放和编解码终端；A fourth aspect of the embodiments of the present application provides an audio transmission system, the system comprising a speaker, an audio interactive terminal, a power amplifier and a codec terminal;

所述编解码终端，用于获取第一编码码流，并对所述第一编码码流进行解码获得第一元数据信息，所述第一编码码流为原始编码码流，所述第一元数据信息中包含有可供用户修改的音频参数；The codec terminal is used to obtain a first coded code stream and decode the first coded code stream to obtain first metadata information, wherein the first coded code stream is an original coded code stream, and the first metadata information includes audio parameters that can be modified by a user;

所述音频互动终端，用于接收编解码终端发送的第一元数据信息，并接收用户对所述第一元数据信息中音频参数进行修改后，修改的音频参数集合生成的第二元数据信息；The audio interactive terminal is used to receive the first metadata information sent by the codec terminal, and receive the second metadata information generated by the modified audio parameter set after the user modifies the audio parameters in the first metadata information;

所述编解码终端，还用于接收音频互动终端发送的第二元数据信息，对所述第二元数据信息进行编码，并将所述编码后的第二元数据信息更新至所述第一编码码流的尾部，形成第二编码码流发送至功放；The codec terminal is further used to receive second metadata information sent by the audio interactive terminal, encode the second metadata information, and update the encoded second metadata information to the end of the first encoding code stream to form a second encoding code stream and send it to the power amplifier;

所述功放，用于接收编解码终端发送的第二编码码流，对所述第二编码码流进行解码，获得第一元数据信息和第二元数据信息，并根据所述第一元数据信息和第二元数据信息通过与所述功放连接的音箱输出声音。The power amplifier is used to receive a second encoded code stream sent by the codec terminal, decode the second encoded code stream, obtain first metadata information and second metadata information, and output sound through a speaker connected to the power amplifier according to the first metadata information and the second metadata information.

采用本申请实施例中提供的音频元数据编解码方法，通过将用户修改音频参数后的元数据信息更新至原始编码码流的尾部，使得新的编码码流能够同时保留原始元数据信息和修改后的元数据信息，在音频渲染时，如果修改后的元数据信息由于外部因素无法实现时，还可以根据原始元数据信息进行音频渲染，保证了音频渲染的灵活性和可靠性。By adopting the audio metadata encoding and decoding method provided in the embodiment of the present application, the metadata information after the user modifies the audio parameters is updated to the end of the original encoded code stream, so that the new encoded code stream can retain both the original metadata information and the modified metadata information. During audio rendering, if the modified metadata information cannot be realized due to external factors, audio rendering can also be performed according to the original metadata information, thereby ensuring the flexibility and reliability of audio rendering.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

此处所说明的附图用来提供对本申请的进一步理解，构成本申请的一部分，本申请的示意性实施例及其说明用于解释本申请，并不构成对本申请的不当限定。在附图中：The drawings described herein are used to provide a further understanding of the present application and constitute a part of the present application. The illustrative embodiments of the present application and their descriptions are used to explain the present application and do not constitute an improper limitation on the present application. In the drawings:

图1为本申请实施例1提供的音频元数据编解码方法的流程图；FIG1 is a flowchart of an audio metadata encoding and decoding method provided in Embodiment 1 of the present application;

图2为本申请实施例1提供的第一编码码流的数据格式；FIG2 is a data format of a first encoding code stream provided in Example 1 of the present application;

图3为本申请实施例1提供的音频操作终端操作界面的示意图；FIG3 is a schematic diagram of an operation interface of an audio operation terminal provided in Example 1 of the present application;

图4为本申请实施例1提供的第二编码码流的数据格式；FIG4 is a data format of a second encoding code stream provided in Example 1 of the present application;

图5为本申请实施例2提供的音频传输方法的流程图；FIG5 is a flow chart of an audio transmission method provided in Example 2 of the present application;

图6为本申请实施例4提供的音频传输系统的原理示意图。FIG6 is a schematic diagram of the principles of the audio transmission system provided in Example 4 of the present application.

具体实施方式Detailed ways

为了使本申请实施例中的技术方案及优点更加清楚明白，以下结合附图对本申请的示例性实施例进行进一步详细的说明，显然，所描述的实施例仅是本申请的一部分实施例，而不是所有实施例的穷举。需要说明的是，在不冲突的情况下，本申请中的实施例及实施例中的特征可以相互组合。In order to make the technical solutions and advantages in the embodiments of the present application more clearly understood, the exemplary embodiments of the present application are further described in detail below in conjunction with the accompanying drawings. Obviously, the described embodiments are only part of the embodiments of the present application, rather than an exhaustive list of all the embodiments. It should be noted that the embodiments in the present application and the features in the embodiments can be combined with each other without conflict.

实施例1Example 1

如图1所示，本实施例提出一种音频元数据编解码方法，该方法包括：As shown in FIG1 , this embodiment provides an audio metadata encoding and decoding method, the method comprising:

S101、获取第一编码码流。S101: Obtain a first encoded bitstream.

具体的，本实施例所提出的音频元数据解密码方法应用于编解码终端，具体可为音频编解码器。该编解码终端能够实现对音频数据的编解码操作。本实施例中，编解码终端可与外部设备连接，以获得外部设备发送的编码码流。该编码码流即为第一编码码流，也可视为原始编码码流。在该原始编码码流中包含有固定的字段以及与音频有关的信息等。本实施例中，具体获得第一编码码流的方式包括但不限于从数字电视、本地介质或网络链路上获取第一编码码流。Specifically, the audio metadata decryption method proposed in this embodiment is applied to a codec terminal, which may be an audio codec. The codec terminal can implement codec operations on audio data. In this embodiment, the codec terminal can be connected to an external device to obtain a coded stream sent by the external device. The coded stream is the first coded stream, which can also be regarded as an original coded stream. The original coded stream contains fixed fields and information related to audio. In this embodiment, the specific method of obtaining the first coded stream includes but is not limited to obtaining the first coded stream from a digital television, a local medium or a network link.

S102、解码所述第一编码码流，获得第一元数据信息。S102: Decode the first encoded bitstream to obtain first metadata information.

具体的，编解码终端在获取到第一编码码流后，需要对其进行解码操作，以获取该第一编码码流中的原始元数据信息，即为第一元数据信息。如图2所示为第一编码码流的数据格式，其包含有头部的帧头，中部的元数据段以及尾部的声床和对象段。其中，中部的元数据段即为第一元数据信息相关内容，具体可包含有对象音量和声响相关信息等音频参数。在该音频参数中，有些是固定参数，即为用户不可更改的默认参数，有些是可变参数，即为用户可根据实际需求进行修改的参数。Specifically, after obtaining the first encoded code stream, the codec terminal needs to perform a decoding operation on it to obtain the original metadata information in the first encoded code stream, that is, the first metadata information. As shown in Figure 2, the data format of the first encoded code stream includes a frame header at the head, a metadata segment in the middle, and a sound bed and object segment at the end. Among them, the metadata segment in the middle is the content related to the first metadata information, which may specifically include audio parameters such as object volume and sound-related information. Among the audio parameters, some are fixed parameters, that is, default parameters that cannot be changed by the user, and some are variable parameters, that is, parameters that the user can modify according to actual needs.

S103、获取第二元数据信息。S103: Acquire second metadata information.

具体的，在上述过程中提到，在该第一元数据信息中包含有可供用户修改的音频参数。当用户对该音频参数进行修改的同时，该音频参数所对应的元数据信息也会发生改变，那么改变之后所生成的数据信息即为第二元数据信息。需要说明的是，本实施例中的第二元数据信息不包含未发生改变的音频参数所对应的元数据信息，为了更为清晰的理解本实施例中第一元数据信息和第二元数据信息的区别，本实施例将作出进一步介绍。Specifically, in the above process, it is mentioned that the first metadata information includes audio parameters that can be modified by the user. When the user modifies the audio parameters, the metadata information corresponding to the audio parameters will also change, and the data information generated after the change is the second metadata information. It should be noted that the second metadata information in this embodiment does not include the metadata information corresponding to the audio parameters that have not changed. In order to more clearly understand the difference between the first metadata information and the second metadata information in this embodiment, this embodiment will be further introduced.

本实施例中，用户对音频参数的修改可通过一操作界面实现。首先，编解码终端在对第一编码码流进行解码获取到第一元数据信息后，可通过一UI界面在操作界面上对第一元数据信息相关的音频参数进行显示，如图3所示。图3中所示的音频参数仅为举例说明，实际应用时可自由进行排版或参数增删。在图3中，可供用户修改的参数包括氛围、声音对象、音量以及声向。若图3所示为第一元数据信息所对应的音频参数状态的话，那么，在第一元数据信息中，氛围为标准，声音对象为中文解说，音量为静音，声向调节为右。此时，用户根据实际需求需要对氛围以及声音对象进行修改，例如，将氛围修改为主客场氛围，将声音对象修改为英文解说，那么，氛围和声音对象即为用户所要选择修改的音频参数，主客场氛围和英文解说即为相应的调整信息。本实施例将氛围、声音对象、主客场氛围以及英文解说进行整合即可生成第二元数据信息。此时，可以看出第二元数据信息中只包含有修改的音频参数及其对应的调整信息，而不包含未修改的音频参数。In this embodiment, the user can modify the audio parameters through an operation interface. First, after the codec terminal decodes the first encoding bit stream to obtain the first metadata information, the audio parameters related to the first metadata information can be displayed on the operation interface through a UI interface, as shown in FIG3. The audio parameters shown in FIG3 are only for example, and the layout or parameter addition and deletion can be freely performed in actual application. In FIG3, the parameters that can be modified by the user include atmosphere, sound object, volume, and sound direction. If FIG3 shows the audio parameter state corresponding to the first metadata information, then, in the first metadata information, the atmosphere is standard, the sound object is Chinese commentary, the volume is mute, and the sound direction adjustment is right. At this time, the user needs to modify the atmosphere and the sound object according to actual needs. For example, the atmosphere is modified to the home and away atmosphere, and the sound object is modified to the English commentary. Then, the atmosphere and the sound object are the audio parameters that the user wants to select to modify, and the home and away atmosphere and the English commentary are the corresponding adjustment information. In this embodiment, the atmosphere, the sound object, the home and away atmosphere, and the English commentary are integrated to generate the second metadata information. At this point, it can be seen that the second metadata information only includes the modified audio parameters and their corresponding adjustment information, but does not include the unmodified audio parameters.

S104、对所述第二元数据信息进行编码，并将所述编码后的第二元数据信息更新至所述第一编码码流的尾部，形成第二编码码流。S104: Encode the second metadata information, and update the encoded second metadata information to the end of the first encoded code stream to form a second encoded code stream.

具体的，在通过上述过程获取到第二元数据信息后，本实施例可通过编码器终端对该第二元数据信息进行编码处理。然后将编码后的第二元数据信息更新至第一编码码流的尾部，以形成第二编码码流，如图4所示，修改元数据段即为本实施例中第二元数据信息相关内容。由于第二元数据信息是直接更新在第一编码码流的尾部，因此未对第一编码码流本身进行任何的改变，使得第一编码码流完整的保留了原始元数据信息。同时，也使得在该第二编码码流中，既存在拥有原始音频参数相关信息的第一元数据信息，还存在拥有修改后音频参数相关信息的第二元数据信息。这样，在音频渲染时，如果第二元数据信息由于外部因素无法实现时，还可以根据第一元数据信息进行音频渲染，保证了音频渲染的灵活性和可靠性。Specifically, after obtaining the second metadata information through the above process, this embodiment can encode the second metadata information through the encoder terminal. Then the encoded second metadata information is updated to the end of the first encoded code stream to form a second encoded code stream, as shown in Figure 4, and the modified metadata segment is the content related to the second metadata information in this embodiment. Since the second metadata information is directly updated at the end of the first encoded code stream, no changes are made to the first encoded code stream itself, so that the first encoded code stream completely retains the original metadata information. At the same time, it also makes it possible that in the second encoded code stream, there is both the first metadata information with the original audio parameter related information and the second metadata information with the modified audio parameter related information. In this way, when rendering audio, if the second metadata information cannot be realized due to external factors, audio rendering can also be performed according to the first metadata information, ensuring the flexibility and reliability of audio rendering.

此外，本实施例在用户选择所要修改的音频参数并对改音频参数进行修改后，在操作界面上还可以对所修改音频参数的显示内容进行调整，例如图3中所示，将氛围由标准的选项切换为主客场氛围，将声音对象由中文解说的选项切换为英文解说等，以使用户直观看到自己所选择的音频参数及调整的内容。In addition, in this embodiment, after the user selects the audio parameters to be modified and modifies the audio parameters, the display content of the modified audio parameters can also be adjusted on the operation interface. For example, as shown in Figure 3, the atmosphere is switched from the standard option to the home and away atmosphere, and the sound object is switched from the Chinese commentary option to the English commentary, etc., so that the user can intuitively see the audio parameters selected by him and the adjusted content.

实施例2Example 2

如图5所示，在实施例1的基础上，本实施例提出一种音频传输方法，该方法包括：As shown in FIG. 5 , based on Embodiment 1, this embodiment proposes an audio transmission method, which includes:

S201、编解码终端获取第一编码码流，并对所述第一编码码流进行解码获得第一元数据信息。S201: A codec terminal obtains a first coded code stream, and decodes the first coded code stream to obtain first metadata information.

具体的，本实施例中，编解码终端能够实现对音频数据的编解码操作。本实施例中，编解码终端可与外部设备连接，以获得外部设备发送的编码码流。该编码码流即为第一编码码流，也可视为原始编码码流。在该原始编码码流中包含有固定的字段以及与音频有关的信息等。本实施例中，编解码终端具体获得第一编码码流的方式包括但不限于数字电视、本地介质或网络链路等途径。Specifically, in this embodiment, the codec terminal can implement the codec operation of audio data. In this embodiment, the codec terminal can be connected to an external device to obtain a coded stream sent by the external device. The coded stream is the first coded stream, which can also be regarded as an original coded stream. The original coded stream contains fixed fields and information related to audio. In this embodiment, the codec terminal specifically obtains the first coded stream in ways including but not limited to digital television, local media or network links.

编解码终端在获取到第一编码码流后，需要对其进行解码操作，以获取该第一编码码流中的原始元数据信息，即为第一元数据信息。第一元数据信息中具体可包含有对象音量和声响相关信息等音频参数。在该音频参数中，有些是固定参数，即为用户不可更改的默认参数，有些是可变参数，即为用户可根据实际需求进行修改的参数。After obtaining the first encoded code stream, the codec terminal needs to perform a decoding operation on it to obtain the original metadata information in the first encoded code stream, that is, the first metadata information. The first metadata information may specifically include audio parameters such as object volume and sound-related information. Among the audio parameters, some are fixed parameters, that is, default parameters that cannot be changed by the user, and some are variable parameters, that is, parameters that can be modified by the user according to actual needs.

S202、编解码终端将所述第一元数据信息发送至音频互动终端，所述音频互动终端接收用户对所述第一元数据信息中音频参数进行修改后，修改的音频参数集合生成的第二元数据信息。S202: The codec terminal sends the first metadata information to the audio interactive terminal, and the audio interactive terminal receives second metadata information generated by a modified audio parameter set after the user modifies the audio parameters in the first metadata information.

本实施例中，用户对音频参数的修改可通过一操作界面实现。首先，编解码终端在对第一编码码流进行解码获取到第一元数据信息后，将该第一元数据信息发送至音频互动终端。该音频互动终端可通过一UI界面在其操作界面上对第一元数据信息相关的音频参数进行显示，如图3所示。图3中所示的音频参数仅为举例说明，实际应用时可自由进行排版或参数增删。在图3中，可供用户修改的参数包括氛围、声音对象、音量以及声向。若图3所示为第一元数据信息所对应的音频参数状态的话，那么，在第一元数据信息中，氛围为标准，声音对象为中文解说，音量为静音，声向调节为右。此时，用户根据实际需求需要对氛围以及声音对象进行修改，例如，将氛围修改为主客场氛围，将声音对象修改为英文解说，那么，氛围和声音对象即为用户所要选择修改的音频参数，主客场氛围和英文解说即为相应的调整信息。本实施例将氛围、声音对象、主客场氛围以及英文解说进行整合即可生成第二元数据信息。此时，可以看出第二元数据信息中只包含有修改的音频参数及其对应的调整信息，而不包含未修改的音频参数。In this embodiment, the user can modify the audio parameters through an operation interface. First, after the codec terminal decodes the first encoding code stream to obtain the first metadata information, it sends the first metadata information to the audio interactive terminal. The audio interactive terminal can display the audio parameters related to the first metadata information on its operation interface through a UI interface, as shown in FIG3. The audio parameters shown in FIG3 are only for example, and the layout or parameter addition and deletion can be freely performed in actual application. In FIG3, the parameters available for user modification include atmosphere, sound object, volume, and sound direction. If FIG3 shows the audio parameter state corresponding to the first metadata information, then, in the first metadata information, the atmosphere is standard, the sound object is Chinese commentary, the volume is mute, and the sound direction adjustment is right. At this time, the user needs to modify the atmosphere and the sound object according to actual needs. For example, the atmosphere is modified to the home and away atmosphere, and the sound object is modified to the English commentary. Then, the atmosphere and the sound object are the audio parameters that the user wants to select to modify, and the home and away atmosphere and the English commentary are the corresponding adjustment information. In this embodiment, the atmosphere, the sound object, the home and away atmosphere, and the English commentary are integrated to generate the second metadata information. At this point, it can be seen that the second metadata information only includes the modified audio parameters and their corresponding adjustment information, but does not include the unmodified audio parameters.

S203、音频互动终端将所述第二元数据信息发送至编解码终端，所述编解码终端对所述第二元数据信息进行编码，并将所述编码后的第二元数据信息更新至所述第一编码码流的尾部，形成第二编码码流。S203: The audio interactive terminal sends the second metadata information to the codec terminal, and the codec terminal encodes the second metadata information and updates the encoded second metadata information to the end of the first encoding code stream to form a second encoding code stream.

具体的，音频互动终端在通过上述过程获取到第二元数据信息后，将该第二元数据信息发送至编解码终端。本实施例可通过编码器终端对该第二元数据信息进行编码处理。然后将编码后的第二元数据信息更新至第一编码码流的尾部，以形成第二编码码流。由于第二元数据信息是直接更新在第一编码码流的尾部，因此未对第一编码码流本身进行任何的改变，使得第一编码码流完整的保留了原始元数据信息。同时，也使得在该第二编码码流中，既存在拥有原始音频参数相关信息的第一元数据信息，还存在拥有修改后音频参数相关信息的第二元数据信息。这样，在音频渲染时，如果第二元数据信息由于外部因素无法实现时，还可以根据第一元数据信息进行音频渲染，保证了音频渲染的灵活性和可靠性。Specifically, after the audio interactive terminal obtains the second metadata information through the above process, the second metadata information is sent to the codec terminal. In this embodiment, the second metadata information can be encoded by the encoder terminal. Then the encoded second metadata information is updated to the end of the first encoding stream to form a second encoding stream. Since the second metadata information is directly updated at the end of the first encoding stream, no changes are made to the first encoding stream itself, so that the first encoding stream completely retains the original metadata information. At the same time, in the second encoding stream, there is both the first metadata information with information related to the original audio parameters and the second metadata information with information related to the modified audio parameters. In this way, when rendering audio, if the second metadata information cannot be realized due to external factors, audio rendering can also be performed according to the first metadata information, thereby ensuring the flexibility and reliability of audio rendering.

S204、编解码终端将所述第二编码码流透传至功放，所述功放对所述第二编码码流进行解码，获得第一元数据信息和第二元数据信息，并根据所述第一元数据信息和第二元数据信息通过与所述功放连接的音箱输出声音。S204: The codec terminal transparently transmits the second encoded code stream to the power amplifier, the power amplifier decodes the second encoded code stream to obtain first metadata information and second metadata information, and outputs sound through a speaker connected to the power amplifier according to the first metadata information and the second metadata information.

具体的，编解码终端在将第二元数据信息编码生成第二编码码流后，将该第二编码码流透传至功放，透传的方式可以选择HDMI、ARC、EARC或SPDIF方式实现，本实施例不做特殊限定。功放在接收第二编码码流后，可对其进行解码，然后根据第一元数据信息和第二元数据信息通过与功放连接的音箱输出声音。在这一过程中，功放也可以根据与其连接的音箱的配置情况来选择使用第一元数据信息或第二元数据信息来进行音频渲染。Specifically, after the codec terminal encodes the second metadata information to generate the second encoded code stream, the second encoded code stream is transparently transmitted to the power amplifier. The transparent transmission method can be implemented by HDMI, ARC, EARC or SPDIF, which is not specifically limited in this embodiment. After receiving the second encoded code stream, the power amplifier can decode it, and then output the sound through the speakers connected to the power amplifier according to the first metadata information and the second metadata information. In this process, the power amplifier can also choose to use the first metadata information or the second metadata information for audio rendering according to the configuration of the speakers connected to it.

本实施例通过将用户修改音频参数后的元数据信息更新至原始编码码流的尾部，使得新的编码码流能够同时保留原始元数据信息和修改后的元数据信息，在音频渲染时，如果修改后的元数据信息由于外部因素无法实现时，还可以根据原始元数据信息进行音频渲染，保证了音频渲染的灵活性和可靠性。This embodiment updates the metadata information after the user modifies the audio parameters to the end of the original encoded bitstream, so that the new encoded bitstream can retain both the original metadata information and the modified metadata information. During audio rendering, if the modified metadata information cannot be implemented due to external factors, audio rendering can also be performed based on the original metadata information, thereby ensuring the flexibility and reliability of audio rendering.

实施例3Example 3

本实施例提出一种编解码终端，该编解码终端包括处理器，处理器内置有处理器可执行的操作指令，以执行如下步骤：This embodiment provides a codec terminal, which includes a processor, and the processor has built-in operating instructions executable by the processor to perform the following steps:

本实施例所提出的编解码终端的具体工作过程可参照实施例1的内容，本实施例不再进行赘述。The specific working process of the codec terminal proposed in this embodiment can refer to the content of Embodiment 1, and will not be described in detail in this embodiment.

实施例4Example 4

如图6所示，本实施例提出一种音频传输系统，该系统包括音箱、音频互动终端、功放和编解码终端；As shown in FIG6 , this embodiment provides an audio transmission system, which includes a speaker, an audio interactive terminal, a power amplifier, and a codec terminal;

本实施例所提出的音频传输系统的具体工作过程可参照实施例2的内容，本实施例不再进行赘述。The specific working process of the audio transmission system proposed in this embodiment can refer to the content of Example 2, and this embodiment will not be repeated.

在本申请的描述中，需要理解的是，术语“中心”、“纵向”、“横向”、“长度”、“宽度”、“厚度”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”“内”、“外”等指示的方位或位置关系为基于附图所示的方位或位置关系，仅是为了便于描述本申请和简化描述，而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作，因此不能理解为对本申请的限制。In the description of the present application, it should be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "up", "down", "front", "back", "left", "right", "vertical", "horizontal", "top", "bottom", "inside", "outside", etc., indicating orientations or positional relationships, are based on the orientations or positional relationships shown in the accompanying drawings, and are only for the convenience of describing the present application and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and therefore should not be understood as a limitation on the present application.

此外，术语“第一”、“第二”仅用于描述目的，而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本申请的描述中，“多个”的含义是至少两个，例如两个，三个等，除非另有明确具体的限定。In addition, the terms "first" and "second" are used for descriptive purposes only and should not be understood as indicating or implying relative importance or implicitly indicating the number of the indicated technical features. Therefore, the features defined as "first" and "second" may explicitly or implicitly include one or more of the features. In the description of this application, the meaning of "plurality" is at least two, such as two, three, etc., unless otherwise clearly and specifically defined.

在本申请中，除非另有明确的规定和限定，术语“安装”、“相连”、“连接”、“固定”等术语应做广义理解，例如，可以是固定连接，也可以是可拆卸连接，或成一体；可以是机械连接，也可以是电连接或可以互相通讯；可以是直接相连，也可以通过中间媒介间接相连，可以是两个元件内部的连通或两个元件的相互作用关系。对于本领域的普通技术人员而言，可以根据具体情况理解上述术语在本申请中的具体含义。In this application, unless otherwise clearly specified and limited, the terms "installed", "connected", "connected", "fixed" and the like should be understood in a broad sense, for example, it can be a fixed connection, a detachable connection, or an integral connection; it can be a mechanical connection, an electrical connection, or can communicate with each other; it can be a direct connection, or an indirect connection through an intermediate medium, it can be the internal connection of two elements or the interaction relationship between two elements. For ordinary technicians in this field, the specific meanings of the above terms in this application can be understood according to specific circumstances.

尽管已描述了本申请的优选实施例，但本领域内的技术人员一旦得知了基本创造性概念，则可对这些实施例作出另外的变更和修改。所以，所附权利要求意欲解释为包括优选实施例以及落入本申请范围的所有变更和修改。Although the preferred embodiments of the present application have been described, those skilled in the art may make other changes and modifications to these embodiments once they have learned the basic creative concept. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments and all changes and modifications falling within the scope of the present application.

显然，本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样，倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内，则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present application without departing from the spirit and scope of the present application. Thus, if these modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is also intended to include these modifications and variations.

Claims

1. An audio metadata encoding and decoding method, characterized in that the method comprises:

Obtaining a first encoded code stream, where the first encoded code stream is an original encoded code stream;

Decoding the first encoded bitstream to obtain first metadata information, wherein the first metadata information includes audio parameters modifiable by a user;

Acquire second metadata information, where the second metadata information is metadata information generated by a modified audio parameter set after a user modifies the audio parameters in the first metadata information;

The second metadata information is encoded, and the encoded second metadata information is updated to the end of the first encoding code stream to form a second encoding code stream.

2. The method according to claim 1 is characterized in that the way of obtaining the first encoded code stream includes obtaining the first encoded code stream from a digital television, a local medium or a network link.

3. The method according to claim 1, wherein the process of obtaining the second metadata information comprises:

Displaying the first metadata information on an operation interface;

Acquire the audio parameters selected by the user through the operation interface and the adjustment information after adjusting the audio parameters;

The audio parameter selected by the user and the adjustment information corresponding to the audio parameter selected by the user are integrated to generate second metadata information.

4. The method according to claim 3, characterized in that after obtaining the audio parameters selected by the user through the operation interface and the adjustment information after adjusting the audio parameters, the method further comprises:

The display content of the audio parameter selected by the user on the operation interface is adjusted according to the audio parameter selected by the user and the adjustment information corresponding to the audio parameter selected by the user.

5. An audio transmission method, characterized in that the method comprises:

The codec terminal obtains a first coded code stream, and decodes the first coded code stream to obtain first metadata information, wherein the first coded code stream is an original coded code stream, and the first metadata information includes audio parameters that can be modified by a user;

The codec terminal sends the first metadata information to the audio interactive terminal, and the audio interactive terminal receives second metadata information generated by the modified audio parameter set after the user modifies the audio parameters in the first metadata information;

The audio interactive terminal sends the second metadata information to the codec terminal, and the codec terminal encodes the second metadata information and updates the encoded second metadata information to the end of the first encoding code stream to form a second encoding code stream;

The codec terminal transparently transmits the second encoded code stream to the power amplifier, the power amplifier decodes the second encoded code stream, obtains the first metadata information and the second metadata information, and outputs sound through a speaker connected to the power amplifier according to the first metadata information and the second metadata information.

6. The method according to claim 5, characterized in that the codec terminal obtains the first coded code stream from a digital television, a local medium or a network link.

7. The method according to claim 5 is characterized in that the process of the audio interactive terminal receiving the second metadata information generated by the modified audio parameter set after the user modifies the audio parameters in the first metadata information comprises:

The audio interactive terminal displays the first metadata information on an operation interface;

The audio interactive terminal obtains the audio parameters selected by the user through the operation interface and the adjustment information after adjusting the audio parameters;

The audio interactive terminal integrates the audio parameters selected by the user and the adjustment information corresponding to the audio parameters selected by the user to generate second metadata information.

8. The method according to claim 7, characterized in that after the audio interactive terminal obtains the audio parameters selected by the user through the operation interface and the adjustment information after adjusting the audio parameters, the method further comprises:

The audio interactive terminal adjusts the display content of the audio parameter selected by the user on the operation interface according to the audio parameter selected by the user and the adjustment information corresponding to the audio parameter selected by the user.

9. The method according to claim 5, characterized in that the codec terminal transparently transmits the second encoded code stream to the power amplifier via HDMI, ARC, EARC or SPDIF.

10. A codec terminal, characterized in that the codec terminal comprises a processor, wherein the processor has built-in operating instructions executable by the processor to execute the audio metadata codec method according to any one of claims 1 to 4.

11. An audio transmission system, characterized in that the system comprises a speaker, an audio interactive terminal, a power amplifier and a codec terminal;

The codec terminal is used to obtain a first coded code stream and decode the first coded code stream to obtain first metadata information, wherein the first coded code stream is an original coded code stream, and the first metadata information includes audio parameters that can be modified by a user;

The audio interactive terminal is used to receive the first metadata information sent by the codec terminal, and receive the second metadata information generated by the modified audio parameter set after the user modifies the audio parameters in the first metadata information;

The codec terminal is further used to receive second metadata information sent by the audio interactive terminal, encode the second metadata information, and update the encoded second metadata information to the end of the first encoding code stream to form a second encoding code stream and send it to the power amplifier;

The power amplifier is used to receive a second encoded code stream sent by the codec terminal, decode the second encoded code stream, obtain first metadata information and second metadata information, and output sound through a speaker connected to the power amplifier according to the first metadata information and the second metadata information.