JP7530751B2

JP7530751B2 - Multiplex signal conversion device, program thereof, and receiver

Info

Publication number: JP7530751B2
Application number: JP2020101091A
Authority: JP
Inventors: 侑輝河村; 知也楠; 裕靖永田; 悠喜山上; 浩一郎今村
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2020-06-10
Filing date: 2020-06-10
Publication date: 2024-08-08
Anticipated expiration: 2040-06-10
Also published as: JP2021197584A

Description

本発明は、多重信号変換装置及びそのプログラム、並びに、受信機に関する。 The present invention relates to a multiplex signal conversion device, a program for the device, and a receiver.

従来のデジタル放送で用いられているＭＰＥＧ－２ＴＳ（Transport Stream）に代わる、ＩＰ（Internet Protocol）ベースの新たなメディアトランスポート方式の国際標準規格として、ＭＭＴ（MPEG Media Transport）が策定されている（非特許文献１）。また、日本国内のデジタル放送サービスにおけるＭＭＴの利用方法が規格化され（非特許文献２）、ＭＭＴを採用した新４Ｋ８Ｋ衛星放送が２０１８年１２月に開始された。 MMT (MPEG Media Transport) has been established as an international standard for a new IP (Internet Protocol)-based media transport method to replace MPEG-2 TS (Transport Stream) used in conventional digital broadcasting (Non-Patent Document 1). In addition, the method of using MMT in digital broadcasting services in Japan has been standardized (Non-Patent Document 2), and new 4K8K satellite broadcasting that uses MMT was launched in December 2018.

ＭＭＴで規定されるアプリケーション層のパケットフォーマット（パケットヘッダのデータ構造）をＭＭＴＰ（MMT Protocol）と呼ぶ。ＭＭＴＰパケットは、ＵＤＰ（User Diagram Protocol）／ＩＰパケットのペイロードとして、図８（ａ）に示すように、放送や通信の伝送路上を片方向に伝送される。 The application layer packet format (packet header data structure) defined by MMT is called MMTP (MMT Protocol). MMTP packets are transmitted unidirectionally over broadcast or communication transmission paths as the payload of UDP (User Diagram Protocol)/IP packets, as shown in Figure 8 (a).

ＭＭＴでは、映像・音声コーデックの処理単位をＭＰＵ（Media Processing Unit）と呼ぶ。ＭＰＵの先頭データは、過去に送信されたデータに依存せずに処理が可能なランダムアクセスポイントである必要がある。ＭＭＴの放送利用を規定する非特許文献２では、映像符号化のイントラ（Intra）フレーム（フレーム内圧縮を行うフレーム）を先頭とするＧＯＰ（Group Of Picture）をＭＰＵとして扱う。なお、映像符号化方式の一例として用いられるＨＥＶＣの規格上ではＧＯＰという用語は使用されていないが、ＭＰＥＧ－２Ｖｉｄｅｏなどの従来方式にならい、イントラフレームを先頭とするフレームの集合を便宜上、ＧＯＰと呼ぶことがある。 In MMT, the processing unit of video and audio codec is called MPU (Media Processing Unit). The first data of an MPU needs to be a random access point that can be processed without relying on previously transmitted data. In Non-Patent Document 2, which specifies the use of MMT in broadcasting, a GOP (Group Of Picture) starting with an intra frame (a frame that performs intra-frame compression) of video coding is treated as an MPU. Note that the term GOP is not used in the HEVC standard, which is used as an example of a video coding method, but following conventional methods such as MPEG-2 Video, a set of frames starting with an intra frame is sometimes called a GOP for convenience.

図８（ｂ）に示すように、放送サービスでは、受信チャンネル変更時のランダムアクセス性を確保するため、０．５秒程度の周期でＧＯＰが構成される。具体例として、ＨＥＶＣでは、３２フレームをＧＯＰとする場合がある。音声符号化方式の一例として用いられるＡＡＣ（Advanced Audio Coding）では、例えば、音圧をサンプリング周波数４８ｋＨｚでサンプリングした音声サンプルについて、１０２４サンプルごとに独立して符号化処理を行ったデータブロックをＡＵ（Access Unit）として扱う。一般に、各ＡＵの先頭がランダムアクセスポイントとなるが、ＭＰＵの中に複数のランダムアクセスポイントがあっても構わないため、１つ以上の音声ＡＵの集合をＭＰＵとして扱うことができる。ＭＭＴＰにより伝送されるＭＰＵは、ＭＰＵシーケンス番号によって一意に特定ができる。ＭＰＵシーケンス番号は、該当ＭＰＵをペイロードとして格納するＭＭＴＰパケットのＭＭＴＰペイロードヘッダ部に記載される。 As shown in FIG. 8(b), in broadcasting services, GOPs are configured with a period of about 0.5 seconds to ensure random access when changing the receiving channel. As a specific example, in HEVC, 32 frames may be used as a GOP. In AAC (Advanced Audio Coding), which is used as an example of an audio coding method, for example, for audio samples in which sound pressure is sampled at a sampling frequency of 48 kHz, a data block in which encoding processing is performed independently for every 1024 samples is treated as an AU (Access Unit). Generally, the beginning of each AU is a random access point, but since there may be multiple random access points in an MPU, a set of one or more audio AUs can be treated as an MPU. An MPU transmitted by MMTP can be uniquely identified by an MPU sequence number. The MPU sequence number is written in the MMTP payload header of an MMTP packet that stores the corresponding MPU as a payload.

非特許文献１によれば、本来、ＭＰＵは、ＩＳＯＢＭＦＦ（ISO Base Media File Format）形式をベースとして規定されている。また、非特許文献１では、ＭＭＴＰにより、ＩＳＯＢＭＦＦのメタデータ部分をＭＰＵメタデータ及びムービーフラグメントメタデータとして送信する方法が規定されている。しかし、非特許文献２で規定される放送用途では、処理の低遅延化を図るためムービーフラグメントメタデータの生成・伝送を省略しており、ＨＥＶＣエンコーダが生成するＮＡＬ（Network Abstraction Layer）ユニットをそのままメディアフラグメントユニットとして、ＭＭＴＰ／ＵＤＰ／ＩＰパケットに多重して送信している。 According to Non-Patent Document 1, MPU is originally defined based on the ISOBMFF (ISO Base Media File Format). Non-Patent Document 1 also defines a method of transmitting the metadata portion of ISOBMFF as MPU metadata and movie fragment metadata by MMTP. However, in the broadcasting applications defined in Non-Patent Document 2, the generation and transmission of movie fragment metadata is omitted in order to reduce processing latency, and the NAL (Network Abstraction Layer) units generated by the HEVC encoder are multiplexed as media fragment units directly into MMTP/UDP/IP packets and transmitted.

ＩＳＯＢＭＦＦは、基本のデータ構造が非特許文献３で規定されており、ＭＰＥＧ－４規格の一部であることから、一般的にはＭＰ４（.mp4）と呼ばれることがある。ＩＳＯＢＭＦＦで定義されるメタデータ記述方法であるＢｏｘ形式は、拡張性があり、アプリケーションの要求に応じて、新たなメタデータのデータ構造の追加や、より詳細な運用方法を規定できる。例えば、非特許文献４では、ＨＴＴＰ（Hypertext Transfer Protocol）／ＴＣＰ（Transmission Control Protocol）通信を用いた映像ストリーミング配信方式であるＭＰＥＧ－ＤＡＳＨ（Dynamic Adaptive Streaming over HTTP）において、詳細なＩＳＯＢＭＦＦメタデータの運用方法やデータ構造を規定している。ＭＰＥＧ－ＤＡＳＨでは、数秒から数十秒程度の映像をファイル化したセグメントをＷＥＢサーバ上に公開し、再生端末はマニュフェストファイルに従ってセグメントを連続的にダウンロードして映像再生を行う。 The basic data structure of ISOBMFF is specified in Non-Patent Document 3, and since it is part of the MPEG-4 standard, it is commonly called MP4 (.mp4). The Box format, which is a metadata description method defined in ISOBMFF, is extensible, and new metadata data structures can be added and more detailed operation methods can be specified according to the requirements of the application. For example, Non-Patent Document 4 specifies detailed operation methods and data structures of ISOBMFF metadata in MPEG-DASH (Dynamic Adaptive Streaming over HTTP), a video streaming distribution method using HTTP (Hypertext Transfer Protocol)/TCP (Transmission Control Protocol) communication. In MPEG-DASH, video segments lasting several to several tens of seconds are made public on a WEB server, and the playback terminal continuously downloads the segments according to the manifest file to play the video.

ＭＰＥＧ－ＤＡＳＨと同様にＨＴＴＰ／ＴＣＰを用いる動画ストリーミング配信方式として、非特許文献５に規定されているＨＬＳ（HTTP Live Streaming）が知られている。現在、ＨＬＳは、ＭＰＥＧ－ＤＡＳＨと並んで広範に使用されている。ＨＬＳでは、当初、ＭＰＥＧ－２ＴＳ形式のセグメントを採用していたが、ＭＰＥＧ－ＤＡＳＨと同じＩＳＯＢＭＦＦ形式のセグメントを使用できるように改定された。これにより、ＭＰＥＧ－ＤＡＳＨとＨＬＳでは、再生時に用いるマニュフェストファイルが異なるもののセグメントを共通化することで、ＣＤＮ（Contents Delivery Network）等を用いた映像配信の効率化が可能となった。 HLS (HTTP Live Streaming), defined in Non-Patent Document 5, is known as a video streaming delivery method that uses HTTP/TCP like MPEG-DASH. Currently, HLS is widely used alongside MPEG-DASH. HLS initially used MPEG-2 TS format segments, but was revised to allow the use of ISOBMFF format segments, the same as MPEG-DASH. As a result, MPEG-DASH and HLS use the same segments, even though they use different manifest files during playback, making it possible to streamline video delivery using CDNs (Contents Delivery Networks), etc.

ＭＰＥＧ－ＤＡＳＨとＨＬＳで共通に使用できるＩＳＯＢＭＦＦベースのセグメント形式については、非特許文献６でＣＭＡＦとして規定されている。なお、ＣＭＡＦは、非特許文献１及び非特許文献２よりも新しい規格である。 An ISOBMFF-based segment format that can be used in common with MPEG-DASH and HLS is specified as CMAF in Non-Patent Document 6. Note that CMAF is a newer standard than Non-Patent Documents 1 and 2.

ＣＭＡＦでは、セグメント形式の共通化の他、映像ストリーミング配信の低遅延化を図る技術として、セグメント構造をさらに細分化するチャンク構造が定義されている。セグメントが一般的に数秒から数十秒であるのに対して、チャンクは数フレームなどより短い時間の映像データである。一般的なＨＴＴＰによるファイル単位の転送では、一つのセグメント全体が完成してからファイルを転送するため、セグメントの時間分（数秒から数十秒）の映像遅延が原理上避けられない。実際には、再生を安定化させるために受信機でも数個のセグメントをバッファに蓄えることから、通算の映像遅延は数十秒から数分になる場合がある。一方、ＣＭＡＦのチャンク構造（数フレーム）を、ＨＴＴＰの拡張技術であるＣｈｕｎｋｅｄＴｒａｎｓｆｅｒを使用して受信機に伝送する場合では、セグメント全体の完成を待つことなく、数フレームのチャンク単位で伝送することで、数秒の遅延での映像ストリーミング配信が実現可能とされている。ＩＳＯＢＭＦＦの規格当初、数フレーム単位でのメタデータの生成は考慮されていなかったが、通信技術の発展に伴う新たなアプリケーションの要求に応じて、機能が拡張されたと言える。 In addition to standardizing segment formats, CMAF defines a chunk structure that further subdivides the segment structure as a technology for reducing delays in video streaming delivery. While a segment is generally several to tens of seconds long, a chunk is video data that is shorter than the duration of a few frames. In general file-based transfers using HTTP, a file is transferred only after an entire segment is completed, so in principle, a video delay of the duration of the segment (several to tens of seconds) is unavoidable. In reality, the receiver also stores several segments in a buffer to stabilize playback, so the total video delay can be several tens of seconds to several minutes. On the other hand, when the chunk structure (several frames) of CMAF is transmitted to the receiver using Chunked Transfer, an extension technology of HTTP, video streaming delivery with a delay of a few seconds is possible by transmitting in chunks of several frames without waiting for the entire segment to be completed. When the ISOBMFF standard was first created, the generation of metadata in units of a few frames was not taken into consideration, but the functionality has been expanded to meet the demands of new applications that have arisen with the development of communication technology.

ＭＭＴによる映像伝送においても、セグメントに相当するＭＰＵ全体の符号化が終わった後にメタデータを生成し、それをＭＰＵメタデータ及びムービーフラグメントメタデータとして伝送する場合には、ＭＰＥＧ－ＤＡＳＨ等でセグメントをファイル化するのと同様の遅延が原理上避けられない。そこで、非特許文献２では、放送における映像伝送の低遅延化を図るために、ＩＳＯＢＭＦＦ形式のメタデータ生成と、ＭＰＵメタデータ及びムービーフラグメントメタデータとしての伝送とを省略している。しかし、ムービーフラグメントメタデータの伝送を省略すると、ＤＴＳ－ＰＴＳ差分情報（dts_pts_offset）を受信機に伝送できないという問題があった。このＤＴＳ－ＰＴＳ差分情報は、映像符号化のフレーム間参照構造に伴う、映像フレームの復号タイミングを指示するＤＴＳ（Decoding Timestamp）と映像フレームの提示タイミングを指示するＰＴＳ（Presentation Timestamp）との差分値を指示する情報である。そこで、非特許文献２では、このＤＴＳ－ＰＴＳ差分情報を別途記述する「拡張ＭＰＵタイムスタンプ記述子」を定義し、制御メッセージであるＭＰテーブル（MMT Package Table）内の記述子として伝送することを規定した。ここで、記述子とは、制御メッセージで様々な補助的な情報を多重して伝送するために、制御メッセージを拡張するためのデータ構造の一般的な名称である。例えば、ＭＭＴでは、制御メッセージであるＭＰテーブルの構造が非特許文献１で規定されるのに対して、それを拡張する各種記述子は非特許文献２で規定されるなど、各国の標準化機関やサービス事業者が独自に記述子を追加定義することができる。「拡張ＭＰＵタイムスタンプ記述子」は、ＭＰＵを構成する全フレームのＤＴＳ－ＰＴＳ差分情報を列挙した構造体であり、ＭＰＵの先頭フレームのＰＴＳを指示する「ＭＰＵタイムスタンプ記述子」とは別に伝送される。なお、「ＭＰＵタイムスタンプ記述子」は、非特許文献１に別途規定されている。 Even in video transmission by MMT, if metadata is generated after the entire MPU corresponding to a segment is encoded and transmitted as MPU metadata and movie fragment metadata, delays similar to those when converting segments into files using MPEG-DASH or the like are unavoidable in principle. Therefore, in Non-Patent Document 2, in order to reduce delays in video transmission in broadcasting, the generation of metadata in ISOBMFF format and the transmission as MPU metadata and movie fragment metadata are omitted. However, if the transmission of movie fragment metadata is omitted, there is a problem in that the DTS-PTS differential information (dts_pts_offset) cannot be transmitted to the receiver. This DTS-PTS differential information is information indicating the difference value between the DTS (Decoding Timestamp) that indicates the decoding timing of a video frame and the PTS (Presentation Timestamp) that indicates the presentation timing of a video frame, which is associated with the inter-frame reference structure of video encoding. Therefore, in Non-Patent Document 2, an "extended MPU timestamp descriptor" is defined that describes this DTS-PTS difference information separately, and it is specified that it is transmitted as a descriptor in the MP table (MMT Package Table), which is a control message. Here, a descriptor is a general name for a data structure for extending a control message in order to multiplex and transmit various auxiliary information in the control message. For example, in MMT, the structure of the MP table, which is a control message, is specified in Non-Patent Document 1, while various descriptors that extend it are specified in Non-Patent Document 2, and standardization organizations and service providers in each country can independently define additional descriptors. The "extended MPU timestamp descriptor" is a structure that lists the DTS-PTS difference information of all frames that make up the MPU, and is transmitted separately from the "MPU timestamp descriptor" that indicates the PTS of the first frame of the MPU. The "MPU timestamp descriptor" is specified separately in Non-Patent Document 1.

また、非特許文献６においては、実際のサービス（高度広帯域衛星デジタル放送）を対象として、より詳細なＭＭＴの運用方法が規定されている。非特許文献６では、「拡張ＭＰＵタイムスタンプ記述子」について、該当するＭＰＵよりも早いタイミングで伝送することを要求している。このとき、図８（ｂ）に示すように、「ＭＰＵタイムスタンプ記述子」に記載するＰＴＳは、ＭＰＵ（ＧＯＰ）の先頭で決定する（符号α）。その一方、「拡張ＭＰＵタイムスタンプ記述子」は、ＭＰＵ全体の映像符号化が終わった後でなければ生成できない（符号β）。このため、ＭＰＵの映像符号データの伝送をＭＰＵ長の時間以上に遅延させる必要が生じた。 Furthermore, Non-Patent Document 6 specifies a more detailed method of operating MMT for an actual service (advanced wideband digital satellite broadcasting). Non-Patent Document 6 requires that the "extended MPU timestamp descriptor" be transmitted at an earlier timing than the corresponding MPU. In this case, as shown in FIG. 8(b), the PTS written in the "MPU timestamp descriptor" is determined at the beginning of the MPU (GOP) (symbol α). On the other hand, the "extended MPU timestamp descriptor" can only be generated after video coding of the entire MPU has been completed (symbol β). This has created a need to delay the transmission of the MPU's video code data by more than the MPU length.

このような映像遅延を回避するために、ＭＭＴにおいても、図９（ａ）に示すように、新たにＣＭＡＦで規定されたチャンク構造のＩＳＯＢＭＦＦメタデータを使用することが考えられる。ＣＭＡＦチャンクは、ＩＳＯＢＭＦＦをベースにした構造であるため、非特許文献１の規定に従ってＭＭＴＰでの多重が可能である。ＣＭＡＦチャンクのＩＳＯＢＭＦＦメタデータでは、チャンク内の各フレームの提示タイミングと復号タイミングとの差分値を指示するＢｏｘ形式のメタデータが規定されているため、チャンク長の遅延でメタデータを生成できる。Ｂｏｘ形式のメタデータをＭＭＴに適用した場合、図９（ｂ）に示すように、チャンク長の遅延でメタデータを生成して、ＭＰＵを送信することが可能である。これにより、ＭＭＴを用いた映像伝送において、メタデータの生成に伴うＭＰＵ長の遅延を回避できる。図９（ｂ）では、一例として、３２フレームのＧＯＰを４分割した８フレームの集合をチャンクとして構成している。また、ＩＳＯＢＭＦＦ及びＣＭＡＦで規定されたＢｏｘ形式のメタデータ構造でＤＴＳ－ＰＴＳ差分情報を記述し、ＭＭＴＰのムービーフラグメントメタデータとして伝送することで、「拡張ＭＰＵタイムスタンプ記述子」を使用せずとも、デコーダに対して必要なメタデータを伝送できる。具体的に、ＣＭＡＦでは、「TrackFragmentRunBox（‘trun’）」の「sample_composition_time_offset」により、チャンク内の各フレームにＤＴＳ－ＰＴＳ差分情報を記述することができる。 In order to avoid such video delay, it is possible to use ISOBMFF metadata with a chunk structure newly defined in CMAF in MMT as shown in FIG. 9(a). Since the CMAF chunk has a structure based on ISOBMFF, multiplexing in MMTP is possible according to the provisions of Non-Patent Document 1. In the ISOBMFF metadata of the CMAF chunk, metadata in a Box format that indicates the difference value between the presentation timing and the decoding timing of each frame in the chunk is defined, so that metadata can be generated with a chunk length delay. When the Box format metadata is applied to MMT, it is possible to generate metadata with a chunk length delay and transmit an MPU as shown in FIG. 9(b). This makes it possible to avoid the MPU length delay associated with the generation of metadata in video transmission using MMT. In FIG. 9(b), as an example, a set of 8 frames obtained by dividing a 32-frame GOP into 4 is configured as a chunk. In addition, by describing DTS-PTS differential information in a box-format metadata structure defined in ISOBMFF and CMAF and transmitting it as MMTP movie fragment metadata, it is possible to transmit the necessary metadata to the decoder without using the "extended MPU timestamp descriptor." Specifically, in CMAF, DTS-PTS differential information can be described for each frame in a chunk using "sample_composition_time_offset" in "TrackFragmentRunBox ('trun')."

以下、映像符号化のパラメータセットについて説明する。ＩＳＯＢＭＦＦでは、コーデックの識別にアルファベット４文字で定義されるＦｏｕｒＣＣ（Four Character Code）を用いており、例えば、ＨＥＶＣでは、「ｈｅｖ１」と「ｈｖｃ１」の２種類が規定されている。ここで、「ｈｅｖ１」は、ＨＥＶＣで規定される映像符号化のパラメータセットであるＶＰＳ（Video Parameter Set）、ＳＰＳ（Sequence Parameter Set）、ＰＰＳ（Picture Parameter Set）をメディアフラグメントユニットの中に含む形式であることを示す。また、「ｈｖｃ１」は、パラメータセットをＭＰＵメタデータに含む形式であることを示す。 Below, the video coding parameter set is explained. ISOBMFF uses FourCC (Four Character Code), which is defined by four alphabetical letters, to identify codecs. For example, HEVC specifies two types, "hev1" and "hvc1". Here, "hev1" indicates a format in which the video coding parameter sets specified in HEVC, VPS (Video Parameter Set), SPS (Sequence Parameter Set), and PPS (Picture Parameter Set), are included in the media fragment unit. Also, "hvc1" indicates a format in which the parameter sets are included in the MPU metadata.

つまり、ＣＭＡＦを適用して映像符号化データ等を多重したＭＭＴＰパケットは、図９（ｂ）に示すように、パラメータセットをＭＰＵメタデータ（「ｈｖｃ１」の場合）又はメディアフラグメントユニット（「ｈｅｖ１」の場合）に含んでいる。以後、ＣＭＡＦを適用して映像符号化データ等を多重したＭＭＴＰパケットを「ＣＭＡＦ適用ＭＭＴ」と略記する場合がある。つまり、ＣＭＡＦ適用ＭＭＴは、「ｈｖｃ１」の場合と、「ｈｅｖ１」の場合とがある。なお、特殊な例ではあるが、各パラメータセットを、メディアフラグメントユニットとＭＰＵメタデータとの両方で伝送することも技術的には可能である。 In other words, an MMTP packet in which video coding data, etc. are multiplexed by applying CMAF includes a parameter set in the MPU metadata (in the case of "hvc1") or in the media fragment unit (in the case of "hev1"), as shown in FIG. 9(b). Hereinafter, an MMTP packet in which video coding data, etc. are multiplexed by applying CMAF may be abbreviated as "CMAF-applied MMT." In other words, CMAF-applied MMT may be "hvc1" or "hev1." Note that, although it is a special case, it is technically possible to transmit each parameter set in both the media fragment unit and the MPU metadata.

一方、ＣＭＡＦを適用せずに映像符号化データ等を多重したＭＭＴでは、図８（ｂ）に示すように、ＭＰＵメタデータを伝送せず、パラメータセットをメディアフラグメントユニットに含んだ形式で伝送するため、「ｈｅｖ１」のみに対応する。以後、ＣＭＡＦを適用せずに映像符号化データ等を多重したＭＭＴＰパケットを「ＣＭＡＦ非適用ＭＭＴ」と略記する場合がある。 On the other hand, in an MMT in which video coding data, etc. are multiplexed without applying CMAF, as shown in FIG. 8(b), MPU metadata is not transmitted, and the parameter set is transmitted in a format including the media fragment unit, so it only supports "hev1". Hereinafter, an MMTP packet in which video coding data, etc. are multiplexed without applying CMAF may be abbreviated as "MMT without applying CMAF".

なお、ＭＰＵメタデータは、ＩＳＯＢＭＦＦで規定される「MovieBox（’moov’）」を含むため、一般的には、ムービーメタデータと呼ばれる。また、ムービーフラグメントメタデータについては、ＩＳＯＢＭＦＦで規定される「MovieFragmentBox（’moof’）」を含むため、一般的にも、ムービーフラグメントメタデータと呼ばれる。また、メディアフラグメントユニットは、ＩＳＯＢＭＦＦで規定される「MediaDataBox（’mdat’）」を含むため、一般的には、メディアデータと呼ばれる。また、ＣＭＡＦでは、ＭＰＵのようなランダムアクセスポイントを先頭に持つ処理単位を、フラグメントと呼ぶ。 Note that MPU metadata is generally called movie metadata because it includes the "MovieBox ('moov')" defined in ISOBMFF. Movie fragment metadata is also generally called movie fragment metadata because it includes the "MovieFragmentBox ('moof')" defined in ISOBMFF. Media fragment units are generally called media data because they include the "MediaDataBox ('mdat')" defined in ISOBMFF. In CMAF, a processing unit that starts with a random access point such as an MPU is called a fragment.

“High efficiency coding and media delivery in heterogeneous environments: MPEG media transport”、ＩＳＯ／ＩＥＣ２３００８－１“High efficiency coding and media delivery in heterogeneous environments: MPEG media transport”, ISO/IEC 23008-1 “デジタル放送におけるMMTによるメディアトランスポート方式”、ＡＲＩＢＳＴＤ－Ｂ６０"Media Transport Method Using MMT in Digital Broadcasting", ARIB STD-B60 “ISO/IEC base media file format”、ＩＳＯ／ＩＥＣ１４４９６－１２“ISO/IEC base media file format”, ISO/IEC 14496-12 “Dynamic adaptive streaming over HTTP (DASH)-Part 1:Media presentation description and segment formats”、ＩＳＯ／ＩＥＣ２３００９－１“Dynamic adaptive streaming over HTTP (DASH)-Part 1:Media presentation description and segment formats”, ISO/IEC 23009-1 “Common media application format (CMAF) for segmented media”、ＩＳＯ／ＩＥＣ２３０００－１９“Common media application format (CMAF) for segmented media”, ISO/IEC 23000-19 “高度広帯域衛星デジタル放送運用規定（第三分冊）”、ＡＲＩＢＴＲ－Ｂ３９"Advanced Wideband Satellite Digital Broadcasting Operational Standards (Volume 3)", ARIB TR-B39

前記したＣＭＡＦ適用ＭＭＴを、ＣＭＡＦに対応していない受信機に入力した場合、正常に映像・音声を再生できないことや、処理エラーにより異常終了することがある。これと同様、ＣＭＡＦ非適用ＭＭＴを、ＣＭＡＦに対応した受信機に入力した場合も、正常に映像・音声を再生できないことや、処理エラーにより異常終了することがある。 If the above-mentioned CMAF-applied MMT is input to a receiver that does not support CMAF, video and audio may not be played back properly, or the program may terminate abnormally due to a processing error. Similarly, if a CMAF-unapplied MMT is input to a receiver that supports CMAF, video and audio may not be played back properly, or the program may terminate abnormally due to a processing error.

そこで、本発明は、ＣＭＡＦの適否に関わらず、受信機が正常に映像・音声を再生できる多重信号変換装置及びそのプログラム、並びに、受信機を提供することを課題とする。 The present invention aims to provide a multiplex signal conversion device and associated program, as well as a receiver, that allows the receiver to play video and audio normally, regardless of whether CMAF is appropriate.

前記課題を解決するため、本発明に係る多重信号変換装置は、ＣＭＡＦを適用した多重信号であるＣＭＡＦ適用多重信号を、ＣＭＡＦを適用していない多重信号であるＣＭＡＦ非適用多重信号に変換する多重信号変換装置であって、分離部と、記述子変換部と、記述子追加部と、出力部と、混合部とを備える構成とした。 To solve the above problems, the multiplexed signal conversion device of the present invention is a multiplexed signal conversion device that converts a CMAF-applied multiplexed signal, which is a multiplexed signal to which CMAF is applied, into a CMAF-non-applied multiplexed signal, which is a multiplexed signal to which CMAF is not applied, and is configured to include a separation unit, a descriptor conversion unit, a descriptor addition unit, an output unit, and a mixing unit.

かかる構成によれば、分離部は、ＣＭＡＦ適用多重信号からムービーメタデータとムービーフラグメントメタデータと制御メッセージとメディアデータとを分離する。
記述子変換部は、ムービーフラグメントメタデータのＤＴＳ－ＰＴＳ差分情報を記述子に変換する。
記述子追加部は、その記述子を制御メッセージに追加する。
出力部は、制御メッセージの出力タイミングに従って、フラグメント単位でメディアデータを出力する。
混合部は、記述子追加部からの制御メッセージと出力部からのメディアデータとを混合し、ＣＭＡＦ非適用多重信号として出力する。 According to this configuration, the separation unit separates the movie metadata, the movie fragment metadata, the control message, and the media data from the CMAF applied multiplexed signal.
The descriptor conversion unit converts the DTS-PTS difference information of the movie fragment metadata into a descriptor.
The descriptor adder adds the descriptor to the control message.
The output unit outputs the media data in fragment units in accordance with the output timing of the control message.
The mixer mixes the control message from the descriptor adding unit with the media data from the output unit, and outputs the result as a CMAF non-applied multiplex signal.

このように、多重信号変換装置は、ＣＭＡＦ適用多重信号をＣＭＡＦ非適用多重信号に変換できるので、ＣＭＡＦの適否に関わらず、受信機が正常に映像・音声を再生できる。 In this way, the multiplex signal conversion device can convert a CMAF-applied multiplex signal into a CMAF-non-applied multiplex signal, allowing the receiver to play video and audio normally regardless of whether CMAF is appropriate.

また、前記課題を解決するため、本発明に係る多重信号変換装置は、ＣＭＡＦを適用していない多重信号であるＣＭＡＦ非適用多重信号を、ＣＭＡＦを適用した多重信号であるＣＭＡＦ適用多重信号に変換する多重信号変換装置であって、分離部と、記述子抽出・削除部と、変換部と、出力部と、混合部とを備える構成とした。 In order to solve the above problem, the multiplexed signal conversion device according to the present invention is a multiplexed signal conversion device that converts a CMAF non-applied multiplexed signal, which is a multiplexed signal to which CMAF is not applied, into a CMAF-applied multiplexed signal, which is a multiplexed signal to which CMAF is applied, and is configured to include a separation unit, a descriptor extraction/deletion unit, a conversion unit, an output unit, and a mixing unit.

かかる構成によれば、分離部は、ＣＭＡＦ非適用多重信号から制御メッセージとメディアデータとを分離する。
記述子抽出・削除部は、制御メッセージからＤＴＳ－ＰＴＳ差分情報を含む記述子を抽出すると共に、制御メッセージの記述子を削除する。
変換部は、記述子のＤＴＳ－ＰＴＳ差分情報をムービーフラグメントメタデータに変換する。
出力部は、ムービーフラグメントメタデータの出力タイミングに従って、チャンク単位でメディアデータを出力する。
混合部は、記述子抽出・削除部からの制御メッセージと変換部からのムービーフラグメントメタデータと出力部からのメディアデータとを混合し、ＣＭＡＦ適用多重信号として出力する。 According to this configuration, the demultiplexer demultiplexes the control message and the media data from the CMAF non-applied multiplex signal.
The descriptor extraction and deletion unit extracts a descriptor including DTS-PTS difference information from the control message, and deletes the descriptor from the control message.
The conversion unit converts the DTS-PTS difference information of the descriptor into movie fragment metadata.
The output unit outputs the media data in chunk units in accordance with the output timing of the movie fragment metadata.
The mixer mixes the control message from the descriptor extractor/deleter, the movie fragment metadata from the converter, and the media data from the output unit, and outputs the result as a CMAF-applied multiplexed signal.

このように、多重信号変換装置は、ＣＭＡＦ非適用多重信号をＣＭＡＦ適用多重信号に変換できるので、ＣＭＡＦの適否に関わらず、受信機が正常に映像・音声を再生できる。 In this way, the multiplex signal conversion device can convert a CMAF-non-applied multiplex signal into a CMAF-applied multiplex signal, allowing the receiver to play video and audio normally regardless of whether CMAF is appropriate.

なお、本発明は、コンピュータを、前記した多重信号変換装置として機能させるためのプログラムで実現することもできる。
また、本発明は、前記した多重信号変換装置を備える受信機で実現することもできる。 The present invention can also be realized by a program for causing a computer to function as the multiplex signal conversion device described above.
The present invention can also be realized in a receiver including the above-mentioned multiple signal conversion device.

本発明によれば、ＣＭＡＦの適否に関わらず、受信機が正常に映像・音声を再生できる。 According to the present invention, the receiver can play video and audio normally regardless of whether CMAF is appropriate.

各実施形態に係る放送システムの概略構成図である。FIG. 1 is a schematic configuration diagram of a broadcasting system according to each embodiment. 第１実施形態に係るＭＭＴ変換装置の構成を示すブロック図である。FIG. 2 is a block diagram showing the configuration of an MMT conversion device according to the first embodiment. 第１実施形態に係るＭＭＴ変換装置の動作を示すフローチャートである。4 is a flowchart showing the operation of the MMT conversion device according to the first embodiment. 変形例１に係るＭＭＴ変換装置の構成を示すブロック図である。FIG. 13 is a block diagram showing the configuration of an MMT conversion device according to a first modified example. 第２実施形態に係るＭＭＴ変換装置の構成を示すブロック図である。FIG. 11 is a block diagram showing the configuration of an MMT conversion device according to a second embodiment. 第２実施形態に係るＭＭＴ変換装置の動作を示すフローチャートである。13 is a flowchart showing the operation of an MMT conversion device according to the second embodiment. 変形例２に係るＭＭＴ変換装置の構成を示すブロック図である。FIG. 11 is a block diagram showing the configuration of an MMT conversion device according to a second modified example. 従来技術として、（ａ）はＣＭＡＦ非適用ＭＭＴの多重を説明する説明図であり、（ｂ）はＣＭＡＦ非適用ＭＭＴのデータ構造を説明する説明図である。As a conventional technique, (a) is an explanatory diagram explaining multiplexing of MMT not applying CMAF, and (b) is an explanatory diagram explaining the data structure of MMT not applying CMAF. 従来技術として、（ａ）はＣＭＡＦ適用ＭＭＴの多重を説明する説明図であり、（ｂ）はＣＭＡＦ適用ＭＭＴのデータ構造を説明する説明図である。As a conventional technique, (a) is an explanatory diagram explaining multiplexing of CMAF-applied MMT, and (b) is an explanatory diagram explaining the data structure of CMAF-applied MMT.

以下、本発明の各実施形態について図面を参照して説明する。但し、以下に説明する実施形態は、本発明の技術思想を具体化するためのものであって、特定的な記載がない限り、本発明を以下のものに限定しない。また、各実施形態において、同一の手段には同一の符号を付し、説明を省略することがある。 Each embodiment of the present invention will be described below with reference to the drawings. However, the embodiments described below are intended to embody the technical concept of the present invention, and unless otherwise specified, the present invention is not limited to the following. In addition, in each embodiment, the same means are given the same reference numerals, and the description may be omitted.

（第１実施形態）
［放送システムの概略］
図１を参照し、第１実施形態に係る放送システム１００の概略について説明する。
図１に示すように、放送システム１００は、デジタル放送を行うものであり、符号化装置２と、送出装置３と、受信機４とを備える。また、受信機４は、後記するＭＭＴ変換装置（多重信号変換装置）１を内蔵している。 First Embodiment
[Broadcasting system overview]
With reference to FIG. 1, an overview of a broadcasting system 100 according to a first embodiment will be described.
1, the broadcasting system 100 performs digital broadcasting and includes an encoding device 2, a transmitting device 3, and a receiver 4. The receiver 4 also includes an MMT conversion device (multiplexed signal conversion device) 1, which will be described later.

符号化装置２は、所定の映像符号化方式で放送番組の映像を符号化し、符号化した映像を送出装置３に出力するものである。本実施形態では、映像符号化方式がＨＥＶＣであることとする。 The encoding device 2 encodes the video of a broadcast program using a predetermined video encoding method and outputs the encoded video to the transmission device 3. In this embodiment, the video encoding method is assumed to be HEVC.

送出装置３は、所定の多重方式で放送番組の映像や音声を多重し、受信機４に送出するものである。本実施形態では、多重方式がＭＭＴであることとする。つまり、送出装置３は、符号化装置２から入力した映像や音声を多重し、ＭＭＴＰパケット列として受信機４に送出する。 The transmission device 3 multiplexes the video and audio of a broadcast program using a predetermined multiplexing method and transmits it to the receiver 4. In this embodiment, the multiplexing method is MMT. In other words, the transmission device 3 multiplexes the video and audio input from the encoding device 2 and transmits it to the receiver 4 as an MMTP packet sequence.

受信機４は、送出装置３が送出したＭＭＴＰパケット列を受信・多重分離し、放送番組の映像や音声を復号・再生するものである。例えば、受信機４としては、一般的なテレビ、スマートフォン、タブレットがあげられる。なお、図１では、図面を見やすくするために受信機４を１台のみ図示したが、通常、受信機４は複数台である。 The receiver 4 receives and demultiplexes the MMTP packet sequence sent by the sending device 3, and decodes and plays back the video and audio of the broadcast program. For example, the receiver 4 may be a general television, a smartphone, or a tablet. Note that in FIG. 1, only one receiver 4 is shown to make the drawing easier to understand, but typically there are multiple receivers 4.

ここで、ＣＭＡＦに対応した送出装置３が、ＣＭＡＦに対応していない受信機４に対し、ＣＭＡＦ適応ＭＭＴ（ＣＭＡＦ適用多重信号）を送出することがある。そこで、受信機４は、内蔵したＭＭＴ変換装置１によって、ＣＭＡＦ適応ＭＭＴをＣＭＡＦ非適応ＭＭＴ（ＣＭＡＦ非適用多重信号）に変換する。
なお、本実施形態では、ＨＥＶＣによる映像信号をアセットとして伝送するＣＭＡＦ適応ＭＭＴが「ｈｖｃ１」に対応し、パラメータセットをＭＰＵメタデータに含むこととする。 Here, a sending device 3 compatible with CMAF may send a CMAF-adapted MMT (CMAF-applied multiplex signal) to a receiver 4 that does not support CMAF. In this case, the receiver 4 converts the CMAF-adapted MMT into a CMAF-non-adapted MMT (CMAF-non-applied multiplex signal) using a built-in MMT conversion device 1.
In this embodiment, the CMAF-adaptive MMT that transmits HEVC video signals as assets corresponds to "hvc1", and the parameter set is included in the MPU metadata.

［ＭＭＴ変換装置の構成］
図２を参照し、ＭＭＴ変換装置１の構成について説明する。
ＭＭＴ変換装置１は、図９（ｂ）のＣＭＡＦ適応ＭＭＴ（ｈｖｃ１）を図８（ｂ）のＣＭＡＦ非適応ＭＭＴに変換するものである。図２に示すように、ＭＭＴ変換装置１は、パケットフィルタ（分離部）１０と、メッセージバッファ１１と、記述子変換部１２と、記述子追加部１３と、パラメータセット抽出部１４と、パラメータセット追加部１５と、ＭＰＵバッファ（出力部）１６と、パケット混合部（混合部）１７とを備える。 [Configuration of MMT conversion device]
The configuration of the MMT conversion device 1 will be described with reference to FIG.
The MMT conversion device 1 converts the CMAF-adapted MMT (hvc1) in Fig. 9(b) into the CMAF-non-adapted MMT in Fig. 8(b). As shown in Fig. 2, the MMT conversion device 1 includes a packet filter (separation unit) 10, a message buffer 11, a descriptor conversion unit 12, a descriptor addition unit 13, a parameter set extraction unit 14, a parameter set addition unit 15, an MPU buffer (output unit) 16, and a packet mixer (mixer) 17.

パケットフィルタ１０は、ＣＭＡＦ適用ＭＭＴから、ムービーフラグメントメタデータと、ＭＰＵメタデータ（ムービーメタデータ）と、制御メッセージ（ＭＰテーブル）と、メディアフラグメントユニット（メディアデータ）と、その他のパケットとを分離するものである。 The packet filter 10 separates movie fragment metadata, MPU metadata (movie metadata), control messages (MP tables), media fragment units (media data), and other packets from the CMAF-applied MMT.

図９（ｂ）に示すように、パケットフィルタ１０は、ＣＭＡＦ適用ＭＭＴ（ＭＭＴＰパケット）のＰＩＤ及びフラグメントタイプ（fragment_type）を参照し、ムービーフラグメントメタデータ等の分離を行う。具体的には、パケットフィルタ１０は、ＰＩＤ＝０のＭＭＴＰパケットを制御メッセージ（ＭＰテーブル）、ＰＩＤ＝Ｘかつフラグメントタイプ＝０のＭＭＴＰパケットをＭＰＵメタデータ、ＰＩＤ＝Ｘかつフラグメントタイプ＝１のＭＭＴＰパケットをムービーフラグメントメタデータ、ＰＩＤ＝Ｘかつフラグメントタイプ＝２のＭＭＴＰパケットをメディアフラグメントユニットとして、ＣＭＡＦ適用ＭＭＴから分離する。また、パケットフィルタ１０は、前記したムービーフラグメントメタデータ等以外のデータ（例えば、ＭＰテーブル以外の制御メッセージ）をその他のパケットとして、ＣＭＡＦ適用ＭＭＴから分離する。 As shown in FIG. 9B, the packet filter 10 refers to the PID and fragment type (fragment_type) of the CMAF-applied MMT (MMTP packet) and separates movie fragment metadata, etc. Specifically, the packet filter 10 separates MMTP packets with PID=0 as control messages (MP tables), MMTP packets with PID=X and fragment type=0 as MPU metadata, MMTP packets with PID=X and fragment type=1 as movie fragment metadata, and MMTP packets with PID=X and fragment type=2 as media fragment units from the CMAF-applied MMT. The packet filter 10 also separates data other than the movie fragment metadata, etc. (e.g., control messages other than the MP table) from the CMAF-applied MMT as other packets.

ここで、パケットフィルタ１０は、ＭＰテーブル内のアセットロケーション情報を参照することで、変換対象のアセットを伝送するＰＩＤ（＝Ｘ）を特定できる。なお、エントリポイントであるＰＩＤ＝０の制御メッセージにはパッケージリストテーブルが含まれ、パッケージリストテーブルから参照される別のＰＩＤでＭＰテーブルが伝送される場合がある。この場合、パケットフィルタ１０は、パッケージリストテーブルを参照することで制御メッセージ（ＭＰテーブル）を伝送するＰＩＤを特定し、制御メッセージ（ＭＰテーブル）を分離できる。 Here, the packet filter 10 can identify the PID (=X) that transmits the asset to be converted by referencing the asset location information in the MP table. Note that the control message with PID=0, which is the entry point, includes a package list table, and the MP table may be transmitted by another PID referenced from the package list table. In this case, the packet filter 10 can identify the PID that transmits the control message (MP table) by referencing the package list table, and separate the control message (MP table).

そして、パケットフィルタ１０は、制御メッセージ（ＭＰテーブル）をメッセージバッファ１１に出力し、ムービーフラグメントメタデータを記述子変換部１２に出力し、ＭＰＵメタデータをパラメータセット抽出部１４に出力する。さらに、パケットフィルタ１０は、メディアフラグメントユニットをパラメータセット追加部１５に出力し、その他のパケットをパケット混合部１７に出力する。 Then, the packet filter 10 outputs the control message (MP table) to the message buffer 11, outputs the movie fragment metadata to the descriptor conversion unit 12, and outputs the MPU metadata to the parameter set extraction unit 14. Furthermore, the packet filter 10 outputs the media fragment unit to the parameter set addition unit 15, and outputs the other packets to the packet mixing unit 17.

メッセージバッファ１１は、パケットフィルタ１０から入力した制御メッセージ（ＭＰテーブル）を蓄積するバッファである。また、メッセージバッファ１１は、記述子変換部１２からの出力指示に従って、制御メッセージ（ＭＰテーブル）を記述子追加部１３に出力する。 The message buffer 11 is a buffer that accumulates the control message (MP table) input from the packet filter 10. In addition, the message buffer 11 outputs the control message (MP table) to the descriptor addition unit 13 according to an output instruction from the descriptor conversion unit 12.

記述子変換部１２は、パケットフィルタ１０から入力したムービーフラグメントメタデータのＤＴＳ－ＰＴＳ差分情報を、制御メッセージに多重して伝送するための記述子に変換するものである。本実施形態では、記述子が拡張ＭＰＵタイムスタンプ記述子であることとする。具体的には、記述子変換部１２は、ムービーフラグメントメタデータを解析して、ＤＴＳ－ＰＴＳ差分情報を拡張ＭＰＵタイムスタンプ記述子の形式に変換する。そして、記述子変換部１２は、ＭＰＵ１個分のＤＴＳ－ＰＴＳ差分情報の変換が完了すると、メッセージバッファ１１に出力指示を行うと共に、記述子追加部１３に拡張ＭＰＵタイムスタンプ記述子を出力する。記述子変換部１２からメッセージバッファ１１への出力指示は、例えば、ＭＰＵシーケンス番号を指定し、そのＭＰＵに対応する記述子（例えば、ＭＰＵタイムスタンプ記述子）を含む制御メッセージ（ＭＰテーブル）を出力させるものである。 The descriptor conversion unit 12 converts the DTS-PTS differential information of the movie fragment metadata input from the packet filter 10 into a descriptor for multiplexing and transmitting the DTS-PTS differential information in a control message. In this embodiment, the descriptor is an extended MPU timestamp descriptor. Specifically, the descriptor conversion unit 12 analyzes the movie fragment metadata and converts the DTS-PTS differential information into the format of an extended MPU timestamp descriptor. Then, when the descriptor conversion unit 12 completes the conversion of the DTS-PTS differential information for one MPU, it instructs the message buffer 11 to output and outputs the extended MPU timestamp descriptor to the descriptor addition unit 13. The output instruction from the descriptor conversion unit 12 to the message buffer 11, for example, specifies an MPU sequence number and causes the control message (MP table) including a descriptor (for example, an MPU timestamp descriptor) corresponding to that MPU to be output.

記述子追加部１３は、記述子変換部１２から入力した拡張ＭＰＵタイムスタンプ記述子を、メッセージバッファ１１から入力した制御メッセージ（ＭＰテーブル）に追加するものである。つまり、この制御メッセージは、拡張ＭＰＵタイムスタンプ記述子を追加したＭＰテーブルを有する。そして、記述子追加部１３は、この制御メッセージ（ＭＰテーブル）をパケット混合部１７に出力する。さらに、記述子追加部１３は、パケット混合部１７に制御メッセージ（ＭＰテーブル）を出力した後、ＭＰＵバッファ１６に対し、拡張ＭＰＵタイムスタンプ記述子に対応するＭＰＵの出力指示を行う。 The descriptor adding unit 13 adds the extended MPU timestamp descriptor input from the descriptor conversion unit 12 to the control message (MP table) input from the message buffer 11. In other words, this control message has an MP table to which the extended MPU timestamp descriptor has been added. The descriptor adding unit 13 then outputs this control message (MP table) to the packet mixing unit 17. Furthermore, after outputting the control message (MP table) to the packet mixing unit 17, the descriptor adding unit 13 instructs the MPU buffer 16 to output the MPU corresponding to the extended MPU timestamp descriptor.

パラメータセット抽出部１４は、パケットフィルタ１０より入力したＭＰＵメタデータから、映像符号化のパラメータセットを抽出するものである。具体的には、パラメータセット抽出部１４は、ＭＰＵメタデータのＢｏｘ形式のメタデータからＨＥＶＣのパラメータセットを抽出し、抽出したパラメータセットをパラメータセット追加部１５に出力する。 The parameter set extraction unit 14 extracts a video encoding parameter set from the MPU metadata input from the packet filter 10. Specifically, the parameter set extraction unit 14 extracts an HEVC parameter set from the Box-format metadata of the MPU metadata, and outputs the extracted parameter set to the parameter set addition unit 15.

パラメータセット追加部１５は、パラメータセット抽出部１４から入力したパラメータセットを、パケットフィルタ１０から入力したメディアフラグメントユニットに追加するものである。具体的には、パラメータセット追加部１５は、ＨＥＶＣのパラメータセットを、ＭＰＵの先頭フレームのメディアフラグメントユニットの先頭に追加する。そして、パラメータセット追加部１５は、このメディアフラグメントユニットをＭＰＵバッファ１６に出力する。 The parameter set addition unit 15 adds the parameter set input from the parameter set extraction unit 14 to the media fragment unit input from the packet filter 10. Specifically, the parameter set addition unit 15 adds the HEVC parameter set to the beginning of the media fragment unit of the first frame of the MPU. Then, the parameter set addition unit 15 outputs this media fragment unit to the MPU buffer 16.

ＭＰＵバッファ１６は、パラメータセット追加部１５から入力したメディアフラグメントユニットを蓄積するバッファである。また、ＭＰＵバッファ１６は、制御メッセージ（ＭＰテーブル）の出力タイミングに従って、パラメータセットが追加されたメディアフラグメントユニットをＭＰＵ（フラグメント）単位で出力する。つまり、ＭＰＵバッファ１６は、記述子追加部１３からの出力指示で指定されたＭＰＵのメディアフラグメントユニットをパケット混合部１７に出力する。記述子追加部１３からＭＰＵバッファ１６への出力指示は、例えば、ＭＰＵシーケンス番号によりＭＰＵを指定して出力させるものである。 The MPU buffer 16 is a buffer that accumulates the media fragment units input from the parameter set adding unit 15. The MPU buffer 16 also outputs the media fragment units to which the parameter sets have been added in MPU (fragment) units according to the output timing of the control message (MP table). In other words, the MPU buffer 16 outputs the media fragment units of the MPU specified by the output instruction from the descriptor adding unit 13 to the packet mixing unit 17. The output instruction from the descriptor adding unit 13 to the MPU buffer 16 is, for example, to specify an MPU by its MPU sequence number and output it.

パケット混合部１７は、記述子追加部１３から入力した制御メッセージ（ＭＰテーブル）と、ＭＰＵバッファ１６から入力したメディアフラグメントユニットと、パケットフィルタ１０から入力したその他のパケットとを混合し、ＣＭＡＦ非適用ＭＭＴとして出力するものである。 The packet mixing unit 17 mixes the control message (MP table) input from the descriptor adding unit 13, the media fragment unit input from the MPU buffer 16, and other packets input from the packet filter 10, and outputs the result as a CMAF-non-applied MMT.

ここで、メッセージバッファ１１、記述子変換部１２、記述子追加部１３、パラメータセット抽出部１４、パラメータセット追加部１５及びＭＰＵバッファ１６の各処理においては、ＭＭＴＰパケットの形式を維持して処理してもよいし、又は、ＭＭＴＰパケットのペイロードである処理対象データを一旦抽出した形式で処理してもよい。前者の場合、パケット混合部１７は、複数のＭＭＴＰパケット列を入力として、それらを混合した単一のＭＭＴＰパケット列として出力する。後者の場合、パケット混合部１７は、制御メッセージ（ＭＰテーブル）、メディアフラグメントユニットをペイロードとして含むＭＭＴＰパケットを生成し、それらをその他のパケットとして入力されるＭＭＴＰパケット列に混合して、単一のＭＭＴＰパケット列として出力する。さらに、パケット混合部１７は、必要に応じて、出力するＭＭＴＰパケット列についてパケットシーケンス番号の連続性を修正するなど、ヘッダ部を書き換えてもよい。 Here, in the processing of the message buffer 11, the descriptor conversion unit 12, the descriptor addition unit 13, the parameter set extraction unit 14, the parameter set addition unit 15, and the MPU buffer 16, the format of the MMTP packet may be maintained, or the processing target data, which is the payload of the MMTP packet, may be processed in a format in which it is extracted once. In the former case, the packet mixing unit 17 inputs a plurality of MMTP packet strings and outputs them as a single MMTP packet string in which they are mixed. In the latter case, the packet mixing unit 17 generates MMTP packets containing a control message (MP table) and a media fragment unit as a payload, mixes them with the MMTP packet string input as other packets, and outputs them as a single MMTP packet string. Furthermore, the packet mixing unit 17 may rewrite the header part, such as correcting the continuity of the packet sequence number for the MMTP packet string to be output, as necessary.

［ＭＭＴ変換装置の動作］
図３を参照し、ＭＭＴ変換装置１の動作について説明する。
図３に示すように、ステップＳ１において、パケットフィルタ１０は、ＣＭＡＦ適用ＭＭＴから、ＭＰＵメタデータと、ムービーフラグメントメタデータと、制御メッセージ（ＭＰテーブル）と、メディアフラグメントユニットと、その他のパケットとを分離する。また、メッセージバッファ１１は、パケットフィルタ１０が分離した制御メッセージ（ＭＰテーブル）を蓄積する。 [Operation of MMT conversion device]
The operation of the MMT conversion device 1 will be described with reference to FIG.
3, in step S1, the packet filter 10 separates MPU metadata, movie fragment metadata, control messages (MP tables), media fragment units, and other packets from the CMAF-applied MMT. The message buffer 11 stores the control messages (MP tables) separated by the packet filter 10.

ステップＳ２において、パラメータセット抽出部１４は、ＭＰＵメタデータからパラメータセットを抽出する。
ステップＳ３において、パラメータセット追加部１５は、パラメータセットをメディアフラグメントユニットに追加する。メディアフラグメントユニットは、記述子追加部１３から出力指示があるまでＭＰＵバッファ１６にバッファされる。 In step S2, the parameter set extraction unit 14 extracts a parameter set from the MPU metadata.
In step S3, the parameter set adding unit 15 adds the parameter set to the media fragment unit. The media fragment unit is buffered in the MPU buffer 16 until an output instruction is received from the descriptor adding unit 13.

ステップＳ４において、記述子変換部１２は、ムービーフラグメントメタデータのＤＴＳ－ＰＴＳ差分情報を拡張ＭＰＵタイムスタンプ記述子に変換し、ＭＰＵ１個分のＤＴＳ－ＰＴＳ差分情報の変換が完了すると、メッセージバッファ１１に出力指示を行う。すると、メッセージバッファ１１は、記述子変換部１２からの出力指示に従って、制御メッセージ（ＭＰテーブル）を記述子追加部１３に出力する。 In step S4, the descriptor conversion unit 12 converts the DTS-PTS differential information of the movie fragment metadata into an extended MPU timestamp descriptor, and when conversion of the DTS-PTS differential information for one MPU is complete, it issues an output instruction to the message buffer 11. Then, in accordance with the output instruction from the descriptor conversion unit 12, the message buffer 11 outputs a control message (MP table) to the descriptor addition unit 13.

ステップＳ５において、記述子追加部１３は、拡張ＭＰＵタイムスタンプ記述子を制御メッセージ（ＭＰテーブル）に追加し、制御メッセージ（ＭＰテーブル）を出力する。さらに、記述子追加部１３は、ＭＰＵバッファ１６に対し、拡張ＭＰＵタイムスタンプ記述子に対応するＭＰＵの出力指示を行う。 In step S5, the descriptor adding unit 13 adds the extended MPU timestamp descriptor to the control message (MP table) and outputs the control message (MP table). Furthermore, the descriptor adding unit 13 instructs the MPU buffer 16 to output the MPU corresponding to the extended MPU timestamp descriptor.

ステップＳ６において、ＭＰＵバッファ１６は、記述子追加部１３からの出力指示で指定されたＭＰＵのメディアフラグメントユニットをパケット混合部１７に出力する。
ステップＳ７において、パケット混合部１７は、制御メッセージ（ＭＰテーブル）と、メディアフラグメントユニットと、その他のパケットとを混合し、ＣＭＡＦ非適用ＭＭＴとして出力する。 In step S 6 , the MPU buffer 16 outputs the media fragment unit of the MPU specified by the output instruction from the descriptor adding unit 13 to the packet mixing unit 17 .
In step S7, the packet mixer 17 mixes the control message (MP table), the media fragment unit, and other packets, and outputs the result as a CMAF-non-applied MMT.

なお、ステップＳ１～Ｓ７の処理は、図３の順序で逐次的に実行せずとも、入力されたパケットの種別や順序に応じて、各ステップの処理順序を入れ替えたり、各ステップの処理を同時並列に実行してもよい。 The processing of steps S1 to S7 does not have to be performed sequentially in the order shown in FIG. 3. Depending on the type and order of input packets, the processing order of each step may be changed, or each step may be performed simultaneously in parallel.

［作用・効果］
以上のように、ＭＭＴ変換装置１は、ＣＭＡＦ適応ＭＭＴをＣＭＡＦ非適応ＭＭＴに変換するので、ＣＭＡＦの適否に関わらず、受信機４が正常に映像・音声を再生できる。このように、受信機４は、ＣＭＡＦで規定されたチャンク構造のＩＳＯＢＭＦＦメタデータに対応しない場合でも、ＭＭＴ変換装置１によって、正常に映像・音声を再生できる。 [Action and Effects]
As described above, the MMT conversion device 1 converts a CMAF-compliant MMT into a CMAF-non-compliant MMT, so that the receiver 4 can normally play back video and audio regardless of the suitability of CMAF. In this way, even if the receiver 4 does not support ISOBMFF metadata with a chunk structure defined by CMAF, the MMT conversion device 1 can normally play back video and audio.

（変形例１）
図４を参照し、変形例１に係るＭＭＴ変換装置１Ｂについて、第１実施形態と異なる点を説明する。
変形例１では、ＣＭＡＦ適応ＭＭＴが「ｈｅｖ１」に対応し、パラメータセットをメディアフラグメントユニットに含むこととする。つまり、ＭＭＴ変換装置１Ｂは、パラメータセットが入力のメディアフラグメントユニットに元々含まれており、そのまま出力すればよいので、パラメータセットをＭＰＵメタデータから抽出してメディアフラグメントユニットに追加する必要がない。 (Variation 1)
With reference to FIG. 4, an MMT conversion device 1B according to the first modification will be described with respect to differences from the first embodiment.
In the first modification, the CMAF-adaptive MMT corresponds to "hev1", and the parameter set is included in the media fragment unit. In other words, since the parameter set is originally included in the input media fragment unit and the MMT conversion device 1B can output the parameter set as is, it is not necessary to extract the parameter set from the MPU metadata and add it to the media fragment unit.

ＭＭＴ変換装置１Ｂは、図９（ｂ）のＣＭＡＦ適応ＭＭＴ（ｈｅｖ１）を図８（ｂ）のＣＭＡＦ非適応ＭＭＴに変換するものである。図４に示すように、ＭＭＴ変換装置１Ｂは、パケットフィルタ（分離部）１０Ｂと、メッセージバッファ１１と、記述子変換部１２と、記述子追加部１３と、ＭＰＵバッファ（出力部）１６Ｂと、パケット混合部（混合部）１７とを備える。 The MMT conversion device 1B converts the CMAF-adaptive MMT (hev1) of FIG. 9(b) into the CMAF-non-adaptive MMT of FIG. 8(b). As shown in FIG. 4, the MMT conversion device 1B includes a packet filter (separation unit) 10B, a message buffer 11, a descriptor conversion unit 12, a descriptor addition unit 13, an MPU buffer (output unit) 16B, and a packet mixer (mixer) 17.

パケットフィルタ１０Ｂは、ＣＭＡＦ適用ＭＭＴから分離したムービーフラグメントメタデータを記述子変換部１２に出力する。また、パケットフィルタ１０Ｂは、ＣＭＡＦ適用ＭＭＴから分離したメディアフラグメントユニットをＭＰＵバッファ１６Ｂに出力する。なお、ＭＰＵメタデータが入力された場合、パケットフィルタ１０Ｂは、そのＭＰＵメタデータを破棄して出力しない。この他、パケットフィルタ１０Ｂは、第１実施形態と同様のため、説明を省略する。 The packet filter 10B outputs the movie fragment metadata separated from the CMAF-applied MMT to the descriptor conversion unit 12. The packet filter 10B also outputs the media fragment unit separated from the CMAF-applied MMT to the MPU buffer 16B. Note that when MPU metadata is input, the packet filter 10B discards the MPU metadata and does not output it. Other than this, the packet filter 10B is the same as in the first embodiment, so a description thereof will be omitted.

ＭＰＵバッファ１６Ｂは、制御メッセージ（ＭＰテーブル）の出力タイミングに従って、パケットフィルタ１０Ｂから入力したメディアフラグメントユニットをＭＰＵ単位で出力する。この他、ＭＰＵバッファ１６Ｂは、第１実施形態と同様のため、説明を省略する。 The MPU buffer 16B outputs the media fragment units input from the packet filter 10B in MPU units according to the output timing of the control message (MP table). Other than this, the MPU buffer 16B is the same as in the first embodiment, so a description is omitted.

［作用・効果］
以上のように、ＭＭＴ変換装置１Ｂは、ＣＭＡＦ適応ＭＭＴが「ｈｅｖ１」に対応する場合でも、第１実施形態と同様にＣＭＡＦ適応ＭＭＴをＣＭＡＦ非適応ＭＭＴに変換するので、ＣＭＡＦの適否に関わらず、受信機４が正常に映像・音声を再生できる。 [Action and Effects]
As described above, even when the CMAF-adaptive MMT corresponds to "hev1", the MMT conversion device 1B converts the CMAF-adaptive MMT to a CMAF-non-adaptive MMT as in the first embodiment, so that the receiver 4 can play video and audio normally regardless of the suitability of the CMAF.

（第２実施形態）
［放送システムの概略］
図１を参照し、第２実施形態に係る放送システム１００Ｃの概略について説明する。
図１に示すように、放送システム１００Ｃは、デジタル放送を行うものであり、符号化装置２と、送出装置３Ｃと、受信機４Ｃとを備える。 Second Embodiment
[Broadcasting system overview]
The outline of a broadcasting system 100C according to the second embodiment will be described with reference to FIG.
As shown in FIG. 1, a broadcasting system 100C performs digital broadcasting, and includes an encoding device 2, a transmitting device 3C, and a receiver 4C.

本実施形態では、ＣＭＡＦに対応していない送出装置３Ｃが、ＣＭＡＦに対応している受信機４Ｃに対し、ＣＭＡＦ非適応ＭＭＴを送出することとする。そこで、受信機４Ｃは、内蔵したＭＭＴ変換装置（多重信号変換装置）５によって、ＣＭＡＦ非適応ＭＭＴをＣＭＡＦ適応ＭＭＴに変換する。 In this embodiment, a sending device 3C that does not support CMAF sends a CMAF-non-compliant MMT to a receiver 4C that supports CMAF. Therefore, the receiver 4C converts the CMAF-non-compliant MMT to a CMAF-compliant MMT using an internal MMT conversion device (multiplexed signal conversion device) 5.

［ＭＭＴ変換装置の構成］
図５を参照し、ＭＭＴ変換装置５の構成について説明する。
ＭＭＴ変換装置５は、図８（ｂ）のＣＭＡＦ非適応ＭＭＴを図９（ｂ）のＣＭＡＦ適応ＭＭＴ（ｈｖｃ１）に変換するものである。図５に示すように、ＭＭＴ変換装置５は、パケットフィルタ（分離部）５０と、記述子抽出・削除部５１と、パラメータセット抽出・削除部５２と、メタデータ変換部（変換部）５３と、ＭＰＵバッファ（出力部）５４と、パケット混合部（混合部）５５とを備える。 [Configuration of MMT conversion device]
The configuration of the MMT conversion device 5 will be described with reference to FIG.
The MMT conversion device 5 converts the CMAF non-adaptive MMT of Fig. 8(b) into the CMAF adapted MMT (hvc1) of Fig. 9(b). As shown in Fig. 5, the MMT conversion device 5 includes a packet filter (separation unit) 50, a descriptor extraction/deletion unit 51, a parameter set extraction/deletion unit 52, a metadata conversion unit (conversion unit) 53, an MPU buffer (output unit) 54, and a packet mixer (mixer) 55.

パケットフィルタ５０は、ＣＭＡＦ非適用ＭＭＴから、制御メッセージ（ＭＰテーブル）と、メディアフラグメントユニット（メディアデータ）と、その他のパケットとを分離するものである。 The packet filter 50 separates control messages (MP tables), media fragment units (media data), and other packets from the CMAF-non-applied MMT.

図８（ｂ）に示すように、パケットフィルタ５０は、ＣＭＡＦ非適用ＭＭＴ（ＭＭＴＰパケット）のＰＩＤを参照し、メディアフラグメントユニット等の分離を行う。具体的には、パケットフィルタ５０は、ＰＩＤ＝０のＭＭＴＰパケットを制御メッセージ（ＭＰテーブル）、ＰＩＤ＝ＸのＭＭＴＰパケットをメディアフラグメントユニットとして、ＣＭＡＦ非適用ＭＭＴから分離する。また、パケットフィルタ５０は、制御メッセージ（ＭＰテーブル）及びメディアフラグメントユニット以外のデータをその他のパケットとして、ＣＭＡＦ非適用ＭＭＴから分離する。 As shown in FIG. 8(b), the packet filter 50 refers to the PID of the CMAF-non-applied MMT (MMTP packet) and separates media fragment units, etc. Specifically, the packet filter 50 separates MMTP packets with PID=0 as control messages (MP tables) and MMTP packets with PID=X as media fragment units from the CMAF-non-applied MMT. The packet filter 50 also separates data other than the control messages (MP tables) and media fragment units as other packets from the CMAF-non-applied MMT.

ここで、パケットフィルタ５０は、ＭＰテーブル内のアセットロケーション情報を参照することで、変換対象のアセットを伝送するＰＩＤ（＝Ｘ）を特定できる。なお、エントリポイントであるＰＩＤ＝０の制御メッセージにはパッケージリストテーブルが含まれ、パッケージリストテーブルから参照される別のＰＩＤでＭＰテーブルが伝送される場合がある。この場合、パケットフィルタ５０は、パッケージリストテーブルを参照することで制御メッセージ（ＭＰテーブル）を伝送するＰＩＤを特定し、制御メッセージ（ＭＰテーブル）を分離できる。 Here, the packet filter 50 can identify the PID (=X) that transmits the asset to be converted by referencing the asset location information in the MP table. Note that the control message with PID=0, which is the entry point, includes a package list table, and the MP table may be transmitted by another PID referenced from the package list table. In this case, the packet filter 50 can identify the PID that transmits the control message (MP table) by referencing the package list table, and separate the control message (MP table).

また、パケットフィルタ５０は、制御メッセージ（ＭＰテーブル）を記述子抽出・削除部５１に出力し、メディアフラグメントユニットをパラメータセット抽出・削除部５２に出力し、その他のパケットをパケット混合部５５に出力する。 The packet filter 50 also outputs the control message (MP table) to the descriptor extraction and deletion unit 51, outputs the media fragment unit to the parameter set extraction and deletion unit 52, and outputs other packets to the packet mixing unit 55.

記述子抽出・削除部５１は、パケットフィルタ５０より入力した制御メッセージ（ＭＰテーブル）から拡張ＭＰＵタイムスタンプ記述子を抽出すると共に、制御メッセージ（ＭＰテーブル）の拡張ＭＰＵタイムスタンプ記述子を削除するものである。そして、記述子抽出・削除部５１は、制御メッセージ（ＭＰテーブル）から抽出した拡張ＭＰＵタイムスタンプ記述子をメタデータ変換部５３に出力する。さらに、記述子抽出・削除部５１は、拡張ＭＰＵタイムスタンプ記述子を削除した制御メッセージ（ＭＰテーブル）をパケット混合部５５に出力する。 The descriptor extraction/deletion unit 51 extracts the extended MPU timestamp descriptor from the control message (MP table) input from the packet filter 50, and deletes the extended MPU timestamp descriptor from the control message (MP table). The descriptor extraction/deletion unit 51 then outputs the extended MPU timestamp descriptor extracted from the control message (MP table) to the metadata conversion unit 53. Furthermore, the descriptor extraction/deletion unit 51 outputs the control message (MP table) from which the extended MPU timestamp descriptor has been deleted to the packet mixing unit 55.

パラメータセット抽出・削除部５２は、パケットフィルタ５０より入力したメディアフラグメントユニットから映像符号化のパラメータセットを抽出すると共に、メディアフラグメントユニットのパラメータセットを削除するものである。具体的には、パラメータセット抽出・削除部５２は、ＭＰＵ先頭のフレームのメディアフラグメントユニットからＨＥＶＣのパラメータセットを抽出し、抽出したパラメータセットをメタデータ変換部５３に出力する。さらに、パラメータセット抽出・削除部５２は、パラメータセットを削除したメディアフラグメントユニットをＭＰＵバッファ５４に出力する。 The parameter set extraction/deletion unit 52 extracts a video encoding parameter set from the media fragment unit input by the packet filter 50, and deletes the parameter set from the media fragment unit. Specifically, the parameter set extraction/deletion unit 52 extracts a HEVC parameter set from the media fragment unit of the first frame of the MPU, and outputs the extracted parameter set to the metadata conversion unit 53. Furthermore, the parameter set extraction/deletion unit 52 outputs the media fragment unit from which the parameter set has been deleted to the MPU buffer 54.

メタデータ変換部５３は、記述子抽出・削除部５１から入力した拡張ＭＰＵタイムスタンプ記述子のＤＴＳ－ＰＴＳ差分情報をムービーフラグメントメタデータに変換するものである。具体的には、メタデータ変換部５３は、拡張ＭＰＵタイムスタンプ記述子を解析して、ＤＴＳ－ＰＴＳ差分情報をＩＳＯＢＭＦＦ及びＣＭＡＦで規定されるＢｏｘ形式のメタデータに変換する。 The metadata conversion unit 53 converts the DTS-PTS difference information of the extended MPU timestamp descriptor input from the descriptor extraction and deletion unit 51 into movie fragment metadata. Specifically, the metadata conversion unit 53 analyzes the extended MPU timestamp descriptor and converts the DTS-PTS difference information into metadata in the Box format defined by ISOBMFF and CMAF.

また、メタデータ変換部５３は、パラメータセット抽出・削除部５２から入力したパラメータセットを含むＭＰＵメタデータ（ムービーメタデータ）を生成する。具体的には、メタデータ変換部５３は、ＨＥＶＣのパラメータセットをＩＳＯＢＭＦＦ及びＣＭＡＦで規定されるＢｏｘ形式のメタデータを生成し、ＭＰＵメタデータとしてパケット混合部５５に出力する。なお、ＭＰＵメタデータは、ＭＰＵの先頭で一度だけ出力する。 The metadata conversion unit 53 also generates MPU metadata (movie metadata) including the parameter set input from the parameter set extraction/deletion unit 52. Specifically, the metadata conversion unit 53 generates metadata in a Box format defined by ISOBMFF and CMAF from the HEVC parameter set, and outputs it to the packet mixing unit 55 as MPU metadata. Note that the MPU metadata is output only once at the beginning of the MPU.

そして、メタデータ変換部５３は、チャンク１個分のＤＴＳ－ＰＴＳ差分情報の変換が完了すると、ムービーフラグメントメタデータをパケット混合部５５に出力すると共に、ＭＰＵバッファ５４に出力指示を行う。この出力指示は、ムービーフラグメントメタデータの出力タイミングに同期させて、ＭＰＵバッファ５４が出力すべきチャンクを指定している。 Then, when the metadata conversion unit 53 has completed the conversion of the DTS-PTS difference information for one chunk, it outputs the movie fragment metadata to the packet mixing unit 55 and issues an output instruction to the MPU buffer 54. This output instruction specifies the chunk that the MPU buffer 54 should output, synchronized with the output timing of the movie fragment metadata.

ＭＰＵバッファ５４は、パラメータセット抽出・削除部５２から入力したメディアフラグメントユニットを蓄積するバッファである。また、ＭＰＵバッファ５４は、ムービーフラグメントメタデータの出力タイミングに従って、チャンク単位でメディアフラグメントユニットを出力する。つまり、ＭＰＵバッファ５４は、メタデータ変換部５３からの出力指示で指定されたチャンクに対応するメディアフラグメントユニットをパケット混合部５５に出力する。メタデータ変換部５３からＭＰＵバッファ５４への出力指示は、例えば、ＭＰＵシーケンス番号とそのＭＰＵの中の何番目のチャンクかにより、チャンクを指定して出力させるものである。 The MPU buffer 54 is a buffer that accumulates the media fragment units input from the parameter set extraction/deletion unit 52. The MPU buffer 54 also outputs the media fragment units in chunk units according to the output timing of the movie fragment metadata. In other words, the MPU buffer 54 outputs the media fragment units corresponding to the chunks specified in the output instruction from the metadata conversion unit 53 to the packet mixing unit 55. The output instruction from the metadata conversion unit 53 to the MPU buffer 54 specifies and outputs a chunk, for example, based on the MPU sequence number and the number of the chunk within that MPU.

パケット混合部５５は、記述子抽出・削除部５１から入力した制御メッセージ（ＭＰテーブル）と、メタデータ変換部５３から入力したＭＰＵメタデータ及びムービーフラグメントメタデータと、ＭＰＵバッファ５４から入力したメディアフラグメントユニットと、パケットフィルタ５０から入力したその他のパケットとを混合し、ＣＭＡＦ適用ＭＭＴとして出力するものである。 The packet mixing unit 55 mixes the control message (MP table) input from the descriptor extraction/deletion unit 51, the MPU metadata and movie fragment metadata input from the metadata conversion unit 53, the media fragment unit input from the MPU buffer 54, and other packets input from the packet filter 50, and outputs the result as a CMAF-applied MMT.

ここで、記述子抽出・削除部５１、パラメータセット抽出・削除部５２及びＭＰＵバッファ５４の各処理においては、ＭＭＴＰパケットの形式を維持して処理し、メタデータ変換部５３でＭＭＴＰパケットを生成してもよいし、又は、ＭＭＴＰパケットのペイロードである処理対象データを一旦抽出した形式で処理してもよい。前者の場合、パケット混合部５５は、複数のＭＭＴＰパケット列を入力として、それらを混合した単一のＭＭＴＰパケット列として出力する。後者の場合、パケット混合部５５は、制御メッセージ（ＭＰテーブル）、ＭＰＵメタデータ、ムービーフラグメントメタデータ、及び、メディアフラグメントユニットをペイロードとして含むＭＭＴＰパケットを生成し、それらをその他のパケットとして入力されるＭＭＴＰパケット列に混合して、単一のＭＭＴＰパケット列として出力する。さらに、パケット混合部５５は、必要に応じて、出力するＭＭＴＰパケット列についてパケットシーケンス番号の連続性を修正するなど、ヘッダ部を書き換えてもよい。 Here, in the processes of the descriptor extraction/deletion unit 51, the parameter set extraction/deletion unit 52, and the MPU buffer 54, the format of the MMTP packet may be maintained and the MMTP packet may be generated by the metadata conversion unit 53, or the data to be processed, which is the payload of the MMTP packet, may be extracted and processed in a format. In the former case, the packet mixing unit 55 inputs a plurality of MMTP packet strings and outputs them as a single MMTP packet string by mixing them. In the latter case, the packet mixing unit 55 generates MMTP packets including a control message (MP table), MPU metadata, movie fragment metadata, and media fragment units as payloads, mixes them with the MMTP packet string input as other packets, and outputs them as a single MMTP packet string. Furthermore, the packet mixing unit 55 may rewrite the header part, for example, by correcting the continuity of the packet sequence number for the MMTP packet string to be output, as necessary.

［ＭＭＴ変換装置の動作］
図６を参照し、ＭＭＴ変換装置５の動作について説明する。
図６に示すように、ステップＳ１０において、パケットフィルタ５０は、ＣＭＡＦ非適用ＭＭＴから、制御メッセージ（ＭＰテーブル）と、メディアフラグメントユニットと、その他のパケットとを分離する。 [Operation of MMT conversion device]
The operation of the MMT conversion device 5 will be described with reference to FIG.
As shown in FIG. 6, in step S10, the packet filter 50 separates control messages (MP tables), media fragment units, and other packets from the CMAF-non-applied MMT.

ステップＳ１１において、パラメータセット抽出・削除部５２は、メディアフラグメントユニットから映像符号化のパラメータセットを抽出すると共に、メディアフラグメントユニットのパラメータセットを削除する。 In step S11, the parameter set extraction/deletion unit 52 extracts a video encoding parameter set from the media fragment unit and deletes the parameter set from the media fragment unit.

ステップＳ１２において、記述子抽出・削除部５１は、制御メッセージ（ＭＰテーブル）から拡張ＭＰＵタイムスタンプ記述子を抽出すると共に、制御メッセージ（ＭＰテーブル）の拡張ＭＰＵタイムスタンプ記述子を削除する。 In step S12, the descriptor extraction/deletion unit 51 extracts the extended MPU timestamp descriptor from the control message (MP table) and deletes the extended MPU timestamp descriptor from the control message (MP table).

ステップＳ１３において、メタデータ変換部５３は、抽出したパラメータセットを含むＭＰＵメタデータを生成する。
ステップＳ１４において、メタデータ変換部５３は、拡張ＭＰＵタイムスタンプ記述子のＤＴＳ－ＰＴＳ差分情報を変換したムービーフラグメントメタデータを生成する。 In step S13, the metadata conversion unit 53 generates MPU metadata including the extracted parameter set.
In step S14, the metadata conversion unit 53 generates movie fragment metadata by converting the DTS-PTS difference information of the extended MPU time stamp descriptor.

ステップＳ１５において、ＭＰＵバッファ５４は、ムービーフラグメントメタデータの出力タイミングに従って、チャンク単位でメディアフラグメントユニットを出力する。
ステップＳ１６において、パケット混合部５５は、制御メッセージと、ＭＰＵメタデータと、ムービーフラグメントメタデータと、メディアフラグメントユニットと、その他のパケットとを混合し、ＣＭＡＦ適用ＭＭＴとして出力する。 In step S15, the MPU buffer 54 outputs the media fragment unit in chunk units in accordance with the output timing of the movie fragment metadata.
In step S16, the packet mixing unit 55 mixes the control message, MPU metadata, movie fragment metadata, media fragment unit, and other packets, and outputs the result as a CMAF-applied MMT.

なお、ステップＳ１０～Ｓ１６の処理は、図６の順序で逐次的に実行せずとも、入力されたパケットの種別や順序に応じて、各ステップの処理順序を入れ替えたり、各ステップの処理を同時並列に実行してもよい。 The processing of steps S10 to S16 does not have to be performed sequentially in the order shown in FIG. 6. Depending on the type and order of the input packets, the processing order of each step may be changed, or each step may be performed simultaneously in parallel.

［作用・効果］
以上のように、ＭＭＴ変換装置５は、ＣＭＡＦ非適応ＭＭＴをＣＭＡＦ適応ＭＭＴに変換するので、ＣＭＡＦの適否に関わらず、受信機４Ｃが正常に映像・音声を再生できる。このように、受信機４Ｃは、ＣＭＡＦで規定されたチャンク構造のＩＳＯＢＭＦＦメタデータのみに対応する場合でも、ＭＭＴ変換装置５によって、正常に映像・音声を再生できる。 [Action and Effects]
As described above, the MMT conversion device 5 converts a CMAF non-compliant MMT into a CMAF compliant MMT, so the receiver 4C can play back video and audio normally regardless of whether the CMAF is appropriate. In this way, even if the receiver 4C only supports ISOBMFF metadata with a chunk structure defined by CMAF, the MMT conversion device 5 can play back video and audio normally.

（変形例２）
図７を参照し、変形例２に係るＭＭＴ変換装置５Ｂについて、第２実施形態と異なる点を説明する。
変形例２では、ＣＭＡＦ適応ＭＭＴが「ｈｅｖ１」に対応し、パラメータセットをメディアフラグメントユニットに含むこととする。つまり、ＭＭＴ変換装置５Ｂは、パラメータセットが入力のメディアフラグメントユニットに元々含まれており、そのまま出力すればよいので、パラメータセットをメディアフラグメントユニットから抽出して削除する必要がない。 (Variation 2)
With reference to FIG. 7, an MMT conversion device 5B according to the second modification will be described with respect to differences from the second embodiment.
In the second modification, the CMAF-adapted MMT corresponds to "hev1", and the parameter set is included in the media fragment unit. In other words, the MMT conversion device 5B does not need to extract and delete the parameter set from the media fragment unit because the parameter set is originally included in the input media fragment unit and can simply output the input media fragment unit as is.

ＭＭＴ変換装置５Ｂは、図８（ｂ）のＣＭＡＦ非適応ＭＭＴを図９（ｂ）のＣＭＡＦ適応ＭＭＴ（ｈｅｖ１）に変換するものである。図７に示すように、ＭＭＴ変換装置５Ｂは、パケットフィルタ（分離部）５０Ｂと、記述子抽出・削除部５１と、メタデータ変換部（変換部）５３Ｂと、ＭＰＵバッファ（出力部）５４Ｂと、パケット混合部（混合部）５５とを備える。 The MMT conversion device 5B converts the CMAF non-adaptive MMT of FIG. 8(b) into the CMAF adapted MMT (hev1) of FIG. 9(b). As shown in FIG. 7, the MMT conversion device 5B includes a packet filter (separation unit) 50B, a descriptor extraction/deletion unit 51, a metadata conversion unit (conversion unit) 53B, an MPU buffer (output unit) 54B, and a packet mixing unit (mixing unit) 55.

パケットフィルタ５０Ｂは、ＣＭＡＦ適用ＭＭＴから分離した制御メッセージ（ＭＰテーブル）を記述子抽出・削除部５１に出力する。この他、パケットフィルタ５０Ｂは、第２実施形態と同様のため、説明を省略する。 The packet filter 50B outputs the control message (MP table) separated from the CMAF-applied MMT to the descriptor extraction and deletion unit 51. Other than this, the packet filter 50B is the same as in the second embodiment, so a description thereof will be omitted.

メタデータ変換部５３Ｂは、パラメータセットを含むＭＰＵメタデータを生成しない以外、第２実施形態と同様のため、説明を省略する。なお、メタデータ変換部５３Ｂは、パラメータセットを含まないＭＰＵメタデータ（図示せず）を生成して出力してもよい。 The metadata conversion unit 53B is similar to the second embodiment except that it does not generate MPU metadata that includes a parameter set, so a description thereof will be omitted. Note that the metadata conversion unit 53B may generate and output MPU metadata (not shown) that does not include a parameter set.

ＭＰＵバッファ５４Ｂは、制御メッセージ（ＭＰテーブル）の出力タイミングに従って、パケットフィルタ５０Ｂから入力したメディアフラグメントユニットをチャンク単位で出力する。この他、ＭＰＵバッファ５４Ｂは、第２実施形態と同様のため、説明を省略する。 The MPU buffer 54B outputs the media fragment units input from the packet filter 50B in chunk units according to the output timing of the control message (MP table). Other than this, the MPU buffer 54B is the same as in the second embodiment, so a description is omitted.

［作用・効果］
以上のように、ＭＭＴ変換装置５Ｂは、ＣＭＡＦ適応ＭＭＴが「ｈｅｖ１」に対応する場合でも、第２実施形態と同様にＣＭＡＦ非適応ＭＭＴをＣＭＡＦ適応ＭＭＴに変換するので、ＣＭＡＦの適否に関わらず、受信機４Ｃが正常に映像・音声を再生できる。 [Action and Effects]
As described above, even when the CMAF-adaptive MMT corresponds to "hev1", the MMT conversion device 5B converts the CMAF-non-adaptive MMT to a CMAF-adaptive MMT as in the second embodiment, so that the receiver 4C can play video and audio normally regardless of the suitability of the CMAF.

以上、本発明の各実施形態を詳述してきたが、本発明はこれらに限られるものではなく、本発明の要旨を逸脱しない範囲の設計変更等も含まれる。
前記した各実施形態では、多重方式がＭＭＴであることとして説明したが、これに限定されない。例えば、ＭＭＴ変換装置時への入力及びＭＭＴ変換装置からの出力の少なくとも一方において、多重方式がＤＡＳＨ／ＲＯＵＴＥであってもよい。 Although each embodiment of the present invention has been described in detail above, the present invention is not limited to these, and includes design modifications and the like within the scope of the gist of the present invention.
In the above-described embodiments, the multiplexing method is described as MMT, but is not limited thereto. For example, the multiplexing method may be DASH/ROUTE in at least one of the input to the MMT conversion device and the output from the MMT conversion device.

前記した各実施形態では、映像符号化方式がＨＥＶＣであることとして説明したが、これに限定されない。例えば、映像符号化方式は、ＡＶＣ（Advanced Video Coding）、ＶＶＣ（Versatile Video Coding）であってもよい。また、本発明は、符号化方式が映像符号化方式に限られず、音声符号化方式であるＡＡＣや３ＤＡ（3D Audio）にも適用できる。 In the above-described embodiments, the video encoding method is described as HEVC, but this is not limited to this. For example, the video encoding method may be AVC (Advanced Video Coding) or VVC (Versatile Video Coding). Furthermore, the encoding method of the present invention is not limited to the video encoding method, and can also be applied to audio encoding methods such as AAC and 3DA (3D Audio).

前記した各実施形態では、ＭＭＴ変換装置が受信機に内蔵されていることとして説明したが、これに限定されない。例えば、ＭＭＴ変換装置は、独立したハードウェアとして実装してもよい。また、放送局側の符号化装置又は送出装置がＭＭＴ変換装置を内蔵してもよい。 In each of the above-described embodiments, the MMT conversion device is described as being built into the receiver, but this is not limited to the above. For example, the MMT conversion device may be implemented as independent hardware. Also, the MMT conversion device may be built into the encoding device or transmission device on the broadcasting station side.

また、コンピュータが備えるＣＰＵ、メモリ、ハードディスク等のハードウェア資源を、前記したＭＭＴ変換装置として動作させるプログラムで実現することもできる。これらのプログラムは、通信回線を介して配布してもよく、ＣＤ－ＲＯＭやフラッシュメモリ等の記録媒体に書き込んで配布してもよい。 It can also be realized by a program that causes the hardware resources of a computer, such as a CPU, memory, and hard disk, to operate as the MMT conversion device described above. These programs may be distributed via a communication line, or written to a recording medium such as a CD-ROM or flash memory and distributed.

１，１ＢＭＭＴ変換装置（多重信号変換装置）
１０，１０Ｂパケットフィルタ（分離部）
１１メッセージバッファ
１２記述子変換部
１３記述子追加部
１４パラメータセット抽出部
１５パラメータセット追加部
１６，１６ＢＭＰＵバッファ（出力部）
１７パケット混合部（混合部）
２符号化装置
３，３Ｃ送出装置
４，４Ｃ受信機
５，５ＢＭＭＴ変換装置（多重信号変換装置）
５０，５０Ｂパケットフィルタ（分離部）
５１記述子抽出・削除部
５２パラメータセット抽出・削除部
５３，５３Ｂメタデータ変換部（変換部）
５４，５４ＢＭＰＵバッファ（出力部）
５５パケット混合部（混合部）
１００，１００Ｃ放送システム 1,1B MMT conversion device (multiplex signal conversion device)
10, 10B Packet filter (separation unit)
11 Message buffer 12 Descriptor conversion unit 13 Descriptor addition unit 14 Parameter set extraction unit 15 Parameter set addition unit 16, 16B MPU buffer (output unit)
17 Packet Mixing Unit (Mixing Unit)
2 Encoding device 3, 3C Transmission device 4, 4C Receiver 5, 5B MMT conversion device (multiplex signal conversion device)
50, 50B Packet filter (separation section)
51 Descriptor extraction/deletion unit 52 Parameter set extraction/deletion unit 53, 53B Metadata conversion unit (conversion unit)
54, 54B MPU buffer (output section)
55 Packet Mixing Unit (Mixing Unit)
100, 100C Broadcasting System

Claims

A multiple signal conversion device that converts a CMAF-applied multiple signal, which is a multiple signal to which CMAF is applied, into a CMAF non-applied multiple signal, which is a multiple signal to which CMAF is not applied,
a separation unit for separating movie metadata, movie fragment metadata, control messages, and media data from the CMAF applied multiplexed signal;
a descriptor conversion unit that converts the DTS-PTS difference information of the movie fragment metadata into a descriptor;
a descriptor adding unit for adding the descriptor to the control message;
an output unit that outputs the media data in units of fragments in accordance with an output timing of the control message;
a mixer that mixes the control message from the descriptor adding unit and the media data from the output unit and outputs the mixed signal as the CMAF non-applied multiplex signal;
A multiplex signal conversion device comprising:

a parameter set extraction unit for extracting an encoding parameter set from the movie metadata;
a parameter set adding unit for adding the parameter set to the media data,
2. The multiplexed signal conversion device according to claim 1, wherein the output unit outputs the media data to which the parameter set has been added in units of the fragments in accordance with an output timing of the control message.

A multiplex signal conversion device that converts a CMAF non-applied multiplex signal, which is a multiplex signal to which CMAF is not applied, into a CMAF-applied multiplex signal, which is a multiplex signal to which CMAF is applied,
a separation unit that separates a control message and media data from the CMAF non-applied multiplex signal;
a descriptor extraction and deletion unit that extracts a descriptor including DTS-PTS differential information from the control message and deletes the descriptor from the control message;
A conversion unit that converts the DTS-PTS difference information of the descriptor into movie fragment metadata;
an output unit that outputs the media data in chunk units according to an output timing of the movie fragment metadata;
a mixer that mixes the control message from the descriptor extractor/deleter, the movie fragment metadata from the converter, and the media data from the output unit, and outputs the mixed signal as the CMAF applied multiplexed signal;
A multiplex signal conversion device comprising:

a parameter set extraction/deletion unit that extracts an encoding parameter set from the media data and deletes the parameter set from the media data,
The multiple signal conversion device according to claim 3 , wherein the conversion unit generates movie metadata including the parameter set.

The multiplexed signal conversion device according to claim 1, 2 or 4, characterized in that the multiplexed signal is MMT, the movie metadata is MPU metadata, the media data is media fragment units, and the descriptor is an extended MPU timestamp descriptor.

The multiplexed signal conversion device according to claim 3, characterized in that the multiplexed signal is an MMT, the media data is a media fragment unit, and the descriptor is an extended MPU timestamp descriptor.

The multiplex signal conversion device according to claim 1 or 2, characterized in that the fragment is an MPU.

A program for causing a computer to function as a multiplex signal conversion device according to any one of claims 1 to 7.

A receiver equipped with a multiplex signal conversion device according to any one of claims 1 to 7.