[go: up one dir, main page]

CN110858902A - Synchronization method of decoded tile and display position based on input geographic location and related video decoding device - Google Patents

Synchronization method of decoded tile and display position based on input geographic location and related video decoding device Download PDF

Info

Publication number
CN110858902A
CN110858902A CN201810959428.7A CN201810959428A CN110858902A CN 110858902 A CN110858902 A CN 110858902A CN 201810959428 A CN201810959428 A CN 201810959428A CN 110858902 A CN110858902 A CN 110858902A
Authority
CN
China
Prior art keywords
interest
reconstructed
image
bitstream
decoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810959428.7A
Other languages
Chinese (zh)
Inventor
王颂文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MStar Semiconductor Inc Taiwan
Original Assignee
MStar Semiconductor Inc Taiwan
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MStar Semiconductor Inc Taiwan filed Critical MStar Semiconductor Inc Taiwan
Priority to CN201810959428.7A priority Critical patent/CN110858902A/en
Publication of CN110858902A publication Critical patent/CN110858902A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

一种解码图块与显示位置同步方法,用于一视频比特流解码装置。该方法包含根据一地理位置,判断一输入比特流的一原始图像的一兴趣区,以撷取该兴趣区的兴趣图块数据,其中该兴趣区对应一兴趣图像;根据该兴趣图块数据,重建该输入比特流,以产生一重建比特流;解码该重建比特流,以产生一重建解码图像;以及根据该兴趣图块数据,将该重建解码图像转换为该兴趣图像。

Figure 201810959428

A method for synchronizing a decoding block with a display position is used in a video bitstream decoding device. The method comprises determining an interest region of an original image of an input bitstream according to a geographical location to extract interest block data of the interest region, wherein the interest region corresponds to an interest image; reconstructing the input bitstream according to the interest block data to generate a reconstructed bitstream; decoding the reconstructed bitstream to generate a reconstructed decoded image; and converting the reconstructed decoded image into the interest image according to the interest block data.

Figure 201810959428

Description

基于输入地理位置的解码图块与显示位置同步方法及相关视 频解码装置Synchronization method of decoded tile and display position based on input geographic location and related view frequency decoding device

技术领域technical field

本发明是指一种解码图块与显示位置同步方法及相关视频解码装置,尤指一种基于输入地理位置的解码图块与显示位置同步的方法及相关视频解码装置。The present invention relates to a method for synchronizing a decoded image block with a display position and a related video decoding device, and more particularly to a method for synchronizing a decoded image block and a display position based on an input geographic location, and a related video decoding device.

背景技术Background technique

在全景虚拟现实(360-Degree Virtual Reality,简称VR360)应用中,当使用世界地图模式时,视频解码装置(例如,虚拟现实装置)会依照使用者选取的地理位置(例如,经度、纬度及对应视角)来决定其屏幕的显示内容。一般而言,虚拟现实装置可透过网络从世界地图数据库下载图像档案(例如,视频比特流(Video bit-stream)),将图像档案暂存于内建存储器(例如,帧缓存器(frame buffer))中,再对图像档案进行解码来取得欲显示的图像,以将之显示于屏幕。In a 360-Degree Virtual Reality (VR360 for short) application, when using the world map mode, the video decoding device (for example, a virtual reality device) will follow the geographic location (for example, longitude, latitude and correspondence) selected by the user. Viewing angle) to determine the display content of its screen. Generally speaking, virtual reality devices can download image files (eg, video bit-streams) from a world map database through a network, and temporarily store the image files in a built-in memory (eg, frame buffers) )), and then decode the image file to obtain the image to be displayed, so as to display it on the screen.

图1绘示根据一地理位置L1,从一世界地图E_MAP载入一视野区FOV1的示意图。图2绘示根据一地理位置L2,从世界地图E_MAP载入一视野区FOV2的示意图。如第1、2图所示,世界地图E_MAP可依据多个经度及多个纬度而分割为多个图块(tile)编号0~509(例如,30经度*17纬度=510图块)。由于世界地图E_MAP是由球体表面转换而来的平面地图,因此需显示的分区图像大小或形状会随着不同的经纬度而有所不同。FIG. 1 is a schematic diagram of loading a field of view FOV1 from a world map E_MAP according to a geographic location L1. FIG. 2 is a schematic diagram of loading a field of view FOV2 from the world map E_MAP according to a geographic location L2. As shown in Figures 1 and 2, the world map E_MAP can be divided into a plurality of tile numbers 0 to 509 according to a plurality of longitudes and latitudes (for example, 30 longitude*17 latitude=510 tiles). Since the world map E_MAP is a flat map converted from the surface of a sphere, the size or shape of the partition image to be displayed will vary with different latitude and longitude.

举例来说,于图1中,当地理位置L1落在纬度0度(赤道)时,视野区(Field ofView)FOV1包含图块编号98~111、128~141、158~171、188~201、218~231、248~261、278~291、308~321、338~351、368~381、398~411,共154个图块;其中屏幕的显示内容为一兴趣区(Regionof Interest)ROI1,兴趣区ROI1包含图块编号132~137、160~169、189~200、219~230、249~260、279~290、309~320、340~349、372~307,共80个图块。For example, in FIG. 1 , when the geographic location L1 is at 0 degrees latitude (equator), the field of view FOV1 includes tile numbers 98-111, 128-141, 158-171, 188-201, 218~231, 248~261, 278~291, 308~321, 338~351, 368~381, 398~411, a total of 154 tiles; the content displayed on the screen is a Region of Interest ROI1, The region ROI1 contains 80 tiles with tile numbers 132-137, 160-169, 189-200, 219-230, 249-260, 279-290, 309-320, 340-349, 372-307 in total.

于图2中,当地理位置L2落在纬度56度时,视野区FOV2包含图块编号0~29、30~59、60~171、90~201、120~231、150~261、180~291、210~321、240~351、270~381,共300个图块;其中屏幕的显示内容为一兴趣区ROI2,兴趣区ROI2包含图块编号0~29、36~53、67~82、97~112、127~142、157~172、188~201、219~230、250~252、257~259,共144个图块。In FIG. 2, when the geographic location L2 is at 56 degrees latitude, the field of view FOV2 includes tile numbers 0-29, 30-59, 60-171, 90-201, 120-231, 150-261, 180-291 , 210~321, 240~351, 270~381, a total of 300 tiles; the display content of the screen is an area of interest ROI2, and the area of interest ROI2 includes tile numbers 0~29, 36~53, 67~82, 97 ~112, 127~142, 157~172, 188~201, 219~230, 250~252, 257~259, a total of 144 tiles.

比较第1、2图可发现视野区FOV1及FOV2的长度与宽度皆不同,因此在进行分区图像解码时需要的存储器尺寸(例如,帧缓存器的尺寸)也不同。Comparing Figures 1 and 2, it can be found that the lengths and widths of the view areas FOV1 and FOV2 are different, so the memory sizes (eg, the size of the frame buffer) required to perform partition image decoding are also different.

目前全景虚拟现实技术采用高效率视频编解码(High Efficiency VideoCoding,简称HEVC)标准来进行图像编码及解码,然而现有硬件设计可支持的最大像素宽度为4096以及最大像素高度为2560。由于视野区FOV2的像素宽度7860超过最大像素宽度4096,造成现有HEVC的硬件设计无法支持视野区FOV2的图像解码工作,导致虚拟现实装置无法正常显示。Currently, the panoramic virtual reality technology adopts the High Efficiency Video Coding (HEVC) standard for image encoding and decoding. However, the current hardware design can support a maximum pixel width of 4096 and a maximum pixel height of 2560. Since the pixel width 7860 of the field of view FOV2 exceeds the maximum pixel width of 4096, the existing HEVC hardware design cannot support the image decoding work of the field of view FOV2, resulting in the virtual reality device being unable to display normally.

此外,HEVC标准定义了运动约束图块集合(motion-constrained tile set,简称MCTS),允许MCTS里的图块可独立解码而不需参考其他数据。然而,申请人注意到在进行MCTS解码时,若使用者将视角从地理位置L1移动到地理位置L2,现有视频解码装置却未同步变更需解码的目标图块(因为MCTS里的图块可独立解码而不需参考其他数据),使得视频解码装置把兴趣区ROI1的解码图像映射到兴趣区ROI2的显示位置,导致虚拟现实装置显示错误图像。In addition, the HEVC standard defines a motion-constrained tile set (MCTS for short), which allows tiles in the MCTS to be decoded independently without reference to other data. However, the applicant has noticed that during MCTS decoding, if the user moves the viewing angle from the geographic location L1 to the geographic location L2, the existing video decoding device does not synchronously change the target tiles to be decoded (because the tiles in the MCTS can be independent decoding without reference to other data), so that the video decoding device maps the decoded image of the region of interest ROI1 to the display position of the region of interest ROI2, resulting in the virtual reality device displaying an incorrect image.

有鉴于此,如何解决图块重建以及解码图块与显示位置同步的问题,以确保虚拟现实装置能正常显示,实乃本领域的重要课题。In view of this, it is an important issue in the art how to solve the problem of reconstructing the image block and synchronizing the decoding image block with the display position so as to ensure the normal display of the virtual reality device.

发明内容SUMMARY OF THE INVENTION

因此,本发明的主要目的即在于提供一种解码图块与显示位置同步的方法及相关视频解码装置,以确保虚拟现实装置能正常显示。Therefore, the main purpose of the present invention is to provide a method for synchronizing a decoded image block with a display position and a related video decoding device, so as to ensure that the virtual reality device can display normally.

本发明揭露一种解码图块与显示位置同步方法,用于一视频解码装置。该方法包含根据一地理位置,判断一输入比特流的一原始图像的一兴趣区,以撷取该兴趣区的兴趣图块数据,其中该兴趣区对应一兴趣图像;根据该兴趣图块数据,重建该输入比特流,以产生一重建比特流;解码该重建比特流,以产生一重建解码图像;以及根据该兴趣图块数据,将该重建解码图像转换为该兴趣图像。The invention discloses a decoding picture block and display position synchronization method for a video decoding device. The method includes determining a region of interest of an original image of an input bitstream according to a geographic location, so as to extract block data of interest in the region of interest, wherein the region of interest corresponds to an image of interest; according to the block data of interest, reconstructing the input bitstream to generate a reconstructed bitstream; decoding the reconstructed bitstream to generate a reconstructed decoded image; and converting the reconstructed decoded image into the interest image according to the interest block data.

本发明另揭露一种视频解码装置,用于一电子装置,包含一比特流接收单元,用来接收一输入比特流;一解码帧缓存器,用来暂存该输入比特流的一重建解码图像;以及一处理器,耦接于该比特流接收单元以及该解码帧缓存器,用来执行一解码图块与显示位置同步流程,其中该解码图块与显示位置同步流程包含上述解码图块与显示位置同步方法的所有步骤。The present invention further discloses a video decoding device for an electronic device, comprising a bit stream receiving unit for receiving an input bit stream; a decoding frame buffer for temporarily storing a reconstructed decoded image of the input bit stream ; And a processor, coupled to the bit stream receiving unit and the decoded frame buffer, is used to execute a decoded image block and a display position synchronization process, wherein the decoded image block and the display position synchronization process includes the above-mentioned decoded image block and Shows all steps of the location synchronization method.

本发明的视频解码装置可根据使用者输入的地理位置,撷取输入比特流的图像帧中属于兴趣区的相关图像解码数据;在给定解码帧缓存器尺寸的条件下,根据相关图像解码数据来重建输入比特流,以产生重建比特流;对重建比特流进行图像解码,以产生重建解码图像并将之暂存于解码帧缓存器;最后将重建解码图像(例如,属于兴趣区的图块)进行重组,以还原并产生解码图像。如此一来,本发明的视频解码装置可在不增加解码帧缓存器面积的前提下,针对属于兴趣区的相关图块进行解码,如此可确保电子装置能正常显示解码图像,也可避免增加视频解码装置的硬件成本。此外,由于重建比特流是依据使用者输入的地理位置及对应兴趣区的相关图像解码数据来产生,因此可确保电子装置显示的解码图像能正确映射到使用者输入的地理位置。The video decoding device of the present invention can capture the relevant image decoding data belonging to the region of interest in the image frame of the input bit stream according to the geographic location input by the user; under the condition of a given decoding frame buffer size, the relevant image decoding data can be obtained according to the relevant image decoding data. to reconstruct the input bitstream to generate the reconstructed bitstream; perform image decoding on the reconstructed bitstream to generate the reconstructed decoded image and temporarily store it in the decoded frame buffer; finally reconstruct the decoded image (for example, the tiles belonging to the region of interest ) for recombination to restore and produce a decoded image. In this way, the video decoding device of the present invention can decode the relevant picture blocks belonging to the region of interest without increasing the area of the decoding frame buffer, so as to ensure that the electronic device can display the decoded image normally, and avoid increasing the video The hardware cost of the decoding device. In addition, since the reconstructed bitstream is generated according to the geographic location input by the user and the relevant image decoded data corresponding to the region of interest, it can ensure that the decoded image displayed by the electronic device can be correctly mapped to the geographic location input by the user.

附图说明Description of drawings

图1绘示根据一地理位置,从一世界地图载入一视野区的示意图。FIG. 1 is a schematic diagram of loading a field of view from a world map according to a geographic location.

图2绘示根据另一地理位置,从一世界地图载入另一视野区的示意图。FIG. 2 is a schematic diagram of loading another viewing area from a world map according to another geographic location.

图3为本发明实施例一电子装置的功能方块图。FIG. 3 is a functional block diagram of an electronic device according to an embodiment of the present invention.

图4为一HEVC比特流的编码格式范例的示意图。FIG. 4 is a schematic diagram of an example encoding format of an HEVC bitstream.

图5A到图5F为本发明多个实施例图块分组与填补方式的示意图。FIG. 5A to FIG. 5F are schematic diagrams of block grouping and padding methods according to various embodiments of the present invention.

图6为本发明实施例一解码图块与显示位置同步流程的流程图。FIG. 6 is a flowchart of a process of synchronizing a decoded picture block and a display position according to an embodiment of the present invention.

图7为本发明实施例一比特流编码流程的流程图。FIG. 7 is a flowchart of a bit stream encoding process according to an embodiment of the present invention.

符号说明Symbol Description

E_MAP 世界地图E_MAP world map

L1、L2 地理位置L1, L2 location

FOV1、FOV2 视野区FOV1, FOV2 field of view

ROI1、ROI2 兴趣区ROI1, ROI2 area of interest

3 电子装置3 Electronics

30 视频解码装置30 Video decoding device

32 图像处理器32 Image Processor

34 显示帧缓存器34 Display frame buffer

300 比特流接收单元300 bit stream receiving unit

301 兴趣区定位单元301 Area of Interest Location Unit

302 比特流重建单元302 bitstream reconstruction unit

303 视频解码器303 video decoder

304 局部区填补单元304 Local area filling unit

305 解码帧缓存器305 Decode Frame Buffer

6、7 流程6.7 Process

601、602、603、604、701、702、703、704、705 步骤601, 602, 603, 604, 701, 702, 703, 704, 705 Steps

具体实施方式Detailed ways

图3为本发明实施例一电子装置3的功能方块图。电子装置3包含一视频解码装置30、一图像处理器(Graphic processing unit,GPU)32以及一显示帧缓存器34。电子装置3可以是虚拟现实装置或桌面计算机、笔记本电脑、智能型手机等装置。FIG. 3 is a functional block diagram of an electronic device 3 according to an embodiment of the present invention. The electronic device 3 includes a video decoding device 30 , a graphics processing unit (GPU) 32 and a display frame buffer 34 . The electronic device 3 may be a virtual reality device or a device such as a desktop computer, a notebook computer, and a smart phone.

视频解码装置30包含一比特流接收单元300、一兴趣区(Region of interest,ROI)定位单元301、一比特流重建(Bit-stream rewriting)单元302、一视频解码器303、一局部区填补(Partial region filling)单元304以及一解码帧缓存器305。视频解码装置30耦接于图像处理器32,用来对一输入比特流(例如,用于全景虚拟现实的视频比特流)进行解码,以产生一解码图像到图像处理器32;接着,图像处理器32可将解码图像写入显示帧缓存器34,以将之显示于显示器。于一实施例中,输入比特流包含的原始图像的分辨率为8K*4K(8192*4320)像素,而重建解码图像的分辨率为4K*2K(4096*2560)像素。The video decoding device 30 includes a bitstream receiving unit 300, a region of interest (ROI) positioning unit 301, a bit-stream rewriting (Bit-stream rewriting) unit 302, a video decoder 303, a local region padding ( Partial region filling) unit 304 and a decoded frame buffer 305 . The video decoding device 30 is coupled to the image processor 32 for decoding an input bit stream (eg, a video bit stream for panoramic virtual reality) to generate a decoded image to the image processor 32; then, image processing The processor 32 may write the decoded image to the display frame buffer 34 for display on the display. In one embodiment, the resolution of the original image included in the input bitstream is 8K*4K (8192*4320) pixels, and the resolution of the reconstructed decoded image is 4K*2K (4096*2560) pixels.

视频解码装置30可根据使用者输入的一地理位置,撷取输入比特流的一图像帧中属于兴趣区的相关图像解码数据;在给定解码帧缓存器305尺寸的条件下,根据相关图像解码数据来重建输入比特流,以产生一重建比特流;对重建比特流进行图像解码,以产生一重建解码图像并将之暂存于解码帧缓存器305;最后将重建解码图像(例如,属于兴趣区的图块)进行重组,以还原并产生解码图像。如此一来,本发明的视频解码装置30可在不增加解码帧缓存器305大小的前提下,针对属于兴趣区的相关图块进行解码,如此可确保电子装置3能正常显示解码图像,也可避免增加视频解码装置30的硬件成本。此外,由于重建比特流是依据使用者输入的地理位置及对应兴趣区的相关图像解码数据来产生,因此可确保电子装置3显示的解码图像能正确映射到使用者输入的地理位置。The video decoding device 30 can retrieve the relevant image decoding data belonging to the region of interest in an image frame of the input bitstream according to a geographic location input by the user; under the condition of a given size of the decoded frame buffer 305, the relevant image decoding data data to reconstruct the input bitstream to generate a reconstructed bitstream; perform image decoding on the reconstructed bitstream to generate a reconstructed decoded image and temporarily store it in the decoded frame buffer 305; regions) are reorganized to restore and produce a decoded image. In this way, the video decoding device 30 of the present invention can decode the relevant image blocks belonging to the region of interest without increasing the size of the decoding frame buffer 305, so as to ensure that the electronic device 3 can display the decoded image normally, and also An increase in the hardware cost of the video decoding apparatus 30 is avoided. In addition, since the reconstructed bit stream is generated according to the geographic location input by the user and the relevant image decoded data corresponding to the region of interest, it can ensure that the decoded image displayed by the electronic device 3 can be correctly mapped to the geographic location input by the user.

于一实施例中,输入比特流是依据高效率视频编解码(High Efficiency VideoCoding,简称HEVC)定义的编码规范来产生,如图4所示,其绘示一HEVC比特流的编码格式范例的示意图。HEVC比特流由多组网络提取层单元串流(Network Abstraction Layer unitstream,NAL unit stream)所组成,其包含多个编码视频序列(coded video sequence)。每一编码视频序列用来显示一视频,其包含一瞬时解码刷新(Instantaneous DecodingRefresh,IDR)存取单元以及多个存取单元(access unit),而每一存取单元可解码为该编码视频序列的一图像帧。存取单元包含多个语法元素(syntaxIn one embodiment, the input bit stream is generated according to the coding standard defined by High Efficiency Video Coding (HEVC), as shown in FIG. 4 , which is a schematic diagram illustrating an example of the coding format of the HEVC bit stream. . The HEVC bit stream is composed of multiple groups of Network Abstraction Layer unit streams (NAL unit streams), which include multiple coded video sequences. Each coded video sequence is used to display a video, which includes an Instantaneous Decoding Refresh (IDR) access unit and a plurality of access units, and each access unit can be decoded into the coded video sequence of an image frame. An access unit contains multiple syntax elements (syntax

element),例如一视频参数子集(video parameter set,VPS)、一序列参数子集(Sequence parameter set,SPS)、一图像参数子集(Picture parameter set,PPS)、一辅助增强信息(Supplemental enhancement information,SEI)以及多个片段(slice),其中每一片段包含一片段表头(header)以及一片段数据,且每一片段包含多个图块。VPS、SPS、PPS、SEI等参数子集的分类为非视频编码层(Non Video Coding Layer,Non-VCL),VPS、SPS、PPS用来描述整个图像帧进行解码时所需的相关参数,SEI用来描述使用者定义数据(user defined metadata),例如运动约束图块集合的位置。多个片段的分类为视频编码层,用来描述片段包含的多个图块进行解码时所需的相关参数。element), such as a video parameter subset (video parameter set, VPS), a sequence parameter subset (Sequence parameter set, SPS), a picture parameter subset (Picture parameter set, PPS), a supplemental enhancement information (Supplemental enhancement information) information, SEI) and a plurality of slices, wherein each slice includes a slice header and a slice data, and each slice includes a plurality of tiles. The subsets of parameters such as VPS, SPS, PPS, and SEI are classified as Non-Video Coding Layer (Non-VCL). VPS, SPS, and PPS are used to describe the relevant parameters required for decoding the entire image frame. SEI Used to describe user defined metadata, such as the location of a collection of motion-constrained tiles. The classification of multiple clips is a video coding layer, which is used to describe the relevant parameters required for decoding the multiple tiles included in the clip.

请再次参考图3。比特流接收单元300耦接于兴趣区定位单元301,用来剖析输入比特流,以撷取一图像帧在进行解码时所需的相关参数。于一实施例中,比特流接收单元300剖析输入比特流的存取单元的多个非视频编码层参数子集(例如VPS、SPS、PPS等参数子集),以撷取存取单元包含的图像帧在进行解码时所需的相关参数。Please refer to Figure 3 again. The bitstream receiving unit 300 is coupled to the ROI locating unit 301, and is used for analyzing the input bitstream to extract relevant parameters required for decoding an image frame. In one embodiment, the bitstream receiving unit 300 parses a plurality of non-video coding layer parameter subsets (eg, parameter subsets such as VPS, SPS, PPS, etc.) of the access unit of the input bitstream to extract the parameter subsets contained in the access unit. Related parameters required for image frame decoding.

兴趣区定位单元301耦接于比特流接收单元300及比特流重建单元302,用来根据地理位置,对输入比特流进行解码,以撷取图像帧的兴趣区在进行解码时所需的相关参数(例如,兴趣图块数据)。于一实施例中,兴趣区定位单元301可解码存取单元的多个片段表头;根据图像参数子集(PSP)来取得图像帧的所有图块配置信息;以及根据地理位置(例如是包含经纬度的坐标)及所有图块配置信息,从多个片段中撷取属于兴趣区的兴趣图块数据。The region of interest locating unit 301 is coupled to the bitstream receiving unit 300 and the bitstream reconstruction unit 302, and is used for decoding the input bitstream according to the geographic location, so as to extract the relevant parameters required for decoding the region of interest of the image frame. (eg interest tile data). In one embodiment, the ROI locating unit 301 can decode a plurality of segment headers of the access unit; obtain all the tile configuration information of the image frame according to the picture parameter subset (PSP); latitude and longitude coordinates) and all tile configuration information, and extract the interest tile data belonging to the region of interest from multiple segments.

于一实施例中,兴趣区定位单元301可读取辅助增强信息描述的使用者定义数据、透过一系统应用组件(android application package,APK)或透过一硬件定位模块,以取得地理位置。例如,使用者可透过电子装置3的使用者接口来输入经纬度,以供兴趣区定位单元301取得地理位置;或者,电子装置3的系统应用组件及硬件定位模块可自动侦测使用者操作来判断经纬度,以取得地理位置。In one embodiment, the area of interest positioning unit 301 can read the user-defined data described in the auxiliary enhanced information, obtain the geographic location through an android application package (APK) or through a hardware positioning module. For example, the user can input the latitude and longitude through the user interface of the electronic device 3, so that the area of interest positioning unit 301 can obtain the geographic position; or, the system application component and the hardware positioning module of the electronic device 3 can automatically detect the user operation to Determine the latitude and longitude to obtain the geographic location.

比特流重建单元302耦接于兴趣区定位单元301及视频解码器303,用来将输入比特流转换为一重建比特流。于一实施例中,比特流重建单元302可在给定解码长度及宽度的条件下,根据兴趣区的兴趣图块数据,改写存取单元的多个非视频编码层参数子集(例如VPS、SPS、PPS、SEI等参数子集),再改写多个片段表头的片段地址,最后将兴趣图块数据以字节级(byte-level)的格式来编入(stitch)存取单元的片段数据,以产生重建比特流。The bitstream reconstruction unit 302 is coupled to the ROI locating unit 301 and the video decoder 303, and is used for converting the input bitstream into a reconstructed bitstream. In one embodiment, the bitstream reconstruction unit 302 can rewrite multiple non-video coding layer parameter subsets (eg, VPS, Subset of parameters such as SPS, PPS, SEI, etc.), then rewrite the fragment addresses of multiple fragment headers, and finally stitch the block data of interest into the fragment of the access unit in a byte-level format (stitch) data to generate a reconstructed bitstream.

于一实施例中,比特流重建单元302对视频参数子集(VPS)重新编码,以描述视频编码范围及阶层(profile and level);对序列参数子集(SPS)重新编码,以描述图像尺寸及范围;以及对图像参数子集(PPS)重新编码,以描述单一图块及图块子集(tile set)的尺寸。In one embodiment, the bitstream reconstruction unit 302 re-encodes the video parameter subset (VPS) to describe the video coding profile and level, and re-encodes the sequence parameter subset (SPS) to describe the picture size and ranges; and re-encoding the Picture Parameter Subset (PPS) to describe the size of single tiles and tile sets.

于一实施例中,比特流重建单元302对一视频可用信息(Video usabilityinformation,VUI)重新编码,以描述图像长宽比(aspect ratio)、过扫描(overscan)及颜色基础(color primaries)等信息。此外,比特流重建单元302对多个片段表头进行编码,以描述属于兴趣图块的起点(entry point)。In one embodiment, the bitstream reconstruction unit 302 re-encodes a video usability information (VUI) to describe information such as aspect ratio, overscan, and color primaries. . Furthermore, the bitstream reconstruction unit 302 encodes a plurality of segment headers to describe entry points belonging to the tiles of interest.

于一实施例中,比特流重建单元302根据兴趣图块数据以及解码长度及宽度(例如,解码帧缓存器305的长度及宽度),对辅助增强信息(SEI)重新编码,以描述属于兴趣图块的尺寸及位置。于一实施例中,比特流重建单元302根据兴趣图块数据,对比特流有效位(bit-stream payload)中的任何虚设位(dummy bit)或专用位(propriety bit)进行编码,以描述兴趣图块的尺寸及位置。In one embodiment, the bitstream reconstruction unit 302 re-encodes the auxiliary enhancement information (SEI) according to the interest tile data and the decoded length and width (eg, the length and width of the decoded frame buffer 305 ) to describe the data belonging to the interest map. Block size and location. In one embodiment, the bitstream reconstruction unit 302 encodes any dummy bit or propriety bit in the bit-stream payload according to the interest tile data to describe the interest. The size and position of the tiles.

视频解码器303耦接于比特流重建单元302及解码帧缓存器305,用来对重建比特流进行解码,以输出兴趣区的兴趣图块数据到解码帧缓存器305。于一实施例中,比特流重建单元302是根据HEVC规范,对输入比特流进行重新编码来产生重建比特流,因此视频解码器303可以是一HEVC解码器。The video decoder 303 is coupled to the bitstream reconstruction unit 302 and the decoding frame buffer 305 , and is used for decoding the reconstructed bitstream to output the interest block data of the region of interest to the decoding frame buffer 305 . In one embodiment, the bitstream reconstruction unit 302 re-encodes the input bitstream according to the HEVC specification to generate a reconstructed bitstream, so the video decoder 303 may be an HEVC decoder.

解码帧缓存器305耦接于视频解码器303及局部区填补单元304,用来缓存兴趣图块,以产生重建解码图像。于一实施例中,视频解码器303可于解码时,配置兴趣图块的缓存地址及硬件空间,以将兴趣图块缓存于解码帧缓存器305。The decoded frame buffer 305 is coupled to the video decoder 303 and the local area filling unit 304, and is used for buffering the blocks of interest to generate a reconstructed decoded image. In one embodiment, the video decoder 303 may configure the cache address and hardware space of the interest image block during decoding, so as to cache the interest image block in the decoding frame buffer 305 .

局部区填补单元304耦接于解码帧缓存器305及图像处理器32,用来将重建解码图像转换为解码图像,以将解码图像输出到图像处理器32并缓存于显示帧缓存器34。于一实施例中,局部区填补单元304可直接将解码图像缓存于显示帧缓存器34而不需透过图像处理器32。The local area filling unit 304 is coupled to the decoded frame buffer 305 and the image processor 32 , and is used for converting the reconstructed decoded image into a decoded image, so as to output the decoded image to the image processor 32 and buffer it in the display frame buffer 34 . In one embodiment, the local area padding unit 304 can directly buffer the decoded image in the display frame buffer 34 without going through the image processor 32 .

于一实施例中,在给定解码帧缓存器305的长度及宽度的条件下,比特流重建单元302可将属于兴趣区的多个兴趣图块分组为多个局部图像,其中每个局部图像相当于一个运动约束图块集合(或兴趣图块子集)。因此,比特流重建单元302可根据多个兴趣图块子集的兴趣图块子集数据,来对输入比特流进行重新编码,以产生重建比特流。如此一来,在视频解码器303对重建比特流进行解码并将重建解码图像缓存到解码帧缓存器305之后,局部区填补单元304可根据兴趣图块子集数据,将多个局部图像填补到世界地图的显示位置,以还原兴趣区图像(即,将重建解码图像转换为解码图像)。In one embodiment, given the length and width of the decoded frame buffer 305, the bitstream reconstruction unit 302 may group the plurality of interest tiles belonging to the region of interest into a plurality of partial images, wherein each partial image Equivalent to a set of motion-constrained tiles (or a subset of interest tiles). Therefore, the bitstream reconstruction unit 302 may re-encode the input bitstream according to the interest tile subset data of the plurality of interest tile subsets to generate a reconstructed bitstream. In this way, after the video decoder 303 decodes the reconstructed bitstream and buffers the reconstructed decoded images into the decoded frame buffer 305, the local area filling unit 304 can fill in the plurality of local images to Display location of the world map to restore the ROI image (ie, convert the reconstructed decoded image to a decoded image).

于一实施例中,兴趣图块子集数据指示输入比特流的原始图像的高度及宽度、多个兴趣图块子集的数量、重建解码图像缓存在解码帧缓存器305的位置、解码图像缓存在显示帧缓存器34的位置、兴趣区的高度及宽度。In one embodiment, the interest tile subset data indicates the height and width of the original image of the input bitstream, the number of multiple interest tile subsets, the location where the reconstructed decoded image is buffered in the decoded frame buffer 305, the decoded image buffer The position of the frame buffer 34, the height and width of the region of interest are displayed.

表格1为兴趣区定位单元301取得兴趣图块数据的范例,其中于HEVC规范中,「user_data_unregistered」是用来表示辅助增强信息(SEI)的功能语法,用来传递兴趣图块数据(或兴趣图块子集数据);Table 1 is an example of the interest area locating unit 301 obtaining the interest block data. In the HEVC specification, "user_data_unregistered" is a functional syntax used to represent auxiliary enhancement information (SEI), which is used to transmit the interest block data (or interest map block subset data);

「uuid_iso_iec_11578」是一通用唯一标识符(universally unique identifier,UUID),用来指示辅助增强信息是用来描述图块布局(layout);"uuid_iso_iec_11578" is a universally unique identifier (UUID) used to indicate that the auxiliary enhancement information is used to describe the tile layout;

「ROI_window」是用来撷取兴趣图块数据的功能函数;"ROI_window" is a function used to extract the block data of interest;

「user_data_payload_byte」是用来传递兴趣图块数据的功能函数。"user_data_payload_byte" is a function used to pass interest tile data.

Figure BDA0001773495760000081
Figure BDA0001773495760000081

Figure BDA0001773495760000091
Figure BDA0001773495760000091

于表格1的实施例中,兴趣区定位单元301可根据地理位置,对输入比特流进行解码,以产生「org_width、org_height、number_of_position、src_x、src_y、dst_x、dst_y、pos_w、pos_h」等兴趣图块数据,让比特流重建单元302根据上述数据来对输入比特流进行重新编码。于解码时,当视频解码器303读取「user_data_unregistered」的辅助增强信息(SEI)的功能语法时,可识别其中的「uuid_iso_iec_11578」使用者定义功能卷标是用于来描述图块布局,最后透过「user_data_payload_byte」功能函数来传递兴趣图块数据到局部区填补单元304。因此,局部区填补单元304可根据兴趣图块数据,将重建解码图像转换为解码图像。In the embodiment of Table 1, the ROI locating unit 301 can decode the input bitstream according to the geographic location to generate interest tiles such as "org_width, org_height, number_of_position, src_x, src_y, dst_x, dst_y, pos_w, pos_h", etc. data, let the bitstream reconstruction unit 302 re-encode the input bitstream according to the above-mentioned data. During decoding, when the video decoder 303 reads the function syntax of the auxiliary enhancement information (SEI) of "user_data_unregistered", it can recognize that the "uuid_iso_iec_11578" user-defined function tag is used to describe the tile layout, and finally reveals the Pass the interest tile data to the local area filling unit 304 through the "user_data_payload_byte" function. Therefore, the local region padding unit 304 can convert the reconstructed decoded image into a decoded image according to the interest tile data.

于一实施例中,经实验分析,在给定解码帧缓存器以及每个图块的位置与尺寸时,即可列出所有可能的图块分组与填补方式,以将图块缓存到解码帧缓存器。图块分组与填补方式可编译为一编码书(codebook),让比特流重建单元302依据编码书来进行图块分组与填补,以重建比特流;之后再让局部区填补单元304依据解码书来进行图块分组与填补的反向操作,以还原解码图像。In one embodiment, based on experimental analysis, given the location and size of the decoded frame buffer and each tile, all possible tile grouping and padding methods can be listed to buffer the tile into the decoded frame. buffer. The block grouping and padding method can be compiled into a codebook, and the bitstream reconstruction unit 302 can perform block grouping and padding according to the codebook to reconstruct the bitstream; and then let the local area padding unit 304 reconstruct the bitstream according to the codebook. The reverse operation of tile grouping and padding is performed to restore the decoded image.

图5A到图5F为本发明多个实施例图块分组与填补方式的示意图。假设解码帧缓存器305的长度(x位置)是10单位分辨率且宽度(y位置)是8单位分辨率;显示帧缓存器34的长度是15单位分辨率且宽度是9单位分辨率,且包含显示缓存器编号34(0)~34(134)。FIG. 5A to FIG. 5F are schematic diagrams of block grouping and padding methods according to various embodiments of the present invention. Assume that the length (x position) of the decoded frame buffer 305 is 10 units of resolution and the width (y position) is 8 units of resolution; the display frame buffer 34 is 15 units of resolution in length and 9 units of resolution in width, and Contains display register numbers 34(0) to 34(134).

于图5A到图5C,兴趣区分组为三个兴趣局部图像,其中第一兴趣局部图像以右斜线图案表示,第二局部兴趣图像以点图案表示,第三局部兴趣图像以左斜线图案表示。In FIGS. 5A to 5C , the regions of interest are grouped into three partial images of interest, wherein the first partial image of interest is represented by a right oblique pattern, the second partial image of interest is represented by a dot pattern, and the third partial interest image is represented by a left oblique pattern. express.

图5A到图5C说明使用者移动水平视角(即,地理位置的纬度)而使得兴趣区的显示位置改变但尺寸不变。据此,图5A的兴趣图块数据指示第一、第二、第三兴趣局部图像的起点分别为显示缓存器34(0)、34(10)、34(55);图5B的兴趣图块数据指示第一、第二、第三兴趣局部图像的起点分别为显示缓存器编号34(30)、34(40)、34(85);图5C的兴趣图块数据指示第一、第二、第三兴趣局部图像的起点分别为显示缓存器编号34(60)、34(70)、34(115)。5A-5C illustrate that the user moves the horizontal viewing angle (ie, the latitude of the geographic location) so that the displayed position of the area of interest changes but the size does not change. Accordingly, the interest block data in FIG. 5A indicates that the starting points of the first, second, and third partial images of interest are the display buffers 34(0), 34(10), and 34(55), respectively; the interest block in FIG. 5B The data indicates that the starting points of the first, second, and third partial images of interest are display buffer numbers 34 (30), 34 (40), and 34 (85), respectively; the interest image block data in FIG. 5C indicates the first, second, The starting points of the third part-of-interest images are display buffer numbers 34 ( 60 ), 34 ( 70 ), and 34 ( 115 ), respectively.

于图5D及图5E,兴趣区分组为二个兴趣局部图像,其中第一兴趣局部图像以右斜线图案表示,第二局部兴趣图像以点图案表示。图5D到图5E说明使用者移动水平视角(即,地理位置的纬度)而使得兴趣区的显示位置改变但尺寸不变。据此,图5D的兴趣图块数据指示第一、第二兴趣局部图像的起点分别为显示缓存器34(0)、34(10);图5E的兴趣图块数据指示第一、第二兴趣局部图像的起点分别为显示缓存器编号34(75)、34(85)。如此一来,当使用者移动水平视角时,可确保电子装置3显示的解码图像能正确映射到使用者输入的地理位置。In FIG. 5D and FIG. 5E , the region of interest is grouped into two partial images of interest, wherein the first partial image of interest is represented by a right oblique pattern, and the second partial image of interest is represented by a dot pattern. 5D-5E illustrate that the user moves the horizontal viewing angle (ie, the latitude of the geographic location) so that the displayed position of the area of interest changes but the size does not change. Accordingly, the interest tile data in FIG. 5D indicates that the starting points of the first and second interest partial images are the display buffers 34(0) and 34(10), respectively; the interest tile data in FIG. 5E indicates the first and second interest images. The starting points of the partial images are display buffer numbers 34 (75) and 34 (85), respectively. In this way, when the user moves the horizontal viewing angle, it can be ensured that the decoded image displayed by the electronic device 3 can be correctly mapped to the geographic location input by the user.

图5F说明使用者移动垂直视角(即,地理位置的经度)而使得兴趣区呈现不连续图像,其中第一兴趣局部图像以右斜线图案表示,第二局部兴趣图像以点图案表示。据此,图5F的兴趣图块数据指示第一、第二兴趣局部图像的起点分别为显示缓存器34(6)、34(0),局部区填补单元304可将属于第一兴趣局部图像的解码帧缓存器编号305(0)~305(8)、305(9)的图块数据分别缓存到显示缓存器编号34(6)~34(14)、34(0),以此类推。如此一来,当使用者移动垂直视角时,可确保电子装置3显示的解码图像能正确映射到使用者输入的地理位置。FIG. 5F illustrates that the user moves the vertical viewing angle (ie, the longitude of the geographic location) so that the region of interest presents discontinuous images, wherein the first partial image of interest is represented by a right diagonal pattern and the second partial image of interest is represented by a dot pattern. Accordingly, the interest block data in FIG. 5F indicates that the starting points of the first and second partial images of interest are the display buffers 34(6) and 34(0), respectively, and the partial area filling unit 304 can The tile data of decoded frame buffer numbers 305(0) to 305(8) and 305(9) are respectively buffered to display buffer numbers 34(6) to 34(14) and 34(0), and so on. In this way, when the user moves the vertical viewing angle, it can be ensured that the decoded image displayed by the electronic device 3 can be correctly mapped to the geographic location input by the user.

以图5A为例,当收到输入比特流时,兴趣区定位单元301可根据地理位置,判断兴趣区包含对应到显示缓存器编号34(0)~34(74)的图块,以产生兴趣区信息;比特流重建单元302可根据兴趣区信息来重建输入比特流,以产生重建比特流;视频解码器303对重建输入比特流进行解码,以将兴趣区的兴趣图像缓存于解码帧缓存器305(如图5A的第一、第二、第三兴趣局部图像的图块布局);最后,局部区填补单元304将第一兴趣局部图像缓存到显示缓存器编号34(0)~34(9)、34(15)~34(24)、34(30)~34(39)、34(45)~34(54)、34(60)~34(69),将第二兴趣局部图像缓存到显示缓存器编号,34(10)~34(14)、34(25)~34(29)、34(40)~34(44),将第三兴趣局部图像缓存到显示缓存器编号,34(55)~34(59)、34(70)~34(74),其余图5B~图5F皆可以此类推。Taking FIG. 5A as an example, when receiving the input bit stream, the area of interest positioning unit 301 can determine that the area of interest includes tiles corresponding to display buffer numbers 34(0) to 34(74) according to the geographic location, so as to generate interest region information; the bitstream reconstruction unit 302 can reconstruct the input bitstream according to the region of interest information to generate a reconstructed bitstream; the video decoder 303 decodes the reconstructed input bitstream to cache the pictures of interest in the region of interest in the decoded frame buffer 305 (the tile layout of the first, second, and third partial images of interest as shown in FIG. 5A ); finally, the partial area filling unit 304 buffers the first partial image of interest to display buffer numbers 34(0)-34(9 ), 34(15)~34(24), 34(30)~34(39), 34(45)~34(54), 34(60)~34(69), cache the second interest partial image to Display buffer number, 34(10)~34(14), 34(25)~34(29), 34(40)~34(44), cache the third-interest partial image to display buffer number, 34( 55) to 34 (59), 34 (70) to 34 (74), and other Figures 5B to 5F can be deduced by analogy.

视频解码装置30的操作方式可归纳为一解码图块与显示位置同步流程6,如图6所示,其中解码图块与显示位置同步流程6包含以下步骤。The operation of the video decoding device 30 can be summarized as a decoding block and display position synchronization process 6 , as shown in FIG. 6 , wherein the decoding block and display position synchronization process 6 includes the following steps.

步骤601:根据一地理位置,判断一输入比特流的一原始图像的一兴趣区,以撷取兴趣区的兴趣图块数据,其中兴趣区对应一兴趣图像。Step 601 : Determine a region of interest in an original image of an input bitstream according to a geographic location, so as to capture block data of the region of interest, wherein the region of interest corresponds to an image of interest.

步骤602:根据兴趣图块数据,重建输入比特流,以产生一重建比特流。Step 602: Reconstruct the input bitstream according to the interest tile data to generate a reconstructed bitstream.

步骤603:解码重建比特流,以产生一重建解码图像。Step 603: Decode the reconstructed bitstream to generate a reconstructed decoded image.

步骤604:根据兴趣图块数据,将重建解码图像转换为兴趣图像。Step 604: Convert the reconstructed decoded image into an interest image according to the interest block data.

于解码图块与显示位置同步流程6中,步骤601可由比特流接收单元300及兴趣区定位单元301来执行;步骤602可由比特流重建单元302来执行;步骤603可由视频解码器303来执行;步骤604可由局部区填补单元304及解码帧缓存器305来执行。关于解码图块与显示位置同步流程6的详细操作可参考上述说明。In the decoding block and display position synchronization process 6, step 601 can be performed by the bitstream receiving unit 300 and the region of interest positioning unit 301; step 602 can be performed by the bitstream reconstruction unit 302; step 603 can be performed by the video decoder 303; Step 604 may be performed by local region padding unit 304 and decoded frame buffer 305 . For the detailed operation of the decoding block and display position synchronization process 6, reference may be made to the above description.

于步骤602,比特流重建单元302可进行一比特流编码流程7,以产生重建比特流,如图7所示,其中比特流编码流程7包含以下步骤。In step 602, the bitstream reconstruction unit 302 may perform a bitstream encoding process 7 to generate a reconstructed bitstream, as shown in FIG. 7, wherein the bitstream encoding process 7 includes the following steps.

步骤701:接收输入比特流的多个图块中的一当前图块。Step 701: Receive a current tile among a plurality of tiles in the input bitstream.

步骤702:根据兴趣区的兴趣图块数据,重建输入比特流的一图块布局。Step 702: Reconstruct a tile layout of the input bitstream according to the interest tile data of the region of interest.

步骤703:判断当前图块是否属于兴趣区?若是,进行步骤704;若否,回到步骤701。Step 703: Determine whether the current tile belongs to the area of interest? If yes, go to step 704; if no, go back to step 701.

步骤704:根据图块布局,调整当前图块的图块地址。Step 704: Adjust the tile address of the current tile according to the tile layout.

步骤705:产生重建比特流。Step 705: Generate a reconstructed bitstream.

编码流程7中,于步骤702,比特流重建单元302根据兴趣区的兴趣图块数据,对存取单元的多个非视频编码层参数子集(例如VPS、SPS、PPS、SEI等参数子集)进行编码,以重建输入比特流的图块布局。于步骤703、704,比特流重建单元302对多个片段表头进行编码,以调整当前图块的图块地址。于步骤705,比特流重建单元302将兴趣图块数据编入存取单元的片段数据,以产生重建比特流。In the encoding process 7, in step 702, the bitstream reconstruction unit 302 performs multiple non-video coding layer parameter subsets of the access unit (such as VPS, SPS, PPS, SEI and other parameter subsets according to the interest block data of the region of interest) ) to reconstruct the tile layout of the input bitstream. In steps 703 and 704, the bitstream reconstruction unit 302 encodes a plurality of segment headers to adjust the tile address of the current tile. In step 705, the bitstream reconstruction unit 302 encodes the interest tile data into the segment data of the access unit to generate a reconstructed bitstream.

综上所述,本发明的视频解码装置可根据使用者输入的地理位置,撷取输入比特流的图像帧中属于兴趣区的相关图像解码数据;在给定解码帧缓存器尺寸的条件下,根据相关图像解码数据来重建输入比特流,以产生重建比特流;对重建比特流进行图像解码,以产生重建解码图像并将之暂存于解码帧缓存器;最后将重建解码图像(例如,属于兴趣区的图块)进行重组,以还原并产生解码图像。如此一来,本发明的视频解码装置可在不增加解码帧缓存器面积的前提下,针对属于兴趣区的相关图块进行解码,如此可确保电子装置能正常显示解码图像,也可避免增加视频解码装置的硬件成本。此外,由于重建比特流是依据使用者输入的地理位置及对应兴趣区的相关图像解码数据来产生,因此可确保电子装置显示的解码图像能正确映射到使用者输入的地理位置。To sum up, the video decoding device of the present invention can capture the relevant image decoding data belonging to the region of interest in the image frame of the input bitstream according to the geographic location input by the user; under the condition of a given decoding frame buffer size, Reconstruct the input bit stream according to the relevant image decoding data to generate the reconstructed bit stream; perform image decoding on the reconstructed bit stream to generate the reconstructed decoded image and temporarily store it in the decoded frame buffer; finally, the reconstructed decoded image (for example, belonging to regions of interest) are reorganized to restore and produce a decoded image. In this way, the video decoding device of the present invention can decode the relevant picture blocks belonging to the region of interest without increasing the area of the decoding frame buffer, so as to ensure that the electronic device can display the decoded image normally, and avoid increasing the video The hardware cost of the decoding device. In addition, since the reconstructed bitstream is generated according to the geographic location input by the user and the relevant image decoded data corresponding to the region of interest, it can ensure that the decoded image displayed by the electronic device can be correctly mapped to the geographic location input by the user.

以上所述仅为本发明的较佳实施例,凡依本发明申请专利范围所做的均等变化与修饰,皆应属本发明的涵盖范围。The above descriptions are only preferred embodiments of the present invention, and all equivalent changes and modifications made according to the scope of the patent application of the present invention shall fall within the scope of the present invention.

Claims (20)

1. A method for synchronizing decoded picture blocks and display positions for a video decoding device comprises:
judging an interest region of an original image of an input bit stream according to a geographic position to capture interest block data of the interest region, wherein the interest region corresponds to an interest image;
reconstructing the input bitstream according to the interest block data to generate a reconstructed bitstream;
decoding the reconstructed bit stream to generate a reconstructed decoded image; and
and converting the reconstructed decoded image into the interest image according to the interest block data.
2. The method of claim 1, wherein the interest block data indicates a height and a width of the original image, a number of interest block subsets of the interest region, a decoded frame buffer address of the reconstructed decoded picture, a display frame buffer address of the decoded picture, and a height and a width of the interest region.
3. The method of claim 1, wherein reconstructing the input bitstream to generate the reconstructed bitstream according to the interest block data comprises:
according to the interest block data, a first user-defined syntax element of the input bitstream is re-encoded to generate the reconstructed bitstream.
4. The method of claim 3, wherein decoding the reconstructed bitstream to generate the reconstructed decoded image comprises:
decoding a second user-defined syntax element of the reconstructed bitstream; and
and allocating a decoding frame buffer address to the reconstructed decoded picture according to the user-defined syntax element to generate the reconstructed decoded picture.
5. The method of claim 4, wherein the step of converting the reconstructed decoded image into the image of interest according to the image of interest data comprises:
adjusting the decoding frame buffer address of the reconstructed decoding image to be a display frame buffer address according to the second user-defined syntax element so as to convert the reconstructed decoding image into the interest image.
6. The method of claim 3, wherein the first and second user-defined syntax elements are an assistant enhancement information syntax element, a dummy bit or a dedicated bit of the input bitstream and the reconstructed bitstream, respectively.
7. The method as claimed in claim 1, wherein the geographic location corresponds to a longitude, a latitude and a viewing angle.
8. The method of claim 1, further comprising:
reading a user-defined data of the input bitstream to obtain the geographic location.
9. The method of claim 1, further comprising:
and executing a system application component to read the geographic position.
10. The method of claim 1, further comprising:
the geographic position is read through a hardware positioning module of the video decoding device.
11. A video decoding apparatus for an electronic device, comprising:
a bit stream receiving unit for receiving an input bit stream;
a decoding frame buffer for temporarily storing a reconstructed decoded picture of the input bitstream; and
a processor, coupled to the bitstream receiving unit and the decoded frame buffer, for performing a decoding tile and display position synchronization process, wherein the decoding tile and display position synchronization process comprises:
judging an interest region of an original image of the input bit stream according to a geographic position to capture interest block data of the interest region, wherein the interest region corresponds to an interest image;
reconstructing the input bitstream according to the interest block data to generate a reconstructed bitstream;
decoding the reconstructed bitstream to produce the reconstructed decoded image; and
and converting the reconstructed decoded image into the interest image according to the interest block data.
12. The video decoding apparatus of claim 11, wherein the interest block data indicates a height and a width of the original picture, a number of interest block subsets of the interest region, a decoded frame buffer address of the reconstructed decoded picture, a display frame buffer address of the decoded picture, and a height and a width of the interest region.
13. The video decoding apparatus of claim 11, wherein the step of reconstructing the input bitstream according to the interest block data to generate the reconstructed bitstream comprises:
according to the interest block data, a first user-defined syntax element of the input bitstream is re-encoded to generate the reconstructed bitstream.
14. The video decoding apparatus of claim 13, wherein the step of decoding the reconstructed bitstream to generate the reconstructed decoded picture comprises:
decoding a second user-defined syntax element of the reconstructed bitstream; and
and allocating a decoding frame buffer address to the reconstructed decoded picture according to the user-defined syntax element to generate the reconstructed decoded picture.
15. The video decoding apparatus of claim 14, wherein the step of converting the reconstructed decoded picture into the picture of interest according to the tile of interest data comprises:
adjusting the decoding frame buffer address of the reconstructed decoding image to be a display frame buffer address according to the second user-defined syntax element so as to convert the reconstructed decoding image into the interest image.
16. The video decoding apparatus of claim 13, wherein the first user-defined syntax element and the second user-defined syntax element are an assistant enhancement information syntax element, a dummy bit or a dedicated bit of the input bitstream and the reconstructed bitstream, respectively.
17. The video decoding apparatus of claim 11, wherein the geographic location corresponds to a longitude, a latitude, and a view angle.
18. The video decoding apparatus of claim 11, wherein the decoding tile and display position synchronization process further comprises:
reading a user-defined data of the input bitstream to obtain the geographic location.
19. The video decoding apparatus of claim 11, wherein the decoding tile and display position synchronization process further comprises:
the geographic location is read through a system application component of the electronic device.
20. The video decoding apparatus of claim 11, wherein the decoding tile and display position synchronization process further comprises:
the geographical position is read through a hardware positioning module of the electronic device.
CN201810959428.7A 2018-08-22 2018-08-22 Synchronization method of decoded tile and display position based on input geographic location and related video decoding device Pending CN110858902A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810959428.7A CN110858902A (en) 2018-08-22 2018-08-22 Synchronization method of decoded tile and display position based on input geographic location and related video decoding device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810959428.7A CN110858902A (en) 2018-08-22 2018-08-22 Synchronization method of decoded tile and display position based on input geographic location and related video decoding device

Publications (1)

Publication Number Publication Date
CN110858902A true CN110858902A (en) 2020-03-03

Family

ID=69634806

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810959428.7A Pending CN110858902A (en) 2018-08-22 2018-08-22 Synchronization method of decoded tile and display position based on input geographic location and related video decoding device

Country Status (1)

Country Link
CN (1) CN110858902A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101002471A (en) * 2004-08-13 2007-07-18 庆熙大学校产学协力团 Method and apparatus to encode image, and method and apparatus to decode image data
CN102999939A (en) * 2012-09-21 2013-03-27 魏益群 Coordinate acquisition device, real-time three-dimensional reconstruction system, real-time three-dimensional reconstruction method and three-dimensional interactive equipment
CN104065964A (en) * 2014-06-19 2014-09-24 上海交通大学 Codec method and video codec device for region of interest information
CN105205862A (en) * 2015-10-26 2015-12-30 武汉沃亿生物有限公司 Three-dimensional image reconstruction method and system
US20170019655A1 (en) * 2015-07-13 2017-01-19 Texas Insturments Incorporated Three-dimensional dense structure from motion with stereo vision

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101002471A (en) * 2004-08-13 2007-07-18 庆熙大学校产学协力团 Method and apparatus to encode image, and method and apparatus to decode image data
CN102999939A (en) * 2012-09-21 2013-03-27 魏益群 Coordinate acquisition device, real-time three-dimensional reconstruction system, real-time three-dimensional reconstruction method and three-dimensional interactive equipment
CN104065964A (en) * 2014-06-19 2014-09-24 上海交通大学 Codec method and video codec device for region of interest information
US20170019655A1 (en) * 2015-07-13 2017-01-19 Texas Insturments Incorporated Three-dimensional dense structure from motion with stereo vision
CN105205862A (en) * 2015-10-26 2015-12-30 武汉沃亿生物有限公司 Three-dimensional image reconstruction method and system

Similar Documents

Publication Publication Date Title
CN107454468B (en) Method, apparatus and stream for formatting immersive video
CN109983757B (en) View-related operations during panoramic video playback
JP6984841B2 (en) Image processing method, terminal and server
US11023152B2 (en) Methods and apparatus for storing data in memory in data processing systems
US11004173B2 (en) Method for processing projection-based frame that includes at least one projection face packed in 360-degree virtual reality projection layout
WO2017215587A1 (en) Method and apparatus for encoding and decoding video image
KR20170113380A (en) Data processing systems
US11024008B1 (en) Methods and apparatus for multi-encoder processing of high resolution content
US11069026B2 (en) Method for processing projection-based frame that includes projection faces packed in cube-based projection layout with padding
CN109983500A (en) Re-projecting a flat-panel projection of panoramic video pictures for rendering by an application
AU2017317839B2 (en) Panoramic video compression method and device
KR20220069086A (en) Method and apparatus for encoding, transmitting and decoding volumetric video
US10963987B2 (en) Method and apparatus for decoding projection based frame with 360-degree content represented by triangular projection faces packed in triangle-based projection layout
EP3565260A1 (en) Generation device, identification information generation method, reproduction device, and image generation method
CN113557729A (en) Partitioning of encoded point cloud data
US20230345020A1 (en) Method for processing video data stream, video decoding apparatus, and method for encoding data stream
US20230388542A1 (en) A method and apparatus for adapting a volumetric video to client devices
CN110858902A (en) Synchronization method of decoded tile and display position based on input geographic location and related video decoding device
CN110858907A (en) Block layout reconstruction method adaptable to decoding frame buffer size and related video decoding device
TW202008784A (en) Method of rewriting tile layout for fitting sizes of decoding frame buffer and related video decoding device
US20210281882A1 (en) Compact description of region-wise packing information
TW202008785A (en) Method of synchronizing decoded tile and display location based on inputted geological location and related video decoding device
WO2023280266A1 (en) Fisheye image compression method, fisheye video stream compression method and panoramic video generation method
KR20230174246A (en) A parallel approach to dynamic mesh alignment
US11405630B2 (en) Video decoding method for decoding part of bitstream to generate projection-based frame with constrained picture size and associated electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200303