CN110858902A - Synchronization method of decoded tile and display position based on input geographic location and related video decoding device - Google Patents
Synchronization method of decoded tile and display position based on input geographic location and related video decoding device Download PDFInfo
- Publication number
- CN110858902A CN110858902A CN201810959428.7A CN201810959428A CN110858902A CN 110858902 A CN110858902 A CN 110858902A CN 201810959428 A CN201810959428 A CN 201810959428A CN 110858902 A CN110858902 A CN 110858902A
- Authority
- CN
- China
- Prior art keywords
- interest
- reconstructed
- image
- bitstream
- decoding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 239000000872 buffer Substances 0.000 claims description 66
- 238000010586 diagram Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 5
- 239000012634 fragment Substances 0.000 description 3
- 238000005192 partition Methods 0.000 description 2
- 230000000153 supplemental effect Effects 0.000 description 2
- 230000003139 buffering effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/167—Position within a video image, e.g. region of interest [ROI]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/184—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
一种解码图块与显示位置同步方法,用于一视频比特流解码装置。该方法包含根据一地理位置,判断一输入比特流的一原始图像的一兴趣区,以撷取该兴趣区的兴趣图块数据,其中该兴趣区对应一兴趣图像;根据该兴趣图块数据,重建该输入比特流,以产生一重建比特流;解码该重建比特流,以产生一重建解码图像;以及根据该兴趣图块数据,将该重建解码图像转换为该兴趣图像。
A method for synchronizing a decoding block with a display position is used in a video bitstream decoding device. The method comprises determining an interest region of an original image of an input bitstream according to a geographical location to extract interest block data of the interest region, wherein the interest region corresponds to an interest image; reconstructing the input bitstream according to the interest block data to generate a reconstructed bitstream; decoding the reconstructed bitstream to generate a reconstructed decoded image; and converting the reconstructed decoded image into the interest image according to the interest block data.
Description
技术领域technical field
本发明是指一种解码图块与显示位置同步方法及相关视频解码装置,尤指一种基于输入地理位置的解码图块与显示位置同步的方法及相关视频解码装置。The present invention relates to a method for synchronizing a decoded image block with a display position and a related video decoding device, and more particularly to a method for synchronizing a decoded image block and a display position based on an input geographic location, and a related video decoding device.
背景技术Background technique
在全景虚拟现实(360-Degree Virtual Reality,简称VR360)应用中,当使用世界地图模式时,视频解码装置(例如,虚拟现实装置)会依照使用者选取的地理位置(例如,经度、纬度及对应视角)来决定其屏幕的显示内容。一般而言,虚拟现实装置可透过网络从世界地图数据库下载图像档案(例如,视频比特流(Video bit-stream)),将图像档案暂存于内建存储器(例如,帧缓存器(frame buffer))中,再对图像档案进行解码来取得欲显示的图像,以将之显示于屏幕。In a 360-Degree Virtual Reality (VR360 for short) application, when using the world map mode, the video decoding device (for example, a virtual reality device) will follow the geographic location (for example, longitude, latitude and correspondence) selected by the user. Viewing angle) to determine the display content of its screen. Generally speaking, virtual reality devices can download image files (eg, video bit-streams) from a world map database through a network, and temporarily store the image files in a built-in memory (eg, frame buffers) )), and then decode the image file to obtain the image to be displayed, so as to display it on the screen.
图1绘示根据一地理位置L1,从一世界地图E_MAP载入一视野区FOV1的示意图。图2绘示根据一地理位置L2,从世界地图E_MAP载入一视野区FOV2的示意图。如第1、2图所示,世界地图E_MAP可依据多个经度及多个纬度而分割为多个图块(tile)编号0~509(例如,30经度*17纬度=510图块)。由于世界地图E_MAP是由球体表面转换而来的平面地图,因此需显示的分区图像大小或形状会随着不同的经纬度而有所不同。FIG. 1 is a schematic diagram of loading a field of view FOV1 from a world map E_MAP according to a geographic location L1. FIG. 2 is a schematic diagram of loading a field of view FOV2 from the world map E_MAP according to a geographic location L2. As shown in Figures 1 and 2, the world map E_MAP can be divided into a plurality of
举例来说,于图1中,当地理位置L1落在纬度0度(赤道)时,视野区(Field ofView)FOV1包含图块编号98~111、128~141、158~171、188~201、218~231、248~261、278~291、308~321、338~351、368~381、398~411,共154个图块;其中屏幕的显示内容为一兴趣区(Regionof Interest)ROI1,兴趣区ROI1包含图块编号132~137、160~169、189~200、219~230、249~260、279~290、309~320、340~349、372~307,共80个图块。For example, in FIG. 1 , when the geographic location L1 is at 0 degrees latitude (equator), the field of view FOV1 includes tile numbers 98-111, 128-141, 158-171, 188-201, 218~231, 248~261, 278~291, 308~321, 338~351, 368~381, 398~411, a total of 154 tiles; the content displayed on the screen is a Region of Interest ROI1, The region ROI1 contains 80 tiles with tile numbers 132-137, 160-169, 189-200, 219-230, 249-260, 279-290, 309-320, 340-349, 372-307 in total.
于图2中,当地理位置L2落在纬度56度时,视野区FOV2包含图块编号0~29、30~59、60~171、90~201、120~231、150~261、180~291、210~321、240~351、270~381,共300个图块;其中屏幕的显示内容为一兴趣区ROI2,兴趣区ROI2包含图块编号0~29、36~53、67~82、97~112、127~142、157~172、188~201、219~230、250~252、257~259,共144个图块。In FIG. 2, when the geographic location L2 is at 56 degrees latitude, the field of view FOV2 includes tile numbers 0-29, 30-59, 60-171, 90-201, 120-231, 150-261, 180-291 , 210~321, 240~351, 270~381, a total of 300 tiles; the display content of the screen is an area of interest ROI2, and the area of interest ROI2 includes
比较第1、2图可发现视野区FOV1及FOV2的长度与宽度皆不同,因此在进行分区图像解码时需要的存储器尺寸(例如,帧缓存器的尺寸)也不同。Comparing Figures 1 and 2, it can be found that the lengths and widths of the view areas FOV1 and FOV2 are different, so the memory sizes (eg, the size of the frame buffer) required to perform partition image decoding are also different.
目前全景虚拟现实技术采用高效率视频编解码(High Efficiency VideoCoding,简称HEVC)标准来进行图像编码及解码,然而现有硬件设计可支持的最大像素宽度为4096以及最大像素高度为2560。由于视野区FOV2的像素宽度7860超过最大像素宽度4096,造成现有HEVC的硬件设计无法支持视野区FOV2的图像解码工作,导致虚拟现实装置无法正常显示。Currently, the panoramic virtual reality technology adopts the High Efficiency Video Coding (HEVC) standard for image encoding and decoding. However, the current hardware design can support a maximum pixel width of 4096 and a maximum pixel height of 2560. Since the pixel width 7860 of the field of view FOV2 exceeds the maximum pixel width of 4096, the existing HEVC hardware design cannot support the image decoding work of the field of view FOV2, resulting in the virtual reality device being unable to display normally.
此外,HEVC标准定义了运动约束图块集合(motion-constrained tile set,简称MCTS),允许MCTS里的图块可独立解码而不需参考其他数据。然而,申请人注意到在进行MCTS解码时,若使用者将视角从地理位置L1移动到地理位置L2,现有视频解码装置却未同步变更需解码的目标图块(因为MCTS里的图块可独立解码而不需参考其他数据),使得视频解码装置把兴趣区ROI1的解码图像映射到兴趣区ROI2的显示位置,导致虚拟现实装置显示错误图像。In addition, the HEVC standard defines a motion-constrained tile set (MCTS for short), which allows tiles in the MCTS to be decoded independently without reference to other data. However, the applicant has noticed that during MCTS decoding, if the user moves the viewing angle from the geographic location L1 to the geographic location L2, the existing video decoding device does not synchronously change the target tiles to be decoded (because the tiles in the MCTS can be independent decoding without reference to other data), so that the video decoding device maps the decoded image of the region of interest ROI1 to the display position of the region of interest ROI2, resulting in the virtual reality device displaying an incorrect image.
有鉴于此,如何解决图块重建以及解码图块与显示位置同步的问题,以确保虚拟现实装置能正常显示,实乃本领域的重要课题。In view of this, it is an important issue in the art how to solve the problem of reconstructing the image block and synchronizing the decoding image block with the display position so as to ensure the normal display of the virtual reality device.
发明内容SUMMARY OF THE INVENTION
因此,本发明的主要目的即在于提供一种解码图块与显示位置同步的方法及相关视频解码装置,以确保虚拟现实装置能正常显示。Therefore, the main purpose of the present invention is to provide a method for synchronizing a decoded image block with a display position and a related video decoding device, so as to ensure that the virtual reality device can display normally.
本发明揭露一种解码图块与显示位置同步方法,用于一视频解码装置。该方法包含根据一地理位置,判断一输入比特流的一原始图像的一兴趣区,以撷取该兴趣区的兴趣图块数据,其中该兴趣区对应一兴趣图像;根据该兴趣图块数据,重建该输入比特流,以产生一重建比特流;解码该重建比特流,以产生一重建解码图像;以及根据该兴趣图块数据,将该重建解码图像转换为该兴趣图像。The invention discloses a decoding picture block and display position synchronization method for a video decoding device. The method includes determining a region of interest of an original image of an input bitstream according to a geographic location, so as to extract block data of interest in the region of interest, wherein the region of interest corresponds to an image of interest; according to the block data of interest, reconstructing the input bitstream to generate a reconstructed bitstream; decoding the reconstructed bitstream to generate a reconstructed decoded image; and converting the reconstructed decoded image into the interest image according to the interest block data.
本发明另揭露一种视频解码装置,用于一电子装置,包含一比特流接收单元,用来接收一输入比特流;一解码帧缓存器,用来暂存该输入比特流的一重建解码图像;以及一处理器,耦接于该比特流接收单元以及该解码帧缓存器,用来执行一解码图块与显示位置同步流程,其中该解码图块与显示位置同步流程包含上述解码图块与显示位置同步方法的所有步骤。The present invention further discloses a video decoding device for an electronic device, comprising a bit stream receiving unit for receiving an input bit stream; a decoding frame buffer for temporarily storing a reconstructed decoded image of the input bit stream ; And a processor, coupled to the bit stream receiving unit and the decoded frame buffer, is used to execute a decoded image block and a display position synchronization process, wherein the decoded image block and the display position synchronization process includes the above-mentioned decoded image block and Shows all steps of the location synchronization method.
本发明的视频解码装置可根据使用者输入的地理位置,撷取输入比特流的图像帧中属于兴趣区的相关图像解码数据;在给定解码帧缓存器尺寸的条件下,根据相关图像解码数据来重建输入比特流,以产生重建比特流;对重建比特流进行图像解码,以产生重建解码图像并将之暂存于解码帧缓存器;最后将重建解码图像(例如,属于兴趣区的图块)进行重组,以还原并产生解码图像。如此一来,本发明的视频解码装置可在不增加解码帧缓存器面积的前提下,针对属于兴趣区的相关图块进行解码,如此可确保电子装置能正常显示解码图像,也可避免增加视频解码装置的硬件成本。此外,由于重建比特流是依据使用者输入的地理位置及对应兴趣区的相关图像解码数据来产生,因此可确保电子装置显示的解码图像能正确映射到使用者输入的地理位置。The video decoding device of the present invention can capture the relevant image decoding data belonging to the region of interest in the image frame of the input bit stream according to the geographic location input by the user; under the condition of a given decoding frame buffer size, the relevant image decoding data can be obtained according to the relevant image decoding data. to reconstruct the input bitstream to generate the reconstructed bitstream; perform image decoding on the reconstructed bitstream to generate the reconstructed decoded image and temporarily store it in the decoded frame buffer; finally reconstruct the decoded image (for example, the tiles belonging to the region of interest ) for recombination to restore and produce a decoded image. In this way, the video decoding device of the present invention can decode the relevant picture blocks belonging to the region of interest without increasing the area of the decoding frame buffer, so as to ensure that the electronic device can display the decoded image normally, and avoid increasing the video The hardware cost of the decoding device. In addition, since the reconstructed bitstream is generated according to the geographic location input by the user and the relevant image decoded data corresponding to the region of interest, it can ensure that the decoded image displayed by the electronic device can be correctly mapped to the geographic location input by the user.
附图说明Description of drawings
图1绘示根据一地理位置,从一世界地图载入一视野区的示意图。FIG. 1 is a schematic diagram of loading a field of view from a world map according to a geographic location.
图2绘示根据另一地理位置,从一世界地图载入另一视野区的示意图。FIG. 2 is a schematic diagram of loading another viewing area from a world map according to another geographic location.
图3为本发明实施例一电子装置的功能方块图。FIG. 3 is a functional block diagram of an electronic device according to an embodiment of the present invention.
图4为一HEVC比特流的编码格式范例的示意图。FIG. 4 is a schematic diagram of an example encoding format of an HEVC bitstream.
图5A到图5F为本发明多个实施例图块分组与填补方式的示意图。FIG. 5A to FIG. 5F are schematic diagrams of block grouping and padding methods according to various embodiments of the present invention.
图6为本发明实施例一解码图块与显示位置同步流程的流程图。FIG. 6 is a flowchart of a process of synchronizing a decoded picture block and a display position according to an embodiment of the present invention.
图7为本发明实施例一比特流编码流程的流程图。FIG. 7 is a flowchart of a bit stream encoding process according to an embodiment of the present invention.
符号说明Symbol Description
E_MAP 世界地图E_MAP world map
L1、L2 地理位置L1, L2 location
FOV1、FOV2 视野区FOV1, FOV2 field of view
ROI1、ROI2 兴趣区ROI1, ROI2 area of interest
3 电子装置3 Electronics
30 视频解码装置30 Video decoding device
32 图像处理器32 Image Processor
34 显示帧缓存器34 Display frame buffer
300 比特流接收单元300 bit stream receiving unit
301 兴趣区定位单元301 Area of Interest Location Unit
302 比特流重建单元302 bitstream reconstruction unit
303 视频解码器303 video decoder
304 局部区填补单元304 Local area filling unit
305 解码帧缓存器305 Decode Frame Buffer
6、7 流程6.7 Process
601、602、603、604、701、702、703、704、705 步骤601, 602, 603, 604, 701, 702, 703, 704, 705 Steps
具体实施方式Detailed ways
图3为本发明实施例一电子装置3的功能方块图。电子装置3包含一视频解码装置30、一图像处理器(Graphic processing unit,GPU)32以及一显示帧缓存器34。电子装置3可以是虚拟现实装置或桌面计算机、笔记本电脑、智能型手机等装置。FIG. 3 is a functional block diagram of an
视频解码装置30包含一比特流接收单元300、一兴趣区(Region of interest,ROI)定位单元301、一比特流重建(Bit-stream rewriting)单元302、一视频解码器303、一局部区填补(Partial region filling)单元304以及一解码帧缓存器305。视频解码装置30耦接于图像处理器32,用来对一输入比特流(例如,用于全景虚拟现实的视频比特流)进行解码,以产生一解码图像到图像处理器32;接着,图像处理器32可将解码图像写入显示帧缓存器34,以将之显示于显示器。于一实施例中,输入比特流包含的原始图像的分辨率为8K*4K(8192*4320)像素,而重建解码图像的分辨率为4K*2K(4096*2560)像素。The
视频解码装置30可根据使用者输入的一地理位置,撷取输入比特流的一图像帧中属于兴趣区的相关图像解码数据;在给定解码帧缓存器305尺寸的条件下,根据相关图像解码数据来重建输入比特流,以产生一重建比特流;对重建比特流进行图像解码,以产生一重建解码图像并将之暂存于解码帧缓存器305;最后将重建解码图像(例如,属于兴趣区的图块)进行重组,以还原并产生解码图像。如此一来,本发明的视频解码装置30可在不增加解码帧缓存器305大小的前提下,针对属于兴趣区的相关图块进行解码,如此可确保电子装置3能正常显示解码图像,也可避免增加视频解码装置30的硬件成本。此外,由于重建比特流是依据使用者输入的地理位置及对应兴趣区的相关图像解码数据来产生,因此可确保电子装置3显示的解码图像能正确映射到使用者输入的地理位置。The
于一实施例中,输入比特流是依据高效率视频编解码(High Efficiency VideoCoding,简称HEVC)定义的编码规范来产生,如图4所示,其绘示一HEVC比特流的编码格式范例的示意图。HEVC比特流由多组网络提取层单元串流(Network Abstraction Layer unitstream,NAL unit stream)所组成,其包含多个编码视频序列(coded video sequence)。每一编码视频序列用来显示一视频,其包含一瞬时解码刷新(Instantaneous DecodingRefresh,IDR)存取单元以及多个存取单元(access unit),而每一存取单元可解码为该编码视频序列的一图像帧。存取单元包含多个语法元素(syntaxIn one embodiment, the input bit stream is generated according to the coding standard defined by High Efficiency Video Coding (HEVC), as shown in FIG. 4 , which is a schematic diagram illustrating an example of the coding format of the HEVC bit stream. . The HEVC bit stream is composed of multiple groups of Network Abstraction Layer unit streams (NAL unit streams), which include multiple coded video sequences. Each coded video sequence is used to display a video, which includes an Instantaneous Decoding Refresh (IDR) access unit and a plurality of access units, and each access unit can be decoded into the coded video sequence of an image frame. An access unit contains multiple syntax elements (syntax
element),例如一视频参数子集(video parameter set,VPS)、一序列参数子集(Sequence parameter set,SPS)、一图像参数子集(Picture parameter set,PPS)、一辅助增强信息(Supplemental enhancement information,SEI)以及多个片段(slice),其中每一片段包含一片段表头(header)以及一片段数据,且每一片段包含多个图块。VPS、SPS、PPS、SEI等参数子集的分类为非视频编码层(Non Video Coding Layer,Non-VCL),VPS、SPS、PPS用来描述整个图像帧进行解码时所需的相关参数,SEI用来描述使用者定义数据(user defined metadata),例如运动约束图块集合的位置。多个片段的分类为视频编码层,用来描述片段包含的多个图块进行解码时所需的相关参数。element), such as a video parameter subset (video parameter set, VPS), a sequence parameter subset (Sequence parameter set, SPS), a picture parameter subset (Picture parameter set, PPS), a supplemental enhancement information (Supplemental enhancement information) information, SEI) and a plurality of slices, wherein each slice includes a slice header and a slice data, and each slice includes a plurality of tiles. The subsets of parameters such as VPS, SPS, PPS, and SEI are classified as Non-Video Coding Layer (Non-VCL). VPS, SPS, and PPS are used to describe the relevant parameters required for decoding the entire image frame. SEI Used to describe user defined metadata, such as the location of a collection of motion-constrained tiles. The classification of multiple clips is a video coding layer, which is used to describe the relevant parameters required for decoding the multiple tiles included in the clip.
请再次参考图3。比特流接收单元300耦接于兴趣区定位单元301,用来剖析输入比特流,以撷取一图像帧在进行解码时所需的相关参数。于一实施例中,比特流接收单元300剖析输入比特流的存取单元的多个非视频编码层参数子集(例如VPS、SPS、PPS等参数子集),以撷取存取单元包含的图像帧在进行解码时所需的相关参数。Please refer to Figure 3 again. The
兴趣区定位单元301耦接于比特流接收单元300及比特流重建单元302,用来根据地理位置,对输入比特流进行解码,以撷取图像帧的兴趣区在进行解码时所需的相关参数(例如,兴趣图块数据)。于一实施例中,兴趣区定位单元301可解码存取单元的多个片段表头;根据图像参数子集(PSP)来取得图像帧的所有图块配置信息;以及根据地理位置(例如是包含经纬度的坐标)及所有图块配置信息,从多个片段中撷取属于兴趣区的兴趣图块数据。The region of
于一实施例中,兴趣区定位单元301可读取辅助增强信息描述的使用者定义数据、透过一系统应用组件(android application package,APK)或透过一硬件定位模块,以取得地理位置。例如,使用者可透过电子装置3的使用者接口来输入经纬度,以供兴趣区定位单元301取得地理位置;或者,电子装置3的系统应用组件及硬件定位模块可自动侦测使用者操作来判断经纬度,以取得地理位置。In one embodiment, the area of
比特流重建单元302耦接于兴趣区定位单元301及视频解码器303,用来将输入比特流转换为一重建比特流。于一实施例中,比特流重建单元302可在给定解码长度及宽度的条件下,根据兴趣区的兴趣图块数据,改写存取单元的多个非视频编码层参数子集(例如VPS、SPS、PPS、SEI等参数子集),再改写多个片段表头的片段地址,最后将兴趣图块数据以字节级(byte-level)的格式来编入(stitch)存取单元的片段数据,以产生重建比特流。The
于一实施例中,比特流重建单元302对视频参数子集(VPS)重新编码,以描述视频编码范围及阶层(profile and level);对序列参数子集(SPS)重新编码,以描述图像尺寸及范围;以及对图像参数子集(PPS)重新编码,以描述单一图块及图块子集(tile set)的尺寸。In one embodiment, the
于一实施例中,比特流重建单元302对一视频可用信息(Video usabilityinformation,VUI)重新编码,以描述图像长宽比(aspect ratio)、过扫描(overscan)及颜色基础(color primaries)等信息。此外,比特流重建单元302对多个片段表头进行编码,以描述属于兴趣图块的起点(entry point)。In one embodiment, the
于一实施例中,比特流重建单元302根据兴趣图块数据以及解码长度及宽度(例如,解码帧缓存器305的长度及宽度),对辅助增强信息(SEI)重新编码,以描述属于兴趣图块的尺寸及位置。于一实施例中,比特流重建单元302根据兴趣图块数据,对比特流有效位(bit-stream payload)中的任何虚设位(dummy bit)或专用位(propriety bit)进行编码,以描述兴趣图块的尺寸及位置。In one embodiment, the
视频解码器303耦接于比特流重建单元302及解码帧缓存器305,用来对重建比特流进行解码,以输出兴趣区的兴趣图块数据到解码帧缓存器305。于一实施例中,比特流重建单元302是根据HEVC规范,对输入比特流进行重新编码来产生重建比特流,因此视频解码器303可以是一HEVC解码器。The
解码帧缓存器305耦接于视频解码器303及局部区填补单元304,用来缓存兴趣图块,以产生重建解码图像。于一实施例中,视频解码器303可于解码时,配置兴趣图块的缓存地址及硬件空间,以将兴趣图块缓存于解码帧缓存器305。The decoded
局部区填补单元304耦接于解码帧缓存器305及图像处理器32,用来将重建解码图像转换为解码图像,以将解码图像输出到图像处理器32并缓存于显示帧缓存器34。于一实施例中,局部区填补单元304可直接将解码图像缓存于显示帧缓存器34而不需透过图像处理器32。The local
于一实施例中,在给定解码帧缓存器305的长度及宽度的条件下,比特流重建单元302可将属于兴趣区的多个兴趣图块分组为多个局部图像,其中每个局部图像相当于一个运动约束图块集合(或兴趣图块子集)。因此,比特流重建单元302可根据多个兴趣图块子集的兴趣图块子集数据,来对输入比特流进行重新编码,以产生重建比特流。如此一来,在视频解码器303对重建比特流进行解码并将重建解码图像缓存到解码帧缓存器305之后,局部区填补单元304可根据兴趣图块子集数据,将多个局部图像填补到世界地图的显示位置,以还原兴趣区图像(即,将重建解码图像转换为解码图像)。In one embodiment, given the length and width of the decoded
于一实施例中,兴趣图块子集数据指示输入比特流的原始图像的高度及宽度、多个兴趣图块子集的数量、重建解码图像缓存在解码帧缓存器305的位置、解码图像缓存在显示帧缓存器34的位置、兴趣区的高度及宽度。In one embodiment, the interest tile subset data indicates the height and width of the original image of the input bitstream, the number of multiple interest tile subsets, the location where the reconstructed decoded image is buffered in the decoded
表格1为兴趣区定位单元301取得兴趣图块数据的范例,其中于HEVC规范中,「user_data_unregistered」是用来表示辅助增强信息(SEI)的功能语法,用来传递兴趣图块数据(或兴趣图块子集数据);Table 1 is an example of the interest
「uuid_iso_iec_11578」是一通用唯一标识符(universally unique identifier,UUID),用来指示辅助增强信息是用来描述图块布局(layout);"uuid_iso_iec_11578" is a universally unique identifier (UUID) used to indicate that the auxiliary enhancement information is used to describe the tile layout;
「ROI_window」是用来撷取兴趣图块数据的功能函数;"ROI_window" is a function used to extract the block data of interest;
「user_data_payload_byte」是用来传递兴趣图块数据的功能函数。"user_data_payload_byte" is a function used to pass interest tile data.
于表格1的实施例中,兴趣区定位单元301可根据地理位置,对输入比特流进行解码,以产生「org_width、org_height、number_of_position、src_x、src_y、dst_x、dst_y、pos_w、pos_h」等兴趣图块数据,让比特流重建单元302根据上述数据来对输入比特流进行重新编码。于解码时,当视频解码器303读取「user_data_unregistered」的辅助增强信息(SEI)的功能语法时,可识别其中的「uuid_iso_iec_11578」使用者定义功能卷标是用于来描述图块布局,最后透过「user_data_payload_byte」功能函数来传递兴趣图块数据到局部区填补单元304。因此,局部区填补单元304可根据兴趣图块数据,将重建解码图像转换为解码图像。In the embodiment of Table 1, the
于一实施例中,经实验分析,在给定解码帧缓存器以及每个图块的位置与尺寸时,即可列出所有可能的图块分组与填补方式,以将图块缓存到解码帧缓存器。图块分组与填补方式可编译为一编码书(codebook),让比特流重建单元302依据编码书来进行图块分组与填补,以重建比特流;之后再让局部区填补单元304依据解码书来进行图块分组与填补的反向操作,以还原解码图像。In one embodiment, based on experimental analysis, given the location and size of the decoded frame buffer and each tile, all possible tile grouping and padding methods can be listed to buffer the tile into the decoded frame. buffer. The block grouping and padding method can be compiled into a codebook, and the
图5A到图5F为本发明多个实施例图块分组与填补方式的示意图。假设解码帧缓存器305的长度(x位置)是10单位分辨率且宽度(y位置)是8单位分辨率;显示帧缓存器34的长度是15单位分辨率且宽度是9单位分辨率,且包含显示缓存器编号34(0)~34(134)。FIG. 5A to FIG. 5F are schematic diagrams of block grouping and padding methods according to various embodiments of the present invention. Assume that the length (x position) of the decoded
于图5A到图5C,兴趣区分组为三个兴趣局部图像,其中第一兴趣局部图像以右斜线图案表示,第二局部兴趣图像以点图案表示,第三局部兴趣图像以左斜线图案表示。In FIGS. 5A to 5C , the regions of interest are grouped into three partial images of interest, wherein the first partial image of interest is represented by a right oblique pattern, the second partial image of interest is represented by a dot pattern, and the third partial interest image is represented by a left oblique pattern. express.
图5A到图5C说明使用者移动水平视角(即,地理位置的纬度)而使得兴趣区的显示位置改变但尺寸不变。据此,图5A的兴趣图块数据指示第一、第二、第三兴趣局部图像的起点分别为显示缓存器34(0)、34(10)、34(55);图5B的兴趣图块数据指示第一、第二、第三兴趣局部图像的起点分别为显示缓存器编号34(30)、34(40)、34(85);图5C的兴趣图块数据指示第一、第二、第三兴趣局部图像的起点分别为显示缓存器编号34(60)、34(70)、34(115)。5A-5C illustrate that the user moves the horizontal viewing angle (ie, the latitude of the geographic location) so that the displayed position of the area of interest changes but the size does not change. Accordingly, the interest block data in FIG. 5A indicates that the starting points of the first, second, and third partial images of interest are the display buffers 34(0), 34(10), and 34(55), respectively; the interest block in FIG. 5B The data indicates that the starting points of the first, second, and third partial images of interest are display buffer numbers 34 (30), 34 (40), and 34 (85), respectively; the interest image block data in FIG. 5C indicates the first, second, The starting points of the third part-of-interest images are display buffer numbers 34 ( 60 ), 34 ( 70 ), and 34 ( 115 ), respectively.
于图5D及图5E,兴趣区分组为二个兴趣局部图像,其中第一兴趣局部图像以右斜线图案表示,第二局部兴趣图像以点图案表示。图5D到图5E说明使用者移动水平视角(即,地理位置的纬度)而使得兴趣区的显示位置改变但尺寸不变。据此,图5D的兴趣图块数据指示第一、第二兴趣局部图像的起点分别为显示缓存器34(0)、34(10);图5E的兴趣图块数据指示第一、第二兴趣局部图像的起点分别为显示缓存器编号34(75)、34(85)。如此一来,当使用者移动水平视角时,可确保电子装置3显示的解码图像能正确映射到使用者输入的地理位置。In FIG. 5D and FIG. 5E , the region of interest is grouped into two partial images of interest, wherein the first partial image of interest is represented by a right oblique pattern, and the second partial image of interest is represented by a dot pattern. 5D-5E illustrate that the user moves the horizontal viewing angle (ie, the latitude of the geographic location) so that the displayed position of the area of interest changes but the size does not change. Accordingly, the interest tile data in FIG. 5D indicates that the starting points of the first and second interest partial images are the display buffers 34(0) and 34(10), respectively; the interest tile data in FIG. 5E indicates the first and second interest images. The starting points of the partial images are display buffer numbers 34 (75) and 34 (85), respectively. In this way, when the user moves the horizontal viewing angle, it can be ensured that the decoded image displayed by the
图5F说明使用者移动垂直视角(即,地理位置的经度)而使得兴趣区呈现不连续图像,其中第一兴趣局部图像以右斜线图案表示,第二局部兴趣图像以点图案表示。据此,图5F的兴趣图块数据指示第一、第二兴趣局部图像的起点分别为显示缓存器34(6)、34(0),局部区填补单元304可将属于第一兴趣局部图像的解码帧缓存器编号305(0)~305(8)、305(9)的图块数据分别缓存到显示缓存器编号34(6)~34(14)、34(0),以此类推。如此一来,当使用者移动垂直视角时,可确保电子装置3显示的解码图像能正确映射到使用者输入的地理位置。FIG. 5F illustrates that the user moves the vertical viewing angle (ie, the longitude of the geographic location) so that the region of interest presents discontinuous images, wherein the first partial image of interest is represented by a right diagonal pattern and the second partial image of interest is represented by a dot pattern. Accordingly, the interest block data in FIG. 5F indicates that the starting points of the first and second partial images of interest are the display buffers 34(6) and 34(0), respectively, and the partial
以图5A为例,当收到输入比特流时,兴趣区定位单元301可根据地理位置,判断兴趣区包含对应到显示缓存器编号34(0)~34(74)的图块,以产生兴趣区信息;比特流重建单元302可根据兴趣区信息来重建输入比特流,以产生重建比特流;视频解码器303对重建输入比特流进行解码,以将兴趣区的兴趣图像缓存于解码帧缓存器305(如图5A的第一、第二、第三兴趣局部图像的图块布局);最后,局部区填补单元304将第一兴趣局部图像缓存到显示缓存器编号34(0)~34(9)、34(15)~34(24)、34(30)~34(39)、34(45)~34(54)、34(60)~34(69),将第二兴趣局部图像缓存到显示缓存器编号,34(10)~34(14)、34(25)~34(29)、34(40)~34(44),将第三兴趣局部图像缓存到显示缓存器编号,34(55)~34(59)、34(70)~34(74),其余图5B~图5F皆可以此类推。Taking FIG. 5A as an example, when receiving the input bit stream, the area of
视频解码装置30的操作方式可归纳为一解码图块与显示位置同步流程6,如图6所示,其中解码图块与显示位置同步流程6包含以下步骤。The operation of the
步骤601:根据一地理位置,判断一输入比特流的一原始图像的一兴趣区,以撷取兴趣区的兴趣图块数据,其中兴趣区对应一兴趣图像。Step 601 : Determine a region of interest in an original image of an input bitstream according to a geographic location, so as to capture block data of the region of interest, wherein the region of interest corresponds to an image of interest.
步骤602:根据兴趣图块数据,重建输入比特流,以产生一重建比特流。Step 602: Reconstruct the input bitstream according to the interest tile data to generate a reconstructed bitstream.
步骤603:解码重建比特流,以产生一重建解码图像。Step 603: Decode the reconstructed bitstream to generate a reconstructed decoded image.
步骤604:根据兴趣图块数据,将重建解码图像转换为兴趣图像。Step 604: Convert the reconstructed decoded image into an interest image according to the interest block data.
于解码图块与显示位置同步流程6中,步骤601可由比特流接收单元300及兴趣区定位单元301来执行;步骤602可由比特流重建单元302来执行;步骤603可由视频解码器303来执行;步骤604可由局部区填补单元304及解码帧缓存器305来执行。关于解码图块与显示位置同步流程6的详细操作可参考上述说明。In the decoding block and display
于步骤602,比特流重建单元302可进行一比特流编码流程7,以产生重建比特流,如图7所示,其中比特流编码流程7包含以下步骤。In
步骤701:接收输入比特流的多个图块中的一当前图块。Step 701: Receive a current tile among a plurality of tiles in the input bitstream.
步骤702:根据兴趣区的兴趣图块数据,重建输入比特流的一图块布局。Step 702: Reconstruct a tile layout of the input bitstream according to the interest tile data of the region of interest.
步骤703:判断当前图块是否属于兴趣区?若是,进行步骤704;若否,回到步骤701。Step 703: Determine whether the current tile belongs to the area of interest? If yes, go to step 704; if no, go back to
步骤704:根据图块布局,调整当前图块的图块地址。Step 704: Adjust the tile address of the current tile according to the tile layout.
步骤705:产生重建比特流。Step 705: Generate a reconstructed bitstream.
编码流程7中,于步骤702,比特流重建单元302根据兴趣区的兴趣图块数据,对存取单元的多个非视频编码层参数子集(例如VPS、SPS、PPS、SEI等参数子集)进行编码,以重建输入比特流的图块布局。于步骤703、704,比特流重建单元302对多个片段表头进行编码,以调整当前图块的图块地址。于步骤705,比特流重建单元302将兴趣图块数据编入存取单元的片段数据,以产生重建比特流。In the
综上所述,本发明的视频解码装置可根据使用者输入的地理位置,撷取输入比特流的图像帧中属于兴趣区的相关图像解码数据;在给定解码帧缓存器尺寸的条件下,根据相关图像解码数据来重建输入比特流,以产生重建比特流;对重建比特流进行图像解码,以产生重建解码图像并将之暂存于解码帧缓存器;最后将重建解码图像(例如,属于兴趣区的图块)进行重组,以还原并产生解码图像。如此一来,本发明的视频解码装置可在不增加解码帧缓存器面积的前提下,针对属于兴趣区的相关图块进行解码,如此可确保电子装置能正常显示解码图像,也可避免增加视频解码装置的硬件成本。此外,由于重建比特流是依据使用者输入的地理位置及对应兴趣区的相关图像解码数据来产生,因此可确保电子装置显示的解码图像能正确映射到使用者输入的地理位置。To sum up, the video decoding device of the present invention can capture the relevant image decoding data belonging to the region of interest in the image frame of the input bitstream according to the geographic location input by the user; under the condition of a given decoding frame buffer size, Reconstruct the input bit stream according to the relevant image decoding data to generate the reconstructed bit stream; perform image decoding on the reconstructed bit stream to generate the reconstructed decoded image and temporarily store it in the decoded frame buffer; finally, the reconstructed decoded image (for example, belonging to regions of interest) are reorganized to restore and produce a decoded image. In this way, the video decoding device of the present invention can decode the relevant picture blocks belonging to the region of interest without increasing the area of the decoding frame buffer, so as to ensure that the electronic device can display the decoded image normally, and avoid increasing the video The hardware cost of the decoding device. In addition, since the reconstructed bitstream is generated according to the geographic location input by the user and the relevant image decoded data corresponding to the region of interest, it can ensure that the decoded image displayed by the electronic device can be correctly mapped to the geographic location input by the user.
以上所述仅为本发明的较佳实施例,凡依本发明申请专利范围所做的均等变化与修饰,皆应属本发明的涵盖范围。The above descriptions are only preferred embodiments of the present invention, and all equivalent changes and modifications made according to the scope of the patent application of the present invention shall fall within the scope of the present invention.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810959428.7A CN110858902A (en) | 2018-08-22 | 2018-08-22 | Synchronization method of decoded tile and display position based on input geographic location and related video decoding device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810959428.7A CN110858902A (en) | 2018-08-22 | 2018-08-22 | Synchronization method of decoded tile and display position based on input geographic location and related video decoding device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110858902A true CN110858902A (en) | 2020-03-03 |
Family
ID=69634806
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810959428.7A Pending CN110858902A (en) | 2018-08-22 | 2018-08-22 | Synchronization method of decoded tile and display position based on input geographic location and related video decoding device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110858902A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101002471A (en) * | 2004-08-13 | 2007-07-18 | 庆熙大学校产学协力团 | Method and apparatus to encode image, and method and apparatus to decode image data |
CN102999939A (en) * | 2012-09-21 | 2013-03-27 | 魏益群 | Coordinate acquisition device, real-time three-dimensional reconstruction system, real-time three-dimensional reconstruction method and three-dimensional interactive equipment |
CN104065964A (en) * | 2014-06-19 | 2014-09-24 | 上海交通大学 | Codec method and video codec device for region of interest information |
CN105205862A (en) * | 2015-10-26 | 2015-12-30 | 武汉沃亿生物有限公司 | Three-dimensional image reconstruction method and system |
US20170019655A1 (en) * | 2015-07-13 | 2017-01-19 | Texas Insturments Incorporated | Three-dimensional dense structure from motion with stereo vision |
-
2018
- 2018-08-22 CN CN201810959428.7A patent/CN110858902A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101002471A (en) * | 2004-08-13 | 2007-07-18 | 庆熙大学校产学协力团 | Method and apparatus to encode image, and method and apparatus to decode image data |
CN102999939A (en) * | 2012-09-21 | 2013-03-27 | 魏益群 | Coordinate acquisition device, real-time three-dimensional reconstruction system, real-time three-dimensional reconstruction method and three-dimensional interactive equipment |
CN104065964A (en) * | 2014-06-19 | 2014-09-24 | 上海交通大学 | Codec method and video codec device for region of interest information |
US20170019655A1 (en) * | 2015-07-13 | 2017-01-19 | Texas Insturments Incorporated | Three-dimensional dense structure from motion with stereo vision |
CN105205862A (en) * | 2015-10-26 | 2015-12-30 | 武汉沃亿生物有限公司 | Three-dimensional image reconstruction method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107454468B (en) | Method, apparatus and stream for formatting immersive video | |
CN109983757B (en) | View-related operations during panoramic video playback | |
JP6984841B2 (en) | Image processing method, terminal and server | |
US11023152B2 (en) | Methods and apparatus for storing data in memory in data processing systems | |
US11004173B2 (en) | Method for processing projection-based frame that includes at least one projection face packed in 360-degree virtual reality projection layout | |
WO2017215587A1 (en) | Method and apparatus for encoding and decoding video image | |
KR20170113380A (en) | Data processing systems | |
US11024008B1 (en) | Methods and apparatus for multi-encoder processing of high resolution content | |
US11069026B2 (en) | Method for processing projection-based frame that includes projection faces packed in cube-based projection layout with padding | |
CN109983500A (en) | Re-projecting a flat-panel projection of panoramic video pictures for rendering by an application | |
AU2017317839B2 (en) | Panoramic video compression method and device | |
KR20220069086A (en) | Method and apparatus for encoding, transmitting and decoding volumetric video | |
US10963987B2 (en) | Method and apparatus for decoding projection based frame with 360-degree content represented by triangular projection faces packed in triangle-based projection layout | |
EP3565260A1 (en) | Generation device, identification information generation method, reproduction device, and image generation method | |
CN113557729A (en) | Partitioning of encoded point cloud data | |
US20230345020A1 (en) | Method for processing video data stream, video decoding apparatus, and method for encoding data stream | |
US20230388542A1 (en) | A method and apparatus for adapting a volumetric video to client devices | |
CN110858902A (en) | Synchronization method of decoded tile and display position based on input geographic location and related video decoding device | |
CN110858907A (en) | Block layout reconstruction method adaptable to decoding frame buffer size and related video decoding device | |
TW202008784A (en) | Method of rewriting tile layout for fitting sizes of decoding frame buffer and related video decoding device | |
US20210281882A1 (en) | Compact description of region-wise packing information | |
TW202008785A (en) | Method of synchronizing decoded tile and display location based on inputted geological location and related video decoding device | |
WO2023280266A1 (en) | Fisheye image compression method, fisheye video stream compression method and panoramic video generation method | |
KR20230174246A (en) | A parallel approach to dynamic mesh alignment | |
US11405630B2 (en) | Video decoding method for decoding part of bitstream to generate projection-based frame with constrained picture size and associated electronic device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200303 |