CN102368824B

CN102368824B - Video stereo vision conversion method

Info

Publication number: CN102368824B
Application number: CN2011102762754A
Authority: CN
Inventors: 戴琼海; 杨铀
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2011-09-16
Filing date: 2011-09-16
Publication date: 2013-11-20
Anticipated expiration: 2031-09-16
Also published as: CN102368824A

Abstract

The present invention proposes a video stereoscopic conversion method, comprising the following steps: obtaining the number of viewpoints n of the input video and the number of viewpoints N required for naked-eye stereoscopic display, where n<N; obtaining the key frame of each viewpoint of the input video and calculating Its scene depth obtains the depth map of the key frame of each viewpoint of the input video; according to the adjacent two key frames of each viewpoint of the input video and their corresponding depth maps, obtain the non- The depth map of the key frame, and repeat this step to obtain the depth map of all non-key frames of each viewpoint of the input video; according to the image of the input video and the depth map of each viewpoint of the input video, draw the image of Nn viewpoints, and with The image of the input video constitutes an N-viewpoint image; pixel arrangement is performed on the N-viewpoint image to obtain an N-viewpoint image suitable for a naked-eye stereoscopic display device. The method of the present invention can quickly realize naked-eye 3D stereoscopic display of single-viewpoint plane, binocular stereoscopic and other videos, saving production cycle and cost.

Description

Video Stereoscopic Conversion Method

技术领域 technical field

本发明涉及计算机视觉技术领域，特别涉及一种视频立体转换方法。The invention relates to the technical field of computer vision, in particular to a video stereo conversion method.

背景技术 Background technique

随着3D立体显示技术的不断发展，立体电影、电视、移动设备等立体产品迅速普及，大众对立体视频的需求也越来越多。现有技术中的立体视频通过双目方式进行显示，观看过程中需要通过主动快门式、偏振式、红蓝式眼镜等将双目图像分别发送至人的左右两眼，从而形成立体视觉感知。这种方式需要用户佩戴眼镜，观看不便。With the continuous development of 3D stereoscopic display technology, 3D products such as 3D movies, TVs, and mobile devices are rapidly popularized, and the public's demand for 3D video is also increasing. The stereoscopic video in the prior art is displayed by binocular mode, and the binocular image needs to be sent to the left and right eyes of the person through active shutter type, polarized type, red and blue type glasses, etc. during viewing, so as to form stereoscopic vision perception. This method requires the user to wear glasses, which is inconvenient to watch.

针对现有技术中需要佩戴眼镜观看3D视频的缺陷，现有技术采用裸眼3D立体显示设备显示。裸眼3D立体显示技术能够让用户无需佩戴辅助设备即可观看到视频所具有的立体效果，是未来家庭、广告、展示等场合中用于立体观看较为理想的方式。现有技术存在的问题是，在裸眼3D立体显示设备上进行立体显示，需要根据该显示设备所提供的观看视点数量N，向该设备同时输入等量视点N的视频信号源，导致裸眼3D立体显示设备的视频源获取困难。In view of the defect in the prior art of needing to wear glasses to watch 3D video, the prior art uses a naked-eye 3D stereoscopic display device for display. Glasses-free 3D stereoscopic display technology allows users to watch the stereoscopic effect of videos without wearing auxiliary equipment. The problem existing in the prior art is that, to perform stereoscopic display on a naked-eye 3D stereoscopic display device, it is necessary to simultaneously input video signal sources with equal viewpoints N to the device according to the number N of viewing viewpoints provided by the display device, resulting in naked-eye 3D stereoscopic display. It is difficult to obtain the video source of the display device.

现有技术中，采用开发具有N个视点的视频采集设备进行同步采集的方法为裸眼立体显示设备提供视频源，这种方法存在的问题是，一方面制作成本高、周期长，对采集设备及辅助控制设备的要求高，另一方面现有的大量视频资料，例如单视点平面节目源、双目立体节目源等在裸眼3D立体显示设备上难以显示。In the prior art, the method of synchronous acquisition by developing a video acquisition device with N viewpoints is used to provide video sources for naked-eye stereoscopic display devices. The problem with this method is that, on the one hand, the production cost is high and the cycle is long, and the acquisition equipment and The requirements for auxiliary control equipment are high. On the other hand, a large number of existing video materials, such as single-viewpoint planar program sources and binocular stereoscopic program sources, are difficult to display on naked-eye 3D stereoscopic display devices.

发明内容 Contents of the invention

本发明的目的旨在至少解决上述技术缺陷之一。The purpose of the present invention is to solve at least one of the above-mentioned technical drawbacks.

为达到上述目的，本发明提出一种视频立体转换方法，包括以下步骤：S1：获取输入视频的视点数n；S2：获取裸眼立体显示所需的视点数N，其中，n＜N；S3：获取所述输入视频的每个视点的关键帧，计算所述输入视频的每个视点的关键帧的场景深度，获得所述输入视频的每个视点的关键帧的深度图；S4：根据所述输入视频的每个视点的相邻两个关键帧和所述相邻两个关键帧的深度图，获得所述相邻两个关键帧之间的非关键帧的深度图；S5：重复步骤S4，获得所述输入视频的每个视点的所有非关键帧的深度图；S6：根据所述输入视频的图像和所述输入视频的每个视点的深度图，绘制N-n视点的图像，并与所述输入视频的图像构成N视点图像；S7：对所述N视点图像进行像素排列以获得适于预定裸眼立体显示设备的N视点图像。In order to achieve the above object, the present invention proposes a video stereo conversion method, comprising the following steps: S1: Obtain the number of viewpoints n of the input video; S2: Obtain the number N of viewpoints required for naked-eye stereoscopic display, where n<N; S3: Obtain the key frame of each viewpoint of the input video, calculate the scene depth of the key frame of each viewpoint of the input video, obtain the depth map of the key frame of each viewpoint of the input video; S4: according to the Two adjacent key frames of each viewpoint of the input video and the depth maps of the two adjacent key frames to obtain the depth maps of the non-key frames between the two adjacent key frames; S5: repeat step S4 , obtain the depth map of all non-key frames of each viewpoint of the input video; S6: draw the image of N-n viewpoint according to the image of the input video and the depth map of each viewpoint of the input video, and compare with the The image of the input video constitutes an N-viewpoint image; S7: Perform pixel arrangement on the N-viewpoint image to obtain an N-viewpoint image suitable for a predetermined naked-eye stereoscopic display device.

在本发明的一个实施例中，所述步骤S1进一步包括：S11：判断所述输入视频的文件个数；S12：如果所述输入视频的文件个数不为1，则所述输入视频的视点数n为所述文件个数；S13：如果所述输入视频的文件个数为1，则进一步判断所述文件的分段点数，所述输入视频的视点数n为所述文件的分段点数。In an embodiment of the present invention, the step S1 further includes: S11: judging the number of files of the input video; S12: if the number of files of the input video is not 1, the viewpoint of the input video Number n is the number of files; S13: If the number of files of the input video is 1, then further judge the number of segmentation points of the file, and the number of viewpoints n of the input video is the number of segmentation points of the file .

在本发明的一个实施例中，所述步骤S3进一步包括：如果所述输入视频的视点数n＝1，则获取所述视点的关键帧，计算所述视点的关键帧的场景深度以获得所述视点的关键帧的深度图；如果所述输入视频的视点数n＝2，则获取所述输入视频的每个视点的关键帧，并计算所述输入视频的视差，并根据所述视差与场景深度的转换关系，获得所述输入视频的每个视点的关键帧的深度图。In one embodiment of the present invention, the step S3 further includes: if the number of viewpoints of the input video is n=1, then obtaining key frames of the viewpoint, and calculating the scene depth of the key frames of the viewpoint to obtain the the depth map of the key frame of the viewpoint; if the viewpoint number n=2 of the input video, then obtain the key frame of each viewpoint of the input video, and calculate the parallax of the input video, and according to the parallax and The conversion relationship of scene depth is to obtain the depth map of the key frame of each viewpoint of the input video.

在本发明的一个实施例中，所述步骤S4进一步包括：S41：根据所述每个视点的相邻两个关键帧K_n和K_n+1，以及关键帧K_n对应的深度图DK_n和关键帧K_n+1对应的深度图DK_n+1，获得所述关键帧K_n和K_n+1之间的非关键帧F_i(i＝1，2，...，t)，其中t为非关键帧的个数；S42：计算K_n与F₁之间、F₁与F₂之间、...、F_t-1与F_t之间的光流图，并以所述光流图为基准将DK_n在非关键帧F_i(i＝1，2，...，t)中进行像素拷贝，获得非关键帧F_i(i＝1，2，...，t)的第一深度图

(i＝1，2，...，t)；S43：对所述第一深度图

(i＝1，2，...，t)进行中值滤波获得第三深度图

(i＝1，2，...，t)；S44：计算K_n+1与F_t之间、F_t与F_t-1之间、...、F₂与F₁之间的光流图，并以所述光流图为基准将DK_n+1在非关键帧F_i(i＝t，t-1，t-2，...，1)中进行像素拷贝，获得非关键帧F_i(i＝1，2，...，t)的第二深度图

(i＝1，2，...，t)；S45：对所述第二深度图(i＝1，2，...，t)进行中值滤波获得第四深度图

(i＝1，2，...，t)；S46：计算所述第三深度图

(i＝1，2，...，t)和所述第四深度图

(i＝1，2，...，t)中的对应像素点的求均值，获得非关键帧的深度图DF_i(i＝1，2，...，t)。In an embodiment of the present invention, the step S4 further includes: S41: According to the two adjacent key frames K _n and K _n+1 of each viewpoint, and the depth map DK _n corresponding to the key frame K _n Depth map DK _n+1 _{corresponding} to key frame K n+1, obtaining non-key frames F _i (i=1, 2, . . . , t) between the key frames K _n and K _n+1 , Wherein t is the number of non-key frames; S42: Calculate the optical flow diagram between K _n and F ₁ , between F ₁ and F ₂ , ..., between F _t-1 and F _t , and use the Using the above optical flow graph as a reference, DK _n performs pixel copying in non-key frames F _i (i=1, 2, ..., t) to obtain non-key frames F _i (i = 1, 2, ..., The first depth map of t)

(i=1, 2, ..., t); S43: For the first depth map

(i=1, 2, ..., t) perform median filtering to obtain the third depth map

(i=1, 2, ..., t); S44: Calculate the light between K _n+1 and F _t , between F _t and F _t-1 , ..., between F ₂ and F ₁ flow graph, and based on the optical flow graph, DK _n+1 performs pixel copying in the non-key frame F _i (i=t, t-1, t-2, . . . , 1) to obtain the non-key Second depth map of frame F _i (i=1, 2, . . . , t)

(i=1, 2, ..., t); S45: For the second depth map (i=1, 2, ..., t) perform median filtering to obtain the fourth depth map

(i=1, 2, ..., t); S46: Calculate the third depth map

(i=1, 2, ..., t) and the fourth depth map

(i=1, 2, . . . , t) are averaged to obtain the depth map DF _i (i=1, 2, . . . , t) of the non-key frame.

在本发明的一个实施例中，所述步骤S6进一步包括：如果所述输入视频的视点数n＝1，则获取所述视点的图像及其深度图，并按照预定期望视点位置进行像素点绘制；如果所述输入视频的视点数n＝2，则获取所述每个视点的图像及其深度图，并获得所述每个视点位置与预定的期望视点位置的距离DL和DR以及所述每个视点位置的距离D，所述期望视点位置点的像素值按照以下公式计算，In one embodiment of the present invention, the step S6 further includes: if the number of viewpoints of the input video is n=1, acquiring the image of the viewpoint and its depth map, and performing pixel rendering according to the predetermined desired viewpoint position ; If the number of viewpoints of the input video n=2, then obtain the image of each viewpoint and its depth map, and obtain the distance DL and DR between the position of each viewpoint and the predetermined desired viewpoint position and the distance of each viewpoint The distance D of a viewpoint position, the pixel value of the desired viewpoint position is calculated according to the following formula,

Pixel＝Pixel(L)*DR/D+Pixel(R)*DL/D，其中，Pixel(L)为与所述预定的期望视点位置距离DR对应视点的像素值，Pixel(R)为与所述预定的期望视点位置距离DL对应视点的像素值。Pixel=Pixel(L)*DR/D+Pixel(R)*DL/D, wherein, Pixel(L) is the pixel value of the viewpoint corresponding to the predetermined desired viewpoint position distance DR, and Pixel(R) is the The predetermined desired viewpoint position distance DL corresponds to the pixel value of the viewpoint.

在本发明的一个实施例中，还包括：如果所述预定的期望视点位置的个数大于1，则根据所述预定的期望视点位置重复执行像素点绘制。In an embodiment of the present invention, it further includes: if the number of the predetermined expected viewpoint positions is greater than 1, repeatedly performing pixel point rendering according to the predetermined expected viewpoint positions.

在本发明的一个实施例中，所述步骤S6还包括：如果所述输入视频的视点数n＞1，根据所述视频n视点的图像及所述视频的视差图，绘制N-n视点的图像，与所述n视点的图像构成所述裸眼立体显示的N视点图像。In one embodiment of the present invention, the step S6 further includes: if the number of viewpoints of the input video is n>1, drawing images of N-n viewpoints according to the images of n viewpoints of the video and the disparity map of the video, and the images of the n viewpoints constitute the N viewpoint images displayed in the naked-eye stereoscopic display.

根据本发明实施例的视频立体转换方法，至少具有以下有益效果：The video stereo conversion method according to the embodiment of the present invention has at least the following beneficial effects:

(1)实现输入视频的适用于裸眼立体显示设备的多目视频转换，取得良好的裸眼立体观看效果。(1) Realize the multi-eye video conversion of input video suitable for naked-eye stereoscopic display equipment, and obtain a good naked-eye stereoscopic viewing effect.

(2)节约了裸眼3D立体显示技术的制作成本高，同时缩短制作周期。(2) The high production cost of naked-eye 3D stereoscopic display technology is saved, and the production cycle is shortened.

(3)可方便地将现有的大量视频资料在裸眼3D立体显示设备上进行显示。(3) A large amount of existing video data can be conveniently displayed on the naked-eye 3D stereoscopic display device.

本发明附加的方面和优点将在下面的描述中部分给出，部分将从下面的描述中变得明显，或通过本发明的实践了解到。Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

附图说明 Description of drawings

本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解，其中：The above and/or additional aspects and advantages of the present invention will become apparent and easy to understand from the following description of the embodiments in conjunction with the accompanying drawings, wherein:

图1为本发明实施例的视频立体转换方法流程图。FIG. 1 is a flowchart of a video stereo conversion method according to an embodiment of the present invention.

具体实施方式 Detailed ways

下面详细描述本发明的实施例，所述实施例的示例在附图中示出，其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的，仅用于解释本发明，而不能解释为对本发明的限制。Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.

图1为本发明实施例的视频立体转换方法流程图。如图1所示，根据本发明实施例的视频立体转换方法，包括以下步骤：FIG. 1 is a flowchart of a video stereo conversion method according to an embodiment of the present invention. As shown in Figure 1, the video stereo conversion method according to the embodiment of the present invention includes the following steps:

步骤S101：获取输入视频的视点数n。Step S101: Obtain the number n of viewpoints of the input video.

其中，输入视频可包括单视点平面视频和双目立体视频。Wherein, the input video may include single-viewpoint planar video and binocular stereoscopic video.

具体地，可首先判断输入视频的文件个数。如果输入视频的文件个数不为1，则输入视频的视点数n为文件个数；如果输入视频的文件个数为1，则进一步判断文件的分段点数，输入视频的视点数n为文件的分段点数。Specifically, the file number of the input video may be determined first. If the number of files of the input video is not 1, the number of viewpoints n of the input video is the number of files; if the number of files of the input video is 1, the number of segmentation points of the file is further judged, and the number of viewpoints n of the input video is the number of files of segment points.

步骤S102：获取裸眼立体显示所需的视点数N，其中，n＜N。Step S102: Obtain the number N of viewpoints required for naked-eye stereoscopic display, where n<N.

其中，根据裸眼立体显示设备获取裸眼立体显示所需的视点数N。Wherein, the number of viewpoints N required for the naked-eye stereoscopic display is obtained according to the naked-eye stereoscopic display device.

步骤S103：获取输入视频的每个视点的关键帧，计算输入视频的每个视点的关键帧的场景深度，获得输入视频的每个视点的关键帧的深度图。Step S103: Obtain the key frame of each viewpoint of the input video, calculate the scene depth of the key frame of each viewpoint of the input video, and obtain the depth map of the key frame of each viewpoint of the input video.

具体地，在本发明的一个实施例中，可通过如下的方法获得输入视频的每个视点的关键帧的深度图。Specifically, in one embodiment of the present invention, the depth map of the key frame of each viewpoint of the input video may be obtained through the following method.

首先获取输入视频的每个视点的关键帧。First get keyframes for each viewpoint of the input video.

其中，获取输入视频的每个视点的关键帧的方法可参考专利ZL200810225050.4所述的方法，同时也可选择其他的选取算法，例如基于镜头边缘获取关键帧的方法、基于运动分析获取关键帧的方法、基于图像信息获取关键帧的方法和基于视频聚类获取关键帧的方法。Among them, the method of obtaining key frames of each viewpoint of the input video can refer to the method described in patent ZL200810225050.4, and other selection algorithms can also be selected, such as the method of obtaining key frames based on the edge of the lens, and obtaining key frames based on motion analysis The method of obtaining key frames based on image information and the method of obtaining key frames based on video clustering.

然后，根据输入视频的每个视点的关键帧计算其场景深度，获得输入视频的每个视点的关键帧的深度图。Then, the scene depth is calculated according to the keyframes of each viewpoint of the input video, and the depth map of the keyframes of each viewpoint of the input video is obtained.

具体地，如果输入视频的视点数n＝1，则根据获取到的视点关键帧，计算关键帧的场景深度以获得视点的关键帧的深度图，例如，可利用专利ZL200710117654.2所述的方法获得视点的关键帧的深度图，也可选择其他的方法。Specifically, if the number of viewpoints of the input video is n=1, then according to the obtained viewpoint keyframes, calculate the scene depth of the keyframes to obtain the depth map of the viewpoint keyframes, for example, the method described in patent ZL200710117654.2 can be used Obtain the depth map of the keyframe of the viewpoint, and other methods can also be selected.

如果输入视频的视点数n＝2，则根据获取到的每个视点的关键帧，计算输入视频的视差，再根据视差与场景深度的转换关系，获得输入视频的每个视点的关键帧的深度图。If the number of viewpoints of the input video is n=2, then calculate the parallax of the input video according to the obtained key frames of each viewpoint, and then obtain the depth of the key frames of each viewpoint of the input video according to the conversion relationship between the parallax and the depth of the scene picture.

其中，关键帧的视差计算可采用典型的立体匹配方法，也可选择其他的方法。Wherein, the disparity calculation of the key frame may adopt a typical stereo matching method, or other methods may be selected.

视差与场景深度均可用于描述场景的三维信息，视差与场景深度可相互转换的。Both disparity and scene depth can be used to describe the three-dimensional information of the scene, and the disparity and scene depth can be converted to each other.

步骤S104：根据输入视频的每个视点的相邻两个关键帧和相邻两个关键帧的深度图，获取相邻两个关键帧之间的非关键帧的深度图。Step S104: Obtain the depth map of the non-key frame between the two adjacent key frames according to the two adjacent key frames and the depth maps of the two adjacent key frames of each viewpoint of the input video.

具体地，首先根据每个视点的相邻两个关键帧K_n和K_n+1，以及关键帧K_n对应的深度图DK_n和关键帧K_n+1对应的深度图DK_n+1，获取关键帧K_n和K_n+1之间的非关键帧F_i(i＝1，2，...，t)，其中t为非关键帧的个数。Specifically, firstly, according to the two adjacent key frames K _n and K _n+1 of each viewpoint, and the depth map DK _n corresponding to the key frame K _n and the depth map DK _n+1 corresponding to the key frame K _n+1 , Obtain non-key frames F _i (i=1, 2, . . . , t) between key frames K _n and K _n+1 , where t is the number of non-key frames.

计算K_n与F₁之间、F₁与F₂之间、...、F_t-1与F_t之间的光流图，并以光流图为基准将DK_n在非关键帧F_i(i＝1，2，...，t)中进行像素拷贝，获取非关键帧F_i(i＝1，2，...，t)的第一深度图

(i＝1，2，...，t)。Calculate the optical flow diagram between K _n and F ₁ , between F ₁ and F ₂ , ..., between F _t-1 and F _t , and use the optical flow diagram as a benchmark to set DK _n in the non-key frame F Copy pixels in _i (i=1, 2, ..., t), and obtain the first depth map of non-key frame F _i (i = 1, 2, ..., t)

(i=1, 2, . . . , t).

对第一深度图(i＝1，2，...，t)进行中值滤波获取第三深度图

(i＝1，2，...，t)。for the first depth map (i=1, 2, ..., t) perform median filtering to obtain the third depth map

(i=1, 2, . . . , t).

计算K_n+1与F_t之间、F_t与F_t-1之间、...、F₂与F₁之间的光流图，并以光流图为基准将DK_n+1在非关键帧F_i(i＝t，t-1，t-2，...，1)中进行像素拷贝，获取非关键帧F_i(i＝1，2，...，t)的第二深度图

(i＝1，2，...，t)。Calculate the optical flow diagram between K _n+1 and F _t , between F _t and F _t-1 , ..., between F ₂ and F ₁ , and use the optical flow diagram as a benchmark to divide DK _n+1 in Perform pixel copying in the non-key frame F _i (i=t, t _- 1, t-2, ..., 1), and obtain the first Two Depth Maps

(i=1, 2, . . . , t).

对第二深度图

(i＝1，2，...，t)进行中值滤波获取第四深度图

(i＝1，2，...，t)。For the second depth map

(i=1, 2, ..., t) perform median filtering to obtain the fourth depth map

(i=1, 2, . . . , t).

其中，第三深度图

(i＝1，2，...，t)和第四深度图

(i＝1，2，...，t)的中值滤波方法相同。Among them, the third depth map

(i=1, 2, ..., t) and the fourth depth map

(i=1, 2, . . . , t) have the same median filtering method.

计算第三深度图

(i＝1，2，...，t)和第四深度图

(i＝1，2，...，t)中的对应像素点的求均值，获取非关键帧的深度图DF_i(i＝1，2，...，t)。Calculate the third depth map

(i=1, 2, ..., t) and the fourth depth map

步骤S105：重复步骤S104，获得输入视频的每个视点的所有非关键帧的深度图。Step S105: Repeat step S104 to obtain the depth maps of all non-key frames of each viewpoint of the input video.

具体地，根据步骤S104所述的方法，计算输入视频所有两个相邻的关键帧之间的非关键帧的深度图，由此获得输入视频的每个视点的所有非关键帧的深度图。Specifically, according to the method described in step S104, the depth maps of non-key frames between all two adjacent key frames of the input video are calculated, thereby obtaining the depth maps of all non-key frames of each viewpoint of the input video.

步骤S106：根据输入视频的图像和输入视频的每个视点的深度图，绘制N-n视点的图像，并与输入视频的图像构成N视点图像。Step S106: According to the image of the input video and the depth map of each viewpoint of the input video, draw images of N-n viewpoints, and form N viewpoint images with the images of the input video.

具体地，如果输入视频的视点数n＝1，则获取视点的图像及其对应的深度图，并按照预定期望视点位置进行像素点绘制。Specifically, if the number of viewpoints of the input video is n=1, obtain viewpoint images and corresponding depth maps, and perform pixel rendering according to predetermined desired viewpoint positions.

如果视频的视点数n＝2，则获取每个视点的图像及其对应的深度图，获得每个视点位置与预定的期望视点位置的距离DL和DR以及每个视点位置的距离D，期望视点位置点的像素值按照以下公式计算，If the number of viewpoints of the video is n=2, then obtain the image of each viewpoint and its corresponding depth map, obtain the distance DL and DR between each viewpoint position and the predetermined desired viewpoint position and the distance D of each viewpoint position, the desired viewpoint The pixel value of the location point is calculated according to the following formula,

Pixel＝Pixel(L)*DR/D+Pixel(R)*DL/D，其中，Pixel(L)为与预定的期望视点位置距离DR对应视点的像素值，Pixel(R)为与预定的期望视点位置距离DL对应视点的像素值。Pixel=Pixel(L)*DR/D+Pixel(R)*DL/D, wherein, Pixel(L) is the pixel value of the viewpoint corresponding to the predetermined desired viewpoint position distance DR, and Pixel(R) is the pixel value corresponding to the predetermined desired viewpoint position distance DR, and Pixel(R) is The distance from the viewpoint position to DL corresponds to the pixel value of the viewpoint.

其中，如果预定期望视点位置的个数大于1，则根据预定的期望视点位置重复执行像素点绘制。Wherein, if the number of predetermined expected viewpoint positions is greater than 1, pixel point rendering is repeatedly performed according to the predetermined expected viewpoint positions.

另外，如果输入视频的视点数n＞1，根据视频n视点的图像及视频的视差图，也可绘制N-n视点的图像，与n视点的图像构成裸眼立体显示的N视点图像。绘制方法与基于深度图的绘制方法相同。In addition, if the number of viewpoints of the input video is n>1, images of N-n viewpoints can also be drawn according to the images of n viewpoints of the video and the disparity map of the video, and together with the images of n viewpoints, an N-viewpoint image for naked-eye stereoscopic display can be formed. The drawing method is the same as that based on the depth map.

步骤S107：对N视点图像进行像素排列以获得适于预定裸眼立体显示设备的N视点图像。Step S107: Perform pixel arrangement on the N-viewpoint image to obtain an N-viewpoint image suitable for a predetermined naked-eye stereoscopic display device.

具体地，对N视点图像进行像素排列方法可采用专利CN200910088902.4所述的方法，也可采用其他方法。Specifically, the method for arranging the pixels of the N-viewpoint image can adopt the method described in the patent CN200910088902.4, and other methods can also be used.

根据本发明实施例的方法，至少具有以下有益效果：The method according to the embodiment of the present invention has at least the following beneficial effects:

尽管已经示出和描述了本发明的实施例，对于本领域的普通技术人员而言，可以理解在不脱离本发明的原理和精神的情况下可以对这些实施例进行多种变化、修改、替换和变型，本发明的范围由所附权利要求及其等同限定。Although the embodiments of the present invention have been shown and described, those skilled in the art can understand that various changes, modifications and substitutions can be made to these embodiments without departing from the principle and spirit of the present invention. and modifications, the scope of the invention is defined by the appended claims and their equivalents.

Claims

1. a video stereo vision conversion method, is characterized in that, comprises the following steps:

S1: the viewpoint of obtaining input video is counted n;

S2: obtain the required viewpoint of bore hole stereo display and count N, wherein, n＜N;

S3: the key frame that obtains each viewpoint of described input video, calculate the scene depth of key frame of each viewpoint of described input video, obtain the depth map of key frame of each viewpoint of described input video, specifically comprise: if the viewpoint of described input video is counted n=1, obtain the key frame of described viewpoint, calculate the scene depth of key frame of described viewpoint with the depth map of the key frame that obtains described viewpoint;

If the viewpoint of described input video is counted n=2, obtain the key frame of each viewpoint of described input video, and calculate the parallax of described input video, and according to the transformational relation of described parallax and scene depth, obtain the depth map of key frame of each viewpoint of described input video;

S4:, according to adjacent two key frames of each viewpoint of described input video and the depth map of described adjacent two key frames, obtain the depth map of the non-key frame between described adjacent two key frames;

S5: repeating step S4, the depth map of all non-key frames of each viewpoint of the described input video of acquisition;

S6:, according to the depth map of each viewpoint of the image of described input video and described input video, draw the image of N-n viewpoint, and with the image construction N visual point image of described input video;

S7: described N visual point image is carried out Pixel arrangement to obtain to be suitable for the N visual point image of predetermined bore hole stereoscopic display device.

2. video stereo vision conversion method according to claim 1, is characterized in that, described step S1 further comprises:

S11: the file number of the described input video of judgement;

S12: if the file number of described input video is not 1, to count n be described file number to the viewpoint of described input video;

S13: if the file number of described input video is 1, further the segmentation of the described file of judgement is counted, and it is that the segmentation of described file is counted that the viewpoint of described input video is counted n.

3. video stereo vision conversion method according to claim 1, is characterized in that, described step S4 further comprises:

S41: according to adjacent two key frame K of described each viewpoint _nAnd K _n+1, and key frame K _nCorresponding depth map DK _nWith key frame K _n+1Corresponding depth map DK _n+1, obtain described key frame K _nAnd K _n+1Between non-key frame F _i(i=1,2 ..., t), wherein t is the number of non-key frame;

S42: calculating K _nWith F ₁Between, F ₁With F ₂Between ..., F _t-1With F _tBetween light stream figure, and take described light stream figure as benchmark with DK _nAt non-key frame F _i(i=1,2 ..., carry out the pixel copy in t), obtain non-key frame F _i(i=1,2 ..., the first depth map DF t) _i ⁽¹⁾(i=1,2 ..., t);

S43: to described the first depth map DF _i ⁽¹⁾(i=1,2 ..., t) carry out medium filtering and obtain the 3rd depth map DF _i ⁽¹⁰⁾(i=1,2 ..., t);

S44: calculating K _n+1With F _tBetween, F _tWith F _t-1Between ..., F ₂With F ₁Between light stream figure, and take described light stream figure as benchmark with DK _n+1At non-key frame F _i(i=t, t-1, t-2 ..., 1) in carry out pixel copy, obtain non-key frame F _i(i=1,2 ..., the second depth map DF t) _i ⁽²⁾(i=1,2 ..., t);

S45: to described the second depth map DF _i ⁽²⁾(i=1,2 ..., t) carry out medium filtering and obtain the 4th depth map DF _i ⁽²⁰⁾(i=1,2 ..., t);

S46: calculate described the 3rd depth map DF _i ⁽¹⁰⁾(i=1,2 ..., t) with described the 4th depth map DF _i ⁽²⁰⁾(i=1,2 ..., averaging of the corresponding pixel points in t), obtain the depth map DF of non-key frame _i(i=1,2 ..., t).

4. video stereo vision conversion method according to claim 1, is characterized in that, described step S6 further comprises:

, if the viewpoint of described input video is counted n=1, obtain image and the depth map thereof of described viewpoint, and carry out pixel according to the preset expected viewpoint position and draw;

If the viewpoint of described input video is counted n=2, obtain image and the depth map thereof of described each viewpoint, and obtain described each viewpoint position and the distance D L of predetermined expectation viewpoint position and the distance D of DR and described each viewpoint position, the pixel value of described expectation viewpoint position point calculates according to following formula

Pixel=Pixel(L)*DR/D+Pixel(R)*DL/D，

Wherein, Pixel (L) is the pixel value with the described predetermined corresponding viewpoint of expectation viewpoint position distance D R, and Pixel (R) is the pixel value with the described predetermined corresponding viewpoint of expectation viewpoint position distance D L.

5. video stereo vision conversion method according to claim 4, is characterized in that, also comprises:

If the number of described predetermined expectation viewpoint position, greater than 1, repeats pixel according to described predetermined expectation viewpoint position and draws.

6. video stereo vision conversion method according to claim 1, is characterized in that, described step S6 also comprises:

If the viewpoint of described input video is counted n〉1,, according to the image of described video n viewpoint and the disparity map of described video, draw the image of N-n viewpoint, with the N visual point image of the described bore hole stereo display of the image construction of described n viewpoint.