CN108053469A

CN108053469A - Complicated dynamic scene human body three-dimensional method for reconstructing and device under various visual angles camera

Info

Publication number: CN108053469A
Application number: CN201711433631.2A
Authority: CN
Inventors: 刘烨斌; 王金宝; 戴琼海
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2017-12-26
Filing date: 2017-12-26
Publication date: 2018-05-18

Abstract

The invention discloses a method and device for three-dimensional reconstruction of a human body in a complex dynamic scene with a multi-view camera, wherein the method includes: shooting a target human body object from multiple viewpoints to obtain a multi-viewpoint two-dimensional image at the same moment; The model after network learning predicts the parts of the human body in the video sequence to obtain the human skeleton information of the human body; according to the skeleton information and through the virtual three-dimensional human body model to match the movements and general outlines of the characters in the image, to obtain the human body skeleton information Contour information: using the calibrated internal and external parameters of each viewpoint camera, combined with human skeleton information and human contour information, the Visual Hull method is used to carry out three-dimensional modeling of the human body. This method can combine camera calibration and other processes to perform three-dimensional reconstruction of human objects in video sequences, so as to achieve accurate segmentation of human objects and effectively improve the accuracy and reliability of reconstruction.

Description

Method and device for 3D reconstruction of human body in complex dynamic scenes under multi-view cameras

技术领域technical field

本发明涉及计算机视觉技术领域，特别涉及一种多视角相机下的复杂动态场景人体三维重建方法及装置。The invention relates to the technical field of computer vision, in particular to a method and device for three-dimensional reconstruction of a human body in complex dynamic scenes under multi-view cameras.

背景技术Background technique

在计算机视觉中，三维重建是根据单视图或者多视图的图像重建三维信息的过程，由于单视频的信息不完全，因此三维重建需要利用先验知识，而多视图的三维重建能够利用更多的视点的二维图像的信息，重建出三维模型。然而，目前大多的三维重建算法，对二维信息的利用不够精确和全面，计算过程过度依赖外部设备提供的信息，如深度相机提供的深度信息等，或依赖于对目标和背景的分割结果等，造成重建出的结果仍比较粗糙。In computer vision, 3D reconstruction is the process of reconstructing 3D information from single-view or multi-view images. Since the information of a single video is incomplete, 3D reconstruction needs to use prior knowledge, while multi-view 3D reconstruction can use more The information of the two-dimensional image of the viewpoint is used to reconstruct the three-dimensional model. However, most of the current 3D reconstruction algorithms do not use 2D information accurately and comprehensively enough, and the calculation process relies too much on information provided by external devices, such as depth information provided by depth cameras, or on the segmentation results of objects and backgrounds, etc. , resulting in a relatively rough reconstruction result.

发明内容Contents of the invention

本发明旨在至少在一定程度上解决相关技术中的技术问题之一。The present invention aims to solve one of the technical problems in the related art at least to a certain extent.

为此，本发明的一个目的在于提出一种多视角相机下的复杂动态场景人体三维重建方法，该方法可以实现准确的人物对象分割，有效提高重建的准确性和可靠性。Therefore, an object of the present invention is to propose a method for three-dimensional human body reconstruction in complex dynamic scenes under multi-view cameras, which can realize accurate segmentation of human objects and effectively improve the accuracy and reliability of reconstruction.

本发明的另一个目的在于提出一种多视角相机下的复杂动态场景人体三维重建装置。Another object of the present invention is to propose a device for three-dimensional human body reconstruction in complex dynamic scenes under multi-view cameras.

为达到上述目的，本发明一方面实施例提出了一种多视角相机下的复杂动态场景人体三维重建方法，包括以下步骤：对目标人体对象进行多视点拍摄，以获得在同一时刻下的多视点二维图像；通过深度网络学习后的模型对视频序列中的人体各部分进行预测，以获取人体的人体骨架信息；根据所述骨架信息并通过虚拟的三维人体模型去匹配图像中人物的动作和大致轮廓，以获得人体的人体轮廓信息；利用标定后的各视点相机内外参信息，并结合所述人体骨架信息和所述人体轮廓信息，采用Visual Hull方法进行人体三维建模。In order to achieve the above purpose, an embodiment of the present invention proposes a method for 3D human body reconstruction in complex dynamic scenes with multi-view cameras, including the following steps: shooting a target human body from multiple viewpoints to obtain multiple viewpoints at the same moment Two-dimensional images; use the deep network learning model to predict the various parts of the human body in the video sequence to obtain the human body skeleton information; according to the skeleton information and use the virtual three-dimensional human body model to match the actions and actions of the characters in the image rough outline to obtain the human body contour information of the human body; using the calibrated internal and external reference information of each viewpoint camera, combined with the human skeleton information and the human body contour information, the Visual Hull method is used to perform three-dimensional modeling of the human body.

本发明实施例的多视角相机下的复杂动态场景人体三维重建方法，能够利用多个视点提供的二维信息，利用深度学习的方法估计人体的大致位置，然后用虚拟的三维人体模型和图割的方法联合解决目标场景中的人体对象由于复杂背景造成难以分割的问题，结合相机标定等过程对视频序列中的人体对象进行三维重建，从而实现准确的人物对象分割，有效提高重建的准确性和可靠性。The 3D human body reconstruction method in complex dynamic scenes under the multi-view camera of the embodiment of the present invention can use the 2D information provided by multiple viewpoints, use the method of deep learning to estimate the approximate position of the human body, and then use the virtual 3D human body model and graph cut The method jointly solves the problem that the human body object in the target scene is difficult to segment due to the complex background, and combines the camera calibration and other processes to perform 3D reconstruction of the human body object in the video sequence, so as to achieve accurate human object segmentation and effectively improve the accuracy and accuracy of reconstruction. reliability.

另外，根据本发明上述实施例的多视角相机下的复杂动态场景人体三维重建方法还可以具有以下附加的技术特征：In addition, the method for three-dimensional human body reconstruction in complex dynamic scenes under multi-view cameras according to the above-mentioned embodiments of the present invention may also have the following additional technical features:

进一步地，在本发明的一个实施例中，所述对目标人体对象进行多视点拍摄，以获得在同一时刻下的多视点二维图像，进一步包括：将多个角度的摄像头架设在人体对象周围，并对准在人体对象运动范围之内，并保证所述多个角度的摄像头一致。Furthermore, in an embodiment of the present invention, the multi-viewpoint shooting of the target human object to obtain multi-viewpoint two-dimensional images at the same time further includes: erecting cameras with multiple angles around the human body object , and align within the motion range of the human object, and ensure that the cameras at multiple angles are consistent.

进一步地，在本发明的一个实施例中，所述获取人体的人体骨架信息，进一步包括：利用深度卷积神经网络对人体的关节点进行预测，并联合多个角度求出在三维空间中的所述人体骨架信息。Further, in one embodiment of the present invention, the acquiring the human body skeleton information further includes: using a deep convolutional neural network to predict the joint points of the human body, and combining multiple angles to obtain the joint points in the three-dimensional space The human skeleton information.

进一步地，在本发明的一个实施例中，所述获得人体的人体轮廓信息，进一步包括：对单角度图片进行背景减操作，以获得人物的位置，并通过图割方法确定的关节点作为模板的前景点，人物背景减后获得的区域为待分割区域，其他区域作为模板的背景点进行分割，从而获得分割结果；将smpl作为人物模型，检测出的所述关节点为输入，解一个全局最小的优化问题，使得初始的三维人体模型匹配到人物动作上，以将匹配后的人物模型轮廓作为前面分割的补充，得到所述人体轮廓信息。Further, in an embodiment of the present invention, the obtaining the human body contour information further includes: performing a background subtraction operation on the single-angle picture to obtain the position of the person, and using the joint points determined by the graph cut method as a template The foreground point of the figure, the area obtained after subtracting the character background is the area to be segmented, and other areas are segmented as the background point of the template to obtain the segmentation result; using smpl as the character model, the detected joint points are used as input to solve a global The smallest optimization problem is to match the initial three-dimensional human body model to the action of the character, so that the contour of the matched character model can be used as a supplement to the previous segmentation to obtain the human body contour information.

进一步地，在本发明的一个实施例中，所述采用Visual Hull方法进行人体三维建模，进一步包括：对多个角度的摄像头进行标定，获得每个相机的内外参信息对场景进行空间模型；利用所述Visual Hull方法，利用多个视图下的所述人体轮廓信息，遍历空间模型中的每个点，确定是否属于人体对象，如果空间点投影到多个二维平面视图中都在人体对象轮廓内，则认为属于三维人体对象，直到遍历完空间中的每个点，获取最终的三维人体模型。Further, in one embodiment of the present invention, the 3D modeling of the human body using the Visual Hull method further includes: calibrating the cameras at multiple angles, obtaining the internal and external parameter information of each camera to perform a spatial model of the scene; Using the Visual Hull method, using the human body outline information under multiple views, traverse each point in the space model to determine whether it belongs to a human body object, if the spatial point is projected into multiple two-dimensional plane views, it is in the human body object If it is within the outline, it is considered to belong to the 3D human body object, until every point in the space is traversed to obtain the final 3D human body model.

为达到上述目的，本发明另一方面实施例提出了一种多视角相机下的复杂动态场景人体三维重建装置，包括：采集模块，用于对目标人体对象进行多视点拍摄，以获得在同一时刻下的多视点二维图像；预测获取模块，用于通过深度网络学习后的模型对视频序列中的人体各部分进行预测，以获取人体的人体骨架信息；匹配获取模块，用于根据所述骨架信息并通过虚拟的三维人体模型去匹配图像中人物的动作和大致轮廓，以获得人体的人体轮廓信息；处理模块，用于利用标定后的各视点相机内外参信息，并结合所述人体骨架信息和所述人体轮廓信息，采用Visual Hull方法进行人体三维建模。In order to achieve the above purpose, another embodiment of the present invention proposes a device for 3D human body reconstruction in complex dynamic scenes under multi-view cameras, including: an acquisition module, which is used to shoot target human objects from multiple viewpoints, so as to obtain The following multi-view two-dimensional image; the prediction and acquisition module is used to predict each part of the human body in the video sequence through the model learned by the deep network, so as to obtain the human body skeleton information of the human body; the matching acquisition module is used for according to the skeleton. information and use the virtual three-dimensional human body model to match the action and rough outline of the person in the image to obtain the human body outline information of the human body; the processing module is used to use the calibrated internal and external parameter information of each viewpoint camera and combine the human skeleton information and the contour information of the human body, the Visual Hull method is used to carry out three-dimensional modeling of the human body.

本发明实施例的多视角相机下的复杂动态场景人体三维重建装置，能够利用多个视点提供的二维信息，利用深度学习的方法估计人体的大致位置，然后用虚拟的三维人体模型和图割的方法联合解决目标场景中的人体对象由于复杂背景造成难以分割的问题，结合相机标定等过程对视频序列中的人体对象进行三维重建，从而实现准确的人物对象分割，有效提高重建的准确性和可靠性。The 3D human body reconstruction device for complex dynamic scenes under the multi-view camera of the embodiment of the present invention can use the two-dimensional information provided by multiple viewpoints, use the method of deep learning to estimate the approximate position of the human body, and then use the virtual 3D human body model and graph cut The method jointly solves the problem that the human body object in the target scene is difficult to segment due to the complex background, and combines the camera calibration and other processes to perform 3D reconstruction of the human body object in the video sequence, so as to achieve accurate human object segmentation and effectively improve the accuracy and accuracy of reconstruction. reliability.

另外，根据本发明上述实施例的多视角相机下的复杂动态场景人体三维重建装置还可以具有以下附加的技术特征：In addition, the device for three-dimensional human body reconstruction in complex dynamic scenes under multi-view cameras according to the above-mentioned embodiments of the present invention may also have the following additional technical features:

进一步地，在本发明的一个实施例中，所述采集模块还用于将多个角度的摄像头架设在人体对象周围，并对准在人体对象运动范围之内，并保证所述多个角度的摄像头一致。Further, in an embodiment of the present invention, the acquisition module is also used to set up cameras with multiple angles around the human body object, and align them within the range of motion of the human body object, and ensure that the cameras at multiple angles The cameras are consistent.

进一步地，在本发明的一个实施例中，所述预测获取模u块还用于利用深度卷积神经网络对人体的关节点进行预测，并联合多个角度求出在三维空间中的所述人体骨架信息。Further, in one embodiment of the present invention, the prediction acquisition module u is also used to predict the joint points of the human body by using a deep convolutional neural network, and combine multiple angles to obtain the joint points in the three-dimensional space Human skeleton information.

进一步地，在本发明的一个实施例中，所述匹配获取模块还用于对单角度图片进行背景减操作，以获得人物的位置，并通过图割方法确定的关节点作为模板的前景点，人物背景减后获得的区域为待分割区域，其他区域作为模板的背景点进行分割，从而获得分割结果，并将smpl作为人物模型，检测出的所述关节点为输入，解一个全局最小的优化问题，使得初始的三维人体模型匹配到人物动作上，以将匹配后的人物模型轮廓作为前面分割的补充，得到所述人体轮廓信息。Further, in one embodiment of the present invention, the matching acquisition module is also used to perform background subtraction operation on the single-angle picture to obtain the position of the character, and use the joint points determined by the graph cut method as the foreground points of the template, The area obtained after subtracting the character background is the area to be segmented, and other areas are segmented as the background points of the template to obtain the segmentation result, and the smpl is used as the character model, and the detected joint points are used as input to solve a global minimum optimization The problem is to match the initial three-dimensional human body model to the action of the character, so that the contour of the matched character model can be used as a supplement to the previous segmentation to obtain the human body contour information.

进一步地，在本发明的一个实施例中，所述处理模块还用于对多个角度的摄像头进行标定，获得每个相机的内外参信息对场景进行空间模型，并利用所述Visual Hull方法，利用多个视图下的所述人体轮廓信息，遍历空间模型中的每个点，确定是否属于人体对象，如果空间点投影到多个二维平面视图中都在人体对象轮廓内，则认为属于三维人体对象，直到遍历完空间中的每个点，获取最终的三维人体模型。Further, in one embodiment of the present invention, the processing module is also used to calibrate cameras with multiple angles, obtain the internal and external parameter information of each camera to carry out a spatial model of the scene, and use the Visual Hull method, Use the human body contour information under multiple views to traverse each point in the space model to determine whether it belongs to the human body object. If the spatial point is projected into multiple two-dimensional plane views and is within the contour of the human body object, it is considered to belong to the three-dimensional Human body objects, until each point in the space is traversed to obtain the final 3D human body model.

本发明附加的方面和优点将在下面的描述中部分给出，部分将从下面的描述中变得明显，或通过本发明的实践了解到。Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

附图说明Description of drawings

本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解，其中：The above and/or additional aspects and advantages of the present invention will become apparent and easy to understand from the following description of the embodiments in conjunction with the accompanying drawings, wherein:

图1为根据本发明一个实施例的多视角相机下的复杂动态场景人体三维重建方法的流程图；Fig. 1 is a flow chart of a method for three-dimensional reconstruction of a human body in a complex dynamic scene under a multi-view camera according to an embodiment of the present invention;

图2为根据本发明一个实施例的获得的多视角视频序列中单角度单帧图像示意图；FIG. 2 is a schematic diagram of a single-angle single-frame image obtained in a multi-view video sequence according to an embodiment of the present invention;

图3为根据本发明一个具体实施例的多视角相机下的复杂动态场景人体三维重建方法的流程图；3 is a flow chart of a method for three-dimensional reconstruction of a human body in a complex dynamic scene under a multi-view camera according to a specific embodiment of the present invention;

图4为根据本发明一个实施例的深度卷积神经网络人体骨架预测结果示意图；FIG. 4 is a schematic diagram of a human skeleton prediction result of a deep convolutional neural network according to an embodiment of the present invention;

图5为根据本发明一个实施例的深度卷积神经网络的流程图；5 is a flow chart of a deep convolutional neural network according to one embodiment of the present invention;

图6为根据本发明一个实施例的三维虚拟人体匹配结果示意图；Fig. 6 is a schematic diagram of a matching result of a three-dimensional virtual human body according to an embodiment of the present invention;

图7为根据本发明一个实施例的图割算法中的模板图；Fig. 7 is a template diagram in the graph cut algorithm according to one embodiment of the present invention;

图8为根据本发明一个实施例的图割后结果示意图；Fig. 8 is a schematic diagram of the results after graph cutting according to an embodiment of the present invention;

图9为根据本发明一个实施例的图割结果加虚拟人体结果示意图；Fig. 9 is a schematic diagram of a graph cut result plus a virtual human body result according to an embodiment of the present invention;

图10为根据本发明一个实施例的再次精确分割后的人体结果示意图；Fig. 10 is a schematic diagram of human body results after accurate segmentation again according to an embodiment of the present invention;

图11为根据本发明一个实施例的三维人体骨架重建结果示意图；Fig. 11 is a schematic diagram of a three-dimensional human skeleton reconstruction result according to an embodiment of the present invention;

图12为根据本发明一个实施例的三维人体重建结果示意图；Fig. 12 is a schematic diagram of a three-dimensional human body reconstruction result according to an embodiment of the present invention;

图13为根据本发明一个实施例的多视角相机下的复杂动态场景人体三维重建装置的结构示意图。Fig. 13 is a schematic structural diagram of a device for three-dimensional reconstruction of a human body in a complex dynamic scene under multi-view cameras according to an embodiment of the present invention.

具体实施方式Detailed ways

下面详细描述本发明的实施例，所述实施例的示例在附图中示出，其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的，旨在用于解释本发明，而不能理解为对本发明的限制。Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

下面参照附图描述根据本发明实施例提出的多视角相机下的复杂动态场景人体三维重建方法及装置，首先将参照附图描述根据本发明实施例提出的多视角相机下的复杂动态场景人体三维重建方法。The method and device for 3D human body reconstruction of complex dynamic scenes under multi-view cameras according to the embodiments of the present invention will be described below with reference to the accompanying drawings. First, the 3D human body reconstruction of complex dynamic scenes under multi-view cameras according to the embodiments of the present invention will be described with reference to the accompanying drawings. rebuild method.

图1是本发明一个实施例的多视角相机下的复杂动态场景人体三维重建方法的流程图。FIG. 1 is a flow chart of a method for three-dimensional reconstruction of a human body in a complex dynamic scene under multi-view cameras according to an embodiment of the present invention.

如图1所示，该多视角相机下的复杂动态场景人体三维重建方法包括以下步骤：As shown in Figure 1, the method for 3D human body reconstruction in complex dynamic scenes under the multi-view camera includes the following steps:

在步骤S101中，对目标人体对象进行多视点拍摄，以获得在同一时刻下的多视点二维图像。In step S101, multi-viewpoint shooting is performed on a target human body object, so as to obtain multi-viewpoint two-dimensional images at the same moment.

可以理解的是，本发明实施例可以对目标人体对象进行多视点拍摄，获得在同一时刻下的多视点二维图像，其中，获取的图像如图2所示。It can be understood that, in the embodiment of the present invention, multi-viewpoint shooting can be performed on the target human object to obtain multi-viewpoint two-dimensional images at the same moment, wherein the acquired images are shown in FIG. 2 .

进一步地，在本发明的一个实施例中，对目标人体对象进行多视点拍摄，以获得在同一时刻下的多视点二维图像，进一步包括：将多个角度的摄像头架设在人体对象周围，并对准在人体对象运动范围之内，并保证多个角度的摄像头一致。Furthermore, in one embodiment of the present invention, the multi-viewpoint shooting of the target human body object to obtain multi-viewpoint two-dimensional images at the same moment further includes: erecting cameras with multiple angles around the human body object, and Aim within the range of motion of the human subject and ensure that the cameras from multiple angles are consistent.

具体而言，本发明实施例可以对目标人体对象进行多视点拍摄，获得在同一时刻下的多视点二维图像，具体内容为：只需要将多个若干角度的摄像头架设在人体对象周围，并对准在人体对象运动范围之内即可，并尽量保证摄像头一致，包括同一型号同一高度等，利于得到更好的重建效果。Specifically, the embodiment of the present invention can perform multi-viewpoint shooting on the target human object to obtain multi-viewpoint two-dimensional images at the same time. It is enough to align within the motion range of the human body object, and try to ensure that the cameras are consistent, including the same model and the same height, which will help to obtain better reconstruction results.

在步骤S102中，通过深度网络学习后的模型对视频序列中的人体各部分进行预测，以获取人体的人体骨架信息。In step S102, each part of the human body in the video sequence is predicted by using the learned model of the deep network, so as to obtain human body skeleton information.

可以理解的是，如图3所示，本发明实施例可以利用深度网络学习后的模型对视频序列中的人体各部分进行预测，从而获得人体对象的骨架信息。其中，深度网络可以为深度卷积神经网络，并且人体骨架预测的结果如图4所示。It can be understood that, as shown in FIG. 3 , in this embodiment of the present invention, a model learned by a deep network can be used to predict various parts of a human body in a video sequence, thereby obtaining skeleton information of a human body object. Wherein, the deep network may be a deep convolutional neural network, and the result of human skeleton prediction is shown in FIG. 4 .

举例而言，深度卷积神经网络的方法总体来说，用人体各部件响应图来表达各部件之间的空间约束，响应图和特征图一起作为数据在网络中传递。网络分为多个阶段。各个阶段都有监督训练，避免过深网络难以优化的问题。因为使用同一个网络，同时在多个尺度处理输入的特征和响应，不仅能确保精度，还考虑了各个部件之间的远距离关系。算法的主要流程为在每一个尺度下，计算各个部件的响应图，对于每个部件，累加所有尺度的响应图，得到总响应图，最后在每个部件的总响应图上，找出相应最大的点，为该部件位置。最终获得预测结果。For example, the method of deep convolutional neural network generally uses the response map of each part of the human body to express the spatial constraints between the parts, and the response map and feature map are transmitted in the network together as data. The network is divided into stages. There are supervised training at each stage to avoid the problem that the deep network is difficult to optimize. Because the same network is used to process input features and responses at multiple scales at the same time, it not only ensures accuracy, but also considers the long-distance relationship between various components. The main process of the algorithm is to calculate the response graph of each component at each scale, and for each component, accumulate the response graphs of all scales to obtain the total response graph, and finally find the corresponding maximum response graph on the total response graph of each component The point is the position of the component. Finally, the prediction result is obtained.

如图5所示，深度卷积神经网络中的原图(ori image)高宽的尺度为368*368，图像通道数为3，图像在第一和第二阶段经过两个分支卷积层(Convs)，一支获得分数(Score1)；另一支经过一个卷积层后与Score1进行连接(concat)，然后经过卷积层后得到score2；第三到第六阶段，从第二阶段分支中卷积步骤时获得的特征图(feature image)分别经过类似于第二阶段的步骤，最终输出结果图。中心图(Center map)为高斯分布的灰度图像，经过降采样后得到小尺寸的中心图(small center map)。As shown in Figure 5, the height and width scale of the original image (ori image) in the deep convolutional neural network is 368*368, the number of image channels is 3, and the image passes through two branch convolution layers ( Convs), one obtains the score (Score1); the other branch passes through a convolutional layer and connects with Score1 (concat), and then obtains score2 after passing through the convolutional layer; the third to sixth stages, from the second stage branch The feature image (feature image) obtained during the convolution step goes through steps similar to the second stage, and finally outputs the result image. The center map is a gray-scale image of Gaussian distribution, and a small center map is obtained after downsampling.

进一步地，在本发明的一个实施例中，获取人体的人体骨架信息，进一步包括：利用深度卷积神经网络对人体的关节点进行预测，并联合多个角度求出在三维空间中的人体骨架信息。Furthermore, in one embodiment of the present invention, obtaining the human body skeleton information further includes: using a deep convolutional neural network to predict the joint points of the human body, and combining multiple angles to obtain the human body skeleton in three-dimensional space information.

也就是说，本发明实施例利用深度卷积神经网络(Deep Pose Machines)进行关节点的检测，从而获得人体对象的骨架信息。That is to say, the embodiment of the present invention utilizes a deep convolutional neural network (Deep Pose Machines) to detect joint points, so as to obtain skeleton information of a human body object.

在步骤S103中，根据骨架信息并通过虚拟的三维人体模型去匹配图像中人物的动作和大致轮廓，以获得人体的人体轮廓信息。In step S103, according to the skeleton information and the virtual three-dimensional human body model, the actions and the general outline of the person in the image are matched to obtain the human body outline information.

可以理解的是，如图3所示，本发明实施例可以利用深度网络学习后的模型对视频序列中的人体各部分进行预测，分割出目标人体对象的轮廓信息，由于会遇到复杂背景造成分割的人物出现过分割或者欠分割现象，故在步骤S102之后，本发明实施例利用虚拟的三维人体模型去匹配图像中人物的动作和大致轮廓，其中，匹配的结果如图6所示；最后进行精确的再分割，最后达到分割的满意效果。It can be understood that, as shown in Figure 3, the embodiment of the present invention can use the model after deep network learning to predict the various parts of the human body in the video sequence, and segment the outline information of the target human body object. The segmented person appears to be over-segmented or under-segmented, so after step S102, the embodiment of the present invention uses a virtual three-dimensional human body model to match the action and outline of the person in the image, wherein the matching result is shown in Figure 6; finally Perform precise subdivision, and finally achieve a satisfactory segmentation effect.

进一步地，在本发明的一个实施例中，获得人体的人体轮廓信息，进一步包括：对单角度图片进行背景减操作，以获得人物的位置，并通过图割方法确定的关节点作为模板的前景点，人物背景减后获得的区域为待分割区域，其他区域作为模板的背景点进行分割，从而获得分割结果；将smpl作为人物模型，检测出的关节点为输入，解一个全局最小的优化问题，使得初始的三维人体模型匹配到人物动作上，以将匹配后的人物模型轮廓作为前面分割的补充，得到人体轮廓信息。Furthermore, in one embodiment of the present invention, obtaining the human body contour information of the human body further includes: performing a background subtraction operation on the single-angle picture to obtain the position of the character, and using the joint points determined by the graph cut method as the front of the template Scenic spots, the area obtained after subtracting the character background is the area to be segmented, and other areas are segmented as the background points of the template to obtain the segmentation result; using smpl as the character model, the detected joint points are used as input, and a global minimum optimization problem is solved , so that the initial three-dimensional human body model is matched to the character's action, so that the matched character model contour is used as a supplement to the previous segmentation to obtain human body contour information.

可以理解的是，本发明实施例可以利用关节点作为图割方法中的前景点，大致分割出目标人体对象的轮廓信息，其中，如图7所示，图中的黑色为背景，白色范围为待分割区域，中间灰色区域为前景点，并且采用图割方法分割后的结果如图8所示。该步骤会遇到图像中由于复杂背景造成分割的人物出现过分割或者欠分割现象，故在上述阶段后利用虚拟的三维人体模型去匹配图像中人物的动作和大致轮廓，其中，采用图割方法分割后的结果加虚拟的三维人体模型的结果如图9所示；最后进行精确的再分割，最后达到分割的满意效果，其中，再次精确分割后的人体结果如图10所示。另外，该方法的优势在于，避免从传统的图像分割的角度达不到足够分割的精度和复杂背景下过度分割的缺点，能够实现准确的人物对象分割，同时达到对人体的定位。该部分是三维重建的核心，对重建结果起到关键的作用。It can be understood that in the embodiment of the present invention, the joint points can be used as the foreground points in the graph cut method to roughly segment the contour information of the target human object, wherein, as shown in FIG. 7 , the black in the figure is the background, and the white range is In the area to be segmented, the gray area in the middle is the foreground point, and the result after using the graph cut method is shown in Figure 8. This step will encounter the phenomenon of over-segmentation or under-segmentation of the segmented characters due to the complex background in the image. Therefore, after the above-mentioned stage, a virtual 3D human body model is used to match the movements and general outlines of the characters in the image. Among them, the graph cut method is used. The result after segmentation plus the virtual three-dimensional human body model is shown in Figure 9; finally, accurate subdivision is performed to achieve a satisfactory segmentation effect, and the human body result after accurate segmentation is shown in Figure 10. In addition, the advantage of this method is that it avoids the shortcomings of insufficient segmentation accuracy and over-segmentation in complex backgrounds from the perspective of traditional image segmentation, and can achieve accurate segmentation of people and objects while achieving positioning of the human body. This part is the core of 3D reconstruction and plays a key role in the reconstruction results.

具体而言，本发明实施例可以利用关节点作为图割方法中的前景点，大致分割出目标人体对象的轮廓信息，具体内容为：首先对单角度图片进行背景减操作，可以获得人物的大致位置，然后应用图割方法，以确定的关节点作为前景点，人物背景减后获得的区域为待分割区域，其他区域作为背景点进行分割，从而获得分割结果。其次，用虚拟的三维人体模型去匹配图像中人物的动作，本发明实施例用smpl作为人物模型，以检测出的关节点为输入，解一个全局最小的优化问题，使得初始的三维人体模型能够匹配到我们需要的人物动作上。最后将匹配后的人物模型轮廓作为前面分割的补充。Specifically, the embodiment of the present invention can use the joint points as the foreground points in the graph cut method to roughly segment the outline information of the target human object. position, and then apply the graph cut method, using the determined joint points as foreground points, the area obtained after subtracting the character background is the area to be segmented, and other areas are segmented as background points to obtain the segmentation result. Secondly, a virtual three-dimensional human body model is used to match the actions of the characters in the image. In the embodiment of the present invention, smpl is used as the character model, and the detected joint points are used as input to solve a global minimum optimization problem, so that the initial three-dimensional human body model can Match to the character action we need. Finally, the matched character model outline is used as a supplement to the previous segmentation.

在步骤S104中，利用标定后的各视点相机内外参信息，并结合人体骨架信息和人体轮廓信息，采用Visual Hull方法进行人体三维建模。In step S104, using the calibrated internal and external reference information of each viewpoint camera, combined with human skeleton information and human body contour information, the Visual Hull method is used to carry out three-dimensional modeling of the human body.

可以理解的是，本发明实施例可以利用标定后的各视点相机内外参信息，结合分割后的人体对象的信息，采用Visual Hull方法进行人体三维建模。It can be understood that in the embodiment of the present invention, the calibrated internal and external parameter information of each viewpoint camera can be used in combination with the information of the segmented human body object, and the Visual Hull method can be used to perform three-dimensional modeling of the human body.

进一步地，在本发明的一个实施例中，采用Visual Hull方法进行人体三维建模，进一步包括：对多个角度的摄像头进行标定，获得每个相机的内外参信息对场景进行空间模型；利用Visual Hull方法，利用多个视图下的人体轮廓信息，遍历空间模型中的每个点，确定是否属于人体对象，如果空间点投影到多个二维平面视图中都在人体对象轮廓内，则认为属于三维人体对象，直到遍历完空间中的每个点，获取最终的三维人体模型。Further, in one embodiment of the present invention, adopting the Visual Hull method to carry out three-dimensional modeling of the human body further includes: calibrating the cameras of multiple angles, obtaining the internal and external parameter information of each camera to carry out the spatial model of the scene; using Visual Hull The Hull method uses the human body contour information under multiple views to traverse each point in the spatial model to determine whether it belongs to the human body object. If the spatial point is projected into multiple two-dimensional plane views and is within the contour of the human body object, it is considered to belong to the human body object. 3D human body object until each point in the space is traversed to obtain the final 3D human body model.

具体而言，首先对若干个视角的摄像机进行标定，获得每个相机的内外参信息对场景进行空间模型。然后利用Visual Hull的方法，利用多个视图下的人体对象轮廓信息，遍历空间模型中的每个点，确定是否属于人体对象，如果该空间点投影到多个二维平面视图中都在人体对象轮廓内，则认为该点属于三维人体对象，直到遍历完空间中的每个点，则获得了最终的三维人体模型，其中，三维人体骨架重建结果如图11所示以及三维人体重建结果如图12所示。Specifically, firstly, the cameras of several viewing angles are calibrated, and the internal and external parameter information of each camera is obtained to make a spatial model of the scene. Then use the Visual Hull method to traverse each point in the space model to determine whether it belongs to the human body object by using the contour information of the human body object under multiple views. If it is inside the contour, it is considered that the point belongs to the 3D human body object, and the final 3D human body model is obtained until each point in the space is traversed. 12 shown.

根据本发明实施例提出的多视角相机下的复杂动态场景人体三维重建方法，能够利用多个视点提供的二维信息，利用深度学习的方法估计人体的大致位置，然后用虚拟的三维人体模型和图割的方法联合解决目标场景中的人体对象由于复杂背景造成难以分割的问题，结合相机标定等过程对视频序列中的人体对象进行三维重建，从而实现准确的人物对象分割，有效提高重建的准确性和可靠性。According to the method for 3D human body reconstruction in complex dynamic scenes with multi-view cameras proposed in the embodiment of the present invention, it can use the 2D information provided by multiple viewpoints to estimate the approximate position of the human body using the method of deep learning, and then use the virtual 3D human body model and The graph cut method jointly solves the problem that human objects in the target scene are difficult to segment due to complex backgrounds, and combines camera calibration and other processes to perform 3D reconstruction of human objects in video sequences, so as to achieve accurate segmentation of human objects and effectively improve the accuracy of reconstruction. sex and reliability.

其次参照附图描述根据本发明实施例提出的多视角相机下的复杂动态场景人体三维重建装置。Next, a device for three-dimensional reconstruction of a human body in a complex dynamic scene under a multi-view camera proposed according to an embodiment of the present invention will be described with reference to the accompanying drawings.

图13是本发明一个实施例的多视角相机下的复杂动态场景人体三维重建装置的结构示意。Fig. 13 is a schematic structural diagram of a device for three-dimensional reconstruction of a human body in a complex dynamic scene under multi-view cameras according to an embodiment of the present invention.

如图13所示，该多视角相机下的复杂动态场景人体三维重建装置10包括：采集模块100、预测获取模块200、匹配获取模块300和处理模块400。As shown in FIG. 13 , the device 10 for 3D human body reconstruction in complex dynamic scenes under multi-view cameras includes: an acquisition module 100 , a prediction acquisition module 200 , a matching acquisition module 300 and a processing module 400 .

其中，采集模块100用于对目标人体对象进行多视点拍摄，以获得在同一时刻下的多视点二维图像。预测获取模块200用于通过深度网络学习后的模型对视频序列中的人体各部分进行预测，以获取人体的人体骨架信息。匹配获取模块300用于根据骨架信息并通过虚拟的三维人体模型去匹配图像中人物的动作和大致轮廓，以获得人体的人体轮廓信息。处理模块400用于利用标定后的各视点相机内外参信息，并结合人体骨架信息和人体轮廓信息，采用Visual Hull方法进行人体三维建模。本发明实施例的装置10可以结合相机标定等过程对视频序列中的人体对象进行三维重建，从而实现准确的人物对象分割，有效提高重建的准确性和可靠性。Wherein, the collection module 100 is used for shooting a target human object from multiple viewpoints, so as to obtain multi-viewpoint two-dimensional images at the same moment. The prediction acquisition module 200 is used to predict each part of the human body in the video sequence through the learned model of the deep network, so as to obtain human body skeleton information. The matching acquisition module 300 is used to match the action and rough outline of the person in the image according to the skeleton information and through the virtual three-dimensional human body model, so as to obtain the human body outline information of the human body. The processing module 400 is used to use the calibrated internal and external parameter information of each viewpoint camera, combined with human skeleton information and human body contour information, to perform three-dimensional human body modeling using the Visual Hull method. The device 10 of the embodiment of the present invention can perform three-dimensional reconstruction of human objects in video sequences in combination with processes such as camera calibration, so as to achieve accurate segmentation of human objects and effectively improve the accuracy and reliability of reconstruction.

进一步地，在本发明的一个实施例中，采集模块100还用于将多个角度的摄像头架设在人体对象周围，并对准在人体对象运动范围之内，并保证多个角度的摄像头一致。Further, in an embodiment of the present invention, the acquisition module 100 is also used to set up cameras with multiple angles around the human body object, align them within the motion range of the human body object, and ensure that the cameras at multiple angles are consistent.

进一步地，在本发明的一个实施例中，预测获取模块200还用于利用深度卷积神经网络对人体的关节点进行预测，并联合多个角度求出在三维空间中的人体骨架信息。Further, in an embodiment of the present invention, the prediction acquisition module 200 is also used to predict the joint points of the human body by using the deep convolutional neural network, and combine multiple angles to obtain the human skeleton information in the three-dimensional space.

进一步地，在本发明的一个实施例中，匹配获取模块300还用于对单角度图片进行背景减操作，以获得人物的位置，并通过图割方法确定的关节点作为模板的前景点，人物背景减后获得的区域为待分割区域，其他区域作为模板的背景点进行分割，从而获得分割结果，并将smpl作为人物模型，检测出的关节点为输入，解一个全局最小的优化问题，使得初始的三维人体模型匹配到人物动作上，以将匹配后的人物模型轮廓作为前面分割的补充，得到人体轮廓信息。Further, in one embodiment of the present invention, the matching acquisition module 300 is also used to perform background subtraction operation on the single-angle picture to obtain the position of the character, and the joint points determined by the graph cut method are used as the foreground point of the template, and the character The area obtained after background subtraction is the area to be segmented, and other areas are segmented as the background points of the template to obtain the segmentation result, and the smpl is used as the character model, and the detected joint points are used as input to solve a global minimum optimization problem, so that The initial 3D human body model is matched to the character's actions, so that the matched character model outline can be used as a supplement to the previous segmentation to obtain human body outline information.

进一步地，在本发明的一个实施例中，处理模块400还用于对多个角度的摄像头进行标定，获得每个相机的内外参信息对场景进行空间模型，并利用Visual Hull方法，利用多个视图下的人体轮廓信息，遍历空间模型中的每个点，确定是否属于人体对象，如果空间点投影到多个二维平面视图中都在人体对象轮廓内，则认为属于三维人体对象，直到遍历完空间中的每个点，获取最终的三维人体模型。Further, in one embodiment of the present invention, the processing module 400 is also used to calibrate the cameras of multiple angles, obtain the internal and external parameter information of each camera to carry out a spatial model of the scene, and use the Visual Hull method to utilize multiple Human body contour information under the view, traversing each point in the space model to determine whether it belongs to the human body object, if the space point is projected into multiple two-dimensional plane views and all are within the contour of the human body object, it is considered to belong to the three-dimensional human body object until the traversal Each point in the space is completed to obtain the final 3D human body model.

需要说明的是，前述对多视角相机下的复杂动态场景人体三维重建方法实施例的解释说明也适用于该实施例的多视角相机下的复杂动态场景人体三维重建装置，此处不再赘述。It should be noted that the foregoing explanations of the embodiment of the method for 3D reconstruction of a human body in complex dynamic scenes with multi-view cameras are also applicable to the device for 3D reconstruction of human bodies in complex dynamic scenes with multi-view cameras in this embodiment, so details are not repeated here.

根据本发明实施例提出的多视角相机下的复杂动态场景人体三维重建装置，能够利用多个视点提供的二维信息，利用深度学习的方法估计人体的大致位置，然后用虚拟的三维人体模型和图割的方法联合解决目标场景中的人体对象由于复杂背景造成难以分割的问题，结合相机标定等过程对视频序列中的人体对象进行三维重建，从而实现准确的人物对象分割，有效提高重建的准确性和可靠性。According to the embodiment of the present invention, the device for 3D human body reconstruction in complex dynamic scenes with multi-view cameras can use the 2D information provided by multiple viewpoints, use the method of deep learning to estimate the approximate position of the human body, and then use the virtual 3D human body model and The graph cut method jointly solves the problem that human objects in the target scene are difficult to segment due to complex backgrounds, and combines camera calibration and other processes to perform 3D reconstruction of human objects in video sequences, so as to achieve accurate segmentation of human objects and effectively improve the accuracy of reconstruction. sex and reliability.

在本发明的描述中，需要理解的是，术语“中心”、“纵向”、“横向”、“长度”、“宽度”、“厚度”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”“内”、“外”、“顺时针”、“逆时针”、“轴向”、“径向”、“周向”等指示的方位或位置关系为基于附图所示的方位或位置关系，仅是为了便于描述本发明和简化描述，而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作，因此不能理解为对本发明的限制。In describing the present invention, it should be understood that the terms "center", "longitudinal", "transverse", "length", "width", "thickness", "upper", "lower", "front", " Back", "Left", "Right", "Vertical", "Horizontal", "Top", "Bottom", "Inner", "Outer", "Clockwise", "Counterclockwise", "Axial", The orientation or positional relationship indicated by "radial", "circumferential", etc. is based on the orientation or positional relationship shown in the drawings, and is only for the convenience of describing the present invention and simplifying the description, rather than indicating or implying the referred device or element Must be in a particular orientation, be constructed in a particular orientation, and operate in a particular orientation, and therefore should not be construed as limiting the invention.

此外，术语“第一”、“第二”仅用于描述目的，而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中，“多个”的含义是至少两个，例如两个，三个等，除非另有明确具体的限定。In addition, the terms "first" and "second" are used for descriptive purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, the features defined as "first" and "second" may explicitly or implicitly include at least one of these features. In the description of the present invention, "plurality" means at least two, such as two, three, etc., unless otherwise specifically defined.

在本发明中，除非另有明确的规定和限定，术语“安装”、“相连”、“连接”、“固定”等术语应做广义理解，例如，可以是固定连接，也可以是可拆卸连接，或成一体；可以是机械连接，也可以是电连接；可以是直接相连，也可以通过中间媒介间接相连，可以是两个元件内部的连通或两个元件的相互作用关系，除非另有明确的限定。对于本领域的普通技术人员而言，可以根据具体情况理解上述术语在本发明中的具体含义。In the present invention, unless otherwise clearly specified and limited, terms such as "installation", "connection", "connection" and "fixation" should be understood in a broad sense, for example, it can be a fixed connection or a detachable connection , or integrated; it may be mechanically connected or electrically connected; it may be directly connected or indirectly connected through an intermediary, and it may be the internal communication of two components or the interaction relationship between two components, unless otherwise specified limit. Those of ordinary skill in the art can understand the specific meanings of the above terms in the present invention according to specific situations.

在本发明中，除非另有明确的规定和限定，第一特征在第二特征“上”或“下”可以是第一和第二特征直接接触，或第一和第二特征通过中间媒介间接接触。而且，第一特征在第二特征“之上”、“上方”和“上面”可是第一特征在第二特征正上方或斜上方，或仅仅表示第一特征水平高度高于第二特征。第一特征在第二特征“之下”、“下方”和“下面”可以是第一特征在第二特征正下方或斜下方，或仅仅表示第一特征水平高度小于第二特征。In the present invention, unless otherwise clearly specified and limited, the first feature may be in direct contact with the first feature or the first and second feature may be in direct contact with the second feature through an intermediary. touch. Moreover, "above", "above" and "above" the first feature on the second feature may mean that the first feature is directly above or obliquely above the second feature, or simply means that the first feature is higher in level than the second feature. "Below", "beneath" and "beneath" the first feature may mean that the first feature is directly below or obliquely below the second feature, or simply means that the first feature is less horizontally than the second feature.

在本说明书的描述中，参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中，对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且，描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外，在不相互矛盾的情况下，本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or characteristic is included in at least one embodiment or example of the present invention. In this specification, the schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.

尽管上面已经示出和描述了本发明的实施例，可以理解的是，上述实施例是示例性的，不能理解为对本发明的限制，本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present invention have been shown and described above, it can be understood that the above embodiments are exemplary and should not be construed as limiting the present invention, those skilled in the art can make the above-mentioned The embodiments are subject to changes, modifications, substitutions and variations.

Claims

1. A complex dynamic scene human body three-dimensional reconstruction method under a multi-view camera, is characterized in that, comprises the following steps:

Multi-viewpoint shooting of the target human object to obtain multi-viewpoint two-dimensional images at the same moment;

Predict the parts of the human body in the video sequence through the deep network learning model to obtain the human body skeleton information;

According to the skeleton information and through the virtual three-dimensional human body model to match the action and the general outline of the person in the image, to obtain the human body outline information of the human body; and

Using the calibrated internal and external reference information of each viewpoint camera, combined with the human skeleton information and the human body contour information, the Visual Hull method is used to carry out three-dimensional modeling of the human body.

2. the complex dynamic scene human body three-dimensional reconstruction method under the multi-view camera according to claim 1, is characterized in that, described target human body object is carried out multi-viewpoint shooting, to obtain the multi-viewpoint two-dimensional image under the same moment, Further includes:

Cameras with multiple angles are set up around the human body object and aligned within the motion range of the human body object, and the cameras at multiple angles are guaranteed to be consistent.

3. the complex dynamic scene human body three-dimensional reconstruction method under the multi-view camera according to claim 1, is characterized in that, described obtaining the human body skeleton information of human body, further comprises:

The joint points of the human body are predicted by using the deep convolutional neural network, and the skeleton information of the human body in three-dimensional space is obtained by combining multiple angles.

4. the complex dynamic scene human body three-dimensional reconstruction method under the multi-view camera according to claim 3, is characterized in that, described obtaining the human body contour information of human body, further comprises:

The background subtraction operation is performed on the single-angle picture to obtain the position of the character, and the joint points determined by the graph cut method are used as the foreground point of the template, the area obtained after subtracting the background of the character is the area to be segmented, and other areas are used as the background point of the template. Segmentation to obtain segmentation results;

Using smpl as a character model, the detected joint points are used as input, and a global minimum optimization problem is solved, so that the initial 3D human body model is matched to the character's action, so that the matched character model outline is used as a supplement to the previous segmentation, Obtain the human body contour information.

5. the complex dynamic scene human body three-dimensional reconstruction method under the multi-view camera according to claim 2, is characterized in that, described adopting Visual Hull method to carry out human body three-dimensional modeling, further comprises:

Calibrate the cameras at multiple angles, obtain the internal and external parameters of each camera, and make a spatial model of the scene;

Using the Visual Hull method, using the human body outline information under multiple views, traverse each point in the space model to determine whether it belongs to a human body object, if the spatial point is projected into multiple two-dimensional plane views, it is in the human body object If it is within the outline, it is considered to belong to the 3D human body object, until every point in the space is traversed to obtain the final 3D human body model.

6. A device for three-dimensional reconstruction of a human body in a complex dynamic scene under a multi-view camera, characterized in that it comprises:

The acquisition module is used to perform multi-viewpoint shooting on the target human object, so as to obtain multi-viewpoint two-dimensional images at the same moment;

The prediction acquisition module is used to predict each part of the human body in the video sequence through the model after deep network learning, so as to obtain the human body skeleton information of the human body;

The matching acquisition module is used to match the action and the general outline of the person in the image according to the skeleton information and through the virtual three-dimensional human body model, so as to obtain the human body outline information of the human body; and

The processing module is used to use the calibrated internal and external reference information of each viewpoint camera and combine the human skeleton information and the human body contour information to perform three-dimensional modeling of the human body by using the Visual Hull method.

7. The device for three-dimensional human body reconstruction of complex dynamic scenes under multi-view cameras according to claim 6, wherein the acquisition module is also used to set up cameras with multiple angles around the human body object, and aim at the human body object Within the range of motion of the object, and ensure that the cameras from multiple angles are consistent.

8. The device for three-dimensional human body reconstruction in complex dynamic scenes under multi-view cameras according to claim 6, wherein the prediction acquisition module is also used to predict the joint points of the human body by using a deep convolutional neural network, and in combination with The human skeleton information in three-dimensional space is obtained from multiple angles.

9. The complex dynamic scene human body three-dimensional reconstruction device under the multi-view camera according to claim 8, characterized in that, the matching acquisition module is also used to perform background subtraction operations on single-angle pictures to obtain the position of the person, and The joint points determined by the graph cut method are used as the foreground point of the template, the area obtained after subtracting the background of the character is the area to be segmented, and other areas are segmented as the background point of the template to obtain the segmentation result, and the smpl is used as the character model to detect The joint points of the algorithm are used as input to solve a global minimum optimization problem, so that the initial 3D human body model is matched to the character's actions, so that the matched character model contour is used as a supplement to the previous segmentation to obtain the human body contour information.

10. The device for three-dimensional human body reconstruction of complex dynamic scenes under multi-view cameras according to claim 7, wherein the processing module is also used for calibrating cameras with multiple angles, and obtaining internal and external parameter information of each camera Carry out a spatial model of the scene, and use the Visual Hull method to use the human body contour information under multiple views to traverse each point in the spatial model to determine whether it belongs to a human body object. If the spatial point is projected to multiple two-dimensional If it is within the contour of the human body object in the plane view, it is considered to belong to the 3D human body object until every point in the space is traversed to obtain the final 3D human body model.