CN117422848B

CN117422848B - Method and device for segmenting three-dimensional model

Info

Publication number: CN117422848B
Application number: CN202311426368.XA
Authority: CN
Inventors: 梁军
Original assignee: Shenli Vision Shenzhen Cultural Technology Co ltd
Current assignee: Shenli Vision Shenzhen Cultural Technology Co ltd
Priority date: 2023-10-27
Filing date: 2023-10-27
Publication date: 2024-08-16
Anticipated expiration: 2043-10-27
Also published as: CN117422848A

Abstract

The embodiment of the present application provides a method and device for segmenting a three-dimensional model, the method comprising: for any model point in the point cloud data of the three-dimensional model, obtaining a target image containing the model point in at least one acquired image corresponding to the three-dimensional model, wherein each acquired image corresponds to a different shooting angle of the three-dimensional model. Determine the three-dimensional feature vector of the model point, and determine the two-dimensional feature vector of the model point according to the target image. Perform fusion processing based on the two-dimensional feature vector and the three-dimensional feature vector to obtain a fused feature vector corresponding to the model point. According to the fused feature vector corresponding to each model point, the three-dimensional model is segmented to obtain at least one segmentation area. The present application effectively ensures the accuracy and rationality of the segmentation of the three-dimensional model.

Description

Method and device for segmenting three-dimensional model

Technical Field

The embodiment of the application relates to a computer technology, in particular to a method and a device for segmenting a three-dimensional model.

Background

There is a technology of developing UV in the process of mapping a three-dimensional model, wherein developing UV refers to unpacking the three-dimensional model to develop into a two-dimensional plane picture, so that the mapping effect of the three-dimensional model is more real.

Wherein the spanuv treatment of a three-dimensional model generally comprises two steps: segmentation and parameterization, wherein the effect of the segmentation process determines the quality of the UV resolution. In the related art, at the time of performing a segmentation process on a three-dimensional model, a plurality of model points having relatively close curvatures are generally determined as one segmented region.

However, the curvature approach of the model points does not represent that the model points belong to the same part of the three-dimensional model, so that the current implementation scheme has the problem of poor segmentation effect of the three-dimensional model.

Disclosure of Invention

The embodiment of the application provides a method and a device for segmenting a three-dimensional model, which are used for solving the problem of poor segmentation effect of the three-dimensional model.

In a first aspect, an embodiment of the present application provides a method for segmenting a three-dimensional model, including:

For any model point in the point cloud data of the three-dimensional model, acquiring a target image containing the model point in at least one acquired image corresponding to the three-dimensional model, wherein the acquired images are different in shooting visual angle corresponding to the three-dimensional model;

determining a three-dimensional feature vector of the model point, and determining a two-dimensional feature vector of the model point according to the target image;

Performing fusion processing according to the two-dimensional feature vector and the three-dimensional feature vector to obtain a fusion feature vector corresponding to the model point;

and dividing the three-dimensional model according to the fusion feature vectors corresponding to the model points to obtain at least one divided area.

In one possible design, the segmenting the three-dimensional model according to the respective corresponding fusion feature vectors of each model point to obtain at least one segmented region includes:

clustering is carried out according to the fusion feature vectors corresponding to the model points, and clustering clusters corresponding to the model points are determined;

Determining a model point corresponding to any one of the cluster as a target point set;

And dividing the three-dimensional model into at least one divided area according to the target point sets corresponding to the clusters, wherein the divided area is an area formed by model points in the same target point set.

In one possible design, the fusing processing is performed according to the two-dimensional feature vector and the three-dimensional feature vector to obtain a fused feature vector corresponding to the model point, including:

Performing space mapping processing on the three-dimensional feature vector to obtain a mapped three-dimensional feature vector;

Splicing the two-dimensional feature vector and the mapped three-dimensional feature vector to obtain a spliced feature vector;

Inputting the spliced feature vector into a first full-connection layer to obtain a first intermediate feature vector output by the first full-connection layer;

Processing the first intermediate feature vector according to an attention network to obtain a target feature vector;

And inputting the target feature vector into a fusion decoder to obtain a fusion feature vector corresponding to the model point output by the fusion decoder.

In one possible design, the attention network includes a second fully connected layer, a probability processing unit, and an element multiplication processing unit;

The processing the first intermediate feature vector according to the attention network to obtain a target feature vector includes:

inputting the first intermediate feature vector into the second full-connection layer to obtain a second intermediate feature vector output by the second full-connection layer;

Processing the second intermediate feature vector according to the probability processing unit to obtain a weight feature vector, wherein the weight feature vector is used for indicating weights corresponding to all elements in the first intermediate feature vector;

And performing element multiplication processing on the first intermediate feature vector and the weight feature vector according to the element multiplication processing unit to obtain the target feature vector.

In one possible design, the acquiring, in the at least one acquired image corresponding to the three-dimensional model, a target image including the model point includes:

In at least one acquired image corresponding to the three-dimensional model, determining a shooting area corresponding to each acquired image according to a camera acquisition parameter corresponding to each acquired image, wherein the camera acquisition parameter comprises at least one of the following components: camera pose, scaling factor;

For any acquired image, if the model point is determined to be positioned in a shooting area of the acquired image according to the coordinate information of the model point, the acquired image is determined to be an image to be selected;

And determining a target image containing the model point according to the image to be selected.

In one possible design, the determining the two-dimensional feature vector of the model point according to the target image includes:

Determining a target pixel point corresponding to the model point in the target image according to the coordinate mapping relation between the target image and the point cloud data and the coordinate information of the model point aiming at any target image;

determining a pixel area taking the target pixel point as a center in the target image, and acquiring coordinate information of the pixel area in the target image;

According to the coordinate information of the pixel region, determining partial feature vectors corresponding to the pixel region in the image feature vectors corresponding to the target image;

And carrying out pooling treatment according to the partial eigenvectors respectively corresponding to the pixel areas of each target image so as to obtain the two-dimensional eigenvectors of the model points.

In one possible design, the method further comprises:

Inputting the acquired image to an image encoder for any acquired image to obtain a first feature vector of the acquired image output by the image encoder;

and performing deconvolution operation on the first feature vector to obtain an image feature vector of the acquired image.

In one possible design, the determining the three-dimensional feature vector of the model point includes:

And inputting the model points into a feature extraction network to obtain three-dimensional feature vectors of the model points output by the feature extraction network.

In a second aspect, an embodiment of the present application provides a segmentation apparatus for a three-dimensional model, including:

The acquisition module is used for acquiring a target image containing model points in at least one acquired image corresponding to the three-dimensional model aiming at any model point in the point cloud data of the three-dimensional model, wherein the acquired images are different in shooting visual angle corresponding to the three-dimensional model;

The determining module is used for determining the three-dimensional feature vector of the model point and determining the two-dimensional feature vector of the model point according to the target image;

the fusion module is used for carrying out fusion processing according to the two-dimensional feature vector and the three-dimensional feature vector to obtain a fusion feature vector corresponding to the model point;

And the processing module is used for dividing the three-dimensional model to obtain at least one divided area according to the fusion feature vectors corresponding to the model points.

In one possible design, the processing module is specifically configured to:

In one possible design, the fusion module is specifically configured to:

the fusion module is specifically used for:

In one possible design, the acquisition module is specifically configured to:

In one possible design, the processing module is further configured to:

In one possible design, the determining module is specifically configured to:

In a third aspect, an embodiment of the present application provides an electronic device, including:

a memory for storing a program;

A processor for executing the program stored by the memory, the processor being adapted to perform the method of the first aspect and any of the various possible designs of the first aspect as described above when the program is executed.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium comprising instructions which, when run on a computer, cause the computer to perform the method of the first aspect above and any of the various possible designs of the first aspect.

In a fifth aspect, embodiments of the present application provide a computer program product comprising a computer program which, when executed by a processor, implements a method as described in the first aspect and any of the various possible designs of the first aspect.

The embodiment of the application provides a method and a device for segmenting a three-dimensional model, wherein the method comprises the steps of acquiring a multi-view acquired image of the three-dimensional model, acquiring a target image containing model points in a plurality of acquired images, determining two-dimensional feature vectors of the model points according to the target image, determining three-dimensional feature vectors of the model points according to the model points, and finally carrying out fusion processing according to the two-dimensional feature vectors capable of reflecting global features and the three-dimensional feature vectors containing richer feature information, so that fusion feature vectors capable of accurately and comprehensively reflecting the features of the model points can be obtained, and therefore, the segmentation accuracy and rationality of the three-dimensional model can be effectively ensured by determining segmentation areas of the three-dimensional model according to the fusion feature vectors corresponding to the model points.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions of the prior art, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it will be obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort to a person skilled in the art.

FIG. 1 is a schematic diagram of UV-spreading treatment according to an embodiment of the present application;

FIG. 2 is a flowchart of a method for segmenting a three-dimensional model according to an embodiment of the present application;

FIG. 3 is a flow chart of a method for segmenting a three-dimensional model according to an embodiment of the present application;

FIG. 4 is a schematic flow chart of determining a two-dimensional feature vector according to an embodiment of the present application;

FIG. 5 is a schematic diagram of an implementation of a feature fusion process according to an embodiment of the present application;

FIG. 6 is a second flowchart of a method for segmenting a three-dimensional model according to an embodiment of the present application;

FIG. 7 is a schematic clustering diagram of model points according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a segmentation effect of a three-dimensional model according to an embodiment of the present application;

FIG. 9 is a schematic structural diagram of a three-dimensional model segmentation apparatus according to an embodiment of the present application;

fig. 10 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

In order to better understand the technical scheme of the application, the related technology related to the application is further described in detail.

With the development of computer technology, applications and products related to three-dimensional rendering are receiving more and more attention, and a three-dimensional model is a basis for three-dimensional rendering. For example, in a virtual shooting scene, a virtual screen constructed based on a three-dimensional model may be rendered to a screen as a background of virtual shooting, and shooting is performed together with a foreground in the real world, resulting in a desired shooting screen. A three-dimensional model is a polygonal representation of an object, typically displayed with a computer or other video device. The displayed object may be a real world entity or an imaginary object. Anything that exists in physical nature can be represented by a three-dimensional model.

The appearance of the three-dimensional model can be modified by mapping the surface of the three-dimensional model, and aiming at the complex three-dimensional model, the UV texture needs to be unfolded for the three-dimensional model to accurately realize the mapping of the three-dimensional model, and then the proper UV mapping is drawn to obtain the desired effect.

The UV will first be briefly described here. UV refers to the abbreviation for U, V texture map coordinates. It defines information of the position of each point on the two-dimensional image. These points are interrelated with the three-dimensional model to determine the location of the surface texture map. That is, UV is to precisely correspond each point on the image to the surface of the model object. The gap position between the points is image smoothed difference processed by software, which is the so-called UV map.

It will be appreciated that each three-dimensional model is composed of numerous faces, and that the process of tiling the faces of the three-dimensional model on a two-dimensional canvas is known as spanuv, which breaks down the three-dimensional model to a two-dimensional planar image. For example, the UV spreading can be understood with reference to fig. 1, and fig. 1 is a schematic view of UV spreading treatment according to an embodiment of the present application.

As shown in fig. 1, assuming that there is currently a three-dimensional model shown on the left side of fig. 1, after UV-spreading operation is performed on the three-dimensional model, for example, UV texture shown on the right side of fig. 1 may be obtained. Referring to fig. 1, it can also be determined that the three-dimensional model exists in an XYZ coordinate system, and the UV texture obtained after the expansion exists in the UV coordinate system.

For the point P on the three-dimensional model, under the XYZ coordinate system corresponding to the three-dimensional model, assuming that the coordinates of the point P are represented as (X1, Y1, Z2) shown in fig. 1, the corresponding point P can also be found in the UV texture obtained after UV expansion, and the coordinates of the point P in the UV texture are represented as (U1, V1) shown in fig. 1.

The processing of UV-spreading for three-dimensional models typically involves two steps, segmentation and parameterization. The segmentation processing is used for segmenting the three-dimensional model into a plurality of different areas, the quality of optimal UV splitting is determined by the segmentation quality, and the good splitting can reasonably divide the three-dimensional model into different parts so as to facilitate the manufacture of a map while maintaining the integrity of the model segmentation as much as possible.

In general, the difference in curvature corresponding to the model points on the same plane should be relatively small, so in the related art, when the three-dimensional model is segmented, a plurality of model points with relatively close curvatures are usually determined as a segmented region. For example, a plurality of model points with curvature difference smaller than a preset threshold value may be obtained by extending outward from a certain model point, and when a certain number of model points are obtained or the extension range is greater than a certain range, the model point and the plurality of model points with acquired curvature close to each other are determined as a segmentation area. Where model points refer to location points or voxel points in the three-dimensional model.

However, the curvature proximity of the model points does not necessarily represent that the model points belong to the same part of the three-dimensional model, and vice versa, the curvature of the model points of the same part of the three-dimensional model does not necessarily all be in proximity, where the same part refers to a part of the three-dimensional model divided in a physical sense. For example, for a three-dimensional model corresponding to a tree, the trunk may be considered a part, the root as a part, and the crown as a part. In the actual implementation process, the division of each part in the three-dimensional model can be determined according to actual requirements.

Therefore, in the current three-dimensional model segmentation scheme, model points corresponding to the same part in the three-dimensional model are scattered and divided into a plurality of different segmentation areas, so that the finally obtained segmentation areas are fragmented and unreasonable, and the current three-dimensional model segmentation processing has the problem of poor segmentation effect.

Aiming at the technical problems introduced above, the application provides the following technical conception: the three-dimensional feature vector is extracted for the model points of the three-dimensional model, and then clustering processing is carried out according to the three-dimensional feature vector, so that a plurality of model points corresponding to each cluster are determined to be a segmentation area, and the segmentation processing of the three-dimensional model can be carried out based on more comprehensive data features. However, because the feature information in the three-dimensional space is rich and limited by the computing power of the current computing equipment, the three-dimensional feature vector extracted for the model point is not a global feature, but is a local feature of the model point, so that the accuracy of clustering processing based on the three-dimensional feature vector is not very high, the application further provides shooting a multi-view image for the three-dimensional model, then carrying out fusion processing based on the two-dimensional feature vector of the image and the three-dimensional feature vector of the model point, and carrying out clustering processing according to the fused feature vector to realize the segmentation of the three-dimensional model, wherein the two-dimensional space is reduced in dimension relative to the three-dimensional space, so that the global feature can be extracted when the two-dimensional feature vector is extracted for the image, and the segmentation of the three-dimensional model can be realized more accurately and reasonably according to the mode of adding the point cloud to the multi-view image.

Based on the above description, the method for segmenting a three-dimensional model according to the present application will be described in detail with reference to specific embodiments. The execution subject of each embodiment of the present application may be a device having a data processing function, such as a server, a processor, or a chip, and the present application is not limited to a specific execution subject, and any device having a data processing function may be used as the execution subject of each embodiment of the present application.

The method for dividing a three-dimensional model provided by the present application will be described with reference to fig. 2, and fig. 2 is a flowchart of the method for dividing a three-dimensional model provided by the embodiment of the present application.

As shown in fig. 2, the method includes:

S201, aiming at any model point in the point cloud data of the three-dimensional model, acquiring a target image containing the model point in at least one acquired image corresponding to the three-dimensional model, wherein the shooting visual angles of the acquired images corresponding to the three-dimensional model are different.

The point cloud data refers to a set of vectors in a three-dimensional coordinate system, and exemplary, after the three-dimensional model is generated, the three-dimensional model may be scanned, and then the scanned data is recorded in the form of points, where each point corresponds to respective point information, where the point information may include, for example, three-dimensional coordinates, reflection intensity information, color information, and the like, and a specific implementation manner of the point information may be selected and set according to actual requirements.

In one possible implementation manner, the processing of the embodiment may be directly performed according to the point cloud data obtained by processing the three-dimensional model. Or even down-sampling processing can be carried out on the point cloud data after the point cloud data is processed according to the three-dimensional model, so that the point cloud data used for processing in the application is obtained, and the number of points in the point cloud data is huge, so that the number of model points to be processed can be reduced after the down-sampling processing, the model segmentation efficiency is improved, the model segmentation effect is not greatly influenced, and the model segmentation accuracy can be ensured.

In the present embodiment, points in the point cloud data of the three-dimensional model are referred to as model points, and thus a plurality of model points are included in the point cloud data of the three-dimensional model. Similar operations are performed for each model point in the point cloud data of the three-dimensional model in this embodiment, and therefore, description will be made below with any model point in the point cloud data of the three-dimensional model.

In this embodiment, at least one acquired image is further captured for the three-dimensional model, where the acquired image includes the three-dimensional model, and each acquired image corresponds to a different capturing view angle of the three-dimensional model.

In one possible implementation, when generating the virtual three-dimensional model, a physical model may be first manufactured, then a physical camera is used to collect images of multiple angles for the physical model, and then the virtual three-dimensional model may be obtained by modeling according to the collected images. In such an implementation, multiple images may be selected directly from among the images taken at multiple angles from the physical camera, resulting in the acquired images described herein.

Or the virtual three-dimensional model can also be directly generated in the electronic device in a software operation mode, in this implementation mode, image shooting with respect to multiple view angles can be performed on the three-dimensional model based on the virtual camera, so as to obtain at least one acquired image in the embodiment.

In the actual implementation process, the shooting view angle of each acquired image relative to the three-dimensional model can be selected according to actual requirements, and the specific number of the acquired images can be determined according to the actual requirements, which is not limited in this embodiment, but the shooting view angles corresponding to the acquired images are required to be ensured to be capable of covering the 360-degree range corresponding to the three-dimensional model together as much as possible, so that image acquisition is performed on all angles of the three-dimensional model.

It will be appreciated that a three-dimensional model is a model in which there is a volume in three-dimensional space, and that the acquired images taken for the three-dimensional model are images on a two-dimensional plane, and therefore for each acquired image they contain only part of the model points in the three-dimensional model.

In this embodiment, for any one model point in the point cloud data, a target image including the model point may be acquired from a plurality of acquired images corresponding to the three-dimensional model.

In one possible implementation manner, when determining the target image including the model point in the plurality of acquired images, any one of the acquired images corresponds to a respective camera acquisition parameter, so that whether the acquired image includes the model point can be determined according to the camera acquisition parameter corresponding to the acquired image.

For example, in at least one acquired image corresponding to the three-dimensional model, a shooting area corresponding to each acquired image may be determined according to a camera acquisition parameter corresponding to each acquired image, where the camera acquisition parameter includes at least one of the following: camera pose, scaling factor;

Then, for any acquired image, judging whether the model point is positioned in a shooting area of the acquired image according to the coordinate information of the model point, and if so, determining the acquired image as an image to be selected; then, a target image containing the model points can be determined according to the image to be selected. For example, the image to be selected is determined as the target image containing the model points, or a partial region containing the model points in the image to be selected may also be determined as the target image in the present embodiment.

S202, determining three-dimensional feature vectors of the model points, and determining two-dimensional feature vectors of the model points according to the target image.

Then, a three-dimensional feature vector of the model point may be determined for the model point, where the three-dimensional feature vector in this embodiment refers to a feature vector determined for the model point in the three-dimensional space, and it is not explained that the dimension of the three-dimensional feature vector is three-dimensional, and it is understood that the vector dimension of the three-dimensional feature vector depends on the actual implementation.

And, in this embodiment, the target image including the model points is acquired, so in this embodiment, the two-dimensional feature vector of the model points can also be determined from the target image. Similarly, the two-dimensional feature vector in this embodiment means that the feature vector is a feature vector determined for two-dimensional data corresponding to a model point on a two-dimensional plane, and it is not explained that the dimension of the two-dimensional feature vector is two-dimensional, and it is understood that the vector dimension of the two-dimensional feature vector depends on the actual implementation.

In addition, in the application, the target image containing the model point is acquired from a plurality of acquired images with multiple view angles, so that a plurality of target images are possibly acquired aiming at the model point, wherein the view shooting view angles of the target images are different. By way of example, the two-dimensional feature vector of the model point is determined based on the multiple target images of multiple view angles, so that the two-dimensional feature vector is generated by combining the target images of multiple view angles, and the comprehensiveness of the features contained in the two-dimensional feature vector can be effectively improved.

S203, fusion processing is carried out according to the two-dimensional feature vector and the three-dimensional feature vector, and fusion feature vectors corresponding to the model points are obtained.

In this embodiment, the two-dimensional feature vector may reflect the relevant feature of the model point corresponding to the two-dimensional plane, because the two-dimensional plane contains fewer data features than the three-dimensional space, and thus may reflect the global feature of the model point corresponding to the two-dimensional plane, such as the position of the model point corresponding to the two-dimensional plane, the color, the positional relationship with the neighboring pixel points, and so on

And, the three-dimensional feature vector may reflect the relevant features of the model point corresponding in the three-dimensional space, and although the three-dimensional space contains the local features of the model point, the data features contained in the three-dimensional space are very rich, so the three-dimensional feature vector may reflect some spatial features of the model point corresponding in the three-dimensional space, such as the position of the model point corresponding in the three-dimensional space, the curvature of the model point, and so on.

Based on the method, the two-dimensional feature vector and the three-dimensional feature vector of the model point are fused to obtain the fused feature vector corresponding to the model point, wherein the fused feature vector can reflect the two-dimensional features of the model point corresponding to the two-dimensional planes of the model point in the target images of the multiple view angles on one hand, and can reflect the three-dimensional features of the model point corresponding to the three-dimensional space on the other hand, so that the fused feature vector can comprehensively and effectively reflect the data features of the model point.

For example, the two-dimensional feature vector and the three-dimensional feature vector of the model point may be input to a feature fusion network, where the feature fusion network performs fusion processing on the two-dimensional feature vector and the three-dimensional feature vector to output a fused feature vector corresponding to the model point. The specific network structure of the feature fusion network can be selected and set according to actual requirements, so long as the purpose of feature fusion can be achieved, and the embodiment is not limited.

S204, dividing the three-dimensional model according to the fusion feature vectors corresponding to the model points to obtain at least one divided area.

After the above operation is performed on each model point in the point cloud data, a fusion feature vector corresponding to each model point can be obtained, and then the three-dimensional model can be subjected to segmentation processing according to the fusion feature vector corresponding to each model point, so that the three-dimensional model is segmented to obtain at least one segmentation region.

When the three-dimensional model is segmented, model points in the point cloud data are actually divided, and then an area formed by the model points divided into a set is determined as a segmented area.

In one possible implementation manner, clustering processing may be performed according to the fusion feature vectors corresponding to the model points, so as to divide the model points in the point cloud data, and further divide the three-dimensional model into a plurality of division areas. Or vector distances can be calculated according to fusion feature vectors corresponding to the model points respectively, and then a plurality of model points with the vector distances smaller than or equal to a preset threshold value are divided into a point set, so that the three-dimensional model is divided into a plurality of division areas.

The method for segmenting the three-dimensional model provided by the embodiment of the application comprises the following steps: and acquiring a target image containing model points in at least one acquired image corresponding to the three-dimensional model aiming at any model point in the point cloud data of the three-dimensional model, wherein the shooting visual angles of the acquired images corresponding to the three-dimensional model are different. Determining a three-dimensional feature vector of the model point, and determining a two-dimensional feature vector of the model point according to the target image. And carrying out fusion processing according to the two-dimensional feature vector and the three-dimensional feature vector to obtain a fusion feature vector corresponding to the model point. And dividing the three-dimensional model according to the fusion feature vectors corresponding to the model points to obtain at least one divided area. The method comprises the steps of obtaining a multi-view collected image of a three-dimensional model, obtaining a target image containing model points in a plurality of collected images, determining two-dimensional feature vectors of the model points according to the target image, determining three-dimensional feature vectors of the model points according to the model points, and finally carrying out fusion processing according to the two-dimensional feature vectors capable of reflecting global features and the three-dimensional feature vectors containing richer feature information, so that fusion feature vectors capable of accurately and comprehensively reflecting the features of the model points can be obtained, and therefore the segmentation area of the three-dimensional model can be effectively ensured by determining segmentation areas of the three-dimensional model according to the fusion feature vectors corresponding to the model points.

On the basis of the above description, the flow of the method for dividing a three-dimensional model provided by the present application will be described in detail with reference to fig. 3, and fig. 3 is a schematic flow diagram of the method for dividing a three-dimensional model provided by the embodiment of the present application.

As shown in fig. 3, in the present application, a plurality of acquired images of multiple views of a three-dimensional model may be processed according to a 2D network, so as to obtain a two-dimensional feature vector of a model point of the three-dimensional model. And processing the model points of the three-dimensional model according to the 3D network, so as to obtain the three-dimensional feature vectors of the model points of the three-dimensional model.

And then, carrying out fusion processing on the two-dimensional feature vector and the three-dimensional feature vector based on a fusion network so as to obtain fusion feature vectors corresponding to the model points. And then realizing the segmentation processing of the three-dimensional model based on the fusion feature vectors corresponding to the model points respectively to obtain a plurality of segmentation areas.

Specific implementations of 2D networks, 3D networks, converged networks, and segmentation processes herein are described in detail below in connection with specific embodiments.

The implementation of the 2D network is first further described in connection with fig. 4. Fig. 4 is a schematic flow chart of determining a two-dimensional feature vector according to an embodiment of the present application.

As shown in fig. 4, in this embodiment, the acquired image may be input to the image encoder, so as to obtain the first feature vector of the acquired image output by the image encoder.

The function of the image encoder is to process an image to obtain a feature vector of the image, and the SAW image encoder may be selected to execute the technical scheme of the present application, or the specific implementation of the image encoder may be selected according to the actual requirement, which is not limited in this embodiment.

In this embodiment, there are multiple acquired images for the three-dimensional model, in one possible implementation manner, the multiple acquired images may be input to the image encoder at a time, and then the image encoder may output the first feature vectors corresponding to the respective acquired images. Or the plurality of acquired images can be sequentially input into the value image encoder, and then the first feature vectors corresponding to the acquired images sequentially output by the image encoder are obtained.

And referring to fig. 4, after determining the first feature vector for the acquired image, a further deconvolution operation may be performed on the first feature vector to obtain an image feature vector for the acquired image.

The deconvolution operation can map the first feature vector into a larger feature space to obtain an image feature vector, and it can be understood that the image feature vector has richer feature expression and larger corresponding feature space relative to the first feature vector, so that the feature expression capability of the image feature vector can be effectively improved, and the defect that the feature expressed by the two-dimensional feature vector of the model point corresponding to the two-dimensional plane is fewer can be overcome to a certain extent.

In the actual implementation process, after the first feature vector of the acquired image is obtained, a subsequent operation can be performed according to the first feature vector. But further performing deconvolution operation on the first feature vector to obtain an image feature vector of the acquired image, and then performing subsequent processing according to the image feature vector of the acquired image, so that the richness of the two-dimensional feature vector of the model point can be further improved, and correspondingly, a better segmentation effect can be realized.

In this embodiment, the corresponding image feature vectors are determined for each acquired image in advance, and then, for any model point, only the subsequent data processing is required according to the image feature vector of the target image corresponding to the model point, so that the problem that the computing resource is wasted due to the fact that feature extraction is required for the target image of each model point, and the same processing is repeated for a plurality of times for the same image, can be effectively avoided.

After determining the respective image feature vector for each acquired image, since the target image is also selected from the acquired images, the respective image feature vector for the target image can naturally be determined, and then the two-dimensional feature vector of the model point can be determined according to the respective image feature vector for the target image.

In determining the two-dimensional feature vector of the model point, since the target image contains the model point, the image feature vector of the target image can be directly used as the two-dimensional feature vector of the model point. However, the proportion of the model points in the target image is very small, so that a large amount of information which is not required to be concerned when the model points are processed exists in the image feature vector of the target image, and therefore, in the embodiment, only a part of feature vectors which need to be concerned currently can be further intercepted in the image feature vector of the target image to determine the two-dimensional feature vector subsequently, so that the association degree and the processing efficiency of the two-dimensional feature vector and the model points are improved.

In one possible implementation manner, for any one target image, there is a coordinate mapping relationship between the first coordinate system corresponding to the point cloud data and the second coordinate system corresponding to the current target image, so that the target pixel point corresponding to the model point can be determined in the target image according to the coordinate mapping relationship and the coordinate information of the model point in the first coordinate system.

For example, the corresponding second coordinate information may be determined in the second coordinate system according to the first coordinate information of the model point in the first coordinate system and the coordinate mapping relation, and then the pixel indicated by the second coordinate information in the second coordinate system may be determined as the target pixel.

For example, as can be understood with reference to fig. 4, as shown in fig. 4, it is assumed that there is currently point cloud data 301 of the three-dimensional model shown in fig. 4, and that the acquired images of multiple perspectives shown in fig. 4 are acquired for the three-dimensional model, and at the same time, it is assumed that the acquired image 302 and the acquired image 303 therein are target images including the model point a.

In the example of fig. 4, the pixel point a1 in the target image 302 is the target pixel point corresponding to the model point a, and the pixel point a2 in the target image 303 is the target pixel point corresponding to the model point a.

After the target pixel point is determined, a pixel region centered on the target pixel point may be determined in the target image, for example, a region centered on the target pixel point and having a predetermined length as a radius may be determined as the pixel region described herein. Then, the coordinate information of the pixel area in the target image is obtained, for example, the coordinate information of each vertex of the pixel area in the target image may be obtained, or the coordinate information of the center point of the pixel area and at least one vertex in the target image may also be obtained, so long as the coordinate information may indicate the position of the pixel area in the target image, and the specific implementation manner may be selected according to the actual requirement.

In the processing of the image feature vector, a part of feature vector of the corresponding position can be obtained from the image feature vector according to the position information in the image, so in this embodiment, the part of feature vector corresponding to the pixel region can be determined from the image feature vector corresponding to the target image according to the coordinate information of the pixel region. Referring to fig. 4, in this embodiment, a partial feature vector corresponding to a pixel region may be extracted from an image feature vector according to a coordinate mapping relationship.

In this embodiment, the same processing is performed for each target image of the model point, so that partial feature vectors corresponding to respective pixel regions of each target image can be determined. Then, in order to realize fusion of a plurality of partial feature vectors, and also because there is a difference in the number of target images corresponding to different model points, referring to fig. 4, in this embodiment, pooling processing may be performed according to the partial feature vectors corresponding to the respective pixel regions of each target image, so as to obtain a two-dimensional feature vector of the model point.

The implementation manner of determining the two-dimensional feature vector of the model point is described above, and when determining the three-dimensional feature vector of the model point, that is, in the processing procedure of the 3D network, the model point may be input to the feature extraction network, for example, so as to obtain the three-dimensional feature vector of the model point output by the feature extraction network. The feature extraction network may be, for example, unet networks, or the specific implementation of the feature extraction network may also be selected according to actual requirements, so long as the feature extraction network may implement feature extraction of data in the three-dimensional space.

The process of fusing the two-dimensional feature vector and the three-dimensional feature vector by the fusion network will be described in further detail with reference to fig. 5. Fig. 5 is a schematic diagram of implementation of feature fusion processing according to an embodiment of the present application.

As shown in fig. 5, in this embodiment, the three-dimensional feature vector may be subjected to a spatial mapping process to map the three-dimensional feature vector to a larger feature space, which is also used to enhance the feature expression capability of the three-dimensional feature vector.

In order to facilitate subsequent feature fusion, referring to fig. 5, in this embodiment, a two-dimensional feature vector and a mapped three-dimensional feature vector may be subjected to a stitching process, where the stitching process directly stitches the two-dimensional feature vector and the three-dimensional feature vector together, so that the stitched feature vector includes a complete two-dimensional feature vector and a complete three-dimensional feature vector.

Referring to fig. 5, the spliced feature vector may be input to the first full-connection layer first, where the full-connection layer functions to fuse the features, so that the first full-connection layer in this embodiment may output the first intermediate feature vector after the fusion processing of the spliced feature vector.

In order to further improve the feature expression accuracy of the final fusion feature vector, the embodiment may further be provided with an attention network, where the attention network is a network structure constructed based on attention (attention) mechanisms, and functions to improve the expression capability of the important features in the feature vector and reduce the expression capability of the unimportant features in the feature vector.

In one possible implementation, referring to fig. 5, a second fully connected layer, a probability processing unit, and an element multiplication processing unit may be included in the attention network.

As shown in fig. 5, in the processing procedure of the attention network, the first intermediate feature vector may be first input to the second full connection layer to obtain a second intermediate feature vector output by the second full connection layer, where the second full connection layer is similar to the first full connection layer described above, and details thereof are omitted herein.

And processing the second intermediate feature vector according to the probability processing unit so as to obtain a weight feature vector, wherein the weight feature vector is used for indicating the weight corresponding to each element in the first intermediate feature vector. In one possible implementation, the probability processing unit may be a Sigmoid function, where the Sigmoid function is used to map the variables between 0 and 1, so as to represent the weights corresponding to the elements.

Then, as shown in fig. 5, the element processing unit performs element multiplication processing on the first intermediate feature vector and the weight feature vector, so as to obtain the target feature vector output by the attention network.

The element multiplication is used for multiplying each element in the first intermediate feature vector and the weight feature vector, wherein the weight feature vector comprises weights corresponding to each element in the first intermediate feature vector, so that after the element multiplication is carried out, the purposes of improving the expression capability of important features in the feature vector and reducing the expression capability of unimportant features in the feature vector can be realized, and further the attention network processing is realized.

In the actual implementation process, the network structure of the attention network is not limited to the implementation manner described in fig. 5, and corresponding network elements can be added or deleted internally according to actual requirements, so long as the role of the attention network is to implement the attention mechanism described above.

Continuing to refer to fig. 5, after obtaining the target feature vector according to the attention network, the target feature vector may be input into a fusion decoder, where the fusion decoder functions to perform further feature fusion on the target feature vector, so as to obtain a fusion feature vector corresponding to the model point output by the fusion decoder. In one possible implementation manner, the network structure in the fusion decoder may be a multi-layer fully connected network, so as to implement further feature fusion, so as to promote the feature fusion effect of the finally obtained fusion feature vector.

Based on the above description, an implementation manner of implementing three-dimensional model segmentation according to clustering processing is further described below with reference to fig. 6 and 7. Fig. 6 is a second flowchart of a method for partitioning a three-dimensional model according to an embodiment of the present application, and fig. 7 is a schematic clustering diagram of model points according to an embodiment of the present application.

As shown in fig. 6, the method includes:

S601, clustering is carried out according to the fusion feature vectors corresponding to the model points, and clustering clusters corresponding to the model points are determined.

In this embodiment, each model point has a respective corresponding fusion feature vector, so that clustering processing may be performed according to the respective corresponding fusion feature vector of each model point, thereby obtaining a plurality of clusters. The fusion feature vector of at least one model point is contained in each cluster, so that the cluster corresponding to each model point can be determined.

S602, determining a model point corresponding to any cluster as a target point set.

In this embodiment, for each cluster, the model point corresponding to the cluster may be determined as a target point set, so as to obtain the target point set corresponding to each cluster.

For example, as shown in fig. 7, it is assumed that 10 model points, model point 1 to model point 10, are currently present, as can be understood with reference to fig. 7. Wherein the fusion feature vector of model point 1, the fusion feature vector of model point 2, and the fusion feature vector of model point 5 are divided into cluster a, so that it can be determined that model point 1, model point 2, and model point 5 constitute one target point set.

Similarly, in the example of fig. 7, it can be determined that model point 3, model point 6, and model point 8 constitute one target point set, and model point 4, model point 7, model point 9, and model point 10 constitute one target point set.

The implementation of the cluster corresponding to the model point is exemplarily described with reference to fig. 7, and in the actual implementation process, the number of model points, and the specific implementation of the cluster corresponding to each model point may be determined according to the actual requirement.

S603, dividing the three-dimensional model into at least one divided area according to the target point sets corresponding to the clustering clusters, wherein the divided area is an area formed by model points in the same target point set.

After the target point sets corresponding to the clusters are obtained, the area formed by the model points in each target point set can be determined as a segmentation area, so that the three-dimensional model is segmented to obtain at least one segmentation area, and the segmentation area is the area formed by the model points in the same target point set.

In the embodiment of the application, the fusion feature vectors of the model points are clustered, then the model points in one cluster are determined as a target point set, and finally the region formed by the model points in the same target point set is determined as a segmentation region.

Based on the basic scheme described in the above embodiments, the final segmentation effect of the three-dimensional model is further understood with reference to fig. 8. Fig. 8 is a schematic diagram of a segmentation effect of a three-dimensional model according to an embodiment of the present application.

The segmentation effect of a set of three-dimensional models is given in fig. 8, which is illustrated on the left side of fig. 8, and the corresponding segmentation result of the three-dimensional model, in which different segmentation areas are represented by gray scales of different colors, is illustrated on the right side of fig. 8. It can be understood with reference to the example of fig. 8 that it can be determined as different divided regions for different parts of the arm, the head, in the present embodiment, thereby improving the accuracy of division of the divided regions.

In summary, in the technical scheme of the application, the segmentation processing of the three-dimensional model can be more accurately and reasonably realized by fusing the two-dimensional feature vector corresponding to the multi-view acquired image and the three-dimensional feature vector corresponding to the point cloud data. In addition, the feature-level fusion processing is performed in the application, instead of feature extraction after the point cloud data are mapped in the two-dimensional image, so that the technical scheme of the application can better fuse the two-dimensional features and the three-dimensional features of the model points in a feature space.

Fig. 9 is a schematic structural diagram of a three-dimensional model segmentation apparatus according to an embodiment of the present application. As shown in fig. 9, the apparatus 90 includes: an acquisition module 901, a determination module 902, a fusion module 903, and a processing module 904.

An obtaining module 901, configured to obtain, for any one model point in point cloud data of a three-dimensional model, a target image including the model point in at least one acquired image corresponding to the three-dimensional model, where each acquired image corresponds to a different shooting view angle of the three-dimensional model;

A determining module 902, configured to determine a three-dimensional feature vector of the model point, and determine a two-dimensional feature vector of the model point according to the target image;

the fusion module 903 is configured to perform fusion processing according to the two-dimensional feature vector and the three-dimensional feature vector, so as to obtain a fusion feature vector corresponding to the model point;

the processing module 904 is configured to segment the three-dimensional model according to the respective fusion feature vectors corresponding to the model points to obtain at least one segment region.

In one possible design, the processing module 904 is specifically configured to:

In one possible design, the fusion module 903 is specifically configured to:

The fusion module 903 is specifically configured to:

In one possible design, the obtaining module 901 is specifically configured to:

In one possible design, the processing module 904 is further configured to:

In one possible design, the determining module 902 is specifically configured to:

The device provided in this embodiment may be used to implement the technical solution of the foregoing method embodiment, and its implementation principle and technical effects are similar, and this embodiment will not be described herein again.

Fig. 10 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application, as shown in fig. 10, an electronic device 100 of the present embodiment includes: a processor 1001 and a memory 1002; wherein the method comprises the steps of

Memory 1002 for storing computer-executable instructions;

the processor 1001 is configured to execute computer-executable instructions stored in the memory to implement the steps executed by the method for segmenting a three-dimensional model in the above embodiment. Reference may be made in particular to the relevant description of the embodiments of the method described above.

Alternatively, the memory 1002 may be separate or integrated with the processor 1001.

When the memory 1002 is provided separately, the electronic device further comprises a bus 1003 for connecting said memory 1002 and the processor 1001.

The embodiment of the application also provides a computer readable storage medium, wherein computer execution instructions are stored in the computer readable storage medium, and when a processor executes the computer execution instructions, the method for dividing the three-dimensional model executed by the electronic equipment is realized.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with related laws and regulations and standards, and provide corresponding operation entries for the user to select authorization or rejection.

In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple modules may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.

The integrated modules, which are implemented in the form of software functional modules, may be stored in a computer readable storage medium. The software functional module is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (english: processor) to perform some of the steps of the methods according to the embodiments of the application.

It should be understood that the above Processor may be a central processing unit (english: central Processing Unit, abbreviated as CPU), or may be other general purpose processors, a digital signal Processor (english: DIGITAL SIGNAL Processor, abbreviated as DSP), an Application-specific integrated Circuit (english: application SPECIFIC INTEGRATED Circuit, abbreviated as ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.

The memory may comprise a high-speed RAM memory, and may further comprise a non-volatile memory NVM, such as at least one magnetic disk memory, and may also be a U-disk, a removable hard disk, a read-only memory, a magnetic disk or optical disk, etc.

The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (PERIPHERAL COMPONENT, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the buses in the drawings of the present application are not limited to only one bus or to one type of bus.

The storage medium may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims

1. A method for segmenting a three-dimensional model, comprising:

Dividing the three-dimensional model according to the fusion feature vectors corresponding to the model points to obtain at least one divided area;

The determining the two-dimensional feature vector of the model point according to the target image comprises the following steps:

2. The method according to claim 1, wherein the segmenting the three-dimensional model into at least one segmented region according to the respective fusion feature vectors of the model points comprises:

3. The method according to claim 1 or 2, wherein the performing fusion processing according to the two-dimensional feature vector and the three-dimensional feature vector to obtain a fused feature vector corresponding to the model point includes:

4. A method according to claim 3, wherein the attention network comprises a second fully connected layer, a probability processing unit and an element multiplication processing unit;

5. The method according to claim 1 or 2, wherein the acquiring, in the at least one acquired image corresponding to the three-dimensional model, a target image including the model point includes:

6. The method according to claim 1, wherein the method further comprises:

7. A three-dimensional model segmentation apparatus, comprising:

the processing module is used for dividing the three-dimensional model to obtain at least one division area according to the fusion feature vectors corresponding to the model points;

The determining module is specifically configured to determine, for any one of the target images, a target pixel point corresponding to the model point in the target image according to a coordinate mapping relationship between the target image and the point cloud data and coordinate information of the model point; determining a pixel area taking the target pixel point as a center in the target image, and acquiring coordinate information of the pixel area in the target image; according to the coordinate information of the pixel region, determining partial feature vectors corresponding to the pixel region in the image feature vectors corresponding to the target image; and carrying out pooling treatment according to the partial eigenvectors respectively corresponding to the pixel areas of each target image so as to obtain the two-dimensional eigenvectors of the model points.

8. An electronic device, comprising:

a memory for storing a program;

a processor for executing the program stored by the memory, the processor being for performing the method of any one of claims 1 to 6 when the program is executed.

9. A computer readable storage medium comprising instructions which, when run on a computer, cause the computer to perform the method of any one of claims 1 to 6.

10. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the method of any of claims 1 to 6.