[go: up one dir, main page]

CN111186139A - Multi-level parallel slicing method for 3D printing model - Google Patents

Multi-level parallel slicing method for 3D printing model Download PDF

Info

Publication number
CN111186139A
CN111186139A CN201911355386.7A CN201911355386A CN111186139A CN 111186139 A CN111186139 A CN 111186139A CN 201911355386 A CN201911355386 A CN 201911355386A CN 111186139 A CN111186139 A CN 111186139A
Authority
CN
China
Prior art keywords
array
level
computing node
data
patch
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911355386.7A
Other languages
Chinese (zh)
Other versions
CN111186139B (en
Inventor
谷建华
董旭伟
赵天海
王云岚
侯正雄
曹梓祥
李超
吴婕菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN201911355386.7A priority Critical patent/CN111186139B/en
Publication of CN111186139A publication Critical patent/CN111186139A/en
Application granted granted Critical
Publication of CN111186139B publication Critical patent/CN111186139B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B29WORKING OF PLASTICS; WORKING OF SUBSTANCES IN A PLASTIC STATE IN GENERAL
    • B29CSHAPING OR JOINING OF PLASTICS; SHAPING OF MATERIAL IN A PLASTIC STATE, NOT OTHERWISE PROVIDED FOR; AFTER-TREATMENT OF THE SHAPED PRODUCTS, e.g. REPAIRING
    • B29C64/00Additive manufacturing, i.e. manufacturing of three-dimensional [3D] objects by additive deposition, additive agglomeration or additive layering, e.g. by 3D printing, stereolithography or selective laser sintering
    • B29C64/30Auxiliary operations or equipment
    • B29C64/386Data acquisition or data processing for additive manufacturing
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B33ADDITIVE MANUFACTURING TECHNOLOGY
    • B33YADDITIVE MANUFACTURING, i.e. MANUFACTURING OF THREE-DIMENSIONAL [3-D] OBJECTS BY ADDITIVE DEPOSITION, ADDITIVE AGGLOMERATION OR ADDITIVE LAYERING, e.g. BY 3-D PRINTING, STEREOLITHOGRAPHY OR SELECTIVE LASER SINTERING
    • B33Y50/00Data acquisition or data processing for additive manufacturing

Landscapes

  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Materials Engineering (AREA)
  • Manufacturing & Machinery (AREA)
  • Physics & Mathematics (AREA)
  • Mechanical Engineering (AREA)
  • Optics & Photonics (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

本发明涉及一种3D打印模型的多层次并行切片方法,该方法通过多层次并行处理的方式对三维模型切片进行加速,具体包含四个层次的切片并行化,分别为计算节点级、多GPU级、线程块级和线程级。在每一层次的并行化中,依据当前层次的地址空间分布、访存方式及数据结构特点设计了相应的任务划分与数据交互方案,使得各并行执行单元负载均衡及减少数据通信量。在线程级的并行化中,采用了对数组索引原子加的方式解决并行插入时的数据竞争问题。本发明能够在不降低原有切片精度的情况下,有效地降低三维模型切片处理耗时。同时,通过各计算节点对相应子模型文件的并行读取,能够降低硬盘I/O时间;各计算节点仅需对子模型数据进行处理,减少了内存占用量。

Figure 201911355386

The invention relates to a multi-level parallel slicing method for a 3D printing model. The method accelerates the slicing of a three-dimensional model by means of multi-level parallel processing, and specifically includes four levels of slicing parallelization, which are computing node level and multi-GPU level respectively. , thread block level and thread level. In each level of parallelization, according to the address space distribution, memory access method and data structure characteristics of the current level, the corresponding task division and data interaction scheme is designed to balance the load of each parallel execution unit and reduce the data traffic. In the thread-level parallelization, the atomic addition of the array index is adopted to solve the data race problem during parallel insertion. The invention can effectively reduce the time-consuming of three-dimensional model slice processing without reducing the original slice precision. At the same time, through the parallel reading of the corresponding sub-model files by each computing node, the hard disk I/O time can be reduced; each computing node only needs to process the sub-model data, which reduces the memory usage.

Figure 201911355386

Description

Multi-level parallel slicing method for 3D printing model
Technical Field
The invention belongs to the technical field of 3D printing, and particularly relates to a multi-level parallel slicing method of a 3D printing model.
Background
3D printing refers to a process of performing digital modeling and layering on a three-dimensional entity and then performing layer-by-layer additive manufacturing on an object through a certain material and a related support technology. The layering and slicing process links the representation of a three-dimensional model of an article stored in a computer with a finished product finally processed, and the three-dimensional digital model can be identified by a 3D printing device and the printing of the actual article can be executed only by being processed and layered into a series of two-dimensional plane data in advance.
The STL is a data format that can cooperate between the CAD and the printing apparatus, and is a de facto file format standard in the field of 3D printing, which uses a large number of triangle patches to approximate an original CAD model, and vertex information and unit normal vectors of each triangle patch are recorded in the STL file. With the continuous improvement of the industrial demand, the three-dimensional model of the manufactured article to be printed is more and more complex in representation, the requirement on the precision of the finished product is higher and higher, the time consumption of the slicing processing process of the three-dimensional model is greatly increased due to the factors, and the factors gradually become a bottleneck for limiting the improvement of the whole production efficiency of 3D printing.
The document "STL model layered adjacency ordering fast slicing algorithm, computer aided design and graphics bulletin, 2011, Vol23(4), p 600-606" discloses a STL model layered adjacency ordering fast slicing algorithm. The method comprises the steps of firstly, solving the maximum and minimum coordinate values of each triangular patch projected in the slicing direction in a model, combining the coordinate values of each tangent plane, further determining the tangent plane intersected with each triangular patch, and solving the coordinates of the intersection point of each triangular patch and each tangent plane. In addition, for each sliced layer, an intersection chain table is constructed according to the adjacency relation between the triangular patches, and the intersections are stored. Compared with the traditional slicing method based on topology information extraction and rectangular grouping, the method disclosed by the document achieves performance improvement. The method described in the literature is still a serial algorithm in nature, the parallelization processing potential of the slicing problem is not analyzed and utilized, and the time consumption is still long when a large-scale three-dimensional model is sliced, so that the slicing processing efficiency is influenced.
Disclosure of Invention
Technical problem to be solved
In order to reduce time consumption of a three-dimensional model slicing processing process and improve slicing efficiency, the invention provides a multi-layer parallel slicing method for processing a 3D printing model. A four-level parallelization slicing scheme is designed, the parallelization potential of the slicing problem is utilized, the effect of reducing the time consumption of three-dimensional model slicing processing is achieved by means of parallel processing of all execution units, and the original slicing precision is not reduced.
Technical scheme
A multi-level parallel slicing method of a 3D printing model is characterized by comprising the following steps:
step 1: executing parallelization of computing node level slices, performing task division among computing nodes by taking a patch as a basic unit in a cluster, and dividing an original model file according to a division result to obtain a plurality of sub-model files;
step 2: executing multi-GPU-level slice parallelization, performing task division among a plurality of GPUs by taking a patch as a basic unit in each computing node, reading corresponding sub-model files into a memory in a grading manner according to a division result, constructing patches and vertex arrays, and importing the patches and the vertex arrays required by the slices into the GPU memory from a computing node main memory;
and step 3: executing thread block level slice parallelization, and dividing a patch array led into a GPU memory into a plurality of sub-patch arrays among thread blocks;
and 4, step 4: executing thread-level slice parallelization, dividing sub-patch arrays belonging to a thread block among threads, sequentially processing the allocated patches by the threads, solving all tangent planes intersected with each patch, calculating tangent segments, and mutually exclusively inserting the tangent segments into the tangent segment arrays of the corresponding layers by the threads;
and 5: exporting the tangent line segment array from each GPU memory to a main memory, merging the calculation results of each GPU at a CPU end, and generating a sub-tangent line segment array corresponding to the current calculation node at each layer;
step 6: task division is carried out among the computing nodes according to layers, sub-tangent segment arrays are collected and merged according to the division results according to the layers, and a corresponding tangent segment array is obtained in each layer obtained through distribution;
and 7: and each computing node sequentially processes the tangent line array of each layer existing in the current node and parallelly executes the tangent line connection to generate the intra-layer tangent contour line.
The step 1 of dividing tasks among the computing nodes means that a patch is taken as a basic unit, an original model file is divided in a continuous equipartition mode, and the equipartition remainder is distributed to the last computing node; and dividing the original model file according to the division result to form a plurality of sub-model files, wherein one sub-model file is allocated to one computing node.
In the step 2, a task division finger is made among the GPUs, a patch is taken as a basic unit, the sub-model file is divided in a continuous dividing mode, and the remainder after dividing is distributed to the last GPU.
And 2, reading in the sub-model files and constructing the array in the step 2 means that the sub-model files distributed to the calculation nodes are read for a plurality of times according to the task division result, wherein the reading times are equal to the number of GPUs contained in the calculation nodes. After reading, the main memory of the computing node contains the model data with the same group number, and each group contains a patch array and a vertex array.
In the step 3, the patch is divided among the thread blocks, the patch is firstly distributed in a continuous equipartition mode, and the remainders after equipartition are sequentially distributed to each thread block one at a time, so that the task amount distribution is balanced and each thread block is distributed to the continuous patch.
In the step 4, the sub-patch arrays are divided among the threads, and a mode of sequentially distributing the sub-patch arrays among the threads one at a time is adopted, so that the distribution of the task amount is balanced, the memory units read by adjacent threads are continuous, and the merged reading of the GPU memory is facilitated to improve the memory access bandwidth.
The mutually exclusive insertion of the tangent segments into the array in the step 4 means that when the threads participating in the parallel processing process respective triangular patches, the calculated tangent segments have the probability of belonging to the same layer, and data needs to be mutually exclusive inserted into the same array. The exclusive insertion mode is that an array index is set to point to the next idle position to be inserted in the array, the atomic addition operation needs to be executed on the array index when data is specified to be inserted, and the current value of the array index is obtained as the insertion position.
And 5, merging the calculation results of the GPUs, namely merging the tangent segment data according to layers at the CPU end, namely merging the tangent segments belonging to the same layer into the same array.
In the step 6, collecting merged tangent segment data among the computing nodes specifically includes the following steps:
1) taking a layer as a basic unit, distributing merging tasks among the computing nodes in a continuous and uniform dividing mode, and distributing the equalized remainder to the last computing node;
2) the computing nodes perform data merging on the distributed layers, collect tangent segment data located on the same layer in all the computing nodes, and merge the tangent segment data into an array in a unified manner; after the combination of the data of the tangent line segments is completed, each computing node discards the data of the layer which does not belong to the current computing node, and the memory space is released.
Advantageous effects
The invention provides a multi-level parallel slicing method for a 3D printing model, which accelerates three-dimensional model slicing in a multi-level parallel processing mode, and specifically comprises four levels of slicing parallelization, namely a computing node level, a multi-GPU level, a thread block level and a thread level. In the parallelization of each layer, a corresponding task division and data interaction scheme is designed according to the address space distribution, the access mode and the data structure characteristics of the current layer, so that the load of each parallel execution unit is balanced and the data communication amount is reduced. In thread-level parallelization, a data competition problem during parallel insertion is solved by adding array index atoms. The method can effectively reduce the time consumption of three-dimensional model slicing processing under the condition of not reducing the original slicing precision. Meanwhile, the I/O time of the hard disk can be reduced by reading the corresponding sub-model files in parallel by each computing node; each computing node only needs to process the sub-model data, and the memory occupation amount is reduced.
Drawings
FIG. 1 is a block diagram of the flow of each stage of a four-stage parallelized slicing method proposed by the present invention;
FIG. 2 is a diagram of the structure of each level of execution units of the four-level parallelized slice of the 3D printing model proposed by the present invention;
FIG. 3 is a diagram of compute node level parallelization task partitioning in the present invention;
FIG. 4 is a diagram of multi-GPU-level parallelization task partitioning according to the present invention;
FIG. 5 is a diagram of thread block level parallelization task partitioning according to the present invention;
FIG. 6 is a diagram of thread-level parallelization task partitioning according to the present invention;
FIG. 7 is a diagram illustrating the insertion of data into a same-tier array of tangent segments in a mutually exclusive manner by two threads, according to an embodiment;
FIG. 8 is a comparison of slicing time between the current scheme and the original scheme.
Detailed Description
The invention will now be further described with reference to the following examples and drawings:
referring to fig. 1-2, the invention improves the slicing processing speed of the three-dimensional model by a multi-level parallel method, specifically including four-level parallelization. From a top-down perspective, compute node level, multi-GPU level, thread block level, and thread level, respectively.
The method comprises the following steps: and executing parallelization of the computing node level slices, performing task division among the computing nodes by taking the surface patches as basic units in the cluster, dividing the original model file according to the division result, executing the step two, and collecting merged tangent segment data among the computing nodes after the step two is finished.
The task division among the computing nodes is to divide the original model file in a continuous equipartition mode by taking a patch as a basic unit, and the equipartition remainder is distributed to the last computing node. And dividing the original model file according to the division result to form a plurality of sub-model files, wherein one sub-model file is allocated to one computing node.
Referring to fig. 3, if the total number of patches N contained in an STL model file, the number of compute nodes in a cluster is N1, the number of patches that each compute node has to divide is count _ uniform 1, and the number of patches that the last process has to divide is last _ count1, then:
Figure BDA0002335768880000051
wherein,
Figure BDA0002335768880000052
indicating rounding down,% indicates the remainder. The original STL model file is divided into n1 sub model files according to the number of slices calculated above.
The benefits of segmenting the original model file are:
1) the data amount of each computing node read in the memory from the hard disk is reduced, and further the I/O time is reduced;
2) reducing the memory usage amount of each computing node, so that a larger model can be sliced under the condition that the total memory amount of the computing nodes is fixed;
3) each computing node executes slice computation in parallel, which brings about effective reduction of computation time.
The method for collecting merged tangent segment data among the computing nodes specifically comprises the following steps:
1) and distributing merging tasks among the computing nodes in a continuous and uniform division mode by taking the layer as a basic unit, and distributing the averaged remainder to the last computing node.
The number of layers obtained by each computing node is uniform _ layer _ count, the residual number obtained by the average is distributed to the last computing node, and the number of layers obtained by the average is last _ layer _ count, then:
Figure BDA0002335768880000061
2) and the computing nodes perform data merging on the distributed layers, collect tangent segment data positioned on the same layer in all the computing nodes, and uniformly merge the tangent segment data into an array. In one embodiment, the merge operation is performed on the sliced segment data using the MPI _ Gather () interface of MPI, which is called once by merging the data of one layer. After the combination of the data of the tangent line segments is completed, each computing node discards the data of the layer which does not belong to the current computing node, and the memory space is released.
Step two: and executing multi-GPU-level slice parallelization, performing task division among a plurality of GPUs by taking a patch as a basic unit in a computing node, reading sub-model files into a memory in a grading manner according to division results, constructing a patch and a vertex array, importing data required by the slice into a GPU memory from a computing node main memory, executing the step three, exporting tangent segment data from each GPU memory to a main memory after the execution, and merging each GPU computing result at a CPU (central processing unit) end.
The task division among the GPUs is realized by dividing the sub-model file in a continuous equal division mode by taking a patch as a basic unit, and distributing the remainder after the equal division to the last GPU.
Referring to fig. 4, if the number of GPUs in each compute node is n2, the number of patches included in the sub-model file allocated to the current compute node is count1, the number of patches evenly divided by each GPU is uniform _ count2, and the number of patches divided by the last GPU is last _ count2, then:
Figure BDA0002335768880000062
and in the second step, reading in the sub-model files and constructing the array refers to that the sub-model files distributed to the calculation nodes are read for a plurality of times according to the task division result, wherein the reading times are equal to the number of GPUs contained in the calculation nodes. After reading, the main memory of the computing node contains the model data with the same group number, and each group contains a patch array and a vertex array.
After reading, the main memory of the computing node contains n2 groups of data, and each group of data contains a patch array Face [ ] and a Vertex array Vertex [ ]. In the patch array, one element records the index of three vertices contained in one patch, and the index refers to the subscript of the Vertex in the Vertex [ ] array. In the vertex array, one element records the x, y, z three-dimensional coordinate system coordinates of the vertex.
Furthermore, the merging of the calculation results of the GPUs is to merge the tangent segment data in layers at the CPU end, that is, to merge the tangent segments belonging to the same layer into the same array.
In one embodiment, OpenMP is used to implement control of multiple GPUs in a compute node, where the number of OpenMP threads created is equal to the number of GPUs in the compute node, and one OpenMP thread takes over one GPU compute card.
Step three: and executing thread block level slice parallelization, dividing the patch array led into the GPU memory among all thread blocks, and executing the step four.
The dividing of the surface patch among the thread blocks is that firstly the surface patch is distributed in a continuous equal distribution mode, and the remainder after the equal distribution is sequentially distributed to each thread block one at a time, so that the distribution of the task amount is balanced and each thread block is divided into continuous surface patches.
Referring to fig. 5, if the number of thread blocks is n3, the number of patches in the Face [ ] array in the GPU memory is count2, the remainder after the division is remain _ count, the number of patches divided by each thread block is Face _ count, and each thread block has a continuous and unique number id, then:
Figure BDA0002335768880000071
in one embodiment, the thread blocks are organized into one dimension, and the number of thread blocks is reasonably set according to the number of streaming multiprocessors contained in the GPU.
Step four: and executing thread-level slice parallelization, dividing sub-patch arrays belonging to a thread block among threads, sequentially processing the allocated patches by the threads, solving all tangent planes intersected with each patch, calculating tangent segments, and mutually exclusively inserting the tangent segments into the tangent segment arrays of the corresponding layers by the threads.
Referring to fig. 6, the number of threads is n4, and the sub-patch arrays are divided among the threads, and are sequentially allocated among the threads one at a time, so that the task amount allocation is balanced, and the memory units read by adjacent threads are continuous, which facilitates the merged reading of the GPU memory to improve the memory access bandwidth.
The mutually exclusive insertion of the tangent segments into the array means that when the threads participating in the parallel processing process respective triangular patches, the calculated tangent segments have the probability of belonging to the same layer, and data needs to be mutually exclusive inserted into the same array. The exclusive insertion mode is that an array index is set to point to the next idle position to be inserted in the array, the atomic addition operation needs to be executed on the array index when data is specified to be inserted, and the current value of the array index is obtained as the insertion position.
In one embodiment, the threads are organized in one dimension, the number of threads included in a thread block is usually an integer multiple of 32, and the number of threads is chosen reasonably according to the number of hardware resources such as registers on the GPU.
Referring to fig. 7, in an embodiment, an index e is used to record a current position to be inserted into an arr array, a thread x and a thread y respectively calculate tangent segments s1 and s2 of the same tangent plane and respective triangle patches, and to Insert into the arr array at the same time, the thread x executes atomadd (e), Insert (s1), the thread y executes atomadd (e), and Insert (s2), so that correct insertion of mutual exclusion into the same array is achieved.
Step five: and executing the line cutting segment connection in parallel by each computing node according to the distributed layers to generate the slice contour lines in the layers.
The layer allocated to each computing node is the layer allocated when the combined tangent segment data is collected among the computing nodes in the step one. During data collection and merging, tangent segment data is divided into computing nodes in the cluster according to layers in a balanced mode. After division, the data of each layer are independent, and each computing node can execute segment cutting connection in parallel to generate an in-layer slice contour line so as to complete the slice processing of the model.
Referring to fig. 8, slicing time consumption of an original scheme and a current scheme is compared, wherein the original scheme refers to a serial slicing scheme, the current scheme refers to the multi-level parallel slicing scheme, a hardware device used for testing is a GPU cluster, the cluster comprises two computing nodes, the type of a CPU on the computing nodes is intel (r) xeon (r) Gold 6132CPU @2.60GHz, each computing node is provided with three GPU computing cards, and the type of each GPU computing card is Tesla V100-PCIE-32 GB. Test results show that the multi-level parallel slicing scheme can effectively reduce slicing time consumption, and the larger the model scale is, the more obvious the acceleration effect is.

Claims (9)

1.一种3D打印模型的多层次并行切片方法,其特征在于步骤如下:1. a multi-level parallel slicing method of a 3D printing model is characterized in that the steps are as follows: 步骤1:执行计算节点级切片并行化,在集群内,以面片为基本单元在计算节点间做任务划分,根据划分结果分割原始模型文件得到多个子模型文件;Step 1: Execute the parallelization of computing node-level slices. In the cluster, use the patch as the basic unit to divide tasks among the computing nodes, and divide the original model file according to the division result to obtain multiple sub-model files; 步骤2:执行多GPU级切片并行化,在每个计算节点内,以面片为基本单元在多个GPU间做任务划分,根据划分结果将对应子模型文件分次读入内存并构造面片和顶点数组,将切片所需的面片和顶点数组从计算节点主存导入GPU内存;Step 2: Perform multi-GPU-level slice parallelization. In each computing node, use the patch as the basic unit to divide tasks among multiple GPUs, and read the corresponding sub-model files into the memory in stages according to the division results and construct the patch and vertex arrays, import the patches and vertex arrays required for slicing from the compute node main memory into GPU memory; 步骤3:执行线程块级切片并行化,将导入GPU内存的面片数组在各线程块之间划分为多个子面片数组;Step 3: Execute thread block-level slicing parallelization, and divide the patch array imported into GPU memory into multiple sub-patch arrays between thread blocks; 步骤4:执行线程级的切片并行化,将属于一个线程块的子面片数组在各线程之间划分,线程依次处理分配到的各面片,对每个面片求取与其相交的所有切平面,并计算切线段,线程将切线段互斥地插入对应层的切线段数组;Step 4: Execute thread-level slice parallelization, divide the sub-patch array belonging to a thread block among threads, the threads process the assigned patches in turn, and obtain all the slices that intersect with each patch. plane, and calculate the tangent segment, the thread inserts the tangent segment into the tangent segment array of the corresponding layer mutually exclusive; 步骤5:将切线段数组从各GPU内存导出到主存,在CPU端合并各GPU计算结果,在每层产生一个对应于当前计算节点的子切线段数组;Step 5: Export the tangent segment array from each GPU memory to the main memory, merge the calculation results of each GPU on the CPU side, and generate a sub-tangent segment array corresponding to the current computing node at each layer; 步骤6:在计算节点之间按层进行任务划分,根据划分结果按层收集合并子切线段数组,在分配得到的每一层中得到一个对应的切线段数组;Step 6: Divide tasks by layer among computing nodes, collect and merge sub-tangent segment arrays by layer according to the division result, and obtain a corresponding tangent segment array in each layer obtained by allocation; 步骤7:各计算节点对当前节点内存在的各层切线段数组依次处理,并行执行切线段连接生成层内切片轮廓线。Step 7: Each computing node sequentially processes the tangent segment arrays of each layer existing in the current node, and performs the tangent segment connection in parallel to generate the contour lines of the slices in the layer. 2.根据权利要求1所述的一种3D打印模型的多层次并行切片方法,其特征在于所述的步骤1中在计算节点间做任务划分是指,以面片为基本单元,以连续均分的方式划分原始模型文件,均分后的余数分配给最后一个计算节点;根据划分结果分割原始模型文件后形成若干子模型文件,一个计算节点分配一个。2. The multi-level parallel slicing method of a 3D printing model according to claim 1, characterized in that in said step 1, the task division between computing nodes refers to taking a patch as a basic unit, and using a continuous average The original model file is divided by dividing, and the remainder after the division is distributed to the last computing node; after dividing the original model file according to the division result, several sub-model files are formed, and one computing node is allocated one. 3.根据权利要求1所述的一种3D打印模型的多层次并行切片方法,其特征在于所述步骤2中在多个GPU间做任务划分指,以面片为基本单元,以连续均分的方式划分子模型文件,均分后的余数分配给最后一个GPU。3. The multi-level parallel slicing method of a 3D printing model according to claim 1, characterized in that in the step 2, task division refers to a plurality of GPUs, taking the face patch as a basic unit, and dividing it continuously and equally. The sub-model file is divided by the method, and the remainder after the division is allocated to the last GPU. 4.根据权利要求1所述的一种3D打印模型的多层次并行切片方法,其特征在于所述步骤2中子模型文件读入及构造数组是指,计算节点根据任务划分结果,分若干次读取分配到的子模型文件,读取次数等于计算节点中包含的GPU数。读取完毕后计算节点主存中包含同组数的模型数据,每组包含一个面片数组和一个顶点数组。4. The multi-level parallel slicing method of a 3D printing model according to claim 1, characterized in that in said step 2, reading in the sub-model file and constructing the array means that the computing node divides the result according to the task and divides it into several times. Read the assigned submodel file with a number of reads equal to the number of GPUs contained in the compute node. After reading, the main memory of the computing node contains the same number of model data, and each group contains a patch array and a vertex array. 5.根据权利要求1所述的一种3D打印模型的多层次并行切片方法,其特征在于所述步骤3中将面片在线程块之间划分,是首先采用连续均分的方式分配面片,对于均分后的余数采取一次一个依次分配给各线程块,使任务量分配均衡及每个线程块分到连续的面片。5. The multi-level parallel slicing method of a 3D printing model according to claim 1, characterized in that in the step 3, the face pieces are divided among the thread blocks, and the face pieces are firstly distributed in a continuous and equal manner. , the remainder after the equal division is taken one at a time and distributed to each thread block in turn, so that the task amount is distributed evenly and each thread block is divided into consecutive patches. 6.根据权利要求1所述的一种3D打印模型的多层次并行切片方法,其特征在于所述步骤4中将子面片数组在线程之间划分,是采用一次一个在各线程间依次分配的方式,以使任务量分配均衡,及使相邻线程读取的内存单元连续,促成对GPU内存的合并读取以提升访存带宽。6. The multi-level parallel slicing method of a 3D printing model according to claim 1, characterized in that in the step 4, the sub-patch array is divided between threads, which is to use one at a time to distribute among the threads in turn In this way, the task load is distributed evenly, and the memory units read by adjacent threads are continuous, which promotes the combined reading of GPU memory to improve the memory access bandwidth. 7.根据权利要求1所述的一种3D打印模型的多层次并行切片方法,其特征在于所述步骤4中将切线段互斥地插入数组,是指参与并行的线程在处理各自的三角面片时,计算出的切线段有几率属于同一层,需互斥地向同一数组中插入数据。互斥插入方式是,设置一个数组索引指向数组中下一个空闲的待插入位置,规定插入数据时需对数组索引执行原子加操作,并获取数组索引的当前值作为插入位置。7. The multi-level parallel slicing method of a 3D printing model according to claim 1, wherein the tangent segments are inserted into the array mutually exclusive in the step 4, which means that the threads participating in parallel are processing the respective triangular faces When slicing, the calculated tangent segments have a probability of belonging to the same layer, and data must be inserted into the same array in a mutually exclusive manner. The mutually exclusive insertion method is to set an array index to point to the next free position to be inserted in the array, specify that when inserting data, perform an atomic addition operation on the array index, and obtain the current value of the array index as the insertion position. 8.根据权利要求1所述的一种3D打印模型的多层次并行切片方法,其特征在于所述步骤5中合并各GPU计算结果,是在CPU端对切线段数据按层进行合并处理,即将属于同一层的切线段合并到同一个数组中。8. the multi-level parallel slicing method of a kind of 3D printing model according to claim 1, it is characterized in that in the described step 5, merge each GPU calculation result, is to merge the tangent segment data by layer at the CPU end, about Tangent segments belonging to the same layer are merged into the same array. 9.根据权利要求1所述的一种3D打印模型的多层次并行切片方法,其特征在于所述步骤6中在计算节点间收集合并切线段数据,具体包括以下步骤:9. The multi-level parallel slicing method of a 3D printing model according to claim 1, characterized in that in said step 6, collecting and merging tangent segment data between computing nodes, specifically comprising the following steps: 1)以层为基本单元,以连续均匀划分的方式在计算节点间分配合并任务,均分后的余数分配给最后一个计算节点;1) Take the layer as the basic unit, distribute the merge task among the computing nodes in a continuous and evenly divided manner, and distribute the remainder after the equal division to the last computing node; 2)计算节点对分配到的层执行数据合并,收集所有计算节点中位于同一层的切线段数据,统一合并到一个数组中;切线段数据合并完成后,各计算节点丢弃不属于当前计算节点的层的数据,释放内存空间。2) The computing node performs data merging on the assigned layer, collects the tangent segment data located on the same layer in all computing nodes, and merges them into an array; after the tangent segment data is merged, each computing node discards the data that does not belong to the current computing node. Layer data, freeing up memory space.
CN201911355386.7A 2019-12-25 2019-12-25 Multi-level parallel slicing method for 3D printing model Active CN111186139B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911355386.7A CN111186139B (en) 2019-12-25 2019-12-25 Multi-level parallel slicing method for 3D printing model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911355386.7A CN111186139B (en) 2019-12-25 2019-12-25 Multi-level parallel slicing method for 3D printing model

Publications (2)

Publication Number Publication Date
CN111186139A true CN111186139A (en) 2020-05-22
CN111186139B CN111186139B (en) 2022-03-15

Family

ID=70703318

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911355386.7A Active CN111186139B (en) 2019-12-25 2019-12-25 Multi-level parallel slicing method for 3D printing model

Country Status (1)

Country Link
CN (1) CN111186139B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113297400A (en) * 2021-05-31 2021-08-24 西北工业大学 Metadata extraction method of 3D printing model
CN114311682A (en) * 2022-03-03 2022-04-12 深圳市创想三维科技股份有限公司 Model generation method, apparatus, device and storage medium
WO2023025269A1 (en) * 2021-08-27 2023-03-02 深圳市纵维立方科技有限公司 Slice processing method, printing method, system, and device, and storage medium
CN115972584A (en) * 2022-12-12 2023-04-18 西北工业大学 Parallel slicing method of additive manufacturing model based on CPU and GPU collaboration

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101819675A (en) * 2010-04-19 2010-09-01 浙江大学 Method for quickly constructing bounding volume hierarchy (BVH) based on GPU
CN101908087A (en) * 2010-07-16 2010-12-08 清华大学 Parallel Simulation Method of Integrated Circuit Power and Ground Network Based on GPU
EP2750018A2 (en) * 2012-12-27 2014-07-02 LSI Corporation Non-volatile memory program failure recovery via redundant arrays
CN104239133A (en) * 2014-09-26 2014-12-24 北京国双科技有限公司 Log processing method, device and server
CN105630409A (en) * 2014-11-25 2016-06-01 Sap欧洲公司 Dual data storage using an in-memory array and an on-disk page structure
CN106202145A (en) * 2016-06-17 2016-12-07 北京四维新世纪信息技术有限公司 A kind of preprocessing of remote sensing images system of Data-intensive computing
CN106686352A (en) * 2016-12-23 2017-05-17 北京大学 Real-time processing method of multi-channel video data on multi-GPU platform
CN106846236A (en) * 2016-12-26 2017-06-13 中国科学院计算技术研究所 A kind of expansible distributed GPU accelerating method and devices
CN107563955A (en) * 2017-09-12 2018-01-09 武汉锐思图科技有限公司 A kind of parallel map dicing method and system based on GPU
CN108555301A (en) * 2018-05-03 2018-09-21 温州职业技术学院 A kind of Paralleled formula 3 D-printing forming method of large-scale precision metal parts
CN108804077A (en) * 2017-04-28 2018-11-13 英特尔公司 For executing instruction and the logic of floating-point and integer operation for machine learning
CN109159425A (en) * 2018-08-21 2019-01-08 东莞中国科学院云计算产业技术创新与育成中心 Three-dimensional model slicing method and three-dimensional printing device
CN109857543A (en) * 2018-12-21 2019-06-07 中国地质大学(北京) A kind of streamline simulation accelerated method calculated based on the more GPU of multinode
US10379868B1 (en) * 2019-02-04 2019-08-13 Bell Integrator Inc. Optimization method with parallel computations
CN110349255A (en) * 2019-07-15 2019-10-18 万东百胜(苏州)医疗科技有限公司 A kind of organ ultrasonic modelling 3D printing method
US10606353B2 (en) * 2012-09-14 2020-03-31 Interaxon Inc. Systems and methods for collecting, analyzing, and sharing bio-signal and non-bio-signal data

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101819675A (en) * 2010-04-19 2010-09-01 浙江大学 Method for quickly constructing bounding volume hierarchy (BVH) based on GPU
CN101908087A (en) * 2010-07-16 2010-12-08 清华大学 Parallel Simulation Method of Integrated Circuit Power and Ground Network Based on GPU
US10606353B2 (en) * 2012-09-14 2020-03-31 Interaxon Inc. Systems and methods for collecting, analyzing, and sharing bio-signal and non-bio-signal data
EP2750018A2 (en) * 2012-12-27 2014-07-02 LSI Corporation Non-volatile memory program failure recovery via redundant arrays
CN104239133A (en) * 2014-09-26 2014-12-24 北京国双科技有限公司 Log processing method, device and server
CN105630409A (en) * 2014-11-25 2016-06-01 Sap欧洲公司 Dual data storage using an in-memory array and an on-disk page structure
CN106202145A (en) * 2016-06-17 2016-12-07 北京四维新世纪信息技术有限公司 A kind of preprocessing of remote sensing images system of Data-intensive computing
CN106686352A (en) * 2016-12-23 2017-05-17 北京大学 Real-time processing method of multi-channel video data on multi-GPU platform
CN106846236A (en) * 2016-12-26 2017-06-13 中国科学院计算技术研究所 A kind of expansible distributed GPU accelerating method and devices
CN108804077A (en) * 2017-04-28 2018-11-13 英特尔公司 For executing instruction and the logic of floating-point and integer operation for machine learning
CN107563955A (en) * 2017-09-12 2018-01-09 武汉锐思图科技有限公司 A kind of parallel map dicing method and system based on GPU
CN108555301A (en) * 2018-05-03 2018-09-21 温州职业技术学院 A kind of Paralleled formula 3 D-printing forming method of large-scale precision metal parts
CN109159425A (en) * 2018-08-21 2019-01-08 东莞中国科学院云计算产业技术创新与育成中心 Three-dimensional model slicing method and three-dimensional printing device
CN109857543A (en) * 2018-12-21 2019-06-07 中国地质大学(北京) A kind of streamline simulation accelerated method calculated based on the more GPU of multinode
US10379868B1 (en) * 2019-02-04 2019-08-13 Bell Integrator Inc. Optimization method with parallel computations
CN110349255A (en) * 2019-07-15 2019-10-18 万东百胜(苏州)医疗科技有限公司 A kind of organ ultrasonic modelling 3D printing method

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113297400A (en) * 2021-05-31 2021-08-24 西北工业大学 Metadata extraction method of 3D printing model
CN113297400B (en) * 2021-05-31 2024-04-30 西北工业大学 Metadata extraction method of 3D printing model
WO2023025269A1 (en) * 2021-08-27 2023-03-02 深圳市纵维立方科技有限公司 Slice processing method, printing method, system, and device, and storage medium
CN114311682A (en) * 2022-03-03 2022-04-12 深圳市创想三维科技股份有限公司 Model generation method, apparatus, device and storage medium
CN115972584A (en) * 2022-12-12 2023-04-18 西北工业大学 Parallel slicing method of additive manufacturing model based on CPU and GPU collaboration
CN115972584B (en) * 2022-12-12 2024-07-16 西北工业大学 Method for parallelizing slicing of additive manufacturing model based on cooperation of CPU and GPU

Also Published As

Publication number Publication date
CN111186139B (en) 2022-03-15

Similar Documents

Publication Publication Date Title
CN111186139B (en) Multi-level parallel slicing method for 3D printing model
KR20130016120A (en) System, method, and computer-readable recording medium for constructing an acceleration structure
Keuper et al. Distributed training of deep neural networks: Theoretical and practical limits of parallel scalability
CN109255828B (en) Method for performing intersection test to draw 3D scene image and ray tracing unit
CN102306396B (en) Three-dimensional entity model surface finite element mesh automatic generation method
US20090106530A1 (en) System, method, and computer program product for generating a ray tracing data structure utilizing a parallel processor architecture
CN110516316B (en) A GPU-accelerated method for solving Euler equations by discontinuous Galerkin method
Schlag et al. Scalable edge partitioning
TWI857493B (en) Computer-implemented method, system and non-transitory computer-readable storage medium for neural network computations
CN110188462A (en) LBM algorithm optimization method based on martial prowess framework
CN110543663A (en) A Coarse-grained MPI+OpenMP Mixed Parallel Method for Structural Mesh Region Division
CN103631981A (en) Designing a modeled volume represented by dexels
CN111080653A (en) Method for simplifying multi-view point cloud by using region segmentation and grouping random simplification method
CN102393826A (en) Multi-core parallel processing based flexible scene continuous collision detection method
CN106484532B (en) GPGPU parallel calculating method towards SPH fluid simulation
CN102339479A (en) Stretch-driven mesh parameterization method using spectral analysis
CN115344383A (en) A Parallel Acceleration Method for Streamline Visualization Based on Process Parallelism
CN114861970A (en) Full-source shortest path division and solution method of large-scale graph under limited resources
CN115968467A (en) Memory constrained scheduling
CN118296291A (en) Quick solving method and device for GPU sparse matrix vector multiplication
CN109712181B (en) Method for extracting open-circuit key area on integrated circuit layout line network
CN118132230A (en) Data splitting processing method and processor for neuromorphic chip under many-core architecture
CN114722571B (en) A CPU-GPU collaborative parallel scan line filling method for additive manufacturing
CN117786912A (en) Parallel constrained tetrahedral mesh optimization method based on lock-free atomic operations
CN110083446A (en) A kind of GPU parallel with remote sensing image real-time processing method and system under zero I/O mode

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant