CN100407798C - 3D geometric modeling system and method - Google Patents
3D geometric modeling system and method Download PDFInfo
- Publication number
- CN100407798C CN100407798C CN2005100122739A CN200510012273A CN100407798C CN 100407798 C CN100407798 C CN 100407798C CN 2005100122739 A CN2005100122739 A CN 2005100122739A CN 200510012273 A CN200510012273 A CN 200510012273A CN 100407798 C CN100407798 C CN 100407798C
- Authority
- CN
- China
- Prior art keywords
- model
- unit
- geometric
- video
- visual analysis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 119
- 230000033001 locomotion Effects 0.000 claims abstract description 150
- 238000004458 analytical method Methods 0.000 claims abstract description 131
- 230000000007 visual effect Effects 0.000 claims abstract description 96
- 238000013461 design Methods 0.000 claims abstract description 87
- 230000002452 interceptive effect Effects 0.000 claims abstract description 58
- 230000008569 process Effects 0.000 claims abstract description 45
- 230000003993 interaction Effects 0.000 claims abstract description 37
- 230000009471 action Effects 0.000 claims abstract description 9
- 238000003860 storage Methods 0.000 claims description 63
- 238000001514 detection method Methods 0.000 claims description 55
- 238000012545 processing Methods 0.000 claims description 39
- 238000004364 calculation method Methods 0.000 claims description 38
- 230000011218 segmentation Effects 0.000 claims description 8
- 230000003252 repetitive effect Effects 0.000 claims description 7
- 239000007787 solid Substances 0.000 claims description 3
- 230000008878 coupling Effects 0.000 claims 6
- 238000010168 coupling process Methods 0.000 claims 6
- 238000005859 coupling reaction Methods 0.000 claims 6
- 238000000465 moulding Methods 0.000 claims 4
- 238000012958 reprocessing Methods 0.000 claims 2
- 230000008676 import Effects 0.000 claims 1
- 238000009877 rendering Methods 0.000 abstract description 8
- 238000005516 engineering process Methods 0.000 description 12
- 238000013519 translation Methods 0.000 description 12
- 230000014616 translation Effects 0.000 description 12
- 238000010586 diagram Methods 0.000 description 10
- 239000011159 matrix material Substances 0.000 description 9
- 230000006870 function Effects 0.000 description 8
- 238000010191 image analysis Methods 0.000 description 7
- 230000009466 transformation Effects 0.000 description 7
- 238000012938 design process Methods 0.000 description 6
- 230000003068 static effect Effects 0.000 description 6
- 230000007613 environmental effect Effects 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 238000011160 research Methods 0.000 description 5
- 238000009826 distribution Methods 0.000 description 4
- 230000014509 gene expression Effects 0.000 description 4
- 238000009499 grossing Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000013179 statistical model Methods 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000011960 computer-aided design Methods 0.000 description 3
- 238000013500 data storage Methods 0.000 description 3
- 238000003708 edge detection Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 101000802640 Homo sapiens Lactosylceramide 4-alpha-galactosyltransferase Proteins 0.000 description 2
- 102100035838 Lactosylceramide 4-alpha-galactosyltransferase Human genes 0.000 description 2
- 238000007476 Maximum Likelihood Methods 0.000 description 2
- 238000012356 Product development Methods 0.000 description 2
- 206010044565 Tremor Diseases 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000007667 floating Methods 0.000 description 2
- 238000012804 iterative process Methods 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 238000000547 structure data Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 101710158485 3-hydroxy-3-methylglutaryl-coenzyme A reductase Proteins 0.000 description 1
- 206010000117 Abnormal behaviour Diseases 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 229910000831 Steel Inorganic materials 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005315 distribution function Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000003703 image analysis method Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 235000012149 noodles Nutrition 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 229920006395 saturated elastomer Polymers 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 230000007474 system interaction Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Landscapes
- Processing Or Creating Images (AREA)
Abstract
本发明公开一种三维几何建模系统,包括:若干视频输入装置,用于采集设计师设计动作的视频流;多个单视频流视觉分析单元,用于检测视频流中的运动区域和非运动区域,估计物体运动的方向和速度并预测下一个位置及计算运动物体的边缘轮廓并估计轮廓特征;多视频流视觉分析单元,用于进行双目立体匹配,进行三维重建、物体运动轨迹拟合及计算运动物体截面;实时交互语义识别单元,用于对多视频视觉分析单元的输出进行处理以获得人机交互语义;三维几何建模单元,用于获得三维几何设计造型;三维模型绘制单元,用于将几何模型绘制到视频输出装置上;视频输出装置,用于显示物体的几何外形以及几何形体设计师所设计的三维几何造型。
The invention discloses a three-dimensional geometric modeling system, which includes: several video input devices for collecting video streams of designers' design actions; multiple single video stream visual analysis units for detecting motion areas and non-motion areas in video streams Area, estimate the direction and speed of object movement and predict the next position, calculate the edge contour of the moving object and estimate the contour features; the multi-video stream visual analysis unit is used for binocular stereo matching, 3D reconstruction, and object trajectory fitting and calculate the section of moving objects; the real-time interactive semantic recognition unit is used to process the output of the multi-video visual analysis unit to obtain human-computer interaction semantics; the 3D geometric modeling unit is used to obtain 3D geometric design modeling; the 3D model rendering unit, It is used to draw the geometric model to the video output device; the video output device is used to display the geometric shape of the object and the three-dimensional geometric shape designed by the geometric shape designer.
Description
技术领域 technical field
本发明涉及一种三维几何建模系统和方法,尤其涉及应用于计算机辅助概念设计的基于计算机立体视觉的实时交互三维几何建模系统和方法。The invention relates to a three-dimensional geometric modeling system and method, in particular to a computer-aided conceptual design-based real-time interactive three-dimensional geometric modeling system and method.
背景技术 Background technique
目前,在工业产品设计领域,后期的详细设计阶段所应用的计算机辅助设计(CAD)技术已经相当成熟。例如在汽车制造业,整个汽车车型设计流水线几乎全部是通过计算机辅助完成。然而,产品的概念设计还是通过手绘草图来实现。设计人员根据自己的创作意图手绘草图。客户从绘制好的数十张草图中选出若干符合需求的样式,结合一些个性化的要求回馈给设计人员。设计人员据此要求进一步细化草图。经过提交与反馈的多次反复,最终确定一款新车型的概念设计。显而易见,平面的手绘草图没有视觉直观性,不便于异地的协同设计,无法为下一步详细设计精确建模提供直接的数据。从国内外汽车工业发展来看,目前汽车市场趋于饱和,竞争日益激烈,汽车开发周期缩短至一年。这些新的挑战推动汽车制造业充分利用寻求更为快捷有效的设计方法以加快产品开发。在信息化程度越来越高的今天,设计人员希望能够通过自动化的辅助概念设计工具自由表达设计意图,并实现与其它工序的自动衔接,从而实现汽车设计流程的完全信息化。At present, in the field of industrial product design, the computer-aided design (CAD) technology used in the later stage of detailed design has been quite mature. For example, in the automobile manufacturing industry, the entire automobile model design pipeline is almost entirely completed through computer aids. However, the conceptual design of the product is still achieved through hand-drawn sketches. Designers draw sketches according to their own creative intentions. The customer selects some styles that meet the needs from the dozens of sketches drawn, and gives back to the designers in combination with some personalized requirements. Designers then request further refinement of the sketch. After many iterations of submission and feedback, the conceptual design of a new model was finally determined. Obviously, flat hand-drawn sketches have no visual intuition, are not convenient for collaborative design in different places, and cannot provide direct data for accurate modeling of detailed design in the next step. From the perspective of domestic and foreign automobile industry development, the current automobile market tends to be saturated, the competition is increasingly fierce, and the automobile development cycle is shortened to one year. These new challenges push the automotive industry to take advantage of faster and more efficient design methods to speed up product development. Today, as the degree of informatization is getting higher and higher, designers hope to be able to freely express design intentions through automated auxiliary conceptual design tools, and realize automatic connection with other processes, so as to realize the complete informatization of the automotive design process.
因此,研究计算机辅助概念设计几何建模方法和装置,将人的几何设计概念,通过自然的人机交互方式输入到计算机系统,建立直观的三维几何造型。这样的装置和方法对汽车制造业具有重要的应用价值。Therefore, the computer-aided conceptual design geometric modeling method and device are studied, and the human geometric design concept is input into the computer system through natural human-computer interaction to establish an intuitive three-dimensional geometric shape. Such devices and methods have important application value to the automobile manufacturing industry.
这样的一种自然的实时交互三维几何设计系统涉及三个技术领域,即三维几何建模技术、自然的实时人机交互技术以及支持自然实时交互几何设计的视觉计算技术。Such a natural real-time interactive 3D geometric design system involves three technical fields, namely 3D geometric modeling technology, natural real-time human-computer interaction technology and visual computing technology supporting natural real-time interactive geometric design.
在三维几何造型建模技术方面,曲面形状的设计和表达,既可以用常用的确定控制顶点的方法,如NURBS方法,也可以用曲面(或三维形体,或曲线)通过运动形式生成。曲面运动生成方法应用范围广泛,例如,细长条状的物体曲面的设计和表达;飞机外形也可以分解成条状曲面(实体)的并集。运动生成方法具有直观、简单的特点,使许多曲面造型工作得到简化,因而深受设计人员的喜爱。这种称为扫掠曲面的曲面形式,在许多场合比其它造型方法在效率和质量方面更符合要求,而不是局限于控制顶点的推拉这种静止物体的造型方法。In terms of 3D geometric modeling technology, the design and expression of the surface shape can be done by using the commonly used method of determining the control vertices, such as the NURBS method, or by using the curved surface (or 3D shape, or curve) to generate through motion. The surface motion generation method has a wide range of applications, for example, the design and expression of slender strip-shaped object surfaces; the shape of an aircraft can also be decomposed into a union of strip-shaped surfaces (solids). The motion generation method is intuitive and simple, which simplifies many surface modeling tasks, so it is very popular among designers. This surface form, called swept surface, is more efficient and quality than other modeling methods in many occasions, rather than being limited to the modeling method of static objects such as push and pull of control vertices.
实时交互运动生成曲面的三维几何造型研究涉及人机交互理论及其实现方法。人机交互模型建立人对于运动物体的控制以达到建模的目的,是制约应用的关键因素之一。更好的人机交互技术使计算机易于使用,能够提高生产效率。华盛顿大学人机交互技术实验室的HMGR项目(Hand Motion Gesture Recognition System)研究手势识别,他们使用隐马尔可夫模型进行手势识别。通过该系统,交互用户界面设计者能够实现多通道输入系统,将三维空间中手的运动转换为手势符号形式,从而能够将语音输入、静态手势语等其他形势的输入结合起来。但该装置使用数据手套传感器,并且仅仅局限于简单的应用。The research on 3D geometric modeling of surfaces generated by real-time interactive motion involves the theory of human-computer interaction and its realization methods. The human-computer interaction model is one of the key factors restricting the application to establish the control of the moving object by the human to achieve the purpose of modeling. Better human-computer interaction technology makes computers easier to use and can increase productivity. The HMGR project (Hand Motion Gesture Recognition System) of the Human-Computer Interaction Technology Laboratory at the University of Washington studies gesture recognition, and they use hidden Markov models for gesture recognition. Through this system, interactive user interface designers can implement a multi-channel input system, which converts hand movements in three-dimensional space into gesture symbols, so that voice input, static sign language and other forms of input can be combined. But the device uses data glove sensors and is limited to simple applications.
美国布朗大学研制的启发式三维绘制的交互界面,旨在提高手势交互界面的可用性,提高基于命令的建模系统的可用性。在该系统中,通过高亮度场景中的相关几何部件,用户向系统传达需要进行何种操作的线索。系统根据这种线索推测可能的用户操作,并且以缩略图的形式表示出来。用户通过点击缩略图的方法完成编辑操作。操作线索机制使用户指定场景中图形组件之间的几何关系。在操作模型可区分度较低的情况下可以使用多缩略图操作提示来解决问题。The heuristic 3D rendering interface developed by Brown University in the United States aims to improve the usability of the gesture interface and the command-based modeling system. In this system, the user conveys cues to the system as to what operation needs to be performed through the relevant geometric components in the high-brightness scene. The system infers possible user actions based on such clues, and displays them in the form of thumbnails. The user completes the editing operation by clicking the thumbnail. The manipulation cue mechanism enables the user to specify geometric relationships between graphical components in the scene. Multi-thumbnail manipulation hints can be used to solve the problem in the case of low discriminability of the manipulation model.
专利申请号为00118340的日本专利,提供了一种对复杂手形图像进行手形手势识别的装置和识别方法以及程序记录媒体。Japanese Patent Application No. 00118340 provides a device, a recognition method and a program recording medium for performing hand gesture recognition on complex hand images.
日本奈良科技学院建立了一种混合三维对象建模系统NIME(ImmersiveModeling Environment)。继承了传统二维GUI界面和三维浸入式建模环境的优点,该系统使用倾斜的背投式显示设备将二维/三维建模环境结合成一个整体。在显示器表面上,使用二维GUI界面建模交互。同时,使用场序立体成像和具有六个自由度的笔形输入装置,实现二维/三维建模环境的无缝转换。Japan's Nara Institute of Technology established a hybrid 3D object modeling system NIME (Immersive Modeling Environment). Inheriting the advantages of the traditional 2D GUI interface and 3D immersive modeling environment, the system combines the 2D/3D modeling environment into a whole by using an inclined rear-projection display device. On the display surface, the interaction is modeled using a two-dimensional GUI interface. At the same time, the use of field-sequential stereoscopic imaging and a pen-shaped input device with six degrees of freedom enables seamless conversion of 2D/3D modeling environments.
实时交互运动生成曲面三维几何建模方法也涉及视觉计算理论与技术。视觉计算的研究目标是使计算机具有通过二维图像认知三维环境信息的能力,即三维物体形状重建、物体空间运动及其在空间的几何位置,是获取和研究客观世界物体扫掠运动及其运动包络,建立运动包络几何模型的重要理论工具。近十年来,实时稠密立体视差匹配已经成为现实。但是直到最近,真正能够实现实时处理的系统都需要像数字信号处理器(DSP)或者可编程门阵列(FPGA)这样的专用硬件支持。例如,J.Woodfill和Von Herzen的立体匹配系统采用16块Xilinx 4025 FPGA,以每秒42帧的速度处理320*240像素的影像。而P.Corke和Dunn使用相似的算法,采用FPGA硬件实现,能够以每秒30帧的速度处理256*256像素影像。The 3D geometric modeling method of real-time interactive motion generated surface also involves the theory and technology of visual computing. The research goal of visual computing is to enable the computer to have the ability to recognize three-dimensional environmental information through two-dimensional images, that is, the shape reconstruction of three-dimensional objects, the spatial motion of objects and their geometric positions in space. Motion envelope, an important theoretical tool for establishing a motion envelope geometric model. Real-time dense stereo disparity matching has been a reality for nearly a decade. But until recently, systems capable of truly real-time processing required dedicated hardware such as digital signal processors (DSPs) or programmable gate arrays (FPGAs). For example, the stereo matching system of J. Woodfill and Von Herzen uses 16 Xilinx 4025 FPGAs to process 320*240 pixel images at a speed of 42 frames per second. P.Corke and Dunn use a similar algorithm, implemented with FPGA hardware, and can process 256*256 pixel images at a speed of 30 frames per second.
专利申请号为03153504的中国专利使用相位和立体视觉技术,将光栅投射到物体表面上,进而实现物体三维表面轮廓测量。The Chinese patent with the patent application number 03153504 uses phase and stereo vision technology to project a grating onto the surface of an object, thereby realizing the three-dimensional surface profile measurement of the object.
在概念设计手绘草图识别工具方面,美国布朗大学计算机图形学研究组的手绘草图工具,结合纸笔草图和计算机CAD系统的一些特点,提供基于手势交互的粗略的3D多面体建模。该工具采用早期传统的2D界面概念,通过手绘草图这种形式,提供用户按照简单的放置规则草绘出各种三维基本元素的能力。In terms of hand-drawn sketch recognition tools for conceptual design, the hand-drawn sketch tool of the Computer Graphics Research Group of Brown University in the United States combines some characteristics of paper-pen sketches and computer CAD systems to provide rough 3D polyhedron modeling based on gesture interaction. This tool adopts the early traditional 2D interface concept, and provides users with the ability to sketch various three-dimensional basic elements in the form of hand-drawn sketches according to simple placement rules.
日本东京大学设计了基于手绘草图的交互自由曲面设计工具,目标是建立简单、快速的自由模型设计系统,如圆鼓鼓的小动物或者类似物体的模型设计。用户在屏幕上交互的绘制二维笔划,从二维侧影轮廓构造三维多边形表面。The University of Tokyo in Japan has designed an interactive free-form surface design tool based on hand-drawn sketches. The goal is to establish a simple and fast free-form model design system, such as the model design of small round animals or similar objects. The user interactively draws 2D strokes on the screen to construct 3D polygonal surfaces from 2D silhouettes.
德国Fraunfofer计算机图形学研究所的ARCADE研究项目使用虚拟设计桌面开展自由表面建模技术研究。用户站在虚拟设计桌面之前,使用数据手套手势实现自由表面的建模。ARCADE系统使用3D输入设备实现了有效和精确的建模能力。交互技术包括自由空间对象创建和基于其它对象的创建、隐式布尔操作、3D拾取和快速移动基于布局上下文的修改、离散操作、双手输入等。The ARCADE research project of the Fraunfofer Institute for Computer Graphics in Germany uses a virtual design desktop to carry out free surface modeling technology research. The user stands in front of the virtual design table and uses the data glove gesture to realize the modeling of the free surface. The ARCADE system achieves efficient and precise modeling capabilities using 3D input devices. Interaction techniques include free-space object creation and creation based on other objects, implicit Boolean operations, 3D picking and rapid movement based on layout context modification, discrete manipulation, two-handed input, and more.
申请号为00103458的专利公开了一种装置,用于字处理的过程中,用户用笔和编辑手势进行文字书写和文稿编辑。The patent application No. 00103458 discloses a device for writing and editing documents with a pen and editing gestures during word processing.
从上面的实现系统可以看出,自然交互的三维几何建模工具所涉及的各方面的技术,在产品概念设计应用需求的推动下获得了广泛的研究,然而,上述系统的还存在许多问题需要加以解决。归纳起来,主要缺陷包括以下几个方面:From the above implementation system, it can be seen that all aspects of technology involved in natural interactive 3D geometric modeling tools have been extensively studied driven by the application requirements of product concept design. However, there are still many problems in the above system that need to be solved. to be resolved. To sum up, the main defects include the following aspects:
人机交互缺乏足够的自然性。无论是虚拟设计桌、数据手套、三维鼠标、触觉传感器,还是联机手绘草图识别装置,都需要用户穿戴、直接接触三维输入工具。这种通过复杂工具和直接的人机物理关联方式会给用户带来不便。非自然的设计工具会影响设计者创作思维灵感的即时捕获。说明这一问题最直接的证据之一就是,尽管手工绘制草图过程中需要反复尝试,但仍然是实际设计过程中最主要的设计工具。Human-computer interaction lacks enough naturalness. Whether it is a virtual design table, a data glove, a 3D mouse, a tactile sensor, or an online hand-drawn sketch recognition device, the user needs to wear and directly touch the 3D input tool. This way of using complex tools and direct human-computer physical association will bring inconvenience to users. Unnatural design tools will affect the instant capture of designers' creative thinking and inspiration. One of the most direct evidences of this is that, despite the trial and error involved in the manual sketching process, it is still the primary design tool in the actual design process.
几何造型交互的直接性不足。手绘草图需要将概念转换为二维模型,然后通过识别工具转换为三维模型。联机手绘草图系统将这两个过程通过设备约束在一起,反而加重了对设计者的束缚。基于数据手套的系统通过手的动作实现对于虚拟场景中模型的操作,设计者对于模型建立和修改都是通过间接的方式进行,当设计者试图改变模型的形状时,需要对控制点进行操作,以反馈设计者构思的概念模型。设计意图经常被数据手套等设计工具所约束和打断,造成设计过程的不连续性,影响设计效果。Insufficient directness of geometry interaction. Freehand sketches are required to translate concepts into 2D models and then into 3D models through recognition tools. The online hand-drawn sketching system binds these two processes together through equipment, which in turn increases the constraints on the designer. The data glove-based system realizes the operation of the model in the virtual scene through hand movements. The designer establishes and modifies the model in an indirect way. When the designer tries to change the shape of the model, he needs to operate the control points. Feedback on conceptual models conceived by designers. Design intentions are often constrained and interrupted by design tools such as data gloves, resulting in discontinuity in the design process and affecting design results.
应用范围和应用方式不够灵活。草图系统、笔式输入、数据手套、力反馈装置以及虚拟设计桌面等只能实现单一方式的外部输入。例如,设计者设计过程中可能会希望通过简单的方式获得某一具体实物的某侧影轮廓,由该轮廓建立起几何模型。现有系统使用专用设备,将用户直接引入概念设计过程,还存在较大的难度。The scope of application and the way of application are not flexible enough. Sketch systems, pen input, data gloves, force feedback devices, and virtual design desktops can only achieve a single way of external input. For example, during the design process, the designer may hope to obtain a certain silhouette of a specific object in a simple way, and establish a geometric model from the outline. Existing systems use special equipment to directly introduce users into the conceptual design process, which is still relatively difficult.
实现技术本身不够完善。手绘草图需要将二维模型转换为三维模型,存在识别错误率问题。由于手绘草图具有随意性和很大的自由度,样本收集十分困难,尤其是草图的语义更具有模糊性和不确定性,既无法完全以模板定义来枚举识别目标,也难以采用预定义字典库的方式来支持其语义解释;联机手绘草图提高了识别正确率,但实际使用过程中,几何设计交互过程常常被系统交互所打断,使用效率不高;数据手套等也存在着传感器使用的空间范围、定位分辨率等问题。The implementation technology itself is not perfect. Hand-drawn sketches need to convert the 2D model into a 3D model, and there is a problem of recognition error rate. Due to the arbitrariness and great freedom of hand-drawn sketches, sample collection is very difficult, especially the semantics of sketches are more ambiguous and uncertain. It is neither possible to enumerate and identify targets with template definitions, nor to use predefined dictionaries. library to support its semantic interpretation; online hand-drawn sketches improve the accuracy of recognition, but in actual use, the interaction process of geometric design is often interrupted by system interaction, and the use efficiency is not high; there are also limitations in the use of sensors in data gloves. Issues such as spatial scope and positioning resolution.
系统通用性不足。上述系统使用专用设备,因此价格昂贵。既提高了产品设计成本,又提高了训练成本和使用成本。这无疑提高了使用者进入的门槛,限制了应用范围。计算机的普及归因子成本的降低和通用性的提高。System versatility is insufficient. The systems described above use specialized equipment and are therefore expensive. It not only increases product design cost, but also increases training cost and use cost. This undoubtedly raises the threshold for users to enter and limits the scope of application. The popularity of computers has been attributed to the reduction in cost and the increase in versatility.
发明内容 Contents of the invention
为了克服现有系统存在的问题而提出了本发明。本发明的目的是使用通用、方便、经济的物理装置,通过自然、实时交互方法,根据三维空间内双手或/及手持物体表面形状以及空间运动轨迹,运用基于运动的几何建模方法,完成三维几何形体的创建、修改和编辑,实现三维产品外形概念设计几何建模。The present invention is proposed in order to overcome the problems existing in the existing systems. The purpose of the present invention is to use a general, convenient and economical physical device, through a natural and real-time interaction method, according to the surface shape of the hands or/and hand-held objects in the three-dimensional space and the spatial trajectory, and use the geometric modeling method based on motion to complete the three-dimensional The creation, modification and editing of geometric shapes realize the geometric modeling of the conceptual design of the three-dimensional product shape.
为了实现这一目的,本发明提供了一种三维几何建模方法,包括:In order to achieve this object, the invention provides a kind of three-dimensional geometric modeling method, comprising:
一、视频输入步骤,用于从分布于设计师周围的多个视频输入装置(101)采集视频流。1. The video input step is used to collect video streams from multiple video input devices (101) distributed around the designer.
二、单视频流视觉分析步骤,使上述采集视频流步骤采集的多个视频流各自经过一个单视频流视觉分析处理,以检测视频流中的运动区域和非运动区域,估计运动物体运动的方向和速度并预测下一个运动位置以及计算运动物体的边缘轮廓并估计轮廓特征。Two, single video stream visual analysis step, make the multiple video streams that above-mentioned collection video stream step gathers respectively through a single video stream visual analysis process, to detect the moving area and the non-moving area in the video stream, estimate the direction of moving object motion And speed and predict the next movement position and calculate the edge contour of the moving object and estimate the contour features.
三、多视频流视觉分析步骤,用于接收上述若干单视频流视觉分析的处理结果,进行双目立体匹配,并基于所获得的轮廓和特征进行三维重建和物体运动轨迹拟合以及计算运动物体截面,并将所获得的物体模型、物体运动轨迹以及物体的截面轮廓提供给实时交互语义识别处理步骤。3. The multi-video stream visual analysis step is used to receive the processing results of the above-mentioned several single video stream visual analysis, perform binocular stereo matching, and perform three-dimensional reconstruction and object motion trajectory fitting and calculation of moving objects based on the obtained contours and features Section, and provide the obtained object model, object motion trajectory and object cross-sectional outline to the real-time interactive semantic recognition processing step.
四、实时交互语义识别步骤,用于对多视频视觉分析步骤的输出进行处理以获得人机交互语义,并利用预先存储在一个语义模型存储单元中的语义定义解释所获得的人机交互语义。4. The real-time interactive semantic recognition step is used to process the output of the multi-video visual analysis step to obtain human-computer interaction semantics, and interpret the obtained human-computer interaction semantics by using the semantic definitions pre-stored in a semantic model storage unit.
五、三维几何建模步骤,用于对多视频视觉分析步骤输出的物体模型、物体运动轨迹、物体截面轮廓以及实时交互语义识别步骤的输出进行处理,从而获得三维几何设计造型,并将处理结果存储到三维几何模型存储单元中。5. The step of three-dimensional geometric modeling, which is used to process the output of the object model, object movement trajectory, object cross-section outline and real-time interactive semantic recognition step output by the multi-video visual analysis step, so as to obtain a three-dimensional geometric design model, and process the result stored in the 3D geometric model storage unit.
六、三维模型绘制步骤,用于将三维几何模型存储单元中实时存储的三维几何模型绘制到视频输出装置上。6. A 3D model rendering step, for rendering the 3D geometric model stored in real time in the 3D geometric model storage unit to the video output device.
七、视频输出步骤,用于在视频输出装置上显示设计师所设计的三维几何造型。7. A video output step, for displaying the three-dimensional geometric shape designed by the designer on the video output device.
此外,本分明还提供一种三维几何建模系统,包括:In addition, Benming also provides a 3D geometric modeling system, including:
一、多个视频输入装置,分布于设计师周围用于采集设计师设计动作的视频流。1. A plurality of video input devices are distributed around the designer to collect the video stream of the designer's design actions.
二、对应于每个视频输入装置的单视频流视觉分析单元,使由上述视频输入装置采集的多个视频流各自经过一个所述单视频流视觉分析单元的处理,以检测视频流中的运动区域和非运动区域,估计运动物体运动的方向和速度并预测下一个运动位置以及计算运动物体的边缘轮廓并估计轮廓特征。Two, corresponding to the single video stream visual analysis unit of each video input device, the multiple video streams collected by the above video input device are respectively processed by a single video stream visual analysis unit to detect motion in the video stream Areas and non-moving areas, estimate the direction and speed of the moving object and predict the next moving position, calculate the edge contour of the moving object and estimate the contour features.
三、多视频流视觉分析单元,用于接收上述多个单视频流视觉分析单元的处理结果,进行双目立体匹配,并基于所获得的轮廓和特征进行三维重建和物体运动轨迹拟合,计算运动物体截面,并将所获得的物体模型、物体运动轨迹以及物体的截面轮廓提供给实时交互语义识别单元。3. The multi-video stream visual analysis unit is used to receive the processing results of the above-mentioned multiple single video stream visual analysis units, perform binocular stereo matching, and perform three-dimensional reconstruction and object motion trajectory fitting based on the obtained contours and features, and calculate The cross-section of the moving object, and provide the obtained object model, object movement trajectory and object cross-sectional outline to the real-time interactive semantic recognition unit.
四、实时交互语义识别单元,用于对多视频视觉分析单元的输出进行处理以获得人机交互语义,并利用预先存储在一个语义模型存储单元中的语义定义解释所获得的人机交互语义。4. The real-time interactive semantic recognition unit is used to process the output of the multi-video visual analysis unit to obtain human-computer interaction semantics, and interpret the obtained human-computer interaction semantics by using the semantic definitions pre-stored in a semantic model storage unit.
五、三维几何建模单元,用于对多视频视觉分析单元的输出结果以及实时交互语义识别单元的输出进行综合处理,从而获得三维几何设计造型,并将处理结果存储到一个三维几何模型存储单元中。5. The three-dimensional geometric modeling unit is used to comprehensively process the output results of the multi-video visual analysis unit and the output of the real-time interactive semantic recognition unit, so as to obtain a three-dimensional geometric design model, and store the processing results in a three-dimensional geometric model storage unit middle.
六、三维模型绘制单元,用于以三维几何模型存储单元中实时存储的三维几何模型为输入,将几何模型绘制到视频输出装置上。6. A 3D model rendering unit, configured to use the 3D geometric model stored in real time in the 3D geometric model storage unit as input, and render the geometric model on the video output device.
七、视频输出装置,用于显示设计师所设计的三维几何造型。7. The video output device is used to display the three-dimensional geometric shape designed by the designer.
根据本发明的另一方面,提供一种单视频流视觉分析处理方法,包括:影像分析方法,对从视频输入设备采集的视频信号进行处理,获得具有不同分辨率尺度和不同特征元素的特征视频流,为运动检测和立体匹配提供输入;实时运动检测方法,检测出视频流中的运动区域和非运动的背景区域;运动估计和预测方法,估计运动的方向和速度并预测下一个运动位置。轮廓计算方法,用于计算运动物体的边缘轮廓并估计轮廓特征。According to another aspect of the present invention, a method for visual analysis and processing of a single video stream is provided, including: an image analysis method for processing video signals collected from a video input device to obtain feature videos with different resolution scales and different feature elements Streaming, which provides input for motion detection and stereo matching; real-time motion detection methods, which detect moving regions and non-moving background regions in video streams; motion estimation and prediction methods, which estimate the direction and speed of motion and predict the next motion position. The contour calculation method is used to calculate the edge contour of the moving object and estimate the contour features.
根据本发明的又一方面,提供了一种多视频流视觉分析处理方法,包括:立体匹配算法,使从两个视频运动检测输出的数据经过立体匹配和视差计算,得到运动物体的区域分割和运动物体深度信息;三维模型建立方法,通过立体匹配输出的深度数据建立三维立体模型;从轮廓恢复立体的方法,按照轮廓计算获得的轮廓描述数据,业已建立运动体三维模型及其主要投影特征,以及从轮廓恢复形状的算法,获得运动物体的三维外形;截面计算方法,按照轮廓数据和运动轨迹,建立相对于运动轨迹的截面轮廓;轨迹拟合方法,对于从运动估计和预测获得数据进行平滑处理,得到平滑的运动轨迹;According to another aspect of the present invention, a method for visual analysis and processing of multiple video streams is provided, including: a stereo matching algorithm, which makes the data output from the motion detection of two videos go through stereo matching and disparity calculation to obtain the region segmentation and Depth information of moving objects; 3D model establishment method, which establishes a 3D stereo model through the depth data output by stereo matching; method of recovering stereo from the contour, according to the contour description data obtained by contour calculation, the 3D model of the moving body and its main projection features have been established. And the algorithm of recovering the shape from the contour, to obtain the three-dimensional shape of the moving object; the cross-section calculation method, according to the contour data and the motion trajectory, to establish the cross-sectional contour relative to the motion trajectory; the trajectory fitting method, to smooth the data obtained from motion estimation and prediction processing to obtain a smooth trajectory;
根据本发明的第五方面,提供了一种实时交互语义识别方法,包括:碰撞检测方法,按照运动物体的运动轨迹和外形轮廓确定语义类型为操作性语义,之后与业已建立的三维几何模型进行碰撞检测,确定碰撞发生的位置和方式;操作语义分析方法,按照碰撞检测的结果,确定语义的操作对象、操作类型等操作语义;交互语义分析方法,按照运动轨迹确定及截面轮廓获得的输入进行处理,得到交互语义的解析结果;语音语义分析方法,从语音分析获得交互语音的语义解析;According to the fifth aspect of the present invention, a real-time interactive semantic recognition method is provided, including: a collision detection method, which determines the semantic type as operational semantics according to the trajectory and outline of the moving object, and then performs with the established three-dimensional geometric model Collision detection, to determine the location and method of collision; operation semantic analysis method, according to the result of collision detection, to determine the operation semantics such as the semantic operation object and operation type; interactive semantic analysis method, according to the input of the motion trajectory determination and cross-sectional contour processing, to obtain the analysis result of the interactive semantics; the voice semantic analysis method, to obtain the semantic analysis of the interactive voice from the voice analysis;
根据本发明的第六方面,提供了一种三维几何建模方法,包括:重复处理方法,用于消除视频影像中运动物体在运动过程中的重复、重叠性运动;运动处理方法,用于消除运动物体在运动过程中的发生的颤抖和抖动;包括计算方法,用于根据已经消除抖动和重复的运动轨迹和截面轮廓,计算出物体运动产生的包络面;造型编辑方法,用于对所建立的三维几何模型进行修改。According to the sixth aspect of the present invention, a three-dimensional geometric modeling method is provided, including: a repetitive processing method for eliminating repetitive and overlapping motions of moving objects in video images; a motion processing method for eliminating Trembling and trembling of moving objects during the movement; including calculation methods, used to calculate the envelope surface generated by the movement of objects based on the motion trajectory and cross-sectional contours that have been eliminated and repeated; modeling editing methods, used for all The established 3D geometric model is modified.
根据本发明的第七方面,提供一种视频输出设备包括:显示装置,用于显示运动物体的几何模型、空间位置和姿态,以及显示业已建立的三维几何造型;According to the seventh aspect of the present invention, there is provided a video output device comprising: a display device for displaying the geometric model, spatial position and posture of the moving object, and displaying the established three-dimensional geometric modeling;
根据本发明的第八方面,提供一种绘制方法,将三维几何设计模型绘制在显示装置上,将运动物体几何模型、相对位置和姿态绘制在显示装置上。According to the eighth aspect of the present invention, a drawing method is provided, which draws a three-dimensional geometric design model on a display device, and draws a moving object's geometric model, relative position and posture on the display device.
使用本系统或应用本方法进行产品外形概念设计将在以下方面产生有益的效果。Using this system or applying this method to carry out product shape concept design will produce beneficial effects in the following aspects.
1 通过使用廉价的、非专用的物理设备,降低产品设计成本,拓宽应用范围。1 By using cheap, non-dedicated physical equipment, reduce product design costs and broaden the scope of application.
2 自然的实时交互方式,便于将普通用户引入开放的概念设计循环过程,有利于消除产品外型几何设计中的被动性、片面性,将多方面的因素直接纳入产品的开发、设计、制造过程。2 The natural real-time interaction method facilitates the introduction of ordinary users into the open concept design cycle process, which helps to eliminate the passivity and one-sidedness in the geometric design of product appearance, and directly incorporates various factors into the product development, design and manufacturing process.
3 基于视觉和通用设备的设计环境,便于引入环境特征。环境特征是影响产品设计的一个重要因素,包括微观层次的环境物理特征和宏观层次的社会文化特征。该发明的装置允许走出设计室,在产品的使用环境中进行概念设计,是实现良好环境特征引入的一种解决方法。3 A design environment based on vision and general equipment, which facilitates the introduction of environmental features. Environmental characteristics are an important factor affecting product design, including environmental physical characteristics at the micro level and social and cultural characteristics at the macro level. The device of the invention allows to go out of the design room and carry out conceptual design in the environment of product use, which is a solution to realize the introduction of good environmental features.
4 支持三维可视化设计。概念设计的主要问题是将设计产品模型化。概念设计的建模方法包括从正规的定义到高层视觉表达。在目前使用较多的模型表达方式包括语言模型、几何模型、图形模型、对象模型、知识模型和图像模型,最接近人的思考和推理的模型是图像模型。该发明是将可视化思考模型用于设计实践的重要方法。4 Support 3D visualization design. The main problem of conceptual design is to model the design product. Modeling approaches for conceptual design range from formal definitions to high-level visual representations. At present, the most used model expressions include language model, geometric model, graphic model, object model, knowledge model and image model. The model closest to human thinking and reasoning is the image model. This invention is an important way to apply the visual thinking model to design practice.
5 直接的三维数字产品。该发明提供直接的三维数字产品作为系统输出,可以得到快捷的设计评价反馈。5 Direct 3D digital products. The invention provides a direct three-dimensional digital product as a system output, which can obtain fast design evaluation feedback.
附图说明 Description of drawings
图1是根据本发明第一具体实施例的三维几何建模系统100的结构的方框图;1 is a block diagram of the structure of a three-dimensional geometric modeling system 100 according to a first embodiment of the present invention;
图2是根据本发明第一具体实施例的摄像机具体布局方式的示意图;2 is a schematic diagram of a specific layout of cameras according to a first specific embodiment of the present invention;
图3是根据本发明第一具体实施例的单/多视频流视觉分析单元结构的方框图;Fig. 3 is the block diagram of single/multiple video stream visual analysis unit structure according to the first specific embodiment of the present invention;
图4是根据本发明第一具体实施例的多视频流视觉分析单元的操作流程图;Fig. 4 is the operation flowchart of the multiple video stream visual analysis unit according to the first specific embodiment of the present invention;
图5示出了本发明的立体恢复操作的坐标系统;Fig. 5 shows the coordinate system of the stereo restoration operation of the present invention;
图6是根据本发明第一具体实施例的语义识别流程框图;Fig. 6 is a flow diagram of semantic recognition according to the first specific embodiment of the present invention;
图7是根据本发明第一具体实施例的三维几何建模流程图;Fig. 7 is a flow chart of three-dimensional geometric modeling according to the first specific embodiment of the present invention;
图8是根据本发明第一具体实施例计算机系统的示意图;8 is a schematic diagram of a computer system according to a first embodiment of the present invention;
图9是根据本发明第二具体实施例的三维几何建模系统200的结构方框图;FIG. 9 is a structural block diagram of a three-dimensional
图10是根据本发明第二具体实施例的语义识别流程框图;Fig. 10 is a flowchart of semantic recognition according to the second specific embodiment of the present invention;
图11是根据本发明第三具体实施例的三维几何建模系统300的结构方框图;FIG. 11 is a structural block diagram of a three-dimensional
图12示出了根据本发明第三具体实施例的摄像机具体布局方式;FIG. 12 shows a specific layout of cameras according to a third specific embodiment of the present invention;
图13是根据本发明第三具体实施例的多视频流视觉分析单元的操作流程图;Fig. 13 is an operation flow chart of the multi-video stream visual analysis unit according to the third specific embodiment of the present invention;
图14是根据本发明第三具体实施例计算机系统的示意图。Fig. 14 is a schematic diagram of a computer system according to a third embodiment of the present invention.
图15示出了根据本发明第一具体实施例的交互手势模式举例Fig. 15 shows an example of an interactive gesture mode according to the first specific embodiment of the present invention
具体实施方式 Detailed ways
第一具体实施例First specific embodiment
图1是根据本发明第一具体实施例的三维几何建模系统100的结构的方框图。如图1所示,视频输入装置101可以是数字摄像机,用于摄取三维几何造型设计师的几何设计建模动作影像。在本实施例中,视频输入装置101由四个数字摄像机C01、C02、C03和C04组成,其具体布局方式在本发明的一个实施例中如图2所示,它们被分别放置在概念设计者右方、右前方、左前方和左方四个不同的位置,距离地面的高度和姿态以适合设计师手势表达为宜,即设计师的设计动作应该能够被摄像机完全摄录,并且不影响设计师的操作及其它活动。相应于每一个数字摄像机分别提供一个单视频流视觉分析单元104。每部数字摄像机通过通用接口按照公知的连接方式直接连接到该数字摄像机所对应的单视频流视觉分析单元104,由各单视频流视觉分析单元104分别对从其所对应的视频输入装置101采集的连续的视频流进行处理,检测视频流中的运动区域和非运动区域,估计物体运动的方向和速度并预测下一个运动位置,以及计算物体的边缘轮廓并估计轮廓特征,并将处理结果提供给多视频流视觉分析单元105。多视频流视觉分析单元105接收上述4个单视频流视觉分析单元104的处理输出,包括运动区域检测结果,运动物体的运动方向和速度,物体的轮廓及轮廓特征。之后,多视频流视觉分析单元105对输入数据进行处理,进行双目立体匹配,基于轮廓和特征进行三维重建和物体运动轨迹拟合,计算运动物体截面等。上述单视频流视觉分析单元105的处理产生的输出结果为物体模型,物体运动轨迹以及物体截面轮廓。将多视频流视觉分析单元105的处理结果提供给实时交互语义识别单元106作为输入。实时交互语义识别单元106用于对多视频视觉分析单元105的输出进行处理以获得人机交互语义,并利用预先存储在语义模型存储单元110中的语义定义解释所获得的人机交互语义。三维几何建模单元107对多视频视觉分析单元105的输出结果以及实时交互语义识别单元106的输出进行综合处理,从而获得三维几何设计造型。三维几何建模单元107的处理结果存储到三维几何模型存储单元111中。三维模型绘制单元108基于三维几何模型存储单元111中实时存储的三维几何模型,将几何模型绘制到视频输出装置109上。视频输出装置109用于显示物体的几何外形以及几何形体设计师所设计的三维几何造型。FIG. 1 is a block diagram showing the structure of a three-dimensional geometric modeling system 100 according to a first embodiment of the present invention. As shown in FIG. 1 , the video input device 101 may be a digital camera, which is used to capture geometric design and modeling action images of a three-dimensional geometric modeling designer. In this embodiment, video input device 101 is made up of four digital cameras C01, C02, C03 and C04, and its specific layout mode is shown in Figure 2 in an embodiment of the present invention, and they are respectively placed in the concept designer Right, right front, left front and left four different positions, the height and posture from the ground should be suitable for the designer’s gesture expression, that is, the designer’s design actions should be fully recorded by the camera without affecting the design Division operations and other activities. A single video stream visual analysis unit 104 is provided corresponding to each digital camera. Each digital camera is directly connected to the corresponding single video stream visual analysis unit 104 of the digital camera according to a known connection mode through a common interface, and each single video stream visual analysis unit 104 collects from its corresponding video input device 101 respectively. Process the continuous video stream, detect the moving area and non-moving area in the video stream, estimate the direction and speed of the object’s motion and predict the next moving position, calculate the edge contour of the object and estimate the contour features, and provide the processing results to the multi-video stream visual analysis unit 105. The multi-video stream visual analysis unit 105 receives the processing outputs of the above four single video stream visual analysis units 104, including the detection result of the moving area, the moving direction and speed of the moving object, the contour and contour features of the object. Afterwards, the multi-video stream visual analysis unit 105 processes the input data, performs binocular stereo matching, performs 3D reconstruction and object trajectory fitting based on contours and features, and calculates cross-sections of moving objects. The output results generated by the processing of the above-mentioned single video stream visual analysis unit 105 are object models, object motion trajectories, and object cross-sectional outlines. The processing result of the multi-video stream visual analysis unit 105 is provided to the real-time interactive semantic recognition unit 106 as an input. The real-time interactive semantic recognition unit 106 is used to process the output of the multi-video visual analysis unit 105 to obtain human-computer interaction semantics, and interpret the obtained human-computer interaction semantics by using the semantic definitions pre-stored in the semantic model storage unit 110 . The 3D geometric modeling unit 107 comprehensively processes the output results of the multi-video visual analysis unit 105 and the output of the real-time interactive semantic recognition unit 106 to obtain a 3D geometric design model. The processing results of the three-dimensional geometric modeling unit 107 are stored in the three-dimensional geometric model storage unit 111 . The 3D model drawing unit 108 draws the geometric model on the
下面将参考附图具体描述本实施例的三维几何建模系统的操作以及系统结构。系统由系统初始化、摄像机标定、物体模型建立、运动几何造型设计等几个基本工作状态组成。The operation and system structure of the three-dimensional geometric modeling system of this embodiment will be described in detail below with reference to the accompanying drawings. The system consists of several basic working states such as system initialization, camera calibration, object model establishment, and motion geometry design.
<系统初始化><system initialization>
首先描述系统的初始化工作状态。当三维造型设计师开启该三维几何建模系统100后,首先开始系统初始化过程。系统初始化包括系统初始参数装载与建立,背景影像统计模型建立。First describe the initial working state of the system. When the 3D modeling designer starts the 3D geometric modeling system 100, the system initialization process starts first. System initialization includes system initial parameter loading and establishment, background image statistical model establishment.
系统初始化过程按照预定设置和当前系统配置,首先加载系统初始化工作环境参数,本实例中系统初始工作环境参数如表1所示。The system initialization process is based on the predetermined settings and the current system configuration. First, the system initialization working environment parameters are loaded. In this example, the system initial working environment parameters are shown in Table 1.
表1 实施例初始化参数表:Table 1 Example initialization parameter list:
系统工作参数System working parameters
用户表user table
模型表model sheet
摄像机参数表Camera parameter table
摄像机表camera table
第i摄像机参数表i-th camera parameter table
摄像机布局参数表Camera layout parameter table
第i摄像机布局参数表i-th camera layout parameter table
初始化工作参数加载后,由视频输入装置101(C01,C02,C03和C04)对操作背景连续采集多幅图像,并将这些图像提供给各视频输入装置所对应的单视频流视觉分析单元104,由它通过处理这些图像建立初始背景的统计模型。例如在本实施例中,如图3所示,单视频流视觉分析单元104还包括一个影像分析单元1041用于对由视频输入装置101所采集的视频信号进行处理以获得背景的统计模型,即具有不同分辨率尺度和不同特征元素的特征视频流。After the initial working parameters are loaded, the video input device 101 (C01, C02, C03 and C04) continuously collects multiple images of the operating background, and provides these images to the single video stream visual analysis unit 104 corresponding to each video input device, It builds a statistical model of the initial background by processing these images. For example, in this embodiment, as shown in FIG. 3 , the single video stream visual analysis unit 104 further includes an image analysis unit 1041 for processing the video signal collected by the video input device 101 to obtain a statistical model of the background, namely Feature video streams with different resolution scales and different feature elements.
在本发明的一个具体实施例中的影像分析单元1041采用了下述建立背景统计模型的方法。对于系统中与每个摄像机Ci相对应的背景Bi,建立一个初始背景模型Mi。对Bi中的每一个像素点p,定义μp为该点颜色值的期望,σp 2为颜色值分布的方差,有如下公式:The image analysis unit 1041 in a specific embodiment of the present invention adopts the following method of establishing a background statistical model. For the background B i corresponding to each camera C i in the system, an initial background model M i is established. For each pixel point p in Bi , define μ p as the expectation of the color value of the point, and σ p 2 is the variance of the color value distribution, which has the following formula:
其中,hp t是p点在第t帧影像上的颜色值。这样,每个点p的(μp,σp 2)构成Bi的背景模型:Among them, h p t is the color value of point p on the tth frame image. In this way, (μ p , σ p 2 ) of each point p constitutes the background model of Bi :
另外,系统按照预定设置生成具有简单、规则几何表面的初始物体模型,例如,长方体,球体等。通过修改系统设置,可以选择生成何种预定义的物体或者不生成任何初始物体。In addition, the system generates initial object models with simple, regular geometric surfaces, such as cuboids, spheres, etc., according to predetermined settings. By modifying the system settings, it is possible to choose what kind of predefined objects are generated or not to generate any initial objects.
<摄像机标定><camera calibration>
当首次使用该系统时或者摄像机的布局、位置和姿态发生改变,也或者更换了摄像机时,需要进行摄像机标定,即开始一个建立摄像机参数的摄像机标定工作过程。在该工作状态下,系统中的各摄像机将获取影像并按照本领域技术人员公知的摄像机标定方法计算出各摄像机的内部参数和外部参数。如果不是首次使用该系统并且没有改变摄像机的布局、位置和姿态,也没有更换摄像机则不需要进行建立摄像机参数的摄像机标定工作。When the system is used for the first time or the layout, position and attitude of the camera are changed, or the camera is replaced, camera calibration is required, that is, a camera calibration process of establishing camera parameters is started. In this working state, each camera in the system will acquire images and calculate the internal parameters and external parameters of each camera according to the camera calibration method known to those skilled in the art. If the system is not used for the first time and the layout, position and attitude of the camera have not been changed, and the camera has not been replaced, then there is no need to perform camera calibration to establish camera parameters.
<物体模型建立><Object model creation>
上述摄像机标定工作过程结束后,设计师便可以使用本实施例所述系统在摄像机组前适当位置利用手或手持物体进行三维几何建模。根据手的外形或手持物体的几何外形所建立的几何模型称为物体三维模型,或简称为物体模型。物体模型是一种动态模型,随着手以及手持物体的空间位置、姿态、形状的变化而即时变化。本发明使用物体模型作为设计工具,进行三维几何造型设计。同时,简单手持物体模型本身也可以作为三维几何造型设计的初始模型。After the above-mentioned camera calibration work process is completed, the designer can use the system described in this embodiment to perform three-dimensional geometric modeling with a hand or a hand-held object at an appropriate position in front of the camera group. The geometric model established according to the shape of the hand or the geometric shape of the held object is called the three-dimensional model of the object, or simply called the object model. The object model is a dynamic model that changes in real time with changes in the spatial position, posture, and shape of the hand and the held object. The invention uses the object model as a design tool to carry out three-dimensional geometric modeling design. At the same time, the simple hand-held object model itself can also be used as the initial model for 3D geometric modeling design.
<运动几何造型设计><Sports Geometry Design>
系统经历了初始化、摄像机标定、物体模型建立三个运行过程后,进入运动几何造型设计工作状态。设计师使用本实施例所述系统在摄像机组前适当位置利用手或手持物体,通过手或手持物体的外形及其运动,将自己的三维几何设计构思输入到三维几何建模系统100中以获得三维几何建模结果。所建立的三维几何模型按照规定的文件格式被实时存储到三维几何模型存储单元111中并被实时输出到诸如CRT或LCD显示器的视频输出装置109上。例如,可以通过存储介质,例如磁盘存储器进行存储,也可以通过计算机网络设备或可移动存储设备进行传输并存储。在一个具体实施例中,一个三维几何模型的格式可以以表2所示的格式进行存储。After the system has gone through three operating processes of initialization, camera calibration, and object model establishment, it enters into the working state of motion geometry modeling design. Designers use the system described in this embodiment to use their hands or handheld objects at appropriate positions in front of the camera group, and input their own three-dimensional geometric design concepts into the three-dimensional geometric modeling system 100 through the shape and motion of the hand or handheld objects to obtain 3D geometry modeling results. The established 3D geometric model is stored in the 3D geometric model storage unit 111 in real time according to the prescribed file format and output to the
表2 三维几何模型数据结构Table 2 Data structure of 3D geometric model
模型表model sheet
对象列表object list
点数据结构point data structure
边数据结构edge data structure
面数据结构surface data structure
如表2所示,三维几何模型数据结构包括:模型编码,模型标识,模型类型编码,模型类型标识,模型属性表,模型参数表,版本号,对象数量对象列表指针等数据项。其中,模型标识用于唯一标识该模型。模型类型编码用来表示该模型的类型。本实施例中,作为设计结果而生成的三维几何模型和作为设计工具而生成的物体模型使用相同的数据结构进行存储。因此,模型类型用来区分三维几何模型和物体模型。同时,为了设计方便,系统对预定义物体模型进行分类并赋予唯一的模型类型标识。模型属性表定义了几何模型所具备的属性,例如尺度属性,位置属性等。版本号用来表示几何模型数据结构的版本。一个模型可以有若干对象构成,对象列表描述对象的属性,包括对象类型,对象标识,父对象指针,子对象指针,几何数据存储结构等等。As shown in Table 2, the data structure of the 3D geometric model includes: model code, model ID, model type code, model type ID, model attribute table, model parameter table, version number, object quantity object list pointer and other data items. Among them, the model ID is used to uniquely identify the model. The model type code is used to indicate the type of the model. In this embodiment, the 3D geometric model generated as a design result and the object model generated as a design tool are stored using the same data structure. Therefore, the model type is used to distinguish between 3D geometric models and object models. At the same time, for the convenience of design, the system classifies the predefined object models and assigns a unique model type identification. The model attribute table defines the attributes of the geometric model, such as scale attributes, location attributes, and so on. The version number is used to indicate the version of the geometric model data structure. A model can consist of several objects, and the object list describes the properties of the object, including object type, object ID, parent object pointer, child object pointer, geometric data storage structure, and so on.
设计师可以通过多种形式向计算机表达其概念设计的几何设计构思。Designers can express the geometric design ideas of their conceptual designs to computers in various forms.
第一种方式通过设计师的手直接地进行三维几何设计。具体来说就是设计师通过手的运动包络以及手势表达三维几何设计构思。The first way is to directly carry out three-dimensional geometric design through the designer's hand. Specifically, the designer expresses the three-dimensional geometric design concept through the motion envelope of the hand and gestures.
第二种方式是通过手持物体进行三维几何形体设计。具体来说可以分为:①设计师单纯通过手持物的外形表达要建立的三维几何设计构思,例如,设计者想设计一个圆球,他可以将一个圆球放置到摄像机前,系统将自动建立起该球的三维几何外形作为物体模型。然后,设计者通过下达命令将物体模型复制为三维几何模型输出;②设计师通过该手持物的外形及其运动联合表达运动包络三维几何模型,例如,设计者在摄像机前手持一个圆球,系统将自动建立起该球的三维几何外形作为物体模型。设计者手持该圆球做圆弧状空间运动,系统根据作为该球的三维几何外形的物体模型和该球的运动轨迹,建立一个由球的运动包络形成的三维几何设计模型,即一个空间圆弧形管子并进行输出。The second way is to design three-dimensional geometric shapes by holding objects. Specifically, it can be divided into: ①The designer simply expresses the 3D geometric design concept to be established through the shape of the hand-held object. For example, if the designer wants to design a ball, he can place a ball in front of the camera, and the system will automatically create The three-dimensional geometric shape of the ball is used as the object model. Then, the designer copies the object model as a 3D geometric model output by issuing an order; ②The designer expresses the motion envelope 3D geometric model through the joint shape and motion of the hand-held object, for example, the designer holds a ball in front of the camera, The system will automatically establish the three-dimensional geometric shape of the ball as the object model. The designer holds the ball to move in an arc-shaped space, and the system builds a three-dimensional geometric design model formed by the ball's motion envelope according to the object model of the three-dimensional geometric shape of the ball and the trajectory of the ball, that is, a space Arc-shaped pipe and output.
第三种方式是对已建立的三维几何模型的编辑和修改。设计师使用手或/和手持物按照预定的交互语义模型对已有三维几何模型进行三维几何编辑,例如拉伸、扭曲等来生成新的三维几何模型。The third way is to edit and modify the established 3D geometric model. Designers use hands or/and handheld objects to perform 3D geometric editing on existing 3D geometric models according to predetermined interactive semantic models, such as stretching, twisting, etc., to generate new 3D geometric models.
当然,正如本领域技术人员容易理解的那样,设计师可以综合使用上述三种方式表达设计构思。Of course, as those skilled in the art can easily understand, the designer can use the above three methods in combination to express the design concept.
无论以上述何种方式或何种方式的组合,本实施例所述的三维几何建模系统都对通过多个摄像机获得的视频数据经由下述处理:Regardless of the above methods or combinations of methods, the 3D geometric modeling system described in this embodiment performs the following processing on the video data obtained through multiple cameras:
●单视频流视觉分析●Single video stream visual analysis
在该环节中,各单视频流视觉分析单元104分别对从其所对应的视频输入装置101采集的连续的视频流进行处理,检测视频流中的运动区域和非运动区域,估计物体运动的方向和速度并预测下一个运动位置,以及计算物体的边缘轮廓并估计轮廓特征,将处理结果提供给多视频流视觉分析单元105。具体而言,包括如下操作:In this link, each single video stream visual analysis unit 104 respectively processes the continuous video stream collected from its corresponding video input device 101, detects the moving area and non-moving area in the video stream, and estimates the direction of object motion and speed and predict the next movement position, and calculate the edge contour of the object and estimate the contour features, and provide the processing results to the multi-video stream visual analysis unit 105. Specifically, it includes the following operations:
I.影像分析I. Image Analysis
如图3所示,每个单视频流视觉分析单元104还包括一个影像分析单元1041,对于每一路视频输入装置101的输入,影像分析单元1041将进行如下步骤的处理。首先,从各个视频输入装置101中获取各个影像帧,然后,按照影像分辨率建立分层结构,输出分层的影像序列。As shown in FIG. 3 , each single video stream visual analysis unit 104 also includes an image analysis unit 1041 , and for each video input device 101 input, the image analysis unit 1041 will perform the following processing steps. Firstly, each image frame is acquired from each video input device 101, and then a layered structure is established according to the image resolution, and a layered image sequence is output.
在本发明的一个实施例中的影像分析单元1041建立影像分层数据的实现方法是:采用三级金字塔结构,对原始影像ML建立一组影像序列{ML,ML-1,ML-2},其中Mi-1是Mi降低一半分辨率后得到的图像。并且将ML称为金字塔底层,或高分辨率层;将ML-1称为金字塔中层,或中分辨率层;将ML-2称为金字塔顶层,或低分辨率层。影像金字塔数据结构如表3所示。In one embodiment of the present invention, the image analysis unit 1041 implements the establishment method of image layered data as follows: adopt a three-level pyramid structure to establish a set of image sequences { ML , M L-1 , M L for the original image M L -2 }, where M i-1 is the image obtained by reducing the resolution of M i by half. And ML is called the bottom layer of the pyramid, or the high-resolution layer; ML -1 is called the middle layer of the pyramid, or the middle resolution layer; and ML-2 is called the top layer of the pyramid, or the low-resolution layer. The image pyramid data structure is shown in Table 3.
表3 影像数据缓存队列描述Table 3 Image data cache queue description
帧数据结构frame data structure
在处理过程中,时间序列影像数据被依次存储在影像处理缓存队列中,队列长度是7个影像帧。队列使用静态循环表实现。During the processing, the time series image data is sequentially stored in the image processing buffer queue, and the queue length is 7 image frames. Queues are implemented using static round-robin tables.
II.实时运动检测II. Real-time motion detection
如图3所示,每个单视频流视觉分析单元104还包括一个运动检测单元1042,用于对每一路影像分析单元1041的处理结果进行实时运动检测。运动检测的目标是检测影像上的运动区域以及运动方向以获得准确的运动区域分割和运动区域边缘。实时运动检测单元在多分辨率影像层的中层影像上通过影像差分算法检测运动区域以及通过光流法获得运动方向。在本实施例中,运动检测单元1042对来自影像分析单元1041的分层数据处理结果进行如下操作:As shown in FIG. 3 , each single video stream visual analysis unit 104 further includes a motion detection unit 1042 for performing real-time motion detection on the processing results of each video analysis unit 1041 . The goal of motion detection is to detect motion areas and motion directions on images to obtain accurate motion area segmentation and motion area edges. The real-time motion detection unit detects the motion area through the image difference algorithm on the middle image of the multi-resolution image layer and obtains the motion direction through the optical flow method. In this embodiment, the motion detection unit 1042 performs the following operations on the layered data processing result from the image analysis unit 1041:
a)背景消除a) Background removal
对于每一个摄像机Ci在t时刻采集的影像Ii t,按照下述方法进行前景区域提取:For the image I i t collected by each camera C i at time t, the foreground area is extracted according to the following method:
设影像Ii t中点p的颜色值为hp,通过以下公式将图像二值化:Assuming the color value of point p in image I i t is h p , the image is binarized by the following formula:
影像Ii t中所有dp为零的点p构成前景区域Fi。All points p in the image I i t where d p is zero constitute the foreground area F i .
b)影像差分计算:b) Image difference calculation:
差分影像Id(i,j)是二值图像d(i,j):The difference image I d (i, j) is a binary image d(i, j):
其中,fk(i,j)是时间序列影像中前后相邻的两帧影像,ε是一个很小的正数。在差分影像中,数值是1的像素位置表明运动发生的地方。由此,利用该公式获得运动区域。Among them, f k (i, j) is two adjacent frames of images in the time series image, and ε is a small positive number. In the difference image, pixel positions with a value of 1 indicate where motion occurred. From this, the motion area is obtained using this formula.
c)平面运动参数计算c) Calculation of plane motion parameters
下面描述按照光流法计算投影平面上的运动速度c(u,v)的操作。基本计算步骤如下:The operation of calculating the motion speed c(u, v) on the projection plane according to the optical flow method will be described below. The basic calculation steps are as follows:
①对于一幅图像上所有的像素(i,j),估计光流初始值c(i,j)=0;① For all pixels (i, j) on an image, estimate the initial value of optical flow c(i, j) = 0;
②令k表示迭代次数,对于所有的像素(i,j),利用公式(6)、(7)计算数值:②Let k represent the number of iterations, and for all pixels (i, j), use formulas (6) and (7) to calculate the value:
其中,in,
P(i,j)=fx(i,j)u+fy(i,j)v (8)P(i,j)=f x (i,j)u+f y (i,j)v (8)
u和v表示u邻域和v邻域中的均值,该均值可以通过利用图像局部平滑算子计算得到。根据图像中噪声的大小对λ取值。当噪声较大时,取较小的值;当噪声较小时,取较大的值。u and v represent the mean value in the u neighborhood and the v neighborhood, and the mean value can be calculated by using the image local smoothing operator. Value λ according to the size of the noise in the image. When the noise is large, take a smaller value; when the noise is small, take a larger value.
③当③ when
时,迭代过程终止;其中,When , the iterative process terminates; among them,
III.轮廓计算III. Contour Calculation
如图3所示,所述单视频流视觉分析单元104还包括一个轮廓计算单元1043,其在运动物体区域上通过基于肤色的检测算法获得手的区域。除去手的区域,获得手持物体的区域。在此基础上计算出区域的边缘轮廓。在本实施例中,轮廓计算单元1043对来自运动检测单元1042的处理结果进行手边缘获取和精细轮廓检测操作:As shown in FIG. 3 , the single video stream visual analysis unit 104 also includes a contour calculation unit 1043 , which obtains the hand area on the moving object area through a detection algorithm based on skin color. Remove the hand area to obtain the area holding the object. Based on this, the edge contour of the area is calculated. In this embodiment, the contour calculation unit 1043 performs hand edge acquisition and fine contour detection operations on the processing results from the motion detection unit 1042:
a)手边缘获取操作a) hand edge acquisition operation
在本实施例中,使用运动信息和皮肤颜色信息融合的策略进行手部区域的分割和轮廓检测。In this embodiment, a strategy of fusion of motion information and skin color information is used for segmentation and contour detection of the hand region.
首先描述基于运动信息的区域获取:First describe the region acquisition based on motion information:
在这一过程中,摄像机处于静止状态,其拍摄的彩色图像序列是由R,G,B分量组成的彩色图像序列。在轮廓计算单元1043中,定义s=(x,y)表示图像平面空间坐标系,t表示时间坐标系,i表示RGB空间中任意一个分量。则It i表示分量i在t时刻的亮度图像。利用t-Δt,t,t+Δt时刻连续3帧图像计算t时刻i分量运动图像dt i为In this process, the camera is in a static state, and the color image sequence it shoots is a color image sequence composed of R, G, and B components. In the contour calculation unit 1043, define s=(x, y) to represent the image plane space coordinate system, t to represent the time coordinate system, and i to represent any component in the RGB space. Then I t i represents the brightness image of component i at time t. Using t-Δt, t, t+Δt time continuous 3 frames of images to calculate the i-component moving image d t i at time t is
i=r,g,b (13)i=r, g, b (13)
综合(r,g,b)分量,可得彩色序列在t时刻的运动图像dt为Combining (r, g, b) components, the moving image d t of the color sequence at time t can be obtained as
最后,对运动图像进行平滑和二值化处理,得到运动图像 Finally, the moving image is smoothed and binarized to obtain the moving image
接下来描述基于肤色检测的手区域识别:Next, the hand area recognition based on skin color detection is described:
我们都知道同种颜色在不同分布的光照下亮度不同,但是色彩的感觉基本上是保持恒定的。轮廓计算单元1043正是利用了人体皮肤颜色在Luv空间的分布以及亮度这一特征,在Luv色彩空间上进行肤色检测。轮廓计算单元1043的操作步骤如下:We all know that the brightness of the same color is different under different distributions of light, but the perception of color is basically constant. The contour calculation unit 1043 utilizes the distribution of human skin color in the Luv space and the feature of brightness to detect skin color in the Luv color space. The operation steps of the contour calculation unit 1043 are as follows:
①色彩空间转换。将RGB色彩空间转换为Luv色彩空间;① Color space conversion. Convert RGB color space to Luv color space;
②以前一帧肤色检测结果(或初始肤色特征)作为初值,采用移动平均算法对运动区域进行分割。将当前帧每个色元的密度分布函数作为一个概率密度函数。定义本区域的概率函数的均值与中心值之间的差为平均转移向量。平均转移向量总是沿着最大概率密度的方向,这样可以通过搜索找到最大密度的实际方向。具体的计算方法是:② The skin color detection result (or initial skin color feature) of the previous frame is used as the initial value, and the moving average algorithm is used to segment the moving area. The density distribution function of each color element in the current frame is used as a probability density function. The difference between the mean and the central value of the probability function defining the region is the average transition vector. The average transition vector is always along the direction of maximum probability density, so that the actual direction of maximum density can be found by searching. The specific calculation method is:
设图像中像素点pi的色彩特征矢量xi可以定义为:Let the color feature vector x i of pixel p i in the image be defined as:
xi=(L,u,v) (15)x i = (L, u, v) (15)
其中,L,u,v是图像的相对亮度和u*、v*色度坐标。令x0表示p0点的色彩特征矢量,xi表示窗口内pi点的特征矢量。本实施例中窗口大小为7。通过下述两个步骤的迭代计算,获得密度梯度为零的点。Wherein, L, u, v are the relative brightness of the image and u * , v * chromaticity coordinates. Let x 0 represent the color feature vector of point p 0 , and x i represent the feature vector of point p i in the window. In this embodiment, the window size is 7. Through the iterative calculation of the following two steps, the point where the density gradient is zero is obtained.
计算平均移动矢量mh,G(x)(x)Calculate the mean moving vector m h,G(x) (x)
其中,h是色彩分辨率,g(x)是多元正态函数where h is the color resolution and g(x) is the multivariate normal function
按mh,G(x)(x)平移核函数G(x)。其中,x是当前窗口中心特征矢量,mh,G(x)(x)是窗口内以G为权的加权平均与窗口中心的差。上述迭代过程必定收敛且按平滑的轨迹收敛于密度梯度为零的点。Translate the kernel function G (x) by m h, G(x) (x). Among them, x is the feature vector of the current window center, m h, G(x) (x) is the difference between the weighted average weighted by G in the window and the window center. The above iterative process must converge and converge to the point where the density gradient is zero according to a smooth trajectory.
③确定局部最大值点后,按照特征空间的局部结构确定与最大值点相联系的特征类,得到肤色在空间中的实际位置。综合基于肤色的检测结果和基于运动的检测结果,获得手的边缘轮廓。③ After determining the local maximum point, determine the feature class associated with the maximum point according to the local structure of the feature space, and obtain the actual position of the skin color in the space. Combine the detection results based on skin color and the detection results based on motion to obtain the edge contour of the hand.
b)精细轮廓检测b) Fine contour detection
如上所述,在多分辨率中层影像上的运动检测结果已经获得了较低分辨率的区域分割。下面将要描述本实施例的轮廓计算单元1043根据区域分割结果和轮廓检测步骤,在原始影像上进行轮廓计算以获得较为精确的物体轮廓的步骤。在本实施例中,运用S.M.Smith和J.M.Brady的边缘检测方法检测边缘,该方法使用5*5圆形窗口模板。具体步骤为:As mentioned above, motion detection results on multi-resolution mid-level imagery have yielded lower-resolution region segmentation. The steps of the contour calculation unit 1043 of this embodiment to perform contour calculation on the original image to obtain a more accurate object contour according to the region segmentation result and the contour detection step will be described below. In this embodiment, the edge detection method of S.M.Smith and J.M.Brady is used to detect edges, and this method uses a 5*5 circular window template. The specific steps are:
①根据区域分割结果建立边缘检测区,该区域在边缘附近一定像素宽度内;①Establish an edge detection area according to the result of area segmentation, and the area is within a certain pixel width near the edge;
②将窗口中心置于边缘检测区内的每一个像点位置上,计算窗口中心点r0与窗口内其它像素点r具有相近亮度的点的个数n(r0)以确定该像素是否是图像边缘点。利用下述公式(18)计算n(r0)②Put the center of the window on each pixel position in the edge detection area, and calculate the number n(r 0 ) of the points whose brightness is similar between the window center point r 0 and other pixel points r in the window to determine whether the pixel is Image edge points. Calculate n(r 0 ) using the following formula (18)
其中,c(r,r0)表示窗口内点r的亮度I(r)与窗口中心点r0的亮度I(r0)相似程度Among them, c(r, r 0 ) represents the similarity between the brightness I(r) of the point r in the window and the brightness I(r 0 ) of the window center point r 0
其中,t表示亮度阈值。显然,当两点亮度之差小于t时,c(r,r0)=1。通过n(r0)可以计算出边缘的中心和方向,通过非极大值抑制细化边缘。Among them, t represents the brightness threshold. Obviously, when the brightness difference between two points is smaller than t, c(r, r 0 )=1. The center and direction of the edge can be calculated by n(r 0 ), and the edge is refined by non-maximum value suppression.
由此,轮廓计算单元1043获得了手以及手持物的边缘轮廓。Thus, the contour calculation unit 1043 obtains the contours of the edges of the hand and the object.
IV.运动估计和预测IV. Motion Estimation and Prediction
如图3所示,单视频流视觉分析单元104还包括一个运动估计和预测单元1044用于跟踪由轮廓计算单元1043计算的轮廓的运动轨迹。本发明的一个实施例使用基于目标轮廓的中心作为焦点,跟踪运动并获得时间离散的运动轨迹。运动跟踪的结果是一系列带有时间标签的轮廓中心的平面坐标,即四元组(x,y,t,i)来表示,其中,i用于表示摄像机。为了提高系统时间效率,本实施例使用卡尔曼滤波进行运动预测,预测结果作为运动检测装置的输入,提供下一帧运动检测的预先估计。As shown in FIG. 3 , the single video stream visual analysis unit 104 further includes a motion estimation and prediction unit 1044 for tracking the motion trajectory of the contour calculated by the contour calculation unit 1043 . One embodiment of the present invention uses the center of the target contour as the focal point to track motion and obtain time-discrete motion trajectories. The result of motion tracking is a series of planar coordinates of the center of the contour with time labels, which is represented by a quadruple (x, y, t, i), where i is used to represent the camera. In order to improve the time efficiency of the system, this embodiment uses Kalman filter for motion prediction, and the prediction result is used as the input of the motion detection device to provide a pre-estimation of motion detection in the next frame.
由此,已经描述了单视频流视觉分析单元104中的各部件及结构,以及其具体的操作。单视频流视觉分析单元的输出包括:分辨率分层的影像序列,影像序列的轮廓特征序列以及运动空间点序列。单视频流视觉分析单元104的输出结果数据结构如表4所示。Thus, the components and structures in the single video stream visual analysis unit 104 have been described, as well as their specific operations. The output of the single video stream visual analysis unit includes: a resolution-layered image sequence, a sequence of contour features of the image sequence, and a sequence of moving spatial points. The data structure of the output result of the single video stream visual analysis unit 104 is shown in Table 4.
表4 单视频流分析输出数据结构描述Table 4 Single video stream analysis output data structure description
单视频分析输出序列包括摄像机标识,帧序列标识,时间标识,帧类型标识,影像数据属性表,点特征数据属性表,区域特征数据属性表,边缘特征数据属性表,原始影像数据指针,中分辨率影像数据指针,低分辨率影像数据指针,特征点数据表指针,特征结构数据表指针,运动区域数据表指针,背景区域数据表指针,运动区域边缘数据表等。其中,摄像机标识用于区分系统不同的摄像机。帧序列标识是该摄像机的帧序列编号。时间标识用于记录该影像帧的获取时间。各种特征数据的属性表描述了给类特征数据的长度等信息。各类数据指针给出了影像数据以及特征数据存储结构的地址。Single video analysis output sequence includes camera identification, frame sequence identification, time identification, frame type identification, image data attribute table, point feature data attribute table, area feature data attribute table, edge feature data attribute table, original image data pointer, middle resolution High-rate image data pointer, low-resolution image data pointer, feature point data table pointer, feature structure data table pointer, motion area data table pointer, background area data table pointer, motion area edge data table, etc. Wherein, the camera identifier is used to distinguish cameras of different systems. The frame sequence ID is the frame sequence number of the camera. The time stamp is used to record the acquisition time of the image frame. The attribute table of various characteristic data describes information such as the length of characteristic data for a class. All kinds of data pointers give the address of image data and feature data storage structure.
●多视频流视觉分析●Visual analysis of multiple video streams
下面将参考图3以及图4描述本实施例的多视频流视觉分析单元105,该单元接受所有四个单视频流视觉分析单元104的输出,并进行下述处理:The multi-video stream visual analysis unit 105 of the present embodiment will be described below with reference to FIGS. 3 and 4 . This unit accepts the output of all four single video stream visual analysis units 104, and performs the following processing:
I.立体匹配I. Stereo matching
四路视频输入设备可以两两组合出三组输入对。而本实施例的多视频流视觉分析单元105包括一个立体匹配单元1051,该单元使用三对输入对中的两对双视图输入构造双视图立体匹配,从而计算物体的三维空间坐标,即由立体匹配和摄影平面上的平面坐标(x,y)计算出相对于摄像机的深度坐标(z坐标),进而根据摄像机参数确定出物体三维空间坐标。通过双视图影像立体匹配可以实现深度重建。首先,在单元1051上述获得的影像上的非背景区域中确定一个目标点,以目标点为中心定义大小为m×n的模板窗口。为了在另一影像上寻找该目标点的匹配点,在另一影像上可能的匹配区域定义一个大小为(m+c)×(n+d)的灰度矩阵作为搜索窗口,通过块匹配算法实现影像匹配。块匹配算法就是在搜索窗口中移动模板窗口并按匹配测度计算出大小为(c+1)×(d+1)的相似度矩阵。相似度矩阵中最大(最小)值所对应的搜索窗口内的影像块就是模板窗口的最佳匹配。Four video input devices can be combined in pairs to form three input pairs. However, the multi-video stream visual analysis unit 105 of this embodiment includes a
本实施例的立体匹配单元1051使用如公式(20)所示的差绝对值和的计算公式来计算相似性测度:The
其中,I(u,v)和I′(u,v)两视图影像;m、n是匹配窗口的宽度和高度,c+m、d+n搜索区域的宽度和高度。具体实现时,可以通过逐次选择不同的阈值,获得比差绝对值和方法的朴素实现更为快速的实现方法。Among them, I(u, v) and I'(u, v) are two-view images; m and n are the width and height of the matching window, and c+m and d+n are the width and height of the search area. In specific implementation, a faster implementation method than the naive implementation of the difference absolute value sum method can be obtained by successively selecting different thresholds.
本实施例的立体匹配单元1051在可能的目标边界区域使用移动式窗口,即,在输入影像上移动窗口的位置从而获得更多的覆盖区域,从中选出最佳匹配位置。而在其它区域使用标准窗口。The
II.SFS&FBR(从轮廓恢复立体和基于特征的三维重建)II. SFS & FBR (Stereo Restoration from Silhouette and Feature-Based 3D Reconstruction)
如图3和图4所示,多视频流视觉分析单元105还包括一个SFS&FBR单元1052和物体模型存储单元1056。其中,SFS&FBR单元1052是从轮廓恢复立体和基于特征的三维重建单元,物体模型存储单元1056用于存储重建的物体模型。SFS&FBR单元1052用于按照单视频流视觉分析单元104的轮廓计算结果和已经建立的在物体模型存储单元1056中存储的物体模型,运用从轮廓恢复形状的方法以及基于特征的目标识别算法,恢复物体的表面形状。该单元进行基于空间变化的轮廓恢复形状操作,算法如下:As shown in FIG. 3 and FIG. 4 , the multi-video stream visual analysis unit 105 also includes a SFS&FBR unit 1052 and an object model storage unit 1056 . Among them, the SFS&FBR unit 1052 is a three-dimensional reconstruction unit based on contour restoration and feature-based, and the object model storage unit 1056 is used to store the reconstructed object model. The SFS&FBR unit 1052 is used to restore the object according to the contour calculation result of the single video stream visual analysis unit 104 and the established object model stored in the object model storage unit 1056, using the method of recovering shape from the contour and the feature-based target recognition algorithm surface shape. The unit performs contour recovery shape operation based on spatial variation, and the algorithm is as follows:
①对于空间点P,根据公式(24)、(25)计算影像上的点P′,(xh,yh)① For the spatial point P, calculate the point P′ on the image according to formulas (24) and (25), (x h , y h )
其中,in,
P′是世界坐标系(X,Y,Z)中的点P在像平面坐标系的投影。C是世界坐标系原点到投影中心的向量,是摄像机光轴方向的单位向量,是像平面坐标系上水平方向的单位向量,是像平面坐标系上垂直方向的单位向量。详细的坐标系统见图5,即W-XYZ世界坐标系和c-xy影像平面坐标系。P' is the projection of point P in the world coordinate system (X, Y, Z) on the image plane coordinate system. C is the vector from the origin of the world coordinate system to the projection center, is the unit vector in the direction of the camera optical axis, is the unit vector in the horizontal direction of the image plane coordinate system, is a unit vector in the vertical direction on the image plane coordinate system. The detailed coordinate system is shown in Figure 5, that is, the W-XYZ world coordinate system and the c-xy image plane coordinate system.
②如果P′位于影像上的背景区,则空间点P是被去除的点;否则,P被保留;② If P′ is located in the background area of the image, then the spatial point P is the point to be removed; otherwise, P is retained;
③使用简单的空间八叉树算法,简化计算过程;③ Use a simple spatial octree algorithm to simplify the calculation process;
④通过多视图轮廓剪裁,获得几个不同角度的重建结果;④Through multi-view contour clipping, several reconstruction results from different angles are obtained;
III.物体模型建立III. Object model building
如图3和图4所示,多视频流视觉分析单元105还包括一个模型建立单元1053,用于在获得立体匹配后使用光束法解算出三维空间中目标的三维坐标。该单元的具体操作如下:As shown in FIG. 3 and FIG. 4 , the multi-video stream visual analysis unit 105 also includes a model building unit 1053 , which is used to calculate the three-dimensional coordinates of the object in the three-dimensional space by using the beam method after stereo matching is obtained. The specific operation of this unit is as follows:
假设三维空间中一组点Xj被矩阵为Pi的一组摄像机所拍摄。用xi j标记第i个空间点在第j个摄像机像平面上的坐标,则已知图像坐标xi j的集合,求摄像机矩阵Pi和空间点Xj使得Assume that a group of points X j in the three-dimensional space is captured by a group of cameras whose matrix is P i . Use x i j to mark the coordinates of the i-th spatial point on the image plane of the j-th camera, then the set of image coordinates x i j is known, and the camera matrix P i and the spatial point X j are obtained such that
PiXj=xi j (21)P i X j = x i j (21)
如果对于Xj或者Pi不做进一步的约束,上述重构是一个射影重构,即Xj与真正的重构相差一个任意的三维射影变换。If no further constraints are made on X j or P i , the above reconstruction is a projective reconstruction, that is, the difference between X j and the real reconstruction is an arbitrary three-dimensional projective transformation.
由于噪声、匹配误差等因素,方程xi j=PiXj不会完全满足。通常假定该类误差满足高斯分布,然后求出最大似然解。在此,需要估计射影矩阵和真正投影到图像点的空间点,即Due to factors such as noise and matching errors, the equation x i j =P i X j will not be completely satisfied. It is usually assumed that this type of error satisfies a Gaussian distribution, and then the maximum likelihood solution is obtained. Here, it is necessary to estimate the projection matrix and the real projection to the image point point of space ,Right now
并且在每一帧图像中最小化重投影点和图像点之间的图像距离,即:And minimize the image distance between the reprojected point and the image point in each frame image, namely:
其中,d(x,y)是齐次点x和y之间的几何图像距离。通过调整每个摄像机中心和三维空间点之间的射线束来估计Xj和Pi的最大似然值。where d(x, y) is the geometric image distance between homogeneous points x and y. Maximum likelihood values for X j and Pi are estimated by adjusting the ray beams between each camera center and point in 3D space.
上述方法所使用的初值是摄像机标定过程获得的投影矩阵参数、上一帧的估计值和初始三维重建估计值。并且,通过摄像机参数约束获得欧氏三维重建。The initial values used in the above method are the projection matrix parameters obtained in the camera calibration process, the estimated value of the previous frame and the estimated value of the initial 3D reconstruction. Moreover, the Euclidean 3D reconstruction is obtained by constraining the camera parameters.
上述方法获得的三维模型表达是点云模型。接着将点云模型转化为细分表面表示的几何模型。由于该转换过程对于本领域的普通技术人员是公知的,因此在此不再赘述。The 3D model representation obtained by the above method is a point cloud model. The point cloud model is then transformed into a geometric model represented by a subdivided surface. Since this conversion process is well known to those skilled in the art, it will not be repeated here.
模型建立单元1053建立物体的立体模型,包括手的立体模型,具有简单几何外形的手持物体的立体模型。简单几何外形手持物体可以是一个具有一定形状的弹性钢条。简单几何外形手持物也可以是一个球体,一个长方体等,这些物体的截面形状被用来作为运动生成曲面的截面线。The model building unit 1053 builds a three-dimensional model of an object, including a three-dimensional model of a hand, and a three-dimensional model of a hand-held object with a simple geometric shape. The simple geometric shape hand-held object can be an elastic steel bar with a certain shape. The simple geometric shape of the handheld object can also be a sphere, a cuboid, etc., and the cross-sectional shapes of these objects are used as the cross-section lines of the motion-generating surface.
按照操作控制命令,系统可以重建环境中静止物体的三维表面形状,重建结果是可以被交互编辑的物体模型,该结果被存储到物体模型存储单元1056中。表5描述了三维几何模型的数据文件格式。According to the operation control command, the system can reconstruct the three-dimensional surface shape of the stationary object in the environment, and the reconstruction result is an object model that can be edited interactively, and the result is stored in the object model storage unit 1056 . Table 5 describes the data file format of the 3D geometric model.
表5三维几何模型数据文件格式Table 5 3D geometric model data file format
点数据结构point data structure
边数据结构edge data structure
面数据结构surface data structure
IV.轨迹拟合IV. Trajectory Fitting
单视频流视觉分析单元104获得的运动速度和物体轮廓是物体空间运动在摄像机摄影平面上的投影,是基于摄像机像平面的运动轨迹坐标序列,因此这些平面坐标是时空离散的。如图3和图4所示,多视频流视觉分析单元105还包括一个轨迹拟合单元1055和一个运动轨迹存储单元1058,所述轨迹拟合单元1055使用空间分布的多个摄像机摄影平面和摄像机相对定向参数,通过空间坐标交会、曲线拟合估计出空间连续运动轨迹描述。输出空间点坐标序列并存储到所述运动轨迹存储单元1058中。该单元的操作如下:The motion speed and object outline obtained by the single video stream visual analysis unit 104 are the projection of the object's spatial motion on the camera's photographing plane, and are based on the motion trajectory coordinate sequence of the camera's image plane, so these plane coordinates are discrete in time and space. As shown in Figures 3 and 4, the multi-video stream visual analysis unit 105 also includes a trajectory fitting unit 1055 and a motion trajectory storage unit 1058, and the trajectory fitting unit 1055 uses a plurality of camera photographing planes and cameras that are spatially distributed Relative to the orientation parameters, the spatial continuous motion trajectory description is estimated by spatial coordinate intersection and curve fitting. The spatial point coordinate sequence is output and stored in the motion track storage unit 1058 . The unit operates as follows:
a)基于摄影矩阵的空间点计算a) Spatial point calculation based on photography matrix
对于一组摄像机Ck,k=1,…,n,n是摄像机总数目。每个摄像机的绝对定位参数是Pi(x,y,z),绝对定向参数为Ri(α,β,γ)。已获得的各个摄像机像平面运动轨迹坐标序列中t时刻的平面坐标是s(x,y,t,i),其中,x,y是像平面坐标,t是时间,i是摄像机编号。由摄像机外部参数和内部参数唯一确定投影矩阵Mi。For a group of cameras C k , k=1, . . . , n, where n is the total number of cameras. The absolute positioning parameters of each camera are P i (x, y, z), and the absolute orientation parameters are R i (α, β, γ). The plane coordinates at time t in the acquired camera image plane trajectory coordinate sequence are s(x, y, t, i), where x, y are the image plane coordinates, t is the time, and i is the camera number. The projection matrix M i is uniquely determined by the external parameters and internal parameters of the camera.
同一时刻,空间点P(X,Y,Z)在每个像平面上的投影坐标s(x,y,t,i)满足方程(29)、(30):At the same moment, the projection coordinates s(x, y, t, i) of the spatial point P(X, Y, Z) on each image plane satisfy equations (29), (30):
根据n个摄像机各自的投影矩阵和像平面坐标,可以构造出2n个上述方程,通过最小二乘法求解出空间点坐标(X,Y,Z)。According to the respective projection matrices and image plane coordinates of n cameras, 2n above equations can be constructed, and the spatial point coordinates (X, Y, Z) can be obtained by least square method.
b)基于三角测量的空间点计算b) Calculation of spatial points based on triangulation
在获得多摄像机和摄像机定位定向参数的条件下,通过三角测量原理和最小二乘方法,可以确定手的空间位置。具体的操作方法可以是:将三维空间内摄像机的位置和姿态投影到三个正交的坐标平面上。在每个平面上,分别计算空间点的各坐标分量。该方法不需要标定摄像机内部参数。Under the condition of obtaining multi-camera and camera positioning and orientation parameters, the spatial position of the hand can be determined by the principle of triangulation and the method of least squares. The specific operation method may be: project the position and attitude of the camera in the three-dimensional space onto three orthogonal coordinate planes. On each plane, each coordinate component of the spatial point is calculated separately. This method does not need to calibrate the internal parameters of the camera.
c)轨迹拟合c) Trajectory fitting
采用三次基样条法进行轨迹拟合。并使用光顺条件确定样条拟合的边界条件。Trajectories were fitted using cubic basis splines. And use the smooth condition to determine the boundary condition of spline fitting.
V.截面计算V. Section calculation
如图3和图4所示,多视频流视觉分析单元105还包括一个截面计算单元1054和一个截面轮廓存储单元1057。所述界面计算单元1054根据由所述轨迹拟合单元1054所获得的连续运动轨迹确定轨迹上各影像帧处的法平面。物体在法平面上的投影就是各帧位置的运动物体截面轮廓线。As shown in FIG. 3 and FIG. 4 , the multi-video stream visual analysis unit 105 further includes a section calculation unit 1054 and a section profile storage unit 1057 . The interface calculation unit 1054 determines the normal plane at each image frame on the trajectory according to the continuous motion trajectory obtained by the trajectory fitting unit 1054 . The projection of the object on the normal plane is the cross-sectional contour line of the moving object at each frame position.
至此,已经描述了多视频流视觉分析单元105的主要数据处理单元1051、1052、1053、1054和1055的功能。单元105中还包括了三个数据存储单元,即物体模型存储单元1056、截面轮廓存储单元1057和运动轨迹存储单元1058。So far, the functions of the main
物体模型存储单元1056是系统永久性存储单元,存储运动物体的三维几何模型。这里,永久性存储单元是指系统重新运行后,该存储单元存储的内容不变。在本实例中,物体模型的数据结构如表2所示。系统中所建立的全部物体模型都存储在物体模型存储单元1056中。在单元1056中,每个物体模型用一个模型表数据结构来存储。每个模型表中,存储了模型编号、模型标识、模型类型编号、模型类型标识、模型属性表、模型参数表、模型中对象的个数以及各个对象的存储表结构。对于刚性物体,每个物体对应于一个模型表。对于手和受约束的可变形物体,单元1056中将对每个物体及其该物体的基本变形存储几个模型表。对于简单物体,模型表中仅仅包含一个几何对象;对于复杂物体,模型表中用多个几何对象存储复杂物体的各个组成成分。模型属型表描述模型的基本属性和可扩展属性。模型参数表存储模型的基本参数。The object model storage unit 1056 is a system permanent storage unit, which stores the three-dimensional geometric model of the moving object. Here, the permanent storage unit means that after the system restarts, the content stored in the storage unit remains unchanged. In this example, the data structure of the object model is shown in Table 2. All object models established in the system are stored in the object model storage unit 1056 . In block 1056, each object model is stored in a model table data structure. Each model table stores the model number, model ID, model type ID, model type ID, model attribute table, model parameter table, the number of objects in the model, and the storage table structure of each object. For rigid objects, each object corresponds to a model table. For hands and constrained deformable objects, several model tables will be stored in unit 1056 for each object and its basic deformations. For simple objects, the model table contains only one geometric object; for complex objects, multiple geometric objects are used in the model table to store the various components of the complex object. The model attribute table describes the basic attributes and extensible attributes of the model. The model parameter table stores the basic parameters of the model.
截面轮廓存储单元1057是系统工作存储单元,存贮运动物体在其运动方向上的时间离散截面轮廓。本实施例中,使用循环队列存储有限长时间序列物体轮廓数据表。例如,可以存放当前工作时间之前连续数百帧的物体轮廓数据表。The cross-sectional profile storage unit 1057 is a system working storage unit, which stores the time-discrete cross-sectional profile of the moving object in its moving direction. In this embodiment, a circular queue is used to store a limited time series object contour data table. For example, the object outline data table of hundreds of frames before the current working time can be stored.
运动轨迹存储单元1058是系统工作存储单元,存贮运动物体的空间运动时间离散轨迹。本实施例中,使用循环队列存储有限长时间序列物体运动轨迹数据表,轨迹数据包括物体几何中心的空间坐标和姿态。例如,可以存放对应于物体截面轮廓物体运动轨迹数据表。The motion track storage unit 1058 is a system work storage unit, which stores the space-time discrete track of the moving object. In this embodiment, a circular queue is used to store a finite time series object trajectory data table, and the trajectory data includes the spatial coordinates and posture of the geometric center of the object. For example, a data table of object motion trajectory corresponding to the cross-sectional profile of the object can be stored.
●实时交互语义识别●Real-time interactive semantic recognition
实时交互语义识别由实时交互语义识别单元106完成,如图1和图6所示。三维几何建模系统100的实时交互语义识别单元106接受多视频流视觉分析单元105的输出,并且从已经定义的语义模型存储单元110中读取语义模型,进行运动语义的分析与解释并输出三维造型命令。实时交互语义识别单元106包括一个碰撞检测单元1061,用于读取运动轨迹存储单元1058的运动轨迹,通过碰撞检测方法检测三维几何模型存储单元111中的三维几何设计模型和物体模型存储单元1056的物体模型之间的碰撞。碰撞检测结果将作为操作语义分析单元1062和交互语义分析单元1063的输入。实时交互语义识别单元106还包括交互语义分析单元1063和操作语义分析单元1062,它们根据碰撞检测单元1061的碰撞检测结果以及物体模型存储单元1056、运动轨迹存储单元1058、语义模型存储单元110所存储的数据获取运动物体对于三维几何模型的操作语义。语义分析结果输出三维造型命令,并存储在一个三维造型命令存储单元1065中。The real-time interactive semantic recognition is completed by the real-time interactive semantic recognition unit 106, as shown in FIG. 1 and FIG. 6 . The real-time interactive semantic recognition unit 106 of the 3D geometric modeling system 100 accepts the output of the multi-video stream visual analysis unit 105, and reads the semantic model from the defined semantic model storage unit 110, analyzes and interprets the motion semantics and outputs a three-dimensional styling commands. The real-time interactive semantic recognition unit 106 includes a collision detection unit 1061, which is used to read the motion track of the motion track storage unit 1058, and detect the three-dimensional geometric design model in the three-dimensional geometric model storage unit 111 and the object model storage unit 1056 by a collision detection method. Collisions between object models. The collision detection result will be used as the input of the operation semantic analysis unit 1062 and the interaction semantic analysis unit 1063 . The real-time interactive semantic recognition unit 106 also includes an interactive semantic analysis unit 1063 and an operation semantic analysis unit 1062, which are based on the collision detection result of the collision detection unit 1061 and the objects stored in the object model storage unit 1056, motion track storage unit 1058, and semantic model storage unit 110. The data obtains the operation semantics of the moving object for the 3D geometric model. The semantic analysis result outputs a 3D modeling command and stores it in a 3D modeling
实时交互语义识别单元106包含进行如下基本操作:The real-time interactive semantic recognition unit 106 includes performing the following basic operations:
I.碰撞检测I. Collision detection
碰撞检测单元1061根据多视频流视觉分析单元105所建立的物体模型、截面轮廓和运动轨迹,通过碰撞检测方法检测三维几何设计模型和物体模型之间的碰撞结果。如前所述,物体模型、截面轮廓和运动轨迹分别存储于物体模型存储单元1056、截面轮廓存储单元1057和运动轨迹存储单元1058中。碰撞检测单元1061提供交互语义分析的上下文环境,同时为设计师对几何设计模型的操作与交互提供视觉反馈。The collision detection unit 1061 detects the collision result between the three-dimensional geometric design model and the object model through a collision detection method according to the object model, cross-sectional profile and motion trajectory established by the multi-video stream visual analysis unit 105 . As mentioned above, the object model, cross-sectional profile and motion track are stored in the object model storage unit 1056, the cross-sectional profile storage unit 1057, and the motion track storage unit 1058, respectively. The collision detection unit 1061 provides a context environment for interactive semantic analysis, and at the same time provides visual feedback for the designer's operation and interaction with the geometric design model.
碰撞检测结果将直接提供给语义分析单元,具体的分析过程将在下文中具体描述。两物体之间的实时碰撞检测采用本领域技术人员所公知的AABB树算法实现,在此不再赘述。The collision detection results will be directly provided to the semantic analysis unit, and the specific analysis process will be described in detail below. The real-time collision detection between two objects is realized by using the AABB tree algorithm known to those skilled in the art, and details will not be repeated here.
II.操作语义分析单元II. Operation Semantic Analysis Unit
操作语义分析单元1062用于解释操作性的语义。例如,“选择”操作语义、“取消”操作语义等,在本发明的一个实施例中,当手模型和三维设计模型发生碰撞接触并持续一定时间时,被判定为“选择”操作语义;在手模型和三维设计模型已经接触的情况下,持续接触并保持静止一段时间时,则判定为“取消选择”操作语义。语义解析结果取决于当前物体的姿态和预定的语义模型。为了保证操作的灵活性,操作方式可以包括几种类型:对已经生成的模型的操作和对界面菜单和工具条的操作。The operational semantic analysis unit 1062 is used to interpret operational semantics. For example, "select" operation semantics, "cancel" operation semantics, etc., in one embodiment of the present invention, when the hand model collides with the 3D design model and lasts for a certain period of time, it is judged as "select" operation semantics; When the hand model and the 3D design model are already in contact, if they continue to be in contact and remain stationary for a period of time, it is determined as the "cancel selection" operation semantics. The result of semantic parsing depends on the pose of the current object and the predetermined semantic model. In order to ensure the flexibility of the operation, the operation mode can include several types: the operation on the generated model and the operation on the interface menu and tool bar.
a)对界面菜单和工具条的操作a) Operations on interface menus and toolbars
该类界面操作有两种方式:鼠标键盘方式和虚拟手方式。鼠标键盘方式即传统的平面图形界面方式。虚拟手方式类似于现有技术中触摸屏的操作,通过虚拟手的运动和点击对界面的菜单、工具条进行操作。当虚拟手运动到系统图形界面操作区域时,系统自动切换到界面命令操作模式,鼠标键盘方式和虚拟手方式通过响应即时输入自动切换。There are two ways to operate this type of interface: mouse and keyboard mode and virtual hand mode. The mouse and keyboard mode is the traditional plane graphic interface mode. The virtual hand mode is similar to the operation of the touch screen in the prior art, and the menu and tool bar of the interface are operated through the movement and click of the virtual hand. When the virtual hand moves to the operating area of the system graphical interface, the system automatically switches to the interface command operation mode, and the mouse and keyboard mode and the virtual hand mode are automatically switched by responding to instant input.
b)对生成模型的操作b) Operations on the generative model
常用的交互语义被设置成虚拟3D空间的浮动球。这些浮动球被称为操作球,每个球被定义为一种操作。当虚拟手抓获一个操作球后,表示虚拟手将开始该操作。按照操作上下文环境,操作球自动消隐、浮现并改变其在虚拟三维空间中的深度。最可能被选择的操作球将处于虚拟手最易抓获的位置。Commonly used interaction semantics are set as floating balls in virtual 3D space. These floating balls are called operation balls, and each ball is defined as an operation. When the virtual hand catches an operation ball, it means that the virtual hand will start the operation. According to the operation context, the operation ball automatically disappears, emerges and changes its depth in the virtual three-dimensional space. The ball most likely to be selected will be in the easiest position for the virtual hand to grasp.
虚拟空间操作、GUI界面操作和语音命令操作这三种操作通过上下文环境自动切换。当虚拟手运动到系统界面的菜单命令区时,自动切换为虚拟鼠标模式。当虚拟手运动到绘制区域时,变换为虚拟三维空间造型操作方式。The three operations of virtual space operation, GUI interface operation and voice command operation are automatically switched through the context environment. When the virtual hand moves to the menu command area of the system interface, it will automatically switch to the virtual mouse mode. When the virtual hand moves to the drawing area, it transforms into a virtual three-dimensional space modeling operation mode.
III.交互语义分析单元III. Interactive Semantic Analysis Unit
交互语义分析单元1063根据碰撞检测结果、物体模型存储单元1056和运动轨迹存储单元1058以及语义模型存储单元110确定交互语义。本实施例中,交互性语义通过静态交互手势来表达。交互语义分析单元1063对静态手势采用基于表观的识别方法进行语义分析。即根据预定义手势模板,按照模板匹配的方法进行手势语义分析。本实施例中,基于模板匹配的语义分析按照下述方法进行:The interaction semantic analysis unit 1063 determines the interaction semantics according to the collision detection result, the object model storage unit 1056 , the motion trajectory storage unit 1058 and the semantic model storage unit 110 . In this embodiment, interactive semantics are expressed through static interactive gestures. The interactive semantic analysis unit 1063 performs semantic analysis on static gestures using an appearance-based recognition method. That is, according to the predefined gesture template, the gesture semantic analysis is performed according to the method of template matching. In this embodiment, the semantic analysis based on template matching is carried out according to the following method:
首先对边缘图像进行距离变换,即对二值图像进行距离变换以得到对应原边缘图像的等尺寸的距离映射图,此距离映射图中的每个“像素点”的新值为距离值,距离变换定义如下:First, the distance transformation is performed on the edge image, that is, the distance transformation is performed on the binary image to obtain an equal-sized distance map corresponding to the original edge image. The new value of each "pixel" in the distance map is the distance value, and the distance The transformation is defined as follows:
D(p)=min(de(p,q)) q∈O (31)D(p)=min( de (p,q)) q∈O (31)
其中,de(p,q)表示像素点p,q间的欧氏距离,O为目标物体的元素集合。Euclidean距离变换de(p,q)的定义如下:Among them, d e (p, q) represents the Euclidean distance between pixel points p and q, and O is the element set of the target object. The Euclidean distance transformation d e (p, q) is defined as follows:
为减小计算量,开方运算被省略,即使用公式(33)代替公式(32)In order to reduce the amount of calculation, the square root operation is omitted, that is, formula (33) is used instead of formula (32)
de(p,q)=(px-qx)2+(py-qy)2 (33)d e (p, q)=(p x -q x ) 2 +(p y -q y ) 2 (33)
经过如上述距离变换后,形成的距离映射图中的每一个点的新值即为在原图像中距该点最近的物体像素点的距离。After the above-mentioned distance transformation, the new value of each point in the formed distance map is the distance of the closest object pixel to the point in the original image.
本实施例使用单向Hausdorff距离h(M,I)进行模型匹配。M为选取的手势模板边缘像素集合,I为边缘提取后图像边缘像素点的集合。匹配识别时,先对边缘提取后的待识别图像实施Euclidean距离变换以得到距离映射图,然后在距离空间内,将模板在距离映射图上进行平移匹配。相应地,hj(M,I)(下标j为平移次数)取为模板中的边缘像素点在当前距离映射图上对应位置处的若干值中的最大者,它度量了模板在当前平移位置与边缘图像上对应像素点之间的最大不匹配度。基本形式的Hausdorff距离模板匹配的判别规则是:取所有平移匹配中得到的上述hj(M,I)值中的最小值作为这个模板与该图像中有可能存在的该模板对应对象之间相似度的度量值。在平移匹配过程中,若出现几个模型与当前待识别图像的相似度很接近时,则再将边缘方向信息加入到平移匹配判别中,此时判断处于某一平移点q(以某一次平移中模板左下角的像素点为参考点)处,模板与边缘图像相应位置处是否匹配的条件是:In this embodiment, the one-way Hausdorff distance h(M, I) is used for model matching. M is the set of edge pixels of the selected gesture template, and I is the set of edge pixels of the image after edge extraction. When matching and recognizing, first implement Euclidean distance transformation on the image to be recognized after edge extraction to obtain a distance map, and then in the distance space, the template is translated and matched on the distance map. Correspondingly, h j (M, I) (the subscript j is the number of translations) is taken as the maximum among several values of the edge pixel in the template at the corresponding position on the current distance map, which measures the current translation of the template The maximum mismatch between the position and the corresponding pixel on the edge image. The discriminant rule of the basic form of Hausdorff distance template matching is: take the minimum value of the above h j (M, I) values obtained in all translation matching as the similarity between this template and the corresponding object of the template that may exist in the image. measure of degree. During the translation matching process, if several models appear to be very similar to the current image to be recognized, then the edge direction information is added to the translation matching judgment. At this time, it is judged to be at a certain translation point q (with a certain translation The pixel point at the lower left corner of the middle template is the reference point), the conditions for whether the template matches the corresponding position of the edge image are:
①设第j次平移匹配时,若模板中符合匹配要求的点与模板像素点总点数的比率为Rj ① When the j-th translation matching is performed, if the ratio of the points meeting the matching requirements in the template to the total points of the template pixels is R j
上式中,Q为模板在边缘图像上由若干次平移中模板左下角像素点形成的点集;n(·)为取集合中元素个数的操作;函数Ang(x)为求得的边缘像素点x的角度值;τ为给定的距离差阈值;θ为给定的方向弧度差阈值。In the above formula, Q is the point set of the template formed by the pixels in the lower left corner of the template in several translations on the edge image; n( ) is the operation of taking the number of elements in the set; the function Ang(x) is the obtained edge The angle value of the pixel point x; τ is the given distance difference threshold; θ is the given direction radian difference threshold.
②取P(k)=max(Rj),其中:k=1,2,…,是模板个数,J对某一个模板进行若干次平移的次数。② Take P(k)=max(R j ), where: k=1, 2, .
③取与max P(k)对应的模板所指示的手势为最终识别结果。③ Take the gesture indicated by the template corresponding to max P(k) as the final recognition result.
在本实施例采取了修正的Hausdorff距离进行模板匹配,即通过对若干次平移中得到的hi(M,I)的平均来得到模板相对于待识别图像的相似度,In this embodiment, the modified Hausdorff distance is used for template matching, that is, the similarity of the template to the image to be recognized is obtained by averaging the h i (M, I) obtained in several translations,
其中,N为模板中的边缘像素点的个数。Wherein, N is the number of edge pixels in the template.
按照上述方法,可以确定数种交互手势的基本语义。图15示出了交互语义手势的几种实例。According to the method described above, the basic semantics of several interactive gestures can be determined. Figure 15 shows several examples of interactive semantic gestures.
●三维几何建模●3D geometric modeling
如图7所示,三维集合建模系统100还包括一个三维几何建模单元107用于基于从多视频流视觉分析单元105获得运动与姿态识别,以及基于从实时交互语义识别单元106获得语义分析结果,即三维造型命令,通过重复处理单元1071、抖动处理单元1072、包络计算单元1073以及造型编辑单元1074,建立新的三维几何设计形体,并对已有的三维几何形体进行编辑替换,并实时读取并存储三位几何模型存储单元111。三维几何建模单元107包含如下处理步骤:As shown in Figure 7, the 3D set modeling system 100 also includes a 3D geometric modeling unit 107 for motion and gesture recognition based on the multi-video stream visual analysis unit 105, and semantic analysis based on the real-time interactive semantic recognition unit 106. As a result, the 3D modeling command creates a new 3D geometric design body, edits and replaces the existing 3D geometric body, and The three-dimensional geometric model storage unit 111 is read and stored in real time. The three-dimensional geometric modeling unit 107 includes the following processing steps:
I.重复处理I. Duplicate processing
重复处理单元1071对物体运动过程中产生的重复动作进行处理,消除物体的重复、重叠性运动。The
II.抖动处理II. Jitter processing
抖动处理单元1072对物体的运动进行平滑、光顺处理,消除物体运动轨迹和姿态的微小抖动。The
III.包络计算单元III. Envelope calculation unit
包络计算单元1073的基本功能是根据业已消除抖动和重复的运动轨迹和截面轮廓,求解出包络微分方程,然后利用龙格-库塔算法计算出物体运动产生的包络面。该单元输出物体运动产生的包络面。The basic function of the
IV.造型编辑单元IV. Shape editing unit
造型编辑单元1074用包络计算单元1073产生的运动包络面替换现有的三维几何造型模型,并进行新旧面片的光滑连接处理。所述替换过程按照约束和修改规则处理拼接点和拼接面的连接与平滑,对所建立的三维几何模型进行修改。The
●模型绘制与显示●Model drawing and display
对三维几何模型的修改将激活模型绘制过程,并将绘制结果输出到显示装置109上。Modifications to the three-dimensional geometric model will activate the model rendering process and output the rendering results to the
由此,已经描述了根据本发明的第一具体实施例的技术方案。此外,如图8所示,本实施例的三维几何建模系统100可以由三台通用数字计算机构成,两台前端计算机1201、1202和一台后端计算机1203。每两部数字摄像机被连接到一台前端计算机(1201,1202)上,两台前端计算机1201和1202连接到后端计算机1203上,后端计算机1203和视频输出装置109相连。在前端计算机1201、1202中均提供有与其所连接的视频输入装置101相对应的单视频视觉分析单元104以及多视频视觉分析单元105中的立体匹配单元1051。在后端计算机1203中提供有除立体匹配单元1051之外的多视频视觉分析单元105的各个组建以及实时交互语义识别单元106,三维几何建模单元107,三维模型绘制单元108。计算机1203的存储系统提供有语义模型存储单元110和三维几何模型存储单元111。本发明的上述组建可以通过软件、固件以及集成电路等等来实现。Thus, the technical solution according to the first specific embodiment of the present invention has been described. In addition, as shown in FIG. 8 , the three-dimensional geometric modeling system 100 of this embodiment may be composed of three general-purpose digital computers, two front-
第二具体实施例Second specific embodiment
如图9所示,根据本发明第二实施例的三维几何建模系统200还包括一个音频输入装置202和音频识别单元203。音频输入装置202可以是通用话筒和将自然语音转换为数字语音信号的声卡设备。本实施例中,音频输入装置设备作为辅助输入设备,提供多通道用户交互方式的辅助交互输入。在获得音频输入后,语音识别单元203进行语音识别,即实现将音频输入转换为受限语言的功能。语音识别单元203仅识别限于预先定义的语音模式,没有定义的语音输入将被丢弃。例如,可以使用Microsoft Speech 5.X微软语音识别引擎对基本语音进行识别。通过设置语音识别引擎中的基于XML文件的受限文法设置,可以使得系统仅仅识别文法中给出的语音命令,有效的提高识别率,限制其它可能导致系统异常行为的语音输入。As shown in FIG. 9 , the three-dimensional
语音识别单元203识别出的语音提供给实时交互语义识别单元206。实时交互语义识别单元206除进行第一实施例的碰撞检测、操作和交互语义分析等操作外,还包括一个语音语义分析单元2064,对已经语音识别单元203识别的语音进行进行语义识别。每当从语音识别单元203获得经过识别的语音输入时,语音语义分析单元2064根据系统当前上下文环境对获得语音进行语义解释。下面示出了一个语义解析文件实例,其中以<O></O>为标记的语义解析文法是可选文法。用户可以根据具体的需要修改这个部分的内容,体现个性化人机交互的思想。The speech recognized by the
<GRAMMAR LANGID=″804″><GRAMMAR LANGID="804">
<RULE NAME=″WithPara″TOPLEVEL=″ACTIVE″><RULE NAME="WithPara" TOPLEVEL="ACTIVE">
<P><P>
<O><O>
<L><L>
<P>请</P><P>Please</P>
<P>我想</P><P>I think</P>
</L></L>
</O></O>
<L><L>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″CrePoint″>创建点</P><P PROPNAME="TYPE_RULEREF" VALSTR="CrePoint">Create Point</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″CreLine″>创建线</P><P PROPNAME="TYPE_RULEREF" VALSTR="CreLine">Create Line</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″CreSurface″>创建面</P><P PROPNAME="TYPE_RULEREF" VALSTR="CreSurface">Create surface</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″Delete″>删除</P><P PROPNAME="TYPE_RULEREF" VALSTR="Delete">Delete</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″Cancel″>撤销</P><P PROPNAME="TYPE_RULEREF" VALSTR="Cancel">Cancel</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″Done″>结束</P><P PROPNAME="TYPE_RULEREF" VALSTR="Done">End</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″Edit″>编辑</P><P PROPNAME="TYPE_RULEREF" VALSTR="Edit">Edit</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″SelePoint″>选择点</P><P PROPNAME="TYPE_RULEREF" VALSTR="SelePoint">Select Point</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″SeleLine″>选择线</P><P PROPNAME="TYPE_RULEREF" VALSTR="SeleLine">Select Line</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″SeleSurface″>选择面</P><P PROPNAME="TYPE_RULEREF" VALSTR="SeleSurface">Select Surface</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″zin″>缩小</P><P PROPNAME="TYPE_RULEREF" VALSTR="zin">zoom out</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″zout″>放大</P><P PROPNAME="TYPE_RULEREF" VALSTR="zout">zoom in</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″Translation″>平移</P><P PROPNAME="TYPE_RULEREF" VALSTR="Translation">Translation</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″X″>X坐标值</P><P PROPNAME="TYPE_RULEREF" VALSTR="X">X coordinate value</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″Y″>Y坐标值</P><P PROPNAME="TYPE_RULEREF" VALSTR="Y">Y coordinate value</P>
<P PROPNAME=″TYPE_RULEREF″VALSTR=″Z″>Z坐标值</P><P PROPNAME="TYPE_RULEREF" VALSTR="Z">Z coordinate value</P>
</L></L>
<L><L>
<P VALSTR=″value″>Value</P><P VALSTR="value">Value</P>
</L></L>
</P></p>
</RULE></RULE>
</GRAMMAR></GRAMMAR>
同时,在实时交互语义识别单元206中进行的操作如图10所示。本实施例的实时交互语义识别单元206包括碰撞检测单元2061,操作语义分析单元2062,交互语义分析单元2063、语音语义分析单元2064以及三维造型命令存储单元2065。实时交互语义识别单元206的主要工作步骤如下:Meanwhile, the operations performed in the real-time interactive semantic recognition unit 206 are shown in FIG. 10 . The real-time interactive semantic recognition unit 206 in this embodiment includes a collision detection unit 2061 , an operation semantic analysis unit 2062 , an interactive semantic analysis unit 2063 , a voice semantic analysis unit 2064 and a 3D modeling command storage unit 2065 . The main working steps of the real-time interactive semantic recognition unit 206 are as follows:
碰撞检测单元2061读取运动轨迹存储单元2058的运动轨迹,通过碰撞检测方法检测三维几何设计存储单元211的三维几何设计模型和物体模型存储单元2056的物体模型之间的碰撞。碰撞检测结果将作为操作语义分析单元2062和交互语义分析单元2063的输入。交互语义分析单元2063和操作语义分析单元2062根据碰撞检测单元2061的碰撞检测结果以及物体模型存储单元2056、运动轨迹存储单元2058、语义模型存储单元210中存储的数据获取运动物体对于三维几何模型的操作语义,生成三维造型命令存储在三维造型命令存储单元2065中。语音语义分析单元2064从语音识别结果单元203中获取受限语言的语义识别结果,根据语义模型存储单元210的语义解析定义,获得语音语义的解释并生成三维造型命令存储在三维造型命令存储单元2065中。The collision detection unit 2061 reads the motion trajectory of the motion trajectory storage unit 2058, and detects the collision between the 3D geometric design model of the 3D geometric design storage unit 211 and the object model of the object model storage unit 2056 through a collision detection method. The collision detection result will be used as the input of the operation semantic analysis unit 2062 and the interaction semantic analysis unit 2063 . The interactive semantic analysis unit 2063 and the operation semantic analysis unit 2062 obtain the data stored in the three-dimensional geometric model of the moving object according to the collision detection result of the collision detection unit 2061 and the data stored in the object model storage unit 2056, the motion track storage unit 2058, and the semantic model storage unit 210. The operation semantics and the generated 3D modeling commands are stored in the 3D modeling command storage unit 2065 . The speech semantic analysis unit 2064 obtains the semantic recognition result of the restricted language from the speech
在此,语音操作作为辅助交互通道,可以在系统人机交互过程中用作命令操作和绘制过程中的控制操作。Here, the voice operation, as an auxiliary interaction channel, can be used as a command operation and a control operation in the drawing process during the human-computer interaction process of the system.
除上述描述之外,根据本发明的第二实施例与第一实施例相同获相似的部分所实施的技术方案是相同的。Except for the above description, according to the second embodiment of the present invention, the technical solutions implemented by the same or similar parts as the first embodiment are the same.
第三具体实施例Third specific embodiment
图11示出了应用本发明的三维几何建模系统300的第三具体实施例的结构图。图11所示的视频输入设备301是一种数字摄像机。本实施例中,视频输入设备301由六个数字摄像机组成;相对应于每个数字摄像机有一个单视频流视觉分析单元304,每部数字摄像机按照通常连接方式通过连接线直接连接到通用计算机设备接口,由单视频流视觉分析单元304处理。音频输入装置302是一个通用话筒和将自然语音转换为数字语音信号的声卡设备。与第一具体实施例不同,该实施例不仅采用了更多的摄像装置,而且连接方式也不同。具体来说,摄像机C01、C02以及C03位于和设计师手部运动等高的位置,摄像机C04、C05以及C06位于设计师上方。该配置由于增加了摄像机的数目,因而可以更好的避免运动过程可能造成的遮挡问题。另一个不同之处时是摄像机的连接方式的改变。如图13所示,6个摄像机平均分为两组,每组的三个摄像机连接成两个可供双视图匹配使用的影像输入对,提供给多视频流视觉分析单元305进行深度获取和立体匹配。FIG. 11 shows a structural diagram of a third specific embodiment of a three-dimensional
六个摄像装置被分别放置在概念设计者前方、左前方、右前方、上方、左上方和右上方六个不同的位置,其距离地面的高度和姿态以适合设计师手势表达为宜,即设计师的设计动作应该能够被摄像机完全摄录,并且不影响设计师的操作及其它活动。数字摄像机的一种具体布局实施方式由图12示出。The six camera devices are placed in six different positions in front of the concept designer, front left, front right, top, top left and top right. The designer's design actions should be fully recorded by the camera without affecting the designer's operation and other activities. A specific layout implementation of a digital camera is shown in FIG. 12 .
本实施例中,计算机系统由三台通用数字计算机构成,如图14所示。每两部摄像机被连接到一台前端计算机1901、1902上,两台前端计算机连接到一台后端计算机1903上,后端计算机和音频输入设备、显示设备相连。In this embodiment, the computer system is composed of three general-purpose digital computers, as shown in FIG. 14 . Every two cameras are connected to a front-
本实施例的运行过程与第一具体实施例基本相同。当三维造型设计师开启该装置的计算机系统后,首先开始系统初始化过程。然后生成简单物体的三维几何模型。所不同的是,相对于第一具体实施例的多视频流视觉分析,本实施例的该多视频流视觉分析接受所有六个单视频流视觉分析的输出。其处理过程按照图14所示组合对视频输入进行处理。The running process of this embodiment is basically the same as that of the first specific embodiment. When the three-dimensional modeling designer turns on the computer system of the device, the system initialization process is first started. A 3D geometric model of the simple object is then generated. The difference is that, compared with the visual analysis of multiple video streams in the first specific embodiment, the visual analysis of multiple video streams in this embodiment accepts the outputs of all six visual analysis of single video streams. Its processing process processes the video input according to the combination shown in Figure 14 .
以上描述了本发明的三位几何建模系统和方法的实现方式。虽然上面描述是参照本发明的特定实施例而进行的,但是应该理解在不脱离其精神的情况下可以进行各种修改。因此本公开实施例在所有方面都是示例性而不是限制性的,本发明的范围由所附权利要求而不是前面描述来表示,因此属于权利要求的等效含义和范围的所有变动均包括在其中。The implementation of the three-dimensional geometric modeling system and method of the present invention has been described above. While the foregoing description has been made with reference to particular embodiments of the invention, it should be understood that various modifications may be made without departing from the spirit thereof. Therefore, the embodiments of the present disclosure are illustrative rather than restrictive in all respects, and the scope of the present invention is represented by the appended claims rather than the foregoing description, so all changes belonging to the equivalent meaning and scope of the claims are included in in.
Claims (14)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2005100122739A CN100407798C (en) | 2005-07-29 | 2005-07-29 | 3D geometric modeling system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2005100122739A CN100407798C (en) | 2005-07-29 | 2005-07-29 | 3D geometric modeling system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1747559A CN1747559A (en) | 2006-03-15 |
CN100407798C true CN100407798C (en) | 2008-07-30 |
Family
ID=36166855
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2005100122739A Expired - Fee Related CN100407798C (en) | 2005-07-29 | 2005-07-29 | 3D geometric modeling system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN100407798C (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI720513B (en) * | 2019-06-14 | 2021-03-01 | 元智大學 | Image enlargement method |
Families Citing this family (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4499693B2 (en) * | 2006-05-08 | 2010-07-07 | ソニー株式会社 | Image processing apparatus, image processing method, and program |
CN101276370B (en) * | 2008-01-14 | 2010-10-13 | 浙江大学 | Three-dimensional human body movement data retrieval method based on key frame |
CN101965589A (en) * | 2008-03-03 | 2011-02-02 | 霍尼韦尔国际公司 | Model driven 3d geometric modeling system |
CN101795348A (en) * | 2010-03-11 | 2010-08-04 | 合肥金诺数码科技股份有限公司 | Object motion detection method based on image motion |
CN101908230B (en) * | 2010-07-23 | 2011-11-23 | 东南大学 | A 3D Reconstruction Method Based on Region Depth Edge Detection and Binocular Stereo Matching |
CN101923729B (en) * | 2010-08-25 | 2012-01-25 | 中国人民解放军信息工程大学 | Reconstruction method of three-dimensional shape of lunar surface based on single gray level image |
CN102354345A (en) * | 2011-10-21 | 2012-02-15 | 北京理工大学 | Medical image browse device with somatosensory interaction mode |
US9396292B2 (en) * | 2013-04-30 | 2016-07-19 | Siemens Product Lifecycle Management Software Inc. | Curves in a variational system |
CN103927784B (en) * | 2014-04-17 | 2017-07-18 | 中国科学院深圳先进技术研究院 | A kind of active 3-D scanning method |
CN105323572A (en) * | 2014-07-10 | 2016-02-10 | 坦亿有限公司 | Stereo image processing system, device and method |
CN104571511B (en) | 2014-12-30 | 2018-04-27 | 青岛歌尔声学科技有限公司 | The system and method for object are reappeared in a kind of 3D scenes |
CN104571510B (en) | 2014-12-30 | 2018-05-04 | 青岛歌尔声学科技有限公司 | A kind of system and method that gesture is inputted in 3D scenes |
US10482670B2 (en) | 2014-12-30 | 2019-11-19 | Qingdao Goertek Technology Co., Ltd. | Method for reproducing object in 3D scene and virtual reality head-mounted device |
US9792692B2 (en) * | 2015-05-29 | 2017-10-17 | Ncr Corporation | Depth-based image element removal |
CN105160673A (en) * | 2015-08-28 | 2015-12-16 | 山东中金融仕文化科技股份有限公司 | Object positioning method |
CN105404511B (en) * | 2015-11-19 | 2019-03-12 | 福建天晴数码有限公司 | Physical impacts prediction technique and device based on ideal geometry |
CN105844692B (en) * | 2016-04-27 | 2019-03-01 | 北京博瑞空间科技发展有限公司 | Three-dimensional reconstruction apparatus, method, system and unmanned plane based on binocular stereo vision |
CN107452037B (en) * | 2017-08-02 | 2021-05-14 | 北京航空航天大学青岛研究院 | GPS auxiliary information acceleration-based structure recovery method from movement |
CN111448568B (en) * | 2017-09-29 | 2023-11-14 | 苹果公司 | Environment-based application presentation |
CN109754457A (en) * | 2017-11-02 | 2019-05-14 | 韩锋 | Reconstruct system, method and the electronic equipment of object threedimensional model |
CN107907110B (en) * | 2017-11-09 | 2020-09-01 | 长江三峡勘测研究院有限公司(武汉) | Multi-angle identification method for structural plane occurrence and properties based on unmanned aerial vehicle |
CN109993976A (en) * | 2017-12-29 | 2019-07-09 | 技嘉科技股份有限公司 | Traffic accident monitoring system and method thereof |
CN109783922A (en) * | 2018-01-08 | 2019-05-21 | 北京航空航天大学 | A kind of local product design method, system and its application based on function and environmental factor |
CN108777770A (en) * | 2018-06-08 | 2018-11-09 | 南京思百易信息科技有限公司 | A kind of three-dimensional modeling shared system and harvester |
CN111010590B (en) * | 2018-10-08 | 2022-05-17 | 阿里巴巴(中国)有限公司 | Video clipping method and device |
CN110246212B (en) * | 2019-05-05 | 2023-02-07 | 上海工程技术大学 | Target three-dimensional reconstruction method based on self-supervision learning |
CN110660132A (en) * | 2019-10-11 | 2020-01-07 | 杨再毅 | Three-dimensional model construction method and device |
CN112215933B (en) * | 2020-10-19 | 2024-04-30 | 南京大学 | Three-dimensional solid geometry drawing system based on pen type interaction and voice interaction |
CN114137880B (en) * | 2021-11-30 | 2024-02-02 | 深蓝汽车科技有限公司 | Moving part attitude test system |
CN117910075B (en) * | 2024-03-18 | 2024-06-07 | 中南民族大学 | Hand-drawn geometric modeling method, system and equipment for cloud native CAD software |
CN118097057B (en) * | 2024-04-25 | 2024-07-02 | 中国电建集团昆明勘测设计研究院有限公司 | Method and device for constructing universal geometric modeling body for hydraulic and hydroelectric engineering |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09180003A (en) * | 1995-12-26 | 1997-07-11 | Nec Corp | Method and device for modeling three-dimensional shape |
CN1263302A (en) * | 2000-03-13 | 2000-08-16 | 中国科学院软件研究所 | Pen and signal based manuscript editing technique |
US20030025788A1 (en) * | 2001-08-06 | 2003-02-06 | Mitsubishi Electric Research Laboratories, Inc. | Hand-held 3D vision system |
CN1404016A (en) * | 2002-10-18 | 2003-03-19 | 清华大学 | Establishing method of human face 3D model by fusing multiple-visual angle and multiple-thread 2D information |
-
2005
- 2005-07-29 CN CN2005100122739A patent/CN100407798C/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09180003A (en) * | 1995-12-26 | 1997-07-11 | Nec Corp | Method and device for modeling three-dimensional shape |
CN1263302A (en) * | 2000-03-13 | 2000-08-16 | 中国科学院软件研究所 | Pen and signal based manuscript editing technique |
US20030025788A1 (en) * | 2001-08-06 | 2003-02-06 | Mitsubishi Electric Research Laboratories, Inc. | Hand-held 3D vision system |
CN1404016A (en) * | 2002-10-18 | 2003-03-19 | 清华大学 | Establishing method of human face 3D model by fusing multiple-visual angle and multiple-thread 2D information |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI720513B (en) * | 2019-06-14 | 2021-03-01 | 元智大學 | Image enlargement method |
Also Published As
Publication number | Publication date |
---|---|
CN1747559A (en) | 2006-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN100407798C (en) | 3D geometric modeling system and method | |
CN109636831B (en) | A Method for Estimating 3D Human Pose and Hand Information | |
CN112771539B (en) | Use 3D data predicted from 2D images using neural networks for 3D modeling applications | |
Hackenberg et al. | Lightweight palm and finger tracking for real-time 3D gesture control | |
US9383895B1 (en) | Methods and systems for interactively producing shapes in three-dimensional space | |
AU2022345532B2 (en) | Browser optimized interactive electronic model based determination of attributes of a structure | |
CN108776773B (en) | Three-dimensional gesture recognition method and interaction system based on depth image | |
CN104937635B (en) | More hypothesis target tracking devices based on model | |
Lin et al. | Two-hand global 3d pose estimation using monocular rgb | |
CN105006016B (en) | A kind of component-level 3 D model construction method of Bayesian network constraint | |
Li et al. | SweepCanvas: Sketch-based 3D prototyping on an RGB-D image | |
CN109359514B (en) | A joint strategy method for gesture tracking and recognition for deskVR | |
CN107357427A (en) | A kind of gesture identification control method for virtual reality device | |
CN107240129A (en) | Object and indoor small scene based on RGB D camera datas recover and modeling method | |
CN104050859A (en) | Interactive digital stereoscopic sand table system | |
JP2011022984A (en) | Stereoscopic video interactive system | |
CN101807114A (en) | Natural interactive method based on three-dimensional gestures | |
CN110633628A (en) | 3D model reconstruction method of RGB image scene based on artificial neural network | |
CN117011493B (en) | Three-dimensional face reconstruction method, device and equipment based on symbol distance function representation | |
Fadzli et al. | VoxAR: 3D modelling editor using real hands gesture for augmented reality | |
Beccari et al. | A fast interactive reverse-engineering system | |
CN108628455A (en) | A kind of virtual husky picture method for drafting based on touch-screen gesture identification | |
CN104484034A (en) | Gesture motion element transition frame positioning method based on gesture recognition | |
Bhakar et al. | A review on classifications of tracking systems in augmented reality | |
CN118466805A (en) | Non-contact 3D model human-computer interaction method based on machine vision and gesture recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20080730 Termination date: 20160729 |
|
CF01 | Termination of patent right due to non-payment of annual fee |