CN101499176B - Video game interface method - Google Patents
Video game interface method Download PDFInfo
- Publication number
- CN101499176B CN101499176B CN2008100571820A CN200810057182A CN101499176B CN 101499176 B CN101499176 B CN 101499176B CN 2008100571820 A CN2008100571820 A CN 2008100571820A CN 200810057182 A CN200810057182 A CN 200810057182A CN 101499176 B CN101499176 B CN 101499176B
- Authority
- CN
- China
- Prior art keywords
- points
- point
- model
- video
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims description 39
- 230000011218 segmentation Effects 0.000 claims description 9
- 238000007781 pre-processing Methods 0.000 claims description 6
- 230000005484 gravity Effects 0.000 claims description 4
- 230000033001 locomotion Effects 0.000 claims description 4
- 230000003993 interaction Effects 0.000 description 17
- 238000004422 calculation algorithm Methods 0.000 description 12
- 238000005516 engineering process Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 10
- 230000002452 interceptive effect Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000012545 processing Methods 0.000 description 4
- 241000699666 Mus <mouse, genus> Species 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 239000003086 colorant Substances 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 238000007654 immersion Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 239000004615 ingredient Substances 0.000 description 2
- 241000699670 Mus sp. Species 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000004020 luminiscence type Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Landscapes
- Processing Or Creating Images (AREA)
Abstract
本发明是一种视频游戏接口方法,所述方法实现了对游戏场景的自然交互。用摄像头采集带有三个标记的飞机的二维平面图像,在飞机各种运动时,能够快速、准确的识别出飞机的三个标记的位置,并且通过坐标转换的方法,较为准确的计算出空间位置。包括步骤:从摄像头得到数据;通过阈值分割得到3个点集;用聚类方法得到3个点平面位置;通过坐标转换得到空间位置。本发明易于实现和操作,使得游戏玩家有一种更新的交互方式,有别于传统的鼠标和键盘的交互方式,使得玩家获得更多的沉浸感,在实物和游戏中虚拟物体的结合中获得更多的乐趣。
The present invention is a video game interface method that realizes natural interaction with game scenes. Use the camera to collect the two-dimensional plane image of the aircraft with three marks. When the aircraft is in various movements, it can quickly and accurately identify the positions of the three marks of the aircraft, and through the method of coordinate transformation, the space can be calculated more accurately. Location. The steps include: obtaining data from a camera; obtaining three point sets through threshold segmentation; obtaining three point plane positions by a clustering method; obtaining spatial positions through coordinate conversion. The present invention is easy to realize and operate, so that game players have a newer interaction mode, which is different from the traditional mouse and keyboard interaction mode, so that players can obtain more immersion, and obtain more in the combination of physical objects and virtual objects in the game. much fun.
Description
技术领域technical field
本发明属模式识别技术和数字互动娱乐领域,涉及一种利用视频交互技术完成人机交互娱乐的过程。The invention belongs to the field of pattern recognition technology and digital interactive entertainment, and relates to a process of using video interactive technology to complete human-computer interactive entertainment.
背景技术Background technique
人机交互是游戏软件的重要组成部分,其作用随着计算机软硬件的发展的用户需求的提高而日渐明显。游戏玩家长期以来都在使用传统的交互方式,如:鼠标,键盘,游戏杆及其他专用游戏设备等。Human-computer interaction is an important part of game software, and its role is becoming more and more obvious with the development of computer hardware and software and the improvement of user needs. Gamers have long used traditional interaction methods such as mice, keyboards, joysticks and other dedicated gaming devices.
而游戏本身的目的是让玩家活得更多的娱乐和新鲜感,传统的方式的特点是以手接触为主,用“可触摸”的方式完成人机交互的过程,玩家对之都已经十分熟悉,已越来越难以适应玩家复杂多样的需求。因此,更加自然和智能的交互方法,比如声音、视觉、运动和触觉等等,已经被很多人注意到,市场上也逐渐的出现了基于视频和音频的游戏,受到了广泛的欢迎。The purpose of the game itself is to make players live more entertaining and fresh. The traditional method is characterized by hand contact, and the process of human-computer interaction is completed in a "touchable" way. Players are already very familiar with it. Familiarity has become increasingly difficult to adapt to the complex and diverse needs of players. Therefore, more natural and intelligent interaction methods, such as sound, vision, motion and touch, etc., have been noticed by many people, and games based on video and audio have gradually appeared on the market, which have been widely welcomed.
在数字互动娱乐领域,视频交互技术是近年来游戏中人机交互的研究热点,而且摄像头已经成为个人电脑的通用配备,目前基于视频的交互方式有:运动检测,人脸检测和表情识别,手势检测等等,但这些方式容易受到光线等变化等干扰,仍然不稳定。In the field of digital interactive entertainment, video interaction technology has become a research hotspot in human-computer interaction in games in recent years, and cameras have become common equipment for personal computers. Currently, video-based interaction methods include: motion detection, face detection and expression recognition, gestures Detection, etc., but these methods are susceptible to interference such as changes in light, etc., and are still unstable.
发明内容Contents of the invention
本发明的目的是得到游戏玩家手持的玩家飞机上三个标记点的三维坐标的系统。用摄像头采集带有三个发光二极管的玩具飞机的二维平面图像,识别出玩具飞机三个发光的空间三维位置。本发明易于实现和操作,使得游戏玩家有一种更新的交互方式,有别于传统的鼠标和键盘的交互方式,使得玩家获得更多的沉浸感,在实物和游戏中虚拟物体的结合中获得更多的乐趣。The purpose of the present invention is to obtain the system of three-dimensional coordinates of three marked points on the player's aircraft held by the game player. A camera is used to collect a two-dimensional plane image of a toy plane with three light-emitting diodes, and the three-dimensional spatial positions of the three light-emitting diodes of the toy plane are recognized. The present invention is easy to realize and operate, so that game players have a newer interaction mode, which is different from the traditional mouse and keyboard interaction mode, so that players can obtain more immersion, and obtain more in the combination of physical objects and virtual objects in the game. much fun.
为实现上述目的,本发明提供视频游戏接口方法和系统的步骤包括:In order to achieve the above object, the steps of the video game interface method and system provided by the present invention include:
步骤1:从视频图像中采集包含玩具飞机图像的样本;Step 1: Collect samples containing toy airplane images from video images;
步骤2:对视频图像中个像素信息进行阈值分割,得到下一步要进行聚类的点集;Step 2: Perform threshold segmentation on pixel information in the video image to obtain the point set to be clustered in the next step;
步骤3:用基于密度的聚类方式(DBSCAN)对点集进行聚类,得到三个点在视屏中所形成的三个点的集合,并以重心的方式计算点的二维坐标;Step 3: cluster the point set with a density-based clustering method (DBSCAN), obtain the set of three points formed by the three points in the video screen, and calculate the two-dimensional coordinates of the point in the form of the center of gravity;
步骤4:利用三点透视图的方法,结合摄像头的内部参数和飞机模型的外部参数,得到玩具飞机上三个点的三维坐标;Step 4: Using the method of three-point perspective, combined with the internal parameters of the camera and the external parameters of the aircraft model, the three-dimensional coordinates of the three points on the toy aircraft are obtained;
步骤5:用OpenGL建立模型进行演示。Step 5: Build a model with OpenGL for demonstration.
进一步,所述从视频图像中采集包含玩具飞机图像的样本的步骤包括:Further, the step of collecting a sample containing a toy airplane image from the video image includes:
步骤11:从鱼眼摄像头中得到每一帧的320×240像素的图像;Step 11: Obtain a 320×240 pixel image of each frame from the fisheye camera;
步骤12:将得到的每一个像素用3个元素来表示,即R,G,B三色模型,每个模型的范围都是0-255,分别在C++里用一个unsigned char类型的变量表示;Step 12: Each obtained pixel is represented by 3 elements, that is, the R, G, B three-color model, and the range of each model is 0-255, respectively represented by a variable of type unsigned char in C++;
步骤13:得到的每一帧图像总共能够存储320×240×3个变量,每个像素点由三个分量R、G、B来表示;Step 13: Each frame of image obtained can store a total of 320×240×3 variables, and each pixel is represented by three components R, G, and B;
步骤14:将得到的每一帧的图像进行预处理,将对灰度偏离目标设定值的情况的区域进行剔除。Step 14: Perform preprocessing on the obtained image of each frame, and remove the region where the grayscale deviates from the target set value.
进一步,所述对视频图像中个像素信息进行阈值分割,得到下一步要进行聚类的点集的步骤包括:Further, the step of performing threshold segmentation on pixel information in the video image to obtain the point set to be clustered in the next step includes:
步骤21:将预处理的图像进行第二步的预处理,取出那些孤立的噪声点和连续分布的且与标记点颜色信息差别小于设定值的点;Step 21: Carry out the preprocessing of the preprocessed image in the second step, and take out those isolated noise points and points that are continuously distributed and whose color information difference from the marked point is less than the set value;
步骤22:根据光线强度的不同选择不同的阈值分割模型;Step 22: Select different threshold segmentation models according to the light intensity;
步骤23:从不同的阈值分割模型中得到数目不同的目标点。Step 23: Obtain different numbers of target points from different threshold segmentation models.
进一步,所述用基于密度的聚类方式(DBSCAN)对点集进行聚类,得到三个点在视屏中所形成的三个点的集合,并以重心的方式计算点的二维坐标的步骤包括:Further, the described method of clustering point sets based on density (DBSCAN) is used to obtain the set of three points formed by three points in the video screen, and the step of calculating the two-dimensional coordinates of the points in the form of center of gravity include:
步骤31:根据前一步骤得到的点的数目改变模型的参数值,重新采集点,保证点的数目在设定的范围之内;Step 31: Change the parameter value of the model according to the number of points obtained in the previous step, and collect points again to ensure that the number of points is within the set range;
步骤32:将31步骤中得到的点数化为平面视频图像中的坐标值;Step 32: converting the points obtained in step 31 into coordinate values in the plane video image;
步骤33:对基于密度的聚类方式(DBSCAN)聚类进行初始化;Step 33: Initialize the density-based clustering method (DBSCAN) clustering;
步骤34:将33步骤中二维的点的坐标依次读入,作为基于密度的聚类方式(DBSCAN)聚类的输入和初始点;Step 34: read in the coordinates of the two-dimensional points in step 33 sequentially, as the input and initial point of the clustering based on density clustering mode (DBSCAN);
步骤35:对读入的点进行预处理;Step 35: Preprocessing the read points;
步骤36:将点和点之间建立相互连接的关系;Step 36: establish a relationship between points and points;
步骤37:建立聚类的核心点,根据参数调整;Step 37: Establish the core points of clustering and adjust according to the parameters;
步骤38:确定聚类;Step 38: determine clustering;
步骤39:根据聚类得到第三组点进行集合,得到重心坐标。Step 39: Gather the third group of points according to the clustering to obtain the center of gravity coordinates.
进一步,所述利用三点透视图的方法,结合摄像头的内部参数和飞机模型的外部参数,得到玩具飞机上三个点的三维坐标的步骤包括:Further, the method of using the three-point perspective view, combined with the internal parameters of the camera and the external parameters of the aircraft model, the step of obtaining the three-dimensional coordinates of three points on the toy aircraft includes:
步骤41:从39步骤中得到三个聚类的重心坐标;Step 41: Obtain the barycentric coordinates of the three clusters from step 39;
步骤42:确定三个坐标系,包括图像坐标系、摄像机坐标系和世界坐标系;Step 42: Determine three coordinate systems, including image coordinate system, camera coordinate system and world coordinate system;
步骤43:利用工具ARToolKit对摄像机进行标定,得到摄像机内部参数如焦距、光心等信息;Step 43: use the tool ARToolKit to calibrate the camera, and obtain the internal parameters of the camera such as focal length, optical center and other information;
步骤44:建立像素与毫米长度的转化关系;Step 44: Establish a conversion relationship between pixels and millimeter lengths;
步骤45:利用三个坐标系的几何位置关系得到摄像机空间模型;Step 45: Obtain the camera space model by using the geometric positional relationship of the three coordinate systems;
步骤46:根据摄像机的空间模型得到三点透视图;Step 46: Obtain a three-point perspective view according to the spatial model of the camera;
步骤47:得到玩具飞机的外部参数,即三个点之间构成的三角形的三边长,并转化为像素单位;Step 47: Obtain the external parameters of the toy plane, that is, the lengths of the three sides of the triangle formed between the three points, and convert it into pixel units;
步骤48:计算得到三点透视图坐标转化;Step 48: Calculate and obtain the coordinate conversion of the three-point perspective view;
步骤49:从三点透视图中得到三个点的空间三维坐标。Step 49: Obtain the three-dimensional coordinates of the three points from the three-point perspective view.
进一步,所述用OpenGL模型进行演示的步骤包括:Further, the described steps of demonstrating with the OpenGL model include:
步骤51:初始化OpenGL模型环境;Step 51: Initialize the OpenGL model environment;
步骤52:对得到的三维数据进行校正,并归一化为OpenGL可接收的数据;Step 52: correcting the obtained three-dimensional data, and normalizing it to data acceptable to OpenGL;
步骤53:通过三维数据在OpenGL中建立三角模型;Step 53: building a triangular model in OpenGL through 3D data;
步骤54:OpenGL中的模型随真实的玩具飞机的运动而实时运动。Step 54: The model in OpenGL moves in real time with the movement of the real toy airplane.
本发明利用计算机视觉和图像处理技术,自然的交互游戏场景中。传统的交互方式的特点是以手接触为主,如鼠标、键盘等。用“可触摸”的方式完成人机交互的过程。随着计算机技术的发展,这些传统的人机交互技术已越来越难以适应玩家复杂多样的需求。游戏玩家要求更加自然和智能的交互方法,采用计算机视觉的方式,使玩家获得更多的沉浸感。The present invention utilizes computer vision and image processing technology in a natural interactive game scene. Traditional interaction methods are characterized by hand contact, such as mouse and keyboard. Complete the process of human-computer interaction in a "touchable" way. With the development of computer technology, these traditional human-computer interaction technologies have become increasingly difficult to adapt to the complex and diverse needs of players. Gamers demand more natural and intelligent interaction methods, using computer vision to enable players to gain more immersion.
另外,本发明采用基于密度的DBSCAN算法,可以识别出任意形状的聚类,而且只需手动设定初始化参数,而参数可以通过经验来进行手工设定,在数据较多的时候有比较高的效率。在算法的实现过程中,对核心点的搜索区域进行了限定,这样就大大缩小了搜索范围,提高了搜索效率,使得聚类的速度达到20ms左右,满足视频实时性的要求。In addition, the present invention adopts the density-based DBSCAN algorithm, which can identify clusters of any shape, and only needs to manually set the initialization parameters, and the parameters can be manually set through experience, and have a relatively high value when there are many data efficiency. During the implementation of the algorithm, the search area of the core points is limited, which greatly reduces the search range and improves the search efficiency, making the clustering speed reach about 20ms, which meets the real-time requirements of video.
附图说明Description of drawings
图1是本发明的最初得到的视频效果图;Fig. 1 is the video effect figure that obtains at first of the present invention;
图2是本发明的视频采集属性图;Fig. 2 is a video acquisition property diagram of the present invention;
图3是本发明用于数字游戏中的程序结构图;Fig. 3 is the program structural diagram that the present invention is used in digital game;
图4是本发明用于数字游戏的示意图;Fig. 4 is the schematic diagram that the present invention is used in digital game;
图5是DBSCAN聚类过程点的连接关系;Fig. 5 is the connection relationship of DBSCAN clustering process points;
图6是聚类过程核心点与非核心点的示意图;Fig. 6 is a schematic diagram of core points and non-core points in the clustering process;
图7是三点透视图(P3P)的模型。Figure 7 is a model of a three-point perspective (P3P).
具体实施方式Detailed ways
下面将结合附图对本发明加以详细说明,所描述的实施例仅旨在便于对本发明的理解,而对其不起任何限定作用。The present invention will be described in detail below in conjunction with the accompanying drawings, and the described embodiments are only intended to facilitate the understanding of the present invention, rather than limiting it in any way.
下面通过实例进一步说明一种视频游戏接口方法和系统的操作过程。The operation process of a video game interface method and system will be further described below through an example.
本实例的所有代码均为C++编写,在Microsoft visual studio 2005环境下运行。程序结构图如图3所示,其中,300,301,302和303是本发明的组成部分,可以通过API接口来接入游戏场景中。All the codes in this example are written in C++ and run under Microsoft visual studio 2005 environment. The program structure diagram is shown in FIG. 3 , wherein 300 , 301 , 302 and 303 are components of the present invention, which can be connected to the game scene through the API interface.
如图1,所示系统的第一步是采集图像,如图3中的300。As shown in FIG. 1 , the first step of the system shown is to acquire an image, such as 300 in FIG. 3 .
(1-1)数字图像处理的信息多是二维信息,处理信息量很大。这里的一幅图像用二维函数f(x,y)表示,其中x,y是二维坐标,f(x,y)表示点(x,y)点的颜色信息。摄像头从空间中采集镜头内的所有光学信息,这些信息进入计算机之后,转换为符合计算机标准的彩色模型,以进入程序进行数字图像处理,并保证视频的连贯性和实时性。从采集的图像中对每一个像素进行处理总共320×240个像素76800个像素点。采集的视频最初效果如图1所示。项目随后的所有的操作和运算都是基于这每一帧的320×240像素。如图2所示的视频采集属性。(1-1) Most of the information in digital image processing is two-dimensional information, and the amount of processed information is large. An image here is represented by a two-dimensional function f(x, y), where x, y are two-dimensional coordinates, and f(x, y) represents the color information of the point (x, y). The camera collects all the optical information in the lens from the space. After the information enters the computer, it is converted into a color model that meets computer standards, and then enters the program for digital image processing, and ensures the continuity and real-time performance of the video. A total of 320×240 pixels and 76800 pixels are processed for each pixel in the collected image. The initial effect of the collected video is shown in Figure 1. All subsequent operations and calculations of the project are based on the 320×240 pixels of each frame. The video capture properties shown in Figure 2.
(1-2)自然界中的所有颜色都可以由红、绿、蓝(R、G、B)3原色组合而成。根据含有红色成分的多少,可以人为的分成0到255共256个等级。0级表示不含有红色成分,255级表示含有100%的红色成分。同样的,绿色和蓝色也可以被分为256级。图2所示的视频采集属性选择的就是RGB格式。这样的方法易于计算机处理,为下面的阶段提高了方便,如图3中的301。(1-2) All colors in nature can be formed by combining the three primary colors of red, green and blue (R, G, B). According to the amount of red components, it can be artificially divided into 256 levels from 0 to 255. A grade of 0 means no red ingredients, and a grade of 255 means 100% red ingredients. Similarly, green and blue can also be divided into 256 levels. The video capture property shown in Figure 2 selects the RGB format. Such a method is easy to be processed by a computer, which improves convenience for the following stages, such as 301 in FIG. 3 .
(1-3)接下来的步骤就要对玩具飞机上的标记进行识别。目前采用的标记是直径约1厘米的红色二极管,在不发光的情况下能够被较好的捕捉。它的优点是球形的表面,在角度变化时仍然有好的识别效果。要捕捉的区域如图1中三个划圈的地方。要对图像进行01标定,即阈值分割。将数据进入下一步用来寻找聚类,以确定发光电在二维图像中具体位置。设原始图像f(x,y),以一定的准则在f(x,y)中找一个合适的彩色图像信息作为阈值t,则按照上述方法分割后的图像g(x,y)可以由下式表示:(1-3) The next step is to identify the marks on the toy airplane. The markers currently used are red diodes about 1 cm in diameter, which can be better captured without emitting light. Its advantage is the spherical surface, which still has a good recognition effect when the angle changes. The area to be captured is the three circled places in Figure 1. It is necessary to perform 01 calibration on the image, that is, threshold segmentation. The data is taken to the next step to find clusters to determine the specific location of the luminescence in the 2D image. Assuming the original image f(x, y), find a suitable color image information in f(x, y) as the threshold t according to certain criteria, then the image g(x, y) segmented according to the above method can be obtained by the following formula means:
通过反复多次的试验,另r,g和b分布表示三种颜色的信息,它们的范围都是0-255,这里令_r=r/(r+g+b),满足:_r>k而且r+g+b>350。即红色信息达到总信息的一定比例。这里的k在程序中是一个可变的参数,即光线较强时数据会增多,这是应该减小k,以降低采集的点数,同理光线较弱时会相应的增加k。Through repeated experiments, the r, g and b distributions represent the information of the three colors, and their ranges are all 0-255, where _r=r/(r+g+b), satisfying: _r>k And r+g+b>350. That is, the red information reaches a certain proportion of the total information. Here k is a variable parameter in the program, that is, the data will increase when the light is strong, so k should be reduced to reduce the number of collected points, and k will be increased correspondingly when the light is weak.
从(1-3)步中得到的点集,并以此为输入进行聚类,就可以找到3个标记在平面视频中的位置,如图4中找到的点集,如图3中的302。From the point set obtained in step (1-3), and clustering with this as input, you can find the positions of 3 markers in the planar video, such as the point set found in Figure 4, such as 302 in Figure 3 .
(2-1)DBSCAN(Density-Based Spatical Clustering of Application withNoise)是聚类算法的一种,聚类的目的是把属于同类的数据分到一起,不同类的数据分开,DBSCAN是基于密度的算法,它的基本思想是:对于一个类中的每个对象,在其给定半径的领域中包含的对象不能少于某一给定的最小数目。下面是DBSCAN算法的一些定义。(2-1) DBSCAN (Density-Based Spatical Clustering of Application with Noise) is a kind of clustering algorithm. The purpose of clustering is to group data belonging to the same type and separate data of different types. DBSCAN is a density-based algorithm. , its basic idea is: for each object in a class, the number of objects contained in its given radius domain cannot be less than a given minimum number. Below are some definitions of the DBSCAN algorithm.
定义:点的Eps-邻域Definition: Eps-neighborhood of a point
点p的Eps-邻域由NEps(p)来表示,定义为:The Eps-neighborhood of a point p is denoted by N Eps (p), defined as:
NEps(p)={q∈D|dist(p,q)≤Eps}N Eps (p) = {q∈D|dist(p, q)≤Eps}
定义:MinPtsDefinition: MinPts
点p的Eps-邻域内的点数目大于MinPts的时候,点为核心点。如图6所示,A如果MinPts为7,Eps为圆的半径,怎A是核心点,B点不是。When the number of points in the Eps-neighborhood of point p is greater than MinPts, the point is a core point. As shown in Figure 6, if the MinPts of A is 7, and Eps is the radius of the circle, then A is the core point, but B is not.
定义:聚类Definition: Clustering
令D为数据库中的点。一个聚类C是D中非空的并满足如下条件的点的集合:Let D be a point in the database. A cluster C is a collection of points in D that are not empty and satisfy the following conditions:
(1)p,q:如果p∈C而且q对于p来说是密度可达的,则q∈C。(最大化)(1) p, q: If p ∈ C and q is density-reachable for p, then q ∈ C. (maximize)
(2)p,q∈C:p对于q来说是密度连接的。(2) p, q ∈ C: p is density-connected with respect to q.
如图4所示,找到的点即时3个聚类。As shown in Figure 4, the found points are immediately 3 clusters.
定义:噪声Definition: noise
令C1,...,Ck为数据库D中的聚类,i=1,...,k。我们定义噪声为数据库中不属于任何聚类Ci的点,即,噪声={p∈D|i:pCi}。Let C 1 , . . . , C k be the clusters in the database D, i=1, . . . , k. We define noise as points in the database that do not belong to any cluster Ci , i.e., noise = {p∈D| i:p C i }.
(2-2)步骤一:点的读入和预处理。在点的定义中已经提到,每个点有一个sSortedList类的成员,这个成员是一系列点的链表,以与本点距离的大小按照顺序排列。比如有如下几个点A(0,0),B(1,1),C(1,1.5),D(2,2),E(2.5,0),则对于点A来说它的sSortedList成员如附图5所示。(2-2) Step 1: Point reading and preprocessing. As mentioned in the point definition, each point has a member of the sSortedList class, which is a linked list of a series of points, arranged in order according to the size of the distance from the point. For example, there are the following points A (0, 0), B (1, 1), C (1, 1.5), D (2, 2), E (2.5, 0), then for point A, its sSortedList Members are shown in Figure 5.
(2-3)步骤二:建立核心点。完成本步骤的前提是步骤一中所完成的点的关联关系。首先从第一个点出发,然后判断每个点的sSortedList类的成员。选择第MinPts个节点,如果这个节点的距离小于Eps的话,则将此点的core_tag值设成true。直到所有的点都被判断。如附图6所示的核心点和非核心点的实例。(2-3) Step 2: Establish core points. The premise of completing this step is the association relationship of the points completed in step 1. First start from the first point, and then judge the members of the sSortedList class of each point. Select the MinPts node, if the distance of this node is less than Eps, set the core_tag value of this point to true. until all points are judged. Examples of core points and non-core points are shown in Figure 6.
这样的过程也可以理解为画一个以要判断的点为半径的圆,如果圆内的点的数目不小于MinPts的话,则说明此点是核心点。本步的作用是为每个点完成这样的准备工作。程序中,每个点以链表的方式进入类中,每个节点都有类似于上图的成员。排序是在插入的过程中完成的,时间复杂度是O(N2)。如果DBSCAN总共要处理的点是N个,则每个点的sSortedList类的成员的长度都是N,也就是说,整个类将会有N×N个节点。如果N很大的话,将导致很大的存储容量和很长的计算时间。Such a process can also be understood as drawing a circle with the point to be judged as the radius. If the number of points in the circle is not less than MinPts, it means that this point is the core point. The purpose of this step is to complete such preparations for each point. In the program, each point enters the class in the form of a linked list, and each node has members similar to the above figure. Sorting is done during insertion, and the time complexity is O(N 2 ). If DBSCAN has to process N points in total, the length of the members of the sSortedList class of each point is N, that is to say, the entire class will have N×N nodes. If N is large, it will result in a large storage capacity and a long calculation time.
(2-4)通过上一个步骤确定了所有核心点之后,基于密度的算法便有了依据。DBSCAN的基本思想是对于一个类中的每个对象,在其给定半径的领域中包含的对象不能少于某一给定的最小数目。聚类的过程步骤如下:(2-4) After all core points are determined through the previous step, the density-based algorithm has a basis. The basic idea of DBSCAN is that for each object in a class, the range of its given radius cannot contain less than a given minimum number of objects. The clustering process steps are as follows:
首先设置一个全局聚类编号current_class_id,初始化为1。然后从第一点出发,如果此点是核心点而且没有被访问过,则将当前点的class_id设成全局聚类编号current_class_id,并把classified_tag设成true,同时进入核心点聚类函数CorePointCluster。完成这个过程后,自增全局聚类编号current_class_id,直到全部点访问结束。这样所以的聚类就全部找到,即current_class_id编号一致的为同一聚类,而current_class_id为0的则为噪声点。First set a global cluster number current_class_id, initialized to 1. Then starting from the first point, if this point is a core point and has not been visited, set the class_id of the current point to the global cluster number current_class_id, set the classified_tag to true, and enter the core point clustering function CorePointCluster at the same time. After this process is completed, the global clustering number current_class_id is incremented until all point visits are completed. In this way, all clusters are found, that is, those with the same current_class_id number are the same cluster, and those with
(2-5)聚类算法的改进实际上,整个程序用到的仅仅是点周围距离最近的MinPts个点和以Eps为半径的圆内。所以,我们仅仅需要将这些点相关联,而这些点一定是小于N的。如果MinPts和Eps的选择很小的话,所节省的时间和空间是相当可观的。这样即满足了程序的实时性要求。(2-5) Improvement of clustering algorithm In fact, the whole program uses only the nearest MinPts points around the point and the circle with Eps as the radius. Therefore, we only need to associate these points, and these points must be less than N. The time and space savings are considerable if the selection of MinPts and Eps is small. This meets the real-time requirements of the program.
聚类完成后得到三个点的平面坐标,要通过三点透视图(P3P)的方法来得到三个点空间坐标,如图3中的303部分。After the clustering is completed, the plane coordinates of the three points are obtained, and the spatial coordinates of the three points are obtained through the three-point perspective (P3P) method, as shown in part 303 in FIG. 3 .
(3-1)首先要对摄像机标定。标定的目的是活得摄像头的内部参数,如焦距,光线的坐标等。(3-1) Firstly, the camera must be calibrated. The purpose of calibration is to obtain the internal parameters of the camera, such as focal length, coordinates of light, etc.
(3-2)P3P算法是非线性的,没有精确解,故需要一种近似的算法得到解。我们利用了对两点问题的算法来对三点问题进行求解。如附图7中,点m0,m1,m2是空间中的点M0,M1,M2在图像上的投影;线段m0m1是空间中的线段M0M1在图像上的投影,线段M0M1的长度为D1;线段m0m2是空间中的线段M0M2在图像上的投影,线段M0M2的长度为D2;在以M0为顶点的角度中,角M0M1M2已知,为α。三角形在空间中的形态完全由三个点m0,m1,m2,线段Om0的延长线到点M0的距离R0和角θ1,θ2来决定。通过计算得到R0和θ1,θ2,空间中三点M0M1M2的位置和角度就能够确定。(3-2) The P3P algorithm is nonlinear and has no exact solution, so an approximate algorithm is needed to obtain the solution. We use the algorithm for the two-point problem to solve the three-point problem. As shown in Figure 7, points m 0 , m 1 , m 2 are the projections of points M 0 , M 1 , M 2 in space on the image; line segment m 0 m 1 is the line segment M 0 M 1 in space in the image The projection on the line segment M 0 M 1 has a length of D 1 ; the line segment m 0 m 2 is the projection of the line segment M 0 M 2 in space on the image, and the length of the line segment M 0 M 2 is D 2 ; Among the angles of the vertices, the angle M 0 M 1 M 2 is known and is α. The shape of a triangle in space is completely determined by the three points m 0 , m 1 , m 2 , the distance R 0 from the extension line of the line segment Om 0 to the point M 0 and the angles θ 1 and θ 2 . By calculating R 0 and θ 1 , θ 2 , the positions and angles of the three points M 0 M 1 M 2 in the space can be determined.
通过两点问题的计算,根据正弦定理,我们应用在线段M0M1和线段M0M2上,可以得到如下的关系:Through the calculation of the two-point problem, according to the law of sine, we apply the line segment M 0 M 1 and the line segment M 0 M 2 , and the following relationship can be obtained:
这样就找到了角度关系的一个方程。将上面两个方程相除,削去R0,我们得到:In this way an equation for the angular relationship is found. Dividing the above two equations, shaving off R 0 , we get:
其中K由已知变量γ1,γ2,D1和D2得来:Where K is obtained from the known variables γ 1 , γ 2 , D 1 and D 2 :
得到角度θ1,θ2关系的另外一个方程需要由空间线段M0M1和M0M2的夹角α得来:Another equation to obtain the relationship between angle θ 1 and θ 2 needs to be obtained from the angle α between the space line segment M 0 M 1 and M 0 M 2 :
cosα=sinθ1sinθ2cosφ+cosθ1cosθ2 (3)cosα=sinθ 1 sinθ 2 cosφ+cosθ 1 cosθ 2 (3)
其中φ是图像中m0m1和m0m2得夹角。Where φ is the angle between m 0 m 1 and m 0 m 2 in the image.
这样我们就得到了关于θ1和θ2的两个方程(2)和(3)。方程的近似解:This way we get two equations (2) and (3) for θ1 and θ2 . Approximate solution to the equation:
利用估算的方法,需要解出如下两个方程:Using the estimation method, the following two equations need to be solved:
sinθ1sinθ2cosφ+cosθ1cosθ2=cosα (5)sinθ 1 sinθ 2 cosφ+cosθ 1 cosθ 2 = cosα (5)
将方程(4)得到
cos2α+sin2θ1sin2θ2cos2φ-2cosαsinθ1sinθ2cosφ=cos2θ1cos2θ2将
方程可以得到两个关于X1的解,但只有较小的一个解才小于1,有可能是一个正弦的值。所以,由次二次方程可以得到关于sinθ1的解:The equation can have two solutions for X 1 , but only the smaller one is less than 1, possibly a sinusoidal value. Therefore, the solution for sinθ 1 can be obtained from the subquadratic equation:
其中,in,
得出sinθ1的值,能过得到两个θ1的值,这与线段M0M1的方向相关。根据公式(X)可知:The value of sin θ 1 can be obtained, and two values of θ 1 can be obtained, which are related to the direction of the line segment M 0 M 1 . According to the formula (X), it can be seen that:
所以最终解R0(线段OM0的长度)只有一个。So there is only one final solution R 0 (the length of line segment OM 0 ).
得到了同摄像头中心的距离R0,如附图7中所示,而每个点同摄像头光心的角度已知,根据三角关系可以计算出点的深度信息,结合平面的坐标,就可以得到空间中的具体位置。The distance R 0 from the center of the camera is obtained, as shown in Figure 7, and the angle between each point and the optical center of the camera is known, and the depth information of the point can be calculated according to the triangular relationship, combined with the coordinates of the plane, we can get specific location in space.
本发明利用计算机视觉和图像处理技术,自然的交互游戏场景中。传统的交互方式的特点是以手接触为主,如鼠标、键盘等。用“可触摸”的方式完成人机交互的过程。随着计算机技术的发展,这些传统的人机交互技术已越来越难以适应玩家复杂多样的需求。游戏玩家要求更加自然和智能的交互方法,采用计算机视觉的方式,使玩家获得更多的沉浸感。The present invention utilizes computer vision and image processing technology in a natural interactive game scene. Traditional interaction methods are characterized by hand contact, such as mouse and keyboard. Complete the process of human-computer interaction in a "touchable" way. With the development of computer technology, these traditional human-computer interaction technologies have become increasingly difficult to adapt to the complex and diverse needs of players. Gamers demand more natural and intelligent interaction methods, using computer vision to enable players to gain more immersion.
另外,本发明采用基于密度的DBSCAN算法,可以识别出任意形状的聚类,而且只需手动设定初始化参数,而参数可以通过经验来进行手工设定,在数据较多的时候有比较高的效率。在算法的实现过程中,对核心点的搜索区域进行了限定,这样就大大缩小了搜索范围,提高了搜索效率,使得聚类的速度达到20ms左右,满足视频实时性的要求。In addition, the present invention adopts the density-based DBSCAN algorithm, which can identify clusters of any shape, and only needs to manually set the initialization parameters, and the parameters can be manually set through experience, and have a relatively high value when there are many data efficiency. During the implementation of the algorithm, the search area of the core points is limited, which greatly reduces the search range and improves the search efficiency, making the clustering speed reach about 20ms, which meets the real-time requirements of video.
以上所述,仅为本发明中的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉该技术的人在本发明所揭露的技术范围内,可理解想到的变换或替换,都应涵盖在本发明的包含范围之内,因此,本发明的保护范围应该以权利要求书的保护范围为准。The above is only a specific implementation mode in the present invention, but the scope of protection of the present invention is not limited thereto. Anyone familiar with the technology can understand the conceivable transformation or replacement within the technical scope disclosed in the present invention. All should be covered within the scope of the present invention, therefore, the protection scope of the present invention should be based on the protection scope of the claims.
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2008100571820A CN101499176B (en) | 2008-01-30 | 2008-01-30 | Video game interface method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2008100571820A CN101499176B (en) | 2008-01-30 | 2008-01-30 | Video game interface method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101499176A CN101499176A (en) | 2009-08-05 |
CN101499176B true CN101499176B (en) | 2011-08-31 |
Family
ID=40946240
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2008100571820A Expired - Fee Related CN101499176B (en) | 2008-01-30 | 2008-01-30 | Video game interface method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101499176B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105894569A (en) * | 2014-12-03 | 2016-08-24 | 北京航天长峰科技工业集团有限公司 | OpenGL based three-dimensional rapid modeling method |
CN114281285B (en) * | 2021-07-14 | 2024-05-28 | 海信视像科技股份有限公司 | Display device and display method for stably presenting depth data |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1613084A2 (en) * | 2004-06-29 | 2006-01-04 | Videra Oy | AV system and control unit |
CN1716273A (en) * | 2004-06-28 | 2006-01-04 | 李剑华 | Outer shape structure of commercial guest greeting robot and identifying method |
CN1909024A (en) * | 2005-08-05 | 2007-02-07 | 颜博文 | Interdynamic advertisement watching board |
-
2008
- 2008-01-30 CN CN2008100571820A patent/CN101499176B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1716273A (en) * | 2004-06-28 | 2006-01-04 | 李剑华 | Outer shape structure of commercial guest greeting robot and identifying method |
EP1613084A2 (en) * | 2004-06-29 | 2006-01-04 | Videra Oy | AV system and control unit |
CN1909024A (en) * | 2005-08-05 | 2007-02-07 | 颜博文 | Interdynamic advertisement watching board |
Also Published As
Publication number | Publication date |
---|---|
CN101499176A (en) | 2009-08-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111243093B (en) | Three-dimensional face grid generation method, device, equipment and storage medium | |
CN109636831B (en) | A Method for Estimating 3D Human Pose and Hand Information | |
AU2017361061B2 (en) | Deep learning system for cuboid detection | |
WO2021093453A1 (en) | Method for generating 3d expression base, voice interactive method, apparatus and medium | |
Liang et al. | Parsing the hand in depth images | |
EP3545497B1 (en) | System for acquiring a 3d digital representation of a physical object | |
CN112150575A (en) | Scene data acquisition method, model training method, device and computer equipment | |
CN100407798C (en) | 3D geometric modeling system and method | |
CN105046710A (en) | Depth image partitioning and agent geometry based virtual and real collision interaction method and apparatus | |
CN102999942A (en) | Three-dimensional face reconstruction method | |
CN105528082A (en) | Three-dimensional space and hand gesture recognition tracing interactive method, device and system | |
CN107679537A (en) | A kind of texture-free spatial target posture algorithm for estimating based on profile point ORB characteristic matchings | |
CN106325509A (en) | Three-dimensional gesture recognition method and system | |
CN102332095A (en) | Face motion tracking method, face motion tracking system and method for enhancing reality | |
WO2022062238A1 (en) | Football detection method and apparatus, and computer-readable storage medium and robot | |
CN102360504A (en) | Self-adaptation virtual and actual three-dimensional registration method based on multiple natural characteristics | |
CN110942110A (en) | Feature extraction method and device of three-dimensional model | |
Dwibedi et al. | Deep cuboid detection: Beyond 2d bounding boxes | |
TW202247108A (en) | Visual positioning method, equipment, and medium | |
CN111259950A (en) | Method for training YOLO neural network based on 3D model | |
Laupheimer et al. | On the association of LiDAR point clouds and textured meshes for multi-modal semantic segmentation | |
Zhuang | Film and television industry cloud exhibition design based on 3D imaging and virtual reality | |
CN101499176B (en) | Video game interface method | |
CN113192204B (en) | Three-dimensional reconstruction method for building in single inclined remote sensing image | |
Nguyen et al. | High resolution 3d content creation using unconstrained and uncalibrated cameras |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20110831 Termination date: 20190130 |
|
CF01 | Termination of patent right due to non-payment of annual fee |