CN103648106B

CN103648106B - WiFi indoor positioning method of semi-supervised manifold learning based on category matching

Info

Publication number: CN103648106B
Application number: CN201310750528.6A
Authority: CN
Inventors: 谭学治; 周才发; 马琳; 邓仲哲; 何晨光; 迟永钢; 魏守明
Original assignee: Harbin Institute of Technology Shenzhen
Current assignee: Hit Robot Group Co ltd
Priority date: 2013-12-31
Filing date: 2013-12-31
Publication date: 2017-03-22
Anticipated expiration: 2033-12-31
Also published as: CN103648106A

Abstract

The invention discloses a WiFi indoor positioning method of semi-supervised manifold learning based on category matching, and relates to an indoor positioning method. The WiFi indoor positioning method disclosed by the invention is used for solving the problems that a Radio Map database is large and the like in an existing WiFi indoor positioning method. The WiFi indoor positioning method comprises the following steps: 1. collecting Radio Map; 2. carrying out intrinsic dimension analysis on the Radio Map; 3. carrying out clustering analysis on the Radio Map; 4. carrying out dimensionality reduction on the Radio Map; 5. adding RSS in the Radio Map to obtain Radio Mapul; and 6. carrying out dimensionality reduction on the Radio Mapul to obtain a characteristic transformation matrix V, and forming an online positioning database through the Radio Map * and V. The WiFi indoor positioning method also comprises the following steps: 1. online testing RSS; 2. carrying out dimensionality reduction on the RSS to obtain RSS *; 3. outputting a positioning result; and 4. updating the database. The WiFi indoor positioning method disclosed by the invention is applied to the field of network technology.

Description

A WiFi Indoor Localization Method Based on Semi-supervised Manifold Learning of Category Matching

技术领域technical field

本发明涉及一种室内定位方法，具体涉及一种基于类别匹配的半监督流形学习的WiFi室内定位方法。The invention relates to an indoor positioning method, in particular to a WiFi indoor positioning method based on category matching semi-supervised manifold learning.

背景技术Background technique

随着无线局域网络在世界范围的飞速发展和移动终端设备的广泛普及，近年来出现了许多室内定位相关的技术和应用。由于多径效应、信号衰减及室内定位环境的复杂性，基于传统的信号传播模型的室内定位方法难以达到高精度的室内定位要求。基于到达时间（Time of Arrival），到达时间差(Time Difference of Arrival)和到达角度(Angles ofArrival)等定位方法虽然可以基本满足定位精度需求，然而都需要定位终端有额外的硬件设备支持，具有较大局限性，从而导致基于上述几类定位方法的室内定位系统没有得到普及。With the rapid development of wireless local area networks in the world and the widespread popularization of mobile terminal equipment, many technologies and applications related to indoor positioning have emerged in recent years. Due to the multipath effect, signal attenuation and the complexity of the indoor positioning environment, it is difficult for the indoor positioning method based on the traditional signal propagation model to meet the high-precision indoor positioning requirements. Although positioning methods based on Time of Arrival (Time of Arrival), Time Difference of Arrival (Time Difference of Arrival) and Angles of Arrival (Angles of Arrival) can basically meet the positioning accuracy requirements, they all require the positioning terminal to have additional hardware equipment support, which has a large Due to the limitations, the indoor positioning system based on the above-mentioned types of positioning methods has not been popularized.

目前，基于WLAN位置指纹(Finger Print)的WiFi室内定位方法得到了广泛应用。该方法的网络构建方法成本低廉，其使用2.4GHz ISM(Industrial Science Medicine)公共频段且无需在现有设施之上添加定位测量专用硬件。只需要通过移动终端的无线网卡及相应软件测量接收到的接入点(Access Point,AP)的信号强度(Received SignalStrength,RSS)，由此来构建网络信号覆盖图(Radio Map)，进而通过匹配算法来预测移动用户所处位置的坐标，或相对位置。At present, the WiFi indoor positioning method based on the WLAN location fingerprint (Finger Print) has been widely used. The network construction method of this method is low in cost, and it uses the 2.4GHz ISM (Industrial Science Medicine) public frequency band and does not need to add special hardware for positioning measurement on the existing facilities. It only needs to measure the received signal strength (Received SignalStrength, RSS) of the access point (Access Point, AP) through the wireless network card of the mobile terminal and the corresponding software, so as to construct the network signal coverage map (Radio Map), and then pass the matching Algorithms to predict the coordinates, or relative location, of where the mobile user is.

然而通过该方式建立的Radio Map包含有庞大的数据信息，且随着定位区域扩大，Radio Map可能（依据定位匹配方式及算法选择）呈指数形势增长。获得尽可能多的相关数据特征信息对于整个系统来说会提升定位精度，但是处理大量的特征信息增加算法开销，定位算法无法在处理能力有限的移动终端上有效运行，同时某些特征信息可能是对于定位没有作用甚至有负面作用，致使匹配效率降低，从而导致匹配定位算法的实现变得更加复杂，并且定位精度下降。However, the Radio Map established by this method contains huge data information, and as the positioning area expands, the Radio Map may (according to the positioning matching method and algorithm selection) grow exponentially. Obtaining as much relevant data feature information as possible will improve the positioning accuracy for the entire system, but processing a large amount of feature information increases the algorithm overhead, and the positioning algorithm cannot run effectively on mobile terminals with limited processing capabilities. At the same time, some feature information may be It has no effect on positioning or even has a negative effect, resulting in a reduction in matching efficiency, which leads to more complex implementation of matching positioning algorithms and a decrease in positioning accuracy.

当AP的数目增加及定位的参考点（Reference Point）增加时，Radio Map的数据信息增加。此时，Radio Map中代表的AP数目的信息表示了数据的维数。因此，当AP数目增加，Radio Map就变成了高维数据。为减轻处理高维数据的负担，降维算法是有效的解决方法之一。高维数据可能包含很多特征，这些特征都在描述同一个事物，这些特征一定程度上是紧密相连的。如当从各个角度对同一个物体同时拍照时，得到的数据就含有重叠的信息。如果能得到这些数据的一些简化的不重叠的表达，将会极大地提高数据处理运行的效率并一定程度上提高准确度。降维算法的目的也正是在于提高高维数据的处理效率。When the number of APs increases and the reference point (Reference Point) for positioning increases, the data information of the Radio Map increases. At this time, the information on the number of APs represented in the Radio Map indicates the dimensionality of the data. Therefore, when the number of APs increases, the Radio Map becomes high-dimensional data. To reduce the burden of dealing with high-dimensional data, dimensionality reduction algorithm is one of the effective solutions. High-dimensional data may contain many features, all of which describe the same thing, and these features are closely connected to a certain extent. For example, when taking pictures of the same object from various angles at the same time, the obtained data contains overlapping information. If some simplified non-overlapping expressions of these data can be obtained, the efficiency of data processing operation will be greatly improved and the accuracy will be improved to a certain extent. The purpose of dimensionality reduction algorithm is to improve the processing efficiency of high-dimensional data.

除了可以简化数据使其能够高效处理外，降维方法还可以实现数据可视化。由于很多统计学的和机器学习算法对于最优解的准确性很差，降维的可视化应用可以令用户能够实际看到高维数据的空间结构和算法输出的能力，具有很强的应用价值。In addition to simplifying data for efficient processing, dimensionality reduction methods also enable data visualization. Due to the poor accuracy of many statistical and machine learning algorithms for the optimal solution, the visualization application of dimensionality reduction can enable users to actually see the spatial structure of high-dimensional data and the ability of algorithm output, which has strong application value.

目前有很多基于不同目的的降维算法，包括有线性与非线性降维算法。其中PCA(Principal Component Analysis)和LDA(Linear Discriminant Analysis)是典型的线性降维算法。这一类算法对于具有线性结构的高维数据有着良好的处理结果，但对于非线性结构的高维数据没有好的结果。典型的非线性降维算法以流形学习（Manifold Learning）算法。2000年Science杂志上同一期发表了3篇有关于流形学习算法中提出了2种经典的流形学习算法：LLE(Local Linear Embedding)及ISOMAP(Isometric Mapping)。由此，各种基于不同的准则的流形学习算法被提出并有一部分流形学习算法应用于图像处理方面。LDE（Local Discriminant Embedding）算法是流形学习算法中较晚提出的，它是一种的典型的基于特征提取的流形学习算法，而不只基于可视化目标。There are currently many dimensionality reduction algorithms for different purposes, including linear and nonlinear dimensionality reduction algorithms. Among them, PCA (Principal Component Analysis) and LDA (Linear Discriminant Analysis) are typical linear dimensionality reduction algorithms. This type of algorithm has good processing results for high-dimensional data with a linear structure, but has no good results for high-dimensional data with a nonlinear structure. A typical nonlinear dimensionality reduction algorithm is the Manifold Learning algorithm. In 2000, Science magazine published 3 articles about manifold learning algorithms in the same issue, and proposed 2 classic manifold learning algorithms: LLE (Local Linear Embedding) and ISOMAP (Isometric Mapping). As a result, various manifold learning algorithms based on different criteria have been proposed and some of them are applied to image processing. The LDE (Local Discriminant Embedding) algorithm was proposed later in the manifold learning algorithm. It is a typical manifold learning algorithm based on feature extraction, not just based on the visualization target.

对于上述的降维算法并不能对在线获得的RSS数据或者新增的RSS来提高降维的精度。由于室内定位环境的变化，在不同时刻，特别是在长时间后，RSS数据之间的相关性就会降低。在已有算法中，有一类算法可以将在不同时间的RSS数据同时加入相应的已有数据中，从而增强不同数据之间的相关性，提高降维精度。这一类算法通常称之半监督算法。根据半监督算法的特点，提出半监督鉴别嵌入（Semi-supervised Discriminant Embedding,SDE）算法。The above-mentioned dimensionality reduction algorithm cannot improve the accuracy of dimensionality reduction for the RSS data obtained online or the newly added RSS. Due to changes in the indoor positioning environment, the correlation between RSS data will decrease at different moments, especially after a long time. Among the existing algorithms, there is a class of algorithms that can add RSS data at different times to the corresponding existing data at the same time, thereby enhancing the correlation between different data and improving the accuracy of dimensionality reduction. This type of algorithm is usually called semi-supervised algorithm. According to the characteristics of semi-supervised algorithm, a semi-supervised discriminant embedding (Semi-supervised Discriminant Embedding, SDE) algorithm is proposed.

根据SDE算法的特点，采用类别匹配的方式将在线新得到的RSS加入已有RadioMap，然后进行局部鉴别嵌入降维，从而提出基于类别匹配的半监督局部鉴别嵌入算法（Classification Matching Based Semi-supervised Discriminant Embedding，CM-SDE）。采用CM-SDE算法对Radio Map进行降维，得出降维后的Radio Map，将对降维后的Radio Map室内定位，从而提出基于CM-SDE算法的WiFi室内定位算法。According to the characteristics of the SDE algorithm, the newly obtained online RSS is added to the existing RadioMap by category matching, and then the local discriminative embedding is carried out to reduce the dimensionality, thus a Classification Matching Based Semi-supervised Discriminant Embedding Algorithm (Classification Matching Based Semi-supervised Discriminant Discriminant) is proposed. Embedding, CM-SDE). The CM-SDE algorithm is used to reduce the dimension of the Radio Map, and the dimension-reduced Radio Map is obtained. The indoor positioning of the reduced-dimensional Radio Map is used to propose a WiFi indoor positioning algorithm based on the CM-SDE algorithm.

发明内容Contents of the invention

本发明是要解决现有WiFi室内定位方法中存在的Radio Map数据库大，以及由于在线定位阶段计算复杂度高而引起的难以应用在线阶段获得RSS数据、难于在移动终端实现以及难于保证定位的实时性要求等问题，而提供了一种基于类别匹配的半监督流形学习的WiFi室内定位方法。The present invention is to solve the problem of the large Radio Map database existing in the existing WiFi indoor positioning method, and the difficulty in applying the online phase to obtain RSS data due to the high computational complexity of the online positioning phase, the difficulty in realizing it in the mobile terminal, and the difficulty in ensuring real-time positioning. In order to meet the performance requirements and other issues, a WiFi indoor localization method based on category matching semi-supervised manifold learning is provided.

基于类别匹配的半监督流形学习的WiFi室内定位方法离线阶段定位过程按以下步骤实现：The WiFi indoor positioning method based on category matching semi-supervised manifold learning offline stage positioning process is implemented in the following steps:

一、对待定位的室内区域布置AP，使无线信号覆盖待定位的室内区域，完成WiFi网络构建；1. Arrange APs in the indoor area to be positioned, so that the wireless signal covers the indoor area to be positioned, and complete the construction of the WiFi network;

在待定位的室内区域规则选取并记录参考点的相应坐标，测量并依次记录参考点接收到的所有AP的RSS信号作为位置特征信息，构建Radio Map，并存储Radio Map；Select and record the corresponding coordinates of the reference point in the indoor area to be positioned, measure and record the RSS signals of all APs received by the reference point in turn as location feature information, construct a Radio Map, and store the Radio Map;

二、采用GMST本征维数估计算法对步骤一中构建的Radio Map的本征维数进行分析，得到的本征维数作为CM-SDE算法的输入参数之一，决定Radio Map降维后的维数；2. Use the GMST eigendimension estimation algorithm to analyze the eigendimension of the Radio Map constructed in step 1. The obtained eigendimension is used as one of the input parameters of the CM-SDE algorithm to determine the dimensionality of the Radio Map after dimension reduction. dimension;

三、采用KFCM算法对Radio Map进行聚类分析，实现建立的Radio Map的类别标记，并作为CM-SDE的输入参数之一，并且提供相应的初始聚类中心及类别标记；3. Use the KFCM algorithm to perform clustering analysis on the Radio Map, realize the category mark of the established Radio Map, and use it as one of the input parameters of CM-SDE, and provide the corresponding initial cluster center and category mark;

四、步骤二中的本征维数与步骤三中的类别标记作为输入参数，采用CM-SDE算法对步骤一中构建的Radio Map降维，得出相应的降维后的RadioMap^*，RadioMap^*作为在匹配定位数据库用于在线定位阶段；4. The intrinsic dimension in step 2 and the category mark in step 3 are used as input parameters, and the CM-SDE algorithm is used to reduce the dimension of the Radio Map constructed in step 1, and the corresponding dimension-reduced RadioMap ^* , RadioMap ^* is obtained Used as a matching location database for the online location stage;

五、将不同用户在线定位阶段测试得到的未标记RSS，采用类别匹配的方式加入至已有Radio Map中，得到相应的包含未标记信号覆盖图RadioMap_ul，通过类别匹配更新的聚类中心作为CM-SDE算法中新的类别输入参数；5. Add the unmarked RSS obtained from the online positioning stage test of different users to the existing Radio Map by category matching, and obtain the corresponding RadioMap _ul that contains the unmarked signal coverage map. The cluster center updated by category matching is used as the CM - New category input parameter in SDE algorithm;

六、步骤五中的更新的聚类中心作为输入参数，采用CM-SDE对RadioMap_ul降维得到特征变换矩阵V′，V′与RadioMap^*共同构成在线匹配定位数据库，用于在线阶段定位；其中，所述线阶段定位具体为：6. The updated clustering center in step 5 is used as an input parameter, and the feature transformation matrix V′ is obtained by using CM-SDE to reduce the dimensionality of RadioMap _ul , and V′ and RadioMap ^* together constitute an online matching and positioning database for online stage positioning; , the specific line stage positioning is:

六（一）、在线测试RSS；Six (1), online test RSS;

六（二）、采用V将RSS降维为RSS^*；Six (2), use V to reduce the RSS dimension to RSS ^* ;

六（三）、采用KNN算法进行匹配定位输出定位结果；Six (3), using the KNN algorithm for matching and positioning to output positioning results;

六（四）、用户定位终端定位数据库更新；Six (four), user positioning terminal positioning database update;

即完成了一种基于类别匹配的半监督流形学习的WiFi室内定位方法的离线阶段实现方式。That is, the offline stage implementation of a WiFi indoor positioning method based on category matching semi-supervised manifold learning is completed.

基于类别匹配的半监督流形学习的WiFi室内定位方法在线阶段定位过程通过下述步骤实现：The WiFi indoor positioning method based on semi-supervised manifold learning of category matching is realized in the online stage positioning process through the following steps:

一、在线测试RSS；1. Test RSS online;

二、将得到的待定位点的RSS采用特征变换矩阵降维变换得到RSS*；2. The RSS of the point to be located is obtained by using the feature transformation matrix dimensionality reduction transformation to obtain RSS*;

三、采用KNN算法对RSS*与Radio Map*匹配位，对待定位点的具体位置坐标进行预测并进行在线数据的更新，其实现过程为：3. Use the KNN algorithm to match RSS* and Radio Map*, predict the specific location coordinates of the positioning point and update the online data. The implementation process is as follows:

（1）在线阶段，测试点处接收的RSS=[AP₁,AP₂,…,AP_n]，与特征变换矩阵V′相乘，从而得出降维后的RSS′=[AP₁,AP₂,…AP_d]，其中d表示本征维数；(1) In the online stage, the RSS=[AP ₁ ,AP ₂ ,…,AP _n ] received at the test point is multiplied by the feature transformation matrix V′, so as to obtain the reduced RSS′=[AP ₁ ,AP ₂ ,...AP _d ], where d represents the intrinsic dimension;

（2）采用KNN算法实现RSS′与Radio Map*的匹配，采用与RSS′最近的K个参考点的坐标的平均值作为测试点(x′,y′)，其表达式为：(2) Use the KNN algorithm to realize the matching between RSS′ and Radio Map*, and use the average value of the coordinates of the K reference points closest to RSS′ as the test point (x′, y′), whose expression is:

式中，(x′,y′)为测试点预测的坐标，(x_i,y_i)为第i个近邻点的坐标，K为KNN算法中近邻的数目；In the formula, (x′, y′) is the predicted coordinate of the test point, ( _xi , y _i ) is the coordinate of the i-th neighbor point, and K is the number of neighbors in the KNN algorithm;

四、用户定位终端定位数据库更新，即完成了基于类别匹配的半监督流形学习的WiFi在线阶段室内定位方法。4. The user positioning terminal positioning database is updated, that is, the indoor positioning method in the WiFi online stage is completed based on category matching semi-supervised manifold learning.

发明效果：Invention effect:

针对本发明提出的CM-SDE算法及背景需要求，通过对哈尔滨工业大学科学园2A栋12层走廊构成的室内定位区域进行定位。采用联想V450笔记本电脑并结合NetStumbler软件测试所有参考点处的RSS，构成Radio Map，并采用CM-SDE算法对Radio Map降维并采用KNN算法实现室内定位。在附图4的仿真中，降维后的Radio Map的维数为原始Radio Map维数的三分之一。从附图4的仿真结果来看，采用CM-SDE算法与初始的KNN的算法的定位性能可比拟，但CM-SDE的定位复杂度仅为三分之一，且CM-SDE算法可以有效地应用实时得到的新的RSS来Radio Map的密度，从而有效的提高定位精度。In view of the CM-SDE algorithm and background requirements proposed by the present invention, the positioning is carried out through the indoor positioning area formed by the corridor on the 12th floor of Building 2A, Science Park of Harbin Institute of Technology. Using Lenovo V450 laptop and NetStumbler software to test the RSS at all reference points to form a Radio Map, and use the CM-SDE algorithm to reduce the dimension of the Radio Map and use the KNN algorithm to achieve indoor positioning. In the simulation of FIG. 4 , the dimensionality of the reduced Radio Map is one third of that of the original Radio Map. From the simulation results in Figure 4, the positioning performance of the CM-SDE algorithm is comparable to that of the initial KNN algorithm, but the positioning complexity of CM-SDE is only one-third, and the CM-SDE algorithm can effectively Apply the new RSS obtained in real time to the density of the Radio Map, thereby effectively improving the positioning accuracy.

附图说明Description of drawings

图1是本发明中离线数据库定位实施流程图；实线箭头表示步骤之间的数据传输；Fig. 1 is the implementation flowchart of off-line database positioning in the present invention; The solid line arrow represents the data transmission between the steps;

图2是本发明中在线数据库定位实施流程图；实线箭头表示步骤之间的数据传输；Fig. 2 is the implementation flowchart of online database location in the present invention; The solid line arrow represents the data transmission between the steps;

图3是基于WiFi的室内定位网络的构建及实验环境示意图；Figure 3 is a schematic diagram of the construction and experimental environment of a WiFi-based indoor positioning network;

图4是采样网格图；Fig. 4 is a sampling grid diagram;

图5是采用CM-SDE与KNN算法定位性能对比图；其中，表示CM-SDE，表示KNN。Figure 5 is a comparison chart of positioning performance using CM-SDE and KNN algorithms; among them, Indicates CM-SDE, stands for KNN.

具体实施方式detailed description

具体实施方式一：本实施方式的基于类别匹配的半监督流形学习的WiFi室内定位方法离线阶段定位过程按以下步骤实现：Specific implementation mode one: The offline stage positioning process of the WiFi indoor positioning method based on category matching semi-supervised manifold learning in this embodiment is implemented in the following steps:

六（一）、在线测试RSS；Six (1), online test RSS;

所述离线阶段与在线阶段均在定位终端完成。Both the offline phase and the online phase are completed at the positioning terminal.

具体实施方式二：本实施方式与具体实施方式一不同的是：步骤二中采用GMST本征维数估计算法对步骤一中构建的Radio Map的本征维数进行分析，其计算公式为：Specific embodiment two: the difference between this embodiment and specific embodiment one is: adopt GMST intrinsic dimension estimation algorithm to analyze the intrinsic dimension of the Radio Map constructed in step one in step two, its calculation formula is:

测地距最小生成树算法 Geodesic minimum spanning tree algorithm

上式中中的a表示最小生成树的线性拟合表达式y=ax+b的斜率。In the above formula A in represents the slope of the linear fitting expression y=ax+b of the minimum spanning tree.

其它步骤及参数与具体实施方式一相同。Other steps and parameters are the same as those in Embodiment 1.

具体实施方式三：本实施方式与具体实施方式一或二不同的是：步骤四中生成新的降维后的信号覆盖图RadioMap^*与步骤六中的V′表达式为：Specific embodiment three: the difference between this embodiment and specific embodiment one or two is: in the step 4, generate a new dimensionality-reduced signal coverage map RadioMap ^* and the V' expression in the step 6 is:

Radio Map*=V′·XRadio Map*=V′·X

X是需要降维的Radio Map。X is a Radio Map that needs dimensionality reduction.

其它步骤及参数与具体实施方式一或二相同。Other steps and parameters are the same as those in Embodiment 1 or Embodiment 2.

具体实施方式四：本实施方式与具体实施方式一至三之一不同的是：步骤五中类别匹配的实现，其核心流程分两步完成：Specific implementation mode four: this implementation mode is different from one of specific implementation modes one to three in that: the realization of category matching in step five, its core process is completed in two steps:

第一步，寻找未标记RSS的类别属性，由下式完成类别属性标记：The first step is to find the category attribute of unmarked RSS, and complete the category attribute labeling by the following formula:

第二步：对RSS进行门限检测：通过计算并判定广义符号值与门限值的关系，从而实现Radio Map及类别标记数据的更新，广义符号值及聚类中心的更新分别下两式完成：Step 2: Threshold detection of RSS: By calculating and determining the relationship between the generalized symbol value and the threshold value, the update of the Radio Map and category tag data is realized, and the update of the generalized symbol value and cluster center is completed by the following two formulas:

其中，所述门限值V_T=0.9N。Wherein, the threshold value V _T =0.9N.

其它步骤及参数与具体实施方式一至三之一相同。Other steps and parameters are the same as those in Embodiments 1 to 3.

具体实施方式五：基于类别匹配的半监督流形学习的WiFi在线阶段室内定位方法通过下述步骤实现：Specific embodiment five: the WiFi online stage indoor positioning method based on category matching semi-supervised manifold learning is realized through the following steps:

一、在线测试RSS；1. Test RSS online;

四、用户定位终端定位数据库更新，即完成了基于类别匹配的半监督流形学习的WiFi在线阶段室内定位方法；4. The user positioning terminal positioning database is updated, that is, the indoor positioning method in the WiFi online stage is completed based on category matching semi-supervised manifold learning;

所述离线阶段在服务器上完成；在线阶段在定位终端上完成。The offline stage is completed on the server; the online stage is completed on the positioning terminal.

具体实施方式六：本实施方式与具体实施方式五不同的是：步骤三中对待定位点的具体位置坐标进行预测并进行在线数据的更新由权利要求1中所述的离线数据库和在线数据库实现：Specific embodiment six: the difference between this embodiment and specific embodiment five is: in step 3, the specific position coordinates of the positioning point to be predicted and the updating of online data are realized by the offline database and the online database described in claim 1:

离线定位数据库的实现方式为：将定位用户本次定位测得的RSS作为未标记数据采用类别匹配方式加入到Radio Map中，并在移动终端上实现对本地在线匹配定位数据库的更新，实现动态的更新本地的数据，从而实现离线数据库定位方式；The implementation of the offline positioning database is as follows: the RSS measured by the positioning user in this positioning is added to the Radio Map as unmarked data by category matching, and the local online matching positioning database is updated on the mobile terminal to realize dynamic Update local data to realize offline database positioning;

在线数据库的实现方式为：用户在线定位完成后，将用户本次在线测得的RSS值上传至在线定位数据库所在的服务器，并在服务器端将在线定位数据库进行更新将将在线定位数据传回上传RSS数据的定位终端。The implementation of the online database is as follows: After the user’s online positioning is completed, the RSS value measured online by the user is uploaded to the server where the online positioning database is located, and the online positioning database is updated on the server side, and the online positioning data is sent back to upload Positioning terminal for RSS data.

其它步骤及参数与具体实施方式一至五之一相同。Other steps and parameters are the same as one of the specific embodiments 1 to 5.

仿真实验：Simulation:

一、结合附图3对本仿真实验做出详细说明：图示为哈尔滨工业大学科学园2A栋12层的平面图示意，基于WiFi的室内定位系统就是基于该实验环境下建立。在实验环境中，总共布置27个AP，AP布置的位置为蓝色无线发射信号形状标记所在处。AP离房间地面高度为2米。在离线阶段，在联想V450笔记本上安装NetStumbler软件，在所有参考点的四个不同的方位上连续采样记录AP的100个RSS值，以及AP的相关信息。将所有的采样点的物理坐标及相应的物理坐标及RSS值存储为定位过程所调用的数据，建立Radio Map。在实验环境共有900个参考点，其采样密度为0.5米×0.5米，如附图4所示。Radio Map作为CM-SDE算法的输入参数及本征维数估计算法的输入数据。1. A detailed description of this simulation experiment is given in conjunction with Figure 3: the diagram shows the floor plan of the 12th floor of Building 2A, Science Park, Harbin Institute of Technology, and the WiFi-based indoor positioning system is established based on this experimental environment. In the experimental environment, a total of 27 APs are arranged, and the location of the AP arrangement is where the blue wireless transmission signal shape mark is located. The height of the AP from the room floor is 2 meters. In the offline stage, NetStumbler software was installed on Lenovo V450 notebook, and 100 RSS values of APs and related information of APs were continuously sampled and recorded in four different orientations of all reference points. Store the physical coordinates of all sampling points and the corresponding physical coordinates and RSS values as the data called by the positioning process to establish a Radio Map. There are 900 reference points in the experimental environment, and the sampling density is 0.5m×0.5m, as shown in Figure 4. The Radio Map is used as the input parameter of the CM-SDE algorithm and the input data of the intrinsic dimension estimation algorithm.

二、Radio Map的本征维数的获取通过下述步骤实现：2. The acquisition of the intrinsic dimension of the Radio Map is achieved through the following steps:

本征维数是对于高维数据进行本征空间维数及空间重建所需最小的独立变量的个数。在具体实际计算中，由于高维数据的本征并不明显，通常不是寻求得到确切的本征维数，而是寻求估计本征维数的可信取值。具体的说，给定一个来自高维空间的样本，本征维数估计算法的中心任务和重要内容就是通过这些样本数据来确定这个高维结构的本征维数。The intrinsic dimension is the minimum number of independent variables required for the intrinsic spatial dimension and spatial reconstruction of high-dimensional data. In actual calculations, since the eigenvalues of high-dimensional data are not obvious, it is usually not to seek the exact eigendimension, but to seek a credible value for estimating the eigendimension. Specifically, given a sample from a high-dimensional space, the central task and important content of the eigendimension estimation algorithm is to determine the eigendimension of this high-dimensional structure through these sample data.

Radio Map的本征维数的估计是CM-SDE算法的重要输入参数，这关系到降维的结果是否能够代表Radio Map的高维空间的特征，因此准确有效的本征维数的估计至关重要。目前，常用本征维数估计算法分为两类：局部估计与全局估计。采用全局算法估计对RadioMap的本征维数进行估计，并作为CM-SDE算法的输入变量。本实验中采用测地线最小生成树算法（Geodesic Minimum Spanning Tree,GMST）对Radio Map的本征维数进行估计。The estimation of the eigendimension of the Radio Map is an important input parameter of the CM-SDE algorithm, which is related to whether the result of dimensionality reduction can represent the characteristics of the high-dimensional space of the Radio Map, so accurate and effective estimation of the eigendimension is crucial important. At present, the commonly used eigendimension estimation algorithms are divided into two categories: local estimation and global estimation. The eigendimension of RadioMap is estimated by global algorithm estimation, and it is used as the input variable of CM-SDE algorithm. In this experiment, the geodesic minimum spanning tree algorithm (Geodesic Minimum Spanning Tree, GMST) is used to estimate the intrinsic dimension of the Radio Map.

下面对GMST算法的理论进行分析。The theory of GMST algorithm is analyzed below.

测地线最小生成树(GMST)估计是基于测地线最小生成树的长度函数依赖于本征维数d。GMST是指定义在数据集X上的近邻曲线的最小生成树。GMST的长度函数L(X)是在测地线最小生成树中所有边缘对应的欧氏距离之和。Geodesic minimum spanning tree (GMST) estimation is based on the length function of the geodesic minimum spanning tree depending on the intrinsic dimension d. GMST refers to the minimum spanning tree of neighbor curves defined on the data set X. The length function L(X) of the GMST is the sum of the Euclidean distances corresponding to all edges in the geodesic minimum spanning tree.

GMST估计在数据集X上构造一条近邻曲线G，其中，在X内每一个数据点x_i都和它的k个近邻相连接。测地线最小生成树T定义为X上的最小曲线，它具有长度GMST estimation constructs a neighbor curve G on the data set X, where each data point _xi in X is related to its k neighbors connected. A geodesic minimum spanning tree T is defined as the smallest curve on X that has length

其中，是曲线G的所有子树集合，e是树T的一个边缘，g_e是边缘e对应的欧氏距离，其计算公式为式(2)所示。in, is the set of all subtrees of the curve G, e is an edge of the tree T, g _e is the Euclidean distance corresponding to the edge e, and its calculation formula is shown in formula (2).

g_e=||x_i-x_j||,x_i,x_j∈e (2)g _e =||x _i -x _j ||,x _i ,x _j ∈ e (2)

在GMST估计中，一些子集由各种大小m组成，并且子集A的GMST的长度L(A)也需要计算。理论上，是线性的，从而可以由y=ax+b这种形式的函数来估计，通过最小二乘法可以估算出变量a和b。可以证明，由a的估算值和能够得到本征维数的估计。由GMST算法给出本征维数d的表达式为式(3)所示。本征维数d是CM-SDE算法的另一个重要的输入参数。In GMST estimation, some subset consists of various sizes m, and the length L(A) of the GMST of the subset A also needs to be calculated. In theory, It is linear, so it can be estimated by a function of the form y=ax+b, and the variables a and b can be estimated by the least square method. It can be shown that from the estimated value of a and An estimate of the eigendimension can be obtained. The expression of the intrinsic dimension d given by the GMST algorithm is shown in formula (3). The intrinsic dimension d is another important input parameter of the CM-SDE algorithm.

三、CM-SDE是一种半监督流形学习算法，在CM-SDE算法实现的过程中需要对所有的参考点进行类标记。考虑到目前的WiFi室内定位环境中，参考点数目将近1000点，并没有人为地对所有的参考点进行类标记，而是采用一定的分类算法对参考点的类别进行标记。3. CM-SDE is a semi-supervised manifold learning algorithm. During the implementation of CM-SDE algorithm, it is necessary to classify all reference points. Considering that in the current WiFi indoor positioning environment, the number of reference points is nearly 1000 points, and all the reference points are not artificially labeled, but a certain classification algorithm is used to mark the category of the reference points.

聚类的目标是将数据集X={x₁,x₂,…,x_n}划分为c类且各类数据之间互不相关。基本的聚类算法按如下步骤实现：The goal of clustering is to divide the data set X={x ₁ ,x ₂ ,…,x _n } into c categories and the data of each category are not related to each other. The basic clustering algorithm is implemented as follows:

（1）生成c个聚类中心，记为v_i，i=1,2,…,c。(1) Generate c cluster centers, denoted as v _i , i=1,2,...,c.

（2）将数据集X={x₁,x₂,…,x_n}的每个元素归类，采用最近邻（Nearest Neighbor）算法判定元素的归属关系，其等价表达式为：(2) Classify each element of the data set X={x ₁ ,x ₂ ,…,x _n }, and use the nearest neighbor (Nearest Neighbor) algorithm to determine the belonging relationship of the elements. The equivalent expression is:

式(4)中，x_i为第i个数据点，G_i为第i类构成的邻接关系图。D(x_i,v_j)表示计算x_i与v_j之间的欧式距离。In formula (4), x _i is the i-th data point, and G _i is the adjacency graph formed by the i-th class. D( _xi ,v _j ) means to calculate the Euclidean distance between x _i and v _j .

（3）聚类中心的更新，对于第i类的聚类中心更新为：(3) The update of the cluster center, for the cluster center of the i-th class, the update is:

式(5)中，|·|表示计算某类中元素的个数。In formula (5), |·| means to calculate the number of elements in a certain class.

（4）收敛性校验及迭代(4) Convergence check and iteration

若满足以下四种情况下的收敛性条件之一，则迭代停止，否则重复执行（2）～（3），直到迭代收敛或者达到最大执行次数。四种收敛性判定条件为：If one of the convergence conditions in the following four cases is satisfied, the iteration stops, otherwise (2)-(3) are repeated until the iteration converges or the maximum number of executions is reached. The four convergence criteria are:

条件一：聚类中心不变；Condition 1: The cluster center remains unchanged;

条件二：每个聚类的元素不变；Condition 2: The elements of each cluster are unchanged;

条件三：聚类中心变化收敛于半径ε内；Condition 3: The change of the cluster center converges within the radius ε;

条件四：聚类元素变化收敛于半径ε内。Condition 4: The change of clustering elements converges within the radius ε.

总的来说，上述收敛性判定条件可以表述为下式：In general, the above convergence judgment conditions can be expressed as the following formula:

max||v_i-v_i′||≤ε,ε≥0 (6)max||v _i -v _i ′||≤ε,ε≥0 (6)

其中，v_i′表示更新后的第i类的聚类中心。Among them, v _i ′ represents the updated cluster center of the i-th class.

从上述聚类算法的基本实现方式分析，用于判定元素的归属关系的算法构成了聚类的核心。不同类型的聚类算法提出不同的归类指标，一般将该函数称为损耗函数（LossFunction）。基本的聚类方法中采用欧式距离作为损耗函数。本专利采用KFCM算法对RadioMap进行类别分析得出聚类中心和Radio Map的类别标记。KFCM算法的理论分析如下：From the analysis of the basic implementation of the clustering algorithm above, the algorithm used to determine the belonging relationship of elements constitutes the core of clustering. Different types of clustering algorithms propose different classification indicators, and this function is generally called a loss function (LossFunction). The basic clustering method uses Euclidean distance as the loss function. This patent uses the KFCM algorithm to analyze the category of the RadioMap to obtain the cluster center and the category label of the Radio Map. The theoretical analysis of the KFCM algorithm is as follows:

引入核函数的模糊c均值聚类的目标是将原始的数据集所在空间变换至无穷维的希尔伯特空间（Hilbert Space），再对变换后的空间作相应的聚类分析。通过核函数的变换，将原始数据之间的类别特征进一步变换后更易于表述和区分。基于核函数的模糊c均值聚类算法的目标函数为：The goal of fuzzy c-means clustering with kernel function is to transform the space where the original data set is located into infinite-dimensional Hilbert Space (Hilbert Space), and then perform corresponding cluster analysis on the transformed space. Through the transformation of the kernel function, it is easier to express and distinguish the category features between the original data after further transformation. The objective function of the fuzzy c-means clustering algorithm based on kernel function is:

式(7)中，Φ(x_k)、W_i分别表示在希尔伯特空间下的数据集及相应的聚类中心。通过推导可以得出KFCM算法的解表述为：In formula (7), Φ(x _k ) and W _i represent the data set and the corresponding cluster center in the Hilbert space respectively. Through derivation, it can be concluded that the solution of KFCM algorithm is expressed as:

KFCM的解的关键在于计算希尔伯特空间的损耗函数或者相似度函数。在本文中考虑引入高斯核函数（Gaussian Kernel Function）的FCM（Fuzzy C-Means）算法的理论分析及其实现。高斯核函数如式(9)所示。The key to the solution of KFCM is to calculate the loss function or similarity function of the Hilbert space. In this paper, the theoretical analysis and implementation of the FCM (Fuzzy C-Means) algorithm that introduces the Gaussian Kernel Function (Gaussian Kernel Function) are considered. The Gaussian kernel function is shown in formula (9).

希尔伯特空间中，由式表述其相应的损耗函数，该式进一步表述为式(10)。In Hilbert space, by Expressing its corresponding loss function, this formula is further expressed as formula (10).

式(10)中，<·,·>表示计算相应式的核函数值。而实际上，无穷空间的变换不存在，因此，对式(10)进一步简化为式(11)所示。In formula (10), <·,·> means to calculate the kernel function value of the corresponding formula. In fact, the transformation of infinite space does not exist, therefore, formula (10) is further simplified as shown in formula (11).

式(11)的全展开式为式(12)所示。The fully expanded formula of formula (11) is shown in formula (12).

式(12)中，<Φ(x_k),Φ(x_j)>由高斯核函数计算，即：In formula (12), <Φ(x _k ),Φ(x _j )> is calculated by Gaussian kernel function, namely:

在算法实现中，不是随机生成聚类中心，而是从数据集X={x₁,x₂,…,x_n}中随机选择c个元素作为聚类中心，构成集合Y={y₁,…,y_c}。因此，初始化的损耗函数值计算如式所示。In the implementation of the algorithm, instead of randomly generating cluster centers, c elements are randomly selected from the data set X={x ₁ ,x ₂ ,…,x _n } as cluster centers to form a set Y={y ₁ , ..., y _c }. Therefore, the initialized loss function value is calculated as shown in Eq.

四、运用CM-SDE算法实现对Radio Map进行降维并获取特征权值矩阵过程通过下述步骤实现：4. Use the CM-SDE algorithm to reduce the dimension of the Radio Map and obtain the feature weight matrix through the following steps:

CM-SDE算法是基于标记数据与未标记数据的类间散度及类内散度最大化的一种流形学习算法。在对CM-SDE算法进行理论分析之前对CM-SDE算法给定的输入数据做如下说明：输入高维数据点数据点x_i的类标记为y_i∈{1,2,…,P}，其中P表示将高维数据划分为P个子流形，即将输入的高维数据分成P类，记P类的聚类中心为V={v₁,v₂,…,v_P}。将输入的高维数据表示成矩阵的形式：X=[x₁,x₂,…,x_m]∈R^n×m。从矩阵表示的形式来看，矩阵中的列代表一个高维数据点。The CM-SDE algorithm is a manifold learning algorithm based on maximizing the inter-class divergence and intra-class divergence between labeled data and unlabeled data. Before the theoretical analysis of the CM-SDE algorithm, the input data given by the CM-SDE algorithm is explained as follows: Input high-dimensional data points The class label of the data point _xi is y _i ∈ {1,2,…,P}, where P means to divide the high-dimensional data into P submanifolds, that is, to divide the input high-dimensional data into P classes, and record the clustering of P classes The class center is V={v ₁ ,v ₂ ,…,v _P }. Express the input high-dimensional data in the form of matrix: X=[x ₁ ,x ₂ ,…,x _m ]∈R ^n×m . From the form of matrix representation, the columns in the matrix represent a high-dimensional data point.

对于包含未标记数据的RadioMap_ul，其中所有的未标记数据X_u=[x_u1,x_u2,…,x_uk]∈R^n×k进行类别匹配，同时已标记数据记为X_l=[x_l1,x_l2,…,x_lc]∈R^n×c。对于X_u中的所有数据进行有序类别匹配。有序的含义是当分配某一个未标记数据后，会影响相应类的聚类中心，因此会对下一未标记数据的类判别会有影响。本专利中主要的考虑信号的采集的时间顺序。假定X_u是按时间顺序排列。采用式(15)计算类的归属x_u1，并采用式(5)更新相应的聚类中心。然后依次将所有的未标记数据进行类别匹配For a RadioMap _ul containing unlabeled data, all unlabeled data X _u =[x _u1 ,x _u2 ,…,x _uk ]∈R ^n×k are used for category matching, and the labeled data are recorded as X _l =[x _l1 ,x _l2 ,…,x _lc ]∈R ^n×c . Ordinal category matching is performed for all data in X _u . The meaning of order is that when a certain unlabeled data is assigned, it will affect the cluster center of the corresponding class, so it will affect the class discrimination of the next unlabeled data. The time sequence of signal acquisition is mainly considered in this patent. Assume X _u is in chronological order. Use formula (15) to calculate the class belonging x _u1 , and use formula (5) to update the corresponding cluster center. Then classify all unlabeled data in turn

CM-SDE算法的目标函数为：The objective function of the CM-SDE algorithm is:

式(16)中S_w、S_b、S_t分别表示类内散度、类间散度及总散度可以由式(17)计算：In formula (16), S _w , S _b , and S _t represent intra-class divergence, inter-class divergence and total divergence respectively, which can be calculated by formula (17):

式(17)中，为第i类的均值，l_i为第i类的采样点的数目；为全体采样点的均值，N为采样点的数目。In formula (17), is the mean value of the i-th class, l _i is the number of sampling points of the i-th class; is the mean value of all sampling points, and N is the number of sampling points.

对于式(16)所示的目标函数同样可以表示局部鉴别嵌入流形学习算法的目标函数形式，其表达式为：For the objective function shown in formula (16), it can also represent the objective function form of the local discriminative embedding manifold learning algorithm, and its expression is:

式(18)中，w_ij表示同类数据间的权重分配，w_ij′表示不同类数据间的权重分配，分别表示为W_N×N和W_N×N′。权重计算过程由两步完成。第一步：构造邻域图。根据高维数据点的类标记信息及其近邻关系构造无方向图G及G′。其中近邻关系是采用KNN算法给出的准则，即选择数据点最近的K个点作为其邻居，G表示当x_i与x_j的类标记信息y_i=y_j时且x_i、x_j互为K近邻关系；G′示当x_i与x_j的类标记信息y_i≠y_j时且x_i、x_j互为K近邻关系。第二步：计算权值矩阵。根据第一步构造的邻接图采用类高斯函数进行权值矩阵的计算。其表达式(19)、(20)为所示。公式中w_ij表示近邻点x_i与x_j之间的权值，||x_i-x_j||²为近邻点.与x_j之间的距离，采用矩阵方式计算距离，t为权值归一化参数，U、L分别表示未标记和已标记的采样点的数目。根据分析可以知道，W_N×N和W_N×N′可以由三部分构成，分别是：已标记数据与已标记数据之间的权重、已标记数据与未标记数据之间的权重及未标记数据与未标记数据之间的权重，分别表示为： In formula (18), w _ij represents the weight distribution among data of the same type, and w _ij ′ represents the weight distribution among data of different types, expressed as W _N×N and W _N×N ′ respectively. The weight calculation process is completed in two steps. Step 1: Construct a neighborhood graph. Construct undirected graphs G and G′ according to the class label information of high-dimensional data points and their neighbor relations. Among them, the neighbor relationship is the criterion given by the KNN algorithm, that is, select the nearest K points of the data point as its neighbors, and G means that when the class label information y _i of x _i and x _j = y _j and x _i and x _j mutually is the K-nearest neighbor relationship; G′ indicates that when the class label information y _i _≠ y _j of x _i and x _j _is the K-nearest neighbor relationship. Step 2: Calculate the weight matrix. According to the adjacency graph constructed in the first step, a Gaussian function is used to calculate the weight matrix. Its expressions (19), (20) are as shown. In the formula, w _ij represents the weight between the neighbor point x _i and x _j , || _xi -x _j || ² is the distance between the neighbor point and x _j , the distance is calculated by matrix, and t is the weight Normalization parameters, U, L represent the number of unmarked and marked sampling points, respectively. According to the analysis, it can be known that W _N×N and W _N×N ′ can be composed of three parts, namely: the weight between marked data and marked data, the weight between marked data and unmarked data, and the weight between unmarked data and unmarked data. The weights between data and unlabeled data are expressed as:

由上述计算公式及矩阵的性质可以得：由此可以推导出W_N×N和W_N×N′表示为分块矩阵的形式，如式(21)所示。From the above calculation formula and the properties of the matrix, we can get: From this, it can be deduced that W _N×N and W _N×N ′ are expressed in the form of block matrix, as shown in formula (21).

根据矩阵的计算式：计算式表示为矩阵A的矩阵的计算方法，计算式给出的方法与矩阵的迹的计算式一致，即：||A||²=tr(AA^T)。由此式(18)可以表示为矩阵的迹的计算方式：According to the calculation formula of the matrix: The calculation formula is expressed as the calculation method of the matrix of the matrix A, and the method given by the calculation formula is consistent with the calculation formula of the trace of the matrix, namely: ||A|| ² =tr(AA ^T ). From this formula (18) can be expressed as the calculation method of the trace of the matrix:

式(22)可以简化为：Formula (22) can be simplified as:

由矩阵迹的计算的标量性质及权值元素均为实数，可以将式(23)简化为：The scalar properties and weight elements of the calculation of the matrix trace are all real numbers, and the formula (23) can be simplified as:

根据简单的数学关系，可以将式(24)简化为：According to the simple mathematical relationship, formula (24) can be simplified as:

J(V)=2tr{V^T[X(D′-W′_N×N)X^T]V} (25)J(V)=2tr{V ^T [X(D′-W′ _N×N )X ^T ]V} (25)

式(25)中：X为输入数据，λ和v为特征值与特征向量，W和W′分别为G及G′对应的权值矩阵，D及D′均为对角阵，其对角元素可以由式(26)表示。In formula (25): X is the input data, λ and v are the eigenvalues and eigenvectors, W and W′ are the weight matrices corresponding to G and G′ respectively, D and D′ are diagonal matrices, and their diagonal Elements can be represented by formula (26).

根据式(25)的推导方式，同理可以将式(18)中的约束条件写成如式(25)相似的形式，由此，可以将(18)表示为如下形式：According to the derivation method of formula (25), similarly, the constraints in formula (18) can be written in a form similar to formula (25). Therefore, (18) can be expressed as the following form:

对式(27)应用拉格朗日（Lagrange）乘数法，可以得出式(28)所示：Applying the Lagrange multiplier method to formula (27), it can be obtained as shown in formula (28):

X(D′-W′_N×N)X^Tv=λX(D-W_N×N)X^Tv (28)X(D′-W′ _N×N )X ^T v=λX(DW _N×N )X ^T v (28)

对式(28)进行广义特征值分解，得出其特征值分解的特征值及特征向量，表示为：λ=[λ₁,λ₂,…,λ_n]^T，其对应的特征向量为：v=[v₁,v₂,…,v_n]^T。取前d个最大的特征值对应的特征向量构成变换矩阵V=[v₁,v₂,…,v_d]。由CM-SDE算法的输出数据变换方法可以得出，降维后数据为：Carry out generalized eigenvalue decomposition on formula (28), and obtain the eigenvalue and eigenvector of its eigenvalue decomposition, expressed as: λ=[λ ₁ ,λ ₂ ,…,λ _n ] ^T , and the corresponding eigenvector is: v=[v ₁ ,v ₂ ,…,v _n ] ^T . Take the eigenvectors corresponding to the first d largest eigenvalues to form a transformation matrix V=[v ₁ ,v ₂ ,…,v _d ]. From the output data transformation method of the CM-SDE algorithm, it can be concluded that the data after dimensionality reduction is:

z_i=V^Tx_i (29)z _i = V ^T x _i (29)

式(29)中，z_i表示输入高维数据点x_i变换后的低维输出数据。从本专利中发明内容给出的离线阶段的实施步骤为：第一步先采用CM-SDE算法对所有参考点Radio Map进行降维处理，得到相应的参考点的降维后的Radio Map，即作为在线阶段的匹配定位数据库（RadioMap^*）。第二步对添加未标记数据的Radio Map，即RadioMap_ul进行降维处理，得到特征变换矩阵V′。由此可以建立离线阶段所需要数据库：RadioMap^*和V′。In Equation (29), _zi represents the low-dimensional output data transformed from the input high-dimensional data point _xi . The implementation steps of the off-line stage given by the content of the invention in this patent are: the first step is to use the CM-SDE algorithm to perform dimensionality reduction processing on all reference point Radio Maps, and obtain the corresponding dimensionality reduction Radio Maps of the reference points, namely As a matching localization database (RadioMap ^* ) for the online phase. The second step is to reduce the dimensionality of the Radio Map with unmarked data, that is, RadioMap _ul , to obtain the feature transformation matrix V′. From this the required databases for the offline phase can be created: RadioMap ^* and V'.

五、由不同的用户在线阶段定位阶段获得的RSS是未标记类别属性的，其加入Radio Map，并构成Radio Map_ul的过程称为类别。其实现方法如下所述：5. The RSS obtained by different users in the online stage positioning stage is not marked with category attributes, and the process of adding it to Radio Map and forming Radio Map _ul is called category. Its implementation method is as follows:

将不同用户在线阶段定位阶段测试得到的未标记RSS，采用类别匹配的方式加入至已有Radio Map中，得到相应的包含未标记信号覆盖图RadioMap_ul；通过类别匹配方法增加Radio Map的数据量，进而提高Radio Map的密度，为CM-SDE算法提供新的降维数据，同时可以更新聚类中心，为CM-SDE算法提供新的类别数据。类别匹配方法分为两步，其实现过程如下所述：Add the unmarked RSS obtained from the positioning stage test of different users in the online stage to the existing Radio Map by category matching, and obtain the corresponding RadioMap _ul that contains unmarked signal coverage; increase the data volume of the Radio Map through the category matching method, Furthermore, the density of the Radio Map can be increased to provide new dimensionality reduction data for the CM-SDE algorithm. At the same time, the cluster center can be updated to provide new category data for the CM-SDE algorithm. The category matching method is divided into two steps, and its implementation process is as follows:

第一步，寻找未标记RSS的类别属性。记一组未标记RSS为RSS_i，与步骤三中的聚类中心进行匹配，由式(4)完成RSS_i的类别标记。In the first step, look for category attributes of untagged RSS. Record a group of unlabeled RSS as RSS _i , match it with the cluster center in step 3, and complete the category labeling of RSS _i by formula (4).

第二步：对RSS进行门限检测。对于聚类中心vi表示为v_i=(v_i1,v_i2,…,v_iN)，N为室内定位系统中AP的个数。RSS_i表示为RSS_i=(RSS_i1,RSS_i2,…,RSS_iN)。计算下式所定义的广义符号值：Step 2: Perform threshold detection on the RSS. The cluster center vi is expressed as v _i =(v _i1 ,v _i2 ,...,v _iN ), and N is the number of APs in the indoor positioning system. RSS _i is expressed as RSS _i =(RSS _i1 ,RSS _i2 ,...,RSS _iN ). Computes the generalized symbolic value defined by:

其中，sgn(·)定义为：Among them, sgn( ) is defined as:

当S_i大于设定门限值时，则将RSS_i加入Radio Map中，并更新聚类中心，否则舍弃RSS_i，不加入Radio Map中。在本专利中，门限值V_T=0.9N。聚类中心的更新由式(5)完成：When S _i is greater than the set threshold value, add RSS _i to the Radio Map and update the cluster center, otherwise discard RSS _i and not add it to the Radio Map. In this patent, the threshold V _T =0.9N. The update of the cluster center is completed by formula (5):

六、基于CM-SDE算法的WiFi室内定位方法的离线数据库实现方式：6. Offline database implementation of WiFi indoor positioning method based on CM-SDE algorithm:

离线数据库方式由三部分构成。第一，所有参考点的Radio Map的建立，并采用CM-SDE算法得到RadioMap^*。第二，再随机采样U点未标记数据并添加入原有的Radio Map中，并用CM-SDE算法得到V′，并将形成的定位数据库下载（存储）到定位的移动终端。第三，在线定位实现及Radio Map更新。第三部分的具体实现如下所述：The offline database mode consists of three parts. First, establish the Radio Map of all reference points, and use the CM-SDE algorithm to obtain RadioMap ^* . Second, randomly sample the unmarked data of point U and add it to the original Radio Map, and use the CM-SDE algorithm to obtain V′, and download (store) the formed positioning database to the positioning mobile terminal. Third, realize online positioning and update Radio Map. The specific implementation of the third part is as follows:

在线阶段，测试点处接收的RSS=[AP₁,AP₂,…,AP_n]，n表示室内定位系统布置的AP的数目。将RSS与特征变换矩阵V′相乘，从而得出降维后的RSS′=[AP₁,AP₂,…,AP_d]，其中d表示本征维数。再采用KNN算法实现RSS′与RadioMap^*的匹配。采用与RSS′最近的K个参考点的坐标的平均值作为测试点(x′,y′)，其表达式为：In the online phase, the RSS received at the test point=[AP ₁ ,AP ₂ ,...,AP _n ], where n represents the number of APs deployed by the indoor positioning system. Multiply the RSS with the feature transformation matrix V′ to obtain the dimensionally reduced RSS′=[AP ₁ ,AP ₂ ,…,AP _d ], where d represents the intrinsic dimension. Then use the KNN algorithm to realize the matching between RSS′ and RadioMap ^* . The average value of the coordinates of the K reference points closest to RSS' is used as the test point (x', y'), and its expression is:

将定位用户本次定位测得的RSS作为未标记数据采用类别匹配方式加入到RadioMap中，并在移动终端上实现对本地在线匹配定位数据库的更新，实现动态的更新本地的数据，从而实现离线数据库定位方式，附图1所示于类别匹配的半监督流形学习的WiFi室内定位方法在用户定位终端实现；The RSS measured by the positioning user in this positioning is added to the RadioMap as unmarked data by category matching, and the local online matching positioning database is updated on the mobile terminal, and the local data is dynamically updated, thereby realizing the offline database The positioning method, shown in Figure 1, is implemented in the user positioning terminal in the WiFi indoor positioning method of the semi-supervised manifold learning of category matching;

七、基于CM-SDE算法的WiFi室内定位方法的在线数据库实现方式：7. Online database implementation of WiFi indoor positioning method based on CM-SDE algorithm:

在线数据库方式由四部分构成。第一，所有参考点的Radio Map的建立，并采用CM-SDE算法得到RadioMap^*。第二，再随机采样U点未标记数据并添加入原有的Radio Map中，并用CM-SDE算法得到V′，并将形成的定位数据库下载（存储）到定位的移动终端。第三，在线定位实现及Radio Map更新。第三部分的具体实现如下所述：The online database method consists of four parts. First, establish the Radio Map of all reference points, and use the CM-SDE algorithm to obtain RadioMap ^* . Second, randomly sample the unmarked data of point U and add it to the original Radio Map, and use the CM-SDE algorithm to obtain V′, and download (store) the formed positioning database to the positioning mobile terminal. Third, realize online positioning and update Radio Map. The specific implementation of the third part is as follows:

第四部分：用户在线定位完成后，将用户本次在线测得的RSS值上传至在线定位数据库所在的服务器，并在服务器端将在线定位数据库进行更新将将在线定位数据传回上传RSS数据的定位终端，即附图2所示的离线阶段在在线定位数据库所在服务器上完成，而在线阶段在定位终端完成。Part 4: After the user's online positioning is completed, upload the RSS value measured online by the user to the server where the online positioning database is located, and update the online positioning database on the server side, and send the online positioning data back to the server that uploaded the RSS data The positioning terminal, that is, the offline phase shown in Figure 2 is completed on the server where the online positioning database is located, while the online phase is completed on the positioning terminal.

Claims

1. The WiFi indoor positioning method based on semi-supervised manifold learning of class matching is characterized in that the positioning process of the WiFi indoor positioning method based on semi-supervised manifold learning of class matching in the off-line stage is realized according to the following steps:

firstly, arranging an access point AP in an indoor area to be positioned, enabling a wireless signal to cover the indoor area to be positioned, and completing the construction of a wireless fidelity WiFi network;

selecting and recording corresponding coordinates of a reference point in an indoor area to be positioned regularly, measuring and sequentially recording signal strength RSS signals of all APs received by the reference point as position characteristic information, constructing a network signal coverage Map Radio Map, and storing the Radio Map;

secondly, analyzing the intrinsic dimension of the Radio Map constructed in the first step by adopting a geodesic minimum spanning tree GMST intrinsic dimension estimation algorithm, and determining the dimension of the Radio Map after dimension reduction by using the obtained intrinsic dimension as one of input parameters of a class matching-based semi-supervised local identification embedded CM-SDE algorithm;

thirdly, performing clustering analysis on the Radio Map by adopting a kernel fuzzy C mean KFCM algorithm to realize the established class mark of the Radio Map, using the class mark as one of input parameters of the CM-SDE, and providing a corresponding initial clustering center and the corresponding class mark;

fourthly, taking the intrinsic dimension in the second step and the category mark in the third step as input parameters, adopting a CM-SDE algorithm to reduce the dimension of the Radio Map constructed in the first step, and obtaining a corresponding dimension-reduced network signal overlay Radio Map^*，RadioMap^*As in the matching location database for the online location phase;

fifthly, adding the unmarked RSS obtained by the test of different users in the on-line positioning stage into the existing Radio Map by adopting a category matching mode to obtain the corresponding Radio Map containing the unmarked signal overlay_ulThe cluster center updated by class matching is used as a new class input parameter in the CM-SDE algorithm;

sixthly, taking the updated clustering center in the step five as an input parameter, and adopting a CM-SDE algorithm to perform the RadioMap_ulDimension reduction to obtain characteristic transformation matrixes V ', V' and RadioMap^*Together forming an online matching positioning database for online stage positioning; wherein the line stage positioning specifically comprises:

sixthly, testing RSS on line;

sixthly, adopting V to reduce the RSS dimension into the reduced signal strength RSS^*；

Sixthly, matching and positioning by adopting a K nearest neighbor method KNN algorithm to output a positioning result;

sixthly, updating a user positioning terminal positioning database;

namely, an off-line stage implementation mode of the WiFi indoor positioning method based on semi-supervised manifold learning of class matching is completed.

2. The WiFi indoor positioning method based on class matching semi-supervised manifold learning as recited in claim 1, wherein in the second step, GMST eigen-dimension estimation algorithm is adopted to analyze eigen-dimension of the Radio Map constructed in the first step, and the calculation formula is as follows:

d_{int r i n s i \dim} = \frac{1}{1 - a},

in the above formulaA in (a) represents the slope of the linear fitting expression y of the minimum spanning tree, ax + b.

3. The WiFi indoor positioning method based on semi-supervised manifold learning of class matching as claimed in claim 1, wherein the new dimension-reduced signal coverage map RADIO MAP is generated in step four^*And V' in the sixth step is expressed as:

Radio Map*＝V′·X

x is the Radio Map that needs dimension reduction.

4. The WiFi indoor positioning method based on semi-supervised manifold learning of class matching as claimed in claim 1, wherein class matching is implemented in step five, and the core process is completed in two steps:

firstly, finding the category attribute of the unmarked RSS, and finishing the marking of the category attribute according to the following formula:

the second step is that: performing threshold detection on RSS: by calculating and determining the generalized symbol value and the threshold value V_TThe updating of the Radio Map and the class mark data is realized, and the updating of the generalized symbol value and the clustering center is respectively completed by the following two formulas:

S_{i} = Σ_{j = 1}^{N} sgn ({RSS}_{i j} - v_{i j})

v_{i} = \frac{1}{| G_{i} |} \underset{x_{k} &Element; G_{i}}{Σ} x_{k}

wherein the threshold value V_T0.9N; c is the number of clustering centers; x is the number of_iIs the ith data point; g_iThe adjacency graph is formed by the ith class; d (x)_i,v_j) Representing a calculation x_iAnd v_jEuropean distance betweenSeparating; | represents calculating the number of elements in a certain class; n is the number of APs in the indoor positioning system; sgn (·) is defined as:

5. the WiFi indoor positioning method based on semi-supervised manifold learning of class matching as claimed in claim 1, wherein the WiFi indoor positioning method based on semi-supervised manifold learning of class matching is implemented in an online stage positioning process through the following steps:

firstly, testing RSS on line;

secondly, performing dimensionality reduction transformation on the obtained RSS to be located by adopting a feature transformation matrix to obtain RSS;

thirdly, matching RSS and Radio Map by adopting a KNN algorithm, predicting the specific position coordinate of the to-be-positioned point and updating online data, wherein the implementation process comprises the following steps:

(1) in the online phase, RSS received at the test point is [ AP ]₁,AP₂,…,AP_n]Multiplying the obtained result by a feature transformation matrix V 'to obtain the RSS' after dimension reduction ═ AP₁,AP₂,…AP_d]Wherein d represents an intrinsic dimension;

(2) the matching of RSS 'and Radio Map is realized by adopting a KNN algorithm, and the average value of the coordinates of K reference points nearest to the RSS' is used as a test point (x ', y'), wherein the expression is as follows:

(x^{'}, y^{'}) = \frac{1}{K} Σ_{i = 1}^{K} (x_{i}, y_{i})

wherein (x ', y') is the predicted coordinate of the test point, (x)_i,y_i) The coordinate of the ith neighbor point is shown, and K is the number of neighbors in the KNN algorithm;

and fourthly, updating a positioning database of the user positioning terminal, namely completing the WiFi on-line stage indoor positioning method based on semi-supervised manifold learning of category matching.

6. The WiFi indoor positioning method based on semi-supervised manifold learning of class matching according to claim 5, wherein the prediction of the specific position coordinates of the point to be positioned and the update of the online data in the third step are realized by an offline database and an online database:

the off-line database is realized in the following way: adding RSS measured by a positioning user in the positioning process into a Radio Map as unmarked data in a category matching mode, updating a local online matching positioning database on the mobile terminal, and dynamically updating local data so as to realize an offline database positioning mode;

the online database is realized in the following way: after the online positioning of the user is completed, the RSS value measured online by the user is uploaded to a server where an online positioning database is located, the online positioning database is updated at the server side, and the online positioning data is returned to a positioning terminal which uploads the RSS data.