CN110472775A

CN110472775A - A kind of series case suspect's foothold prediction technique

Info

Publication number: CN110472775A
Application number: CN201910683777.5A
Authority: CN
Inventors: 柳林; 廖薇薇
Original assignee: Guangzhou University
Current assignee: Guangzhou University
Priority date: 2019-07-26
Filing date: 2019-07-26
Publication date: 2019-11-19

Abstract

The invention discloses a method for predicting the whereabouts of suspects in a series of cases. The method first obtains the residence and activity range information of all mobile phone users, and generates the first residence kernel density map according to the residence information; The crime location information of the suspect is matched with the residence information of the mobile phone user, and a second residence kernel density map is generated; the crime distance attenuation is obtained by combining the crime location information of the suspect to be predicted with the crime location and residence information of the captured criminal function algorithm to obtain the probability surface distribution of the residence model of the suspect to be predicted; according to the first and second residence kernel density maps and the probability surface distribution of the residence model, combined with the Bayesian algorithm, obtain the Probability Density Plot. Adopting the technical scheme of the present invention does not need to always consider factors such as time, region, traffic environment, etc., so that the accuracy of the prediction model is reduced, and the accuracy of the result of predicting the suspect's foothold is improved.

Description

A method for predicting the whereabouts of suspects in a series of cases

技术领域technical field

本发明涉及公共安全、犯罪地理学技术领域，尤其涉及一种系列案件疑犯落脚点预测方法。The invention relates to the technical fields of public security and crime geography, in particular to a method for predicting the whereabouts of suspects in a series of cases.

背景技术Background technique

系列案件具有罪犯作案次数多，作案手段娴熟，反侦察能力强的特点，容易引起社会恐慌，不利于维持社会安定。因此，能准确的预测系列案件疑犯落脚点能有效的提高警务侦查工作的效率。The series of cases have the characteristics of many criminals committing crimes, skilled crime methods, and strong anti-detection capabilities, which are likely to cause social panic and are not conducive to maintaining social stability. Therefore, being able to accurately predict the whereabouts of suspects in a series of cases can effectively improve the efficiency of police investigation work.

现有技术中，通常采用构建犯罪之旅估算模型(Journey-to-crime，简称JTC模型)的方法来预测系列案件疑犯的落脚点。JTC模型能够从已解决的系列犯罪案件中构建犯罪距离衰减函数，进而推算最有可能作为系列案件疑犯落脚点的区域。而目前的犯罪之旅估算模型需要时刻考虑时间、地域、交通环境等因素对JTC模型预测的精度造成影响。当疑犯在居住地周边地区作案时，根据JTC模型来预测疑犯的居住地信息可以得到较好的预测结果；在当疑犯在远离居住地的地区作案时，根据JTC模型来预测疑犯的居住地信息，JTC模型的预测效果不佳。因此，现有的JTC模型的精确度、准确度仍有待提高。In the prior art, a method of constructing a Journey-to-crime estimation model (Journey-to-crime, JTC model for short) is usually used to predict the final destination of a suspect in a series of cases. The JTC model can construct a crime distance attenuation function from the series of crimes that have been solved, and then calculate the area that is most likely to be the place where the suspect in the series of cases ends. However, the current crime journey estimation model needs to always consider factors such as time, region, and traffic environment that affect the prediction accuracy of the JTC model. When the suspect commits a crime in the surrounding area of residence, predicting the residence information of the suspect according to the JTC model can get better prediction results; when the suspect commits a crime in an area far away from the residence, predicting the residence information of the suspect according to the JTC model , the prediction effect of the JTC model is not good. Therefore, the accuracy and accuracy of the existing JTC models still need to be improved.

发明内容Contents of the invention

本发明实施例提出了一种系列案件疑犯落脚点预测方法，不用时刻考虑时间、地域、交通环境等因素导致预测模型精确度降低，提高了预测疑犯落脚点结果的准确性。The embodiment of the present invention proposes a method for predicting the whereabouts of suspects in a series of cases. It does not need to consider factors such as time, region, and traffic environment at all times to reduce the accuracy of the prediction model, and improves the accuracy of the results of predicting the whereabouts of suspects.

本发明实施例提供了一种系列案件疑犯落脚点预测方法，包括：An embodiment of the present invention provides a method for predicting the whereabouts of suspects in a series of cases, including:

获取全体手机用户的第一居住地信息和活动范围信息，其中，所述第一居住地信息用于生成所述全体手机用户的第一居住地核密度图；Obtaining first residence information and activity range information of all mobile phone users, wherein the first residence information is used to generate a first residence kernel density map of all mobile phone users;

根据所述活动范围信息，按照时间和地点，提取与待预测疑犯的系列案件的第一作案地信息相匹配的手机用户，并根据提取的手机用户对应的第一居住地信息，结合预设的空间核密度分析算法，获得所述待预测疑犯的第二居住地核密度图；According to the information on the scope of activities, according to the time and place, extract the mobile phone users that match the first crime location information of the series of cases to be predicted suspects, and according to the extracted first residence information corresponding to the mobile phone users, combined with the preset A spatial kernel density analysis algorithm to obtain a kernel density map of the second place of residence of the suspect to be predicted;

根据预设的已捉获罪犯的系列案件的第二作案地信息和第二居住地信息，结合预设的负指数函数来拟合所述第二作案地信息-第二居住地信息的距离分布曲线，获得犯罪距离衰减函数；According to the preset information on the second place of crime and the second place of residence of the serial cases of the criminals captured, combined with the preset negative exponential function to fit the distance distribution of the second place of crime - the second place of residence Curve to obtain the crime distance decay function;

根据所述第一作案地信息，结合所述犯罪距离衰减函数算法，获得待预测疑犯的居住地模型概率表面分布；According to the information of the first crime location, combined with the crime distance attenuation function algorithm, the probability surface distribution of the residence model of the suspect to be predicted is obtained;

根据所述第一居住地核密度图、所述第二居住地核密度图和所述居住地模型概率表面分布，结合预设的贝叶斯算法，获得用于预测所述待预测疑犯落脚点的概率密度图。According to the first residential area kernel density map, the second residential area kernel density map and the probability surface distribution of the residential area model, combined with a preset Bayesian algorithm, obtain the location used to predict the suspect to be predicted. The probability density plot of .

可选地，所述获取全体手机用户的第一居住地信息和活动范围信息，具体为：Optionally, the acquisition of the first place of residence information and activity range information of all mobile phone users is specifically:

所述第一居住地信息包括若干个居住地，所述活动范围信息包括若干个活动范围和与所述活动范围分别一一对应的活动时间；The first place of residence information includes several places of residence, and the activity range information includes several activity ranges and activity time corresponding to the activity ranges respectively;

根据所述全体手机用户的手机信令数据，提取、聚合得到全部基站点数据；According to the mobile phone signaling data of all mobile phone users, extract and aggregate to obtain all base station data;

根据所述全部基站点数据，构建基站泰森多边形，并结合预设的蒙特卡洛模拟算法，获得各手机用户在每个时刻分别一一对应的基站位置；According to all the base station point data, construct the base station Thiessen polygon, and combine the preset Monte Carlo simulation algorithm to obtain the base station positions corresponding to each mobile phone user at each moment;

利用信息熵值法分别对各个所述基站位置进行计算，得到所述全体手机用户的第一居住地信息和活动范围信息。Using the information entropy value method to calculate the positions of each of the base stations to obtain the first residence information and activity range information of all mobile phone users.

可选地，所述利用信息熵值法分别对各个所述基站位置进行计算，具体为：Optionally, the use of the information entropy value method to calculate the positions of each of the base stations is specifically:

提取各手机用户夜间时间段所对应的基站位置，计算在夜间时间段各手机用户在所有基站位置的停留比例；Extract the base station positions corresponding to the night time period of each mobile phone user, and calculate the stay ratio of each mobile phone user at all base station positions during the night time period;

根据所述全体手机用户对应的停留比例，分别计算每个手机用户的夜间总熵值；Calculate the total entropy value of each mobile phone user at night according to the corresponding stay ratio of all mobile phone users;

当第i个手机用户的夜间总熵值小于预设的阈值时，将所述第i个手机用户对应的停留比例最大的基站位置标记为所述第i个手机用户的居住地，而所述第i个手机用户在居住地以外的其他基站位置标记为所述第i个手机用户的活动范围。When the nighttime total entropy value of the i mobile phone user is less than the preset threshold, the base station position corresponding to the i mobile phone user with the largest staying ratio is marked as the residence of the i mobile phone user, and the i mobile phone user The location of the i-th mobile phone user other than the place of residence is marked as the activity range of the i-th mobile phone user.

可选地，所述空间核密度分析算法采用的是四次核函数。Optionally, the spatial kernel density analysis algorithm uses a quartic kernel function.

可选地，所述根据预设的已捉获罪犯的系列案件的第二作案地信息和第二居住地信息，结合预设的负指数函数来拟合所述第二作案地信息-第二居住地信息的距离分布曲线，获得犯罪距离衰减函数，具体为：Optionally, the second crime location information and the second residence information of the series of cases of the captured criminals are combined with a preset negative exponential function to fit the second crime location information - the second The distance distribution curve of the residence information obtains the crime distance decay function, specifically:

所述第二作案地信息包含若干个第二作案地，所述第二居住地信息包含与所述第二作案地分别一一对应的第二居住地；The information on the second place of crime includes several second places of crime, and the information on the second place of residence includes a second place of residence corresponding to the second place of crime respectively;

分别计算每个所述第二作案地与其对应的所述第二居住地之间的距离，获得犯罪距离数据；Calculating the distance between each of the second places of crime and its corresponding second place of residence to obtain crime distance data;

以所述犯罪距离数据为横轴，案件比例为纵轴，得到犯罪距离分布曲线；Taking the crime distance data as the horizontal axis and the case ratio as the vertical axis, the crime distance distribution curve is obtained;

利用负指数函数来拟合所述犯罪距离分布曲线，获得犯罪距离衰减函数f(d_ij)：Using a negative exponential function to fit the crime distance distribution curve, obtain the crime distance attenuation function f(d _ij ):

d_ij指到居住点的欧氏距离，a_k、b_k、c_k参数为常数值，需要通过实际已捉获罪犯的犯罪距离分布数据训练获得。d _ij refers to the Euclidean distance to the residential point, and the parameters a _k , b _k , and c _k are constant values, which need to be obtained through the training of the crime distance distribution data of the actually captured criminals.

可选地，所述根据所述第一作案地信息，结合所述犯罪距离衰减函数算法，获得待预测疑犯的居住地模型概率表面分布，具体为：Optionally, according to the information of the first crime location, combined with the crime distance attenuation function algorithm, the probability surface distribution of the residence model of the suspect to be predicted is obtained, specifically:

所述第一作案地信息包含若干个第一作案地和与所述第一作案地分别一一对应的第一作案时间；The information of the first crime location includes several first crime locations and the first crime time corresponding to the first crime locations respectively;

获得待预测疑犯的居住地模型概率表面分布的计算方法如下：The calculation method to obtain the probability surface distribution of the residence model of the suspect to be predicted is as follows:

在预测区域内生成N*N米的栅格，分别计算每个栅格中心点到所述第一作案地的距离x_ij，并将所述x_ij代入到所述距离衰减函数中，计算每个栅格作为待预测疑犯落脚点的居住地模型概率值，最终获得待预测疑犯的居住地模型概率表面分布。Generate a grid of N*N meters in the prediction area, calculate the distance x _ij from each grid center point to the first crime site, and substitute the x _ij into the distance attenuation function, and calculate each Grids are used as the probability value of the residence model of the suspect's foothold to be predicted, and finally the probability surface distribution of the residence model of the suspect to be predicted is obtained.

可选地，根据所述第一居住地核密度图、所述第二居住地核密度图和所述居住地模型概率表面分布，结合预设的贝叶斯算法，获得用于预测所述待预测疑犯落脚点的概率密度图，计算方法如下：Optionally, according to the first residential area kernel density map, the second residential area kernel density map and the probability surface distribution of the residential area model, combined with a preset Bayesian algorithm, the The probability density map for predicting the whereabouts of the suspect is calculated as follows:

其中，P(JTC|O)为所述待预测疑犯落脚点的概率密度，P(JTC)为所述待预测疑犯的居住地模型概率表面分布，P(O|JTC)为所述第二居住地核密度图，P(O)为所述第一居住地核密度图。Wherein, P(JTC|O) is the probability density of the to-be-predicted suspect’s foothold, P(JTC) is the probability surface distribution of the residence model of the to-be-predicted suspect, and P(O|JTC) is the probability density of the second residential area. Earth core density map, P(O) is the core density map of the first habitable place.

实施本发明实施例，具有如下有益效果：Implementing the embodiment of the present invention has the following beneficial effects:

本发明实施例提供的系列案件疑犯落脚点预测方法，先获取全体手机用户的居住地和活动范围信息，根据居住地信息生成第一居住地核密度图；根据活动范围信息，获取与待预测疑犯的作案地信息相匹配的手机用户的居住地信息，并生成第二居住地核密度图；将待预测疑犯的作案地信息结合由已捉获罪犯的作案地和居住地信息得到犯罪距离衰减函数算法，获得待预测疑犯的居住地模型概率表面分布；根据第一、第二居住地核密度图和居住地模型概率表面分布，结合贝叶斯算法，获得用于预测待预测疑犯落脚点的概率密度图。相比于现有技术采用传统的JTC模型进行疑犯落脚点的预测，本发明技术方案不用时刻考虑时间，地域、交通环境等因素导致JTC模型预测精确度降低的影响，而是会通过手机信令数据或带有位置信息的大数据来研究个体空间行为，进一步提高了JTC模型预测疑犯落脚点结果的准确性。The method for predicting the whereabouts of suspects in a series of cases provided by the embodiments of the present invention first obtains the residence and activity range information of all mobile phone users, and generates the first residence kernel density map according to the residence information; The location information of mobile phone users matched with the crime location information, and generate the second residence kernel density map; the crime distance attenuation function is obtained by combining the crime location information of the suspect to be predicted with the crime location and residence information of the captured criminal Algorithm to obtain the probability surface distribution of the residence model of the suspect to be predicted; according to the first and second residence kernel density maps and the probability surface distribution of the residence model, combined with the Bayesian algorithm, the probability used to predict the whereabouts of the suspect to be predicted is obtained Density map. Compared with the prior art that uses the traditional JTC model to predict the whereabouts of suspects, the technical solution of the present invention does not always consider the impact of factors such as time, region, and traffic environment on the JTC model’s prediction accuracy reduction, but will use the mobile phone signaling Data or big data with location information is used to study individual spatial behavior, which further improves the accuracy of the JTC model in predicting the results of the suspect's foothold.

进一步的，本发明根据手机用户的手机信令数据，提取、聚合得到全部基站点数据，并结合泰森多边形和蒙特卡洛模拟算法对所述基站点数据进行处理，能够有效提高了手机用户某时刻在某个基站内的具体位置的数据更加准确。Further, the present invention extracts and aggregates all base station data according to the mobile phone signaling data of mobile phone users, and processes the base station data in combination with Thiessen polygon and Monte Carlo simulation algorithms, which can effectively improve the mobile phone user's certain The data of the specific location in a certain base station at any time is more accurate.

进一步的，本发明在获取全体手机用户第一居住地信息和活动范围信息时，采用了熵值法分别对每个手机用户每个时刻分别一一对应的基站位置进行处理，进一步提高了获取全体手机用户第一居住地信息和活动范围信息的准确性。Furthermore, when the present invention obtains the first residence information and activity range information of all mobile phone users, the entropy method is used to process the base station positions corresponding to each mobile phone user at each time, which further improves the acquisition of all mobile phone users. The accuracy of mobile phone users' primary residence information and activity range information.

进一步的，本发明在获取待预测疑犯时，根据全体手机用户的活动范围信息，按照时间和地点，提取与待预测疑犯的系列案件的第一作案地信息相匹配的手机用户，能够使得获取待预测疑犯的数据的不受到时间和地域的影响，有效提高了获取待预测疑犯的数据的准确性。Further, when acquiring the suspect to be predicted, the present invention extracts the mobile phone users that match the first crime location information of the series of cases of the suspect to be predicted according to the activity range information of all mobile phone users according to time and place, so that the suspect can be obtained. The data of the predicted suspect is not affected by time and region, which effectively improves the accuracy of obtaining the data of the suspect to be predicted.

进一步的，本发明在获得待预测疑犯落脚点的概率密度图时，利用贝叶斯框算法对数据进行处理，提高了预测疑犯落脚点结果的精确性。Further, when the present invention obtains the probability density map of the suspect's foothold to be predicted, the Bayesian box algorithm is used to process the data, which improves the accuracy of the result of predicting the suspect's foothold.

附图说明Description of drawings

图1是本发明提供的一种系列案件疑犯落脚点预测方法的一种实施例的流程示意图；Fig. 1 is a schematic flow chart of an embodiment of a method for predicting the whereabouts of suspects in a series of cases provided by the present invention;

图2是本发明中手机信令数据处理流程图；Fig. 2 is the flow chart of mobile phone signaling data processing in the present invention;

图3为本发明中的基站与泰森多边形示意图；Fig. 3 is a schematic diagram of a base station and a Thiessen polygon in the present invention;

图4为本发明实施例中的距离衰减曲线拟合图。FIG. 4 is a fitting diagram of a distance attenuation curve in an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

实施例1Example 1

请参见图1，是本发明提供的系列案件疑犯落脚点预测方法的一种实施例的流程示意图。如图1所述，该构建方法包括步骤101至步骤105，各步骤具体如下：Please refer to FIG. 1 , which is a schematic flow chart of an embodiment of a method for predicting the whereabouts of suspects in a series of cases provided by the present invention. As shown in Figure 1, the construction method includes steps 101 to 105, and each step is specifically as follows:

步骤101：获取全体手机用户的第一居住地信息和活动范围信息，其中，第一居住地信息用于生成全体手机用户的第一居住地核密度图。Step 101: Obtain first residence information and activity range information of all mobile phone users, wherein the first residence information is used to generate a first residence kernel density map of all mobile phone users.

在本实施例中，全体手机用户的第一居住地信息包括：若干个第一居住地；全体手机用户的活动范围信息包括：若干个活动范围和与活动范围分别一一对应的活动时间。In this embodiment, the first residence information of all mobile phone users includes: several first residences; the activity range information of all mobile phone users includes: several activity ranges and activity time corresponding to the activity ranges respectively.

在本实施例中，由于同个经纬度上存在若干个基站信号接收器，还存在同个基信号接收器上有若干组手机信令数据。因此，需要提取并聚合全体手机用户的手机信令数据，将相同经纬度上的基站聚集为1点，得到全部基站点数据。本方案仅在警务侦查工作需要的前提下，才提取手机用户的手机信令数据，不属于违反专利法3.1.3妨碍公共利益的发明创造。In this embodiment, since there are several base station signal receivers on the same latitude and longitude, there are also several groups of mobile phone signaling data on the same base signal receiver. Therefore, it is necessary to extract and aggregate the mobile phone signaling data of all mobile phone users, gather the base stations on the same longitude and latitude into one point, and obtain all the base station point data. This solution only extracts the mobile phone signaling data of mobile phone users under the premise of police investigation work, which is not an invention that violates 3.1.3 of the Patent Law and hinders public interests.

在本实施例中，步骤101具体为：根据全体手机用户的手机信令数据，提取、聚合得到全部基站点数据；根据全部基站点数据，构建基站泰森多边形，并结合预设的蒙特卡洛模拟算法，获得各手机用户在每个时刻分别一一对应的基站位置；利用信息熵值法分别对各个基站位置进行计算，得到全体手机用户的第一居住地信息和活动范围信息；根据第一居住地信息结合空间核密度分析算法，生成全体手机用户的第一居住地核密度图。In this embodiment, step 101 is specifically: according to the mobile phone signaling data of all mobile phone users, extract and aggregate all base station point data; according to all base station point data, construct a base station Thiessen polygon, and combine the preset Monte Carlo The simulation algorithm obtains the base station positions corresponding to each mobile phone user at each moment; uses the information entropy value method to calculate the positions of each base station respectively, and obtains the first residence information and activity range information of all mobile phone users; according to the first The residence information is combined with the spatial kernel density analysis algorithm to generate the first residence kernel density map of all mobile phone users.

在本步骤中获得各手机用户在每个时刻分别一一对应的基站位置，具体为：在研究区域内，当第i个手机用户k时刻位于j基站点时，以j基站点(X_j,Y_j)为中心构建泰森多边形(可参见图3)，使得多边形内任意一个基站点(x_p,y_p)到j基站点(X_j,Y_j)的欧式距离小于基站点(x_p,y_p)到其他任意一个基站点的欧氏距离。欧式距离的定义为:根据以j基站点(X_j,Y_j)为中心构建的泰森多边形生成最小外包矩形，并结合蒙特卡洛模拟算法生成随机点。将第i个手机用户位于j基站点的泰森多边形的概率设置为1，泰森多边形以外、最小外包矩形以内的概率设置为0。将j基站点构建的泰森多边形内的随机点记录为第i个手机用户k时刻的具体的基站位置，并将第i个手机用户k时刻所对应的基站位置，通过手机用户ID进行存储。分别对每个用户每个时刻所在基站点都进行以上处理，获得各手机用户在每个时刻分别一一对应的基站位置信息。详细手机用户的手机信令处理流程，可参见图2。In this step, the base station positions corresponding to each mobile phone user at each moment are obtained, specifically: in the research area, when the i-th mobile phone user is located at j base station at time k, take j base station (X _j , Y _j ) as the center to construct a Thiessen polygon (see Figure 3), so that the Euclidean distance from any base point (x _p , y _p ) in the polygon to j base point (X _j , Y _j ) is smaller than the base point (x _p ,y _p ) to any other Euclidean base point. The Euclidean distance is defined as: The minimum enclosing rectangle is generated according to the Thiessen polygon centered on the base station j (X _j , Y _j ), and random points are generated by combining the Monte Carlo simulation algorithm. The probability that the i-th mobile phone user is located in the Thiessen polygon of base station j is set to 1, and the probability outside the Thiessen polygon and within the smallest enclosing rectangle is set to 0. Record the random point in the Thiessen polygon constructed by j base station as the specific base station position of the i-th mobile phone user at time k, and store the base station position corresponding to the i-th mobile phone user at time k through the mobile phone user ID. The above processing is performed on the base station where each user is located at each time, and the corresponding base station position information of each mobile phone user at each time is obtained. See Figure 2 for a detailed mobile phone signaling processing flow for mobile phone users.

在本步骤中利用信息熵值法分别对各个基站位置进行计算，具体为：提取各手机用户夜间时间段所对应的基站位置，计算在夜间时间段各手机用户在所有基站位置的停留比例；根据全体手机用户对应的停留比例，分别计算每个手机用户的夜间总熵值；当第i个手机用户的夜间总熵值小于预设的阈值时，将第i个手机用户对应的停留比例最大的基站位置标记为第i个手机用户的居住地，而第i个手机用户在居住地以外的其他基站位置标记为第i个手机用户的活动范围。In this step, use the information entropy value method to calculate the positions of each base station respectively, specifically: extract the base station positions corresponding to the night time period of each mobile phone user, and calculate the stay ratio of each mobile phone user at all base station positions in the night time period; Calculate the total nighttime entropy of each mobile phone user according to the corresponding stay ratio of all mobile phone users; The base station location is marked as the i-th mobile phone user's residence, and the i-th mobile phone user's other base station locations outside the residence are marked as the i-th mobile phone user's activity range.

譬如，提取第i个手机用户夜间时段(如0-6点)所在的全部基站位置，计算第i个手机用户在所有基站位置的停留比例，停留比例的计算方式为：其中，P_ij代表第i个手机用户在j基站位置的停留比例，T_ij为第i个手机用户在j基站位置的停留时长，T为常量，代表纳入运算的夜间总时长；接着计算第i个手机用户的夜间总熵值，计算方法如下：其中H(U_i)为第i个手机用户的夜间总熵值，P_ij代表第i个手机用户在j基站位置的停留比例，n代表第i个手机用户共停留n个基站位置，熵值H(U_i)越小，说明第i个手机用户在夜间时段在基站位置之间切换的频率越少，稳定地呆在某个基站位置内。For example, extract all base station locations where the i-th mobile phone user is at night time (such as 0-6 o'clock), and calculate the stay ratio of the i-th mobile phone user at all base station locations. The calculation method of the stay ratio is: Among them, P _ij represents the proportion of the i-th mobile phone user staying at the base station j, T _ij is the length of time the i-th mobile phone user stays at the j base station, and T is a constant, representing the total night time included in the calculation; then calculate the i-th The total entropy value of a mobile phone user at night is calculated as follows: Among them, H(U _i ) is the total entropy value of the i-th mobile phone user at night, P _ij represents the proportion of the i-th mobile phone user staying at the j base station location, n represents the i-th mobile phone user staying at n base station locations in total, and the entropy value The smaller H(U _i ), it means that the i-th mobile phone user switches between base station locations less frequently at night, and stays stably in a certain base station location.

当夜间时段设置为0-6点时，以1.5作为阈值，如果H(U_i)值≤1.5，则选择第i个手机用户在夜间时间段停留比例最大的基站位置作为第i个手机用户的居住地，而第i个手机用户在居住地以外的其他区域活动生成的轨迹点，记录为活动范围。When the night time period is set to 0-6 o’clock, 1.5 is used as the threshold, and if the H(U _i ) value is ≤1.5, the base station location with the largest proportion of the i-th mobile phone user staying in the night time period is selected as the i-th mobile phone user’s The place of residence, while the trajectory points generated by the i-th mobile phone user in other areas other than the place of residence are recorded as the activity range.

步骤102：根据活动范围信息，按照时间和地点，提取与待预测疑犯的系列案件的第一作案地信息相匹配的手机用户，并根据提取的手机用户对应的第一居住地信息，结合预设的空间核密度分析算法，获得待预测疑犯的第二居住地核密度图。Step 102: According to the activity range information, according to the time and place, extract the mobile phone users that match the first crime location information of the series of cases to be predicted suspects, and combine the preset The spatial kernel density analysis algorithm is used to obtain the second residence kernel density map of the suspect to be predicted.

在本实施例中，待预测疑犯的系列案件的第一作案地信息包含若干个第一作案地和与若干个第一作案地分别一一对应的第一作案时间。In this embodiment, the information on the first crime location of the series of cases of the suspect to be predicted includes several first crime locations and first crime times corresponding to each of the several first crime locations.

在本实施例中，步骤102具体为：将城市空间进行网格化，获得K个N×N米的正方形网格并为每个网格添加编号，接着通过空间叠置分析，获取各手机用户的活动范围所在的网格编号、待预测疑犯的系列案件的第一作案地所在的网格编号。根据第一作案地所在的网格编号＝手机用户的活动范围所在的网格编号、第一作案时间＝手机用户的活动时间，筛选出与待预测疑犯的系列案件相匹配的手机用户。紧接着，提取待预测疑犯的系列案件相匹配的手机用户对应的第一居住地，使用空间核密度分析算法，以网格边长N米作为核密度的搜索半径，生成待预测疑犯的第二居住地核密度图。In this embodiment, step 102 is specifically: grid the urban space, obtain K square grids of N×N meters and add a number to each grid, and then obtain the data of each mobile phone user through spatial overlapping analysis. The grid number where the range of activities of the suspect is located, and the grid number where the first crime site of the series of cases of the suspect to be predicted is located. According to the grid number where the first crime site is located=the grid number where the mobile phone user's activity range is located, the first crime time=the mobile phone user's activity time, screen out the mobile phone users that match the serial cases of the suspect to be predicted. Next, extract the first place of residence corresponding to the mobile phone user that matches the series of cases of the suspect to be predicted, use the spatial kernel density analysis algorithm, and use the grid side length N meters as the search radius of the kernel density to generate the second place of residence of the suspect to be predicted. Habitat Kernel Density Map.

本步骤中的空间核密度算法采用的是四次核函数。对于待预测疑犯的系列案件相匹配的手机用户对应的第一居住地信息中的第一居住地(X_i,Y_i)，周围N米以内的任意一个基站位置(x_j,y_j)，基站位置(x_j,y_j)上的核函数值通过公式获得，其中，k为常数；t＝d_ij/N,d_ij为基站位置(x_j,y_j)到第一居住地(X_i,Y_i)之间的欧式距离，N为核密度搜索半径。采用的空间核密度分析算法为四次核函数，能够使居住地核密度图内的数据分布更加精确。The spatial kernel density algorithm in this step uses a quartic kernel function. For the first place of residence (X _i , Y _i ) in the first place of residence information corresponding to the mobile phone user corresponding to the series of cases to be predicted suspects, any base station position (x _j , y _j ) within N meters around, The kernel function value at the base station position (x _j , y _j ) is obtained by the formula, Among them, k is a constant; t=d _ij /N, d _ij is the Euclidean distance between the base station location (x _j , y _j ) and the first residence (X _i , Y _i ), and N is the kernel density search radius. The spatial kernel density analysis algorithm adopted is a quartic kernel function, which can make the data distribution in the residential kernel density map more accurate.

步骤103：根据预设的已捉获罪犯的系列案件的第二作案地信息和第二居住地信息，结合预设的负指数函数来拟合第二作案地信息-第二居住地信息的距离分布曲线，获得犯罪距离衰减函数。Step 103: According to the preset information on the second place of crime and the information on the second place of residence of the series of cases of the captured criminals, combined with the preset negative exponential function to fit the distance between the information on the second place of crime and the information on the second place of residence distribution curve to obtain the crime distance decay function.

在本实施例中，步骤103具体为：第二作案地信息包含若干个第二作案地，第二居住地信息分别包含与第二作案地分别一一对应的第二居住地；分别计算各个第二作案地与其对应的第二居住地之间的距离，获得犯罪距离数据，并利用犯罪距离数据来构建犯罪距离分布曲线，以分段犯罪距离数据为横轴，譬如，0-1000m，1001-2000m。案件比例为纵轴。譬如，纵轴为分段犯罪距离区间内的案件占比。并选择负指数函数来拟合犯罪距离分布曲线，输出犯罪距离衰减函数。犯罪距离衰减函数f(d_ij)为：In this embodiment, step 103 is specifically: the information on the second place of crime includes several second places of crime, and the information on the second place of residence includes the second place of residence corresponding to the second places of crime respectively; The distance between the place where the crime was committed and its corresponding second place of residence is to obtain the crime distance data, and use the crime distance data to construct the crime distance distribution curve, with the segmented crime distance data as the horizontal axis, for example, 0-1000m, 1001- 2000m. The proportion of cases is on the vertical axis. For example, the vertical axis is the proportion of cases within the segmented crime distance interval. And choose the negative exponential function to fit the crime distance distribution curve, and output the crime distance decay function. The crime distance decay function f(d _ij ) is:

d_ij指基站位置(x_j,y_j)到第二居住地(X_i,Y_i)之间的欧式距离，a_k、b_k、c_k参数为常数值，需要通过实际已捉获罪犯的犯罪距离分布曲线训练获得。d _ij refers to the Euclidean distance between the base station location (x _j , y _j ) and the second place of residence (X _i , Y _i ), the parameters a _k , b _k , and c _k are constant values, and it is necessary to pass the actual captured criminals The crime distance distribution curve training is obtained.

本步骤通过获取大量已捉获罪犯的系列案件的第二作案地信息和第二居住地信息的数据来绘制犯罪距离分布曲线，并利用负指数函数来拟合该分布曲线，使得构建犯罪距离衰减函数的算法更为准确。In this step, the crime distance distribution curve is drawn by obtaining the data of the second crime location information and the second residence information information of a large number of serial cases of captured criminals, and a negative exponential function is used to fit the distribution curve, so that the construction crime distance attenuation The algorithm of the function is more accurate.

步骤104：根据第一作案地信息，结合犯罪距离衰减函数算法，获得待预测疑犯的居住地模型概率表面分布。Step 104: According to the information of the first crime location, combined with the crime distance attenuation function algorithm, obtain the probability surface distribution of the residence model of the suspect to be predicted.

在本实施例中，步骤104的计算方法如下：在预测区域内生成100m*100m米的栅格，计算每个栅格中心点到第一作案地的距离x_ij，并将x_ij代入到犯罪距离衰减函数中，计算每个栅格作为待预测疑犯落脚点的居住地模型概率值，最终获得待预测疑犯的居住地模型概率表面分布。In this embodiment, the calculation method of step 104 is as follows: Generate a grid of 100m*100m in the prediction area, calculate the distance x _ij from the center point of each grid to the first crime site, and substitute x _ij into the crime distance attenuation function, and calculate each grid as the to-be-predicted The probability value of the residence model of the suspect's foothold, and finally obtain the probability surface distribution of the residence model of the suspect to be predicted.

步骤105：根据第一居住地核密度图、第二居住地核密度图和居住地模型概率表面分布，结合预设的贝叶斯算法，获得用于预测待预测疑犯落脚点的概率密度图。Step 105: According to the first residential area kernel density map, the second residential area kernel density map and the probability surface distribution of the residential area model, combined with the preset Bayesian algorithm, obtain a probability density map for predicting the whereabouts of the suspect to be predicted.

在本实施例中，步骤105的计算方法如下：In this embodiment, the calculation method of step 105 is as follows:

其中，P(JTC|O)为待预测疑犯落脚点的概率密度，P(JTC)为待预测疑犯的居住地模型概率表面分布，P(O|JTC)为第二居住地核密度图，P(O)为第一居住地核密度图。Among them, P(JTC|O) is the probability density of the suspect’s foothold to be predicted, P(JTC) is the probability surface distribution of the residence model of the suspect to be predicted, P(O|JTC) is the kernel density map of the second residence, and P (O) is the kernel density map of the first residence.

由上可见，应用本实施例技术方案，可以通过手机信令数据获取每个手机用户的居住地、活动范围以及活动时间。并根据作案地及作案时间基于手机信令数据提取到的每个手机用户的活动范围及活动时间，匹配到待预测疑犯。能够使得获取待预测疑犯的数据的不受到时间和地域的影响，有效提高了获取待预测疑犯的数据的准确性。It can be seen from the above that by applying the technical solution of this embodiment, the residence, activity range and activity time of each mobile phone user can be obtained through the mobile phone signaling data. According to the location and time of the crime, the activity range and activity time of each mobile phone user extracted based on the mobile phone signaling data are matched to the suspect to be predicted. The acquisition of the data of the suspect to be predicted can be prevented from being affected by time and region, and the accuracy of obtaining the data of the suspect to be predicted can be effectively improved.

为了更好的说明本实施例的流程和原理，以下面例子作为详细说明：In order to better illustrate the process and principle of this embodiment, the following example is used as a detailed description:

获取ZG市2018年12月28日全体手机用户的手机信令数据10,785,815条。通过手机信令数据的提取、聚合，得到10,594个基站点数据，再分别以每个基站点为中心构建泰森多边形。根据手机信令数据可知第i个手机用户k时刻位于j基站点，根据j基站点为中心所构建的泰森多边形生成的最小外包矩形，并通过蒙特卡洛模拟方法生成随机点。将第i个手机用户位于j基站点的泰森多边形的概率设置为1，泰森多边形外、最小外包矩形以内的概率设置为0。将j基站点构建的泰森多边形内的随机点记录为第i个手机用户在k时刻的具体的基站位置，并将第i个手机用户k时刻所对应的基站位置，通过手机用户ID进行存储。分别对每个用户每个时刻所在基站点都进行以上处理，获得各手机用户在每个时刻一一对应的基站位置。Obtain 10,785,815 pieces of mobile phone signaling data of all mobile phone users in ZG City on December 28, 2018. Through the extraction and aggregation of mobile phone signaling data, 10,594 base station data are obtained, and then Thiessen polygons are constructed with each base station as the center. According to the mobile phone signaling data, it can be known that the i-th mobile phone user is located at the j base station at time k, and the minimum outer rectangle generated by the Thiessen polygon constructed based on the j base station, and the random point is generated by the Monte Carlo simulation method. The probability that the i-th mobile phone user is located in the Thiessen polygon of base station j is set to 1, and the probability outside the Thiessen polygon and within the smallest enclosing rectangle is set to 0. Record the random point in the Thiessen polygon constructed by j base station as the specific base station position of the i-th mobile phone user at time k, and store the base station position corresponding to the i-th mobile phone user at time k through the mobile phone user ID . The above processing is performed on the base station where each user is located at each time, and the base station positions corresponding to each mobile phone user at each time are obtained.

根据上述步骤获取到各手机用户在每个时刻一一对应的基站位置的基础上提取各手机用户夜间(0-6点)时间段所在的全部基站位置，分别计算各手机用户在0-6点在所有基站位置的停留比例，并计算各手机用户的夜间总熵值。若第i个手机用户总熵值≤1.5，则选择第i个手机用户在夜间时间段停留比例最大的基站位置作为第i个手机用户的居住地，而第i个手机用户在居住地以外的其他基站标记为第i个手机用户的活动范围。本实施例共识别出1,500,871名手机用户的居住地。通过手机用户ID关联每个手机用户的活动范围和居住地。According to the above steps, on the basis of the one-to-one corresponding base station positions of each mobile phone user at each time point, extract all the base station positions where each mobile phone user is at night (0-6 o'clock) time period, and calculate the location of each mobile phone user at 0-6 o'clock The proportion of staying at all base station locations, and calculate the total entropy value of each mobile phone user at night. If the total entropy value of the i-th mobile phone user is less than or equal to 1.5, then select the base station location where the i-th mobile phone user stays with the largest proportion of the night time period as the i-th mobile phone user's residence, and the i-th mobile phone user is outside the residence Other base stations are marked as the activity range of the i-th mobile phone user. In this embodiment, a total of 1,500,871 mobile phone users' residences are identified. The activity range and place of residence of each mobile phone user are associated with the mobile phone user ID.

获取ZG市2012年-2016年6月的已捉获盗窃案件罪犯信息，使用在线地理编码服务接口，对第二作案地、第二居住地进行编码。筛选出在市内不同区域作案数量≥3起的罪犯共282人，随机分为两组，一组为训练样本133人，另一组为测试样本共149人，假设测试样本为待预测疑犯的系列案件的疑犯。Obtain the criminal information of the captured theft cases in ZG City from 2012 to June 2016, and use the online geocoding service interface to code the second place of crime and the second place of residence. A total of 282 criminals with more than 3 crimes in different areas of the city were screened out, and they were randomly divided into two groups, one group was a training sample of 133 people, and the other group was a test sample of 149 people, assuming that the test samples were the suspects to be predicted Suspects in the series.

首先对ZG市进行空间网格化，将ZG市划分2562个1609m×1609m的正方形网格，并对每个网格进行编号。对于待预测疑犯的系列案件的疑犯A，根据其第一作案时间、第一作案地进行网格编号。在已处理好的全体手机用户活动范围的数据集中提取活动时间、活动范围所在的网格编号都与疑犯A的第一作案时间、第一作案地所在的网格编号都相同的手机用户，提取这部分手机用户的居住地。使用空间核密度分析算法，生成第二核密度图，栅格大小为100m×100m，搜索半径为2000m。Firstly, ZG city is spatially gridded, and ZG city is divided into 2562 square grids of 1609m×1609m, and each grid is numbered. For the suspect A in the series of cases to be predicted, grid numbering is carried out according to the first crime time and the first crime place. Extract the mobile phone users whose activity time and grid number of the activity range are the same as suspect A’s first committing time and the grid number of the first committing place in the processed data set of the activity range of all mobile phone users, and extract The place of residence of this part of the mobile phone users. Using the spatial kernel density analysis algorithm, generate the second kernel density map, the grid size is 100m×100m, and the search radius is 2000m.

训练样本即已捉获系列案件罪犯133人，涉及案件467起，计算每起案件的第二作案地与第二作案地分别一一对应的第二居住地之间的欧式距离，并统计不同距离段内的案件比例，以分段犯罪距离数据为横轴，案件比例为纵轴，构建犯罪距离分布曲线，并结合负指数函数拟合犯罪距离分布曲线，得到犯罪距离衰减函数。如图4所示，得到的拟合曲线R方为0.9935，说明拟合度良好。The training sample has captured 133 criminals in a series of cases involving 467 cases. Calculate the Euclidean distance between the second place of crime and the second place of residence corresponding to the second place of crime in each case, and count the different distances The proportion of cases in the segment, with the segmented crime distance data as the horizontal axis and the case proportion as the vertical axis, constructs the crime distance distribution curve, and combines the negative exponential function to fit the crime distance distribution curve to obtain the crime distance attenuation function. As shown in Figure 4, the R square of the obtained fitting curve is 0.9935, indicating that the fitting degree is good.

将研究区域栅格化，每个栅格大小为100m*100m。计算每个栅格的中心点到第一作案地的距离，并代入到犯罪距离衰减函数中，计算每个栅格作为待预测疑犯落脚点的居住地模型概率值，最终获得待预测疑犯的居住地模型概率表面分布。Rasterize the research area, each grid size is 100m*100m. Calculate the distance from the center point of each grid to the first crime site, and substitute it into the crime distance attenuation function, calculate the probability value of each grid as the residence model of the suspect’s foothold to be predicted, and finally obtain the residence of the suspect to be predicted Earth models probability surface distributions.

针对全体手机用户的第一居住地，使用空间核密度分析，设置搜索半径为2000m，栅格大小为100m×100m，生成全体手机用户的第一核密度图。根据贝叶斯理论框架，通过栅格计算，计算方法如下：For the first residence of all mobile phone users, use spatial kernel density analysis, set the search radius to 2000m, and set the grid size to 100m×100m to generate the first kernel density map of all mobile phone users. According to the framework of Bayesian theory, through grid calculation, the calculation method is as follows:

其中，P(JTC|O)为待预测疑犯落脚点的概率密度，P(JTC)为待预测疑犯的居住地模型概率表面分布，P(O|JTC)为第二居住地核密度图，P(O)为第一居住地核密度图，最终获得用于预测待预测疑犯落脚点的概率密度图。Among them, P(JTC|O) is the probability density of the suspect’s foothold to be predicted, P(JTC) is the probability surface distribution of the residence model of the suspect to be predicted, P(O|JTC) is the kernel density map of the second residence, and P (O) is the kernel density map of the first place of residence, and finally obtains the probability density map used to predict the whereabouts of the suspect to be predicted.

通过本实施例获得测试样本，即149名待预测疑犯落脚点的概率密度图，与传统仅考虑距离衰减曲线的JTC模型(表1中以JTC表示)、仅考虑已捉获罪犯OD矩阵的贝叶斯JTC模型(表1中以CBJTC表示)以及中心点模型(表1中以Cmd表示)进行精度对比。Obtain the test sample through this embodiment, that is, the probability density map of 149 suspects' footholds to be predicted, which is different from the traditional JTC model (expressed as JTC in Table 1) that only considers the distance decay curve, and the Bayesian model that only considers the OD matrix of captured criminals. The accuracy is compared between the Yessian JTC model (represented by CBJTC in Table 1) and the center point model (represented by Cmd in Table 1).

精度评估包括以下三个指标：Accuracy evaluation includes the following three indicators:

有效性：能够获得预测结果的样本数占总测试样本的比例；能够从训练样本中找到与预测样本相匹配的数据；输出真实疑犯落脚点的概率值不为0。Effectiveness: The ratio of the number of samples that can obtain the predicted results to the total test samples; the data that matches the predicted samples can be found from the training samples; the probability value of outputting the real suspect's foothold is not 0.

搜索成本(精度)：警方根据待预测疑犯落脚点的概率密度大小，从概率密度最高点开始搜索，直至搜索到待预测疑犯真实居住地所需要搜索的面积。本发明使用的是搜索成本比例，即搜索面积与ZG市总面积之间的比值。Search cost (accuracy): According to the probability density of the suspect’s whereabouts to be predicted, the police start searching from the highest point of the probability density until they find the area that needs to be searched for the real residence of the suspect to be predicted. The present invention uses the search cost ratio, that is, the ratio between the search area and the total area of ZG city.

误差距离(准确度)：待预测疑犯落脚点的概率密度的最高点与待预测疑犯真实居住地之间的欧式距离。本实施例统计测试样本中误差距离<1609m(1英里，与ZG市网格大小相同)、误差距离<804.5m的样本比例。Error distance (accuracy): the Euclidean distance between the highest point of the probability density of the suspect's foothold and the real place of residence of the suspect to be predicted. In this embodiment, the proportion of samples with an error distance<1609m (1 mile, the same grid size as ZG City) and an error distance<804.5m among the test samples is counted.

表1本发明与原有方法效果对比Table 1 The present invention compares with original method effect

由此可见，本发明通过挖掘罪犯与手机用户的空间行为模式，实现了将手机信令数据结合传统JTC模型来预测待预测疑犯落脚点，克服了传统JTC模型只考虑单一的距离因素，进一步提高了待预测疑犯落脚点预测的精度和准确度。在实施例中，与传统JTC模型进行比较，本实施例能够使得警方侦查过程中的搜索成本减少56.74％，误差距离小于1609米的案例占比由26.85％提升至39.42％。It can be seen that the present invention realizes combining mobile phone signaling data with the traditional JTC model to predict the whereabouts of the suspect to be predicted by mining the spatial behavior patterns of criminals and mobile phone users, overcoming the fact that the traditional JTC model only considers a single distance factor, and further improves The precision and accuracy of predicting the whereabouts of the suspect to be predicted are improved. In the embodiment, compared with the traditional JTC model, this embodiment can reduce the search cost in the police investigation process by 56.74%, and the proportion of cases whose error distance is less than 1609 meters increases from 26.85% to 39.42%.

本发明通过融合海量的手机信令数据与历史犯罪数据，能够避免传统贝叶斯JTC模型OD矩阵稀疏问题。在实施例中，传统贝叶斯JTC模型的预测精度仅为58.39％，而本实施例的预测精度为91.94％。本发明能够有效降低了根据待预测疑犯的系列案件的第一作案地和作案时间无法找到相匹配的待预测疑犯，而导致预测失败的技术问题。本发明能够有效地提高预测精度和准确度，使得警方能够准确预测疑犯的落脚点。The invention can avoid the sparse problem of the OD matrix of the traditional Bayesian JTC model by fusing massive mobile phone signaling data and historical crime data. In the embodiment, the prediction accuracy of the traditional Bayesian JTC model is only 58.39%, while the prediction accuracy of this embodiment is 91.94%. The invention can effectively reduce the technical problem that a matching suspect cannot be found according to the first crime location and time of the serial cases of the suspect to be predicted, which leads to prediction failure. The invention can effectively improve the prediction precision and accuracy, so that the police can accurately predict the whereabouts of suspects.

以上所述是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也视为本发明的保护范围。The above description is a preferred embodiment of the present invention, it should be pointed out that for those skilled in the art, without departing from the principle of the present invention, some improvements and modifications can also be made, and these improvements and modifications are also considered Be the protection scope of the present invention.

Claims

1. a kind of series case suspect's foothold prediction technique, it is characterised in that:

Obtain the first residence information and scope of activities information of all mobile phone users, wherein first residence information is used In the first residence cuclear density figure for generating all mobile phone users；

According to the scope of activities information, according to when and where, the first crime with the serial case of suspect to be predicted is extracted The mobile phone user that ground information matches, and according to the corresponding first residence information of the mobile phone user of extraction, in conjunction with preset sky Between cuclear density parser, obtain the second residence cuclear density figure of the suspect to be predicted；

According to the second of the preset serial case for having captured criminal the crime ground information and the second residence information, in conjunction with preset Negative exponential function obtains crime distance come the distance profile of with being fitted second crime the-the second residence of information information Attenuation function；

The inhabitation of suspect to be predicted is obtained in conjunction with the crime range-attenuation function algorithm according to first crime ground information The distribution of ground model probability surface；

According to the first residence cuclear density figure, the second residence cuclear density figure and residence model probability surface Distribution, in conjunction with preset bayesian algorithm, obtains the probability density figure for predicting suspect's foothold to be predicted.

2. series case suspect's foothold prediction technique as described in claim 1, which is characterized in that all mobile phones of the acquisition The the first residence information and scope of activities information of user, specifically:

First residence information includes several residences, the scope of activities information include several scopes of activities and with The scope of activities distinguishes the one-to-one activity time；

According to the mobile phone signaling data of all mobile phone users, extract, polymerization obtains whole base station point datas；

According to whole base stations point data, base station Thiessen polygon is constructed, and combine preset Monte Carlo simulation algorithm, obtained It obtains each mobile phone user and distinguishes one-to-one base station location at each moment；

Each base station location is calculated respectively using information Information Entropy, obtain all mobile phone users first occupies Residence information and scope of activities information.

3. series case suspect's foothold prediction technique as claimed in claim 2, which is characterized in that described to utilize information entropy Method respectively calculates each base station location, specifically:

Base station location corresponding to each mobile phone user's evening hours section is extracted, is calculated in each mobile phone user of evening hours section all The stop ratio of base station location；

According to the corresponding stop ratio of the entirety mobile phone user, night total entropy of each mobile phone user is calculated separately；

When the night of i-th of mobile phone user total entropy is less than preset threshold value, by the corresponding stop of i-th of mobile phone user The maximum base station location of ratio is labeled as the residence of i-th of mobile phone user, and i-th of mobile phone user is in residence Other base station locations in addition are labeled as the scope of activities of i-th of mobile phone user.

4. series case suspect's foothold prediction technique as described in claim 1, which is characterized in that the space cuclear density point Algorithm is analysed using four kernel functions.

5. series case suspect's foothold prediction technique as described in claim 1, which is characterized in that it is described according to it is preset The the second crime ground information and the second residence information for capturing the serial case of criminal, are fitted in conjunction with preset negative exponential function The distance profile of second crime ground the-the second residence of information information, obtains crime range-attenuation function, specifically:

Second crime ground information includes several the second crime ground, and second residence information includes to make with described second Distinguish one-to-one second residence to case；

It calculates separately each described second to commit a crime corresponding the distance between second residence, obtains crime distance Data；

Using the crime range data as horizontal axis, case ratio is the longitudinal axis, obtains crime distance profile；

It is fitted the crime distance profile using negative exponential function, obtains crime range-attenuation function f (d_ij):

d_ijPoint to the Euclidean distance of settlement, a_k、b_k、c_kParameter is constant value, need by the practical crime for having captured criminal away from It is obtained from distributed data training.

6. series case suspect's foothold prediction technique as described in claim 1, which is characterized in that described according to described first Crime ground information obtains the residence model probability surface point of suspect to be predicted in conjunction with the crime range-attenuation function algorithm Cloth, specifically:

First crime ground information includes several the first crime ground and respectively one-to-one the with first crime One crime time；

The calculation method for obtaining the residence model probability surface distribution of suspect to be predicted is as follows:

N*N meter of grid is generated in estimation range, calculates separately the distance that each grid central point commits a crime ground to described first x_ij, and by the x_ijIt is updated in the range-attenuation function, calculates inhabitation of each grid as suspect's foothold to be predicted Ground model probability value finally obtains the residence model probability surface distribution of suspect to be predicted.

7. series case suspect's foothold prediction technique as described in claim 1, which is characterized in that described according to described first Residence cuclear density figure, the second residence cuclear density figure and the distribution of residence model probability surface, in conjunction with preset Bayesian algorithm obtains the probability density figure for predicting suspect's foothold to be predicted, and calculation method is as follows:

Wherein, P (JTC | O) is the probability density of suspect's foothold to be predicted, and P (JTC) is the residence of the suspect to be predicted The distribution of residence model probability surface, and P (O | JTC) it is the second residence cuclear density figure, P (O) is the first inhabitation earth's core Density map.