[go: up one dir, main page]

CN107172637B - Method and device for classifying calls - Google Patents

Method and device for classifying calls Download PDF

Info

Publication number
CN107172637B
CN107172637B CN201710348411.3A CN201710348411A CN107172637B CN 107172637 B CN107172637 B CN 107172637B CN 201710348411 A CN201710348411 A CN 201710348411A CN 107172637 B CN107172637 B CN 107172637B
Authority
CN
China
Prior art keywords
calls
call
preset
feature
entropy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710348411.3A
Other languages
Chinese (zh)
Other versions
CN107172637A (en
Inventor
彼得·巴詹诺夫·瓦拉瑞也斯基
克里斯托·劳得亚斯
王高虎
李汐
王瑞岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201710348411.3A priority Critical patent/CN107172637B/en
Publication of CN107172637A publication Critical patent/CN107172637A/en
Application granted granted Critical
Publication of CN107172637B publication Critical patent/CN107172637B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/02Arrangements for optimising operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/10Scheduling measurement reports ; Arrangements for measurement reports

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

本申请实施例公开了一种对呼叫进行分类的方法和装置,所述方法包括:获取数据包,数据包包括与N个呼叫分别对应的N个数据集合,N个呼叫中的任一呼叫Hi对应的数据集合包括通信终端在所述呼叫Hi过程中依次向基站发送的所有测量报告MR;对N个呼叫中的每个呼叫以w个预设时间窗分别进行切分,确定每个呼叫经各预设时间窗切分后对应的特征集,特征集包括与切分后的呼叫对应的小区数、切换数、和至少一个自定义特征的值;根据每个呼叫经各预设时间窗切分后对应的特征集,对所述N个呼叫按照预设的移动状态的类别进行分类。本申请实施例可以根据呼叫对应的小区数、切换数、以及自定义特征确定呼叫的类别,有利于提高对呼叫进行分类时的精度。

Figure 201710348411

The embodiment of the present application discloses a method and device for classifying calls, the method includes: acquiring a data packet, where the data packet includes N data sets corresponding to N calls respectively, and any call in the N calls Hi The corresponding data set includes all measurement reports MR sent by the communication terminal to the base station in turn during the call Hi process; each call in the N calls is divided into w preset time windows, and it is determined that each call is The corresponding feature set after each preset time window is split, the feature set includes the number of cells corresponding to the split call, the number of handovers, and the value of at least one self-defined feature; According to the corresponding feature set after classification, the N calls are classified according to the preset mobility state categories. In the embodiment of the present application, the type of the call can be determined according to the number of cells corresponding to the call, the number of handovers, and the self-defined feature, which is beneficial to improve the accuracy of classifying the call.

Figure 201710348411

Description

一种对呼叫进行分类的方法和装置A method and apparatus for classifying calls

技术领域technical field

本申请涉及通信技术领域,尤其涉及一种对呼叫进行分类的方法和装置。The present application relates to the field of communication technologies, and in particular, to a method and apparatus for classifying calls.

背景技术Background technique

在进行网络优化时,运营商常常需要利用服务器等处理装置对获取的数据包中的各呼叫的移动状态进行分类。需要说明的是,使用手机、平板电脑(portable androiddevice,Pad)等通信终端进行通话时,我们认为一次通话过程对应两次呼叫,其中,发起通话的通信终端对应一次呼叫,接收通话的通信终端也对应一次呼叫。另外手机、pad等便携式通讯设备通过基站上网时,一次上网过程也对应一次呼叫。在服务器获取的数据包中,每次呼叫对应一个数据集合,每个数据集合包括在一次呼叫过程中通信终端发送给基站的所有测量报告(measurement report,MR)。When performing network optimization, operators often need to use a processing device such as a server to classify the mobility status of each call in the acquired data packets. It should be noted that when using communication terminals such as mobile phones and tablet computers (portable android device, Pad) to make a call, we believe that one call process corresponds to two calls. Among them, the communication terminal that initiates the call corresponds to one call, and the communication terminal that receives the call also corresponds to a call. In addition, when portable communication devices such as mobile phones and pads access the Internet through the base station, one Internet access process also corresponds to one call. In the data packets obtained by the server, each call corresponds to a data set, and each data set includes all measurement reports (measurement reports, MRs) sent by the communication terminal to the base station during a call.

本申请的发明人发现,在对呼叫进行分类时,现有技术是对一次呼叫对应的数据集合进行简单处理,根据获取的小区数和切换数这两个参数进行分类的。以用户使用通信终端在某个区域来回走动这个场景对应的呼叫为例,数据集合中小区数和切换数可能都比较大,现有技术可能会将这次呼叫分到高速运动的类别,这显然是与呼叫的实际移动状态不一致。因此,现有技术对呼叫进行分类时不准确。The inventor of the present application finds that when classifying calls, the prior art simply processes a data set corresponding to a call, and classifies them according to the acquired parameters of the number of cells and the number of handovers. Take a call corresponding to a scenario where a user uses a communication terminal to walk back and forth in a certain area as an example, the number of cells and handovers in the data set may be relatively large, and the existing technology may classify this call into the category of high-speed movement, which is obviously is inconsistent with the actual mobile state of the call. Therefore, the prior art is inaccurate in classifying calls.

发明内容SUMMARY OF THE INVENTION

本申请实施例提供了一种对呼叫进行分类的方法和装置,用于提高对呼叫进行分类的精度。The embodiments of the present application provide a method and apparatus for classifying calls, which are used to improve the accuracy of classifying calls.

第一方面,本申请实施例提供了一种对呼叫进行分类的方法,所述方法包括:In a first aspect, an embodiment of the present application provides a method for classifying calls, the method comprising:

获取数据包,所述数据包包括与N个呼叫分别对应的N个数据集合,所述N个呼叫中的任一呼叫Hi对应的数据集合包括通信终端在所述呼叫Hi过程中依次向基站发送的所有测量报告MR,其中,所述N为大于1的整数,所述1≤i≤N;Acquire a data packet, where the data packet includes N data sets corresponding to N calls respectively, and the data set corresponding to any call Hi in the N calls includes the communication terminal sending sequentially to the base station during the call Hi process All measurement reports MR of , wherein the N is an integer greater than 1, and the 1≤i≤N;

对所述N个呼叫中的每个呼叫以w个预设时间窗分别进行切分,确定所述每个呼叫经各预设时间窗切分后对应的特征集,所述特征集包括与切分后的呼叫对应的小区数、切换数、和至少一个自定义特征的值,所述自定义特征与呼叫的移动状态相关;Each call in the N calls is segmented with w preset time windows, and a feature set corresponding to each call after being segmented by each preset time window is determined, and the feature set includes The number of cells corresponding to the divided call, the number of handovers, and the value of at least one self-defined characteristic, the self-defined characteristic is related to the movement state of the call;

根据所述每个呼叫经各预设时间窗切分后对应的特征集,对所述N个呼叫按照预设的移动状态的类别进行分类。According to the feature set corresponding to each call after being segmented by each preset time window, the N calls are classified according to preset mobility state categories.

本申请各实施例中,所述至少一个自定义特征,包括如下自定义特征中的至少一个:接收信号强度标准差、切换熵、室外小区占比、和站间距离速度;其中,In each embodiment of the present application, the at least one custom feature includes at least one of the following custom features: standard deviation of received signal strength, handover entropy, outdoor cell ratio, and inter-site distance speed; wherein,

所述接收信号强度标准差,是所述呼叫Hj接收信号强度值的标准差;The standard deviation of the received signal strength is the standard deviation of the received signal strength value of the call Hj;

所述切换熵,用于表示所述呼叫Hj接入小区的不确定度;the handover entropy, used to represent the uncertainty of the calling Hj accessing the cell;

所述室外小区占比,是所述呼叫Hj接入室外类型小区的个数占所有接入小区总个数的百分比;The outdoor cell ratio is the percentage of the number of the call Hj accessing the outdoor type cells to the total number of all access cells;

所述站间距离速度,是所述呼叫Hj获得的所有位置信息的均值与所述呼叫的所有位置的最远距离。The inter-station distance speed is the longest distance between the mean value of all the position information obtained by the call Hj and all the positions of the call.

在一些可能的实施方式中,,所述切换熵根据如下公式计算得到:In some possible implementations, the handover entropy is calculated according to the following formula:

Figure BDA0001297009240000021
Figure BDA0001297009240000021

其中,所述entropy为所述呼叫Hj对应的切换熵,所述

Figure BDA0001297009240000022
Wherein, the entropy is the handover entropy corresponding to the call Hj, and the
Figure BDA0001297009240000022

其中N表示所述呼叫Hj接入的小区总数,i表示所述呼叫Hj接入的第i个小区,#i表示所述呼叫Hj接入第i个小区的次数,T表示接入不同小区的总次数,pi表示在这段时间内接入第i个小区的概率。where N represents the total number of cells accessed by the call Hj, i represents the ith cell accessed by the call Hj, #i represents the number of times the call Hj accesses the ith cell, and T represents the number of accesses to different cells. The total number of times, pi represents the probability of accessing the i-th cell during this period.

可以看出,上述技术方案中,由于设置了与呼叫的移动状态相关的自定义特征,所以在对移动状态进行判断时,除了可以根据呼叫对应的小区数和切换数,还可以结合自定义特征进行判断,这样有利于提高对呼叫进行分类时的精度。It can be seen that in the above technical solution, since a custom feature related to the mobility state of the call is set, when judging the mobility state, in addition to the number of cells corresponding to the call and the number of handovers, the custom feature can also be combined It is helpful to improve the accuracy when classifying calls.

在一些可能的实施方式中,根据接入小区性质的不同,In some possible implementations, depending on the nature of the access cell,

所述呼叫Hj对应的预设特征集中的所述接收信号强度标准差,包括:主服务小区的接收信号强度标准差、相邻小区的接收信号强度标准差和全部小区的接收信号强度标准差:The received signal strength standard deviation in the preset feature set corresponding to the call Hj includes: the received signal strength standard deviation of the primary serving cell, the received signal strength standard deviation of neighboring cells, and the received signal strength standard deviation of all cells:

所述呼叫Hj对应的预设特征集中的所述切换熵,包括:主服务小区的熵、相邻小区的熵和全部小区的熵。The handover entropy in the preset feature set corresponding to the call Hj includes: the entropy of the primary serving cell, the entropy of the adjacent cells, and the entropy of all cells.

在一些可能的实施方式中,当所述N个呼叫中有N'个呼叫对应的数据集合中包括精确地理位置信息时,所述N'个呼叫中的任一呼叫对应的预设特征集还包括:平均速度,其中,所述N'≥1。In some possible implementations, when the data sets corresponding to N' calls in the N calls include precise geographic location information, the preset feature set corresponding to any one of the N' calls is further Including: average speed, wherein, the N'≥1.

在一些可能的实施方式中,所述精确地理位置信息包括通过AGPS或OTT定位服务获得的位置信息。In some possible implementations, the precise geographic location information includes location information obtained through AGPS or OTT positioning services.

在一些可能的实施方式中,当所述N个呼叫中有N'个呼叫对应的数据集合中包括精确地理位置信息时,所述对所述N个呼叫按照预设的移动状态的类别进行分类,包括:In some possible implementations, when the data sets corresponding to N' calls in the N calls include precise geographic location information, the N calls are classified according to preset mobility state categories ,include:

所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的平均速度确定所述呼叫Hk的类别,其中,所述1≤K≤N';Any call H k in the N' calls determines the type of the call H k according to its corresponding average speed and the average speed corresponding to the preset mobile state types respectively, wherein the 1≤K≤N ';

根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数;According to the category corresponding to each of the N' calls and the set of the feature sets corresponding to each of the N' calls, use a supervised learning algorithm to determine any two of the preset The limit of the category of the mobile state in the M-dimensional space, and according to the limit, the range of the distribution of any of the preset categories of the mobile state in the M-dimensional space is obtained, and the M is the number corresponding to the call H k . Describe the number of features in the feature set;

对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.

在一些可能的实施方式中,对所述N个呼叫按照预设的移动状态的类别进行分类,包括:In some possible implementations, the N calls are classified according to preset mobility state categories, including:

当所述N'个呼叫中包括地理信息(Geography Information System,GIS)信息时,When the N' calls include geographic information (Geography Information System, GIS) information,

获取所述GIS信息中指定的地物信息的位置,所述指定的地物信息是与所述预设的移动状态强相关的地物信息;地物信息比如可以是:住宅、商场、公园、道路、路口、高速公路或者铁路等。Obtain the location of the feature information specified in the GIS information, and the specified feature information is the feature information strongly related to the preset movement state; Roads, intersections, highways or railways, etc.

所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的速度确定所述呼叫Hk的类别,其中,所述1≤K≤N';Any call H k among the N' calls determines the type of the call H k according to its corresponding average speed and the speed corresponding to the preset mobile state type respectively, wherein the 1≤K≤N';

确定所述任一呼叫Hk匹配的地物信息;determining the feature information matched by any of the calls H k ;

确定与所述N'个呼叫对应的地物信息的集合J;Determine the set J of the feature information corresponding to the N' calls;

根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,以及所述地物信息的集合J使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数;Use a supervised learning algorithm according to the category corresponding to each of the N' calls, the set of feature sets corresponding to each of the N' calls, and the set of feature information J Determine the boundary of any two preset movement state categories in the M-dimensional space, and obtain the range of the distribution of any of the preset movement state categories in the M-dimensional space according to the boundary, and the M is the number of features in the feature set corresponding to the call H k ;

对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.

在一些可能的实施方式中,确定所述集合J中任一地物信息对应的所述N'个呼叫中呼叫的集合Ji,若所述集合Ji中包括N”个呼叫,所述N”个呼叫对应的移动状态的类型的集合为Ji',所述集合Ji'对应的移动状态的类型数为N”',若所述集合Ji'中某一移动状态的类型对应的呼叫的个数小于In some possible implementations, a set Ji of the N' calls in the call corresponding to any feature information in the set J is determined, if the set Ji includes N" calls, the N" calls The set of types of mobile states corresponding to calls is Ji', and the number of types of mobile states corresponding to the set Ji' is N"', if the number of calls corresponding to a certain type of mobile state in the set Ji' is less than

Figure BDA0001297009240000031
则复制该移动状态的类型对应的呼叫对应的向量F。通过这种方式可以提高概率较小的类别的分类精度。
Figure BDA0001297009240000031
Then copy the vector F corresponding to the call corresponding to the type of the mobility state. In this way, the classification accuracy of the less probable categories can be improved.

根据所述N'个呼叫中的每个呼叫对应的移动状态的类别、所述向量F以及与每个呼叫对应的预设特征集对应的向量,使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述预设特征集中特征的个数;According to the category of the mobility state corresponding to each of the N' calls, the vector F, and the vector corresponding to the preset feature set corresponding to each call, use a supervised learning algorithm to determine any two of the presets The category of the mobile state is in the limit of the M-dimensional space, and according to the limit, the range of the distribution of any of the preset mobile state categories in the M-dimensional space is obtained, and the M is the corresponding value of the call H k the number of features in the preset feature set;

对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的预设特征集得到该呼叫在所述M为空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on its corresponding preset feature set. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.

在一些可能的实施方式中,当所述N个呼叫中不包括精确地理位置信息时,所述对所述N个呼叫按照预设的移动状态的类别进行分类,包括:In some possible implementation manners, when the N calls do not include precise geographic location information, classifying the N calls according to preset mobility state categories includes:

根据所述N个呼叫对应的所述预设特征集的集合,得到与所述N个呼叫分别对应的N个与所述预设特征集对应的向量;According to the set of the preset feature sets corresponding to the N calls, obtain N vectors corresponding to the preset feature sets corresponding to the N calls respectively;

根据N个所述向量和非监督学习算法,将所述N个呼叫分为M个集合,所述M大于预设的移动状态的类别的个数;According to the N described vectors and the unsupervised learning algorithm, the N calls are divided into M sets, and the M is greater than the preset number of mobile state categories;

根据专家规则,将所述M个集合按照预设的移动状态的类别进行分类,则任一呼叫的类别与其所属集合的类别相同。According to the expert rule, the M sets are classified according to the preset mobility state classes, and the class of any call is the same as the class of the set to which it belongs.

第二方面,本申请实施例提供了一种对呼叫进行分类的装置,所述装置包括:In a second aspect, an embodiment of the present application provides an apparatus for classifying calls, the apparatus comprising:

获取单元,用于获取数据包,所述数据包包括与N个呼叫分别对应的N个数据集合,所述N个呼叫中的任一呼叫Hi对应的数据集合包括通信终端在所述呼叫Hi过程中依次向基站发送的所有测量报告MR,其中,所述N为大于1的整数,所述1≤i≤N;an acquisition unit, configured to acquire a data packet, the data packet includes N data sets corresponding to N calls respectively, and the data set corresponding to any call Hi in the N calls includes the communication terminal during the call Hi process All measurement reports MR sent to the base station in sequence in the N, wherein the N is an integer greater than 1, and the 1≤i≤N;

第一处理单元,用于对所述N个呼叫中的每个呼叫以w个预设时间窗分别进行切分,确定所述每个呼叫经各预设时间窗切分后对应的特征集,所述特征集包括与切分后的呼叫对应的小区数、切换数、和至少一个自定义特征的值,所述自定义特征与呼叫的移动状态相关;a first processing unit, configured to divide each call in the N calls with w preset time windows, respectively, and determine a feature set corresponding to each call after being divided by each preset time window, The feature set includes the number of cells corresponding to the split call, the number of handovers, and the value of at least one self-defined characteristic, the self-defined characteristic being related to the movement state of the call;

分类单元,用于根据所述每个呼叫经各预设时间窗切分后对应的特征集,对所述N个呼叫按照预设的移动状态的类别进行分类。The classification unit is configured to classify the N calls according to the preset mobility state categories according to the corresponding feature set of each call after being segmented by each preset time window.

本申请各实施例中,所述至少一个自定义特征,包括如下自定义特征中的至少一个:接收信号强度标准差、切换熵、室外小区占比、和站间距离速度;其中,In each embodiment of the present application, the at least one custom feature includes at least one of the following custom features: standard deviation of received signal strength, handover entropy, outdoor cell ratio, and inter-site distance speed; wherein,

所述接收信号强度标准差,是所述呼叫Hj接收信号强度值的标准差;The standard deviation of the received signal strength is the standard deviation of the received signal strength value of the call Hj;

所述切换熵,用于表示所述呼叫Hj接入小区的不确定度;the handover entropy, used to represent the uncertainty of the calling Hj accessing the cell;

所述室外小区占比,是所述呼叫Hj接入室外类型小区的个数占所有接入小区总个数的百分比;The outdoor cell ratio is the percentage of the number of the call Hj accessing the outdoor type cells to the total number of all access cells;

所述站间距离速度,是所述呼叫Hj获得的所有位置信息的均值与所述呼叫的所有位置的最远距离。The inter-station distance speed is the longest distance between the mean value of all the position information obtained by the call Hj and all the positions of the call.

在一些可能的实施方式中,所述切换熵可以根据如下公式计算得到:In some possible implementations, the handover entropy can be calculated according to the following formula:

Figure BDA0001297009240000041
Figure BDA0001297009240000041

其中,所述entropy为所述呼叫Hj对应的切换熵,所述

Figure BDA0001297009240000042
Wherein, the entropy is the handover entropy corresponding to the call Hj, and the
Figure BDA0001297009240000042

其中N表示所述呼叫Hj接入的小区总数,i表示所述呼叫Hj接入的第i个小区,#i表示所述呼叫Hj接入第i个小区的次数,T表示接入不同小区的总次数,pi表示在这段时间内接入第i个小区的概率。where N represents the total number of cells accessed by the call Hj, i represents the ith cell accessed by the call Hj, #i represents the number of times the call Hj accesses the ith cell, and T represents the number of accesses to different cells. The total number of times, pi represents the probability of accessing the i-th cell during this period.

可以看出,上述技术方案中,由于设置了与呼叫的移动状态相关的自定义特征,所以在对移动状态进行判断时,除了可以根据呼叫对应的小区数和切换数,还可以结合自定义特征进行判断,这样有利于提高对呼叫进行分类时的精度。It can be seen that in the above technical solution, since a custom feature related to the mobility state of the call is set, when judging the mobility state, in addition to the number of cells corresponding to the call and the number of handovers, the custom feature can also be combined It is helpful to improve the accuracy when classifying calls.

在一些可能的实施方式中,根据接入小区性质的不同,In some possible implementations, depending on the nature of the access cell,

所述呼叫Hj对应的预设特征集中的所述接收信号强度标准差,包括:主服务小区的接收信号强度标准差、相邻小区的接收信号强度标准差和全部小区的接收信号强度标准差:The received signal strength standard deviation in the preset feature set corresponding to the call Hj includes: the received signal strength standard deviation of the primary serving cell, the received signal strength standard deviation of neighboring cells, and the received signal strength standard deviation of all cells:

所述呼叫Hj对应的预设特征集中的所述切换熵,包括:主服务小区的熵、相邻小区的熵和全部小区的熵。The handover entropy in the preset feature set corresponding to the call Hj includes: the entropy of the primary serving cell, the entropy of the adjacent cells, and the entropy of all cells.

在一些可能的实施方式中,当所述N个呼叫中有N'个呼叫对应的数据集合中包括精确地理位置信息时,所述N'个呼叫中的任一呼叫对应的预设特征集还包括:平均速度,其中,所述N'≥1。In some possible implementations, when the data sets corresponding to N' calls in the N calls include precise geographic location information, the preset feature set corresponding to any one of the N' calls is further Including: average speed, wherein, the N'≥1.

在一些可能的实施方式中,所述精确地理位置信息包括通过AGPS或OTT定位服务获得的位置信息。In some possible implementations, the precise geographic location information includes location information obtained through AGPS or OTT positioning services.

在一些可能的实施方式中,当所述N个呼叫中有N'个呼叫对应的数据集合中包括精确地理位置信息时,所述分类单元具体用于,In some possible implementations, when the data sets corresponding to N' calls in the N calls include precise geographic location information, the classification unit is specifically configured to:

所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的平均速度确定所述呼叫Hk的类别,其中,所述1≤K≤N';Any call H k in the N' calls determines the type of the call H k according to its corresponding average speed and the average speed corresponding to the preset mobile state types respectively, wherein the 1≤K≤N ';

根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数;According to the category corresponding to each of the N' calls and the set of the feature sets corresponding to each of the N' calls, use a supervised learning algorithm to determine any two of the preset The limit of the category of the mobile state in the M-dimensional space, and according to the limit, the range of the distribution of any of the preset categories of the mobile state in the M-dimensional space is obtained, and the M is the number corresponding to the call H k . Describe the number of features in the feature set;

对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.

在一些可能的实施方式中,所述分类单元具体用于,In some possible implementations, the classification unit is specifically used to:

当所述N'个呼叫中包括GIS信息时,When the N' calls include GIS information,

获取所述GIS信息中指定的地物信息的位置,所述指定的地物信息是与所述预设的移动状态强相关的地物信息;Obtain the location of the specified feature information in the GIS information, where the specified feature information is the feature information strongly related to the preset movement state;

所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的速度确定所述呼叫Hk的类别,其中,所述1≤K≤N';Any call H k among the N' calls determines the type of the call H k according to its corresponding average speed and the speed corresponding to the preset mobile state type respectively, wherein the 1≤K≤N';

确定所述任一呼叫Hk匹配的地物信息;determining the feature information matched by any of the calls H k ;

确定与所述N'个呼叫对应的地物信息的集合J;Determine the set J of the feature information corresponding to the N' calls;

根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,以及所述地物信息的集合J使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数;A supervised learning algorithm is used according to the category corresponding to each of the N' calls, the set of feature sets corresponding to each of the N' calls, and the set of feature information J Determine the boundary of any two preset movement state categories in the M-dimensional space, and obtain the range of the distribution of any of the preset movement state categories in the M-dimensional space according to the boundary, and the M is the number of features in the feature set corresponding to the call H k ;

对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.

在一些可能的实施方式中,当所述N个呼叫中不包括精确地理位置信息时,所述分类单元具体用于,In some possible implementations, when the N calls do not include precise geographic location information, the classification unit is specifically configured to:

根据所述N个呼叫对应的所述预设特征集的集合,得到与所述N个呼叫分别对应的N个与所述预设特征集对应的向量;According to the set of the preset feature sets corresponding to the N calls, obtain N vectors corresponding to the preset feature sets corresponding to the N calls respectively;

根据N个所述向量和非监督学习算法,将所述N个呼叫分为M个集合,所述M大于预设的移动状态的类别的个数;According to the N described vectors and the unsupervised learning algorithm, the N calls are divided into M sets, and the M is greater than the preset number of mobile state categories;

根据专家规则,将所述M个集合按照预设的移动状态的类别进行分类,则任一呼叫的类别与其所属集合的类别相同。According to the expert rule, the M sets are classified according to the preset mobility state classes, and the class of any call is the same as the class of the set to which it belongs.

第三方面,本申请实施例提供了一种存储介质,所述存储介质为非易失性计算机可读存储介质,所述非易失性计算机可读存储介质存储有至少一个程序,每个所述程序包括指令,所述指令包括可被具有处理器的装置执行的本申请实施例提供的任意一种对呼叫进行分类的方法的部分或全部步骤的指令。In a third aspect, an embodiment of the present application provides a storage medium, the storage medium is a non-volatile computer-readable storage medium, and the non-volatile computer-readable storage medium stores at least one program, each of which is The program includes instructions, and the instructions include instructions that can be executed by a device with a processor of some or all of the steps of any of the methods for classifying calls provided in the embodiments of the present application.

第四方面,本申请实施例提供了一种对呼叫进行分类的装置,其特征在于,包括:In a fourth aspect, an embodiment of the present application provides an apparatus for classifying calls, which is characterized by comprising:

相互耦合的处理器和存储部件;其中,所述处理器用于执行权利要求1至9任一项所述方法。A processor and a storage component coupled to each other; wherein the processor is configured to perform the method of any one of claims 1 to 9.

附图说明Description of drawings

为了更清楚地说明本申请实施例或背景技术中的技术方案,下面将对本申请实施例或背景技术中所需要使用的附图进行说明。In order to more clearly describe the technical solutions in the embodiments of the present application or the background technology, the accompanying drawings required in the embodiments or the background technology of the present application will be described below.

图1是本申请实施例应用的一个场景示意图;FIG. 1 is a schematic diagram of a scenario applied by an embodiment of the present application;

图2是本申请的一个实施例提供的一种对呼叫进行分类的方法的流程示意图;2 is a schematic flowchart of a method for classifying calls provided by an embodiment of the present application;

图3是图1中各通信终端实际移动轨迹及对应站间距离速度示意图;Fig. 3 is a schematic diagram of the actual movement track of each communication terminal in Fig. 1 and the distance speed between corresponding stations;

图4是本申请的另一实施例提供的一种对呼叫进行分类的装置的结构示意图。FIG. 4 is a schematic structural diagram of an apparatus for classifying calls provided by another embodiment of the present application.

具体实施方式Detailed ways

下面结合本申请实施例中的附图对本申请实施例进行描述。The embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.

请参见图1,图1是本申请实施例应用的一个场景示意图。如图1所示,对呼叫进行分类的装置101从多个基站(BTS1、BTS2、BTS3、BTS4、…、BTSn)获取数据包,数据包包括与多个呼叫分别对应的数据集合,任一呼叫对应的数据集合包括通信终端在呼叫过程中依次向基站发送的所有测量报告(Measurement Report,MR),MR是通信终端反馈给对呼叫进行分类的装置101的信息,在MR中包括本申请实施例要用到的终端接收的服务小区、邻区、及这些小区对应的信号强度等信息。图1中的通信终端A位于汽车上,随着汽车移动高速移动。通信终端B位于室内,处于静止状态。通信终端C随着用户步行低速移动。由图1可知在不同时刻T1、T2和T3时,不同通信终端移动距离不同,其中通信终端A移动最远,通信终端C其次,通信终端B未移动。现有技术中,在确定通信终端的移动状态时,通常根据通信终端呼叫时对应的小区数和切换数来确定。需要说明的是,当通信终端在小范围内快速移动时,根据现有技术,由于小区数及切换数较小,得到的通信终端的移动状态可能是慢速或者静止,因此现有技术对呼叫进行分类时不准确。Referring to FIG. 1 , FIG. 1 is a schematic diagram of a scenario where an embodiment of the present application is applied. As shown in FIG. 1 , the apparatus 101 for classifying calls obtains data packets from multiple base stations (BTS1, BTS2, BTS3, BTS4, . The corresponding data set includes all measurement reports (Measurement Report, MR) that the communication terminal sends to the base station in turn during the call process. MR is the information that the communication terminal feeds back to the apparatus 101 for classifying the call, and the MR includes the embodiments of the present application. Information such as serving cells, neighboring cells, and signal strengths corresponding to these cells received by the terminal to be used. The communication terminal A in FIG. 1 is located on a car and moves at high speed as the car moves. The communication terminal B is located indoors and is in a stationary state. The communication terminal C moves at a low speed along with the user's walking. It can be seen from FIG. 1 that at different times T1, T2 and T3, different communication terminals move at different distances, among which communication terminal A moves the farthest, communication terminal C is second, and communication terminal B does not move. In the prior art, when determining the mobile state of a communication terminal, it is usually determined according to the number of cells and the number of handovers corresponding to the communication terminal when calling. It should be noted that when the communication terminal moves rapidly in a small area, according to the prior art, due to the small number of cells and the number of handovers, the obtained moving state of the communication terminal may be slow or static, so Inaccurate when classifying.

为了提高对通信终端移动状态分类的精度,本申请实施例中引入了与呼叫的移动状态相关的自定义特征,进行分类的装置101获取多个呼叫对应的数据包,并对多个呼叫的移动状态进行分类。In order to improve the accuracy of classifying the movement state of a communication terminal, a custom feature related to the movement state of a call is introduced in the embodiment of the present application, and the device 101 for classifying obtains data packets corresponding to multiple calls, and analyzes the movement state of the multiple calls. Status is classified.

具体地,如图2所示,本申请实施例提供的对呼叫进行分类的方法,所述方法包括如下步骤:Specifically, as shown in FIG. 2 , the method for classifying calls provided by an embodiment of the present application includes the following steps:

S201、获取数据包,所述数据包包括与N个呼叫分别对应的N个数据集合,所述N个呼叫中的任一呼叫Hi对应的数据集合包括通信终端在所述呼叫Hi过程中依次向基站发送的所有测量报告MR,其中,所述N为大于1的整数,所述1≤i≤N。S201. Acquire a data packet, where the data packet includes N data sets corresponding to N calls respectively, and the data set corresponding to any call Hi in the N calls includes the communication terminal in the process of calling Hi. All measurement reports MR sent by the base station, wherein the N is an integer greater than 1, and the 1≤i≤N.

S202、对所述N个呼叫中的每个呼叫以w个预设时间窗分别进行切分,确定所述每个呼叫经各预设时间窗切分后对应的特征集,所述特征集包括与切分后的呼叫对应的小区数、切换数、和至少一个自定义特征的值,所述自定义特征与呼叫的移动状态相关。S202: Divide each of the N calls with w preset time windows, and determine a feature set corresponding to each call after being divided by each preset time window, where the feature set includes The number of cells corresponding to the split call, the number of handovers, and the value of at least one custom feature, where the custom feature is related to the mobility state of the call.

其中,小区数是在一段时间(比如30秒、或者1分钟、或者5分钟等)内总共接入的小区数目。The number of cells is the total number of cells accessed within a period of time (for example, 30 seconds, or 1 minute, or 5 minutes, etc.).

其中,切换数是在一端时间(比如30秒、或者1分钟、或者5分钟等)内总共发生的切换数目Among them, the number of handovers is the total number of handovers that occur in one end of time (such as 30 seconds, or 1 minute, or 5 minutes, etc.).

其中,所述至少一个自定义特征,包括如下自定义特征中的至少一个:接收信号强度标准差、切换熵、室外小区占比、和站间距离速度;其中,Wherein, the at least one custom feature includes at least one of the following custom features: standard deviation of received signal strength, handover entropy, outdoor cell ratio, and inter-site distance speed; wherein,

所述接收信号强度标准差,是所述呼叫Hj接收信号强度值的标准差。具体地,可以是在一段时间(比如30秒、或者1分钟、或者5分钟等)内多次测量获得的接受信号强度值的标准差。可以理解的,按照接收信号发送端性质的不同,接收信号强度标准差可以分为主服务小区的接收信号强度标准差、相邻小区的接收信号强度标准差和全部小区的接收信号强度标准差。一般情况下,通信终端移动速度越快,接收信号强度标准差的值越大。The standard deviation of the received signal strength is the standard deviation of the received signal strength value of the call Hj. Specifically, it may be the standard deviation of the received signal strength values obtained by multiple measurements within a period of time (such as 30 seconds, or 1 minute, or 5 minutes, etc.). It can be understood that, according to the nature of the receiving end of the received signal, the standard deviation of the received signal strength can be divided into the standard deviation of the received signal strength of the primary serving cell, the standard deviation of the received signal strength of the neighboring cells, and the standard deviation of the received signal strength of all cells. In general, the faster the communication terminal moves, the greater the value of the standard deviation of the received signal strength.

所述切换熵,用于表示所述呼叫Hj接入小区的不确定度。具体地,可以是在一段时间(比如30秒、或者1分钟、或者5分钟等)内通信终端接入不同小区的不确定度,取值在0到1之间。举例来说,若通信终端为静止状态,则它在这段时间内接入小区只有一个,其切换熵为0。可以理解的,通信终端移动速度越快,其在一段时间内接入小区数目及小区切换数越大,其切换熵越趋近于1。举例来说若切换熵为0.2,可以推断通信终端处于低速移动状态。若切换熵为0.9,可以推断通信终端处于高速移动状态。The handover entropy is used to represent the uncertainty of the calling Hj accessing the cell. Specifically, it may be the uncertainty of the communication terminal accessing different cells within a period of time (for example, 30 seconds, or 1 minute, or 5 minutes, etc.), and the value is between 0 and 1. For example, if the communication terminal is in a stationary state, it accesses only one cell during this period, and its handover entropy is 0. It can be understood that, the faster the moving speed of the communication terminal, the greater the number of cells it accesses and the number of cell handovers within a period of time, and the closer its handover entropy is to 1. For example, if the handover entropy is 0.2, it can be inferred that the communication terminal is in a low-speed moving state. If the handover entropy is 0.9, it can be inferred that the communication terminal is in a high-speed moving state.

在一些可能的实施方式中,切换熵可以根据如下公式计算得到:In some possible implementations, the switching entropy can be calculated according to the following formula:

Figure BDA0001297009240000071
Figure BDA0001297009240000071

其中,所述entropy为所述呼叫Hj对应的切换熵,所述

Figure BDA0001297009240000072
Wherein, the entropy is the handover entropy corresponding to the call Hj, and the
Figure BDA0001297009240000072

其中N表示所述呼叫Hj接入的小区总数,i表示所述呼叫Hj接入的第i个小区,#i表示所述呼叫Hj接入第i个小区的次数,T表示接入不同小区的总次数,pi表示在这段时间内接入第i个小区的概率。where N represents the total number of cells accessed by the call Hj, i represents the ith cell accessed by the call Hj, #i represents the number of times the call Hj accesses the ith cell, and T represents the number of accesses to different cells. The total number of times, pi represents the probability of accessing the i-th cell during this period.

根据接入小区性质的不同,所述切换熵,包括:主服务小区的熵、相邻小区的熵和全部小区的熵。According to different properties of the access cells, the handover entropy includes: the entropy of the primary serving cell, the entropy of the adjacent cells, and the entropy of all cells.

所述室外小区占比,是所述呼叫Hj接入室外类型小区的个数占所有接入小区总个数的百分比。可以理解的,若室外小区占比低于50%,则可以认为通信终端位于室内,为静止状态。The outdoor cell proportion is the percentage of the number of the calling Hj accessing the outdoor type cells to the total number of all access cells. It can be understood that if the proportion of outdoor cells is less than 50%, it can be considered that the communication terminal is located indoors and is in a static state.

所述站间距离速度,是所述呼叫Hj获得的所有位置信息的均值与所述呼叫的所有位置的最远距离。具体地,可以是用户在一段时间内所获得的所有位置信息(包括MR定位,AGPS定位,OTT定位)的均值与所有位置的最远距离。站间距离速度越大,则认为通信终端的移动速度越大。对于通信终端在一段时间内从一个位置出发又回到该位置的情况,站间距离速度比简单的速度计算模型,即起终点距离与时间差比值,更能表征该用户的运动速度。参见图3,其中各通信终端的实际移动轨迹如图3中虚线所示,图中带箭头的实线段的起始位置指示位置信息的均值,带箭头的线段的箭头指示的位置为在指定窗口时间内距离所述均值最远的点,带箭头的线段的距离为站间举例速度。由图3可知,左边通信终端的站间距离速度最大,中间通信终端的站间距离速度次之,右边通信终端的站间距离速度最小。The inter-station distance speed is the longest distance between the mean value of all the position information obtained by the call Hj and all the positions of the call. Specifically, it may be the longest distance between the mean value of all the location information (including MR positioning, AGPS positioning, and OTT positioning) obtained by the user within a period of time and all the positions. The larger the inter-station distance speed is, the larger the movement speed of the communication terminal is considered to be. For the situation that the communication terminal starts from a position and returns to the position within a certain period of time, the distance and speed between stations can better represent the movement speed of the user than the simple speed calculation model, that is, the ratio of the distance between the starting and ending points and the time difference. Referring to FIG. 3 , the actual movement trajectory of each communication terminal is shown as a dotted line in FIG. 3 , the starting position of the solid line segment with arrows in the figure indicates the mean value of the position information, and the position indicated by the arrows of the line segments with arrows is in the specified window. The point that is farthest from the mean value in time, the distance of the line segment with the arrow is the example speed between stations. It can be seen from Figure 3 that the inter-station distance velocity of the left communication terminal is the largest, the inter-station distance velocity of the middle communication terminal is second, and the inter-station distance velocity of the right communication terminal is the smallest.

S203、根据所述每个呼叫经各预设时间窗切分后对应的特征集,对所述N个呼叫按照预设的移动状态的类别进行分类。S203: Classify the N calls according to preset mobility state categories according to the corresponding feature set of each call after being segmented by each preset time window.

需要说明的是,先计算经任一预设时间窗切分得到的多个呼叫分别对应的特征集,然后确定多个特征集的平均值,作为该预设时间窗对应的特征集。比如一个呼叫H1经预设的30秒的时间窗切分为6个小的呼叫,则分别计算着6个小的呼叫对应的特征集,然后对得到的6个特征集中每个特征的平均值,将平均值做为该呼叫H1的特征集。It should be noted that the feature sets corresponding to the multiple calls obtained by dividing any preset time window are first calculated, and then the average value of the multiple feature sets is determined as the feature set corresponding to the preset time window. For example, a call H1 is divided into 6 small calls by a preset time window of 30 seconds, then the feature sets corresponding to the 6 small calls are calculated respectively, and then the average value of each feature in the obtained 6 feature sets is calculated. , and take the mean value as the feature set of the call H1.

可以看出,上述技术方案中,由于设置了与呼叫的移动状态相关的自定义特征,所以在对移动状态进行判断时,除了可以根据呼叫对应的小区数和切换数,还可以结合自定义特征进行判断,这样有利于提高对呼叫进行分类时的精度。It can be seen that in the above technical solution, since a custom feature related to the mobility state of the call is set, when judging the mobility state, in addition to the number of cells corresponding to the call and the number of handovers, the custom feature can also be combined It is helpful to improve the accuracy when classifying calls.

在一些可能的实施方式中,当所述N个呼叫中有N'个呼叫对应的数据集合中包括精确地理位置信息时,所述对所述N个呼叫按照预设的移动状态的类别进行分类,包括:In some possible implementations, when the data sets corresponding to N' calls in the N calls include precise geographic location information, the N calls are classified according to preset mobility state categories ,include:

所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的平均速度确定所述呼叫Hk的类别,其中,所述1≤K≤N'。比如若N为1000,N'为120,则在1000个呼叫中有120个呼叫对应的数据集合中包括精确地理位置信息。其中,精确地理位置信息包括通过辅助全球卫星定位系统(ssisted Global Positioning System,AGPS)或通过互联网的应用服务(Over The Top,OTT)获取。根据所述120个带有精确地理位置信息的呼叫的地理位置信息可以获取这120个呼叫中每个呼叫的平均速度,根据预设的移动状态的类别对应的评价速度确定这些呼叫的移动状态的类别。Any call Hk among the N' calls determines the type of the call Hk according to the average speed corresponding to the call Hk according to the average speed corresponding to the preset mobile state types respectively, where 1≤K≤N'. For example, if N is 1000 and N' is 120, the data set corresponding to 120 calls in the 1000 calls includes precise geographic location information. Wherein, the precise geographic location information includes obtaining through an assisted global positioning system (ssisted global positioning system, AGPS) or through an application service (Over The Top, OTT) of the Internet. According to the geographic location information of the 120 calls with precise geographic location information, the average speed of each call in the 120 calls can be obtained, and the mobility state of these calls can be determined according to the evaluation speed corresponding to the preset mobility state category. category.

根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数;According to the category corresponding to each of the N' calls and the set of the feature sets corresponding to each of the N' calls, use a supervised learning algorithm to determine any two of the preset The limit of the category of the mobile state in the M-dimensional space, and according to the limit, the range of the distribution of any of the preset categories of the mobile state in the M-dimensional space is obtained, and the M is the number corresponding to the call H k . Describe the number of features in the feature set;

对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.

在一些可能的实施方式中,当所述N'个呼叫中包括GIS信息时,In some possible implementations, when the N' calls include GIS information,

获取所述GIS信息中指定的地物信息的位置,所述指定的地物信息是与所述预设的移动状态强相关的地物信息;地物信息比如可以是:住宅、商场、公园、道路、路口、高速公路或者铁路等。Obtain the location of the feature information specified in the GIS information, and the specified feature information is the feature information strongly related to the preset movement state; Roads, intersections, highways or railways, etc.

所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的速度确定所述呼叫Hk的类别,其中,所述1≤K≤N';Any call H k among the N' calls determines the type of the call H k according to its corresponding average speed and the speed corresponding to the preset mobile state type respectively, wherein the 1≤K≤N';

确定所述任一呼叫Hk匹配的地物信息;determining the feature information matched by any of the calls H k ;

确定与所述N'个呼叫对应的地物信息的集合J;Determine the set J of the feature information corresponding to the N' calls;

根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,以及所述地物信息的集合J使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数。举例来说在一些可能的实施方式中,Use a supervised learning algorithm according to the category corresponding to each of the N' calls, the set of feature sets corresponding to each of the N' calls, and the set of feature information J Determine the boundary of any two preset movement state categories in the M-dimensional space, and obtain the range of the distribution of any of the preset movement state categories in the M-dimensional space according to the boundary, and the M is the number of features in the feature set corresponding to the call H k . For example, in some possible implementations,

在一些可能的实施方式中,确定所述集合J中任一地物信息对应的所述N'个呼叫中呼叫的集合Ji,若所述集合Ji中包括N”个呼叫,所述N”个呼叫对应的移动状态的类型的集合为Ji',所述集合Ji'对应的移动状态的类型数为N”',若所述集合Ji'中某一移动状态的类型对应的呼叫的个数小于In some possible implementations, a set Ji of the N' calls in the call corresponding to any feature information in the set J is determined, if the set Ji includes N" calls, the N" calls The set of types of mobile states corresponding to calls is Ji', and the number of types of mobile states corresponding to the set Ji' is N"', if the number of calls corresponding to a certain type of mobile state in the set Ji' is less than

Figure BDA0001297009240000081
则复制该移动状态的类型对应的呼叫对应的向量F。通过这种方式可以提高概率较小的类别的分类精度。
Figure BDA0001297009240000081
Then copy the vector F corresponding to the call corresponding to the type of the mobility state. In this way, the classification accuracy of the less probable categories can be improved.

根据所述N'个呼叫中的每个呼叫对应的移动状态的类别、所述向量F以及与每个呼叫对应的预设特征集对应的向量,使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述预设特征集中特征的个数;According to the category of the mobility state corresponding to each of the N' calls, the vector F, and the vector corresponding to the preset feature set corresponding to each call, use a supervised learning algorithm to determine any two of the presets The category of the mobile state is in the limit of the M-dimensional space, and according to the limit, the range of the distribution of any of the preset mobile state categories in the M-dimensional space is obtained, and the M is the corresponding value of the call H k the number of features in the preset feature set;

对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的预设特征集得到该呼叫在所述M为空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on its corresponding preset feature set. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.

举例来说,若GIS对应的地物信息为公园,经过公园的包括精确地理位置的呼叫有150个,其中92个呼叫对应的移动状态为慢速移动,51个呼叫对应的移动状态为静止,7个呼叫对应的移动状态为高速运动。由于高速运动对应的呼叫数小于150/3=50,所以可以复制高速运动的呼叫对应的向量,复制后使高速运动对应的向量达到50或者50以上,然后在利用前面所述150个呼叫对应的预设特征集对应的向量,以及复制得到的43个(以复制后高速运动对应的呼叫数为50个为例)移动状态为高速运动的呼叫对应的向量,使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围。For example, if the feature information corresponding to the GIS is a park, there are 150 calls including the precise geographic location passing through the park, of which 92 calls correspond to a slow moving state, and 51 calls correspond to a stationary state. The mobile state corresponding to the 7 calls is high-speed movement. Since the number of calls corresponding to high-speed motion is less than 150/3=50, the vector corresponding to the call of high-speed motion can be copied. After copying, the vector corresponding to high-speed motion can reach 50 or more. The vector corresponding to the preset feature set, as well as the 43 copies obtained (taking the number of calls corresponding to high-speed motion after copying as an example) correspond to 50 calls with high-speed motion, use the supervised learning algorithm to determine any two The boundary of the preset movement state category in the M-dimensional space, and the range of the distribution of any of the preset movement state categories in the M-dimensional space is obtained according to the boundary.

对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.

在一些可能的实施方式中,当所述N个呼叫中不包括精确地理位置信息时,根据所述N个呼叫对应的所述预设特征集的集合,得到与所述N个呼叫分别对应的N个与所述预设特征集对应的向量;In some possible implementations, when the N calls do not include precise geographic location information, according to the set of the preset feature sets corresponding to the N calls, the corresponding N calls are obtained respectively. N vectors corresponding to the preset feature set;

根据N个所述向量和非监督学习算法,将所述N个呼叫分为M个集合,所述M大于预设的移动状态的类别的个数;举例来说,若预设的移动状态包括4种:静止、低速运动、中速运动、和高速运动。若N为1000,在本发明的一些可能的实施方式中,可以将1000个呼叫根据非监督学习算法分成20个集合,然后按照因子分析法、迭代算法、主成分分析法等专家算法确定所述20个集合中的每个集合的移动状态。比如将切换熵的值为0的集合的移动状态确定为静止。将切换熵为0.7以上且切换数大于5的集合中呼叫的移动状态确定为高速移动等。则任一呼叫的类别与其所属集合的类别相同。According to the N vectors and an unsupervised learning algorithm, the N calls are divided into M sets, where M is greater than the number of preset mobility state categories; for example, if the preset mobility state includes 4 types: stationary, low-speed motion, medium-speed motion, and high-speed motion. If N is 1000, in some possible implementations of the present invention, 1000 calls can be divided into 20 sets according to an unsupervised learning algorithm, and then determined according to expert algorithms such as factor analysis, iterative, and principal component analysis. The state of movement for each of the 20 sets. For example, the moving state of the set whose switching entropy value is 0 is determined to be stationary. The mobility state of the call in the set where the handover entropy is 0.7 or more and the number of handovers is greater than 5 is determined to be high-speed mobility or the like. The class of any call is the same as the class of the set to which it belongs.

请参阅图4,为本申请实施例提供的一种对呼叫进行分类的装置400,具体地,图4所示的对呼叫进行分类的装置400可以包括:获取单元401、第一处理单元402、和分类单元403。Referring to FIG. 4, an apparatus 400 for classifying calls provided in an embodiment of the present application. Specifically, the apparatus 400 for classifying calls shown in FIG. 4 may include: an obtaining unit 401, a first processing unit 402, and classification unit 403.

其中,获取单元401用于执行本发明方法实施例图2中步骤S201的方法,获取单元401的实施方式可以参考本发明方法实施例图2中步骤S201对应的描述,在此不再赘述。The obtaining unit 401 is configured to execute the method of step S201 in FIG. 2 of the method embodiment of the present invention. For the implementation of the obtaining unit 401, reference may be made to the description corresponding to step S201 in FIG. 2 of the method embodiment of the present invention, and details are not repeated here.

第一处理单元402用于执行本发明方法实施例图2中步骤S202的方法,第一处理单元402的实施方式可以参考本发明方法实施例图2中步骤S202对应的描述,在此不再赘述。The first processing unit 402 is configured to execute the method of step S202 in FIG. 2 of the method embodiment of the present invention. For the implementation of the first processing unit 402, reference may be made to the description corresponding to step S202 in FIG. 2 of the method embodiment of the present invention, which will not be repeated here. .

分类单元403用于执行本发明方法实施例图2中步骤S203的方法,分类单元403的实施方式可以参考本发明方法实施例图2中步骤S203对应的描述,在此不再赘述。The classification unit 403 is configured to execute the method of step S203 in FIG. 2 in the method embodiment of the present invention. For the implementation of the classification unit 403, reference may be made to the description corresponding to step S203 in FIG. 2 in the method embodiment of the present invention, which will not be repeated here.

可选的,在本发明一些可能的实施方式中,所述至少一个自定义特征,包括如下自定义特征中的至少一个:接收信号强度标准差、切换熵、室外小区占比、和站间距离速度;其中,Optionally, in some possible implementations of the present invention, the at least one custom feature includes at least one of the following custom features: standard deviation of received signal strength, handover entropy, outdoor cell ratio, and inter-station distance. speed; where,

所述接收信号强度标准差,是所述呼叫Hj接收信号强度值的标准差;The standard deviation of the received signal strength is the standard deviation of the received signal strength value of the call Hj;

所述切换熵,用于表示所述呼叫Hj接入小区的不确定度;the handover entropy, used to represent the uncertainty of the calling Hj accessing the cell;

所述室外小区占比,是所述呼叫Hj接入室外类型小区的个数占所有接入小区总个数的百分比;The outdoor cell ratio is the percentage of the number of the call Hj accessing the outdoor type cells to the total number of all access cells;

所述站间距离速度,是所述呼叫Hj获得的所有位置信息的均值与所述呼叫的所有位置的最远距离。The inter-station distance speed is the longest distance between the mean value of all the position information obtained by the call Hj and all the positions of the call.

可选的,在本发明一些可能的实施方式中,所述切换熵根据如下公式计算得到:Optionally, in some possible embodiments of the present invention, the handover entropy is calculated according to the following formula:

Figure BDA0001297009240000101
Figure BDA0001297009240000101

其中,所述entropy为所述呼叫Hj对应的切换熵,所述

Figure BDA0001297009240000102
Wherein, the entropy is the handover entropy corresponding to the call Hj, and the
Figure BDA0001297009240000102

其中N表示所述呼叫Hj接入的小区总数,i表示所述呼叫Hj接入的第i个小区,#i表示所述呼叫Hj接入第i个小区的次数,T表示接入不同小区的总次数,pi表示在这段时间内接入第i个小区的概率。where N represents the total number of cells accessed by the call Hj, i represents the ith cell accessed by the call Hj, #i represents the number of times the call Hj accesses the ith cell, and T represents the number of accesses to different cells. The total number of times, pi represents the probability of accessing the i-th cell during this period.

可选的,在本发明一些可能的实施方式中,根据接入小区性质的不同,Optionally, in some possible implementations of the present invention, according to different properties of the access cell,

所述呼叫Hj对应的预设特征集中的所述接收信号强度标准差,包括:主服务小区的接收信号强度标准差、相邻小区的接收信号强度标准差和全部小区的接收信号强度标准差:The received signal strength standard deviation in the preset feature set corresponding to the call Hj includes: the received signal strength standard deviation of the primary serving cell, the received signal strength standard deviation of neighboring cells, and the received signal strength standard deviation of all cells:

所述呼叫Hj对应的预设特征集中的所述切换熵,包括:主服务小区的熵、相邻小区的熵和全部小区的熵。The handover entropy in the preset feature set corresponding to the call Hj includes: the entropy of the primary serving cell, the entropy of the adjacent cells, and the entropy of all cells.

可选的,在本发明一些可能的实施方式中,当所述N个呼叫中有N'个呼叫对应的数据集合中包括精确地理位置信息时,所述N'个呼叫中的任一呼叫对应的预设特征集还包括:平均速度,其中,所述N'≥1。Optionally, in some possible implementations of the present invention, when the data sets corresponding to N' calls in the N calls include precise geographic location information, any call in the N' calls corresponds to The preset feature set also includes: average speed, where the N'≥1.

所述精确地理位置信息包括通过AGPS或OTT定位服务获得的位置信息。The precise geographic location information includes location information obtained through AGPS or OTT positioning services.

可选的,在本发明一些可能的实施方式中,当所述N个呼叫中有N'个呼叫对应的数据集合中包括精确地理位置信息时,所述分类单元具体用于,Optionally, in some possible implementations of the present invention, when the data sets corresponding to N' calls in the N calls include precise geographic location information, the classification unit is specifically configured to:

所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的平均速度确定所述呼叫Hk的类别,其中,所述1≤K≤N';Any call H k in the N' calls determines the type of the call H k according to its corresponding average speed and the average speed corresponding to the preset mobile state types respectively, wherein the 1≤K≤N ';

根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数;According to the category corresponding to each of the N' calls and the set of the feature sets corresponding to each of the N' calls, use a supervised learning algorithm to determine any two of the preset The limit of the category of the mobile state in the M-dimensional space, and according to the limit, the range of the distribution of any of the preset categories of the mobile state in the M-dimensional space is obtained, and the M is the number corresponding to the call H k . Describe the number of features in the feature set;

对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.

可选的,在本发明一些可能的实施方式中,所述分类单元具体用于,Optionally, in some possible implementations of the present invention, the classification unit is specifically used to:

当所述N'个呼叫中包括GIS信息时,When the N' calls include GIS information,

获取所述GIS信息中指定的地物信息的位置,所述指定的地物信息是与所述预设的移动状态强相关的地物信息;Obtain the location of the specified feature information in the GIS information, where the specified feature information is the feature information strongly related to the preset movement state;

所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的速度确定所述呼叫Hk的类别,其中,所述1≤K≤N';Any call H k among the N' calls determines the type of the call H k according to its corresponding average speed and the speed corresponding to the preset mobile state type respectively, wherein the 1≤K≤N';

确定所述任一呼叫Hk匹配的地物信息;determining the feature information matched by any of the calls H k ;

确定与所述N'个呼叫对应的地物信息的集合J;Determine the set J of the feature information corresponding to the N' calls;

根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,以及所述地物信息的集合J使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数;A supervised learning algorithm is used according to the category corresponding to each of the N' calls, the set of feature sets corresponding to each of the N' calls, and the set of feature information J Determine the boundary of any two preset movement state categories in the M-dimensional space, and obtain the range of the distribution of any of the preset movement state categories in the M-dimensional space according to the boundary, and the M is the number of features in the feature set corresponding to the call H k ;

对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.

可选的,在本发明一些可能的实施方式中,当所述N个呼叫中不包括精确地理位置信息时,所述分类单元具体用于,Optionally, in some possible implementations of the present invention, when the N calls do not include precise geographic location information, the classification unit is specifically configured to:

根据所述N个呼叫对应的所述预设特征集的集合,得到与所述N个呼叫分别对应的N个与所述预设特征集对应的向量;According to the set of the preset feature sets corresponding to the N calls, obtain N vectors corresponding to the preset feature sets corresponding to the N calls respectively;

根据N个所述向量和非监督学习算法,将所述N个呼叫分为M个集合,所述M大于预设的移动状态的类别的个数;According to the N described vectors and the unsupervised learning algorithm, the N calls are divided into M sets, and the M is greater than the preset number of mobile state categories;

根据专家规则,将所述M个集合按照预设的移动状态的类别进行分类,则任一呼叫的类别与其所属集合的类别相同。According to the expert rule, the M sets are classified according to the preset mobility state classes, and the class of any call is the same as the class of the set to which it belongs.

本申请实施例还提供一种计算机存储介质,其中,该计算机存储介质可存储有程序,所述程序执行时包括上述方法实施例中记载的任意一种对呼叫进行分类的方法的部分或全部步骤。An embodiment of the present application further provides a computer storage medium, wherein the computer storage medium may store a program, and when the program is executed, the program includes part or all of the steps of any of the methods for classifying calls described in the above method embodiments .

本申请实施例还提供一种对呼叫进行分类的装置,包括:相互耦合的处理器和存储部件;其中,所述处理器用于执行上述方法实施例中记载的任意一种对呼叫进行分类的方法。An embodiment of the present application further provides an apparatus for classifying calls, including: a processor and a storage component coupled to each other; wherein the processor is configured to execute any one of the methods for classifying calls described in the foregoing method embodiments .

本申请实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减。The steps in the method of the embodiment of the present application may be adjusted, combined and deleted in sequence according to actual needs.

本申请实施例装置中的单元可以根据实际需要进行合并、划分和删减。Units in the apparatus of the embodiment of the present application may be combined, divided, and deleted according to actual needs.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,该流程可以由计算机程序来指令相关的硬件完成,该程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法实施例的流程。而前述的存储介质包括:只读存储器(Read-Only Memory,ROM)或随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented. The process can be completed by instructing the relevant hardware by a computer program, and the program can be stored in a computer-readable storage medium. When the program is executed , which may include the processes of the foregoing method embodiments. The aforementioned storage medium includes: a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk and other media that can store program codes.

Claims (20)

1. A method of classifying a call, the method comprising:
acquiring a data packet, wherein the data packet comprises N data sets respectively corresponding to N calls, and the data set corresponding to any call Hi in the N calls comprises all measurement reports MR which are received by a base station and are sequentially sent to the base station by a communication terminal in the process of calling Hi, wherein N is an integer greater than 1, and i is greater than or equal to 1 and less than or equal to N;
segmenting each call in the N calls by w preset time windows respectively, and determining a feature set corresponding to each call after each preset time window is segmented, wherein the feature set comprises the number of cells, the switching number and the value of at least one custom feature corresponding to the segmented call, and the custom feature is related to the moving state of the call;
and classifying the N calls according to the classes of the preset moving state according to the feature set corresponding to each call after being segmented by each preset time window.
2. The method of claim 1,
the at least one custom feature comprises at least one of the following custom features: receiving signal strength standard deviation, switching entropy, outdoor cell ratio and inter-station distance speed; wherein,
the received signal strength standard deviation is the standard deviation of the received signal strength value of the call Hi;
the switching entropy is used for representing the uncertainty of the Hi access cell of the call;
the outdoor cell occupation ratio is the percentage of the number of the call Hi access outdoor type cells in the total number of all the access cells;
and the inter-station distance speed is the farthest distance between the average value of all the position information obtained by the call Hi and all the positions of the call.
3. The method of claim 2,
the switching entropy is calculated according to the following formula:
Figure FDA0002685214490000011
wherein the entry is a handover entropy corresponding to the call Hi, and the entry is a handover entropy corresponding to the call Hi
Figure FDA0002685214490000012
Where N denotes the total number of cells accessed by the call Hi, i denotes the ith cell accessed by the call Hi, # i denotes the number of times the call Hi accesses the ith cell, T denotes the total number of times different cells are accessed, and pi denotes the probability of accessing the ith cell during this time.
4. The method of claim 3,
depending on the nature of the access cell,
the received signal strength standard deviation in the preset feature set corresponding to the call Hi includes: the received signal strength standard deviation of the main serving cell, the received signal strength standard deviations of the neighboring cells, and the received signal strength standard deviations of all cells:
the switching entropy in the preset feature set corresponding to the call Hi includes: entropy of the main serving cell, entropy of the neighboring cells, and entropy of all cells.
5. The method of claim 4,
when the data set corresponding to N 'calls in the N calls includes accurate geographical location information, the preset feature set corresponding to any one of the N' calls further includes: average speed, wherein N' is ≧ 1.
6. The method of claim 5,
the precise geographical location information includes location information obtained through AGPS or OTT location services.
7. The method of claim 6,
when the data sets corresponding to N' calls in the N calls include accurate geographical location information, classifying the N calls according to a preset category of a mobile state includes:
any one of the N' calls HkDetermining the call H according to the corresponding average speeds and the corresponding average speeds of the types of the preset moving stateskWherein 1. ltoreq. K.ltoreq.N';
determining the boundary of any two preset mobile state categories in an M-dimensional space by using a supervised learning algorithm according to the category corresponding to each call in the N 'calls and the feature set corresponding to each call in the N' calls, and obtaining the region range of any one preset mobile state category in the M-dimensional space according to the boundary, wherein M is the call HkThe number of the corresponding features in the feature set;
for the (N-N ') calls in the N calls, calling the (N-N') calls which do not include the accurate geographic position information in the corresponding data set, obtaining the mapping position of the call in the M-dimensional space according to the (N-N ') calls and determining the moving state corresponding to the (N-N') calls according to the mapping position and the regional range of the M-dimensional space distribution of any preset moving state.
8. The method of claim 6, wherein classifying the N calls according to a predetermined category of mobility state comprises:
when GIS information is included in the N' calls,
acquiring the position of the specified ground feature information in the GIS information, wherein the specified ground feature information is the ground feature information which is strongly related to the preset moving state;
any one of the N' calls HkDetermining the call H according to the corresponding average speed and the speed corresponding to the preset type of the moving statekWherein 1. ltoreq. K.ltoreq.N';
determining said any call HkMatching ground feature information;
determining a set J of feature information corresponding to the N' calls;
determining the boundary of any two preset moving state categories in an M-dimensional space according to the category corresponding to each call in the N 'calls, the feature set corresponding to each call in the N' calls and the feature information set J by using a supervised learning algorithm, obtaining the region range of any one preset moving state category in the M-dimensional space according to the boundary, wherein M is the call HkThe number of the corresponding features in the feature set;
for the (N-N ') calls in the N calls, calling the (N-N') calls which do not include the accurate geographic position information in the corresponding data set, obtaining the mapping position of the call in the M-dimensional space according to the (N-N ') calls and determining the moving state corresponding to the (N-N') calls according to the mapping position and the regional range of the M-dimensional space distribution of any preset moving state.
9. The method of claim 2,
when the accurate geographical location information is not included in the N calls, classifying the N calls according to a preset category of a moving state includes:
obtaining N vectors corresponding to the preset feature set and respectively corresponding to the N calls according to the set of the preset feature set corresponding to the N calls;
dividing the N calls into M sets according to the N vectors and an unsupervised learning algorithm, wherein M is larger than the number of the categories of the preset moving state;
and classifying the M sets according to the preset moving state category according to expert rules, wherein the category of any call is the same as that of the set to which the call belongs.
10. An apparatus for classifying a call, the apparatus comprising:
an obtaining unit, configured to obtain a data packet, where the data packet includes N data sets corresponding to N calls, and a data set corresponding to any call Hi in the N calls includes all measurement reports MR received by a base station and sent by a communication terminal to the base station in sequence in the process of calling Hi, where N is an integer greater than 1, and i is greater than or equal to 1 and is less than or equal to N;
the first processing unit is used for segmenting each call in the N calls by w preset time windows respectively, and determining a feature set corresponding to each call after each preset time window is segmented, wherein the feature set comprises the number of cells, the switching number and at least one value of a user-defined feature corresponding to the segmented call, and the user-defined feature is related to the moving state of the call;
and the classification unit is used for classifying the N calls according to the classes of the preset moving states according to the feature set corresponding to each call after being segmented by each preset time window.
11. The apparatus of claim 10,
the at least one custom feature comprises at least one of the following custom features: receiving signal strength standard deviation, switching entropy, outdoor cell ratio and inter-station distance speed; wherein,
the received signal strength standard deviation is the standard deviation of the received signal strength value of the call Hi;
the switching entropy is used for representing the uncertainty of the Hi access cell of the call;
the outdoor cell occupation ratio is the percentage of the number of the call Hi access outdoor type cells in the total number of all the access cells;
and the inter-station distance speed is the farthest distance between the average value of all the position information obtained by the call Hi and all the positions of the call.
12. The apparatus of claim 11,
the switching entropy is calculated according to the following formula:
Figure FDA0002685214490000031
wherein the entry is a handover entropy corresponding to the call Hi, and the entry is a handover entropy corresponding to the call Hi
Figure FDA0002685214490000032
Where N denotes the total number of cells accessed by the call Hi, i denotes the ith cell accessed by the call Hi, # i denotes the number of times the call Hi accesses the ith cell, T denotes the total number of times different cells are accessed, and pi denotes the probability of accessing the ith cell during this time.
13. The apparatus of claim 12,
depending on the nature of the access cell,
the received signal strength standard deviation in the preset feature set corresponding to the call Hi includes: the received signal strength standard deviation of the main serving cell, the received signal strength standard deviations of the neighboring cells, and the received signal strength standard deviations of all cells:
the switching entropy in the preset feature set corresponding to the call Hi includes: entropy of the main serving cell, entropy of the neighboring cells, and entropy of all cells.
14. The apparatus of claim 13,
when the data set corresponding to N 'calls in the N calls includes accurate geographical location information, the preset feature set corresponding to any one of the N' calls further includes: average speed, wherein N' is ≧ 1.
15. The apparatus of claim 14,
the precise geographical location information includes location information obtained through AGPS or OTT location services.
16. The apparatus of claim 15,
when the data sets corresponding to N' of the N calls include accurate geographical location information, the classifying unit is specifically configured to,
any one of the N' calls HkDetermining the call H according to the corresponding average speeds and the corresponding average speeds of the types of the preset moving stateskWherein 1. ltoreq. K.ltoreq.N';
determining the boundary of any two classes of the preset moving state in an M-dimensional space by using a supervised learning algorithm according to the class corresponding to each of the N 'calls and the set of the feature set corresponding to each of the N' calls, and obtaining any one of the preset moving states according to the boundaryThe category of the state is in the region range of the M-dimensional space distribution, wherein M is the call HkThe number of the corresponding features in the feature set;
for the (N-N ') calls in the N calls, calling the (N-N') calls which do not include the accurate geographic position information in the corresponding data set, obtaining the mapping position of the call in the M-dimensional space according to the (N-N ') calls and determining the moving state corresponding to the (N-N') calls according to the mapping position and the regional range of the M-dimensional space distribution of any preset moving state.
17. The apparatus according to claim 15, wherein the classification unit is specifically configured to,
when GIS information is included in the N' calls,
acquiring the position of the specified ground feature information in the GIS information, wherein the specified ground feature information is the ground feature information which is strongly related to the preset moving state;
any one of the N' calls HkDetermining the call H according to the corresponding average speed and the speed corresponding to the preset type of the moving statekWherein 1. ltoreq. K.ltoreq.N';
determining said any call HkMatching ground feature information;
determining a set J of feature information corresponding to the N' calls;
determining the boundary of any two preset moving state categories in an M-dimensional space according to the category corresponding to each call in the N 'calls, the feature set corresponding to each call in the N' calls and the feature information set J by using a supervised learning algorithm, obtaining the region range of any one preset moving state category in the M-dimensional space according to the boundary, wherein M is the call HkThe number of the corresponding features in the feature set;
for the (N-N ') calls in the N calls, calling the (N-N') calls which do not include the accurate geographic position information in the corresponding data set, obtaining the mapping position of the call in the M-dimensional space according to the (N-N ') calls and determining the moving state corresponding to the (N-N') calls according to the mapping position and the regional range of the M-dimensional space distribution of any preset moving state.
18. The apparatus of claim 11,
when the precise geographical location information is not included in the N calls, the classification unit is specifically configured to,
obtaining N vectors corresponding to the preset feature set and respectively corresponding to the N calls according to the set of the preset feature set corresponding to the N calls;
dividing the N calls into M sets according to the N vectors and an unsupervised learning algorithm, wherein M is larger than the number of the categories of the preset moving state;
and classifying the M sets according to the preset moving state category according to expert rules, wherein the category of any call is the same as that of the set to which the call belongs.
19. A storage medium, characterized in that it is a non-volatile computer-readable storage medium storing at least one program, each of said programs comprising instructions which, when executed by an apparatus having a processor, cause the apparatus to carry out the method of classifying a call according to any one of claims 1-9.
20. An apparatus for classifying a call, comprising:
a processor and a memory component coupled to each other; wherein the processor is configured to perform the method of any one of claims 1 to 9.
CN201710348411.3A 2017-05-17 2017-05-17 Method and device for classifying calls Active CN107172637B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710348411.3A CN107172637B (en) 2017-05-17 2017-05-17 Method and device for classifying calls

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710348411.3A CN107172637B (en) 2017-05-17 2017-05-17 Method and device for classifying calls

Publications (2)

Publication Number Publication Date
CN107172637A CN107172637A (en) 2017-09-15
CN107172637B true CN107172637B (en) 2021-01-29

Family

ID=59815400

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710348411.3A Active CN107172637B (en) 2017-05-17 2017-05-17 Method and device for classifying calls

Country Status (1)

Country Link
CN (1) CN107172637B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115373002A (en) * 2021-05-19 2022-11-22 中兴通讯股份有限公司 A road user identification method, device, storage medium and electronic device

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7689210B1 (en) * 2002-01-11 2010-03-30 Broadcom Corporation Plug-n-playable wireless communication system
CN101453770A (en) * 2007-12-07 2009-06-10 华为技术有限公司 Measurement control method and apparatus
CN102026309B (en) * 2009-09-10 2013-07-31 电信科学技术研究院 Method, system and device for detecting mobile state of terminal
WO2013023346A1 (en) * 2011-08-12 2013-02-21 Huawei Technologies Co., Ltd. Method for estimating mobility state
CN103167551B (en) * 2011-12-15 2016-06-29 华为技术有限公司 A kind of method of reported by user equipment UE measurement result and subscriber equipment
WO2014047795A1 (en) * 2012-09-26 2014-04-03 华为技术有限公司 Method and device for estimating moving status and measurement report method and device
CN104080135B (en) * 2013-03-29 2018-02-13 电信科学技术研究院 A kind of network selecting method and equipment
US9854527B2 (en) * 2014-08-28 2017-12-26 Apple Inc. User equipment transmit duty cycle control
CN105101247B (en) * 2015-09-01 2018-08-03 重庆邮电大学 Mobile status estimation Enhancement Method based on switching type certain weights and device

Also Published As

Publication number Publication date
CN107172637A (en) 2017-09-15

Similar Documents

Publication Publication Date Title
US9439044B2 (en) Mechanism for determining location history via multiple historical predictors
CN110213714B (en) Method and device for terminal positioning
US11733388B2 (en) Method, apparatus and electronic device for real-time object detection
CN108574934B (en) Pseudo base station positioning method and device
CN108009688B (en) Aggregation event prediction method, device and equipment
CN106604228A (en) Fingerprint positioning method based on LET signaling data
WO2018028424A1 (en) Positioning apparatus, method, mobile node and wireless communication apparatus
US11140652B2 (en) Data processing method and apparatus
CN111148030A (en) Fingerprint database updating method and device, server and storage medium
WO2018112825A1 (en) Positioning method based on wi-fi access point, and device
CN112214677A (en) A point of interest recommendation method, device, electronic device and storage medium
Wu et al. CrowdWiFi: efficient crowdsensing of roadside WiFi networks
Chen et al. A travel mode identification framework based on cellular signaling data
CN108549049B (en) Ray tracing assisted Bayes fingerprint positioning method and device
US10885532B2 (en) Facilitating demographic assessment of information using targeted location oversampling
Fang et al. An accurate and real-time commercial indoor localization system in LTE networks
CN107172637B (en) Method and device for classifying calls
CN115273899A (en) A voice quality assessment method, device, equipment and storage medium
US20160192155A1 (en) Facilitating estimation of mobile device presence inside a defined region
WO2021184320A1 (en) Vehicle positioning method and device
Zheng et al. RSS-based indoor passive localization using clustering and filtering in a LTE network
CN111783641A (en) A face clustering method and device
Zhang et al. Deep neural network-based telco outdoor localization
CN111787490A (en) Pseudo base station track identification method, device, equipment and storage medium
CN115882985B (en) A low-orbit satellite channel prediction method and system based on Gaussian process regression

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant