CN107172637B - Method and device for classifying calls - Google Patents
Method and device for classifying calls Download PDFInfo
- Publication number
- CN107172637B CN107172637B CN201710348411.3A CN201710348411A CN107172637B CN 107172637 B CN107172637 B CN 107172637B CN 201710348411 A CN201710348411 A CN 201710348411A CN 107172637 B CN107172637 B CN 107172637B
- Authority
- CN
- China
- Prior art keywords
- calls
- call
- preset
- feature
- entropy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/02—Arrangements for optimising operational condition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/10—Scheduling measurement reports ; Arrangements for measurement reports
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
本申请实施例公开了一种对呼叫进行分类的方法和装置,所述方法包括:获取数据包,数据包包括与N个呼叫分别对应的N个数据集合,N个呼叫中的任一呼叫Hi对应的数据集合包括通信终端在所述呼叫Hi过程中依次向基站发送的所有测量报告MR;对N个呼叫中的每个呼叫以w个预设时间窗分别进行切分,确定每个呼叫经各预设时间窗切分后对应的特征集,特征集包括与切分后的呼叫对应的小区数、切换数、和至少一个自定义特征的值;根据每个呼叫经各预设时间窗切分后对应的特征集,对所述N个呼叫按照预设的移动状态的类别进行分类。本申请实施例可以根据呼叫对应的小区数、切换数、以及自定义特征确定呼叫的类别,有利于提高对呼叫进行分类时的精度。
The embodiment of the present application discloses a method and device for classifying calls, the method includes: acquiring a data packet, where the data packet includes N data sets corresponding to N calls respectively, and any call in the N calls Hi The corresponding data set includes all measurement reports MR sent by the communication terminal to the base station in turn during the call Hi process; each call in the N calls is divided into w preset time windows, and it is determined that each call is The corresponding feature set after each preset time window is split, the feature set includes the number of cells corresponding to the split call, the number of handovers, and the value of at least one self-defined feature; According to the corresponding feature set after classification, the N calls are classified according to the preset mobility state categories. In the embodiment of the present application, the type of the call can be determined according to the number of cells corresponding to the call, the number of handovers, and the self-defined feature, which is beneficial to improve the accuracy of classifying the call.
Description
技术领域technical field
本申请涉及通信技术领域,尤其涉及一种对呼叫进行分类的方法和装置。The present application relates to the field of communication technologies, and in particular, to a method and apparatus for classifying calls.
背景技术Background technique
在进行网络优化时,运营商常常需要利用服务器等处理装置对获取的数据包中的各呼叫的移动状态进行分类。需要说明的是,使用手机、平板电脑(portable androiddevice,Pad)等通信终端进行通话时,我们认为一次通话过程对应两次呼叫,其中,发起通话的通信终端对应一次呼叫,接收通话的通信终端也对应一次呼叫。另外手机、pad等便携式通讯设备通过基站上网时,一次上网过程也对应一次呼叫。在服务器获取的数据包中,每次呼叫对应一个数据集合,每个数据集合包括在一次呼叫过程中通信终端发送给基站的所有测量报告(measurement report,MR)。When performing network optimization, operators often need to use a processing device such as a server to classify the mobility status of each call in the acquired data packets. It should be noted that when using communication terminals such as mobile phones and tablet computers (portable android device, Pad) to make a call, we believe that one call process corresponds to two calls. Among them, the communication terminal that initiates the call corresponds to one call, and the communication terminal that receives the call also corresponds to a call. In addition, when portable communication devices such as mobile phones and pads access the Internet through the base station, one Internet access process also corresponds to one call. In the data packets obtained by the server, each call corresponds to a data set, and each data set includes all measurement reports (measurement reports, MRs) sent by the communication terminal to the base station during a call.
本申请的发明人发现,在对呼叫进行分类时,现有技术是对一次呼叫对应的数据集合进行简单处理,根据获取的小区数和切换数这两个参数进行分类的。以用户使用通信终端在某个区域来回走动这个场景对应的呼叫为例,数据集合中小区数和切换数可能都比较大,现有技术可能会将这次呼叫分到高速运动的类别,这显然是与呼叫的实际移动状态不一致。因此,现有技术对呼叫进行分类时不准确。The inventor of the present application finds that when classifying calls, the prior art simply processes a data set corresponding to a call, and classifies them according to the acquired parameters of the number of cells and the number of handovers. Take a call corresponding to a scenario where a user uses a communication terminal to walk back and forth in a certain area as an example, the number of cells and handovers in the data set may be relatively large, and the existing technology may classify this call into the category of high-speed movement, which is obviously is inconsistent with the actual mobile state of the call. Therefore, the prior art is inaccurate in classifying calls.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供了一种对呼叫进行分类的方法和装置,用于提高对呼叫进行分类的精度。The embodiments of the present application provide a method and apparatus for classifying calls, which are used to improve the accuracy of classifying calls.
第一方面,本申请实施例提供了一种对呼叫进行分类的方法,所述方法包括:In a first aspect, an embodiment of the present application provides a method for classifying calls, the method comprising:
获取数据包,所述数据包包括与N个呼叫分别对应的N个数据集合,所述N个呼叫中的任一呼叫Hi对应的数据集合包括通信终端在所述呼叫Hi过程中依次向基站发送的所有测量报告MR,其中,所述N为大于1的整数,所述1≤i≤N;Acquire a data packet, where the data packet includes N data sets corresponding to N calls respectively, and the data set corresponding to any call Hi in the N calls includes the communication terminal sending sequentially to the base station during the call Hi process All measurement reports MR of , wherein the N is an integer greater than 1, and the 1≤i≤N;
对所述N个呼叫中的每个呼叫以w个预设时间窗分别进行切分,确定所述每个呼叫经各预设时间窗切分后对应的特征集,所述特征集包括与切分后的呼叫对应的小区数、切换数、和至少一个自定义特征的值,所述自定义特征与呼叫的移动状态相关;Each call in the N calls is segmented with w preset time windows, and a feature set corresponding to each call after being segmented by each preset time window is determined, and the feature set includes The number of cells corresponding to the divided call, the number of handovers, and the value of at least one self-defined characteristic, the self-defined characteristic is related to the movement state of the call;
根据所述每个呼叫经各预设时间窗切分后对应的特征集,对所述N个呼叫按照预设的移动状态的类别进行分类。According to the feature set corresponding to each call after being segmented by each preset time window, the N calls are classified according to preset mobility state categories.
本申请各实施例中,所述至少一个自定义特征,包括如下自定义特征中的至少一个:接收信号强度标准差、切换熵、室外小区占比、和站间距离速度;其中,In each embodiment of the present application, the at least one custom feature includes at least one of the following custom features: standard deviation of received signal strength, handover entropy, outdoor cell ratio, and inter-site distance speed; wherein,
所述接收信号强度标准差,是所述呼叫Hj接收信号强度值的标准差;The standard deviation of the received signal strength is the standard deviation of the received signal strength value of the call Hj;
所述切换熵,用于表示所述呼叫Hj接入小区的不确定度;the handover entropy, used to represent the uncertainty of the calling Hj accessing the cell;
所述室外小区占比,是所述呼叫Hj接入室外类型小区的个数占所有接入小区总个数的百分比;The outdoor cell ratio is the percentage of the number of the call Hj accessing the outdoor type cells to the total number of all access cells;
所述站间距离速度,是所述呼叫Hj获得的所有位置信息的均值与所述呼叫的所有位置的最远距离。The inter-station distance speed is the longest distance between the mean value of all the position information obtained by the call Hj and all the positions of the call.
在一些可能的实施方式中,,所述切换熵根据如下公式计算得到:In some possible implementations, the handover entropy is calculated according to the following formula:
其中,所述entropy为所述呼叫Hj对应的切换熵,所述 Wherein, the entropy is the handover entropy corresponding to the call Hj, and the
其中N表示所述呼叫Hj接入的小区总数,i表示所述呼叫Hj接入的第i个小区,#i表示所述呼叫Hj接入第i个小区的次数,T表示接入不同小区的总次数,pi表示在这段时间内接入第i个小区的概率。where N represents the total number of cells accessed by the call Hj, i represents the ith cell accessed by the call Hj, #i represents the number of times the call Hj accesses the ith cell, and T represents the number of accesses to different cells. The total number of times, pi represents the probability of accessing the i-th cell during this period.
可以看出,上述技术方案中,由于设置了与呼叫的移动状态相关的自定义特征,所以在对移动状态进行判断时,除了可以根据呼叫对应的小区数和切换数,还可以结合自定义特征进行判断,这样有利于提高对呼叫进行分类时的精度。It can be seen that in the above technical solution, since a custom feature related to the mobility state of the call is set, when judging the mobility state, in addition to the number of cells corresponding to the call and the number of handovers, the custom feature can also be combined It is helpful to improve the accuracy when classifying calls.
在一些可能的实施方式中,根据接入小区性质的不同,In some possible implementations, depending on the nature of the access cell,
所述呼叫Hj对应的预设特征集中的所述接收信号强度标准差,包括:主服务小区的接收信号强度标准差、相邻小区的接收信号强度标准差和全部小区的接收信号强度标准差:The received signal strength standard deviation in the preset feature set corresponding to the call Hj includes: the received signal strength standard deviation of the primary serving cell, the received signal strength standard deviation of neighboring cells, and the received signal strength standard deviation of all cells:
所述呼叫Hj对应的预设特征集中的所述切换熵,包括:主服务小区的熵、相邻小区的熵和全部小区的熵。The handover entropy in the preset feature set corresponding to the call Hj includes: the entropy of the primary serving cell, the entropy of the adjacent cells, and the entropy of all cells.
在一些可能的实施方式中,当所述N个呼叫中有N'个呼叫对应的数据集合中包括精确地理位置信息时,所述N'个呼叫中的任一呼叫对应的预设特征集还包括:平均速度,其中,所述N'≥1。In some possible implementations, when the data sets corresponding to N' calls in the N calls include precise geographic location information, the preset feature set corresponding to any one of the N' calls is further Including: average speed, wherein, the N'≥1.
在一些可能的实施方式中,所述精确地理位置信息包括通过AGPS或OTT定位服务获得的位置信息。In some possible implementations, the precise geographic location information includes location information obtained through AGPS or OTT positioning services.
在一些可能的实施方式中,当所述N个呼叫中有N'个呼叫对应的数据集合中包括精确地理位置信息时,所述对所述N个呼叫按照预设的移动状态的类别进行分类,包括:In some possible implementations, when the data sets corresponding to N' calls in the N calls include precise geographic location information, the N calls are classified according to preset mobility state categories ,include:
所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的平均速度确定所述呼叫Hk的类别,其中,所述1≤K≤N';Any call H k in the N' calls determines the type of the call H k according to its corresponding average speed and the average speed corresponding to the preset mobile state types respectively, wherein the 1≤K≤N ';
根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数;According to the category corresponding to each of the N' calls and the set of the feature sets corresponding to each of the N' calls, use a supervised learning algorithm to determine any two of the preset The limit of the category of the mobile state in the M-dimensional space, and according to the limit, the range of the distribution of any of the preset categories of the mobile state in the M-dimensional space is obtained, and the M is the number corresponding to the call H k . Describe the number of features in the feature set;
对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.
在一些可能的实施方式中,对所述N个呼叫按照预设的移动状态的类别进行分类,包括:In some possible implementations, the N calls are classified according to preset mobility state categories, including:
当所述N'个呼叫中包括地理信息(Geography Information System,GIS)信息时,When the N' calls include geographic information (Geography Information System, GIS) information,
获取所述GIS信息中指定的地物信息的位置,所述指定的地物信息是与所述预设的移动状态强相关的地物信息;地物信息比如可以是:住宅、商场、公园、道路、路口、高速公路或者铁路等。Obtain the location of the feature information specified in the GIS information, and the specified feature information is the feature information strongly related to the preset movement state; Roads, intersections, highways or railways, etc.
所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的速度确定所述呼叫Hk的类别,其中,所述1≤K≤N';Any call H k among the N' calls determines the type of the call H k according to its corresponding average speed and the speed corresponding to the preset mobile state type respectively, wherein the 1≤K≤N';
确定所述任一呼叫Hk匹配的地物信息;determining the feature information matched by any of the calls H k ;
确定与所述N'个呼叫对应的地物信息的集合J;Determine the set J of the feature information corresponding to the N' calls;
根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,以及所述地物信息的集合J使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数;Use a supervised learning algorithm according to the category corresponding to each of the N' calls, the set of feature sets corresponding to each of the N' calls, and the set of feature information J Determine the boundary of any two preset movement state categories in the M-dimensional space, and obtain the range of the distribution of any of the preset movement state categories in the M-dimensional space according to the boundary, and the M is the number of features in the feature set corresponding to the call H k ;
对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.
在一些可能的实施方式中,确定所述集合J中任一地物信息对应的所述N'个呼叫中呼叫的集合Ji,若所述集合Ji中包括N”个呼叫,所述N”个呼叫对应的移动状态的类型的集合为Ji',所述集合Ji'对应的移动状态的类型数为N”',若所述集合Ji'中某一移动状态的类型对应的呼叫的个数小于In some possible implementations, a set Ji of the N' calls in the call corresponding to any feature information in the set J is determined, if the set Ji includes N" calls, the N" calls The set of types of mobile states corresponding to calls is Ji', and the number of types of mobile states corresponding to the set Ji' is N"', if the number of calls corresponding to a certain type of mobile state in the set Ji' is less than
则复制该移动状态的类型对应的呼叫对应的向量F。通过这种方式可以提高概率较小的类别的分类精度。 Then copy the vector F corresponding to the call corresponding to the type of the mobility state. In this way, the classification accuracy of the less probable categories can be improved.
根据所述N'个呼叫中的每个呼叫对应的移动状态的类别、所述向量F以及与每个呼叫对应的预设特征集对应的向量,使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述预设特征集中特征的个数;According to the category of the mobility state corresponding to each of the N' calls, the vector F, and the vector corresponding to the preset feature set corresponding to each call, use a supervised learning algorithm to determine any two of the presets The category of the mobile state is in the limit of the M-dimensional space, and according to the limit, the range of the distribution of any of the preset mobile state categories in the M-dimensional space is obtained, and the M is the corresponding value of the call H k the number of features in the preset feature set;
对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的预设特征集得到该呼叫在所述M为空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on its corresponding preset feature set. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.
在一些可能的实施方式中,当所述N个呼叫中不包括精确地理位置信息时,所述对所述N个呼叫按照预设的移动状态的类别进行分类,包括:In some possible implementation manners, when the N calls do not include precise geographic location information, classifying the N calls according to preset mobility state categories includes:
根据所述N个呼叫对应的所述预设特征集的集合,得到与所述N个呼叫分别对应的N个与所述预设特征集对应的向量;According to the set of the preset feature sets corresponding to the N calls, obtain N vectors corresponding to the preset feature sets corresponding to the N calls respectively;
根据N个所述向量和非监督学习算法,将所述N个呼叫分为M个集合,所述M大于预设的移动状态的类别的个数;According to the N described vectors and the unsupervised learning algorithm, the N calls are divided into M sets, and the M is greater than the preset number of mobile state categories;
根据专家规则,将所述M个集合按照预设的移动状态的类别进行分类,则任一呼叫的类别与其所属集合的类别相同。According to the expert rule, the M sets are classified according to the preset mobility state classes, and the class of any call is the same as the class of the set to which it belongs.
第二方面,本申请实施例提供了一种对呼叫进行分类的装置,所述装置包括:In a second aspect, an embodiment of the present application provides an apparatus for classifying calls, the apparatus comprising:
获取单元,用于获取数据包,所述数据包包括与N个呼叫分别对应的N个数据集合,所述N个呼叫中的任一呼叫Hi对应的数据集合包括通信终端在所述呼叫Hi过程中依次向基站发送的所有测量报告MR,其中,所述N为大于1的整数,所述1≤i≤N;an acquisition unit, configured to acquire a data packet, the data packet includes N data sets corresponding to N calls respectively, and the data set corresponding to any call Hi in the N calls includes the communication terminal during the call Hi process All measurement reports MR sent to the base station in sequence in the N, wherein the N is an integer greater than 1, and the 1≤i≤N;
第一处理单元,用于对所述N个呼叫中的每个呼叫以w个预设时间窗分别进行切分,确定所述每个呼叫经各预设时间窗切分后对应的特征集,所述特征集包括与切分后的呼叫对应的小区数、切换数、和至少一个自定义特征的值,所述自定义特征与呼叫的移动状态相关;a first processing unit, configured to divide each call in the N calls with w preset time windows, respectively, and determine a feature set corresponding to each call after being divided by each preset time window, The feature set includes the number of cells corresponding to the split call, the number of handovers, and the value of at least one self-defined characteristic, the self-defined characteristic being related to the movement state of the call;
分类单元,用于根据所述每个呼叫经各预设时间窗切分后对应的特征集,对所述N个呼叫按照预设的移动状态的类别进行分类。The classification unit is configured to classify the N calls according to the preset mobility state categories according to the corresponding feature set of each call after being segmented by each preset time window.
本申请各实施例中,所述至少一个自定义特征,包括如下自定义特征中的至少一个:接收信号强度标准差、切换熵、室外小区占比、和站间距离速度;其中,In each embodiment of the present application, the at least one custom feature includes at least one of the following custom features: standard deviation of received signal strength, handover entropy, outdoor cell ratio, and inter-site distance speed; wherein,
所述接收信号强度标准差,是所述呼叫Hj接收信号强度值的标准差;The standard deviation of the received signal strength is the standard deviation of the received signal strength value of the call Hj;
所述切换熵,用于表示所述呼叫Hj接入小区的不确定度;the handover entropy, used to represent the uncertainty of the calling Hj accessing the cell;
所述室外小区占比,是所述呼叫Hj接入室外类型小区的个数占所有接入小区总个数的百分比;The outdoor cell ratio is the percentage of the number of the call Hj accessing the outdoor type cells to the total number of all access cells;
所述站间距离速度,是所述呼叫Hj获得的所有位置信息的均值与所述呼叫的所有位置的最远距离。The inter-station distance speed is the longest distance between the mean value of all the position information obtained by the call Hj and all the positions of the call.
在一些可能的实施方式中,所述切换熵可以根据如下公式计算得到:In some possible implementations, the handover entropy can be calculated according to the following formula:
其中,所述entropy为所述呼叫Hj对应的切换熵,所述 Wherein, the entropy is the handover entropy corresponding to the call Hj, and the
其中N表示所述呼叫Hj接入的小区总数,i表示所述呼叫Hj接入的第i个小区,#i表示所述呼叫Hj接入第i个小区的次数,T表示接入不同小区的总次数,pi表示在这段时间内接入第i个小区的概率。where N represents the total number of cells accessed by the call Hj, i represents the ith cell accessed by the call Hj, #i represents the number of times the call Hj accesses the ith cell, and T represents the number of accesses to different cells. The total number of times, pi represents the probability of accessing the i-th cell during this period.
可以看出,上述技术方案中,由于设置了与呼叫的移动状态相关的自定义特征,所以在对移动状态进行判断时,除了可以根据呼叫对应的小区数和切换数,还可以结合自定义特征进行判断,这样有利于提高对呼叫进行分类时的精度。It can be seen that in the above technical solution, since a custom feature related to the mobility state of the call is set, when judging the mobility state, in addition to the number of cells corresponding to the call and the number of handovers, the custom feature can also be combined It is helpful to improve the accuracy when classifying calls.
在一些可能的实施方式中,根据接入小区性质的不同,In some possible implementations, depending on the nature of the access cell,
所述呼叫Hj对应的预设特征集中的所述接收信号强度标准差,包括:主服务小区的接收信号强度标准差、相邻小区的接收信号强度标准差和全部小区的接收信号强度标准差:The received signal strength standard deviation in the preset feature set corresponding to the call Hj includes: the received signal strength standard deviation of the primary serving cell, the received signal strength standard deviation of neighboring cells, and the received signal strength standard deviation of all cells:
所述呼叫Hj对应的预设特征集中的所述切换熵,包括:主服务小区的熵、相邻小区的熵和全部小区的熵。The handover entropy in the preset feature set corresponding to the call Hj includes: the entropy of the primary serving cell, the entropy of the adjacent cells, and the entropy of all cells.
在一些可能的实施方式中,当所述N个呼叫中有N'个呼叫对应的数据集合中包括精确地理位置信息时,所述N'个呼叫中的任一呼叫对应的预设特征集还包括:平均速度,其中,所述N'≥1。In some possible implementations, when the data sets corresponding to N' calls in the N calls include precise geographic location information, the preset feature set corresponding to any one of the N' calls is further Including: average speed, wherein, the N'≥1.
在一些可能的实施方式中,所述精确地理位置信息包括通过AGPS或OTT定位服务获得的位置信息。In some possible implementations, the precise geographic location information includes location information obtained through AGPS or OTT positioning services.
在一些可能的实施方式中,当所述N个呼叫中有N'个呼叫对应的数据集合中包括精确地理位置信息时,所述分类单元具体用于,In some possible implementations, when the data sets corresponding to N' calls in the N calls include precise geographic location information, the classification unit is specifically configured to:
所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的平均速度确定所述呼叫Hk的类别,其中,所述1≤K≤N';Any call H k in the N' calls determines the type of the call H k according to its corresponding average speed and the average speed corresponding to the preset mobile state types respectively, wherein the 1≤K≤N ';
根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数;According to the category corresponding to each of the N' calls and the set of the feature sets corresponding to each of the N' calls, use a supervised learning algorithm to determine any two of the preset The limit of the category of the mobile state in the M-dimensional space, and according to the limit, the range of the distribution of any of the preset categories of the mobile state in the M-dimensional space is obtained, and the M is the number corresponding to the call H k . Describe the number of features in the feature set;
对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.
在一些可能的实施方式中,所述分类单元具体用于,In some possible implementations, the classification unit is specifically used to:
当所述N'个呼叫中包括GIS信息时,When the N' calls include GIS information,
获取所述GIS信息中指定的地物信息的位置,所述指定的地物信息是与所述预设的移动状态强相关的地物信息;Obtain the location of the specified feature information in the GIS information, where the specified feature information is the feature information strongly related to the preset movement state;
所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的速度确定所述呼叫Hk的类别,其中,所述1≤K≤N';Any call H k among the N' calls determines the type of the call H k according to its corresponding average speed and the speed corresponding to the preset mobile state type respectively, wherein the 1≤K≤N';
确定所述任一呼叫Hk匹配的地物信息;determining the feature information matched by any of the calls H k ;
确定与所述N'个呼叫对应的地物信息的集合J;Determine the set J of the feature information corresponding to the N' calls;
根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,以及所述地物信息的集合J使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数;A supervised learning algorithm is used according to the category corresponding to each of the N' calls, the set of feature sets corresponding to each of the N' calls, and the set of feature information J Determine the boundary of any two preset movement state categories in the M-dimensional space, and obtain the range of the distribution of any of the preset movement state categories in the M-dimensional space according to the boundary, and the M is the number of features in the feature set corresponding to the call H k ;
对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.
在一些可能的实施方式中,当所述N个呼叫中不包括精确地理位置信息时,所述分类单元具体用于,In some possible implementations, when the N calls do not include precise geographic location information, the classification unit is specifically configured to:
根据所述N个呼叫对应的所述预设特征集的集合,得到与所述N个呼叫分别对应的N个与所述预设特征集对应的向量;According to the set of the preset feature sets corresponding to the N calls, obtain N vectors corresponding to the preset feature sets corresponding to the N calls respectively;
根据N个所述向量和非监督学习算法,将所述N个呼叫分为M个集合,所述M大于预设的移动状态的类别的个数;According to the N described vectors and the unsupervised learning algorithm, the N calls are divided into M sets, and the M is greater than the preset number of mobile state categories;
根据专家规则,将所述M个集合按照预设的移动状态的类别进行分类,则任一呼叫的类别与其所属集合的类别相同。According to the expert rule, the M sets are classified according to the preset mobility state classes, and the class of any call is the same as the class of the set to which it belongs.
第三方面,本申请实施例提供了一种存储介质,所述存储介质为非易失性计算机可读存储介质,所述非易失性计算机可读存储介质存储有至少一个程序,每个所述程序包括指令,所述指令包括可被具有处理器的装置执行的本申请实施例提供的任意一种对呼叫进行分类的方法的部分或全部步骤的指令。In a third aspect, an embodiment of the present application provides a storage medium, the storage medium is a non-volatile computer-readable storage medium, and the non-volatile computer-readable storage medium stores at least one program, each of which is The program includes instructions, and the instructions include instructions that can be executed by a device with a processor of some or all of the steps of any of the methods for classifying calls provided in the embodiments of the present application.
第四方面,本申请实施例提供了一种对呼叫进行分类的装置,其特征在于,包括:In a fourth aspect, an embodiment of the present application provides an apparatus for classifying calls, which is characterized by comprising:
相互耦合的处理器和存储部件;其中,所述处理器用于执行权利要求1至9任一项所述方法。A processor and a storage component coupled to each other; wherein the processor is configured to perform the method of any one of claims 1 to 9.
附图说明Description of drawings
为了更清楚地说明本申请实施例或背景技术中的技术方案,下面将对本申请实施例或背景技术中所需要使用的附图进行说明。In order to more clearly describe the technical solutions in the embodiments of the present application or the background technology, the accompanying drawings required in the embodiments or the background technology of the present application will be described below.
图1是本申请实施例应用的一个场景示意图;FIG. 1 is a schematic diagram of a scenario applied by an embodiment of the present application;
图2是本申请的一个实施例提供的一种对呼叫进行分类的方法的流程示意图;2 is a schematic flowchart of a method for classifying calls provided by an embodiment of the present application;
图3是图1中各通信终端实际移动轨迹及对应站间距离速度示意图;Fig. 3 is a schematic diagram of the actual movement track of each communication terminal in Fig. 1 and the distance speed between corresponding stations;
图4是本申请的另一实施例提供的一种对呼叫进行分类的装置的结构示意图。FIG. 4 is a schematic structural diagram of an apparatus for classifying calls provided by another embodiment of the present application.
具体实施方式Detailed ways
下面结合本申请实施例中的附图对本申请实施例进行描述。The embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.
请参见图1,图1是本申请实施例应用的一个场景示意图。如图1所示,对呼叫进行分类的装置101从多个基站(BTS1、BTS2、BTS3、BTS4、…、BTSn)获取数据包,数据包包括与多个呼叫分别对应的数据集合,任一呼叫对应的数据集合包括通信终端在呼叫过程中依次向基站发送的所有测量报告(Measurement Report,MR),MR是通信终端反馈给对呼叫进行分类的装置101的信息,在MR中包括本申请实施例要用到的终端接收的服务小区、邻区、及这些小区对应的信号强度等信息。图1中的通信终端A位于汽车上,随着汽车移动高速移动。通信终端B位于室内,处于静止状态。通信终端C随着用户步行低速移动。由图1可知在不同时刻T1、T2和T3时,不同通信终端移动距离不同,其中通信终端A移动最远,通信终端C其次,通信终端B未移动。现有技术中,在确定通信终端的移动状态时,通常根据通信终端呼叫时对应的小区数和切换数来确定。需要说明的是,当通信终端在小范围内快速移动时,根据现有技术,由于小区数及切换数较小,得到的通信终端的移动状态可能是慢速或者静止,因此现有技术对呼叫进行分类时不准确。Referring to FIG. 1 , FIG. 1 is a schematic diagram of a scenario where an embodiment of the present application is applied. As shown in FIG. 1 , the
为了提高对通信终端移动状态分类的精度,本申请实施例中引入了与呼叫的移动状态相关的自定义特征,进行分类的装置101获取多个呼叫对应的数据包,并对多个呼叫的移动状态进行分类。In order to improve the accuracy of classifying the movement state of a communication terminal, a custom feature related to the movement state of a call is introduced in the embodiment of the present application, and the
具体地,如图2所示,本申请实施例提供的对呼叫进行分类的方法,所述方法包括如下步骤:Specifically, as shown in FIG. 2 , the method for classifying calls provided by an embodiment of the present application includes the following steps:
S201、获取数据包,所述数据包包括与N个呼叫分别对应的N个数据集合,所述N个呼叫中的任一呼叫Hi对应的数据集合包括通信终端在所述呼叫Hi过程中依次向基站发送的所有测量报告MR,其中,所述N为大于1的整数,所述1≤i≤N。S201. Acquire a data packet, where the data packet includes N data sets corresponding to N calls respectively, and the data set corresponding to any call Hi in the N calls includes the communication terminal in the process of calling Hi. All measurement reports MR sent by the base station, wherein the N is an integer greater than 1, and the 1≤i≤N.
S202、对所述N个呼叫中的每个呼叫以w个预设时间窗分别进行切分,确定所述每个呼叫经各预设时间窗切分后对应的特征集,所述特征集包括与切分后的呼叫对应的小区数、切换数、和至少一个自定义特征的值,所述自定义特征与呼叫的移动状态相关。S202: Divide each of the N calls with w preset time windows, and determine a feature set corresponding to each call after being divided by each preset time window, where the feature set includes The number of cells corresponding to the split call, the number of handovers, and the value of at least one custom feature, where the custom feature is related to the mobility state of the call.
其中,小区数是在一段时间(比如30秒、或者1分钟、或者5分钟等)内总共接入的小区数目。The number of cells is the total number of cells accessed within a period of time (for example, 30 seconds, or 1 minute, or 5 minutes, etc.).
其中,切换数是在一端时间(比如30秒、或者1分钟、或者5分钟等)内总共发生的切换数目Among them, the number of handovers is the total number of handovers that occur in one end of time (such as 30 seconds, or 1 minute, or 5 minutes, etc.).
其中,所述至少一个自定义特征,包括如下自定义特征中的至少一个:接收信号强度标准差、切换熵、室外小区占比、和站间距离速度;其中,Wherein, the at least one custom feature includes at least one of the following custom features: standard deviation of received signal strength, handover entropy, outdoor cell ratio, and inter-site distance speed; wherein,
所述接收信号强度标准差,是所述呼叫Hj接收信号强度值的标准差。具体地,可以是在一段时间(比如30秒、或者1分钟、或者5分钟等)内多次测量获得的接受信号强度值的标准差。可以理解的,按照接收信号发送端性质的不同,接收信号强度标准差可以分为主服务小区的接收信号强度标准差、相邻小区的接收信号强度标准差和全部小区的接收信号强度标准差。一般情况下,通信终端移动速度越快,接收信号强度标准差的值越大。The standard deviation of the received signal strength is the standard deviation of the received signal strength value of the call Hj. Specifically, it may be the standard deviation of the received signal strength values obtained by multiple measurements within a period of time (such as 30 seconds, or 1 minute, or 5 minutes, etc.). It can be understood that, according to the nature of the receiving end of the received signal, the standard deviation of the received signal strength can be divided into the standard deviation of the received signal strength of the primary serving cell, the standard deviation of the received signal strength of the neighboring cells, and the standard deviation of the received signal strength of all cells. In general, the faster the communication terminal moves, the greater the value of the standard deviation of the received signal strength.
所述切换熵,用于表示所述呼叫Hj接入小区的不确定度。具体地,可以是在一段时间(比如30秒、或者1分钟、或者5分钟等)内通信终端接入不同小区的不确定度,取值在0到1之间。举例来说,若通信终端为静止状态,则它在这段时间内接入小区只有一个,其切换熵为0。可以理解的,通信终端移动速度越快,其在一段时间内接入小区数目及小区切换数越大,其切换熵越趋近于1。举例来说若切换熵为0.2,可以推断通信终端处于低速移动状态。若切换熵为0.9,可以推断通信终端处于高速移动状态。The handover entropy is used to represent the uncertainty of the calling Hj accessing the cell. Specifically, it may be the uncertainty of the communication terminal accessing different cells within a period of time (for example, 30 seconds, or 1 minute, or 5 minutes, etc.), and the value is between 0 and 1. For example, if the communication terminal is in a stationary state, it accesses only one cell during this period, and its handover entropy is 0. It can be understood that, the faster the moving speed of the communication terminal, the greater the number of cells it accesses and the number of cell handovers within a period of time, and the closer its handover entropy is to 1. For example, if the handover entropy is 0.2, it can be inferred that the communication terminal is in a low-speed moving state. If the handover entropy is 0.9, it can be inferred that the communication terminal is in a high-speed moving state.
在一些可能的实施方式中,切换熵可以根据如下公式计算得到:In some possible implementations, the switching entropy can be calculated according to the following formula:
其中,所述entropy为所述呼叫Hj对应的切换熵,所述 Wherein, the entropy is the handover entropy corresponding to the call Hj, and the
其中N表示所述呼叫Hj接入的小区总数,i表示所述呼叫Hj接入的第i个小区,#i表示所述呼叫Hj接入第i个小区的次数,T表示接入不同小区的总次数,pi表示在这段时间内接入第i个小区的概率。where N represents the total number of cells accessed by the call Hj, i represents the ith cell accessed by the call Hj, #i represents the number of times the call Hj accesses the ith cell, and T represents the number of accesses to different cells. The total number of times, pi represents the probability of accessing the i-th cell during this period.
根据接入小区性质的不同,所述切换熵,包括:主服务小区的熵、相邻小区的熵和全部小区的熵。According to different properties of the access cells, the handover entropy includes: the entropy of the primary serving cell, the entropy of the adjacent cells, and the entropy of all cells.
所述室外小区占比,是所述呼叫Hj接入室外类型小区的个数占所有接入小区总个数的百分比。可以理解的,若室外小区占比低于50%,则可以认为通信终端位于室内,为静止状态。The outdoor cell proportion is the percentage of the number of the calling Hj accessing the outdoor type cells to the total number of all access cells. It can be understood that if the proportion of outdoor cells is less than 50%, it can be considered that the communication terminal is located indoors and is in a static state.
所述站间距离速度,是所述呼叫Hj获得的所有位置信息的均值与所述呼叫的所有位置的最远距离。具体地,可以是用户在一段时间内所获得的所有位置信息(包括MR定位,AGPS定位,OTT定位)的均值与所有位置的最远距离。站间距离速度越大,则认为通信终端的移动速度越大。对于通信终端在一段时间内从一个位置出发又回到该位置的情况,站间距离速度比简单的速度计算模型,即起终点距离与时间差比值,更能表征该用户的运动速度。参见图3,其中各通信终端的实际移动轨迹如图3中虚线所示,图中带箭头的实线段的起始位置指示位置信息的均值,带箭头的线段的箭头指示的位置为在指定窗口时间内距离所述均值最远的点,带箭头的线段的距离为站间举例速度。由图3可知,左边通信终端的站间距离速度最大,中间通信终端的站间距离速度次之,右边通信终端的站间距离速度最小。The inter-station distance speed is the longest distance between the mean value of all the position information obtained by the call Hj and all the positions of the call. Specifically, it may be the longest distance between the mean value of all the location information (including MR positioning, AGPS positioning, and OTT positioning) obtained by the user within a period of time and all the positions. The larger the inter-station distance speed is, the larger the movement speed of the communication terminal is considered to be. For the situation that the communication terminal starts from a position and returns to the position within a certain period of time, the distance and speed between stations can better represent the movement speed of the user than the simple speed calculation model, that is, the ratio of the distance between the starting and ending points and the time difference. Referring to FIG. 3 , the actual movement trajectory of each communication terminal is shown as a dotted line in FIG. 3 , the starting position of the solid line segment with arrows in the figure indicates the mean value of the position information, and the position indicated by the arrows of the line segments with arrows is in the specified window. The point that is farthest from the mean value in time, the distance of the line segment with the arrow is the example speed between stations. It can be seen from Figure 3 that the inter-station distance velocity of the left communication terminal is the largest, the inter-station distance velocity of the middle communication terminal is second, and the inter-station distance velocity of the right communication terminal is the smallest.
S203、根据所述每个呼叫经各预设时间窗切分后对应的特征集,对所述N个呼叫按照预设的移动状态的类别进行分类。S203: Classify the N calls according to preset mobility state categories according to the corresponding feature set of each call after being segmented by each preset time window.
需要说明的是,先计算经任一预设时间窗切分得到的多个呼叫分别对应的特征集,然后确定多个特征集的平均值,作为该预设时间窗对应的特征集。比如一个呼叫H1经预设的30秒的时间窗切分为6个小的呼叫,则分别计算着6个小的呼叫对应的特征集,然后对得到的6个特征集中每个特征的平均值,将平均值做为该呼叫H1的特征集。It should be noted that the feature sets corresponding to the multiple calls obtained by dividing any preset time window are first calculated, and then the average value of the multiple feature sets is determined as the feature set corresponding to the preset time window. For example, a call H1 is divided into 6 small calls by a preset time window of 30 seconds, then the feature sets corresponding to the 6 small calls are calculated respectively, and then the average value of each feature in the obtained 6 feature sets is calculated. , and take the mean value as the feature set of the call H1.
可以看出,上述技术方案中,由于设置了与呼叫的移动状态相关的自定义特征,所以在对移动状态进行判断时,除了可以根据呼叫对应的小区数和切换数,还可以结合自定义特征进行判断,这样有利于提高对呼叫进行分类时的精度。It can be seen that in the above technical solution, since a custom feature related to the mobility state of the call is set, when judging the mobility state, in addition to the number of cells corresponding to the call and the number of handovers, the custom feature can also be combined It is helpful to improve the accuracy when classifying calls.
在一些可能的实施方式中,当所述N个呼叫中有N'个呼叫对应的数据集合中包括精确地理位置信息时,所述对所述N个呼叫按照预设的移动状态的类别进行分类,包括:In some possible implementations, when the data sets corresponding to N' calls in the N calls include precise geographic location information, the N calls are classified according to preset mobility state categories ,include:
所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的平均速度确定所述呼叫Hk的类别,其中,所述1≤K≤N'。比如若N为1000,N'为120,则在1000个呼叫中有120个呼叫对应的数据集合中包括精确地理位置信息。其中,精确地理位置信息包括通过辅助全球卫星定位系统(ssisted Global Positioning System,AGPS)或通过互联网的应用服务(Over The Top,OTT)获取。根据所述120个带有精确地理位置信息的呼叫的地理位置信息可以获取这120个呼叫中每个呼叫的平均速度,根据预设的移动状态的类别对应的评价速度确定这些呼叫的移动状态的类别。Any call Hk among the N' calls determines the type of the call Hk according to the average speed corresponding to the call Hk according to the average speed corresponding to the preset mobile state types respectively, where 1≤K≤N'. For example, if N is 1000 and N' is 120, the data set corresponding to 120 calls in the 1000 calls includes precise geographic location information. Wherein, the precise geographic location information includes obtaining through an assisted global positioning system (ssisted global positioning system, AGPS) or through an application service (Over The Top, OTT) of the Internet. According to the geographic location information of the 120 calls with precise geographic location information, the average speed of each call in the 120 calls can be obtained, and the mobility state of these calls can be determined according to the evaluation speed corresponding to the preset mobility state category. category.
根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数;According to the category corresponding to each of the N' calls and the set of the feature sets corresponding to each of the N' calls, use a supervised learning algorithm to determine any two of the preset The limit of the category of the mobile state in the M-dimensional space, and according to the limit, the range of the distribution of any of the preset categories of the mobile state in the M-dimensional space is obtained, and the M is the number corresponding to the call H k . Describe the number of features in the feature set;
对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.
在一些可能的实施方式中,当所述N'个呼叫中包括GIS信息时,In some possible implementations, when the N' calls include GIS information,
获取所述GIS信息中指定的地物信息的位置,所述指定的地物信息是与所述预设的移动状态强相关的地物信息;地物信息比如可以是:住宅、商场、公园、道路、路口、高速公路或者铁路等。Obtain the location of the feature information specified in the GIS information, and the specified feature information is the feature information strongly related to the preset movement state; Roads, intersections, highways or railways, etc.
所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的速度确定所述呼叫Hk的类别,其中,所述1≤K≤N';Any call H k among the N' calls determines the type of the call H k according to its corresponding average speed and the speed corresponding to the preset mobile state type respectively, wherein the 1≤K≤N';
确定所述任一呼叫Hk匹配的地物信息;determining the feature information matched by any of the calls H k ;
确定与所述N'个呼叫对应的地物信息的集合J;Determine the set J of the feature information corresponding to the N' calls;
根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,以及所述地物信息的集合J使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数。举例来说在一些可能的实施方式中,Use a supervised learning algorithm according to the category corresponding to each of the N' calls, the set of feature sets corresponding to each of the N' calls, and the set of feature information J Determine the boundary of any two preset movement state categories in the M-dimensional space, and obtain the range of the distribution of any of the preset movement state categories in the M-dimensional space according to the boundary, and the M is the number of features in the feature set corresponding to the call H k . For example, in some possible implementations,
在一些可能的实施方式中,确定所述集合J中任一地物信息对应的所述N'个呼叫中呼叫的集合Ji,若所述集合Ji中包括N”个呼叫,所述N”个呼叫对应的移动状态的类型的集合为Ji',所述集合Ji'对应的移动状态的类型数为N”',若所述集合Ji'中某一移动状态的类型对应的呼叫的个数小于In some possible implementations, a set Ji of the N' calls in the call corresponding to any feature information in the set J is determined, if the set Ji includes N" calls, the N" calls The set of types of mobile states corresponding to calls is Ji', and the number of types of mobile states corresponding to the set Ji' is N"', if the number of calls corresponding to a certain type of mobile state in the set Ji' is less than
则复制该移动状态的类型对应的呼叫对应的向量F。通过这种方式可以提高概率较小的类别的分类精度。 Then copy the vector F corresponding to the call corresponding to the type of the mobility state. In this way, the classification accuracy of the less probable categories can be improved.
根据所述N'个呼叫中的每个呼叫对应的移动状态的类别、所述向量F以及与每个呼叫对应的预设特征集对应的向量,使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述预设特征集中特征的个数;According to the category of the mobility state corresponding to each of the N' calls, the vector F, and the vector corresponding to the preset feature set corresponding to each call, use a supervised learning algorithm to determine any two of the presets The category of the mobile state is in the limit of the M-dimensional space, and according to the limit, the range of the distribution of any of the preset mobile state categories in the M-dimensional space is obtained, and the M is the corresponding value of the call H k the number of features in the preset feature set;
对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的预设特征集得到该呼叫在所述M为空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on its corresponding preset feature set. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.
举例来说,若GIS对应的地物信息为公园,经过公园的包括精确地理位置的呼叫有150个,其中92个呼叫对应的移动状态为慢速移动,51个呼叫对应的移动状态为静止,7个呼叫对应的移动状态为高速运动。由于高速运动对应的呼叫数小于150/3=50,所以可以复制高速运动的呼叫对应的向量,复制后使高速运动对应的向量达到50或者50以上,然后在利用前面所述150个呼叫对应的预设特征集对应的向量,以及复制得到的43个(以复制后高速运动对应的呼叫数为50个为例)移动状态为高速运动的呼叫对应的向量,使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围。For example, if the feature information corresponding to the GIS is a park, there are 150 calls including the precise geographic location passing through the park, of which 92 calls correspond to a slow moving state, and 51 calls correspond to a stationary state. The mobile state corresponding to the 7 calls is high-speed movement. Since the number of calls corresponding to high-speed motion is less than 150/3=50, the vector corresponding to the call of high-speed motion can be copied. After copying, the vector corresponding to high-speed motion can reach 50 or more. The vector corresponding to the preset feature set, as well as the 43 copies obtained (taking the number of calls corresponding to high-speed motion after copying as an example) correspond to 50 calls with high-speed motion, use the supervised learning algorithm to determine any two The boundary of the preset movement state category in the M-dimensional space, and the range of the distribution of any of the preset movement state categories in the M-dimensional space is obtained according to the boundary.
对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.
在一些可能的实施方式中,当所述N个呼叫中不包括精确地理位置信息时,根据所述N个呼叫对应的所述预设特征集的集合,得到与所述N个呼叫分别对应的N个与所述预设特征集对应的向量;In some possible implementations, when the N calls do not include precise geographic location information, according to the set of the preset feature sets corresponding to the N calls, the corresponding N calls are obtained respectively. N vectors corresponding to the preset feature set;
根据N个所述向量和非监督学习算法,将所述N个呼叫分为M个集合,所述M大于预设的移动状态的类别的个数;举例来说,若预设的移动状态包括4种:静止、低速运动、中速运动、和高速运动。若N为1000,在本发明的一些可能的实施方式中,可以将1000个呼叫根据非监督学习算法分成20个集合,然后按照因子分析法、迭代算法、主成分分析法等专家算法确定所述20个集合中的每个集合的移动状态。比如将切换熵的值为0的集合的移动状态确定为静止。将切换熵为0.7以上且切换数大于5的集合中呼叫的移动状态确定为高速移动等。则任一呼叫的类别与其所属集合的类别相同。According to the N vectors and an unsupervised learning algorithm, the N calls are divided into M sets, where M is greater than the number of preset mobility state categories; for example, if the preset mobility state includes 4 types: stationary, low-speed motion, medium-speed motion, and high-speed motion. If N is 1000, in some possible implementations of the present invention, 1000 calls can be divided into 20 sets according to an unsupervised learning algorithm, and then determined according to expert algorithms such as factor analysis, iterative, and principal component analysis. The state of movement for each of the 20 sets. For example, the moving state of the set whose switching entropy value is 0 is determined to be stationary. The mobility state of the call in the set where the handover entropy is 0.7 or more and the number of handovers is greater than 5 is determined to be high-speed mobility or the like. The class of any call is the same as the class of the set to which it belongs.
请参阅图4,为本申请实施例提供的一种对呼叫进行分类的装置400,具体地,图4所示的对呼叫进行分类的装置400可以包括:获取单元401、第一处理单元402、和分类单元403。Referring to FIG. 4, an
其中,获取单元401用于执行本发明方法实施例图2中步骤S201的方法,获取单元401的实施方式可以参考本发明方法实施例图2中步骤S201对应的描述,在此不再赘述。The obtaining
第一处理单元402用于执行本发明方法实施例图2中步骤S202的方法,第一处理单元402的实施方式可以参考本发明方法实施例图2中步骤S202对应的描述,在此不再赘述。The
分类单元403用于执行本发明方法实施例图2中步骤S203的方法,分类单元403的实施方式可以参考本发明方法实施例图2中步骤S203对应的描述,在此不再赘述。The
可选的,在本发明一些可能的实施方式中,所述至少一个自定义特征,包括如下自定义特征中的至少一个:接收信号强度标准差、切换熵、室外小区占比、和站间距离速度;其中,Optionally, in some possible implementations of the present invention, the at least one custom feature includes at least one of the following custom features: standard deviation of received signal strength, handover entropy, outdoor cell ratio, and inter-station distance. speed; where,
所述接收信号强度标准差,是所述呼叫Hj接收信号强度值的标准差;The standard deviation of the received signal strength is the standard deviation of the received signal strength value of the call Hj;
所述切换熵,用于表示所述呼叫Hj接入小区的不确定度;the handover entropy, used to represent the uncertainty of the calling Hj accessing the cell;
所述室外小区占比,是所述呼叫Hj接入室外类型小区的个数占所有接入小区总个数的百分比;The outdoor cell ratio is the percentage of the number of the call Hj accessing the outdoor type cells to the total number of all access cells;
所述站间距离速度,是所述呼叫Hj获得的所有位置信息的均值与所述呼叫的所有位置的最远距离。The inter-station distance speed is the longest distance between the mean value of all the position information obtained by the call Hj and all the positions of the call.
可选的,在本发明一些可能的实施方式中,所述切换熵根据如下公式计算得到:Optionally, in some possible embodiments of the present invention, the handover entropy is calculated according to the following formula:
其中,所述entropy为所述呼叫Hj对应的切换熵,所述 Wherein, the entropy is the handover entropy corresponding to the call Hj, and the
其中N表示所述呼叫Hj接入的小区总数,i表示所述呼叫Hj接入的第i个小区,#i表示所述呼叫Hj接入第i个小区的次数,T表示接入不同小区的总次数,pi表示在这段时间内接入第i个小区的概率。where N represents the total number of cells accessed by the call Hj, i represents the ith cell accessed by the call Hj, #i represents the number of times the call Hj accesses the ith cell, and T represents the number of accesses to different cells. The total number of times, pi represents the probability of accessing the i-th cell during this period.
可选的,在本发明一些可能的实施方式中,根据接入小区性质的不同,Optionally, in some possible implementations of the present invention, according to different properties of the access cell,
所述呼叫Hj对应的预设特征集中的所述接收信号强度标准差,包括:主服务小区的接收信号强度标准差、相邻小区的接收信号强度标准差和全部小区的接收信号强度标准差:The received signal strength standard deviation in the preset feature set corresponding to the call Hj includes: the received signal strength standard deviation of the primary serving cell, the received signal strength standard deviation of neighboring cells, and the received signal strength standard deviation of all cells:
所述呼叫Hj对应的预设特征集中的所述切换熵,包括:主服务小区的熵、相邻小区的熵和全部小区的熵。The handover entropy in the preset feature set corresponding to the call Hj includes: the entropy of the primary serving cell, the entropy of the adjacent cells, and the entropy of all cells.
可选的,在本发明一些可能的实施方式中,当所述N个呼叫中有N'个呼叫对应的数据集合中包括精确地理位置信息时,所述N'个呼叫中的任一呼叫对应的预设特征集还包括:平均速度,其中,所述N'≥1。Optionally, in some possible implementations of the present invention, when the data sets corresponding to N' calls in the N calls include precise geographic location information, any call in the N' calls corresponds to The preset feature set also includes: average speed, where the N'≥1.
所述精确地理位置信息包括通过AGPS或OTT定位服务获得的位置信息。The precise geographic location information includes location information obtained through AGPS or OTT positioning services.
可选的,在本发明一些可能的实施方式中,当所述N个呼叫中有N'个呼叫对应的数据集合中包括精确地理位置信息时,所述分类单元具体用于,Optionally, in some possible implementations of the present invention, when the data sets corresponding to N' calls in the N calls include precise geographic location information, the classification unit is specifically configured to:
所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的平均速度确定所述呼叫Hk的类别,其中,所述1≤K≤N';Any call H k in the N' calls determines the type of the call H k according to its corresponding average speed and the average speed corresponding to the preset mobile state types respectively, wherein the 1≤K≤N ';
根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数;According to the category corresponding to each of the N' calls and the set of the feature sets corresponding to each of the N' calls, use a supervised learning algorithm to determine any two of the preset The limit of the category of the mobile state in the M-dimensional space, and according to the limit, the range of the distribution of any of the preset categories of the mobile state in the M-dimensional space is obtained, and the M is the number corresponding to the call H k . Describe the number of features in the feature set;
对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.
可选的,在本发明一些可能的实施方式中,所述分类单元具体用于,Optionally, in some possible implementations of the present invention, the classification unit is specifically used to:
当所述N'个呼叫中包括GIS信息时,When the N' calls include GIS information,
获取所述GIS信息中指定的地物信息的位置,所述指定的地物信息是与所述预设的移动状态强相关的地物信息;Obtain the location of the specified feature information in the GIS information, where the specified feature information is the feature information strongly related to the preset movement state;
所述N'个呼叫中的任一呼叫Hk根据其对应的平均速度按照预设的移动状态的类别分别对应的速度确定所述呼叫Hk的类别,其中,所述1≤K≤N';Any call H k among the N' calls determines the type of the call H k according to its corresponding average speed and the speed corresponding to the preset mobile state type respectively, wherein the 1≤K≤N';
确定所述任一呼叫Hk匹配的地物信息;determining the feature information matched by any of the calls H k ;
确定与所述N'个呼叫对应的地物信息的集合J;Determine the set J of the feature information corresponding to the N' calls;
根据所述N'个呼叫中的每个呼叫对应的类别、以及所述N'个呼叫中的每个呼叫对应的所述特征集的集合,以及所述地物信息的集合J使用监督学习算法确定任意两个所述预设的移动状态的类别在M维空间的界限,根据所述界限得到任一所述预设的移动状态的类别在所述M维空间分布的区域范围,所述M为所述呼叫Hk对应的所述特征集中特征的个数;A supervised learning algorithm is used according to the category corresponding to each of the N' calls, the set of feature sets corresponding to each of the N' calls, and the set of feature information J Determine the boundary of any two preset movement state categories in the M-dimensional space, and obtain the range of the distribution of any of the preset movement state categories in the M-dimensional space according to the boundary, and the M is the number of features in the feature set corresponding to the call H k ;
对于所述N个呼叫中,呼叫对应的数据集合中不包括精确地理位置信息的(N-N')个呼叫,任一所述(N-N')个呼叫根据其对应的特征集的集合得到该呼叫在所述M维空间中的映射位置,根据所述映射位置及任一所述预设的移动状态在所述M维空间分布的区域范围,确定所述任一所述(N-N')个呼叫对应的移动状态。For the (N-N') calls that do not include precise geographic location information in the data set corresponding to the N calls, any of the (N-N') calls is based on the set of corresponding feature sets. Obtain the mapping position of the call in the M-dimensional space, and determine any of the (N- N') mobile states corresponding to calls.
可选的,在本发明一些可能的实施方式中,当所述N个呼叫中不包括精确地理位置信息时,所述分类单元具体用于,Optionally, in some possible implementations of the present invention, when the N calls do not include precise geographic location information, the classification unit is specifically configured to:
根据所述N个呼叫对应的所述预设特征集的集合,得到与所述N个呼叫分别对应的N个与所述预设特征集对应的向量;According to the set of the preset feature sets corresponding to the N calls, obtain N vectors corresponding to the preset feature sets corresponding to the N calls respectively;
根据N个所述向量和非监督学习算法,将所述N个呼叫分为M个集合,所述M大于预设的移动状态的类别的个数;According to the N described vectors and the unsupervised learning algorithm, the N calls are divided into M sets, and the M is greater than the preset number of mobile state categories;
根据专家规则,将所述M个集合按照预设的移动状态的类别进行分类,则任一呼叫的类别与其所属集合的类别相同。According to the expert rule, the M sets are classified according to the preset mobility state classes, and the class of any call is the same as the class of the set to which it belongs.
本申请实施例还提供一种计算机存储介质,其中,该计算机存储介质可存储有程序,所述程序执行时包括上述方法实施例中记载的任意一种对呼叫进行分类的方法的部分或全部步骤。An embodiment of the present application further provides a computer storage medium, wherein the computer storage medium may store a program, and when the program is executed, the program includes part or all of the steps of any of the methods for classifying calls described in the above method embodiments .
本申请实施例还提供一种对呼叫进行分类的装置,包括:相互耦合的处理器和存储部件;其中,所述处理器用于执行上述方法实施例中记载的任意一种对呼叫进行分类的方法。An embodiment of the present application further provides an apparatus for classifying calls, including: a processor and a storage component coupled to each other; wherein the processor is configured to execute any one of the methods for classifying calls described in the foregoing method embodiments .
本申请实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减。The steps in the method of the embodiment of the present application may be adjusted, combined and deleted in sequence according to actual needs.
本申请实施例装置中的单元可以根据实际需要进行合并、划分和删减。Units in the apparatus of the embodiment of the present application may be combined, divided, and deleted according to actual needs.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,该流程可以由计算机程序来指令相关的硬件完成,该程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法实施例的流程。而前述的存储介质包括:只读存储器(Read-Only Memory,ROM)或随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented. The process can be completed by instructing the relevant hardware by a computer program, and the program can be stored in a computer-readable storage medium. When the program is executed , which may include the processes of the foregoing method embodiments. The aforementioned storage medium includes: a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk and other media that can store program codes.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710348411.3A CN107172637B (en) | 2017-05-17 | 2017-05-17 | Method and device for classifying calls |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710348411.3A CN107172637B (en) | 2017-05-17 | 2017-05-17 | Method and device for classifying calls |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107172637A CN107172637A (en) | 2017-09-15 |
CN107172637B true CN107172637B (en) | 2021-01-29 |
Family
ID=59815400
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710348411.3A Active CN107172637B (en) | 2017-05-17 | 2017-05-17 | Method and device for classifying calls |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107172637B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115373002A (en) * | 2021-05-19 | 2022-11-22 | 中兴通讯股份有限公司 | A road user identification method, device, storage medium and electronic device |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7689210B1 (en) * | 2002-01-11 | 2010-03-30 | Broadcom Corporation | Plug-n-playable wireless communication system |
CN101453770A (en) * | 2007-12-07 | 2009-06-10 | 华为技术有限公司 | Measurement control method and apparatus |
CN102026309B (en) * | 2009-09-10 | 2013-07-31 | 电信科学技术研究院 | Method, system and device for detecting mobile state of terminal |
WO2013023346A1 (en) * | 2011-08-12 | 2013-02-21 | Huawei Technologies Co., Ltd. | Method for estimating mobility state |
CN103167551B (en) * | 2011-12-15 | 2016-06-29 | 华为技术有限公司 | A kind of method of reported by user equipment UE measurement result and subscriber equipment |
WO2014047795A1 (en) * | 2012-09-26 | 2014-04-03 | 华为技术有限公司 | Method and device for estimating moving status and measurement report method and device |
CN104080135B (en) * | 2013-03-29 | 2018-02-13 | 电信科学技术研究院 | A kind of network selecting method and equipment |
US9854527B2 (en) * | 2014-08-28 | 2017-12-26 | Apple Inc. | User equipment transmit duty cycle control |
CN105101247B (en) * | 2015-09-01 | 2018-08-03 | 重庆邮电大学 | Mobile status estimation Enhancement Method based on switching type certain weights and device |
-
2017
- 2017-05-17 CN CN201710348411.3A patent/CN107172637B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN107172637A (en) | 2017-09-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9439044B2 (en) | Mechanism for determining location history via multiple historical predictors | |
CN110213714B (en) | Method and device for terminal positioning | |
US11733388B2 (en) | Method, apparatus and electronic device for real-time object detection | |
CN108574934B (en) | Pseudo base station positioning method and device | |
CN108009688B (en) | Aggregation event prediction method, device and equipment | |
CN106604228A (en) | Fingerprint positioning method based on LET signaling data | |
WO2018028424A1 (en) | Positioning apparatus, method, mobile node and wireless communication apparatus | |
US11140652B2 (en) | Data processing method and apparatus | |
CN111148030A (en) | Fingerprint database updating method and device, server and storage medium | |
WO2018112825A1 (en) | Positioning method based on wi-fi access point, and device | |
CN112214677A (en) | A point of interest recommendation method, device, electronic device and storage medium | |
Wu et al. | CrowdWiFi: efficient crowdsensing of roadside WiFi networks | |
Chen et al. | A travel mode identification framework based on cellular signaling data | |
CN108549049B (en) | Ray tracing assisted Bayes fingerprint positioning method and device | |
US10885532B2 (en) | Facilitating demographic assessment of information using targeted location oversampling | |
Fang et al. | An accurate and real-time commercial indoor localization system in LTE networks | |
CN107172637B (en) | Method and device for classifying calls | |
CN115273899A (en) | A voice quality assessment method, device, equipment and storage medium | |
US20160192155A1 (en) | Facilitating estimation of mobile device presence inside a defined region | |
WO2021184320A1 (en) | Vehicle positioning method and device | |
Zheng et al. | RSS-based indoor passive localization using clustering and filtering in a LTE network | |
CN111783641A (en) | A face clustering method and device | |
Zhang et al. | Deep neural network-based telco outdoor localization | |
CN111787490A (en) | Pseudo base station track identification method, device, equipment and storage medium | |
CN115882985B (en) | A low-orbit satellite channel prediction method and system based on Gaussian process regression |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |