Disclosure of Invention
It is an object of the present invention to solve at least the above problems and to provide at least the advantages to be described later.
Still another object of the present invention is to provide a method for recommending a next point of interest based on multi-granularity self-attention, which can more accurately recommend a next point of interest by applying a multi-granularity self-attention network after comprehensively considering a POI sequence and a POI business turn sequence.
To achieve these objects and other advantages and in accordance with the purpose of the invention, a method for recommending next points of interest based on multi-granularity self-attention is provided, comprising:
constructing a multi-granularity self-attention network model, wherein the multi-granularity self-attention network model comprises a first encoder for perceptually encoding time, a second encoder for perceptually encoding space and a decoder for decoding POI characteristics;
collecting historical check-in data of a plurality of users, training the multi-granularity self-attention network model by taking the historical check-in data of part of the users as a training set, taking the historical check-in data of the other part of the users as a test set, and predicting the next interest point of the users by utilizing the trained multi-granularity self-attention network model;
The training set training process and the testing set testing process comprise the steps of extracting POI sequences containing time features and space features according to user historical sign-in data, and extracting POI business circle sequences containing time features and space features according to user historical sign-in data;
and using the POI sequence and the POI business circle sequence as input sequences, and processing the POI sequence and the POI business circle sequence through the multi-granularity self-attention network model to obtain the next POI id of the user.
Preferably, the user history check-in data includes a user id, a POI id, a check-in time, a category to which the POI belongs, a POI category, and a POI geographical location.
Preferably, the decoder includes a first decoder decoding the category of the POI, a second decoder decoding the category to which the POI belongs, and a third decoder decoding the POI id.
Preferably, the first encoder and the second encoder are each comprised of a feature layer, an aggregation layer, and a self-attention network.
Preferably, the processing of the POI sequence and the POI business circle sequence through the multi-granularity self-attention network model to obtain the next POI id of the user comprises the following steps of:
using a first encoder to encode the time features in the POI sequence to obtain time perception encoding of the POI sequence, using a second encoder to encode the space features in the POI sequence to obtain space perception encoding of the POI sequence, and synthesizing the time perception encoding of the POI sequence and the space perception encoding of the POI sequence to obtain POI sequence representation;
The method comprises the steps of using a first encoder to encode time features in a POI business turn sequence to obtain time perception encoding of the POI business turn sequence, using a second encoder to encode space features in the POI business turn sequence to obtain space perception encoding of the POI business turn sequence, and integrating the time perception encoding of the POI business turn sequence and the space perception encoding of the POI business turn sequence to obtain POI business turn sequence representation;
Using the time perception code of the POI sequence and the time perception code of the POI business circle sequence as input quantities, and processing the input quantities by a first decoder to obtain a predicted value of the next POI type;
The aggregate representation of the time features in the POI sequence obtained in the encoding process and the aggregate representation of the time features in the POI business circle sequence obtained in the encoding process are used as input quantities, and the next POI belonging category predicted value is obtained through processing of a second decoder;
and taking the POI sequence representation, the POI business circle sequence representation, the predicted value of the next POI category and the predicted value of the category to which the next POI belongs as input quantities, and processing the input quantities by a third decoder to obtain the next POI id of the user.
Preferably, the self-attention network is added with a layer of feedforward neural network, and the activation function used is ReLu functions.
Preferably, the activation function of the first decoder is a sigmoid function, and the loss function is a cross entropy loss function.
Preferably, the loss function of the second decoder is a BPR loss function.
Preferably, the loss function of the third decoder is a BPR loss function.
Preferably, adaGrad optimizers are used in the multi-granularity self-attention network model training process.
The invention has the advantages that the POI sequence is used as a fine granularity sequence, the POI business circle sequence is used as a coarse granularity sequence, modeling is carried out on two layers of the coarse granularity sequence and the fine granularity sequence, the model expression capability can be effectively improved, a multi-granularity self-attention network is adopted to capture ordered transfer modes of two granularities, an activity task for predicting the category of the next interest point and an auxiliary task for predicting the category of the next interest point are introduced, and finally, the next interest point is recommended by integrating the results of the coarse granularity sequence representation and the two tasks.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention.
Detailed Description
The present invention is described in further detail below with reference to the drawings to enable those skilled in the art to practice the invention by referring to the description.
It should be noted that the described embodiments of the present application are only some embodiments, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
As shown in fig. 1, the present invention provides a method for recommending next interest points based on multi-granularity self-attention, which includes:
S101, constructing a multi-granularity self-attention network model, wherein the multi-granularity self-attention network model comprises a first encoder for perceptually encoding time, a second encoder for perceptually encoding space and a decoder for decoding POI characteristics;
Here the first encoder and the second encoder are each comprised of a feature layer, an aggregation layer and a self-attention network.
The self-attention network uses an attention mechanism to calculate the correlation between each element and all other elements. In order to calculate the self-attention score, there are mainly the following processes:
Firstly, obtaining Q, K and V according to the dot multiplication of an input matrix X and a weight matrix:
Q=XWQ;
K=XWK;
V=XWV;
Secondly, for Q, K and V, calculating, and normalizing the calculated result by a softmax function:
Wherein, The function of (2) is to scale the result. Since the previous calculations are all linear calculations, in order to make the model have better nonlinear fitting capability, and thus add a layer of feedforward neural network, the activation function used is ReLu function:
F=FFN(Z)=ReLu(ZW1+b1);
Finally, the result F of the self-attention network is obtained.
Here, the POI feature includes a POI id, a category to which the POI belongs, a POI category, and a POI geographical position, and the decoder that decodes the POI feature includes a first decoder that decodes the POI category, a second decoder that decodes the POI category, and a third decoder that decodes the POI id.
S102, collecting historical check-in data of a plurality of users, training the multi-granularity self-attention network model by taking the historical check-in data of part of the users as a training set, taking the historical check-in data of the other part of the users as a test set, and predicting the next interest point of the users by utilizing the trained multi-granularity self-attention network model;
For example, there are 10 users, each having some history, and assuming p 1~p10 for training, in the training set, p 1~p9 is used as training to predict p 10 for each user, and assuming that the ratio of the training set to the testing set is 8:2, that is, 8 user histories are used for training, to obtain the learnable parameters of the multi-granularity self-attention network model, and 2 user histories are used as testing set for performing the test alone to predict the next interest point of the user. Of course, this is merely illustrative, and in the practical application process, the number of training sets is continuously increased along with the increase of the number of users, so that the prediction result of the multi-granularity self-attention network model is more and more accurate.
The training set training process and the testing set testing process herein include:
S201, extracting POI sequences containing time features and space features according to user historical sign-in data, and extracting POI business circle sequences containing time features and space features according to user historical sign-in data;
Here, the user history check-in data includes a user id (u), a POI id (p), a check-in time (t), a category (c) to which the POI belongs, a POI category (y), and a POI geographical location (g), and the usage parameters are expressed as (u, p, t, c, y, g).
The user id (u), the POI id (p), the check-in time (t) and the POI geographic position (g) are better understood, and the user id (u), the POI id (p), the check-in time (t) and the POI geographic position (g) can be known according to the literal meaning, and the category (c) to which the POI belongs can be roughly classified into catering, hospitals, education and the like. For POI category (y), we explain with the following example:
As shown in fig. 4, the user Alice has the following history access record p 2→p3→…→p8 on the electronic map. Wherein the individual POIs p 2,p3,p4,p5 are among the business turn POIs p 1. In this model, the POI sequence (fine-grained sequence) is p 2→p3→…→p8 and the POI business turn sequence (coarse-grained sequence) is p 1→p6→p7→p8, where p 2→p3→p4→p5 is reduced to business turn POI p 1. Since individual POI p 6,p7,p8 has no corresponding POI business at a higher granularity, we will still keep p 6,p7,p8 in order to preserve the integrity of the sequence context semantics. If the POI category (y) is distinguished by 0 and 1, the value of the POI p 2~p8 category (y) is 0 in the POI sequence and the value of the category (y) to which p 1 belongs is 1 in the POI business sequence.
In the sequence of the POI,For the temporal characteristics of user u at time n,Is a spatial feature of user u at time n. Similarly, in the POI business turn sequence,For the temporal characteristics of user u at time n,Is a spatial feature of user u at time n.
S202, using the POI sequence and the POI business circle sequence as input sequences, and processing through the multi-granularity self-attention network model to obtain the next POI id of the user.
The steps specifically comprise the following steps:
s301, encoding the time features in the POI sequence by using a first encoder to obtain the time perception encoding of the POI sequence.
In the sequence of the POI,For the temporal feature of user u at time n, the first encoder consists of a feature layer, an aggregation layer, and a self-attention network.
First, the temporal feature aggregation is expressed as:
Wherein, Is of the kind embedding (of the genus embedding),Is a embedding time stamp map to 24 hours,Is embedding of the category to which it belongs, B 1 is a learnable parameter.
The aggregated representation of the temporal feature is then fed into a self-attention network, resulting in a time-aware encoding of the POI sequence:
Wherein SAN represents a self-care network. Represented is a time-aware encoding of the POI sequence.
And using a second encoder to encode the spatial features in the POI sequence to obtain spatial perception encoding of the POI sequence, and synthesizing the temporal perception encoding of the POI sequence and the spatial perception encoding of the POI sequence to obtain POI sequence representation.
In the sequence of the POI,For the spatial feature of user u at time n, it is noted that it is not directly used in the process of extracting the featureI.e. the spatial coordinates, but the distance between the current position and the last position is recorded asThe second encoder is also composed of a feature layer, an aggregation layer, and a self-attention network.
First, the spatial feature aggregation is expressed as:
Wherein, For the POI id to correspond to embedding,For the interval of the current position from the last position, B 2 is a learnable parameter.
Next, the process willThe type of (2) is set to be an integer type, so that the computational complexity is reduced, and then the aggregate representation of the spatial features is sent into a self-attention network to obtain the spatial perception coding of the POI sequence:
Wherein SAN represents a self-care network. Represented is a spatial perceptual coding of a POI sequence. Finally, combining the time perception coding and the space perception coding of the POI sequence into a POI sequence representation:
S302, encoding the time features in the POI business turn sequence by using a first encoder to obtain the time perception encoding of the POI business turn sequence
In the sequence of the POI business turn,Is the temporal characteristic of user u at time n. The first encoder is comprised of a feature layer, an aggregation layer, and a self-attention network.
First, the temporal feature aggregation is expressed as:
Wherein, Is of the kind embedding (of the genus embedding),Is a embedding time stamp map to 24 hours,Is embedding of the category to which it belongs, B 3 is a learnable parameter.
Then, the aggregate representation of the temporal features is fed into a self-attention network, resulting in a temporal perceptual coding of the POI business turn sequence:
wherein the SAN represents a self-care network, Represented is a time-aware encoding of the POI business turn sequence.
Using a second encoder to encode the spatial features in the POI business turn sequence to obtain spatial perception encoding of the POI business turn sequence, and synthesizing the temporal perception encoding of the POI business turn sequence and the spatial perception encoding of the POI business turn sequence to obtain POI business turn sequence representation;
in the sequence of the POI business turn, Is a spatial feature of user u at time n. Notably, they are not directly used in the process of extracting featuresI.e. the spatial coordinates, but the distance between the current position and the last position is recorded asThe second encoder is also composed of a feature layer, an aggregation layer, and a self-attention network.
First, the spatial feature aggregation is expressed as:
Wherein, For the POI id to correspond to embedding,For the interval of the current position from the last position, B 4 is a learnable parameter.
Next, the process willThe type of (2) is set as an integer, so that the computational complexity is reduced, and then the aggregate representation of the spatial features is sent into a self-attention network to obtain the spatial perception coding of the POI business turn sequence:
Wherein SAN represents a self-care network. Representing the spatial perceptual coding of a sequence of POI business circles. Finally, combining the time perception coding and the space perception coding of the POI business turn sequence to form a POI business turn sequence representation:
S303, using the time perception code of the POI sequence and the time perception code of the POI business circle sequence as input quantity, and processing the input quantity by a first decoder to obtain a predicted value of the next POI type;
to predict the next POI category, we use temporal perceptual coding from two sequences to represent:
Wherein, Is an aggregation of the temporal perceptual encoders by two sequences, σ is a sigmoid function,Is a predicted value for the next POI category.
For the prediction of the next POI category we use Cross-entropy (Cross-entropy) loss function:
s304, using an aggregate representation obtained by the time features in the POI sequence in the encoding process and an aggregate representation obtained by the time features in the POI business circle sequence in the encoding process as input quantities, and obtaining a predicted value of the category to which the next POI belongs through processing of a second decoder;
In order to predict the category to which the next POI belongs, we first calculate the time-coded aggregate representation from the two sequences, then calculate the aggregate a rep of the two time-coded aggregate representations, and finally calculate the category to which the next POI belongs:
wherein A rep is a time-aware representation, For the representation of the user's hand,And the predicted value of the category to which the next POI belongs.
For the prediction of the category to which the next POI belongs, we use the BPR loss function:
S305, using the POI sequence representation, the POI business circle sequence representation, the predicted value of the next POI category and the predicted value of the category to which the next POI belongs as input quantities, and obtaining the next POI id of the user through processing of a third decoder.
In order to recommend the next POI, we aggregate the POI sequence representation, the POI business circle sequence representation, the predicted value of the category of the next POI and the predicted value of the category of the next POI, then calculate the aggregate P rep of the two sequence space coding aggregate representations, and finally recommend the next POI id:
Wherein, Is an aggregation of the first encoder result and the second encoder result of different sequences,Is a multi-granularity sequence representation, P rep is a spatially aware representation,For the representation of the user's hand,Predicted values for the next POI id of the user.
For recommending the next POI, we use the BPR loss function:
For the total loss function, there is the following representation:
Where λ y,λc and λ p are used to adjust for different losses, λ y+λc+λp =1, λ is the regularization coefficient, Θ= (W, b) is the set of learnable parameters.
Here, when steps S303 to 305 are performed in the training process, a AdaGrad optimizer is further used to optimize the training process of the learnable parameters.
According to the method in the embodiment, the POI sequence is used as the fine-granularity sequence, the POI business circle sequence is used as the coarse-granularity sequence, modeling is carried out on two levels of the coarse-granularity sequence and the fine-granularity sequence, the model expression capability can be effectively improved, a multi-granularity self-attention network is adopted to capture ordered transfer modes of two granularities, an activity task for predicting the category of the next interest point and an auxiliary task for predicting the category of the next interest point are introduced, and finally, the result of the coarse-granularity sequence representation and the two tasks is integrated to recommend the next interest point.
Although embodiments of the present invention have been disclosed above, it is not limited to the details and embodiments shown and described, it is well suited to various fields of use for which the invention would be readily apparent to those skilled in the art, and accordingly, the invention is not limited to the specific details and illustrations shown and described herein, without departing from the general concepts defined in the claims and their equivalents.