CN113342963B

CN113342963B - Service recommendation method and system based on transfer learning

Info

Publication number: CN113342963B
Application number: CN202110476286.0A
Authority: CN
Inventors: 戴鸿君; 雷超
Original assignee: Shandong University
Current assignee: Shandong Mass Institute Of Information Technology; Shandong University
Priority date: 2021-04-29
Filing date: 2021-04-29
Publication date: 2022-03-04
Anticipated expiration: 2041-04-29
Also published as: CN113342963A

Abstract

The invention relates to a service recommendation method and a system based on transfer learning, which comprises the following steps: s1, establishing a data set: collecting related data, filtering effective data and establishing a data set; s2, service semantic modeling: extracting geographic position information in different information in a data set, carrying out cluster analysis, fusing the geographic position information into original information by taking a proper proportion as a characteristic, and carrying out service semantic modeling; s3, transition learning: combining cross-domain prior information with service semantic modeling information and refining the cross-domain prior information and the service semantic modeling information; s4, decomposing a data fusion matrix: after the correction coefficients are obtained, determining implicit factors of a user-theme matrix and a theme-service matrix, performing different weight decay technologies on the implicit factors determined in a priori, and using an ADAM optimizer to resist overfitting; and S5, service recommendation is executed for the user. The method and the device are used for providing stable service recommendation quality in a complex environment and providing good service experience for the user in a hot start environment.

Description

Service recommendation method and system based on transfer learning

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a service recommendation method and system based on transfer learning.

Background

Due to the exposure of privacy problems and the increasing security awareness of users in recent years, people tend to transmit personal user data to a server as little as possible, so that the recommendation system can only obtain sparse user data, but the system still needs to recommend proper services to users in such an environment, which is a cold start problem.

In a cold start environment, only a small amount of data (such as GPS, biometric fingerprint, phone, historical data, etc.) of a user can be acquired, so how to complete service recommendation for the user based on existing sparse information becomes a problem to be solved urgently.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a service recommendation method based on transfer learning, which is used for solving the problem of describing the service by using the transfer learning under sparse user data, thereby providing proper service recommendation for a user and improving the service recommendation quality under a cold start environment.

The invention also provides a service recommendation system based on the transfer learning.

For the cold start problem, the invention is summarized as follows: and extracting the position information by the web services, performing thermal coding after a clustering algorithm, and adding the coded information into the service description. And performing semantic modeling on the service description information, migrating the extracted semantic model on the basis of GoogleNews, calculating the similarity to obtain a correction coefficient, and performing matrix decomposition. And obtaining service recommendation results of different users.

Interpretation of terms:

1. implicit Dirichlet distribution, LDA (LatensDirichletAllocation) for short, is a topic model which can give the topic of each document in a document set according to the form of probability distribution. Meanwhile, the method is an unsupervised learning algorithm, a training set which is marked manually is not needed during training, and only a document set and the number k of specified subjects are needed. Yet another advantage of LDA is that for each topic some words can be found to describe it. LDA was first proposed in 2003 by bleei, davidm, wunda and Jordan, michael, and is currently used in the field of text mining including text topic identification, text classification and text similarity calculation.

2. One-Hot coding, also known as One-Hot-coding, uses an N-bit status register to encode N states, each state being represented by its own independent register bit and only One of which is active at any time. In FPGA or ASIC design with abundant trigger resources, the circuit characteristics can be ensured by adopting single hot coding, and the advantage of large number of triggers can be fully utilized.

3. Gibbs Sampling, is an algorithm used statistically for markov monte carlo (MCMC) to approximate a sample sequence from some multivariate probability distribution when direct Sampling is difficult. The sequence may be used to approximate a joint distribution, an edge distribution of partial variables, or to compute an integral (e.g., an expected value of a certain variable). Some variables may be known variables and no sampling of these variables is required.

4. The GoogleNews model, as shown in fig. 2, is a model obtained by Google corporation based on training of GoogleNews data, and is one of models commonly used in natural language processing. When the model Google company is trained, the corpus contains about three million different vocabularies, and each word is mapped into a 300-dimensional high-dimensional vector for use in various forms of processing. The invention is based on the original model, uses skip-gram to train, and is a technology for predicting context by giving input.

5. The K-means algorithm is a hard clustering algorithm, is a typical representation of an objective function clustering method based on a prototype, is an adjustment rule of iterative operation obtained by taking a certain distance from a data point to the prototype as an optimized objective function and utilizing a function extremum solving method. The K-means algorithm takes Euclidean distance as similarity measure, and solves the optimal classification of a corresponding initial clustering center vector V, so that the evaluation index J is minimum. The algorithm uses a sum of squared errors criterion function as a clustering criterion function.

6. The weight decay technology is used for reserving information acquired after data fusion at the initial stage of training to a certain extent in the matrix decomposition process, indicating the optimization direction for the optimizer to display when optimizing the loss function, and keeping the optimization direction in the training process to a certain extent so that the information acquired before is not forgotten too fast due to the target of the loss function.

The technical scheme of the invention is as follows:

a service recommendation method based on transfer learning comprises the following steps:

s1, establishing a data set: collecting relevant data on the network, filtering effective data in the data, and establishing a data set;

s2, service semantic modeling: extracting geographic position information in different information in a data set, carrying out cluster analysis, fusing the geographic position information into original information by taking a proper proportion as a characteristic, carrying out service semantic modeling, and outputting a document-theme probability distribution matrix and a theme-word probability distribution matrix;

s3, transition learning: combining cross-domain prior information with service semantic modeling information and refining the cross-domain prior information and the service semantic modeling information; the information of service semantic modeling, namely the document-subject probability distribution matrix and the subject-word probability distribution matrix output in step S2; in this section, information within the GoogleNews model domain is primarily migrated into the native description information dataset. The method mainly uses a fine-tuning technology based on GoogleNews, and simultaneously uses a negative sampling technology for word embedding during migration in order to accelerate convergence and avoid overfitting, and finally optimizes by a loss function;

s4, decomposing a data fusion matrix: after the correction coefficients are obtained, determining implicit factors of a user-theme matrix and a theme-service matrix, performing different weight decay technologies on the implicit factors determined in a priori manner, and using an ADAM optimizer to resist overfitting, so that data sparsity is relieved in a cold start environment;

and S5, performing service recommendation on the user, and obtaining a service recommendation result on the user in the estimated scoring matrix.

Preferably, in step S2, the service semantic modeling includes the following specific steps:

s201, carrying out data set labeling work:

analyzing textual content information Info in a dataset using latent dirichlet distribution_cAnd geographical location information Info_l(ii) a Calculating a topic model with each service close to each other by using an implicit Dirichlet distribution unsupervised clustering algorithm, and projecting and mapping description information to a vector space consisting of a plurality of topics, wherein the geographic position information is treated as a special vocabulary through one-hot coding, and the special vocabulary is added to the text content information according to the occurrence frequency;

the added Top-N vocabulary l formally satisfies the following formula (I), and then generates a new description information Info:

in the formula (I), the compound is shown in the specification,

the gamma coefficient is determined by a formula, omega, eta, delta are parameters manually set in a program to limit the range of gamma, l_iRefers to the frequency of different terms appearing in a document in the document; f. of_wdThe frequency of different words appearing in the document in the whole corpus is referred to; f. of_lTop-N geographic location count frequencies for different services;

it is difficult to estimate parameters on the whole data set directly using the service description information and the original implicit dirichlet S202. The approximate processing is carried out by using Gibbs sampling, specifically comprising the following steps: inputting a service description d, parameters alpha and beta, a theme number K and geographical position information l, wherein the parameters alpha and beta are parameters required in implicit Dirichlet distribution, performing service semantic modeling, and obtaining a document-theme probability distribution matrix phi and a theme-word probability distribution matrix theta after the service semantic modeling.

Preferably, in step S3, information of different domains is fused by using transfer learning, that is, cross-domain prior information and service semantic modeling information are combined and refined, and the method includes the following specific steps:

s301, calculating the frequency f (w) of different words in different description information_j)，f(w_j) Means the word w_jThe frequency of occurrence in all corpora, j is the index of the word; f (w)_j) And f described above_wdThe meanings are logically identical, but the description information is changed (step S201), so the symbol f (w) is used_j) And (4) replacing.

Computing a set of negative samples P (w)_i)：

Xi is an empirical parameter, w_iIs a word, i, j are tokens of different types of words;

s302, fine-tuning the fine-tuning by using a negative sampling technology on the basis of the description information Info based on the GoogleNews model to obtain a migration model f, and obtaining high-dimensional mapping vectors of different words in the description information under the migration model f

The migration model f is based on a skip-gram and a negative sampling technology, and a migration model under a GoogleNews data set and a new description information data set is obtained after a certain iteration step.

S303, obtaining the similarity and the central vector of different words under the K-means algorithm by using the K-means algorithm, wherein the central vector is one of output results of the K-means algorithm, different words have different central vectors at the moment, and the similarity of different words is the distance of different vectors. Calculating a correction coefficient w according to the result of the combined service semantic modeling, as shown in formula (II):

in formula (II), the correction coefficient w is used to measure different subject terms in the service description to determine f(s), dis () represents a distance function of a vector, and may take cosine distance, euclidean distance, etc.; m is at

The number of words in the class to which it is attached, R is the maximum distance within the class,

is the central vector.

More preferably, ξ is 0.75;

further preferably, high-dimensional mapping vectors of different words in the description information are obtained under the migration model f through word2vec

Specifically, different words in the description information are subjected to thermal coding, and then output after the hidden layer is obtained as a result of the words in the high-dimensional space, that is, the words are embedded into the high-dimensional space, which is referred to as word embedding in the industry.

Preferably, the step S4 is implemented as follows:

s401, obtaining an initialized user-theme matrix P according to the correction coefficient w_i：

f(s_j) Determined by the subject words inherent in the different descriptive documents, is the service score, s, accumulated by the inherent subject words_jIs a service, h_iIs all services used by user i, r_i，jIs the user's rating for different services;

s402, a theme-service matrix Q is initialized as: q is phi^T(ii) a The hidden factors of the theme-service matrix refer to all parameters in the initialized theme-service matrix Q;

and S403, in the scoring matrix, initializing and determining missing values according to the user-scoring information in the data set established in the step S1, wherein the user only gives scores to part of items, so that other items are not scored, and the missing values are information that the user does not score. As shown in formula (III):

in the formula (III), R_i，jIs the score, Φ, of user i for service j_jIs the document-topic probability distribution matrix phi, h output in step S202_iIs all services used by user i, f(s)_k) Is by service s_kInherent subject word accumulated service score, s_kIs a service, N_iIs the number of services used by the user, R_k，jIs the score of user k for service j, and U is the set of all users;

s404, based on the weight regression technology, in the process of matrix decomposition, keeping the relevant information brought by transfer learning to a certain extent at the initial stage, and using an ADAM optimizer to carry out the t-th training in the iterative training of the loss function so as to

Determining the influence of pre-initialized information including a pre-estimated rating matrix R, a user-theme matrix P and a theme-service matrix Q on an optimization target in the iteration of determining different times;

to r_u，iThe data of (a) are randomly sampled to obtain a training set and a test set, of which 80% are used for training and the rest for prediction. The use of formula (IV) as a loss function in training, followed by training using techniques with weight decay based on ADAM optimization algorithms.

dw_t，u，iIs the decay coefficient of the user u to the item i during the t-th training; dr is a set initial decay coefficient; t is the training step of the t-th iteration; r is_u，iIs the user u's score for item i;

in the decomposition of the data fusion matrix, the loss function using the ADAM optimizer is shown in formula (IV):

in the formula (IV), J (r) represents the most common loss function representation form in machine learning,

is a user scoring matrix for data fusion matrix decomposition prediction, λ is a regularization coefficient, P_u，fIs the user-topic matrix, Q, determined in step S401_f，iThe topic-service matrix determined at step S402.

The data set obtained in step S2 is subjected to service semantic constructionAfter modeling, data of geographical position information is merged, the data set obtained in step S3 is subjected to transfer learning and is mapped to vectors of different words in a high-dimensional space, and S404 in step S4 is subjected to matrix decomposition training to finally obtain a user scoring matrix predicted by a matrix decomposition model

Preferably, in step S5, the service recommendation is performed on the user, and the service recommendation result for the user is obtained in the estimated score matrix, which is implemented as follows:

s501, organizing a database according to collected information of different users; the specific implementation process comprises the following steps:

firstly, at a client, a user collects information which is agreed and insensitive by the user through equipment such as a mobile phone and the like;

then, the collected data are transmitted to a server side, and the server side filters effective data in the step S1, formats the effective data, and acquires scoring information and geographic position information of different items in the user history information; meanwhile, capturing description information of different items on the network in the step S1;

finally, building a database for storing the grading information and the geographic position information of different projects in the user historical information and the description information of different projects captured on the network;

s502, based on the data in the database constructed in the step S501, service semantic modeling is sequentially performed in the step S2, and the step S3 is performed with transfer learning; and more accurate results and weight coefficients of different services in a high-dimensional vector space are obtained.

S503, obtaining the prediction scores of the user for different projects (the scores of the user u for the project i) through the decomposition of the data fusion matrix, and carrying out the sequencing processing from high to low on the prediction scores of the user for the different projects, wherein the higher score represents that the model predicts the higher degree of interest of the user for the project. The first few items are recommended as service recommendation results.

A service recommendation system based on transfer learning comprises a data set establishing module, a service semantic modeling module, a transfer learning module, a matrix decomposition module and a service recommendation module;

the data set establishing module is configured to execute the step S1; collecting data such as user and service description information; the service semantic modeling module is configured to execute the step S2; the data used for the preliminary processing comprises introducing geographic position information, performing semantic modeling and obtaining different subjects; the transfer learning module is configured to execute the step S3; migrating the obtained preliminary semantic processing result by combining with information of different neighborhoods to obtain a more detailed description result; the matrix decomposition module is configured to perform the step S4; performing data fusion on user information and semantic processing results, decomposing to obtain initialized user-theme, theme-service matrix and scoring matrix, and training by combining a weight regression technology; the service recommending module is configured to execute the step S5; and returning service recommendation results which may be interested by the user according to the grades of different services of different users.

The invention has the beneficial effects that:

1. the invention provides a service recommendation method based on transfer learning in a cold start environment, which is used for providing stable service recommendation quality in a complex environment and providing good service experience for users in a hot start environment.

2. When a user enters different new environments, all personal information of the user does not need to be uploaded to the cloud, good service recommendation quality can be obtained only by part of the information, and the safety of user privacy is improved;

3. according to the invention, by adopting a semantic modeling technology and transfer learning, deeper semantic information can be mined from the original description information to make up for the deficiency of data sparsity in a cold start environment; by combining the initialization and weight regression technology, compared with the traditional matrix decomposition technology, the method can avoid the phenomenon of overfitting to a certain extent and improve the training speed.

Drawings

FIG. 1 is a flow chart illustrating a method for service recommendation based on transfer learning;

FIG. 2 is a schematic diagram of a network structure of the GoogleNews model;

FIG. 3(a) is a schematic diagram of a root mean square error curve of a training set and a testing set using a weight decay technique;

FIG. 3(b) is a schematic diagram of the root mean square error curves of the training set and the testing set by using the naive MF technique.

Detailed Description

A migration learning based service recommendation technique according to the present invention is described in detail below with reference to the drawings and the detailed description.

Example 1

A service recommendation method based on transfer learning, as shown in fig. 1, includes the following steps:

the relevant data collected on the network is divided into two parts, one part is directly fromhttp：// www2.informatik.uni-freiburg.de/～cziegler/BX/The data set obtained by downloading contains user information, project information and user rating information for different projects, and the other part is according to the project number sequence in the datahttps：//www.amazon.comCapturing corresponding description information as supplementary information data for different items.

Specifically, in the grabbing process, a python script is used, a browser behavior is simulated based on a Beautiful Soup library and a selenium library, the Amazon website is accessed at random intervals, meanwhile, description supplementary information of a specific position under a specific number is retrieved according to label information, and a file named as a corresponding number is stored and placed into a separate folder. When the data is filtered, the data which accords with the format in the data is mainly reserved, and the data which does not accord with the format is removed.

the cross-domain prior information refers to word vector information obtained by Google corporation based on the training of GoogleNews data. Because the data set during the training of the GoogleNews model is based on GoogleNews, words with similar relations in the GoogleNews data can also keep similar relations in a high-dimensional space to a certain extent.

And (3) optimizing by using a loss function, wherein the loss function is characterized by taking a formula (IV) as a mathematical model, and after the loss function is defined, performing matrix decomposition by using an ADAM optimizer and combining a weight decay technology to finally obtain a decomposed user-theme matrix, a decomposed theme-service matrix and a decomposed score matrix.

Example 2

The service recommendation method based on the transfer learning in embodiment 1 is characterized in that:

in step S2, performing semantic modeling for the service, which includes the following specific steps:

s201, carrying out data set labeling work:

analyzing textual content information Info in a dataset using latent dirichlet distribution_cAnd geographical location information Info_l(ii) a The text content information refers to the description information captured from amazon website in the data set created in step S1. The geographical location information refers to the geographical location of the data set directly downloaded from the data set established in step S1 when different users request different items.

Calculating a topic model with each service close to each other by using an implicit Dirichlet distribution unsupervised clustering algorithm, and projecting and mapping description information to a vector space consisting of a plurality of topics, wherein the geographic position information is treated as a special vocabulary through one-hot coding, and the special vocabulary is added to the text content information according to the occurrence frequency;

in the formula (I), the compound is shown in the specification,

it is difficult to estimate parameters on the whole data set directly using the service description information and the original implicit dirichlet S202. The approximate processing is carried out by using Gibbs sampling, specifically comprising the following steps: inputting a service description d, parameters alpha and beta, a theme number K and geographical position information l, wherein the parameters alpha and beta are parameters required in implicit Dirichlet distribution, performing service semantic modeling, and obtaining a document-theme probability distribution matrix phi and a theme-word probability distribution matrix theta after the service semantic modeling. The document-topic probability distribution matrix phi is one of the LDA algorithm outputs and describes the document-topic joint probability distribution, that is, the sampling probability of different documents corresponding to each topic. The topic-term probability distribution matrix Θ, which is one of the LDA algorithm outputs, describes the topic-term joint probability distribution, i.e. the sampling probability that different topics correspond to respective terms.

In step S3, information in different fields is fused by using transfer learning, that is, cross-domain prior information and service semantic modeling information are combined and refined, and the specific steps are as follows:

Computing a set of negative samples P (w)_i)：

is the central vector.

ξ＝0.75；

Obtaining high-dimensional mapping vectors of different words in description information under a migration model f through word2vec

Specifically, different words in the description information are subjected to thermal coding, and then output after the hidden layer is obtained as a result of the words in the high-dimensional space, that is, the words are embedded into the high-dimensional space, which is referred to as word2vec in the industry.

The specific implementation steps of step S4 are as follows:

s402, a theme-service matrix Q is initialized as: q＝φ^T(ii) a The hidden factors of the theme-service matrix refer to all parameters in the initialized theme-service matrix Q;

to r_u，iThe data of (a) are randomly sampled to obtain a training set and a test set, of which 80% are used for training and the rest for prediction. Using formula (IV) as a loss function during training and then using weighted values based on ADAM optimization algorithmsThe technique of heavy fading is trained.

The data set obtained in step S2 is data into which geographical location information is incorporated after service semantic modeling, the data set obtained in step S3 is mapped to vectors of different words in a high-dimensional space after migration learning, and S404 in step S4 performs training of matrix decomposition to finally obtain a user score matrix predicted by a matrix decomposition model

And (4) using the MAE and the RMSE as evaluation indexes to evaluate and judge the effect of model training. Under the tensoflow machine learning framework, python was used as the programming development language, formula (IV) was used as the loss function in the training, and then the training was performed using a technique with weight decay based on ADAM optimization algorithm. In the verification, MAE (Mean absolute Error) and RMSE (Root Mean square Error) were used as evaluation indices.

In step S5, service recommendation is performed on the user, and a service recommendation result for the user is obtained in the estimated score matrix, which is specifically implemented as follows:

In the related test, as shown in table 1, in the cold start environment, i.e. 10%, 20% and 30% of the data are respectively selected as the training set, and the rest are selected as the verification set, and the comparison test is performed with the conventional algorithm. Table 1 shows the comparative results in the cold start environment.

TABLE 1

In the related results, as shown in table 2, the method TLMF of the present invention achieves the optimal result with the lowest cost under the evaluation method of MAE or RMSE under the warm start environment, i.e., 60%, 70%, and 80% of the data are respectively selected as the training set, and the rest are selected as the verification set, and the comparison test is performed with the conventional algorithm. Table 2 shows the comparative results in the hot start environment.

TABLE 2

In general, the method of the invention can provide more excellent service recommendation quality for the user no matter in cold start or hot start environment.

As shown in fig. 3(a) and fig. 3(b), in the matrix decomposition training process, the present invention (fig. 3(a)) utilizes the weight decay technique, in fig. 3(a), curve a represents the root mean square error (rmse) of the training set (train), and curve b represents the root mean square error (rmse) of the test set (test); the naive MF (fig. 3(b)) does not use the weight decay technique and the implicit factor initialization technique. In fig. 3(b), curve a represents the training set (train) root mean square error (rmse), and curve b represents the test set (test) root mean square error (rmse); in the comparison of the two effects, the work of the invention can be converged in a faster and more stable manner, while the naive MF shows a clear overfitting condition (meaning that the results are good on the training set but not good on the validation set, and the results are not universally valuable).

Example 3

A service recommendation system based on transfer learning is used for realizing the service recommendation method based on transfer learning in

embodiment

1 or 2, and comprises a data set establishing module, a service semantic modeling module, a transfer learning module, a matrix decomposition module and a service recommendation module;

a data set creation module for executing step S1; collecting data such as user and service description information; a service semantic modeling module for performing step S2; the data used for the preliminary processing comprises introducing geographic position information, performing semantic modeling and obtaining different subjects; a transfer learning module for executing step S3; migrating the obtained preliminary semantic processing result by combining with information of different neighborhoods to obtain a more detailed description result; a matrix decomposition module for performing step S4; performing data fusion on user information and semantic processing results, decomposing to obtain initialized user-theme, theme-service matrix and scoring matrix, and training by combining a weight regression technology; a service recommending module for executing step S5; and returning service recommendation results which may be interested by the user according to the grades of different services of different users.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. a service recommendation method based on migration learning, is characterized in that, comprises the steps:

S1. Establish a data set: collect relevant data on the Internet, filter the valid data in the data, and establish a data set;

S2. Service semantic modeling: extract geographic location information from different information in the dataset, perform cluster analysis, and fuse it into the original information with an appropriate ratio as a feature, perform service semantic modeling, and output document-topic probability distribution matrix and Topic-word probability distribution matrix; the specific implementation steps are as follows:

S201. Perform dataset labeling work:

Use the implicit Dirichlet distribution to analyze the text content information Info _c and the geographic location information Info _l in the dataset; use the implicit Dirichlet distribution to calculate the topic model that is close to each service, and map the description information to multiple A vector space composed of topics, in which the geographic location information is regarded as a special word after one-hot encoding, and the special word is added to the text content information according to the frequency of occurrence; the added Top-N word l is in the form Satisfy the following formula (I), and then generate new description information Info:

In formula (I),

ω, η, δ are all manually set parameters to limit the range of γ, _{li refers to the frequency of different words appearing in the document in the document; f wd} _refers to the different words appearing in the document in the entire corpus frequency in ; f _l is the Top-N geolocation count frequency of different services;

S202. Use Gibbs sampling for approximation processing, specifically: input service description d, parameters α and β, number of topics K, geographic location information 1, parameters α and β are required in the implicit Dirichlet distribution parameters, perform service semantic modeling, and obtain document-topic probability distribution matrix φ and topic-word probability distribution matrix Θ after service semantic modeling;

S3. Transfer learning: combine and refine the cross-domain prior information and service semantic modeling information; the service semantic modeling information is the document-topic probability distribution matrix and topic-word probability distribution matrix output in step S2; Optimize with loss function;

S4. Data fusion matrix decomposition: After obtaining the correction coefficient, determine the hidden factors of the user-topic matrix and the topic-service matrix, and perform different weight decay techniques on the a priori determined hidden factors, and use the ADAM optimizer to combat overfitting , so as to alleviate data sparsity in a cold-start environment;

S5 , perform service recommendation for the user, and obtain a service recommendation result for the user in the estimated score matrix.

2. A service recommendation method based on transfer learning according to claim 1, characterized in that, in step S3, transfer learning is used to fuse information in different fields, that is, cross-domain prior information and information of service semantic modeling are combined And refine it, the specific steps are as follows:

S301. Calculate the frequency f(w _j ) of different words in different description information, f(w _j ) refers to the frequency of word w _j appearing in all corpora, and j is the index index of the word; calculate the negative sampling set P(w _i ):

ξ is an empirical parameter, _wi is a word, i, j are the index of different types of words;

S302. Based on the Google News model, on the basis of the description information Info, use negative sampling technology to fine-tune fine-tuning to obtain a migration model f, and obtain high-dimensional mapping vectors of different words in the description information under the migration model f

S303. Use the K-means algorithm to obtain the similarity and center vector of different words under the K-means algorithm, and calculate the correction coefficient w with the result of the service semantic modeling, as shown in formula (II):

In formula (II), dis() represents the distance function of the vector, and m is in

the number of words in the attached class, R is the maximum intra-class distance,

is the center vector.

3 . The service recommendation method based on transfer learning according to claim 2 , wherein ξ=0.75. 4 .

4. a kind of service recommendation method based on migration learning according to claim 2, is characterized in that, obtains the high-dimensional mapping vector of different words in the description information under the migration model f through word2vec

5. A kind of service recommendation method based on migration learning according to claim 2, is characterized in that, the concrete realization step of step S4 is as follows:

S401. Obtain an initialized user-topic matrix according to the correction coefficient w

f(s _j ) is determined by the internal subject words in different description documents, and is the service score accumulated by the internal subject words, s _j is the service, hi is all the services used by user _i , and ri _,j are User ratings for different services;

S402, topic-service matrix Q is initialized as: Q= ^φT ; The implicit factor of topic-service matrix refers to each parameter in the topic-service matrix Q after initialization;

S403. In the scoring matrix, the missing values are initialized and determined according to the user-scoring information in the data set established in step S1, as shown in formula (III):

In formula (III), R _i,j is the rating of service j by user i, Φ _j is the document-topic probability distribution matrix φ output in step S202, hi is all the services used by user _i , f(s _k ) is the service score accumulated by the internal subject words of the service _sk , _sk is the service, N _i is the number of services used by the user, R _k,j is the score of the service j by the user i, and U is the set of all users;

S404. Based on the weight decay technique, the ADAM optimizer is used to perform the t-th training in the iterative training of the loss function, with

It is determined in the form of , and in different times of iterations, the pre-initialized information includes the estimated score matrix R, the user-topic matrix P and the topic-service matrix Q. The impact of the optimization goal;

dw _A,u,i is the decay coefficient of user u to item i during the t-th training; dr is the set initial decay coefficient; t is the t-th iterative training step; r _u,i is the user u's rating for item i;

In the matrix factorization of data fusion, the loss function of ADAM optimizer is shown in formula (IV):

In formula (IV), J(r) represents the most commonly used loss function representation in machine learning,

is the user rating matrix predicted by the data fusion matrix decomposition, λ is the regularization coefficient, P _u,f is the user-topic matrix determined in step S401, and Q _f,i is the topic-service matrix determined in step S402.

6. The method for recommending services based on migration learning according to claim 1, wherein in step S5, a service recommendation is performed for the user, and in the estimated score matrix, a service recommendation result for the user is obtained, specifically The implementation steps are as follows:

S501. Organize the database according to the collected information of different users; the specific implementation process includes:

First, collect user information;

Then, by filtering the valid data in the data as described in step S1, the valid data is formatted, and the rating information and geographic location information of different items in the user's historical information are obtained; descriptive information for different items;

Finally, build a database to store the scoring information and geographic location information of different items in the user's historical information and the description information of different items crawled from the Internet;

S502, based on the data in the database constructed in step S501, sequentially perform service semantic modeling through step S2, and perform migration learning in step S3;

S503. Obtain the user's predicted scores for different items by decomposing the data fusion matrix, sort the user's predicted scores for different items from high to low, and recommend the first few items as service recommendation results.

7. A service recommendation system based on migration learning, characterized in that, for realizing a service recommendation method based on migration learning described in any one of claims 1-6, comprising a data set establishment module and a service semantic modeling module , a migration learning module, a matrix decomposition module, and a service recommendation module; the data set establishment module is used to execute the step S1; the service semantic modeling module is used to execute the step S2; the migration learning module, for performing the step S3; the matrix decomposition module for performing the step S4; the service recommendation module for performing the step S5.