CN113342963B - Service recommendation method and system based on transfer learning - Google Patents
Service recommendation method and system based on transfer learning Download PDFInfo
- Publication number
- CN113342963B CN113342963B CN202110476286.0A CN202110476286A CN113342963B CN 113342963 B CN113342963 B CN 113342963B CN 202110476286 A CN202110476286 A CN 202110476286A CN 113342963 B CN113342963 B CN 113342963B
- Authority
- CN
- China
- Prior art keywords
- service
- information
- matrix
- user
- different
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/38—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/387—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Probability & Statistics with Applications (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medical Informatics (AREA)
- Evolutionary Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Library & Information Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a service recommendation method and a system based on transfer learning, which comprises the following steps: s1, establishing a data set: collecting related data, filtering effective data and establishing a data set; s2, service semantic modeling: extracting geographic position information in different information in a data set, carrying out cluster analysis, fusing the geographic position information into original information by taking a proper proportion as a characteristic, and carrying out service semantic modeling; s3, transition learning: combining cross-domain prior information with service semantic modeling information and refining the cross-domain prior information and the service semantic modeling information; s4, decomposing a data fusion matrix: after the correction coefficients are obtained, determining implicit factors of a user-theme matrix and a theme-service matrix, performing different weight decay technologies on the implicit factors determined in a priori, and using an ADAM optimizer to resist overfitting; and S5, service recommendation is executed for the user. The method and the device are used for providing stable service recommendation quality in a complex environment and providing good service experience for the user in a hot start environment.
Description
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a service recommendation method and system based on transfer learning.
Background
Due to the exposure of privacy problems and the increasing security awareness of users in recent years, people tend to transmit personal user data to a server as little as possible, so that the recommendation system can only obtain sparse user data, but the system still needs to recommend proper services to users in such an environment, which is a cold start problem.
In a cold start environment, only a small amount of data (such as GPS, biometric fingerprint, phone, historical data, etc.) of a user can be acquired, so how to complete service recommendation for the user based on existing sparse information becomes a problem to be solved urgently.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a service recommendation method based on transfer learning, which is used for solving the problem of describing the service by using the transfer learning under sparse user data, thereby providing proper service recommendation for a user and improving the service recommendation quality under a cold start environment.
The invention also provides a service recommendation system based on the transfer learning.
For the cold start problem, the invention is summarized as follows: and extracting the position information by the web services, performing thermal coding after a clustering algorithm, and adding the coded information into the service description. And performing semantic modeling on the service description information, migrating the extracted semantic model on the basis of GoogleNews, calculating the similarity to obtain a correction coefficient, and performing matrix decomposition. And obtaining service recommendation results of different users.
Interpretation of terms:
1. implicit Dirichlet distribution, LDA (LatensDirichletAllocation) for short, is a topic model which can give the topic of each document in a document set according to the form of probability distribution. Meanwhile, the method is an unsupervised learning algorithm, a training set which is marked manually is not needed during training, and only a document set and the number k of specified subjects are needed. Yet another advantage of LDA is that for each topic some words can be found to describe it. LDA was first proposed in 2003 by bleei, davidm, wunda and Jordan, michael, and is currently used in the field of text mining including text topic identification, text classification and text similarity calculation.
2. One-Hot coding, also known as One-Hot-coding, uses an N-bit status register to encode N states, each state being represented by its own independent register bit and only One of which is active at any time. In FPGA or ASIC design with abundant trigger resources, the circuit characteristics can be ensured by adopting single hot coding, and the advantage of large number of triggers can be fully utilized.
3. Gibbs Sampling, is an algorithm used statistically for markov monte carlo (MCMC) to approximate a sample sequence from some multivariate probability distribution when direct Sampling is difficult. The sequence may be used to approximate a joint distribution, an edge distribution of partial variables, or to compute an integral (e.g., an expected value of a certain variable). Some variables may be known variables and no sampling of these variables is required.
4. The GoogleNews model, as shown in fig. 2, is a model obtained by Google corporation based on training of GoogleNews data, and is one of models commonly used in natural language processing. When the model Google company is trained, the corpus contains about three million different vocabularies, and each word is mapped into a 300-dimensional high-dimensional vector for use in various forms of processing. The invention is based on the original model, uses skip-gram to train, and is a technology for predicting context by giving input.
5. The K-means algorithm is a hard clustering algorithm, is a typical representation of an objective function clustering method based on a prototype, is an adjustment rule of iterative operation obtained by taking a certain distance from a data point to the prototype as an optimized objective function and utilizing a function extremum solving method. The K-means algorithm takes Euclidean distance as similarity measure, and solves the optimal classification of a corresponding initial clustering center vector V, so that the evaluation index J is minimum. The algorithm uses a sum of squared errors criterion function as a clustering criterion function.
6. The weight decay technology is used for reserving information acquired after data fusion at the initial stage of training to a certain extent in the matrix decomposition process, indicating the optimization direction for the optimizer to display when optimizing the loss function, and keeping the optimization direction in the training process to a certain extent so that the information acquired before is not forgotten too fast due to the target of the loss function.
The technical scheme of the invention is as follows:
a service recommendation method based on transfer learning comprises the following steps:
s1, establishing a data set: collecting relevant data on the network, filtering effective data in the data, and establishing a data set;
s2, service semantic modeling: extracting geographic position information in different information in a data set, carrying out cluster analysis, fusing the geographic position information into original information by taking a proper proportion as a characteristic, carrying out service semantic modeling, and outputting a document-theme probability distribution matrix and a theme-word probability distribution matrix;
s3, transition learning: combining cross-domain prior information with service semantic modeling information and refining the cross-domain prior information and the service semantic modeling information; the information of service semantic modeling, namely the document-subject probability distribution matrix and the subject-word probability distribution matrix output in step S2; in this section, information within the GoogleNews model domain is primarily migrated into the native description information dataset. The method mainly uses a fine-tuning technology based on GoogleNews, and simultaneously uses a negative sampling technology for word embedding during migration in order to accelerate convergence and avoid overfitting, and finally optimizes by a loss function;
s4, decomposing a data fusion matrix: after the correction coefficients are obtained, determining implicit factors of a user-theme matrix and a theme-service matrix, performing different weight decay technologies on the implicit factors determined in a priori manner, and using an ADAM optimizer to resist overfitting, so that data sparsity is relieved in a cold start environment;
and S5, performing service recommendation on the user, and obtaining a service recommendation result on the user in the estimated scoring matrix.
Preferably, in step S2, the service semantic modeling includes the following specific steps:
s201, carrying out data set labeling work:
analyzing textual content information Info in a dataset using latent dirichlet distributioncAnd geographical location information Infol(ii) a Calculating a topic model with each service close to each other by using an implicit Dirichlet distribution unsupervised clustering algorithm, and projecting and mapping description information to a vector space consisting of a plurality of topics, wherein the geographic position information is treated as a special vocabulary through one-hot coding, and the special vocabulary is added to the text content information according to the occurrence frequency;
the added Top-N vocabulary l formally satisfies the following formula (I), and then generates a new description information Info:
in the formula (I), the compound is shown in the specification,the gamma coefficient is determined by a formula, omega, eta, delta are parameters manually set in a program to limit the range of gamma, liRefers to the frequency of different terms appearing in a document in the document; f. ofwdThe frequency of different words appearing in the document in the whole corpus is referred to; f. oflTop-N geographic location count frequencies for different services;
it is difficult to estimate parameters on the whole data set directly using the service description information and the original implicit dirichlet S202. The approximate processing is carried out by using Gibbs sampling, specifically comprising the following steps: inputting a service description d, parameters alpha and beta, a theme number K and geographical position information l, wherein the parameters alpha and beta are parameters required in implicit Dirichlet distribution, performing service semantic modeling, and obtaining a document-theme probability distribution matrix phi and a theme-word probability distribution matrix theta after the service semantic modeling.
Preferably, in step S3, information of different domains is fused by using transfer learning, that is, cross-domain prior information and service semantic modeling information are combined and refined, and the method includes the following specific steps:
s301, calculating the frequency f (w) of different words in different description informationj),f(wj) Means the word wjThe frequency of occurrence in all corpora, j is the index of the word; f (w)j) And f described abovewdThe meanings are logically identical, but the description information is changed (step S201), so the symbol f (w) is usedj) And (4) replacing.
Computing a set of negative samples P (w)i):Xi is an empirical parameter, wiIs a word, i, j are tokens of different types of words;
s302, fine-tuning the fine-tuning by using a negative sampling technology on the basis of the description information Info based on the GoogleNews model to obtain a migration model f, and obtaining high-dimensional mapping vectors of different words in the description information under the migration model f
The migration model f is based on a skip-gram and a negative sampling technology, and a migration model under a GoogleNews data set and a new description information data set is obtained after a certain iteration step.
S303, obtaining the similarity and the central vector of different words under the K-means algorithm by using the K-means algorithm, wherein the central vector is one of output results of the K-means algorithm, different words have different central vectors at the moment, and the similarity of different words is the distance of different vectors. Calculating a correction coefficient w according to the result of the combined service semantic modeling, as shown in formula (II):
in formula (II), the correction coefficient w is used to measure different subject terms in the service description to determine f(s), dis () represents a distance function of a vector, and may take cosine distance, euclidean distance, etc.; m is atThe number of words in the class to which it is attached, R is the maximum distance within the class,is the central vector.
More preferably, ξ is 0.75;
further preferably, high-dimensional mapping vectors of different words in the description information are obtained under the migration model f through word2vecSpecifically, different words in the description information are subjected to thermal coding, and then output after the hidden layer is obtained as a result of the words in the high-dimensional space, that is, the words are embedded into the high-dimensional space, which is referred to as word embedding in the industry.
Preferably, the step S4 is implemented as follows:
s401, obtaining an initialized user-theme matrix P according to the correction coefficient wi:f(sj) Determined by the subject words inherent in the different descriptive documents, is the service score, s, accumulated by the inherent subject wordsjIs a service, hiIs all services used by user i, ri,jIs the user's rating for different services;
s402, a theme-service matrix Q is initialized as: q is phiT(ii) a The hidden factors of the theme-service matrix refer to all parameters in the initialized theme-service matrix Q;
and S403, in the scoring matrix, initializing and determining missing values according to the user-scoring information in the data set established in the step S1, wherein the user only gives scores to part of items, so that other items are not scored, and the missing values are information that the user does not score. As shown in formula (III):
in the formula (III), Ri,jIs the score, Φ, of user i for service jjIs the document-topic probability distribution matrix phi, h output in step S202iIs all services used by user i, f(s)k) Is by service skInherent subject word accumulated service score, skIs a service, NiIs the number of services used by the user, Rk,jIs the score of user k for service j, and U is the set of all users;
s404, based on the weight regression technology, in the process of matrix decomposition, keeping the relevant information brought by transfer learning to a certain extent at the initial stage, and using an ADAM optimizer to carry out the t-th training in the iterative training of the loss function so as to Determining the influence of pre-initialized information including a pre-estimated rating matrix R, a user-theme matrix P and a theme-service matrix Q on an optimization target in the iteration of determining different times;
to ru,iThe data of (a) are randomly sampled to obtain a training set and a test set, of which 80% are used for training and the rest for prediction. The use of formula (IV) as a loss function in training, followed by training using techniques with weight decay based on ADAM optimization algorithms.
dwt,u,iIs the decay coefficient of the user u to the item i during the t-th training; dr is a set initial decay coefficient; t is the training step of the t-th iteration; r isu,iIs the user u's score for item i;
in the decomposition of the data fusion matrix, the loss function using the ADAM optimizer is shown in formula (IV):
in the formula (IV), J (r) represents the most common loss function representation form in machine learning,is a user scoring matrix for data fusion matrix decomposition prediction, λ is a regularization coefficient, Pu,fIs the user-topic matrix, Q, determined in step S401f,iThe topic-service matrix determined at step S402.
The data set obtained in step S2 is subjected to service semantic constructionAfter modeling, data of geographical position information is merged, the data set obtained in step S3 is subjected to transfer learning and is mapped to vectors of different words in a high-dimensional space, and S404 in step S4 is subjected to matrix decomposition training to finally obtain a user scoring matrix predicted by a matrix decomposition model
Preferably, in step S5, the service recommendation is performed on the user, and the service recommendation result for the user is obtained in the estimated score matrix, which is implemented as follows:
s501, organizing a database according to collected information of different users; the specific implementation process comprises the following steps:
firstly, at a client, a user collects information which is agreed and insensitive by the user through equipment such as a mobile phone and the like;
then, the collected data are transmitted to a server side, and the server side filters effective data in the step S1, formats the effective data, and acquires scoring information and geographic position information of different items in the user history information; meanwhile, capturing description information of different items on the network in the step S1;
finally, building a database for storing the grading information and the geographic position information of different projects in the user historical information and the description information of different projects captured on the network;
s502, based on the data in the database constructed in the step S501, service semantic modeling is sequentially performed in the step S2, and the step S3 is performed with transfer learning; and more accurate results and weight coefficients of different services in a high-dimensional vector space are obtained.
S503, obtaining the prediction scores of the user for different projects (the scores of the user u for the project i) through the decomposition of the data fusion matrix, and carrying out the sequencing processing from high to low on the prediction scores of the user for the different projects, wherein the higher score represents that the model predicts the higher degree of interest of the user for the project. The first few items are recommended as service recommendation results.
A service recommendation system based on transfer learning comprises a data set establishing module, a service semantic modeling module, a transfer learning module, a matrix decomposition module and a service recommendation module;
the data set establishing module is configured to execute the step S1; collecting data such as user and service description information; the service semantic modeling module is configured to execute the step S2; the data used for the preliminary processing comprises introducing geographic position information, performing semantic modeling and obtaining different subjects; the transfer learning module is configured to execute the step S3; migrating the obtained preliminary semantic processing result by combining with information of different neighborhoods to obtain a more detailed description result; the matrix decomposition module is configured to perform the step S4; performing data fusion on user information and semantic processing results, decomposing to obtain initialized user-theme, theme-service matrix and scoring matrix, and training by combining a weight regression technology; the service recommending module is configured to execute the step S5; and returning service recommendation results which may be interested by the user according to the grades of different services of different users.
The invention has the beneficial effects that:
1. the invention provides a service recommendation method based on transfer learning in a cold start environment, which is used for providing stable service recommendation quality in a complex environment and providing good service experience for users in a hot start environment.
2. When a user enters different new environments, all personal information of the user does not need to be uploaded to the cloud, good service recommendation quality can be obtained only by part of the information, and the safety of user privacy is improved;
3. according to the invention, by adopting a semantic modeling technology and transfer learning, deeper semantic information can be mined from the original description information to make up for the deficiency of data sparsity in a cold start environment; by combining the initialization and weight regression technology, compared with the traditional matrix decomposition technology, the method can avoid the phenomenon of overfitting to a certain extent and improve the training speed.
Drawings
FIG. 1 is a flow chart illustrating a method for service recommendation based on transfer learning;
FIG. 2 is a schematic diagram of a network structure of the GoogleNews model;
FIG. 3(a) is a schematic diagram of a root mean square error curve of a training set and a testing set using a weight decay technique;
FIG. 3(b) is a schematic diagram of the root mean square error curves of the training set and the testing set by using the naive MF technique.
Detailed Description
A migration learning based service recommendation technique according to the present invention is described in detail below with reference to the drawings and the detailed description.
Example 1
A service recommendation method based on transfer learning, as shown in fig. 1, includes the following steps:
s1, establishing a data set: collecting relevant data on the network, filtering effective data in the data, and establishing a data set;
the relevant data collected on the network is divided into two parts, one part is directly fromhttp:// www2.informatik.uni-freiburg.de/~cziegler/BX/The data set obtained by downloading contains user information, project information and user rating information for different projects, and the other part is according to the project number sequence in the datahttps://www.amazon.comCapturing corresponding description information as supplementary information data for different items.
Specifically, in the grabbing process, a python script is used, a browser behavior is simulated based on a Beautiful Soup library and a selenium library, the Amazon website is accessed at random intervals, meanwhile, description supplementary information of a specific position under a specific number is retrieved according to label information, and a file named as a corresponding number is stored and placed into a separate folder. When the data is filtered, the data which accords with the format in the data is mainly reserved, and the data which does not accord with the format is removed.
S2, service semantic modeling: extracting geographic position information in different information in a data set, carrying out cluster analysis, fusing the geographic position information into original information by taking a proper proportion as a characteristic, carrying out service semantic modeling, and outputting a document-theme probability distribution matrix and a theme-word probability distribution matrix;
s3, transition learning: combining cross-domain prior information with service semantic modeling information and refining the cross-domain prior information and the service semantic modeling information; the information of service semantic modeling, namely the document-subject probability distribution matrix and the subject-word probability distribution matrix output in step S2; in this section, information within the GoogleNews model domain is primarily migrated into the native description information dataset. The method mainly uses a fine-tuning technology based on GoogleNews, and simultaneously uses a negative sampling technology for word embedding during migration in order to accelerate convergence and avoid overfitting, and finally optimizes by a loss function;
the cross-domain prior information refers to word vector information obtained by Google corporation based on the training of GoogleNews data. Because the data set during the training of the GoogleNews model is based on GoogleNews, words with similar relations in the GoogleNews data can also keep similar relations in a high-dimensional space to a certain extent.
And (3) optimizing by using a loss function, wherein the loss function is characterized by taking a formula (IV) as a mathematical model, and after the loss function is defined, performing matrix decomposition by using an ADAM optimizer and combining a weight decay technology to finally obtain a decomposed user-theme matrix, a decomposed theme-service matrix and a decomposed score matrix.
S4, decomposing a data fusion matrix: after the correction coefficients are obtained, determining implicit factors of a user-theme matrix and a theme-service matrix, performing different weight decay technologies on the implicit factors determined in a priori manner, and using an ADAM optimizer to resist overfitting, so that data sparsity is relieved in a cold start environment;
and S5, performing service recommendation on the user, and obtaining a service recommendation result on the user in the estimated scoring matrix.
Example 2
The service recommendation method based on the transfer learning in embodiment 1 is characterized in that:
in step S2, performing semantic modeling for the service, which includes the following specific steps:
s201, carrying out data set labeling work:
analyzing textual content information Info in a dataset using latent dirichlet distributioncAnd geographical location information Infol(ii) a The text content information refers to the description information captured from amazon website in the data set created in step S1. The geographical location information refers to the geographical location of the data set directly downloaded from the data set established in step S1 when different users request different items.
Calculating a topic model with each service close to each other by using an implicit Dirichlet distribution unsupervised clustering algorithm, and projecting and mapping description information to a vector space consisting of a plurality of topics, wherein the geographic position information is treated as a special vocabulary through one-hot coding, and the special vocabulary is added to the text content information according to the occurrence frequency;
the added Top-N vocabulary l formally satisfies the following formula (I), and then generates a new description information Info:
in the formula (I), the compound is shown in the specification,the gamma coefficient is determined by a formula, omega, eta, delta are parameters manually set in a program to limit the range of gamma, liRefers to the frequency of different terms appearing in a document in the document; f. ofwdThe frequency of different words appearing in the document in the whole corpus is referred to; f. oflTop-N geographic location count frequencies for different services;
it is difficult to estimate parameters on the whole data set directly using the service description information and the original implicit dirichlet S202. The approximate processing is carried out by using Gibbs sampling, specifically comprising the following steps: inputting a service description d, parameters alpha and beta, a theme number K and geographical position information l, wherein the parameters alpha and beta are parameters required in implicit Dirichlet distribution, performing service semantic modeling, and obtaining a document-theme probability distribution matrix phi and a theme-word probability distribution matrix theta after the service semantic modeling. The document-topic probability distribution matrix phi is one of the LDA algorithm outputs and describes the document-topic joint probability distribution, that is, the sampling probability of different documents corresponding to each topic. The topic-term probability distribution matrix Θ, which is one of the LDA algorithm outputs, describes the topic-term joint probability distribution, i.e. the sampling probability that different topics correspond to respective terms.
In step S3, information in different fields is fused by using transfer learning, that is, cross-domain prior information and service semantic modeling information are combined and refined, and the specific steps are as follows:
s301, calculating the frequency f (w) of different words in different description informationj),f(wj) Means the word wjThe frequency of occurrence in all corpora, j is the index of the word; f (w)j) And f described abovewdThe meanings are logically identical, but the description information is changed (step S201), so the symbol f (w) is usedj) And (4) replacing.
Computing a set of negative samples P (w)i):Xi is an empirical parameter, wiIs a word, i, j are tokens of different types of words;
s302, fine-tuning the fine-tuning by using a negative sampling technology on the basis of the description information Info based on the GoogleNews model to obtain a migration model f, and obtaining high-dimensional mapping vectors of different words in the description information under the migration model f
The migration model f is based on a skip-gram and a negative sampling technology, and a migration model under a GoogleNews data set and a new description information data set is obtained after a certain iteration step.
S303, obtaining the similarity and the central vector of different words under the K-means algorithm by using the K-means algorithm, wherein the central vector is one of output results of the K-means algorithm, different words have different central vectors at the moment, and the similarity of different words is the distance of different vectors. Calculating a correction coefficient w according to the result of the combined service semantic modeling, as shown in formula (II):
in formula (II), the correction coefficient w is used to measure different subject terms in the service description to determine f(s), dis () represents a distance function of a vector, and may take cosine distance, euclidean distance, etc.; m is atThe number of words in the class to which it is attached, R is the maximum distance within the class,is the central vector.
ξ=0.75;
Obtaining high-dimensional mapping vectors of different words in description information under a migration model f through word2vecSpecifically, different words in the description information are subjected to thermal coding, and then output after the hidden layer is obtained as a result of the words in the high-dimensional space, that is, the words are embedded into the high-dimensional space, which is referred to as word2vec in the industry.
The specific implementation steps of step S4 are as follows:
s401, obtaining an initialized user-theme matrix P according to the correction coefficient wi:f(sj) Determined by the subject words inherent in the different descriptive documents, is the service score, s, accumulated by the inherent subject wordsjIs a service, hiIs all services used by user i, ri,jIs the user's rating for different services;
s402, a theme-service matrix Q is initialized as: q=φT(ii) a The hidden factors of the theme-service matrix refer to all parameters in the initialized theme-service matrix Q;
and S403, in the scoring matrix, initializing and determining missing values according to the user-scoring information in the data set established in the step S1, wherein the user only gives scores to part of items, so that other items are not scored, and the missing values are information that the user does not score. As shown in formula (III):
in the formula (III), Ri,jIs the score, Φ, of user i for service jjIs the document-topic probability distribution matrix phi, h output in step S202iIs all services used by user i, f(s)k) Is by service skInherent subject word accumulated service score, skIs a service, NiIs the number of services used by the user, Rk,jIs the score of user k for service j, and U is the set of all users;
s404, based on the weight regression technology, in the process of matrix decomposition, keeping the relevant information brought by transfer learning to a certain extent at the initial stage, and using an ADAM optimizer to carry out the t-th training in the iterative training of the loss function so as to Determining the influence of pre-initialized information including a pre-estimated rating matrix R, a user-theme matrix P and a theme-service matrix Q on an optimization target in the iteration of determining different times;
to ru,iThe data of (a) are randomly sampled to obtain a training set and a test set, of which 80% are used for training and the rest for prediction. Using formula (IV) as a loss function during training and then using weighted values based on ADAM optimization algorithmsThe technique of heavy fading is trained.
dwt,u,iIs the decay coefficient of the user u to the item i during the t-th training; dr is a set initial decay coefficient; t is the training step of the t-th iteration; r isu,iIs the user u's score for item i;
in the decomposition of the data fusion matrix, the loss function using the ADAM optimizer is shown in formula (IV):
in the formula (IV), J (r) represents the most common loss function representation form in machine learning,is a user scoring matrix for data fusion matrix decomposition prediction, λ is a regularization coefficient, Pu,fIs the user-topic matrix, Q, determined in step S401f,iThe topic-service matrix determined at step S402.
The data set obtained in step S2 is data into which geographical location information is incorporated after service semantic modeling, the data set obtained in step S3 is mapped to vectors of different words in a high-dimensional space after migration learning, and S404 in step S4 performs training of matrix decomposition to finally obtain a user score matrix predicted by a matrix decomposition model
And (4) using the MAE and the RMSE as evaluation indexes to evaluate and judge the effect of model training. Under the tensoflow machine learning framework, python was used as the programming development language, formula (IV) was used as the loss function in the training, and then the training was performed using a technique with weight decay based on ADAM optimization algorithm. In the verification, MAE (Mean absolute Error) and RMSE (Root Mean square Error) were used as evaluation indices.
In step S5, service recommendation is performed on the user, and a service recommendation result for the user is obtained in the estimated score matrix, which is specifically implemented as follows:
s501, organizing a database according to collected information of different users; the specific implementation process comprises the following steps:
firstly, at a client, a user collects information which is agreed and insensitive by the user through equipment such as a mobile phone and the like;
then, the collected data are transmitted to a server side, and the server side filters effective data in the step S1, formats the effective data, and acquires scoring information and geographic position information of different items in the user history information; meanwhile, capturing description information of different items on the network in the step S1;
finally, building a database for storing the grading information and the geographic position information of different projects in the user historical information and the description information of different projects captured on the network;
s502, based on the data in the database constructed in the step S501, service semantic modeling is sequentially performed in the step S2, and the step S3 is performed with transfer learning; and more accurate results and weight coefficients of different services in a high-dimensional vector space are obtained.
S503, obtaining the prediction scores of the user for different projects (the scores of the user u for the project i) through the decomposition of the data fusion matrix, and carrying out the sequencing processing from high to low on the prediction scores of the user for the different projects, wherein the higher score represents that the model predicts the higher degree of interest of the user for the project. The first few items are recommended as service recommendation results.
In the related test, as shown in table 1, in the cold start environment, i.e. 10%, 20% and 30% of the data are respectively selected as the training set, and the rest are selected as the verification set, and the comparison test is performed with the conventional algorithm. Table 1 shows the comparative results in the cold start environment.
TABLE 1
In the related results, as shown in table 2, the method TLMF of the present invention achieves the optimal result with the lowest cost under the evaluation method of MAE or RMSE under the warm start environment, i.e., 60%, 70%, and 80% of the data are respectively selected as the training set, and the rest are selected as the verification set, and the comparison test is performed with the conventional algorithm. Table 2 shows the comparative results in the hot start environment.
TABLE 2
In general, the method of the invention can provide more excellent service recommendation quality for the user no matter in cold start or hot start environment.
As shown in fig. 3(a) and fig. 3(b), in the matrix decomposition training process, the present invention (fig. 3(a)) utilizes the weight decay technique, in fig. 3(a), curve a represents the root mean square error (rmse) of the training set (train), and curve b represents the root mean square error (rmse) of the test set (test); the naive MF (fig. 3(b)) does not use the weight decay technique and the implicit factor initialization technique. In fig. 3(b), curve a represents the training set (train) root mean square error (rmse), and curve b represents the test set (test) root mean square error (rmse); in the comparison of the two effects, the work of the invention can be converged in a faster and more stable manner, while the naive MF shows a clear overfitting condition (meaning that the results are good on the training set but not good on the validation set, and the results are not universally valuable).
Example 3
A service recommendation system based on transfer learning is used for realizing the service recommendation method based on transfer learning in embodiment 1 or 2, and comprises a data set establishing module, a service semantic modeling module, a transfer learning module, a matrix decomposition module and a service recommendation module;
a data set creation module for executing step S1; collecting data such as user and service description information; a service semantic modeling module for performing step S2; the data used for the preliminary processing comprises introducing geographic position information, performing semantic modeling and obtaining different subjects; a transfer learning module for executing step S3; migrating the obtained preliminary semantic processing result by combining with information of different neighborhoods to obtain a more detailed description result; a matrix decomposition module for performing step S4; performing data fusion on user information and semantic processing results, decomposing to obtain initialized user-theme, theme-service matrix and scoring matrix, and training by combining a weight regression technology; a service recommending module for executing step S5; and returning service recommendation results which may be interested by the user according to the grades of different services of different users.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110476286.0A CN113342963B (en) | 2021-04-29 | 2021-04-29 | Service recommendation method and system based on transfer learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110476286.0A CN113342963B (en) | 2021-04-29 | 2021-04-29 | Service recommendation method and system based on transfer learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113342963A CN113342963A (en) | 2021-09-03 |
CN113342963B true CN113342963B (en) | 2022-03-04 |
Family
ID=77469141
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110476286.0A Active CN113342963B (en) | 2021-04-29 | 2021-04-29 | Service recommendation method and system based on transfer learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113342963B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114329231A (en) * | 2021-12-31 | 2022-04-12 | 北京百度网讯科技有限公司 | Object feature processing method, device, electronic device and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109241440A (en) * | 2018-09-29 | 2019-01-18 | 北京工业大学 | It is a kind of based on deep learning towards implicit feedback recommended method |
CN109471982A (en) * | 2018-11-21 | 2019-03-15 | 南京邮电大学 | A QoS-aware Web Service Recommendation Method Based on User and Service Clustering |
CN110807154A (en) * | 2019-11-08 | 2020-02-18 | 内蒙古工业大学 | A recommendation method and system based on a hybrid deep learning model |
CN110990600A (en) * | 2019-12-04 | 2020-04-10 | 腾讯科技(深圳)有限公司 | Multimedia file recommendation method, multimedia file recommendation device, multimedia file parameter adjustment device, multimedia file recommendation medium and electronic equipment |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102591915B (en) * | 2011-12-15 | 2013-09-11 | 南京大学 | Recommending method based on label migration learning |
AU2017214469A1 (en) * | 2016-02-01 | 2018-07-26 | Prevencio, Inc. | Diagnostic and prognostic methods for cardiovascular diseases and events |
CN108363804B (en) * | 2018-03-01 | 2020-08-21 | 浙江工业大学 | Local model weighted fusion Top-N movie recommendation method based on user clustering |
CN108629010B (en) * | 2018-05-07 | 2022-03-18 | 南京大学 | Web service recommendation method based on theme and service combination information |
CN110399742B (en) * | 2019-07-29 | 2020-12-18 | 深圳前海微众银行股份有限公司 | A training and prediction method and device for a federated transfer learning model |
CN110955775A (en) * | 2019-11-11 | 2020-04-03 | 南通大学 | A picture book recommendation method based on implicit query |
CN110968675B (en) * | 2019-12-05 | 2023-03-31 | 北京工业大学 | Recommendation method and system based on multi-field semantic fusion |
CN112307351A (en) * | 2020-11-23 | 2021-02-02 | 中国科学院计算技术研究所 | Model training and recommending method, device and equipment for user behavior |
-
2021
- 2021-04-29 CN CN202110476286.0A patent/CN113342963B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109241440A (en) * | 2018-09-29 | 2019-01-18 | 北京工业大学 | It is a kind of based on deep learning towards implicit feedback recommended method |
CN109471982A (en) * | 2018-11-21 | 2019-03-15 | 南京邮电大学 | A QoS-aware Web Service Recommendation Method Based on User and Service Clustering |
CN110807154A (en) * | 2019-11-08 | 2020-02-18 | 内蒙古工业大学 | A recommendation method and system based on a hybrid deep learning model |
CN110990600A (en) * | 2019-12-04 | 2020-04-10 | 腾讯科技(深圳)有限公司 | Multimedia file recommendation method, multimedia file recommendation device, multimedia file parameter adjustment device, multimedia file recommendation medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN113342963A (en) | 2021-09-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108846422B (en) | Account number association method and system across social networks | |
Bansal et al. | Ask the gru: Multi-task learning for deep text recommendations | |
CN113407729B (en) | A judicial-oriented personalized case recommendation method and system | |
CN113505204B (en) | Recall model training method, search recall device and computer equipment | |
CN103593425A (en) | Intelligent retrieval method and system based on preference | |
CN114048354A (en) | Test question retrieval method, device and medium based on multi-element characterization and metric learning | |
Tondulkar et al. | Get me the best: predicting best answerers in community question answering sites | |
CN113688281B (en) | Video recommendation method and system based on deep learning behavior sequence | |
CN113342963B (en) | Service recommendation method and system based on transfer learning | |
Sathishkumar et al. | Hate Speech Detection in Social Media Using Ensemble Method in Classifiers | |
Zhang et al. | MULTIFORM: few-shot knowledge graph completion via multi-modal contexts | |
CN114722896B (en) | News topic discovery method fusing neighbor heading diagrams | |
CN119443760A (en) | Lightweight process form intelligent prediction method and system | |
CN115687760A (en) | User learning interest label prediction method based on graph neural network | |
CN113688633B (en) | Method and device for determining outline | |
CN119357409A (en) | Paper recommendation method and system based on cross-community knowledge graph in hyperbolic space | |
CN110750712A (en) | Data-driven recommendation method for software security requirements | |
Assegaff et al. | Experimental of vectorizer and classifier for scrapped social media data | |
CN113515699A (en) | Information recommendation method and device, computer-readable storage medium and processor | |
CN113535928A (en) | Service discovery method and system based on long short-term memory network based on attention mechanism | |
CN114385926B (en) | Robust recommendation method and system for reducing the proportion of useless information in attack environment | |
CN115630223A (en) | Service recommendation method and system based on multi-model fusion | |
Liu et al. | An image classification method that considers privacy-preservation | |
CN110633446B (en) | Webpage column recognition model training method, using method, device and storage medium | |
Wang et al. | Study on personalised search of English teaching resources database based on semantic association mining |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20250523 Address after: Licheng Alexander Road in Ji'nan City, Shandong province 250100 No. 27 Patentee after: SHANDONG University Country or region after: China Patentee after: SHANDONG MASS INSTITUTE OF INFORMATION TECHNOLOGY Address before: Licheng Alexander Road in Ji'nan City, Shandong province 250199 No. 27 Patentee before: SHANDONG University Country or region before: China |