Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, a flow chart of a personalized recommendation method based on temporal entropy according to an embodiment of the present invention is shown, where the method includes:
101. clustering the existing user and project scoring information to obtain a cluster center of the cluster and user and project information contained in each cluster;
102. if the current user is judged not to belong to the newly added system user, calculating the time entropy of the project of the current user according to the related attribute feature information of the current user and different scoring records of the project of the user;
103. judging the user type of the current user according to the calculated time entropy of the current user to the item;
104. and performing pattern matching in the cluster center of the cluster according to the user type of the current user, so as to recommend the diversity project information for the current user.
Preferably, the method further comprises:
if the current user is judged to be a newly added system user, recommending N items of information ranked at the top to the current user by adopting the following item popularity calculation mode:
wherein ipopFor item popularity, UiRepresents the user set of the evaluated item I in the recommendation system, wherein I is the number of all items in the recommendation system.
Preferably, according to the user interest preference, the user type of the current user is determined to be one of the following four types: recently like and past like, recently like but past dislike, recently dislike but past like, recently dislike and past dislike.
Preferably, the performing pattern matching in the cluster center of the cluster according to the user type of the current user so as to perform diversity item information recommendation for the current user includes:
performing pattern matching in the following two interest preference patterns of the cluster center of the cluster according to the user type of the current user: a regular interest preference mode and a non-regular interest preference mode; wherein the regular interest preference pattern is liked recently and liked in the past; the irregular interest preference pattern comprises: recently liked but not liked in the past, recently disliked but liked in the past, recently disliked and not liked in the past.
Preferably, the time entropy of the project of the current user is calculated according to the related attribute feature information of the current user and different scoring records of the project of the user by using the following time entropy formula:
where H is the time entropy, n is the number of time intervals, scoreiRating of item i for user, miRepresents the sum of the scores of all users for item i in a time interval, M refers to all MiThe sum of (a) and (b).
Corresponding to the above method embodiment, as shown in fig. 2, a schematic structural diagram of a personalized recommendation device based on temporal entropy according to an embodiment of the present invention is shown, where the device includes:
the clustering unit 21 is used for clustering the existing user and project scoring information to obtain the cluster center of the clusters and the user and project information contained in each cluster;
the calculating unit 22 is configured to calculate a time entropy of the current user for the project according to the relevant attribute feature information of the current user and different scoring records of the user for the project if it is determined that the current user does not belong to the newly added system user;
the judging unit 23 is configured to judge a user type of the current user according to the calculated time entropy of the current user on the item;
and the matching unit 24 is configured to perform pattern matching in the cluster center of the cluster according to the user type of the current user, so as to perform diversity item information recommendation for the current user.
Preferably, as shown in fig. 3, which is a schematic structural diagram of another personalized recommendation device based on temporal entropy according to an embodiment of the present invention, the device includes: clustering unit 21, calculating unit 22, judging unit 23, matching unit 24, the device further includes:
a recommending unit 25, configured to recommend, if it is determined that the current user is a newly added system user, the N item information ranked before to the current user in the following item popularity calculation manner:
wherein ipopFor item popularity, UiRepresents the user set of the evaluated item I in the recommendation system, wherein I is the number of all items in the recommendation system.
Preferably, the determining unit 23 is specifically configured to determine, according to the user interest preference, that the user type of the current user is one of the following four types: recently like and past like, recently like but past dislike, recently dislike but past like, recently dislike and past dislike.
Preferably, the matching unit 24 is specifically configured to perform pattern matching in the following two interest preference patterns of the cluster center of the cluster according to the user type of the current user: a regular interest preference mode and a non-regular interest preference mode; wherein the regular interest preference pattern is liked recently and liked in the past; the irregular interest preference pattern comprises: recently liked but not liked in the past, recently disliked but liked in the past, recently disliked and not liked in the past.
Preferably, the calculating unit 22 is specifically configured to calculate, according to the relevant attribute feature information of the current user and different scoring records of the user on the item, a time entropy of the current user on the item by using the following time entropy formula:
where H is the time entropy, n is the number of time intervals, scoreiRating of item i for user, miRepresents the sum of the scores of all users for item i in a time interval, M refers to all MiThe sum of (a) and (b).
The technical scheme of the embodiment of the invention has the following beneficial effects: the user is recommended by using the personalized recommendation method based on the time entropy, on one hand, the user and the project information are clustered, the popularity of the project is calculated, the recommendation efficiency and accuracy can be improved, and the cold start problem of the user is solved. On the other hand, the multi-interest of the user can be effectively utilized by calculating the time entropy of the user on the item, so that the diversity of the recommendation method is improved.
The above embodiment of the present invention provides a Personalized recommendation method (PTCF) based on temporal entropy for recommending a user. The method comprises the steps of firstly clustering existing user and project rating information in a recommendation system to obtain corresponding clustering centers and different cluster information; then judging whether the target user belongs to a newly added system user, if not, calculating the time entropy of the user to the project according to the related attribute characteristic information of the user and different grading records of the user to the project, judging which type of user the user belongs to, and further performing mode matching on the target user in the recommendation system, so as to perform diversity recommendation for the user by adopting a project diversity calculation formula; otherwise, recommending the N items ranked at the top to the target user by adopting an item popularity calculation mode.
The main recommendation mechanism of the above personalized recommendation method PTCF based on temporal entropy is as follows: in the selection of the recommendation algorithm, not only the collaborative filtering algorithm based on clustering is selected for recommendation, but also the interest of the user is considered to change along with the change of various factors, so that the diversity recommendation is performed on the user. The recommendation method not only solves the problem of recommendation efficiency of a recommendation system for recommending by calculating all scoring information originally, but also utilizes diversified information of the user scoring the items, thereby effectively improving the recommendation accuracy and recommendation diversity of the recommendation method.
The main content of the embodiment of the invention is the research and application of a time entropy-based personalized recommendation method PTCF, and the method mainly comprises the steps of clustering original users and item rating information, calculating the time entropy of different item ratings of target users, classifying and matching modes of the target users, and correspondingly recommending whether the target users are newly added to a system user. The technical scheme adopted by the method is as follows: 1) clustering user and project scoring information by using an RLPSO _ KM clustering algorithm so as to obtain a clustered cluster center and corresponding cluster information; 2) by introducing the definition of the time entropy, the time entropy of different items of target users who have added into the system is calculated, and then classification and pattern matching are carried out on the target users, so that effectiveness recommendation is carried out on the target users; 3) and recommending the target user newly added into the system by introducing a mode of calculating the popularity of the project.
The technical scheme adopted by the embodiment of the invention is a personalized recommendation method based on time entropy, and the method comprises the following implementation steps:
(1) and clustering the user and item scoring information. Firstly, processing the scoring information of a project by a user; then, clustering the processed information by adopting an improved Kmeans clustering algorithm RLPSO _ KM, wherein the clustering comprises a series of parameter settings such as initial clustering centers, initial clustering numbers, clustering iteration times and the like; and finally, outputting the cluster center of the clusters and the user and item information contained in each cluster.
(2) Personalized recommendation based on temporal entropy. The interest preferences of different users tend to be different, with some users always enjoying the same type of movie, and the opposite for others, with their favorite types of movies possibly changing with their mood, friends and fashion trends.
Here, we classify the interest preference of the user into like (Likes) and dislike (Dislikes) according to the rating information of the item by the user. Likewise, if measured in time, it is divided into the most Recent (Recent) and Past (Past) times, so we simply divide the user's interest preferences into four categories. The first type is recentLikes; PastLikes, the interest preferences of such users are regular, meaning that such users generally only like one type of movie and like one movie type for a longer duration, typically without changing for a longer period of time. The second user type is recentLikes; PastDislieks, the third type is RecentrtDislieks; PastLikes. For these users, their interest preferences change over time. The last is recentDislikes; and PastDislikes, the item information evaluated by the users shows a diversified trend, and meanwhile, the items show randomness and irregularity.
By classifying users, we classify the types of users into two modes, namely a first mode (first class of users) and a second mode (second, third and fourth classes of users). The value and time information implied by the project means to which mode the user belongs. For example, if a user frequently visits the same type of movie, we consider him to be a user following the first mode. This model is simple and the same type of movie exhibits a tendency to be evenly distributed. Conversely, if the user belongs to the second mode, there is no regularity to follow. Based on the inspiration, a novel pattern mining method based on time entropy is provided, the method can be used for measuring the time distribution of the user historical scoring record, and a calculation formula for defining the time entropy is as follows:
in formula (1), n is the number of time intervals, and the user's score for item i is scorei,miRepresents the sum of the scores of all users for item i in a time interval, M refers to all MiThe sum of (a) and (b).
Usually, an item contains a plurality of tags, and the item can be described simply by the tags. For a movie, these tags may represent the type of movie. For a user, he may comment on the same tag at different times, i.e. a tag may belong to several scoring cycles. Temporal entropy measures the degree of misordering of the timestamp, with a higher value indicating that the user prefers this tag.
As shown in table 1, the table contains scoring information for 8 items by 4 users. u. ofi(i ═ 1, 2, 3, 4) is a user information set, ik(k ═ 1, 2, …, 8) represents a set of item information. The scoring value of the user on the item is between 1 and 5, and the number of times of the user accessing the item and the scoring information of the label to which the item belongs are listed in the table.
TABLE 1 user-project Scoring information
Table 1 User-item rating
As shown in table 2, there are 5 time intervals in the table. For user u1And 3 time distribution records are included. For user u2It also contains 3 time distribution records.
TABLE 2 user vs tag1Scoring information of
Table 2 Users rating for tag1
But for user u1And user u2Which user prefers tag1Is there a The following are their temporal entropies:
due to the fact that
So user u
1User u
2Prefer tag
1. This is in accordance with user u
1User u
2For tag
1The fact that the score was higher agreed.
Diversity recommendation attracts a lot of attention by coordinating similarity and difference of items in the item list to improve recommendation efficiency. One common method of measuring diversity is to maximize the sum of the different items as much as possible. Thus, for a given project, we set its diversity formula to be defined as follows:
in the formula (2), fr (i) denotes a tag1Frequency of occurrence in the set, z (i) is tag1The sum of the similarity of all the labels in the candidate set, α, is the adjustment coefficient, where we choose this value to be 2.
In the formula (4), the term r
uiAnd r
ujRespectively represent the scores of the user u for the item i and the item j,
the average rating of item j for all users who rated item j,
average rating of item i for all users who rated item i, U
ijFor a set of users that score both item i and item j, sim (i, j) has a value between [ -1,1]In the meantime.
(3) And calculating the popularity of the project. For the newly added target users in the system, the N items which are most popular with the users are recommended to the target users. We define the popularity formula for items in the recommendation system as follows:
in the formula (5), UiRepresents the set of users in the recommendation system who have evaluated the item I, I being all items in the recommendation system.
In conclusion, the scoring information of the user to the project in the collaborative filtering algorithm is clustered, the time entropy of the user to the project is calculated, the mode of the user is calculated, diversity recommendation is further performed on the user, and the previous N projects are recommended to the user who is not added to the system in the system. In this way, on one hand, the recommendation efficiency of the recommendation system is improved by calculating the neighborhood of the user in the class cluster; on the other hand, the interest of the user is greatly reflected by calculating the time entropy of the user on the item, the singleness of the traditional recommendation system is made up, and the recommendation system has better diversity.
In order to make the objects, technical solutions and features of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings. The overall flow chart of the personalized recommendation method based on the temporal entropy is shown in fig. 4.
The individual steps are illustrated below:
(1) the user and the scoring information are clustered based on the RLPSO _ KM clustering algorithm, so that the recommendation efficiency of the recommendation system is improved.
(2) A personalized recommendation method based on the time entropy is provided, and a time entropy calculation mode is introduced, so that the diversified interests of users are greatly reflected.
(3) A mode of calculating the popularity of the project is provided for recommending the newly added system user, and the problem of cold start of the project in the recommending system is solved.
The experimental environment was as follows:
the application example of the invention verifies the actual effect of the personalized recommendation method based on the time entropy through experiments, the experimental environment is a win7 (64-bit) host, an 8G memory and a 1T hard disk, the adopted data sets are MovieLens (10M) and data captured from Douban.com, the accuracy of recommendation is verified by taking the average value of 10 experiments of the algorithm as the final experiment result, and meanwhile, the diversity of the personalized recommendation method based on the time entropy and other diversity recommendation methods is compared.
First, the application example of the present invention clusters the rating information of users and items. Vectorizing or characterizing the scoring information to enable the scoring information to meet the format of clustering input; clustering the processed information by adopting an improved clustering algorithm RLPSO _ KM, wherein the clustering comprises a series of parameter settings such as initial clustering center, initial clustering number selection, clustering iteration times and the like; and finally, outputting the center of each cluster clustered by the clustering algorithm and the user and item information contained in each cluster.
And then, establishing a personalized recommendation method based on the time entropy. And calculating the time entropy of the user to the project by introducing a time entropy calculation mode, further classifying and pattern matching the user, and finally performing diversity recommendation for the user. The recommendation efficiency of this recommendation algorithm is plotted against the other algorithms as shown in fig. 5 and 6.
Finally, project popularity is calculated. Aiming at new users who do not join the system, the cold start problem of the users in the system is solved by calculating the popularity of the items and selecting N items with the highest popularity to recommend the users.
From the comparison between the method proposed by the present invention and other algorithms in fig. 5, it can be clearly seen that the PTCF proposed by the present invention is more dominant in accuracy rate compared to PMF, BPMF, SVD and McoC algorithms, regardless of whether the selected experimental dataset is a movilens or a Douban dataset, and the accuracy rate is respectively increased by 1.04% and 1.07% on the movilens and Douban datasets compared to the McoC algorithm when N is 10. Meanwhile, no matter whether the selected Top-N value is any one of 10,20 and 50, the MAP value of the method provided by the invention is obviously higher than that of other algorithms, which also illustrates the advantage of the PTCF method provided by the invention in the recommendation efficiency.
From fig. 6, it can be seen that by comparing the number of different recommended item lists, the PTCF method proposed by the present invention remains substantially consistent in diversity and does not change significantly, regardless of the selection of the experimental data set or when the number of recommended items is 10,20, and 50. Meanwhile, compared with the CUTA algorithm, when N is 10, the diversity of the personalized recommendation method provided by the invention is improved by 4 times under the MovieLens data set, and is improved by 1.05% compared with the CUTATime algorithm.
By combining the experiments, the application example of the invention recommends the user by using the personalized recommendation method based on the time entropy, on one hand, the user and the project information are clustered and the popularity of the project is calculated, so that the recommendation efficiency and accuracy can be improved, and the cold start problem of the user can be solved. On the other hand, the multi-interest of the user can be effectively utilized by calculating the time entropy of the user on the item, so that the diversity of the recommendation method is improved.
It should be understood that the specific order or hierarchy of steps in the processes disclosed is an example of exemplary approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the processes may be rearranged without departing from the scope of the present disclosure. The accompanying method claims present elements of the various steps in a sample order, and are not intended to be limited to the specific order or hierarchy presented.
In the foregoing detailed description, various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments of the subject matter require more features than are expressly recited in each claim. Rather, as the following claims reflect, invention lies in less than all features of a single disclosed embodiment. Thus, the following claims are hereby expressly incorporated into the detailed description, with each claim standing on its own as a separate preferred embodiment of the invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. To those skilled in the art; various modifications to these embodiments will be readily apparent, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
What has been described above includes examples of one or more embodiments. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the aforementioned embodiments, but one of ordinary skill in the art may recognize that many further combinations and permutations of various embodiments are possible. Accordingly, the embodiments described herein are intended to embrace all such alterations, modifications and variations that fall within the scope of the appended claims. Furthermore, to the extent that the term "includes" is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term "comprising" as "comprising" is interpreted when employed as a transitional word in a claim. Furthermore, any use of the term "or" in the specification of the claims is intended to mean a "non-exclusive or".
Those of skill in the art will further appreciate that the various illustrative logical blocks, units, and steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate the interchangeability of hardware and software, various illustrative components, elements, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design requirements of the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present embodiments.
The various illustrative logical blocks, or elements, described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a digital signal processor, an Application Specific Integrated Circuit (ASIC), a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a digital signal processor and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a digital signal processor core, or any other similar configuration.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. For example, a storage medium may be coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC, which may be located in a user terminal. In the alternative, the processor and the storage medium may reside in different components in a user terminal.
In one or more exemplary designs, the functions described above in connection with the embodiments of the invention may be implemented in hardware, software, firmware, or any combination of the three. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media that facilitate transfer of a computer program from one place to another. Storage media may be any available media that can be accessed by a general purpose or special purpose computer. For example, such computer-readable media can include, but is not limited to, RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store program code in the form of instructions or data structures and which can be read by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Additionally, any connection is properly termed a computer-readable medium, and, thus, is included if the software is transmitted from a website, server, or other remote source via a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wirelessly, e.g., infrared, radio, and microwave. Such discs (disk) and disks (disc) include compact disks, laser disks, optical disks, DVDs, floppy disks and blu-ray disks where disks usually reproduce data magnetically, while disks usually reproduce data optically with lasers. Combinations of the above may also be included in the computer-readable medium.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.