Personalized News Recommendation Based On Click Behavior
Personalized News Recommendation Based On Click Behavior
Behavior
Jiahui Liu, Peter Dolan, Elin Rønby Pedersen
Google Inc.
1600 Amphitheatre Parkway, Mountain View, CA 94043, USA
{jiahui, peterdolan, elinp}@google.com
ABSTRACT to thousands of sources via the internet. News aggregation
Online news reading has become very popular as the web websites, like Google News and Yahoo! News, collect news
provides access to news articles from millions of sources from various sources and provide an aggregate view of
around the world. A key challenge of news websites is to news from around the world. A critical problem with news
help users find the articles that are interesting to read. In service websites is that the volumes of articles can be
this paper, we present our research on developing overwhelming to the users. The challenge is to help users
personalized news recommendation system in Google find news articles that are interesting to read.
News. For users who are logged in and have explicitly
enabled web history, the recommendation system builds Information filtering is a technology in response to this
profiles of users’ news interests based on their past click challenge of information overload in general. Based on a
behavior. To understand how users’ news interest change profile of user interests and preferences, systems
over time, we first conducted a large-scale analysis of recommend items that may be of interest or value to the
anonymized Google News users click logs. Based on the user. Information filtering plays a central role in
log analysis, we developed a Bayesian framework for recommender systems, as it is able to recommend
predicting users’ current news interests from the activities information that has not been rated before and
of that particular user and the news trends demonstrated in accommodates the individual differences between users [3,
the activity of all users. We combine the information 8]. Information filtering has been applied in various
filtering mechanism using learned user profiles with an domains, such as email [16], news [4, 5, 20], and web
existing collaborative filtering mechanism to generate search [15, 18]. In the domain of news, this technology
personalized news recommendations. The combined particularly aims at aggregating news articles according to
method was deployed in Google News. Experiments on the user interests and creating a “personal newspaper” for each
live traffic of Google News website demonstrated that the user.
combined method improves the quality of news An accurate profile of users' current interests is critical for
recommendation and increases traffic to the site. the success of information filtering systems. Some systems
[1, 19] require users to manually create and update profiles.
Author Keywords This approach places an extra burden on users, something
Personalization, user modeling, news trend. very few are willing to take on. Instead, systems can
construct profiles automatically from users' interaction with
ACM Classification Keywords the system.
H5.m. Information interfaces and presentation (e.g., HCI):
Miscellaneous. In this paper, we describe our research on developing a
personalized news recommendation system based on
INTRODUCTION profiles learned from user activity in Google News. The
News reading has changed with the advance of the World Google News website, available at http://news.google.com,
Wide Web, from the traditional model of news is one of the most popular news websites in the world,
consumption via physical newspaper subscription to access receiving millions of page views and clicks from users
around the world
The nature of news reading makes news information
Permission to make digital or hard copies of all or part of this work for filtering distinctive from information filtering in other
personal or classroom use is granted without fee provided that copies are domains. When visiting a news website, the user is looking
not made or distributed for profit or commercial advantage and that copies for new information, information that she did not know
bear this notice and the full citation on the first page. To copy otherwise,
or republish, to post on servers or to redistribute to lists, requires prior before, that may even surprise her. Since user profiles are
specific permission and/or a fee. inferred from past user activity, it is important to know how
CHI 2009, April 4–9, 2009, Boston, MA, USA. users’ news interests change over time and how effective it
Copyright 2009 ACM 978-1-60558-246-7/08/04…$5.00
1
would be to use the past user activities to predict their “Recommended for [account]”, containing stories
future behavior. recommended based on her click history in Google News.
For our analysis, the recorded click histories were fully
To understand this issue, we conducted a large scale log
anonymized and kept secure according to the Google
analysis of Google News users to measure the stability of
Privacy Policy.
users’ news interests. We found that their interests do vary
over time but follow the aggregate trend of news events. A previous Google News recommendation system was
Based on these findings, we develop a Bayesian model to developed using a collaborative filtering method [7]. It
predict the news interests of an individual user from the recommends news stories that were read by users with
activities of that particular user and the news trend similar click history. This method has two major drawbacks
demonstrated in activities of a group of users. To in recommending news stories. First, the system cannot
recommend news stories to users, the system takes into recommend stories that have not yet been read by other
account of the genuine interests of individual user and the users, a problem that is often referred to as the first-rater
current news trend. Therefore, the user will receive news problem [7, 8]. For news recommendations, this is a serious
tailored to her interests without missing the important news problem, as news service websites strive to present the most
events, even when those events do not strictly match the updated information to users in a timely manner. News
user’s particular interests. articles presented in Google News are usually published
within one hour. However, the collaborative filtering
We combined the news information filtering method with
method has to wait several hours to collect enough clicks to
the collaborative filtering method previously developed for
recommend the news story to users, resulting in undesirable
Google News [7] to generate personalized
time lags between break-out news and recommendations.
recommendations for news access. The combined method
Second, not all users are equal to each other, and the
was evaluated in a live experiment: a subset of the live
collaborative filtering method may not account for the
traffic at Google News used the combined method; the
individual variability between users [3]. For example, we
result showed significant improvement over the existing
observed that entertainment news stories are constantly
collaborative filtering method. The experiment on live
recommended to most of the users, even for those users
traffic also revealed a number of interesting issues related
who never clicked on entertainment stories. The reason is
to recommendations, serendipitous exploration and user
that entertainment news stories are generally very popular,
satisfaction. We will discuss these issues later in this paper.
thus there are always enough clicks on entertainment stories
The contribution of this paper is three-fold. First, we report from a user’s “neighbors” to make the recommendation.
a large-scale log analysis of the consistency of users' news
A solution to these two problems would be to build profiles
interest. Second, we propose a novel method for predicting
of user’s genuine interests and use them to make news
a user’s news interests based on click behavior which
recommendations. The profiles would help the system filter
combines the genuine interests of the user and the current
out the stories that are not of interest to the user, such as the
news trend. Third, we combined information filtering and
entertainment news mentioned above. A news story may
collaborative filtering for personalized news
also be recommended to the user if it matches her interest,
recommendation and ran an experiment on the live traffic,
even if the story has not been clicked on by other users.
showing improved results.
In this paper, we describe an information filter method to
PERSONALIZATION IN GOOGLE NEWS recommend news articles based on their topic categories,
Google News is a computer-generated news website that which is assigned by text classifiers. Based on a user’s
aggregates headlines from news sources worldwide. It news reading history the information filtering component
classifies news articles into different topic categories (e.g. predicts the topic categories of interest to her each time she
“world”, “sport”, “entertainment”, etc.) and displays them visit Google News. News articles in those categories are
in corresponding sections, as do standard news websites, ranked higher in the candidate list and will be
but with fully automated text-based classification. Google recommended to the user. We chose to recommend news
News serves millions of users around the world, and stories at the general level of topic categories instead of fine
provides numerous editions for different countries and grained topics because of the nature of news reading: most
languages. users visit news websites with the attitude of “show me
something interesting,” rather than having any specific
Users usually visit Google News starting from the
information goals [7]. Over-specializing the user profile
homepage. The homepage of the standard edition has the
may limit the recommendations to news that the user
Top Stories section on the top of the homepage, followed
already knew, which is obviously undesirable for news
by topic based sections of news articles, like “world” and
reading.
“sport”.
The user activities that Google News records are the user’s
If a user signs in to her Google Account and explicitly
clicks on the Google News website. The system records the
enables Web History, the system will record her click
event and the time when a user clicks on the page. Each
history and generate a personalized section for her, named
click on a news article is treated as a positive vote for the Unlike these news personalization systems, our news
topic category of that article. recommendation system infers user interest based on their
click behavior on the news website. There are no ratings or
There are two practical constraints on our news information
negative votes to gauge what the user dislikes. For privacy
filtering algorithm. First, a user’s news interests may
protection reasons, Google News does not record detailed
change over time. The system should be able to
information about the clicks, such as the amount of time
incrementally update the user’s profile to reflect change in
spent on the page. Thus, the system needs to make
interest. Second, there is a large variance in the click history
reasonable prediction with the limited and noisy
size of the users. A successful algorithm needs to degrade
information of user activity on the website.
gracefully, i.e. be able to provide reasonable
recommendations even when there is little information Recently, there has been some research on user modeling
about the user. based on lick histories, mostly with the aim of enhancing
personalized web search. For instance, Qiu and Cho [13]
RELATED WORK presented a formal framework and a method to
Two different technologies are commonly used in automatically learn user interest based on past click history.
recommender systems: information filtering and The learned user interest is integrated in Topic-Sensitive
collaborative filtering. The information filtering approach PageRank to generate personalized ranking. Speretta and
recommends information based on profiles; these profiles Gauch [17] classified queries and snippets of clicked search
are built by analyzing the content of items that the user results to create user profiles, which were then used to re-
accessed and favored in the past. In contrast, the rank search results. Kim and Chan [9] proposed to model
collaborative filtering approach does not consider the user interest in a hierarchy of concepts, going from general
content of items, but uses the opinions of peer users to to specific. The hierarchy is learned from the web pages
generate recommendations. In this paper, we focus on bookmarked by the user using clustering methods.
developing effective information filtering mechanism for
news recommendation in a large-scale website. An important issue in user modeling, particularly for news
access, is the changes in user interest over time. Billsus and
The information filtering approach has been applied to Pazzani [1] found that there are two types of user interest in
provide personalized selection of news information in news reading: short-term and long-term. The short-term
various forms such as personal news agents [1], news interest usually is related to hot news events and changes
readers for wireless devices [2, 3] and web-based news quickly. In contrast, long term interest often reflects actual
aggregators [19]. These systems build user profiles from user interest. Accordingly, News Dude [1] uses a multi-
information explicitly provided by the user or implicitly strategy machine learning approach to create separate
observed in user activities. The profiles are then compared models of short-term and long-term interest. Chen et al. [8]
with the content of news articles to generate personalized analyzed the change of user interest in news over time and
recommendations. used special mechanisms to update user profiles to reflect
user’s current interests. Liang and Lai [14] proposed a time-
Tan and Teo [19] presented a personalized news system,
based approach to build user profiles from browsing
named PIN. PIN retrieves and ranks news articles according
behavior, which took into account of the time spent by the
to the user’s profile, which is initially defined by the user as
user on reading the articles and the recency of user activity.
a list of keywords and then learned from user feedback
using neural network technology. When interacting with Compared to the above methods, our method is unique in
PIN, users provide explicit feedback by rating the articles. that it captures the dynamic changes of user interest in the
A similar system, News Dude [1], reads news to users, context of news trend. The system discovers the genuine
supporting a series of feedback options such as interest of users and combines the genuine interest with the
“interesting”, “not interesting”, “I already know this”, etc. current news trend to predict the user’s current news
A special purpose news browser for PDAs, named interest.
WebClipping2, is implemented by Carreira et al. [3].
WebClipping2 uses a Bayesian Classifier in order to The second technology for recommender systems is
calculate the probability that a specific article would be collaborative filtering. Collaborative filtering has been
interesting to the user. Rather than requiring users to applied to personalized news reading applications, such as
provide explicit feedbacks, WebClipping2 observes the GroupLens [12] and the first version of Google News
total reading time, number of lines read and some other recommender.
characteristics of user behavior to infer the user’s interests. Information filtering and collaborative filtering each have
Another personal news agent, PVA [8], uses a proxy to their advantages and limitations [3]. Some research tried to
collect user’s page clicks and the browsing time, in order to combine both methods and achieve encouraging results [3,
construct a “personal view” that reflects user interests. PVA 6]. ]. The combined method benefits from both methods,
is applied and evaluated to provide personalized news providing early predictions that cover all items and users,
access. and improving the recommendations as the number of users
and ratings increases. In Google News, we combined our
3
new information filtering method with the collaborative
filtering method previously developed for Google News [7]
to generate personalized recommendations for news access.
The live traffic experiment showed that the combined
method improved the quality of news recommendation.
Figure 2 plots the click distribution for the United States 2007, when the baseball season ended. However, there were
population over time. For the clarity of the figure, only 4 no such trend in Spain and UK.
most representative categories are shown in the figure.
Figure 2 shows many fluctuations in the news interests of Influence of the General News Trends on Individual
the general public in the US, which was also observed in Interest Change
plots of other countries (not shown in this paper). The previous subsections analyze the interest change of
Furthermore, some topic categories (e.g. “national”) individual users and the general public. A natural question
showed greater variation than others (e.g. “health”). This that follows is whether the general news trends influence
phenomenon may be explained by the fact that there are the interest change of individual users. To understand this
more and bigger break-out news in national politics than question, we compare the click distribution of individual
health. users with the click distribution of the general public in the
same time period. We also computed the d1 and d ∞
We hypothesize that the interest change of a country’s
general public corresponds to the big news events in that distance of an individual user and the general public in a
country. The log analysis provided empirical evidence for randomly picked different location. If the user’s interest is
this hypothesis. For example, the US election campaign influenced by the local news trend, her click distribution
starting in late 2007 attracted a large amount of attention to should be more similar to general click distribution of the
national political news. Figure 2 shows that the percentage location that she belongs to than to those of other locations.
of national news clicks doubled during the election The average d1 and d ∞ distance is presented in table 1.
campaign compared to before the campaign. Those users
who usually paid little attention to national politics Table 1. Comparison in click distributions between individual
probably read more national news about the election users and the general public
campaign because of the importance of the event. Similarly,
the 2008 Olympic Games in August 2008 produced a spike d1 distance d ∞ distance
in the general interest in sports news in several different Same location 0.92 0.31
countries, as shown in Figure 3.
Different location 1.13 0.39
Moreover, the log analysis shows that there are regional
differences in the news trend represented in the click As shown in the table, an individual user’s click distribution
distributions of general public. Figure 3 shows the change is more similar to the click distribution of the general public
of interests in sports news in three different countries: in the same location than to a randomly selected location.
United Kingdom, Spain and United States. Overall, Spanish Using t-test, both the d1 and d ∞ distance in the same
users read more sports news than British and American location are significantly lower than those in the different
users. Figure 3 shows spikes in June 2008 and August 2008, location, at the confidence level of 99%.
which correspond to the Euro Cup in June and Olympic
We can draw the following conclusions from this log
Games in August respectively. But the American users
analysis:
showed much lower interests in the Euro Cup than the two
European countries. On the other hand, the American users’ • The news interests of individual users do change over
interests in Sports news dropped dramatically in November time.
5
• The click distributions of the general public reflect the clicking on an article about ci . Using a Bayesian rule,
news trend, which correspond to the big news events.
p (click | category = ci ) is computed as follows:
• There exist different news trends in different locations.
interest t (category = ci ) = p t (click | category = ci )
• To a certain extent, the individual user’s news interests
correspond with the news trend in the location that the p t (category = ci | click ) p t (click ) (3)
user belongs to. =
p t (category = ci )
BAYESIAN FRAMEWORK FOR USER INTEREST
PREDICTION p t (category = ci | click ) is the probability that the user’s
The log analysis reveals that the click distributions of clicks being in category ci . It can be estimated by the click
individual users are influenced by the local news trend. For distribution D(u, t ) observed in time period t , as
example, Spanish users read more sports news during Euro
Cup. Similar phenomena were also reported in a user study computed in Equation 1.
of the lifecycle of news interests [8]. Based on these
p t (category = ci ) is the prior probability of an article
findings, we decompose user’s news interests into two
parts: users’ genuine interests and the influence of local being about category ci . This is the proportion of news
news trend. The user’s genuine interests originate from the articles published about that category in the time period,
personal characteristics of the user, such as gender, age, which correlates with the news trend in the location. As
profession, etc. and are thus relatively stable over time. On more news events happen in a given topic category, more
the other hand, when deciding what to read, users are also news articles will be written in that category. Thus, we can
influenced by the news trend in the location that they approximate this probability with the click distribution of
belong to. This kind of influence produces short-term the general public D(t ) .
effects and changes over time. The genuine interests and
news trend influence correspond to the “long-term” and p t (click ) is the prior probability of the user clicking on
“short-term” interests discussed in [1]. However, we used any news article, regardless of the article category.
distinct methods to predict user’s news interests. More
importantly, we model the “short-term” interests from the According to Equation 3, p(click | category = ci )
perspective of news trend using the click patterns of the represents the extent to which the user’s interest in the topic
general public, instead of only using the user’s own category differs from the general public of the same
feedbacks. location. If the user reads a lot of sports news while a lot of
We developed an approach using Bayesian frameworks users are reading it, the user may not be particularly
[10] to predict users’ current news interest based on the interested in sports but read the sports news because of
some hot sports event. In contrast, an extraordinary large
click patterns of the individual users and the group of users
in the country. The predicted interests are used in news proportion of clicks on sports news is a strong signal for the
information filtering. The approach works as follows: first, user’s genuine interests in sports.
the system predicts user’s genuine news interests regardless
Combining Predictions of Past Time Periods
of the news trend, using the user’s clicks in each past time
Equation 3 computes the user’s genuine news interest based
period; second, the predictions made with data in a series of
on the click distributions in a particular time period. To
past time periods are combined to gain an accurate
accurately gauge the user’s genuine interests, we combine
prediction of the user’s genuine news interests; finally, the
the predictions made over multiple time periods as follows:
system predicts the user’s current interests by combining
her genuine news interests and the current news trend in her
location. interest (category = ci ) =
(
∑t N t × interest t (category = ci ) )
∑t N t
Predicting User’s Genuine News Interest
⎛ p t (category = ci | click ) p t (click ) ⎞⎟
For a specific time period t in the past, we observed the ∑t ⎜⎜ N t × ⎟
click distribution of individual users, D(u, t ) , and the click ⎝ p t (category = ci ) ⎠
=
distribution of all the users in a country, D(t ) , which ∑t N t
represents the news trend in that country in the time period.
(4)
We would like to learn the user’s genuine interests revealed
in D(u, t ) regardless of the influence of D(t ) . The genuine Here, N t is the total number of clicks by the user in time
interest of a user in topic category ci is modeled as period t . We can assume that the prior probability of a user
clicking on any article is constant over time. Thus, Equation
p t (click | category = ci ) , the probability of the user
4 becomes Equation 5:
interest (category = ci ) based on the current news trend, which is still a reasonable
⎛ p t (category = ci | click ) ⎞⎟ estimation. On the other hand, if ∑t N t is the much larger
p(click ) × ∑t ⎜ N t × (5)
⎜ p t (category = c i ) ⎟⎠ than G , the estimation is mainly based on the user’s own
⎝
= click distribution in the past.
∑t N t
Another advantage of the proposed approach is that the
Predicting User’s Current News Interest user’s interests can be updated incrementally. The system
As we discussed before, the user’s news interest is p t (category = ci | click )
decomposed into two parts: the genuine news interest and can save the values of N t and for
p t (category = ci )
the influence of news trends. The previous section
each past time period. When updating the user’s profile, the
calculated the user’s genuine news interests based on her
system only needs to compute the value for the most recent
past click behaviors. To gauge the current news trend, we
time period and recompute the weighted sum with the saved
use the click distribution of the general public in a short
values.
current time period (e.g. in the past hour), represented as
p 0 (category = ci ) . Because of the large number of users, NEWS RECOMMENDATION
there are enough clicks in the short current time period to In order to rank the list of candidate articles to be
accurately estimate the popular topic categories in the recommended, the system generates an information filtering
location. score, IF (article) , and a collaborative filtering score,
CF (article) , for each article. IF (article) is based on the
The ultimate goal is to predict the click distribution of the
user for the near future. Again, we use the Bayesian law: topic category of that article and the predicted user’s
interest using Equation 8. The collaborative method
p 0 (category = ci | click ) implemented in [7] computes CF (article) . The two scores
are combined in ranking the candidates for news
p 0 (click | category = ci ) p 0 (category = ci ) (6)
= recommendation:
0
p (click )
Rec(article) = IF (article) × CF (article) (9)
We estimate p (click | category = ci ) with the genuine
0
Combining the information filtering method and the
news interests, interest (category = ci ) , computed in Equation collaborative method offers the advantages of both methods
5, and assume the probability a user clicking on any news and shows improved performance over using the
article is constant, thus, collaborative method alone. In the next section, we describe
our evaluation of the combined method on the live traffic of
p 0 (category = ci | click ) Google News.
interest (category = ci ) p 0 (category = c i )
∝ (7) LIVE TRAFFIC EXPERIMENT
p (click ) To evaluate the performance of the combined methods and
⎛ p t (category = ci | click ) ⎞⎟ understand the user experience with personalized news
p 0 (category = ci ) × ∑t ⎜ N t ×
⎜ p t (category = ci ) ⎟⎠
recommendation, we conducted experiments on a fraction
⎝
∝ (about 10,000 users) of the live traffic at Google News. The
∑t N t users were randomly assigned to a control group and a test
group. The two groups had about the same number of users.
In addition to the user’s past clicks, we add a set of virtual When a logged-in Google News user (who also explicitly
clicks, with the same click distribution as that of current has enabled web history) visits the website, a section of
news trend, i.e. p 0 (category = ci ) . Thus, the final estimation recommended news is generated particularly for that user.
of the user’s news interests in the near future is In our experiment, the users in the control group get
recommended news from the existing collaborative filtering
p 0 (category = c i | click ) method; while the new combined method is used for the test
⎛ ⎛ p t (category = c i | click ) ⎞⎟ ⎞ group. Aggregate clickthrough rate analysis was then
p 0 (category = c i ) × ⎜ ∑t ⎜ N t × + G⎟ performed over fully anonymized click logs.
⎜ ⎜ p (category = ci ) ⎟⎠
t ⎟
⎝ ⎝ ⎠
∝ The experiment was run for 34 days, from 1/10/2009 to
∑t N t
+ G 2/17/2009. The user’s clicks in the past 12 months are used
(8) as history to compute the user’s interests. To gain greater
accuracy in estimating the news trend in the past, we
G is the number of virtual clicks (set to be 10 in the
calculated the click distributions of the general public for
system), which can be regarded as a smoothing factor.
When the system observes very few (even zero) clicks from each week. The current news trend, p 0 (category = ci ) , is
the user, the system will predict the user’s interest mostly
7
Figure 5. CTR of the Google News homepage
Figure 4. CTR of the recommended news section reading: CTR of the Google News homepage and the
frequency of visiting the Google News site.
estimated with the click distribution of the general public in The CTR of the Google News homepage is calculated as
the past day. the total number of clicks for each page visit made by the
user. Figure 5 plots the measurements for the control and
Three different metrics are used to measure the
test groups in the experiment. Interestingly, there is not
performance of the recommender and the user's experience:
much difference in the CTR of the homepage for the two
click-through rates (CTR) of the recommended news
groups. Although the test group clicked on more news
section, CTR of the Google News homepage, and frequency
articles in the recommended news section (shown in figure
of visiting Google News website. We calculated the three
4), the total number of articles that a user is willing to click
metrics for each user on daily basis. The performance of the
on in each website visit seems to be constant. In other
control and test group was derived by averaging the
words, the improved recommender “stole” clicks from other
measurements of all the users in the corresponding group.
non-personalized sections, rather than increasing the overall
We report the experiment results for the three aspects
number of clicks. The experiment demonstrated that the
below.
improved news recommender created a more focused news
CTR of the recommended news section is calculated as the reading in the test group. As the recommender was
number of clicks on the recommended news articles every improved to present news articles that better matched the
time the user visits the Google News website. It directly user’s interests, the users seemed to pay more attention to
measures the quality of the recommendations as how many the recommended news section and spend less time and
of the recommendations are clicked on, thus liked, by the effort in finding interesting news articles in the non-
user. Figure 4 shows the CTR of the recommended news personalized sections.
section for the control and test group in the 34 days. The
We measure the overall satisfaction of the Google News
values are scaled so that the CTR of the control group in the
website with the frequency of website visits, calculated as
first day is 1. As shown in the figure, the CTR in the test
the number of times the user visits the website in a day.
group is consistently higher than the CTR in the control
group, in 33 of 34 days in the experiment. This shows that
the proposed news interest prediction method improved the
quality of news recommendations. On average, the
combined method that incorporates the information filtering
method improves the CTR upon the existing collaborative
method by 30.9%.
The recommended news section is only one part of the
Google News website, which presents to the user many
other standard non-personalized news sections along with
the recommended news section, such as top stories, world
news, business news, etc. We would like to analyze the
effect of the improved recommender on the user experience
of the whole website. Two metrics are computed to evaluate
Figure 6. Frequency of website visit per day
the news recommender in the larger context of news
Figure 6 shows the frequency of website visit for the understand the effect of personalization on news
control and test group. It is evident in the figure that the test exploration.
group visited Google News more often than the control
group in most of the days in the experiment period. On REFERENCE
average, the frequency of website visits in the test group is 1. Billsus, D., & Pazzani, M. A hybrid user model for news
14.1% higher than the control group. story classification. In Proceedings of the Seventh
International Conference on User Modeling. 1999.
In summation, the proposed news interest prediction
method improved the quality of news recommendations. 2. Billsus, D., Pazzani, M. J., User Modeling for Adaptive
More recommended news articles were clicked on by the News Access, User Modeling and User-Adapted
users in the test group using the new combined method than Interaction, v.10 n.2-3, p.147-180, 2000
the control group using the existing collaborative filtering
3. Carreira, R., Crato, J. M., Gon?alves, D., Jorge, J. A.
method. As a result, users seemed to like Google News
Evaluating adaptive user profiles for news classification,
more and visited the website more often. However, the total
Proceedings of the 9th international conference on
amount of attention that users are willing to pay per visit
Intelligent user interfaces, 2004.
seems to be constant. As users clicked on more
recommended news articles, they clicked on fewer articles 4. Chen, C. C., Chen, M. C., Sun, Y. PVA: a self-adaptive
in the standard non-personalized sections. More research of personal view agent system, Proceedings of the seventh
in-depth user studies would be needed to understand the ACM SIGKDD international conference on Knowledge
effects of personalization on information exploration and discovery and data mining, 2001.
serendipitous discovery.
5. Chen, Y-S., Shahabi, C.: Automatically improving the
accuracy of user profiles with genetic algorithm. In:
CONCLUSION AND FUTURE WORK
In this paper, we present our research on developing an Proceedings of IASTED International Conference on
effective information filtering mechanism for news Artificial Intelligence and Soft Computing, 2001.
recommendations in a large-scale website such as Google 6. Claypool, M., Gokhale, A., Miranda, T., Murnikov, P.,
News. We first conducted a log analysis on the change of Netes, D. and Sartin, M. Combining Content-Based and
user’s interests in news topics over time. The log analysis Collaborative Filters in an Online Newspaper. In
demonstrated variations in users’ news interests and shows Proceedings of ACM SIGIR Workshop on
that the news interests of individual users are influenced by Recommender Systems, 1999.
the local news trend. Based on these findings, we
decompose users’ news interests into two parts: the genuine 7. Das, A. S., Datar, M., Garg, A., Rajaram, S. Google
interests and the influence of local news trends. A Bayesian news personalization: scalable online collaborative
framework is proposed to model a user’s genuine interests filtering, Proceedings of the 16th international
using her past click history and predict her current interests conference on World Wide Web, 2007
by combining her genuine interest and the local news trend. 8. Good, N., Schafer, J. B., Konstan, J. A., Borchers, A.,
The method for predicting user’s interests was used in news Sarwar, B., Herlocker, J., Riedl, J. Combining
information filtering, and it was combined with the existing collaborative filtering with personal agents for better
collaborative filtering method to generate personalized recommendations, Proceedings of the 16th national
news recommendations. We conducted an experiment with conference on Artificial intelligence and the 11th
the news recommender using the combined method on a Innovative applications of artificial intelligence
fraction of live traffic on the Google News website. conference innovative applications of artificial
Compared with the existing collaborative filtering method, intelligence, 1999.
the experiment showed that the combined method improved
the quality of news recommendations and attracted more 9. Hyoung R. Kim , Philip K. Chan, Learning implicit user
frequent visits to the Google News website. interest hierarchy for context in personalization,
Proceedings of the 8th international conference on
The research can be extended in the following directions in Intelligent user interfaces, January 12-15, 2003.
the future. Position bias can be investigated and
incorporated in modeling users’ interests using the click 10. Jensen, V. Bayesian Networks and Decision Graphs.
behavior. More advanced methods for combining the Springer, 2001
information filtering and collaborative filtering mechanisms 11. Katakis, I., Tsoumakas, G., Banos, E., Bassiliades, N.,
can also be studied to better leverage the advantages of both Vlahavas, I. An adaptive personalized news
mechanism. In addition, our live traffic experiment revealed dissemination system. In Journal of Intelligent
that the improved recommender increased the CTR of the Information Systems, Volume 32 , Issue 2. 2009.
recommended news sections while reducing the CTR of
other standard sections. Further user studies can be 12. Konstan, J. A.,Miller, B.N.,Maltz,D.,Herlocker, J.
conducted to investigate this phenomenon to better L.,Gordon, L. R., And Riedl, J. Group-Lens: Applying
9
collaborative filtering to usenet news. Commun. ACM 18. Sugiyama, K., Hatano, K., Yoshikawa, M. Adaptive
40, 77-87. 1997. web search based on user profile constructed without
any effort from users. In: Proceedings 13th International
13. Lee, U., Liu, Z., Cho, J. Automatic identification of user
Conference on World Wide Web, 2004.
goals in Web search, Proceedings of the 14th
international conference on World Wide Web, 2005 19. Tan, A. and Tee, C. "Learning User Profiles for
Personalized Information Dissemination," Proceedings
14. Liang, T.-P. and Lai, H.-J. Discovering User Interests
of 1998 IEEE International Joint conference on Neural
from Web Browsing Behavior: An Application to
Networks, pp. 183- 188, May 1998
Internet News Services, IEEE Computer Society, Los
Alamitos, CA, USA, 2002. 20. Tan, A., Teo, C.: Learning user profiles for personalized
information dissemination. In: Proceedings of 1998
15. Liu, F., Yu, C., Meng, W. Personalized Web Search For
IEEE International Joint Conference on Neural
Improving Retrieval Effectiveness. In: IEEE
Networks, 1998.
Transactions on Knowledge and Data Engineering,
2004. 21. Wedig, S., Madani, O. A large-scale analysis of query
logs for assessing personalization opportunities,
16. Maes, P. Agents that reduce work and information
Proceedings of the 12th ACM SIGKDD international
overload, Communications of the ACM, v.37 n.7, p.30-
conference on Knowledge discovery and data mining,
40, July 1994.
2006.
17. Speretta, M., Gauch, S.: Personalized Search based on
User Search Histories. In: IEEE/WIC/ACM
International Conference on Web Intelligence, 2005.